Active Learning of Regular Expressions for Entity Extraction
Created by W.Langdon from
gp-bibliography.bib Revision:1.5787
- @Article{Bartoli:2017:ieeeTC,
-
author = "A. Bartoli and A. {De Lorenzo} and E. Medvet and
F. Tarlao",
-
journal = "IEEE Transactions on Cybernetics",
-
title = "Active Learning of Regular Expressions for Entity
Extraction",
-
year = "2017",
-
abstract = "We consider the automatic synthesis of an entity
extractor, in the form of a regular expression, from
examples of the desired extractions in an unstructured
text stream. This is a long-standing problem for which
many different approaches have been proposed, which all
require the preliminary construction of a large dataset
fully annotated by the user. In this paper, we propose
an active learning approach aimed at minimizing the
user annotation effort: the user annotates only one
desired extraction and then merely answers extraction
queries generated by the system. During the learning
process, the system digs into the input text for
selecting the most appropriate extraction query to be
submitted to the user in order to improve the current
extractor. We construct candidate solutions with
genetic programming (GP) and select queries with a form
of querying-by-committee, i.e., based on a measure of
disagreement within the best candidate solutions. All
the components of our system are carefully tailored to
the peculiarities of active learning with GP and of
entity extraction from unstructured text. We evaluate
our proposal in depth, on a number of challenging
datasets and based on a realistic estimate of the user
effort involved in answering each single query. The
results demonstrate high accuracy with significant
savings in terms of computational effort, annotated
characters, and execution time over a state-of-the-art
baseline.",
-
keywords = "genetic algorithms, genetic programming",
-
DOI = "
doi:10.1109/TCYB.2017.2680466",
-
ISSN = "2168-2267",
-
notes = "Also known as \cite{7886274}",
- }
Genetic Programming entries for
Alberto Bartoli
Andrea De Lorenzo
Eric Medvet
Fabiano Tarlao
Citations