Inference of Regular Expressions for Text Extraction from Examples
Created by W.Langdon from
gp-bibliography.bib Revision:1.8051
- @Article{Bartoli:2016:ieeeTKDE,
-
author = "Alberto Bartoli and Andrea {De Lorenzo} and
Eric Medvet and Fabiano Tarlao",
-
title = "Inference of Regular Expressions for Text Extraction
from Examples",
-
journal = "IEEE Transactions on Knowledge and Data Engineering",
-
year = "2016",
-
volume = "28",
-
number = "5",
-
pages = "1217--1230",
-
month = may,
-
keywords = "genetic algorithms, genetic programming",
-
ISSN = "1041-4347",
-
URL = "http://www.human-competitive.org/sites/default/files/bartoli-delorenzo-medvet-tarlao-text.txt",
-
DOI = "doi:10.1109/TKDE.2016.2515587",
-
abstract = "A large class of entity extraction tasks from text
that is either semistructured or fully unstructured may
be addressed by regular expressions, because in many
practical cases the relevant entities follow an
underlying syntactical pattern and this pattern may be
described by a regular expression. In this work, we
consider the long-standing problem of synthesizing such
expressions automatically, based solely on examples of
the desired behaviour. We present the design and
implementation of a system capable of addressing
extraction tasks of realistic complexity. Our system is
based on an evolutionary procedure carefully tailored
to the specific needs of regular expression generation
by examples. The procedure executes a search driven by
a multiobjective optimization strategy aimed at
simultaneously improving multiple performance indexes
of candidate solutions while at the same time ensuring
an adequate exploration of the huge solution space. We
assess our proposal experimentally in great depth, on a
number of challenging datasets. The accuracy of the
obtained solutions seems to be adequate for practical
usage and improves over earlier proposals
significantly. Most importantly, our results are highly
competitive even with respect to human operators. A
prototype is available as a web application at
regex.inginf.units.it",
-
notes = "Entered 2016 HUMIES Department of Engineering and
Architecture (DIA), University of Trieste, Italy. Also
known as \cite{7374717}",
- }
Genetic Programming entries for
Alberto Bartoli
Andrea De Lorenzo
Eric Medvet
Fabiano Tarlao
Citations