The use of genetic programming to build queries for information retrieval
Created by W.Langdon from
gp-bibliography.bib Revision:1.8051
- @InProceedings{Kraft:1994:GPqir,
-
author = "Donald H. Kraft and Frederick E. Petry and
Bill P. Buckles and Thyagarajan Sadasivan",
-
title = "The use of genetic programming to build queries for
information retrieval",
-
booktitle = "Proceedings of the 1994 IEEE World Congress on
Computational Intelligence",
-
year = "1994",
-
volume = "1",
-
pages = "468--473",
-
address = "Orlando, Florida, USA",
-
month = "27-29 " # jun,
-
publisher = "IEEE Press",
-
keywords = "genetic algorithms, genetic programming, Boolean query
formulation, fitness function, index term space,
information retrieval system, parse tree, relevance
feedback, topicality, user defined measures, Boolean
functions, information retrieval systems, query
processing, search problems, trees (mathematics),
vocabulary",
-
DOI = "doi:10.1109/ICEC.1994.349905",
-
size = "6 pages",
-
abstract = "Genetic programming is applied to an information
retrieval system in order to improve Boolean query
formulation via relevance feedback. This approach
brings together the concepts of information retrieval
and genetic programming. Documents are viewed as
vectors in index term space. A Boolean query, viewed as
a parse tree, is an organism in the genetic programming
sense. Through the mechanisms of genetic programming,
the query is modified in order to improve precision and
recall. Relevance feedback is incorporated, in part,
via user defined measures over a trial set of
documents. The fitness of a candidate query can be
expressed directly as a function of the relevance of
the retrieved set. Preliminary results based on a
testbed are given. The form of the fitness function has
a significant effect upon performance and the proper
fitness functions take into account relevance based on
topicality (and perhaps other factors)",
-
notes = "Syntax of programs restricted to Boolean disjunctive
normal form. Has 4923 terms (=terminals?), seeds
initial population so 80% of terms are predetermined to
be relevant, rest chosen at random. 20% mutation rate
(three kinds). Fitness ~ relevance of documents
retrieved.",
- }
Genetic Programming entries for
Donald H Kraft
Frederic E Petry
Bill Buckles
Thyagarajan Sadasivan
Citations