Genome-Wide Genetic Analysis Using Genetic Programming: The Critical Need for Expert Knowledge
Created by W.Langdon from
gp-bibliography.bib Revision:1.8081
- @InCollection{Moore:2006:GPTP,
-
author = "Jason H. Moore and Bill C. White",
-
title = "Genome-Wide Genetic Analysis Using Genetic
Programming: The Critical Need for Expert Knowledge",
-
booktitle = "Genetic Programming Theory and Practice {IV}",
-
year = "2006",
-
editor = "Rick L. Riolo and Terence Soule and Bill Worzel",
-
volume = "5",
-
series = "Genetic and Evolutionary Computation",
-
pages = "11--28",
-
address = "Ann Arbor",
-
month = "11-13 " # may,
-
publisher = "Springer",
-
keywords = "genetic algorithms, genetic programming",
-
ISBN = "0-387-33375-4",
-
DOI = "doi:10.1007/978-0-387-49650-4_2",
-
size = "16 pages",
-
abstract = "Human genetics is undergoing an information explosion.
The availability of chip-based technology facilitates
the measurement of thousands of DNA sequence variation
from across the human genome. The challenge is to sift
through these high-dimensional datasets to identify
combinations of interacting DNA sequence variations
that are predictive of common diseases. The goal of
this study is to develop and evaluate a genetic
programming (GP) approach to attribute selection and
classification in this domain. We simulated genetic
datasets of varying size in which the disease model
consists of two interacting DNA sequence variations
that exhibit no independent effects on class (i.e.
epistasis). We show that GP is no better than a simple
random search when classification accuracy is used as
the fitness function. We then show that including
pre-processed estimates of attribute quality using
Tuned ReliefF (TuRF) in a multi-objective fitness
function that also includes accuracy significantly
improves the performance of GP over that of random
search. This study demonstrates that GP may be a useful
computational discovery tool in this domain. This study
raises important questions about the general utility of
GP for these types of problems, the importance of data
pre-processing, the ideal functional form of the
fitness function, and the importance of expert
knowledge. We anticipate this study will provide an
important baseline for future studies investigating the
usefulness of GP as a general computational discovery
tool for large-scale genetic studies.",
-
notes = "part of \cite{Riolo:2006:GPTP} Published Jan 2007
after the workshop",
- }
Genetic Programming entries for
Jason H Moore
Bill C White
Citations