Abstract
Human genetics is undergoing a data explosion. Methods are available to measure DNA sequence variation throughout the human genome. Given current knowledge it seems likely that common human diseases are best predicted by interactions between biological components, which can be examined as interacting DNA sequence variations. The challenge is thus to examine these high-dimensional datasets to identify combinations of variations likely to predict common diseases. The goal of this paper was to develop and evaluate a genetic programming (GP) mutator suited to this task by exploiting expert knowledge in the form of Tuned ReliefF (TuRF) scores during mutation. We show that using expert knowledge guided mutation performs similarly to expert knowledge guided selection. This study demonstrates that in the context of an expert knowledge aware GP, mutation may be an appropriate component of the GP used to search for interacting predictors in this domain.
Chapter PDF
Similar content being viewed by others
Keywords
- Genetic Programming
- Expert Knowledge
- Mutation Operator
- Multifactor Dimensionality Reduction
- Common Human Disease
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Freitas, A.A.: Understanding the crucial role of attribute interaction in data mining. Artif. Intell. Rev. 16(3), 177–199 (2001)
Goldberg, D.E.: The Design of Innovation: Lessons from and for Competent Genetic Algorithms. Kluwer Academic Publishers, Norwell, MA, USA (2002)
Consortium, T.I.H.: A haplotype map of the human genome. Nature 437(7063), 1299–1320 (2005)
White, B.C., Gilbert, J.C., Reif, D.M., Moore, J.H.: A statistical comparison of grammatical evolution strategies in the domain of human genetics. In: Proceedings of the IEEE Congress on Evolutionary Computing, pp. 676–682. IEEE Computer Society Press, Los Alamitos (2005)
Moore, J.H., White, B.C.: Genome-wide genetic analysis using genetic programming: The critical need for expert knowledge. In: Genetic Programming Theory and Practice IV, Springer, Heidelberg (2006)
Moore, J., White, B.: Exploiting expert knowledge in genetic programming for genome-wide genetic analysis. In: Runarsson, T.P., Beyer, H.-G., Burke, E., Merelo-Guervós, J.J., Whitley, L.D., Yao, X. (eds.) Parallel Problem Solving from Nature - PPSN IX. LNCS, vol. 4193, pp. 969–977. Springer, Heidelberg (2006)
Koza, J.R.: Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge, MA, USA (1992)
Koza, J.R.: Genetic programming II: automatic discovery of reusable programs. MIT Press, Cambridge, MA, USA (1994)
Koza, J.R., Andre, D., Bennett, F.H., Keane, M.A.: Genetic Programming III: Darwinian Invention & Problem Solving. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1999)
Koza, J.R.: Genetic Programming IV: Routine Human-Competitive Machine Intelligence. Kluwer Academic Publishers, Norwell, MA, USA (2003)
Banzhaf, W., Nordin, P., Keller, R.E., Francone, F.D.: Genetic programming: an introduction: on the automatic evolution of computer programs and its applications. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1998)
Langdon, W.B., Koza, J.R.: Genetic Programming and Data Structures: Genetic Programming + Data Structures = Automatic Programming! Kluwer Academic Publishers, Norwell, MA, USA (1998)
Langdon, W.B., Poli, R.: Foundations of Genetic Programming. Springer, Heidelberg (2002)
Freitas, A.A.: Data Mining and Knowledge Discovery with Evolutionary Algorithms, Secaucus, NJ, USA. Springer, New York (2002)
Fogel, G., Corne, D.: Evolutionary Computation in Bioinformatics. Morgan Kaufmann, San Francisco (2003)
Yu, T., Riolo, R., Worzel, B.: Genetic Programming: Theory and Practice, (2006) 10.1007/0-387-28111-8_1
Luke, S., Spector, L.: A revised comparison of crossover and mutation in genetic programming. In: Koza, J.R., Banzhaf, W., Chellapilla, K., Deb, K., Dorigo, M., Fogel, D.B., Garzon, M.H., Goldberg, D.E., Iba, H., Riolo, R. (eds.) Genetic Programming 1998: Proceedings of the Third Annual Conference, University of Wisconsin, Madison, Wisconsin, USA, pp. 208–213. Morgan Kaufmann, San Francisco (1998)
Bearpark, K., Keane, A.: The use of collective memory in genetic programming. In: Jin, Y. (ed.) Knowledge Incorporation in Evolutionary Computation. Studies in Fuzziness and Soft Computing, pp. 15–36. Springer, Heidelberg (2005)
Ritchie, M.D., Hahn, L.W., Roodi, N., Bailey, L.R., Dupont, W.D., Parl, F.F., Moore, J.H.: Multifactor dimensionality reduction reveals high-order interactions among estrogen metabolism genes in sporadic breast cancer. American Journal of Human Genetics 69, 138–147 (2001)
Moore, J.H.: Computational analysis of gene-gene interactions using multifactor dimensionality reduction. Expert Review of Molecular Diagnostics 4(6), 795–803 (2004)
Moore, J.H., Gilbert, J.C., Tsai, C.T., Chiang, F.T., Holden, T., Barney, N., White, B.C.: A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. Journal of Theoretical Biology 241(2), 252–261 (2006)
Moore, J.H.: Genome-wide analysis of epistasis using multifactor dimensionality reduction: feature selection and construction in the domain of human genetics. In: Knowledge Discovery and Data Mining: Challenges and Realities with Real World Data. IGI (2007)
Wilke, R.A., Reif, D.M., Moore, J.H.: Combinatorial pharmacogenetics. Nature Reviews Drug Discovery 4, 911–918 (2005)
Kira, K., Rendell, L.A.: A practical approach to feature selection. In: Machine Learning: Proceedings of the AAAI 1992 (1992)
Kononenko, I.: Estimating attributes: Analysis and extension of relief. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994)
Robnik-Sikonja, M., Kononenko, I.: Theoretical and empirical analysis of relieff and rrelieff. Mach. Learn. 53(1-2), 23–69 (2003)
Moore, J.H., White, B.C.: Tuning relieff for genome-wide genetic analysis. LNCS, vol. 4447, pp. 166–175 (2007)
Gonzalez, G., Uribe, J.C., Tari, L., Brophy, C., Baral, C.: Mining gene-disease relationships from biomedical literature: Weighting protein-protein interactions and connectivity measures. In: Pacific Symposium on Biocomputing, vol. 12, pp. 28–39 (2007)
Moore, J.H., Barney, N., Tsai, C.T., Chiang, F.T., Gui, J., White, B.C.: Symbolic modeling of epistasis. Hum. Hered. 63(2), 120–133 (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Greene, C.S., White, B.C., Moore, J.H. (2007). An Expert Knowledge-Guided Mutation Operator for Genome-Wide Genetic Analysis Using Genetic Programming. In: Rajapakse, J.C., Schmidt, B., Volkert, G. (eds) Pattern Recognition in Bioinformatics. PRIB 2007. Lecture Notes in Computer Science(), vol 4774. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75286-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-75286-8_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75285-1
Online ISBN: 978-3-540-75286-8
eBook Packages: Computer ScienceComputer Science (R0)