Skip to main content

Exploiting Expert Knowledge in Genetic Programming for Genome-Wide Genetic Analysis

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4193))

Abstract

Human genetics is undergoing an information explosion. The availability of chip-based technology facilitates the measurement of thousands of DNA sequence variation from across the human genome. The challenge is to sift through these high-dimensional datasets to identify combinations of interacting DNA sequence variations that are predictive of common diseases. The goal of this paper was to develop and evaluate a genetic programming (GP) approach for attribute selection and modeling that uses expert knowledge such as Tuned ReliefF (TuRF) scores during selection to ensure trees with good building blocks are recombined and reproduced. We show here that using expert knowledge to select trees performs as well as a multiobjective fitness function but requires only a tenth of the population size. This study demonstrates that GP may be a useful computational discovery tool in this domain.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. The MIT Press, Cambridge (1992)

    MATH  Google Scholar 

  2. Koza, J.R.: Genetic Programming II: Automatic Discovery of Reusable Programs. The MIT Press, Cambridge (1994)

    MATH  Google Scholar 

  3. Koza, J.R., Bennett III, F.H., Andre, D., Keane, M.A.: Genetic Programming III: Darwinian Invention and Problem Solving. Morgan Kaufmann, San Francisco (1999)

    MATH  Google Scholar 

  4. Koza, J.R., Keane, M.A., Streeter, M.J., Mydlowec, W., Yu, J., Lanza, G.: Genetic Programming IV: Routine Human-Competitive Machine Intelligence. Springer, Heidelberg (2003)

    MATH  Google Scholar 

  5. Banzhaf, W., Nordin, P., Keller, R.E., Francone, F.D.: Genetic Programming: An Introduction: On the Automatic Evolution of Computer Programs and Its Applications. Morgan Kaufmann Publishers, San Francisco (1998)

    MATH  Google Scholar 

  6. Langdon, W.B.: Genetic Programming and Data Structures: Genetic Programming + Data Structures = Automatic Programming! Kluwer, Dordrecht (1998)

    Book  Google Scholar 

  7. Langdon, W.B., Poli, R.: Foundations of Genetic Programming. Springer, Heidelberg (2002)

    Book  Google Scholar 

  8. Freitas, A.: Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer, Heidelberg (2002)

    Book  Google Scholar 

  9. Yu, T., Riolo, R., Worzel, B.: Genetic programming: Theory and practice. In: Yu, T., Riolo, R., Worzel, B. (eds.) Genetic Programming Theory and Practice III, Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  10. Freitas, A.: Understanding the crucial role of attribute interactions. Artificial Intelligence Review 16, 177–199 (2001)

    Article  Google Scholar 

  11. Goldberg, D.E.: The Design of Innovation. Kluwer, Dordrecht (2002)

    Book  Google Scholar 

  12. Altshuler, D., Brooks, L.D., Chakravarti, A., Collins, F.S., Daly, M.J., Donnelly, P.: International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005)

    Google Scholar 

  13. White, B.C., Gilbert, J.C., Reif, D.M., Moore, J.H.: A statistical comparison of grammatical evolution strategies in the domain of human genetics. In: Proceedings of the IEEE Congress on Evolutionary Computing, pp. 676–682 (2005)

    Google Scholar 

  14. Moore, J.H., White, B.C.: Genome-wide genetic analysis using genetic programming: The critical need for expert knowledge. In: Genetic Programming Theory and Practice IV, Springer, Heidelberg (in press, 2006)

    Google Scholar 

  15. Ritchie, M.D., Hahn, L.W., Roodi, N., Bailey, L.R., Dupont, W.D., Parl, F.F., Moore, J.H.: Multifactor dimensionality reduction reveals high-order interactions among estrogen metabolism genes in sporadic breast cancer. American Journal of Human Genetics 69, 138–147 (2001)

    Article  Google Scholar 

  16. Moore, J.H.: Computational analysis of gene-gene interactions in common human diseases using multifactor dimensionality reduction. Expert Review of Molecular Diagnostics 4, 795–803 (2004)

    Article  Google Scholar 

  17. Moore, J.H., Gilbert, J.C., Tsai, C.-T., Chiang, F.T., Holden, W., Barney, N., White, B.C.: A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. Journal of Theoretical Biology (in press, 2006)

    Google Scholar 

  18. Wilke, R.A., Reif, D.M., Moore, J.: Combinatorial pharmacogenetics. Nature Reviews Drug Discovery 4, 911–918 (2005)

    Article  Google Scholar 

  19. Kira, K., Rendell, L.A.: A practical approach to feature selection. In: Machine Learning: Proceedings of the AAAI 1992 (1992)

    Google Scholar 

  20. Kononenko, I.: Estimating attributes: analysis and extension of Relief. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994)

    Chapter  Google Scholar 

  21. Robnik-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Machine Learning 53, 23–69 (2003)

    Article  Google Scholar 

  22. Moore, J.H., White, B.C.: Tuning ReliefF for genome-wide genetic analysis (submitted)

    Google Scholar 

  23. Sastry, K., Goldberg, D.E.: Probabilistic model building and competent genetic programming. In: Riolo, R., Worzel, B. (eds.) Genetic Programming Theory and Practice, Kluwer, Dordrecht (2003)

    Google Scholar 

  24. Jensen, L.J., Saric, J., Bork, P.: Literature mining for the biologist: from information retrieval to biological discovery. Nature Reviews Genetics 7, 119–129 (2006)

    Article  Google Scholar 

  25. Jin, Y.: Knowledge Incorporation in Evolutionary Computation. Springer, Heidelberg (2005)

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Moore, J.H., White, B.C. (2006). Exploiting Expert Knowledge in Genetic Programming for Genome-Wide Genetic Analysis. In: Runarsson, T.P., Beyer, HG., Burke, E., Merelo-Guervós, J.J., Whitley, L.D., Yao, X. (eds) Parallel Problem Solving from Nature - PPSN IX. PPSN 2006. Lecture Notes in Computer Science, vol 4193. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11844297_98

Download citation

  • DOI: https://doi.org/10.1007/11844297_98

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-38990-3

  • Online ISBN: 978-3-540-38991-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics