Abstract
The etiology of common human disease often involves a complex genetic architecture, where numerous points of genetic variation interact to influence disease susceptibility. Automating the detection of such epistatic genetic risk factors poses a major computational challenge, as the number of possible gene-gene interactions increases combinatorially with the number of sequence variations. Previously, we addressed this challenge with the development of a computational evolution system (CES) that incorporates greater biological realism than traditional artificial evolution methods. Our results demonstrated that CES is capable of efficiently navigating these large and rugged epistatic landscapes toward the discovery of biologically meaningful genetic models of disease predisposition. Further, we have shown that the efficacy of CES is improved dramatically when the system is provided with statistical expert knowledge. We anticipate that biological expert knowledge, such as genetic regulatory or protein-protein interaction maps, will provide complementary information, and further improve the ability of CES to model the genetic architectures of common human disease. The goal of this study is to test this hypothesis, utilizing publicly available protein-protein interaction information. We show that by incorporating this source of expert knowledge, the system is able to identify functional interactions that represent more concise models of disease susceptibility with improved accuracy. Our ability to incorporate biological knowledge into learning algorithms is an essential step toward the routine use of methods such as CES for identifying genetic risk factors for common human diseases.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Albert, R., Jeong, H., and Barabási, A.L. (2000). Error and attack tolerance of complex networks. Nature, 406:378–382.
Aldana, M., Balleza, E., Kauffman, S., and Resendiz, O. (2007). Robustness and evolvability in genetic regulatory networks. Journal of Theoretical Biology, 245:433–448.
Andrew, A.S., Karagas, M.R., Nelson, H.H., Guarrera, S., Polidoro, S., Gamberini, S., Sacerdote, C., Moore, J.H., Kelsey, K.T., Vineis, P., and Matullo, G. (2008). Assessment of multiple DNA repair gene polymorphisms and bladder cancer susceptibility in a joint italian and u.s. population: a comparison of alternative analytic approaches. Human Heredity, 65:105–118.
Askland, K., Read, C., and Moore, J.H. (2009). Pathway-based analyses of whole-genome association study data in bipolar disorder reveal genes mediating ion channel activity and synaptic neurotransmission. Human Genetics, 125:63–79.
Banzhaf, W., Beslon, G., Christensen, S., Foster, J.A., Képès, F., Lefort, V., Miller, J.F., Radman, M., and Ramsden, J.J. (2006). From artificial evolution to computational evolution: a research agenda. Nature Reviews Genetics, 7:729–735.
Cordell, H.J. (2009). Detecting gene-gene interactions that underlie human diseases. Nature Reviews Genetics, 10:392–404.
Culverhouse, R., Suarez, B.K., Lin, J., and Reich, T. (2002). A perspective on epistasis: limits of models displaying no main effect. American Journal of Human Genetics, 70(2):461–471.
Emily, M., Mailund, T., Hein, J., Schauser, L., and Schierup, M.H. (2009).Using biological networks to search for interacting loci in genome-wide association studies. European Journal of Human Genetics, 17(10):1231–1240.
Eppstein, M.J., Payne, J.L., White, B.C., and Moore, J.H. (2007).Genomicmining for complex disease traits with random chemistry. Genetic Programming and Evolvable Machines, 8:395–411.
Greene, C.S., Hill, D.P., and Moore, J.H. (2009a). Environmental noise improves epistasis models of genetic data discovered using a computational evolution system. In Proceedings of the Genetic and Evolutionary Computation Conference, pages 1785–1786.
Greene, C.S., Hill, D.P., and Moore, J.H. (2009b). Environmental sensing of expert knowledge in a computational evolution system for complex problem solving in human genetics. In Riolo, R., O-Reilly, U.M., and McConaghy, T., editors, Genetic Programming Theory and Practice VII, pages 19–36. Springer.
Greene, C.S., White, B.C., and Moore, J.H. (2009c). An expert knowledgeguided mutation operator for genome-wide genetic analysis using genetic programming. In Lecture Notes in Bioinformatics, volume 4774, pages 30–40.
Greene, C.S.,White, B.C., and Moore, J.H. (2009d). Sensible initialization using expert knowledge for genome-wide analysis of epistasis using genetic programming. In Proceedings of the IEEE Congress on Evolutionary Computation, pages 1289–1296.
Jenson, L.J., M.Kuhn, Stark, M., Chaffron, S., Creevey, C., Muller, J., Doerks, T., Julien, P., Roth, A., Simonovic, M., Bork, P., and von Mering, C. (2009). String 8 - a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Research, 37:D412–D416.
Jeong, H., Mason, S.P., Barabási, A.L., and Oltvai, Z.N. (2001). Lethality and centrality in protein networks. Nature, 411:41–42.
Kononenko, I. (1994). Estimating attributes: analysis and extensions of RELIEF. In European Conference on Machine Learning, pages 171–182.
Langdon, W.B. (1998). Genetic Programming and Data Structures: Genetic Programming + Data Structures = Automatic Programming! Kluwer Academic Publishers Group.
Moore, J.H. (2003). The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Human Heredity, 56:73–82.
Moore, J.H., Andrews, P.C., Barney, N., and White, B.C. (2008). Development and evaluation of an open-ended computational evolution system for the genetic analysis of susceptibility to common human diseases. In Lecture Notes in Computer Science, volume 4973, pages 129–140.
Moore, J.H., Asselbergs, F.W., and Williams, S.M. (2010). Bioinformatics challenges for genome-wide association studies. Bioinformatics, 26(4):445–455.
Moore, J.H., Greene, C.S., Andrews, P.C., and White, B.C. (2009). Does complexity matter? artificial evolution, computational evolution, and the genetic analysis of epistasis in common human diseases. In Riolo, R., Soule, T., and Worzel, B., editors, Genetic Programming Theory and Practice VI. Springer.
Moore, J.H., Parker, J.S., Olsen, N.J., and Aune, T.M. (2002). Symbolic discriminant analysis of microarray data in autoimmune disease. Genetic Epidemiology, 23:57–69.
Moore, J.H. and White, B.C. (2006). Exploiting expert knowledge in genetic programming for genome-wide genetic analysis. In Lecture Notes in Computer Science, volume 4193, pages 969–977.
Moore, J.H. and White, B.C. (2007). Genome-wide genetic analysis using genetic programming: The critical need for expert knowledge. In Riolo, R., Soule, T., and Worzel, B., editors, Genetic Programming Theory and Practice IV, pages 11–28. Springer.
Moore, J.H. and Williams, S.M. (2009). Epistasis and its implications for personal genetics. American Journal of Human Genetics, 85:309–320.
Payne, J.L., Greene, C.S., Hill, D.P., and Moore, J.H. (2010). Sensible initialization of a computational evolution system using expert knowledge for epistasis analysis in human genetics. In Chen, Y.P., editor, Exploitation of Linkage Learning in Evolutionary Algorithms, pages 215–226. Springer.
Poli, R., Langdon, W.B., and McPhee, N.F. (2008). A Field Guide to Genetic Programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk.
Ritchie, M.D., Hahn, L.W., Roodi, N., Bailey, L.R., Dupont, W.D., Parl, F.F., and Moore, J.H. (2001). Multifactor dimensionality reduction reveals high-order interactions among estrogen metabolism genes in sporadic breast cancer. American Journal of Human Genetics, 69:138–147.
von Mering, C., Jensen, L.J., Snel, B., Hooper, S.D., Krupp, M., Foglierini, M., Jouffre, N., Huynen, M.A., and Bork, P. (2005). String: known and predicted protein-protein associations, integrated and transferred across organisms. Nucleic Acids Research, 33:D433–D437.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Pattin, K.A., Payne, J.L., Hill, D.P., Caldwell, T., Fisher, J.M., Moore, J.H. (2011). Exploiting Expert Knowledge of Protein-Protein Interactions in a Computational Evolution System for Detecting Epistasis. In: Riolo, R., McConaghy, T., Vladislavleva, E. (eds) Genetic Programming Theory and Practice VIII. Genetic and Evolutionary Computation, vol 8. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-7747-2_12
Download citation
DOI: https://doi.org/10.1007/978-1-4419-7747-2_12
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-7746-5
Online ISBN: 978-1-4419-7747-2
eBook Packages: Computer ScienceComputer Science (R0)