Abstract
The search for the underlying heritability ofcomplex traits has led to an explosion of data generation and analysis in the field of human genomics. With these technological advances, we have made some progress in the identification of genes and proteins associated with common, complex human diseases. Still, our understanding of the genetic architecture of complex traits remains limited and additional research is needed to illuminate the genetic and environmental factors important for the disease process, much of which will include looking at variation in DNA, RNA, protein, etc. in ameta-dimensional analysis framework. We have developed amachine learning technique, ATHENA: Analysis Tool for Heritable and Environmental Network Associations, to address this issue of integrating data from multiple “-omics” technologies to identify models that explain or predict the genetic architecture of complex traits. In this chapter, we discuss the challenges in handling meta-dimensional data usinggrammatical evolution neural networks (GENN) which are one modeling component ofATHENA, and a characterization of the models identified in simulation studies to explore the ability of GENN to build complex, meta-dimensional models. Challenges remain to further understand the evolutionary process for GENN, and an explanation of the simplicity of the models. This work highlights potential areas for extension and improvement of the GENN approach within ATHENA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hamid JS et al (2009a) Data integration in genetics and genomics: methods and challenges. Hum Genomics Proteomics 2009
Hindorff LA et al (2009b) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci USA 106:9362–9367
Huang RS et al (2007a) A genome-wide approach to identify genetic variants that contribute to etoposide-induced cytotoxicity. Proc Natl Acad Sci USA 104:9758–9763
Huang RS et al (2007b) Identification of genetic variants contributing to cisplatin-induced cytotoxdicity by use of a genomewide approach. Am J Hum Genet 81:427–437
Huang RS et al (2008a) Genetic variants contributing tko danunorubicin-induced cytotoxicity. Cancer Res 68:3161–3168
Edwards T et al (2008b) Generating linkage disequilibrium patterns in data simulations using genomesimla. Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics 4973:24–35
Klein TE et al (2001) Integrating genotype and phenotype information: an overview of the pharmgkb project. pharmacogenetics research network and knowledge base. Pharmacogenomics J 1:167–170
Breiman L (2001) Random forests. Machine Learning 45:5–32
Chalise P, Batzler A, Abo R, Wang L, Fridley BL (2012) Simultaneous analysis of multiple data types in pharmacogenomic studies using weighted sparse canonical correlation analysis. OMICS 16:363–373
Edgar R, Domrachev M, Lash A (2002) Gene expression omnibus: Ncbi gene expression and hybridization array data repository. Nucleic Acids Res 30:207–210
Holzinger ER, Ritchie MD (2012) Integrating heterogeneous high-throughput data for meta-dimensional pharmacogenomics and disease-related studies. Pharmacogenomics 13:213–222
Holzinger ER, Buchanan CC, Dudek SM, Torstenson EC, Turner SD, Ritchie MD (2010) Initialization parameter sweep in ATHENA: optimizing neural networks for detecting gene-gene interactions in the presence of small main effects. In: Branke J, Pelikan M, Alba E, Arnold DV, Bongard J, Brabazon A, Branke J, Butz MV, Clune J, Cohen M, Deb K, Engelbrecht AP, Krasnogor N, Miller JF, O’Neill M, Sastry K, Thierens D, van Hemert J, Vanneschi L, Witt C (eds) GECCO ’10: Proceedings of the 12th annual conference on Genetic and evolutionary computation, ACM, Portland, Oregon, USA, pp 203–210, DOI doi:10.1145/1830483.1830519
Holzinger ER, Dudek SM, Frase AT, Fridley B, Chalise P, Ritchie MD (2012) Comparison of methods for meta-dimensional data analysis using in silico and biological data sets. In: Giacobini M, Vanneschi L, Bush WS (eds) 10th European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics, EvoBIO 2012, Springer Verlag, Malaga, Spain, LNCS, vol 7246, pp 134–143, DOI doi:10.1007/ 978-3-642-29066-4-12
Koza JR, Rice JP (1991) Genetic generation of both the weights and architecture for a neural network. In: International Joint Conference on Neural Networks, IJCNN-91, IEEE Computer Society Press, Washington State Convention and Trade Center, Seattle, WA, USA, vol II, pp 397–404, DOI doi:10.1109/IJCNN.1991.155366, URL http://www.genetic-programming.com/jkpdf/ijcnn1991.pdf
Maher B (2008) Personal genomes: The case of the missing heritability. Nature 456:18–21
Motsinger AA, Lee SL, Mellick G, Ritchie MD (2006) GPNN: Power studies and applications of a neural network method for detecting gene-gene interactions in studies of human disease. BMC bioinformatics [electronic resource] 7(1):39–39, DOI doi:10.1186/1471-2105-7-39, URL http://www.biomedcentral.com/1471-2105/7/39
Motsinger-Reif AA, Dudek SM, Hahn LW, Ritchie MD (2008) Comparison of approaches for machine-learning optimization of neural networks for detecting gene-gene interactions in genetic epidemiology. Genet Epidemiol 32:325–340
O’Neill M, Ryan C (2001) Grammatical evolution. IEEE Transactions on Evolutionary Computation 5(4):349–358, DOI doi:10.1109/4235.942529
O’Neill M, Ryan C (2003) Grammatical Evolution: Evolutionary Automatic Programming in a Arbitrary Language, Genetic programming, vol 4. Kluwer Academic Publishers, URL http://www.wkap.nl/prod/b/1-4020-7444-1
Skapura D (1995) Building neural networks. ACM Press, New York
Turner SD, Dudek SM, Ritchie MD (2010a) Athena: A knowledge-based hybrid backpropagation-grammatical evolution neural network algorithm for discovering epistatis among quantitative trait loci. BioData Min 3:5
Turner SD, Dudek SM, Ritchie MD (2010b) Grammatical evolution of neural networks for discovering epistasis among quantitative trait loci. In: Pizzuti C, Ritchie MD, Giacobini M (eds) 8th European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics (EvoBIO 2010), Springer, Istanbul, Turkey, Lecture Notes in Computer Science, vol 6023, pp 86–97, DOI doi:10. 1007/978-3-642-12211-8
Acknowledgements
ERH was supported by NIH/NIGMS training grant T32 GM080178. MDR was supported by NIH grants LM010040 and P-STAR. P-STAR (PGRN Statistical Analysis Resource) is supported by funding from NIGMS and is part of the PGRN (Pharmacogenomics Research Network). P-STAR is a component of HL065962.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Ritchie, M.D., Holzinger, E.R., Dudek, S.M., Frase, A.T., Chalise, P., Fridley, B. (2013). Meta-Dimensional Analysis of Phenotypes Using the Analysis Tool for Heritable and Environmental Network Associations (ATHENA): Challenges with Building Large Networks. In: Riolo, R., Vladislavleva, E., Ritchie, M., Moore, J. (eds) Genetic Programming Theory and Practice X. Genetic and Evolutionary Computation. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6846-2_8
Download citation
DOI: https://doi.org/10.1007/978-1-4614-6846-2_8
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-6845-5
Online ISBN: 978-1-4614-6846-2
eBook Packages: Computer ScienceComputer Science (R0)