Abstract
One of the definitely unsolved main problems in molecular biology is the protein-protein functional association prediction problem. Genetic Programming (GP) is applied to this domain. GP evolves an expression, equivalent to a binary classifier, which predicts if a given pair of proteins interacts. We take advantages of GP flexibility, particularly, the possibility of defining new operations. In this paper, the missing values problem benefits from the definition of if-unknown, a new operation which is more appropriate to the domain data semantics. Besides, in order to improve the solution size and the computational time, we use the Tarpeian method which controls the bloat effect of GP. According to the obtained results, we have verified the feasibility of using GP in this domain, and the enhancement in the search efficiency and interpretability of solutions due to the Tarpeian method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Rojas, A., Juan, D., Valencia, A.: Molecular interactions: Learning form protein complexes. In: Leon, D., Markel, S. (eds.) Silico Technologies in Drug Target Identification and Validation, vol. 6, pp. 225–244 (2006)
Causier, B.: Studying the Interactome with the Yeast Two-Hybrid System and Mass Spectrometry. Mass Spectrom. Rev. 23, 350–367 (2004)
Valencia, A., Pazos, F.: omputational Methods for the Prediction of Protein Interactions. Curr. Opin. Struct. Biol. 12, 368–373 (2002)
Fraser, H.B., Hirsh, A.E., Wall, D.P., et al.: Coevolution of Gene Expression among Interacting Proteins. Proc. Natl. Acad. Sci. U. S. A. 101, 9033–9038 (2004)
Yu, H., Luscombe, N.M., Lu, H.X., et al.: Annotation Transfer between Genomes: Protein-Protein Interologs and Protein-DNA Regulogs. Genome Res. 14, 1107–1118 (2004)
Gómez, M., Alonso-Allende, R., Pazos, F., et al.: Accessible Protein Interaction Data for Network Modeling. Structure of the Information and Available Repositories. Transactions on Computational Systems Biology I, 1–13 (2005)
Mering, C.v., Krause, R., Snel, B., et al.: Comparative Assessment of Large-Scale Data Sets of Protein-Protein Interactions. Nature 417, 399–403 (2002)
Koza, J.: Genetic programming II. MIT Press, Cambridge (1994)
Mahler, S., Robilliard, D., Fonlupt, C.: Tarpeian Bloat Control and Generalization Accuracy. In: Keijzer, M., Tettamanzi, A.G.B., Collet, P., van Hemert, J., Tomassini, M. (eds.) EuroGP 2005. LNCS, vol. 3447, pp. 203–214. Springer, Heidelberg (2005)
Poli, R.: A Simple but Theoretically-Motivated Method to Control Bloat in Genetic Programming. In: Ryan, C., Soule, T., Keijzer, M., Tsang, E.P.K., Poli, R., Costa, E. (eds.) EuroGP 2003. LNCS, vol. 2610, pp. 204–217. Springer, Heidelberg (2003)
Butland, G., Peregrin-Alvarez, J.M., Li, J., et al.: Interaction Network Containing Conserved and Essential Protein Complexes in Escherichia Coli. Nature 433, 531–537 (2005)
Zongker, D., Punch, B.: Lil-Gp Genetic Programming System (1998), http://garage.Cse.Msu.edu/software/lil-Gp/
Fawcett, T.: ROC Graphs: Notes and Practical Considerations for Data Mining Researchers (2003)
Witten, I.H., Frank, E.: Data mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Poli, R., Langdon, W., Dignum, S.: On the Limiting Distribution of Program Sizes in Tree-Based Genetic Programming. In: Ebner, M., O’Neill, M., Ekárt, A., Vanneschi, L., Esparcia-Alcázar, A.I. (eds.) EuroGP 2007. LNCS, vol. 4445, pp. 193–204. Springer, Heidelberg (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Garcia, B., Aler, R., Ledezma, A., Sanchis, A. (2008). Genetic Programming for Predicting Protein Networks. In: Geffner, H., Prada, R., Machado Alexandre, I., David, N. (eds) Advances in Artificial Intelligence – IBERAMIA 2008. IBERAMIA 2008. Lecture Notes in Computer Science(), vol 5290. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88309-8_44
Download citation
DOI: https://doi.org/10.1007/978-3-540-88309-8_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88308-1
Online ISBN: 978-3-540-88309-8
eBook Packages: Computer ScienceComputer Science (R0)