Controlling Overfitting in Symbolic Regression Based on a Bias/Variance Error Decomposition

Agapitos, Alexandros; Brabazon, Anthony; O’Neill, Michael

doi:10.1007/978-3-642-32937-1_44

Controlling Overfitting in Symbolic Regression Based on a Bias/Variance Error Decomposition

Alexandros Agapitos²¹,
Anthony Brabazon²¹ &
Michael O’Neill²¹

Conference paper

1961 Accesses
13 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7491))

Abstract

We consider the fundamental property of generalisation of data-driven models evolved by means of Genetic Programming (GP). The statistical treatment of decomposing the regression error into bias and variance terms provides insight into the generalisation capability of this modelling method. The error decomposition is used as a source of inspiration to design a fitness function that relaxes the sensitivity of an evolved model to a particular training dataset. Results on eight symbolic regression problems show that new method is capable on inducing better-generalising models than standard GP for most of the problems.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agapitos, A., O’Neill, M., Brabazon, A.: Evolutionary Learning of Technical Trading Rules without Data-Mining Bias. In: Schaefer, R., Cotta, C., Kołodziej, J., Rudolph, G. (eds.) PPSN XI, Part I. LNCS, vol. 6238, pp. 294–303. Springer, Heidelberg (2010)
Chapter Google Scholar
Agapitos, A., O’Neill, M., Brabazon, A., Theodoridis, T.: Maximum Margin Decision Surfaces for Increased Generalisation in Evolutionary Decision Tree Learning. In: Silva, S., Foster, J.A., Nicolau, M., Machado, P., Giacobini, M. (eds.) EuroGP 2011. LNCS, vol. 6621, pp. 61–72. Springer, Heidelberg (2011)
Chapter Google Scholar
Banzhaf, W., Francone, F.D., Nordin, P.: The Effect of Extensive Use of the Mutation Operator on Generalization in Genetic Programming Using Sparse Data Sets. In: Ebeling, W., Rechenberg, I., Voigt, H.-M., Schwefel, H.-P. (eds.) PPSN IV. LNCS, vol. 1141, pp. 300–309. Springer, Heidelberg (1996)
Chapter Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer (2006)
Google Scholar
Castelli, M., Manzoni, L., Silva, S., Vanneschi, L.: A comparison of the generalization ability of different genetic programming frameworks. In: IEEE Congress on Evolutionary Computation (CEC 2010), July 18-23. IEEE Press, Barcelona (2010)
Google Scholar
Efron, B., Tibshirani, R.: An introduction to the bootstrap. Chapman and Hall (1993)
Google Scholar
Keijzer, M.: Improving Symbolic Regression with Interval Arithmetic and Linear Scaling. In: Ryan, C., Soule, T., Keijzer, M., Tsang, E.P.K., Poli, R., Costa, E. (eds.) EuroGP 2003. LNCS, vol. 2610, pp. 70–82. Springer, Heidelberg (2003)
Chapter Google Scholar
Keijzer, M., Babovic, V.: Genetic Programming, Ensemble Methods and the Bias/Variance Tradeoff - Introductory Investigations. In: Poli, R., Banzhaf, W., Langdon, W.B., Miller, J., Nordin, P., Fogarty, T.C. (eds.) EuroGP 2000. LNCS, vol. 1802, pp. 76–90. Springer, Heidelberg (2000)
Chapter Google Scholar
Poli, R., Langdon, W.B., McPhee, N.F.: A field guide to genetic programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk (2008), http://www.gp-field-guide.org.uk , (With contributions by J. R. Koza)
Theodoridis, T., Agapitos, A., Hu, H.: A gaussian groundplan projection area model for evolving probabilistic classifiers. In: Genetic and Evolutionary Computation Conference, GECCO 2011, July 12-16. ACM, Dublin (2011) (forthcoming)
Google Scholar
Tuite, C., Agapitos, A., O’Neill, M., Brabazon, A.: A Preliminary Investigation of Overfitting in Evolutionary Driven Model Induction: Implications for Financial Modelling. In: Di Chio, C., Brabazon, A., Di Caro, G.A., Drechsler, R., Farooq, M., Grahl, J., Greenfield, G., Prins, C., Romero, J., Squillero, G., Tarantino, E., Tettamanzi, A.G.B., Urquhart, N., Uyar, A.Ş. (eds.) EvoApplications 2011, Part II. LNCS, vol. 6625, pp. 120–130. Springer, Heidelberg (2011)
Chapter Google Scholar
Tuite, C., Agapitos, A., O’Neill, M., Brabazon, A.: Tackling Overfitting in Evolutionary-Driven Financial Model Induction. In: Brabazon, A., O’Neill, M., Maringer, D. (eds.) Natural Computing in Computational Finance. SCI, vol. 380, pp. 141–161. Springer, Heidelberg (2011)
Chapter Google Scholar
Vladislavleva, E.J., Smits, G.F., den Hertog, D.: Order of nonlinearity as a complexity measure for models generated by symbolic regression via pareto genetic programming. IEEE Transactions on Evolutionary Computation 13(2), 333–349 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Financial Mathematics and Computation Research Cluster, Natural Computing Research and Applications Group, University College Dublin, Ireland
Alexandros Agapitos, Anthony Brabazon & Michael O’Neill

Authors

Alexandros Agapitos
View author publications
You can also search for this author in PubMed Google Scholar
Anthony Brabazon
View author publications
You can also search for this author in PubMed Google Scholar
Michael O’Neill
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Centro de Investigacion y de Estudios, Avanzados del Instituto Politecnico Nacional (CINVESTAV-IPN), Departmento de Computation, Av. IPN No. 2508, Col. San Pedro Zacatenco, 0360, Mexico, D.F., Mexico
Carlos A. Coello Coello
Department of Mathematics and Computer Science, University of Catania, V.le A. Doria 6, 95125, Catania, Italy
Vincenzo Cutello & Mario Pavone &
Kanpur Genetic Algorithms Laboratory (KanGAL), Indian Institute of Technology, Kanpur, Kanpur, India
Kalyanmoy Deb
Department of Computer Science, University of New, Mexico, USA
Stephanie Forrest
Department of Mathematics and Computer Science, University of Catania, Viale A. Doria 6, 95125, Catania, Italy
Giuseppe Nicosia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Agapitos, A., Brabazon, A., O’Neill, M. (2012). Controlling Overfitting in Symbolic Regression Based on a Bias/Variance Error Decomposition. In: Coello, C.A.C., Cutello, V., Deb, K., Forrest, S., Nicosia, G., Pavone, M. (eds) Parallel Problem Solving from Nature - PPSN XII. PPSN 2012. Lecture Notes in Computer Science, vol 7491. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32937-1_44

Download citation

DOI: https://doi.org/10.1007/978-3-642-32937-1_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32936-4
Online ISBN: 978-3-642-32937-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics