Abstract
This paper investigates the effects of early stopping as a method to counteract overfitting in evolutionary data modelling using Genetic Programming. Early stopping has been proposed as a method to avoid model overtraining, which has been shown to lead to a significant degradation of out-of-sample performance. If we assume some sort of performance metric maximisation, the most widely used early training stopping criterion is the moment within the learning process that an unbiased estimate of the performance of the model begins to decrease after a strictly monotonic increase through the earlier learning iterations. We are conducting an initial investigation on the effects of early stopping in the performance of Genetic Programming in symbolic regression and financial modelling. Empirical results suggest that early stopping using the above criterion increases the extrapolation abilities of symbolic regression models, but is by no means the optimal training-stopping criterion in the case of a real-world financial dataset.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Becker, L.A., Seshadri, M.: Comprehensibility and overfitting avoidance in genetic programming for technical trading rules. Worcester Polytechnic Institute, Computer Science Technical Report (2003)
Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1996)
Brabazon, A., Dang, J., Dempsey, I., O’Neill, M., Edelman, D.: Natural computing in finance: a review (2010)
Brabazon, A., O’Neill, M.: Biologically inspired algorithms for financial modelling. Springer, Heidelberg (2006)
Chauvin, Y.: Generalisation performance of overtrained back-propagation networks. In: EUROSIP Workshop, pp. 46–55 (1990)
Dempsey, I., O’Neill, M., Brabazon, A.: Foundations in Grammatical Evolution for Dynamic Environments. Springer, Heidelberg (2009)
Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. John Wiley and Sons, Chichester (2001)
Mckay, R.I., Hoai, N.X., Whigham, P.A., Shan, Y., O’Neill, M.: Grammar-based Genetic Programming: a survey. Genetic Programming and Evolvable Machines 11(3-4), 365–396 (2010)
Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998)
O’Neill, M., Hemberg, E., Gilligan, C., Bartley, E., McDermott, J., Brabazon, A.: GEVA: grammatical evolution in Java. ACM SIGEVOlution 3(2), 17–22 (2008)
O’Neill, M., Ryan, C.: Grammatical Evolution: Evolutionary automatic programming in an arbitrary language. Springer, Netherlands (2003)
O’Neill, M., Vanneschi, L., Gustafson, S., Banzhaf, W.: Open issues in genetic programming. Genetic Programming and Evolvable Machines 11(3-4), 339–363 (2010)
Paris, G., Robilliard, D., Fonlupt, C.: Exploring overfitting in genetic programming. In: Artificial Evolution, pp. 267–277. Springer, Heidelberg (2004)
Prechelt, L.: Early stopping-but when? In: Neural Networks: Tricks of the trade, pp. 553–553 (1998)
Sarle, W.S.: Stopped training and other remedies for overfitting. In: Proceedings of the 27th Symposium on the Interface of Computing Science and Statistics, pp. 352–360 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tuite, C., Agapitos, A., O’Neill, M., Brabazon, A. (2011). A Preliminary Investigation of Overfitting in Evolutionary Driven Model Induction: Implications for Financial Modelling. In: Di Chio, C., et al. Applications of Evolutionary Computation. EvoApplications 2011. Lecture Notes in Computer Science, vol 6625. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20520-0_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-20520-0_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20519-4
Online ISBN: 978-3-642-20520-0
eBook Packages: Computer ScienceComputer Science (R0)