ABSTRACT
Covariant parsimony pressure is a theoretically motivated method primarily aimed to control bloat. In this contribution we describe an adaptive method to control covariant parsimony pressure that is aimed to reduce overfitting in symbolic regression. The method is based on the assumption that overfitting can be reduced by controlling the evolution of program length. Additionally, we propose an overfitting detection criterion that is based on the correlation of the fitness values on the training set and a validation set of all models in the population.
The proposed method uses covariant parsimony pressure to decrease the average program length when overfitting occurs and allows an increase of the average program length in the absence of overfitting. The proposed approach is applied on two real world datasets. The experimental results show that the correlation of training and validation fitness can be used as an indicator for overfitting and that the proposed method of covariant parsimony pressure adaption alleviates overfitting in symbolic regression experiments with the two datasets.
- H. Akaike. Information theory and an extension of the maximum likelihood principle. In Second International Symposium on Information Theory, pages 267--281. 1973.Google Scholar
- R. M. A. Azad and C. Ryan. Abstract functions and lifetime learning in genetic programming for symbolic regression. In Proceedings of the 12th annual conference on Genetic and evolutionary computation, GECCO '10, pages 893--900, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- S. Dignum and R. Poli. Generalisation of the limiting distribution of program sizes in tree-based genetic programming and analysis of its effects on bloat. In GECCO '07: Proceedings of the 9th annual conference on Genetic and evolutionary computation, volume 2, pages 1588--1595, London, 7-11 July 2007. ACM Press. Google ScholarDigital Library
- A. Frank and A. Asuncion. UCI machine learning repository, 2010.Google Scholar
- C. Gagne, M. Schoenauer, M. Parizeau, and M. Tomassini. Genetic programming, validation sets, and parsimony pressure. In Genetic Programming, 9th European Conference, EuroGP2006, volume 3905 of Lecture Notes in Computer Science, pages 109--120, Berlin, Heidelberg, New York, 2006. Springer. Google ScholarDigital Library
- T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning - Data Mining, Inference, and Prediction. Springer, 2009. Second Edition.Google Scholar
- M. Keijzer. Scaled symbolic regression. Genetic Programming and Evolvable Machines, 5(3):259--269, Sept. 2004. Google ScholarDigital Library
- S. Luke. Two fast tree-creation algorithms for genetic programming. IEEE Transactions on Evolutionary Computation, 4(3):274--283, Sept. 2000. Google ScholarDigital Library
- R. Poli and N. F. McPhee. Covariant parsimony pressure for genetic programming. Technical Report CES-480, Department of Computing and Electronic Systems, University of Essex, UK, 2008.Google Scholar
- J. Rissanen. A universal prior for integers and estimation by minimum description length. Annals of Statistics, 11:416--431, 1983.Google ScholarCross Ref
- M. Schmidt and H. Lipson. Symbolic regression of implicit equations. In Genetic Programming Theory and Practice VII, Genetic and Evolutionary Computation, pages 73--85. Springer US, 2010.Google Scholar
- G. E. Schwarz. Estimating the dimension of a model. Annals of Statistics, 6(2):461--464, 1978.Google ScholarCross Ref
- S. Silva and S. Dignum. Extending operator equalisation: Fitness based self adaptive length distribution for bloat free GP. In Proceedings of the 12th European Conference on Genetic Programming, EuroGP 2009, volume 5481 of LNCS, pages 159--170, Tuebingen, Apr. 15-17 2009. Springer. Google ScholarDigital Library
- S. Silva and L. Vanneschi. Operator equalisation, bloat and overfitting: a study on human oral bioavailability prediction. In GECCO '09: Proceedings of the 11th Annual conference on Genetic and evolutionary computation, pages 1115--1122, Montreal, 8-12 July 2009. ACM. Google ScholarDigital Library
- G. F. Smits and M. Kotanchek. Pareto-front exploitation in symbolic regression. In Genetic Programming in Theory and Practice II, pages 283--299. Springer, 2005.Google Scholar
- C. Spearman. The proof and measurement of association between two things. The American Journal of Psychology, 15(1):72--101, 1904.Google ScholarCross Ref
- L. Vanneschi, M. Castelli, and S. Silva. Measuring bloat, overfitting and functional complexity in genetic programming. In Proc. GECCO'10, pages 877--884, July 7-11 2010. Google ScholarDigital Library
- V. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, 1996. Google ScholarDigital Library
- E. J. Vladislavleva, G. F. Smits, and D. den Hertog. Order of nonlinearity as a complexity measure for models generated by symbolic regression via pareto genetic programming. IEEE Transactions on Evolutionary Computation, 13(2):333--349, 2009. Google ScholarDigital Library
- S. Wagner. Heuristic Optimization Software Systems - Modeling of Heuristic Optimization Algorithms in the HeuristicLab Software Environment. PhD thesis, Institute for Formal Models and Verification, Johannes Kepler University, Linz, Austria, 2009.Google Scholar
- S. Winkler, M. Affenzeller, and S. Wagner. Using enhanced genetic programming techniques for evolving classifiers in the context of medical diagnosis. Genetic Programming and Evolvable Machines, 10(2):111--140, 2009. Google ScholarDigital Library
Index Terms
- Overfitting detection and adaptive covariant parsimony pressure for symbolic regression
Recommendations
Early stopping criteria to counteract overfitting in genetic programming
GECCO '11: Proceedings of the 13th annual conference companion on Genetic and evolutionary computationEarly stopping typically stops training the first time validation fitness disimproves. This may not be the best strategy given that validation fitness can subsequently increase or decrease. We examine the effects of stopping subsequent to the first ...
Does overfitting affect performance in estimation of distribution algorithms
GECCO '06: Proceedings of the 8th annual conference on Genetic and evolutionary computationEstimation of Distribution Algorithms (EDAs) are a class of evolutionary algorithms that use machine learning techniques to solve optimization problems. Machine learning is used to learn probabilistic models of the selected population. This model is ...
On the effect of function set to the generalisation of symbolic regression models
GECCO '18: Proceedings of the Genetic and Evolutionary Computation Conference CompanionSupervised learning by means of Genetic Programming aims at the evolutionary synthesis of a model that achieves a balance between approximating the target function on the training data and generalising on new data. In this study we benchmark the ...
Comments