Skip to main content

Genetic Programming, Validation Sets, and Parsimony Pressure

  • Conference paper
Book cover Genetic Programming (EuroGP 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3905))

Included in the following conference series:

Abstract

Fitness functions based on test cases are very common in Genetic Programming (GP). This process can be assimilated to a learning task, with the inference of models from a limited number of samples. This paper is an investigation on two methods to improve generalization in GP-based learning: 1) the selection of the best-of-run individuals using a three data sets methodology, and 2) the application of parsimony pressure in order to reduce the complexity of the solutions. Results using GP in a binary classification setup show that while the accuracy on the test sets is preserved, with less variances compared to baseline results, the mean tree size obtained with the tested methods is significantly reduced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)

    MATH  Google Scholar 

  2. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. John Wiley & Sons, Inc, New York (2001)

    MATH  Google Scholar 

  3. Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)

    MATH  Google Scholar 

  4. Eiben, A.E., Jelasity, M.: A critical note on experimental research methodology in EC. In: Proceedings of the 2002 Congress on Evolutionary Computation (CEC 2002), Honolulu (HI), USA, pp. 582–587. IEEE Press, Los Alamitos (2002)

    Google Scholar 

  5. Rissanen, J.: Modeling by shortest data description. Automatica 14, 465–471 (1978)

    Article  MATH  Google Scholar 

  6. Domingos, P.: The role of occam’s razor in knowledge discovery. Data Mining and Knowledge Discovery 3(4), 409–425 (1999)

    Article  Google Scholar 

  7. Banzhaf, W., Langdon, W.B.: Some considerations on the reason for bloat. Genetic Programming and Evolvable Machines 3(1), 81–91 (2002)

    Article  MATH  Google Scholar 

  8. Langdon, W.B.: Size fair and homologous tree genetic programming crossovers. Genetic Programming and Evolvable Machines 1(1/2), 95–119 (2000)

    Article  MATH  Google Scholar 

  9. Ekárt, A., Németh, S.Z.: Selection based on the pareto nondomination criterion for controlling code growth in genetic programming. Genetic Programming and Evolvable Machines 2(1), 61–73 (2001)

    Article  MATH  Google Scholar 

  10. Luke, S., Panait, L.: Lexicographic parsimony pressure. In: Proceedings of the 2002 Genetic and Evolutionary Computation Conference (GECCO 2002), pp. 829–836. Morgan Kaufmann Publishers, New York (2002)

    Google Scholar 

  11. Silva, S., Almeida, J.: Dynamic maximum tree depth. In: Cantú-Paz, E., Foster, J.A., Deb, K., Davis, L., Roy, R., O’Reilly, U.-M., Beyer, H.-G., Kendall, G., Wilson, S.W., Harman, M., Wegener, J., Dasgupta, D., Potter, M.A., Schultz, A., Dowsland, K.A., Jonoska, N., Miller, J., Standish, R.K. (eds.) GECCO 2003. LNCS, vol. 2723, pp. 1776–1787. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  12. Newman, D., Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

  13. Sherrah, J., Bogner, R.E., Bouzerdoum, A.: The evolutionary pre-processor: Automatic feature extraction for supervised classification using genetic programming. In: Genetic Programming 1997: Proceedings of the Second Annual Conference, Stanford University (CA), USA, pp. 304–312. Morgan Kaufmann, San Francisco (1997)

    Google Scholar 

  14. Brameier, M., Banzhaf, W.: Evolving teams of predictors with linear genetic programming. Genetic Programming and Evolvable Machines 2(4), 381–407 (2001)

    Article  MATH  Google Scholar 

  15. Yu, T., Chen, S.H., Kuo, T.W.: Discovering financial technical trading rules using genetic programming with lambda abstraction. In: Genetic Programming Theory and Practice II, Ann Arbor (MI), USA, pp. 11–30 (2004)

    Google Scholar 

  16. Panait, L., Luke, S.: Methods for evolving robust programs. In: Cantú-Paz, E., Foster, J.A., Deb, K., Davis, L., Roy, R., O’Reilly, U.-M., Beyer, H.-G., Kendall, G., Wilson, S.W., Harman, M., Wegener, J., Dasgupta, D., Potter, M.A., Schultz, A., Dowsland, K.A., Jonoska, N., Miller, J., Standish, R.K. (eds.) GECCO 2003. LNCS, vol. 2724, pp. 1740–1751. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  17. Rowland, J.J.: Generalisation and model selection in supervised learning with evolutionary computation. In: Raidl, G.R., Cagnoni, S., Cardalda, J.J.R., Corne, D.W., Gottlieb, J., Guillot, A., Hart, E., Johnson, C.G., Marchiori, E., Meyer, J.-A., Middendorf, M. (eds.) EvoIASP 2003, EvoWorkshops 2003, EvoSTIM 2003, EvoROB/EvoRobot 2003, EvoCOP 2003, EvoBIO 2003, and EvoMUSART 2003. LNCS, vol. 2611, pp. 119–130. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  18. Kushchu, I.: Genetic programming and evolutionary generalization. IEEE transactions on Evolutionary Computation 6(5), 431–442 (2002)

    Article  MATH  Google Scholar 

  19. Nordin, P., Banzhaf, W.: Complexity compression and evolution. In: Proceedings of the Sixth International Conference Genetic Algorithms, Pittsburgh (PA), USA, pp. 310–317. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  20. Soule, T., Foster, J.A.: Effects of code growth and parsimony pressure on populations in genetic programming. Evolutionary Computation 6(4), 293–309 (1998)

    Article  Google Scholar 

  21. Gustafson, S., Ekart, A., Burke, E., Kendall, G.: Problem difficulty and code growth in genetic programming. Genetic Programming and Evolvable Machines 5(3), 271–290 (2004)

    Article  Google Scholar 

  22. Iba, H., de Garis, H., Sato, T.: Genetic programming using a minimum description length principle. In: Advances in Genetic Programming. Complex Adaptive Systems, pp. 265–284. MIT Press, Cambridge (1994)

    Google Scholar 

  23. Zhang, B.T., Mühlenbein, H.: Balancing accuracy and parsimony in genetic programming. Evolutionary Computation 3(1), 17–38 (1995)

    Article  Google Scholar 

  24. Rosca, J.: Generality versus size in genetic programming. In: Genetic Programming 1996: Proceedings of the First Annual Conference, Stanford University (CA), USA, pp. 381–387 (1996)

    Google Scholar 

  25. Cavaretta, M.J., Chellapilla, K.: Data mining using genetic programming: The implications of parsimony on generalization error. In: Proceedings of the 1999 Congress on Evolutionary Computation (CEC 1999), Washington (DC), USA, pp. 1330–1337 (1999)

    Google Scholar 

  26. Gagné, C., Parizeau, M.: Open BEAGLE: A new versatile C++ framework for evolutionary computation. In: Late-Breaking Papers of the 2002 Genetic and Evolutionary Computation Conference (GECCO 2002), New York (NY), USA (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gagné, C., Schoenauer, M., Parizeau, M., Tomassini, M. (2006). Genetic Programming, Validation Sets, and Parsimony Pressure. In: Collet, P., Tomassini, M., Ebner, M., Gustafson, S., Ekárt, A. (eds) Genetic Programming. EuroGP 2006. Lecture Notes in Computer Science, vol 3905. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11729976_10

Download citation

  • DOI: https://doi.org/10.1007/11729976_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33143-8

  • Online ISBN: 978-3-540-33144-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics