Skip to main content

The Importance of Being Flat–Studying the Program Length Distributions of Operator Equalisation

  • Chapter
  • First Online:
Book cover Genetic Programming Theory and Practice IX

Part of the book series: Genetic and Evolutionary Computation ((GEVO))

Abstract

The recent Crossover Bias theory has shown that bloat in Genetic Programming can be caused by the proliferation of small unfit individuals in the population. Inspired by this theory, Operator Equalisation is the most recent and successful bloat control method available. In a recent work there has been an attempt to replicate the evolutionary dynamics of Operator Equalisation by joining two key ingredients found in older and newer bloat control methods. However, the obtained dynamics was very different from expected, which prompted a further investigation into the reasons that make Operator Equalisation so successful. It was revealed that, at least for complex symbolic regression problems, the distribution of program lengths enforced by Operator Equalisation is nearly flat, contrasting with the peaky and well delimited distributions of the other approaches. In this work we study the importance of having flat program length distributions for bloat control. We measure the flatness of the distributions found in previous and new Operator Equalisation variants and we correlate it with the amount of search performed by each approach. We also analyze where this search occurs and how bloat correlates to these properties. We conclude presenting a possible explanation for the unique behavior of Operator Equalisation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Archetti, Francesco,Lanzeni, Stefano,Messina,Enza, andVanneschi,Leonardo (2006). Genetic programming for human oral bioavailability of drugs. In Keijzer,Maarten et al., editors, GECCO2006: Proceedings of the 8th annual conference on Genetic and evolutionary computation, volume 1, pages 255– 262, Seattle, Washington, USA. ACM Press.

    Google Scholar 

  • Archetti, Francesco,Lanzeni, Stefano,Messina,Enza, andVanneschi,Leonardo (2007). Genetic programming for computational pharmacokinetics in drug discovery and development. Genetic Programming and Evolvable Machines, 8(4):413–432. special issue on medical applications of Genetic and Evolutionary Computation.

    Google Scholar 

  • Dignum, Stephen and Poli, Riccardo (2007). Generalisation of the limiting distribution of program sizes in tree-based genetic programming and analysis of its effects on bloat. In Thierens, Dirk et al., editors, GECCO ’07: Proceedings of the 9th annual conference on Genetic and evolutionary computation, volume 2, pages 1588–1595, London. ACM Press.

    Google Scholar 

  • Dignum, Stephen and Poli, Riccardo (2008a). Crossover, sampling, bloat and the harmful effects of size limits. In O’Neill, Michael et al., editors, Proceedings of the 11th European Conference on Genetic Programming, EuroGP 2008, volume 4971 of Lecture Notes in Computer Science, pages 158–169, Naples. Springer.

    Google Scholar 

  • Dignum, Stephen and Poli, Riccardo (2008b). Operator equalisation and bloat free GP. In O’Neill, Michael et al., editors, Proceedings of the 11th European Conference onGeneticProgramming,EuroGP2008, volume 4971 of Lecture Notes in Computer Science, pages 110–121, Naples. Springer.

    Google Scholar 

  • Igel, Christian and Chellapilla, Kumar (1999). Investigating the influence of depth and degree of genotypic change on fitness in genetic programming. In Banzhaf,Wolfgang et al., editors, Proceedings of the Genetic and Evolutionary Computation Conference, volume 2, pages 1061–1068, Orlando, Florida, USA. Morgan Kaufmann.

    Google Scholar 

  • Koza, John R. (1992). Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA.

    Google Scholar 

  • Luke, Sean (2003). Modification point depth and genome growth in genetic programming. Evolutionary Computation, 11(1):67–106.

    Article  Google Scholar 

  • Poli, Riccardo, Langdon, William B., and Dignum, Stephen (2007). On the limiting distribution of program sizes in tree-based genetic programming. In Ebner, Marc et al., editors, Proceedings of the 10th European Conference on Genetic Programming, volume 4445 of Lecture Notes in Computer Science, pages 193–204, Valencia, Spain. Springer.

    Google Scholar 

  • Poli, Riccardo, McPhee, Nicholas F., and Vanneschi, Leonardo (2008a). Analysis of the effects of elitism on bloat in linear and tree-based genetic program232 ming. In Riolo, Rick L., Soule, Terence, and Worzel, Bill, editors, Genetic Programming Theory and Practice VI, Genetic and Evolutionary Computation, chapter 7, pages 91–111. Springer, Ann Arbor.

    Google Scholar 

  • Poli, Riccardo, McPhee, Nicholas F., and Vanneschi, Leonardo (2008b). The impact of population size on code growth in GP: analysis and empirical validation. In Keijzer, Maarten et al., editors, GECCO ’08: Proceedings of the 10th annual conference on Genetic and evolutionary computation, pages 1275–1282, Atlanta, GA, USA. ACM.

    Google Scholar 

  • Poli, Riccardo, McPhee, Nicholas Freitag, and Vanneschi, Leonardo (2008c). Elitism reduces bloat in genetic programming. In Keijzer, Maarten et al., editors, GECCO ’08: Proceedings of the 10th annual conference on Genetic and evolutionary computation, pages 1343–1344, Atlanta, GA, USA. ACM. Silva, Sara (2011). Reassemblingoperator equalisation - a secret revealed. In Genetic and Evolutionary Computation Conference (GECCO-2011). ACM Press.

    Google Scholar 

  • Silva, Sara andAlmeida, Jonas (2003).Dynamicmaximum tree depth. In Cant´u- Paz, E. et al., editors, Genetic and Evolutionary Computation – GECCO- 2003, volume 2724 of LNCS, pages 1776–1787, Chicago. Springer-Verlag.

    Google Scholar 

  • Silva, Sara and Costa, Ernesto (2004). Dynamic limits for bloat control: Variations on size and depth. In Deb, Kalyanmoy et al., editors, Genetic and Evolutionary Computation – GECCO-2004, Part II, volume 3103 of Lecture Notes in Computer Science, pages 666–677, Seattle, WA, USA. Springer- Verlag.

    Google Scholar 

  • Silva, Sara and Costa, Ernesto (2009). Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories. Genetic Programming and Evolvable Machines, 10(2):141–179.

    Article  MathSciNet  Google Scholar 

  • Silva, Sara and Dignum, Stephen (2009). Extending operator equalisation: Fitness based self adaptive length distribution for bloat free GP. In Vanneschi, Leonardo et al., editors, Proceedings of the 12th EuropeanConference onGenetic Programming, EuroGP 2009, volume 5481 of LNCS, pages 159–170, Tuebingen. Springer.

    Google Scholar 

  • Silva, Sara and Vanneschi, Leonardo (2009). Operator equalisation, bloat and overfitting: a study on human oral bioavailability prediction. In Raidl, Guenther et al., editors, GECCO ’09: Proceedings of the 11th Annual conference on Genetic and evolutionary computation, pages 1115–1122, Montreal. ACM.

    Google Scholar 

  • Silva, Sara and Vanneschi, Leonardo (2012). Bloat free genetic programming: Application to human oral bioavailability prediction. International Journal of Data Mining and Bioinformatics. to appear.

    Google Scholar 

  • Silva, Sara, Vasconcelos, Maria, and Melo, Joana (2010). Bloat free genetic programming versus classification trees for identification of burned areas in The Importance of Being Flat 233 satellite imagery. In Di Chio, Cecilia et al., editors, EvoIASP, volume 6024 of LNCS, Istanbul. Springer.

    Google Scholar 

  • Tackett, Walter Alden (1994). Recombination, Selection, and the Genetic Construction of Computer Programs. PhD thesis, University of Southern California, Department of Electrical Engineering Systems, USA.

    Google Scholar 

  • Vanneschi, Leonardo (2004). Theory and Practice for Efficient Genetic Programming.

    Google Scholar 

  • PhD thesis, Faculty of Sciences, University of Lausanne, Switzerland. Vanneschi, Leonardo, Castelli,Mauro, and Silva, Sara (2010).Measuring bloat, overfitting and functional complexity in genetic programming. In Branke, Juergen et al., editors, GECCO ’10: Proceedings of the 12th annual conference on Genetic and evolutionary computation, pages 877–884, Portland, Oregon, USA. ACM.

    Google Scholar 

  • Vanneschi, Leonardo and Silva, Sara (2009). Using operator equalisation for prediction of drug toxicity with genetic programming. In Lopes, Luis Seabra, Lau, Nuno, Mariano, Pedro, and Rocha, Luis Mateus, editors, Progress in Artificial Intelligence, 14th Portuguese Conference on Artificial Intelligence, EPIA 2009, volume 5816 of LNAI, pages 65–76, Aveiro, Portugal. Springer.

    Google Scholar 

  • Vanneschi, Leonardo, Tomassini, Marco, Collard, Philippe, and Clergue, Manuel (2003). Fitness distance correlation in structural mutation genetic programming. In Ryan, Conor et al., editors, Genetic Programming, Proceedings of EuroGP’2003, volume 2610 of LNCS, pages 455–464, Essex. Springer-Verlag.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Silva, S., Vanneschi, L. (2011). The Importance of Being Flat–Studying the Program Length Distributions of Operator Equalisation. In: Riolo, R., Vladislavleva, E., Moore, J. (eds) Genetic Programming Theory and Practice IX. Genetic and Evolutionary Computation. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1770-5_12

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-1770-5_12

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-1769-9

  • Online ISBN: 978-1-4614-1770-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics