Skip to main content
Log in

Genetic programming with one-point crossover and subtree mutation for effective problem solving and bloat control

  • Original Paper
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Genetic programming (GP) is one of the most widely used paradigms of evolutionary computation due to its ability to automatically synthesize computer programs and mathematical expressions. However, because GP uses a variable length representation, the individuals within the evolving population tend to grow rapidly without a corresponding return in fitness improvement, a phenomenon known as bloat. In this paper, we present a simple bloat control strategy for standard tree-based GP that achieves a one order of magnitude reduction in bloat when compared with standard GP on benchmark tests, and practically eliminates bloat on two real-world problems. Our proposal is to substitute standard subtree crossover with the one-point crossover (OPX) developed by Poli and Langdon (Second online world conference on soft computing in engineering design and manufacturing, Springer, Berlin (1997)), while maintaining all other GP aspects standard, particularly subtree mutation. OPX was proposed for theoretical purposes related to GP schema theorems, however since it curtails exploration during the search it has never achieved widespread use. In our results, on the other hand, we are able to show that OPX can indeed perform an effective search if it is coupled with subtree mutation, thus combining the bloat control capabilities of OPX with the exploration provided by standard mutation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. The word \(computations\) refers to a finite amount of machine instructions or CPU cycles, whichever is sufficient for the purposes of our argument.

  2. Our intention is to overestimate the total number of comparisons required at each generation.

References

  • Alfaro-Cid E, Merelo J, de Vega FF, Esparcia-Alcázar A, Sharman K (2009) Bloat control operators and diversity in genetic programming: acomparative study. Evol Comput 18(2):305

    Article  Google Scholar 

  • Angeline PJ (1997) Comparing subtree crossover with macromutation. In: EP ’97: Proceedings of the 6th international conference on evolutionary programming VI, Springer, London, pp 101–112

  • Archetti F, Lanzeni S, Messina E, Vanneschi L (2006) Genetic programming for human oral bioavailability of drugs. In: GECCO ’06: Proceedings of the 8th annual conference on genetic and evolutionary computation, ACM, New York, pp 255–262

  • Blickle T, Thiele L (1994) Genetic programming and redundancy. In: Hopf J (ed) Genetic algorithms within the framework of evolutionary computation (Workshop at KI-94), Saarbrucken, pp 33–38

  • De Jong K (2001) Evolutionary computation: a unified approach. The MIT Press, Cambridge

  • Dignum S, Poli R (2007) Generalisation of the limiting distribution of program sizes in tree-based genetic programming and analysis of its effects on bloat. In: GECCO ’07: Proceedings of the 9th annual conference on genetic and evolutionary computation, ACM, New York, pp 1588–1595

  • Dignum S, Poli R (2008a) Crossover, sampling, bloat and the harmful effects of size limits. In: O’Neill M, Vanneschi L, Gustafson S, Esparcia-Alcázar A, Falco ID, Cioppa AD, Tarantino E (eds) Genetic programming. In: Proceedings of 11th European conference, EuroGP 2008, Naples, Italy, March 26–28, 2008, vol 4971. Lecture Notes in Computer Science, Springer, Berlin, pp 158–169

  • Dignum S, Poli R (2008b) Operator equalisation and bloat free gp. In: EuroGP’08: Proceedings of the 11th European conference on genetic programming, Springer, Berlin, pp 110–121

  • Ekárt A, Németh SZ (2001) Selection based on the pareto nondomination criterion for controlling code growth in genetic programming. Genetic Program Evol Mach 2(1):61–73

    Article  MATH  Google Scholar 

  • Fernández F, Martín A (2004) Saving effort in parallel gp by means of plagues. Lecture Notes in Computer Science, vol 3003. Springer, Berlin, pp 269–278

  • Forrest S, Nguyen T, Weimer W, Le Goues C (2009) A genetic programming approach to automated software repair. In: GECCO ’09: Proceedings of the 11th Annual conference on genetic and evolutionary computation, ACM, New York, pp 947–954

  • García S, Fernández A, Luengo J, Herrera F (2009) A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13(10):959–977

    Article  Google Scholar 

  • Keijzer M, Babovic V (2002) Declarative and preferential bias in gp-based scientific discovery. Genetic Program Evol Mach 3(1):41–79

    Article  MATH  Google Scholar 

  • Kinnear K (1993) Evolving a sort: Lessons in genetic programming. In: Proceedings of the 1993 international conference on neural networks, vol 2. IEEE Press, Piscataway, pp 881–888

  • Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge

    MATH  Google Scholar 

  • Koza JR, Keane MA, Yu J, Forrest H, Bennett I, Mydlowec W (2000) Automatic creation of human-competitive programs and controllers by means of genetic programming. Genetic Program Evol Mach 1(1–2):121–164

    Article  MATH  Google Scholar 

  • Langdon WB (2000) Size fair and homologous tree crossovers for tree genetic programming. Genetic Program Evol Mach 1(1–2):95–119

    Article  MATH  Google Scholar 

  • Langdon WB, Poli R (2002) Foundations of genetic programming. Springer, New York

    MATH  Google Scholar 

  • Luke S (2003) Modification point depth and genome growth in genetic programming. Evol Comput 11(1):67–106

    Article  Google Scholar 

  • Luke S, Panait L (2002) Lexicographic parsimony pressure. In: GECCO ’02: Proceedings of the genetic and evolutionary computation conference, Morgan Kaufmann Publishers Inc., San Francisco, pp 829–836

  • Luke S, Panait L (2006) A comparison of bloat control methods for genetic programming. Evol Comput 14(3):309–344

    Article  Google Scholar 

  • Poli R (2003) A simple but theoretically-motivated method to control bloat in genetic programming. In: Ryan C, Soule T, Keijzer M, Tsang EPK, Poli R, Costa E (eds) Genetic programming. In: Proceedings of the 6th European conference, EuroGP 2003, Essex, UK, April 14–16, 2003, vol 2610. Lecture Notes in Computer Science, Springer, Berlin, pp 204–217

  • Poli R, Langdon WB (1997) Genetic programming with one-point crossover. In: Chawdhry PK, Roy R, Pant RK (eds) Second online world conference on soft computing in engineering design and manufacturing, Springer, Berlin

  • Poli R, McPhee NF (2003a) General schema theory for genetic programming with subtree-swapping crossover: Part i. Evol Comput 11(1):53–66

    Article  Google Scholar 

  • Poli R, McPhee NF (2003b) General schema theory for genetic programming with subtree-swapping crossover: Part ii. Evol Comput 11(2):169–206

    Article  Google Scholar 

  • Poli R, McPhee NF (2008) Parsimony pressure made easy. In: GECCO ’08: Proceedings of the 10th annual conference on genetic and evolutionary computation, ACM, New York, pp 1267–1274

  • Poli R, Langdon WB, McPhee NF (2008a) A field guide to genetic programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk (with contributions by J. R. Koza)

  • Poli R, McPhee NF, Vanneschi L (2008b) The impact of population size on code growth in gp: analysis and empirical validation. In: GECCO ’08: Proceedings of the 10th annual conference on genetic and evolutionary computation, ACM, New York, pp 1275–1282

  • Silva S, Almeida J (2003) Gplab—a genetic programming toolbox for matlab. In: Gregersen L (ed) Proceedings of the Nordic MATLAB conference, pp 273–278

  • Silva S, Costa E (2009) Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories. Genetic Program Evol Mach 10(2):141–179

    Article  MathSciNet  Google Scholar 

  • Silva S, Dignum S (2009) Extending operator equalisation: fitness based self adaptive length distribution for bloat free gp. In: EuroGP ’09: Proceedings of the 12th European conference on genetic programming. Springer, Berlin, pp 159–170

  • Soule T, Foster JA (1998) Removal bias: a new cause of code growth in tree based evolutionary programming. In: ICEC 98: IEEE international conference on evolutionary computation 1998, IEEE Press, pp 781–786

  • Spector L, Clark DM, Lindsay I, Barr B, Klein J (2008) Genetic programming for finite algebras. In: GECCO ’08: Proceedings of the 10th annual conference on genetic and evolutionary computation, ACM, New York, pp 1291–1298

  • Tackett W (1994) Recombination, selection, and the genetic constructor of computer programs. PhD thesis, University of Southern California, Department of Electrical Engineering Systems

  • Trujillo L, Olague G (2008) Automated design of image operators that detect interest points. Evol Comput 16(4):483–507

    Article  Google Scholar 

  • Trujillo L, Legrand P, Lévy-Véhel J (2010) The estimation of hölderian regularity using genetic programming. In: GECCO ’10: Proceedings of the 12th annual conference on genetic and evolutionary computation, ACM, New York, pp 861–868

  • Vanneschi L, Castelli M, Silva S (2010) Measuring bloat, overfitting and functional complexity in genetic programming. In: GECCO ’10: Proceedings of the 12th annual conference on genetic and evolutionary computation, ACM, New York, pp 877–884

Download references

Acknowledgments

The author thanks his students from Instituto Tecnológico de Tijuana for their work in code development and experimentation, they are: Maria Lizette Gutierréz, Eduardo Ramírez, Jose Christian Romero, Luis Fernando Gaxiola, and Francisco Arce. Also, thanks are given to Dr. Sara Silva, the developer of GPLAB, for providing such a useful open-source tool for GP experimentation. Finally, special thanks are given to Dr. Pierrick Legrand for his invaluable comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leonardo Trujillo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Trujillo, L. Genetic programming with one-point crossover and subtree mutation for effective problem solving and bloat control. Soft Comput 15, 1551–1567 (2011). https://doi.org/10.1007/s00500-010-0687-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-010-0687-7

Keywords

Navigation