Abstract
Genetic programming (GP) is one of the most widely used paradigms of evolutionary computation due to its ability to automatically synthesize computer programs and mathematical expressions. However, because GP uses a variable length representation, the individuals within the evolving population tend to grow rapidly without a corresponding return in fitness improvement, a phenomenon known as bloat. In this paper, we present a simple bloat control strategy for standard tree-based GP that achieves a one order of magnitude reduction in bloat when compared with standard GP on benchmark tests, and practically eliminates bloat on two real-world problems. Our proposal is to substitute standard subtree crossover with the one-point crossover (OPX) developed by Poli and Langdon (Second online world conference on soft computing in engineering design and manufacturing, Springer, Berlin (1997)), while maintaining all other GP aspects standard, particularly subtree mutation. OPX was proposed for theoretical purposes related to GP schema theorems, however since it curtails exploration during the search it has never achieved widespread use. In our results, on the other hand, we are able to show that OPX can indeed perform an effective search if it is coupled with subtree mutation, thus combining the bloat control capabilities of OPX with the exploration provided by standard mutation.
Similar content being viewed by others
Notes
The word \(computations\) refers to a finite amount of machine instructions or CPU cycles, whichever is sufficient for the purposes of our argument.
Our intention is to overestimate the total number of comparisons required at each generation.
References
Alfaro-Cid E, Merelo J, de Vega FF, Esparcia-Alcázar A, Sharman K (2009) Bloat control operators and diversity in genetic programming: acomparative study. Evol Comput 18(2):305
Angeline PJ (1997) Comparing subtree crossover with macromutation. In: EP ’97: Proceedings of the 6th international conference on evolutionary programming VI, Springer, London, pp 101–112
Archetti F, Lanzeni S, Messina E, Vanneschi L (2006) Genetic programming for human oral bioavailability of drugs. In: GECCO ’06: Proceedings of the 8th annual conference on genetic and evolutionary computation, ACM, New York, pp 255–262
Blickle T, Thiele L (1994) Genetic programming and redundancy. In: Hopf J (ed) Genetic algorithms within the framework of evolutionary computation (Workshop at KI-94), Saarbrucken, pp 33–38
De Jong K (2001) Evolutionary computation: a unified approach. The MIT Press, Cambridge
Dignum S, Poli R (2007) Generalisation of the limiting distribution of program sizes in tree-based genetic programming and analysis of its effects on bloat. In: GECCO ’07: Proceedings of the 9th annual conference on genetic and evolutionary computation, ACM, New York, pp 1588–1595
Dignum S, Poli R (2008a) Crossover, sampling, bloat and the harmful effects of size limits. In: O’Neill M, Vanneschi L, Gustafson S, Esparcia-Alcázar A, Falco ID, Cioppa AD, Tarantino E (eds) Genetic programming. In: Proceedings of 11th European conference, EuroGP 2008, Naples, Italy, March 26–28, 2008, vol 4971. Lecture Notes in Computer Science, Springer, Berlin, pp 158–169
Dignum S, Poli R (2008b) Operator equalisation and bloat free gp. In: EuroGP’08: Proceedings of the 11th European conference on genetic programming, Springer, Berlin, pp 110–121
Ekárt A, Németh SZ (2001) Selection based on the pareto nondomination criterion for controlling code growth in genetic programming. Genetic Program Evol Mach 2(1):61–73
Fernández F, Martín A (2004) Saving effort in parallel gp by means of plagues. Lecture Notes in Computer Science, vol 3003. Springer, Berlin, pp 269–278
Forrest S, Nguyen T, Weimer W, Le Goues C (2009) A genetic programming approach to automated software repair. In: GECCO ’09: Proceedings of the 11th Annual conference on genetic and evolutionary computation, ACM, New York, pp 947–954
García S, Fernández A, Luengo J, Herrera F (2009) A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13(10):959–977
Keijzer M, Babovic V (2002) Declarative and preferential bias in gp-based scientific discovery. Genetic Program Evol Mach 3(1):41–79
Kinnear K (1993) Evolving a sort: Lessons in genetic programming. In: Proceedings of the 1993 international conference on neural networks, vol 2. IEEE Press, Piscataway, pp 881–888
Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge
Koza JR, Keane MA, Yu J, Forrest H, Bennett I, Mydlowec W (2000) Automatic creation of human-competitive programs and controllers by means of genetic programming. Genetic Program Evol Mach 1(1–2):121–164
Langdon WB (2000) Size fair and homologous tree crossovers for tree genetic programming. Genetic Program Evol Mach 1(1–2):95–119
Langdon WB, Poli R (2002) Foundations of genetic programming. Springer, New York
Luke S (2003) Modification point depth and genome growth in genetic programming. Evol Comput 11(1):67–106
Luke S, Panait L (2002) Lexicographic parsimony pressure. In: GECCO ’02: Proceedings of the genetic and evolutionary computation conference, Morgan Kaufmann Publishers Inc., San Francisco, pp 829–836
Luke S, Panait L (2006) A comparison of bloat control methods for genetic programming. Evol Comput 14(3):309–344
Poli R (2003) A simple but theoretically-motivated method to control bloat in genetic programming. In: Ryan C, Soule T, Keijzer M, Tsang EPK, Poli R, Costa E (eds) Genetic programming. In: Proceedings of the 6th European conference, EuroGP 2003, Essex, UK, April 14–16, 2003, vol 2610. Lecture Notes in Computer Science, Springer, Berlin, pp 204–217
Poli R, Langdon WB (1997) Genetic programming with one-point crossover. In: Chawdhry PK, Roy R, Pant RK (eds) Second online world conference on soft computing in engineering design and manufacturing, Springer, Berlin
Poli R, McPhee NF (2003a) General schema theory for genetic programming with subtree-swapping crossover: Part i. Evol Comput 11(1):53–66
Poli R, McPhee NF (2003b) General schema theory for genetic programming with subtree-swapping crossover: Part ii. Evol Comput 11(2):169–206
Poli R, McPhee NF (2008) Parsimony pressure made easy. In: GECCO ’08: Proceedings of the 10th annual conference on genetic and evolutionary computation, ACM, New York, pp 1267–1274
Poli R, Langdon WB, McPhee NF (2008a) A field guide to genetic programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk (with contributions by J. R. Koza)
Poli R, McPhee NF, Vanneschi L (2008b) The impact of population size on code growth in gp: analysis and empirical validation. In: GECCO ’08: Proceedings of the 10th annual conference on genetic and evolutionary computation, ACM, New York, pp 1275–1282
Silva S, Almeida J (2003) Gplab—a genetic programming toolbox for matlab. In: Gregersen L (ed) Proceedings of the Nordic MATLAB conference, pp 273–278
Silva S, Costa E (2009) Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories. Genetic Program Evol Mach 10(2):141–179
Silva S, Dignum S (2009) Extending operator equalisation: fitness based self adaptive length distribution for bloat free gp. In: EuroGP ’09: Proceedings of the 12th European conference on genetic programming. Springer, Berlin, pp 159–170
Soule T, Foster JA (1998) Removal bias: a new cause of code growth in tree based evolutionary programming. In: ICEC 98: IEEE international conference on evolutionary computation 1998, IEEE Press, pp 781–786
Spector L, Clark DM, Lindsay I, Barr B, Klein J (2008) Genetic programming for finite algebras. In: GECCO ’08: Proceedings of the 10th annual conference on genetic and evolutionary computation, ACM, New York, pp 1291–1298
Tackett W (1994) Recombination, selection, and the genetic constructor of computer programs. PhD thesis, University of Southern California, Department of Electrical Engineering Systems
Trujillo L, Olague G (2008) Automated design of image operators that detect interest points. Evol Comput 16(4):483–507
Trujillo L, Legrand P, Lévy-Véhel J (2010) The estimation of hölderian regularity using genetic programming. In: GECCO ’10: Proceedings of the 12th annual conference on genetic and evolutionary computation, ACM, New York, pp 861–868
Vanneschi L, Castelli M, Silva S (2010) Measuring bloat, overfitting and functional complexity in genetic programming. In: GECCO ’10: Proceedings of the 12th annual conference on genetic and evolutionary computation, ACM, New York, pp 877–884
Acknowledgments
The author thanks his students from Instituto Tecnológico de Tijuana for their work in code development and experimentation, they are: Maria Lizette Gutierréz, Eduardo Ramírez, Jose Christian Romero, Luis Fernando Gaxiola, and Francisco Arce. Also, thanks are given to Dr. Sara Silva, the developer of GPLAB, for providing such a useful open-source tool for GP experimentation. Finally, special thanks are given to Dr. Pierrick Legrand for his invaluable comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Trujillo, L. Genetic programming with one-point crossover and subtree mutation for effective problem solving and bloat control. Soft Comput 15, 1551–1567 (2011). https://doi.org/10.1007/s00500-010-0687-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-010-0687-7