Abstract
This research examines the cause of code growth (bloat) in genetic programming (GP). Currently there are three hypothesized causes of code growth in GP: protection, drift, and removal bias. We show that single node mutations increase code growth in evolving programs. This is strong evidence that the protective hypothesis is correct. We also show a negative correlation between the size of the branch removed during crossover and the resulting change in fitness, but a much weaker correlation for added branches. These results support the removal bias hypothesis, but seem to refute the drift hypothesis. Our results also suggest that there are serious disadvantages to the tree structured programs commonly evolved with GP, because the nodes near the root are effectively fixed in the very early generations.
Similar content being viewed by others
References
T. Blickle and L. Thiele, “Genetic programming and redundancy,” in Genetic Algorithms Within the Framework of Evolutionary Computation, J. Hopf (ed.), Max-Planck-Institut fur Informatik: Saarbrucken, Germany, 1994, pp. 33–38.
J. M. Daida, R. B. Bertram, S. A. Stanhpe, J. C. Khoo, S. A. Chaudhary, and O. A. Chaudhri, “What makes a problem gp-hard? Analysis of tunably difficult problems in genetic programming,” Genetic Programming and Evolvable Machines, vol. 2, no. 2, pp. 165–191, 2001.
C. Gathercole and P. Ross, “Tackling the boolean even n parity problem with genetic programming and limited-error fitness,” in Genetic Programming 1997: Proc. Second Annual Conf., J. R. Koza, K. Deb, M. Dorigo, D. B. Fogel, M. Garzon, H. Iba, and R. R. Riolo (eds.), Morgan Kaufmann: San Francisco, CA, 1997, pp. 119–127.
D. C. Hooper, N. S. Flann, and S. R. Fuller, “Recombinative hill-climbing: Astronger search method for genetic programming,” in Genetic Programming 1997: Proc. Second Annual Conf., J. R. Koza, K. Deb, M. Dorigo, D. B. Fogel, M. Garzon, H. Iba, and R. R. Riolo (eds.), Morgan Kaufmann: San Francisco, CA, 1997, pp. 174–179.
C. Igel and K. Chellapilla, “Investigating the influence of depth and degree of genotypic change on fitness in genetic programming,” in Proc. Genetic and Evolutionary Computation Conf., W. Banzhaf, J. Daida, A. E. Eiben, M. H. Garzon, V. Honavar, M. Jakiela, and R. E. Smith (eds.), Morgan Kaufmann, 1999, pp. 1061–1068.
J. Koza, “Agenetic approach to the truck backer upper problem and the intertwined spiral problem,” in Proc. IJCNN Int. Joint Conf. Neural Networks, IEEE Press, 1992, pp. 310–318.
J. R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection, The MIT Press: Cambridge, MA, 1992.
W. B. Langdon, “Fitness causes bloat: Simulated annealing, hill climbing and populations,” Technical Report SRP-97-22, The University of Birmingham, Birmingham, UK, 1997.
W. B. Langdon, “Size fair and homologous tree genetic programming crossovers,” in Proc. Genetic and Evolutionary Computation Conf., W. Banzhaf, J. Daida, A. E. Eiben, M. H. Garzon, V. Honavar, M. Jakiela, and R. E. Smith (eds.), Morgan Kaufmann: 1999, pp. 1092–1097.
W. B. Langdon and R. Poli, “Why ants are hard,” in Genetic Programming 1998: Proc. Third Annual Conf., J. R. Koza, W. Banzhaf, K. Chellapilla, K. Deb, M. Dorigo, D. B. Fogel, M. H. Garzon, D. E. Goldberg, H. Iba, and R. Riolo (eds.), 1998, pp. 193–201.
W. B. Langdon, T. Soule, R. Poli, and J. A. Foster, “The evolution of size and shape,” in Advances in Genetic Programming III, L. Spector, W. B. Langdon, U.-M. O'Reilly, and P. J. Angeline (eds.), The MIT Press: Cambridge, MA, 1999, pp. 163–190.
S. Luke, “Code growth is not caused by introns,” in Late Breaking Papers, Proc. Genetic and Evolutionary Computation Conf., 2000, pp. 228–235.
N. F. McPhee and J. D. Miller, “Accurate replication in genetic programming,” in Proc. Sixth Int. Conf. Genetic Algorithms, L. J. Eshelman (ed.), Morgan Kaufmann: San Francisco, CA, 1995, pp. 303–309.
N. F. McPhee and N. J. Hopper, “Analysis of genetic diversity through population history,” in Proc. Genetic and Evolutionary Computation Conf., W. Banzhaf, J. Daida, A. E. Eiben, M. H. Garzon, V. Honavar, M. Jakiela, and R. E. Smith (eds.), Morgan Kaufmann, 1999, pp. 1112–1120.
P. Nordin, Evolutionary Program Induction of Binary Machine Code and Its Application, Krehl Verlag: Muenster, 1997.
P. Nordin and W. Banzhaf, “Complexity compression and evolution,” in Proc. Sixth Int. Conf. Genetic Algorithms, L. J. Eshelman (ed.), Morgan Kaufmann: San Francisco, CA, 1995, pp. 310–317.
P. Nordin, W. Banzhaf, and F. D. Francone, “Introns in nature and in simulated structure evolution,” in Proceedings Bio-Computing and Emergent Computation, D. Lundh, B. Olsson, and A. Narayanan (eds.), Springer, 1997, pp. 22–35.
P. Nordin, F. Francone, and W. Banzhaf, “Explicitly defined introns and destructive crossover in genetic programming,” in Advances in Genetic Programming II, P. Angeline and Jr. K. E. Kinnear (eds.), The MIT Press: Cambridge, MA, 1996, pp. 111–134.
P. Smith and K. Harries, “Code growth, explicitly defined introns, and alternative selection schemes,” Evolutionary Computation, vol. 6, no. 4, pp. 339–360, 1998.
T. Soule, Code Growth in Genetic Programming, Ph.D. thesis, University of Idaho, University of Idaho, 1998.
T. Soule and J. A. Foster, “Code size and depth flows in genetic programming,” in Genetic Programming 1997: Proc. Second Annual Conf., J. R. Koza, K. Deb, M. Dorigo, D. B. Fogel, M. Garzon, H. Iba, and R. R. Riolo (eds.), Morgan Kaufmann: San Francisco, CA, 1997, pp. 313–320.
T. Soule and J. A. Foster, “Removal bias: a new cause of code growth in tree based evolutionary programming,” in ICEC 98: IEEE International Conf. on Evolutionary Computation, IEEE Press, 1998, pp. 781–786.
T. Soule, J. A. Foster, and J. Dickinson, “Code growth in genetic programming,” in Genetic Programming 1996: Proc. First Annual Conf., J. R. Koza, D. E. Goldberg, D. B. Fogel, and R. R. Riolo (eds.), MIT Press: Cambridge, MA, 1996, pp. 215–223.
D. H. M. Spector, Building Linux Clusters: Scaling Linux for Scientific and Enterprise Applications, O'Reilly: Sebastopol, CA, 2000.
T. L. Sterling, J. Salmon, D. J. Becker, and D. F. Savarese, How to Build a Beowulf: AGuide to the Implementation and Application of PC Cluster, MIT Press: Cambridge, MA, 1999.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Soule, T., Heckendorn, R.B. An Analysis of the Causes of Code Growth in Genetic Programming. Genetic Programming and Evolvable Machines 3, 283–309 (2002). https://doi.org/10.1023/A:1020115409250
Issue Date:
DOI: https://doi.org/10.1023/A:1020115409250