Coevolving functions in genetic programming
Introduction
The use of function sub-routines is ubiquitous in high-level computer programming languages. Functions provide the ability to reuse code efficiently, generating a modular and hierarchical structure to programs. Since its formalization, genetic programming (GP) [8] has included the ability to exploit functional sub-routines, termed “automatically defined functions” (ADFs) [8, p. 534] modular functions can form part of the genetic make-up of an evolving program.
In this paper we present a coevolutionary approach to the use of ADFs in genetic programming. Here each identified ADF is assigned its own independent sub-population which coevolves with other ADF sub-populations and a population of main program trees or “result-producing branches” (RPBs). For each evaluation, ADFs from each sub-population are selected randomly to be used by a program, where fitness can be assigned globally or locally. In the global case all ADFs and the main tree receive the same fitness. In the local case the main tree receives the global fitness, but each ADF receives the fitness available for that aspect of the task with which it is concerned. In either case selection and reproduction are done independently within each sub-population. In this paper we show that the coevolutionary approach performs better than traditional GP and traditional GP with ADFs on two well-known classification tasks.
We then introduce a version of the approach termed evolution defined functions (EDFs) which uses the coevolutionary strategy in conjunction with the two mutation operators, compression and expansion, of the genetic library builder (GLiB) [2]. We use the same two classification tasks to show that the automatic specification of EDF sub-populations via compression is beneficial when the existence of a particular function is determined by a measure of its worth/recent usage. We then extend the approach further to allow any number of functions to be created during evolution, rather than having an a priori fixed number of EDFs, again using the measure of existing function worth. It is again shown that improvements can be achieved over the previous approaches.
Finally, we introduce a version of our coevolutionary technique specifically for classification tasks, based on the work of [10], in which ADFs are feature preprocessors/extractors for a classification algorithm.
The paper is arranged as follows: the following section details the basic ADF strategy of GP and our new approach. Section 3 describes the problems used to do the comparisons and Section 4 presents results from using our coevolutionary approach. Section 5 introduces the EDF approach with dynamic function creation and Section 6 presents the results of its use. Section 7 shows the results obtained from allowing the number of EDFs available to evolve over time and Section 8 presents a version of the coevolutionary approach designed specifically for classification tasks. Finally, all findings are discussed.
Section snippets
Automatically defined functions
Koza [8] presented ADFs as a refinement to genetic programming with the aim of enabling the composite evolution of larger programs. Here each identified ADF is genetically joined to the main program tree such that a child's ADFs are a mix of its parents'; each joined ADF recombines with the corresponding ADF of the other parent. Whenever a call to a particular type of ADF is made, the joined example individual is used. The number of ADFs available during evolution is fixed a priori and the ADFs
The classification tasks
In this paper we use two well-known classification problems to compare the different strategies (see [4] for an example of the comparative performance and potential benefits of GP for classification tasks).
Australian credit card
For this problem we use populations of size 1410 individuals for the traditional approaches, therefore in the COADF approach each population contains 470 individuals as there are two function sub-populations and a population of main programs.
It has been found that the best results are obtained when crossover (for each population) is performed on a small percentage (20%) of the populations. A mutation rate of 0.02 and roulette-wheel selection are used.
The results shown for this task (always) the
Dynamic function creation
Angeline and Pollack [2] have demonstrated the use of a GLiB to alter the structure of the genotype. The GLiB uses two novel mutation operators (“compression” and “expansion”) to compress and expand modules (sub-trees) during evolution. A randomly selected sub-tree from the genotype is compressed as a module and added to the GLiB. The library keeps the definition of a module and the usage of the module by the subsequent generations which indicates its worth. The idea here is that compression
Australian credit card
We use the same parameters as in Section 4.1, with the new sub-trees taken from the main program mutated with probability 0.5 per node. The usage counters are checked (compression/expansion) at the end of every five generations here.
We also compare the performance of the standard GLiB approach to the use of EDFs in this section. We implement the GLiB mechanism such that compression/expansion events occur at the same frequency as in the EDF system, with the same conditions for
Fully dynamic coevolutionary functions
In the previous section it was found that, for the two classification tasks, too many functions necessary to solve the tasks with EDFs were defined a priori; two functions existed where only one proved beneficial to the search process in the Australian credit tasks and three were defined where only two proved significantly useful in the letter recognition task. For more complex tasks specifying the correct number of functions is, potentially, both more difficult and critical. To cope with this
Coevolving functions specifically for classification tasks
So far we have used two classification tasks to demonstrate our new approach to the use of functions in genetic programming; hierarchical, modular classification programs have been coevolved. This technique can of course be used in genetic programming for many types of task. In this section we introduce a final version of our coevolutionary approach, based on the work of Raymer et al. [10], specifically for classification tasks.
Raymer et al. [10] have presented an alternative approach to the
Conclusions
In this paper we have presented a coevolutionary approach to the use of ADFs in genetic programming. Here each identified ADF is assigned its own independent sub-population which coevolves with other ADF sub-populations and a population of main program trees or “result-producing branches” (RPBs). For each evaluation, random ADFs from each sub-population are selected to be used by a program (see [1] for a comparison of different ADF selection strategies). This is in contrast to Koza's original
Manu Ahluwalia received the Ph.D. degree in Computer Science from the University of the West of England, U.K. in 2000. The subject being the use of Genetic Programming with functional decomposition for data mining. He is currently a Research Scientist for Applied Predictive Technologies, Inc.
References (16)
- M. Ahluwalia, L. Bull, T.C. Fogarty, Coevolving functions in genetic programming: a comparison in ADF selection...
- et al.
Coevolving high-level representations
- et al.
Evolutionary computing in multi-agent environments: speciation and symbiogenesis
- A.E. Eiben, T.J. Euverman, W. Kowalczyk, F. Slisser, Modelling customer retention with statistical techniques, rough...
- et al.
Letter recognition using Holland-style adaptive classifier systems
Machine Learning
(1991) - P. Husbands, F. Mill, Simulated coevolution as the mechanism for emergent planning and scheduling, in: R.L. Belew, L.B....
Cited by (13)
Squeezing the last drop: Cluster-based classification algorithm
2007, Statistics and Probability LettersCooperative coevolution of automatically defined functions with gene expression programming
2012, Proceedings of Special Session - Revised Papers, 11th Mexican International Conference on Artificial Intelligence 2012: Advances in Artificial Intelligence and Applications, MICAI 2012Have your spaghetti and eat it too: Evolutionary algorithmics and post-evolutionary analysis
2011, Genetic Programming and Evolvable MachinesAutomatically defined functions for learning classifier systems
2011, Genetic and Evolutionary Computation Conference, GECCO'11 - Companion PublicationConstruction of classifier with feature selection based on genetic programming
2010, 2010 IEEE World Congress on Computational Intelligence, WCCI 2010 - 2010 IEEE Congress on Evolutionary Computation, CEC 2010Evolving efficient list search algorithms
2010, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Manu Ahluwalia received the Ph.D. degree in Computer Science from the University of the West of England, U.K. in 2000. The subject being the use of Genetic Programming with functional decomposition for data mining. He is currently a Research Scientist for Applied Predictive Technologies, Inc.
Larry Bull received the B.Sc. and Ph.D. degrees in Computer Science from the University of the West of England, U.K., in 1992 and 1995 respectively. He is currently a Senior Research Fellow at the University and award leader for an M.Sc. Machine Learning & Adaptive Computing.