ABSTRACT
The ability to generalize beyond the training set is important for Genetic Programming (GP). Interleaved Sampling is a recently proposed approach to improve generalization in GP. In this technique, GP alternates between using the entire data set and only a single data point. Initial results showed that the technique not only produces solutions that generalize well, but that it so happens at a reduced computational expense as half the number of generations only evaluate a single data point.
This paper further investigates the merit of interleaving the use of training set with two alternatives approaches. These are: the use of random search instead of a single data point, and simply minimising the tree size. Both of these alternatives are computationally even cheaper than the original setup as they simply do not invoke the fitness function half the time. We test the utility of these new methods on four, well cited, and high dimensional problems from the symbolic regression domain.
The results show that the new approaches continue to produce general solutions despite taking only half the fitness evaluations. Size minimisation also prevents bloat while producing competitive results on both training and test data sets. The tree sizes with size minisation are substantially smaller than the rest of the setups, which further brings down the training costs.
- I. Goncalves and S. Silva. Balancing learning and overfitting in genetic programming with interleaved sampling of training data. In K. Krawiec, A. Moraglio, T. Hu, A. S. Uyar, and B. Hu, editors, Proceedings of the 16th European Conference on Genetic Programming, EuroGP 2013, volume 7831 of LNCS, pages 73--84, Vienna, Austria, 3-5 Apr. 2013. Springer Verlag. Google ScholarDigital Library
Index Terms
- Efficient interleaved sampling of training data in genetic programming
Recommendations
Neural network crossover in genetic algorithms using genetic programming
AbstractThe use of genetic algorithms (GAs) to evolve neural network (NN) weights has risen in popularity in recent years, particularly when used together with gradient descent as a mutation operator. However, crossover operators are often omitted from ...
Exact Schema Theory and Markov Chain Models for Genetic Programming and Variable-length Genetic Algorithms with Homologous Crossover
Genetic Programming (GP) homologous crossovers are a group of operators, including GP one-point crossover and GP uniform crossover, where the offspring are created preserving the position of the genetic material taken from the parents. In this paper we ...
Evolving dynamic fitness measures for genetic programming
Highlights- Further research is conducted on dynamic fitness measure genetic programming.
- ...
AbstractThis research builds on the hypothesis that the use of different fitness measures on the different generations of genetic programming (GP) is more effective than the convention of applying the same fitness measure individually ...
Comments