A Comparison of Fitness-Case Sampling Methods for Symbolic Regression with Genetic Programming
Created by W.Langdon from
gp-bibliography.bib Revision:1.8081
- @InProceedings{Martinez:2014:EVOLVE,
-
author = "Yuliana Martinez and Leonardo Trujillo and
Enrique Naredo and Pierrick Legrand",
-
title = "A Comparison of Fitness-Case Sampling Methods for
Symbolic Regression with Genetic Programming",
-
booktitle = "EVOLVE - A Bridge between Probability, Set Oriented
Numerics, and Evolutionary Computation V",
-
year = "2014",
-
editor = "Alexandru-Adrian Tantar and Emilia Tantar and
Jian-Qiao Sun and Wei Zhang and Qian Ding and
Oliver Schuetze and Michael Emmerich and Pierrick Legrand and
Pierre {Del Moral} and Carlos A. {Coello Coello}",
-
volume = "288",
-
series = "Advances in Intelligent Systems and Computing",
-
pages = "201--212",
-
address = "Peking",
-
month = "1-4 " # jul,
-
publisher = "Springer",
-
keywords = "genetic algorithms, genetic programming, Fitness-Case
Sampling, Symbolic Regression, Performance Evaluation",
-
isbn13 = "978-3-319-07493-1",
-
DOI = "doi:10.1007/978-3-319-07494-8_14",
-
abstract = "The canonical approach towards fitness evaluation in
Genetic Programming (GP) is to use a static training
set to determine fitness, based on a cost function
averaged over all fitness-cases. However, motivated by
different goals, researchers have recently proposed
several techniques that focus selective pressure on a
subset of fitness-cases at each generation. These
approaches can be described as fitness-case sampling
techniques, where the training set is sampled, in some
way, to determine fitness. This paper shows a
comprehensive evaluation of some of the most recent
sampling methods, using benchmark and real-world
problems for symbolic regression. The algorithms
considered here are Interleaved Sampling, Random
Interleaved Sampling, Lexicase Selection and a new
sampling technique is proposed called Keep-Worst
Interleaved Sampling (KW-IS). The algorithms are
extensively evaluated based on test performance, over
fitting and bloat. Results suggest that sampling
techniques can improve performance compared with
standard GP. While on synthetic benchmarks the
difference is slight or none at all, on real-world
problems the differences are substantial. Some of the
best results were achieved by Lexicase Selection and
Keep Worse-Interleaved Sampling. Results also show that
on real-world problems overfitting correlates strongly
with bloating. Furthermore, the sampling techniques
provide efficiency, since they reduce the number of
fitness-case evaluations required over an entire run.",
- }
Genetic Programming entries for
Yuliana Martinez
Leonardo Trujillo
Enrique Naredo
Pierrick Legrand
Citations