Choosing function sets with better generalisation performance for symbolic regression models
Created by W.Langdon from
gp-bibliography.bib Revision:1.7913
- @Article{Nicolau:GPEM:funcset,
-
author = "Miguel Nicolau and Alexandros Agapitos",
-
title = "Choosing function sets with better generalisation
performance for symbolic regression models",
-
journal = "Genetic Programming and Evolvable Machines",
-
year = "2021",
-
volume = "22",
-
number = "1",
-
pages = "73--100",
-
month = mar,
-
keywords = "genetic algorithms, genetic programming, Symbolic
regression, Machine learning, Generalisation,
Overfitting, Data-driven modelling",
-
ISSN = "1389-2576",
-
DOI = "doi:10.1007/s10710-020-09391-4",
-
abstract = "Supervised learning by means of Genetic Programming
(GP) aims at the evolutionary synthesis of a model that
achieves a balance between approximating the target
function on the training data and generalising on new
data. The model space searched by the Evolutionary
Algorithm is populated by compositions of primitive
functions defined in a function set. Since the target
function is unknown, the choice of function set's
constituent elements is primarily guided by the makeup
of function sets traditionally used in the GP
literature. Our work builds upon previous research of
the effects of protected arithmetic operators (i.e.
division, logarithm, power) on the output value of an
evolved model for input data points not encountered
during training. The scope is to benchmark the
approximation/generalisation of models evolved using
different function set choices across a range of 43
symbolic regression problems. The salient outcomes are
as follows. Firstly, Koza's protected operators of
division and exponentiation have a detrimental effect
on generalisation, and should therefore be avoided.
This result is invariant of the use of moderately sized
validation sets for model selection. Secondly, the
performance of the recently introduced analytic
quotient operator is comparable to that of the
sinusoidal operator on average, with their combination
being advantageous to both approximation and
generalisation. These findings are consistent across
two different system implementations, those of standard
expression-tree GP and linear Grammatical Evolution. We
highlight that this study employed very large test
sets, which create confidence when benchmarking the
effect of different combinations of primitive functions
on model generalisation. Our aim is to encourage GP
researchers and practitioners to use similar stringent
means of assessing generalisation of evolved models
where possible, and also to avoid certain primitive
functions that are known to be inappropriate.",
-
notes = "College of Business, University College Dublin,
Dublin, Ireland",
- }
Genetic Programming entries for
Miguel Nicolau
Alexandros Agapitos
Citations