Evolving simple and accurate symbolic regression models via asynchronous parallel computing
Created by W.Langdon from
gp-bibliography.bib Revision:1.8051
- @Article{Sambo:2021:ASC,
-
author = "Aliyu Sani Sambo and R. Muhammad Atif Azad and
Yevgeniya Kovalchuk and
Vivek Padmanaabhan Indramohan and Hanifa Shah",
-
title = "Evolving simple and accurate symbolic regression
models via asynchronous parallel computing",
-
journal = "Applied Soft Computing",
-
year = "2021",
-
volume = "104",
-
pages = "107198",
-
month = jun,
-
keywords = "genetic algorithms, genetic programming, Model
complexity, Parallel computing, Evaluation time",
-
ISSN = "1568-4946",
-
URL = "https://www.sciencedirect.com/science/article/pii/S1568494621001216",
-
DOI = "doi:10.1016/j.asoc.2021.107198",
-
size = "15 pages",
-
abstract = "In machine learning, reducing the complexity of a
model can help to improve its computational efficiency
and avoid overfitting. In genetic programming (GP), the
model complexity reduction is often achieved by
reducing the size of evolved expressions. However,
previous studies have demonstrated that the expression
size reduction does not necessarily prevent model
overfitting. Therefore, we use the evaluation time, the
computational time required to evaluate a GP model on
data, as the estimate of model complexity. The
evaluation time depends not only on the size of evolved
expressions but also their composition, thus acting as
a more nuanced measure of model complexity than the
expression size alone. To discourage complexity, this
study employs a novel method called asynchronous
parallel GP (APGP) that introduces a race condition in
the evolutionary process of GP; the race offers an
evolutionary advantage to the simple solutions when
their accuracy is competitive. To evaluate the proposed
method, it is compared to the standard GP (GP) and GP
with bloat control (GP+BC) methods on six challenging
symbolic regression problems. APGP produced models that
are significantly more accurate (on 6/6 problems) than
those produced by both GP and GP+BC. In terms of
complexity control, APGP prevailed over GP but not over
GP+BC; however, GP+BC produced simpler solutions at the
cost of test-set accuracy. Moreover, APGP took a
significantly lower number of evaluations than both GP
and GP+BC to meet a target training fitness in all
tests. Our analysis of the proposed APGP also involved:
(1) an ablation study that separated the proposed
measure of complexity from the race condition in APGP
and (2) the study of an initialisation scheme that
encourages functional diversity in the initial
population that improved the results for all the GP
methods. These results question the overall benefits of
bloat control and endorse the employment of both the
evaluation time as an estimate of model complexity and
the proposed APGP method for controlling it.",
-
notes = "also known as \cite{SAMBO2021107198}",
- }
Genetic Programming entries for
Aliyu Sani Sambo
R Muhammad Atif Azad
Yevgeniya Kovalchuk
Vivek Padmanaabhan Indramohan
Hanifa Shah
Citations