Effects of reducing redundant parameters in parameter optimization for symbolic regression using genetic programming
Created by W.Langdon from
gp-bibliography.bib Revision:1.8414
- @Article{Kronberger:2025:jsc,
-
author = "Gabriel Kronberger and Fabricio {Olivetti de Franca}",
-
title = "Effects of reducing redundant parameters in parameter
optimization for symbolic regression using genetic
programming",
-
journal = "Journal of Symbolic Computation",
-
year = "2025",
-
volume = "129",
-
pages = "102413",
-
keywords = "genetic algorithms, genetic programming, Symbolic
regression, Machine learning, Expression rewriting,
Equality saturation, Nonlinear least squares",
-
ISSN = "0747-7171",
-
URL = "
https://www.sciencedirect.com/science/article/pii/S0747717124001172",
-
DOI = "
doi:10.1016/j.jsc.2024.102413",
-
abstract = "Gradient-based local optimisation has been shown to
improve results of genetic programming (GP) for
symbolic regression (SR) - a machine learning method
for symbolic equation learning. Correspondingly,
several state-of-the-art GP implementations use
iterative nonlinear least squares (NLS) algorithms for
local optimisation of parameters. An issue that has
however mostly been ignored in SR and GP literature is
overparameterization of SR expressions, and as a
consequence, bad conditioning of NLS optimisation
problem. The aim of this study is to analyse the
effects of overparameterization on the NLS results and
convergence speed, whereby we use Operon as an example
GP/SR implementation. First, we demonstrate that
numeric rank approximation can be used to detect
overparameterization using a set of six selected
benchmark problems. In the second part, we analyse
whether the NLS results or convergence speed can be
improved by simplifying expressions to remove redundant
parameters with equality saturation. This analysis is
done with the much larger Feynman symbolic regression
benchmark set after collecting all expressions visited
by GP, as the simplification procedure is not fast
enough to use it within GP fitness evaluation. We
observe that Operon frequently visits overparameterized
solutions but the number of redundant parameters is
small on average. We analysed the Pareto-optimal
expressions of the first and last generation of GP, and
found that for 70percent to 80percent of the simplified
expressions, the success rate of reaching the optimum
was better or equal than for the overparameterized
form. The effect was smaller for the number of NLS
iterations until convergence, where we found fewer or
equal iterations for 51percent to 63percent of the
expressions after simplification",
- }
Genetic Programming entries for
Gabriel Kronberger
Fabricio Olivetti de Franca
Citations