abstract = "Symbolic regression methods generate expression trees
that simultaneously define the functional form of a
regression model and the regression parameter values.
As a result, the regression problem can search many
nonlinear functional forms using only the specification
of simple mathematical operators such as addition,
subtraction, multiplication, and division, among
others. Currently, state-of-the-art symbolic regression
methods leverage genetic algorithms and adaptive
programming techniques. Genetic algorithms lack
optimality certifications and are typically stochastic
in nature. In contrast, we propose an optimization
formulation for the rigorous deterministic optimization
of the symbolic regression problem. We present a
mixed-integer nonlinear programming (MINLP) formulation
to solve the symbolic regression problem as well as
several alternative models to eliminate redundancies
and symmetries. We demonstrate this symbolic regression
technique using an array of experiments based upon
literature instances. We then use a set of 24 MINLPs
from symbolic regression to compare the performance of
five local and five global MINLP solvers. Finally, we
use larger instances to demonstrate that a portfolio of
models provides an effective solution mechanism for
problems of the size typically addressed in the
symbolic regression literature.",
notes = "Department of Chemical Engineering, Carnegie Mellon
University, Pittsburgh, PA, USA