abstract = "The estimation of problem difficulty is an open issue
in Genetic Programming(GP). The goal of this work is to
generate models that predict the expected performance
of a GP-based classifier when it is applied to an
unseen task. Classification problems are described
using domain specific features, some of which are
proposed in this work, and these features are given as
input to the predictive models. These models are
referred to as predictors of expected performance
(PEPs). We extend this approach by using an ensemble of
specialized predictors (SPEP), dividing classification
problems into groups and choosing the corresponding
SPEP. The proposed predictors are trained using 2D
synthetic classification problems with balanced
datasets. The models are then used to predict the
performance of the GP classifier on unseen real world
datasets that are multidimensional and imbalanced. This
work is the first to provide a performance prediction
of a GP system on testdata, while previous works
focused on predicting training performance. Accurate
predictive models are generated by posing a symbolic
regression task and solving it with GP. These results
are achieved by using highly descriptive features and
including a dimensionality reduction stage that
simplifies the learning and testing process. The
proposed approach could be extended to other
classification algorithms and used as the basis of an
expert system for algorithm selection.",
notes = "Supervisor: Leonardo Trujillo Reyes
Also known as \cite{oai:HAL:tel-01668769v1} Also known
as \cite{DBLP:phd/hal/Martinez16a}",