abstract = "Typically, the quality of a solution in Genetic
Programming (GP) is represented by a score on a given
training sample. However, in Machine Learning, we are
most interested in estimating the quality of the
evolving individuals on unseen data. In this paper, we
propose to simulate the effect of unseen data to direct
training without actually using additional data, by
employing a technique called bootstrapping that
repeatedly re-samples with replacement from the
training data and helps estimate sensitivity of the
individual in question to small variations across these
re-sampled data sets. We minimise this sensitivity, as
measured by the Bootstrap Standard Error, alongside the
training error, in a bid to evolve models that
generalise better to the unseen data.
We evaluate the proposed technique on four binary
classification problems and compare with a standard GP
approach. The results show that for the problems
undertaken, the proposed method not only generalises
significantly better than standard GP while the
training performance improves, but also demonstrates a
strong side effect of containing the tree sizes.",