abstract = "In classification, machine learning algorithms can
suffer a performance bias when data sets are
unbalanced. Data sets are unbalanced when at least one
class is represented by only a small number of training
examples (called the minority class) while the other
class(es) make up the majority. In this scenario,
classifiers can have good accuracy on the majority
class but very poor accuracy on the minority class(es).
This paper proposes a Multi-objective Genetic
Programming (MOGP) approach to evolving accurate and
diverse ensembles of genetic program classifiers with
good performance on both the minority and majority
classes. The evolved ensembles comprise of nondominated
solutions in the population where individual members
vote on class membership. This paper evaluates the
effectiveness of two popular Pareto-based fitness
strategies in the MOGP algorithm (SPEA2 and NSGAII),
and investigates techniques to encourage diversity
between solutions in the evolved ensembles.
Experimental results on six (binary) class imbalance
problems show that the evolved ensembles outperform
their individual members, as well as single-predictor
methods such as canonical GP, Naive Bayes and Support
Vector Machines, on highly unbalanced tasks. This
highlights the importance of developing an effective
fitness evaluation strategy in the underlying MOGP
algorithm to evolve good ensemble members.",