Evolving decision trees using oracle guides
Created by W.Langdon from
gp-bibliography.bib Revision:1.8051
- @InProceedings{Johansson:2009:ieeeCIDM,
-
author = "Ulf Johansson and Lars Niklasson",
-
title = "Evolving decision trees using oracle guides",
-
booktitle = "IEEE Symposium on Computational Intelligence and Data
Mining, CIDM '09",
-
year = "2009",
-
month = "30 2009-" # apr # " 2",
-
pages = "238--244",
-
keywords = "genetic algorithms, genetic programming, data mining,
decision trees, high-accuracy techniques, human
inspection, neural network ensemble, opaque models,
oracle guides, predictive models, rule extraction,
transparent models, data mining, decision trees, neural
nets",
-
DOI = "doi:10.1109/CIDM.2009.4938655",
-
abstract = "Some data mining problems require predictive models to
be not only accurate but also comprehensible.
Comprehensibility enables human inspection and
understanding of the model, making it possible to trace
why individual predictions are made. Since most
high-accuracy techniques produce opaque models,
accuracy is, in practice, regularly sacrificed for
comprehensibility. One frequently studied technique,
often able to reduce this accuracy vs.
comprehensibility tradeoff, is rule extraction, i.e.,
the activity where another, transparent, model is
generated from the opaque. In this paper, it is argued
that techniques producing transparent models, either
directly from the dataset, or from an opaque model,
could benefit from using an oracle guide. In the
experiments, genetic programming is used to evolve
decision trees, and a neural network ensemble is used
as the oracle guide. More specifically, the datasets
used by the genetic programming when evolving the
decision trees, consist of several different
combinations of the original training data and 'oracle
data', i.e., training or test data instances, together
with corresponding predictions from the oracle. In
total, seven different ways of combining regular
training data with oracle data were evaluated, and the
results, obtained on 26 UCI datasets, clearly show that
the use of an oracle guide improved the performance. As
a matter of fact, trees evolved using training data
only had the worst test set accuracy of all setups
evaluated. Furthermore, statistical tests show that two
setups, both using the oracle guide, produced
significantly more accurate trees, compared to the
setup using training data only.",
-
notes = "Also known as \cite{4938655}",
- }
Genetic Programming entries for
Ulf Johansson
Lars Niklasson
Citations