Trustable symbolic regression models: using ensembles, interval arithmetic and Pareto fronts to develop robust and trust-aware models
Created by W.Langdon from
gp-bibliography.bib Revision:1.7917
- @InCollection{Kotanchek:2007:GPTP,
-
author = "Mark Kotanchek and Guido Smits and
Ekaterina Vladislavleva",
-
title = "Trustable symbolic regression models: using ensembles,
interval arithmetic and {Pareto} fronts to develop
robust and trust-aware models",
-
booktitle = "Genetic Programming Theory and Practice {V}",
-
year = "2007",
-
editor = "Rick L. Riolo and Terence Soule and Bill Worzel",
-
series = "Genetic and Evolutionary Computation",
-
chapter = "12",
-
pages = "201--220",
-
address = "Ann Arbor",
-
month = "17-19" # may,
-
publisher = "Springer",
-
keywords = "genetic algorithms, genetic programming",
-
isbn13 = "978-0-387-76308-8",
-
URL = "http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.457.5272",
-
URL = "http://www.evolved-analytics.com/sites/EA_Documents/Publications/GPTP07/GPTP07_TrustableModels_Preprint.pdf",
-
DOI = "doi:10.1007/978-0-387-76308-8_12",
-
size = "19 pages",
-
abstract = "Trust is a major issue with deploying empirical models
in the real world since changes in the underlying
system or use of the model in new regions of parameter
space can produce (potentially dangerous) incorrect
predictions. The trepidation involved with model usage
can be mitigated by assembling ensembles of diverse
models and using their consensus as a trust metric,
since these models will be constrained to agree in the
data region used for model development and also
constrained to disagree outside that region. The
problem is to define an appropriate model complexity
(since the ensemble should consist of models of similar
complexity), as well as to identify diverse models from
the candidate model set. In this chapter we discuss
strategies for the development and selection of robust
models and model ensembles and demonstrate those
strategies against industrial data sets. An important
benefit of this approach is that all available data may
be used in the model development rather than a
partition into training, test and validation subsets.
The result is constituent models are more accurate
without risk of over-fitting, the ensemble predictions
are more accurate and the ensemble predictions have a
meaningful trust metric.",
-
notes = "part of \cite{Riolo:2007:GPTP} published 2008",
- }
Genetic Programming entries for
Mark Kotanchek
Guido F Smits
Ekaterina (Katya) Vladislavleva
Citations