On explaining machine learning models by evolving crucial and compact features
Created by W.Langdon from
gp-bibliography.bib Revision:1.8120
- @Article{VIRGOLIN:2020:swarm,
-
author = "Marco Virgolin and Tanja Alderliesten and
Peter A. N. Bosman",
-
title = "On explaining machine learning models by evolving
crucial and compact features",
-
journal = "Swarm and Evolutionary Computation",
-
volume = "53",
-
pages = "100640",
-
year = "2020",
-
ISSN = "2210-6502",
-
DOI = "doi:10.1016/j.swevo.2019.100640",
-
URL = "http://www.sciencedirect.com/science/article/pii/S2210650219305036",
-
keywords = "genetic algorithms, genetic programming, Feature
construction, Interpretable machine learning, GOMEA",
-
abstract = "Feature construction can substantially improve the
accuracy of Machine Learning (ML) algorithms. Genetic
Programming (GP) has been proven to be effective at
this task by evolving non-linear combinations of input
features. GP additionally has the potential to improve
ML explainability since explicit expressions are
evolved. Yet, in most GP works the complexity of
evolved features is not explicitly bound or minimized
though this is arguably key for explainability. In this
article, we assess to what extent GP still performs
favorably at feature construction when constructing
features that are (1) Of small-enough number, to enable
visualization of the behavior of the ML model; (2) Of
small-enough size, to enable interpretability of the
features themselves; (3) Of sufficient informative
power, to retain or even improve the performance of the
ML algorithm. We consider a simple feature construction
scheme using three different GP algorithms, as well as
random search, to evolve features for five ML
algorithms, including support vector machines and
random forest. Our results on 21 datasets pertaining to
classification and regression problems show that
constructing only two compact features can be
sufficient to rival the use of the entire original
feature set. We further find that a modern GP
algorithm, GP-GOMEA, performs best overall. These
results, combined with examples that we provide of
readable constructed features and of 2D visualizations
of ML behavior, lead us to positively conclude that
GP-based feature construction still works well when
explicitly searching for compact features, making it
extremely helpful to explain ML models",
- }
Genetic Programming entries for
Marco Virgolin
Tanja Alderliesten
Peter A N Bosman
Citations