Complexity, interpretability and robustness of GP-based feature engineering in remote sensing
Created by W.Langdon from
gp-bibliography.bib Revision:1.8414
- @Article{Batista:2025:swevo,
-
author = "Joao E. Batista and Adam K. Pindur and
Ana I. R. Cabral and Hitoshi Iba and Sara Silva",
-
title = "Complexity, interpretability and robustness of
{GP-based} feature engineering in remote sensing",
-
journal = "Swarm and Evolutionary Computation",
-
year = "2025",
-
volume = "92",
-
pages = "101761",
-
keywords = "genetic algorithms, genetic programming, Remote
sensing, Model complexity, Model interpretability,
Feature construction, Forest degradation,
Classification, Time series",
-
ISSN = "2210-6502",
-
URL = "
https://www.sciencedirect.com/science/article/pii/S2210650224002992",
-
DOI = "
doi:10.1016/j.swevo.2024.101761",
-
abstract = "Feature engineering is a crucial step in machine
learning that provides better data for the learning
algorithms to induce robust models, and this effort
should be adapted to the capabilities of each
algorithm. For example, classifiers that do not perform
data transformations (e.g., cluster-based) perform
better when the different classes are separated,
typically requiring preprocessed data. Other models
(e.g., decision trees) can perform several splits in
the feature space, easily obtaining perfect results in
training data, but have a higher risk of overfitting
with unprocessed data. We use the rbd-GP and M3GP
genetic programming algorithms to induce new features
based on the original features, to be used by shallow
and deep decision tree and random forest models. M3GP
is wrapped around a learning algorithm, using its
performance as fitness. This way, the induced features
are adapted to the classifier, allowing us to compare
the complexity of the features induced for the
different classifiers. We measure the complexity of the
induced features using several structural and
functional complexity metrics found in the literature,
also proposing a new metric that measures the
separability of classes in the feature space. Like
other authors, we use complexity as an interpretability
metric, selecting three models to discuss and validate
based on their performance and size. We apply these
methods to remote sensing classification problems and
solve two tasks that are hard due to the high
similarity between the land cover classes: detecting
cocoa agroforest and forecasting forest degradation up
to one year in the future",
- }
Genetic Programming entries for
Joao E Batista
Adam K Pindur
Ana Isabel Rosa Cabral
Hitoshi Iba
Sara Silva
Citations