Toward Data-Driven Generation and Evaluation of Model Structure for Integrated Representations of Human Behavior in Water Resources Systems
Created by W.Langdon from
gp-bibliography.bib Revision:1.8276
- @Article{ekblad:2021:WR,
-
author = "Liam Ekblad and Jonathan D. Herman",
-
title = "Toward Data-Driven Generation and Evaluation of Model
Structure for Integrated Representations of Human
Behavior in Water Resources Systems",
-
journal = "Water Resources Research",
-
year = "2021",
-
volume = "57",
-
number = "2",
-
pages = "e2020WR028148",
-
keywords = "genetic algorithms, genetic programming, Data-driven
modeling, human behavior, integrated systems, machine
learning, multiobjective optimization, California",
-
URL = "
https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2020WR028148",
-
eprint = "https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2020WR028148",
-
DOI = "
doi:10.1029/2020WR028148",
-
size = "22 pages",
-
abstract = "Abstract Simulations of human behavior in water
resources systems are challenged by uncertainty in
model structure and parameters. The increasing
availability of observations describing these systems
provides the opportunity to infer a set of plausible
model structures using data-driven approaches. This
study develops a three-phase approach to the inference
of model structures and parameterizations from data:
problem definition, model generation, and model
evaluation, illustrated on a case study of land use
decisions in the Tulare Basin, California. We encode
the generalized decision problem as an arbitrary
mapping from a high-dimensional data space to the
action of interest and use multiobjective genetic
programming to search over a family of functions that
perform this mapping for both regression and
classification tasks. To facilitate the discovery of
models that are both realistic and interpretable, the
algorithm selects model structures based on
multiobjective optimization of (1) their performance on
a training set and (2) complexity, measured by the
number of variables, constants, and operations
composing the model. After training, optimal model
structures are further evaluated according to their
ability to generalize to held-out test data and
clustered based on their performance, complexity, and
generalization properties. Finally, we diagnose the
causes of good and bad generalization by performing
sensitivity analysis across model inputs and within
model clusters. This study serves as a template to
inform and automate the problem-dependent task of
constructing robust data-driven model structures to
describe human behavior in water resources systems.",
- }
Genetic Programming entries for
Liam Ekblad
Jonathan D Herman
Citations