COVID-19 seroprevalence estimation and forecasting in the USA from ensemble machine learning models using a stacking strategy
Created by W.Langdon from
gp-bibliography.bib Revision:1.8414
- @Article{Sagastabeitia:2024:eswa,
-
author = "Gontzal Sagastabeitia and Josu Doncel and
Jose Aguilar and Antonio {Fernandez Anta} and Juan Marcos Ramirez",
-
title = "{COVID-19} seroprevalence estimation and forecasting
in the {USA} from ensemble machine learning models
using a stacking strategy",
-
journal = "Expert Systems with Applications",
-
year = "2024",
-
volume = "258",
-
pages = "124930",
-
keywords = "genetic algorithms, genetic programming, COVID-19,
Epidemiology, Stacking ensemble method, Machine
learning, Regression modelling, Neural networks, ANN",
-
ISSN = "0957-4174",
-
URL = "
https://www.sciencedirect.com/science/article/pii/S0957417424017974",
-
DOI = "
doi:10.1016/j.eswa.2024.124930",
-
abstract = "The COVID-19 pandemic exposed the importance of
research on the spread of epidemic diseases. In this
paper, we apply Artificial Intelligence and statistics
techniques to build prediction models to estimate the
SARS-CoV-2 seroprevalence in the United States, using
multiple estimates of COVID-19 prevalence and other
explanatory variables. We propose the use of stacking
techniques based on multiple model building techniques
(Linear and Beta Regression, Genetic Programming and
Neural Networks) to obtain Predictive Ensemble Models.
There has been extensive research on this field, but
there has not been in-depth research on the application
of stacking methods to estimate and forecast
seroprevalence in the USA specifically. This paper
provides a novel comparison of the behaviour and
performance of different building techniques for
stacking ensemble models and presents which methods are
better for different scenarios. We find that Genetic
Programming and Neural Networks are the best models
with trained data within single states, and when
multiple states are considered Genetic Programming is
still better than the Regression models, but Neural
Networks fail to estimate the seroprevalence
accurately. Another novelty of our work is the use of
cross-state validation to evaluate the models with new
data, as well as temporal forecasting. Depending on how
the data is processed, Linear Regression performs very
well with cross-state validation and temporal
forecasting, and Genetic Programming is very accurate
with the former while Neural Networks work better with
the latter",
- }
Genetic Programming entries for
Gontzal Sagastabeitia
Josu Doncel
Jose Lisandro Aguilar Castro
Antonio Fernandez Anta
Juan Marcos Ramirez
Citations