A Bibliometric Analysis and Benchmark of Machine Learning and AutoML in Crash Severity Prediction: The Case Study of Three Colombian Cities
Created by W.Langdon from
gp-bibliography.bib Revision:1.7917
- @Article{angarita-zapata:2021:Sensors,
-
author = "Juan S Angarita-Zapata and Gina Maestre-Gongora and
Jenny Fajardo Calderin",
-
title = "A Bibliometric Analysis and Benchmark of Machine
Learning and {AutoML} in Crash Severity Prediction: The
Case Study of Three Colombian Cities",
-
journal = "Sensors (Basel, Switzerland)",
-
year = "2021",
-
volume = "21",
-
number = "24",
-
month = dec # " 16",
-
keywords = "genetic algorithms, genetic programming, TPOT, Bayes
Theorem, Benchmarking, Bibliometrics, Cities, Colombia,
Machine Learning, Internet of Things, automated machine
learning, crash severity prediction, intelligent
transportation systems, supervised learning",
-
ISSN = "1424-8220",
-
DOI = "doi:10.3390/s21248401",
-
abstract = "Traffic accidents are of worldwide concern, as they
are one of the leading causes of death globally. One
policy designed to cope with them is the design and
deployment of road safety systems. These aim to predict
crashes based on historical records, provided by new
Internet of Things (IoT) technologies, to enhance
traffic flow management and promote safer roads.
Increasing data availability has helped machine
learning (ML) to address the prediction of crashes and
their severity. The literature reports numerous
contributions regarding survey papers, experimental
comparisons of various techniques, and the design of
new methods at the point where crash severity
prediction (CSP) and ML converge. Despite such
progress, and as far as we know, there are no
comprehensive research articles that theoretically and
practically approach the model selection problem (MSP)
in CSP. Thus, this paper introduces a bibliometric
analysis and experimental benchmark of ML and automated
machine learning (AutoML) as a suitable approach to
automatically address the MSP in CSP. Firstly, 2318
bibliographic references were consulted to identify
relevant authors, trending topics, keywords evolution,
and the most common ML methods used in related-case
studies, which revealed an opportunity for the use
AutoML in the transportation field. Then, we compared
AutoML (AutoGluon, Auto-sklearn, TPOT) and ML
(CatBoost, Decision Tree, Extra Trees, Gradient
Boosting, Gaussian Naive Bayes, Light Gradient Boosting
Machine, Random Forest) methods in three case studies
using open data portals belonging to the cities of
Medellin, Bogota, and Bucaramanga in Colombia. Our
experimentation reveals that AutoGluon and CatBoost are
competitive and robust ML approaches to deal with
various CSP problems. In addition, we concluded that
general-purpose AutoML effectively supports the MSP in
CSP without developing domain-focused AutoML methods
for this supervised learning problem. Finally, based on
the results obtained, we introduce challenges and
research opportunities that the community should
explore to enhance the contributions that ML and AutoML
can bring to CSP and other transportation areas.",
-
notes = "PMID: 34960494",
- }
Genetic Programming entries for
Juan S Angarita-Zapata
Gina Paola Maestre-Gongora
Jenny Fajardo Calderin
Citations