A combined frequency–severity approach for the analysis of rear-end crashes on urban arterials
Highlights
► Genetic Programming approach to crash frequency & injury severity. ► Models for rear-end crashes (mid block, intersection, access points). ► Graphical understanding for variables’ behavior.
Introduction
Safety assessment of roadway elements such as mid-block segments, signalized intersections and un-signalized intersections (access points) includes investigations into the severity as well as the frequency of crashes. The goals of transportation safety engineers are to reduce the number of crashes and mitigate the injury severity in case of a crash. However, any research directed only towards the frequency or the severity analysis of crashes may not always be sufficient. Though this aspect of safety analysis is widely accepted, the existing body of knowledge however has very limited citations for a complete analysis involving both the crash counts and severity of the injuries resulting in the crashes. Recently Ma et al. (2008) used a multivariate Poisson-lognormal approach to model crash occurrence simultaneously at various levels of injury severity. However, the complex statistical structure of the study makes it less practical to implement.
Fundamental difference between the crash occurrence phenomenon and the injury severity levels is the response type. Crash occurrence is a continuous integer response while the severity is an ordinal target. Most statistical studies for the two phenomena are based on this difference. For crash count prediction, models such as negative binomial (Miaou, 1996, Harwood et al., 2000) and support vector machines (Li et al., 2008) are the norm. In case of injury severity, logistic regression (Huang et al., 2008, Sze and Wong, 2007), binary trees (Das et al., 2009, Chang and Wang, 2006), ordered probit and logit models (Das et al., 2008, Obeng, 2008) and the innovative proportional odds model (Wang and Abdel-Aty, 2008) are the standard modeling practices.
In this study the authors investigate a generalized heuristic approach of Genetic Programming (GP) to model injury severity as well as the crash frequency. GP uses concepts from evolutionary biology, such as crossover and mutation, for the model development process. The process of model evolution takes places, through generations, with decreasing mean error as the objective function for regression and increasing hit rate as the objective function for classification problems.
Roadway incidents in which the front section of a vehicle collides with the back of another are categorized as rear-end crashes (Singh, 2003). Presently the researchers investigate the frequency and severity analysis for rear-end crashes, specifically for urban arterials (not limited access facility) in this study. Though they are fundamentally different phenomena yet they have an overlapping set of contributing factors. It must be understood that crash occurrence and the injury severity is sequential in the reference frame of time, i.e. they are not simultaneous. First, a crash has to occur and then an injury may result. Hence, there is a one-way dependency between both events. The authors suggest here independent approaches for building both the severity and frequency of crashes models under the broader umbrella of the heuristic GP. Since the crash occurrence and injury severity are fundamentally different phenomena it is not practical to have one model governing them. However, in this study the authors have given a common heuristic model development process for both events. It must be noted that crash cause is the link between occurrence and injury severity. Different types of causes may have an effect on the injury severity.
The following section explains the GP methodology and the overall model development algorithm. The results and analyses follow next, with the injury severity analysis preceding the crash count modeling for rear-end crashes on urban arterials. The data set preparation has been included in the respective analysis sub-sections. The crash frequency analysis includes graphical demonstration of the change in crash counts with the change in parameter values. The sensitivity analysis also helps in identifying the most important continuous variable entering the final models effecting the variation of the crash count.
Section snippets
Genetic Programming (GP)
The GP methodology, which is a class of evolutionary algorithm, originates from the Genetic Algorithm (GA) in which the members evolve through generations. The concepts of biology such as crossover and mutation are at the center of the evolutionary process. Typically during crossover there occurs an interchange of sections between two homologous chromosomes at a certain splice point. On the other hand mutation means the alteration of any particular point in a chromosome. In GP the chromosome
Data preparation
The authors set up a binary classification problem for the severe/non-severe injury crashes. Injury related crashes represent all types of injuries and the degree of severity ranges from possible injury to death. Keeping in view the nature of the injury, two possible grouping of the injury related crashes is possible. The crashes with fatalities and incapacitating injuries have been grouped together. They are put together into one level as the crashes that involve incapacitating injury could
Conclusion
The authors have discussed in this study a heuristic model development approach to understand the frequency of crashes as well as the classification of injury severity. The fundamental difference between frequency modeling and classification is the response type. The frequency response is continuous, where as the classification target variable is categorical. Previous transportation researchers have mostly focused on unique statistical modeling techniques specifically suited either for crash
References (26)
- et al.
Exploring the overall and specific crash severity levels at signalized intersections
Accident Analysis and Prevention
(2005) - et al.
Analysis of traffic injury severity: an application of non-parametric classification tree techniques
Accident Analysis and Prevention
(2006) - et al.
Using conditional inference forests to identify the factors affecting crash severity on arterial corridors
Journal of Safety Research
(2009) - et al.
Severity of driver injury and vehicle damage in traffic crashes at intersections: a Bayesian hierarchical analysis
Accident Analysis and Prevention
(2008) - et al.
Predicting motor vehicle crashes using support vehicle machine models
Accident Analysis and Prevention
(2008) - et al.
A multivariate Poisson-lognormal regression model for prediction of crash counts by severity, using Bayesian methods
Accident Analysis and Prevention
(2008) - et al.
Diagnostic analysis of the logistic model for pedestrian injury severity in traffic crashes
Accident Analysis and Prevention
(2007) - et al.
Analysis of left-turn crash injury severity by conflicting pattern using partial proportional odds models
Accident Analysis and Prevention
(2008) - et al.
Underreporting in traffic accident data, bias in parameters and the structure of injury severity models
Accident Analysis and Prevention
(2008) - et al.
Linear Genetic Programming
(2007)
Urban arterial crash characteristics related with proximity to intersections and injury severity
Transportation Research Record
Effects of rural highway median treatments and access
Transportation Research Record
Genetic Algorithms in Search, Optimization and Machine Learning
Cited by (46)
Advances, challenges, and future research needs in machine learning-based crash prediction models: A systematic review
2024, Accident Analysis and PreventionTemporal stability of factors affecting injury severity in rear-end and non-rear-end crashes: A random parameter approach with heterogeneity in means and variances
2022, Analytic Methods in Accident ResearchMachine learning applied to road safety modeling: A systematic literature review
2020, Journal of Traffic and Transportation Engineering (English Edition)Citation Excerpt :The main explanatory variables used in crash modeling by severity consist of road-environmental factors, human factors, crash characteristics, and vehicle-related factors, in descending order of importance. However, Das and Abdel-Aty (2010), Das and Abdel-Aty (2011), and Iranitalab and Khattak (2017) only considered road-environmental factors. Furthermore, those authors underscored that there is no need to divide the roadway into segments for modeling for classification purposes, which means that these models can be based on a greater amount of data (i.e., each crash occurrence is a single observation for classification, whereas crash occurrences must be grouped for modeling frequency) and also lead to an improved generalization capacity.