Elsevier

Safety Science

Volume 49, Issues 8–9, October 2011, Pages 1156-1163
Safety Science

A combined frequency–severity approach for the analysis of rear-end crashes on urban arterials

https://doi.org/10.1016/j.ssci.2011.03.007Get rights and content

Abstract

Analysis of both the crash count and the severity of injury are required to provide the complete picture of the safety situation of any given roadway. The randomness of crashes, the one-way dependency of injury on crash occurrence and the difference in response types have typically led researchers into developing independent statistical models for crash count and severity classification. The Genetic Programming (GP) methodology adopts the concepts of evolutionary biology such as crossover and mutation in effectively giving a common heuristic approach to model the development for the two different modeling objectives. The chosen GP models have the highest hit rate for rear-end crash classification problem and the least error for function fitting (regression) problems. Higher Average Daily Traffic (ADT) is more likely to result in more crashes. Absence of on-street parking may result in diminished severity of injuries resulting from crashes as they may provide “soft” crash barrier in contrast to fixed road side objects. Graphical presentation of the frequency of crashes with varying input variables shed new light on the results and its interpretation. Higher friction coefficient of roadways result in reduced frequency of crashes during the morning peak hours, with the trend being reversed during the afternoon peak hours. Crash counts have been observed to be at a maximum at a surface width of 30 ft. Sensitivity analysis results reflect that ADT is responsible for the largest variation in crash counts on urban arterials.

Highlights

► Genetic Programming approach to crash frequency & injury severity. ► Models for rear-end crashes (mid block, intersection, access points). ► Graphical understanding for variables’ behavior.

Introduction

Safety assessment of roadway elements such as mid-block segments, signalized intersections and un-signalized intersections (access points) includes investigations into the severity as well as the frequency of crashes. The goals of transportation safety engineers are to reduce the number of crashes and mitigate the injury severity in case of a crash. However, any research directed only towards the frequency or the severity analysis of crashes may not always be sufficient. Though this aspect of safety analysis is widely accepted, the existing body of knowledge however has very limited citations for a complete analysis involving both the crash counts and severity of the injuries resulting in the crashes. Recently Ma et al. (2008) used a multivariate Poisson-lognormal approach to model crash occurrence simultaneously at various levels of injury severity. However, the complex statistical structure of the study makes it less practical to implement.

Fundamental difference between the crash occurrence phenomenon and the injury severity levels is the response type. Crash occurrence is a continuous integer response while the severity is an ordinal target. Most statistical studies for the two phenomena are based on this difference. For crash count prediction, models such as negative binomial (Miaou, 1996, Harwood et al., 2000) and support vector machines (Li et al., 2008) are the norm. In case of injury severity, logistic regression (Huang et al., 2008, Sze and Wong, 2007), binary trees (Das et al., 2009, Chang and Wang, 2006), ordered probit and logit models (Das et al., 2008, Obeng, 2008) and the innovative proportional odds model (Wang and Abdel-Aty, 2008) are the standard modeling practices.

In this study the authors investigate a generalized heuristic approach of Genetic Programming (GP) to model injury severity as well as the crash frequency. GP uses concepts from evolutionary biology, such as crossover and mutation, for the model development process. The process of model evolution takes places, through generations, with decreasing mean error as the objective function for regression and increasing hit rate as the objective function for classification problems.

Roadway incidents in which the front section of a vehicle collides with the back of another are categorized as rear-end crashes (Singh, 2003). Presently the researchers investigate the frequency and severity analysis for rear-end crashes, specifically for urban arterials (not limited access facility) in this study. Though they are fundamentally different phenomena yet they have an overlapping set of contributing factors. It must be understood that crash occurrence and the injury severity is sequential in the reference frame of time, i.e. they are not simultaneous. First, a crash has to occur and then an injury may result. Hence, there is a one-way dependency between both events. The authors suggest here independent approaches for building both the severity and frequency of crashes models under the broader umbrella of the heuristic GP. Since the crash occurrence and injury severity are fundamentally different phenomena it is not practical to have one model governing them. However, in this study the authors have given a common heuristic model development process for both events. It must be noted that crash cause is the link between occurrence and injury severity. Different types of causes may have an effect on the injury severity.

The following section explains the GP methodology and the overall model development algorithm. The results and analyses follow next, with the injury severity analysis preceding the crash count modeling for rear-end crashes on urban arterials. The data set preparation has been included in the respective analysis sub-sections. The crash frequency analysis includes graphical demonstration of the change in crash counts with the change in parameter values. The sensitivity analysis also helps in identifying the most important continuous variable entering the final models effecting the variation of the crash count.

Section snippets

Genetic Programming (GP)

The GP methodology, which is a class of evolutionary algorithm, originates from the Genetic Algorithm (GA) in which the members evolve through generations. The concepts of biology such as crossover and mutation are at the center of the evolutionary process. Typically during crossover there occurs an interchange of sections between two homologous chromosomes at a certain splice point. On the other hand mutation means the alteration of any particular point in a chromosome. In GP the chromosome

Data preparation

The authors set up a binary classification problem for the severe/non-severe injury crashes. Injury related crashes represent all types of injuries and the degree of severity ranges from possible injury to death. Keeping in view the nature of the injury, two possible grouping of the injury related crashes is possible. The crashes with fatalities and incapacitating injuries have been grouped together. They are put together into one level as the crashes that involve incapacitating injury could

Conclusion

The authors have discussed in this study a heuristic model development approach to understand the frequency of crashes as well as the classification of injury severity. The fundamental difference between frequency modeling and classification is the response type. The frequency response is continuous, where as the classification target variable is categorical. Previous transportation researchers have mostly focused on unique statistical modeling techniques specifically suited either for crash

References (26)

  • A. Das et al.

    Urban arterial crash characteristics related with proximity to intersections and injury severity

    Transportation Research Record

    (2008)
  • J.L. Gettis et al.

    Effects of rural highway median treatments and access

    Transportation Research Record

    (2005)
  • D.E. Goldberg

    Genetic Algorithms in Search, Optimization and Machine Learning

    (1989)
  • Cited by (46)

    • Machine learning applied to road safety modeling: A systematic literature review

      2020, Journal of Traffic and Transportation Engineering (English Edition)
      Citation Excerpt :

      The main explanatory variables used in crash modeling by severity consist of road-environmental factors, human factors, crash characteristics, and vehicle-related factors, in descending order of importance. However, Das and Abdel-Aty (2010), Das and Abdel-Aty (2011), and Iranitalab and Khattak (2017) only considered road-environmental factors. Furthermore, those authors underscored that there is no need to divide the roadway into segments for modeling for classification purposes, which means that these models can be based on a greater amount of data (i.e., each crash occurrence is a single observation for classification, whereas crash occurrences must be grouped for modeling frequency) and also lead to an improved generalization capacity.

    View all citing articles on Scopus
    View full text