An Approach For Predicting Road Accident Severity

An Approach For Predicting
Road Accident Severity
B. Sikander
MTech Scholar 188150002
Under the Supervision of:
Dr. Anant Ram
Dept. of Computer Engineering & Applications

CONTENT
Area of Research
Introduction
Motivation
Applications
Issues and Challenges
Objective
Proposed Approach
Result
References

Area of Research
Accident Analysis and Prevention
3

Introduction
• Road traffic injuries and deaths have been a major public health issue globally. According to
World Health Organization (WHO), approximately 1.35 million people die from roadway traffic
accidents each year, while 20∼50 million people suffer nonfatal injuries with many resulting in
disabilities [1].
• Road accidents in India claimed over 1.5 lakh lives in the country in 2018, with over-speeding
of vehicles being the biggest reason for casualties[2].
• The ministry of road transport and highways issued a report on Road accidents in India in 2018,
which showed that road accidents last year increased by 0.46% as compared to 2017[2].
4

Cont’d…
5
Figure 1: Road Accidents Statistics from 2007 to 2017 [3]

Motivation
6
• Increased highway accidents and rise in death toll every day.
• A total of 4,67,044 road accidents have been reported by States and Union Territories (UTs) in
the calendar year 2018, claiming 1,51,417 lives and causing injuries to 4,69,418 persons,” the
report said. Over-speeding accounted for 64.4% of the persons killed[2].
Figure 2. Car Accident due to over speeding[2]

Cont’d…
7
Figure 3: Number of Road Accidents, Deaths and Injuries [4]

Applications
8
Help in designing road geometry
Help in finding the those spot on road where chance of accident is high.
Help in installing the road surveillance
Road network simulation
Help in study of Road Intersection

Issues and Challenges
9
Proper selection of features which can cause a road accident.
Removal of unwanted features.
Collected data should be valid and related to real life accident situations.
Removal of outlier instances

Objective
10
To evaluate factors contributing to road accidents.
To review related systems and models for predicting the likelihood of causing an accident.
To develop an algorithm for predicting the likelihood of causing an accident.
To test and validate the developed algorithm.

Proposed Approach
11
• In this research we build a model which able to predict the road accident based on road conditions.
• First we consider 50 possible features/variables which can cause road accidents.
• Vehicle type : car, truck, motorcycle.
• Vehicle length : near by vehicle length.
• Road Type : One way street, Single carriageway, dual carriageway.
• Light condition : daylight, darkness – lights lit, darkness – no lighting.
• Weather condition : fine no high winds, raining no high winds, snowing, etc.
• Road surface condition : dry, wet, snow, mud, etc.
• Number of vehicles
• Speed limit

Cont’d…
12
• Number of passenger
• Pedestrian location
• Age of driver
• Engine capacity
• Age of vehicle
(manufacture)
• Vehicle maneuver
• Pedestrian movement
• Breaking behavior
• Passenger in adjacent
seat
• Cellphone use
• Driver seatbelt
• Speeding
• Low speed
• Drowsiness
• Alcohol Drug
Impairment
• Travel lanes
• Traffic Density
• Traffic flow
• Traffic control
• Vehicle to vehicle
distance
• Fatigue
• Tire pressure
• Acceleration
• Deceleration
• Pedestrian crossing
• Pedestrian type
• Sex of driver
• Longitude and
latitude

Cont’d…
13
We categorized the stated features into four categories as shown in figure 5.
Figure 5: Road Accidents Causes

Cont’d…
14
Assume D = { 𝑑1, 𝑑2, 𝑑3,… 𝑑 𝑛} where D is the dataset with number of instances around 1 Lakh
X = {𝑥1, 𝑥2, 𝑥3,… 𝑥 𝑛} where X is the set of Features with the number of features are 50.
T = {𝑡1, 𝑡2, 𝑡3} where T is the set of Target Values i.e. {Slight, Fatal, Serious}
Propose Work Process
Step 1: Manual Selection of features which are related to the on-road condition
Step 2: In this step, we also filter out those features which have some direct impact on road accidents.

Cont’d…
15
Step 3: In this step, we label the instance with the target value by identifying the impact of the accident.
D: X → T {Slight, Fatal, Serious}
Step 4: Now we use some machine learning features selection techniques to find the important features and
reduce the curse of dimensionality.
we use Univariate Selection, Feature Importance, Recursive Feature Elimination.

Cont’d…
16
Step 4 (a): Univariate Features Selection
• Statistical tests can be used to select those features that have the strongest relationship with
the output variable.
• The scikit-learn library provides the SelectKBest class that can be used with a suite of
different statistical tests to select a specific number of features.
• We uses the chi-squared (chi²) statistical test for non-negative features to select 11 of the best
features from the Dataset.
• The Formula for Chi-squared test is
𝑋𝑐
𝑖
=
(𝑂 𝑖 − 𝐸 𝑖 )2
𝐸 𝑖
Where c = Degrees of freedom, O = Observed value(s) and E = Expected value(s)

Cont’d…
17
Step 4 (b): Feature Importance
• We can get the feature importance of each feature of the dataset by using the feature
importance property of the model.
• Feature importance gives you a score for each feature of the data, the higher the score more
important or relevant is the feature towards your output variable.
• Feature importance is an inbuilt class that comes with Tree Based Classifiers, we will be
using Extra Tree Classifier for extracting the top 11 features for the dataset.
• Scikit-learn calculates a nodes importance using Gini Importance, assuming only two child
nodes
𝑛𝑖𝑗 = 𝑤𝑗 𝐶𝑗 − 𝑤𝑙𝑒𝑓𝑡 𝑗 𝐶𝑙𝑒𝑓𝑡 𝑗 − 𝑤 𝑟𝑖𝑔ℎ𝑡 𝑗 𝐶 𝑟𝑖𝑔ℎ𝑡 𝑗

Cont’d…
18
Where
𝑛𝑖𝑗 = the importance of node j
𝑤𝑗 = 𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑎𝑚𝑝𝑙𝑒𝑠 𝑟𝑒𝑎𝑐ℎ𝑖𝑛𝑔 𝑛𝑜𝑑𝑒 𝑗
𝐶𝑗 = the impurity value of node j
𝑙𝑒𝑓𝑡 𝑗 = child node from left split on node j
𝑟𝑖𝑔ℎ𝑡 𝑗 = child node from left split on node j
The importance for each feature on a decision tree is then calculated as:
𝑓𝑖 𝑗 =
𝑗:𝑛𝑜𝑑𝑒 𝑗 𝑠𝑝𝑙𝑖𝑡𝑠 𝑜𝑛 𝑓𝑒𝑎𝑡𝑢𝑟𝑒 𝑖 𝑛𝑖𝑗
𝑘∈𝑎𝑙𝑙 𝑛𝑜𝑑𝑒𝑠 𝑛𝑖 𝑘
𝑓𝑖 𝑗 = 𝑡ℎ𝑒 𝑖𝑚𝑝𝑜𝑟𝑡𝑎𝑛𝑐𝑒 𝑜𝑓 𝑓𝑒𝑎𝑡𝑢𝑟𝑒 𝑖
𝑛𝑖𝑗 = 𝑡ℎ𝑒 𝑖𝑚𝑝𝑜𝑟𝑡𝑎𝑛𝑐𝑒 𝑜𝑓 𝑛𝑜𝑑𝑒 𝑗

Cont’d…
19
These can then be normalized to a value between 0 and 1 by dividing by the sum of all feature importance values:
𝑛𝑜𝑟𝑚𝑓𝑖 𝑗 =
𝑓𝑖 𝑗
𝑗∈𝑎𝑙𝑙 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠 𝑓𝑖 𝑗
The final feature importance, at the Random Forest level, is it’s average over all the trees. The sum of the feature’s
importance value on each trees is calculated and divided by the total number of trees:
𝑅𝐹𝑓𝑖𝑖 =
𝑗∈𝑎𝑙𝑙 𝑡𝑟𝑒𝑒𝑠 𝑛𝑜𝑟𝑚𝑓𝑖𝑖𝑗
𝑇
𝑅𝐹𝑓𝑖𝑖 = 𝑡ℎ𝑒 𝑖𝑚𝑝𝑜𝑟𝑡𝑎𝑛𝑐𝑒 𝑜𝑓 𝑓𝑒𝑎𝑡𝑢𝑟𝑒 𝑖 𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 𝑓𝑟𝑜𝑚 𝑎𝑙𝑙 𝑡𝑟𝑒𝑒𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑅𝑎𝑛𝑑𝑜𝑚 𝐹𝑜𝑟𝑒𝑠𝑡 𝑚𝑜𝑑𝑒𝑙
𝑛𝑜𝑟𝑚𝑓𝑖𝑖𝑗 = 𝑡ℎ𝑒 𝑛𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑒𝑑 𝑓𝑒𝑎𝑡𝑢𝑟𝑒 𝑖𝑚𝑝𝑜𝑟𝑡𝑎𝑛𝑐𝑒 𝑓𝑜𝑟 𝑖 𝑖𝑛 𝑡𝑟𝑒𝑒 𝑗
T = total number of trees

Cont’d…
20
Step 4 (c): Recursive Feature Elimination
Given an external estimator that assigns weights to features (e.g., the coefficients of a linear model), the goal of
ursive feature elimination (RFE) is to select features by recursively considering smaller and smaller sets of
features. First, the estimator is trained on the initial set of features and the importance of each feature is obtained
either through a coef_ attribute or through a feature_importances_ attribute. Then, the least important features are
pruned from current set of features. That procedure is recursively repeated on the pruned set until the desired
number of features to select is eventually reached.

Result
21
Step 5: Following model has been used and test:
A. Naïve Bayes
Accuracy = 79.8% Classification Error = 20.2%
Confusion
Matrix
True Slight True Serious True Fatal Class Prediction
Pred. Slight 27965 6605 468 79.81%
Pred. Serious 0 0 0 0.00%
Pred. Fatal 0 0 0 0.00%
Class recall 100.00% 0.00% 0.00%

Cont’d…
22
B. Logistic Regression
Confusion
Matrix
Pred. Slight 2862 439 7 86.52%
Pred. Fatal 25039 6220 472 1.49%
Class recall 10.26% 0.00% 98.54%

Cont’d…
23
C. Decision Tree
Confusion
Matrix
Pred. Slight 27965 6605 468 79.81%
Pred. Fatal 0 0 0 0.00%
Class recall 100.00% 0.00% 0.00%

Cont’d…
24
D. Random Forest
Confusion
Matrix
Pred. Slight 14240 3373 235 79.78%
Pred. Fatal 0 0 0 0.00%
Class recall 99.94% 0.06% 0.00%

Cont’d…
25
E. Support Vector Machine
Confusion
Matrix
Pred. Slight 3404 822 59 79.44%
Pred. Fatal 0 0 0 0.00%
Class recall 100.00% 0.00% 0.00%

Propose work flowchart 26
Input Raw Data
Top 11 Features
Remove null and noisy data
Input all road condition
related features
Remove the dependent
features
Perform features selection
techniques (Univariate
Features selection, Feature
Importance & recursive
feature elimination)
Perform the Classification
techniques (Naive Bayes,
Logistic Regression, Decision
Tree, Random Forest, Support
Vector Machine )
Select the Best Prediction
Method
Resultant Final Model

References
1. Global status report on road safety 2018. “https://www.who.int/violence_injury_prevention/road_safety_status/2018/en/”
(27/05/2020 04:37 PM).
2. Road accidents claimed over 1.5 lakh lives in 2018, over-speeding major killer - The Economic Times.
“https://economictimes.indiatimes.com/news/politics-and-nation/road-accidents-claimed-over-1-5-lakh-lives-in-2018-over-
speeding-major-killer/articleshow/72127418.cms?from=mdr” (27/05/2020 04:37 PM).
3. India way behind 2020 target, road accidents still kill over a lakh a year | India News - Times of India.
“https://timesofindia.indiatimes.com/india/india-way-off-road-safety-targets-for-2020-road-accidents-still-kill-over-a-lakh-a-
year/articleshow/65765549.cms” (27/05/2020 04:55 PM).
4. Government of India Ministry of Road Transport & Highways Transport Research Wing 2018.
5. Xiaoxia Xiong , Long Chen , and Jun Liang : Discrete Dynamics in Nature and Society Volume 2018, "Vehicle Driving Risk
Prediction Based on Markov Chain Model.“
6. Chunjiao Dong ,1,2,3 Chunfu Shao,1,2 Juan Li,1 and Zhihua Xiong : Journal of Advanced Transportation Volume 2018, "An
Improved Deep Learning Model for Traffic Crash Prediction.“
7. Nasim Arbabzadeh and Mohsen Jafari : IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, “A
Data-Driven Approach for Driving Safety Risk Prediction Using Driver Behavior and Roadway Information Data”
8. Yutao Ba a,⇑, Wei Zhang b, Qinhua Wang a, Ronggang Zhou c, Changrui Ren a : Transportation Research Part C, “Crash
prediction with behavioral and physiological features for advanced vehicle collision avoidance system”
9. Yiping Chen1, Jingkang Wang2, Jonathan Li#1,3, Cewu Lu#2, Zhipeng Luo1, Han Xue2, and Cheng Wang1 : “LiDAR-
Video Driving Dataset: Learning Driving Policies Effectively”
10. Rishu Chhabra, Dr. Seema Verma and Dr. C. Rama Krishna : A survey on driver behavior detection techniques for intelligent
transportation systems.
11. Loukas Dimitrioua,⁎, Katerina Stylianoua, Mohamed A. Abdel-Atyb : Assessing rear-end crash potential in urban locations
based on vehicle-byvehicle interactions, geometric characteristics and operational conditions.
27

Published and Communicated
Papers
28
• B. Sikander and Anant Ram, “SURVEY ON SEVERITY RATE OF ROAD ACCIDENT
ASSESSMENT AND ESTIMATION USING DATA MINING TECHNIQUES”, TEST ENGINEERING
AND MANAGEMENT (ACCEPTED FOR PUBLICATION) –SCOPUS
• B. Sikander and Anant Ram, “Urban Road Accident Evaluation and Road
Accident Severity Prediction”, International Journal of Mathematical, Engineering
and Management Sciences (Scopus), - Communicated

An Approach For Predicting Road Accident Severity

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to An Approach For Predicting Road Accident Severity

Similar to An Approach For Predicting Road Accident Severity (20)

Recently uploaded

Recently uploaded (20)

An Approach For Predicting Road Accident Severity