The document describes a project to predict traffic accident severity and vehicle count using machine learning models. The authors first conducted a literature review on previous related work and then collected and cleaned a dataset from Kaggle on US traffic accidents. They applied several classification algorithms including logistic regression, random forest, KNN, decision trees and naive bayes to predict severity. The best performing models were random forest and support vector machines with over 80% accuracy. The authors conclude the paper by discussing opportunities for future work to improve the models.
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
BiPredicting Traffic Accident Severity and Vehicle Count using Machine Learning.pptx
1. Madhu Kiran Chiti
Radhika Patel
Surabhi Parab
PREDICTING TRAFFIC ACCIDENT
SEVERITY AND VEHICLE COUNT USING
MACHINE LEARNING
2. TABLE OF CONTENT
0 2
MADHU KIRAN CHITI, Radhika PateL, Surabhi Parab
01
02
03
04
05
06
07
Introduction
Literature review
Methodology
Model training algorithms
Result and Analysis
Conclusion
References
3. INTRODUCTION
• Road traffic accidents result in 1.3 million deaths and 50 million injuries worldwide
each year.
• In the US alone, there were over 36,000 traffic fatalities in 2019.
• Economic costs of traffic accidents range from $230 billion to $900 billion globally.
• This project aims to build a multiclass classification model to predict whether an
accident is likely to result in a severe or minor injury.
• We use a subset of features from the US Accidents dataset, including distance,
temperature, humidity, pressure, precipitation, and weather condition.
MADHU KIRAN CHITI, Radhika PateL, Surabhi Parab 0 3
4. LITERATURE REVIEW
• Conducted a literature review before project implementation
• This helped in precisely formulating the project objective and tailoring models
as per project requirements.
• Several user-created notebooks on Kaggle [4]
• An IEEE paper on Comparison of Machine Learning algorithms [5]
• Article on the impact of rainy weather on road accidents. (based on Weather) [6]
• Our project's unique aspect is focusing on weather conditions and location to
provide a severity report, which is different from the existing literature that we
came across.
MADHU KIRAN CHITI, Radhika PateL, Surabhi Parab 0 4
5. METHODOLOGY
• Kaggle Dataset
• Features like - Severity, Time, Latitude, Longitude,
Distance(mi), Street, Number, Side, Zipcode,
Temperature(F), Wind_Chill(F), Humidity(%), Pressure(in),
Precipitation(in), Weather_Condition.
DATASET
DATASET
EVALUATION
• Feature Selection for analysis
• Two separate analysis - one according to the time and
location, second according to the Weather Conditions
MADHU KIRAN CHITI, Radhika PateL, Surabhi Parab 0 5
6. METHODOLOGY
• Finding missing values and missing columns
• Replacing missing numerical values with mean (tried, but
dropped column itself)
• Scaling features using PCA and Standardization
DATA CLEANING
MODEL
TRAINING
• Logistic Regression (LR), Random Forest Classifier
(RF), Support Vector Classifier (SVC), K Nearest
Neighbor (KNN), Gaussian Naive Bayes (GNB) and the
Decision Tree models.
MADHU KIRAN CHITI, Radhika PateL, Surabhi Parab 0 6
7. WORK FLOWCHART
Input Raw Data
jmjj
Remove null and noisy data
Input all road condition related
features
Remove the unwanted
features
Perform features selection
techniques
Top 11 Features
Perform the Classification techniques
(Naive Bayes, Logistic Regression,
Decision Tree, Random Forest, Support
Vector Machine)
Select the Best Prediction
Method
Resultant Final Model
MADHU KIRAN CHITI, Radhika PateL, Surabhi Parab 0 7
8. PCA WITH NAÏVE BAYES ALGORITHM
Load the dataset and display its summary statistics.
Check for missing values and drop columns with more than
50% missing data.
Rename columns and convert DateTime variables to DateTime
format.
Impute missing numerical values with mean imputation.
encoding categorical variables
Split the dataset into training and testing
sets.
Scale the features using StandardScaler.
Perform feature selection using PCA and plot the
first two principal components.
Train a Gaussian Naive Bayes Classifier on the
training set.
Evaluate the performance of the model
MADHU KIRAN CHITI, Radhika PateL, Surabhi Parab 0 8
9. PCA WITH LR, RF,
AND SVC
M A D H U K I R A N C H I T I , R a d h i k a P a t e L , S u r a b h i P a r a b 0 9
19. FUTURE WORK
• Due to the limited time constraints, we were unable to fine-tune our models to bring more
accurate results. As part of the future work, we can experiment more with the learning rates and
parameters of the code; and make our model even more robust.
• In the SVC model, we can try changing the kernel to poly (Polynomial) and rbf (Radial Basis
Function).
• A deeper analysis of the relationships between different variables and accident severity or vehicle
count can provide valuable insights. We can explore more performance metrics to analyze the
model in future.
MADHU KIRAN CHITI, Radhika PateL, Surabhi Parab 1 9
20. REFERENCES
• [1] S. Moosavi, M. Ardalan, and M. K. H. Shafiei, "US Accidents (3.5 million records): A Countrywide Traffic Accident Dataset," Kaggle,
2019. https://www.kaggle.com/sobhanmoosavi/us-accidents
• [2] Moosavi, Sobhan, Mohammad Hossein Samavatian, Srinivasan Parthasarathy, and Rajiv Ramnath. “A Countrywide Traffic Accident
Dataset.”, 2019. https://smoosavi.org/datasets/us_accidents
• [3] Moosavi, Sobhan, Mohammad Hossein Samavatian, Srinivasan Parthasarathy, Radu Teodorescu, and Rajiv Ramnath. "Accident Risk
Prediction based on Heterogeneous Sparse Data: New Dataset and Insights." In Proceedings of the 27th ACM SIGSPATIAL International
Conference on Advances in Geographic Information Systems, ACM, 2019. https://arxiv.org/abs/1909.09638
• [4] https://www.kaggle.com/code/satyabrataroy/60-insights-extraction-us?accident-analysis
• [5] R. E. AlMamlook, K. M. Kwayu, M. R. Alkasisbeh, and A. A. Frefer, "Comparison of Machine Learning Algorithms for Predicting Traffic
Accident Severity," 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology (JEEIT),
Amman, Jordan, 2019, pp. 272-276, doi: 10.1109/JEEIT.2019.8717393.
• [6] Harold Brodsky, A.Shalom Hakkert, “Risk of a road accident in rainy weather,” Accident Analysis & Prevention, Volume 20, Issue 3, 1988,
Pages 161-176, ISSN 0001-4575, https://doi.org/10.1016/0001- 4575(88)90001-2
• [7] Anna Trawén, Pia Maraste, Ulf Persson, “International comparison of costs of a fatal casualty of road accidents in 1990 and 1999,”
Accident Analysis & Prevention, Volume 34, Issue 3, 2002, Pages 323-332, ISSN 0001-4575, https://doi.org/10.1016/S0001-4575(01)00029-X
MADHU KIRAN CHITI, Radhika PateL, Surabhi Parab 2 0