SlideShare a Scribd company logo
1 of 18
Predictive Analysis of Traffic violations
Group 1
Introduction
Road accidents - big worldwide threat:
● Up to 1.27 million deaths/year [2].
● Up to 50 million injuries/year [2].
● Over 2.5 millions/year involved in US [2].
● Huge economic and social impact.
Data mining is arduous for traffic violations:
● Huge data size and high dimensions [2-4].
● Popularity of classification methods [3-5].
● High dependence on collected data [3-5].
● Testing multiple to choose the best [3-5].
Flow of Project
Project followed CRISP-DM model and provided instructions
A. Selecting a dataset from any free source on web.
B. Study the variables in the selected dataset; draw
preliminary conclusions; and develop at least three
initial research questions/hypotheses.
C. Develop a Business Use Case.
D. Use the visualization (min of 5) to explore dataset;
present research hypotheses, based on the
visualizations.
E. Produce a dataset satisfying all of the criteria in part c.
F. Present 3 modeling techniques for hypotheses.
G. Develop models using 3 algorithm for each model.
H. Provide recommendations based on modeling results.
Dataset Chosen
Traffic Violations of MCP: records from 2012 to 2018
from https://catalog.data.gov/ (Jan 17, 2018)
Attributes:
● Date of Stop
● Time of Stop
● Belts
● Contribution Accident
● Description
● Phone Used
● Fatal
● Property damage
● Violation Type
● Hazmat
Data Preprocessing
Missing Data Handling
● Two attributes i.e. “Agency” and “Accident” are removed from the dataset for being
single-valued attributes with value 'No' for every one of the records.
● Records with Null values are evacuated utilizing XL miners missing data treatment.
New Attributes
● Three new binary-valued variables are presented to be specific, "Phone Usage", "
Contributed to accident " and "Fatal" in light of the visualizations made for the
dataset.
Outliers detection
● Outliers are identified for the attribute "Year" and treated.
Filtering Datasets
● For each search question separate datasets were created for modeling.
● R programming language is utilized to accomplish balanced dataset with an
equivalent weight of target variables.
Visualizations - Violations based on hours
Visualizations - Child violations with year & month
Visualizations - Phone violations with year & month
Visualizations - Violations with Personal injury & seat belt
Search Hypothesis
1
Whether a violation
occurred contributed to
an accident or not
Whether the violation
that led to an accident
was fatal or not
2
The geolocation, at
which a violation is
likely to occur based on
several factors
3
Modeling
Model 1
Predictors:
● Belt
● Alcohol
● Vehicle Type
● Year
● Phone Usage
Target variable:
Contributed to accident
Model 2
Predictors:
● Belt
● HAZMAT
● Alcohol
● Phone Usage
Target variable:
Fatal
Model 3
Predictors:
● Belt, Personal Injury
● Property Damage
● Alcohol
● Violation Type
Target variable:
Cluster ID (geo-coordinates)
Model I: contrib. to accident
Algorithm Single Tree Random Trees Naïve Bayes
Precision 0.626 0.503 0.621
Sensitivity 0.275 0.996 0.266
Specificity 0.836 0.023 0.838
F1-Score 0.383 0.669 0.373
Belts
Phone usage
Alcohol
Type:
Bus
Best model: Random Trees due to highest
sensitivity and F1-score
Belts and phone usage are the top predictors
for contribution to accident
Model II: Fatality of accident
Alcohol
Phone usage
HAZMAT
Algorithm Single Tree Random Trees Naïve Bayes
Precision 0.579 0.639 0.579
Sensitivity 0.79 0.383 0.543
Specificity 0.404 0.776 0.59
F1-Score 0.668 0.478 0.561
Best model: Single Tree due to highest
sensitivity and F1-score
Alcohol and phone usage are the top predictors for fatal accidents
Model III: geolocation for “likely” violations
● K-means clustering and
Single Tree classification are
best algorithms due to
unbiased results
● Cluster 5 has the highest
number of alcohol violations
and personal injuries
● Cluster 9 has the highest
number of belt violations
Recommendations
> Recommend MCP to increase their attention and increase enforcement of rules
for belts and phone usage as the main reason for accident contribution
> Recommend MCP to pay an extra attention and take extra measures to drunk
drivers and phone usage when driving
> Alert Maryland police about the major areas violations are likely to be caused:
>> high number of alcohol violations with personal injuries in cluster 5
>> multiple belt violations in cluster 9
> Perform similar recommendations to the insurance companies in the state of
Maryland
Conclusion
● 5 visualizations have been produced and 3 research hypotheses developed;
● Data preprocessed in several datasets;
● 3 modeling technique is performed for hypotheses;
● Several classification, clustering and regression models have been considered for
modeling: Single Tree, Random Trees, K-Means clustering, Multiple regression, etc.;
● Random Trees and Single Tree are the best algorithms for models 1 and 2 due to an
importance of high sensitivity and a high F1-score;
● K-means clustering and Single Tree classification have been considered as the best
algorithms for model 3 providing numbers of different types of violations along with the
number of injuries for various clusters;
● XLMiner, Tableau, and R are used for analysis.
References
● Discovering Knowledge in Data: An Introduction to Data Mining, Daniel T. Larose and Chantal D. Larose,
Wiley, 2nd edition: Wiley ISBN 978-0-470-90874-7
● Abellan, J., Lopez, G., & De O~na, J. (2013). Analysis of traffic accident severity using Decision Rules via
Decision Trees. Expert Systems with Applications, 40, 6047–6054.
● Chang, L.-Y., & Chien, J.-T. (2015). Analysis of driver injury severity in truck-involved accidents using a non-
parametric classification tree model. Safety Science, 51(1), 17–22.
● Chen, W. H., & Jovanis, P. P. (2012). Method for identifying factors contributing to driver injury severity in
traffic crashes. Transportation Research Record, 1717, 1–9.
● Kashani, A. T., Rabieyan, R., & Besharati, M. M. (2014). A data mining approach to investigate the factors
influencing the crash severity of motorcycle pillion passengers. Journal of Safety Research, 51, 93–98.
● Kwon, O. H., Rhee, W., & Yoon, Y. (2015). Application of classification algorithms for analysis of road safety
risk factor dependencies. Accident Analysis and Prevention, 75, 1–15.
● Xie, Y., Zhang, Y., & Liang, F. (2009). Crash injury severity analysis using Bayesian ordered probit models.
Journal of Transportation Engineering ASCE, 135(1), 18–25.
● Mujalli, M. O., & de O~na, J. (2011). A method for simplifying the analysis of traffic accidents injury severity
on two-lane highways using Bayesian networks. Journal of Safety Research, 42, 317–326.
● De O~na, J., Lopez, G., & Abellan, J. (2013). Extracting decision rules from police accident reports through
decision trees. Accident Analysis & Prevention, 50, 1151–1160.
QUESTIONS ?

More Related Content

Similar to Predictive analysis of traffic violations

EVALUATION OF PARTICLE SWARM OPTIMIZATION ALGORITHM IN PREDICTION OF THE CAR ...
EVALUATION OF PARTICLE SWARM OPTIMIZATION ALGORITHM IN PREDICTION OF THE CAR ...EVALUATION OF PARTICLE SWARM OPTIMIZATION ALGORITHM IN PREDICTION OF THE CAR ...
EVALUATION OF PARTICLE SWARM OPTIMIZATION ALGORITHM IN PREDICTION OF THE CAR ...ijcsa
 
Analysis of Machine Learning Algorithm with Road Accidents Data Sets
Analysis of Machine Learning Algorithm with Road Accidents Data SetsAnalysis of Machine Learning Algorithm with Road Accidents Data Sets
Analysis of Machine Learning Algorithm with Road Accidents Data SetsDr. Amarjeet Singh
 
Accident prediction modelling for an urban road of bangalore
Accident prediction modelling for an urban road of bangaloreAccident prediction modelling for an urban road of bangalore
Accident prediction modelling for an urban road of bangaloreeSAT Publishing House
 
IRJET-Road Traffic Accident Analysis and Prediction Model: A Case Study of Va...
IRJET-Road Traffic Accident Analysis and Prediction Model: A Case Study of Va...IRJET-Road Traffic Accident Analysis and Prediction Model: A Case Study of Va...
IRJET-Road Traffic Accident Analysis and Prediction Model: A Case Study of Va...IRJET Journal
 
IRJET - Predicting Accident Severity using Machine Learning
IRJET -  	  Predicting Accident Severity using Machine LearningIRJET -  	  Predicting Accident Severity using Machine Learning
IRJET - Predicting Accident Severity using Machine LearningIRJET Journal
 
IRJET - Road Accident and Emergency Management: A Data Analytics Approach
IRJET - Road Accident and Emergency Management: A Data Analytics ApproachIRJET - Road Accident and Emergency Management: A Data Analytics Approach
IRJET - Road Accident and Emergency Management: A Data Analytics ApproachIRJET Journal
 
Analyzing Specialized Views of Transportation Under Mean Safety By Using Fuzz...
Analyzing Specialized Views of Transportation Under Mean Safety By Using Fuzz...Analyzing Specialized Views of Transportation Under Mean Safety By Using Fuzz...
Analyzing Specialized Views of Transportation Under Mean Safety By Using Fuzz...IJERA Editor
 
Dr.Makendran Chapter -II Accident Studies & Collision Diagram .pdf
Dr.Makendran Chapter -II Accident Studies & Collision Diagram  .pdfDr.Makendran Chapter -II Accident Studies & Collision Diagram  .pdf
Dr.Makendran Chapter -II Accident Studies & Collision Diagram .pdfmakendran1
 
Schwarz et al._2016_The Detection of Visual Distraction using Vehicle and Dri...
Schwarz et al._2016_The Detection of Visual Distraction using Vehicle and Dri...Schwarz et al._2016_The Detection of Visual Distraction using Vehicle and Dri...
Schwarz et al._2016_The Detection of Visual Distraction using Vehicle and Dri...Julie J. Kang, Ph.D.
 
Sensor Based Detection & Classification of Actionable & Non-Actionable Condit...
Sensor Based Detection & Classification of Actionable & Non-Actionable Condit...Sensor Based Detection & Classification of Actionable & Non-Actionable Condit...
Sensor Based Detection & Classification of Actionable & Non-Actionable Condit...IRJET Journal
 
Predictive Modeling for Topographical Analysis of Crime Rate
Predictive Modeling for Topographical Analysis of Crime RatePredictive Modeling for Topographical Analysis of Crime Rate
Predictive Modeling for Topographical Analysis of Crime RateIRJET Journal
 
To Find out the Relationship between Errors, Lapses, Violations and Traffic A...
To Find out the Relationship between Errors, Lapses, Violations and Traffic A...To Find out the Relationship between Errors, Lapses, Violations and Traffic A...
To Find out the Relationship between Errors, Lapses, Violations and Traffic A...inventionjournals
 
IRJET- Measuring The Driver's Perception Error in the Traffic Accident Risk E...
IRJET- Measuring The Driver's Perception Error in the Traffic Accident Risk E...IRJET- Measuring The Driver's Perception Error in the Traffic Accident Risk E...
IRJET- Measuring The Driver's Perception Error in the Traffic Accident Risk E...IRJET Journal
 
Pedestrian Conflict Risk Model at Unsignalized Locations on a Community Street
Pedestrian Conflict Risk Model at Unsignalized Locations on a Community StreetPedestrian Conflict Risk Model at Unsignalized Locations on a Community Street
Pedestrian Conflict Risk Model at Unsignalized Locations on a Community Streetcoreconferences
 
IRJET- Mode Choice Behaviour Analysis of Students in Thrissur City
IRJET- Mode Choice Behaviour Analysis of Students in Thrissur CityIRJET- Mode Choice Behaviour Analysis of Students in Thrissur City
IRJET- Mode Choice Behaviour Analysis of Students in Thrissur CityIRJET Journal
 
Improving the understanding of safety performance of commercial motorcycles i...
Improving the understanding of safety performance of commercial motorcycles i...Improving the understanding of safety performance of commercial motorcycles i...
Improving the understanding of safety performance of commercial motorcycles i...Institute for Transport Studies (ITS)
 
Loan Default Prediction Using Machine Learning Techniques
Loan Default Prediction Using Machine Learning TechniquesLoan Default Prediction Using Machine Learning Techniques
Loan Default Prediction Using Machine Learning TechniquesIRJET Journal
 

Similar to Predictive analysis of traffic violations (20)

EVALUATION OF PARTICLE SWARM OPTIMIZATION ALGORITHM IN PREDICTION OF THE CAR ...
EVALUATION OF PARTICLE SWARM OPTIMIZATION ALGORITHM IN PREDICTION OF THE CAR ...EVALUATION OF PARTICLE SWARM OPTIMIZATION ALGORITHM IN PREDICTION OF THE CAR ...
EVALUATION OF PARTICLE SWARM OPTIMIZATION ALGORITHM IN PREDICTION OF THE CAR ...
 
Analysis of Machine Learning Algorithm with Road Accidents Data Sets
Analysis of Machine Learning Algorithm with Road Accidents Data SetsAnalysis of Machine Learning Algorithm with Road Accidents Data Sets
Analysis of Machine Learning Algorithm with Road Accidents Data Sets
 
Accident prediction modelling for an urban road of bangalore
Accident prediction modelling for an urban road of bangaloreAccident prediction modelling for an urban road of bangalore
Accident prediction modelling for an urban road of bangalore
 
IRJET-Road Traffic Accident Analysis and Prediction Model: A Case Study of Va...
IRJET-Road Traffic Accident Analysis and Prediction Model: A Case Study of Va...IRJET-Road Traffic Accident Analysis and Prediction Model: A Case Study of Va...
IRJET-Road Traffic Accident Analysis and Prediction Model: A Case Study of Va...
 
IRJET - Predicting Accident Severity using Machine Learning
IRJET -  	  Predicting Accident Severity using Machine LearningIRJET -  	  Predicting Accident Severity using Machine Learning
IRJET - Predicting Accident Severity using Machine Learning
 
IRJET - Road Accident and Emergency Management: A Data Analytics Approach
IRJET - Road Accident and Emergency Management: A Data Analytics ApproachIRJET - Road Accident and Emergency Management: A Data Analytics Approach
IRJET - Road Accident and Emergency Management: A Data Analytics Approach
 
Analyzing Specialized Views of Transportation Under Mean Safety By Using Fuzz...
Analyzing Specialized Views of Transportation Under Mean Safety By Using Fuzz...Analyzing Specialized Views of Transportation Under Mean Safety By Using Fuzz...
Analyzing Specialized Views of Transportation Under Mean Safety By Using Fuzz...
 
India Vision Zero 2017: Speed - The Biggest Killer
India Vision Zero 2017: Speed - The Biggest KillerIndia Vision Zero 2017: Speed - The Biggest Killer
India Vision Zero 2017: Speed - The Biggest Killer
 
Dr.Makendran Chapter -II Accident Studies & Collision Diagram .pdf
Dr.Makendran Chapter -II Accident Studies & Collision Diagram  .pdfDr.Makendran Chapter -II Accident Studies & Collision Diagram  .pdf
Dr.Makendran Chapter -II Accident Studies & Collision Diagram .pdf
 
Schwarz et al._2016_The Detection of Visual Distraction using Vehicle and Dri...
Schwarz et al._2016_The Detection of Visual Distraction using Vehicle and Dri...Schwarz et al._2016_The Detection of Visual Distraction using Vehicle and Dri...
Schwarz et al._2016_The Detection of Visual Distraction using Vehicle and Dri...
 
Sensor Based Detection & Classification of Actionable & Non-Actionable Condit...
Sensor Based Detection & Classification of Actionable & Non-Actionable Condit...Sensor Based Detection & Classification of Actionable & Non-Actionable Condit...
Sensor Based Detection & Classification of Actionable & Non-Actionable Condit...
 
Predictive Modeling for Topographical Analysis of Crime Rate
Predictive Modeling for Topographical Analysis of Crime RatePredictive Modeling for Topographical Analysis of Crime Rate
Predictive Modeling for Topographical Analysis of Crime Rate
 
To Find out the Relationship between Errors, Lapses, Violations and Traffic A...
To Find out the Relationship between Errors, Lapses, Violations and Traffic A...To Find out the Relationship between Errors, Lapses, Violations and Traffic A...
To Find out the Relationship between Errors, Lapses, Violations and Traffic A...
 
IRJET- Measuring The Driver's Perception Error in the Traffic Accident Risk E...
IRJET- Measuring The Driver's Perception Error in the Traffic Accident Risk E...IRJET- Measuring The Driver's Perception Error in the Traffic Accident Risk E...
IRJET- Measuring The Driver's Perception Error in the Traffic Accident Risk E...
 
GDRR Opening Workshop - Transportation System Reliability: Challenges and Opp...
GDRR Opening Workshop - Transportation System Reliability: Challenges and Opp...GDRR Opening Workshop - Transportation System Reliability: Challenges and Opp...
GDRR Opening Workshop - Transportation System Reliability: Challenges and Opp...
 
Pedestrian Conflict Risk Model at Unsignalized Locations on a Community Street
Pedestrian Conflict Risk Model at Unsignalized Locations on a Community StreetPedestrian Conflict Risk Model at Unsignalized Locations on a Community Street
Pedestrian Conflict Risk Model at Unsignalized Locations on a Community Street
 
IRJET- Mode Choice Behaviour Analysis of Students in Thrissur City
IRJET- Mode Choice Behaviour Analysis of Students in Thrissur CityIRJET- Mode Choice Behaviour Analysis of Students in Thrissur City
IRJET- Mode Choice Behaviour Analysis of Students in Thrissur City
 
Driver Distraction Management Using Sensor Data Cloud
Driver Distraction Management Using Sensor Data Cloud Driver Distraction Management Using Sensor Data Cloud
Driver Distraction Management Using Sensor Data Cloud
 
Improving the understanding of safety performance of commercial motorcycles i...
Improving the understanding of safety performance of commercial motorcycles i...Improving the understanding of safety performance of commercial motorcycles i...
Improving the understanding of safety performance of commercial motorcycles i...
 
Loan Default Prediction Using Machine Learning Techniques
Loan Default Prediction Using Machine Learning TechniquesLoan Default Prediction Using Machine Learning Techniques
Loan Default Prediction Using Machine Learning Techniques
 

Recently uploaded

BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 

Recently uploaded (20)

BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 

Predictive analysis of traffic violations

  • 1. Predictive Analysis of Traffic violations Group 1
  • 2. Introduction Road accidents - big worldwide threat: ● Up to 1.27 million deaths/year [2]. ● Up to 50 million injuries/year [2]. ● Over 2.5 millions/year involved in US [2]. ● Huge economic and social impact. Data mining is arduous for traffic violations: ● Huge data size and high dimensions [2-4]. ● Popularity of classification methods [3-5]. ● High dependence on collected data [3-5]. ● Testing multiple to choose the best [3-5].
  • 3. Flow of Project Project followed CRISP-DM model and provided instructions A. Selecting a dataset from any free source on web. B. Study the variables in the selected dataset; draw preliminary conclusions; and develop at least three initial research questions/hypotheses. C. Develop a Business Use Case. D. Use the visualization (min of 5) to explore dataset; present research hypotheses, based on the visualizations. E. Produce a dataset satisfying all of the criteria in part c. F. Present 3 modeling techniques for hypotheses. G. Develop models using 3 algorithm for each model. H. Provide recommendations based on modeling results.
  • 4. Dataset Chosen Traffic Violations of MCP: records from 2012 to 2018 from https://catalog.data.gov/ (Jan 17, 2018) Attributes: ● Date of Stop ● Time of Stop ● Belts ● Contribution Accident ● Description ● Phone Used ● Fatal ● Property damage ● Violation Type ● Hazmat
  • 5. Data Preprocessing Missing Data Handling ● Two attributes i.e. “Agency” and “Accident” are removed from the dataset for being single-valued attributes with value 'No' for every one of the records. ● Records with Null values are evacuated utilizing XL miners missing data treatment. New Attributes ● Three new binary-valued variables are presented to be specific, "Phone Usage", " Contributed to accident " and "Fatal" in light of the visualizations made for the dataset. Outliers detection ● Outliers are identified for the attribute "Year" and treated. Filtering Datasets ● For each search question separate datasets were created for modeling. ● R programming language is utilized to accomplish balanced dataset with an equivalent weight of target variables.
  • 7. Visualizations - Child violations with year & month
  • 8. Visualizations - Phone violations with year & month
  • 9. Visualizations - Violations with Personal injury & seat belt
  • 10. Search Hypothesis 1 Whether a violation occurred contributed to an accident or not Whether the violation that led to an accident was fatal or not 2 The geolocation, at which a violation is likely to occur based on several factors 3
  • 11. Modeling Model 1 Predictors: ● Belt ● Alcohol ● Vehicle Type ● Year ● Phone Usage Target variable: Contributed to accident Model 2 Predictors: ● Belt ● HAZMAT ● Alcohol ● Phone Usage Target variable: Fatal Model 3 Predictors: ● Belt, Personal Injury ● Property Damage ● Alcohol ● Violation Type Target variable: Cluster ID (geo-coordinates)
  • 12. Model I: contrib. to accident Algorithm Single Tree Random Trees Naïve Bayes Precision 0.626 0.503 0.621 Sensitivity 0.275 0.996 0.266 Specificity 0.836 0.023 0.838 F1-Score 0.383 0.669 0.373 Belts Phone usage Alcohol Type: Bus Best model: Random Trees due to highest sensitivity and F1-score Belts and phone usage are the top predictors for contribution to accident
  • 13. Model II: Fatality of accident Alcohol Phone usage HAZMAT Algorithm Single Tree Random Trees Naïve Bayes Precision 0.579 0.639 0.579 Sensitivity 0.79 0.383 0.543 Specificity 0.404 0.776 0.59 F1-Score 0.668 0.478 0.561 Best model: Single Tree due to highest sensitivity and F1-score Alcohol and phone usage are the top predictors for fatal accidents
  • 14. Model III: geolocation for “likely” violations ● K-means clustering and Single Tree classification are best algorithms due to unbiased results ● Cluster 5 has the highest number of alcohol violations and personal injuries ● Cluster 9 has the highest number of belt violations
  • 15. Recommendations > Recommend MCP to increase their attention and increase enforcement of rules for belts and phone usage as the main reason for accident contribution > Recommend MCP to pay an extra attention and take extra measures to drunk drivers and phone usage when driving > Alert Maryland police about the major areas violations are likely to be caused: >> high number of alcohol violations with personal injuries in cluster 5 >> multiple belt violations in cluster 9 > Perform similar recommendations to the insurance companies in the state of Maryland
  • 16. Conclusion ● 5 visualizations have been produced and 3 research hypotheses developed; ● Data preprocessed in several datasets; ● 3 modeling technique is performed for hypotheses; ● Several classification, clustering and regression models have been considered for modeling: Single Tree, Random Trees, K-Means clustering, Multiple regression, etc.; ● Random Trees and Single Tree are the best algorithms for models 1 and 2 due to an importance of high sensitivity and a high F1-score; ● K-means clustering and Single Tree classification have been considered as the best algorithms for model 3 providing numbers of different types of violations along with the number of injuries for various clusters; ● XLMiner, Tableau, and R are used for analysis.
  • 17. References ● Discovering Knowledge in Data: An Introduction to Data Mining, Daniel T. Larose and Chantal D. Larose, Wiley, 2nd edition: Wiley ISBN 978-0-470-90874-7 ● Abellan, J., Lopez, G., & De O~na, J. (2013). Analysis of traffic accident severity using Decision Rules via Decision Trees. Expert Systems with Applications, 40, 6047–6054. ● Chang, L.-Y., & Chien, J.-T. (2015). Analysis of driver injury severity in truck-involved accidents using a non- parametric classification tree model. Safety Science, 51(1), 17–22. ● Chen, W. H., & Jovanis, P. P. (2012). Method for identifying factors contributing to driver injury severity in traffic crashes. Transportation Research Record, 1717, 1–9. ● Kashani, A. T., Rabieyan, R., & Besharati, M. M. (2014). A data mining approach to investigate the factors influencing the crash severity of motorcycle pillion passengers. Journal of Safety Research, 51, 93–98. ● Kwon, O. H., Rhee, W., & Yoon, Y. (2015). Application of classification algorithms for analysis of road safety risk factor dependencies. Accident Analysis and Prevention, 75, 1–15. ● Xie, Y., Zhang, Y., & Liang, F. (2009). Crash injury severity analysis using Bayesian ordered probit models. Journal of Transportation Engineering ASCE, 135(1), 18–25. ● Mujalli, M. O., & de O~na, J. (2011). A method for simplifying the analysis of traffic accidents injury severity on two-lane highways using Bayesian networks. Journal of Safety Research, 42, 317–326. ● De O~na, J., Lopez, G., & Abellan, J. (2013). Extracting decision rules from police accident reports through decision trees. Accident Analysis & Prevention, 50, 1151–1160.

Editor's Notes

  1. Belts: If traffic violation involved a seat belt violation. Personal Injury: If traffic violation involved Personal Injury. Property Damage: If traffic violation involved Property Damage. Fatal: If traffic violation involved a fatality. HAZMAT: If the traffic violation involved hazardous materials. Violation Type: Violation type. (Examples: Warning, Citation, SERO) Geolocation: Geo-coded location information.