SlideShare a Scribd company logo
Toronto Traffic Safety Analysis:
Unveiling Risk Factors in Fatal Collisions
Professor: Richard Boire
CAPSTONE DATA PROJECT
Vishal Sang 101458132
Yiran Hu 101442783
Agenda
Introduction 3
Approach 6
Analytical Results 13
Key Findings 20
Conclusions 21
Background
Overview of the Company:
• TPS mission: "To Serve and Protect."
• Role in maintaining public safety and upholding
the law in Toronto.
• Partnership with the community for safety, law
enforcement, and emergency services.
• Commitment to transparency and continuous
improvement.
3
Business Problem
4
• Data Overload
• Public Safety
• Community Engagement
• Resource Allocation
Key Expectations of the Client:
• Insightful Analysis
• Predictive Analytics
• Strategic Recommendations
• Community-Oriented Solutions
Key Challenges of the Client:
Identification of the Problem
5
Need to effectively analyze and
interpret extensive traffic collision
data for enhanced public safety.
Specific Challenges:
• Understanding Collision Patterns
• Addressing Road Safety Issues
• Data-Driven Decision Making
Primary Problem:
Approach
6
Data
Diagnostics
Selected
Variables
Conversion
to Numeric
Correlation Analysis
Data Diagnostics
7
Missing Values
Data Diagnostics
8
Unique Values
Data Diagnostics
9
Unique Values
Frequency Distribution Reports
• Significant occurrences on specific
streets like "LAWRENCE AVE E,"
"FINCH AVE W," and "EGLINTON AVE E"
highlight high-risk locations.
• Pedestrians (517 incidents) are notably
vulnerable road users in fatal collisions.
• Contributing factors include
speeding (187 incidents),
aggressive driving (427
incidents), and alcohol
involvement (44 incidents).
• Temporal patterns across 846
unique dates and 629 unique
times emphasize the diverse
circumstances of fatal
collisions.
Appendix
Derived Variables
Selected Variables
The Target Variable:'SPEEDING', because collisions involving
speeding are more likely to be severe.
Other Variable:
TIME, INVAGE, ROAD_CLASS, TRAFFCTL, RDSFCOND,
LIGHT, ALCOHOL
14
Correlation
15
ALCOHOL
a positive correlation
INVAGE
a significant negative correlation
Other
a smaller negative correlation
The Exploratory Data Analysis
16
Traffic Control
Certain types of traffic control are more commonly
associated with collisions.
1.0 refers to No traffic Control and 5.0 refers to
Traffic Signal.
This shows that there were most no. of collisions
in the places where traffic wasn’t being controlled.
The Exploratory Data Analysis
17
INVAGE
• The age distribution shows a wide range,
indicating that individuals of various ages are
involved in collisions.
• There appears to be a higher frequency of
younger individuals involved in speeding and
collisions.
The Exploratory Data Analysis
18
Road Class
• The distribution across different road classes shows a varied number of
collisions associated with each class. This variation can be indicative of
the traffic volume, road conditions, or other factors specific to each
road class that may influence the occurrence of speeding-related
accidents.
• 1.0 refers to Expressway Ramp
• 6.0 refers to Major Arterial
• Major Arterial road class have a noticeably higher count of
collisions, suggesting that these classes might be more prone
to accidents or have higher traffic flow.
The Exploratory Data Analysis
19
LIGHT
The distribution across different light conditions
shows variability in collision occurrences.
1.0 refers to Dark light Condition
4.0 refers to Day Light Conditions
This Shows that most no. of Collisions
happen in Day light and Dark light
Conditions.
The Exploratory Data Analysis
20
Alcohol
• The majority of collisions did not involve alcohol,
as indicated by the higher count for '0' (No
Alcohol Involvement).
• However, there is still a notable number of
incidents where alcohol was involved,
highlighting its significance in traffic accidents.
ROC Curve
21
ROC Curve Overview:
• Shows how well the model distinguishes
instances with and without speeding.
AUC Score Significance:
• Quantifies overall model performance;
higher values indicate better discrimination.
Decile/Gains Chart
22
• Reveals how accurately the model ranks instances
by predicted probabilities.
• Divides the dataset into deciles, showing
cumulative true positive percentages for each.
• Visualizes alignment between model predictions
and actual instances of speeding.
• Steep increases indicate the model effectively
identifies a higher percentage of positive instances
within subsets.
Key Findings
23
Identified influential features: TRAFFCTL and ‘ROADCLASS' play crucial roles
in predicting speeding and Collision incidents.
Model accuracy: Achieved an overall accuracy of approximately 77.54% on
the test set.
ROC Curve and AUC Score: The ROC curve and AUC score (around 0.80)
demonstrate good discriminative power in distinguishing speeding incidents.
Decile/Gains Chart: Indicates the model's effective ranking of instances, with
a cumulative increase in true positive rate across deciles.
Conclusion
• Identified influential features: ‘TRAFFCTL’ and
‘ROADCLASS.'
• Suggested model refinement for improved accuracy.
• Emphasized ongoing validation, ethical deployment
considerations, and stakeholder collaboration for an
impactful solution in predicting and preventing speeding
and Collision incidents.
24
Thank
you

More Related Content

Similar to Capstone Fatal Collisions_.pptx

Q01262105114
Q01262105114Q01262105114
Q01262105114
IOSR Journals
 
Accident Analysis At The Black Spot: A Case Study
Accident Analysis At The Black Spot: A Case StudyAccident Analysis At The Black Spot: A Case Study
Accident Analysis At The Black Spot: A Case Study
iosrjce
 
Estimation of positive demand feedback processes
Estimation of positive demand feedback processesEstimation of positive demand feedback processes
Estimation of positive demand feedback processes
Institute for Transport Studies (ITS)
 
PhD Abtract_HoangDao_July2016
PhD Abtract_HoangDao_July2016PhD Abtract_HoangDao_July2016
PhD Abtract_HoangDao_July2016Hoang Dao
 
MGT 3050 Decision Science Final Report
MGT 3050 Decision Science Final ReportMGT 3050 Decision Science Final Report
MGT 3050 Decision Science Final Report
Sara Husna
 
An Analysis on Distribution of Traffic Faults in Accidents Based on Driver’s ...
An Analysis on Distribution of Traffic Faults in Accidents Based on Driver’s ...An Analysis on Distribution of Traffic Faults in Accidents Based on Driver’s ...
An Analysis on Distribution of Traffic Faults in Accidents Based on Driver’s ...
Tanvir Moin
 
Civiconomics
CiviconomicsCiviconomics
Pedestrian Accident Scenario of Dhaka City and Development of a Prediction Model
Pedestrian Accident Scenario of Dhaka City and Development of a Prediction ModelPedestrian Accident Scenario of Dhaka City and Development of a Prediction Model
Pedestrian Accident Scenario of Dhaka City and Development of a Prediction Model
RafidTahmid1
 
An Introduction To Road Safety Engineering
An Introduction To Road Safety EngineeringAn Introduction To Road Safety Engineering
An Introduction To Road Safety Engineering
Allison Koehn
 
Study On Traffic Conlict At Unsignalized Intersection In Malaysia
Study On Traffic Conlict At Unsignalized Intersection In Malaysia Study On Traffic Conlict At Unsignalized Intersection In Malaysia
Study On Traffic Conlict At Unsignalized Intersection In Malaysia
IOSR Journals
 
Rob Bain - Errors & Optimism Bias in Toll Road Traffic Forecasts
Rob Bain - Errors & Optimism Bias in Toll Road Traffic ForecastsRob Bain - Errors & Optimism Bias in Toll Road Traffic Forecasts
Rob Bain - Errors & Optimism Bias in Toll Road Traffic Forecasts
JumpingJaq
 
Transport Modelling for managers 2014 willumsen
Transport Modelling for managers 2014 willumsenTransport Modelling for managers 2014 willumsen
Transport Modelling for managers 2014 willumsen
Luis Willumsen
 
Spatial Risk Diffusion: Predicting risk linked to human behavior
Spatial Risk Diffusion:  Predicting risk linked to human behaviorSpatial Risk Diffusion:  Predicting risk linked to human behavior
Spatial Risk Diffusion: Predicting risk linked to human behavior
Accenture Insurance
 
An Exploration of Prosocial Aspects of Communication Cues between Automated V...
An Exploration of Prosocial Aspects of Communication Cues between Automated V...An Exploration of Prosocial Aspects of Communication Cues between Automated V...
An Exploration of Prosocial Aspects of Communication Cues between Automated V...
Shadan Sadeghian
 
Portland Tames Speed for Safety, a Case Study for Vision Zero Cities
Portland Tames Speed for Safety, a Case Study for Vision Zero CitiesPortland Tames Speed for Safety, a Case Study for Vision Zero Cities
Portland Tames Speed for Safety, a Case Study for Vision Zero Cities
visionzeronetwork
 
Seminar paper 5
Seminar paper 5Seminar paper 5
Seminar paper 5juilice
 
Uncertainty in travel forecasting on
Uncertainty in travel forecasting         onUncertainty in travel forecasting         on
Uncertainty in travel forecasting on
Luis Willumsen
 
Research stats assignment for Most Preferred Cellular Service
Research stats assignment for Most Preferred Cellular ServiceResearch stats assignment for Most Preferred Cellular Service
Research stats assignment for Most Preferred Cellular Service
Osama Yousaf
 
Clear Air Zones – What are Local Authorities Proposing? - Nigel Bellamy
Clear Air Zones – What are Local Authorities Proposing? - Nigel BellamyClear Air Zones – What are Local Authorities Proposing? - Nigel Bellamy
Clear Air Zones – What are Local Authorities Proposing? - Nigel Bellamy
IES / IAQM
 
Final_Parry_Frank_CMAP_Hourly_Crashes_ChicagoV2
Final_Parry_Frank_CMAP_Hourly_Crashes_ChicagoV2Final_Parry_Frank_CMAP_Hourly_Crashes_ChicagoV2
Final_Parry_Frank_CMAP_Hourly_Crashes_ChicagoV2Parry Frank
 

Similar to Capstone Fatal Collisions_.pptx (20)

Q01262105114
Q01262105114Q01262105114
Q01262105114
 
Accident Analysis At The Black Spot: A Case Study
Accident Analysis At The Black Spot: A Case StudyAccident Analysis At The Black Spot: A Case Study
Accident Analysis At The Black Spot: A Case Study
 
Estimation of positive demand feedback processes
Estimation of positive demand feedback processesEstimation of positive demand feedback processes
Estimation of positive demand feedback processes
 
PhD Abtract_HoangDao_July2016
PhD Abtract_HoangDao_July2016PhD Abtract_HoangDao_July2016
PhD Abtract_HoangDao_July2016
 
MGT 3050 Decision Science Final Report
MGT 3050 Decision Science Final ReportMGT 3050 Decision Science Final Report
MGT 3050 Decision Science Final Report
 
An Analysis on Distribution of Traffic Faults in Accidents Based on Driver’s ...
An Analysis on Distribution of Traffic Faults in Accidents Based on Driver’s ...An Analysis on Distribution of Traffic Faults in Accidents Based on Driver’s ...
An Analysis on Distribution of Traffic Faults in Accidents Based on Driver’s ...
 
Civiconomics
CiviconomicsCiviconomics
Civiconomics
 
Pedestrian Accident Scenario of Dhaka City and Development of a Prediction Model
Pedestrian Accident Scenario of Dhaka City and Development of a Prediction ModelPedestrian Accident Scenario of Dhaka City and Development of a Prediction Model
Pedestrian Accident Scenario of Dhaka City and Development of a Prediction Model
 
An Introduction To Road Safety Engineering
An Introduction To Road Safety EngineeringAn Introduction To Road Safety Engineering
An Introduction To Road Safety Engineering
 
Study On Traffic Conlict At Unsignalized Intersection In Malaysia
Study On Traffic Conlict At Unsignalized Intersection In Malaysia Study On Traffic Conlict At Unsignalized Intersection In Malaysia
Study On Traffic Conlict At Unsignalized Intersection In Malaysia
 
Rob Bain - Errors & Optimism Bias in Toll Road Traffic Forecasts
Rob Bain - Errors & Optimism Bias in Toll Road Traffic ForecastsRob Bain - Errors & Optimism Bias in Toll Road Traffic Forecasts
Rob Bain - Errors & Optimism Bias in Toll Road Traffic Forecasts
 
Transport Modelling for managers 2014 willumsen
Transport Modelling for managers 2014 willumsenTransport Modelling for managers 2014 willumsen
Transport Modelling for managers 2014 willumsen
 
Spatial Risk Diffusion: Predicting risk linked to human behavior
Spatial Risk Diffusion:  Predicting risk linked to human behaviorSpatial Risk Diffusion:  Predicting risk linked to human behavior
Spatial Risk Diffusion: Predicting risk linked to human behavior
 
An Exploration of Prosocial Aspects of Communication Cues between Automated V...
An Exploration of Prosocial Aspects of Communication Cues between Automated V...An Exploration of Prosocial Aspects of Communication Cues between Automated V...
An Exploration of Prosocial Aspects of Communication Cues between Automated V...
 
Portland Tames Speed for Safety, a Case Study for Vision Zero Cities
Portland Tames Speed for Safety, a Case Study for Vision Zero CitiesPortland Tames Speed for Safety, a Case Study for Vision Zero Cities
Portland Tames Speed for Safety, a Case Study for Vision Zero Cities
 
Seminar paper 5
Seminar paper 5Seminar paper 5
Seminar paper 5
 
Uncertainty in travel forecasting on
Uncertainty in travel forecasting         onUncertainty in travel forecasting         on
Uncertainty in travel forecasting on
 
Research stats assignment for Most Preferred Cellular Service
Research stats assignment for Most Preferred Cellular ServiceResearch stats assignment for Most Preferred Cellular Service
Research stats assignment for Most Preferred Cellular Service
 
Clear Air Zones – What are Local Authorities Proposing? - Nigel Bellamy
Clear Air Zones – What are Local Authorities Proposing? - Nigel BellamyClear Air Zones – What are Local Authorities Proposing? - Nigel Bellamy
Clear Air Zones – What are Local Authorities Proposing? - Nigel Bellamy
 
Final_Parry_Frank_CMAP_Hourly_Crashes_ChicagoV2
Final_Parry_Frank_CMAP_Hourly_Crashes_ChicagoV2Final_Parry_Frank_CMAP_Hourly_Crashes_ChicagoV2
Final_Parry_Frank_CMAP_Hourly_Crashes_ChicagoV2
 

Capstone Fatal Collisions_.pptx

  • 1. Toronto Traffic Safety Analysis: Unveiling Risk Factors in Fatal Collisions Professor: Richard Boire CAPSTONE DATA PROJECT Vishal Sang 101458132 Yiran Hu 101442783
  • 2. Agenda Introduction 3 Approach 6 Analytical Results 13 Key Findings 20 Conclusions 21
  • 3. Background Overview of the Company: • TPS mission: "To Serve and Protect." • Role in maintaining public safety and upholding the law in Toronto. • Partnership with the community for safety, law enforcement, and emergency services. • Commitment to transparency and continuous improvement. 3
  • 4. Business Problem 4 • Data Overload • Public Safety • Community Engagement • Resource Allocation Key Expectations of the Client: • Insightful Analysis • Predictive Analytics • Strategic Recommendations • Community-Oriented Solutions Key Challenges of the Client:
  • 5. Identification of the Problem 5 Need to effectively analyze and interpret extensive traffic collision data for enhanced public safety. Specific Challenges: • Understanding Collision Patterns • Addressing Road Safety Issues • Data-Driven Decision Making Primary Problem:
  • 10. Frequency Distribution Reports • Significant occurrences on specific streets like "LAWRENCE AVE E," "FINCH AVE W," and "EGLINTON AVE E" highlight high-risk locations. • Pedestrians (517 incidents) are notably vulnerable road users in fatal collisions. • Contributing factors include speeding (187 incidents), aggressive driving (427 incidents), and alcohol involvement (44 incidents). • Temporal patterns across 846 unique dates and 629 unique times emphasize the diverse circumstances of fatal collisions.
  • 12.
  • 14. Selected Variables The Target Variable:'SPEEDING', because collisions involving speeding are more likely to be severe. Other Variable: TIME, INVAGE, ROAD_CLASS, TRAFFCTL, RDSFCOND, LIGHT, ALCOHOL 14
  • 15. Correlation 15 ALCOHOL a positive correlation INVAGE a significant negative correlation Other a smaller negative correlation
  • 16. The Exploratory Data Analysis 16 Traffic Control Certain types of traffic control are more commonly associated with collisions. 1.0 refers to No traffic Control and 5.0 refers to Traffic Signal. This shows that there were most no. of collisions in the places where traffic wasn’t being controlled.
  • 17. The Exploratory Data Analysis 17 INVAGE • The age distribution shows a wide range, indicating that individuals of various ages are involved in collisions. • There appears to be a higher frequency of younger individuals involved in speeding and collisions.
  • 18. The Exploratory Data Analysis 18 Road Class • The distribution across different road classes shows a varied number of collisions associated with each class. This variation can be indicative of the traffic volume, road conditions, or other factors specific to each road class that may influence the occurrence of speeding-related accidents. • 1.0 refers to Expressway Ramp • 6.0 refers to Major Arterial • Major Arterial road class have a noticeably higher count of collisions, suggesting that these classes might be more prone to accidents or have higher traffic flow.
  • 19. The Exploratory Data Analysis 19 LIGHT The distribution across different light conditions shows variability in collision occurrences. 1.0 refers to Dark light Condition 4.0 refers to Day Light Conditions This Shows that most no. of Collisions happen in Day light and Dark light Conditions.
  • 20. The Exploratory Data Analysis 20 Alcohol • The majority of collisions did not involve alcohol, as indicated by the higher count for '0' (No Alcohol Involvement). • However, there is still a notable number of incidents where alcohol was involved, highlighting its significance in traffic accidents.
  • 21. ROC Curve 21 ROC Curve Overview: • Shows how well the model distinguishes instances with and without speeding. AUC Score Significance: • Quantifies overall model performance; higher values indicate better discrimination.
  • 22. Decile/Gains Chart 22 • Reveals how accurately the model ranks instances by predicted probabilities. • Divides the dataset into deciles, showing cumulative true positive percentages for each. • Visualizes alignment between model predictions and actual instances of speeding. • Steep increases indicate the model effectively identifies a higher percentage of positive instances within subsets.
  • 23. Key Findings 23 Identified influential features: TRAFFCTL and ‘ROADCLASS' play crucial roles in predicting speeding and Collision incidents. Model accuracy: Achieved an overall accuracy of approximately 77.54% on the test set. ROC Curve and AUC Score: The ROC curve and AUC score (around 0.80) demonstrate good discriminative power in distinguishing speeding incidents. Decile/Gains Chart: Indicates the model's effective ranking of instances, with a cumulative increase in true positive rate across deciles.
  • 24. Conclusion • Identified influential features: ‘TRAFFCTL’ and ‘ROADCLASS.' • Suggested model refinement for improved accuracy. • Emphasized ongoing validation, ethical deployment considerations, and stakeholder collaboration for an impactful solution in predicting and preventing speeding and Collision incidents. 24