SlideShare a Scribd company logo
1 of 51
Download to read offline
Driving Style Analysis
based on Trip Segmentation.
A Comparative Multi-Technique Approach
Marco Brambilla, Andrea Mauri, Paolo Mascetti
@marcobrambi
Agenda
Intro
Problem Definition
Dataset
Data Exploration and Preliminaries
Trip Segmentation Techniques
Validation
Conclusions
Intro: Relevance
1.24 million traffic-related fatalities occur annually
world wide
Currently the leading cause of death for people aged
between 15 and 29 years
Majority of cases due to improper or risky driving
behavior
Source: World Health Organisation (WHO)
Intro: Driving Process
Driving Process: driving
a car is a complex task
that requires to take
informed decisions
based on information
pertaining different
levels such as his own
state and other drivers’
behavior.
Intro: Relevant Information
Vehicle’s Status
Contextual Info
• Road State
• Weather
Conditions
• Traffic Info
• Road Risk
• Traffic
Problem Statement
Data-driven driver profiling
with respect to driving risk
Essentially: Multivariate Time Series Segmentation
Application scenarios in insurance, promoting
pay-how-you-drive (PHYD) business models
State of the Art and Challenges
State of the art: many works on identification and
recognition of behavioural patterns (line following,
accelerations, braking etc) and maneuvers
recognition, behavioural scoring, prediction of
driver intentions.
Supervised Learning techniques require intensive
end expensive gathering process.
Proposed Solution
Unsupervised techniques to profile drivers
behaviour based on identified recurrent patterns
on driving path segmentation
Comparison of 3 different approaches and use of
all of them for consolidated results
1. Unsupervised Segmentation Based on Clustering
2. Unsupervised Segmentation Based on HMM
3. Unsupervised Topic Extraction
Contextual Scenes
Observed driving behaviours that are
repeated in each driver's behaviour and
also across different drivers.
A reduced representation of the original
Multivariate Time Series conveying a
simplified characterization
Further reasoning is then applied
ETL Process
3 Steps:
Extract: read collected files and selection of candidate features
Transform:
Filter and Grouping
Features computation
Load: produce a unique dataset
PreProcessing
Transform
Global	
dataset.csv
Load
Trip	File.csv
Extract
Datasets
Collection Device :
Xsens MTi-G-710 (27 users)
And cell phones (10 users)
Retrieved Signals :
Acceleration measurements
Altitude
GPS Positioning
Speeding
Orientation
Mounted in-vehicle aligned with
direction of movement.
No Ground truth knowledge
Features Selected
Acceleration (on Y and X axes),
Speed (on Y and X axes)
Difference in yaw
Pre-Analysis 1: Data Exploration
Pre-Analysis 1: Data Exploration
Pre-Analysis 2: Application of Driving
Safety Existing Analyses
Vaiana et.al. Propose a Driving Safety Diagram based on longitudinal and
lateral accelerations analysis.
Aggressiveness Index formulation:
(A = Aggressive, S = Safe points)
Graphical representation:
DP-Means
1. Unsupervised Segmentation Based
on DP-Means Clustering
Problem: Bayesian nonparametric techniques require expensive sampling methods or
variational techniques.
DP-means: proposed by Kulis et. al. revisiting k-means: K-means like objective function +
penalty
A new cluster is created whenever a point is farther than λ away from every already existing centroid.
Note:
Clustering results depends on data ordering.
Clusters
Silhouette
Results
Centroids
Results
Centroids
Distribution of features across
clusters
Distribution of features across
clusters
Trip Segmentation Examples
Trip Segmentation Examples
Hidden Markov Models
Unsupervised Segmentation based on
HMM
Goal: identify latent structure given observed data points,
assuming existance of Gaussian hidden states.
Assign to each observed point the corresponding hidden state.
Hidden Markov Models (HMM):
Observation and hidden states
Markovian properties
Continous observation
Unsupervised Segmentation based on
HMM
Training:
Baum-Welch EM algorithm to learn model parameters
Decoding:
Viterbi decoding to assign to each observed point the most
likely hidden state
HMM Results
Also a different variation applied: inertial HMM: lower transition
probabilities enforcing state persistence. Sensible for driving.
HMM Results
Clusters as hidden states.
HMM Results
Clusters as hidden states.
Example of Trip Segmentation
Topic Extraction
Topic Extraction Approach
What is topic extraction ?
Model topical concepts belonging to a set of textual documents.
Data are described as documents and the components are distributions of
terms that reflect recurring patterns, name Topics.
Hierarchical Dirichlet Processes (HDPs)
soft-clustering technique based on non-parametric Bayesian theory.
number of topics is not set a priori, but learned from data.
Posteriori probability approximated by Variational Inference algorithm by
Wang et.al.
Results:
Most relevant topics for each document and terms distribution in each topic.
Topic Extraction Process
Data	Quantization
Documents creation
Topics	Extraction
Topics	Evaluation
Quantization – Binning Process
with static binning strategy
Documents
Terms Relevance on Top 7 Topics
Linguist…
Terms Relevance on Top 7 Topics
… and data analyst perspectives
…
Comparison and Validation
Big Issue: How to Compare?
1) Point-to-point or point distribution
2) Resulting grouping of trips
3) Perceived user similarity of trips
Solution 1: Point-to-Point
Overlap of clusters? Per trip? Overall?
Solution 1: Point-to-Point
Solution 1: Point-to-Point
Solution 2: Moving from Points to
Trips
Can we cluster trips based on how observation points have
been clustered?
à Simple K-means clustering of trips for each approach.
à Comparison of overlap of the different clusters
Coherent with original question: grouping of trips (and thus
drivers) by driving behavior
Result of overlap analysis
K-means with K=6 clusters.
DP-means vs. HMM: 74% overlap
DP-means vs. Topic: 44%
HMM vs. Topic: 48%
Human Validation of Trip Groups
Experts (knowledgeable about driving styles and driving
paths recorded) identify possible groups of trips in the
dataset
Problem:
- Unable to distinguish 6 categories of groups
- Only 3 categories are feasible
- Best matching 6à3 categories for each method
Results
Conclusions
Three different clustering techniques of driving
behavior over trips
-> segmentation
Clustering of trips based on behavior
-> up to 74% overlap over 6 clusters
-> 100% overlap over 3 clusters
User Validation
-> 96% precision over 3 clusters
Future Work
About collection process:
Gathering process including contextual information (road
risk, traffic status, weather conditions)
Larger dataset to improve inference performance
About implemented methods:
Smarter data ordering for DP-means
Relax independency assumption in HMM
Improvements in data discretization process for HDP
Marco Brambilla, @marcobrambi, marco.brambilla@polimi.it
http://datascience.deib.polimi.it
Thanks! Questions?

More Related Content

What's hot

Driver drowsinees detection and alert.pptx slide
Driver drowsinees detection and alert.pptx slideDriver drowsinees detection and alert.pptx slide
Driver drowsinees detection and alert.pptx slide
kavinakshi
 
Internet of Vehicles (IoV)
Internet of Vehicles (IoV)Internet of Vehicles (IoV)
Internet of Vehicles (IoV)
jangezkhan
 
Intelligent transport system
Intelligent transport systemIntelligent transport system
Intelligent transport system
Civil Engineers
 
2.17Mb ppt
2.17Mb ppt2.17Mb ppt
2.17Mb ppt
butest
 
Vinod_Autonomous_car_ppts
Vinod_Autonomous_car_pptsVinod_Autonomous_car_ppts
Vinod_Autonomous_car_ppts
vinumukkati
 

What's hot (20)

Drowsiness State Detection of Driver using Eyelid Movement- IRE Journal Confe...
Drowsiness State Detection of Driver using Eyelid Movement- IRE Journal Confe...Drowsiness State Detection of Driver using Eyelid Movement- IRE Journal Confe...
Drowsiness State Detection of Driver using Eyelid Movement- IRE Journal Confe...
 
Prediction and planning for self driving at waymo
Prediction and planning for self driving at waymoPrediction and planning for self driving at waymo
Prediction and planning for self driving at waymo
 
Data Science for Connected Vehicles
Data Science for Connected VehiclesData Science for Connected Vehicles
Data Science for Connected Vehicles
 
Self Driving Car Seminar presentation
Self Driving Car Seminar presentationSelf Driving Car Seminar presentation
Self Driving Car Seminar presentation
 
How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?
 
IoT services in the automotive sector
IoT services in the automotive sectorIoT services in the automotive sector
IoT services in the automotive sector
 
Driverless cars
Driverless carsDriverless cars
Driverless cars
 
Automated Driver Fatigue Detection
Automated Driver Fatigue DetectionAutomated Driver Fatigue Detection
Automated Driver Fatigue Detection
 
ACCIDENT PREVENTION IN VEHICLE WITH EFFECTIVE RESCUE OPERATION
ACCIDENT PREVENTION IN VEHICLE WITH EFFECTIVE RESCUE OPERATIONACCIDENT PREVENTION IN VEHICLE WITH EFFECTIVE RESCUE OPERATION
ACCIDENT PREVENTION IN VEHICLE WITH EFFECTIVE RESCUE OPERATION
 
Driver detection system_final.ppt
Driver detection system_final.pptDriver detection system_final.ppt
Driver detection system_final.ppt
 
Driver drowsinees detection and alert.pptx slide
Driver drowsinees detection and alert.pptx slideDriver drowsinees detection and alert.pptx slide
Driver drowsinees detection and alert.pptx slide
 
Internet of Vehicles (IoV)
Internet of Vehicles (IoV)Internet of Vehicles (IoV)
Internet of Vehicles (IoV)
 
Self-driving cars are here
Self-driving cars are hereSelf-driving cars are here
Self-driving cars are here
 
Intelligent transport system
Intelligent transport systemIntelligent transport system
Intelligent transport system
 
2.17Mb ppt
2.17Mb ppt2.17Mb ppt
2.17Mb ppt
 
Vinod_Autonomous_car_ppts
Vinod_Autonomous_car_pptsVinod_Autonomous_car_ppts
Vinod_Autonomous_car_ppts
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance Systems
 
Seminar on Driver Behaviour Detection using Swarm Intelligence.
Seminar on Driver Behaviour Detection using Swarm Intelligence.Seminar on Driver Behaviour Detection using Swarm Intelligence.
Seminar on Driver Behaviour Detection using Swarm Intelligence.
 
Machine Learning Final presentation
Machine Learning Final presentation Machine Learning Final presentation
Machine Learning Final presentation
 
Computer Vision for autonomous driving
Computer Vision for autonomous drivingComputer Vision for autonomous driving
Computer Vision for autonomous driving
 

Similar to Driving Style and Behavior Analysis based on Trip Segmentation over GPS Information. Comparison of three unsupervised approaches

Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using RapidminerStudy and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
IJERA Editor
 
MachineLearning-v0.1
MachineLearning-v0.1MachineLearning-v0.1
MachineLearning-v0.1
Sergey Popov
 
Top10 algorithms data mining
Top10 algorithms data miningTop10 algorithms data mining
Top10 algorithms data mining
Asad Ahamad
 

Similar to Driving Style and Behavior Analysis based on Trip Segmentation over GPS Information. Comparison of three unsupervised approaches (20)

Emergency response behaviour data collection issue
Emergency response behaviour data collection issueEmergency response behaviour data collection issue
Emergency response behaviour data collection issue
 
Chapter8
Chapter8Chapter8
Chapter8
 
KIT-601 Lecture Notes-UNIT-2.pdf
KIT-601 Lecture Notes-UNIT-2.pdfKIT-601 Lecture Notes-UNIT-2.pdf
KIT-601 Lecture Notes-UNIT-2.pdf
 
A data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototypingA data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototyping
 
5. Machine Learning.pptx
5.  Machine Learning.pptx5.  Machine Learning.pptx
5. Machine Learning.pptx
 
Machine Learning statistical model using Transportation data
Machine Learning statistical model using Transportation dataMachine Learning statistical model using Transportation data
Machine Learning statistical model using Transportation data
 
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using RapidminerStudy and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
 
MachineLearning-v0.1
MachineLearning-v0.1MachineLearning-v0.1
MachineLearning-v0.1
 
Data Science and Machine learning-Lect01.pdf
Data Science and Machine learning-Lect01.pdfData Science and Machine learning-Lect01.pdf
Data Science and Machine learning-Lect01.pdf
 
algorithmic-decisions, fairness, machine learning, provenance, transparency
algorithmic-decisions, fairness, machine learning, provenance, transparencyalgorithmic-decisions, fairness, machine learning, provenance, transparency
algorithmic-decisions, fairness, machine learning, provenance, transparency
 
Mobility model for convex areas
Mobility model for convex areasMobility model for convex areas
Mobility model for convex areas
 
A survey of modified support vector machine using particle of swarm optimizat...
A survey of modified support vector machine using particle of swarm optimizat...A survey of modified support vector machine using particle of swarm optimizat...
A survey of modified support vector machine using particle of swarm optimizat...
 
Simplified Knowledge Prediction: Application of Machine Learning in Real Life
Simplified Knowledge Prediction: Application of Machine Learning in Real LifeSimplified Knowledge Prediction: Application of Machine Learning in Real Life
Simplified Knowledge Prediction: Application of Machine Learning in Real Life
 
Chapter 07
Chapter 07Chapter 07
Chapter 07
 
Feature selection in multimodal
Feature selection in multimodalFeature selection in multimodal
Feature selection in multimodal
 
I017366469
I017366469I017366469
I017366469
 
Concept Drift for obtaining Accurate Insight on Process Execution
Concept Drift for obtaining Accurate Insight on Process ExecutionConcept Drift for obtaining Accurate Insight on Process Execution
Concept Drift for obtaining Accurate Insight on Process Execution
 
DM UNIT_4 PPT for btech final year students
DM UNIT_4 PPT for btech final year studentsDM UNIT_4 PPT for btech final year students
DM UNIT_4 PPT for btech final year students
 
Top10 algorithms data mining
Top10 algorithms data miningTop10 algorithms data mining
Top10 algorithms data mining
 
Zeleke_Poster14
Zeleke_Poster14Zeleke_Poster14
Zeleke_Poster14
 

More from Marco Brambilla

Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Marco Brambilla
 
Exploring the Bi-verse. A trip across the digital and physical ecospheres
Exploring the Bi-verse.A trip across the digital and physical ecospheresExploring the Bi-verse.A trip across the digital and physical ecospheres
Exploring the Bi-verse. A trip across the digital and physical ecospheres
Marco Brambilla
 
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
Marco Brambilla
 
Community analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networksCommunity analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networks
Marco Brambilla
 
Data Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extractionData Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extraction
Marco Brambilla
 
Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...
Marco Brambilla
 
Web Science. An introduction
Web Science. An introductionWeb Science. An introduction
Web Science. An introduction
Marco Brambilla
 
On the Quest for Changing Knowledge. Capturing emerging entities from social ...
On the Quest for Changing Knowledge. Capturing emerging entities from social ...On the Quest for Changing Knowledge. Capturing emerging entities from social ...
On the Quest for Changing Knowledge. Capturing emerging entities from social ...
Marco Brambilla
 

More from Marco Brambilla (20)

M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
 
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
 
Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023
 
Exploring the Bi-verse. A trip across the digital and physical ecospheres
Exploring the Bi-verse.A trip across the digital and physical ecospheresExploring the Bi-verse.A trip across the digital and physical ecospheres
Exploring the Bi-verse. A trip across the digital and physical ecospheres
 
Conversation graphs in Online Social Media
Conversation graphs in Online Social MediaConversation graphs in Online Social Media
Conversation graphs in Online Social Media
 
Trigger.eu: Cocteau game for policy making - introduction and demo
Trigger.eu: Cocteau game for policy making - introduction and demoTrigger.eu: Cocteau game for policy making - introduction and demo
Trigger.eu: Cocteau game for policy making - introduction and demo
 
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
 
Analyzing rich club behavior in open source projects
Analyzing rich club behavior in open source projectsAnalyzing rich club behavior in open source projects
Analyzing rich club behavior in open source projects
 
Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit C...
Analysis of On-line Debate on Long-Running Political Phenomena.The Brexit C...Analysis of On-line Debate on Long-Running Political Phenomena.The Brexit C...
Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit C...
 
Community analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networksCommunity analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networks
 
Available Data Science M.Sc. Thesis Proposals
Available Data Science M.Sc. Thesis Proposals Available Data Science M.Sc. Thesis Proposals
Available Data Science M.Sc. Thesis Proposals
 
Data Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extractionData Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extraction
 
Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018
 
Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...
 
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
 
Model-driven Development of User Interfaces for IoT via Domain-specific Comp...
Model-driven Development of  User Interfaces for IoT via Domain-specific Comp...Model-driven Development of  User Interfaces for IoT via Domain-specific Comp...
Model-driven Development of User Interfaces for IoT via Domain-specific Comp...
 
A Model-Based Method for Seamless Web and Mobile Experience. Splash 2016 conf.
A Model-Based Method for  Seamless Web and Mobile Experience. Splash 2016 conf.A Model-Based Method for  Seamless Web and Mobile Experience. Splash 2016 conf.
A Model-Based Method for Seamless Web and Mobile Experience. Splash 2016 conf.
 
Big Data and Stream Data Analysis at Politecnico di Milano
Big Data and Stream Data Analysis at Politecnico di MilanoBig Data and Stream Data Analysis at Politecnico di Milano
Big Data and Stream Data Analysis at Politecnico di Milano
 
Web Science. An introduction
Web Science. An introductionWeb Science. An introduction
Web Science. An introduction
 
On the Quest for Changing Knowledge. Capturing emerging entities from social ...
On the Quest for Changing Knowledge. Capturing emerging entities from social ...On the Quest for Changing Knowledge. Capturing emerging entities from social ...
On the Quest for Changing Knowledge. Capturing emerging entities from social ...
 

Recently uploaded

一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
pyhepag
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
cyebo
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
pyhepag
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
pyhepag
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
cyebo
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
pyhepag
 

Recently uploaded (20)

一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prison
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
Machine Learning for Accident Severity Prediction
Machine Learning for Accident Severity PredictionMachine Learning for Accident Severity Prediction
Machine Learning for Accident Severity Prediction
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
 
Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp online
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdf
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 

Driving Style and Behavior Analysis based on Trip Segmentation over GPS Information. Comparison of three unsupervised approaches

  • 1. Driving Style Analysis based on Trip Segmentation. A Comparative Multi-Technique Approach Marco Brambilla, Andrea Mauri, Paolo Mascetti @marcobrambi
  • 2. Agenda Intro Problem Definition Dataset Data Exploration and Preliminaries Trip Segmentation Techniques Validation Conclusions
  • 3. Intro: Relevance 1.24 million traffic-related fatalities occur annually world wide Currently the leading cause of death for people aged between 15 and 29 years Majority of cases due to improper or risky driving behavior Source: World Health Organisation (WHO)
  • 4. Intro: Driving Process Driving Process: driving a car is a complex task that requires to take informed decisions based on information pertaining different levels such as his own state and other drivers’ behavior.
  • 5. Intro: Relevant Information Vehicle’s Status Contextual Info • Road State • Weather Conditions • Traffic Info • Road Risk • Traffic
  • 6. Problem Statement Data-driven driver profiling with respect to driving risk Essentially: Multivariate Time Series Segmentation Application scenarios in insurance, promoting pay-how-you-drive (PHYD) business models
  • 7. State of the Art and Challenges State of the art: many works on identification and recognition of behavioural patterns (line following, accelerations, braking etc) and maneuvers recognition, behavioural scoring, prediction of driver intentions. Supervised Learning techniques require intensive end expensive gathering process.
  • 8. Proposed Solution Unsupervised techniques to profile drivers behaviour based on identified recurrent patterns on driving path segmentation Comparison of 3 different approaches and use of all of them for consolidated results 1. Unsupervised Segmentation Based on Clustering 2. Unsupervised Segmentation Based on HMM 3. Unsupervised Topic Extraction
  • 9. Contextual Scenes Observed driving behaviours that are repeated in each driver's behaviour and also across different drivers. A reduced representation of the original Multivariate Time Series conveying a simplified characterization Further reasoning is then applied
  • 10. ETL Process 3 Steps: Extract: read collected files and selection of candidate features Transform: Filter and Grouping Features computation Load: produce a unique dataset PreProcessing Transform Global dataset.csv Load Trip File.csv Extract
  • 11. Datasets Collection Device : Xsens MTi-G-710 (27 users) And cell phones (10 users) Retrieved Signals : Acceleration measurements Altitude GPS Positioning Speeding Orientation Mounted in-vehicle aligned with direction of movement. No Ground truth knowledge
  • 12. Features Selected Acceleration (on Y and X axes), Speed (on Y and X axes) Difference in yaw
  • 13. Pre-Analysis 1: Data Exploration
  • 14. Pre-Analysis 1: Data Exploration
  • 15. Pre-Analysis 2: Application of Driving Safety Existing Analyses Vaiana et.al. Propose a Driving Safety Diagram based on longitudinal and lateral accelerations analysis. Aggressiveness Index formulation: (A = Aggressive, S = Safe points) Graphical representation:
  • 17. 1. Unsupervised Segmentation Based on DP-Means Clustering Problem: Bayesian nonparametric techniques require expensive sampling methods or variational techniques. DP-means: proposed by Kulis et. al. revisiting k-means: K-means like objective function + penalty A new cluster is created whenever a point is farther than λ away from every already existing centroid. Note: Clustering results depends on data ordering.
  • 22. Distribution of features across clusters
  • 23. Distribution of features across clusters
  • 27. Unsupervised Segmentation based on HMM Goal: identify latent structure given observed data points, assuming existance of Gaussian hidden states. Assign to each observed point the corresponding hidden state. Hidden Markov Models (HMM): Observation and hidden states Markovian properties Continous observation
  • 28. Unsupervised Segmentation based on HMM Training: Baum-Welch EM algorithm to learn model parameters Decoding: Viterbi decoding to assign to each observed point the most likely hidden state
  • 29. HMM Results Also a different variation applied: inertial HMM: lower transition probabilities enforcing state persistence. Sensible for driving.
  • 30. HMM Results Clusters as hidden states.
  • 31. HMM Results Clusters as hidden states.
  • 32. Example of Trip Segmentation
  • 34. Topic Extraction Approach What is topic extraction ? Model topical concepts belonging to a set of textual documents. Data are described as documents and the components are distributions of terms that reflect recurring patterns, name Topics. Hierarchical Dirichlet Processes (HDPs) soft-clustering technique based on non-parametric Bayesian theory. number of topics is not set a priori, but learned from data. Posteriori probability approximated by Variational Inference algorithm by Wang et.al. Results: Most relevant topics for each document and terms distribution in each topic.
  • 35. Topic Extraction Process Data Quantization Documents creation Topics Extraction Topics Evaluation
  • 36. Quantization – Binning Process with static binning strategy
  • 38. Terms Relevance on Top 7 Topics Linguist…
  • 39. Terms Relevance on Top 7 Topics … and data analyst perspectives …
  • 41. Big Issue: How to Compare? 1) Point-to-point or point distribution 2) Resulting grouping of trips 3) Perceived user similarity of trips
  • 42. Solution 1: Point-to-Point Overlap of clusters? Per trip? Overall?
  • 45. Solution 2: Moving from Points to Trips Can we cluster trips based on how observation points have been clustered? à Simple K-means clustering of trips for each approach. à Comparison of overlap of the different clusters Coherent with original question: grouping of trips (and thus drivers) by driving behavior
  • 46. Result of overlap analysis K-means with K=6 clusters. DP-means vs. HMM: 74% overlap DP-means vs. Topic: 44% HMM vs. Topic: 48%
  • 47. Human Validation of Trip Groups Experts (knowledgeable about driving styles and driving paths recorded) identify possible groups of trips in the dataset Problem: - Unable to distinguish 6 categories of groups - Only 3 categories are feasible - Best matching 6à3 categories for each method
  • 49. Conclusions Three different clustering techniques of driving behavior over trips -> segmentation Clustering of trips based on behavior -> up to 74% overlap over 6 clusters -> 100% overlap over 3 clusters User Validation -> 96% precision over 3 clusters
  • 50. Future Work About collection process: Gathering process including contextual information (road risk, traffic status, weather conditions) Larger dataset to improve inference performance About implemented methods: Smarter data ordering for DP-means Relax independency assumption in HMM Improvements in data discretization process for HDP
  • 51. Marco Brambilla, @marcobrambi, marco.brambilla@polimi.it http://datascience.deib.polimi.it Thanks! Questions?