Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

交通事故データへの頻出パターンマイニングの適用

313 views

Published on

``An application of frequent pattern mining to traffic accident data"

自動車技術会 2016年春季大会 in パシフィコ横浜 事故分析と安全対策Ⅰ
にて発表したスライドです.
時間がオーバーするため泣く泣く非表示にしたスライドを追加してあります.

質疑応答
Q. 事故データは人がラベル付けした?
A. 警察の方々がやっています.

Q. 3つ目の分析で相関以外の分析を行っては? (線形になるとは限らないので)
A. 確かにそうですね(思いつかなかった). supportの増加量・減少量が大きいパターンを抽出してみるのもありだと思います.

Q. センサデータを取り入れたビックデータで処理できる?
A. 数値データを扱うなら, 別の手法が好ましそうです. センサデータを処理してラベルデータを抽出するならありかもです(事故の衝撃の度合, 車の具体的な損傷等).

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

交通事故データへの頻出パターンマイニングの適用

  1. 1. An application of frequent pattern mining to traffic accident data 交通事故データへの頻出パターンマイニングの適用 Yuta Takahashi1), Masaru Kiyota1), Yukuo Hayashida1), Yuichiro Sakamoto1) 1) Saga University
  2. 2. Background • Traffic accidents per hundred thousands in Saga are most often in Japan while 4 years 2 0 200 400 600 800 1000 1200 2012 2013 2014 2015 NumberofTrafficAccidents (perhundredthousands) year Saga Average We haven’t identified the cause clearly. Due to complication Over 2 times!!
  3. 3. PDCA cycle for preventing accidents 3 Plan Do Check Action • Analysis a cause of accidents • Propose a prevention method • Take measures • Verify the effectiveness of action • Consider the lack of measures • Schedule next measures Do Plan Check Action PCDA cycle
  4. 4. PDCA cycle for preventing accidents 4 Plan Do Check Action PCDA cycle  Procedure of “Plan” Select feature points Analysis of accident causes Planning of measures Research entities Position of this study
  5. 5. Problem of previous analysis • Ordinary we make many tables and graphs ⇒ It needs much cost to make and check 5 • We attend to introduce a data mining approach
  6. 6. Data mining • A method of gaining knowledge from large data • It is different from statistical analysis • Performance takes priority over strictness 6  SOM  Clustering  K-means  Association rule analysis  Neural Network  Decision Tree  SVM  Bayesian network ① Learning ② Extract patterns
  7. 7. Related works  Machine Learning Approach 7 Dataset Fatal injury Classifier Incapacitating injury No injury … Model Patterns Chong, Miao, etc at al. "Traffic accident data mining using machine learning paradigms." Fourth International Conference on Intelligent Systems Design and Applications (ISDA'04) Method  Decision Tree  Neural Network  DT & NN
  8. 8. Related works 8  Clustering Approach Dataset Clustering C1 C2 C3 C7 … Pattern1 Pattern2 Pattern3 Pattern7 … Depaire, Benoît, Geert Wets, and Koen Vanhoof. "Traffic accident segmentation by means of latent class clustering." Accident Analysis & Prevention 40.4 (2008): 1257-1266.
  9. 9. Frequent pattern mining 9 • A method of finding frequent items from the dataset • It can count item combination efficiently support 𝑋 = count 𝑋 𝑀 X : Patterns M : Amount of transactions  specify minsup parameter support(X ) Patterns which are under minsup will be eliminated We expect this could easy to extract patterns!!
  10. 10. Frequent pattern mining vs statistical analysis Frequent pattern mining Statistical analysis strictness △ ○ Analysis cost ○ △ Finding knowledge ○ × 10
  11. 11. Purpose of this study Adapt frequent pattern mining to traffic accident data - Is it efficient to analyze? - Can it get knowledge? 11 Data & Analysis - Traffic Accident Data 45,653 datas - Algorithm Apriori - Programming Python
  12. 12. 3 type analysis with pattern mining ② Vehicle type 12 ① Intersection ③ Age ① ー ② ー ③ ⇒ Time rules ⇒ Emerging Patterns, Similarity ⇒ Correlation
  13. 13. ① Time rules in each intersection 13① ー ② ー ③  Intersection analysis Name of intersection is labeled by police • Occur time prediction • Causes due to time  Association rule support 𝑋 ⇒ 𝑌 = count(𝑋 ∪ 𝑌) 𝑀 rule ∶ 𝑋 ⇒ Y Sunny ⇒ Road is drying
  14. 14. Flow of the analysis 14① ー ② ー ③ Dataset Interse ction 1 Interse ction 2 Interse ction N Patter ns 1 Patter ns N All rules Patter ns 2 Rules 1 Rules 2 Rules N … … …  Time span 1. Long span (Season, Weekday or Holiday, time per 6 hours) 2. Short span (Season, Days of the week, time per 3 hours)
  15. 15. Result of rules (Long span) rule intersection support {Weekday} ⇒ {12:00-17:00} 下村 0.588 {Autumn} ⇒ {Weekday} 水上交差点 0.583 {Weekday} ⇒ {12:00-17:00} 県庁前交差点 0.571 {Summer} ⇒ {Weekday} 愛敬町 0.545 {Winter} ⇒ {Weekday} 満穴 0.529 {Summer} ⇒ {Weekday} 唐房入口交差点 0.500 {Summer} ⇒ {Weekday} 伊勢町 0.500 {Weekday} ⇒ {12:00-17:00} ハローワーク唐 津入口 0.500 {Summer} ⇒ {Weekday} 栄町北 0.474 {Autumn} ⇒ {Weekday} 村徳永 0.462 15① ー ② ー ③
  16. 16. Result of rules (Long span) rule intersection support {Winter} ⇒ {Holiday} 幡崎東 0.455 {Holiday} ⇒ {12:00-17:00} 枯木の塔 0.364 {Holiday} ⇒ {12:00-17:00} 長瀬交差点(信号) 0.364 {Autumn} ⇒ {Holiday} 神埼市役所前 0.357 {Holiday} ⇒ {6:00-11:00} 五条(北) 0.333 16① ー ② ー ③
  17. 17. Result of rules rule intersection support {Spring} ⇒ {Friday} 千布北交差点(信号) 0.364 {Saturday} ⇒ {12:00-14:00} 鏡山入口交差点 0.364 {Winter} ⇒ {Saturday} 幡崎東 0.364 {Winter} ⇒ {Monday} 龍谷短大入口 0.350 {Winter} ⇒ {Sunday} 脇田 0.333 {Summer} ⇒ {12:00-14:00} 浜玉中学校前交差点 0.313 {Spring} ⇒ {18:00-21:00} 中副交差点(R385) 0.308 {Monday} ⇒ {12:00-14:00} 東多久駅前 0.286 {Winter} ⇒ {15:00-17:00} 県庁前交差点 0.286 {Summer} ⇒ {Saturday} 材木町浦島通り 0.273 17① ー ② ー ③
  18. 18. ② Vehicle type analysis 18① ー ② ー ③  Road structure Extract road structure patterns by comparing vehicle type Determine degree of similarity with comparing patterns Car Walker Bicycle Car Auto bicycle perpetrator victim
  19. 19. Flow of the analysis 19① ー ② ー ③ Dataset Walker Bicycle Auto bicycle Patter ns 1 Patter ns 4 All patterns Patter ns 2 Car Patter ns 3 Similarity Emerging patterns
  20. 20. How to determine emerging patterns 20① ー ② ー ③ Walker Bicycle Car Auto bicycle  Growth rate GRG 𝑒 = supportG 𝑒 1 𝑁 𝑖=0 𝑁 supportGi 𝑒 At pattern e GRG 𝑒 ≥ 1.0 ⇩ Emerging pattern
  21. 21. Growth rate of each vehicle type 21① ー ② ー ③
  22. 22. Elements of the emerging pattern 22 Walker Bicycle Car Auto bicycle  No traffic light  Median divider - paint  Inside of intersection  Single road・1 lane road  Near intersection  Exist pedestrian-vehicle separation  Median divider - paint  General traffic location  No pedestrian-vehicle separation  No traffic light  Intersection  No median divider ① ー ② ー ③
  23. 23. Degree of similarity between vehicle type Walker Bicycle Car Auto bicycle Walker - 0.129 0.129 0.112 Bicycle 0.135 - 0.137 0.056 Car 0.129 0.137 - 0.087 Auto bicycle 0.112 0.056 0.087 - 23① ー ② ー ③ Si G1, G2 = 1 𝑁 𝑖=0 𝑁 𝑠𝑢𝑝𝑝𝑜𝑟𝑡 𝐺1 𝑒𝑖 − 𝑠𝑢𝑝𝑝𝑜𝑟𝑡 𝐺2 𝑒𝑖 Si → 0.0 : Similar Si → 1.0 : not Similar
  24. 24. ③ Correlation between age and support 24① ー ② ー ③  Correlation patterns Many patterns ↓ Some patterns have a correlation with age? 0 200 400 600 800 1000 1200 1400 1600 0 10 20 30 40 50 60 70 80 90 Amountofperpetrators Age
  25. 25. Flow of the analysis 25① ー ② ー ③ Dataset Age 18 Age 19 Age 80 Patter ns 1 Patter ns 62 All patterns Patter ns 2 Correlation patterns … …
  26. 26. Positive correlation patterns 26① ー ② ー ③ Before noon : 9:00-11:00
  27. 27. Negative correlation patterns 27① ー ② ー ③ Beginning of night : 18:00-20:00
  28. 28. Conclusions We tried to adapt frequent pattern mining to traffic accident data We did 3 type analysis • Time rules in intersections → Some rules that look related to time • Vehicle type and road structure → Emerging patterns and determine similarity • Frequent patterns which have correlation with age → Extract some patterns with the correlation 28
  29. 29. Future works • Validation of knowledge has not done • Traffic accident data with location can add more metadata • We’d like to introduce this method to real case 29 Dataset Frequent pattern mining Knowledge ?

×