Submit Search
Upload
Catch me if you can: detecting pickpocket suspects from large-scale transit records
•
0 likes
•
13,275 views
S
ShaoHsuan Huang
Follow
sharing paper notes and highlights
Read less
Read more
Science
Report
Share
Report
Share
1 of 16
Download now
Download to read offline
Recommended
11X1 T14 03 indefinite integral
11X1 T14 03 indefinite integral
Nigel Simmons
11. límite de funciones
11. límite de funciones
Lourdes Moreno Márquez
机器学习V10baochang svm
机器学习V10baochang svm
Shocky1
支持向量机算法
支持向量机算法
ruxianzaixin
高等生產管理 第一組
高等生產管理 第一組
阿狗 郭
手勢以及身體骨架辨識
手勢以及身體骨架辨識
CHENHuiMei
蒙地卡羅模擬與志願運算
蒙地卡羅模擬與志願運算
Yuan CHAO
人工智慧07_迴歸方法(智慧交通)
人工智慧07_迴歸方法(智慧交通)
Fuzhou University
Recommended
11X1 T14 03 indefinite integral
11X1 T14 03 indefinite integral
Nigel Simmons
11. límite de funciones
11. límite de funciones
Lourdes Moreno Márquez
机器学习V10baochang svm
机器学习V10baochang svm
Shocky1
支持向量机算法
支持向量机算法
ruxianzaixin
高等生產管理 第一組
高等生產管理 第一組
阿狗 郭
手勢以及身體骨架辨識
手勢以及身體骨架辨識
CHENHuiMei
蒙地卡羅模擬與志願運算
蒙地卡羅模擬與志願運算
Yuan CHAO
人工智慧07_迴歸方法(智慧交通)
人工智慧07_迴歸方法(智慧交通)
Fuzhou University
2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
Marius Sescu
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
Expeed Software
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
Pixeldarts
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
marketingartwork
Skeleton Culture Code
Skeleton Culture Code
Skeleton Technologies
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
Neil Kimberley
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
contently
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
Albert Qian
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
Search Engine Journal
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
SpeakerHub
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
Clark Boyd
Getting into the tech field. what next
Getting into the tech field. what next
Tessa Mero
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Lily Ray
How to have difficult conversations
How to have difficult conversations
Rajiv Jayarajah, MAppComm, ACC
Introduction to Data Science
Introduction to Data Science
Christy Abraham Joy
Time Management & Productivity - Best Practices
Time Management & Productivity - Best Practices
Vit Horky
The six step guide to practical project management
The six step guide to practical project management
MindGenius
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
RachelPearson36
More Related Content
Featured
2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
Marius Sescu
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
Expeed Software
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
Pixeldarts
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
marketingartwork
Skeleton Culture Code
Skeleton Culture Code
Skeleton Technologies
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
Neil Kimberley
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
contently
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
Albert Qian
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
Search Engine Journal
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
SpeakerHub
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
Clark Boyd
Getting into the tech field. what next
Getting into the tech field. what next
Tessa Mero
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Lily Ray
How to have difficult conversations
How to have difficult conversations
Rajiv Jayarajah, MAppComm, ACC
Introduction to Data Science
Introduction to Data Science
Christy Abraham Joy
Time Management & Productivity - Best Practices
Time Management & Productivity - Best Practices
Vit Horky
The six step guide to practical project management
The six step guide to practical project management
MindGenius
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
RachelPearson36
Featured
(20)
2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
Skeleton Culture Code
Skeleton Culture Code
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
Getting into the tech field. what next
Getting into the tech field. what next
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
How to have difficult conversations
How to have difficult conversations
Introduction to Data Science
Introduction to Data Science
Time Management & Productivity - Best Practices
Time Management & Productivity - Best Practices
The six step guide to practical project management
The six step guide to practical project management
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Catch me if you can: detecting pickpocket suspects from large-scale transit records
1.
Catch Me If
You Can: Detecting Pickpocket Suspects from Large-scale Transit Records 2019.3.11 Youngmi Huang Sharing Topic
2.
Agenda • 解決了了什什麼問題 • 如何解 •
如何衡量量此解法的好壞 • 後續應⽤用 解決了了什什麼問題 如何解 如何衡量量好壞 後續應⽤用
3.
Identify thieves in
the public transit system • 難點: (1) 正常乘客與⼩小偷的 mobile pattern ⾼高度重疊 (2) 存在 imbalanced data 的問題 (1:600 ≈ 0.0017) • 2016 KDD paper:知識發現的頂會 • 論⽂文創新點: • 特徵構建的⼯工作 (交通⾏行行為數據+地理理功能分 區+社群蒐集⼩小偷的ground truth) (2) two-step approach 解決了了異異常⾏行行為以及 從中辨識誰是⼩小偷 (3) 不僅能分析潛在⼩小偷的⾏行行為,建立監測+ 預警系統 解決了了什什麼問題 如何解 如何衡量量好壞 後續應⽤用
4.
Framework Overview 解決了了什什麼問題 如何解
如何衡量量好壞 後續應⽤用 ≈ 1.數據源 2.特徵構建與 可疑⾏行行為分析 3. 兩兩階段模型 4.視覺化呈現
5.
Framework: Data Source
(1/3) ≈ 1.數據源 地圖數據 解決了了什什麼問題 如何解 後續應⽤用如何衡量量好壞
6.
Framework: Data Source
(2/3) ≈ • 每⼀一個 trip 由多筆 record 組成 • 超過 30 分鐘即視為新的 trip trip1 trip2 trip3 1.數據源 交通數據 地圖數據 解決了了什什麼問題 如何解 後續應⽤用如何衡量量好壞
7.
Framework: Data Source
(3/3) ≈ 違規報告 • ⼈人 • 被竊地點 • 時間 • 微博 (官⽅方po⽂文 + ⺠民眾揭露) • 做為 ground truth 解決了了什什麼問題 如何解 後續應⽤用 1.數據源 交通數據 地圖數據 如何衡量量好壞
8.
Framework: Mobility Characteristics 1.
超過 80%的乘客每⽇日搭乘時間⼩小於2⼩小時,搭 乘記錄為2次 2. 正常乘客會傾向於 short ride 的次數越少越好 但少於7次 vs 少於19次 的分布其實很不⼀一樣 3. 定義每⼀一次 trip 的主要⽬目的 (e.g. 觀光,⼯工作,…) 4. 辨識是否為可疑的 wandering behavior (via 統計出沒區域的頻率) 5. 起點與終點的合理理搭乘時間 (via 與⼤大眾相比的標準差) 2.特徵構建與 可疑⾏行行為分析 解決了了什什麼問題 如何解 後續應⽤用如何衡量量好壞
9.
Framework: Two Step
Approach (1/2) step1: Anomaly Detection (One-Class SVM) 3. 兩兩階段模型 異異常包含真正的⼩小偷與 誤判為⼩小偷的正常乘客(fp) 解決了了什什麼問題 如何解 後續應⽤用如何衡量量好壞
10.
Framework: Two Step
Approach (1/2) step2: Supervised Classification (SVM) 3. 兩兩階段模型 step1: Anomaly Detection (One-Class SVM) 異異常包含真正的⼩小偷與 誤判為⼩小偷的正常乘客(fp) 解決了了什什麼問題 如何解 後續應⽤用如何衡量量好壞
11.
Framework: Two Step
Approach (2/2) step2: Supervised Classification (SVM) step1: Anomaly Detection (One-Class SVM) non-linear decision boundaries min w,ρ 1 2 ∥w∥2 + C N ∑ n=1 ϵi − ρ s . t . ̂g(Xi) = ⟨w, ϕ⟩ + ρ ≤ ϵi and ϵi ≥ 0, for all collected passengers n=1,2,…N ̂g(x) = ⟨w, ϕ(x)⟩ + ρ, g(x) = 1 ̂g(x) ≥ 0 0 ̂g(x) < 0 and then (objective function) 解決了了什什麼問題 如何解 後續應⽤用如何衡量量好壞
12.
Framework: Two Step
Approach (2/2) step2: Supervised Classification (SVM) step1: Anomaly Detection (One-Class SVM) margin decision plane non-linear decision boundaries min w,ρ 1 2 ∥w∥2 + C N ∑ n=1 ϵi − ρ s . t . ̂g(Xi) = ⟨w, ϕ⟩ + ρ ≤ ϵi and ϵi ≥ 0, for all collected passengers n=1,2,…N ̂g(x) = ⟨w, ϕ(x)⟩ + ρ, g(x) = 1 ̂g(x) ≥ 0 0 ̂g(x) < 0 and then (objective function) 解決了了什什麼問題 如何解 後續應⽤用如何衡量量好壞
13.
Experiments & Discussion
(1/2) 解決了了什什麼問題 如何解 如何衡量量好壞 後續應⽤用 在⼩小偷 ground truth 樣本數極少的情況下: (1) 單⼀一模型: anomaly detection 比分類模型有效 (2) 兩兩階段模型: 先辨識異異常,再利利⽤用⼆二分類模型分類 有效在 recall, precision, f1-score 有所提升 • 使⽤用數據 經由數據清洗(排除極值)後約有 16 億 筆搭乘紀錄、約有 600 萬名乘客 • 模型成效 10-fold cross validation, frac= 0.2 • precision 7% 是否為好?
14.
Experiments & Discussion
(2/2) • precision 7% 是否為好? 在錯分類當中:FP⾼高,FN低, 代表模型寧可錯抓也不要漏放 • 抓⼩小偷與抓出⾼高風險的⼈人? 在數據本質上類似(樣本少、潛在違約樣態多元),因此後續量量化風險的異異常檢測可以參參 考本篇論⽂文的作法 解決了了什什麼問題 如何解 如何衡量量好壞 後續應⽤用
15.
Application: Prototype & Pattern
Discovery 解決了了什什麼問題 如何解 如何衡量量好壞 後續應⽤用 ≈ 全體/有效群體/離群值/可疑⼩小偷 實時流量量熱點圖 可疑⼩小偷的出沒地點 可疑⼩小偷的List 交互式移動路路徑
16.
THANK YOU
Download now