This document discusses multi-object tracking algorithms. It begins by introducing object tracking and classification of trackers. Simple Online and Realtime Tracking (SORT) is described, which uses a Kalman filter for state estimation and the Hungarian algorithm for data association. Deep SORT is then introduced, which improves on SORT by incorporating appearance features and using the Mahalanobis distance and cosine distance for data association, helping with short-term and long-term occlusion. Results show Deep SORT performs well on benchmark datasets.
1. MULTI-OBJECT TRACKING : SIMPLE
ONLINE AND REALTIME TRACKING WITH
DEEPASSOCIATION METRIC(DEEP SORT)
김 경 훈
Vision & Display Systems Lab.
Dept. of Electronic Engineering, Sogang University
3. Introduction
• Object Tracking
▪ Object tracking is the process of locating moving objects over time in videos.
▪ Why can’t using only object detection for tracking?
− Association problem, occlusion problem
3
7. SORT 1/2
• SimpleOnlineandRealtimeTracking–ICIP2016
(235citation)
▪ StateEstimation
− Kalmanfilter(Linear) -[x,y,a,h,vx,vy,va,vh]
▪ DataAssociation
− Hungarianmethod
• High speed and high accuracy
• Simple model
▪ location and size of bounding box -> state estimation
▪ Focus on real-time speed
− Disregarding long-term occlusion
8
8. SORT 2/2
• Estimation Model
▪ Kalman Filter(Linear)
− X = [ x, y, a, h, vx, vy, va, vh ]
• Data Association
9
x, y : location of bounding box
a : aspect ratio
h : height
v(x,y,a,h) : respective velocities
12. DEEP SORT 1/4
• SimpleOnlineandRealtimeTrackingwithaDeepAssociationMetric-2017arxiv
(154citation)
• To solve assignment problem more effectively
▪ Squared Mahalanobis distance
− Useful for short-term occlusion
▪ The cosine distance considers appearance information
− Useful for long-term occlusion
14
13. DEEP SORT 2/4
• Euclidean distance
▪ Which is the same in all directions really doesn’t reflect the class structure
and the data that tends to be distributed.
• Mahalanobis distance
▪ Incorporate the uncertainties from Kalman filter
▪ Thresholding this distance can give good actual association
15
14. DEEP SORT 3/4
• The occlusion problem
▪ To solve long-term occlusion
− Using the appearance feature vector
− D = Lambda ∗ 𝐷𝑘 + ( 1 − Lambda ) ∗ 𝐷𝑎
҉ 𝐷𝑘 is the Mahalanobis distance
҉ 𝐷𝑎 is the cosine distace between the appearance feature vectors
҉ Lambda is the weighting factor
16
15. Result
17
• Tracking results on the MOT16 challenge. We compare to other published
methods with non-standard detections.
• Depend on detection algorithm
16. DEEP SORT 4/4
• Conclusion
▪ If the bounding boxes are too big than background is reducing the effectiveness of the
algorithm
▪ If people are dressed similarly as happens in sports that can results in similar features
and ID switching.
18
17. References
• Multiple Object Tracking: A Literature Review
• SimpleOnlineandRealtimeTracking
• SimpleOnlineandRealtimeTrackingwithDeepCosineMetric
20