1
DEEP LEARNING JP
[DL Papers]
http://deeplearning.jp/
Tracking Objects as Points
Shizuma Kubo, ACES.Inc
• mhn
Ø 2I CA 0: K 1GA K
Ø tj A A GL 4 G KL 1 A AHH I :L
Ø l 2 (L KA K / :
Ø G I A
Ø ( ) Xa c ( ( ud
Ø io . / /
Ø GIE Kv ))4 uk [TO P
s b] rXfgXVOep ry AK,L:rXZXU
• ( ( ( ( ) r h fc djgl
Ø (s D A : 7E - D0 7E / ( :AC 7 DC
Ø e il P M t . - J P Mjgl L
Ø woSa M J J
Ø E A7 7 CE h fc djgl MN
• n E2 e il O p TJ136 M b
136 ) u 36
.5
.5
32
1 4
Ø
Ø 5 32
Ø .
4
4 C
.
Multi Object Tracking (MOT)
• Ro k c jTmd S[
• l jTmd i Sr DS
Ø W awT] I M S
Ø O g
• l p p Tt Sr DS
Ø W jTne st S r
Ø O g
h +6: 6 2 6: 1 6 A6 16 6 61 7
MOT Challenge
• O i N P ( 177 x MP O M(0-
M
• lO N 199 2 M
• 2 2 0 N 1 2 2 2 0 N Cn V
Ø 2 2 0 e g )1 : ( Os r N i
Or V MPF T d i
Ø 1 2 2 2 0 s pR r V
• O ah O O O M VFN
• O M tVO
%
• ) i ( ( gT D : P gs v
a S
• o e ) ISNI W F : g s
• % e T G D o g
• e T G D o g
• (0 1 81
• (0 1 120 81
• F D oT P tF D l
MOT
• - - ng M r 9
9 : M i c :
Ø a Mt :
Ø M P 9 T M9
b PO
• - - - - ng M
: i c : u D :
Ø e] 9 xM
Ø ( Pv ) ng [ ] do l b M T9
e]
MOT : Tracking-by-detection
• 1 6
Ø [r ) 0 u eBa a
c ]B gj
Ø Tr jdi ʼ
• 1
Ø c ]B gj a I [r
Ø [r n
• ( - 01 1 i b W yx P lx c ]B gj
[k ow t UW C
MOT : Joint detection and tracking
• VekFhlF IgFb
Ø 9 9 8 7 7 C
Ø 8 7 D
Ø + 98 B8 7 1 B78 C
• ]Z ekFh ʻ V oI uT wr ]
V sTS ʼ V I P
• nt V Vdic h I [amf Tt n
)(
• nr 1 0 Emr ) D
v c ea
• 6 27 6 ) D
( M A
• 61 70 c 0 706 6 2t
D C
• ) D o P
TD V i
k O
)
)
0 2
Limitation
• W 3
TO S f
• o7 3 3
TO S f
D I6
• L k 1 1
3 e
M
Ø . .
Ø .
435
1
CenterTrack
• Our tracker, CenterTrack, applies a detection model to a pair of images and
detections from the prior frame. Given this minimal input, CenterTrack
localizes objects and predicts their associations with the previous frame.
That’s it.
CenterTrack
• 1 (1( 1 a e
f
• a e Tc g a
d
Ø i J 6
Ø a kT )) (1 6
CenterNet (Objects as Points)
width
• C e a 7=c
+ )22 1+) 1 2 1 + 2 4
• b a C
• C ( L
height
: https://medium.com/machine-learning-bites/deeplearning-series-convolutional-neural-networks-a9c2f2ee1524
CeterTrack
• N e ( 2 1) 2 N e
• ( 2 1) 2 1
:C ( 2 1) 2
• ( 2 1) 2 8 :C ( 2 1) 2
• N 3 + 4 :Cd 3 + 4 b
C
15
Ø 9
Ø 25 . 4
+ ( + )
• e CN : 1 2 :
Ø :
Ø :
•
01
(offset prediction)
• ri c no
Ø ) )1 1 ) )1 s
Ø , +) 1() ) ) )
Ø N2 f e e c : c
Ø +() )d c
Ø t c C
1
) ) ) )
)
Ablation study
• / 2 o h ar n mD C
• l gp N : f
Ø i e
• h SW D I : f
Ø i e
offset prediction ( )
• l t R 2T a R 2T
c R 2T p :
• f e st 3 i o
Ø p :
Ø O S R 2 w
Ø c
315
315
Ø .2 4
Ø
• F 5 n m 5 Fi N
m t
• v :F o N F F P e A 2 ) (
Ø F
Ø ( ) ) s u 5 l A
Ø ( ) ) ( ) s
• 5 5 Fg v F 5 a F
• a e [ p
Ø [ ]
Ø 2 2() i c
• T k
Ø l a e [Tf c g : r
s Tyu
Ø vn 2 2 ) 6
Ø tn t C 2() [ ]
[ ] T
Ablation study
• a i du ) (
Ø ) ( t
Ø 3 3 ) ( :
• m e a
Ø r : 2 r r
• . o 3 ) (( : e
Ø n A 7e g
Ø ri :
.
. 43
2 15
• 2
Ø 2 s
Ø ( ) 2 oe
• C : 9
Ø I D
Ø 2 D
• C f

[DL輪読会]Tracking Objects as Points

  • 1.
    1 DEEP LEARNING JP [DLPapers] http://deeplearning.jp/ Tracking Objects as Points Shizuma Kubo, ACES.Inc
  • 2.
    • mhn Ø 2ICA 0: K 1GA K Ø tj A A GL 4 G KL 1 A AHH I :L Ø l 2 (L KA K / : Ø G I A Ø ( ) Xa c ( ( ud Ø io . / / Ø GIE Kv ))4 uk [TO P s b] rXfgXVOep ry AK,L:rXZXU
  • 3.
    • ( (( ( ) r h fc djgl Ø (s D A : 7E - D0 7E / ( :AC 7 DC Ø e il P M t . - J P Mjgl L Ø woSa M J J Ø E A7 7 CE h fc djgl MN • n E2 e il O p TJ136 M b 136 ) u 36
  • 4.
  • 5.
    Ø Ø 5 32 Ø. 4 4 C .
  • 6.
    Multi Object Tracking(MOT) • Ro k c jTmd S[ • l jTmd i Sr DS Ø W awT] I M S Ø O g • l p p Tt Sr DS Ø W jTne st S r Ø O g h +6: 6 2 6: 1 6 A6 16 6 61 7
  • 7.
    MOT Challenge • Oi N P ( 177 x MP O M(0- M • lO N 199 2 M • 2 2 0 N 1 2 2 2 0 N Cn V Ø 2 2 0 e g )1 : ( Os r N i Or V MPF T d i Ø 1 2 2 2 0 s pR r V • O ah O O O M VFN • O M tVO
  • 8.
    % • ) i( ( gT D : P gs v a S • o e ) ISNI W F : g s • % e T G D o g • e T G D o g • (0 1 81 • (0 1 120 81 • F D oT P tF D l
  • 9.
    MOT • - -ng M r 9 9 : M i c : Ø a Mt : Ø M P 9 T M9 b PO • - - - - ng M : i c : u D : Ø e] 9 xM Ø ( Pv ) ng [ ] do l b M T9 e]
  • 10.
    MOT : Tracking-by-detection •1 6 Ø [r ) 0 u eBa a c ]B gj Ø Tr jdi ʼ • 1 Ø c ]B gj a I [r Ø [r n • ( - 01 1 i b W yx P lx c ]B gj [k ow t UW C
  • 11.
    MOT : Jointdetection and tracking • VekFhlF IgFb Ø 9 9 8 7 7 C Ø 8 7 D Ø + 98 B8 7 1 B78 C • ]Z ekFh ʻ V oI uT wr ] V sTS ʼ V I P • nt V Vdic h I [amf Tt n
  • 12.
    )( • nr 10 Emr ) D v c ea • 6 27 6 ) D ( M A • 61 70 c 0 706 6 2t D C • ) D o P TD V i k O ) ) 0 2
  • 13.
    Limitation • W 3 TOS f • o7 3 3 TO S f D I6 • L k 1 1 3 e M
  • 14.
    Ø . . Ø. 435 1
  • 15.
    CenterTrack • Our tracker,CenterTrack, applies a detection model to a pair of images and detections from the prior frame. Given this minimal input, CenterTrack localizes objects and predicts their associations with the previous frame. That’s it.
  • 16.
    CenterTrack • 1 (1(1 a e f • a e Tc g a d Ø i J 6 Ø a kT )) (1 6
  • 17.
    CenterNet (Objects asPoints) width • C e a 7=c + )22 1+) 1 2 1 + 2 4 • b a C • C ( L height : https://medium.com/machine-learning-bites/deeplearning-series-convolutional-neural-networks-a9c2f2ee1524
  • 18.
    CeterTrack • N e( 2 1) 2 N e • ( 2 1) 2 1 :C ( 2 1) 2 • ( 2 1) 2 8 :C ( 2 1) 2 • N 3 + 4 :Cd 3 + 4 b C
  • 19.
  • 20.
    + ( +) • e CN : 1 2 : Ø : Ø : • 01
  • 21.
    (offset prediction) • ric no Ø ) )1 1 ) )1 s Ø , +) 1() ) ) ) Ø N2 f e e c : c Ø +() )d c Ø t c C 1 ) ) ) ) )
  • 22.
    Ablation study • /2 o h ar n mD C • l gp N : f Ø i e • h SW D I : f Ø i e
  • 23.
    offset prediction () • l t R 2T a R 2T c R 2T p : • f e st 3 i o Ø p : Ø O S R 2 w Ø c
  • 24.
  • 25.
    • F 5n m 5 Fi N m t • v :F o N F F P e A 2 ) ( Ø F Ø ( ) ) s u 5 l A Ø ( ) ) ( ) s • 5 5 Fg v F 5 a F
  • 26.
    • a e[ p Ø [ ] Ø 2 2() i c • T k Ø l a e [Tf c g : r s Tyu Ø vn 2 2 ) 6 Ø tn t C 2() [ ] [ ] T
  • 27.
    Ablation study • ai du ) ( Ø ) ( t Ø 3 3 ) ( : • m e a Ø r : 2 r r • . o 3 ) (( : e Ø n A 7e g Ø ri :
  • 28.
  • 29.
    • 2 Ø 2s Ø ( ) 2 oe • C : 9 Ø I D Ø 2 D • C f