3D panoramic multi-person localization and tracking are prominent in many applications, however, conventional methods using LiDAR equipment could be economically expensive and also computationally inefficient due to the processing of point cloud data. In this work, we propose an effective and efficient approach at a low cost. First, we utilize RGB panoramic videos instead of LiDAR data. Then, we transform human locations from a 2D panoramic image coordinate to a 3D panoramic camera coordinate using camera geometry and human bio-metric property (i.e., height). Finally, we generate 3D tracklets by associating human appearance and 3D trajectory. We verify the effectiveness of our method on three datasets including a new one built by us, in terms of 3D single-view multi-person localization, 3D single-view multi-person tracking, and 3D panoramic multi-person localization and tracking. Our code is available at \url{https://github.com/fandulu/MPLT}.
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
Using Panoramic Videos for Multi-Person Localization and Tracking In A 3D Panoramic Coordinate
1. Using Panoramic Videos For Multi-person
Localization and Tracking
in A 3D Panoramic Coordinate
0
Fan Yang1,2, Feiran Li3, Yang Wu4, Sakriani Sakti1,2, Satoshi Nakamura1,2
1Nara Institute of Science and Technology, Japan
2RIKEN Center for Advanced Intelligence Project, Japan
3Osaka University, Japan
4Kyoto University, Japan
3. 2
3D panoramic multi-person localization and tracking – challenges
In 3D panoramic multi-person localization and tracking are widely used in realistic
applications. A common approach is to utilized LiDAR data.
Above demo utilizes LiDAR equipments – by Jafari O. Hosseini, et al. `` Real-Time Multi-Modal People Tracking for Mobile
Robots in Crowded Environments.'’ ICRA 2014.
But, LiDAR could be economically expensive, and, processing point cloud data could be
computationally expensive.
4. 3
3D panoramic multi-person localization and tracking – our proposal
An economical approach could be just utilized panoramic cameras to obtain 2D panoramic
videos.
Or
convert each frame from equirectangular
projection to cubic projection
5. 4
3D panoramic multi-person localization and tracking – our proposal
1Kreiss, Sven, et al. `` PifPaf: Composite Fields for Human Pose Estimation.'' CVPR 2019.
1. Localizing each person in 2D image coordinate
Pose Detection Module
(We use openpifpaf1 here)
Cropping upper body part
6. 5
3D panoramic multi-person localization and tracking – our proposal
v
u
Y = 0
Part of this Image is from opencv
Real scale person model
Upper body ≈ 0.8m
2. Localizing each person in a single-view 3D coordinate
2D image coordinates
3D coordinates
For person target, the biometric property (i.e., height) could be utilized for 3D localization.
7. 6
3D panoramic multi-person localization and tracking – our proposal
+
3D Panoramic
coordinates2D image coordinates
with θ=90°
v
u
3D coordinates
with θ=90°
+
Z
X
0.8 mK
3. Localizing each person in a panoramic 3D coordinate
8. 7
3D panoramic multi-person localization and tracking – our proposal
𝑙 𝑎
𝑘
𝑙 𝑎
𝑘+1
a b
c
d
0.1 0.7
0.6 0.1
𝑙 𝑐
𝑘+1
𝑙 𝑎
𝑘+1 = [𝑋𝑎, 𝑍𝑎]
𝑙 𝑐
𝑘+1 = [𝑋𝑐, 𝑍𝑐]
wℎ𝑒𝑟𝑒 𝐻 𝑏𝑜𝑑𝑦 𝑖𝑠 𝑡ℎ𝑒 ℎ𝑒𝑖𝑔ℎ𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑝𝑒𝑟𝑠𝑜𝑛
Compute distance between observed locations and
estimated locations by Kalman-filter, as
4. Computing cross-frame trajectory cost
9. 8
3D panoramic multi-person localization and tracking – our proposal
query
frame k
gallery
frame k + Δ
5. Knowing trajectories is insufficient for tracking, let’s compare appearances
10. 9
3D panoramic multi-person localization and tracking – our proposal
Frame k
Frame k+1
a b
c d
0.2 0.8
0.7 0.3
a b
c
d
Appearance Embedding
a
b
c
d
5. Obtaining appearance embedding and compute their appearance cost
1Luo, Hao, et al. `` A Strong Baseline and Batch Normalization Neck for Deep Person Re-identification.'' TMM 2019.
Appearance
Re-identification
Module
(We use Reid-strong-
baseline1 here)
11. 10
3D panoramic multi-person localization and tracking – our proposal
6. Fusing trajectory cost and appearance embedding
Tracking
Module
Assign tracking IDs
and update tracking records
0.2 0.8
0.7 0.3
a b
c
d
0.1 0.7
0.6 0.1
a b
c
d
Hungarian Algorithm is used for cross-frame bipartite matching as
Frame k Frame k+1
12. 11
3D panoramic multi-person localization and tracking – our proposal
6. Tracking Module
Initializing the tracking module
Starting tracking
Updating existing tracking records
Updating new tracking records
Removing expired tracking records
Predicting new location for all tracking records
Tracking
Module
13. 12
3D panoramic multi-person localization and tracking – our proposal
Compute distance between
observed locations and estimated
locations by Kalman-filter
Appearance
Re-identification
Module
Pose Detection Module
Upper body poses
Geometry
Transformation
Module
Transform to
panoramic
camera-
centered
coordinates
Frame k Frame k+1
Crop upper
body patches
Tracking
ModuleAssign tracking IDs
and update tracking records
Compute
appearance
distance
0.2 0.8
0.7 0.3
Frame k
Frame k+1
a b
c d
a b
c
d
0.1 0.7
0.6 0.1
a b
c
d
+
3D Panoramic
coordinates
3D coordinates
with θ=90°
+
2D image coordinates
with θ=90°
v
u
Z
X
180$
°
90$
° 90$
°
270$
° 270$
°
180$
°
0$
°
0$
°
Put all components together
17. 16
Potential applications
Service robot at tourist places
1.Trajectory planning
2. Moving to target person for service
…
Visitor's behavior analysis
1. Recording 3D trajectory and walking
speed in a in a tourist site
2. Finding group activity information
…
Image is from Xueyang Wang, et al. `` PANDA: A Gigapixel-
level Human-centric Video Dataset.'' ArXiv 2020.
18. 17
Conclusion
2. Weakness
• It is not real-time yet, with a speed of 5 FPS on a TITAN X GPU.
• To observe a far-away person, we need videos with higher resolution, which
increases the computation cost.
1. Towards low-cost 3D Panoramic Multi-person Tracking
• Using RGB panoramic videos, which is cheaper and takes lower
computation cost on 2D data.
• Compared with 3D single-view works, our 3D Panoramic approach can be
used as a more powerful tool to enable other related projects.