Detection Tracking and Recognition of Human Poses for a Real Time Spatial Game

1,750 views

Published on

CASA workshop 3AMIGAS (supported by FOCUS K3D and GATE)

Technical program presentation no. 5:
presenter Feifei Huo, TU Delft

http://www.cs.uu.nl/events/3amigas/
http://www.focusk3d.eu/
http://gate.gameresearch.nl

Published in: Technology, Business
  • Be the first to comment

Detection Tracking and Recognition of Human Poses for a Real Time Spatial Game

  1. 1. Detection Tracking and Recognition of Human Poses for a Real Time Spatial Game Feifei Huo, Emile Hendriks, A.H.J. Oomes, Pascal van Beek, Remco Veltkamp Presenter: Feifei Huo Information and Communication Theory (ICT) Group Delft University of Technology June 16, 2009
  2. 2. Outline: • Introduction to visual analysis system • People detection, tracking and pose recognition system – Human body detection and body parts segmentation – Feature points representation and tracking – Pose recognition • Experimental results and conclusion • Spatial game application and future works
  3. 3. Introduction to Visual Analysis System 1. virtual reality 2. smart environment systems 3. sports video indexing 4. advanced users interfaces Video-based applications Pose-Driven Spatial Game
  4. 4. Pose-Driven Spatial Game
  5. 5. The state of the art: • combining bottom-up and top-down approaches. • incorporating appearance, kinematic, temporal constraints, etc. The proposed system: • real time system • a variety of poses • spatial game control Fig.1. The flowchart of the proposed system
  6. 6. People Detection, Tracking and Pose Recognition System Video People People Pose Spatial Sequence Detection Tracking Recognition Game Whole Human Blob Initial Frame Detection Different Body Parts Segmentation
  7. 7. Methodology • Background subtraction – Mixture of Gaussian • Head and torso detection and tracking – 2D upper-body model B F Area( F ) = Area( B) (a) (b) Fig.2. (a) Foreground binary image of the initial frame, (b) 2D upper-body model for human torso detection and tracking.
  8. 8. Particle Filtering {s (n) , n = 1, 2 , 3 … , N } → P( B A = s ( ) ) n {π (n) , n = 1, 2,3,… , N } → 8
  9. 9. People detection and tracking • A sample set {s , π , n = 1, 2, N } is generated with an initial (n) (n) distribution s ( n ) = p ( n ) = ( x ( n ) , y ( n ) , scale( n ) ). • Then the observation steps take place. (n ) 1 ⎧ ∑ F ( n ) − ∑ B ( n ) , if ∑ F ( n ) > ⎪ ∑B (n) ⎫ ⎪ P(B A = s )=ω (n) = (n) ×⎨ ⎬ Area ( F ) ⎪ ⎩ 0, otherwise ⎪ ⎭
  10. 10. People detection and tracking • This observation is updated by taking the prior weight into account. π t(−1) n ωt (n) =ω (n) × N ∑π n =1 (n) t −1 • The normalized observation forms a new set of particle weight. ωt( n ) π (n) t = N Fig.3. 2D upper-body model for human ∑ω n =1 t (n) torso detection and tracking.
  11. 11. Methodology • Hand detection and tracking – Foreground pixels are segmented into skin-color and non-skin- color regions. B π π G π π B π π arctan( ) − < , arctan( ) − < , arctan( ) − < R 4 8 R 6 18 G 5 15 – The face is excluded from the candidate hands regions by using the size of the connected skin color area.
  12. 12. People Detection, Tracking and Pose Recognition System Video People People Pose Spatial Sequence Detection Tracking Recognition Game Feature Points Multiple Views Location Subsequent Feature Points Video Frames Tracking
  13. 13. Torso and Hand Segmentation Fig.4. Results of torso and hand segmentation
  14. 14. 3D Reconstruction • Three synchronized cameras are used. – One front view – Two side views • The 3D positions of torso and hands can be obtained. Fig.5. Multiple camera settings
  15. 15. People Detection, Tracking and Pose Recognition System Video People People Pose Spatial Sequence Detection Tracking Recognition Game Construction Predefined Key Classifier Poses Pose Recognition
  16. 16. Pose Recognition • Feature space construction 2D and 3D positions of the torso center and the hands normalized feature space relative positions between hands and torso center
  17. 17. Predefined Key Poses Pose Classification • 9 poses into 9 classes • 15 persons • 1515 samples in total
  18. 18. Results and Discussion Cross-validation results of pose classifiers (mean errors with standard deviation) method LOPO FORO mean pose err. max pose err. mean pose err. max pose err. NMC 0.06(0.09) 0.18(0.35) 0.04(0.02) 0.09(0.10) LDC 0.06(0.07) 0.14(0.35) 0.01(0.01) 0.04(0.05) QDC 0.10(0.11) 0.23(0.34) 0.01(0.01) 0.04(0.06) LDA+QDC 0.07(0.09) 0.16(0.35) 0.02(0.01) 0.04(0.06) Parzen 0.07(0.09) 0.16(0.35) 0.01(0.01) 0.02(0.04) LDA+Parzen 0.06(0.07) 0.14(0.35) 0.00(0.00) 0.01(0.03) Conclusion: the simplest method (NMC) provides comparable performance to more complex classifiers.
  19. 19. Results and Discussion Confusion matrices of nine poses Estimated Labels P1 P2 P3 P4 P5 P6 P7 P8 P9 P1 198 0 0 0 0 0 0 0 0 True Labels P2 0 193 0 0 0 0 0 0 0 P3 2 0 157 0 0 0 0 0 0 P4 0 0 0 159 0 20 0 0 0 P5 1 0 1 0 164 0 2 0 0 P6 2 3 6 0 0 129 0 0 0 P7 0 0 1 0 3 0 164 0 0 P8 0 0 9 0 6 0 1 162 0 P9 0 0 5 3 0 0 0 0 133 Conclusion: most of the poses can be recognized very well. However, there is quite a large error between pose4 and pose6.
  20. 20. People Detection, Tracking and Pose Recognition System Video People People Pose Spatial Sequence Detection Tracking Recognition Game Pose Color Control Location Position Control
  21. 21. Spatial Game Demo
  22. 22. Application: Spatial Game • Real-time application: 20 frames/second PRSD Studio, http://prsysdesign.net/ • Robust to different environments: different indoor settings • Adapt to different users: various users
  23. 23. Future Works • Improve the robustness of the system better skin colour detection, more robust feature detection • Develop multiple-user applications solve occlusion problem
  24. 24. Thanks for your attention ! ?

×