Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Human Action Recognition Using 3D Joint Information and HOOFD Features

1,565 views

Published on

Master Thesis Defence Representation of Baris Can Ustundag

"Human Action Recognition Using 3D Joint Information and HOOFD Features"

Published in: Software
  • Be the first to comment

Human Action Recognition Using 3D Joint Information and HOOFD Features

  1. 1. Human Action Recognition Using 3D Joint Information and Pyramidal HOOFD Features MSc Thesis by Barış Can Üstündağ Thesis Advisor: Prof. Dr. Mustafa Ünel Buraya görseller eklenecek
  2. 2. • Introduction to Human Action Recognition – Motivation, Applications – Related Work • Human Action Recognition Using 3D Joint Information and HOOFD Features – Acquiring Depth Data – Feature Extraction • 3D Joints • HOOFD – Feature Representation – Classification • Experiments – Datasets • MSR Action 3D Dataset • MSR Action Pairs Dataset • MSRC-12 Gesture Dataset • Conclusions & Future Work Outline
  3. 3. • Introduction to Human Action Recognition – Motivation, Applications – Related Work • Human Action Recognition Using 3D Joint Information and HOOFD Features – Acquiring Depth Data – Feature Extraction • 3D Joints • HOOFD – Feature Representation – Classification • Experiments – Datasets • MSR Action 3D Dataset • MSR Action Pairs Dataset • MSRC-12 Gesture Dataset • Conclusions & Future Work Outline
  4. 4. • Motion Perception – Gunnar Johansson [1971] • Sequence of images for Human Motion Analysis • ‘Moving Light Displays’ enable identification of people and gender • Motion Capture [2014] – Dawn of the Planet of the Apes Motivation
  5. 5. • Vast amount of Data YouTube • More than 34K hours of video uploaded every day Surveillance Cameras • ~30 M cameras in the US • ~700K video hours every day Motivation
  6. 6. • Video Categorization Movies TV YouTube Motivation
  7. 7. • Video Categorization – How many human-pixels are there? Movies TV YouTube Motivation
  8. 8. • Video Categorization – How many human-pixels are there? Movies TV YouTube 35% 34% 40% Motivation
  9. 9. • Rehabilitation – 15M people suffer fom stroke every year – Automated systems – Gamification Motivation - Application
  10. 10. • Release of Low-cost Depth Cameras – Kinect (2010) – Google Tango (developers only, 2014) – Leap Motion (2013) • Effective and robust performance given – Complex background – Challenging viewpoints – Occlusions Motivation – Why depth? Google Tango Leap Motion
  11. 11. Intensity Based • Extraction of Cuboids • Motion History Images, Motion Energy Images Depth Map Based • Depth Motion Maps • Histogram of Oriented 4D Normals Skeletal Data Based • SMIJ – Sequence of most informative Joints • HOJ3D – Histogram of 3D Joint Locations Related Work
  12. 12. Related Work • Extraction of Cuboids, Dollar et al. [CVPR, 2005] • Motion History Images Motion Energy Images, Gorelick et al. [PAMI, 2007] Intensity Based
  13. 13. Related Work • Histogram of Oriented 4D Normals (HON4D) Oreifej et al. [CVPR, 2013] • Depth Motion Maps, Yang et al. [JRTIP, 2012] Depth Map Based
  14. 14. Related Work • Sequence of Most Informative Joints (SMIJ), Ofli et al. [CVIU, 2013] • View Invariant Human Action Recognition Using Histogram of 3D Joints, Xia et al. [CVPR, 2012] Skeletal Data Based
  15. 15. • Introduction to Human Action Recognition – Motivation, Applications – Related Work • Human Action Recognition Using 3D Joint Information and HOOFD Features – Acquiring Depth Data – Feature Extraction • 3D Joints • HOOFD – Feature Representation – Classification • Experiments – Datasets • MSR Action 3D Dataset • MSR Action Pairs Dataset • MSRC-12 Gesture Dataset • Conclusions & Future Work Outline
  16. 16. Acquiring Depth Data Feature Extraction Feature Representation Classification Human Action Recognition Using 3D Joint Information and HOOFD Features • Depth Acquisition • Formation of shadows • Eliminating the noise • 3D Joints • HOOFD • Signal Warping • Pyramidal HOOFD Features • Naive Bayes • Support Vector Machines
  17. 17. • Kinect – Depth data acquisition is accomplised by using ‘Light Coding’ Method • In order to process the depth data in any application – Formation of shadows – Eliminating the noise Acquiring Depth Data Feature Extraction Feature Representation Classification
  18. 18. • Shadows – Generated by the foreground objects • Noise – Rough object boundaries caused gaps and holes on depth data • Bilateral Filter Space term Range term Acquiring Depth Data Feature Extraction Feature Representation Classification
  19. 19. • Joint Features – 20 Joints are provided by Kinect SDK – 10 Joint Angles and their derivatives calculated: T kk k 1 
  20. 20. • Joint Features – Mapped to spherical Coordinates – Origin is aligned to the hip center – Radius parameter is discarded Acquiring Depth Data Feature Extraction Feature Representation Classification
  21. 21. • Histogram of Oriented Optical Flows from Depth (HOOFD) Acquiring Depth Data Feature Extraction Feature Representation Classification - Optical Flow from Depth Data •Mapping of depth data to intensity image •Depth values (z) represented as intensity (I) •Optical flow field which is invariant to sudden change of brightness
  22. 22. - Optical Flow • 2D displacement of pixel patches on the image plane • Brightness Constancy Equation • Linearizing assuming small (u,v) using Taylor Series Expansion • Histogram of Oriented Optical Flows from Depth (HOOFD) ),(),,( , ttyyxxItyxI  0),,(),(),,(),(),,(  tyxIyxvtyxIyxutyxI tyx t x yxu   ),( t y yxv   ),(
  23. 23. • Optical Flow – Lucas Kanade Method • Apply it within a local patch • Minimize using Least-Squares method    yx tyx IvtyxIutyxIvuE , 2 ),,(),,(),(                           ty tx yyx yxx II II v u III III 2 2 bA u    bAAA TT 1 u   
  24. 24. • Optical Flow – Horn Schunk Method • Assumption: global smoothness in the flow over the whole image     dydxvvuuE D yxyxs   2222Smoothness error:   dydxIvIuIE D tyxc   2Error in brightness constancy equation sc EE Minimize:
  25. 25. • Histogram of Oriented Optical Flow from Depth • Binning according to: – Primary Angle between the flow vector and the horizontal axis – Magnitude of the flow vector • Orientation & Magnitude images Histogram Binning example with bin size = 4 )(tan 1 u v  22 vuM 
  26. 26. • Signal Warping – If it is a longer action instance -> Discard frames – If it is a shorter action instance -> Replicate and insert frames Acquiring Depth Data Feature Extraction Feature Representation Classification
  27. 27. • Pyramidal HOOFD Features – Histogram of Oriented Optical Flow from Depth After obtaining optical flows patches 1. Patches are extracted around each joint Acquiring Depth Data Feature Extraction Feature Representation Classification
  28. 28. • Pyramidal HOOFD Features – Histogram of Oriented Optical Flow from Depth After obtaining optical flows patches 1. Patches are extracted around each joint 2. HOOFDs are calculated in a pyramidal fashion Level 2 Level 3 Level 1 Acquiring Depth Data Feature Extraction Feature Representation Classification
  29. 29. Level 2 Level 3 Level 1 Acquiring Depth Data Feature Extraction Feature Representation Classification
  30. 30. Level 2 Level 3 Level 1 Acquiring Depth Data Feature Extraction Feature Representation Classification
  31. 31. • Supervised learning methods – Training examples are attached to known classes • Spam filtering on an e-mail client – Examples: Naive Bayes, Support Vector Machines Acquiring Depth Data Feature Extraction Feature Representation Classification
  32. 32. • Naive Bayes Classifier – Independence assumption between features • For example: a car ‘Volkswagen’ with a red color and 17 inch wheels and these features contribute independently to classify that this car is a ‘Volkswagen’ Acquiring Depth Data Feature Extraction Feature Representation Classification
  33. 33. • Support Vector Machines – Calculates the choice of the most optimal hyperplane that defines the decision boundary between two classes Acquiring Depth Data Feature Extraction Feature Representation Classification
  34. 34. • Introduction to Human Action Recognition – Motivation, Applications – Related Work • Action Recognition Using 3D Joint Information and HOOFD Features – Acquiring Depth Data – Feature Extraction • 3D Joints • HOOFD – Feature Representation – Classification • Experiments – Datasets • MSR Action 3D Dataset • MSR Action Pairs Dataset • MSRC-12 Gesture Dataset • Conclusions & Future Work Outline
  35. 35. • Datasets – MSR Action 3D • 10 Subjects • 20 Actions – MSR Pairs 3D • 10 Subjects • 12 Actions – MSRC-12 Gesture • 30 Subjects • 12 Actions Experiments
  36. 36. Experiment - 1 Settings • Dataset: MSRC-12 Gesture • Feature: Joint Features • Ratio: • Leave-one-subject-out-cross-valuation • 50% Training 50% Test • 75% Training 25% Test
  37. 37. Experiment - 1
  38. 38. Experiment - 1
  39. 39. Experiment - 2 Settings • Feature: HOOFD Features • Dataset: MSR Action 3D • Ratio: 50% Training 50% Test
  40. 40. Experiment - 2 Settings • Feature: HOOFD Features • Dataset: MSR Action 3D • Ratio: 50% Training 50% Test
  41. 41. Experiment - 2 Settings • Feature: HOOFD Features • Dataset: MSR Action 3D • Ratio: 50% Training 50% Test Smash Action Forward Punch Action
  42. 42. Experiment - 3 Settings • Feature: HOOFD Features • Dataset: MSR Action Pairs • Ratio: 50% Training 50% Test
  43. 43. Conclusion & Future Work • We developed a novel human action recognition framework by fusing 3D Joint information and HOOFD features • We proposed a new feature called Histogram of Oriented Optical Flow from Depth (HOOFD) • Several experiments with publicly available datasets were conducted to assess the performance of the proposed technique. • Comparison with state-of-the-art algorithms show the success of our algorithm. • As future work, – Potential of HOOFD will be fully explored – Different popular classification approaches will be employed (Bag of Words, Random Forest, Boosted Trees)
  44. 44. Thank You ... ???

×