Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Semantic human activity detection in videos

  • Be the first to comment

  • Be the first to like this

Semantic human activity detection in videos

  1. 1. Semantic Human Activity Detection in Videos by Hirantha Pradeep Weerarathna Dr. Anuja DharmaratneUniversity of Colombo School of Computing
  2. 2. Definitions• Basic Human Action – One simple motion of a human body organ• Human Activity – Combination of basic actions in a row• Semantic Human Activity – More meaningful human activities
  3. 3. • Human action detection is a well recognized problem in computer vision• It is a very hard problem due to: – Variations in recording settings – Inter personal differences – Variations in action performing
  4. 4. • Many solutions have been proposed in the past for human action detection• Significant observation on these solutions is almost all the solutions discuss about basic human actions no attention has been paid for human activities• Most solutions are based on action pattern analysis
  5. 5. Previous Human Action Detection Solutions• Laptev et al have proposed a space time classifier and key frame priming based method for action ‘drinking’.• Ming Yang et al has proposed efficient detection method based on motion history images• Qingshan Luo et al has proposed a novel action representation called local motion histogram and a gentle adaboost based feature selection technique• Ke et al proposed a solution to detect smooth human actions
  6. 6. • These solutions are based on action pattern analysis• Fails to detect human activities. Because, 1. Some activities do not have any pattern or structure within 2. Some activities are too complex to identify using basic action detection techniques 3. For some human activities it would be possible to create an action template, but when actions are performed this pattern would not be preserved
  7. 7. Problem StatementIdentifying such semantic human activities ?
  8. 8. Our Solution• Identifying human activities based on Context Specific Information.• We propose a solution prototype for the activity ‘smoking’
  9. 9. Context Specific Information• Information set directly associated to a particular human activity• Best description of the activity• Have the strength to discriminate the activity from thousands of similar activity classes
  10. 10. CSI Examples• Fighting – rapid hand, leg movements – collision of two or more human silhouettes• Delivering a speech – changing facial expressions – continuous hand movements• Riding a bicycle – continuous leg movements – bent hands and body – rapid moving in the space
  11. 11. Smoking• Well-known human activity• Cause fatal diseases• CSI set associated with smoking – Property I Repeating motion from hand/mouth to mouth/hand – Property II Appearance of Main Frame – Property III Appearance of smoke
  12. 12. Solution Architecture Input Frame Grabber Video Motion Analyzer Human Detector Main Frame Smoke Detector DetectorOutputVideo Frame Collector
  13. 13. Human Detector• Detect and localize humans using face detection technique• Deploys two classifiers to detect face frontal view and face profile view face face frontal profile view view• Haar cascades used for detection• If detector fails, no room for smoking scene
  14. 14. Motion Analyzer• Associated with smoking property I• Creates a motion history image to accumulate motion information• Alarms MFD when there is a motion from hand/mouth to mouth/hand
  15. 15. Main Frame Detector• Associated with smoking property II• Detects main frames• Uses object detection techniques to detection actions• Deploys a HOG feature based SVM for detection
  16. 16. Smoke Detector• Associated with smoking property III• Detect appearance of smoke in video sequence• Uses modified version of Phillips III et als work• Accumulate n number of frames to capture smoke properties• Uses properties of smoke: special color distribution and rapid motion
  17. 17. Dataset• No public dataset available with smoking videos or main frames• We exploited movie ‘Coffee and Cigarettes’ and ‘Sea and Love’• Downloaded samples for WWW• Training datasets not overlaps with testing data
  18. 18. Results Evaluation• Results of face frontal view detector Dataset Recall Precision Movie 88% 98% Global 78% 92%• Results of face profile view detector Dataset Recall Precision Movie 53% 88% Global 57% 86%• Results of combined face detector Dataset Recall Precision Movie 92% 90% Global 84% 88%
  19. 19. Results Evaluation• Results of main frame detector 1 0.9 0.8 0.7 Detection Rate 0.6 0.5 0.4 Series1 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 False Positive Rate
  20. 20. Results Evaluation• Results of smoke detector Dataset Detection Rate FP Rate Colored 92% 5% Grayscale 60% 40%
  21. 21. Strengths of CSI Based Solutions• Can be designed like a evidence collecting approach• Robust to action performing variations• Robust to dynamic and cluttered backgrounds
  22. 22. Future Works• This is the introduction to significance of using CSI for activity detection. We expect an open discussion and more accurate solutions based on our concept.• Classifier training using more samples• Analyze the importance of sound information associated with a particular activity as a context specific information source.
  23. 23. Conclusion• Action pattern recognition is sufficient for identifying basic human actions• But it is not sufficient to detect human activities• CSI can be used to detect such human activities• CSI set used to detect one activity class cannot be used to detect another activity class• Selection of CSI set for a particular activity should be done carefully

    Be the first to comment

    Login to see the comments


Total views


On Slideshare


From embeds


Number of embeds