SlideShare a Scribd company logo
Human Action Recognition Based on Spatio-temporal Features Nikhil Sawant						K. K. Biswas Department of Computer Science and Engineering Indian Institute of Technology, Delhi Target Localization Motion Features Fixed sized grid Possible search space is xyt cube, which is reduce using target localization. Action and actor is localized in space and time Background subtraction helps localizing the actor. ROI is marked around the actor ROI is the only region processed, rest all ignored We make use of Optical Flows. Optical flow is the pattern of relative motion between the object/object feature points and the viewer/camera. We make use of Lucas – Kanade, two frame differential method, it comparatively yields robust and dense optical flows A fixed sized grid overlaid on the region of interest Dimension of the grid is (Xdiv x Ydiv) ROI is divided into both bij with cenres at cij respectively Organizing Optical Flows Simple averaging  Weighted averaging  Shape Feature Shape of the person gives information about the action being performed.  Viola-Jones box features used to get shape features. We make use of 2-rectangle ad 4-rectangle features Foreground pixels in white region are subtracted from foreground pixels in grey region. These features are applied for all possible locations on the rectangular grid. Noise Reduction Noise removal by averaging. Optical flows with magnitude > (C*Omean) are ignored, where  C – constant [1.5 - 2],  Omean- mean of optical flow within ROI  Unorganized optical flows organized optical flows Spatio-temporal Descriptor Learning with Adaboost Shape and motion features combined over the span of time to form spatio-temporal features TSPAN is the offset between the consecutive video frames TLEN is the number of video frames used We use standard Adaboost algorithm for learning the data. Adaboost is state of art learning algorithm. In case of Adaboost strong hypothesis is made up of  weak hypothesis, infact weighted sum of weak hypothesis is a strong hypothesis. We consider linear decision stumps as the weak classifiers.  We prepare mutually exclusive training and testing dataset. The system is trained first for the set of actions. For each give video system classifies it into one of the action class for which it is trained. TLEN and TSPAN allows us to capture large change in possibly small number of number of frames Data set Results and conclusion We observe only 10% error in waving, stand up and bending actions in our own dataset rest all actions show 0% error. In case of Weizman data set error is only observed in run and wave1 actions rest all action are unambiguous. We report overall error rate of 2.17%  From this technique we can conclude that spatio-temporal features including motion and shape features can be used  for action recognition effectively. Adaboost successfully classifies the descriptors formed using spatio-temporal features. We constructed our own dataset with 7 actions and 8 actors videos are shot in daylight and against stable background. Various actions recorded are walk, run, wave1, wave2, bend, sit-down, stand-up  We also benchmark our method with standard Weizman dataset, which contain 9 actions by 10 actors various actions. The actions included are bend, jack, jump, pjump, run, side, skip, walk, wave1, wave2.
Human Action Recognition Based on Spatio-temporal Features Nikhil Sawant						K. K. Biswas Department of Computer Science and Engineering Indian Institute of Technology, Delhi Fixed sized grid Motion Features Target Localization Possible search space is xyt cube, which is reduce using target localization. Action and actor is localized in space and time Background subtraction helps localizing the actor. ROI is marked around the actor ROI is the only region processed, rest all ignored We make use of Optical Flows. Optical flow is the pattern of relative motion between the object/object feature points and the viewer/camera. We make use of Lucas – Kanade, two frame differential method, it comparatively yields robust and dense optical flows A fixed sized grid overlaid on the region of interest Dimension of the grid is (Xdiv x Ydiv) ROI is divided into both bij with cenres at cij respectively Organizing Optical Flows Noise Reduction Weighted averaging  Noise removal by averaging. Optical flows with magnitude > (C*Omean) are ignored, where  C – constant [1.5 - 2],  Omean- mean of optical flow within ROI  Adaboost Shape Feature We use standard Adaboost algorithm for learning the data. Adaboost is state of art learning algorithm. In case of Adaboost strong hypothesis is made up of  weak hypothesis, infact weighted sum of weak hypothesis is a strong hypothesis. We consider linear decision stumps as the weak classifiers.  Classification in case of Adaboost can be binary or multiclass we make use of multiclass classification. We give ‘n’ action classes to the Adaboost system which trains itself to detect the pattern produced by different actions. Shape of the person gives information about the action being performed.  Viola-Jones box features used to get shape features. We make use of 2-rectangle ad 4-rectangle features Foreground pixels in white region are subtracted from foreground pixels in grey region. These features are applied for all possible locations on the rectangular grid. Learning Confusion matrix (weizman dataset) Unorganized optical flows organized optical flows Spatio-temporal features formed using shape and motion features. The features extracted from the training  are provides to the learning system so that the pattern produced by the action classes is understood. We prepare mutually exclusive training and testing dataset.  Once the system is trained with variety of samples from each class it is ready of action detection. For each given video system classifies it into one of the action class for which it is trained. Spatio-temporal Descriptor Shape and motion features combined over the span of time to form spatio-temporal features TSPAN is the offset between the consecutive video frames TLEN is the number of video frames used TLEN and TSPAN allows us to capture large change in possibly small number of number of frames Results and conclusion Data set We constructed our own dataset with 7 actions and 8 actors videos are shot in daylight and against stable background. Various actions recorded are walk, run, wave1, wave2, bend, sit-down, stand-up  We also benchmark our method with standard Weizman dataset, which contain 9 actions by 10 actors various actions. The actions included are bend, jack, jump, pjump, run, side, skip, walk, wave1, wave2. We observe only 10% error in waving, stand up and bending actions in our own dataset rest all actions show 0% error. In case of Weizman data set error is only observed in run and wave1 actions rest all action are unambiguous. We report overall error rate of 2.17%  From this technique we can conclude that spatio-temporal features including motion and shape features can be used  for action recognition effectively. Adaboost successfully classifies the descriptors formed using spatio-temporal features.

More Related Content

What's hot

Survey Paper for Different Video Stabilization Techniques
Survey Paper for Different Video Stabilization TechniquesSurvey Paper for Different Video Stabilization Techniques
Survey Paper for Different Video Stabilization Techniques
IRJET Journal
 
Visual Object Tracking: review
Visual Object Tracking: reviewVisual Object Tracking: review
Visual Object Tracking: review
Dmytro Mishkin
 
stduy Edge-Based Image Coarsening
stduy Edge-Based Image Coarseningstduy Edge-Based Image Coarsening
stduy Edge-Based Image CoarseningChiamin Hsu
 
【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset
【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset
【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset
Hirokatsu Kataoka
 
FMRI medical imagining
FMRI  medical imaginingFMRI  medical imagining
FMRI medical imagining
Vishwas N
 
Video Surveillance Systems For Traffic Monitoring
Video Surveillance Systems For Traffic MonitoringVideo Surveillance Systems For Traffic Monitoring
Video Surveillance Systems For Traffic MonitoringMeridian Media
 
Noise Removal in SAR Images using Orthonormal Ridgelet Transform
Noise Removal in SAR Images using Orthonormal Ridgelet TransformNoise Removal in SAR Images using Orthonormal Ridgelet Transform
Noise Removal in SAR Images using Orthonormal Ridgelet Transform
IJERA Editor
 
Calibration Issues in FRC: Camera, Projector, Kinematics based Hybrid Approac...
Calibration Issues in FRC: Camera, Projector, Kinematics based Hybrid Approac...Calibration Issues in FRC: Camera, Projector, Kinematics based Hybrid Approac...
Calibration Issues in FRC: Camera, Projector, Kinematics based Hybrid Approac...
Joo-Haeng Lee
 
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 4)
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 4)SIGGRAPH 2014 Course on Computational Cameras and Displays (part 4)
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 4)
Matthew O'Toole
 
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Saimunur Rahman
 
Moving object detection in video surveillance
Moving object detection in video surveillanceMoving object detection in video surveillance
Moving object detection in video surveillance
Ashfaqul Haque John
 
Poster RITS motion_correction
Poster RITS motion_correctionPoster RITS motion_correction
Poster RITS motion_correction
Rania BERRADA
 
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision GroupDTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
Lihang Li
 
Trajectory Based Unusual Human Movement Identification for ATM System
	 Trajectory Based Unusual Human Movement Identification for ATM System	 Trajectory Based Unusual Human Movement Identification for ATM System
Trajectory Based Unusual Human Movement Identification for ATM System
IRJET Journal
 
Imu fusion algorithm for pose estimation (mCube invited talk) 2018 1003-1
Imu fusion algorithm for pose estimation (mCube invited talk) 2018 1003-1Imu fusion algorithm for pose estimation (mCube invited talk) 2018 1003-1
Imu fusion algorithm for pose estimation (mCube invited talk) 2018 1003-1
James D.B. Wang, PhD
 
Soundarya m.sc
Soundarya m.scSoundarya m.sc
Soundarya m.sc
sowfi
 
Magnetic tracking --- talking from Magic Leap One
Magnetic tracking --- talking from Magic Leap OneMagnetic tracking --- talking from Magic Leap One
Magnetic tracking --- talking from Magic Leap One
James D.B. Wang, PhD
 

What's hot (20)

Survey Paper for Different Video Stabilization Techniques
Survey Paper for Different Video Stabilization TechniquesSurvey Paper for Different Video Stabilization Techniques
Survey Paper for Different Video Stabilization Techniques
 
report
reportreport
report
 
Visual Object Tracking: review
Visual Object Tracking: reviewVisual Object Tracking: review
Visual Object Tracking: review
 
stduy Edge-Based Image Coarsening
stduy Edge-Based Image Coarseningstduy Edge-Based Image Coarsening
stduy Edge-Based Image Coarsening
 
【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset
【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset
【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset
 
FMRI medical imagining
FMRI  medical imaginingFMRI  medical imagining
FMRI medical imagining
 
Video Surveillance Systems For Traffic Monitoring
Video Surveillance Systems For Traffic MonitoringVideo Surveillance Systems For Traffic Monitoring
Video Surveillance Systems For Traffic Monitoring
 
Noise Removal in SAR Images using Orthonormal Ridgelet Transform
Noise Removal in SAR Images using Orthonormal Ridgelet TransformNoise Removal in SAR Images using Orthonormal Ridgelet Transform
Noise Removal in SAR Images using Orthonormal Ridgelet Transform
 
Calibration Issues in FRC: Camera, Projector, Kinematics based Hybrid Approac...
Calibration Issues in FRC: Camera, Projector, Kinematics based Hybrid Approac...Calibration Issues in FRC: Camera, Projector, Kinematics based Hybrid Approac...
Calibration Issues in FRC: Camera, Projector, Kinematics based Hybrid Approac...
 
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 4)
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 4)SIGGRAPH 2014 Course on Computational Cameras and Displays (part 4)
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 4)
 
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
 
Moving object detection in video surveillance
Moving object detection in video surveillanceMoving object detection in video surveillance
Moving object detection in video surveillance
 
Background subtraction
Background subtractionBackground subtraction
Background subtraction
 
Presentation of Visual Tracking
Presentation of Visual TrackingPresentation of Visual Tracking
Presentation of Visual Tracking
 
Poster RITS motion_correction
Poster RITS motion_correctionPoster RITS motion_correction
Poster RITS motion_correction
 
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision GroupDTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
 
Trajectory Based Unusual Human Movement Identification for ATM System
	 Trajectory Based Unusual Human Movement Identification for ATM System	 Trajectory Based Unusual Human Movement Identification for ATM System
Trajectory Based Unusual Human Movement Identification for ATM System
 
Imu fusion algorithm for pose estimation (mCube invited talk) 2018 1003-1
Imu fusion algorithm for pose estimation (mCube invited talk) 2018 1003-1Imu fusion algorithm for pose estimation (mCube invited talk) 2018 1003-1
Imu fusion algorithm for pose estimation (mCube invited talk) 2018 1003-1
 
Soundarya m.sc
Soundarya m.scSoundarya m.sc
Soundarya m.sc
 
Magnetic tracking --- talking from Magic Leap One
Magnetic tracking --- talking from Magic Leap OneMagnetic tracking --- talking from Magic Leap One
Magnetic tracking --- talking from Magic Leap One
 

Similar to Human Action Recognition Based on Spacio-temporal features-Poster

Flow Trajectory Approach for Human Action Recognition
Flow Trajectory Approach for Human Action RecognitionFlow Trajectory Approach for Human Action Recognition
Flow Trajectory Approach for Human Action Recognition
IRJET Journal
 
Automatic identification of animal using visual and motion saliency
Automatic identification of animal using visual and motion saliencyAutomatic identification of animal using visual and motion saliency
Automatic identification of animal using visual and motion saliency
eSAT Publishing House
 
An Object Detection, Tracking And Parametric Classification– A Review
An Object Detection, Tracking And Parametric Classification– A ReviewAn Object Detection, Tracking And Parametric Classification– A Review
An Object Detection, Tracking And Parametric Classification– A Review
IRJET Journal
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
Development of Human Tracking in Video Surveillance System for Activity Anal...
Development of Human Tracking in Video Surveillance System  for Activity Anal...Development of Human Tracking in Video Surveillance System  for Activity Anal...
Development of Human Tracking in Video Surveillance System for Activity Anal...
IOSR Journals
 
Dd net
Dd netDd net
Dd net
Fan Yang
 
Discovering Anomalies Based on Saliency Detection and Segmentation in Surveil...
Discovering Anomalies Based on Saliency Detection and Segmentation in Surveil...Discovering Anomalies Based on Saliency Detection and Segmentation in Surveil...
Discovering Anomalies Based on Saliency Detection and Segmentation in Surveil...
ijtsrd
 
J017377578
J017377578J017377578
J017377578
IOSR Journals
 
Real-time Moving Object Detection using SURF
Real-time Moving Object Detection using SURFReal-time Moving Object Detection using SURF
Real-time Moving Object Detection using SURF
iosrjce
 
IRJET- Behavior Analysis from Videos using Motion based Feature Extraction
IRJET-  	  Behavior Analysis from Videos using Motion based Feature ExtractionIRJET-  	  Behavior Analysis from Videos using Motion based Feature Extraction
IRJET- Behavior Analysis from Videos using Motion based Feature Extraction
IRJET Journal
 
Online framework for video stabilization
Online framework for video stabilizationOnline framework for video stabilization
Online framework for video stabilization
IAEME Publication
 
IRJET-Motion Segmentation
IRJET-Motion SegmentationIRJET-Motion Segmentation
IRJET-Motion Segmentation
IRJET Journal
 
Tracking-based Visual Surveillance System
Tracking-based Visual Surveillance SystemTracking-based Visual Surveillance System
Tracking-based Visual Surveillance System
IRJET Journal
 
Dance With AI – An interactive dance learning platform
Dance With AI – An interactive dance learning platformDance With AI – An interactive dance learning platform
Dance With AI – An interactive dance learning platform
IRJET Journal
 
ramya_Motion_Detection
ramya_Motion_Detectionramya_Motion_Detection
ramya_Motion_Detection
ramya1591
 
Computer Vision: Visual Extent of an Object
Computer Vision: Visual Extent of an ObjectComputer Vision: Visual Extent of an Object
Computer Vision: Visual Extent of an Object
IOSR Journals
 
HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES
HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES
HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES
sipij
 
Event recognition image & video segmentation
Event recognition image & video segmentationEvent recognition image & video segmentation
Event recognition image & video segmentation
eSAT Journals
 
Human Motion Detection in Video Surveillance using Computer Vision Technique
Human Motion Detection in Video Surveillance using Computer Vision TechniqueHuman Motion Detection in Video Surveillance using Computer Vision Technique
Human Motion Detection in Video Surveillance using Computer Vision Technique
IRJET Journal
 
I0343065072
I0343065072I0343065072
I0343065072
ijceronline
 

Similar to Human Action Recognition Based on Spacio-temporal features-Poster (20)

Flow Trajectory Approach for Human Action Recognition
Flow Trajectory Approach for Human Action RecognitionFlow Trajectory Approach for Human Action Recognition
Flow Trajectory Approach for Human Action Recognition
 
Automatic identification of animal using visual and motion saliency
Automatic identification of animal using visual and motion saliencyAutomatic identification of animal using visual and motion saliency
Automatic identification of animal using visual and motion saliency
 
An Object Detection, Tracking And Parametric Classification– A Review
An Object Detection, Tracking And Parametric Classification– A ReviewAn Object Detection, Tracking And Parametric Classification– A Review
An Object Detection, Tracking And Parametric Classification– A Review
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Development of Human Tracking in Video Surveillance System for Activity Anal...
Development of Human Tracking in Video Surveillance System  for Activity Anal...Development of Human Tracking in Video Surveillance System  for Activity Anal...
Development of Human Tracking in Video Surveillance System for Activity Anal...
 
Dd net
Dd netDd net
Dd net
 
Discovering Anomalies Based on Saliency Detection and Segmentation in Surveil...
Discovering Anomalies Based on Saliency Detection and Segmentation in Surveil...Discovering Anomalies Based on Saliency Detection and Segmentation in Surveil...
Discovering Anomalies Based on Saliency Detection and Segmentation in Surveil...
 
J017377578
J017377578J017377578
J017377578
 
Real-time Moving Object Detection using SURF
Real-time Moving Object Detection using SURFReal-time Moving Object Detection using SURF
Real-time Moving Object Detection using SURF
 
IRJET- Behavior Analysis from Videos using Motion based Feature Extraction
IRJET-  	  Behavior Analysis from Videos using Motion based Feature ExtractionIRJET-  	  Behavior Analysis from Videos using Motion based Feature Extraction
IRJET- Behavior Analysis from Videos using Motion based Feature Extraction
 
Online framework for video stabilization
Online framework for video stabilizationOnline framework for video stabilization
Online framework for video stabilization
 
IRJET-Motion Segmentation
IRJET-Motion SegmentationIRJET-Motion Segmentation
IRJET-Motion Segmentation
 
Tracking-based Visual Surveillance System
Tracking-based Visual Surveillance SystemTracking-based Visual Surveillance System
Tracking-based Visual Surveillance System
 
Dance With AI – An interactive dance learning platform
Dance With AI – An interactive dance learning platformDance With AI – An interactive dance learning platform
Dance With AI – An interactive dance learning platform
 
ramya_Motion_Detection
ramya_Motion_Detectionramya_Motion_Detection
ramya_Motion_Detection
 
Computer Vision: Visual Extent of an Object
Computer Vision: Visual Extent of an ObjectComputer Vision: Visual Extent of an Object
Computer Vision: Visual Extent of an Object
 
HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES
HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES
HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES
 
Event recognition image & video segmentation
Event recognition image & video segmentationEvent recognition image & video segmentation
Event recognition image & video segmentation
 
Human Motion Detection in Video Surveillance using Computer Vision Technique
Human Motion Detection in Video Surveillance using Computer Vision TechniqueHuman Motion Detection in Video Surveillance using Computer Vision Technique
Human Motion Detection in Video Surveillance using Computer Vision Technique
 
I0343065072
I0343065072I0343065072
I0343065072
 

Human Action Recognition Based on Spacio-temporal features-Poster

  • 1. Human Action Recognition Based on Spatio-temporal Features Nikhil Sawant K. K. Biswas Department of Computer Science and Engineering Indian Institute of Technology, Delhi Target Localization Motion Features Fixed sized grid Possible search space is xyt cube, which is reduce using target localization. Action and actor is localized in space and time Background subtraction helps localizing the actor. ROI is marked around the actor ROI is the only region processed, rest all ignored We make use of Optical Flows. Optical flow is the pattern of relative motion between the object/object feature points and the viewer/camera. We make use of Lucas – Kanade, two frame differential method, it comparatively yields robust and dense optical flows A fixed sized grid overlaid on the region of interest Dimension of the grid is (Xdiv x Ydiv) ROI is divided into both bij with cenres at cij respectively Organizing Optical Flows Simple averaging Weighted averaging Shape Feature Shape of the person gives information about the action being performed. Viola-Jones box features used to get shape features. We make use of 2-rectangle ad 4-rectangle features Foreground pixels in white region are subtracted from foreground pixels in grey region. These features are applied for all possible locations on the rectangular grid. Noise Reduction Noise removal by averaging. Optical flows with magnitude > (C*Omean) are ignored, where C – constant [1.5 - 2], Omean- mean of optical flow within ROI Unorganized optical flows organized optical flows Spatio-temporal Descriptor Learning with Adaboost Shape and motion features combined over the span of time to form spatio-temporal features TSPAN is the offset between the consecutive video frames TLEN is the number of video frames used We use standard Adaboost algorithm for learning the data. Adaboost is state of art learning algorithm. In case of Adaboost strong hypothesis is made up of weak hypothesis, infact weighted sum of weak hypothesis is a strong hypothesis. We consider linear decision stumps as the weak classifiers. We prepare mutually exclusive training and testing dataset. The system is trained first for the set of actions. For each give video system classifies it into one of the action class for which it is trained. TLEN and TSPAN allows us to capture large change in possibly small number of number of frames Data set Results and conclusion We observe only 10% error in waving, stand up and bending actions in our own dataset rest all actions show 0% error. In case of Weizman data set error is only observed in run and wave1 actions rest all action are unambiguous. We report overall error rate of 2.17% From this technique we can conclude that spatio-temporal features including motion and shape features can be used for action recognition effectively. Adaboost successfully classifies the descriptors formed using spatio-temporal features. We constructed our own dataset with 7 actions and 8 actors videos are shot in daylight and against stable background. Various actions recorded are walk, run, wave1, wave2, bend, sit-down, stand-up We also benchmark our method with standard Weizman dataset, which contain 9 actions by 10 actors various actions. The actions included are bend, jack, jump, pjump, run, side, skip, walk, wave1, wave2.
  • 2. Human Action Recognition Based on Spatio-temporal Features Nikhil Sawant K. K. Biswas Department of Computer Science and Engineering Indian Institute of Technology, Delhi Fixed sized grid Motion Features Target Localization Possible search space is xyt cube, which is reduce using target localization. Action and actor is localized in space and time Background subtraction helps localizing the actor. ROI is marked around the actor ROI is the only region processed, rest all ignored We make use of Optical Flows. Optical flow is the pattern of relative motion between the object/object feature points and the viewer/camera. We make use of Lucas – Kanade, two frame differential method, it comparatively yields robust and dense optical flows A fixed sized grid overlaid on the region of interest Dimension of the grid is (Xdiv x Ydiv) ROI is divided into both bij with cenres at cij respectively Organizing Optical Flows Noise Reduction Weighted averaging Noise removal by averaging. Optical flows with magnitude > (C*Omean) are ignored, where C – constant [1.5 - 2], Omean- mean of optical flow within ROI Adaboost Shape Feature We use standard Adaboost algorithm for learning the data. Adaboost is state of art learning algorithm. In case of Adaboost strong hypothesis is made up of weak hypothesis, infact weighted sum of weak hypothesis is a strong hypothesis. We consider linear decision stumps as the weak classifiers. Classification in case of Adaboost can be binary or multiclass we make use of multiclass classification. We give ‘n’ action classes to the Adaboost system which trains itself to detect the pattern produced by different actions. Shape of the person gives information about the action being performed. Viola-Jones box features used to get shape features. We make use of 2-rectangle ad 4-rectangle features Foreground pixels in white region are subtracted from foreground pixels in grey region. These features are applied for all possible locations on the rectangular grid. Learning Confusion matrix (weizman dataset) Unorganized optical flows organized optical flows Spatio-temporal features formed using shape and motion features. The features extracted from the training are provides to the learning system so that the pattern produced by the action classes is understood. We prepare mutually exclusive training and testing dataset. Once the system is trained with variety of samples from each class it is ready of action detection. For each given video system classifies it into one of the action class for which it is trained. Spatio-temporal Descriptor Shape and motion features combined over the span of time to form spatio-temporal features TSPAN is the offset between the consecutive video frames TLEN is the number of video frames used TLEN and TSPAN allows us to capture large change in possibly small number of number of frames Results and conclusion Data set We constructed our own dataset with 7 actions and 8 actors videos are shot in daylight and against stable background. Various actions recorded are walk, run, wave1, wave2, bend, sit-down, stand-up We also benchmark our method with standard Weizman dataset, which contain 9 actions by 10 actors various actions. The actions included are bend, jack, jump, pjump, run, side, skip, walk, wave1, wave2. We observe only 10% error in waving, stand up and bending actions in our own dataset rest all actions show 0% error. In case of Weizman data set error is only observed in run and wave1 actions rest all action are unambiguous. We report overall error rate of 2.17% From this technique we can conclude that spatio-temporal features including motion and shape features can be used for action recognition effectively. Adaboost successfully classifies the descriptors formed using spatio-temporal features.