Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

【ISVC2015】Evaluation of Vision-based Human Activity Recognition in Dense Trajectory Framework

1,585 views

Published on

ISVC2015 paper
http://www.hirokatsukataoka.net/pdf/isvc15_kataoka_dt13feature.pdf

Activity recognition has been an active research topic in computer vision. Recently, the most successful approaches use dense trajectories that extract a large number of trajectories and features on the trajectories into a codeword. In this paper, we evaluate various features in the framework of dense trajectories on several types of datasets. We implement 13 features in total by including five different types of descriptor, namely motion-, shape-, texture- trajectory- and co-occurrence-based feature descriptors. The experimental results show a relationship between feature descriptors and performance rate at each dataset. Different scenes of traffic, surgery, daily living and sports are used to analyze the feature characteristics. Moreover, we test how much the performance rate of concatenated vectors depends on the type, top-ranked in experiment and all 13 feature descriptors on fine-grained datasets. Feature evaluation is beneficial not only in the activity recognition problem, but also in other domains in spatio-temporal recognition.

Published in: Science
  • Be the first to comment

  • Be the first to like this

【ISVC2015】Evaluation of Vision-based Human Activity Recognition in Dense Trajectory Framework

  1. 1. Evaluation of Vision-based Human Activity Recognition in Dense Trajectory Framework Hirokatsu Kataoka, Yoshimitsu Aoki†, Kenji Iwata, Yutaka Satoh National Institute of Advanced Industrial Science and Technology (AIST) † Keio University http://www.hirokatsukataoka.net/
  2. 2. Background Computer vision for human sensing -  Detection, Tracking, Trajectory Analysis -  Posture Estimation, Activity Recognition -  Action recognition is able to extend human sensing applications Mental state Body Situation Attention Activity Analysis shakinghands Look at people Detection Gaze Estimation Action Recognition Posture Estimation Face Recognition Trajectory extraction Tracking
  3. 3. Activity Recognition “Activity” is a low-level primitive with semantic meaning e.g. walking, running, sitting This image contains a man walking - The classification (location is given) Activity recognition - The classification and localization Activity detection Walking
  4. 4. Dense Trajectories (DT) [Wang+, IJCV2013] •  State-of-the-art space-time recognition approach –  State-of-the-art: DT + Deep Learning [THUMOS2015] –  Usable motion analyzer –  Simply, (i) flow tracker (ii) feature vectorization Large amount of opt. flows [THUMOS2015] http://www.thumos.info/results.html
  5. 5. History of keypoint/traj.-based approach •  Space-time interest points (STIP) – DT STIP: Space-time interest points [Laptev et al., IJCV2005] Dense Trajectories [Wang et al., CVPR2 [Laptev et al., CVPR2008] HOG + HOF on STIP Feature Mining for Activity Recognition [Gilbert et al., PAMI2011] Cuboid Features [Dollar et al., PETS2005] STR: Spatio-Temporal Relationship Match [Ryoo et al., ICCV2009] [Raptis et al., ECCV2010] Tracklet Descriptors
  6. 6. STIP & DT: Sampling •  Space-time interest points (STIP) – DT STIP: Space-time interest points [Laptev et al., IJCV2005] Dense Trajectories [Wang et al., CVPR2011] Action Bank [Sadanand et al., CVPR2012] [Laptev et al., CVPR2008] HOG + HOF on STIP Feature Mining for Activity Recognition [Gilbert et al., PAMI2011] Cuboid Features [Dollar et al., PETS2005] STR: Spatio-Temporal Relationship Match [Ryoo et al., ICCV2009] [Raptis et al., ECCV2010] Tracklet Descriptors
  7. 7. Co-occurrence features in DT •  Extended co-occurrence feature (ECoHOG) –  Feature •  CoHOG[Watanabe, PSIVT2009] (pair-count), ECoHOG (edge-magnitude accum.) •  PCA for codeword •  DT+Co-occurrence features (62.4%) > DT (59.2%) on MPII cooking CoHOG ECoHOG H. Kataoka+, “Extended Co-occurrence HOG with Dense Trajectories for Fine-grained Activity Recognition”, in ACCV2014. Need for more features! Pose-based approach Holistic appraoch
  8. 8. Proposal •  Feature evaluation for more better performance –  Evaluation of 13 features at fair settings –  5 Category •  Trajectory: traj. feature (originally in DT) •  Shape: HOG, SIFT •  Motion: HOF, MBHx, MBHy, MIP •  Texture: HLAC, LBP, iLBP, LTP •  Co-occurrence: CoHOG, ECoHOG –  4 different datasets •  NTSEL (traffic) •  INRIA surgery (surgery) •  MSR daily activity 3d (daily living) •  UCF50 (sports)
  9. 9. Simple algorithm •  (i) Flow tracking –  Pyramidal images & sampling –  Farneback optical flow & flow tracking •  (ii) Feature vectorization –  HOG, HOF, MBH, Trajectory, SIFT, LBP….. –  Bag-of-words (BoW) representation
  10. 10. Pyramidal images & sampling •  Scaling and dense sampling –  Pyramidal images •  Scales *= 1/√2 –  Sampling at each scale •  Grid: 5x5 [pxls] (experimentally decided) •  Corner detection T: threshold, λ: eigen value Scale invariant Detailed description
  11. 11. Farneback Optical Flow •  Dense Optical Flow + ST-patch –  Farneback Optical Flow is included OpenCV –  Comparison of KLT tracker and SIFT –  Local space-time patch around tracked sampling points Noises Tracking-error
  12. 12. Trajectory-based feature •  Trajectory shape –  Calculating flow between frames –  Scale normalization Pt = (Pt+1 − Pt) = (xt+1 − xt, yt+1 − yt) [Wang+, IJCV2013]
  13. 13. Shape-based feature •  HOG, SIFT Edge-orient., mag. from block representation with overlapping and normalization Edge-shape from background Simply divided 4x4 blocks [Lowe, IJCV2004] [Dalal+, CVPR2005]
  14. 14. Motion •  HOF, MBHx, MBHy, MIP Block optical flow extraction Quantization Motion boundary with dense optical flow [Dalal+, ECCV2006] Trinary (-1, 0, +1) from block flow direction, [Kliper-Gross+, ECCV2012] [Laptev+, CVPR2008]
  15. 15. Texture •  HLAC, LBP, iLBP, LTP Higher-order local auto-correlation 0-, 1st-, 2nd- order pattern Texture binarization in a 3x3 patch, [Ojala+, TPAMI2002] [Otsu+, IAIP1988] [Kobayashi+, ICPR2004]
  16. 16. Co-occurrence •  Extended co-occurrence feature (ECoHOG) –  Feature •  CoHOG[Watanabe, PSIVT2009] (pair-count), ECoHOG (edge-magnitude accum.) •  PCA for codeword •  DT+Co-occurrence features (62.4%) > DT (59.2%) on MPII cooking CoHOG ECoHOG H. Kataoka+, “Extended Co-occurrence HOG with Dense Trajectories for Fine-grained Activity Recognition”, in ACCV2014.
  17. 17. Experiments •  Evaluation of 13 features in dense trajectory framework –  4 different datasets •  Traffic scene (NTSEL dataset): 4 classes •  Surgery (INRIA surgery): 4 classes •  Daily living (MSR daily action 3D): 12 classes •  Sports (UCF50): 50 classes
  18. 18. Results on the 4 datasets •  High-performance features –  Top three features at each dataset –  4 different scenes
  19. 19. Results on the 4 datasets •  High-performance features –  CoHOG, SIFT, MBH –  CoHOG is the stable accuracy at all datasets
  20. 20. Detailed performance rate •  Depending on recognition task! –  We need to experimentally concatenate several features –  Feature concatenation on the NTSEL and INRIA surgery
  21. 21. Rate of feature concatenation •  Baseline, 5 categories and concatenated vector –  Baseline: DT + BoW model –  Motion and co-occurrence feature –  No need to apply all features
  22. 22. Conclusion •  We evaluated 13 features in the framework of DT –  For more effective activity recognition –  4 different scenes at each dataset –  Detailed evaluation and concatenated vectors –  Top-N ranked concatenation is needed for activity recognition
  23. 23. Feature extraction Around trajectories –  Extraction of 13 features in ST-patch –  2 (x dir.) x 2 (y dir.) x 3 (t dir.) region –  Calculating features with bag-of-words(BoW) ST-patch and xyt block extraction 13 features extractioin
  24. 24. Trajectory feature •  Trajectory shape –  フレーム間のフローを算出 –  全体のフローの大きさで正規化 Pt = (Pt+1 − Pt) = (xt+1 − xt, yt+1 − yt)
  25. 25. HOG特徴量 •  Histograms of Oriented Gradients (HOG) –  物体のおおまかな形状を表現可能 –  局所領域をブロック分割して特徴取得 –  エッジ勾配(下式g(x,y))により量子化ヒストグラム作成 –  勾配毎のエッジ強度(下式m(x,y))を累積 歩行者画像から取得した形状 背景から取得した形状
  26. 26. HOF特徴量 •  Histograms of Optical Flow (HOF) –  局所領域をブロック毎に分割 –  前後フレーム(tとt+1)のフローをブロックごとに記述 –  フロー方向と強度(長さ) 前後2フレームからフローを算出 動作ベースの特徴ベクトルを取得

×