【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recognition

Dominant Codewords Selec2on with Topic Model
for Ac2on Recogni2on
Hirokatsu KATAOKA, Masaki Hayashi†, Yoshimitsu AOKI†, Kenji IWATA,
Yutaka SATOH, Slobodan Ilic‡
Na2onal Ins2tute of Advanced Industrial Science and Technology (AIST)
† Keio University
‡ Technische Universität München

hPp://www.hirokatsukataoka.net/

Human Sensing
•  Several tasks are included in vision-based human sensing
–  Detec2on, tracking, face recogni2on
–  Posture es2ma2on, ac2on analysis (event recogni2on)
–  Ac2on recogni2on is able to extend human sensing applica2ons
Mental state
Body Situa2on
APen2on
Ac2on Analysis
shakinghands
Look at people
Detec2on
Gaze Es2ma2on
Ac2on Recogni2on
Posture Es2ma2on
Face Recogni2on
Trajectory extrac2on
Tracking

Understanding Human Ac2ons
•  Ac2on is defined as something that people do or cause to happen
–  e.g. walking, running, si^ng
This image contains a man walking
•  Ac2on recogni2on
Classifica2on
•  Ac2on detec2on
Classifica2on & localiza2on
Walking
Walking
Ac2on Recogni2on
Ac2on Detec2on

Current Ac2on Recogni2on
•  Trajectory-based representa2on
Improved Dense Trajectories (IDT) [Wang+, ICCV13]
Trajectory-pooled Deep-convolu2onal Descriptors
(TDD) [Wang+, CVPR15]
•  “Dense” representa2on is important
•  Trajectory-based hand-craied
descrip2on (HOG/HOF/MBH/Traj.)
•  Intersec2on of hand-craied and deep-
learned features
•  Intui2vely, IDT feature is replaced by
ConvMaps
•  CNN is based on Two-stream ConvNet
[Simonyan+, NIPS14]

Typical Ac2on Recogni2on Pipeline
Trajectories / keypoints extrac2on Feature descrip2on
Codeword pooling Classiﬁca2on
e.g. BoF, VLAD, Fisher Vectors hPp://www.analy2calway.com/images/SMV/svmFeatureSpace2.gif
e.g. SVM

Typical Ac2on Recogni2on Pipeline
•  Two strategies for more sophis2cated feature vector
Trajectory extrac2on Trajectory descrip2on
1. Trajectory reﬁnement!
(main contribu2on)
2. BePer feature!

Our proposal
•  1. Dominant codewords selec2on (DCS)
–  Signiﬁcant ac2on representa2on by using topic model (typical LDA)
–  Topic-based grouping from dense trajectories
–  Noise elimina2on at each topic
•  2. Co-occurrence feature representa2on for bePer feat.
–  As an addi2onal feature into typical HOG/ HOF / MBH / Traj.
–  The feature is based on [Kataoka+, ACCV14] which is improved co-occurrence
feature focusing on extrac2ng subtle mo2on
× × × ×
× ×
×
× ×
×
×
×
×
×
×
There are some noise even in the sta2c scene!

Flowchart of dominant codewords selec2on (DCS)
•  1. Feature extrac2on
–  Dense Trajectories
•  2. Topic modeling to divide primi2ve mo2on
–  Each topic is approxima2ng each primi2ve mo2on
•  3. Noise canceling at each topic
•  4. Dominant dense trajectories (DDT)

Feature extrac2on
•  Improved dense trajectories [Wang+, ICCV13]
–  HOG/ HOF/ MBHx, /MBHy
–  CoHOG, Extended CoHOG [Kataoka+, ACCV14]
–  Bag-of-features (BoF) to input of topic model
Extended CoHOG

Topic Model
•  Latent Dirichlet alloca2on (LDA) [Blei+, JMLR03]
–  We applied simpliﬁed model [Griﬃths+, 04]
–  Typical LDA
–  Parameters
•  Probability distribu2on of topic: Θ
•  Topic: T
•  Hyper parameter: α, β
•  DT-BoF from video: v

Topic Model for DCS
•  Noise rejec2on at each topic
–  Each topic indicates each “primi2ve mo2on”
–  In-topic adap2ve thresholding
× × × ×
× ×
×
× ×
×
×
“cut oﬀ ends” Topic1 Topic2 Topic3 Topic4
MPII cooking
Each topic indicates each “primi2ve mo2on”
× ×
× × × ×
× ×
× × × ×
× ×
×
× ×
×
×
“cut oﬀ ends”
×
× × × × × ×
×
×
×
× × × Unsophis2cated rejec2on is not desirable
× Eliminated vector

Dominant dense trajectoies (DDT)
•  All reﬁned topics are integrated
–  DDT
–  DDT is a mixture of dominant codewords using AND opera2on
e.g.
•  T1 = {vw1, vw3, vw5} => T1’ = {vw1, vw5}
•  T2 = {vw1, vw2, vw7} => T2’ = {vw1, vw2, vw7}
•  T3 = {vw3, vw6, vw7} => T3’ = {vw7}
•  DDT = {vw1, vw2, vw5, vw7}
•  DDT is sophis2cated representa2on with
dominant codewords

Experiments
•  Four datasets for ac2on recogni2on
【INRIA surgery】 View: 4 Activity: 4
view1 view2
view3 view4
【IXMAS】 View: 5 Activity: 11
【MPII cooking activities】 View: 1 Activity: 65
view1 view2
view3 view4
view5
【NTSEL traffic】 View: 1 Activity: 4

Parameters
•  DT, classiﬁca2on & LDA are based on the previous works
–  # of codeword: 4,000
–  # of trajectory pooling: 15 frame
–  The parameters are following original DT
–  α and β of LDA are set at 1.0 and 0.01
–  1,000 Gibbs sampler itera2ons
–  The topic modeling params are based on [Griﬃths and Steyvers, 04]
•  Dominant codewords selec2on (DCS)
–  Topic-based codewords are set as 1% of the frequency
–  # of topic: # of view x # of ac2on

Comparison of DT
•  Dominant DT vs DT
–  By using HOG, HOF, MBHx, MBHy and combined features
–  Dominant codewords selec2on is eﬀec2ve on the DT

Co-occurrence feature in DDT
•  On fine-grained dataset
–  MPII cooking [Rohrbach+, CVPR12]
–  Extended co-occurrence feature is improved by DCS
–  DDT is improved with extended co-occurrence feature
–  [Ni+, ECCV14] and [Ni+, CVPR15] are state-of-the-art with mid-level features

Topic Visualiza2on
×
×
×
×
×
×
×
×
×
× × ×
× × ×
×
× × × × ×
×
× ×
×
×
× ×
× ×
×
×
× × ×
×
×
×
× × × × ×
×
×
×
× × × × × ×
× ×
×
× ×
× × ×
× ×
× ×
× ×
×
× × ×
×
×
× ×
×
× ×
×
×
× ×
×
×
×
× ×
×
× ×
×
×
× × × × × ×
×
×
×
× ×
×
× × ×
× × × ×
× ×
×
× ×
×
×
Topic1 (View 2)
“sit” View 1
“sit” View 2
Topic1 (View 1)
Topic2 (View 2)
Topic2 (View 1)
Topic3 (View 2)
Topic3 (View 1)
Topic4 (View 2)
Topic4 (View 1)
“getup” View 2 Topic1 (View 2) Topic2 (View 2) Topic3 (View 2) Topic4 (View 2)
“crossing” Topic1 Topic2 Topic3 Topic4
“cut oﬀ ends” Topic1 Topic2 Topic3 Topic4
INRIA surgery
INRIA surgery
IXMAS
NTSEL traﬃc
MPII cooking

Conclusion
•  Dominant codewords selec2on (DCS) for ac2on recogni2on
–  We applied topic model for noise canceling in trajectory-based representa2on
–  Extra feature (co-occurrence feature)
•  Future works
–  Exploring parameters and more eﬀec2ve visualiza2on (on-going…)
–  Seman2c ﬂow with topic models

【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recognition

【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recognition

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (10)

Recently uploaded

Recently uploaded (20)

【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recognition