SlideShare a Scribd company logo
Recognition of Transitional Action for Short-Term Action
Prediction using Discriminative Temporal CNN Feature
Hirokatsu Kataoka, Ph.D.
Computer Vision Research Group (CVRG), AIST
http://www.hirokatsukataoka.net/
Yudai Miyashita (TDU), Masaki Hayashi (Liquid Inc., Keio Univ.),
Kenji Iwata, Yutaka Satoh (AIST)
Related work: Early Action Recognition
•  [Ryoo, ICCV2011]
M. S. Ryoo, “Human Activity Prediction: Early Recognition of Ongoing Activities from Streaming Videos”, International Conference on
Computer Vision (ICCV), pp.1036-1043, 2011.
Related work: Action Prediction
•  [Kataoka+, VISAPP2016]
???	Daytime
(Time Zone)	
Walking
(Previous Activity)	
Sitting
(Current Activity)	
???
(Next Activity)	
xtimezone	
xprevious	 xcurrent	
θ = “Using a PC”	
Given	 Not given	
Time series	
H. Kataoka, Y. Aoki, K. Iwata, Y. Satoh, “Activity Prediction using a Space-Time CNN and Bayesian Framework”, in VISAPP, 2016.
Problem of related works
•  Early action recognition
–  Action recognition in an early frame of the action
–  Enough cue is required, so almost equals to action recognition
•  Action prediction
–  Complete future prediction in an unstable situation
Proposal
•  Transitional Action (TA): Action-class while an action is transitive
–  TA contains cue of prediction: Earlier than early action recognition
–  Recognition-like future action prediction: More stable prediction
[Applications] Autonomous driving, active safety and robotics
Δt
【Proposal】
Short-term action prediction
recognize “cross” at time t5
【Previous works】
Early action recognition recognize
“cross” at time t9
Walk straight
(Action)
Cross
(Action)
Walk straight – Cross
(Transitional action)
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12
Problem settings
Framework Problem
Action Recognition
Early Action Recognition
Action Prediction
Transitional Action Recognition
f (F1...t
A
) → At
f (F1...t−L
A
) → At
f (F1...t
A
) → At+L
f (F1...t
TA
) → At+L
Difference
Framework Problem
Action Recognition
Early Action Recognition
Action Prediction
Transitional Action Recognition
f (F1...t
A
) → At
f (F1...t−L
A
) → At
f (F1...t
A
) → At+L
f (F1...t
TA
) → At+L
Walk straight
(Action)
Cross
(Action)
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12
f (F1...t−L
A
) → At
A(cross)The objective action is
-  Early action recognition is late response
Difference
Framework Problem
Action Recognition
Early Action Recognition
Action Prediction
Transitional Action Recognition
f (F1...t
A
) → At
f (F1...t−L
A
) → At
f (F1...t
A
) → At+L
f (F1...t
TA
) → At+L
Walk straight
(Action)
Cross
(Action)
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12
f (F1...t
A
) → At+L
A(cross)The objective action is
-  Action prediction is unstable
Difference
Framework Problem
Action Recognition
Early Action Recognition
Action Prediction
Transitional Action Recognition
f (F1...t
A
) → At
f (F1...t−L
A
) → At
f (F1...t
A
) → At+L
f (F1...t
TA
) → At+L
Walk straight
(Action)
Cross
(Action)
Walk straight – Cross
(Transitional action)
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12
A(cross)The objective action is
-  Transitional action recognition is reasonable
f (F1...t
TA
) → At+L
Details of transitional action (TA)
•  Annotation for TA
–  TA and normal action (NA) classes are partially overlapped each other
•  Difficulty of TA
–  Temporally mixed between NA and TA
Subtle Motion Descriptor (SMD)
•  A discriminative temporal CNN feature
–  To divide classes between NA and TA
Subtle Motion Descriptor (SMD)
•  Activation feature from VGG-16
–  Fully-connected layer (N = 4,096)
–  Based on pooled time series (PoT) [Ryoo+, CVPR2015]
Subtle Motion Descriptor (SMD)
•  Temporal difference ΔVt is calculated
–  (Frame t) – (Frame t-1)
Subtle Motion Descriptor (SMD)
•  Temporal pooling from ΔV t
–  Plus and minus
–  Zero-around values are pooled (→This is the contribution of SMD)
–  TH is experimentally fixed
Datasets
•  Temporal action datasets
–  NTSEL [Kataoka+, ITSC2015]
•  Walk (NA), cross (NA), bicycle (NA), turn (TA) with human bbox
–  UTKinect-Action [Xia+, CVPRW2012]
•  Ordered 10 NAs (e.g. walk, throw, sit)
•  8 TAs (excluding push/pull; next page)
•  Without human bbox
–  Watch-n-Patch [Wu+, CVPR2015]
•  Daily 10 NAs (e.g. read, turn on monitor, leave office)
•  Top frequent 10 TAs (next page)
•  Without human bbox
Experimental settings (list of TAs)
•  @UTKinect-Action @Watch-n-Patch
Implements
•  Action recognition appraoches
–  Temporal CNN models
•  Pooled Time-series (PoT) [Ryoo+, CVPR2015]
•  CNN accumulation
•  CNN + IDT [Jain+, ECCVW2014]
–  Improved dense trajectories (IDT) and with improved features
•  IDT [Wang+, ICCV2013]
•  IDT + cooccurrence-feature [Kataoka+, ACCV2014]
•  All Features in IDT
Exploration experiment
•  Parameters
–  Frame accumulation
–  Thresholding value TH
–  Layer fc6 vs fc7
Exploration experiment
•  Temporal accumulation [frames]
–  Faster prediction: 3 [frames] (0.1s)
–  Toward state-of-the-art: 10 [frames] (0.33s)
–  Baseline should be 3 and 10 frames accumulation
Exploration experiment
•  Thresholding value
–  Depending on data
Exploration experiment
•  Layer fc6 vs fc7
–  Layer fc6 is better
Results
•  SMD (ours) is state-of-the-art in transitional action recognition
Comparison of PoT
•  Subtle motion is effective for transitional action recognition
–  NTSEL: +2.18%, +8.63%
–  UTKinect: +7.19%, +4.31%
–  Watch-n-Patch: +4.82%, +5.12%
Conclulsion
•  Two contribusions:
1.  Definition of transitional action for short-term action prediction
2.  Subtle Motion Descriptor (SMD) to classify transitional and normal actions

More Related Content

What's hot

Medical Image Synthesis with Improved Cycle-GAN: CT from CECT
Medical Image Synthesis with Improved Cycle-GAN: CT from CECT Medical Image Synthesis with Improved Cycle-GAN: CT from CECT
Medical Image Synthesis with Improved Cycle-GAN: CT from CECT
BoahKim2
 
Motion Human Detection & Tracking Based On Background Subtraction
Motion Human Detection & Tracking Based On Background SubtractionMotion Human Detection & Tracking Based On Background Subtraction
Motion Human Detection & Tracking Based On Background Subtraction
International Journal of Engineering Inventions www.ijeijournal.com
 
Learning to discover monte carlo algorithm on spin ice manifold
Learning to discover monte carlo algorithm on spin ice manifoldLearning to discover monte carlo algorithm on spin ice manifold
Learning to discover monte carlo algorithm on spin ice manifold
Kai-Wen Zhao
 
Benchmarking of indoor localization and tracking systems (LTSs)
Benchmarking of indoor localization and tracking systems (LTSs)Benchmarking of indoor localization and tracking systems (LTSs)
Benchmarking of indoor localization and tracking systems (LTSs)
Kurata Takeshi
 
Digest of Human Detection from CVPR2015
Digest of Human Detection from CVPR2015Digest of Human Detection from CVPR2015
Digest of Human Detection from CVPR2015
belltailjp
 
Real time pedestrian detection, tracking, and distance estimation
Real time pedestrian detection, tracking, and distance estimationReal time pedestrian detection, tracking, and distance estimation
Real time pedestrian detection, tracking, and distance estimation
omid Asudeh
 
Multi-energy Bone Subtraction in Chest Radiography by Eigenvalue Decomposition
Multi-energy Bone Subtraction in Chest Radiography by Eigenvalue DecompositionMulti-energy Bone Subtraction in Chest Radiography by Eigenvalue Decomposition
Multi-energy Bone Subtraction in Chest Radiography by Eigenvalue Decomposition
BoahKim2
 

What's hot (7)

Medical Image Synthesis with Improved Cycle-GAN: CT from CECT
Medical Image Synthesis with Improved Cycle-GAN: CT from CECT Medical Image Synthesis with Improved Cycle-GAN: CT from CECT
Medical Image Synthesis with Improved Cycle-GAN: CT from CECT
 
Motion Human Detection & Tracking Based On Background Subtraction
Motion Human Detection & Tracking Based On Background SubtractionMotion Human Detection & Tracking Based On Background Subtraction
Motion Human Detection & Tracking Based On Background Subtraction
 
Learning to discover monte carlo algorithm on spin ice manifold
Learning to discover monte carlo algorithm on spin ice manifoldLearning to discover monte carlo algorithm on spin ice manifold
Learning to discover monte carlo algorithm on spin ice manifold
 
Benchmarking of indoor localization and tracking systems (LTSs)
Benchmarking of indoor localization and tracking systems (LTSs)Benchmarking of indoor localization and tracking systems (LTSs)
Benchmarking of indoor localization and tracking systems (LTSs)
 
Digest of Human Detection from CVPR2015
Digest of Human Detection from CVPR2015Digest of Human Detection from CVPR2015
Digest of Human Detection from CVPR2015
 
Real time pedestrian detection, tracking, and distance estimation
Real time pedestrian detection, tracking, and distance estimationReal time pedestrian detection, tracking, and distance estimation
Real time pedestrian detection, tracking, and distance estimation
 
Multi-energy Bone Subtraction in Chest Radiography by Eigenvalue Decomposition
Multi-energy Bone Subtraction in Chest Radiography by Eigenvalue DecompositionMulti-energy Bone Subtraction in Chest Radiography by Eigenvalue Decomposition
Multi-energy Bone Subtraction in Chest Radiography by Eigenvalue Decomposition
 

Viewers also liked

【ECCV 2016 BNMW】Human Action Recognition without Human
【ECCV 2016 BNMW】Human Action Recognition without Human【ECCV 2016 BNMW】Human Action Recognition without Human
【ECCV 2016 BNMW】Human Action Recognition without Human
Hirokatsu Kataoka
 
【慶應大学講演】なぜ、博士課程に進学したか?
【慶應大学講演】なぜ、博士課程に進学したか?【慶應大学講演】なぜ、博士課程に進学したか?
【慶應大学講演】なぜ、博士課程に進学したか?
Hirokatsu Kataoka
 
【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...
【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...
【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...
Hirokatsu Kataoka
 
ECCV 2016 速報
ECCV 2016 速報ECCV 2016 速報
ECCV 2016 速報
Hirokatsu Kataoka
 
【チュートリアル】コンピュータビジョンによる動画認識
【チュートリアル】コンピュータビジョンによる動画認識【チュートリアル】コンピュータビジョンによる動画認識
【チュートリアル】コンピュータビジョンによる動画認識
Hirokatsu Kataoka
 
Deep Residual Learning (ILSVRC2015 winner)
Deep Residual Learning (ILSVRC2015 winner)Deep Residual Learning (ILSVRC2015 winner)
Deep Residual Learning (ILSVRC2015 winner)
Hirokatsu Kataoka
 
【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recogn...
【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recogn...【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recogn...
【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recogn...
Hirokatsu Kataoka
 
ILSVRC2015 手法のメモ
ILSVRC2015 手法のメモILSVRC2015 手法のメモ
ILSVRC2015 手法のメモ
Hirokatsu Kataoka
 
TensorFlowによるCNNアーキテクチャ構築
TensorFlowによるCNNアーキテクチャ構築TensorFlowによるCNNアーキテクチャ構築
TensorFlowによるCNNアーキテクチャ構築
Hirokatsu Kataoka
 
CVPR 2016 速報
CVPR 2016 速報CVPR 2016 速報
CVPR 2016 速報
Hirokatsu Kataoka
 
【2016.08】cvpaper.challenge2016
【2016.08】cvpaper.challenge2016【2016.08】cvpaper.challenge2016
【2016.08】cvpaper.challenge2016
cvpaper. challenge
 
PythonによるCVアルゴリズム実装
PythonによるCVアルゴリズム実装PythonによるCVアルゴリズム実装
PythonによるCVアルゴリズム実装
Hirokatsu Kataoka
 
CNN Tutorial
CNN TutorialCNN Tutorial
CNN Tutorial
Sungjoon Choi
 
Res netと派生研究の紹介
Res netと派生研究の紹介Res netと派生研究の紹介
Res netと派生研究の紹介
masataka nishimori
 
21世紀で最もセクシーな職業!?「データサイエンティスト」の実像に迫る
21世紀で最もセクシーな職業!?「データサイエンティスト」の実像に迫る21世紀で最もセクシーな職業!?「データサイエンティスト」の実像に迫る
21世紀で最もセクシーな職業!?「データサイエンティスト」の実像に迫る
Takashi J OZAKI
 
20150930
2015093020150930
20150930
nlab_utokyo
 
Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2
Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2
Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2
Daiki Shimada
 
CVPR 2016 まとめ v1
CVPR 2016 まとめ v1CVPR 2016 まとめ v1
CVPR 2016 まとめ v1
cvpaper. challenge
 

Viewers also liked (18)

【ECCV 2016 BNMW】Human Action Recognition without Human
【ECCV 2016 BNMW】Human Action Recognition without Human【ECCV 2016 BNMW】Human Action Recognition without Human
【ECCV 2016 BNMW】Human Action Recognition without Human
 
【慶應大学講演】なぜ、博士課程に進学したか?
【慶應大学講演】なぜ、博士課程に進学したか?【慶應大学講演】なぜ、博士課程に進学したか?
【慶應大学講演】なぜ、博士課程に進学したか?
 
【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...
【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...
【論文紹介】Fashion Style in 128 Floats: Joint Ranking and Classification using Wea...
 
ECCV 2016 速報
ECCV 2016 速報ECCV 2016 速報
ECCV 2016 速報
 
【チュートリアル】コンピュータビジョンによる動画認識
【チュートリアル】コンピュータビジョンによる動画認識【チュートリアル】コンピュータビジョンによる動画認識
【チュートリアル】コンピュータビジョンによる動画認識
 
Deep Residual Learning (ILSVRC2015 winner)
Deep Residual Learning (ILSVRC2015 winner)Deep Residual Learning (ILSVRC2015 winner)
Deep Residual Learning (ILSVRC2015 winner)
 
【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recogn...
【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recogn...【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recogn...
【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recogn...
 
ILSVRC2015 手法のメモ
ILSVRC2015 手法のメモILSVRC2015 手法のメモ
ILSVRC2015 手法のメモ
 
TensorFlowによるCNNアーキテクチャ構築
TensorFlowによるCNNアーキテクチャ構築TensorFlowによるCNNアーキテクチャ構築
TensorFlowによるCNNアーキテクチャ構築
 
CVPR 2016 速報
CVPR 2016 速報CVPR 2016 速報
CVPR 2016 速報
 
【2016.08】cvpaper.challenge2016
【2016.08】cvpaper.challenge2016【2016.08】cvpaper.challenge2016
【2016.08】cvpaper.challenge2016
 
PythonによるCVアルゴリズム実装
PythonによるCVアルゴリズム実装PythonによるCVアルゴリズム実装
PythonによるCVアルゴリズム実装
 
CNN Tutorial
CNN TutorialCNN Tutorial
CNN Tutorial
 
Res netと派生研究の紹介
Res netと派生研究の紹介Res netと派生研究の紹介
Res netと派生研究の紹介
 
21世紀で最もセクシーな職業!?「データサイエンティスト」の実像に迫る
21世紀で最もセクシーな職業!?「データサイエンティスト」の実像に迫る21世紀で最もセクシーな職業!?「データサイエンティスト」の実像に迫る
21世紀で最もセクシーな職業!?「データサイエンティスト」の実像に迫る
 
20150930
2015093020150930
20150930
 
Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2
Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2
Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2
 
CVPR 2016 まとめ v1
CVPR 2016 まとめ v1CVPR 2016 まとめ v1
CVPR 2016 まとめ v1
 

Similar to 【BMVC2016】Recognition of Transitional Action for Short-Term Action Prediction using Discriminative Temporal CNN Feature

Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Satoshi NAKAHIRA
 
Robot Localisation: An Introduction - Luis Contreras 2020.06.09 | RoboCup@Hom...
Robot Localisation: An Introduction - Luis Contreras 2020.06.09 | RoboCup@Hom...Robot Localisation: An Introduction - Luis Contreras 2020.06.09 | RoboCup@Hom...
Robot Localisation: An Introduction - Luis Contreras 2020.06.09 | RoboCup@Hom...
robocupathomeedu
 
ASQ Talk v4
ASQ Talk v4ASQ Talk v4
Reactive Programming at Cloud-Scale and Beyond
Reactive Programming at Cloud-Scale and BeyondReactive Programming at Cloud-Scale and Beyond
Reactive Programming at Cloud-Scale and Beyond
C4Media
 
Stack of Tasks Course
Stack of Tasks CourseStack of Tasks Course
Stack of Tasks Course
Thomas Moulard
 
On the Development of A Real-Time Multi-Sensor Activity Recognition System
On the Development of A Real-Time Multi-Sensor Activity Recognition SystemOn the Development of A Real-Time Multi-Sensor Activity Recognition System
On the Development of A Real-Time Multi-Sensor Activity Recognition System
Oresti Banos
 
Ph.D. Presentation
Ph.D. PresentationPh.D. Presentation
Ph.D. Presentation
matteodefelice
 
Anomaly Detection in Sequences of Short Text Using Iterative Language Models
Anomaly Detection in Sequences of Short Text Using Iterative Language ModelsAnomaly Detection in Sequences of Short Text Using Iterative Language Models
Anomaly Detection in Sequences of Short Text Using Iterative Language Models
Cynthia Freeman
 
Actors for Behavioural Simulation
Actors for Behavioural SimulationActors for Behavioural Simulation
Actors for Behavioural Simulation
ClarkTony
 
Yuki Oyama - Incorporating context-dependent energy into the pedestrian dynam...
Yuki Oyama - Incorporating context-dependent energy into the pedestrian dynam...Yuki Oyama - Incorporating context-dependent energy into the pedestrian dynam...
Yuki Oyama - Incorporating context-dependent energy into the pedestrian dynam...
Yuki Oyama
 
Time Series With OrientDB - Fosdem 2015
Time Series With OrientDB - Fosdem 2015Time Series With OrientDB - Fosdem 2015
Time Series With OrientDB - Fosdem 2015
wolf4ood
 
Real-Time Top-R Topic Detection on Twitter with Topic Hijack Filtering
Real-Time Top-R Topic Detection on Twitter with Topic Hijack FilteringReal-Time Top-R Topic Detection on Twitter with Topic Hijack Filtering
Real-Time Top-R Topic Detection on Twitter with Topic Hijack Filtering
Kohei Hayashi
 
A calculus of mobile Real-Time processes
A calculus of mobile Real-Time processesA calculus of mobile Real-Time processes
A calculus of mobile Real-Time processes
Polytechnique Montréal
 

Similar to 【BMVC2016】Recognition of Transitional Action for Short-Term Action Prediction using Discriminative Temporal CNN Feature (13)

Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Robot Localisation: An Introduction - Luis Contreras 2020.06.09 | RoboCup@Hom...
Robot Localisation: An Introduction - Luis Contreras 2020.06.09 | RoboCup@Hom...Robot Localisation: An Introduction - Luis Contreras 2020.06.09 | RoboCup@Hom...
Robot Localisation: An Introduction - Luis Contreras 2020.06.09 | RoboCup@Hom...
 
ASQ Talk v4
ASQ Talk v4ASQ Talk v4
ASQ Talk v4
 
Reactive Programming at Cloud-Scale and Beyond
Reactive Programming at Cloud-Scale and BeyondReactive Programming at Cloud-Scale and Beyond
Reactive Programming at Cloud-Scale and Beyond
 
Stack of Tasks Course
Stack of Tasks CourseStack of Tasks Course
Stack of Tasks Course
 
On the Development of A Real-Time Multi-Sensor Activity Recognition System
On the Development of A Real-Time Multi-Sensor Activity Recognition SystemOn the Development of A Real-Time Multi-Sensor Activity Recognition System
On the Development of A Real-Time Multi-Sensor Activity Recognition System
 
Ph.D. Presentation
Ph.D. PresentationPh.D. Presentation
Ph.D. Presentation
 
Anomaly Detection in Sequences of Short Text Using Iterative Language Models
Anomaly Detection in Sequences of Short Text Using Iterative Language ModelsAnomaly Detection in Sequences of Short Text Using Iterative Language Models
Anomaly Detection in Sequences of Short Text Using Iterative Language Models
 
Actors for Behavioural Simulation
Actors for Behavioural SimulationActors for Behavioural Simulation
Actors for Behavioural Simulation
 
Yuki Oyama - Incorporating context-dependent energy into the pedestrian dynam...
Yuki Oyama - Incorporating context-dependent energy into the pedestrian dynam...Yuki Oyama - Incorporating context-dependent energy into the pedestrian dynam...
Yuki Oyama - Incorporating context-dependent energy into the pedestrian dynam...
 
Time Series With OrientDB - Fosdem 2015
Time Series With OrientDB - Fosdem 2015Time Series With OrientDB - Fosdem 2015
Time Series With OrientDB - Fosdem 2015
 
Real-Time Top-R Topic Detection on Twitter with Topic Hijack Filtering
Real-Time Top-R Topic Detection on Twitter with Topic Hijack FilteringReal-Time Top-R Topic Detection on Twitter with Topic Hijack Filtering
Real-Time Top-R Topic Detection on Twitter with Topic Hijack Filtering
 
A calculus of mobile Real-Time processes
A calculus of mobile Real-Time processesA calculus of mobile Real-Time processes
A calculus of mobile Real-Time processes
 

Recently uploaded

waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
LengamoLAppostilic
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf
PirithiRaju
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
Daniel Tubbenhauer
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
International Food Policy Research Institute- South Asia Office
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
Leonel Morgado
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
pablovgd
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
Advanced-Concepts-Team
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
hozt8xgk
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Leonel Morgado
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Selcen Ozturkcan
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Texas Alliance of Groundwater Districts
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
Scintica Instrumentation
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
RDhivya6
 
Katherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdfKatherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdf
Texas Alliance of Groundwater Districts
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
Carl Bergstrom
 

Recently uploaded (20)

waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
 
Katherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdfKatherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdf
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
 

【BMVC2016】Recognition of Transitional Action for Short-Term Action Prediction using Discriminative Temporal CNN Feature

  • 1. Recognition of Transitional Action for Short-Term Action Prediction using Discriminative Temporal CNN Feature Hirokatsu Kataoka, Ph.D. Computer Vision Research Group (CVRG), AIST http://www.hirokatsukataoka.net/ Yudai Miyashita (TDU), Masaki Hayashi (Liquid Inc., Keio Univ.), Kenji Iwata, Yutaka Satoh (AIST)
  • 2. Related work: Early Action Recognition •  [Ryoo, ICCV2011] M. S. Ryoo, “Human Activity Prediction: Early Recognition of Ongoing Activities from Streaming Videos”, International Conference on Computer Vision (ICCV), pp.1036-1043, 2011.
  • 3. Related work: Action Prediction •  [Kataoka+, VISAPP2016] ??? Daytime (Time Zone) Walking (Previous Activity) Sitting (Current Activity) ??? (Next Activity) xtimezone xprevious xcurrent θ = “Using a PC” Given Not given Time series H. Kataoka, Y. Aoki, K. Iwata, Y. Satoh, “Activity Prediction using a Space-Time CNN and Bayesian Framework”, in VISAPP, 2016.
  • 4. Problem of related works •  Early action recognition –  Action recognition in an early frame of the action –  Enough cue is required, so almost equals to action recognition •  Action prediction –  Complete future prediction in an unstable situation
  • 5. Proposal •  Transitional Action (TA): Action-class while an action is transitive –  TA contains cue of prediction: Earlier than early action recognition –  Recognition-like future action prediction: More stable prediction [Applications] Autonomous driving, active safety and robotics Δt 【Proposal】 Short-term action prediction recognize “cross” at time t5 【Previous works】 Early action recognition recognize “cross” at time t9 Walk straight (Action) Cross (Action) Walk straight – Cross (Transitional action) t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12
  • 6. Problem settings Framework Problem Action Recognition Early Action Recognition Action Prediction Transitional Action Recognition f (F1...t A ) → At f (F1...t−L A ) → At f (F1...t A ) → At+L f (F1...t TA ) → At+L
  • 7. Difference Framework Problem Action Recognition Early Action Recognition Action Prediction Transitional Action Recognition f (F1...t A ) → At f (F1...t−L A ) → At f (F1...t A ) → At+L f (F1...t TA ) → At+L Walk straight (Action) Cross (Action) t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 f (F1...t−L A ) → At A(cross)The objective action is -  Early action recognition is late response
  • 8. Difference Framework Problem Action Recognition Early Action Recognition Action Prediction Transitional Action Recognition f (F1...t A ) → At f (F1...t−L A ) → At f (F1...t A ) → At+L f (F1...t TA ) → At+L Walk straight (Action) Cross (Action) t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 f (F1...t A ) → At+L A(cross)The objective action is -  Action prediction is unstable
  • 9. Difference Framework Problem Action Recognition Early Action Recognition Action Prediction Transitional Action Recognition f (F1...t A ) → At f (F1...t−L A ) → At f (F1...t A ) → At+L f (F1...t TA ) → At+L Walk straight (Action) Cross (Action) Walk straight – Cross (Transitional action) t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 A(cross)The objective action is -  Transitional action recognition is reasonable f (F1...t TA ) → At+L
  • 10. Details of transitional action (TA) •  Annotation for TA –  TA and normal action (NA) classes are partially overlapped each other •  Difficulty of TA –  Temporally mixed between NA and TA
  • 11. Subtle Motion Descriptor (SMD) •  A discriminative temporal CNN feature –  To divide classes between NA and TA
  • 12. Subtle Motion Descriptor (SMD) •  Activation feature from VGG-16 –  Fully-connected layer (N = 4,096) –  Based on pooled time series (PoT) [Ryoo+, CVPR2015]
  • 13. Subtle Motion Descriptor (SMD) •  Temporal difference ΔVt is calculated –  (Frame t) – (Frame t-1)
  • 14. Subtle Motion Descriptor (SMD) •  Temporal pooling from ΔV t –  Plus and minus –  Zero-around values are pooled (→This is the contribution of SMD) –  TH is experimentally fixed
  • 15. Datasets •  Temporal action datasets –  NTSEL [Kataoka+, ITSC2015] •  Walk (NA), cross (NA), bicycle (NA), turn (TA) with human bbox –  UTKinect-Action [Xia+, CVPRW2012] •  Ordered 10 NAs (e.g. walk, throw, sit) •  8 TAs (excluding push/pull; next page) •  Without human bbox –  Watch-n-Patch [Wu+, CVPR2015] •  Daily 10 NAs (e.g. read, turn on monitor, leave office) •  Top frequent 10 TAs (next page) •  Without human bbox
  • 16. Experimental settings (list of TAs) •  @UTKinect-Action @Watch-n-Patch
  • 17. Implements •  Action recognition appraoches –  Temporal CNN models •  Pooled Time-series (PoT) [Ryoo+, CVPR2015] •  CNN accumulation •  CNN + IDT [Jain+, ECCVW2014] –  Improved dense trajectories (IDT) and with improved features •  IDT [Wang+, ICCV2013] •  IDT + cooccurrence-feature [Kataoka+, ACCV2014] •  All Features in IDT
  • 18. Exploration experiment •  Parameters –  Frame accumulation –  Thresholding value TH –  Layer fc6 vs fc7
  • 19. Exploration experiment •  Temporal accumulation [frames] –  Faster prediction: 3 [frames] (0.1s) –  Toward state-of-the-art: 10 [frames] (0.33s) –  Baseline should be 3 and 10 frames accumulation
  • 20. Exploration experiment •  Thresholding value –  Depending on data
  • 21. Exploration experiment •  Layer fc6 vs fc7 –  Layer fc6 is better
  • 22. Results •  SMD (ours) is state-of-the-art in transitional action recognition
  • 23. Comparison of PoT •  Subtle motion is effective for transitional action recognition –  NTSEL: +2.18%, +8.63% –  UTKinect: +7.19%, +4.31% –  Watch-n-Patch: +4.82%, +5.12%
  • 24. Conclulsion •  Two contribusions: 1.  Definition of transitional action for short-term action prediction 2.  Subtle Motion Descriptor (SMD) to classify transitional and normal actions