Skeleton-based Human Action Recognition with Recurrent Neural Network

Skeleton-based Human Action Recognition
with Recurrent Neural Network
University of Science, VNU-HCM
Faculty of Information Technology
Advanced Program in Computer Science
Võ Trần Thanh Lương
1551020
Vũ Hoàng Quân
1551026
Thesis Advisors:
Dr. Trần Thái Sơn
Ho Chi Minh City
Aug 18th
2019

Outlines
●
●
●
●
●
●
●

Introduction
Every human action is done to serve a purpose. Machines should be able to learn and understand it.

Introduction
●
●
○
○
○
○
●
●

Motivation
●
○
○
○
○
○
○ …

Related Work
●
○
○
Zhuowen Lv 1, Xianglei Xing 1,, Kejun Wang 1, and Donghai Guan 2 , "Class
Energy Image Analysis for Video Sensor-Based Gait Recognition: A Review,"
An example of local representations for human action

Related Work
●
○
○
Georgios D. Evangelidis, Gurkirt Singh, Radu
Horaud, "Continuous Gesture Recognition from
Articulated Poses,"
Spatial Temporal interest points. S.F. Wong, T.-K. Kim, and R. Cipolla, "Learning motion
categories using both semantic and structural information”

Related Work
●
○
○
■
■
Example of 3D convolution. Karen Simonyan & Andrew Zisserman , "Two-Stream Convolutional Networks for Action Recognition in Videos,"

Related Work
●
○
○
Hybrid network for temporal modeling. Lisa Anne Hendricks, Marcus Rohrbach,
Subhashini Venugopalan, Sergio Guadarrama, Kate Saenko, Trevor Darrell,
"Long-term recurrent convolutional networks for visual recognition and description,"
Jun Liu, Gang Wang, Ling-Yu Duan, Kamila Abdiyeva, Alex C. Kot, "Skeleton-Based Human Action
Recognition with Global Context-Aware Attention LSTM Networks,

Proposed Method
Temporal RNN
Spatial RNN

Temporal RNN
●
●
●
○
○

Training Flow
Skeleton
Dataset
Training
Set
Testing
Set
Initialize
RNN
Feature
Extraction
Training
Softmax
Classification
Trained
Model
NTU
Dataset
Kinetics
Dataset
Raw
video
Extract Skeleton
Data
Using OpenPose

Predicted Flow
Raw Video
Extract Skeleton
Data
Load Trained
Model
Predicted Value
Using OpenPose

NTU RGB+D and NTU RGB+D 120 Dataset
●
●
●
●

Skeleton Joints Position (NTU Dataset)

Sample frames of Kinetics Dataset

Skeleton Joints Position (Coco Model)

Comparison with the state-of-the-art

Problems with Kinetics dataset
NTU RGB+D NTU RGB+D 120 Kinetics
Raw Videos Yes Yes No (can obtain
from given URLs)
3D skeleton data Yes Yes No
Depth maps Yes Yes No

Problems with Kinetics dataset
○
○

Future Work
●
●
●
Jun Liu, Gang Wang, Ling-Yu Duan, Kamila Abdiyeva, Alex C. Kot, "Skeleton-Based Human
Action Recognition with Global Context-Aware Attention LSTM Networks,
ASPR Framework. Liu, Jun and Shahroudy, Amir and Perez, Mauricio and Wang, Gang and
Duan, Ling-Yu and Kot, Alex C., "NTU RGB+D 120: A Large-Scale Benchmark for 3D
Human Activity Understanding,"

THANK YOU FOR YOUR ATTENTION
University of Science, VNU-HCM
Faculty of Information Technology
Advanced Program in Computer Science
Võ Trần Thanh Lương
1551020
Vũ Hoàng Quân
1551026
Thesis Advisors:
Dr. Trần Thái Sơn
Ho Chi Minh City
Aug 18th
2019

Skeleton-based Human Action Recognition with Recurrent Neural Network

More Related Content

What's hot

Similar to Skeleton-based Human Action Recognition with Recurrent Neural Network

More from Luong Vo

Recently uploaded

Skeleton-based Human Action Recognition with Recurrent Neural Network