Human Motion Forecasting (Generation) with RNNs

Terry Taewoong Um (terry.t.um@gmail.com)
University of Waterloo
Department of Electrical & Computer Engineering
Terry T. Um
ON HUMAN MOTION PREDICTION
USING RNNS (2017)
1

MOTIVATION TO CHOOSE THIS PAPER
• I have applied convolutional neural networks (CNNs) to classify wearable sensor data in my
research, but haven’t applied recurrent neural networks (RNNs) in my research.
Exercise Motion Classification from Large-
Scale Wearable Sensor Data Using CNNs
(2016)
Classified 50 gym exercises with
92%
Data Augmentation of Wearable Sensor Data for
Parkinson’s Disease Monitoring using CNNs (2017)
classification accuracy 77% 
92%

MOTION FORECASTING
• Motion forecasting (Motion prediction)
: Given a person’s past motions,
forecast the most likely future 3D poses
• e.g.) Sentence completion
motion forecasting ≈
a high-dimensional and nonlinear
version of sentence completion

BACKGROUND: RNN
• Recurrent Neural Networks (RNNs)
(unfold)
 vanishing or exploding gradient problem
 solve by using gate units
(Xavier Giro, https://www.slideshare.net/xavigiro/recurrent-neural-networks-1-d2l2-deep-learning-for-speech-and-language-upc-2017)

BACKGROUND: LSTM & GRU
• Solution : Let nodes to decide whether forget or bypass the information
 Gate units: Long short-term memory(LSTM) or gated recurrent unit (GRU)
(Christopher Olah, http://colah.github.io/posts/2015-08-Understanding-LSTMs/)
LSTM GRU
similar performance,
but less computation

BACKGROUND: RNN
(Andrej Karpathy, http://karpathy.github.io/2015/05/21/rnn-effectiveness/)

SEQUENCE GENERATION
(Andrej Karpathy, http://karpathy.github.io/2015/05/21/rnn-effectiveness/)

SIMPLEST APPROACH
• Just apply a LSTM to joint angle data
(2015)
ERD
Encoder-Recurrent-Decoder
LSTM
https://www.youtube.com/wat
ch?v=CvaKD1NGcBk
[Result]
Contribution :
It’s the first LSTM work with
skeleton data

RELATED WORK
ERD

RELATED WORK
https://youtu.be/JTr_wkPN-xs?t=1m18s
SRNN

C.F.) HIERARCHICAL RNN
MHAD dataset (11 actions)
HDM05 dataset (65 actions)
MSR-Action3D dataset (20 actions)
(2016)

MOTION FORECASTING USING RNN
• Evaluation criteria • Problem of RNN-based methods
for short-term (<=0.5s) for long-term
(>=1s)
Learning Human
Motion Models for
Long-term Predictions
(2017), P. Ghosh et al.

WHAT’S THE PROBLEMS?
Other problems:
- Model is so complicated that large data is needed
- Action-specific network : use a certain-action data

PROPOSED SOLUTION
(from K. He’s ResNet
tutorial at ICML2016)

EXPERIMENT SETTING
• Details

RESULTS
(SA : single action data, MA: multiple action data)

RESULTS
• Zero-velocity shows a good performance
• Sampling-based loss gives plausible motion generation
+ no noise scheduling is needed
• Residual connection improves the performance
• Using single action data < Using all action data (data quality < data quantity)
• Aperiodic motions are hard to model with RNNs
• Action labels helps the learning process
• Small loss != good qualitative long-term motion  need to propose a new loss
• Unsupervised approach gives a comparative result
• This research area hasn’t been matured, so, we have a chance .
Idea:
(for t+1 prediction)
Rather than residual input 𝑋𝑡
residual input 𝑋𝑡 + 𝑋𝑡 𝑑𝑡, or
explicitly exploiting 𝑋 and 𝑋

BONUS: MORE RESEARCHES FROM 2017

BONUS: MORE RESEARCHES FROM 2017
https://sites.google.com/a/umich.edu/rub
enevillegas/hierch_vid
https://twitter.com/TerryUm_ML

Human Motion Forecasting (Generation) with RNNs

More Related Content

What's hot

Viewers also liked

Similar to Human Motion Forecasting (Generation) with RNNs

More from Terry Taewoong Um

Recently uploaded

Human Motion Forecasting (Generation) with RNNs