Skipping and Repeating Samples in Recurrent Neural Networks

Xavier Giro-i-Nieto
xavier.giro@upc.edu
Associate Professor
Intelligent Data Science and Artificial Intelligence Center
Universitat Politecnica de Catalunya (UPC)
DEEP
LEARNING
WORKSHOP
Dublin City University
21-22 May 2018
Skipping and Repeating
Samples in RNNs
Case study III
#InsightDL2018

bit.ly/InsightDL2018
#InsightDL2018
Our team @ UPC IDEAI Barcelona
Victor
Campos
Amaia
Salvador
Amanda
Duarte
Dani
Fojo
Görkem
Çamli
Akis
Linardos
Santiago
Pascual
Xavi
Giró
Miriam
Bellver
Marta R.
Costa.jussà
Ferran
Marqués
Jordi
Torres (BSC)
Marc
Assens
Sandra
Roca
Miquel
LLobet
Antonio
Bonafonte

#InsightDL2018
Kevin
McGuinness
Shih-Fu
Chang
Noel E.
O’Connor
Cathal
Gurrin
Eva
Mohedano
Carles
Ventura
Our partners @ academia
Francisco
Roldan
Marta
Coll
Paula
Gómez
Alejandro
Woodward
Marc
Górriz

#InsightDL2018
Dèlia
Fernández
Elisenda
Bou
Sebastian
Palacio
Eduard
Ramon
Andreu
Girbau
Carlos
Arenas
Our team @ industry

2016 20172015
Convolutional Neural Networks (CNNs)
Evolution
Strategies
Recurrent Neural Networks (RNN)
Adversarial Training
2018
Unsupervised Learning
Reinforcement Learning

#InsightDL2018
Motivation: Dynamic Computation
2+2

#InsightDL2018
Outline
Recurrent Neural
Networks (RNNs)
SkipRNN
[ICLR 2018]
RepeatRNN
[ICLRW 2018]

#InsightDL2018
Feed-forward Neural Network
INPUT(x)
OUTPUT(y)
FeedForward
Hidden
States
h1
& h2
Feed-forward
Weights (Wi
)
Figure: Hugo Larochelle

#InsightDL2018
Recurrent Neural Network

#InsightDL2018
Recurrent
Weights (U)
Feed-forward
Weights (W)

#InsightDL2018
Updated
state
Previous
state
INPUT(x)

#InsightDL2018
time
time
Unfold
(Rotation 90o
)
Unfold
(Rotation 90o
)

#InsightDL2018
S
x1
s1
S
x2
s2
S
xT
sT
…
time

#InsightDL2018
CNN CNN CNN...
RNN RNN RNN...
Video sequences (eg. action recognition)

#InsightDL2018
Word sequences (eg. Machine Translation)

#InsightDL2018
Spectrograms sequence (eg. speech recognition)
T H _ _ E … D _ O _ _G

#InsightDL2018
Skip RNN: Learning to Skip
State Updates in RNNs
Víctor Campos Jordi Torres Xavier Giró-i-Nieto Shih-Fu ChangBrendan Jou
Victor Campos, Brendan Jou, Xavier Giro-i-Nieto, Jordi Torres, and Shih-Fu Chang. “Skip RNN: Learning to
Skip State Updates in Recurrent Neural Networks”, ICLR 2018.

#InsightDL2018
Motivation
Used Unused
CNN CNN CNN...
RNN RNN RNN...

#InsightDL2018
SkipRNN
S
x1
s1
time

#InsightDL2018
SkipRNN
S
x1
s1
time
COPY

#InsightDL2018
SkipRNN
S
x1
s1
x2
time
COPY

#InsightDL2018
SkipRNN
S
x1
s1
S
x2
s1
time
COPY

#InsightDL2018
SkipRNN
S
x1
s1
S
x2
s1
time
COPY UPDATE

#InsightDL2018
SkipRNN
S
x1
s1
S
x2
s1
time
S
x3
s3
COPY UPDATE

#InsightDL2018
Limiting computation
cost per sample
(hyperparameter)
1 if sample used
0 otherwise
Intuition: the network can be encouraged to perform
fewer updates by adding a penalization when ut
= 1

#InsightDL2018
Experiments: Sequential MNIST
Used
Unused

#InsightDL2018
Epochs for Skip LSTM (λ = 10-4
)
~30% acc ~50% acc ~70% acc ~95% acc
11 epochs 51 epochs 101 epochs 400 epochs
Used
Unused

#InsightDL2018
cost per sample
(hyperparameter)
Random
baseline

#InsightDL2018
Better
accuracy...
Ours

#InsightDL2018
...with less
computation
Ours

#InsightDL2018
Experiments: Action Localization
CNN CNN CNN...
RNN RNN RNN...
better

#InsightDL2018
Experiments: Action Localization
CNN CNN CNN...
RNN RNN RNN...
faster

#InsightDL2018
Open science
https://git.io/skip-rnn

Reproducing and Analyzing
Adaptive Computation Time
in PyTorch and TensorFlow
Víctor Campos Xavier Giró-i-NietoDani Fojo
Fojo, Daniel, Víctor Campos, and Xavier Giró-i-Nieto. "Comparing Fixed and Adaptive Computation Time for
Recurrent Neural Networks." ICLR Workshop 2018.

#InsightDL2018
Adaptive Computation Time (ACT)
Graves, Alex. "Adaptive computation time for recurrent neural networks."
arXiv preprint arXiv:1603.08983 (2016).
REPEAT 3 REPEAT 2

#InsightDL2018
ACT
Adaptive Computation Time (ACT)

#InsightDL2018
ACT vs RepeatRNN
ACT
RepeatRNN

#InsightDL2018
Experiment: One-hot addition

#InsightDL2018
Ours

#InsightDL2018
Interpretable
hyperparameter
Ours

#InsightDL2018
Faster training….
Ours

#InsightDL2018
...and faster inference
Ours

#InsightDL2018
Open Science
bit.ly/repeat-rnn

#InsightDL2018
Acknowledgements

#InsightDL2018Deep Learning courses @ UPC (videos)
● MSc course (2017)
● BSc course (2018)
● 1st edition (2016)
● 2nd edition (2017)
● 3rd edition (2018)
● 1st edition (2017)
● 2nd edition (2018)
Next edition Autumn 2018 Next edition Winter/Spring 2019Summer School (starts 28/06)

@DocXavi
Xavier Giro-i-Nieto
Download slides from:
http://bit.ly/InsightDL2018
We are looking for both
industrial & academic partners:
xavier.giro@upc.edu
#InsightDL2018
Click to comment:

#InsightDL2018
SkipRNN
Intuition: introduce a binary update state gate, ut
,
deciding whether the RNN state is updated or copied
S(xt
, ht-1
) if ut
= 1 // update operation
st-1
if ut
= 0 // copy operation
st
=S
xt
st

#InsightDL2018
SkipRNN
Update state gate ∈ {0, 1}

#InsightDL2018
SkipRNN
Update state probability ∈ [0, 1]

#InsightDL2018
SkipRNN
Update state probability ∈ [0, 1]
Increment for the update state probability

#InsightDL2018
SkipRNN
Update state (ut
= 1) Copy state (ut
= 0)

#InsightDL2018
Straight Though Estimator
fbinarize
(x)
x
Forward pass
1
10
G. Hinton, Neural Networks for Machine Learning, Coursera video lectures 2012

#InsightDL2018
fbinarize
(x)
x
Forward pass
Backward pass
G. Hinton, Neural Networks for Machine Learning, Coursera video lectures 2012
1
10
Straight Though Estimator

Skipping and Repeating Samples in Recurrent Neural Networks

More Related Content

What's hot

Similar to Skipping and Repeating Samples in Recurrent Neural Networks

More from Universitat Politècnica de Catalunya

Recently uploaded

Skipping and Repeating Samples in Recurrent Neural Networks