State of the art time-series analysis with deep learning by Javier Ordóñez at Big Data Spain 2017

State of the art time-
series analysis with
deep learning

Who am I?
Francisco Javier
Ordóñez
Lead Data Scientist
javier.ordonez@stylesage.c
o
http://stylesage.
co

What is this about?
Approach for time series analysis using deep neural nets
What are we going to see:
Brief introduction
Deep learning concepts
Model
Use case
Core ref:
“Deep convolutional and lstm recurrent neural networks for multimodal wearable activity
recognition” FJ Ordóñez, et. al

Time series classification Time series forecasting
ECG anomaly detection Energy demand prediction
Human activity recognition Stock market prediction
Time series
A time series is a sequence of regular time-ordered observations
e.g. stock prices, weather readings, smartphone sensor data, health
monitoring data
“Traditional” approaches for time series analysis are based on autoregressive
models
-Challenges: Tackle feature design, usually a single signal involved, etc

Why deep
learning?●State of the art in speech
recognition and computer vision
●Capable of automatically learn
features
○Don’t require much domain
knowledge
●Works better when you have lots
of labelled data
© David Yanofsky |

Model that learns by the example
●using many examples
●defined as series of hierarchically connected functions
(layers)
●can be very complex (deep!)
Artificial neural
nets

Model that learns by the example
●using many examples
●defined as series of hierarchically connected functions
(layers)
●can be very complex (deep!)
Input Hidden layer Output
Artificial neural
nets

What does it know?
●composed by units (neurons), distributed in layers, which
control whether the data flow should continue (activation
level)
●controlled by “weights” and nonlinear functions
Artificial neural
nets

How does it learn?
●correcting the errors
●backpropagation!, the weights are adjusted and readjusted,
layer by layer, until the network can have the fewest
possible errors
Artificial neural
nets

Case: image processing
●Classical problem: MNIST dataset
○It’s the “Hello World” of image
processing
●Recognition of handwritten numbers
●Training - 60,000 pictures to learn the
relation picture-label

Case: image processing
It’s a
8!!

●Convolutional nets are less dense = less number of
weights
●Focus on local patterns, assuming that neighboring
variables are locally correlated
- Images - Pixels that are close
●One simple operation is repeated over and over several
times starting with the raw input data.
●They work very well. State of the art results in different
fields
Convnets

Convnets: filters
Input
Output
Filte
r
●Parameters of
convnets
●Capture a feature

Convnets:
features
http://caisplusplus.usc.edu/

Convnets:
architecture
https://blogs.sap.com/2015/01/14/image-classification-with-convolutional-neural-networks-my-attempt-at-the-ndsb-kaggle-
competition/

Convnets:
training
http://www.cs.nyu.edu/~yann/research/deep/ Y. LeCun
●Random initialization
●Backpropagated end to end

“Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations”. H Lee
Convnets: feature
learning

Convnets and images:
summary
http://karpathy.github.io/2015/10/25/selfie/ A.
Karpathy
●Result is a transformed representation of the input

Convnets:
dimensions
●Number of dimensions is not
relevant
●Time series use 1D filters (or
kernels)
●They are also feature extractors
© Wikipedia

Convnets:
signals●Same principles:
○Operations applied in a hierarchy
○Each filter will define a feature
map
○As many features maps as filters
○Each filter captures a pattern
●Result is another sequence/signal
○Transformed by the operations
3rd
layer
2nd
layer
1st

Recurrent neural
net nets
●Designed to learn from time
related data
●Units with persistence
●Input includes the output from
previous timestep
© deeplearning4j

Memory cells which can maintain its state over time, and non-linear
gating units which regulate the information flow into and out of the
cell
Long short-term
memory
“Generating Sequences With Recurrent Neural Networks”

LSTM: Layers
“Recurrent Neural Network Regularization” Zaremba, W.
●Also in a hierarchy. Output of
layer l is the input of layer
l+1
●Can model more complex
time relations

DeepConvLSTM
Deep framework based on convolutional and LSTM recurrent
units
●The convolutional layers are feature extractors and provide abstract
representations of the input data in feature maps.
●The recurrent layers model the temporal dynamics of the activation of the
feature maps
https://github.com/sussexwearlab/DeepConvLST
M

DeepConvLSTM
●Architecture
○How many layers
○How many nodes/filters
○Which type
●Data
○Batches size
○Size of filters
○Number of steps the
memory cells will learn
●Training:
○Regularization
○Learning rate
○Gradient expressions
○Init policy
Parameters are learnt automatically, but the
hyperparameters??

●Architecture
○Layers:
Conv(64)−Conv(64)−Conv(64)−Conv(64)−LSTM(128)−LSTM(128)
○Type: ReLUs units for conv layers
●Data
○Batches size: 100 (careful with the GPU memory)
○Size of filters: 5 samples
○Number of steps the memory cells will learn: 24 samples
●Training
○Regularization: Dropout in the conv layers
○Learning rate: Small (0.0001)
○Gradient expressions: RMSProp. Usually a good choice for
RNN
DeepConvLSTM:
hyperparams

DeepConvLSTM:
architecture
Input 1st conv 2nd conv 3rd conv 4th conv 1st lstm 2nd
lstm Output
64 64 64 64 128
128

Stand Run Walk
time
Activity
recognition
Output:
Activity label
Input:
Sensor signals
Supervised classification problem. Most likely activity label
according to a sensor signal for each time instant.

Dataset
●113 sensor channels
●30 Hz
●6 hours data
●17 home activities:
○Open/Close fridge
○Clean Table
○Toggle switch
○Open/Close dishwasher
○Open/Close drawers
○Open/Close doors
○Drink from cup
"Collecting complex activity datasets in highly rich networked sensor environments"
Roggen, D. et al
OPPORTUNITY
dataset

F-score
●Considers all errors equally important
●Combines precision and recall
●Value between 0 and 1
●The higher the F-score the better the
model
Metrics
Loss
●Measures of the number of errors
●Value aimed to optimize during the
learning process
●Value between 0 and 1
●The lower the loss, the better a model
1
0
f-score
1
0

●Benchmark:
○OPPORTUNITY challenge
●~1M parameters
●Single GPU - 1664 cores
●Training takes ~6h to converge
●Classification takes ~6 seconds
(kNN)
(SVM)
(kNN + SVM)
Performance
F-score
1
0

Training
visualizationGreenness Thickness
Influence in Activation
final score level
F-score Loss
1
0
1
0
class 1
class 2
class n

Summary
Automatic feature learning. A convolutional filter captures a
specific salient pattern and would act as a feature detector
Core ref:
“Deep convolutional and lstm recurrent neural networks for multimodal wearable activity
recognition” FJ Ordóñez, et. al
We have to deal with the hyperparameters.
“Learning to learn by gradient descent by gradient descent”
Andrychowicz. M.
Recurrent layers can learn the temporal dynamics of such
features
State of the art performance with restrained nets (~1M
params). Capable of real time processing

State of the art time-series analysis with deep learning by Javier Ordóñez at Big Data Spain 2017

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to State of the art time-series analysis with deep learning by Javier Ordóñez at Big Data Spain 2017

Similar to State of the art time-series analysis with deep learning by Javier Ordóñez at Big Data Spain 2017 (20)

More from Big Data Spain

More from Big Data Spain (20)

Recently uploaded

Recently uploaded (20)

State of the art time-series analysis with deep learning by Javier Ordóñez at Big Data Spain 2017