SlideShare a Scribd company logo
1 of 26
Download to read offline
MACHINE
LEARNING FOR
TIME SERIES
DR. MIKIO L. BRAUN
AI ARCHITECT AT ZALANDO
@mikiobraun
WHAT WORKS AND WHAT DOESN’T
STRATA DATA LONDON, MARCH 23, 2018
!2
TIME SERIES ANALYSIS
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!3
TIME SERIES APPLICATIONS
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!4
MACHINE LEARNING FOR TIME SERIES
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!5
CLASSICAL METHODS
Strong assumptions on stationarity. Predictions as linear combinations of past data / i.i.d. noise.
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!6
ESTIMATING WITH THE BOX-JENKINS PROGRAM
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!7
• Solid theoretical background.
• Very explicit modeling.
• A lot of control as it is a manual process.
• Bayesian version available to provide uncertainty
estimates.
WHAT WORKS
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!8
CHALLENGES: SEASONALITY & NON-STATIONARITY
In reality, data is seldom stationary,
but shows trends, seasonality,
cycles, ... .
In the classical approach, these are
manually removed first.
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!9
DIFFERENCING AND SCALING
• Running means.
• De-trending by differencing.
• Variance stabilization by log, square root, Box-
Cox transformation.
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!10
• What if assumptions do not hold?
• Stationarity is a rather strong requirement.
• Linear autoregressive models are somewhat “boring.”
CLASSICAL METHODS: WHAT DOESN’T WORK SO WELL
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!11
MORE GENERAL MACHINE LEARNING APPROACH
Be explicitly collecting the past of a point, we can
construct a supervised learning setting.
Still different as points are highly correlated.
Can use any number of methods (linear, SVMs,
neural networks, …)
Easily extends to other areas as well:
• Multiple input variables.
• Multiple output variables.
• Additional variables to feed into the model.
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!12
EVALUATION AND CROSS-VALIDATION WITH TIME SERIES DATA
In ML, one often uses cross-validation to
estimate performance on future data.
Since time series data is highly correlated, one
cannot sample test data at random but should
sample block-wise.
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!13
CHALLENGES: EXTEND PREDICTION
Prediction can be done either one point at a
time, using test data as past values as they
become available.
Or one can use the predictions themselves,
which leads to much less stable predictions.
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!14
CLICK DATA & BEYOND SIMPLE TIME SERIES MODELS
Another interesting data source is event
data (click data, customer actions, …).
These show very similar properties: strong
dependence, predictions depend on past,
etc.
Often, data needs to be summarized and
transformed to get good predictions.
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!15
• Aggregate histograms over time scales.
• Transform into Fourier Space.
• Apply bandpass / low pass / high pass filter.
• Intelligent filtering: independent component analysis,
canonical correlation analysis.
• Downside: Quite costly to retrain on each iteration.
FEATURE ENGINEERING FOR TIME SERIES
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!16
DEEP LEARNING: LONG SHORT TERM MEMORY
Recurrent neural network base predictions on past
data point and hidden state.
Hidden state can aggregate features automatically.
LSTM is a particularly flexible variant that has
(learnable) gates and transformations to control how
hidden state is updated.
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!17
APPLICATION: ANALYZING USER ACTIONS @ ZALANDO
• Goal is to predict buy probability based on user
histories.
• Before: many handcrafted features + logistic
regression
• Drawback: retune all the features again and again
• With DL: embedding of user histories in a RNN plus
user specific features.
• Performs already pretty well.
Lang, Rettenmeier: „Understanding Customer
Behavior with Recurrent Neural Networks“, MLRec
2017
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!18
DEEP LEARNING FOR CUSTOMER ACTIONS @ ZALANDO
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!19
APPLICATION: DEMAND PREDICTION FOR RARE EVENTS @ UBER
Uber is interested in having reliable models also during extreme events like Thanksgiving or
New Year's Day—which have little coverage in usual data.
https://eng.uber.com/neural-networks/
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!20
DEMAND PREDICTION AT UBER: THE DATA
Available data uses a number of exogenous features like weather, app views.
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!21
ARCHITECTURE: TIME SERIES AUTOENCODERS
Combination of a stacked LSTM autoencoder to
capture general dynamics and informative
features.
These are then concatenated with the actual
input and put into another LSTM forecast
network.
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!22
APPLICATION: DEMAND PREDICTION @ AMAZON, MANY TIME SERIES & PROBABILITIES
https://arxiv.org/abs/1704.04110
Challenges of predicting article
demand over thousands of articles:
• Numbers on many scales.
• Amount of available data varies.
• We want probability distributions in
predictions.
• Predictions ahead in time.
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!23
• Use LSTM to learn interactions in the time
series.
• LSTMs also propagate knowledge about
dynamics to data points with few data
points.
• LSTM predicts parameters of distributions
in each point.
• Pre- & post-scale time series.
DEEP AR @ AMAZON
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!24
• Procedure:
1. Predict parameter
2. Compute likelihood
3. Sample next point
• Train by maximizing
likelihood.
• Train directly on requested
prediction into the future.
• Sample points to go into the
future.
DEEP AR: TRAINING
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
!25
SUMMARY
MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
Classical Timer Series
Models
General Machine
Learning
Feature Engineering Deep Learning
Use to get started.
Use if explicit modeling is
good.
If you are unsure about
modeling assumptions.
But: use proper validation
to ensure good
performance.
For more complex data.
If you have a priori
knowledge about the
domain.
If you have a lot of data.
If you frequently want to
iterate & experiment.
If explicit modeling &
feature engineering is too
costly.
THANK YOU!

More Related Content

What's hot

Graph based Semi Supervised Learning V1
Graph based Semi Supervised Learning V1Graph based Semi Supervised Learning V1
Graph based Semi Supervised Learning V1
Neeta Pande
 
The More the Merrier: Scaling Model Building Infrastructure at Zendesk
The More the Merrier: Scaling Model Building Infrastructure at ZendeskThe More the Merrier: Scaling Model Building Infrastructure at Zendesk
The More the Merrier: Scaling Model Building Infrastructure at Zendesk
Databricks
 

What's hot (20)

Resume
ResumeResume
Resume
 
Resume
ResumeResume
Resume
 
Philip Rathle- Graph Boosted Artificial Intelligence
Philip Rathle- Graph Boosted Artificial IntelligencePhilip Rathle- Graph Boosted Artificial Intelligence
Philip Rathle- Graph Boosted Artificial Intelligence
 
Remote Patient & Elderly Care Monitoring
Remote Patient & Elderly Care MonitoringRemote Patient & Elderly Care Monitoring
Remote Patient & Elderly Care Monitoring
 
Resume kartikeya sharma
Resume kartikeya sharmaResume kartikeya sharma
Resume kartikeya sharma
 
Knowledge Discovery in Production
Knowledge Discovery in ProductionKnowledge Discovery in Production
Knowledge Discovery in Production
 
The State of Artificial Intelligence in 2018: A Good Old Fashioned Report
The State of Artificial Intelligence in 2018: A Good Old Fashioned ReportThe State of Artificial Intelligence in 2018: A Good Old Fashioned Report
The State of Artificial Intelligence in 2018: A Good Old Fashioned Report
 
Graph based Semi Supervised Learning V1
Graph based Semi Supervised Learning V1Graph based Semi Supervised Learning V1
Graph based Semi Supervised Learning V1
 
MLSD18. Automating Machine Learning Workflows
MLSD18. Automating Machine Learning WorkflowsMLSD18. Automating Machine Learning Workflows
MLSD18. Automating Machine Learning Workflows
 
Building A Feature Factory
Building A Feature FactoryBuilding A Feature Factory
Building A Feature Factory
 
Saving Human Lives with the IoT
Saving Human Lives with the IoTSaving Human Lives with the IoT
Saving Human Lives with the IoT
 
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
 
Portfolio
PortfolioPortfolio
Portfolio
 
Workshop on Real-time & Stream Analytics IEEE BigData 2016
Workshop on Real-time & Stream Analytics IEEE BigData 2016Workshop on Real-time & Stream Analytics IEEE BigData 2016
Workshop on Real-time & Stream Analytics IEEE BigData 2016
 
Satwik Mishra Resume
Satwik Mishra ResumeSatwik Mishra Resume
Satwik Mishra Resume
 
The More the Merrier: Scaling Model Building Infrastructure at Zendesk
The More the Merrier: Scaling Model Building Infrastructure at ZendeskThe More the Merrier: Scaling Model Building Infrastructure at Zendesk
The More the Merrier: Scaling Model Building Infrastructure at Zendesk
 
Big Data Graph Analytics
Big Data Graph AnalyticsBig Data Graph Analytics
Big Data Graph Analytics
 
Interpretable Machine Learning
Interpretable Machine LearningInterpretable Machine Learning
Interpretable Machine Learning
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
 
Nicole Grinstead: Using Statistics for Security with Threat Detection at Netflix
Nicole Grinstead: Using Statistics for Security with Threat Detection at NetflixNicole Grinstead: Using Statistics for Security with Threat Detection at Netflix
Nicole Grinstead: Using Statistics for Security with Threat Detection at Netflix
 

Similar to Machine Learning for Time Series, Strata London 2018

DSI_Detailed_Syllabus_v10.2
DSI_Detailed_Syllabus_v10.2DSI_Detailed_Syllabus_v10.2
DSI_Detailed_Syllabus_v10.2
Dorian Lacaisse
 

Similar to Machine Learning for Time Series, Strata London 2018 (20)

Zühlke Meetup - Mai 2017
Zühlke Meetup - Mai 2017Zühlke Meetup - Mai 2017
Zühlke Meetup - Mai 2017
 
20181212 Queensland AI Meetup
20181212 Queensland AI Meetup20181212 Queensland AI Meetup
20181212 Queensland AI Meetup
 
Bitcoin Price Prediction Using LSTM
Bitcoin Price Prediction Using LSTMBitcoin Price Prediction Using LSTM
Bitcoin Price Prediction Using LSTM
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather Conditions
 
Spark Technology Center IBM
Spark Technology Center IBMSpark Technology Center IBM
Spark Technology Center IBM
 
The Past, Present, and Future of Machine Learning APIs
The Past, Present, and Future of Machine Learning APIsThe Past, Present, and Future of Machine Learning APIs
The Past, Present, and Future of Machine Learning APIs
 
Architecting AI Applications
Architecting AI ApplicationsArchitecting AI Applications
Architecting AI Applications
 
Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16
Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16
Mehr und schneller ist nicht automatisch besser - data2day, 06.10.16
 
IRJET- Criminal Recognization in CCTV Surveillance Video
IRJET-  	  Criminal Recognization in CCTV Surveillance VideoIRJET-  	  Criminal Recognization in CCTV Surveillance Video
IRJET- Criminal Recognization in CCTV Surveillance Video
 
Era ofdataeconomyv4short
Era ofdataeconomyv4shortEra ofdataeconomyv4short
Era ofdataeconomyv4short
 
Deep Learning Applications in Finance.pdf
Deep Learning Applications in Finance.pdfDeep Learning Applications in Finance.pdf
Deep Learning Applications in Finance.pdf
 
STOCK MARKET PREDICTION USING NEURAL NETWORKS
STOCK MARKET PREDICTION USING NEURAL NETWORKSSTOCK MARKET PREDICTION USING NEURAL NETWORKS
STOCK MARKET PREDICTION USING NEURAL NETWORKS
 
DSI_Detailed_Syllabus_v10.2
DSI_Detailed_Syllabus_v10.2DSI_Detailed_Syllabus_v10.2
DSI_Detailed_Syllabus_v10.2
 
The Bitter Lesson of ML Pipelines
The Bitter Lesson of ML Pipelines The Bitter Lesson of ML Pipelines
The Bitter Lesson of ML Pipelines
 
Data Summer Conf 2018, “Architecting IoT system with Machine Learning (ENG)” ...
Data Summer Conf 2018, “Architecting IoT system with Machine Learning (ENG)” ...Data Summer Conf 2018, “Architecting IoT system with Machine Learning (ENG)” ...
Data Summer Conf 2018, “Architecting IoT system with Machine Learning (ENG)” ...
 
Architecting IoT with Machine Learning
Architecting IoT with Machine LearningArchitecting IoT with Machine Learning
Architecting IoT with Machine Learning
 
Icbai 2018 ver_1
Icbai 2018 ver_1Icbai 2018 ver_1
Icbai 2018 ver_1
 
Time Series Weather Forecasting Techniques: Literature Survey
Time Series Weather Forecasting Techniques: Literature SurveyTime Series Weather Forecasting Techniques: Literature Survey
Time Series Weather Forecasting Techniques: Literature Survey
 
DIGITAL INVESTMENT PREDICTION IN CRYPTOCURRENCY
DIGITAL INVESTMENT PREDICTION IN CRYPTOCURRENCYDIGITAL INVESTMENT PREDICTION IN CRYPTOCURRENCY
DIGITAL INVESTMENT PREDICTION IN CRYPTOCURRENCY
 
ML master class
ML master classML master class
ML master class
 

More from Mikio L. Braun

More from Mikio L. Braun (7)

Bringing ML To Production, What Is Missing? AMLD 2020
Bringing ML To Production, What Is Missing? AMLD 2020Bringing ML To Production, What Is Missing? AMLD 2020
Bringing ML To Production, What Is Missing? AMLD 2020
 
Academia to industry looking back on a decade of ml
Academia to industry looking back on a decade of mlAcademia to industry looking back on a decade of ml
Academia to industry looking back on a decade of ml
 
Hardcore Data Science - in Practice
Hardcore Data Science - in PracticeHardcore Data Science - in Practice
Hardcore Data Science - in Practice
 
Data flow vs. procedural programming: How to put your algorithms into Flink
Data flow vs. procedural programming: How to put your algorithms into FlinkData flow vs. procedural programming: How to put your algorithms into Flink
Data flow vs. procedural programming: How to put your algorithms into Flink
 
Realtime Data Analysis Patterns
Realtime Data Analysis PatternsRealtime Data Analysis Patterns
Realtime Data Analysis Patterns
 
Cassandra - An Introduction
Cassandra - An IntroductionCassandra - An Introduction
Cassandra - An Introduction
 
Cassandra - Eine Einführung
Cassandra - Eine EinführungCassandra - Eine Einführung
Cassandra - Eine Einführung
 

Recently uploaded

一比一原版贝德福特大学毕业证学位证书
一比一原版贝德福特大学毕业证学位证书一比一原版贝德福特大学毕业证学位证书
一比一原版贝德福特大学毕业证学位证书
F
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
ayvbos
 
原版定制美国加州大学河滨分校毕业证原件一模一样
原版定制美国加州大学河滨分校毕业证原件一模一样原版定制美国加州大学河滨分校毕业证原件一模一样
原版定制美国加州大学河滨分校毕业证原件一模一样
A
 
一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理
F
 
原版定制英国赫瑞瓦特大学毕业证原件一模一样
原版定制英国赫瑞瓦特大学毕业证原件一模一样原版定制英国赫瑞瓦特大学毕业证原件一模一样
原版定制英国赫瑞瓦特大学毕业证原件一模一样
AS
 
一比一原版犹他大学毕业证如何办理
一比一原版犹他大学毕业证如何办理一比一原版犹他大学毕业证如何办理
一比一原版犹他大学毕业证如何办理
F
 
一比一定制美国罗格斯大学毕业证学位证书
一比一定制美国罗格斯大学毕业证学位证书一比一定制美国罗格斯大学毕业证学位证书
一比一定制美国罗格斯大学毕业证学位证书
A
 
原版定制(LBS毕业证书)英国伦敦商学院毕业证原件一模一样
原版定制(LBS毕业证书)英国伦敦商学院毕业证原件一模一样原版定制(LBS毕业证书)英国伦敦商学院毕业证原件一模一样
原版定制(LBS毕业证书)英国伦敦商学院毕业证原件一模一样
AS
 
Abortion Clinic in Germiston +27791653574 WhatsApp Abortion Clinic Services i...
Abortion Clinic in Germiston +27791653574 WhatsApp Abortion Clinic Services i...Abortion Clinic in Germiston +27791653574 WhatsApp Abortion Clinic Services i...
Abortion Clinic in Germiston +27791653574 WhatsApp Abortion Clinic Services i...
mikehavy0
 
一比一原版(毕业证书)新西兰怀特克利夫艺术设计学院毕业证原件一模一样
一比一原版(毕业证书)新西兰怀特克利夫艺术设计学院毕业证原件一模一样一比一原版(毕业证书)新西兰怀特克利夫艺术设计学院毕业证原件一模一样
一比一原版(毕业证书)新西兰怀特克利夫艺术设计学院毕业证原件一模一样
AS
 
如何办理(UCLA毕业证)加州大学洛杉矶分校毕业证成绩单本科硕士学位证留信学历认证
如何办理(UCLA毕业证)加州大学洛杉矶分校毕业证成绩单本科硕士学位证留信学历认证如何办理(UCLA毕业证)加州大学洛杉矶分校毕业证成绩单本科硕士学位证留信学历认证
如何办理(UCLA毕业证)加州大学洛杉矶分校毕业证成绩单本科硕士学位证留信学历认证
hfkmxufye
 

Recently uploaded (20)

一比一原版贝德福特大学毕业证学位证书
一比一原版贝德福特大学毕业证学位证书一比一原版贝德福特大学毕业证学位证书
一比一原版贝德福特大学毕业证学位证书
 
Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...
Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...
Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...
 
Beyond Inbound: Unlocking the Secrets of API Egress Traffic Management
Beyond Inbound: Unlocking the Secrets of API Egress Traffic ManagementBeyond Inbound: Unlocking the Secrets of API Egress Traffic Management
Beyond Inbound: Unlocking the Secrets of API Egress Traffic Management
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
 
Washington Football Commanders Redskins Feathers Shirt
Washington Football Commanders Redskins Feathers ShirtWashington Football Commanders Redskins Feathers Shirt
Washington Football Commanders Redskins Feathers Shirt
 
Lowongan Kerja LC Yogyakarta Terbaru 085746015303
Lowongan Kerja LC Yogyakarta Terbaru 085746015303Lowongan Kerja LC Yogyakarta Terbaru 085746015303
Lowongan Kerja LC Yogyakarta Terbaru 085746015303
 
原版定制美国加州大学河滨分校毕业证原件一模一样
原版定制美国加州大学河滨分校毕业证原件一模一样原版定制美国加州大学河滨分校毕业证原件一模一样
原版定制美国加州大学河滨分校毕业证原件一模一样
 
一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理
 
原版定制英国赫瑞瓦特大学毕业证原件一模一样
原版定制英国赫瑞瓦特大学毕业证原件一模一样原版定制英国赫瑞瓦特大学毕业证原件一模一样
原版定制英国赫瑞瓦特大学毕业证原件一模一样
 
一比一原版犹他大学毕业证如何办理
一比一原版犹他大学毕业证如何办理一比一原版犹他大学毕业证如何办理
一比一原版犹他大学毕业证如何办理
 
APNIC Policy Roundup presented by Sunny Chendi at TWNOG 5.0
APNIC Policy Roundup presented by Sunny Chendi at TWNOG 5.0APNIC Policy Roundup presented by Sunny Chendi at TWNOG 5.0
APNIC Policy Roundup presented by Sunny Chendi at TWNOG 5.0
 
一比一定制美国罗格斯大学毕业证学位证书
一比一定制美国罗格斯大学毕业证学位证书一比一定制美国罗格斯大学毕业证学位证书
一比一定制美国罗格斯大学毕业证学位证书
 
原版定制(LBS毕业证书)英国伦敦商学院毕业证原件一模一样
原版定制(LBS毕业证书)英国伦敦商学院毕业证原件一模一样原版定制(LBS毕业证书)英国伦敦商学院毕业证原件一模一样
原版定制(LBS毕业证书)英国伦敦商学院毕业证原件一模一样
 
Abortion Clinic in Germiston +27791653574 WhatsApp Abortion Clinic Services i...
Abortion Clinic in Germiston +27791653574 WhatsApp Abortion Clinic Services i...Abortion Clinic in Germiston +27791653574 WhatsApp Abortion Clinic Services i...
Abortion Clinic in Germiston +27791653574 WhatsApp Abortion Clinic Services i...
 
一比一原版(毕业证书)新西兰怀特克利夫艺术设计学院毕业证原件一模一样
一比一原版(毕业证书)新西兰怀特克利夫艺术设计学院毕业证原件一模一样一比一原版(毕业证书)新西兰怀特克利夫艺术设计学院毕业证原件一模一样
一比一原版(毕业证书)新西兰怀特克利夫艺术设计学院毕业证原件一模一样
 
Abortion Pills In Jeddah+966572737505 & Get cytotec Jeddah
Abortion Pills In Jeddah+966572737505 & Get cytotec JeddahAbortion Pills In Jeddah+966572737505 & Get cytotec Jeddah
Abortion Pills In Jeddah+966572737505 & Get cytotec Jeddah
 
HUMANIZE YOUR BRAND - FREE E-WORKBOOK Download Now
HUMANIZE YOUR BRAND - FREE E-WORKBOOK Download NowHUMANIZE YOUR BRAND - FREE E-WORKBOOK Download Now
HUMANIZE YOUR BRAND - FREE E-WORKBOOK Download Now
 
Loker Pemandu Lagu LC Semarang 085746015303
Loker Pemandu Lagu LC Semarang 085746015303Loker Pemandu Lagu LC Semarang 085746015303
Loker Pemandu Lagu LC Semarang 085746015303
 
如何办理(UCLA毕业证)加州大学洛杉矶分校毕业证成绩单本科硕士学位证留信学历认证
如何办理(UCLA毕业证)加州大学洛杉矶分校毕业证成绩单本科硕士学位证留信学历认证如何办理(UCLA毕业证)加州大学洛杉矶分校毕业证成绩单本科硕士学位证留信学历认证
如何办理(UCLA毕业证)加州大学洛杉矶分校毕业证成绩单本科硕士学位证留信学历认证
 
Down bad crying at the gym t shirtsDown bad crying at the gym t shirts
Down bad crying at the gym t shirtsDown bad crying at the gym t shirtsDown bad crying at the gym t shirtsDown bad crying at the gym t shirts
Down bad crying at the gym t shirtsDown bad crying at the gym t shirts
 

Machine Learning for Time Series, Strata London 2018

  • 1. MACHINE LEARNING FOR TIME SERIES DR. MIKIO L. BRAUN AI ARCHITECT AT ZALANDO @mikiobraun WHAT WORKS AND WHAT DOESN’T STRATA DATA LONDON, MARCH 23, 2018
  • 2. !2 TIME SERIES ANALYSIS MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 3. !3 TIME SERIES APPLICATIONS MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 4. !4 MACHINE LEARNING FOR TIME SERIES MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 5. !5 CLASSICAL METHODS Strong assumptions on stationarity. Predictions as linear combinations of past data / i.i.d. noise. MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 6. !6 ESTIMATING WITH THE BOX-JENKINS PROGRAM MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 7. !7 • Solid theoretical background. • Very explicit modeling. • A lot of control as it is a manual process. • Bayesian version available to provide uncertainty estimates. WHAT WORKS MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 8. !8 CHALLENGES: SEASONALITY & NON-STATIONARITY In reality, data is seldom stationary, but shows trends, seasonality, cycles, ... . In the classical approach, these are manually removed first. MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 9. !9 DIFFERENCING AND SCALING • Running means. • De-trending by differencing. • Variance stabilization by log, square root, Box- Cox transformation. MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 10. !10 • What if assumptions do not hold? • Stationarity is a rather strong requirement. • Linear autoregressive models are somewhat “boring.” CLASSICAL METHODS: WHAT DOESN’T WORK SO WELL MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 11. !11 MORE GENERAL MACHINE LEARNING APPROACH Be explicitly collecting the past of a point, we can construct a supervised learning setting. Still different as points are highly correlated. Can use any number of methods (linear, SVMs, neural networks, …) Easily extends to other areas as well: • Multiple input variables. • Multiple output variables. • Additional variables to feed into the model. MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 12. !12 EVALUATION AND CROSS-VALIDATION WITH TIME SERIES DATA In ML, one often uses cross-validation to estimate performance on future data. Since time series data is highly correlated, one cannot sample test data at random but should sample block-wise. MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 13. !13 CHALLENGES: EXTEND PREDICTION Prediction can be done either one point at a time, using test data as past values as they become available. Or one can use the predictions themselves, which leads to much less stable predictions. MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 14. !14 CLICK DATA & BEYOND SIMPLE TIME SERIES MODELS Another interesting data source is event data (click data, customer actions, …). These show very similar properties: strong dependence, predictions depend on past, etc. Often, data needs to be summarized and transformed to get good predictions. MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 15. !15 • Aggregate histograms over time scales. • Transform into Fourier Space. • Apply bandpass / low pass / high pass filter. • Intelligent filtering: independent component analysis, canonical correlation analysis. • Downside: Quite costly to retrain on each iteration. FEATURE ENGINEERING FOR TIME SERIES MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 16. !16 DEEP LEARNING: LONG SHORT TERM MEMORY Recurrent neural network base predictions on past data point and hidden state. Hidden state can aggregate features automatically. LSTM is a particularly flexible variant that has (learnable) gates and transformations to control how hidden state is updated. MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 17. !17 APPLICATION: ANALYZING USER ACTIONS @ ZALANDO • Goal is to predict buy probability based on user histories. • Before: many handcrafted features + logistic regression • Drawback: retune all the features again and again • With DL: embedding of user histories in a RNN plus user specific features. • Performs already pretty well. Lang, Rettenmeier: „Understanding Customer Behavior with Recurrent Neural Networks“, MLRec 2017 MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 18. !18 DEEP LEARNING FOR CUSTOMER ACTIONS @ ZALANDO MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 19. !19 APPLICATION: DEMAND PREDICTION FOR RARE EVENTS @ UBER Uber is interested in having reliable models also during extreme events like Thanksgiving or New Year's Day—which have little coverage in usual data. https://eng.uber.com/neural-networks/ MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 20. !20 DEMAND PREDICTION AT UBER: THE DATA Available data uses a number of exogenous features like weather, app views. MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 21. !21 ARCHITECTURE: TIME SERIES AUTOENCODERS Combination of a stacked LSTM autoencoder to capture general dynamics and informative features. These are then concatenated with the actual input and put into another LSTM forecast network. MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 22. !22 APPLICATION: DEMAND PREDICTION @ AMAZON, MANY TIME SERIES & PROBABILITIES https://arxiv.org/abs/1704.04110 Challenges of predicting article demand over thousands of articles: • Numbers on many scales. • Amount of available data varies. • We want probability distributions in predictions. • Predictions ahead in time. MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 23. !23 • Use LSTM to learn interactions in the time series. • LSTMs also propagate knowledge about dynamics to data points with few data points. • LSTM predicts parameters of distributions in each point. • Pre- & post-scale time series. DEEP AR @ AMAZON MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 24. !24 • Procedure: 1. Predict parameter 2. Compute likelihood 3. Sample next point • Train by maximizing likelihood. • Train directly on requested prediction into the future. • Sample points to go into the future. DEEP AR: TRAINING MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON
  • 25. !25 SUMMARY MIKIO BRAUN, MACHINE LEARNING FOR TIME SERIES: WHAT WORKS AND WHAT DOESN'T, STRATA DATA 2018 LONDON Classical Timer Series Models General Machine Learning Feature Engineering Deep Learning Use to get started. Use if explicit modeling is good. If you are unsure about modeling assumptions. But: use proper validation to ensure good performance. For more complex data. If you have a priori knowledge about the domain. If you have a lot of data. If you frequently want to iterate & experiment. If explicit modeling & feature engineering is too costly.