Continuous Intelligence: Moving Machine Learning into Production Reliably

1
Continuous
Intelligence
Moving Machine Learning
Application into Production Reliably
Christoph Windheuser
Danilo Sato
Emily Gorcenski
Arif Wider
ThoughtWorks Inc.
WORKSHOP ON WHY
AND HOW TO APPLY
CONTINUOUS
DELIVERY TO
MACHINE LEARNING
(CD4ML)
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019

2
Structure of Today’s Workshop
- INTRODUCTION TO THE TOPIC
- EXERCISE 1: SETUP
- EXERCISE 2: DEPLOYMENT PIPELINE
- BREAK
- EXERCISE 3: ML PIPELINE
- EXERCISE 4: TRACKING EXPERIMENTS
- EXERCISE 5: MODEL MONITORING
©ThoughtWorks 2019

5000+ technologists with 40 oﬃces in 14 countries
Partner for technology driven business transformation
©ThoughtWorks 2019
join.thoughtworks.com

#1
in Agile and
Continuous Delivery
100+
books written
©ThoughtWorks 2019

©ThoughtWorks 2019

TECHNIQUES
Continuous delivery
for machine
learning (CD4ML)
models
#8
ASSESS
8
©ThoughtWorks 2019

©ThoughtWorks 2018 Commercial in Conﬁdence
CONTINUOUS INTELLIGENCE CYCLE
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019 7

CD4ML isn’t a technology or a
tool; it is a practice and a set of
principles. Quality is built into
software and improvement is
always possible.
But machine learning systems
have unique challenges; unlike
deterministic software, it is
diﬃcult—or impossible—to
understand the behavior of
data-driven intelligent systems.
This poses a huge challenge
when it comes to deploying
machine learning systems in
accordance with CD principles.
8
PRODUCTIONIZING ML IS HARD
Production systems should be:
● Reproducible
● Testable
● Auditable
● Continuously Improving
HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO
INTELLIGENT SYSTEMS?
©ThoughtWorks 2019

CD4ML isn’t a technology or a
tool; it is a practice and a set of
principles. Quality is built into
software and improvement is
always possible.
But machine learning systems
have unique challenges; unlike
deterministic software, it is
diﬃcult—or impossible—to
understand the behavior of
data-driven intelligent systems.
This poses a huge challenge
when it comes to deploying
machine learning systems in
accordance with CD principles.
9
PRODUCTIONIZING ML IS HARD
Production systems should be:
● Reproducible
● Testable
● Auditable
● Continuously Improving
Machine Learning is:
● Non-deterministic
● Hard to test
● Hard to explain
● Hard to improve
HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO
INTELLIGENT SYSTEMS?
©ThoughtWorks 2019

MANY SOURCES OF CHANGE
10
ModelData Code
+ +
Schema
Sampling over Time
Volume
...
Research, Experiments
Training on New Data
Performance
...
New Features
Bug Fixes
Dependencies
...
Icons created by Noura Mbarki and I Putu Kharismayadi from Noun Project
©ThoughtWorks 2019

“Continuous Delivery is the ability to get changes of
all types — including new features, conﬁguration
changes, bug ﬁxes and experiments — into
production, or into the hands of users, safely and
quickly in a sustainable way.”
- Jez Humble & Dave Farley
11
©ThoughtWorks 2019

©ThoughtWorks 2019
PRINCIPLES OF CONTINUOUS DELIVERY
12
→ Create a Repeatable, Reliable Process for Releasing
Software
→ Automate Almost Everything
→ Build Quality In
→ Work in Small Batches
→ Keep Everything in Source Control
→ Done Means “Released”
→ Improve Continuously

WHAT DO WE NEED IN OUR STACK?
13
Doing CD with Machine Learning is still a hard problem
MODEL
PERFORMANCE
ASSESSMENT
TRACKING
VERSION
CONTROL AND
ARTIFACT
REPOSITORIES
©ThoughtWorks 2019
MODEL
MONITORING
AND
OBSERVABILITY
DISCOVERABLE
AND
ACCESSIBLE
DATA
CONTINUOUS
DELIVERY
ORCHESTRATION
TO COMBINE
PIPELINES
INFRASTRUCTURE
FOR MULTIPLE
ENVIRONMENTS
AND
EXPERIMENTS

PUTTING EVERYTHING TOGETHER
14
Data Science,
Model
Building
Training Data
Source Code
+
Executables
Model +
parameters
CD Tools and Repositories
DiscoverableandAccessibleData
©ThoughtWorks 2019

15
Data Science,
Model
Building
Training Data
Source Code
+
Executables
Model
Evaluation
Test Data
Model +
parameters
©ThoughtWorks 2019

16
Data Science,
Model
Building
Training Data
Source Code
+
Executables
Model
Evaluation
Productionize
Model
Test Data
Model +
parameters
©ThoughtWorks 2019

17
Data Science,
Model
Building
Training Data
Source Code
+
Executables
Model
Evaluation
Productionize
Model
Integration
Testing
Test Data
Model +
parameters
©ThoughtWorks 2019

18
Data Science,
Model
Building
Training Data
Source Code
+
Executables
Model
Evaluation
Productionize
Model
Integration
Testing
Deployment
Test Data
Model +
parameters
©ThoughtWorks 2019

19
Data Science,
Model
Building
Training Data
Source Code
+
Executables
Model
Evaluation
Productionize
Model
Integration
Testing
Deployment
Test Data
Model +
parameters
Monitoring
©ThoughtWorks 2019
Production Data

20
Data Science,
Model
Building
Training Data
Source Code
+
Executables
Model
Evaluation
Productionize
Model
Integration
Testing
Deployment
Test Data
Model +
parameters
Monitoring
©ThoughtWorks 2019
Production Data

WHAT WE WILL USE IN THIS WORKSHOP
21
There are many options for tools and technologies to implement CD4ML
©ThoughtWorks 2019

THE MACHINE LEARNING PROBLEM
WE ARE EXPLORING TODAY
22
©ThoughtWorks 2019

A REAL BUSINESS PROBLEM
RETAIL / SUPPLY CHAIN
Loss of sales, opportunity cost, stock waste, discounting
REQUIRES
Accurate Demand Forecasting
TYPICAL CHALLENGES
→ Predictions Are Inaccurate
→ Development Takes A Long Time
→ Diﬃcult To Adapt To Market Change Pace
23
©ThoughtWorks 2019

SALES FORECASTING
FOR GROCERY
RETAILER
● 4,000 items
● 50 stores
● 125,000,000 sales transactions
● 4.5 years of data
Make predictions based on data from:
24
TASK:
Predict how many of each
product will be purchased
in each store on a given
date
©ThoughtWorks 2019

THE SIMPLIFIED WEB APPLICATION
As a buyer, I want to be able to choose a product and
predict how many units the product will sell at a future date.
25
©ThoughtWorks 2019

EXERCISE 1: SETUP
https://github.com/ThoughtWorksInc/continuous-intelligence-workshop
26
● Click on instructions → 1-setup.md
● Follow the steps to setup your local development
environment
User ID assignment sheet: http://bit.ly/cd4ml-strata19
©ThoughtWorks 2019

DEPLOYMENT PIPELINE
Automates the process of building, testing, and deploying applications to production
27
Application code in
version control
repository
Container image as
deployment
artifact
Deploy container
to production
servers
©ThoughtWorks 2019

28
©ThoughtWorks 2019
An Open Source Continuous Delivery server to model and visualise complex workﬂows

Pipeline Group
ANATOMY OF A GOCD PIPELINE
29
©ThoughtWorks 2019

EXERCISE 2:
DEPLOYMENT PIPELINE
30
● Click on instructions → 2-deployment-pipeline.md
● Follow the steps to setup your deployment pipeline
● GoCD URL: https://gocd.cd4ml.net
©ThoughtWorks 2019

BUT NOW WHAT?
● How do we retrain the model more often?
● How to deploy the retrained model to production?
● How to make sure we don’t break anything when
deploying?
● How to make sure that our modeling approach or
parameterization is still the best ﬁt for the data?
● How to monitor our model “in the wild”?
Once your model is in production...
31
©ThoughtWorks 2019

BASIC DATA SCIENCE WORKFLOW
32
Gather data and
extract features
Separate into
training and
validation sets
Train model and
evaluate
performance
©ThoughtWorks 2019

SALES FORECAST MODEL TRAINING PROCESS
33
Data
splitter.p
y
Training
Data
Validation
Data
decision_t
ree.py
model.pkl
evaluation.py
metrics.json
download_d
ata.py
©ThoughtWorks 2019

CHALLENGE 1: THESE ARE LARGE FILES!
34
Data
splitter.p
y
Training
Data
Validation
Data
decision_t
ree.py
model.pkl
evaluation.py
metrics.json
download_d
ata.py
©ThoughtWorks 2019

CHALLENGE 2: AD-HOC MULTI-STEP PROCESS
35
Data
splitter.p
y
Training
Data
Validation
Data
decision_t
ree.py
model.pkl
evaluation.py
metrics.json
download_d
ata.py
©ThoughtWorks 2019

● dvc is git porcelain for storing large ﬁles using cloud storage
● dvc connects model training steps to create reproducible workﬂows
SOLUTION: dvc
data science version control
36
master
change-max-depth
try-random-forest
model.pkl
decision_tree.py
model.pkl.dvc
©ThoughtWorks 2019

ANATOMY OF A DVC COMMAND
This runs a command and creates a .dvc file. The dvc file points to the
dependencies. The output files are versioned and stored in the cloud by running
dvc push.
When you use the output files (store47-2016.csv) as dependencies for the next
step, a is automatically created.
You can re-execute an entire pipeline with one command: dvc repro
37
dvc run -d src/download_data.py
-o data/raw/store47-2016.csv python src/download_data.py
©ThoughtWorks 2019

EXERCISE 3: MACHINE LEARNING
PIPELINE
38
● Click on instructions →
3-machine-learning-pipeline.md
● Follow the steps on your local development environment
and in GoCD to create your Machine Learning pipeline
©ThoughtWorks 2019

HOW DO WE TRACK EXPERIMENTS?
● Which experiments and hypothesis are being explored?
● Which algorithms are being used in each experiment?
● Which version of the code was used?
● How long does it take to run each experiment?
● What parameter and hyperparameters were used?
● How fast are my models learning?
● How do we compare results from diﬀerent runs?
We need to track the scientiﬁc process and evaluate our models:
39
©ThoughtWorks 2019

EXERCISE 4: TRACKING
EXPERIMENTS
41
● Click on instructions →
4-tracking-experiments.md
● Follow the steps to track ML training in mlflow
● MLflow URL: https://mlflow.cd4ml.net
©ThoughtWorks 2019

HOW TO LEARN CONTINUOUSLY?
● Track model usage
● Track model inputs to ﬁnd training-serving skew
● Track model outputs
● Track model interpretability outputs to identify potential
bias or overﬁt
● Track model fairness to understand how it behaves
against dimensions that could introduce unfair bias
We need to capture production data to improve our models:
42
©ThoughtWorks 2019

EFK STACK
Monitoring and Observability infrastructure
43
Open Source data
collector for uniﬁed
logging
Open Source Search
Engine
Open Source web UI
to explore and
visualise data
©ThoughtWorks 2019

EXERCISE 5: MODEL MONITORING
45
● Click on instructions → 5-model-monitoring.md
● Follow the steps to log prediction events
● Kibana URL: https://kibana.cd4ml.net
©ThoughtWorks 2019

CD4ML
● Proper data/model versioning tools enable reproducible work to be done in
parallel.
● No need to maintain complex data processing/model training scripts.
● We can then put data science work into a Continuous Delivery workﬂow.
● Result: Continuous, on-demand AI development and deployment, from
research to production, with a single command.
● Beneﬁt: production AI systems that are always as smart as your data
science team.
47
©ThoughtWorks 2019

4848
THANK YOU!
Danilo Sato (dsato@thoughtworks.com)
Christoph Windheuser (cwindheu@thoughtworks.com)
Emily Gorcenski (egorcens@thoughtsworks.com)
Arif Wider (awider@thoughtworks.com)
©ThoughtWorks 2019
join.thoughtworks.com

WHAT DO WE NEED IN OUR STACK?
49
Doing CD with Machine Learning is still a hard problem
MODEL
PERFORMANCE
ASSESSMENT
TRACKING
BUSINESS
VALUE
ASSESSMENT
VERSION
CONTROL (FOR
CODE AND
MODELS AND
DATA)
DEPLOYMENT,
MONITORING,
AND LOGGING
©ThoughtWorks 2019

VERSION CONTROL & COLLABORATION
Our code, data, and models should be versioned and shareable without unnecessary work.
■ Data and models can
be very large
■ Data can vary invisibly
■ Data scientists need to
share work and it must
be repeatable
What are the challenges?
50
■ Large artifacts stored
in arbitrary storage
linked to source repo
■ Data scientists can
encode work process
at repeat in one step
What does the ideal
solution look like?
■ Storage: S3, HDFS, etc
■ Git LFS
■ Shell scripts
■ dvc
■ Pachyderm
■ jupyterhub
What solutions are out
there now?
©ThoughtWorks 2019

MODEL PERFORMANCE TRACKING
We should be able to scale model development to try multiple modeling approaches simultaneously.
■ Hyperparameter
tuning and model
selection is hard
■ Tracking performance
depends on other
moving parts (e.g. data)
What are the challenges?
51
■ Links models to
speciﬁc training sets
and parameter sets
■ Is diﬀerentiable
■ Allows visualization of
results
What does the ideal
solution look like?
■ dvc
■ MLFlow
■ No shortage of options
here
What solutions are out
there now?
©ThoughtWorks 2019

Continuous Intelligence: Moving Machine Learning into Production Reliably

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Continuous Intelligence: Moving Machine Learning into Production Reliably

Similar to Continuous Intelligence: Moving Machine Learning into Production Reliably (20)

More from Dr. Arif Wider

More from Dr. Arif Wider (9)

Recently uploaded

Recently uploaded (20)

Continuous Intelligence: Moving Machine Learning into Production Reliably