Deploying ML models to production (frequently and safely) - PYCON 2018

How to deploy machine learning models to
production (frequently and safely)

2
hello pycon
David Tan
@davified
Developer @ ThoughtWorks

3
About us
@thoughtworks
https://www.thoughtworks.com/intelligent-empowerment

1. First, a story about all
of us...

6
Temperature check: who has...
● trained a ML model before?
● deployed a ML model for fun?
● deployed a ML model at work?
● an automated deployment pipeline for ML models?

7
The million-dollar question
How can we reliably and repeatably take our models
from our laptop to production?

8
What today’s talk is about
Share principles and practices that can
make it easier for teams to iteratively deploy better ML
products
Share about what to strive towards, and
how to strive towards it

9
Standing on the shoulders of giants
● @jezhumble
● @davefarley77
● @mat_kelcey
● @codingnirvana
● @kief

10
The stack for today’s demo

2. Why deploy
frequently and safely?

14
Why deploy?
Until the model is in production,
it creates value for no one except ourselves

15
● Iteratively improve our model (training with new {data, hyperparameters,
features}
● Correct any biases
● Model decay
● If it’s hard, do it more often
Why deploy frequently?

16
Why deploy safely?
One of these things are not like the others

17
Why deploy safely?
● ML models affect decisions that impact lives… in real-time
● Hippocratic oath for us: Do no harm.
● Safety enable us to iteratively improve ML products that better serve
people

18
Machine learning is only one part of the problem/solution
Source: Hidden Technical Debt in Machine Learning Systems (Google, 2015)
Collecting data /
data engineering
training
ML
models
Deploying and monitoring
ML models
Focus of this talk
Finding the
right
business
problem to
solve

19
Goal of today’s talk
Notebook
/
playgroun
d
:-( :-)
PROD
(maybe
)
Experiment /
Develop
Monitor Deploy
Test
Continuous
Delivery
commit and push

4. So, how do we get there?
Challenges (and solutions from Continuous Delivery practices)

21
Our story’s main characters
Mario the data scientist
Luigi the engineer
loca
l
PROD

Key concept: CI/CD Pipeline
Run unit
tests
Deploy
candidate
model to
STAGING
Deploy
model to
PROD
Train and
evaluate
model
push
Version
control
trigger
feedback
manua
l
trigger
Model
repositor
y
Data / feature repository
Local env
Model
repositor
y
Source: Continuous Delivery (Jez Humble, Dave Farley)

loca
l
PROD
#1: Automated configuration management
Challenge
● Snowflake (dev)
environments
● “Works on my machine!”
Solution
● Single-command setup
● Version control all dependencies, configuration
Benefits
● Enable experimentation by all teammates
● Production-like environment == discover potential
deployment issues early on
dev

24
#1: Automated environment configuration management (Demo)

loca
l
PROD
#2: Test pyramid
Solution
● Testing strategy
● Test every method
Benefits
● Fast feedback
● Safety harness allows team to boldly try new things /
refactor
Challenge
● How can I ensure my
changes haven’t broken
anything?
● How can I enforce the
“goodness” of our
models?
Unit tests
narrow/broad
integration tests
ML metrics
tests
Manual tests
dev
Automate
d

loca
l
PROD
#3: Continuous integration (CI) pipeline for automated testing
Solution
● CI/CD pipeline: automates unit tests → train → test →
deploy (to staging)
● Every code change is tested (assuming tests exist)
● Source code as the only source of software/models
Benefits
● Fast feedback
Challenge
● Everyone may not run
tests. “Goodness” checks
are done manually.
● We could deploy {bugs,
errors, bad models} to
production
dev unit tests train & testVCS

loca
l
PROD
#4: Artifact versioning
Challenge
● How can we revert to
previous models?
● Retraining == time-
consuming
● Manual
renaming/redeployment
s of old models (if we
still have them)
Solution
● Build your binaries once
● Each artifact is tagged with metadata (training data,
hyperparameters, datetime)
Benefits
● Save on build times
● Confidence in artifact increases down the pipeline
● Metadata enables reproducibility
dev train & test version artifactunit testsVCS

loca
l
PROD
#5: Continuous delivery (CD) pipeline for automated deployment
Solution
● Automated deployments triggered by pipeline
● Single-command deployment to staging/production
● Eliminate manual deployments
Benefits
● More rehearsal == More confidence
● Disaster recovery: (single-command) deployment of last
good model in production
Challenge
● Deployments are scary
● Manual deployments ==
potential for mistakes
dev train & test version artifact deploy-stagingunit testsVCS

33
#5: CD pipeline for automated deployment (Demo)
# Deploy model (the actual model)
gcloud beta ml-engine versions create
$VERSION_NAME --model $MODEL_NAME
--origin $DEPLOYMENT_SOURCE
--runtime-version=1.5
--framework $FRAMEWORK
--python-version=3.5

34
#5: CD pipeline for automated deployment (Demo)
# Deploy to prod
gcloud ml-engine versions set-default
$version_to_deploy_to_prod --
model=$MODEL_NAME

loca
l
PROD
#6: Canary releases + monitoring
Solution
● Request shadowing pattern (credit: @codingnirvana)
Benefits
● Confidence increases along the pipeline, backed by metrics
● Monitoring in production == Important source of feedback
Challenge
● How can I know if I’m
deploying a better /
worse model?
● Deployment to
production may not
work as expected
dev train & test version artifact deploy-staging deploy-canary-
prod
unit testsVCS

36
#6: Canary releases + monitoring (Demo)
ML App

loca
l
PROD
#7: Start simple (tracer bullet)
Solution
● Start with simple model + simple features
● Create solid pipeline first
● But, not simpler than what is required (and, don’t take
expensive shortcuts)
Benefits
● Discover integration issues/requirements sooner
● Demonstrate working software to stakeholders in less time
Challenge
● Complex models ==
longer time to develop /
debug
● Getting all the “right”
features ==
weeks / months
dev

38
#7: Start simple (tracer bullet) (Demo)
dev run-unit-tests
train-and
-evaluate-model deploy

loca
l
PROD
#8: Collect more and better data with every release
Solution
● Think about how you can collect labels (immediately or
eventually) after serving predictions (credit: @mat_kelcey)
● Create bug reports for clients
● Complete the data pipeline cycle
● Caution: attempts to game your ML system
Benefits
● More and better data. Nuff said.
Challenge
● Data collection is hard
● Garbage in, garbage out
prod
deploy-produnit testsVCS

loca
l
PROD
#9: Build cross-functional teams
Solution
● Build cross functional teams (data scientist, data engineer,
software engineer, UX, BA)
Benefits
● Less nails (because not everyone is a hammer)
● Improve empathy + reduce silos == productivity
Challenge
● How can we do all of the
above?
prod

loca
l
PROD
#10: Kaizen mindset
Solution
● Kaizen == 改善 == change for better
● Go through deployment health checklists as a team
Benefits
● Iteratively get to good
Challenge
● How can we do all of the
above?
prod

43
#10: Kaizen - Health checklists
❏ General software engineering practices
❏ Source control (e.g. git)
❏ Unit tests
❏ CI pipeline to run automated tests
❏ Automated deployments
❏ Data / feature-related tests
❏ Test all code that creates input features, both in training and serving
❏ ...
❏ Model-related tests
❏ Test against a simpler model as a baseline
❏ ...
Source: A rubric for ML production systems (Google, 2016)

44
#10: Kaizen - Health checks
● How much calendar time to deploy a model from staging to production?
● How much calendar time to add a new feature to the production model?
● How comfortable does your team feel about iteratively deploying
models?

A generalizable approach for deploying ML models frequently and safely
Run unit
tests
Deploy
candidate
model to
STAGING
Deploy
model to
PROD
Train and
evaluate
model
push
Version
control
Credit: Continuous Delivery (Jez Humble, Dave Farley)
trigger
feedback
manua
l
trigger
Model
repositor
y
Data / feature repository
Local env
Model
repositor
y

48
Solve the right problem
We don’t have a machine learning problem.
We have a {business, data, software delivery, ML, UX}
problem

49
Solve the right problem
Deployment and
monitoring
03
Machine learning02
Data collection01
Focus of
today’s talk

50
How to deploy models to prod {frequently, safely, repeatably, reliably}?
1. Automate configuration management
2. Think about your test pyramid
3. Set up a continuous integration (CI) pipeline
4. Version your artifacts (i.e. models)
5. Automated deployment
6. Try canary releases
7. Start simple (tracer bullet)
8. Collect more and better data with every release
9. Build cross-functional teams
10. Kaizen / continuous improvement

52
We’re hiring!
● Software Developers
(>= junior-level devs
welcome)
● UX Designer
● Senior Information
Security Consultant

53
Resources for further reading
● Visibility and monitoring for machine learning (12-min video)
● Using continuous delivery with machine learning models to tackle fraud
● What’s your ML Test Score? A rubric for ML production systems (Google)
● Rules of Machine Learning (Google)
● Continuous Delivery (Jez Humble, Dave Farley)
● Why you need to improve your training data and how to do it

Deploying ML models to production (frequently and safely) - PYCON 2018

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Deploying ML models to production (frequently and safely) - PYCON 2018

Similar to Deploying ML models to production (frequently and safely) - PYCON 2018 (20)

Recently uploaded

Recently uploaded (20)

Deploying ML models to production (frequently and safely) - PYCON 2018

Editor's Notes