SlideShare a Scribd company logo
How to deploy machine learning models to
production (frequently and safely)
2
hello pycon
David Tan
@davified
Developer @ ThoughtWorks
3
About us
@thoughtworks
https://www.thoughtworks.com/intelligent-empowerment
1. First, a story about all
of us...
5
6
Temperature check: who has...
● trained a ML model before?
● deployed a ML model for fun?
● deployed a ML model at work?
● an automated deployment pipeline for ML models?
7
The million-dollar question
How can we reliably and repeatably take our models
from our laptop to production?
8
What today’s talk is about
Share principles and practices that can
make it easier for teams to iteratively deploy better ML
products
Share about what to strive towards, and
how to strive towards it
9
Standing on the shoulders of giants
● @jezhumble
● @davefarley77
● @mat_kelcey
● @codingnirvana
● @kief
10
The stack for today’s demo
11
Demo
2. Why deploy
frequently and safely?
14
Why deploy?
Until the model is in production,
it creates value for no one except ourselves
15
● Iteratively improve our model (training with new {data, hyperparameters,
features}
● Correct any biases
● Model decay
● If it’s hard, do it more often
Why deploy frequently?
16
Why deploy safely?
One of these things are not like the others
17
Why deploy safely?
● ML models affect decisions that impact lives… in real-time
● Hippocratic oath for us: Do no harm.
● Safety enable us to iteratively improve ML products that better serve
people
18
Machine learning is only one part of the problem/solution
Source: Hidden Technical Debt in Machine Learning Systems (Google, 2015)
Collecting data /
data engineering
training
ML
models
Deploying and monitoring
ML models
Focus of this talk
Finding the
right
business
problem to
solve
19
Goal of today’s talk
Notebook
/
playgroun
d
:-( :-)
PROD
(maybe
)
Experiment /
Develop
Monitor Deploy
Test
Continuous
Delivery
commit and push
4. So, how do we get there?
Challenges (and solutions from Continuous Delivery practices)
21
Our story’s main characters
Mario the data scientist
Luigi the engineer
loca
l
PROD
Key concept: CI/CD Pipeline
Run unit
tests
Deploy
candidate
model to
STAGING
Deploy
model to
PROD
Train and
evaluate
model
push
Version
control
trigger
feedback
manua
l
trigger
Model
repositor
y
Data / feature repository
Local env
Model
repositor
y
Source: Continuous Delivery (Jez Humble, Dave Farley)
loca
l
PROD
#1: Automated configuration management
Challenge
● Snowflake (dev)
environments
● “Works on my machine!”
Solution
● Single-command setup
● Version control all dependencies, configuration
Benefits
● Enable experimentation by all teammates
● Production-like environment == discover potential
deployment issues early on
dev
24
#1: Automated environment configuration management (Demo)
loca
l
PROD
#2: Test pyramid
Solution
● Testing strategy
● Test every method
Benefits
● Fast feedback
● Safety harness allows team to boldly try new things /
refactor
Challenge
● How can I ensure my
changes haven’t broken
anything?
● How can I enforce the
“goodness” of our
models?
Unit tests
narrow/broad
integration tests
ML metrics
tests
Manual tests
dev
Automate
d
28
#2: Test pyramid (Demo)
loca
l
PROD
#3: Continuous integration (CI) pipeline for automated testing
Solution
● CI/CD pipeline: automates unit tests → train → test →
deploy (to staging)
● Every code change is tested (assuming tests exist)
● Source code as the only source of software/models
Benefits
● Fast feedback
Challenge
● Everyone may not run
tests. “Goodness” checks
are done manually.
● We could deploy {bugs,
errors, bad models} to
production
dev unit tests train & testVCS
30
#3: CI pipeline (Demo)
loca
l
PROD
#4: Artifact versioning
Challenge
● How can we revert to
previous models?
● Retraining == time-
consuming
● Manual
renaming/redeployment
s of old models (if we
still have them)
Solution
● Build your binaries once
● Each artifact is tagged with metadata (training data,
hyperparameters, datetime)
Benefits
● Save on build times
● Confidence in artifact increases down the pipeline
● Metadata enables reproducibility
dev train & test version artifactunit testsVCS
loca
l
PROD
#5: Continuous delivery (CD) pipeline for automated deployment
Solution
● Automated deployments triggered by pipeline
● Single-command deployment to staging/production
● Eliminate manual deployments
Benefits
● More rehearsal == More confidence
● Disaster recovery: (single-command) deployment of last
good model in production
Challenge
● Deployments are scary
● Manual deployments ==
potential for mistakes
dev train & test version artifact deploy-stagingunit testsVCS
33
#5: CD pipeline for automated deployment (Demo)
# Deploy model (the actual model)
gcloud beta ml-engine versions create 
$VERSION_NAME --model $MODEL_NAME 
--origin $DEPLOYMENT_SOURCE 
--runtime-version=1.5 
--framework $FRAMEWORK 
--python-version=3.5
34
#5: CD pipeline for automated deployment (Demo)
# Deploy to prod
gcloud ml-engine versions set-default 
$version_to_deploy_to_prod  --
model=$MODEL_NAME
loca
l
PROD
#6: Canary releases + monitoring
Solution
● Request shadowing pattern (credit: @codingnirvana)
Benefits
● Confidence increases along the pipeline, backed by metrics
● Monitoring in production == Important source of feedback
Challenge
● How can I know if I’m
deploying a better /
worse model?
● Deployment to
production may not
work as expected
dev train & test version artifact deploy-staging deploy-canary-
prod
unit testsVCS
36
#6: Canary releases + monitoring (Demo)
ML App
loca
l
PROD
#7: Start simple (tracer bullet)
Solution
● Start with simple model + simple features
● Create solid pipeline first
● But, not simpler than what is required (and, don’t take
expensive shortcuts)
Benefits
● Discover integration issues/requirements sooner
● Demonstrate working software to stakeholders in less time
Challenge
● Complex models ==
longer time to develop /
debug
● Getting all the “right”
features ==
weeks / months
dev
38
#7: Start simple (tracer bullet) (Demo)
dev run-unit-tests
train-and
-evaluate-model deploy
loca
l
PROD
#8: Collect more and better data with every release
Solution
● Think about how you can collect labels (immediately or
eventually) after serving predictions (credit: @mat_kelcey)
● Create bug reports for clients
● Complete the data pipeline cycle
● Caution: attempts to game your ML system
Benefits
● More and better data. Nuff said.
Challenge
● Data collection is hard
● Garbage in, garbage out
dev train & test version artifact deploy-staging deploy-canary-
prod
deploy-produnit testsVCS
loca
l
PROD
#9: Build cross-functional teams
Solution
● Build cross functional teams (data scientist, data engineer,
software engineer, UX, BA)
Benefits
● Less nails (because not everyone is a hammer)
● Improve empathy + reduce silos == productivity
Challenge
● How can we do all of the
above?
dev train & test version artifact deploy-staging deploy-canary-
prod
deploy-produnit testsVCS
loca
l
PROD
#10: Kaizen mindset
Solution
● Kaizen == 改善 == change for better
● Go through deployment health checklists as a team
Benefits
● Iteratively get to good
Challenge
● How can we do all of the
above?
dev train & test version artifact deploy-staging deploy-canary-
prod
deploy-produnit testsVCS
43
#10: Kaizen - Health checklists
❏ General software engineering practices
❏ Source control (e.g. git)
❏ Unit tests
❏ CI pipeline to run automated tests
❏ Automated deployments
❏ Data / feature-related tests
❏ Test all code that creates input features, both in training and serving
❏ ...
❏ Model-related tests
❏ Test against a simpler model as a baseline
❏ ...
Source: A rubric for ML production systems (Google, 2016)
44
#10: Kaizen - Health checks
● How much calendar time to deploy a model from staging to production?
● How much calendar time to add a new feature to the production model?
● How comfortable does your team feel about iteratively deploying
models?
45
Conclusion
A generalizable approach for deploying ML models frequently and safely
Run unit
tests
Deploy
candidate
model to
STAGING
Deploy
model to
PROD
Train and
evaluate
model
push
Version
control
Credit: Continuous Delivery (Jez Humble, Dave Farley)
trigger
feedback
manua
l
trigger
Model
repositor
y
Data / feature repository
Local env
Model
repositor
y
48
Solve the right problem
We don’t have a machine learning problem.
We have a {business, data, software delivery, ML, UX}
problem
49
Solve the right problem
Deployment and
monitoring
03
Machine learning02
Data collection01
Focus of
today’s talk
50
How to deploy models to prod {frequently, safely, repeatably, reliably}?
1. Automate configuration management
2. Think about your test pyramid
3. Set up a continuous integration (CI) pipeline
4. Version your artifacts (i.e. models)
5. Automated deployment
6. Try canary releases
7. Start simple (tracer bullet)
8. Collect more and better data with every release
9. Build cross-functional teams
10. Kaizen / continuous improvement
THANK YOU
52
We’re hiring!
● Software Developers
(>= junior-level devs
welcome)
● UX Designer
● Senior Information
Security Consultant
53
Resources for further reading
● Visibility and monitoring for machine learning (12-min video)
● Using continuous delivery with machine learning models to tackle fraud
● What’s your ML Test Score? A rubric for ML production systems (Google)
● Rules of Machine Learning (Google)
● Continuous Delivery (Jez Humble, Dave Farley)
● Why you need to improve your training data and how to do it

More Related Content

What's hot

Using PySpark to Process Boat Loads of Data
Using PySpark to Process Boat Loads of DataUsing PySpark to Process Boat Loads of Data
Using PySpark to Process Boat Loads of Data
Robert Dempsey
 
Using dataset versioning in data science
Using dataset versioning in data scienceUsing dataset versioning in data science
Using dataset versioning in data science
Venkata Pingali
 
Version Control in Machine Learning + AI (Stanford)
Version Control in Machine Learning + AI (Stanford)Version Control in Machine Learning + AI (Stanford)
Version Control in Machine Learning + AI (Stanford)
Anand Sampat
 
Provenance in Production-Grade Machine Learning
Provenance in Production-Grade Machine LearningProvenance in Production-Grade Machine Learning
Provenance in Production-Grade Machine Learning
Anand Sampat
 
Machine Learning In Production
Machine Learning In ProductionMachine Learning In Production
Machine Learning In Production
Samir Bessalah
 
Machine learning model to production
Machine learning model to productionMachine learning model to production
Machine learning model to production
Georg Heiler
 
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycleKyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Lviv Startup Club
 
Big rewrites without big risks
Big rewrites without big risksBig rewrites without big risks
Big rewrites without big risks
Flavius Stef
 
Hydrosphere.io Platform for AI/ML Operations Automation
Hydrosphere.io Platform for AI/ML Operations AutomationHydrosphere.io Platform for AI/ML Operations Automation
Hydrosphere.io Platform for AI/ML Operations Automation
Rustem Zakiev
 
AI and ML 101
AI and ML 101AI and ML 101
AI and ML 101
Rustem Zakiev
 
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
Databricks
 
Productionizing Real-time Serving With MLflow
Productionizing Real-time Serving With MLflowProductionizing Real-time Serving With MLflow
Productionizing Real-time Serving With MLflow
Databricks
 
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
dtz001
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
Databricks
 
COMPARATIVE STUDY OF MATLAB AND ITS OPEN SOURCE ALTERNATIVE SCILAB
COMPARATIVE STUDY OF MATLAB AND ITS OPEN SOURCE ALTERNATIVE SCILABCOMPARATIVE STUDY OF MATLAB AND ITS OPEN SOURCE ALTERNATIVE SCILAB
COMPARATIVE STUDY OF MATLAB AND ITS OPEN SOURCE ALTERNATIVE SCILABWildan Maulana
 
Using H2O AutoML for Kaggle Competitions
Using H2O AutoML for Kaggle CompetitionsUsing H2O AutoML for Kaggle Competitions
Using H2O AutoML for Kaggle Competitions
Sri Ambati
 
Pose extraction for real time workout assistant - milestone 1
Pose extraction for real time workout assistant - milestone 1Pose extraction for real time workout assistant - milestone 1
Pose extraction for real time workout assistant - milestone 1
Zachary Christmas
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning Infrastructure
SigOpt
 
Whats new in_mlflow
Whats new in_mlflowWhats new in_mlflow
Whats new in_mlflow
Databricks
 
From NASA to Startups to Big Commerce
From NASA to Startups to Big CommerceFrom NASA to Startups to Big Commerce
From NASA to Startups to Big CommerceDaniel Greenfeld
 

What's hot (20)

Using PySpark to Process Boat Loads of Data
Using PySpark to Process Boat Loads of DataUsing PySpark to Process Boat Loads of Data
Using PySpark to Process Boat Loads of Data
 
Using dataset versioning in data science
Using dataset versioning in data scienceUsing dataset versioning in data science
Using dataset versioning in data science
 
Version Control in Machine Learning + AI (Stanford)
Version Control in Machine Learning + AI (Stanford)Version Control in Machine Learning + AI (Stanford)
Version Control in Machine Learning + AI (Stanford)
 
Provenance in Production-Grade Machine Learning
Provenance in Production-Grade Machine LearningProvenance in Production-Grade Machine Learning
Provenance in Production-Grade Machine Learning
 
Machine Learning In Production
Machine Learning In ProductionMachine Learning In Production
Machine Learning In Production
 
Machine learning model to production
Machine learning model to productionMachine learning model to production
Machine learning model to production
 
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycleKyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
 
Big rewrites without big risks
Big rewrites without big risksBig rewrites without big risks
Big rewrites without big risks
 
Hydrosphere.io Platform for AI/ML Operations Automation
Hydrosphere.io Platform for AI/ML Operations AutomationHydrosphere.io Platform for AI/ML Operations Automation
Hydrosphere.io Platform for AI/ML Operations Automation
 
AI and ML 101
AI and ML 101AI and ML 101
AI and ML 101
 
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
 
Productionizing Real-time Serving With MLflow
Productionizing Real-time Serving With MLflowProductionizing Real-time Serving With MLflow
Productionizing Real-time Serving With MLflow
 
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
 
COMPARATIVE STUDY OF MATLAB AND ITS OPEN SOURCE ALTERNATIVE SCILAB
COMPARATIVE STUDY OF MATLAB AND ITS OPEN SOURCE ALTERNATIVE SCILABCOMPARATIVE STUDY OF MATLAB AND ITS OPEN SOURCE ALTERNATIVE SCILAB
COMPARATIVE STUDY OF MATLAB AND ITS OPEN SOURCE ALTERNATIVE SCILAB
 
Using H2O AutoML for Kaggle Competitions
Using H2O AutoML for Kaggle CompetitionsUsing H2O AutoML for Kaggle Competitions
Using H2O AutoML for Kaggle Competitions
 
Pose extraction for real time workout assistant - milestone 1
Pose extraction for real time workout assistant - milestone 1Pose extraction for real time workout assistant - milestone 1
Pose extraction for real time workout assistant - milestone 1
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning Infrastructure
 
Whats new in_mlflow
Whats new in_mlflowWhats new in_mlflow
Whats new in_mlflow
 
From NASA to Startups to Big Commerce
From NASA to Startups to Big CommerceFrom NASA to Startups to Big Commerce
From NASA to Startups to Big Commerce
 

Similar to Deploying ML models to production (frequently and safely) - PYCON 2018

Deploying ML models to production (frequently and safely) - PYCON 2018
Deploying ML models to production (frequently and safely) - PYCON 2018Deploying ML models to production (frequently and safely) - PYCON 2018
Deploying ML models to production (frequently and safely) - PYCON 2018
David Tan
 
Continuous Intelligence Workshop
Continuous Intelligence WorkshopContinuous Intelligence Workshop
Continuous Intelligence Workshop
David Tan
 
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Fwdays
 
Strata CA 2019: From Jupyter to Production Manu Mukerji
Strata CA 2019: From Jupyter to Production Manu MukerjiStrata CA 2019: From Jupyter to Production Manu Mukerji
Strata CA 2019: From Jupyter to Production Manu Mukerji
Manu Mukerji
 
Python and test
Python and testPython and test
Python and test
Micron Technology
 
Developers Testing - Girl Code at bloomon
Developers Testing - Girl Code at bloomonDevelopers Testing - Girl Code at bloomon
Developers Testing - Girl Code at bloomon
Ineke Scheffers
 
Test Driven Development
Test Driven DevelopmentTest Driven Development
Test Driven Development
pmanvi
 
Testing and DevOps Culture: Lessons Learned
Testing and DevOps Culture: Lessons LearnedTesting and DevOps Culture: Lessons Learned
Testing and DevOps Culture: Lessons LearnedLB Denker
 
Continuous Testing
Continuous TestingContinuous Testing
Continuous Testing
jaredrrichardson
 
The Holy Trinity of UI Testing by Diego Molina
The Holy Trinity of UI Testing by Diego MolinaThe Holy Trinity of UI Testing by Diego Molina
The Holy Trinity of UI Testing by Diego Molina
Sauce Labs
 
Always Be Deploying. How to make R great for machine learning in (not only) E...
Always Be Deploying. How to make R great for machine learning in (not only) E...Always Be Deploying. How to make R great for machine learning in (not only) E...
Always Be Deploying. How to make R great for machine learning in (not only) E...
Wit Jakuczun
 
Machine Learning Models: From Research to Production 6.13.18
Machine Learning Models: From Research to Production 6.13.18Machine Learning Models: From Research to Production 6.13.18
Machine Learning Models: From Research to Production 6.13.18
Cloudera, Inc.
 
Introduction to Software Engineering
Introduction to Software EngineeringIntroduction to Software Engineering
Introduction to Software Engineering
International Islamic University Islamabad
 
Writing Tests with the Unity Test Framework
Writing Tests with the Unity Test FrameworkWriting Tests with the Unity Test Framework
Writing Tests with the Unity Test Framework
Peter Kofler
 
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Sotrender
 
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Neotys_Partner
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to Production
Florian Wilhelm
 
Cloud native development without the toil
Cloud native development without the toilCloud native development without the toil
Cloud native development without the toil
Ambassador Labs
 
GOTOpia 2/2021 "Cloud Native Development Without the Toil: An Overview of Pra...
GOTOpia 2/2021 "Cloud Native Development Without the Toil: An Overview of Pra...GOTOpia 2/2021 "Cloud Native Development Without the Toil: An Overview of Pra...
GOTOpia 2/2021 "Cloud Native Development Without the Toil: An Overview of Pra...
Daniel Bryant
 

Similar to Deploying ML models to production (frequently and safely) - PYCON 2018 (20)

Deploying ML models to production (frequently and safely) - PYCON 2018
Deploying ML models to production (frequently and safely) - PYCON 2018Deploying ML models to production (frequently and safely) - PYCON 2018
Deploying ML models to production (frequently and safely) - PYCON 2018
 
Continuous Intelligence Workshop
Continuous Intelligence WorkshopContinuous Intelligence Workshop
Continuous Intelligence Workshop
 
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
 
Strata CA 2019: From Jupyter to Production Manu Mukerji
Strata CA 2019: From Jupyter to Production Manu MukerjiStrata CA 2019: From Jupyter to Production Manu Mukerji
Strata CA 2019: From Jupyter to Production Manu Mukerji
 
Python and test
Python and testPython and test
Python and test
 
Developers Testing - Girl Code at bloomon
Developers Testing - Girl Code at bloomonDevelopers Testing - Girl Code at bloomon
Developers Testing - Girl Code at bloomon
 
Test Driven Development
Test Driven DevelopmentTest Driven Development
Test Driven Development
 
Resume_shai.docx
Resume_shai.docxResume_shai.docx
Resume_shai.docx
 
Testing and DevOps Culture: Lessons Learned
Testing and DevOps Culture: Lessons LearnedTesting and DevOps Culture: Lessons Learned
Testing and DevOps Culture: Lessons Learned
 
Continuous Testing
Continuous TestingContinuous Testing
Continuous Testing
 
The Holy Trinity of UI Testing by Diego Molina
The Holy Trinity of UI Testing by Diego MolinaThe Holy Trinity of UI Testing by Diego Molina
The Holy Trinity of UI Testing by Diego Molina
 
Always Be Deploying. How to make R great for machine learning in (not only) E...
Always Be Deploying. How to make R great for machine learning in (not only) E...Always Be Deploying. How to make R great for machine learning in (not only) E...
Always Be Deploying. How to make R great for machine learning in (not only) E...
 
Machine Learning Models: From Research to Production 6.13.18
Machine Learning Models: From Research to Production 6.13.18Machine Learning Models: From Research to Production 6.13.18
Machine Learning Models: From Research to Production 6.13.18
 
Introduction to Software Engineering
Introduction to Software EngineeringIntroduction to Software Engineering
Introduction to Software Engineering
 
Writing Tests with the Unity Test Framework
Writing Tests with the Unity Test FrameworkWriting Tests with the Unity Test Framework
Writing Tests with the Unity Test Framework
 
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...
 
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
Jonathon Wright - Intelligent Performance Cognitive Learning (AIOps)
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to Production
 
Cloud native development without the toil
Cloud native development without the toilCloud native development without the toil
Cloud native development without the toil
 
GOTOpia 2/2021 "Cloud Native Development Without the Toil: An Overview of Pra...
GOTOpia 2/2021 "Cloud Native Development Without the Toil: An Overview of Pra...GOTOpia 2/2021 "Cloud Native Development Without the Toil: An Overview of Pra...
GOTOpia 2/2021 "Cloud Native Development Without the Toil: An Overview of Pra...
 

Recently uploaded

How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 

Recently uploaded (20)

How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 

Deploying ML models to production (frequently and safely) - PYCON 2018

  • 1. How to deploy machine learning models to production (frequently and safely)
  • 4. 1. First, a story about all of us...
  • 5. 5
  • 6. 6 Temperature check: who has... ● trained a ML model before? ● deployed a ML model for fun? ● deployed a ML model at work? ● an automated deployment pipeline for ML models?
  • 7. 7 The million-dollar question How can we reliably and repeatably take our models from our laptop to production?
  • 8. 8 What today’s talk is about Share principles and practices that can make it easier for teams to iteratively deploy better ML products Share about what to strive towards, and how to strive towards it
  • 9. 9 Standing on the shoulders of giants ● @jezhumble ● @davefarley77 ● @mat_kelcey ● @codingnirvana ● @kief
  • 10. 10 The stack for today’s demo
  • 13. 14 Why deploy? Until the model is in production, it creates value for no one except ourselves
  • 14. 15 ● Iteratively improve our model (training with new {data, hyperparameters, features} ● Correct any biases ● Model decay ● If it’s hard, do it more often Why deploy frequently?
  • 15. 16 Why deploy safely? One of these things are not like the others
  • 16. 17 Why deploy safely? ● ML models affect decisions that impact lives… in real-time ● Hippocratic oath for us: Do no harm. ● Safety enable us to iteratively improve ML products that better serve people
  • 17. 18 Machine learning is only one part of the problem/solution Source: Hidden Technical Debt in Machine Learning Systems (Google, 2015) Collecting data / data engineering training ML models Deploying and monitoring ML models Focus of this talk Finding the right business problem to solve
  • 18. 19 Goal of today’s talk Notebook / playgroun d :-( :-) PROD (maybe ) Experiment / Develop Monitor Deploy Test Continuous Delivery commit and push
  • 19. 4. So, how do we get there? Challenges (and solutions from Continuous Delivery practices)
  • 20. 21 Our story’s main characters Mario the data scientist Luigi the engineer loca l PROD
  • 21. Key concept: CI/CD Pipeline Run unit tests Deploy candidate model to STAGING Deploy model to PROD Train and evaluate model push Version control trigger feedback manua l trigger Model repositor y Data / feature repository Local env Model repositor y Source: Continuous Delivery (Jez Humble, Dave Farley)
  • 22. loca l PROD #1: Automated configuration management Challenge ● Snowflake (dev) environments ● “Works on my machine!” Solution ● Single-command setup ● Version control all dependencies, configuration Benefits ● Enable experimentation by all teammates ● Production-like environment == discover potential deployment issues early on dev
  • 23. 24 #1: Automated environment configuration management (Demo)
  • 24. loca l PROD #2: Test pyramid Solution ● Testing strategy ● Test every method Benefits ● Fast feedback ● Safety harness allows team to boldly try new things / refactor Challenge ● How can I ensure my changes haven’t broken anything? ● How can I enforce the “goodness” of our models? Unit tests narrow/broad integration tests ML metrics tests Manual tests dev Automate d
  • 26. loca l PROD #3: Continuous integration (CI) pipeline for automated testing Solution ● CI/CD pipeline: automates unit tests → train → test → deploy (to staging) ● Every code change is tested (assuming tests exist) ● Source code as the only source of software/models Benefits ● Fast feedback Challenge ● Everyone may not run tests. “Goodness” checks are done manually. ● We could deploy {bugs, errors, bad models} to production dev unit tests train & testVCS
  • 28. loca l PROD #4: Artifact versioning Challenge ● How can we revert to previous models? ● Retraining == time- consuming ● Manual renaming/redeployment s of old models (if we still have them) Solution ● Build your binaries once ● Each artifact is tagged with metadata (training data, hyperparameters, datetime) Benefits ● Save on build times ● Confidence in artifact increases down the pipeline ● Metadata enables reproducibility dev train & test version artifactunit testsVCS
  • 29. loca l PROD #5: Continuous delivery (CD) pipeline for automated deployment Solution ● Automated deployments triggered by pipeline ● Single-command deployment to staging/production ● Eliminate manual deployments Benefits ● More rehearsal == More confidence ● Disaster recovery: (single-command) deployment of last good model in production Challenge ● Deployments are scary ● Manual deployments == potential for mistakes dev train & test version artifact deploy-stagingunit testsVCS
  • 30. 33 #5: CD pipeline for automated deployment (Demo) # Deploy model (the actual model) gcloud beta ml-engine versions create $VERSION_NAME --model $MODEL_NAME --origin $DEPLOYMENT_SOURCE --runtime-version=1.5 --framework $FRAMEWORK --python-version=3.5
  • 31. 34 #5: CD pipeline for automated deployment (Demo) # Deploy to prod gcloud ml-engine versions set-default $version_to_deploy_to_prod -- model=$MODEL_NAME
  • 32. loca l PROD #6: Canary releases + monitoring Solution ● Request shadowing pattern (credit: @codingnirvana) Benefits ● Confidence increases along the pipeline, backed by metrics ● Monitoring in production == Important source of feedback Challenge ● How can I know if I’m deploying a better / worse model? ● Deployment to production may not work as expected dev train & test version artifact deploy-staging deploy-canary- prod unit testsVCS
  • 33. 36 #6: Canary releases + monitoring (Demo) ML App
  • 34. loca l PROD #7: Start simple (tracer bullet) Solution ● Start with simple model + simple features ● Create solid pipeline first ● But, not simpler than what is required (and, don’t take expensive shortcuts) Benefits ● Discover integration issues/requirements sooner ● Demonstrate working software to stakeholders in less time Challenge ● Complex models == longer time to develop / debug ● Getting all the “right” features == weeks / months dev
  • 35. 38 #7: Start simple (tracer bullet) (Demo) dev run-unit-tests train-and -evaluate-model deploy
  • 36. loca l PROD #8: Collect more and better data with every release Solution ● Think about how you can collect labels (immediately or eventually) after serving predictions (credit: @mat_kelcey) ● Create bug reports for clients ● Complete the data pipeline cycle ● Caution: attempts to game your ML system Benefits ● More and better data. Nuff said. Challenge ● Data collection is hard ● Garbage in, garbage out dev train & test version artifact deploy-staging deploy-canary- prod deploy-produnit testsVCS
  • 37. loca l PROD #9: Build cross-functional teams Solution ● Build cross functional teams (data scientist, data engineer, software engineer, UX, BA) Benefits ● Less nails (because not everyone is a hammer) ● Improve empathy + reduce silos == productivity Challenge ● How can we do all of the above? dev train & test version artifact deploy-staging deploy-canary- prod deploy-produnit testsVCS
  • 38. loca l PROD #10: Kaizen mindset Solution ● Kaizen == 改善 == change for better ● Go through deployment health checklists as a team Benefits ● Iteratively get to good Challenge ● How can we do all of the above? dev train & test version artifact deploy-staging deploy-canary- prod deploy-produnit testsVCS
  • 39. 43 #10: Kaizen - Health checklists ❏ General software engineering practices ❏ Source control (e.g. git) ❏ Unit tests ❏ CI pipeline to run automated tests ❏ Automated deployments ❏ Data / feature-related tests ❏ Test all code that creates input features, both in training and serving ❏ ... ❏ Model-related tests ❏ Test against a simpler model as a baseline ❏ ... Source: A rubric for ML production systems (Google, 2016)
  • 40. 44 #10: Kaizen - Health checks ● How much calendar time to deploy a model from staging to production? ● How much calendar time to add a new feature to the production model? ● How comfortable does your team feel about iteratively deploying models?
  • 41. 45
  • 43. A generalizable approach for deploying ML models frequently and safely Run unit tests Deploy candidate model to STAGING Deploy model to PROD Train and evaluate model push Version control Credit: Continuous Delivery (Jez Humble, Dave Farley) trigger feedback manua l trigger Model repositor y Data / feature repository Local env Model repositor y
  • 44. 48 Solve the right problem We don’t have a machine learning problem. We have a {business, data, software delivery, ML, UX} problem
  • 45. 49 Solve the right problem Deployment and monitoring 03 Machine learning02 Data collection01 Focus of today’s talk
  • 46. 50 How to deploy models to prod {frequently, safely, repeatably, reliably}? 1. Automate configuration management 2. Think about your test pyramid 3. Set up a continuous integration (CI) pipeline 4. Version your artifacts (i.e. models) 5. Automated deployment 6. Try canary releases 7. Start simple (tracer bullet) 8. Collect more and better data with every release 9. Build cross-functional teams 10. Kaizen / continuous improvement
  • 48. 52 We’re hiring! ● Software Developers (>= junior-level devs welcome) ● UX Designer ● Senior Information Security Consultant
  • 49. 53 Resources for further reading ● Visibility and monitoring for machine learning (12-min video) ● Using continuous delivery with machine learning models to tackle fraud ● What’s your ML Test Score? A rubric for ML production systems (Google) ● Rules of Machine Learning (Google) ● Continuous Delivery (Jez Humble, Dave Farley) ● Why you need to improve your training data and how to do it

Editor's Notes

  1. I’m David and here’s Ramsey, and we’re going to share about how you can deploy ML models to production frequently and safely. Note to self: A talk is more about telling a story around a topic Changing people's perspective Inspiring them to try something else and giving them the tools for that.” Empathize with audience. Don’t preach
  2. Note: use “we”, rather than “you” Got an idea (e.g. NLP sentiment analysis). Follow a ML tutorial Built a model Asked to deploy. (click) “You want me to .. what?” Bombarded with questions. How do I deploy? How do I load new data? How do I call .predict() without hitting shift+enter? How do I vectorize user input strings before passing it to the model? We’re stumped. We don’t know where to start. We give up.
  3. Before we go on, we want to take a quick temperature check
  4. Bear this question in mind throughout the talk
  5. Most of these are not ideas that Ramsey and I thought of. They are practices that smart these folks have thought of, and that have been tried and tested at our clients.
  6. We built a sample app What it does Why we chose this stack / data source How you can use it To make this tangible, we’ve had to pick a stack. But focus on the patterns, and not our implementation
  7. we built a demo so that we can have code to illustrate some points but we ran out of time So for the last few points, we'll talk abt concepts and how we would implement it
  8. Just read the title. Don’t talk too much here.
  9. Use fraud detection as an example. Share about tracer bullet idea here
  10. In other programming languages / frameworks, when we build something, we can share a link on Twitter and the rest of the world can use it In ML, my experience === i/people just share screenshots of the loss curve (insert picture) or some object detection bounding boxes (insert pictures) This is the problem facing many of us today. We have tons of ML tutorials in local environment / jupyter notebooks, but very little / none about serving those models or continuous delivery/evolution of these models Until something is in production, it creates value for no one except ourselves
  11. Model decay (our model can get stale / dangerous) Deploying frequently allows us to make iterative improvements to our model (training with new {data, hyperparameters, features}
  12. cars, phones, ikea chairs go through multiple rounds of testing. Why should ML models be any different? The irony is that ML has already started to impact all of our lives, but testing and safety is something that we rarely talk about in ML
  13. ML models affect decisions that impact lives… in real-time Safety is essential
  14. Goal of today’s talk (in pictures)
  15. “Ok, david - I’m sold on why this frequent and safe deployment thing is important. But what does it look like in practice?”
  16. CI/CD pipeline - The main vehicle for everything we’re sharing today It’s all about feedback 30 seconds - quick overview of this. The model goes through different stages Each of them solves a different problem, which we’ll talk about next Generalizable approach: we can see it working for classifiers, regression models, deep learning models, NLP models, etc.
  17. Snowflake Every dataset is unique, non-reproducible, hand-cleaned with TLC
  18. Challenge Brittle glue code in ML Unit tests At lower levels, check edge cases, add more tests for all that At higher levels, check happy path and integration
  19. Skip if people get CI pipeline
  20. Deployment Provisioning Configuration Deploying your app
  21. Tracer bullet Deploying a simple thing is easier than a complex thing Focus on deploying first. Focus on deployment pipeline. Don’t get distracted. We can come back to tuning models later
  22. Benefits Monitoring === important source of feedback Find out when model are getting stale / dangerous LIME - Local Interpretable Model-Agnostic Explanations Caveat: Monitoring ML metrics can be challenging because labels take time to come
  23. Training serving skew where the data seen at serving time differs in some way from the data used to train the model, leading to reduced prediction quality
  24. Talk about just the first bullet
  25. Pyception (Anaconda 2018 video) - a battle between data scientists and software engineers
  26. Generalizable approach: we can see it working for classifiers, regression models, deep learning models, NLP models, etc.
  27. TODO: add bitly link here