SlideShare a Scribd company logo
1
Continuous
Intelligence
Moving Machine Learning
Application into Production Reliably
Christoph Windheuser
Danilo Sato
Emily Gorcenski
Arif Wider
ThoughtWorks Inc.
WORKSHOP ON WHY
AND HOW TO APPLY
CONTINUOUS
DELIVERY TO
MACHINE LEARNING
(CD4ML)
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
2
Structure of Today’s Workshop
- INTRODUCTION TO THE TOPIC
- EXERCISE 1: SETUP
- EXERCISE 2: DEPLOYMENT PIPELINE
- BREAK
- EXERCISE 3: ML PIPELINE
- EXERCISE 4: TRACKING EXPERIMENTS
- EXERCISE 5: MODEL MONITORING
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
5000+ technologists with 40 offices in 14 countries
Partner for technology driven business transformation
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
join.thoughtworks.com
#1
in Agile and
Continuous Delivery
100+
books written
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
TECHNIQUES
Continuous delivery
for machine
learning (CD4ML)
models
#8
ASSESS
8
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
©ThoughtWorks 2018 Commercial in Confidence
CONTINUOUS INTELLIGENCE CYCLE
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019 7
CD4ML isn’t a technology or a
tool; it is a practice and a set of
principles. Quality is built into
software and improvement is
always possible.
But machine learning systems
have unique challenges; unlike
deterministic software, it is
difficult—or impossible—to
understand the behavior of
data-driven intelligent systems.
This poses a huge challenge
when it comes to deploying
machine learning systems in
accordance with CD principles.
8
PRODUCTIONIZING ML IS HARD
Production systems should be:
● Reproducible
● Testable
● Auditable
● Continuously Improving
HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO
INTELLIGENT SYSTEMS?
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
CD4ML isn’t a technology or a
tool; it is a practice and a set of
principles. Quality is built into
software and improvement is
always possible.
But machine learning systems
have unique challenges; unlike
deterministic software, it is
difficult—or impossible—to
understand the behavior of
data-driven intelligent systems.
This poses a huge challenge
when it comes to deploying
machine learning systems in
accordance with CD principles.
9
PRODUCTIONIZING ML IS HARD
Production systems should be:
● Reproducible
● Testable
● Auditable
● Continuously Improving
Machine Learning is:
● Non-deterministic
● Hard to test
● Hard to explain
● Hard to improve
HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO
INTELLIGENT SYSTEMS?
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
MANY SOURCES OF CHANGE
10
ModelData Code
+ +
Schema
Sampling over Time
Volume
...
Research, Experiments
Training on New Data
Performance
...
New Features
Bug Fixes
Dependencies
...
Icons created by Noura Mbarki and I Putu Kharismayadi from Noun Project
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
“Continuous Delivery is the ability to get changes of
all types — including new features, configuration
changes, bug fixes and experiments — into
production, or into the hands of users, safely and
quickly in a sustainable way.”
- Jez Humble & Dave Farley
11
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
PRINCIPLES OF CONTINUOUS DELIVERY
12
→ Create a Repeatable, Reliable Process for Releasing
Software
→ Automate Almost Everything
→ Build Quality In
→ Work in Small Batches
→ Keep Everything in Source Control
→ Done Means “Released”
→ Improve Continuously
WHAT DO WE NEED IN OUR STACK?
13
Doing CD with Machine Learning is still a hard problem
MODEL
PERFORMANCE
ASSESSMENT
TRACKING
VERSION
CONTROL AND
ARTIFACT
REPOSITORIES
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
MODEL
MONITORING
AND
OBSERVABILITY
DISCOVERABLE
AND
ACCESSIBLE
DATA
CONTINUOUS
DELIVERY
ORCHESTRATION
TO COMBINE
PIPELINES
INFRASTRUCTURE
FOR MULTIPLE
ENVIRONMENTS
AND
EXPERIMENTS
PUTTING EVERYTHING TOGETHER
14
Data Science,
Model
Building
Training Data
Source Code
+
Executables
Model +
parameters
CD Tools and Repositories
DiscoverableandAccessibleData
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
PUTTING EVERYTHING TOGETHER
15
Data Science,
Model
Building
Training Data
Source Code
+
Executables
Model
Evaluation
Test Data
Model +
parameters
CD Tools and Repositories
DiscoverableandAccessibleData
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
PUTTING EVERYTHING TOGETHER
16
Data Science,
Model
Building
Training Data
Source Code
+
Executables
Model
Evaluation
Productionize
Model
Test Data
Model +
parameters
CD Tools and Repositories
DiscoverableandAccessibleData
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
PUTTING EVERYTHING TOGETHER
17
Data Science,
Model
Building
Training Data
Source Code
+
Executables
Model
Evaluation
Productionize
Model
Integration
Testing
Test Data
Model +
parameters
CD Tools and Repositories
DiscoverableandAccessibleData
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
PUTTING EVERYTHING TOGETHER
18
Data Science,
Model
Building
Training Data
Source Code
+
Executables
Model
Evaluation
Productionize
Model
Integration
Testing
Deployment
Test Data
Model +
parameters
CD Tools and Repositories
DiscoverableandAccessibleData
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
PUTTING EVERYTHING TOGETHER
19
Data Science,
Model
Building
Training Data
Source Code
+
Executables
Model
Evaluation
Productionize
Model
Integration
Testing
Deployment
Test Data
Model +
parameters
CD Tools and Repositories
DiscoverableandAccessibleData
Monitoring
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
Production Data
PUTTING EVERYTHING TOGETHER
20
Data Science,
Model
Building
Training Data
Source Code
+
Executables
Model
Evaluation
Productionize
Model
Integration
Testing
Deployment
Test Data
Model +
parameters
CD Tools and Repositories
DiscoverableandAccessibleData
Monitoring
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
Production Data
WHAT WE WILL USE IN THIS WORKSHOP
21
There are many options for tools and technologies to implement CD4ML
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
THE MACHINE LEARNING PROBLEM
WE ARE EXPLORING TODAY
22
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
A REAL BUSINESS PROBLEM
RETAIL / SUPPLY CHAIN
Loss of sales, opportunity cost, stock waste, discounting
REQUIRES
Accurate Demand Forecasting
TYPICAL CHALLENGES
→ Predictions Are Inaccurate
→ Development Takes A Long Time
→ Difficult To Adapt To Market Change Pace
23
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019 23
SALES FORECASTING
FOR GROCERY
RETAILER
● 4,000 items
● 50 stores
● 125,000,000 sales transactions
● 4.5 years of data
Make predictions based on data from:
24
TASK:
Predict how many of each
product will be purchased
in each store on a given
date
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
THE SIMPLIFIED WEB APPLICATION
As a buyer, I want to be able to choose a product and
predict how many units the product will sell at a future date.
25
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
EXERCISE 1: SETUP
https://github.com/ThoughtWorksInc/continuous-intelligence-workshop
26
● Click on instructions → 1-setup.md
● Follow the steps to setup your local development
environment
User ID assignment sheet: http://bit.ly/cd4ml-strata19
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019 26
DEPLOYMENT PIPELINE
Automates the process of building, testing, and deploying applications to production
27
Application code in
version control
repository
Container image as
deployment
artifact
Deploy container
to production
servers
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
28
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
An Open Source Continuous Delivery server to model and visualise complex workflows
Pipeline Group
ANATOMY OF A GOCD PIPELINE
29
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
EXERCISE 2:
DEPLOYMENT PIPELINE
https://github.com/ThoughtWorksInc/continuous-intelligence-workshop
30
● Click on instructions → 2-deployment-pipeline.md
● Follow the steps to setup your deployment pipeline
● GoCD URL: https://gocd.cd4ml.net
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019 30
BUT NOW WHAT?
● How do we retrain the model more often?
● How to deploy the retrained model to production?
● How to make sure we don’t break anything when
deploying?
● How to make sure that our modeling approach or
parameterization is still the best fit for the data?
● How to monitor our model “in the wild”?
Once your model is in production...
31
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019 31
BASIC DATA SCIENCE WORKFLOW
32
Gather data and
extract features
Separate into
training and
validation sets
Train model and
evaluate
performance
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
SALES FORECAST MODEL TRAINING PROCESS
33
Data
splitter.p
y
Training
Data
Validation
Data
decision_t
ree.py
model.pkl
evaluation.py
metrics.json
download_d
ata.py
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
CHALLENGE 1: THESE ARE LARGE FILES!
34
Data
splitter.p
y
Training
Data
Validation
Data
decision_t
ree.py
model.pkl
evaluation.py
metrics.json
download_d
ata.py
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
CHALLENGE 2: AD-HOC MULTI-STEP PROCESS
35
Data
splitter.p
y
Training
Data
Validation
Data
decision_t
ree.py
model.pkl
evaluation.py
metrics.json
download_d
ata.py
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
● dvc is git porcelain for storing large files using cloud storage
● dvc connects model training steps to create reproducible workflows
SOLUTION: dvc
data science version control
36
master
change-max-depth
try-random-forest
model.pkl
decision_tree.py
model.pkl.dvc
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
ANATOMY OF A DVC COMMAND
This runs a command and creates a .dvc file. The dvc file points to the
dependencies. The output files are versioned and stored in the cloud by running
dvc push.
When you use the output files (store47-2016.csv) as dependencies for the next
step, a is automatically created.
You can re-execute an entire pipeline with one command: dvc repro
37
dvc run -d src/download_data.py
-o data/raw/store47-2016.csv python src/download_data.py
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019 37
EXERCISE 3: MACHINE LEARNING
PIPELINE
https://github.com/ThoughtWorksInc/continuous-intelligence-workshop
38
● Click on instructions →
3-machine-learning-pipeline.md
● Follow the steps on your local development environment
and in GoCD to create your Machine Learning pipeline
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019 38
HOW DO WE TRACK EXPERIMENTS?
● Which experiments and hypothesis are being explored?
● Which algorithms are being used in each experiment?
● Which version of the code was used?
● How long does it take to run each experiment?
● What parameter and hyperparameters were used?
● How fast are my models learning?
● How do we compare results from different runs?
We need to track the scientific process and evaluate our models:
39
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019 39
40
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
An Open Source platform for managing end-to-end machine learning lifecycle
EXERCISE 4: TRACKING
EXPERIMENTS
https://github.com/ThoughtWorksInc/continuous-intelligence-workshop
41
● Click on instructions →
4-tracking-experiments.md
● Follow the steps to track ML training in mlflow
● MLflow URL: https://mlflow.cd4ml.net
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019 41
HOW TO LEARN CONTINUOUSLY?
● Track model usage
● Track model inputs to find training-serving skew
● Track model outputs
● Track model interpretability outputs to identify potential
bias or overfit
● Track model fairness to understand how it behaves
against dimensions that could introduce unfair bias
We need to capture production data to improve our models:
42
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019 42
EFK STACK
Monitoring and Observability infrastructure
43
Open Source data
collector for unified
logging
Open Source Search
Engine
Open Source web UI
to explore and
visualise data
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
44
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
An Open Source UI that makes it easy to explore and visualise the data index in Elasticsearch
EXERCISE 5: MODEL MONITORING
https://github.com/ThoughtWorksInc/continuous-intelligence-workshop
45
● Click on instructions → 5-model-monitoring.md
● Follow the steps to log prediction events
● Kibana URL: https://kibana.cd4ml.net
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019 45
46
SUMMARY - WHAT HAVE WE LEARNED?
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
CD4ML
● Proper data/model versioning tools enable reproducible work to be done in
parallel.
● No need to maintain complex data processing/model training scripts.
● We can then put data science work into a Continuous Delivery workflow.
● Result: Continuous, on-demand AI development and deployment, from
research to production, with a single command.
● Benefit: production AI systems that are always as smart as your data
science team.
47
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
4848
THANK YOU!
Danilo Sato (dsato@thoughtworks.com)
Christoph Windheuser (cwindheu@thoughtworks.com)
Emily Gorcenski (egorcens@thoughtsworks.com)
Arif Wider (awider@thoughtworks.com)
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
join.thoughtworks.com
WHAT DO WE NEED IN OUR STACK?
49
Doing CD with Machine Learning is still a hard problem
MODEL
PERFORMANCE
ASSESSMENT
TRACKING
BUSINESS
VALUE
ASSESSMENT
VERSION
CONTROL (FOR
CODE AND
MODELS AND
DATA)
DEPLOYMENT,
MONITORING,
AND LOGGING
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
VERSION CONTROL & COLLABORATION
Our code, data, and models should be versioned and shareable without unnecessary work.
■ Data and models can
be very large
■ Data can vary invisibly
■ Data scientists need to
share work and it must
be repeatable
What are the challenges?
50
■ Large artifacts stored
in arbitrary storage
linked to source repo
■ Data scientists can
encode work process
at repeat in one step
What does the ideal
solution look like?
■ Storage: S3, HDFS, etc
■ Git LFS
■ Shell scripts
■ dvc
■ Pachyderm
■ jupyterhub
What solutions are out
there now?
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019
MODEL PERFORMANCE TRACKING
We should be able to scale model development to try multiple modeling approaches simultaneously.
■ Hyperparameter
tuning and model
selection is hard
■ Tracking performance
depends on other
moving parts (e.g. data)
What are the challenges?
51
■ Links models to
specific training sets
and parameter sets
■ Is differentiable
■ Allows visualization of
results
What does the ideal
solution look like?
■ dvc
■ MLFlow
■ No shortage of options
here
What solutions are out
there now?
©ThoughtWorks 2019
Strata Data Conference London, April 30, 2019

More Related Content

What's hot

GARE du MIDIH Open Digital Platforms the adoption of a standards-based open...
GARE du MIDIH   Open Digital Platforms the adoption of a standards-based open...GARE du MIDIH   Open Digital Platforms the adoption of a standards-based open...
GARE du MIDIH Open Digital Platforms the adoption of a standards-based open...
MIDIH_EU
 
PDes (Programmable Devices and Embedded System)
PDes (Programmable Devices and Embedded System)PDes (Programmable Devices and Embedded System)
PDes (Programmable Devices and Embedded System)
MIDIH_EU
 
GARE du MIDIH Digital Manufacturing Platforms in H2020 and in future Digita...
GARE du MIDIH   Digital Manufacturing Platforms in H2020 and in future Digita...GARE du MIDIH   Digital Manufacturing Platforms in H2020 and in future Digita...
GARE du MIDIH Digital Manufacturing Platforms in H2020 and in future Digita...
MIDIH_EU
 
Gare du MIDIH MIDIH general overview
Gare du MIDIH   MIDIH general overviewGare du MIDIH   MIDIH general overview
Gare du MIDIH MIDIH general overview
MIDIH_EU
 
ECPPM2016 - SimpleBIM: from full ifcOWL graphs to simplified building graphs
ECPPM2016 - SimpleBIM: from full ifcOWL graphs to simplified building graphsECPPM2016 - SimpleBIM: from full ifcOWL graphs to simplified building graphs
ECPPM2016 - SimpleBIM: from full ifcOWL graphs to simplified building graphs
Pieter Pauwels
 
LACE Project WP5 - Learning Analytics & Performance Support for Manufacturing...
LACE Project WP5 - Learning Analytics & Performance Support for Manufacturing...LACE Project WP5 - Learning Analytics & Performance Support for Manufacturing...
LACE Project WP5 - Learning Analytics & Performance Support for Manufacturing...
Fabrizio Cardinali
 
A Indústria 4.0 na Alemanha: há lições úteis para o Brasil? - NORBERT LÜTKE-E...
A Indústria 4.0 na Alemanha: há lições úteis para o Brasil? - NORBERT LÜTKE-E...A Indústria 4.0 na Alemanha: há lições úteis para o Brasil? - NORBERT LÜTKE-E...
A Indústria 4.0 na Alemanha: há lições úteis para o Brasil? - NORBERT LÜTKE-E...
Fundação Fernando Henrique Cardoso
 
SMACC - Smart Machines and Manufacturing Competence Center
SMACC - Smart Machines and Manufacturing Competence CenterSMACC - Smart Machines and Manufacturing Competence Center
SMACC - Smart Machines and Manufacturing Competence Center
Timo Rainio
 
Lace presentation at world manufacturing forum 2014 wshop on learning gamific...
Lace presentation at world manufacturing forum 2014 wshop on learning gamific...Lace presentation at world manufacturing forum 2014 wshop on learning gamific...
Lace presentation at world manufacturing forum 2014 wshop on learning gamific...
Fabrizio Cardinali
 
Towards a Connected World of Supply Chain - Industrie 4.0
Towards a Connected World of Supply Chain - Industrie 4.0Towards a Connected World of Supply Chain - Industrie 4.0
Towards a Connected World of Supply Chain - Industrie 4.0
Sarathy Kalaichelvan
 
Digital Readiness Level (DRL), Simon Barnes
Digital Readiness Level (DRL), Simon BarnesDigital Readiness Level (DRL), Simon Barnes
Digital Readiness Level (DRL), Simon Barnes
WMG centre High Value Manufacturing Catapult
 
Autonomous intelligence for the Industrial Internet - LibreCon 2016
Autonomous intelligence for the Industrial Internet - LibreCon 2016Autonomous intelligence for the Industrial Internet - LibreCon 2016
Autonomous intelligence for the Industrial Internet - LibreCon 2016
LibreCon
 
Meetup #3 - Cyber-physical view of the Internet of Everything
Meetup #3 - Cyber-physical view of the Internet of EverythingMeetup #3 - Cyber-physical view of the Internet of Everything
Meetup #3 - Cyber-physical view of the Internet of Everything
Francesco Rago
 
Industry 4.0
Industry 4.0Industry 4.0
Industry 4.0
kavyasree203263
 
PTC Accelerates Industrie 4.0 Strategy
PTC Accelerates Industrie 4.0 StrategyPTC Accelerates Industrie 4.0 Strategy
PTC Accelerates Industrie 4.0 Strategy
PTC
 
The Road to Level 3 #COMIT2017
The Road to Level 3 #COMIT2017The Road to Level 3 #COMIT2017
The Road to Level 3 #COMIT2017
Comit Projects Ltd
 
STEP Architecture Update
STEP Architecture UpdateSTEP Architecture Update
STEP Architecture Update
Dr Nicolas Figay
 
Industrial Data Space: Digital Sovereignty for Industry 4.0 and Smart Services
Industrial Data Space: Digital Sovereignty for Industry 4.0 and Smart ServicesIndustrial Data Space: Digital Sovereignty for Industry 4.0 and Smart Services
Industrial Data Space: Digital Sovereignty for Industry 4.0 and Smart Services
Boris Otto
 
Industrie 4.0 presentations matthias barbian_june2016
Industrie 4.0 presentations matthias barbian_june2016Industrie 4.0 presentations matthias barbian_june2016
Industrie 4.0 presentations matthias barbian_june2016
Matthias Barbian
 

What's hot (20)

GARE du MIDIH Open Digital Platforms the adoption of a standards-based open...
GARE du MIDIH   Open Digital Platforms the adoption of a standards-based open...GARE du MIDIH   Open Digital Platforms the adoption of a standards-based open...
GARE du MIDIH Open Digital Platforms the adoption of a standards-based open...
 
PDes (Programmable Devices and Embedded System)
PDes (Programmable Devices and Embedded System)PDes (Programmable Devices and Embedded System)
PDes (Programmable Devices and Embedded System)
 
GARE du MIDIH Digital Manufacturing Platforms in H2020 and in future Digita...
GARE du MIDIH   Digital Manufacturing Platforms in H2020 and in future Digita...GARE du MIDIH   Digital Manufacturing Platforms in H2020 and in future Digita...
GARE du MIDIH Digital Manufacturing Platforms in H2020 and in future Digita...
 
Gare du MIDIH MIDIH general overview
Gare du MIDIH   MIDIH general overviewGare du MIDIH   MIDIH general overview
Gare du MIDIH MIDIH general overview
 
ECPPM2016 - SimpleBIM: from full ifcOWL graphs to simplified building graphs
ECPPM2016 - SimpleBIM: from full ifcOWL graphs to simplified building graphsECPPM2016 - SimpleBIM: from full ifcOWL graphs to simplified building graphs
ECPPM2016 - SimpleBIM: from full ifcOWL graphs to simplified building graphs
 
LACE Project WP5 - Learning Analytics & Performance Support for Manufacturing...
LACE Project WP5 - Learning Analytics & Performance Support for Manufacturing...LACE Project WP5 - Learning Analytics & Performance Support for Manufacturing...
LACE Project WP5 - Learning Analytics & Performance Support for Manufacturing...
 
A Indústria 4.0 na Alemanha: há lições úteis para o Brasil? - NORBERT LÜTKE-E...
A Indústria 4.0 na Alemanha: há lições úteis para o Brasil? - NORBERT LÜTKE-E...A Indústria 4.0 na Alemanha: há lições úteis para o Brasil? - NORBERT LÜTKE-E...
A Indústria 4.0 na Alemanha: há lições úteis para o Brasil? - NORBERT LÜTKE-E...
 
SMACC - Smart Machines and Manufacturing Competence Center
SMACC - Smart Machines and Manufacturing Competence CenterSMACC - Smart Machines and Manufacturing Competence Center
SMACC - Smart Machines and Manufacturing Competence Center
 
Lace presentation at world manufacturing forum 2014 wshop on learning gamific...
Lace presentation at world manufacturing forum 2014 wshop on learning gamific...Lace presentation at world manufacturing forum 2014 wshop on learning gamific...
Lace presentation at world manufacturing forum 2014 wshop on learning gamific...
 
Towards a Connected World of Supply Chain - Industrie 4.0
Towards a Connected World of Supply Chain - Industrie 4.0Towards a Connected World of Supply Chain - Industrie 4.0
Towards a Connected World of Supply Chain - Industrie 4.0
 
Digital Readiness Level (DRL), Simon Barnes
Digital Readiness Level (DRL), Simon BarnesDigital Readiness Level (DRL), Simon Barnes
Digital Readiness Level (DRL), Simon Barnes
 
Autonomous intelligence for the Industrial Internet - LibreCon 2016
Autonomous intelligence for the Industrial Internet - LibreCon 2016Autonomous intelligence for the Industrial Internet - LibreCon 2016
Autonomous intelligence for the Industrial Internet - LibreCon 2016
 
Meetup #3 - Cyber-physical view of the Internet of Everything
Meetup #3 - Cyber-physical view of the Internet of EverythingMeetup #3 - Cyber-physical view of the Internet of Everything
Meetup #3 - Cyber-physical view of the Internet of Everything
 
Industry 4.0
Industry 4.0Industry 4.0
Industry 4.0
 
PTC Accelerates Industrie 4.0 Strategy
PTC Accelerates Industrie 4.0 StrategyPTC Accelerates Industrie 4.0 Strategy
PTC Accelerates Industrie 4.0 Strategy
 
Industry 4.0: Smart robots for smart factories
Industry 4.0: Smart robots for smart factoriesIndustry 4.0: Smart robots for smart factories
Industry 4.0: Smart robots for smart factories
 
The Road to Level 3 #COMIT2017
The Road to Level 3 #COMIT2017The Road to Level 3 #COMIT2017
The Road to Level 3 #COMIT2017
 
STEP Architecture Update
STEP Architecture UpdateSTEP Architecture Update
STEP Architecture Update
 
Industrial Data Space: Digital Sovereignty for Industry 4.0 and Smart Services
Industrial Data Space: Digital Sovereignty for Industry 4.0 and Smart ServicesIndustrial Data Space: Digital Sovereignty for Industry 4.0 and Smart Services
Industrial Data Space: Digital Sovereignty for Industry 4.0 and Smart Services
 
Industrie 4.0 presentations matthias barbian_june2016
Industrie 4.0 presentations matthias barbian_june2016Industrie 4.0 presentations matthias barbian_june2016
Industrie 4.0 presentations matthias barbian_june2016
 

Similar to Continuous Intelligence: Moving Machine Learning into Production Reliably

Continuous Delivery for Machine Learning
Continuous Delivery for Machine LearningContinuous Delivery for Machine Learning
Continuous Delivery for Machine Learning
Thoughtworks
 
CD4ML and the challenges of testing and quality in ML systems
CD4ML and the challenges of testing and quality in ML systemsCD4ML and the challenges of testing and quality in ML systems
CD4ML and the challenges of testing and quality in ML systems
Seldon
 
Advanced System Engineering in the Automotive Industry - Dr Alain Pfouga (pro...
Advanced System Engineering in the Automotive Industry - Dr Alain Pfouga (pro...Advanced System Engineering in the Automotive Industry - Dr Alain Pfouga (pro...
Advanced System Engineering in the Automotive Industry - Dr Alain Pfouga (pro...
Intland Software GmbH
 
Industry and academic partnerships july 2015 final
Industry and academic partnerships july 2015 finalIndustry and academic partnerships july 2015 final
Industry and academic partnerships july 2015 final
Steven Miller
 
How to build containerized architectures for deep learning - Data Festival 20...
How to build containerized architectures for deep learning - Data Festival 20...How to build containerized architectures for deep learning - Data Festival 20...
How to build containerized architectures for deep learning - Data Festival 20...
Antje Barth
 
2015 12-01 digital transformation in industrial automation sanitized
2015 12-01 digital transformation in industrial automation sanitized2015 12-01 digital transformation in industrial automation sanitized
2015 12-01 digital transformation in industrial automation sanitized
Thorsten Schroeer
 
Emerging Best Practises for Machine Learning Engineering (Canberra Meetup edits)
Emerging Best Practises for Machine Learning Engineering (Canberra Meetup edits)Emerging Best Practises for Machine Learning Engineering (Canberra Meetup edits)
Emerging Best Practises for Machine Learning Engineering (Canberra Meetup edits)
Lex Toumbourou
 
IGNITE 2015 Valentijn de Leeuw - Industry 4.0: The industrial Internet of Things
IGNITE 2015 Valentijn de Leeuw - Industry 4.0: The industrial Internet of ThingsIGNITE 2015 Valentijn de Leeuw - Industry 4.0: The industrial Internet of Things
IGNITE 2015 Valentijn de Leeuw - Industry 4.0: The industrial Internet of Things
Elemica
 
Emerging Best Practises for Machine Learning Engineering- Lex Toumbourou (By ...
Emerging Best Practises for Machine Learning Engineering- Lex Toumbourou (By ...Emerging Best Practises for Machine Learning Engineering- Lex Toumbourou (By ...
Emerging Best Practises for Machine Learning Engineering- Lex Toumbourou (By ...
Thoughtworks
 
ProIndústria2018 - Sala Beta - A08
ProIndústria2018 - Sala Beta - A08 ProIndústria2018 - Sala Beta - A08
ProIndústria2018 - Sala Beta - A08
Evandro Gama (Prof. Dr.)
 
TechEvent DWH Modernization
TechEvent DWH ModernizationTechEvent DWH Modernization
TechEvent DWH Modernization
Trivadis
 
Deep learning enhanced digital twin for Closed-loop In-Process Quality Improv...
Deep learning enhanced digital twin for Closed-loop In-Process Quality Improv...Deep learning enhanced digital twin for Closed-loop In-Process Quality Improv...
Deep learning enhanced digital twin for Closed-loop In-Process Quality Improv...
WMG centre High Value Manufacturing Catapult
 
Sip@iPLM 2016
Sip@iPLM 2016 Sip@iPLM 2016
Sip@iPLM 2016
Dr Nicolas Figay
 
Steps to Scale Internet of Things (IoT)
Steps to Scale Internet of Things (IoT)Steps to Scale Internet of Things (IoT)
Steps to Scale Internet of Things (IoT)
Rafael Maranon
 
What is IMAGINE for me?
What is IMAGINE for me?What is IMAGINE for me?
What is IMAGINE for me?
imaginefuturefactory
 
Kovach and Laufer Levesque "Challenges and Opportunities with Producing Diver...
Kovach and Laufer Levesque "Challenges and Opportunities with Producing Diver...Kovach and Laufer Levesque "Challenges and Opportunities with Producing Diver...
Kovach and Laufer Levesque "Challenges and Opportunities with Producing Diver...
National Information Standards Organization (NISO)
 
DrupalDay 2014 - Ecology of value and DRUPAL@Engineering: the experience of a...
DrupalDay 2014 - Ecology of value and DRUPAL@Engineering: the experience of a...DrupalDay 2014 - Ecology of value and DRUPAL@Engineering: the experience of a...
DrupalDay 2014 - Ecology of value and DRUPAL@Engineering: the experience of a...
SpagoWorld
 
Open Call Webinar presentation
Open Call Webinar presentationOpen Call Webinar presentation
Open Call Webinar presentation
OliviadeAlba
 
Fraunhofer – SINTEF: towards an initiative on Data Sovereignty in Europe
Fraunhofer – SINTEF: towards an initiative on Data Sovereignty in EuropeFraunhofer – SINTEF: towards an initiative on Data Sovereignty in Europe
Fraunhofer – SINTEF: towards an initiative on Data Sovereignty in Europe
Thorsten Huelsmann
 
About OPEN DEI
About OPEN DEIAbout OPEN DEI
About OPEN DEI
OPEN DEI
 

Similar to Continuous Intelligence: Moving Machine Learning into Production Reliably (20)

Continuous Delivery for Machine Learning
Continuous Delivery for Machine LearningContinuous Delivery for Machine Learning
Continuous Delivery for Machine Learning
 
CD4ML and the challenges of testing and quality in ML systems
CD4ML and the challenges of testing and quality in ML systemsCD4ML and the challenges of testing and quality in ML systems
CD4ML and the challenges of testing and quality in ML systems
 
Advanced System Engineering in the Automotive Industry - Dr Alain Pfouga (pro...
Advanced System Engineering in the Automotive Industry - Dr Alain Pfouga (pro...Advanced System Engineering in the Automotive Industry - Dr Alain Pfouga (pro...
Advanced System Engineering in the Automotive Industry - Dr Alain Pfouga (pro...
 
Industry and academic partnerships july 2015 final
Industry and academic partnerships july 2015 finalIndustry and academic partnerships july 2015 final
Industry and academic partnerships july 2015 final
 
How to build containerized architectures for deep learning - Data Festival 20...
How to build containerized architectures for deep learning - Data Festival 20...How to build containerized architectures for deep learning - Data Festival 20...
How to build containerized architectures for deep learning - Data Festival 20...
 
2015 12-01 digital transformation in industrial automation sanitized
2015 12-01 digital transformation in industrial automation sanitized2015 12-01 digital transformation in industrial automation sanitized
2015 12-01 digital transformation in industrial automation sanitized
 
Emerging Best Practises for Machine Learning Engineering (Canberra Meetup edits)
Emerging Best Practises for Machine Learning Engineering (Canberra Meetup edits)Emerging Best Practises for Machine Learning Engineering (Canberra Meetup edits)
Emerging Best Practises for Machine Learning Engineering (Canberra Meetup edits)
 
IGNITE 2015 Valentijn de Leeuw - Industry 4.0: The industrial Internet of Things
IGNITE 2015 Valentijn de Leeuw - Industry 4.0: The industrial Internet of ThingsIGNITE 2015 Valentijn de Leeuw - Industry 4.0: The industrial Internet of Things
IGNITE 2015 Valentijn de Leeuw - Industry 4.0: The industrial Internet of Things
 
Emerging Best Practises for Machine Learning Engineering- Lex Toumbourou (By ...
Emerging Best Practises for Machine Learning Engineering- Lex Toumbourou (By ...Emerging Best Practises for Machine Learning Engineering- Lex Toumbourou (By ...
Emerging Best Practises for Machine Learning Engineering- Lex Toumbourou (By ...
 
ProIndústria2018 - Sala Beta - A08
ProIndústria2018 - Sala Beta - A08 ProIndústria2018 - Sala Beta - A08
ProIndústria2018 - Sala Beta - A08
 
TechEvent DWH Modernization
TechEvent DWH ModernizationTechEvent DWH Modernization
TechEvent DWH Modernization
 
Deep learning enhanced digital twin for Closed-loop In-Process Quality Improv...
Deep learning enhanced digital twin for Closed-loop In-Process Quality Improv...Deep learning enhanced digital twin for Closed-loop In-Process Quality Improv...
Deep learning enhanced digital twin for Closed-loop In-Process Quality Improv...
 
Sip@iPLM 2016
Sip@iPLM 2016 Sip@iPLM 2016
Sip@iPLM 2016
 
Steps to Scale Internet of Things (IoT)
Steps to Scale Internet of Things (IoT)Steps to Scale Internet of Things (IoT)
Steps to Scale Internet of Things (IoT)
 
What is IMAGINE for me?
What is IMAGINE for me?What is IMAGINE for me?
What is IMAGINE for me?
 
Kovach and Laufer Levesque "Challenges and Opportunities with Producing Diver...
Kovach and Laufer Levesque "Challenges and Opportunities with Producing Diver...Kovach and Laufer Levesque "Challenges and Opportunities with Producing Diver...
Kovach and Laufer Levesque "Challenges and Opportunities with Producing Diver...
 
DrupalDay 2014 - Ecology of value and DRUPAL@Engineering: the experience of a...
DrupalDay 2014 - Ecology of value and DRUPAL@Engineering: the experience of a...DrupalDay 2014 - Ecology of value and DRUPAL@Engineering: the experience of a...
DrupalDay 2014 - Ecology of value and DRUPAL@Engineering: the experience of a...
 
Open Call Webinar presentation
Open Call Webinar presentationOpen Call Webinar presentation
Open Call Webinar presentation
 
Fraunhofer – SINTEF: towards an initiative on Data Sovereignty in Europe
Fraunhofer – SINTEF: towards an initiative on Data Sovereignty in EuropeFraunhofer – SINTEF: towards an initiative on Data Sovereignty in Europe
Fraunhofer – SINTEF: towards an initiative on Data Sovereignty in Europe
 
About OPEN DEI
About OPEN DEIAbout OPEN DEI
About OPEN DEI
 

More from Dr. Arif Wider

Data Mesh - It's not about technology, it's about people
Data Mesh - It's not about technology, it's about peopleData Mesh - It's not about technology, it's about people
Data Mesh - It's not about technology, it's about people
Dr. Arif Wider
 
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Dr. Arif Wider
 
Continuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in ProductionContinuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in Production
Dr. Arif Wider
 
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & AnalyticsDataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
Dr. Arif Wider
 
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & AnalyticsDataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
Dr. Arif Wider
 
DataDevOps - A Manifesto on Shared Data Responsibility in Times of Microservices
DataDevOps - A Manifesto on Shared Data Responsibility in Times of MicroservicesDataDevOps - A Manifesto on Shared Data Responsibility in Times of Microservices
DataDevOps - A Manifesto on Shared Data Responsibility in Times of Microservices
Dr. Arif Wider
 
Predictive Analytics for Vehicle Price Prediction - Delivered Continuously at...
Predictive Analytics for Vehicle Price Prediction - Delivered Continuously at...Predictive Analytics for Vehicle Price Prediction - Delivered Continuously at...
Predictive Analytics for Vehicle Price Prediction - Delivered Continuously at...
Dr. Arif Wider
 
A High-Performance Solution to Microservice UI Composition @ XConf Hamburg
A High-Performance Solution to Microservice UI Composition @ XConf HamburgA High-Performance Solution to Microservice UI Composition @ XConf Hamburg
A High-Performance Solution to Microservice UI Composition @ XConf Hamburg
Dr. Arif Wider
 
An Unexpected Solution to Microservices UI Composition
An Unexpected Solution to Microservices UI CompositionAn Unexpected Solution to Microservices UI Composition
An Unexpected Solution to Microservices UI Composition
Dr. Arif Wider
 

More from Dr. Arif Wider (9)

Data Mesh - It's not about technology, it's about people
Data Mesh - It's not about technology, it's about peopleData Mesh - It's not about technology, it's about people
Data Mesh - It's not about technology, it's about people
 
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
 
Continuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in ProductionContinuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in Production
 
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & AnalyticsDataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
 
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & AnalyticsDataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
DataDevOps: A Manifesto for a DevOps-like Culture Shift in Data & Analytics
 
DataDevOps - A Manifesto on Shared Data Responsibility in Times of Microservices
DataDevOps - A Manifesto on Shared Data Responsibility in Times of MicroservicesDataDevOps - A Manifesto on Shared Data Responsibility in Times of Microservices
DataDevOps - A Manifesto on Shared Data Responsibility in Times of Microservices
 
Predictive Analytics for Vehicle Price Prediction - Delivered Continuously at...
Predictive Analytics for Vehicle Price Prediction - Delivered Continuously at...Predictive Analytics for Vehicle Price Prediction - Delivered Continuously at...
Predictive Analytics for Vehicle Price Prediction - Delivered Continuously at...
 
A High-Performance Solution to Microservice UI Composition @ XConf Hamburg
A High-Performance Solution to Microservice UI Composition @ XConf HamburgA High-Performance Solution to Microservice UI Composition @ XConf Hamburg
A High-Performance Solution to Microservice UI Composition @ XConf Hamburg
 
An Unexpected Solution to Microservices UI Composition
An Unexpected Solution to Microservices UI CompositionAn Unexpected Solution to Microservices UI Composition
An Unexpected Solution to Microservices UI Composition
 

Recently uploaded

First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
Tendenci - The Open Source AMS (Association Management Software)
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
e20449
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
Srikant77
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
IES VE
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 

Recently uploaded (20)

First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 

Continuous Intelligence: Moving Machine Learning into Production Reliably

  • 1. 1 Continuous Intelligence Moving Machine Learning Application into Production Reliably Christoph Windheuser Danilo Sato Emily Gorcenski Arif Wider ThoughtWorks Inc. WORKSHOP ON WHY AND HOW TO APPLY CONTINUOUS DELIVERY TO MACHINE LEARNING (CD4ML) ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 2. 2 Structure of Today’s Workshop - INTRODUCTION TO THE TOPIC - EXERCISE 1: SETUP - EXERCISE 2: DEPLOYMENT PIPELINE - BREAK - EXERCISE 3: ML PIPELINE - EXERCISE 4: TRACKING EXPERIMENTS - EXERCISE 5: MODEL MONITORING ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 3. 5000+ technologists with 40 offices in 14 countries Partner for technology driven business transformation ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 join.thoughtworks.com
  • 4. #1 in Agile and Continuous Delivery 100+ books written ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 5. ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 6. TECHNIQUES Continuous delivery for machine learning (CD4ML) models #8 ASSESS 8 ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 7. ©ThoughtWorks 2018 Commercial in Confidence CONTINUOUS INTELLIGENCE CYCLE ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 7
  • 8. CD4ML isn’t a technology or a tool; it is a practice and a set of principles. Quality is built into software and improvement is always possible. But machine learning systems have unique challenges; unlike deterministic software, it is difficult—or impossible—to understand the behavior of data-driven intelligent systems. This poses a huge challenge when it comes to deploying machine learning systems in accordance with CD principles. 8 PRODUCTIONIZING ML IS HARD Production systems should be: ● Reproducible ● Testable ● Auditable ● Continuously Improving HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO INTELLIGENT SYSTEMS? ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 9. CD4ML isn’t a technology or a tool; it is a practice and a set of principles. Quality is built into software and improvement is always possible. But machine learning systems have unique challenges; unlike deterministic software, it is difficult—or impossible—to understand the behavior of data-driven intelligent systems. This poses a huge challenge when it comes to deploying machine learning systems in accordance with CD principles. 9 PRODUCTIONIZING ML IS HARD Production systems should be: ● Reproducible ● Testable ● Auditable ● Continuously Improving Machine Learning is: ● Non-deterministic ● Hard to test ● Hard to explain ● Hard to improve HOW DO WE APPLY DECADES OF SOFTWARE DELIVERY EXPERIENCE TO INTELLIGENT SYSTEMS? ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 10. MANY SOURCES OF CHANGE 10 ModelData Code + + Schema Sampling over Time Volume ... Research, Experiments Training on New Data Performance ... New Features Bug Fixes Dependencies ... Icons created by Noura Mbarki and I Putu Kharismayadi from Noun Project ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 11. “Continuous Delivery is the ability to get changes of all types — including new features, configuration changes, bug fixes and experiments — into production, or into the hands of users, safely and quickly in a sustainable way.” - Jez Humble & Dave Farley 11 ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 12. ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 PRINCIPLES OF CONTINUOUS DELIVERY 12 → Create a Repeatable, Reliable Process for Releasing Software → Automate Almost Everything → Build Quality In → Work in Small Batches → Keep Everything in Source Control → Done Means “Released” → Improve Continuously
  • 13. WHAT DO WE NEED IN OUR STACK? 13 Doing CD with Machine Learning is still a hard problem MODEL PERFORMANCE ASSESSMENT TRACKING VERSION CONTROL AND ARTIFACT REPOSITORIES ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 MODEL MONITORING AND OBSERVABILITY DISCOVERABLE AND ACCESSIBLE DATA CONTINUOUS DELIVERY ORCHESTRATION TO COMBINE PIPELINES INFRASTRUCTURE FOR MULTIPLE ENVIRONMENTS AND EXPERIMENTS
  • 14. PUTTING EVERYTHING TOGETHER 14 Data Science, Model Building Training Data Source Code + Executables Model + parameters CD Tools and Repositories DiscoverableandAccessibleData ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 15. PUTTING EVERYTHING TOGETHER 15 Data Science, Model Building Training Data Source Code + Executables Model Evaluation Test Data Model + parameters CD Tools and Repositories DiscoverableandAccessibleData ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 16. PUTTING EVERYTHING TOGETHER 16 Data Science, Model Building Training Data Source Code + Executables Model Evaluation Productionize Model Test Data Model + parameters CD Tools and Repositories DiscoverableandAccessibleData ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 17. PUTTING EVERYTHING TOGETHER 17 Data Science, Model Building Training Data Source Code + Executables Model Evaluation Productionize Model Integration Testing Test Data Model + parameters CD Tools and Repositories DiscoverableandAccessibleData ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 18. PUTTING EVERYTHING TOGETHER 18 Data Science, Model Building Training Data Source Code + Executables Model Evaluation Productionize Model Integration Testing Deployment Test Data Model + parameters CD Tools and Repositories DiscoverableandAccessibleData ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 19. PUTTING EVERYTHING TOGETHER 19 Data Science, Model Building Training Data Source Code + Executables Model Evaluation Productionize Model Integration Testing Deployment Test Data Model + parameters CD Tools and Repositories DiscoverableandAccessibleData Monitoring ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 Production Data
  • 20. PUTTING EVERYTHING TOGETHER 20 Data Science, Model Building Training Data Source Code + Executables Model Evaluation Productionize Model Integration Testing Deployment Test Data Model + parameters CD Tools and Repositories DiscoverableandAccessibleData Monitoring ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 Production Data
  • 21. WHAT WE WILL USE IN THIS WORKSHOP 21 There are many options for tools and technologies to implement CD4ML ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 22. THE MACHINE LEARNING PROBLEM WE ARE EXPLORING TODAY 22 ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 23. A REAL BUSINESS PROBLEM RETAIL / SUPPLY CHAIN Loss of sales, opportunity cost, stock waste, discounting REQUIRES Accurate Demand Forecasting TYPICAL CHALLENGES → Predictions Are Inaccurate → Development Takes A Long Time → Difficult To Adapt To Market Change Pace 23 ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 23
  • 24. SALES FORECASTING FOR GROCERY RETAILER ● 4,000 items ● 50 stores ● 125,000,000 sales transactions ● 4.5 years of data Make predictions based on data from: 24 TASK: Predict how many of each product will be purchased in each store on a given date ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 25. THE SIMPLIFIED WEB APPLICATION As a buyer, I want to be able to choose a product and predict how many units the product will sell at a future date. 25 ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 26. EXERCISE 1: SETUP https://github.com/ThoughtWorksInc/continuous-intelligence-workshop 26 ● Click on instructions → 1-setup.md ● Follow the steps to setup your local development environment User ID assignment sheet: http://bit.ly/cd4ml-strata19 ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 26
  • 27. DEPLOYMENT PIPELINE Automates the process of building, testing, and deploying applications to production 27 Application code in version control repository Container image as deployment artifact Deploy container to production servers ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 28. 28 ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 An Open Source Continuous Delivery server to model and visualise complex workflows
  • 29. Pipeline Group ANATOMY OF A GOCD PIPELINE 29 ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 30. EXERCISE 2: DEPLOYMENT PIPELINE https://github.com/ThoughtWorksInc/continuous-intelligence-workshop 30 ● Click on instructions → 2-deployment-pipeline.md ● Follow the steps to setup your deployment pipeline ● GoCD URL: https://gocd.cd4ml.net ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 30
  • 31. BUT NOW WHAT? ● How do we retrain the model more often? ● How to deploy the retrained model to production? ● How to make sure we don’t break anything when deploying? ● How to make sure that our modeling approach or parameterization is still the best fit for the data? ● How to monitor our model “in the wild”? Once your model is in production... 31 ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 31
  • 32. BASIC DATA SCIENCE WORKFLOW 32 Gather data and extract features Separate into training and validation sets Train model and evaluate performance ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 33. SALES FORECAST MODEL TRAINING PROCESS 33 Data splitter.p y Training Data Validation Data decision_t ree.py model.pkl evaluation.py metrics.json download_d ata.py ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 34. CHALLENGE 1: THESE ARE LARGE FILES! 34 Data splitter.p y Training Data Validation Data decision_t ree.py model.pkl evaluation.py metrics.json download_d ata.py ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 35. CHALLENGE 2: AD-HOC MULTI-STEP PROCESS 35 Data splitter.p y Training Data Validation Data decision_t ree.py model.pkl evaluation.py metrics.json download_d ata.py ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 36. ● dvc is git porcelain for storing large files using cloud storage ● dvc connects model training steps to create reproducible workflows SOLUTION: dvc data science version control 36 master change-max-depth try-random-forest model.pkl decision_tree.py model.pkl.dvc ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 37. ANATOMY OF A DVC COMMAND This runs a command and creates a .dvc file. The dvc file points to the dependencies. The output files are versioned and stored in the cloud by running dvc push. When you use the output files (store47-2016.csv) as dependencies for the next step, a is automatically created. You can re-execute an entire pipeline with one command: dvc repro 37 dvc run -d src/download_data.py -o data/raw/store47-2016.csv python src/download_data.py ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 37
  • 38. EXERCISE 3: MACHINE LEARNING PIPELINE https://github.com/ThoughtWorksInc/continuous-intelligence-workshop 38 ● Click on instructions → 3-machine-learning-pipeline.md ● Follow the steps on your local development environment and in GoCD to create your Machine Learning pipeline ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 38
  • 39. HOW DO WE TRACK EXPERIMENTS? ● Which experiments and hypothesis are being explored? ● Which algorithms are being used in each experiment? ● Which version of the code was used? ● How long does it take to run each experiment? ● What parameter and hyperparameters were used? ● How fast are my models learning? ● How do we compare results from different runs? We need to track the scientific process and evaluate our models: 39 ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 39
  • 40. 40 ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 An Open Source platform for managing end-to-end machine learning lifecycle
  • 41. EXERCISE 4: TRACKING EXPERIMENTS https://github.com/ThoughtWorksInc/continuous-intelligence-workshop 41 ● Click on instructions → 4-tracking-experiments.md ● Follow the steps to track ML training in mlflow ● MLflow URL: https://mlflow.cd4ml.net ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 41
  • 42. HOW TO LEARN CONTINUOUSLY? ● Track model usage ● Track model inputs to find training-serving skew ● Track model outputs ● Track model interpretability outputs to identify potential bias or overfit ● Track model fairness to understand how it behaves against dimensions that could introduce unfair bias We need to capture production data to improve our models: 42 ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 42
  • 43. EFK STACK Monitoring and Observability infrastructure 43 Open Source data collector for unified logging Open Source Search Engine Open Source web UI to explore and visualise data ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 44. 44 ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 An Open Source UI that makes it easy to explore and visualise the data index in Elasticsearch
  • 45. EXERCISE 5: MODEL MONITORING https://github.com/ThoughtWorksInc/continuous-intelligence-workshop 45 ● Click on instructions → 5-model-monitoring.md ● Follow the steps to log prediction events ● Kibana URL: https://kibana.cd4ml.net ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 45
  • 46. 46 SUMMARY - WHAT HAVE WE LEARNED? ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 47. CD4ML ● Proper data/model versioning tools enable reproducible work to be done in parallel. ● No need to maintain complex data processing/model training scripts. ● We can then put data science work into a Continuous Delivery workflow. ● Result: Continuous, on-demand AI development and deployment, from research to production, with a single command. ● Benefit: production AI systems that are always as smart as your data science team. 47 ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 48. 4848 THANK YOU! Danilo Sato (dsato@thoughtworks.com) Christoph Windheuser (cwindheu@thoughtworks.com) Emily Gorcenski (egorcens@thoughtsworks.com) Arif Wider (awider@thoughtworks.com) ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019 join.thoughtworks.com
  • 49. WHAT DO WE NEED IN OUR STACK? 49 Doing CD with Machine Learning is still a hard problem MODEL PERFORMANCE ASSESSMENT TRACKING BUSINESS VALUE ASSESSMENT VERSION CONTROL (FOR CODE AND MODELS AND DATA) DEPLOYMENT, MONITORING, AND LOGGING ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 50. VERSION CONTROL & COLLABORATION Our code, data, and models should be versioned and shareable without unnecessary work. ■ Data and models can be very large ■ Data can vary invisibly ■ Data scientists need to share work and it must be repeatable What are the challenges? 50 ■ Large artifacts stored in arbitrary storage linked to source repo ■ Data scientists can encode work process at repeat in one step What does the ideal solution look like? ■ Storage: S3, HDFS, etc ■ Git LFS ■ Shell scripts ■ dvc ■ Pachyderm ■ jupyterhub What solutions are out there now? ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019
  • 51. MODEL PERFORMANCE TRACKING We should be able to scale model development to try multiple modeling approaches simultaneously. ■ Hyperparameter tuning and model selection is hard ■ Tracking performance depends on other moving parts (e.g. data) What are the challenges? 51 ■ Links models to specific training sets and parameter sets ■ Is differentiable ■ Allows visualization of results What does the ideal solution look like? ■ dvc ■ MLFlow ■ No shortage of options here What solutions are out there now? ©ThoughtWorks 2019 Strata Data Conference London, April 30, 2019