ODSC London - ML in production

ML IN PRODUCTION
Serverless and Painless
Oliver Gindele
@tinyoli
oliver@datatonic.com
22.11.2019 ODSC London

Who is Oliver?
+ Head of Machine Learning
+ PhD in computational physics
Who is datatonic?
We are a strong team of data scientists, machine learning
experts, software engineers and mathematicians.
Our mission is to provide tailor-made systems to help your
organization get smart actionable insights from large data
volumes.

Why is moving models
into production hard?

Define ML use cases
Define specific ML use cases for
the project
Select algorithm
Choose the right ML
algorithm for the task
Build ML model
Develop the first iteration
of the ML model
Present results
Present results of the model in
a way that demonstrates its
value to stakeholders
Iterate ML model
Refine the ML model to
improve performance and
efficacy
Data pipeline &
feature engineering
Create the right
features from raw data
for the ML task
Plan for deployment
Prepare for deployment in
production
Operationalize model
Deploy and operationalize
ML model in production
Monitor model
Monitor deployed ML model
and retrain or rebuild when
performance degrades
1 3
10 789
Data exploration
Perform exploratory
analysis to understand the
data
2 4
6
5
Start
a new ML project
Discover Model Build Deploy
ML Project Life Cycle

Define ML use cases
Define specific ML use cases for
the project
Select algorithm
Choose the right ML
algorithm for the task
Build ML model
Develop the first iteration
of the ML model
Present results
Present results of the model in
a way that demonstrates its
value to stakeholders
Iterate ML model
Refine the ML model to
improve performance and
efficacy
Data pipeline &
feature engineering
Create the right
features from raw data
for the ML task
Plan for deployment
Prepare for deployment in
production
Operationalize model
Deploy and operationalize
ML model in production
Monitor model
Monitor deployed ML model
and retrain or rebuild when
performance degrades
1 3
10 789
Data exploration
Perform exploratory
analysis to understand the
data
2 4
6
5
Start
a new ML project
Discover Model Build Deploy
ML Project Life Cycle
The hard part!
Needs SE/DevOps skills

The Model
+ MobileNetV2
+ Lots of image augmentation, careful data selection
+ Transfer learning → retrain top layers
+ Small changes to overall architecture
+ Hyperparameter tuning
→ F1 score: 45% -> 97%

We want:
+ Quick development cycles 🚀
+ Continuous delivery 🔁
+ Ship ML models like “normal”
software 📦
+ As much automation as possible 🤖

Why MLOps?
+ Orchestration of multiple
pipelines
+ Scalable ML Applications
+ Flexible, self-serve R&D
+ Reliable APIs
+ Ongoing data validation
+ Monitoring and validation of
ML predictions
+ Model governance/versioning
+ Continuous Integration and
Deployment

The Landscape
(a sample) Tensorflow Extended (TFX)
Machine-learning Studio
AWS SageMaker
GCP AI Platform

Current issues with data science platforms
(generalisations, 1/2)
+ Incomplete, missing features
+ Not stable yet, still maturing (and changing!)
+ New custom languages/APIs to learn
+ Vendor Lock-in

Current issues with data science platforms
(generalisations, 2/2)
+ Requires non DS/ML skillset (Docker, Spark, K8s)
+ Experimentation can’t be contained to a platform
+ Data access/data silos not solved
+ Focus on DS/ML but not business value

What now? (2019 View)
Check if one the these tools ticks all your boxes
→ If not: roll your own tailored solution - It’s not that hard
→ Serverless and managed options are your friend!
Composer
BigQuery
GCS
AI Platform Managed (training, deployment, notebooks)
Blob storage
Fully managed data warehouse
Managed Airﬂow (orchestration)
Dataﬂow Managed ETL
Google Cloud Platform
example tools:

Let’s Build a Machine Learning Pipeline
for Computer Vision
🖼 Preprocess images
⚙ Automate model training
and evaluation
🔬 Monitor and track model
performance
🏷 Store and version models

Training
Images
.tflite
model
Android and
iOS apps🤷🏻‍♀ 🤷🏼‍♂
❓ ❓

Even more challenges on mobile
🏎 Processor speed
📦 Storage space
🔋 Power
⏰ Latency
📴 Disconnections
📡 Bandwidth

Data
preparation
Machine
Learning
Pipeline Overview
Image Upload
GCS

Image Upload
GCS
Convert images to
TFRecords
Dataflow
Data augmentation
AI Platform
Train/Eval Sets
GCS
Data preparation

Convert images to TFRecords
🏃 Dataflow is a serverless
runner for Beam pipelines
🌬 Converted 50k jpeg
training images in 10
minutes
🖼 TFRecords are serialized
images stored in a set of
files
🐍 Batch data processing in
Apache Beam (Python)

Model training &
evaluation
AI Platform
Convert model to
TFLite
AI Platform
Store evaluation
metrics
BigQuery
Model versions
GCS
Machine Learning

Train & evaluate and deploy model
📈 Serverless scalable model training with AI Platform
🔀 Distributed training & hyperparameter tuning
♻ Tensorflow code can run on AI Platform with no change
🔗 Easily Attach GPUs and TPUs
📲 Convert to .tflite (3.5 MB with full integer quantization)

Quantization
https://heartbeat.fritz.ai/8-bit-quantization-and-tensorflow-lite-speeding-up-mobile-inference-with-low-precision-a882dfcafbbd

Automation
Composer
Image Upload
GCS
Convert images
to TFRecords
Dataﬂow
Data
augmentation
AI Platform
Model training &
evaluation
AI Platform
Train/Eval Sets
GCS
Convert model to
TFLite
AI Platform
Store evaluation
metrics
BigQuery
Model versions
GCS
Data preparation Machine Learning

Serverless ML Pipeline build on GCP
Exported to TFLite and deployed to
Android and IOS apps
Huge improvement of model
performance
✅
✅
✅

Custom MLOps on GCP
+ Orchestration of multiple pipelines ✔
+ Scalable ML Applications ✔
+ Monitoring and validation of ML predictions ✔
+ Ongoing data validation ✔
+ Model Governance ✔
+ Continuous Integration and Deployment ✔
AI Platform
Composer
BigQuery
GCS
Composer Gitlab
AI Platform

Takeaways
+ Building your own custom solution is not that hard!
+ Cloud vendors already offer easy to use, fully
managed and battle tested components
+ New MLops platforms are maturing rapidly
—> Kubeflow, TFX, MLFlow
—> Can’t wait for 2020!

Thank you.
oliver@datatonic.com
@tinyoli
www.datatonic.com

ODSC London - ML in production

More Related Content

Recently uploaded

Featured

ODSC London - ML in production