Deploying ML models in production, with or without CI/CD, is significantly more complicated than deploying traditional applications. That is mainly because ML models do not just consist of the code used for their training, but they also depend on the data they are trained on and on the supporting code. Monitoring ML models also adds additional complexity beyond what is usually done for traditional applications. This talk will cover these problems and best practices for solving them, with special focus on how it's done on the Databricks platform.
2. inteligencija.com
Agenda
• Why is productionising Machine Learning hard?
• Overview of the Machine Learning lifecycle best practices
• Overview of how Databricks solves Machine Learning
productionisation
4. inteligencija.com
ML lifecycle – the naive version
Streaming
data?
Is the data
fresh?
Schema
changes?
ETL code
versioning?
Has the data
distribution
changed?
ETL
testing?
What
parameters
and algos
worked?
Is the
model
performanc
e still OK?
Which
environment
?
Can the
environment
be
reproduced?
Which data
was used
for
training?
Preparation
code
versioning
and
testing?
Are all the
features
really
needed?
Are
features in
the prod
equivalent
to the ones
from
training?
Is the whole
pipeline
integration
tested?
5. inteligencija.com
How is ML development different?
„The ML Test Score: A Rubric for ML Production Readiness and
Technical Debt Reduction”, Breck et al., Google 2017
8. inteligencija.com
What is DevOps?
Image source: Wikipedia
• Application code versioning
• Continuous integration – CI
• Continuous deployment – CD
• Automated testing
• Infrastructure as code
• Configuration management
• Monitoring
9. inteligencija.com
What is DataOps?
Image source: Monte Carlo Data
• ETL/ELT pipelines
• Code versioning
• Data lineage
• Data testing
• Data privacy
• Data self service
• Feature engineering
10. inteligencija.com
What is ModelOps?
Image source: Aksel Yap
• Feature engineering
• Tracking experiments
• Model validation and testing
• Versioning of model code
• Apply models to real-life data (deployment)
• Model performance monitoring
11. inteligencija.com
Deployment paradigms
• Batch – most of the applications
• Streaming – latency in seconds and minutes
• Real time – latency in <1s
• Edge (on-device) – specially tuned models
13. inteligencija.com
Data best practices
• Version control data pipeline code
• Document feature expectations and automate data quality
checks
• Design for reusable and modular data pipelines
• Test feature creation and data processing code
• Use a Feature Store to ensure that features are consistent
across different models and pipelines
• Adopt CI/CD for data pipelines
• Adopt Infrastructure as Code
• Beware sensitive data and compliance
• Training/serving skew – Check that training and serving
features are computed in the same way (a.k.a online/offline
skew)
14. inteligencija.com
Model best practices
• Version control model training code and track experiments
• Model testing:
• Check for feature usefulness and cost
• Tune all hyperparameters
• Compare models to simpler alternatives – sanity check
• Test performance on important subsets of data (e.g.
regions)
• Understand the real-world impact of the model outputs
• Use canary deployments and A/B testing in production
• Have a rollback strategy
• Monitor for model degradation in production
• Understand how fast the model goes stale
• Set up automatic retraining pipelines (continuous learning)
17. inteligencija.com
Delta Lake
• ACID transactions – ensures data consistency and reliability
• Schema enforcement and evolution – helps with data quality
• Time travel (Data versioning) – facilitates experimentation
• Deletes and upserts (MERGE INTO) – iterative and incremental
feature preparation
• Data skipping and other optimizations – improves
performance
18. inteligencija.com
Delta Live Tables
• Framework for building data processing pipelines
• You define transformations and DLT manages:
• Orchestration
• Cluster management
• Monitoring
• Data quality (Expectations)
• Error handling
• Can perform CDC with APPLY CHANGES INTO .. FROM ..
20. inteligencija.com
Unity Catalog
• Centralized data discovery and access – quick search for and
reuse of existing datasets
• Centralized data governance and security – Fine-grained
access control management from a central location
• Data lineage – tables, columns, notebooks, workflows and
dashboards provide automatically collected lineage information
• Collaboration – Cross-workspace sharing enables teams to
share datasets across projects without data duplication,
promoting consistent use
23. inteligencija.com
Feature Store
• Centralized Feature Management – discoverable and reusable
• Any table in Unity Catalog can serve as a feature table (since
DBR 13.2)
• Lineage – upstream and downstream
• Consistency Across Models – Features used for training
models are also served in production
• Simplified Serving – models include feature metadata
• Should be used consistently (log_model) – so that you can
keep track of feature usage
• You can publish features to online stores (Amazon
DynamoDB, Aurora or RDS MySQL)
• for models served with Databricks Model Serving
24. inteligencija.com
MLflow
Integrated within the Databricks platform (notebooks and
workflows):
• MLflow Tracking – Log and query experiments and runs in
terms of code, data, config, and results
• MLflow Projects – Package data science code in a reusable,
reproducible form to share with other data scientists or transfer
to production.
• MLflow Models – Manage and deploy machine learning models
from a variety of ML libraries to a variety of model serving and
inference platforms.
• MLflow Model Registry – A centralized model store, set of APIs,
and UI, to collaboratively manage the lifecycle of a MLflow
Model.
27. inteligencija.com
AutoML
• Generates ML classification, regression or forecasting code
automatically, based on input table and the target field
• Features from the Feature Store can be joined
• Jupyter notebooks with code for splitting data, setting up
libraries etc.
• Provides a good starting point for experiments and/or models
ready to be registered
28. inteligencija.com
CI/CD integration
• Databricks Repos UI is used for checking out Git branches,
merging and pushing changes
• It provides a REST API that can be invoked by Git automation
• In production you can:
• directly reference notebooks in remote Git repos by tags or
branches
• set up read-only folders with checked-out repos and update
them automatically using Git automation
• MLflow Model Registry provides an API so that Git automation
can automatically transition models between environments
31. inteligencija.com
Promoting code and models
Use Git branches to separate code versions:
• dev branch for development
• specific feature branches for feature development
• release branches for different versions
Lifecycle of models might be independent of code
33. inteligencija.com
The recommended approach for model promotion
The workflow recommended by Databricks:
• Dev environment:
• Develop training and other code
• Promote code
• Staging environment:
• Test training code on subset of data
• Test other code
• Promote code
• Prod environment:
• Train model on production data
• Test model
• Deploy model
• Deploy code
34. inteligencija.com
Links & Resources
• Big Book of MLOps – Databricks
• The Comprehensive Guide to Feature Stores – Databricks
• ML in Production – Databricks course
• https://github.com/databricks/mlops-stacks
• https://ml-ops.org/
• https://cloud.google.com/architecture/mlops-continuous-
delivery-and-automation-pipelines-in-machine-learning
36. inteligencija.com
We are Data & Analytics consulting company committed to deliver great solutions and products that
enables our clients to unlock hidden opportunities within data, become data-driven and make better
business decisions
Our goal is to enable data-driven business decisions
Offices in UK,
Sweden,
Austria,
Slovenia and
Croatia
180+
employees
20 years in
Data &
Analytics
250+
projects
100+
clients on 5
continents
Data extraction & preparation – data collection, cleaning, wrangling, aggregating, transforming, feature engineering, etc.
Exploratory Data Analysis – gaining an understanding of the data, its statistical properties and its mapping to business use case
Model Training – train models on training data
Model Validation – validate models on validation data
Deployment – deploy models to production
Monitoring – keeping track of how well models behave in production
It should be clear by now that managing ML development and deployment is more complicated than traditional SW development and hence we need a distinct methodology which is MLOps.
The whole point of DevOps is to enable fast, flexible and reliable delivery of applications. Besides development, it also comprises configuration management and monitoring.
Training/serving skew – for example, if you have optimized code running in production