SlideShare a Scribd company logo
MLOps pipelines using MLFlow - From training to production
Dr. Andreas Weiden, skillbyte
CAIML#24, April 13th, 2023
Problem Description
Team A, a (majority) Data Scientist team, creates many machine learning models
Team B, a (majority) Data Engineer team, needs to deploy these models into production and use them for a
recommender system
So two problems:
1. Technical
Need to deploy these models to multiple targets (→ this talk)
2. Organizational
Need to make two teams work together (→ not this talk)
MLOps
Got its name from DevOps and GitOps
Continuous training and deployment for machine learning systems
ml-ops.org
Pipelines
Created and maintained by Team A
Daily training runs, since fresh data is constantly coming in
Output various artifacts which are needed by the prediction services, run by Team B
Examples of what the Data Science Magic™ can be:
popularity of items
user embeddings from user-item interactions
item embeddings from item descriptions
Need somewhere to store the outputs of those
pipelines
And deploy them, too
Manage end-to-end machine learning
lifecycle
Open source: Github
Four pillars:
Tracking
Log parameters, code versions,
metrics, artifacts
Projects
Models
Registry
Basic unit is a Run
Whenever your pipeline runs, a new Run is created
Runs can be grouped under Experiments
You can add arbitrary data to a Run as well as the output
artifacts
import mlflow
import datetime, numpy as np, pickle, random
from tempfile import TemporaryDirectory
mlflow.set_experiment("Pipeline A")
run_name = f"Pipeline A {datetime.datetime.now().isoformat()}"
tags = {"version": "0.0.1"}
with mlflow.start_run(run_name=run_name, tags=tags) as run:
mlflow.log_param("ndims", 1024)
mlflow.log_metric("recall", random.random())
with TemporaryDirectory() as temp_dir:
with open(f"{temp_dir}/out.pickle", "wb") as f:
pickle.dump([np.random.rand(1024) for _ in range(100)], f)
mlflow.log_artifacts(temp_dir)
Manage end-to-end machine learning
lifecycle
Open source: Github
Four pillars:
Tracking
Log parameters, code versions,
metrics, artifacts
Projects
Models
Registry
Basic unit is a Run
Whenever your pipeline runs, a new Run is created
Runs can be grouped under Experiments
You can add arbitrary data to a Run as well as the output
artifacts
Manage end-to-end machine learning
lifecycle
Open source: Github
Four pillars:
Tracking
Log parameters, code versions,
metrics, artifacts
Projects
Package Data Science code including
dependencies
Git and containerization already does
this, if you lock your dependencies,
which you should
Models
Registry
Manage end-to-end machine learning
lifecycle
Open source: Github
Four pillars:
Tracking
Log parameters, code versions,
metrics, artifacts
Projects
Package Data Science code including
dependencies
Git and containerization already does
this, if you lock your dependencies,
which you should
Models
Package ML models and deploy them
Containers and/or model artifacts
Registry
A standard format for packaging machine learning models
that can be used in a variety of downstream tools e.g.
real-time serving through a REST API
batch inference on Apache Spark
Saves model specific data and environment data:
# Directory written by mlflow.sklearn.save_model(model, "my_model")
my_model/
├── MLmodel
├── model.pkl
├── conda.yaml
├── python_env.yaml
└── requirements.txt
Manage end-to-end machine learning
lifecycle
Open source: Github
Four pillars:
Tracking
Log parameters, code versions,
metrics, artifacts
Projects
Package Data Science code including
dependencies
Git and containerization already does
this, if you lock your dependencies,
which you should
Models
Package ML models and deploy them
Containers and/or model artifacts
Registry
Model storage and lifecycle
(versioning, stage transitions)
Can associate runs with a Model:
import mlflow
with mlflow.start_run() as run:
...
mlflow.register_model(f"runs:/{run.info.run_id}", "Model A")
Manage end-to-end machine learning
lifecycle
Open source: Github
Four pillars:
Tracking
Log parameters, code versions,
metrics, artifacts
Projects
Package Data Science code including
dependencies
Git and containerization already does
this, if you lock your dependencies,
which you should
Models
Package ML models and deploy them
Containers and/or model artifacts
Registry
Model storage and lifecycle
(versioning, stage transitions)
Sounds interesting, but only very
rudimentary
Deployment options
ml-ops.org
Allow deploying arbitrary machine learning
artifacts
Full control of the API
Not so fast…
… managed vector stores are also a thing
They give you e.g.
pre-filtering
a full-blown query syntax
Quite a few options exist nowadays
Elasticsearch
Google Vertex AI
RediSearch
Milvus
…
→ Need a generic way to deploy machine learning artifacts to
multiple targets
Watcher
Simple microservice that periodically polls the MLFlow registry
Pushes the updated artifacts to all ML deployment options
Uploads embeddings to managed databases
Updates ConfigMap definitions with the correct Run ID per model and environment
Deployment targets
Elasticsearch Google Vertex AI Kubernetes
ConfigMap
POST /${ES_INDEX}/_doc/ HTTP/1.1
Host: ${ES_URL}
Content-Type: application/json
{
"id": "foo",
"vector": [1, 2, 3],
"popularity": 123,
"type": "bar"
}
GET /v1/${INDEX_URL}:upsertDatapoint HTTP/1.1
Host: ${VERTEX_ENDPOINT}
Content-Type: application/json
Authorization: Bearer `gcloud auth print-access-token`
{
"datapoints": [
{
"datapoint_id": "foo",
"feature_vector": [1, 2, 3],
"restricts": {
"namespace": "type",
"allow_list": ["bar"]
}
}
]
}
apiVersion: v1
kind: ConfigMap
metadata:
name: foo-configmap
data:
RUN_ID: "1234abcde"
Configmap reloader
https://github.com/stakater/Reloader
Ensures that all deployments that rely on a ConfigMap get restarted whenever
that ConfigMap changes
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
configmap.reloader.stakater.com/reload: "foo-configmap"
spec:
template:
spec:
containers:
- name: foo
image: foo:0.0.1
env:
- name: MLFLOW_RUN_ID
valueFrom:
configMapKeyRef:
name: foo-configmap
key: RUN_ID
apiVersion: v1
kind: ConfigMap
metadata:
name: foo-configmap
data:
RUN_ID: "1234abcde"
Alternatives
Possible alternatives that we considered:
Model deployment directly through MLFlow
Seldon Core
AWS SageMaker
(Got more? Let me know!)
However, they all have the same drawbacks:
No control over final images
Image size not optimised
Custom logging, metrics, tracing, … difficult
No control over API
Only deploy to REST-APIs, but we also want other targets
Summary
Embeddings and models are centrally produced → Need some central model storage
Each of the targets supports training and or deploying ML models individually, none of them support doing
so for all targets
→ If your needs are diverse enough, you may need to roll your own (ML deployment)
→ But use existing tools where applicable
Questions
or

More Related Content

Similar to MLOps pipelines using MLFlow - From training to production

MLFlow 1.0 Meetup
MLFlow 1.0 Meetup MLFlow 1.0 Meetup
MLFlow 1.0 Meetup
Databricks
 
Scaling up Machine Learning Development
Scaling up Machine Learning DevelopmentScaling up Machine Learning Development
Scaling up Machine Learning Development
Matei Zaharia
 
Utilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learningUtilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learning
Paris Data Engineers !
 
Tech leaders guide to effective building of machine learning products
Tech leaders guide to effective building of machine learning productsTech leaders guide to effective building of machine learning products
Tech leaders guide to effective building of machine learning products
Gianmario Spacagna
 
DEVOPS AND MACHINE LEARNING
DEVOPS AND MACHINE LEARNINGDEVOPS AND MACHINE LEARNING
DEVOPS AND MACHINE LEARNING
CodeOps Technologies LLP
 
MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle
Databricks
 
Introduction to MLflow
Introduction to MLflowIntroduction to MLflow
Introduction to MLflow
Databricks
 
Hydrosphere.io for ODSC: Webinar on Kubeflow
Hydrosphere.io for ODSC: Webinar on KubeflowHydrosphere.io for ODSC: Webinar on Kubeflow
Hydrosphere.io for ODSC: Webinar on Kubeflow
Rustem Zakiev
 
Azure machine learning service
Azure machine learning serviceAzure machine learning service
Azure machine learning service
Ruth Yakubu
 
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
Big Data Value Association
 
Apache Eagle in Action
Apache Eagle in ActionApache Eagle in Action
Apache Eagle in Action
Hao Chen
 
Scaling AI/ML with Containers and Kubernetes
Scaling AI/ML with Containers and Kubernetes Scaling AI/ML with Containers and Kubernetes
Scaling AI/ML with Containers and Kubernetes
Tushar Katarki
 
S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...
S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...
S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...
Henry Saputra
 
03_aiops-1.pptx
03_aiops-1.pptx03_aiops-1.pptx
03_aiops-1.pptx
FarazulHoda2
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform
Seldon
 
databricks ml flow demonstration using automatic features engineering
databricks ml flow demonstration using automatic features engineeringdatabricks ml flow demonstration using automatic features engineering
databricks ml flow demonstration using automatic features engineering
Mohamed MEJDOUBI
 
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and KubeflowKostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
IT Arena
 
DotNet Conf Madrid 2019 - Whats New in ML.NET
DotNet Conf Madrid 2019 - Whats New in ML.NETDotNet Conf Madrid 2019 - Whats New in ML.NET
DotNet Conf Madrid 2019 - Whats New in ML.NET
Alberto Diaz Martin
 
Metaflow: The ML Infrastructure at Netflix
Metaflow: The ML Infrastructure at NetflixMetaflow: The ML Infrastructure at Netflix
Metaflow: The ML Infrastructure at Netflix
Bill Liu
 
Easy path to machine learning (2023-2024)
Easy path to machine learning (2023-2024)Easy path to machine learning (2023-2024)
Easy path to machine learning (2023-2024)
wesley chun
 

Similar to MLOps pipelines using MLFlow - From training to production (20)

MLFlow 1.0 Meetup
MLFlow 1.0 Meetup MLFlow 1.0 Meetup
MLFlow 1.0 Meetup
 
Scaling up Machine Learning Development
Scaling up Machine Learning DevelopmentScaling up Machine Learning Development
Scaling up Machine Learning Development
 
Utilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learningUtilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learning
 
Tech leaders guide to effective building of machine learning products
Tech leaders guide to effective building of machine learning productsTech leaders guide to effective building of machine learning products
Tech leaders guide to effective building of machine learning products
 
DEVOPS AND MACHINE LEARNING
DEVOPS AND MACHINE LEARNINGDEVOPS AND MACHINE LEARNING
DEVOPS AND MACHINE LEARNING
 
MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle
 
Introduction to MLflow
Introduction to MLflowIntroduction to MLflow
Introduction to MLflow
 
Hydrosphere.io for ODSC: Webinar on Kubeflow
Hydrosphere.io for ODSC: Webinar on KubeflowHydrosphere.io for ODSC: Webinar on Kubeflow
Hydrosphere.io for ODSC: Webinar on Kubeflow
 
Azure machine learning service
Azure machine learning serviceAzure machine learning service
Azure machine learning service
 
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
 
Apache Eagle in Action
Apache Eagle in ActionApache Eagle in Action
Apache Eagle in Action
 
Scaling AI/ML with Containers and Kubernetes
Scaling AI/ML with Containers and Kubernetes Scaling AI/ML with Containers and Kubernetes
Scaling AI/ML with Containers and Kubernetes
 
S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...
S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...
S8277 - Introducing Krylov: AI Platform that Empowers eBay Data Science and E...
 
03_aiops-1.pptx
03_aiops-1.pptx03_aiops-1.pptx
03_aiops-1.pptx
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform
 
databricks ml flow demonstration using automatic features engineering
databricks ml flow demonstration using automatic features engineeringdatabricks ml flow demonstration using automatic features engineering
databricks ml flow demonstration using automatic features engineering
 
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and KubeflowKostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
 
DotNet Conf Madrid 2019 - Whats New in ML.NET
DotNet Conf Madrid 2019 - Whats New in ML.NETDotNet Conf Madrid 2019 - Whats New in ML.NET
DotNet Conf Madrid 2019 - Whats New in ML.NET
 
Metaflow: The ML Infrastructure at Netflix
Metaflow: The ML Infrastructure at NetflixMetaflow: The ML Infrastructure at Netflix
Metaflow: The ML Infrastructure at Netflix
 
Easy path to machine learning (2023-2024)
Easy path to machine learning (2023-2024)Easy path to machine learning (2023-2024)
Easy path to machine learning (2023-2024)
 

Recently uploaded

Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 

Recently uploaded (20)

Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 

MLOps pipelines using MLFlow - From training to production

  • 1. MLOps pipelines using MLFlow - From training to production Dr. Andreas Weiden, skillbyte CAIML#24, April 13th, 2023
  • 2. Problem Description Team A, a (majority) Data Scientist team, creates many machine learning models Team B, a (majority) Data Engineer team, needs to deploy these models into production and use them for a recommender system So two problems: 1. Technical Need to deploy these models to multiple targets (→ this talk) 2. Organizational Need to make two teams work together (→ not this talk)
  • 3. MLOps Got its name from DevOps and GitOps Continuous training and deployment for machine learning systems ml-ops.org
  • 4. Pipelines Created and maintained by Team A Daily training runs, since fresh data is constantly coming in Output various artifacts which are needed by the prediction services, run by Team B Examples of what the Data Science Magic™ can be: popularity of items user embeddings from user-item interactions item embeddings from item descriptions Need somewhere to store the outputs of those pipelines And deploy them, too
  • 5. Manage end-to-end machine learning lifecycle Open source: Github Four pillars: Tracking Log parameters, code versions, metrics, artifacts Projects Models Registry Basic unit is a Run Whenever your pipeline runs, a new Run is created Runs can be grouped under Experiments You can add arbitrary data to a Run as well as the output artifacts import mlflow import datetime, numpy as np, pickle, random from tempfile import TemporaryDirectory mlflow.set_experiment("Pipeline A") run_name = f"Pipeline A {datetime.datetime.now().isoformat()}" tags = {"version": "0.0.1"} with mlflow.start_run(run_name=run_name, tags=tags) as run: mlflow.log_param("ndims", 1024) mlflow.log_metric("recall", random.random()) with TemporaryDirectory() as temp_dir: with open(f"{temp_dir}/out.pickle", "wb") as f: pickle.dump([np.random.rand(1024) for _ in range(100)], f) mlflow.log_artifacts(temp_dir)
  • 6. Manage end-to-end machine learning lifecycle Open source: Github Four pillars: Tracking Log parameters, code versions, metrics, artifacts Projects Models Registry Basic unit is a Run Whenever your pipeline runs, a new Run is created Runs can be grouped under Experiments You can add arbitrary data to a Run as well as the output artifacts
  • 7. Manage end-to-end machine learning lifecycle Open source: Github Four pillars: Tracking Log parameters, code versions, metrics, artifacts Projects Package Data Science code including dependencies Git and containerization already does this, if you lock your dependencies, which you should Models Registry
  • 8. Manage end-to-end machine learning lifecycle Open source: Github Four pillars: Tracking Log parameters, code versions, metrics, artifacts Projects Package Data Science code including dependencies Git and containerization already does this, if you lock your dependencies, which you should Models Package ML models and deploy them Containers and/or model artifacts Registry A standard format for packaging machine learning models that can be used in a variety of downstream tools e.g. real-time serving through a REST API batch inference on Apache Spark Saves model specific data and environment data: # Directory written by mlflow.sklearn.save_model(model, "my_model") my_model/ ├── MLmodel ├── model.pkl ├── conda.yaml ├── python_env.yaml └── requirements.txt
  • 9. Manage end-to-end machine learning lifecycle Open source: Github Four pillars: Tracking Log parameters, code versions, metrics, artifacts Projects Package Data Science code including dependencies Git and containerization already does this, if you lock your dependencies, which you should Models Package ML models and deploy them Containers and/or model artifacts Registry Model storage and lifecycle (versioning, stage transitions) Can associate runs with a Model: import mlflow with mlflow.start_run() as run: ... mlflow.register_model(f"runs:/{run.info.run_id}", "Model A")
  • 10. Manage end-to-end machine learning lifecycle Open source: Github Four pillars: Tracking Log parameters, code versions, metrics, artifacts Projects Package Data Science code including dependencies Git and containerization already does this, if you lock your dependencies, which you should Models Package ML models and deploy them Containers and/or model artifacts Registry Model storage and lifecycle (versioning, stage transitions) Sounds interesting, but only very rudimentary
  • 11. Deployment options ml-ops.org Allow deploying arbitrary machine learning artifacts Full control of the API
  • 12. Not so fast… … managed vector stores are also a thing They give you e.g. pre-filtering a full-blown query syntax Quite a few options exist nowadays Elasticsearch Google Vertex AI RediSearch Milvus … → Need a generic way to deploy machine learning artifacts to multiple targets
  • 13. Watcher Simple microservice that periodically polls the MLFlow registry Pushes the updated artifacts to all ML deployment options Uploads embeddings to managed databases Updates ConfigMap definitions with the correct Run ID per model and environment
  • 14. Deployment targets Elasticsearch Google Vertex AI Kubernetes ConfigMap POST /${ES_INDEX}/_doc/ HTTP/1.1 Host: ${ES_URL} Content-Type: application/json { "id": "foo", "vector": [1, 2, 3], "popularity": 123, "type": "bar" } GET /v1/${INDEX_URL}:upsertDatapoint HTTP/1.1 Host: ${VERTEX_ENDPOINT} Content-Type: application/json Authorization: Bearer `gcloud auth print-access-token` { "datapoints": [ { "datapoint_id": "foo", "feature_vector": [1, 2, 3], "restricts": { "namespace": "type", "allow_list": ["bar"] } } ] } apiVersion: v1 kind: ConfigMap metadata: name: foo-configmap data: RUN_ID: "1234abcde"
  • 15. Configmap reloader https://github.com/stakater/Reloader Ensures that all deployments that rely on a ConfigMap get restarted whenever that ConfigMap changes apiVersion: apps/v1 kind: Deployment metadata: annotations: configmap.reloader.stakater.com/reload: "foo-configmap" spec: template: spec: containers: - name: foo image: foo:0.0.1 env: - name: MLFLOW_RUN_ID valueFrom: configMapKeyRef: name: foo-configmap key: RUN_ID apiVersion: v1 kind: ConfigMap metadata: name: foo-configmap data: RUN_ID: "1234abcde"
  • 16. Alternatives Possible alternatives that we considered: Model deployment directly through MLFlow Seldon Core AWS SageMaker (Got more? Let me know!) However, they all have the same drawbacks: No control over final images Image size not optimised Custom logging, metrics, tracing, … difficult No control over API Only deploy to REST-APIs, but we also want other targets
  • 17. Summary Embeddings and models are centrally produced → Need some central model storage Each of the targets supports training and or deploying ML models individually, none of them support doing so for all targets → If your needs are diverse enough, you may need to roll your own (ML deployment) → But use existing tools where applicable