SlideShare a Scribd company logo
1 of 41
Model Serving
Andre Mesarovic
Resident Solutions Architect at Databricks
19 November 2020 - updated 10 March 2021
Agenda
• MLflow model serving
• How to score models with MLflow
○ Online scoring with MLflow scoring server
○ Offline scoring with Spark
• Custom model deployment and scoring
Agenda
Offline Prediction
● High throughput
● Bulk predictions
● Predict with Spark
● Batch Prediction
● Structured Streaming
Prediction
Online Prediction
● Low latency
● Individual predictions
● Real-time model serving
● MLflow scoring server
● MLflow deployment plugin
● Serving outside MLflow
● Databricks model serving
Vocabulary - Synonyms
Model Serving
● Model serving
● Model scoring
● Model prediction
● Model inference
● Model deployment
Prediction
● Offline == batch prediction
○ Spark batch prediction
○ Spark streaming prediction
● Online == real-time prediction
Model Serving Ecosystem
Offline Scoring
• Spark is ideally suited for offline scoring
• Can use either Spark batch or structured streaming
• Use SQL with MLflow UDFs
• Distributed scoring with Spark
• Load model from MLflow Model Registry
MLflow Offline Scoring Example
Directly invoke model
udf = mlflow.pyfunc.spark_udf(spark, model_uri)
spark.udf.register("predict", udf)
%sql select *, predict(*) as prediction from my_data
UDF - SQL
model = mlflow.spark.load_model(model_uri)
predictions = model.transform(data)
UDF - Python
udf = mlflow.pyfunc.spark_udf(spark, model_uri)
predictions = data.withColumn("prediction", udf(*data.columns))
model_uri = "models:/sklearn-wine/production"
Model URI
Online Scoring
• Score models outside of Spark
• Variety of real-time deployment options:
○ REST API - web servers - self-managed or as containers in cloud providers
○ Embedded in browsers, .e.g TensorFlow Lite
○ Edge devices: mobile phones, sensors, routers, etc.
○ Hardware accelerators - GPU, NVIDIA, Intel
Options to serve real-time models
• Embed model in application code
• Model deployed as service
• Model published as data
• Martin Fowler’s Continuous Delivery for Machine Learning
MLflow Scoring Server
• Basis of different MLflow deployment targets
• Standard REST API:
○ Input: CSV or JSON (pandas-split or pandas-records formats)
○ Output: JSON list of predictions
• Implementation: Python Flask or Java Jetty server
• MLflow CLI builds a server with embedded model
• See Built-In Deployment Tools
MLflow Scoring Server deployment options
• Local web server
• SageMaker docker container
• Azure ML docker container
• Generic docker container
MLflow Scoring Server container types
Python Server
Model and its ML
framework embedded
in Flask server.
Only for non-Spark ML
models.
Python Server + Spark
Flask server delegates
scoring to Spark
server. Only for Spark
ML models.
Java MLeap Server
Model and MLeap
runtime are embedded
in Jetty server. Only
for Spark ML models.
No Spark runtime.
MLflow Scoring Server container types
Launch Local MLflow Scoring Server
mlflow models serve --model-uri models:/sklearn-iris/production --port 5001
Launch server
Launch local server
CSV format
curl http://localhost:5001/invocations 
-H "Content-Type:text/csv" 
-d '
sepal_length,sepal_width,petal_length,petal_width
5.1,3.5,1.4,0.2
4.9,3.0,1.4,0.2'
Launch server
Request
[0, 0]
Launch server
Response
JSON split-oriented format
curl http://localhost:5001/invocations 
-H "Content-Type:application/json" 
-d ' {
"columns": ["sepal_length", "sepal_width", "petal_length", "petal_width"],
"data": [
[ 5.1, 3.5, 1.4, 0.2 ],
[ 4.9, 3.0, 1.4, 0.2 ] ] }'
Launch server
Request
[0, 0]
Launch server
Response
JSON record-oriented format
curl http://localhost:5001/invocations 
-H "Content-Type:application/json; format=pandas-records" 
-d '[
{ "sepal_length": 5.1, "sepal_width": 3.5, "petal_length": 1.4, "petal_width": 0.2 },
{ "sepal_length": 4.9, "sepal_width": 3.0, "petal_length": 1.4, "petal_width": 0.2 } ]'
Launch server
Request
[0, 0]
Launch server
Response
MLflow SageMaker and Azure ML containers
• Two types of containers to deploy to cloud providers
○ Python container - Flask web server process with embedded model
○ SparkML container - Flask web server process and Spark process
• SageMaker container
○ Most versatile container type
○ Can run in local mode on laptop as regular docker container
Python container
• Flask web server
• Model (e.g. sklearn or Keras/TF) is embedded inside web server
SparkML container
• Two processes:
○ Flask web server
○ Spark server - OSS Spark - no Databricks
• Web server delegates scoring to Spark server
MLflow SageMaker container deployment
• CLI:
○ mlflow sagemaker build-and-push-container
○ mlflow sagemaker deploy
○ mlflow sagemaker run-local
• API: mlflow.sagemaker API
• Deploy a model on Amazon SageMaker
mlflow sagemaker build-and-push-container --build --container my-container
mlflow sagemaker deploy --app-name wine-quality 
--model-uri models:/wine-quality/production 
--image-url my-image-url 
--region us-west-2 --mode replace
Launch server
Launch container on SageMaker
Launch container on SageMaker
mlflow sagemaker build-and-push-container --build --no-push --container my-container
mlflow sagemaker run-local 
--model-uri models:/wine-quality/production 
--port 5051 --image my-container
Launch server
Launch local container
Deploy MLflow SageMaker container
MLflow AzureML container deployment
• Deploy to Azure Kubernetes Service (AKS) or Azure Container
Instances (ACI)
• CLI - mlflow azureml build-image
• API - mlflow.azureml.build-image
• Deploy a python_function model on Microsoft Azure ML
model_image, azure_model = mlflow.azureml.build_image(
model_uri="models:/sklearn_wine",
workspace="my_workspace",
model_name="sklearn_wine",
image_name="sklearn_wine_image",
description="Sklearn Wine Quality",
synchronous=True)
Deploy MLflow Azure ML container
MLeap
• Non-Spark serialization format for Spark models
• Advantage is ability to score Spark model without overhead of
Spark
• Faster than scoring Spark ML models with Spark
• Problem is stability, maturity and lack of dedicated commercial
support
End-to-end ML Pipeline Example with MLflow
See mlflow-examples - e2e-ml-pipeline
Databricks Model Serving
• Expose MLflow model predictions as REST endpoint
• One-node cluster is automatically provisioned
• HTTP endpoint publicly exposed
• Limited production capability - intended for light loads and testing
• Production-grade serving on the product roadmap
Model Serving on Databricks
Model Serving
Turnkey serving
solution to expose
models as REST
endpoints
Reports
Applications
...
REST
Endpoint
Tracking
Record and query
experiments: code,
metrics, parameters,
artifacts, models
Models
General format
that standardizes
deployment options
Logged
Model
Model Registry
Centralized and
collaborative
model lifecycle
management
Staging Production Archived
Data Scientists Deployment Engineers
Databricks Model Serving Example
Databricks Model Serving Resources
• Blog posts
○ Quickly Deploy, Test, and Manage ML Models as REST Endpoints with MLflow
Model Serving on Databricks - 2020-11-02
○ Announcing MLflow Model Serving on Databricks - 2020-06-25
• Documentation
○ MLflow Model Serving on Databricks
MLflow Deployment Plugins
• MLflow Deployment Plugins - Deploy model to custom serving tool
• Current deployment plugins:
○ https://github.com/RedisAI/mlflow-redisai
○ https://github.com/mlflow/mlflow-torchserve
MLflow PyTorch
• MLflow PyTorch integration released in MLflow 1.12.0 (Nov. 2020)
• Features:
○ PyTorch Autologging
○ TorchScript Models
○ TorchServing
• Deploy MLflow PyTorch models into TorchServe
MLflow PyTorch and TorchServe Resources
• PyTorch and MLflow Integration Announcement - 2020-11-12
• MLflow 1.12 Features Extended PyTorch Integration - 2012-11-13
• MLflow and PyTorch — Where Cutting Edge AI meets MLOps
• https://github.com/mlflow/mlflow-torchserve
MLflow and TensorFlow Serving Custom
Deployment
• Example how to deploy models to a custom deployment target
• MLflow deployment plugin has not yet been developed
• Deployment builds a standard TensorFlow Serving docker
container with a MLflow model from Model Registry
• https://github.com/amesar/mlflow-
tools/tree/master/mlflow_tools/tensorflow_serving
Keras/TensorFlow Model Formats
• HD5 format
○ Default in Keras TF 1.x but is legacy in Keras TF 2.x
○ Not supported by TensorFlow Serving
• SavedModel format
○ TensorFlow standard format is default in Keras TF 2.x
○ Supported by TensorFlow Serving
MLflow Keras/TensorFlow Flavors
• MLflow currently only supports Keras HD5 format as flavor
• MLflow does not support SavedModel flavor until this is resolved:
○ https://github.com/mlflow/mlflow/issues/3224
○ https://github.com/mlflow/mlflow/pull/3552
• Can save SavedModel as an unmanaged artifact for now
and
Serving
Serving
Ideal Current
TensorFlow Serving Example
docker run -t --rm --publish 8501 
--volume /opt/mlflow/mlruns/1/f48dafd70be044298f71488b0ae10df4/artifacts/tensorflow-model:/models/keras_wine
--env MODEL_NAME=keras_wine 
tensorflow/serving
Launch server
Launch server in Docker container
curl -d '{"instances": [12.8, 0.03, 0.48, 0.98, 6.2, 29, 1.2, 0.4, 75 ] }' 
-X POST http://localhost:8501/v1/models/keras_wine:predict
Launch server
Request
{ "predictions": [[-0.70597136]] }
Launch server
Response
MLflow Keras/TensorFlow Run Example
MLflow Flavors - Managed Models
● Models
○ keras-hd5-model
○ onnx-model
● Can be served as PyFunc models
MLflow Unmanaged Models
● Models
○ tensorflow-model - Standard TF SavedModel format
○ tensorflow-lite-model
○ tensorflow.js model
● Load as raw artifacts and then serve manually
● Sample code: wine_train.py and wine_predict.py
ONNX
• Interoperable model format supported by:
○ MSFT, Facebook, AWS, Nvidia, Intel, etc.
• Train once, deploy anywhere (in theory) - depends on ML tool
• Save MLflow model as ONNX flavor
• Deploy in standard MLflow scoring server
• ONNX runtime is embedded in Python server in MLflow container
Thank You
Happy model serving!

More Related Content

What's hot

Use MLflow to manage and deploy Machine Learning model on Spark
Use MLflow to manage and deploy Machine Learning model on Spark Use MLflow to manage and deploy Machine Learning model on Spark
Use MLflow to manage and deploy Machine Learning model on Spark Herman Wu
 
Ml ops deployment choices
Ml ops   deployment choicesMl ops   deployment choices
Ml ops deployment choicesAvinash Patil
 
모델 서빙 파이프라인 구축하기
모델 서빙 파이프라인 구축하기모델 서빙 파이프라인 구축하기
모델 서빙 파이프라인 구축하기SeongIkKim2
 
DevOps for Applications in Azure Databricks: Creating Continuous Integration ...
DevOps for Applications in Azure Databricks: Creating Continuous Integration ...DevOps for Applications in Azure Databricks: Creating Continuous Integration ...
DevOps for Applications in Azure Databricks: Creating Continuous Integration ...Databricks
 
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...Databricks
 
Model versioning done right: A ModelDB 2.0 Walkthrough
Model versioning done right: A ModelDB 2.0 WalkthroughModel versioning done right: A ModelDB 2.0 Walkthrough
Model versioning done right: A ModelDB 2.0 WalkthroughManasi Vartak
 
Hamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature StoreHamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature StoreMoritz Meister
 
Building machine learning applications locally with Spark — Joel Pinho Lucas ...
Building machine learning applications locally with Spark — Joel Pinho Lucas ...Building machine learning applications locally with Spark — Joel Pinho Lucas ...
Building machine learning applications locally with Spark — Joel Pinho Lucas ...PAPIs.io
 
Managing the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsManaging the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsFatih Baltacı
 
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...Databricks
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowJan Kirenz
 
AWS IoT Edge Management
AWS IoT Edge ManagementAWS IoT Edge Management
AWS IoT Edge ManagementAvinash Patil
 
MLFlow 1.0 Meetup
MLFlow 1.0 Meetup MLFlow 1.0 Meetup
MLFlow 1.0 Meetup Databricks
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsDataPhoenix
 
Simplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkSimplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkDatabricks
 
Productionzing ML Model Using MLflow Model Serving
Productionzing ML Model Using MLflow Model ServingProductionzing ML Model Using MLflow Model Serving
Productionzing ML Model Using MLflow Model ServingDatabricks
 
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycleKyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycleLviv Startup Club
 
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and PrometheusRobust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and PrometheusManasi Vartak
 
ML-Ops: Philosophy, Best-Practices and Tools
ML-Ops:Philosophy, Best-Practices and ToolsML-Ops:Philosophy, Best-Practices and Tools
ML-Ops: Philosophy, Best-Practices and ToolsJorge Davila-Chacon
 
A Microservices Framework for Real-Time Model Scoring Using Structured Stream...
A Microservices Framework for Real-Time Model Scoring Using Structured Stream...A Microservices Framework for Real-Time Model Scoring Using Structured Stream...
A Microservices Framework for Real-Time Model Scoring Using Structured Stream...Databricks
 

What's hot (20)

Use MLflow to manage and deploy Machine Learning model on Spark
Use MLflow to manage and deploy Machine Learning model on Spark Use MLflow to manage and deploy Machine Learning model on Spark
Use MLflow to manage and deploy Machine Learning model on Spark
 
Ml ops deployment choices
Ml ops   deployment choicesMl ops   deployment choices
Ml ops deployment choices
 
모델 서빙 파이프라인 구축하기
모델 서빙 파이프라인 구축하기모델 서빙 파이프라인 구축하기
모델 서빙 파이프라인 구축하기
 
DevOps for Applications in Azure Databricks: Creating Continuous Integration ...
DevOps for Applications in Azure Databricks: Creating Continuous Integration ...DevOps for Applications in Azure Databricks: Creating Continuous Integration ...
DevOps for Applications in Azure Databricks: Creating Continuous Integration ...
 
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
 
Model versioning done right: A ModelDB 2.0 Walkthrough
Model versioning done right: A ModelDB 2.0 WalkthroughModel versioning done right: A ModelDB 2.0 Walkthrough
Model versioning done right: A ModelDB 2.0 Walkthrough
 
Hamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature StoreHamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature Store
 
Building machine learning applications locally with Spark — Joel Pinho Lucas ...
Building machine learning applications locally with Spark — Joel Pinho Lucas ...Building machine learning applications locally with Spark — Joel Pinho Lucas ...
Building machine learning applications locally with Spark — Joel Pinho Lucas ...
 
Managing the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsManaging the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOps
 
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
 
AWS IoT Edge Management
AWS IoT Edge ManagementAWS IoT Edge Management
AWS IoT Edge Management
 
MLFlow 1.0 Meetup
MLFlow 1.0 Meetup MLFlow 1.0 Meetup
MLFlow 1.0 Meetup
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
 
Simplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkSimplifying AI integration on Apache Spark
Simplifying AI integration on Apache Spark
 
Productionzing ML Model Using MLflow Model Serving
Productionzing ML Model Using MLflow Model ServingProductionzing ML Model Using MLflow Model Serving
Productionzing ML Model Using MLflow Model Serving
 
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycleKyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
 
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and PrometheusRobust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
 
ML-Ops: Philosophy, Best-Practices and Tools
ML-Ops:Philosophy, Best-Practices and ToolsML-Ops:Philosophy, Best-Practices and Tools
ML-Ops: Philosophy, Best-Practices and Tools
 
A Microservices Framework for Real-Time Model Scoring Using Structured Stream...
A Microservices Framework for Real-Time Model Scoring Using Structured Stream...A Microservices Framework for Real-Time Model Scoring Using Structured Stream...
A Microservices Framework for Real-Time Model Scoring Using Structured Stream...
 

Similar to DAIS Europe Nov. 2020 presentation on MLflow Model Serving

MLflow Model Serving - DAIS 2021
MLflow Model Serving - DAIS 2021MLflow Model Serving - DAIS 2021
MLflow Model Serving - DAIS 2021amesar0
 
MLflow Model Serving
MLflow Model ServingMLflow Model Serving
MLflow Model ServingDatabricks
 
MLflow with Databricks
MLflow with DatabricksMLflow with Databricks
MLflow with DatabricksLiangjun Jiang
 
Mlflow with databricks
Mlflow with databricksMlflow with databricks
Mlflow with databricksLiangjun Jiang
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Predicting Flights with Azure Databricks
Predicting Flights with Azure DatabricksPredicting Flights with Azure Databricks
Predicting Flights with Azure DatabricksSarah Dutkiewicz
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...Databricks
 
Azure Resource Manager templates: Improve deployment time and reusability
Azure Resource Manager templates: Improve deployment time and reusabilityAzure Resource Manager templates: Improve deployment time and reusability
Azure Resource Manager templates: Improve deployment time and reusabilityStephane Lapointe
 
A Collaborative Data Science Development Workflow
A Collaborative Data Science Development WorkflowA Collaborative Data Science Development Workflow
A Collaborative Data Science Development WorkflowDatabricks
 
MLflow MLOps Architecture - Technical Perspective
MLflow MLOps Architecture - Technical PerspectiveMLflow MLOps Architecture - Technical Perspective
MLflow MLOps Architecture - Technical Perspectiveamesar0
 
Cnam azure ze cloud resource manager
Cnam azure ze cloud  resource managerCnam azure ze cloud  resource manager
Cnam azure ze cloud resource managerAymeric Weinbach
 
PLSSUG - Troubleshoot SQL Server performance problems like a Microsoft Engineer
PLSSUG - Troubleshoot SQL Server performance problems like a Microsoft EngineerPLSSUG - Troubleshoot SQL Server performance problems like a Microsoft Engineer
PLSSUG - Troubleshoot SQL Server performance problems like a Microsoft EngineerMarek Maśko
 
9th docker meetup 2016.07.13
9th docker meetup 2016.07.139th docker meetup 2016.07.13
9th docker meetup 2016.07.13Amrita Prasad
 
Containerized architectures for deep learning
Containerized architectures for deep learningContainerized architectures for deep learning
Containerized architectures for deep learningAntje Barth
 
Database CI Demo Using Sql Server
Database CI  Demo Using Sql ServerDatabase CI  Demo Using Sql Server
Database CI Demo Using Sql ServerUmesh Kumar
 
Running Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesRunning Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesDatabricks
 
Staying Close to Experts with Executable Specifications
Staying Close to Experts with Executable SpecificationsStaying Close to Experts with Executable Specifications
Staying Close to Experts with Executable SpecificationsVagif Abilov
 
GCP Deployment- Vertex AI
GCP Deployment- Vertex AIGCP Deployment- Vertex AI
GCP Deployment- Vertex AITriloki Gupta
 
ML_Development_with_Sagemaker.pptx
ML_Development_with_Sagemaker.pptxML_Development_with_Sagemaker.pptx
ML_Development_with_Sagemaker.pptxTemiReply
 
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...HostedbyConfluent
 

Similar to DAIS Europe Nov. 2020 presentation on MLflow Model Serving (20)

MLflow Model Serving - DAIS 2021
MLflow Model Serving - DAIS 2021MLflow Model Serving - DAIS 2021
MLflow Model Serving - DAIS 2021
 
MLflow Model Serving
MLflow Model ServingMLflow Model Serving
MLflow Model Serving
 
MLflow with Databricks
MLflow with DatabricksMLflow with Databricks
MLflow with Databricks
 
Mlflow with databricks
Mlflow with databricksMlflow with databricks
Mlflow with databricks
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Predicting Flights with Azure Databricks
Predicting Flights with Azure DatabricksPredicting Flights with Azure Databricks
Predicting Flights with Azure Databricks
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 
Azure Resource Manager templates: Improve deployment time and reusability
Azure Resource Manager templates: Improve deployment time and reusabilityAzure Resource Manager templates: Improve deployment time and reusability
Azure Resource Manager templates: Improve deployment time and reusability
 
A Collaborative Data Science Development Workflow
A Collaborative Data Science Development WorkflowA Collaborative Data Science Development Workflow
A Collaborative Data Science Development Workflow
 
MLflow MLOps Architecture - Technical Perspective
MLflow MLOps Architecture - Technical PerspectiveMLflow MLOps Architecture - Technical Perspective
MLflow MLOps Architecture - Technical Perspective
 
Cnam azure ze cloud resource manager
Cnam azure ze cloud  resource managerCnam azure ze cloud  resource manager
Cnam azure ze cloud resource manager
 
PLSSUG - Troubleshoot SQL Server performance problems like a Microsoft Engineer
PLSSUG - Troubleshoot SQL Server performance problems like a Microsoft EngineerPLSSUG - Troubleshoot SQL Server performance problems like a Microsoft Engineer
PLSSUG - Troubleshoot SQL Server performance problems like a Microsoft Engineer
 
9th docker meetup 2016.07.13
9th docker meetup 2016.07.139th docker meetup 2016.07.13
9th docker meetup 2016.07.13
 
Containerized architectures for deep learning
Containerized architectures for deep learningContainerized architectures for deep learning
Containerized architectures for deep learning
 
Database CI Demo Using Sql Server
Database CI  Demo Using Sql ServerDatabase CI  Demo Using Sql Server
Database CI Demo Using Sql Server
 
Running Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesRunning Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using Kubernetes
 
Staying Close to Experts with Executable Specifications
Staying Close to Experts with Executable SpecificationsStaying Close to Experts with Executable Specifications
Staying Close to Experts with Executable Specifications
 
GCP Deployment- Vertex AI
GCP Deployment- Vertex AIGCP Deployment- Vertex AI
GCP Deployment- Vertex AI
 
ML_Development_with_Sagemaker.pptx
ML_Development_with_Sagemaker.pptxML_Development_with_Sagemaker.pptx
ML_Development_with_Sagemaker.pptx
 
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
 

Recently uploaded

Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 

Recently uploaded (20)

Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 

DAIS Europe Nov. 2020 presentation on MLflow Model Serving

  • 1. Model Serving Andre Mesarovic Resident Solutions Architect at Databricks 19 November 2020 - updated 10 March 2021
  • 2. Agenda • MLflow model serving • How to score models with MLflow ○ Online scoring with MLflow scoring server ○ Offline scoring with Spark • Custom model deployment and scoring
  • 3. Agenda Offline Prediction ● High throughput ● Bulk predictions ● Predict with Spark ● Batch Prediction ● Structured Streaming Prediction Online Prediction ● Low latency ● Individual predictions ● Real-time model serving ● MLflow scoring server ● MLflow deployment plugin ● Serving outside MLflow ● Databricks model serving
  • 4. Vocabulary - Synonyms Model Serving ● Model serving ● Model scoring ● Model prediction ● Model inference ● Model deployment Prediction ● Offline == batch prediction ○ Spark batch prediction ○ Spark streaming prediction ● Online == real-time prediction
  • 6. Offline Scoring • Spark is ideally suited for offline scoring • Can use either Spark batch or structured streaming • Use SQL with MLflow UDFs • Distributed scoring with Spark • Load model from MLflow Model Registry
  • 7. MLflow Offline Scoring Example Directly invoke model udf = mlflow.pyfunc.spark_udf(spark, model_uri) spark.udf.register("predict", udf) %sql select *, predict(*) as prediction from my_data UDF - SQL model = mlflow.spark.load_model(model_uri) predictions = model.transform(data) UDF - Python udf = mlflow.pyfunc.spark_udf(spark, model_uri) predictions = data.withColumn("prediction", udf(*data.columns)) model_uri = "models:/sklearn-wine/production" Model URI
  • 8. Online Scoring • Score models outside of Spark • Variety of real-time deployment options: ○ REST API - web servers - self-managed or as containers in cloud providers ○ Embedded in browsers, .e.g TensorFlow Lite ○ Edge devices: mobile phones, sensors, routers, etc. ○ Hardware accelerators - GPU, NVIDIA, Intel
  • 9. Options to serve real-time models • Embed model in application code • Model deployed as service • Model published as data • Martin Fowler’s Continuous Delivery for Machine Learning
  • 10. MLflow Scoring Server • Basis of different MLflow deployment targets • Standard REST API: ○ Input: CSV or JSON (pandas-split or pandas-records formats) ○ Output: JSON list of predictions • Implementation: Python Flask or Java Jetty server • MLflow CLI builds a server with embedded model • See Built-In Deployment Tools
  • 11. MLflow Scoring Server deployment options • Local web server • SageMaker docker container • Azure ML docker container • Generic docker container
  • 12. MLflow Scoring Server container types Python Server Model and its ML framework embedded in Flask server. Only for non-Spark ML models. Python Server + Spark Flask server delegates scoring to Spark server. Only for Spark ML models. Java MLeap Server Model and MLeap runtime are embedded in Jetty server. Only for Spark ML models. No Spark runtime.
  • 13. MLflow Scoring Server container types
  • 14. Launch Local MLflow Scoring Server mlflow models serve --model-uri models:/sklearn-iris/production --port 5001 Launch server Launch local server
  • 15. CSV format curl http://localhost:5001/invocations -H "Content-Type:text/csv" -d ' sepal_length,sepal_width,petal_length,petal_width 5.1,3.5,1.4,0.2 4.9,3.0,1.4,0.2' Launch server Request [0, 0] Launch server Response
  • 16. JSON split-oriented format curl http://localhost:5001/invocations -H "Content-Type:application/json" -d ' { "columns": ["sepal_length", "sepal_width", "petal_length", "petal_width"], "data": [ [ 5.1, 3.5, 1.4, 0.2 ], [ 4.9, 3.0, 1.4, 0.2 ] ] }' Launch server Request [0, 0] Launch server Response
  • 17. JSON record-oriented format curl http://localhost:5001/invocations -H "Content-Type:application/json; format=pandas-records" -d '[ { "sepal_length": 5.1, "sepal_width": 3.5, "petal_length": 1.4, "petal_width": 0.2 }, { "sepal_length": 4.9, "sepal_width": 3.0, "petal_length": 1.4, "petal_width": 0.2 } ]' Launch server Request [0, 0] Launch server Response
  • 18. MLflow SageMaker and Azure ML containers • Two types of containers to deploy to cloud providers ○ Python container - Flask web server process with embedded model ○ SparkML container - Flask web server process and Spark process • SageMaker container ○ Most versatile container type ○ Can run in local mode on laptop as regular docker container
  • 19. Python container • Flask web server • Model (e.g. sklearn or Keras/TF) is embedded inside web server
  • 20. SparkML container • Two processes: ○ Flask web server ○ Spark server - OSS Spark - no Databricks • Web server delegates scoring to Spark server
  • 21. MLflow SageMaker container deployment • CLI: ○ mlflow sagemaker build-and-push-container ○ mlflow sagemaker deploy ○ mlflow sagemaker run-local • API: mlflow.sagemaker API • Deploy a model on Amazon SageMaker
  • 22. mlflow sagemaker build-and-push-container --build --container my-container mlflow sagemaker deploy --app-name wine-quality --model-uri models:/wine-quality/production --image-url my-image-url --region us-west-2 --mode replace Launch server Launch container on SageMaker Launch container on SageMaker mlflow sagemaker build-and-push-container --build --no-push --container my-container mlflow sagemaker run-local --model-uri models:/wine-quality/production --port 5051 --image my-container Launch server Launch local container Deploy MLflow SageMaker container
  • 23. MLflow AzureML container deployment • Deploy to Azure Kubernetes Service (AKS) or Azure Container Instances (ACI) • CLI - mlflow azureml build-image • API - mlflow.azureml.build-image • Deploy a python_function model on Microsoft Azure ML
  • 24. model_image, azure_model = mlflow.azureml.build_image( model_uri="models:/sklearn_wine", workspace="my_workspace", model_name="sklearn_wine", image_name="sklearn_wine_image", description="Sklearn Wine Quality", synchronous=True) Deploy MLflow Azure ML container
  • 25. MLeap • Non-Spark serialization format for Spark models • Advantage is ability to score Spark model without overhead of Spark • Faster than scoring Spark ML models with Spark • Problem is stability, maturity and lack of dedicated commercial support
  • 26. End-to-end ML Pipeline Example with MLflow See mlflow-examples - e2e-ml-pipeline
  • 27. Databricks Model Serving • Expose MLflow model predictions as REST endpoint • One-node cluster is automatically provisioned • HTTP endpoint publicly exposed • Limited production capability - intended for light loads and testing • Production-grade serving on the product roadmap
  • 28. Model Serving on Databricks Model Serving Turnkey serving solution to expose models as REST endpoints Reports Applications ... REST Endpoint Tracking Record and query experiments: code, metrics, parameters, artifacts, models Models General format that standardizes deployment options Logged Model Model Registry Centralized and collaborative model lifecycle management Staging Production Archived Data Scientists Deployment Engineers
  • 30. Databricks Model Serving Resources • Blog posts ○ Quickly Deploy, Test, and Manage ML Models as REST Endpoints with MLflow Model Serving on Databricks - 2020-11-02 ○ Announcing MLflow Model Serving on Databricks - 2020-06-25 • Documentation ○ MLflow Model Serving on Databricks
  • 31. MLflow Deployment Plugins • MLflow Deployment Plugins - Deploy model to custom serving tool • Current deployment plugins: ○ https://github.com/RedisAI/mlflow-redisai ○ https://github.com/mlflow/mlflow-torchserve
  • 32. MLflow PyTorch • MLflow PyTorch integration released in MLflow 1.12.0 (Nov. 2020) • Features: ○ PyTorch Autologging ○ TorchScript Models ○ TorchServing • Deploy MLflow PyTorch models into TorchServe
  • 33. MLflow PyTorch and TorchServe Resources • PyTorch and MLflow Integration Announcement - 2020-11-12 • MLflow 1.12 Features Extended PyTorch Integration - 2012-11-13 • MLflow and PyTorch — Where Cutting Edge AI meets MLOps • https://github.com/mlflow/mlflow-torchserve
  • 34. MLflow and TensorFlow Serving Custom Deployment • Example how to deploy models to a custom deployment target • MLflow deployment plugin has not yet been developed • Deployment builds a standard TensorFlow Serving docker container with a MLflow model from Model Registry • https://github.com/amesar/mlflow- tools/tree/master/mlflow_tools/tensorflow_serving
  • 35. Keras/TensorFlow Model Formats • HD5 format ○ Default in Keras TF 1.x but is legacy in Keras TF 2.x ○ Not supported by TensorFlow Serving • SavedModel format ○ TensorFlow standard format is default in Keras TF 2.x ○ Supported by TensorFlow Serving
  • 36. MLflow Keras/TensorFlow Flavors • MLflow currently only supports Keras HD5 format as flavor • MLflow does not support SavedModel flavor until this is resolved: ○ https://github.com/mlflow/mlflow/issues/3224 ○ https://github.com/mlflow/mlflow/pull/3552 • Can save SavedModel as an unmanaged artifact for now
  • 38. TensorFlow Serving Example docker run -t --rm --publish 8501 --volume /opt/mlflow/mlruns/1/f48dafd70be044298f71488b0ae10df4/artifacts/tensorflow-model:/models/keras_wine --env MODEL_NAME=keras_wine tensorflow/serving Launch server Launch server in Docker container curl -d '{"instances": [12.8, 0.03, 0.48, 0.98, 6.2, 29, 1.2, 0.4, 75 ] }' -X POST http://localhost:8501/v1/models/keras_wine:predict Launch server Request { "predictions": [[-0.70597136]] } Launch server Response
  • 39. MLflow Keras/TensorFlow Run Example MLflow Flavors - Managed Models ● Models ○ keras-hd5-model ○ onnx-model ● Can be served as PyFunc models MLflow Unmanaged Models ● Models ○ tensorflow-model - Standard TF SavedModel format ○ tensorflow-lite-model ○ tensorflow.js model ● Load as raw artifacts and then serve manually ● Sample code: wine_train.py and wine_predict.py
  • 40. ONNX • Interoperable model format supported by: ○ MSFT, Facebook, AWS, Nvidia, Intel, etc. • Train once, deploy anywhere (in theory) - depends on ML tool • Save MLflow model as ONNX flavor • Deploy in standard MLflow scoring server • ONNX runtime is embedded in Python server in MLflow container

Editor's Notes

  1. What did they do with us? what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  2. What did they do with us? what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  3. What did they do with us? what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  4. What did they do with us? what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  5. 1. Aha! feature 2. Description of the feature: With MLflow we already have a Model format that abstracts away the ML framework. It doesn’t matter whether you use TensorFlow, Scikit, or SparkML, if you log it as an MLflow Model you can deploy the models in the same way. The tracking server allows you to log those models in experiments, with metrics, parameters, etc. Once you want to deploy a model, the Model Registry takes care of versioning, the deployment lifecycle, and hand-off between Data Scientists and Deployment Engineers From the Model Registry, there are several deployment options. However, many customers have asked us for a turnkey solution to serve models as REST endpoints. So we are building a Serving solution to do exactly that, with two clicks. 3. Value of the feature (aka what databricks was before this feature, and what this feature will do for databricks) Compared to other serving solutions this will require only 2 clicks to be enabled. However, important to highlight that this is for reporting / debug purposes and will not autoscale to arbitrary request loads. 4. Associated summary slide bullet point: Native support for serving MLflow Models as REST endpoints 5. Cloud: All 6. Deployment (MT, ST, Azure) All