Andre Mesarovic
31 May 2024
MLflow MLOps Architecture
A Technical Perspective
Peeling Back the MLflow MLOps Tech Stack
Onion
The Many Meanings of a Model
Model is an overloaded term with several meanings in MLflow:
● Native model - simply the native flavor’s serialized format. For sklearn it’s a pickle file, for TensorFlow it’s a
directory with SaveModel format files.
● MLflow model - wrapper around the native model with metadata in the MLmodel file and environment
information in conda.yaml and requirements.txt files. Lives in an MLflow run.
● Model version
○ Wrapper around an MLflow model of which there are two:
■ The source run's MLflow model used for UI and lineage purposes - lives in workspace.
■ A copy of the run's MLflow model that lives in model registry storage. For UC it lives in UC storage, for non-UC it lives
in the workspace locked-down DBFS.
● Registered model - a bucket of model versions.
See MLflow documentation Components of a Model in MLflow
Meanings of a Model
MLflow Model - Executable Unit
artifact_path: model
databricks_runtime: 13.2.x-cpu-ml-scala2.12
flavors:
python_function:
env:
conda: conda.yaml
virtualenv: python_env.yaml
loader_module: mlflow.sklearn
model_path: model.pkl
predict_fn: predict
python_version: 3.10.6
sklearn:
code: null
pickled_model: model.pkl
serialization_format: cloudpickle
sklearn_version: 1.1.1
mlflow_version: 2.5.0
model_uuid: 2daecace267f4de29ec73062a10e2036
run_id: c62ccf932e0649a2b9247cc76d89b637
saved_input_example_info:
artifact_path: input_example.json
pandas_orient: split
type: dataframe
signature:
inputs: '[{"type": "double", "name": "fixed_acidity"}, {"type": "double", "name":
"volatile_acidity"}, {"type": "double", "name": "citric_acid"}, {"type": "double",
"name": "residual_sugar"}, {"type": "double", "name": "chlorides"}, {"type": "double",
"name": "free_sulfur_dioxide"}, {"type": "double", "name": "total_sulfur_dioxide"},
{"type": "double", "name": "density"}, {"type": "double", "name": "pH"}, {"type":
"double", "name": "sulphates"}, {"type": "double", "name": "alcohol"}]'
outputs: '[{"type": "tensor", "tensor-spec": {"dtype": "float64", "shape": [-1]}}]'
utc_time_created: '2023-08-11 08:40:16.227603'
MLflow Model - MLmodel metadata file
● Always has a Pyfunc flavor - python_function
● Most often has a native flavor, e.g. sklearn
● Should have a signature defining the input and output
schemas
● Signature is required to register the run's model in the
Unity Catalog model registry
● Model version is the MLflow deployable unit
● Lives either in workspace model registry or Unity Catalog model registry
● Wraps an MLflow model
● Retrievable from model registry by version number:
models:/my_catalog/my_schema/my_model/12
● Or by alias:
models:/my_catalog/my_schema/my_model@champ
Model Version - Deployable Unit
MLflow Object Relationships - Overview
●Model Version - points to a Run
●Run points to:
○MLflow Model - contains:
■Native Model
■Metadata in MLmodel file
■Python library dependencies
○Notebook code:
■Notebook revision in workspace
■Git link (if applicable)
○Delta Table - Captured in several ways:
■Run.inputs aka mlflow.data
■Run tag sparkDatasourceInfo
(undocumented)
MLflow Object Relationships - Unity Catalog
Registry
● Native Model - native model’s serialized format, e.g.
pickle file
● MLflow Model - metadata wrapper around native
model
● Run - Contains MLflow model + params and metrics
● Model Version - deployable unit in registry pointing to
an MLflow model
● Registered Model - bucket of model versions
● Experiment - bucket of runs
MLflow Object Relationships - Workspace Registry
● Native Model - native model’s serialized format, e.g.
pickle file
● MLflow Model - metadata wrapper around native
model
● Run - Contains MLflow model + params and metrics
● Model Version - deployable unit in registry pointing to
an MLflow model
● Registered Model - bucket of model versions
● Experiment - bucket of runs
MLflow Object Relationships - Class Diagram
● An experiment has 0 or more runs
● A run belongs to only one experiment
● A run has 0 or more MLflow models
● A model version points to one run
● A run can be linked to one or more model versions
● A registered model has 0 or more model versions
● A model version belongs to only one registered model
MLOps "Model First" Pipeline - Unity Catalog
Workflow steps
1. Dev environment
1a. Train model
1b. Register best model in dev catalog
1b. Copy MLflow run to staging workspace
2. Staging environment
2a. Run model evaluation and non-ML code tests
2b. Copy (promote) model version to staging catalog
2c. Copy model version to prod catalog when ready
2c. Copy MLflow run to prod workspace for lineage and governance
3. Prod environment - run model inference on data
● MlflowClient.copy_version(src_model_uri, dst_model_name) - shallow copy
○ Model registry URI with a models:/ scheme (e.g., models:/iris_model@champion
● MlflowClient.create_model_version(name, source, run_id: Optional) - shallow copy
○ source – URI indicating the location of the model artifacts. The artifact URI can be run relative (e.g.
runs:/<run_id>/<model_artifact_path>), a model registry URI (e.g. models:/<model_name>/<version>), or other URIs
supported by the model registry backend (e.g. “s3://my_bucket/my/model”).
○ run_id – Run ID from MLflow tracking server that generated the model
● mlflow.register_model(src_model_uri, name) - shallow copy
○ models:/ URIs are currently not supported
○ Use a runs:/ URI if you want to record the run ID with the model in model registry (recommended), or pass the local
filesystem path of the model
● Copy Model Version - mlflow-import-export - with full run lineage - deep copy
Different ways to create/copy a model version
Copy Model Version - Unity Catalog Registry
Copy Model Version - Workspace Registry
Deep vs Shallow Model Version Copy
Shallow Copy Deep Copy
16
●Does not copy the version's run
●Version in new workspace points to run in
original workspace
●Problem is that a prod version now points to a
run in a dev workspace
●Can use MlflowClient.copy_model_version
●Need to create registered model in target UC
schema and copy its metadata
●Copies the version's run to target workspace
●Version in prod catalog now points to run in prod
workspace
●Copies the run's experiment metadata to new
experiment in target workspace
●Copies the version's registered model metadata
●Can copy version to another UC metastore
Github
● Python script: copy-model-version
○ https://github.com/mlflow/mlflow-export-import/blob/master/README_copy.md#copy-model-version
● Notebook: Copy_Model_Version
○ https://github.com/mlflow/mlflow-export-import/blob/master/databricks_notebooks/copy/Copy_Model_Version.py
e2-demo-west workspace
● Notebook: Copy_Model_Version
Copy Model Version Code - mlflow-export-
import
● Copies a model version and its run (optional).
● Two types of model version copy:
○ Shallow copy - does not copy the source version's run. The destination model version will point to the source
version's run.
○ Deep copy - the destination model version will point to a new copy of the run in the destination workspace.
■ Recommended for full governance and lineage tracking.
● Supports both the WS registry and UC registry copying including WS to UC copying.
● For WS registry, the destination model version can be either in the same workspace or in another workspace.
● For UC registry, the destination model version can be either in the same UC metastore or in another UC metastore.
● Databricks registry URIs should be Databricks profiles.
● Note MLflow 2.8.0 introduced MlflowClient.copy_model_version. However it is only a shallow copy and does not work
across external workspaces or UC metastores.
● Source:
○ Copy_Model_Version.py - Python script
○ Copy_Model_Version - Databricks notebook
Copy Model Version Details
Copy Model Version - script examples
copy-model-version 
--src-model dev.models.sklearn_wine 
--src-version 1 
--dst-model prod.models.sklearn_wine 
--dst-experiment-name /Users/first.last@mycompany.com/My_Experiment 
--src-registry-uri: databricks-uc://test-env 
--dst-registry-uri: databricks-uc://test-env
Copy UC model version in the same UC metastore
Copy UC model version to another UC metastore
copy-model-version 
--src-model dev.models.sklearn_wine 
--src-version 1 
--dst-model prod.models.sklearn_wine 
--dst-experiment-name /Users/first.last@mycompany.com/My_Experiment 
--src-registry-uri: databricks-uc://test-env 
--dst-registry-uri: databricks-uc://prod-env
Copy Model Version - notebook example

MLflow_MLOps_Databricks_Architecture.pptx

  • 1.
    Andre Mesarovic 31 May2024 MLflow MLOps Architecture A Technical Perspective
  • 2.
    Peeling Back theMLflow MLOps Tech Stack Onion
  • 3.
  • 4.
    Model is anoverloaded term with several meanings in MLflow: ● Native model - simply the native flavor’s serialized format. For sklearn it’s a pickle file, for TensorFlow it’s a directory with SaveModel format files. ● MLflow model - wrapper around the native model with metadata in the MLmodel file and environment information in conda.yaml and requirements.txt files. Lives in an MLflow run. ● Model version ○ Wrapper around an MLflow model of which there are two: ■ The source run's MLflow model used for UI and lineage purposes - lives in workspace. ■ A copy of the run's MLflow model that lives in model registry storage. For UC it lives in UC storage, for non-UC it lives in the workspace locked-down DBFS. ● Registered model - a bucket of model versions. See MLflow documentation Components of a Model in MLflow Meanings of a Model
  • 5.
    MLflow Model -Executable Unit
  • 6.
    artifact_path: model databricks_runtime: 13.2.x-cpu-ml-scala2.12 flavors: python_function: env: conda:conda.yaml virtualenv: python_env.yaml loader_module: mlflow.sklearn model_path: model.pkl predict_fn: predict python_version: 3.10.6 sklearn: code: null pickled_model: model.pkl serialization_format: cloudpickle sklearn_version: 1.1.1 mlflow_version: 2.5.0 model_uuid: 2daecace267f4de29ec73062a10e2036 run_id: c62ccf932e0649a2b9247cc76d89b637 saved_input_example_info: artifact_path: input_example.json pandas_orient: split type: dataframe signature: inputs: '[{"type": "double", "name": "fixed_acidity"}, {"type": "double", "name": "volatile_acidity"}, {"type": "double", "name": "citric_acid"}, {"type": "double", "name": "residual_sugar"}, {"type": "double", "name": "chlorides"}, {"type": "double", "name": "free_sulfur_dioxide"}, {"type": "double", "name": "total_sulfur_dioxide"}, {"type": "double", "name": "density"}, {"type": "double", "name": "pH"}, {"type": "double", "name": "sulphates"}, {"type": "double", "name": "alcohol"}]' outputs: '[{"type": "tensor", "tensor-spec": {"dtype": "float64", "shape": [-1]}}]' utc_time_created: '2023-08-11 08:40:16.227603' MLflow Model - MLmodel metadata file ● Always has a Pyfunc flavor - python_function ● Most often has a native flavor, e.g. sklearn ● Should have a signature defining the input and output schemas ● Signature is required to register the run's model in the Unity Catalog model registry
  • 7.
    ● Model versionis the MLflow deployable unit ● Lives either in workspace model registry or Unity Catalog model registry ● Wraps an MLflow model ● Retrievable from model registry by version number: models:/my_catalog/my_schema/my_model/12 ● Or by alias: models:/my_catalog/my_schema/my_model@champ Model Version - Deployable Unit
  • 8.
    MLflow Object Relationships- Overview ●Model Version - points to a Run ●Run points to: ○MLflow Model - contains: ■Native Model ■Metadata in MLmodel file ■Python library dependencies ○Notebook code: ■Notebook revision in workspace ■Git link (if applicable) ○Delta Table - Captured in several ways: ■Run.inputs aka mlflow.data ■Run tag sparkDatasourceInfo (undocumented)
  • 9.
    MLflow Object Relationships- Unity Catalog Registry ● Native Model - native model’s serialized format, e.g. pickle file ● MLflow Model - metadata wrapper around native model ● Run - Contains MLflow model + params and metrics ● Model Version - deployable unit in registry pointing to an MLflow model ● Registered Model - bucket of model versions ● Experiment - bucket of runs
  • 10.
    MLflow Object Relationships- Workspace Registry ● Native Model - native model’s serialized format, e.g. pickle file ● MLflow Model - metadata wrapper around native model ● Run - Contains MLflow model + params and metrics ● Model Version - deployable unit in registry pointing to an MLflow model ● Registered Model - bucket of model versions ● Experiment - bucket of runs
  • 11.
    MLflow Object Relationships- Class Diagram ● An experiment has 0 or more runs ● A run belongs to only one experiment ● A run has 0 or more MLflow models ● A model version points to one run ● A run can be linked to one or more model versions ● A registered model has 0 or more model versions ● A model version belongs to only one registered model
  • 12.
    MLOps "Model First"Pipeline - Unity Catalog Workflow steps 1. Dev environment 1a. Train model 1b. Register best model in dev catalog 1b. Copy MLflow run to staging workspace 2. Staging environment 2a. Run model evaluation and non-ML code tests 2b. Copy (promote) model version to staging catalog 2c. Copy model version to prod catalog when ready 2c. Copy MLflow run to prod workspace for lineage and governance 3. Prod environment - run model inference on data
  • 13.
    ● MlflowClient.copy_version(src_model_uri, dst_model_name)- shallow copy ○ Model registry URI with a models:/ scheme (e.g., models:/iris_model@champion ● MlflowClient.create_model_version(name, source, run_id: Optional) - shallow copy ○ source – URI indicating the location of the model artifacts. The artifact URI can be run relative (e.g. runs:/<run_id>/<model_artifact_path>), a model registry URI (e.g. models:/<model_name>/<version>), or other URIs supported by the model registry backend (e.g. “s3://my_bucket/my/model”). ○ run_id – Run ID from MLflow tracking server that generated the model ● mlflow.register_model(src_model_uri, name) - shallow copy ○ models:/ URIs are currently not supported ○ Use a runs:/ URI if you want to record the run ID with the model in model registry (recommended), or pass the local filesystem path of the model ● Copy Model Version - mlflow-import-export - with full run lineage - deep copy Different ways to create/copy a model version
  • 14.
    Copy Model Version- Unity Catalog Registry
  • 15.
    Copy Model Version- Workspace Registry
  • 16.
    Deep vs ShallowModel Version Copy Shallow Copy Deep Copy 16 ●Does not copy the version's run ●Version in new workspace points to run in original workspace ●Problem is that a prod version now points to a run in a dev workspace ●Can use MlflowClient.copy_model_version ●Need to create registered model in target UC schema and copy its metadata ●Copies the version's run to target workspace ●Version in prod catalog now points to run in prod workspace ●Copies the run's experiment metadata to new experiment in target workspace ●Copies the version's registered model metadata ●Can copy version to another UC metastore
  • 17.
    Github ● Python script:copy-model-version ○ https://github.com/mlflow/mlflow-export-import/blob/master/README_copy.md#copy-model-version ● Notebook: Copy_Model_Version ○ https://github.com/mlflow/mlflow-export-import/blob/master/databricks_notebooks/copy/Copy_Model_Version.py e2-demo-west workspace ● Notebook: Copy_Model_Version Copy Model Version Code - mlflow-export- import
  • 18.
    ● Copies amodel version and its run (optional). ● Two types of model version copy: ○ Shallow copy - does not copy the source version's run. The destination model version will point to the source version's run. ○ Deep copy - the destination model version will point to a new copy of the run in the destination workspace. ■ Recommended for full governance and lineage tracking. ● Supports both the WS registry and UC registry copying including WS to UC copying. ● For WS registry, the destination model version can be either in the same workspace or in another workspace. ● For UC registry, the destination model version can be either in the same UC metastore or in another UC metastore. ● Databricks registry URIs should be Databricks profiles. ● Note MLflow 2.8.0 introduced MlflowClient.copy_model_version. However it is only a shallow copy and does not work across external workspaces or UC metastores. ● Source: ○ Copy_Model_Version.py - Python script ○ Copy_Model_Version - Databricks notebook Copy Model Version Details
  • 19.
    Copy Model Version- script examples copy-model-version --src-model dev.models.sklearn_wine --src-version 1 --dst-model prod.models.sklearn_wine --dst-experiment-name /Users/first.last@mycompany.com/My_Experiment --src-registry-uri: databricks-uc://test-env --dst-registry-uri: databricks-uc://test-env Copy UC model version in the same UC metastore Copy UC model version to another UC metastore copy-model-version --src-model dev.models.sklearn_wine --src-version 1 --dst-model prod.models.sklearn_wine --dst-experiment-name /Users/first.last@mycompany.com/My_Experiment --src-registry-uri: databricks-uc://test-env --dst-registry-uri: databricks-uc://prod-env
  • 20.
    Copy Model Version- notebook example

Editor's Notes

  • #2  what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  • #3 What did they do with us? what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  • #4 What did they do with us? what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  • #5 What did they do with us? what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  • #6 What did they do with us? what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  • #7 What did they do with us? what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  • #8  what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  • #9  what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  • #10  what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  • #11  what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  • #12  what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  • #13 What did they do with us? what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  • #14  what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  • #15  what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  • #17 What did they do with us? what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  • #18 What did they do with us? what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  • #19 What did they do with us? what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?
  • #20  what are they trying to do? recommendation? content curation? how does that work? How come Delta and Spark and those things can help with that thing (recommendation, or whatever they do)?