This article describes how you can use MLOps on the Databricks platform to optimize the performance and long-term efficiency of your machine learning (ML) systems. It includes general recommendations for an MLOps architecture and describes a generalized workflow using the Databricks platform that you can use as a model for your ML development-to-production process.
3. WHAT IS MLOPS?
MLOps is a set of processes and
automated steps to manage code,
data, and models. It combines
DevOps, DataOps, and ModelOps.
3
4. GENERAL RECOMMENDATIONS FOR MLOPS
This section includes some general
recommendations for MLOps on Databricks with
links for more information.
4
5. CREATE A SEPARATE ENVIRONMENT FOR EACH STAGE
• An execution environment is the place where models and data are created or
consumed by code. Each execution environment consists of compute instances, their
runtimes and libraries, and automated jobs.
• Databricks recommends creating separate environments for the different stages of
ML code and model development with clearly defined transitions between stages.
The workflow described in this article follows this process, using the common names
for the stages:
• Development
• Staging
• Production
5
6. ACCESS CONTROL AND VERSIONING
6
•Use Git for version control.
Pipelines and code should be stored in Git for version control. Moving ML logic between stages can
then be interpreted as moving code from the development branch, to the staging branch, to the
release branch. Use Databricks Git folders to integrate with your Git provider and sync notebooks and
source code with Databricks workspaces. Databricks also provides additional tools for Git integration
and version control; see Developer tools and guidance.
•Store data in a lakehouse architecture using Delta tables.
Data should be stored in a lakehouse architecture in your cloud account. Both raw data and feature
tables should be stored as Delta tables with access controls to determine who can read and modify
them.
7. • Manage model development with MLflow.
You can use MLflow to track the model development process and save code
snapshots, model parameters, metrics, and other metadata.
• Use Models in Unity Catalog to manage the model lifecycle.
Use Models in Unity Catalog to manage model versioning, governance, and
deployment status.
7
8. DEPLOY CODE, NOT MODELS
• In most situations, Databricks recommends that during the ML development
process, you promote code, rather than models, from one environment to the
next. Moving project assets this way ensures that all code in the ML
development process goes through the same code review and integration
testing processes. It also ensures that the production version of the model is
trained on production code. For a more detailed discussion of the options and
trade-offs, see Model deployment patterns.
•
• URL : https://docs.databricks.com/en/machine-learning/mlops/deployment-
patterns.html
8
9. RECOMMENDED MLOPS WORKFLOW
• The following sections describe a typical MLOps workflow, covering each of the
three stages: development, staging, and production.
• This section uses the terms “data scientist” and “ML engineer” as archetypal
personas; specific roles and responsibilities in the MLOps workflow will vary
between teams and organizations.
9
10. DEVELOPMENT
STAGE
10
• The focus of the development
stage is experimentation. Data
scientists develop features
and models and run
experiments to optimize
model performance. The
output of the development
process is ML pipeline code
that can include feature
computation, model training,
inference, and monitoring.
Ref link : https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.htmlˇ˘č
11. DEVELOPMENT
STAGE
• Data sources
• Exploratory data analysis (EDA)
• Code
• Train model (development)
• Validate and deploy model
• Commit code
4/2/24
11
https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.html
12. STAGING STAGE
12
• The focus of this stage is testing
the ML pipeline code to ensure it
is ready for production. All of the
ML pipeline code is tested in this
stage, including code for model
training as well as feature
engineering pipelines, inference
code, and so on.
• ML engineers create a CI pipeline
to implement the unit and
integration tests run in this stage.
The output of the staging
process is a release branch that
triggers the CI/CD system to
start the production stage
Ref link : https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.htmlˇ˘
13. STAGING STAGE
• Data
• Merge code
• Integration tests (CI)
• Merge to staging branch
• Create a release branch
4/2/24
13
https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.html
14. PRODUCTION
STAGE
14
• ML engineers own the production environment
where ML pipelines are deployed and executed.
These pipelines trigger model training, validate and
deploy new model versions, publish predictions to
downstream tables or applications, and monitor the
entire process to avoid performance degradation
and instability.
• Data scientists typically do not have write or
compute access in the production environment.
However, it is important that they have visibility to
test results, logs, model artifacts, production
pipeline status, and monitoring tables. This visibility
allows them to identify and diagnose problems in
production and to compare the performance of new
models to models currently in production. You can
grant data scientists read-only access to assets in
the production catalog for these purposes.
https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.html
15. PRODUCTION
STAGE
• Train model
• Validate model
• Deploy model
• Model Serving
• Inference: batch or streaming
• Lakehouse Monitoring
• Retraining
4/2/24
15
https://docs.databricks.com/en/machine-learning/mlops/mlops-workflow.html
16. MLOPS — END-TO-END PIPELINE DEMO
• This demo covers a full MLOps pipeline. We’ll show you how Databricks
Lakehouse can be leveraged to orchestrate and deploy models in production
while ensuring governance, security and robustness.
• Ingest data and save them in a feature store
• Build ML models with Databricks AutoML
• Set up MLflow hooks to automatically test your models
• Create the model test job
• Automatically move models in production once the tests are validated
• Periodically retrain your model to prevent drift
16
17. COMMAND
17
• To install the demo, get a free Databricks workspace and
execute the following two commands in a Python notebook
• %pip install dbdemos
• import dbdemos dbdemos.install('mlops-end2end')
Try Databricks free
https://www.databricks.com/try-databricks?itm_data=demo_center#account