ML development brings many new complexities beyond the traditional software development lifecycle. MLflow is an open-source project from Databricks aiming to solve some of these challenges such as experiment tracking, reproducibility, model packaging, deployment, and governance, in order to manage and accelerate the lifecycle of your ML projects.
12. Write code
Software ML Models
Write unit tests
Send for review
Get approvals
Commit
Release testing
Release
Analyze data
Put data into the right
format
Write model code
Train and evaluate model
Experiment with params,
model structure
Deploy … by email?
Monitor performance and
trigger retraining
13. Write code
Software ML Models
Write unit tests
Send for review
Get approvals
Commit
Release testing
Release
Analyze data
Put data into the right
format
Write model code
Train and evaluate model
Experiment with params,
model structure
Deploy … by email?
Monitor performance and
trigger retraining
Meet a functional
specification
Optimize a metric,
e.g. CTR
Goal
14. Write code
Software ML Models
Write unit tests
Send for review
Get approvals
Commit
Release testing
Release
Analyze data
Put data into the right
format
Write model code
Train and evaluate model
Experiment with params,
model structure
Deploy … by email?
Monitor performance and
trigger retraining
Meet a functional
specification
Optimize a metric,
e.g. CTR
Goal
Depends on code Depends on data,
code, model, params,
...
Quality
15. Write code
Software ML Models
Write unit tests
Send for review
Get approvals
Commit
Release testing
Release
Analyze data
Put data into the right
format
Write model code
Train and evaluate model
Experiment with params,
model structure
Deploy … by email?
Monitor performance and
trigger retraining
Meet a functional
specification
Optimize a metric,
e.g. CTR
Goal
Depends on code Depends on data,
code, model, params,
...
Quality
Typically one
software stack
Combination of many
libraries, tools,
...
Tools
16. Write code
Software ML Models
Write unit tests
Send for review
Get approvals
Commit
Release testing
Release
Analyze data
Put data into the right
format
Write model code
Train and evaluate model
Experiment with params,
model structure
Deploy … by email?
Monitor performance and
trigger retraining
Meet a functional
specification
Optimize a metric,
e.g. CTR
Goal
Depends on code Depends on data,
code, model, params,
...
Quality
Typically one
software stack
Combination of many
libraries, tools,
...
Tools
Works
deterministically
Keeps changing with
data, etc.
Outcome
20. Components
Tracking
Record and query
experiments: code,
data, config,
results
Projects
Packaging format
for reproducible
runs on any
platform
Models
General format
that standardizes
deployment paths
Model
Registry
Centralized and
collaborative
model lifecycle
management
new
20
mlflow.org github.com/mlflow twitter.com/MLflowdatabricks.com/mlflow
30. Model Format
Flavor 2Flavor 1
Simple model flavors
usable by many tools
Containers
Batch & Stream Scoring
Cloud Inference Services
In-Line Code
Models
31. What’s deployed in production? Who did it?
How can I rollback to the previous model?
Can I get approval before deploying?
Where do I store and archive my models?
Registry motivation
my_model
my_model_last
my_model_new
32. Automated Jobs
REST Serving
Downstream Users
Reviewers + CI/CD Tools
Model Registry
Experimental Staging A/B Tests Production
Model Registry
Data Scientists Deployment Engineers
33. MLflow Model Registry Benefits
One Collaborative Hub: The Model Registry provides a central hub for
making models discoverable, improving collaboration and knowledge
sharing across the organization.
Manage the entire Model Lifecycle (MLOps): The Model Registry provides
lifecycle management for models from experimentation to deployment,
improving reliability and robustness of the model deployment process.
Visibility and Governance: The Model Registry provides full visibility into
the deployment stage of all models, who requested and approved
changes, allowing for full governance and auditability.
34. MLflow Model Registry Benefits
One Collaborative Hub: The Model Registry provides a central hub for
making models discoverable, improving collaboration and knowledge
sharing across the organization.
Manage the entire Model Lifecycle (MLOps): The Model Registry provides
lifecycle management for models from experimentation to deployment,
improving reliability and robustness of the model deployment process.
Visibility and Governance: The Model Registry provides full visibility into
the deployment stage of all models, who requested and approved
changes, allowing for full governance and auditability.
35. MLflow Model Registry Benefits
One Collaborative Hub: The Model Registry provides a central hub for
making models discoverable, improving collaboration and knowledge
sharing across the organization.
Manage the entire Model Lifecycle (MLOps): The Model Registry provides
lifecycle management for models from experimentation to deployment,
improving reliability and robustness of the model deployment process.
Visibility and Governance: The Model Registry provides full visibility into
the deployment stage of all models, who requested and approved
changes, allowing for full governance and auditability.
36. MLflow Model Registry Benefits
One Collaborative Hub: The Model Registry provides a central hub for
making models discoverable, improving collaboration and knowledge
sharing across the organization.
Manage the entire Model Lifecycle (MLOps): The Model Registry provides
lifecycle management for models from experimentation to deployment,
improving reliability and robustness of the model deployment process.
Visibility and Governance: The Model Registry provides full visibility into
the deployment stage of all models, who requested and approved
changes, allowing for full governance and auditability.