Managing the Machine Learning Lifecycle with MLflow

Arduino Cascella
@ MLOps Paris, January 16th 2020

Agenda
• What is MLflow?
• Demo
• Questions & Answers

Deployment lifecycle of ML Models?

Can you find what’s wrong with this true story?

Did anything change in the
feature engineering?

How did the
hyperparameters change?

What data was this model
trained on?

How did the oﬀline metrics
change?

....

The diﬀerence between releasing Software
and deploying ML Models

Write code
Software
Write unit tests
Send for review
Get approvals
Commit
Release testing
Release

Write code
Software ML Models
Write unit tests
Send for review
Get approvals
Commit
Release testing
Release
Analyze data
Put data into the right
format
Write model code
Train and evaluate model
Experiment with params,
model structure
Deploy … by email?
Monitor performance and
trigger retraining

Write code
Software ML Models
Write unit tests
Send for review
Get approvals
Commit
Release testing
Release
Analyze data
format
Write model code
model structure
trigger retraining
Meet a functional
specification
Optimize a metric,
e.g. CTR
Goal

Write code
Software ML Models
Write unit tests
Send for review
Get approvals
Commit
Release testing
Release
Analyze data
format
Write model code
model structure
trigger retraining
Meet a functional
specification
Optimize a metric,
e.g. CTR
Goal
Depends on code Depends on data,
code, model, params,
...
Quality

Write code
Software ML Models
Write unit tests
Send for review
Get approvals
Commit
Release testing
Release
Analyze data
format
Write model code
model structure
trigger retraining
Meet a functional
specification
Optimize a metric,
e.g. CTR
Goal
...
Quality
Typically one
software stack
Combination of many
libraries, tools,
...
Tools

Write code
Software ML Models
Write unit tests
Send for review
Get approvals
Commit
Release testing
Release
Analyze data
format
Write model code
model structure
trigger retraining
Meet a functional
specification
Optimize a metric,
e.g. CTR
Goal
...
Quality
Typically one
software stack
Combination of many
libraries, tools,
...
Tools
Works
deterministically
Keeps changing with
data, etc.
Outcome

In summary, deploying ML Models is hard!

: An Open Source ML Platform
18
mlflow.org github.com/mlflow twitter.com/MLflowdatabricks.com/mlflow

140
120
100
80
60
40
20
0
0 5 10 15 20 25 30 35 40 45
Months since Project Launch
#ofContributors

Components
Tracking
Record and query
experiments: code,
data, config,
results
Projects
Packaging format
for reproducible
runs on any
platform
Models
General format
that standardizes
deployment paths
Model
Registry
Centralized and
collaborative
model lifecycle
management
new
20

Tracking
Notebooks
Local Apps
Cloud Jobs
UI
API
Tracking Server
Parameters Metrics Artifacts
ModelsMetadata Spark
Data Source

Tracking
Record and query
experiments: code,
configs, results,
…etc
import mlflow
# log model’s tuning parameters
with mlflow.start_run():
mlflow.log_param("layers", layers)
mlflow.log_param("alpha", alpha)
# log model’s metrics
mlflow.log_metric("mse", model.mse())
mlflow.log_artifact("plot", model.plot(test_df))
Tracking API

X, y = get_training_data()
opt = keras.optimizers.Adam(lr=params["learning_rate"],
beta_1=params["beta_1"],
epsilon=params["epsilon"])
model = Sequential()
model.add(Dense(int(params["units"]), ...)
model.add(Dense(1))
model.compile(loss="mse", optimizer=opt)
rest = model.fit(X, y, epochs=50, batch_size=64, validation_split=.2)

# Enable MLflow Autologging
mlflow.keras.autolog()
X, y = get_training_data()
opt = keras.optimizers.Adam(lr=params["learning_rate"],
epsilon=params["epsilon"])
model = Sequential()
model.add(Dense(int(params["units"]), ...)
model.add(Dense(1))
model.compile(loss="mse", optimizer=opt)
rest = model.fit(X, y, epochs=50, batch_size=64, validation_split=.2)

https://www.mlflow.org/docs/latest/tracking.html#automatic-logging-from-tensorflow-and-keras-experimental
Automatic Logging from TF and Keras

Projects
Project Spec
Code MetadataConfig
Local Execution
Remote Execution

my_project/
├── MLproject
│
│
│
│
│
├── conda.yaml
├── main.py
└── model.py
...
conda_env: conda.yaml
entry_points:
main:
parameters:
training_data: path
lambda: {type: float, default: 0.1}
command: python main.py {training_data} {lambda}
$ mlflow run git://<my_project>
mlflow.run(“git://<my_project>”, ...)
Projects

Model Format
Flavor 2Flavor 1
Simple model flavors
usable by many tools
Containers
Batch & Stream Scoring
Cloud Inference Services
In-Line Code
Models

What’s deployed in production? Who did it?
How can I rollback to the previous model?
Can I get approval before deploying?
Where do I store and archive my models?
Registry motivation
my_model
my_model_last
my_model_new

Automated Jobs
REST Serving
Downstream Users
Reviewers + CI/CD Tools
Model Registry
Experimental Staging A/B Tests Production
Model Registry
Data Scientists Deployment Engineers

MLflow Model Registry Benefits
One Collaborative Hub: The Model Registry provides a central hub for
making models discoverable, improving collaboration and knowledge
sharing across the organization.
Manage the entire Model Lifecycle (MLOps): The Model Registry provides
lifecycle management for models from experimentation to deployment,
improving reliability and robustness of the model deployment process.
Visibility and Governance: The Model Registry provides full visibility into
the deployment stage of all models, who requested and approved
changes, allowing for full governance and auditability.

Using the latest production Model

Deploying the new best Model as a new
version to the Model Registry

Using the new production Model

Model Deployment Options
Containers Batch & Stream
Scoring
Cloud Inference
Services
In-Line Code OSS Inference
Solutions

pip install mlflow
45

pip install mlflow
46
Try Managed MLflow on Databricks Community Edition!
databricks.com/try

Managing the Machine Learning Lifecycle with MLflow

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Managing the Machine Learning Lifecycle with MLflow

Similar to Managing the Machine Learning Lifecycle with MLflow (20)

More from Databricks

More from Databricks (20)

Recently uploaded

Recently uploaded (20)

Managing the Machine Learning Lifecycle with MLflow