Simplifying Model Management with MLflow

Simplifying ML Model
Management with .

ML development is harder than
traditional software development

Traditional Software Machine Learning
Goal: Optimize a metric (e.g., accuracy)
• Constantly experiment to improve it
Quality depends on input data
and tuning parameters
Compare + combine many libraries,
models & algorithms for the same task
Goal: Meet a functional specification
Quality depends only on code
Typically pick one software stack

Production ML is Even Harder
Data Prep
Training
Deployment
Raw Data
ML ENGINEER
APPLICATION
DEVELOPER
DATA
ENGINEER
ML apps must be fed new data
to keep working
Design, retraining & inference
done by different people

Solution: Machine Learning Platforms
Software to manage the ML lifecycle
Examples: Uber Michelangelo,
Google TFX, Facebook FBLearner
Data Prep
Training
Deployment
Raw Data
Versioning, CI/CD, QA,
ops, monitoring, etc

MLflow: An Open Source ML Platform
Three components:
• Tracking: experiment tracking
• Projects: reproducible runs
• Models: model packaging
140 contributors, 800K downloads/month
Works with any ML library, programming language, deployment tool

MLflow Tracking: Experiments
Notebooks
Local Apps
Cloud Jobs
Tracking Server
UI
API
mlflow.log_param(“alpha”, 0.5)
mlflow.log_metric(“accuracy”, 0.9)
...
REST API

MLflow Projects:
Reproducible Runs
Project Spec
Code DataConfig
Local
Execution
Remote
Cluster
MLflow Models:
Model Packaging
Model Format
ONNX Flavor
Python Flavor
Model Logic
Batch & Stream Scoring
REST Serving
Packaging Format
. . .
Evaluation & Debugging
LIME
TCAV

New in Last 6 Months
MLflow 1.0 (and 1.1, 1.2, 1.3)
Autologging in TensorFlow & Keras
DataFrame search API
Kubernetes, HDFS & Seldon integrations

MLflow Autologging
model = keras.models.Sequential()
model.add(layers.Dense(hidden_units, ...))
model.fit(X_train, y_train)
test_loss = model.evaluate(X_test, y_test)

MLflow Autologging
with mlflow.start_run():
mlflow.log_param(“hidden_units”, hidden_units)
mlflow.log_param(“learning_rate”, learning_rate)
mlflow.log_metric(“train_loss”, train_loss)
mlflow.log_metric(“test_loss”, test_loss)
mlflow.keras.log_model(model)
mlflow.keras.autolog()

MLflow’s Next Goal:
Model Management

The Model Management Problem
When you’re working on one ML app alone, storing your
models in files is manageable
MODEL
DEVELOPER
classifier_v1.h5
classifier_v2.h5
classifier_v3_sept_19.h5
classifier_v3_new.h5
…

The Model Management Problem
When you work in a large organization with many models,
management becomes a major challenge:
• Where can I find the best version of this model?
• How was this model trained?
• How can I track docs for each model?
• How can I review models?
MODEL
DEVELOPER
REVIEWER
MODEL
USER
???

MLflow Model Registry
Repository of named, versioned
models with comments & tags
Track each model’s stage: dev,
staging, production, archived
Easily load a specific version

Model Registry Workflow
Model Registry
MODEL
DEVELOPER
DOWNSTREAM
USERS
AUTOMATED JOBS
REST SERVING
REVIEWERS,
CI/CD TOOLS

Model Registry Availability
Pull request available: tinyurl.com/registry-pr
Available to Databricks customers

Wind directionWind speed Power
+ =

Modeling wind power availability
Weather
forecast
Power
forecast
ML model

Modeling wind power availability
Hourly
job
Weather
forecast
Power
forecast
ML model

MLflow Model Registry
Weather
forecast
Power
forecast
Easy to deploy bad models
Limited model information
No audit trail
Model administration and review
Model tracking with MLflow
Centralized activity logs and comments

Getting Started with
pip install mlflow
Docs and tutorials: mlflow.org
Databricks Community Edition: databricks.com/try

Simplifying Model Management with MLflow

Simplifying Model Management with MLflow

More Related Content

What's hot

Similar to Simplifying Model Management with MLflow

More from Databricks

Recently uploaded

Simplifying Model Management with MLflow