Advanced MLflow
Matei Zaharia, Sid Murching, Tomas Nykodym
September 20th, 2018
1
What is MLflow?
Tracking
Record and query
experiments: code,
data, config, results
Projects
Packaging format
for reproducible runs
on any platform
Models
General model format
that supports diverse
deployment tools
Open source platform to accelerate the ML lifecycle
What’s New in MLflow?
Many updates since our last meetup at MLflow 0.2.1!
• Java API
• PyTorch, Keras, H2O, GCS, SFTP integrations
• APIs for tags & querying run history
• Optimized Spark ML serving with MLeap
• UX improvements
3
What’s New in MLflow?
4
What’s New in MLflow?
5
Ongoing Development
R API for MLflow (led by RStudio)
• See Javier’s talk later in this meetup!
CRUD UI and APIs (naming, annotating & deleting runs)
Improved experiment view and search UX
6
Learning More About MLflow
pip install mlflow to get started
Find docs & examples at mlflow.org
tinyurl.com/mlflow-slack
7
This Meetup
Go beyond the basics to show how to use MLflow’s components
for complex ML workflows
• Multi-step workflows with caching
• Hyperparameter tuning
8
Multistep ML Workflows
Project Spec
Code DataConfig
Local Execution
Remote Execution
MLflow Projects
● With projects: parametrized, dependency-agnostic runs of arbitrary code
● Can chain projects into multistep workflows
● Tracking server: source of truth for output of individual steps
11
Multistep Workflows
Can debug & develop steps independently
12
● Find this example at mlflow/examples/multistep_workflow
● MovieLens: given user and movie, predict a rating
Demo
13
Demo
1414
Hyperparameter Optimization
ML Algorithms have many parameters affecting the performance:
● Learning rate, momentum, network layers count and size, …
Parameter Selection Strategies:
● Manual
○ Depends on the data scientists skills, high variance, error prone
● Algorithmic
○ Reduced variance, less bias, lower chance of error
○ Strategies include grid search, random search and model based optimization
15
Model Based Hyperparam Optimization
1. Select “best parameters” based on current model.
Params[n+1] = Select(Model[n])
2. Obtain new data points by training the model with new parameters
Metric[n+1] = Train(Params[n+1])
3. Use new data points to update the model.
Model[n+1] = Update(Model[n], Metric[n+1])
HyperParameter Tuning With MLfLow
17
HyperParam
Search Run
Train Model
Run
mlflow.log_metric()mlflow.get_metric()
Projects
Tracking
Models
mlflow run ... Logged Model
mlflow.log_artifact
MLflow HyperParameter Example
You can find this example at mlflow/examples/hyperparam.
Goal: predict wine quality from measured properties
• data: acidity, sugar content, chlorides, alcohol, ... "
• target: quality score, integer between 3 and 9
• metric: “rmse”
The MLproject has following entry points:
• train - train deep learning model with Keras, has two tunable parameters: learning
rate and momentum
• hyperparam train with random, hyperopt, gpyopt
18
Thank you
19

Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating Custom Libraries

  • 1.
    Advanced MLflow Matei Zaharia,Sid Murching, Tomas Nykodym September 20th, 2018 1
  • 2.
    What is MLflow? Tracking Recordand query experiments: code, data, config, results Projects Packaging format for reproducible runs on any platform Models General model format that supports diverse deployment tools Open source platform to accelerate the ML lifecycle
  • 3.
    What’s New inMLflow? Many updates since our last meetup at MLflow 0.2.1! • Java API • PyTorch, Keras, H2O, GCS, SFTP integrations • APIs for tags & querying run history • Optimized Spark ML serving with MLeap • UX improvements 3
  • 4.
    What’s New inMLflow? 4
  • 5.
    What’s New inMLflow? 5
  • 6.
    Ongoing Development R APIfor MLflow (led by RStudio) • See Javier’s talk later in this meetup! CRUD UI and APIs (naming, annotating & deleting runs) Improved experiment view and search UX 6
  • 7.
    Learning More AboutMLflow pip install mlflow to get started Find docs & examples at mlflow.org tinyurl.com/mlflow-slack 7
  • 8.
    This Meetup Go beyondthe basics to show how to use MLflow’s components for complex ML workflows • Multi-step workflows with caching • Hyperparameter tuning 8
  • 9.
  • 10.
    Project Spec Code DataConfig LocalExecution Remote Execution
  • 11.
    MLflow Projects ● Withprojects: parametrized, dependency-agnostic runs of arbitrary code ● Can chain projects into multistep workflows ● Tracking server: source of truth for output of individual steps 11
  • 12.
    Multistep Workflows Can debug& develop steps independently 12
  • 13.
    ● Find thisexample at mlflow/examples/multistep_workflow ● MovieLens: given user and movie, predict a rating Demo 13
  • 14.
  • 15.
    Hyperparameter Optimization ML Algorithmshave many parameters affecting the performance: ● Learning rate, momentum, network layers count and size, … Parameter Selection Strategies: ● Manual ○ Depends on the data scientists skills, high variance, error prone ● Algorithmic ○ Reduced variance, less bias, lower chance of error ○ Strategies include grid search, random search and model based optimization 15
  • 16.
    Model Based HyperparamOptimization 1. Select “best parameters” based on current model. Params[n+1] = Select(Model[n]) 2. Obtain new data points by training the model with new parameters Metric[n+1] = Train(Params[n+1]) 3. Use new data points to update the model. Model[n+1] = Update(Model[n], Metric[n+1])
  • 17.
    HyperParameter Tuning WithMLfLow 17 HyperParam Search Run Train Model Run mlflow.log_metric()mlflow.get_metric() Projects Tracking Models mlflow run ... Logged Model mlflow.log_artifact
  • 18.
    MLflow HyperParameter Example Youcan find this example at mlflow/examples/hyperparam. Goal: predict wine quality from measured properties • data: acidity, sugar content, chlorides, alcohol, ... " • target: quality score, integer between 3 and 9 • metric: “rmse” The MLproject has following entry points: • train - train deep learning model with Keras, has two tunable parameters: learning rate and momentum • hyperparam train with random, hyperopt, gpyopt 18
  • 19.