Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
What to Upload to SlideShare
Loading in …3
×
1 of 28

Simplifying Model Management with MLflow

5

Share

Download to read offline

<p>Last summer, Databricks launched MLflow, an open source platform to manage the machine learning lifecycle, including experiment tracking, reproducible runs and model packaging. MLflow has grown quickly since then, with over 120 contributors from dozens of companies, including major contributions from R Studio and Microsoft. It has also gained new capabilities such as automatic logging from TensorFlow and Keras, Kubernetes integrations, and a high-level Java API. In this talk, we’ll cover some of the new features that have come to MLflow, and then focus on a major upcoming feature: model management with the MLflow Model Registry. Many organizations face challenges tracking which models are available in the organization and which ones are in production. The MLflow Model Registry provides a centralized database to keep track of these models, share and describe new model versions, and deploy the latest version of a model through APIs. We’ll demonstrate how these features can simplify common ML lifecycle tasks.</p>

Simplifying Model Management with MLflow

  1. 1. Simplifying ML Model Management with .
  2. 2. ML development is harder than traditional software development
  3. 3. Traditional Software Machine Learning Goal: Optimize a metric (e.g., accuracy) • Constantly experiment to improve it Quality depends on input data and tuning parameters Compare + combine many libraries, models & algorithms for the same task Goal: Meet a functional specification Quality depends only on code Typically pick one software stack
  4. 4. Production ML is Even Harder Data Prep Training Deployment Raw Data ML ENGINEER APPLICATION DEVELOPER DATA ENGINEER ML apps must be fed new data to keep working Design, retraining & inference done by different people
  5. 5. Solution: Machine Learning Platforms Software to manage the ML lifecycle Examples: Uber Michelangelo, Google TFX, Facebook FBLearner Data Prep Training Deployment Raw Data Versioning, CI/CD, QA, ops, monitoring, etc
  6. 6. MLflow: An Open Source ML Platform Three components: • Tracking: experiment tracking • Projects: reproducible runs • Models: model packaging 140 contributors, 800K downloads/month Works with any ML library, programming language, deployment tool
  7. 7. MLflow Tracking: Experiments Notebooks Local Apps Cloud Jobs Tracking Server UI API mlflow.log_param(“alpha”, 0.5) mlflow.log_metric(“accuracy”, 0.9) ... REST API
  8. 8. Tracking UI: Inspecting Runs
  9. 9. MLflow Projects: Reproducible Runs Project Spec Code DataConfig Local Execution Remote Cluster MLflow Models: Model Packaging Model Format ONNX Flavor Python Flavor Model Logic Batch & Stream Scoring REST Serving Packaging Format . . . Evaluation & Debugging LIME TCAV
  10. 10. MLflow Talks at This Summit
  11. 11. New in Last 6 Months MLflow 1.0 (and 1.1, 1.2, 1.3) Autologging in TensorFlow & Keras DataFrame search API Kubernetes, HDFS & Seldon integrations
  12. 12. MLflow Autologging model = keras.models.Sequential() model.add(layers.Dense(hidden_units, ...)) model.fit(X_train, y_train) test_loss = model.evaluate(X_test, y_test)
  13. 13. MLflow Autologging with mlflow.start_run(): model = keras.models.Sequential() model.add(layers.Dense(hidden_units, ...)) model.fit(X_train, y_train) test_loss = model.evaluate(X_test, y_test) mlflow.log_param(“hidden_units”, hidden_units) mlflow.log_param(“learning_rate”, learning_rate) mlflow.log_metric(“train_loss”, train_loss) mlflow.log_metric(“test_loss”, test_loss) mlflow.keras.log_model(model) mlflow.keras.autolog() model = keras.models.Sequential() model.add(layers.Dense(hidden_units, ...)) model.fit(X_train, y_train) test_loss = model.evaluate(X_test, y_test)
  14. 14. MLflow’s Next Goal: Model Management
  15. 15. The Model Management Problem When you’re working on one ML app alone, storing your models in files is manageable MODEL DEVELOPER classifier_v1.h5 classifier_v2.h5 classifier_v3_sept_19.h5 classifier_v3_new.h5 …
  16. 16. The Model Management Problem When you work in a large organization with many models, management becomes a major challenge: • Where can I find the best version of this model? • How was this model trained? • How can I track docs for each model? • How can I review models? MODEL DEVELOPER REVIEWER MODEL USER ???
  17. 17. MLflow Model Registry Repository of named, versioned models with comments & tags Track each model’s stage: dev, staging, production, archived Easily load a specific version
  18. 18. Model Registry Workflow Model Registry MODEL DEVELOPER DOWNSTREAM USERS AUTOMATED JOBS REST SERVING REVIEWERS, CI/CD TOOLS
  19. 19. Model Registry Availability Pull request available: tinyurl.com/registry-pr Available to Databricks customers
  20. 20. Wind directionWind speed Power + =
  21. 21. Modeling wind power availability Weather forecast Power forecast ML model
  22. 22. Modeling wind power availability Hourly job Weather forecast Power forecast ML model
  23. 23. MLflow Model Registry Weather forecast Power forecast Easy to deploy bad models Limited model information No audit trail Model administration and review Model tracking with MLflow Centralized activity logs and comments
  24. 24. Thank you
  25. 25. Getting Started with pip install mlflow Docs and tutorials: mlflow.org Databricks Community Edition: databricks.com/try

×