: Infrastructure for Complete
Machine Learning Lifecycle
Andrew Chen & Mani Parkhe
Sept 6th, 2018
Outline
ML development challenges
How MLflow tackles these
Demo
How to get started
Machine Learning
Development is Complex
ML Lifecycle
4
Delta
Data Prep
Training
Deploy
Raw Data
μ
λ θ Tuning
Scale
μ
λ θ Tuning
Scale
Scale
Scale
Model
Exchange
Governance
“I build 100s of models/day to lift revenue, using any library:
MLlib, PyTorch, R, etc. There’s no easy way to see what data
went in a model from a week ago, tune it and rebuild it.”
-- Chief scientist at ad tech firm
Example
Example
“Our company has 100 teams using ML worldwide. We can’t
share work across them: when a new team tries to run some
code, it often doesn’t even give the same result.”
-- Large consumer electronics firm
Custom ML Platforms
Facebook FBLearner, Uber Michelangelo, Google TFX
+Standardize the data prep / training / deploy loop:
if you work with the platform, you get these!
– Limited to a few algorithms or frameworks
– Tied to one company’s infrastructure
Can we provide similar benefits in an open manner?
Introducing
Open machine learningplatform
• Works with any ML library& language
• Runs the same wayanywhere (e.g. any cloud)
• Designed to be useful for 1 or 100,000 person orgs
MLflow Components
9
Tracking
Record and query
experiments: code,
configs, results,…etc
Projects
Packaging format
for reproducible runs
on any platform
Models
General model format
that supports diverse
deployment tools
Demo
bathrooms: 1
bedrooms: 2
accommodates: 4
total_reviews: 45
cleanliness_rating:9
location_rating:10
checkin_rating: 10
zip_code: 94105
price: 150
f (x)
Model
listing attributes
based on data from insideairbnb.com
Goal: Predict Price of Airbnb Listings
Notebooks
LocalApps
CloudJobs
Tracking Server
UI
API
MLflow	Tracking
Python or
REST API
Project Spec
Code DataConfig
LocalExecution
Remote Execution
MLflow	Projects
Model Format
Flavor 2Flavor 1
Run Sources
InferenceCode
Batch & Stream Scoring
Cloud ServingTools
MLflow	Models
Simple model flavors
usableby many tools
Getting Started with MLflow
Install with pip install mlflow
Find detailed tutorials at mlflow.org
Sign up at databricks.com/mlflow for futureupdates
Ongoing MLflow Roadmap
• TensorFlow, Keras, PyTorch, H2O, MLlib integrations ✔
• Java and R language APIs (both in review!)
• Multi-step workflows
• Hyperparameter tuning
• Data source API based on Spark data sources
• Model metadata & management
Conclusion
Workflow tools can greatly simplifythe ML lifecycle
• Improve usability for both data scientists and engineers
• Same way software dev lifecycle tools simplify development
Learn more about MLflow at mlflow.org
Thank you!
https://databricks.com/sparkaisummit/europe
30% Discount Code: Mani30
MLflow Design Philosophy
1. “API-first”, open platform
• Allow submittingruns,models,etc from anylibrary & language
• Example: a “model” can justbe a lambdafunction thatMLflow
can thendeploy in many places (Docker, AzureML, Spark UDF, …)
Key enabler: built aroundREST APIs and CLI
MLflow Design Philosophy
2. Modular design
• Let people use different components individually(e.g.,use
MLflow’s project format but not its deployment tools)
• Easy to integrateinto existing ML platforms & workflows
Key enabler: distinct components (Tracking/Projects/Models)

MLflow: Infrastructure for a Complete Machine Learning Life Cycle