SlideShare a Scribd company logo
Building an ML
Platform with Ray and
MLflow
Amog Kamsetty and Archit Kulkarni
Ray Team @ Anyscale
The Team
Archit Kulkarni Amog Kamsetty Dmitri Gekhtman Edward Oakes
Richard Liaw Kai Fricke Simon Mo
Kathryn Zhou
Overview of Talk
▪ What are ML Platforms?
▪ Ray and its libraries
▪ MLflow
▪ Demo: An ML Platform
built with MLflow and
Ray
What are ML Platforms?
Typical ML Process
Fuzzy
search!
NLP, DL …
Execution
- Feature engineering
- Training
- Including tuning
- Serving
- Offline scoring, inference
- Online serving
Typical ML Process -- Simplified
Management
- Tracking
- Data, Code, Configurations
- Reproducing Results
- Deployment
- Deploy in a variety of
environments
Challenges with the ML Process
Data/Features
• Data Preparation
• Data Analysis
• Feature
Engineering
• Data Pipeline
• Data
Management/Feat
ure Store
• Manages big data
clusters
Model
• ML Expertise
• Implement SOTA
ML Research
• Experimentation
• Manage GPU
infrastructure
• Scalable training &
hyperparameter
tuning
Production
• A/B Testing
• Model Evaluation
• Analysis of
Predictions
• Deploy in variety of
environments
• CI/CD
• Highly Available
prediction service
Data/Research
Scientist
Engineers
Challenges with the ML Process
Data
• Data Preparation
• Data Analysis
• Feature
Engineering
• Data Pipeline
• Data
Management/Feat
ure Store
• Manages big data
clusters
Model
• ML Expertise
• Implement SOTA
ML Research
• Experimentation
• Manage GPU
infrastructure
• Scalable training &
hyperparameter
tuning
Production
• A/B Testing
• Model Evaluation
• Analysis of
Predictions
• Deploy in variety of
environments
• CI/CD
• Highly Available
prediction service
Data/Research
Scientist
Software/Data/
ML Engineer
ML Platform
Abstraction
ML Platforms -- Scale
- LinkedIn:
- 500+ “AI engineers” building models; 50+ MLP engineers
- > 50% offline compute demand (12K servers each with 256G RAM)
- More than 2x a year
- Uber Michelangelo, AirBnB Bighead, Facebook FBLearner,
etc.
- Globally, a few Billion $ now, growing 40%+ YoY
- Many companies building ML Platforms from the ground up
ML Platforms -- Landscape
(Source: Intel Capital)
ML Platforms -- Landscape
(Source: Intel Capital)
Execution
- Feature engineering 🔪
- Training 🍳
- Including tuning 🧂
- Serving 🍽
- Offline scoring, inference
- Online serving
Typical ML Process -- Simplified
Management
- Tracking 📝
- Data, Code, Configurations
- Reproducing Results 📖
- Deployment 🚚 💻
- Deploy in a variety of
environments
Execution
- Feature engineering 🔪
- Training 🍳
- Including tuning 🧂
- Serving 🍽
- Offline scoring, inference
- Online serving
Typical ML Process -- Simplified
Management
- Tracking 📝
- Data, Code, Configurations
- Reproducing Results 📖
- Deployment 🚚 💻
- Variety of environments
Ray and its Libraries
What is Ray?
• A simple/general library for distributed computing
• Single machine or 100s of nodes
• Agnostic to the type of work
• An ecosystem of libraries (for scaling ML and more)
• Native: Ray RLlib, Ray Tune, Ray Serve
• Third party: Modin, Dask, Horovod, XGBoost, Pytorch Lightning
• Tools for launching clusters on any cloud provider
Three key ideas
Execute remote functions as tasks, and
instantiate remote classes as actors
• Support both stateful and stateless computations
Asynchronous execution using futures
• Enable parallelism
Distributed (immutable) object store
• Efficient communication (send arguments by reference)
Ray API
API
Functions -> Tasks
def read_array(file):
# read array “a” from “file”
return a
def add(a, b):
return np.add(a, b)
API
Functions -> Tasks
@ray.remote
def read_array(file):
# read array “a” from “file”
return a
@ray.remote
def add(a, b):
return np.add(a, b)
API
Functions -> Tasks
@ray.remote
def read_array(file):
# read array “a” from “file”
return a
@ray.remote
def add(a, b):
return np.add(a, b)
id1 = read_array.remote(“/input1”)
id1
read_array
API
Functions -> Tasks
@ray.remote
def read_array(file):
# read array “a” from “file”
return a
@ray.remote
def add(a, b):
return np.add(a, b)
id1 = read_array.remote(“/input1”)
id2 = read_array.remote(“/input2”)
id1
read_array
id2
zeros
read_array
API
Functions -> Tasks
@ray.remote
def read_array(file):
# read array “a” from “file”
return a
@ray.remote
def add(a, b):
return np.add(a, b)
id1 = read_array.remote(“/input1”)
id2 = read_array.remote(“/input2”)
id3 = add.remote(id1, id2)
id1
read_array
id2
zeros
read_array
id3
add
API
Functions -> Tasks
@ray.remote
def read_array(file):
# read array “a” from “file”
return a
@ray.remote
def add(a, b):
return np.add(a, b)
id1 = read_array.remote(“/input1”)
id2 = read_array.remote(“/input2”)
id3 = add.remote(id1, id2); ray.get(id3)
id1
read_array
id2
zeros
read_array
id3
add
API
Functions -> Tasks
@ray.remote
def read_array(file):
# read array “a” from “file”
return a
@ray.remote
def add(a, b):
return np.add(a, b)
id1 = read_array.remote(“/input1”)
id2 = read_array.remote(“/input2”)
id3 = add.remote(id1, id2)
Classes -> Actors
API
Functions -> Tasks
@ray.remote
def read_array(file):
# read array “a” from “file”
return a
@ray.remote
def add(a, b):
return np.add(a, b)
id1 = read_array.remote(“/input1”)
id2 = read_array.remote(“/input2”)
id3 = add.remote(id1, id2)
Classes -> Actors
@ray.remote
class Counter(object):
def __init__(self):
self.value = 0
def inc(self):
self.value += 1
return self.value
API
Functions -> Tasks
@ray.remote
def read_array(file):
# read array “a” from “file”
return a
@ray.remote
def add(a, b):
return np.add(a, b)
id1 = read_array.remote(“/input1”)
id2 = read_array.remote(“/input2”)
id3 = add.remote(id1, id2)
Classes -> Actors
@ray.remote
class Counter(object):
def __init__(self):
self.value = 0
def inc(self):
self.value += 1
return self.value
c = Counter.remote()
id4 = c.inc.remote()
id5 = c.inc.remote()
ray.get([id4, id5])
API
Functions -> Tasks
@ray.remote
def read_array(file):
# read array “a” from “file”
return a
@ray.remote(num_gpus=1)
def add(a, b):
return np.add(a, b)
id1 = read_array.remote(“/input1”)
id2 = read_array.remote(“/input2”)
id3 = add.remote(id1, id2)
Classes -> Actors
@ray.remote(num_gpus=1)
class Counter(object):
def __init__(self):
self.value = 0
def inc(self):
self.value += 1
return self.value
c = Counter.remote()
id4 = c.inc.remote()
id5 = c.inc.remote()
ray.get([id4, id5])
at Anyscale
Your app
here!
Native Libraries 3rd Party Libraries
Ecosystem
Universal framework for
Distributed computing
Ray Ecosystem
Ray Tune
Ray Tune: Scalable
Hyperparameter Tuning
Wide variety of algorithms Compatible with ML frameworks
HYPERBAND
PBT
BAYESIAN OPT.
Ray Tune focuses on
simplifying execution
Easily launch distributed multi-gpu
tuning jobs
Automatic fault tolerance to save
3x on GPU costs
https://www.vecteezy.com/
$ ray up {cluster config}
ray.init(address="auto")
tune.run(func, num_samples=100)
Ray Tune interoperates
with other HPO libraries
Ray Tune
Ax
Optuna
scikit-optimize
…
def train_model(config={}):
model = ConvNet(config)
for i in range(steps):
current_loss = model.train()
from ray import tune
def train_model(config={}):
model = ConvNet(config)
for i in range(steps):
current_loss = model.train()
tune.report(loss=current_loss)
def train_model(config):
model = ConvNet(config)
for i in range(epochs):
current_loss = model.train()
tune.report(loss=current_loss)
tune.run(train_model,
config={“lr”: 0.1})
tune.run(
train_model,
config={“lr”: tune.uniform(0.001, 0.1)},
num_samples=100
)
def train_model(config):
model = ConvNet(config)
for i in range(epochs):
current_loss = model.train()
tune.report(loss=current_loss)
tune.run(
train_model,
config={“lr”: tune.uniform(0.001, 0.1)},
num_samples=100,
scheduler=ASHAScheduler())
def train_model(config):
model = ConvNet(config)
for i in range(epochs):
current_loss = model.train()
tune.report(loss=current_loss)
tune.run(
train_model,
config={“lr”: tune.uniform(0.001, 0.1)},
num_samples=100,
scheduler=PopulationBasedTraining(...))
def train_model(config, checkpoint_dir=None):
model = ConvNet(config)
if checkpoint_dir is not None:
model.load_checkpoint(checkpoint_dir+”model.pt”)
for i in range(epochs):
current_loss = model.train()
with tune.checkpoint_dir() as dir:
model.save_checkpoint(dir+”model.pt”)
tune.report(loss=current_loss)
Ray Serve
Ray Serve is a
Web Framework
Built for
Model Serving
Model Serving in Python
Ray Serve is
high-performance and flexible
• Framework-agnostic
• Easily scales
• Supports batching
• Query your endpoints from
HTTP and from Python
• Easily integrate with other
tools
Ray Serve is built on top of Ray
For user, no need to think about:
• Interprocess communication
• Failure management
• Scheduling
Just tell Ray Serve to scale up your model.
Serve functions and stateful classes.
Ray Serve will use multiple replicas to parallelize
across cores and across nodes in your cluster.
Ray Serve API
Flexibility
Query your model from HTTP:
> curl "http://127.0.0.1:8000/my/route"
Or query from Python using ServeHandle:
MLflow
Challenges of ML in production
• It’s difficult to keep track of experiments.
• It’s difficult to reproduce code.
• There’s no standard way to package and deploy
models.
• There’s no central store to manage models (their
versions and stage transitions).
Source: mlflow.org
What is MLflow?
• Open-source ML lifecycle management tool
• Single solution for all of the above challenges
• Library-agnostic and language-agnostic
• (Works with your existing code)
Four key functions of MLflow
Source: MLflow
MLflow Tracking
MLflow Models
Ray + MLflow
Ray Tune + MLflow Tracking
def train_model(config):
model = ConvNet(config)
for i in range(epochs):
current_loss = model.train()
tune.report(loss=current_loss)
tune.run(
train_model,
config={“lr”: tune.uniform(0.001, 0.1)},
num_samples=100,
callbacks=[MLflowLoggerCallback(“my_experiment”)])
Ray Tune + MLflow Tracking
@mlflow_mixin
def train_model(config):
mlflow.autolog()
xgboost_results = xgb.train(config, ...)
tune.run(
train_model,
config={“lr”: tune.uniform(0.001, 0.1)},
num_samples=100)
+
> pip install mlflow-ray-serve
> ray start --head
> serve start
MLflow deployments CLI
Create deployment
> mlflow deployments create -t ray-serve -m <model URI>
--name my_model -C num_replicas=100
Model URI:
• models:/MyModel/1
• runs:/93203689db9c4b50afb6869
• s3://<bucket>/<path>
• ...
MLflow deployments Python API
Create model
Integrating with Ray Serve is easy.
• Ray Serve endpoints can be called from Python.
• Clean conceptual separation:
• Ray Serve handles data plane (processing)
• MLflow handles control plane (metadata, configuration)
Demo: An ML Platform built with MLflow and Ray
Acknowledgements
Thanks to Jules Damji, Sid Murching, and Paul Ogilvie for
their help and guidance with MLflow.
Thanks to Dmitri Gekhtman, Kai Fricke, Simon Mo,
Edward Oakes, Richard Liaw, Kathryn Zhou and the rest
of the Ray team!
Feedback
Your feedback is important to us.
Don’t forget to rate and review the sessions.

More Related Content

What's hot

MLflow: A Platform for Production Machine Learning
MLflow: A Platform for Production Machine LearningMLflow: A Platform for Production Machine Learning
MLflow: A Platform for Production Machine Learning
Matei Zaharia
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleMLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
Databricks
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at Scale
Databricks
 
MLOps by Sasha Rosenbaum
MLOps by Sasha RosenbaumMLOps by Sasha Rosenbaum
MLOps by Sasha Rosenbaum
Sasha Rosenbaum
 
"Managing the Complete Machine Learning Lifecycle with MLflow"
"Managing the Complete Machine Learning Lifecycle with MLflow""Managing the Complete Machine Learning Lifecycle with MLflow"
"Managing the Complete Machine Learning Lifecycle with MLflow"
Databricks
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
Databricks
 
From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOps
Carl W. Handlin
 
Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)
Animesh Singh
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflows
Márton Kodok
 
Learn to Use Databricks for the Full ML Lifecycle
Learn to Use Databricks for the Full ML LifecycleLearn to Use Databricks for the Full ML Lifecycle
Learn to Use Databricks for the Full ML Lifecycle
Databricks
 
Using MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOpsUsing MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOps
Weaveworks
 
Apply MLOps at Scale by H&M
Apply MLOps at Scale by H&MApply MLOps at Scale by H&M
Apply MLOps at Scale by H&M
Databricks
 
MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle
Databricks
 
How to Build a ML Platform Efficiently Using Open-Source
How to Build a ML Platform Efficiently Using Open-SourceHow to Build a ML Platform Efficiently Using Open-Source
How to Build a ML Platform Efficiently Using Open-Source
Databricks
 
Frame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine LearningFrame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine Learning
David Stein
 
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDPBuild Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
Databricks
 
Accelerate Your ML Pipeline with AutoML and MLflow
Accelerate Your ML Pipeline with AutoML and MLflowAccelerate Your ML Pipeline with AutoML and MLflow
Accelerate Your ML Pipeline with AutoML and MLflow
Databricks
 
Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...
Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...
Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...
Databricks
 
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Vertex AI - Unified ML Platform for the entire AI workflow on Google CloudVertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Márton Kodok
 
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
Chris Hoyean Song
 

What's hot (20)

MLflow: A Platform for Production Machine Learning
MLflow: A Platform for Production Machine LearningMLflow: A Platform for Production Machine Learning
MLflow: A Platform for Production Machine Learning
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleMLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at Scale
 
MLOps by Sasha Rosenbaum
MLOps by Sasha RosenbaumMLOps by Sasha Rosenbaum
MLOps by Sasha Rosenbaum
 
"Managing the Complete Machine Learning Lifecycle with MLflow"
"Managing the Complete Machine Learning Lifecycle with MLflow""Managing the Complete Machine Learning Lifecycle with MLflow"
"Managing the Complete Machine Learning Lifecycle with MLflow"
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
 
From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOps
 
Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflows
 
Learn to Use Databricks for the Full ML Lifecycle
Learn to Use Databricks for the Full ML LifecycleLearn to Use Databricks for the Full ML Lifecycle
Learn to Use Databricks for the Full ML Lifecycle
 
Using MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOpsUsing MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOps
 
Apply MLOps at Scale by H&M
Apply MLOps at Scale by H&MApply MLOps at Scale by H&M
Apply MLOps at Scale by H&M
 
MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle
 
How to Build a ML Platform Efficiently Using Open-Source
How to Build a ML Platform Efficiently Using Open-SourceHow to Build a ML Platform Efficiently Using Open-Source
How to Build a ML Platform Efficiently Using Open-Source
 
Frame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine LearningFrame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine Learning
 
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDPBuild Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
 
Accelerate Your ML Pipeline with AutoML and MLflow
Accelerate Your ML Pipeline with AutoML and MLflowAccelerate Your ML Pipeline with AutoML and MLflow
Accelerate Your ML Pipeline with AutoML and MLflow
 
Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...
Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...
Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...
 
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Vertex AI - Unified ML Platform for the entire AI workflow on Google CloudVertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
 
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
 

Similar to Building an ML Platform with Ray and MLflow

Ray and Its Growing Ecosystem
Ray and Its Growing EcosystemRay and Its Growing Ecosystem
Ray and Its Growing Ecosystem
Databricks
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Chetan Khatri
 
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflowImproving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Databricks
 
slide-keras-tf.pptx
slide-keras-tf.pptxslide-keras-tf.pptx
slide-keras-tf.pptx
RithikRaj25
 
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
PAPIs.io
 
Python and Oracle : allies for best of data management
Python and Oracle : allies for best of data managementPython and Oracle : allies for best of data management
Python and Oracle : allies for best of data management
Laurent Leturgez
 
ProgrammingPrimerAndOOPS
ProgrammingPrimerAndOOPSProgrammingPrimerAndOOPS
ProgrammingPrimerAndOOPS
sunmitraeducation
 
Rails Tips and Best Practices
Rails Tips and Best PracticesRails Tips and Best Practices
Rails Tips and Best Practices
David Keener
 
ACM Sunnyvale Meetup.pdf
ACM Sunnyvale Meetup.pdfACM Sunnyvale Meetup.pdf
ACM Sunnyvale Meetup.pdf
Anyscale
 
Flux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / PipelineFlux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / Pipeline
Jan Wiegelmann
 
Database programming
Database programmingDatabase programming
Viktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning ServiceViktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning Service
Lviv Startup Club
 
DML Syntax and Invocation process
DML Syntax and Invocation processDML Syntax and Invocation process
DML Syntax and Invocation process
Arvind Surve
 
S1 DML Syntax and Invocation
S1 DML Syntax and InvocationS1 DML Syntax and Invocation
S1 DML Syntax and Invocation
Arvind Surve
 
Eclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science ProjectEclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science Project
Matthew Gerring
 
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Universitat Politècnica de Catalunya
 
Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...
Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...
Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...
InfluxData
 
D Trace Support In My Sql Guide To Solving Reallife Performance Problems
D Trace Support In My Sql Guide To Solving Reallife Performance ProblemsD Trace Support In My Sql Guide To Solving Reallife Performance Problems
D Trace Support In My Sql Guide To Solving Reallife Performance ProblemsMySQLConference
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
Databricks
 
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Databricks
 

Similar to Building an ML Platform with Ray and MLflow (20)

Ray and Its Growing Ecosystem
Ray and Its Growing EcosystemRay and Its Growing Ecosystem
Ray and Its Growing Ecosystem
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
 
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflowImproving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
 
slide-keras-tf.pptx
slide-keras-tf.pptxslide-keras-tf.pptx
slide-keras-tf.pptx
 
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
 
Python and Oracle : allies for best of data management
Python and Oracle : allies for best of data managementPython and Oracle : allies for best of data management
Python and Oracle : allies for best of data management
 
ProgrammingPrimerAndOOPS
ProgrammingPrimerAndOOPSProgrammingPrimerAndOOPS
ProgrammingPrimerAndOOPS
 
Rails Tips and Best Practices
Rails Tips and Best PracticesRails Tips and Best Practices
Rails Tips and Best Practices
 
ACM Sunnyvale Meetup.pdf
ACM Sunnyvale Meetup.pdfACM Sunnyvale Meetup.pdf
ACM Sunnyvale Meetup.pdf
 
Flux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / PipelineFlux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / Pipeline
 
Database programming
Database programmingDatabase programming
Database programming
 
Viktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning ServiceViktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning Service
 
DML Syntax and Invocation process
DML Syntax and Invocation processDML Syntax and Invocation process
DML Syntax and Invocation process
 
S1 DML Syntax and Invocation
S1 DML Syntax and InvocationS1 DML Syntax and Invocation
S1 DML Syntax and Invocation
 
Eclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science ProjectEclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science Project
 
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
 
Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...
Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...
Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...
 
D Trace Support In My Sql Guide To Solving Reallife Performance Problems
D Trace Support In My Sql Guide To Solving Reallife Performance ProblemsD Trace Support In My Sql Guide To Solving Reallife Performance Problems
D Trace Support In My Sql Guide To Solving Reallife Performance Problems
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
 
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
James Polillo
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 

Recently uploaded (20)

一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 

Building an ML Platform with Ray and MLflow

  • 1. Building an ML Platform with Ray and MLflow Amog Kamsetty and Archit Kulkarni Ray Team @ Anyscale
  • 2. The Team Archit Kulkarni Amog Kamsetty Dmitri Gekhtman Edward Oakes Richard Liaw Kai Fricke Simon Mo Kathryn Zhou
  • 3. Overview of Talk ▪ What are ML Platforms? ▪ Ray and its libraries ▪ MLflow ▪ Demo: An ML Platform built with MLflow and Ray
  • 4. What are ML Platforms?
  • 6. Execution - Feature engineering - Training - Including tuning - Serving - Offline scoring, inference - Online serving Typical ML Process -- Simplified Management - Tracking - Data, Code, Configurations - Reproducing Results - Deployment - Deploy in a variety of environments
  • 7. Challenges with the ML Process Data/Features • Data Preparation • Data Analysis • Feature Engineering • Data Pipeline • Data Management/Feat ure Store • Manages big data clusters Model • ML Expertise • Implement SOTA ML Research • Experimentation • Manage GPU infrastructure • Scalable training & hyperparameter tuning Production • A/B Testing • Model Evaluation • Analysis of Predictions • Deploy in variety of environments • CI/CD • Highly Available prediction service Data/Research Scientist Engineers
  • 8. Challenges with the ML Process Data • Data Preparation • Data Analysis • Feature Engineering • Data Pipeline • Data Management/Feat ure Store • Manages big data clusters Model • ML Expertise • Implement SOTA ML Research • Experimentation • Manage GPU infrastructure • Scalable training & hyperparameter tuning Production • A/B Testing • Model Evaluation • Analysis of Predictions • Deploy in variety of environments • CI/CD • Highly Available prediction service Data/Research Scientist Software/Data/ ML Engineer ML Platform Abstraction
  • 9. ML Platforms -- Scale - LinkedIn: - 500+ “AI engineers” building models; 50+ MLP engineers - > 50% offline compute demand (12K servers each with 256G RAM) - More than 2x a year - Uber Michelangelo, AirBnB Bighead, Facebook FBLearner, etc. - Globally, a few Billion $ now, growing 40%+ YoY - Many companies building ML Platforms from the ground up
  • 10. ML Platforms -- Landscape (Source: Intel Capital)
  • 11. ML Platforms -- Landscape (Source: Intel Capital)
  • 12. Execution - Feature engineering 🔪 - Training 🍳 - Including tuning 🧂 - Serving 🍽 - Offline scoring, inference - Online serving Typical ML Process -- Simplified Management - Tracking 📝 - Data, Code, Configurations - Reproducing Results 📖 - Deployment 🚚 💻 - Deploy in a variety of environments
  • 13. Execution - Feature engineering 🔪 - Training 🍳 - Including tuning 🧂 - Serving 🍽 - Offline scoring, inference - Online serving Typical ML Process -- Simplified Management - Tracking 📝 - Data, Code, Configurations - Reproducing Results 📖 - Deployment 🚚 💻 - Variety of environments
  • 14. Ray and its Libraries
  • 15. What is Ray? • A simple/general library for distributed computing • Single machine or 100s of nodes • Agnostic to the type of work • An ecosystem of libraries (for scaling ML and more) • Native: Ray RLlib, Ray Tune, Ray Serve • Third party: Modin, Dask, Horovod, XGBoost, Pytorch Lightning • Tools for launching clusters on any cloud provider
  • 16. Three key ideas Execute remote functions as tasks, and instantiate remote classes as actors • Support both stateful and stateless computations Asynchronous execution using futures • Enable parallelism Distributed (immutable) object store • Efficient communication (send arguments by reference)
  • 18. API Functions -> Tasks def read_array(file): # read array “a” from “file” return a def add(a, b): return np.add(a, b)
  • 19. API Functions -> Tasks @ray.remote def read_array(file): # read array “a” from “file” return a @ray.remote def add(a, b): return np.add(a, b)
  • 20. API Functions -> Tasks @ray.remote def read_array(file): # read array “a” from “file” return a @ray.remote def add(a, b): return np.add(a, b) id1 = read_array.remote(“/input1”) id1 read_array
  • 21. API Functions -> Tasks @ray.remote def read_array(file): # read array “a” from “file” return a @ray.remote def add(a, b): return np.add(a, b) id1 = read_array.remote(“/input1”) id2 = read_array.remote(“/input2”) id1 read_array id2 zeros read_array
  • 22. API Functions -> Tasks @ray.remote def read_array(file): # read array “a” from “file” return a @ray.remote def add(a, b): return np.add(a, b) id1 = read_array.remote(“/input1”) id2 = read_array.remote(“/input2”) id3 = add.remote(id1, id2) id1 read_array id2 zeros read_array id3 add
  • 23. API Functions -> Tasks @ray.remote def read_array(file): # read array “a” from “file” return a @ray.remote def add(a, b): return np.add(a, b) id1 = read_array.remote(“/input1”) id2 = read_array.remote(“/input2”) id3 = add.remote(id1, id2); ray.get(id3) id1 read_array id2 zeros read_array id3 add
  • 24. API Functions -> Tasks @ray.remote def read_array(file): # read array “a” from “file” return a @ray.remote def add(a, b): return np.add(a, b) id1 = read_array.remote(“/input1”) id2 = read_array.remote(“/input2”) id3 = add.remote(id1, id2) Classes -> Actors
  • 25. API Functions -> Tasks @ray.remote def read_array(file): # read array “a” from “file” return a @ray.remote def add(a, b): return np.add(a, b) id1 = read_array.remote(“/input1”) id2 = read_array.remote(“/input2”) id3 = add.remote(id1, id2) Classes -> Actors @ray.remote class Counter(object): def __init__(self): self.value = 0 def inc(self): self.value += 1 return self.value
  • 26. API Functions -> Tasks @ray.remote def read_array(file): # read array “a” from “file” return a @ray.remote def add(a, b): return np.add(a, b) id1 = read_array.remote(“/input1”) id2 = read_array.remote(“/input2”) id3 = add.remote(id1, id2) Classes -> Actors @ray.remote class Counter(object): def __init__(self): self.value = 0 def inc(self): self.value += 1 return self.value c = Counter.remote() id4 = c.inc.remote() id5 = c.inc.remote() ray.get([id4, id5])
  • 27. API Functions -> Tasks @ray.remote def read_array(file): # read array “a” from “file” return a @ray.remote(num_gpus=1) def add(a, b): return np.add(a, b) id1 = read_array.remote(“/input1”) id2 = read_array.remote(“/input2”) id3 = add.remote(id1, id2) Classes -> Actors @ray.remote(num_gpus=1) class Counter(object): def __init__(self): self.value = 0 def inc(self): self.value += 1 return self.value c = Counter.remote() id4 = c.inc.remote() id5 = c.inc.remote() ray.get([id4, id5])
  • 28. at Anyscale Your app here! Native Libraries 3rd Party Libraries Ecosystem Universal framework for Distributed computing Ray Ecosystem
  • 30. Ray Tune: Scalable Hyperparameter Tuning Wide variety of algorithms Compatible with ML frameworks HYPERBAND PBT BAYESIAN OPT.
  • 31. Ray Tune focuses on simplifying execution Easily launch distributed multi-gpu tuning jobs Automatic fault tolerance to save 3x on GPU costs https://www.vecteezy.com/ $ ray up {cluster config} ray.init(address="auto") tune.run(func, num_samples=100)
  • 32. Ray Tune interoperates with other HPO libraries Ray Tune Ax Optuna scikit-optimize …
  • 33. def train_model(config={}): model = ConvNet(config) for i in range(steps): current_loss = model.train()
  • 34. from ray import tune def train_model(config={}): model = ConvNet(config) for i in range(steps): current_loss = model.train() tune.report(loss=current_loss)
  • 35. def train_model(config): model = ConvNet(config) for i in range(epochs): current_loss = model.train() tune.report(loss=current_loss) tune.run(train_model, config={“lr”: 0.1})
  • 36. tune.run( train_model, config={“lr”: tune.uniform(0.001, 0.1)}, num_samples=100 ) def train_model(config): model = ConvNet(config) for i in range(epochs): current_loss = model.train() tune.report(loss=current_loss)
  • 37. tune.run( train_model, config={“lr”: tune.uniform(0.001, 0.1)}, num_samples=100, scheduler=ASHAScheduler()) def train_model(config): model = ConvNet(config) for i in range(epochs): current_loss = model.train() tune.report(loss=current_loss)
  • 38. tune.run( train_model, config={“lr”: tune.uniform(0.001, 0.1)}, num_samples=100, scheduler=PopulationBasedTraining(...)) def train_model(config, checkpoint_dir=None): model = ConvNet(config) if checkpoint_dir is not None: model.load_checkpoint(checkpoint_dir+”model.pt”) for i in range(epochs): current_loss = model.train() with tune.checkpoint_dir() as dir: model.save_checkpoint(dir+”model.pt”) tune.report(loss=current_loss)
  • 40. Ray Serve is a Web Framework Built for Model Serving
  • 42. Ray Serve is high-performance and flexible • Framework-agnostic • Easily scales • Supports batching • Query your endpoints from HTTP and from Python • Easily integrate with other tools
  • 43. Ray Serve is built on top of Ray For user, no need to think about: • Interprocess communication • Failure management • Scheduling Just tell Ray Serve to scale up your model.
  • 44. Serve functions and stateful classes. Ray Serve will use multiple replicas to parallelize across cores and across nodes in your cluster. Ray Serve API
  • 45. Flexibility Query your model from HTTP: > curl "http://127.0.0.1:8000/my/route" Or query from Python using ServeHandle:
  • 47. Challenges of ML in production • It’s difficult to keep track of experiments. • It’s difficult to reproduce code. • There’s no standard way to package and deploy models. • There’s no central store to manage models (their versions and stage transitions). Source: mlflow.org
  • 48. What is MLflow? • Open-source ML lifecycle management tool • Single solution for all of the above challenges • Library-agnostic and language-agnostic • (Works with your existing code)
  • 49. Four key functions of MLflow Source: MLflow
  • 53. Ray Tune + MLflow Tracking def train_model(config): model = ConvNet(config) for i in range(epochs): current_loss = model.train() tune.report(loss=current_loss) tune.run( train_model, config={“lr”: tune.uniform(0.001, 0.1)}, num_samples=100, callbacks=[MLflowLoggerCallback(“my_experiment”)])
  • 54. Ray Tune + MLflow Tracking @mlflow_mixin def train_model(config): mlflow.autolog() xgboost_results = xgb.train(config, ...) tune.run( train_model, config={“lr”: tune.uniform(0.001, 0.1)}, num_samples=100)
  • 55. + > pip install mlflow-ray-serve > ray start --head > serve start
  • 56. MLflow deployments CLI Create deployment > mlflow deployments create -t ray-serve -m <model URI> --name my_model -C num_replicas=100 Model URI: • models:/MyModel/1 • runs:/93203689db9c4b50afb6869 • s3://<bucket>/<path> • ...
  • 57. MLflow deployments Python API Create model
  • 58. Integrating with Ray Serve is easy. • Ray Serve endpoints can be called from Python. • Clean conceptual separation: • Ray Serve handles data plane (processing) • MLflow handles control plane (metadata, configuration)
  • 59. Demo: An ML Platform built with MLflow and Ray
  • 60. Acknowledgements Thanks to Jules Damji, Sid Murching, and Paul Ogilvie for their help and guidance with MLflow. Thanks to Dmitri Gekhtman, Kai Fricke, Simon Mo, Edward Oakes, Richard Liaw, Kathryn Zhou and the rest of the Ray team!
  • 61.
  • 62. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.