SlideShare a Scribd company logo
1 of 70
Download to read offline
Dan Crankshaw
crankshaw@cs.berkeley.edu
http://clipper.ai
https://github.com/ucbrise/clipper
June 5, 2017
A Low-Latency Online Prediction
Serving System
Clipper
Collaborators
Joseph
Gonzalez
Ion
Stoica
Michael
Franklin
Xin Wang Giulio Zhou
Alexey
Tumanov
Corey Zumar
2
Nishad Singh Yujia Luo
Big
Data
Training
Learning
Complex Model
Learning Produces a Trained Model
“CAT”
Query Decision
Model
Big
Data
Training
Learning
Application
Decision
Query
?
Serving
Model
Big
Data
Training
Learning
Application
Decision
Query
Model
Prediction-Serving for interactive applications
Timescale: ~10s of milliseconds
Serving
q Challenges of prediction serving
q Clipper overview and architecture
q Deploy and serve Spark models with Clipper
In this talk…
Prediction-Serving Raises
New Challenges
Prediction-Serving Challenges
Support low-latency, high-
throughput serving workloads
???
Create
VWCaffe 9
Large and growing ecosystem
of ML models and frameworks
Models getting more complex
Ø 10s of GFLOPs [1]
Support low-latency, high-throughput serving workloads
Using specialized hardware
for predictions
Deployed on critical path
Ø Maintain SLOs under heavy load
[1] Deep Residual Learning for Image Recognition. He et al. CVPR 2015.
Google Translate
Serving
82,000 GPUs
running 24/7
[1] https://www.nytimes.com/2016/12/14/magazine/the-great-ai-awakening.html
140 billion words a day1
Invented New Hardware!
Tensor Processing Unit
(TPU)
Big Companies Build One-Off Systems
Problems:
Ø Expensive to build and maintain
Ø Highly specialized and require ML and
systems expertise
Ø Tightly-coupled model and application
Ø Difficult to change or update model
Ø Only supports single ML framework
Prediction-Serving Challenges
Support low-latency, high-
throughput serving workloads
???
Create
VWCaffe 13
Large and growing ecosystem
of ML models and frameworks
Large and growing ecosystem of ML models and frameworks
???
Fraud
Detection
14
Large and growing ecosystem of ML models and frameworks
???
Fraud
Detection
15
Ø Different programming languages:
Ø TensorFlow: Python and C++
Ø Spark: JVM-based
Ø Different hardware:
Ø CPUs and GPUs
Ø Different costs
Ø DNNs in TensorFlow are generally much slower
Large and growing ecosystem of ML models and frameworks
???
Fraud
Detection
Create VW
Caffe 16
Content
Rec.
Personal
Asst.
Robotic
Control
Machine
Translation
But building new
serving systems for
each framework is
expensive…
Big
Data
Batch Analytics
Model
Training
Use existing systems: Offline Scoring
Big
Data
Model
Training
Batch Analytics
Scoring
X Y
Datastore
Use existing systems: Offline Scoring
Application
Decision
Query
Look up decision in datastore
Low-Latency Serving
X Y
Use existing systems: Offline Scoring
Application
Decision
Query
Look up decision in datastore
Low-Latency Serving
X Y
Problems:
Ø Requires full set of queries ahead of time
Ø Small and bounded input domain
Ø Wasted computation and space
Ø Can render and store unneeded predictions
Ø Costly to update
Ø Re-run batch job
Use existing systems: Offline Scoring
Prediction-Serving Challenges
Support low-latency, high-
throughput serving workloads
???
Create
VWCaffe 22
Large and growing ecosystem
of ML models and frameworks
Prediction-Serving Today
Highly specialized systems for
specific problems
Decision
QueryX Y
Offline scoring with existing
frameworks and systems
Clipper aims to unify these
approaches
What would we like this to look like?
???
Content
Rec.
Fraud
Detection
Personal
Asst.
Robotic
Control
Machine
Translation
Create VW
Caffe 24
Encapsulate each system in isolated environment
???
Content
Rec.
Fraud
Detection
Personal
Asst.
Robotic
Control
Machine
Translation
Create VW
Caffe 25
Decouple models and applications
Content
Rec.
Fraud
Detection
Personal
Asst.
Robotic
Control
Machine
Translation
26
Caffe
???
Clipper
Predict
MC MC
RPC RPC RPC RPC
Clipper Decouples Applications and Models
Applications
Model Container (MC) MC
Caffe
Clipper Implementation
Clipper
MC MC MC
RPC RPC RPC RPC
Model Abstraction Layer
Caching
Adaptive Batching
Model Container (MC)
Core system: 35,000 lines of C++
Open source system: https://github.com/ucbrise/clipper
Goal: Community-owned platform for model deployment and
serving
Applications
Predict
Clipper
Query Processor
Redis
Clipper
Management
Applications
Model
Container
Model
Container
Predict
Clipper Implementation
clipper
admin
Driver Program
SparkContext
Worker Node
Executor
Task Task
Worker Node
Executor
Task Task
Worker Node
Executor
Task Task
Web Server
Database
Cache
Clipper
Clipper Connects Analytics and Serving
Getting Started with Clipper
Docker images available on DockerHub
Clipper admin is distributed as pip package:
pip install clipper_admin
Get up and running without cloning or compiling!
q Containerized frameworks: unified
abstraction and isolation
q Cross-framework caching and
batching: optimize throughput and latency
q Cross-framework model composition:
improved accuracy through ensembles and
bandits
Technologies inside Clipper
Clipper
Predict FeedbackRPC/REST Interface
Caffe
MC MC MC
RPC RPC RPC RPC
Clipper Decouples Applications and Models
Applications
Model Container (MC)
Clipper
Caffe
MC MC MC
RPC RPC RPC RPC
Model Container (MC)
Clipper
Caffe
MC MC MC
RPC RPC RPC RPC
Model Container (MC)
Common Interface à Simplifies Deployment:
Ø Evaluate models using original code & systems
Container-based Model Deployment
class ModelContainer:
def __init__(model_data)
def predict_batch(inputs)
Implement Model API:
class ModelContainer:
def __init__(model_data)
def predict_batch(inputs)
Implement Model API:
Ø API support for many programming languages
Ø Python
Ø Java
Ø C/C++
Container-based Model Deployment
Package model implementation and dependencies
Model Container (MC)
Container-based Model Deployment
class ModelContainer:
def __init__(model_data)
def predict_batch(inputs)
Clipper
Caffe
MC MC
RPC RPC RPC RPC
Model Container (MC)
Container-based Model Deployment
MC
Clipper
Caffe
MC MC MC
RPC RPC RPC RPC
Model Container (MC)
Common Interface à Simplifies Deployment:
Ø Evaluate models using original code & systems
Ø Models run in separate processes as Docker containers
Ø Resource isolation: Cutting edge ML frameworks can be buggy
Clipper
Caffe
MC
RPC RPC RPC
Container (MC)
Common Interface à Simplifies Deployment:
Ø Evaluate models using original code & systems
Ø Models run in separate processes as Docker containers
Ø Resource isolation: Cutting edge ML frameworks can be buggy
Ø Scale-out
MC
RPCRPC RPC
MC MC MC
TensorFlow-
Serving
Predict RPC Interface
Applications
Overhead of decoupled architecture
Clipper
Predict FeedbackRPC/REST Interface
Caffe
MC MC
RPC RPC RPC RPC
Applications
MC MC
Overhead of decoupled architecture
Throughput
(QPS)
Better P99 Latency
(ms)
Better
Model: AlexNet trained on CIFAR-10
q Containerized frameworks: unified
abstraction and isolation
q Cross-framework caching and
batching: optimize throughput and latency
q Cross-framework model composition:
improved accuracy through ensembles and
bandits
Technologies inside Clipper
Clipper Architecture
Clipper
Applications
Predict ObserveRPC/REST Interface
MC MC MC
RPC RPC RPC RPC
Model Abstraction Layer
Caching
Latency-Aware Batching
Model Container (MC)
A single
page load
may generate
many queries
Batching to Improve Throughput
Ø Optimal batch depends on:
Ø hardware configuration
Ø model and framework
Ø system load
Ø Why batching helps:
Throughput
Throughput-optimized frameworks
Batch size
A single
page load
may generate
many queries
Ø Optimal batch depends on:
Ø hardware configuration
Ø model and framework
Ø system load
Ø Why batching helps:
Throughput
Throughput-optimized frameworks
Batch size
Batching to Improve ThroughputLatency-aware
Clipper Solution:
Adaptively tradeoff latency and throughput…
Ø Inc. batch size until the latency objective
is exceeded (Additive Increase)
Ø If latency exceeds SLO cut batch size by
a fraction (Multiplicative Decrease)
Throughput
(QPS)
1R-2S
5andRP
)RreVt
(6KOearn)
LLnear 690
(3y6SarN)
LLnear 690
(6KLearn)
KerneO 690
(6KLearn)
LRg 5egreVVLRn
(6KLearn)
0
20000
40000
60000 48386
22859
29350 48934
197
47219
8963
317 7206
1920
203
1921
AdaStLve 1R BatFKLng
Better
Batching to Improve ThroughputLatency-aware
Overhead of decoupled architecture
Clipper
Predict FeedbackRPC/REST Interface
Caffe
MC MC MC
RPC RPC RPC RPC
Applications
MC
q Containerized frameworks: unified
abstraction and isolation
q Cross-framework caching and
batching: optimize throughput and latency
q Cross-framework model composition:
improved accuracy through ensembles and
bandits
Technologies inside Clipper
Clipper Architecture
Clipper
Applications
Predict ObserveRPC/REST Interface
MC MC MC
RPC RPC RPC RPC
Model Abstraction Layer
Model Container (MC)
Model Selection LayerSelection Policy
Clipper
Version 1
Version 2
Version 3
Periodic retraining
Experiment with new
models and frameworks
Model Selection LayerSelection Policy
“CAT”
“CAT”
“CAT”
“CAT”
“CAT”
CONFIDENT
Selection Policy: Estimate confidence
Policy
Version 2
Version 3
“CAT”
“MAN”
“CAT”
“SPACE”
“CAT”
UNSURE
Selection Policy: Estimate confidence
Policy
Clipper
Model Selection LayerSelection Policy
ØExploit multiple models to estimate confidence
Ø Use multi-armed bandit algorithms to learn
optimal model-selection online
Ø Online personalization across ML frameworks
*See paper for details [NSDI 2017]
Serving Spark
Models with Clipper
Driver Program
SparkContext
Worker Node
Executor
Task Task
Worker Node
Executor
Task Task
Worker Node
Executor
Task Task
Web Server
Database
Cache
Clipper
Deploy Models from Spark to Clipper
You can always implement your own model container…
Model Container (MC)
class ModelContainer:
def __init__(model_data):
def predict_batch(inputs):
Clipper admin will automatically deploy Spark models
Clipper.deploy_pyspark_model(model, predict_function)
Clipper admin will automatically deploy Spark models
Clipper.deploy_pyspark_model(model, predict_function)
Instance of PySpark model:
pyspark.mllib.*
pyspark.ml.pipeline.PipelineModel
Clipper admin will automatically deploy Spark models
Clipper.deploy_pyspark_model(model, predict_function)
Python closure
def predict_function(spark, model, inputs)
Params:
spark : SparkSession
model : ml or mllib model instance
inputs : list of inputs to predict
Returns:
list of str
Can be used for business logic, featurization,
post-processing…
pyspark.mllib example:
Clipper.deploy_pyspark_model(model, predict_function)
pyspark.ml.pipeline example:
Clipper.deploy_pyspark_model(model, predict_function)
Driver Program
SparkContext
Worker Node
Executor
Task Task
Worker Node
Executor
Task Task
Worker Node
Executor
Task Task
Web Server
Database
Cache
Clipper
Deploy Models from Spark to Clipper
Driver Program
SparkContext
Worker Node
Executor
Task Task
Worker Node
Executor
Task Task
Worker Node
Executor
Task Task
Web Server
Database
Cache
Deploy Models from Spark to Clipper
Clipper
MC
Driver Program
SparkContext
Worker Node
Executor
Task Task
Worker Node
Executor
Task Task
Worker Node
Executor
Task Task
Web Server
Database
Cache
Deploy Models from Spark to Clipper
Clipper
Clipper can automatically deploy other models:
ØArbitrary Python closures:
ØScikit-Learn models
Øpred_function can contain any Python code that is
pickle-able
ØDeploy Spark models from Scala or Java:
ØSupports both MLLib and Pipelines APIs
DEMO
http://clipper.ai/examples/spark-demo/
Check out Clipper
Ø Alpha release (0.1.3) available
Ø First class support for Scikit-Learn and Spark models
Ø Deploy directly from training job without writing Docker containers
Ø Actively working with early users
Ø Post issues and questions on GitHub and subscribe to our
mailing list clipper-dev@googlegroups.com
http://clipper.ai
What’s next
Ø Ongoing efforts to support for additional frameworks
Ø TensorFlow (supported but requires writing Docker containers)
Ø R (open pull request using Python-R interop library)
Ø Caffe2
Ø Kubernetes integration (early internal prototype)
Ø Bandit methods, online personalization, and cascaded predictions
Ø Studied in research prototypes but need to integrate into system
Ø Actively looking for contributors:
https://github.com/ucbrise/clipper

More Related Content

Similar to Clipper: A Low-Latency Online Prediction Serving System

Clipper at UC Berkeley RISECamp 2017
Clipper at UC Berkeley RISECamp 2017Clipper at UC Berkeley RISECamp 2017
Clipper at UC Berkeley RISECamp 2017Dan Crankshaw
 
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...Databricks
 
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...Spark Summit
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLNordic APIs
 
Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019Iulian Pintoiu
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...MLconf
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareJustin Basilico
 
Bounded Model Checking for C Programs in an Enterprise Environment
Bounded Model Checking for C Programs in an Enterprise EnvironmentBounded Model Checking for C Programs in an Enterprise Environment
Bounded Model Checking for C Programs in an Enterprise EnvironmentAdaCore
 
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNetFrom Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNetEric Haibin Lin
 
The Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open SourceThe Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open SourceDataWorks Summit/Hadoop Summit
 
Confluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with ReplyConfluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with Replyconfluent
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Anton Nazaruk
 
AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...Ryousei Takano
 
Scale Container Operations with AIOps
Scale Container Operations with AIOpsScale Container Operations with AIOps
Scale Container Operations with AIOpsTimothy Chen
 
Kubecon 2023 EU - KServe - The State and Future of Cloud-Native Model Serving
Kubecon 2023 EU - KServe - The State and Future of Cloud-Native Model ServingKubecon 2023 EU - KServe - The State and Future of Cloud-Native Model Serving
Kubecon 2023 EU - KServe - The State and Future of Cloud-Native Model ServingTheofilos Papapanagiotou
 
Legion - AI Runtime Platform
Legion -  AI Runtime PlatformLegion -  AI Runtime Platform
Legion - AI Runtime PlatformAlexey Kharlamov
 
Efficient Validation of Large Models using the Mogwaï Tool
Efficient Validation of Large Models using the Mogwaï ToolEfficient Validation of Large Models using the Mogwaï Tool
Efficient Validation of Large Models using the Mogwaï ToolGwendal Daniel
 

Similar to Clipper: A Low-Latency Online Prediction Serving System (20)

Clipper at UC Berkeley RISECamp 2017
Clipper at UC Berkeley RISECamp 2017Clipper at UC Berkeley RISECamp 2017
Clipper at UC Berkeley RISECamp 2017
 
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
 
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of ML
 
System mldl meetup
System mldl meetupSystem mldl meetup
System mldl meetup
 
Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
 
Bounded Model Checking for C Programs in an Enterprise Environment
Bounded Model Checking for C Programs in an Enterprise EnvironmentBounded Model Checking for C Programs in an Enterprise Environment
Bounded Model Checking for C Programs in an Enterprise Environment
 
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNetFrom Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
 
optimizing_ceph_flash
optimizing_ceph_flashoptimizing_ceph_flash
optimizing_ceph_flash
 
The Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open SourceThe Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open Source
 
Confluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with ReplyConfluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with Reply
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...
 
Scale Container Operations with AIOps
Scale Container Operations with AIOpsScale Container Operations with AIOps
Scale Container Operations with AIOps
 
Kubecon 2023 EU - KServe - The State and Future of Cloud-Native Model Serving
Kubecon 2023 EU - KServe - The State and Future of Cloud-Native Model ServingKubecon 2023 EU - KServe - The State and Future of Cloud-Native Model Serving
Kubecon 2023 EU - KServe - The State and Future of Cloud-Native Model Serving
 
Legion - AI Runtime Platform
Legion -  AI Runtime PlatformLegion -  AI Runtime Platform
Legion - AI Runtime Platform
 
Efficient Validation of Large Models using the Mogwaï Tool
Efficient Validation of Large Models using the Mogwaï ToolEfficient Validation of Large Models using the Mogwaï Tool
Efficient Validation of Large Models using the Mogwaï Tool
 
Microservices.pdf
Microservices.pdfMicroservices.pdf
Microservices.pdf
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencessuser9e7c64
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingShane Coughlan
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shardsChristopher Curtin
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Anthony Dahanne
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 

Recently uploaded (20)

SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conference
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 

Clipper: A Low-Latency Online Prediction Serving System