SlideShare a Scribd company logo
1 of 74
Download to read offline
Ray: The alternative to
distributed frameworks
李泓旻(Andrew Li)
2
About me
- Data Engineer
@Data Science & Technology, Cathay Financial Holdings
- Former one-stop engineer for data science(Manufacturing)
- Former Chemical Engineer
- Polymer material, Genetic engineering, Bacterial fermentation
- D4SG (Data for Social Good) #4, winter 2018
- First prize, Genius For Home competition, MediaTek, 2018
- : orcahmlee
3
Source
What will you get
4
Why We Need
5
6
Four Reasons Why Leading Companies Are Betting On Ray, Anyscale
How Ray’s ecosystem powers Spotify’s ML scientists and engineers
7
8
What if We Could
9
10
Four Reasons Why Leading Companies Are Betting On Ray, Anyscale
How Ray’s ecosystem powers Spotify’s ML scientists and engineers
What is Ray?
11
12
Ray
13
Ray
14
Ray Tune:
Tuning with your favorite
ML framework
15
Ray Tune: Tuning with your favorite framework
and more......
16
Ray Tune: Tuning with your favorite framework
17
Ray Tune: Tuning with your favorite framework
18
Ray Tune: Tuning with your favorite framework
19
Ray Tune: Tuning with your favorite framework
20
Ray Tune: Tuning with your favorite framework
21
Ray Tune: Tuning with your favorite framework
search_optimization Algorithm
"random" (Random Search)
"bayesian" SkoptSearch
"hyperopt" HyperOptSearch
"bohb" TuneBOHB
"optuna" Optuna
22
Modin:
A drop-in replacement for
pandas
Modin: A drop-in replacement for pandas
23
Modin: A drop-in replacement for pandas
24
Modin: Architecture
25
Modin: Architecture
pandas API coverage
26
Modin vs. Dask DataFrame vs. Koalas
- Dask DataFrame and Koalas
- Lazy execution
- Support row-oriented partitioning and parallelism
- Modin
- Eager execution
- Support row, column, and cell-oriented partitioning
and parallelism
Modin vs. Dask DataFrame vs. Koalas
27
Modin vs. Dask DataFrame vs. Koalas
Decomposition
28
Flexible Rule-Based Decomposition and Metadata Independence in Modin: A Parallel Dataframe System
- Dask DataFrame and Koalas
- Lazy execution
- Support row-oriented partitioning and parallelism
- Modin
- Eager execution
- Support row, column, and cell-oriented partitioning
and parallelism
- If the API is not supported yet, it is being executed
in the default to pandas mode
Modin vs. Dask DataFrame vs. Koalas
29
Modin vs. Dask DataFrame vs. Koalas
default to pandas
30
Defaulting to pandas
Supported APIs
- pd.DataFrame
- Y: iloc, T, all, any, quantile, apply, applymap……
- D: plot, to_parquet, to_pickle, to_json……
- pd.Series
- Y: iloc, T, all, any, quantile, apply, value_counts, to_frame……
- D: plot, to_parquet, to_pickle, to_json……
- pd.read_<file>
- Y: read_csv, read_parquet……
- D: read_pickle, read_html……
- Utilities
- Y: pd.concat, pd.unique, pd.get_dummies……
- D: pd.cut, pd.to_datetime, pd.to_numeric……
31
Supported APIs
32
Ray Core
Ray 33
Ray:
Programming Model
34
Actor, Stateful
Task, Stateless
Programming model
35
Fire and Forget, AIM-120 AMRAAM
Actor Model
- What is Actor Model and why to use it
- Related languages/frameworks implements Actor Model:
- Erlang, RabbitMQ, Akka
- Super useful references:
- https:>/blog.techbridge.cc/2019/06/21/actor-model-in-web/
- [COSCUP 2011] Programming for the Future, Introduction to the
Actor Model and Akka Framework
36
Function —> Task
37
38
Class —> Actor
Programming model
39
Ray: A Distributed Framework for Emerging AI Applications
Programming model
40
Ray: A Distributed Framework for Emerging AI Applications
Specifying Resources
41
Ray:
Architecture
42
Architecture
43
Ray: A Distributed Framework for Emerging AI Applications
Architecture - Application Layer
44
Ray: A Distributed Framework for Emerging AI Applications
Architecture - System Layer
The system layer consists of three major components
- Global Control Store(GCS)
- Bottom-Up Distributed Scheduler
- In-Memory Distributed Object Store
45
Ray: A Distributed Framework for Emerging AI Applications
Global Control Store(GCS)
46
Global Control Store
47
Ray: A Distributed Framework for Emerging AI Applications
Global Control Store
48
- Maintains fault tolerance and low latency
- Enables every components in the system to be
stateless
- Key-value store with pub-sub functionality
- < v1.11.0: Using Redis
- >=v1.11.0: No longer starts Redis as default
Ray: A Distributed Framework for Emerging AI Applications
Global Control Store (< v1.11.0)
49
Redis in Ray: Past and future
Global Control Store (>=v1.11.0)
50
Redis in Ray: Past and future
Global Control Store
51
- Maintains fault tolerance and low latency
- Enables every components in the system to be
stateless
- Key-value store with pub-sub functionality
- < v1.11.0: Using Redis
- >=v1.11.0: No longer starts Redis as default
Ray: A Distributed Framework for Emerging AI Applications
Global Control Store
Fault tolerance
- Decouple the durable lineage storage from other
system components
- Heartbeat table, Job table, Actor table
52
Ray: A Distributed Framework for Emerging AI Applications
Global Control Store
Low latency
- Centralized scheduler couple task scheduling and task
dispatch(Dask, Spark, CIEL)
- Involving the scheduler in each object transfer is
prohibitively expensive
- Ray store the object’s metadata in GCS rather than in
the scheduler, fully decoupling task dispatch from
task scheduling
53
Ray: A Distributed Framework for Emerging AI Applications
Bottom-Up
Distributed Scheduler
54
Bottom-Up Distributed Scheduler
55
Ray: A Distributed Framework for Emerging AI Applications
Existing cluster computing frameworks:
- Centralized schedulers: provide locality but at latencies
in the tens of ms(Spark, CIEL, Dryed)
- Distributed schedulers: can achieve high scale, but they
either don’t consider data locality(work stealing), or
assume tasks belong to independent jobs(Sparrow), or
assume the computation graph is known(Canary)
Bottom-Up Distributed Scheduler
56
Ray: A Distributed Framework for Emerging AI Applications
Bottom-Up Distributed Scheduler
57
Ray: A Distributed Framework for Emerging AI Applications
In-Memory Distributed
Object Store
58
In-Memory Distributed Object Store
59
Ray: A Distributed Framework for Emerging AI Applications
- Plasma: A High-Performance Shared-Memory Object Store
- Plasma was initially developed as part of Ray that is
being developed as part of Apache Arrow
- On each node, Ray implement the object store via
shared memory. This allows zero-copy data sharing
between tasks running on the same node
- Plasma holds immutable objects in shared memory
In-Memory Distributed Object Store
60
Ray: A Distributed Framework for Emerging AI Applications
- To minimize task latency, Plasma is used to store the
inputs and outputs of every task, or stateless
computation.
- For low latency, Ray keep objects entirely in memory
and evict them as needed to disk using an LRU policy
- Small objects(<100 KiB): store in in-process object store
- Large objects: store in shared memory object store
In-Memory Distributed Object Store
61
Ray: A Distributed Framework for Emerging AI Applications
In-Memory Distributed Object Store
62
In-Memory Distributed Object Store
63
In-Memory Distributed Object Store
Object spilling and persistence
- Spilling objects to external storage once the capacity
of the object store is used up(v1.3+)
- Two types of external storage supported by default
- For local storage, the OS would run out of inodes very
quickly. If objects are smaller than 100MB, Ray fuses
objects into a single file to avoid this problem
64
In-Memory Distributed Object Store
65
Fault Tolerance
- Ray recovers any needed objects through lineage
re-execution. The lineage stored in the GCS tracks
both stateless tasks and stateful actors during
initial execution
Ray: A Distributed Framework for Emerging AI Applications
Ray:
Cluster Launcher
66
Ray Cluster on GCP/AWS/Azure
67
VM VM VM
Ray Cluster on K8s
68
POD POD
POD
Ray:
Handling Dependencies
69
Handling Dependencies
70
Source
Handling Dependencies
71
RECAP
72
73
Ray
Thank you for your time
74

More Related Content

Similar to Ray The alternative to distributed frameworks.pdf

Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
 Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov... Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
Databricks
 
Off-Label Data Mesh: A Prescription for Healthier Data
Off-Label Data Mesh: A Prescription for Healthier DataOff-Label Data Mesh: A Prescription for Healthier Data
Off-Label Data Mesh: A Prescription for Healthier Data
HostedbyConfluent
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata
Bhupesh Bansal
 

Similar to Ray The alternative to distributed frameworks.pdf (20)

Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkRunning Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
 
Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analytics
 
Data Engineering for Data Scientists
Data Engineering for Data Scientists Data Engineering for Data Scientists
Data Engineering for Data Scientists
 
Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
 Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov... Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
 
NameNode Analytics - Querying HDFS Namespace in Real Time
NameNode Analytics - Querying HDFS Namespace in Real TimeNameNode Analytics - Querying HDFS Namespace in Real Time
NameNode Analytics - Querying HDFS Namespace in Real Time
 
Ceph as software define storage
Ceph as software define storageCeph as software define storage
Ceph as software define storage
 
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
 
Off-Label Data Mesh: A Prescription for Healthier Data
Off-Label Data Mesh: A Prescription for Healthier DataOff-Label Data Mesh: A Prescription for Healthier Data
Off-Label Data Mesh: A Prescription for Healthier Data
 
Teradata Partners Conference Oct 2014 Big Data Anti-Patterns
Teradata Partners Conference Oct 2014   Big Data Anti-PatternsTeradata Partners Conference Oct 2014   Big Data Anti-Patterns
Teradata Partners Conference Oct 2014 Big Data Anti-Patterns
 
Ray and Its Growing Ecosystem
Ray and Its Growing EcosystemRay and Its Growing Ecosystem
Ray and Its Growing Ecosystem
 
Inroduction to Big Data
Inroduction to Big DataInroduction to Big Data
Inroduction to Big Data
 
A look under the hood at Apache Spark's API and engine evolutions
A look under the hood at Apache Spark's API and engine evolutionsA look under the hood at Apache Spark's API and engine evolutions
A look under the hood at Apache Spark's API and engine evolutions
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
Critical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency DatabaseCritical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency Database
 
Analyzing Big data in R and Scala using Apache Spark 17-7-19
Analyzing Big data in R and Scala using Apache Spark  17-7-19Analyzing Big data in R and Scala using Apache Spark  17-7-19
Analyzing Big data in R and Scala using Apache Spark 17-7-19
 
Spark 101
Spark 101Spark 101
Spark 101
 
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata
 
THE SOLUTION FOR BIG DATA
THE SOLUTION FOR BIG DATATHE SOLUTION FOR BIG DATA
THE SOLUTION FOR BIG DATA
 

Recently uploaded

Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
Muhammad Subhan
 

Recently uploaded (20)

JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
 

Ray The alternative to distributed frameworks.pdf