Submit Search
Upload
Dynamic Resource Allocation in Apache Spark
•
7 likes
•
3,014 views
Yuta Imai
Follow
The talk about Dynamic Resource Allocation and External Shuffle Service.
Read less
Read more
Technology
Report
Share
Report
Share
1 of 21
Download now
Download to read offline
Recommended
WattGo: Analyses temps-réél de series temporelles avec Spark et Solr (Français)
WattGo: Analyses temps-réél de series temporelles avec Spark et Solr (Français)
DataStax Academy
RxJS Evolved
RxJS Evolved
trxcllnt
ComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical Sciences
alexstorer
R and cpp
R and cpp
Romain Francois
RxJS - The Reactive extensions for JavaScript
RxJS - The Reactive extensions for JavaScript
Viliam Elischer
Operations on rdd
Operations on rdd
sparrowAnalytics.com
RxJS101 - What you need to know to get started with RxJS tomorrow
RxJS101 - What you need to know to get started with RxJS tomorrow
Viliam Elischer
Climate data in r with the raster package
Climate data in r with the raster package
Alberto Labarga
Recommended
WattGo: Analyses temps-réél de series temporelles avec Spark et Solr (Français)
WattGo: Analyses temps-réél de series temporelles avec Spark et Solr (Français)
DataStax Academy
RxJS Evolved
RxJS Evolved
trxcllnt
ComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical Sciences
alexstorer
R and cpp
R and cpp
Romain Francois
RxJS - The Reactive extensions for JavaScript
RxJS - The Reactive extensions for JavaScript
Viliam Elischer
Operations on rdd
Operations on rdd
sparrowAnalytics.com
RxJS101 - What you need to know to get started with RxJS tomorrow
RxJS101 - What you need to know to get started with RxJS tomorrow
Viliam Elischer
Climate data in r with the raster package
Climate data in r with the raster package
Alberto Labarga
Meet scala
Meet scala
Wojciech Pituła
MapReduce with Scalding @ 24th Hadoop London Meetup
MapReduce with Scalding @ 24th Hadoop London Meetup
Landoop Ltd
Scott Anderson [InfluxData] | InfluxDB Tasks – Beyond Downsampling | InfluxDa...
Scott Anderson [InfluxData] | InfluxDB Tasks – Beyond Downsampling | InfluxDa...
InfluxData
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
InfluxData
spaCy lightning talk for KyivPy #21
spaCy lightning talk for KyivPy #21
Anton Kasyanov
Scalding Presentation
Scalding Presentation
Landoop Ltd
JS Fest 2019. Anjana Vakil. Serverless Bebop
JS Fest 2019. Anjana Vakil. Serverless Bebop
JSFestUA
Spark workshop
Spark workshop
Wojciech Pituła
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Modern Data Stack France
Spark_Documentation_Template1
Spark_Documentation_Template1
Nagavarunkumar Kolla
Spark schema for free with David Szakallas
Spark schema for free with David Szakallas
Databricks
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
InfluxData
HyperLogLog in Hive - How to count sheep efficiently?
HyperLogLog in Hive - How to count sheep efficiently?
bzamecnik
Time Series Analysis for Network Secruity
Time Series Analysis for Network Secruity
mrphilroth
Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...
Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...
Data Con LA
Wprowadzenie do technologi Big Data i Apache Hadoop
Wprowadzenie do technologi Big Data i Apache Hadoop
Sages
R and C++
R and C++
Romain Francois
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Sages
Caching a page
Caching a page
Radha Krishnan
Spark Schema For Free with David Szakallas
Spark Schema For Free with David Szakallas
Databricks
Spark devoxx2014
Spark devoxx2014
Andy Petrella
Artigo 81 - spark_tutorial.pdf
Artigo 81 - spark_tutorial.pdf
WalmirCouto3
More Related Content
What's hot
Meet scala
Meet scala
Wojciech Pituła
MapReduce with Scalding @ 24th Hadoop London Meetup
MapReduce with Scalding @ 24th Hadoop London Meetup
Landoop Ltd
Scott Anderson [InfluxData] | InfluxDB Tasks – Beyond Downsampling | InfluxDa...
Scott Anderson [InfluxData] | InfluxDB Tasks – Beyond Downsampling | InfluxDa...
InfluxData
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
InfluxData
spaCy lightning talk for KyivPy #21
spaCy lightning talk for KyivPy #21
Anton Kasyanov
Scalding Presentation
Scalding Presentation
Landoop Ltd
JS Fest 2019. Anjana Vakil. Serverless Bebop
JS Fest 2019. Anjana Vakil. Serverless Bebop
JSFestUA
Spark workshop
Spark workshop
Wojciech Pituła
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Modern Data Stack France
Spark_Documentation_Template1
Spark_Documentation_Template1
Nagavarunkumar Kolla
Spark schema for free with David Szakallas
Spark schema for free with David Szakallas
Databricks
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
InfluxData
HyperLogLog in Hive - How to count sheep efficiently?
HyperLogLog in Hive - How to count sheep efficiently?
bzamecnik
Time Series Analysis for Network Secruity
Time Series Analysis for Network Secruity
mrphilroth
Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...
Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...
Data Con LA
Wprowadzenie do technologi Big Data i Apache Hadoop
Wprowadzenie do technologi Big Data i Apache Hadoop
Sages
R and C++
R and C++
Romain Francois
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Sages
Caching a page
Caching a page
Radha Krishnan
Spark Schema For Free with David Szakallas
Spark Schema For Free with David Szakallas
Databricks
What's hot
(20)
Meet scala
Meet scala
MapReduce with Scalding @ 24th Hadoop London Meetup
MapReduce with Scalding @ 24th Hadoop London Meetup
Scott Anderson [InfluxData] | InfluxDB Tasks – Beyond Downsampling | InfluxDa...
Scott Anderson [InfluxData] | InfluxDB Tasks – Beyond Downsampling | InfluxDa...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
spaCy lightning talk for KyivPy #21
spaCy lightning talk for KyivPy #21
Scalding Presentation
Scalding Presentation
JS Fest 2019. Anjana Vakil. Serverless Bebop
JS Fest 2019. Anjana Vakil. Serverless Bebop
Spark workshop
Spark workshop
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Spark_Documentation_Template1
Spark_Documentation_Template1
Spark schema for free with David Szakallas
Spark schema for free with David Szakallas
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
HyperLogLog in Hive - How to count sheep efficiently?
HyperLogLog in Hive - How to count sheep efficiently?
Time Series Analysis for Network Secruity
Time Series Analysis for Network Secruity
Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...
Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...
Wprowadzenie do technologi Big Data i Apache Hadoop
Wprowadzenie do technologi Big Data i Apache Hadoop
R and C++
R and C++
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Caching a page
Caching a page
Spark Schema For Free with David Szakallas
Spark Schema For Free with David Szakallas
Similar to Dynamic Resource Allocation in Apache Spark
Spark devoxx2014
Spark devoxx2014
Andy Petrella
Artigo 81 - spark_tutorial.pdf
Artigo 81 - spark_tutorial.pdf
WalmirCouto3
Using spark 1.2 with Java 8 and Cassandra
Using spark 1.2 with Java 8 and Cassandra
Denis Dus
Spark by Adform Research, Paulius
Spark by Adform Research, Paulius
Vasil Remeniuk
20130912 YTC_Reynold Xin_Spark and Shark
20130912 YTC_Reynold Xin_Spark and Shark
YahooTechConference
Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014
Konrad Malawski
Spark4
Spark4
poovarasu maniandan
DataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL Workshop
Hakka Labs
Beauty and the beast - Haskell on JVM
Beauty and the beast - Haskell on JVM
Jarek Ratajski
Apply Hammer Directly to Thumb; Avoiding Apache Spark and Cassandra AntiPatt...
Apply Hammer Directly to Thumb; Avoiding Apache Spark and Cassandra AntiPatt...
Databricks
Introduction to Apache Spark
Introduction to Apache Spark
Mohamed hedi Abidi
Spark: Taming Big Data
Spark: Taming Big Data
Leonardo Gamas
JDays Lviv 2014: Java8 vs Scala: Difference points & innovation stream
JDays Lviv 2014: Java8 vs Scala: Difference points & innovation stream
Ruslan Shevchenko
Big Data Scala by the Bay: Interactive Spark in your Browser
Big Data Scala by the Bay: Interactive Spark in your Browser
gethue
NLP on a Billion Documents: Scalable Machine Learning with Apache Spark
NLP on a Billion Documents: Scalable Machine Learning with Apache Spark
Martin Goodson
Meetup ml spark_ppt
Meetup ml spark_ppt
Snehal Nagmote
Introduction to Spark with Scala
Introduction to Spark with Scala
Himanshu Gupta
Apache Spark and DataStax Enablement
Apache Spark and DataStax Enablement
Vincent Poncet
Scala.js - yet another what..?
Scala.js - yet another what..?
Artur Skowroński
Real Time Big Data Management
Real Time Big Data Management
Albert Bifet
Similar to Dynamic Resource Allocation in Apache Spark
(20)
Spark devoxx2014
Spark devoxx2014
Artigo 81 - spark_tutorial.pdf
Artigo 81 - spark_tutorial.pdf
Using spark 1.2 with Java 8 and Cassandra
Using spark 1.2 with Java 8 and Cassandra
Spark by Adform Research, Paulius
Spark by Adform Research, Paulius
20130912 YTC_Reynold Xin_Spark and Shark
20130912 YTC_Reynold Xin_Spark and Shark
Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014
Spark4
Spark4
DataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL Workshop
Beauty and the beast - Haskell on JVM
Beauty and the beast - Haskell on JVM
Apply Hammer Directly to Thumb; Avoiding Apache Spark and Cassandra AntiPatt...
Apply Hammer Directly to Thumb; Avoiding Apache Spark and Cassandra AntiPatt...
Introduction to Apache Spark
Introduction to Apache Spark
Spark: Taming Big Data
Spark: Taming Big Data
JDays Lviv 2014: Java8 vs Scala: Difference points & innovation stream
JDays Lviv 2014: Java8 vs Scala: Difference points & innovation stream
Big Data Scala by the Bay: Interactive Spark in your Browser
Big Data Scala by the Bay: Interactive Spark in your Browser
NLP on a Billion Documents: Scalable Machine Learning with Apache Spark
NLP on a Billion Documents: Scalable Machine Learning with Apache Spark
Meetup ml spark_ppt
Meetup ml spark_ppt
Introduction to Spark with Scala
Introduction to Spark with Scala
Apache Spark and DataStax Enablement
Apache Spark and DataStax Enablement
Scala.js - yet another what..?
Scala.js - yet another what..?
Real Time Big Data Management
Real Time Big Data Management
More from Yuta Imai
Node-RED on device to Apache NiFi on cloud, via SORACOM Canal, with no Internet
Node-RED on device to Apache NiFi on cloud, via SORACOM Canal, with no Internet
Yuta Imai
HDP2.5 Updates
HDP2.5 Updates
Yuta Imai
Deep Learning On Apache Spark
Deep Learning On Apache Spark
Yuta Imai
Hadoop in adtech
Hadoop in adtech
Yuta Imai
Hadoop/Spark セルフサービス系の事例まとめ
Hadoop/Spark セルフサービス系の事例まとめ
Yuta Imai
IoTアプリケーションで利用するApache NiFi
IoTアプリケーションで利用するApache NiFi
Yuta Imai
OLAP options on Hadoop
OLAP options on Hadoop
Yuta Imai
Apache ambari
Apache ambari
Yuta Imai
Spark at Scale
Spark at Scale
Yuta Imai
Apache Hiveの今とこれから - 2016
Apache Hiveの今とこれから - 2016
Yuta Imai
Hadoop最新事情とHortonworks Data Platform
Hadoop最新事情とHortonworks Data Platform
Yuta Imai
Benchmark and Metrics
Benchmark and Metrics
Yuta Imai
Hadoop and Kerberos
Hadoop and Kerberos
Yuta Imai
Spark Streaming + Amazon Kinesis
Spark Streaming + Amazon Kinesis
Yuta Imai
オンラインゲームの仕組みと工夫
オンラインゲームの仕組みと工夫
Yuta Imai
Amazon Machine Learning
Amazon Machine Learning
Yuta Imai
Global Gaming On AWS
Global Gaming On AWS
Yuta Imai
Digital marketing on AWS
Digital marketing on AWS
Yuta Imai
EC2のストレージどう使う? -Instance Storageを理解して高速IOを上手に活用!-
EC2のストレージどう使う? -Instance Storageを理解して高速IOを上手に活用!-
Yuta Imai
クラウドネイティブなアーキテクチャでサクサク解析
クラウドネイティブなアーキテクチャでサクサク解析
Yuta Imai
More from Yuta Imai
(20)
Node-RED on device to Apache NiFi on cloud, via SORACOM Canal, with no Internet
Node-RED on device to Apache NiFi on cloud, via SORACOM Canal, with no Internet
HDP2.5 Updates
HDP2.5 Updates
Deep Learning On Apache Spark
Deep Learning On Apache Spark
Hadoop in adtech
Hadoop in adtech
Hadoop/Spark セルフサービス系の事例まとめ
Hadoop/Spark セルフサービス系の事例まとめ
IoTアプリケーションで利用するApache NiFi
IoTアプリケーションで利用するApache NiFi
OLAP options on Hadoop
OLAP options on Hadoop
Apache ambari
Apache ambari
Spark at Scale
Spark at Scale
Apache Hiveの今とこれから - 2016
Apache Hiveの今とこれから - 2016
Hadoop最新事情とHortonworks Data Platform
Hadoop最新事情とHortonworks Data Platform
Benchmark and Metrics
Benchmark and Metrics
Hadoop and Kerberos
Hadoop and Kerberos
Spark Streaming + Amazon Kinesis
Spark Streaming + Amazon Kinesis
オンラインゲームの仕組みと工夫
オンラインゲームの仕組みと工夫
Amazon Machine Learning
Amazon Machine Learning
Global Gaming On AWS
Global Gaming On AWS
Digital marketing on AWS
Digital marketing on AWS
EC2のストレージどう使う? -Instance Storageを理解して高速IOを上手に活用!-
EC2のストレージどう使う? -Instance Storageを理解して高速IOを上手に活用!-
クラウドネイティブなアーキテクチャでサクサク解析
クラウドネイティブなアーキテクチャでサクサク解析
Recently uploaded
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
Hyundai Motor Group
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
AndikSusilo4
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
Scott Keck-Warren
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Enjoy Anytime
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
Neo4j
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
Padma Pradeep
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
Pixlogix Infotech
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
naman860154
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
Scott Keck-Warren
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
Michael W. Hawkins
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
BookNet Canada
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Safe Software
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Malak Abu Hammad
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Mark Billinghurst
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
Delhi Call girls
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Puma Security, LLC
Key Features Of Token Development (1).pptx
Key Features Of Token Development (1).pptx
LBM Solutions
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
Softradix Technologies
Recently uploaded
(20)
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Key Features Of Token Development (1).pptx
Key Features Of Token Development (1).pptx
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
Dynamic Resource Allocation in Apache Spark
1.
Dynamic Resource Alloca1on in Apache Spark Yuta Imai @imai_factory
2.
1. RDD Graph val text = "Hello Spark, this is my first Spark application." val textArray = text.split(" ").map(_.replaceAll(" ","")) val result = sc.parallelize(textArray) .map(item => (item, 1)) .reduceByKey((x,y) => x + y) .collect()
3.
Array Array ParallelCollec1onRDD Par11on0 Par11on1 Par11on2 Par11on3 MapPar11onsRDD Par11on0 Par11on1 Par11on2 Par11on3 ShuffledRDD Par11on0 Par11on1 sc.parallelize() .map(…)
.reduceByKey(…) .collect() 2. DAG Scheduler
4.
Array Array ParallelCollec1onRDD Par11on0 Par11on1 Par11on2 Par11on3 MapPar11onsRDD Par11on0 Par11on1 Par11on2 Par11on3 ShuffledRDD Par11on0 Par11on1 sc.parallelize() .map(…)
.reduceByKey(…) .collect() 2. DAG Scheduler Narrow Dependency Shuffle Dependency
5.
Array Array ParallelCollec1onRDD Par11on0 Par11on1 Par11on2 Par11on3 MapPar11onsRDD Par11on0 Par11on1 Par11on2 Par11on3 ShuffledRDD Par11on0 Par11on1 sc.parallelize() .map(…)
.reduceByKey(…) .collect() 2. DAG Scheduler Narrow Dependency Shuffle Dependency Stage0 Stage1 Task0 Task1 Task2 Task3 Task4 Task5
6.
3. Task Scheduler Par11on0 Par11on1 Par11on2 Par11on3 Par11on0 Par11on1 Par11on2 Par11on3 Task0 Task1 Task2 Task3 Executors
7.
Shuffle File iterator.map(…).map(...)... Executor Thread Storage Worker Node iterator.map(…).map(...)... Executor Thread Worker Node
8.
DYNAMIC RESOURCE ALLOCATION
9.
Dynamic Resource Alloca1on • Adds extra executors to an app which has pending tasks. – Offloads challenge for exact resource planning for an app. • Removes idle executors from an app. – Helps a long running app to free idle executors.
10.
Overview Tasks Executors
11.
Overview Tasks Executors Insufficient capacity
12.
Overview Tasks Executors Insufficient capacity
13.
Overview Tasks Executors Insufficient capacity
14.
Overview Tasks Executors Insufficient capacity Op1mal capacity
15.
Overview Tasks Executors ✔ ✔ Insufficient capacity Op1mal capacity
Idle executors
16.
Tasks Executors ✔ ✔ Overview Insufficient capacity Op1mal capacity
Idle executors Op1mal capacity
17.
Request Policy • An app starts with user specified # of executors. ./bin/spark-submit --class <main-class> --master <master-url> --num-executors <# of executors> • Ader spark.dynamicAlloca1on.schedulerBacklogTimeout(sec), App starts reques1ng new executors, if it has pending task(s). •
App requests new executors every spark.dynamicAlloca1on.sustainedSchedulerBacklogTimeout(sec), with doubling # of requests like 1, 2, 4, 8, 16…
18.
Remove Policy • An app removes an executor when it has been idle for more than spark.dynamicAlloca1on.executorIdleTimeout seconds.
19.
External Shuffle Service iterator.map(…).map(...)... Executor Thread Storage Worker Node iterator.map(…).map(...)... Executor Thread Worker Node
20.
External Shuffle Service iterator.map(…).map(...)... Executor Thread Storage Worker Node iterator.map(…).map(...)... Executor Thread Worker Node
21.
External Shuffle Service iterator.map(…).map(...)... Executor Thread Storage Worker Node iterator.map(…).map(...)... Executor Thread Worker Node Shuffle Service Shuffle Service
Download now