SlideShare a Scribd company logo
1 of 26
Cloud Computing
MapReduce in Heterogeneous
Environments
Eva Kalyvianaki
ek264@cam.ac.uk
Contents
 Looking at MapReduce performance in heterogeneous
clusters
 Material is from the paper:
“Improving MapReduce Performance in Heterogeneous Environments”,
By Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz and
Ion Stoica, published in Usenix OSDI conference, 2008
 and their presentation at OSDI
2
Motivation: MapReduce is becoming popular
 Open-source implementation, Hadoop, used by Yahoo!,
Facebook, Last.fm, …
 Scale: 20 PB/day at Google, O(10,000) nodes at Yahoo, 3000
jobs/day at Facebook
3
Stragglers in MapReduce
 Straggler is a node that performs poorly or not performing
at all.
 Original MapReduce mitigation approach was:
 To run a speculative copy (called a backup task)
 Whichever copy or original would finish first would be included
 Without speculative execution, a job would be slow as the
slowest sub-task
 Google notes that speculative execution can improve job
response times by 44%
 Is this approach good enough for modern clusters?
4
Modern Clusters: Heterogeneity is the norm
 Cloud computing providers like Amazon’s Elastic Compute
Cloud (EC2) provide cheap on-demand computing:
 Price: 2 cents / VM / hour
 Scale: thousands of VMs
 Caveat: less control of performance
 Main challenge for Hadoop on EC2 is performance
heterogeneity, which breaks task scheduler assumptions
 This lecture/paper is on a new LATE scheduler that can cut
response time in half
5
MapReduce Revised
6
MapReduce Implementation, Hadoop
7
Scheduling in MapReduce
 When a node has an empty slot, Hadoop chooses one from
the three categories in the following priority:
1. A failed task is given higher priority
2. Unscheduled tasks. For maps, tasks with local data to the node are
chosen first.
3. Looks to run a speculative task.
8
Deciding on Speculative Tasks
 Which task to execute speculatively?
 Hadoop monitors tasks progress using a progress score: a
number from 0, …, 1
 For mappers: the score is the fraction of input data read
 For reducers: the execution is divided into three equal phases,
1/3 of the score each:
 Copy phase: percent of maps that output has been copied from
 Sort phase: map outputs are sorted by key: percent of data merged
 Reduce phase: percent of data passed through the reduce function
 Example: a task halfway through the copy phase has
progress score = 1/2*1/3 = 1/6.
 Example: a task halfway through the reduce phase has
progress score = 1/3 + 1/3 + 1/2 * 1/3 = 5/6
9
Deciding on Speculative Tasks (con’t)
 Hadoop looks at the average progress of each category of
maps and reduces and defines a threshold:
 When a task’s progress is less than the average for its
category minus 0.2, and the task has run at least one
minute, it is marked as a straggler:
threshold = avgProgress – 0.2
 All tasks with progress score < threshold are stragglers
 Ties are broken by data locality
 This approach works reasonably well in homogeneous clusters
10
Scheduler’s Assumptions
1. Nodes can perform work at roughly the same rate
2. Tasks progress at constant rate all the time
3. There is no cost to starting a speculative task
4. A task’s progress is roughly equal to the fraction of its total
work
5. Tasks tend to finish in waves, so a task with a low progress
score is likely a slow task
6. Different task of the same category (maps or reduces) take
roughly the same amount of work
11
Revising Scheduler’s Assumptions
1. Nodes can perform work at roughly the same rate
2. Tasks progress at constant rate all the time
 (1) In heterogeneous clusters some nodes are slower (older)
than others
 (2) Virtualized clusters “suffer” from co-location interference
12
Heterogeneity in Virtualized Environments
 VM technology isolates CPU and memory, but disk and
network are shared
 Full bandwidth when no contention
 Equal shares when there is contention
 2.5x performance difference
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7
IO
Performance
per
VM
(MB/s)
VMs on Physical Host
13
Revising Scheduler’s Assumptions
3. There is no cost to starting a speculative task
4. A task’s progress is roughly equal to the fraction of its total
work
5. Tasks tend to finish in waves, so a task with a low progress
score is likely a slow task
 (3) Too many speculative tasks can take away resources
from other running tasks
 (4) The copy phase of reducers is the slowest part, because
it involves all-pairs communications. But this phase counts
for 1/3 of the total reduce work.
 (5) Tasks from different generations will be executed
concurrently. So newer faster tasks are considered with older
show tasks, avgProgress changes a lot.
14
Idea: Progress Rates
 Instead of using progress score values, compute progress
rates, and back up tasks that are “far enough” below the
mean
 Problem: can still select the wrong tasks
15
Progress Rate Example
Time (min)
Node 1
Node 2
Node 3
3x slower
1.9x slower
1 task/min
1 min 2 min
16
Progress Rate Example
Node 1
Node 2
Node 3
What if the job had 5 tasks?
time left: 1.8 min
2 min
Time (min)
Node 2 is slowest, but should back up Node 3’s
task!
time left: 1 min
17
Our Scheduler: LATE
 Insight: back up the task with the largest estimated finish
time
 “Longest Approximate Time to End”  LATE
 Look forward instead of looking backward
 Sanity thresholds:
 Cap number of backup tasks
 Launch backups on fast nodes
 Only back up tasks that are sufficiently slow
18
LATE Details
 Estimating finish times:
progress score
execution time
progress rate =
1 – progress score
progress rate
estimated time left =
19
LATE Scheduler
 If a task slot becomes available and there are less than
SpeculativeCap tasks running, then:
1. Ignore the request if the node’s total progress is below
SlowNodeThreshold (=25th percentile)
2. Rank currently running, non-speculatively executed tasks by
estimated time left
3. Launch a copy of the highest-ranked task with progress rate below
SlowTaskThreshold (=25th percentile)
 Threshold values:
 10% cap on backups, 25th percentiles for slow node/task
 Validated by sensitivity analysis
20
LATE Example
Node 1
Node 2
Node 3
2 min
Time (min)
Progress = 5.3%
Estimated time left:
(1-0.66) / (1/3) = 1
Estimated time left:
(1-0.05) / (1/1.9) = 1.8
Progress = 66%
LATE correctly picks Node 3
21
Evaluation
 Environments:
 EC2 (3 job types, 200-250 nodes)
 Small local testbed
 Self-contention through VM placement
 Stragglers through background processes
22
EC2 Sort without Stragglers (Sec 5.2.1)
 106 machines , 7-8 VMs per machine  total of 243 VMs
 128 MB data per host, 30 GB in total
 486 map tasks and 437 reduce tasks
 average 27% speedup over native, 31% over no backups
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Worst Best Average
Normalized
Response
Time
No Backups
Hadoop Native
LATE Scheduler
23
EC2 Sort with Stragglers (Sec 5.2.2)
 8 VMs are manually slowed down out of 100 VMs in total
 running background of CPU- and disk-intensive jobs
 average 58% speedup over native, 220% over no backups
 93% max speedup over native
0.0
0.5
1.0
1.5
2.0
2.5
Worst Best Average
Normalized
Response
Time
No Backups
Hadoop Native
LATE Scheduler
24
Conclusion
 Heterogeneity is a challenge for parallel apps, and is
growing more important
 Lessons:
 Back up tasks which hurt response time most
 2x improvement using simple algorithm
25
Summary
 MapReduce is a very powerful and expressive model
 Performance depends a lot on implementation details
 Material is from the paper:
“Improving MapReduce Performance in Heterogeneous Environments”,
By Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz and
Ion Stoica, published in Usenix OSDI conference, 2008
 and their presentation at OSDI
26

More Related Content

Similar to MapReduce Performance in Heterogeneous Clusters

mapreduce.pptx
mapreduce.pptxmapreduce.pptx
mapreduce.pptxShimoFcis
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabadsreehari orienit
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation HadoopVarun Narang
 
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종NAVER D2
 
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...areej qasrawi
 
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...Reynold Xin
 
Presentation on Large Scale Data Management
Presentation on Large Scale Data ManagementPresentation on Large Scale Data Management
Presentation on Large Scale Data ManagementChris Bunch
 
Introduction to Parallel Computing
Introduction to Parallel ComputingIntroduction to Parallel Computing
Introduction to Parallel ComputingAkhila Prabhakaran
 
Enhancing Performance and Fault Tolerance of Hadoop Cluster
Enhancing Performance and Fault Tolerance of Hadoop ClusterEnhancing Performance and Fault Tolerance of Hadoop Cluster
Enhancing Performance and Fault Tolerance of Hadoop ClusterIRJET Journal
 
Online learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and HadoopOnline learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and HadoopHéloïse Nonne
 
Apache Spark Performance: Past, Future and Present
Apache Spark Performance: Past, Future and PresentApache Spark Performance: Past, Future and Present
Apache Spark Performance: Past, Future and PresentDatabricks
 
What is Distributed Computing, Why we use Apache Spark
What is Distributed Computing, Why we use Apache SparkWhat is Distributed Computing, Why we use Apache Spark
What is Distributed Computing, Why we use Apache SparkAndy Petrella
 
Spark Overview and Performance Issues
Spark Overview and Performance IssuesSpark Overview and Performance Issues
Spark Overview and Performance IssuesAntonios Katsarakis
 
MapReduce: Distributed Computing for Machine Learning
MapReduce: Distributed Computing for Machine LearningMapReduce: Distributed Computing for Machine Learning
MapReduce: Distributed Computing for Machine Learningbutest
 
Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profilepramodbiligiri
 
MapReduce Scheduling Algorithms
MapReduce Scheduling AlgorithmsMapReduce Scheduling Algorithms
MapReduce Scheduling AlgorithmsLeila panahi
 

Similar to MapReduce Performance in Heterogeneous Clusters (20)

mapreduce.pptx
mapreduce.pptxmapreduce.pptx
mapreduce.pptx
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabad
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
 
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
 
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
 
Map reduce
Map reduceMap reduce
Map reduce
 
Memory management
Memory managementMemory management
Memory management
 
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
 
Presentation on Large Scale Data Management
Presentation on Large Scale Data ManagementPresentation on Large Scale Data Management
Presentation on Large Scale Data Management
 
Hadoop
HadoopHadoop
Hadoop
 
Introduction to Parallel Computing
Introduction to Parallel ComputingIntroduction to Parallel Computing
Introduction to Parallel Computing
 
Enhancing Performance and Fault Tolerance of Hadoop Cluster
Enhancing Performance and Fault Tolerance of Hadoop ClusterEnhancing Performance and Fault Tolerance of Hadoop Cluster
Enhancing Performance and Fault Tolerance of Hadoop Cluster
 
Online learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and HadoopOnline learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and Hadoop
 
Apache Spark Performance: Past, Future and Present
Apache Spark Performance: Past, Future and PresentApache Spark Performance: Past, Future and Present
Apache Spark Performance: Past, Future and Present
 
What is Distributed Computing, Why we use Apache Spark
What is Distributed Computing, Why we use Apache SparkWhat is Distributed Computing, Why we use Apache Spark
What is Distributed Computing, Why we use Apache Spark
 
Spark Overview and Performance Issues
Spark Overview and Performance IssuesSpark Overview and Performance Issues
Spark Overview and Performance Issues
 
MapReduce: Distributed Computing for Machine Learning
MapReduce: Distributed Computing for Machine LearningMapReduce: Distributed Computing for Machine Learning
MapReduce: Distributed Computing for Machine Learning
 
Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profile
 
MapReduce Scheduling Algorithms
MapReduce Scheduling AlgorithmsMapReduce Scheduling Algorithms
MapReduce Scheduling Algorithms
 

More from ShimoFcis

Motif Finding.pdf
Motif Finding.pdfMotif Finding.pdf
Motif Finding.pdfShimoFcis
 
05_SQA_Overview.ppt
05_SQA_Overview.ppt05_SQA_Overview.ppt
05_SQA_Overview.pptShimoFcis
 
Topic21 Elect. Codebook, Cipher Block Chaining.pptx
Topic21 Elect. Codebook, Cipher Block Chaining.pptxTopic21 Elect. Codebook, Cipher Block Chaining.pptx
Topic21 Elect. Codebook, Cipher Block Chaining.pptxShimoFcis
 
lab-8 (1).pptx
lab-8 (1).pptxlab-8 (1).pptx
lab-8 (1).pptxShimoFcis
 
Lab-11-C-Problems.pptx
Lab-11-C-Problems.pptxLab-11-C-Problems.pptx
Lab-11-C-Problems.pptxShimoFcis
 
Mid-Term Problem Solving Part.pptx
Mid-Term Problem Solving Part.pptxMid-Term Problem Solving Part.pptx
Mid-Term Problem Solving Part.pptxShimoFcis
 
Lecture 6.pptx
Lecture 6.pptxLecture 6.pptx
Lecture 6.pptxShimoFcis
 
Lecture Cloud Security.pptx
Lecture Cloud Security.pptxLecture Cloud Security.pptx
Lecture Cloud Security.pptxShimoFcis
 
storage-systems.pptx
storage-systems.pptxstorage-systems.pptx
storage-systems.pptxShimoFcis
 

More from ShimoFcis (10)

Motif Finding.pdf
Motif Finding.pdfMotif Finding.pdf
Motif Finding.pdf
 
05_SQA_Overview.ppt
05_SQA_Overview.ppt05_SQA_Overview.ppt
05_SQA_Overview.ppt
 
Topic21 Elect. Codebook, Cipher Block Chaining.pptx
Topic21 Elect. Codebook, Cipher Block Chaining.pptxTopic21 Elect. Codebook, Cipher Block Chaining.pptx
Topic21 Elect. Codebook, Cipher Block Chaining.pptx
 
4-DES.pdf
4-DES.pdf4-DES.pdf
4-DES.pdf
 
lab-8 (1).pptx
lab-8 (1).pptxlab-8 (1).pptx
lab-8 (1).pptx
 
Lab-11-C-Problems.pptx
Lab-11-C-Problems.pptxLab-11-C-Problems.pptx
Lab-11-C-Problems.pptx
 
Mid-Term Problem Solving Part.pptx
Mid-Term Problem Solving Part.pptxMid-Term Problem Solving Part.pptx
Mid-Term Problem Solving Part.pptx
 
Lecture 6.pptx
Lecture 6.pptxLecture 6.pptx
Lecture 6.pptx
 
Lecture Cloud Security.pptx
Lecture Cloud Security.pptxLecture Cloud Security.pptx
Lecture Cloud Security.pptx
 
storage-systems.pptx
storage-systems.pptxstorage-systems.pptx
storage-systems.pptx
 

Recently uploaded

HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 

Recently uploaded (20)

HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 

MapReduce Performance in Heterogeneous Clusters

  • 1. Cloud Computing MapReduce in Heterogeneous Environments Eva Kalyvianaki ek264@cam.ac.uk
  • 2. Contents  Looking at MapReduce performance in heterogeneous clusters  Material is from the paper: “Improving MapReduce Performance in Heterogeneous Environments”, By Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz and Ion Stoica, published in Usenix OSDI conference, 2008  and their presentation at OSDI 2
  • 3. Motivation: MapReduce is becoming popular  Open-source implementation, Hadoop, used by Yahoo!, Facebook, Last.fm, …  Scale: 20 PB/day at Google, O(10,000) nodes at Yahoo, 3000 jobs/day at Facebook 3
  • 4. Stragglers in MapReduce  Straggler is a node that performs poorly or not performing at all.  Original MapReduce mitigation approach was:  To run a speculative copy (called a backup task)  Whichever copy or original would finish first would be included  Without speculative execution, a job would be slow as the slowest sub-task  Google notes that speculative execution can improve job response times by 44%  Is this approach good enough for modern clusters? 4
  • 5. Modern Clusters: Heterogeneity is the norm  Cloud computing providers like Amazon’s Elastic Compute Cloud (EC2) provide cheap on-demand computing:  Price: 2 cents / VM / hour  Scale: thousands of VMs  Caveat: less control of performance  Main challenge for Hadoop on EC2 is performance heterogeneity, which breaks task scheduler assumptions  This lecture/paper is on a new LATE scheduler that can cut response time in half 5
  • 8. Scheduling in MapReduce  When a node has an empty slot, Hadoop chooses one from the three categories in the following priority: 1. A failed task is given higher priority 2. Unscheduled tasks. For maps, tasks with local data to the node are chosen first. 3. Looks to run a speculative task. 8
  • 9. Deciding on Speculative Tasks  Which task to execute speculatively?  Hadoop monitors tasks progress using a progress score: a number from 0, …, 1  For mappers: the score is the fraction of input data read  For reducers: the execution is divided into three equal phases, 1/3 of the score each:  Copy phase: percent of maps that output has been copied from  Sort phase: map outputs are sorted by key: percent of data merged  Reduce phase: percent of data passed through the reduce function  Example: a task halfway through the copy phase has progress score = 1/2*1/3 = 1/6.  Example: a task halfway through the reduce phase has progress score = 1/3 + 1/3 + 1/2 * 1/3 = 5/6 9
  • 10. Deciding on Speculative Tasks (con’t)  Hadoop looks at the average progress of each category of maps and reduces and defines a threshold:  When a task’s progress is less than the average for its category minus 0.2, and the task has run at least one minute, it is marked as a straggler: threshold = avgProgress – 0.2  All tasks with progress score < threshold are stragglers  Ties are broken by data locality  This approach works reasonably well in homogeneous clusters 10
  • 11. Scheduler’s Assumptions 1. Nodes can perform work at roughly the same rate 2. Tasks progress at constant rate all the time 3. There is no cost to starting a speculative task 4. A task’s progress is roughly equal to the fraction of its total work 5. Tasks tend to finish in waves, so a task with a low progress score is likely a slow task 6. Different task of the same category (maps or reduces) take roughly the same amount of work 11
  • 12. Revising Scheduler’s Assumptions 1. Nodes can perform work at roughly the same rate 2. Tasks progress at constant rate all the time  (1) In heterogeneous clusters some nodes are slower (older) than others  (2) Virtualized clusters “suffer” from co-location interference 12
  • 13. Heterogeneity in Virtualized Environments  VM technology isolates CPU and memory, but disk and network are shared  Full bandwidth when no contention  Equal shares when there is contention  2.5x performance difference 0 10 20 30 40 50 60 70 1 2 3 4 5 6 7 IO Performance per VM (MB/s) VMs on Physical Host 13
  • 14. Revising Scheduler’s Assumptions 3. There is no cost to starting a speculative task 4. A task’s progress is roughly equal to the fraction of its total work 5. Tasks tend to finish in waves, so a task with a low progress score is likely a slow task  (3) Too many speculative tasks can take away resources from other running tasks  (4) The copy phase of reducers is the slowest part, because it involves all-pairs communications. But this phase counts for 1/3 of the total reduce work.  (5) Tasks from different generations will be executed concurrently. So newer faster tasks are considered with older show tasks, avgProgress changes a lot. 14
  • 15. Idea: Progress Rates  Instead of using progress score values, compute progress rates, and back up tasks that are “far enough” below the mean  Problem: can still select the wrong tasks 15
  • 16. Progress Rate Example Time (min) Node 1 Node 2 Node 3 3x slower 1.9x slower 1 task/min 1 min 2 min 16
  • 17. Progress Rate Example Node 1 Node 2 Node 3 What if the job had 5 tasks? time left: 1.8 min 2 min Time (min) Node 2 is slowest, but should back up Node 3’s task! time left: 1 min 17
  • 18. Our Scheduler: LATE  Insight: back up the task with the largest estimated finish time  “Longest Approximate Time to End”  LATE  Look forward instead of looking backward  Sanity thresholds:  Cap number of backup tasks  Launch backups on fast nodes  Only back up tasks that are sufficiently slow 18
  • 19. LATE Details  Estimating finish times: progress score execution time progress rate = 1 – progress score progress rate estimated time left = 19
  • 20. LATE Scheduler  If a task slot becomes available and there are less than SpeculativeCap tasks running, then: 1. Ignore the request if the node’s total progress is below SlowNodeThreshold (=25th percentile) 2. Rank currently running, non-speculatively executed tasks by estimated time left 3. Launch a copy of the highest-ranked task with progress rate below SlowTaskThreshold (=25th percentile)  Threshold values:  10% cap on backups, 25th percentiles for slow node/task  Validated by sensitivity analysis 20
  • 21. LATE Example Node 1 Node 2 Node 3 2 min Time (min) Progress = 5.3% Estimated time left: (1-0.66) / (1/3) = 1 Estimated time left: (1-0.05) / (1/1.9) = 1.8 Progress = 66% LATE correctly picks Node 3 21
  • 22. Evaluation  Environments:  EC2 (3 job types, 200-250 nodes)  Small local testbed  Self-contention through VM placement  Stragglers through background processes 22
  • 23. EC2 Sort without Stragglers (Sec 5.2.1)  106 machines , 7-8 VMs per machine  total of 243 VMs  128 MB data per host, 30 GB in total  486 map tasks and 437 reduce tasks  average 27% speedup over native, 31% over no backups 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Worst Best Average Normalized Response Time No Backups Hadoop Native LATE Scheduler 23
  • 24. EC2 Sort with Stragglers (Sec 5.2.2)  8 VMs are manually slowed down out of 100 VMs in total  running background of CPU- and disk-intensive jobs  average 58% speedup over native, 220% over no backups  93% max speedup over native 0.0 0.5 1.0 1.5 2.0 2.5 Worst Best Average Normalized Response Time No Backups Hadoop Native LATE Scheduler 24
  • 25. Conclusion  Heterogeneity is a challenge for parallel apps, and is growing more important  Lessons:  Back up tasks which hurt response time most  2x improvement using simple algorithm 25
  • 26. Summary  MapReduce is a very powerful and expressive model  Performance depends a lot on implementation details  Material is from the paper: “Improving MapReduce Performance in Heterogeneous Environments”, By Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz and Ion Stoica, published in Usenix OSDI conference, 2008  and their presentation at OSDI 26