Submit Search
Upload
Yarn
•
Download as PPTX, PDF
•
1 like
•
687 views
Ayub Mohammad
Follow
Yarn AKA MapReduce version 2
Read less
Read more
Technology
Report
Share
Report
Share
1 of 40
Download now
Recommended
Hadoop YARN
Hadoop YARN
Vigen Sahakyan
Apache HBase™
Apache HBase™
Prashant Gupta
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
rebeccatho
Map reduce in BIG DATA
Map reduce in BIG DATA
GauravBiswas9
Big Data Open Source Technologies
Big Data Open Source Technologies
neeraj rathore
Introduction to YARN and MapReduce 2
Introduction to YARN and MapReduce 2
Cloudera, Inc.
Big data and Hadoop
Big data and Hadoop
Rahul Agarwal
Hadoop And Their Ecosystem ppt
Hadoop And Their Ecosystem ppt
sunera pathan
Recommended
Hadoop YARN
Hadoop YARN
Vigen Sahakyan
Apache HBase™
Apache HBase™
Prashant Gupta
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
rebeccatho
Map reduce in BIG DATA
Map reduce in BIG DATA
GauravBiswas9
Big Data Open Source Technologies
Big Data Open Source Technologies
neeraj rathore
Introduction to YARN and MapReduce 2
Introduction to YARN and MapReduce 2
Cloudera, Inc.
Big data and Hadoop
Big data and Hadoop
Rahul Agarwal
Hadoop And Their Ecosystem ppt
Hadoop And Their Ecosystem ppt
sunera pathan
Introduction to Hadoop Technology
Introduction to Hadoop Technology
Manish Borkar
Hadoop File system (HDFS)
Hadoop File system (HDFS)
Prashant Gupta
Hadoop and Big Data
Hadoop and Big Data
Harshdeep Kaur
Map Reduce
Map Reduce
Prashant Gupta
Introduction to NOSQL databases
Introduction to NOSQL databases
Ashwani Kumar
Hadoop
Hadoop
Rajesh Piryani
SQOOP PPT
SQOOP PPT
Dushhyant Kumar
Consistency in NoSQL
Consistency in NoSQL
Dr-Dipali Meher
Hadoop
Hadoop
Nishant Gandhi
Intro to Big Data and NoSQL
Intro to Big Data and NoSQL
Don Demcsak
Introduction to Hadoop
Introduction to Hadoop
Dr. C.V. Suresh Babu
HDFS Architecture
HDFS Architecture
Jeff Hammerbacher
Hadoop
Hadoop
Shamama Kamal
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2
Fabio Fumarola
20090622 Velocity
20090622 Velocity
Jeff Hammerbacher
Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...
Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...
Simplilearn
Design of Hadoop Distributed File System
Design of Hadoop Distributed File System
Dr. C.V. Suresh Babu
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
sudhakara st
Hadoop 2.0, MRv2 and YARN - Module 9
Hadoop 2.0, MRv2 and YARN - Module 9
Rohit Agrawal
Introduction of Big data, NoSQL & Hadoop
Introduction of Big data, NoSQL & Hadoop
Savvycom Savvycom
Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014
Hortonworks
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
Insight Technology, Inc.
More Related Content
What's hot
Introduction to Hadoop Technology
Introduction to Hadoop Technology
Manish Borkar
Hadoop File system (HDFS)
Hadoop File system (HDFS)
Prashant Gupta
Hadoop and Big Data
Hadoop and Big Data
Harshdeep Kaur
Map Reduce
Map Reduce
Prashant Gupta
Introduction to NOSQL databases
Introduction to NOSQL databases
Ashwani Kumar
Hadoop
Hadoop
Rajesh Piryani
SQOOP PPT
SQOOP PPT
Dushhyant Kumar
Consistency in NoSQL
Consistency in NoSQL
Dr-Dipali Meher
Hadoop
Hadoop
Nishant Gandhi
Intro to Big Data and NoSQL
Intro to Big Data and NoSQL
Don Demcsak
Introduction to Hadoop
Introduction to Hadoop
Dr. C.V. Suresh Babu
HDFS Architecture
HDFS Architecture
Jeff Hammerbacher
Hadoop
Hadoop
Shamama Kamal
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2
Fabio Fumarola
20090622 Velocity
20090622 Velocity
Jeff Hammerbacher
Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...
Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...
Simplilearn
Design of Hadoop Distributed File System
Design of Hadoop Distributed File System
Dr. C.V. Suresh Babu
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
sudhakara st
Hadoop 2.0, MRv2 and YARN - Module 9
Hadoop 2.0, MRv2 and YARN - Module 9
Rohit Agrawal
Introduction of Big data, NoSQL & Hadoop
Introduction of Big data, NoSQL & Hadoop
Savvycom Savvycom
What's hot
(20)
Introduction to Hadoop Technology
Introduction to Hadoop Technology
Hadoop File system (HDFS)
Hadoop File system (HDFS)
Hadoop and Big Data
Hadoop and Big Data
Map Reduce
Map Reduce
Introduction to NOSQL databases
Introduction to NOSQL databases
Hadoop
Hadoop
SQOOP PPT
SQOOP PPT
Consistency in NoSQL
Consistency in NoSQL
Hadoop
Hadoop
Intro to Big Data and NoSQL
Intro to Big Data and NoSQL
Introduction to Hadoop
Introduction to Hadoop
HDFS Architecture
HDFS Architecture
Hadoop
Hadoop
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2
20090622 Velocity
20090622 Velocity
Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...
Hadoop YARN | Hadoop YARN Architecture | Hadoop YARN Tutorial | Hadoop Tutori...
Design of Hadoop Distributed File System
Design of Hadoop Distributed File System
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
Hadoop 2.0, MRv2 and YARN - Module 9
Hadoop 2.0, MRv2 and YARN - Module 9
Introduction of Big data, NoSQL & Hadoop
Introduction of Big data, NoSQL & Hadoop
Similar to Yarn
Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014
Hortonworks
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
Insight Technology, Inc.
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014
Hortonworks
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practices
DataWorks Summit
Taming YARN @ Hadoop conference Japan 2014
Taming YARN @ Hadoop conference Japan 2014
Tsuyoshi OZAWA
Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014
Tsuyoshi OZAWA
YARN - way to share cluster BEYOND HADOOP
YARN - way to share cluster BEYOND HADOOP
Omkar Joshi
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele
Hakka Labs
Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0
Big Data Joe™ Rossi
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
hdhappy001
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute Platform
Bikas Saha
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Hortonworks
YARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User Group
Rommel Garcia
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data Applications
Hortonworks
Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340
Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340
Big Data Joe™ Rossi
Combine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARN
Hortonworks
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoop
hitesh1892
ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...
ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...
Zhijie Shen
Yarnthug2014
Yarnthug2014
Joseph Niemiec
Hadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA Edition
Hadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA Edition
Big Data Joe™ Rossi
Similar to Yarn
(20)
Developing YARN Applications - Integrating natively to YARN July 24 2014
Developing YARN Applications - Integrating natively to YARN July 24 2014
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
[db tech showcase Tokyo 2014] C32: Hadoop最前線 - 開発の現場から by NTT 小沢健史
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practices
Taming YARN @ Hadoop conference Japan 2014
Taming YARN @ Hadoop conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014
Taming YARN @ Hadoop Conference Japan 2014
YARN - way to share cluster BEYOND HADOOP
YARN - way to share cluster BEYOND HADOOP
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele
Developing Applications with Hadoop 2.0 and YARN by Abhijit Lele
Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute Platform
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
YARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User Group
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data Applications
Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340
Hadoop: Past, Present and Future - v2.1 - SQLSaturday #340
Combine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARN
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoop
ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...
ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distri...
Yarnthug2014
Yarnthug2014
Hadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA Edition
Hadoop: Past, Present and Future - v2.2 - SQLSaturday #326 - Tampa BA Edition
Recently uploaded
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
Khushali Kathiriya
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
Dropbox
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Deepika Singh
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Zilliz
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
MIND CTI
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
Christopher Logan Kennedy
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
Zilliz
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
Nanddeep Nachan
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
Andrey Devyatkin
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
The Digital Insurer
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
apidays
presentation ICT roal in 21st century education
presentation ICT roal in 21st century education
jfdjdjcjdnsjd
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Angeliki Cooney
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
johnbeverley2021
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Jeffrey Haguewood
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
MadyBayot
Recently uploaded
(20)
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
presentation ICT roal in 21st century education
presentation ICT roal in 21st century education
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
Yarn
1.
Yarn and MapReduce
v2 ©2014 Zaloni, Inc. All Rights Reserved.
2.
Agenda What
is YARN? Why YARN ? Components of YARN Architecture API in MRv2 Gain with MRv2 Failure in MRv2 ©2014 Zaloni, Inc. All Rights Reserved.
3.
So What YARN
is really ??? ©2014 Zaloni, Inc. All Rights Reserved.
4.
YARN INTRODUCTION YARN
– (Yet another resource negotiator) And is responsible for •Cluster resource management •Scheduling Various applications may run on YARN- MapReduce is just a choice. ©2014 Zaloni, Inc. All Rights Reserved.
5.
YARN INTRODUCTION HADOOP
1.0 MapReduce (cluster resource management & data processing) HDFS (redundant, reliable storage) HADOOP 2.0 YARN MapReduce (data processing) Others (data processing) (cluster resource management) HDFS2 (redundant, reliable storage) Single Use System Batch Apps Multi Purpose Platform Batch, Interactive, Online, Streaming, … ©2014 Zaloni, Inc. All Rights Reserved.
6.
YARN INTRODUCTION Store
ALL DATA in one place…Interact with Applications Run Natively IN Hadoop YARN (Cluster Resource Management) HDFS2 (Redundant, Reliable Storage) ©2014 Zaloni, Inc. All Rights Reserved. BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, S4,…) GRAPH (Giraph) IN-MEMORY (Spark) HPC MPI (OpenMPI) ONLINE (HBase) OTHER (Search) (Weave…) that data in MULTIPLE WAYS
7.
Why YARN and
MRv2 ??? ©2014 Zaloni, Inc. All Rights Reserved.
8.
Hadoop MapReduce Classic
©2014 Zaloni, Inc. All Rights Reserved. JobTracker Manages cluster resources and job scheduling TaskTracker Per-node agent Manage tasks
9.
MRv1 Limitations •
Scalability – JT limits horizontal scaling • Cluster utilization- Fixed sized slots degrade the cluster utilization • Availability – when JT dies, jobs must restart • Upgradability – must stop jobs to upgrade JT • Hardwired – JT only supports MapReduce ©2014 Zaloni, Inc. All Rights Reserved.
10.
ResourceManager ©2014 Zaloni,
Inc. All Rights Reserved. JobTracker ApplicationMaster
11.
Architecture ©2014 Zaloni,
Inc. All Rights Reserved.
12.
YARN Components So,
What was Developed • Resource Manager • Node Manager • Application Master • Container ©2014 Zaloni, Inc. All Rights Reserved.
13.
• Manages the
global assignment of compute resources to applications. • A pure Scheduler • No monitoring, tracking status of application Resource Manager (RM) ©2014 Zaloni, Inc. All Rights Reserved.
14.
• Each client/application
may request multiple resources – Memory – Network – Cpu – Disk .. • This is a significant change from static Mapper / Reducer model Resource Manager (RM) ©2014 Zaloni, Inc. All Rights Reserved.
15.
Application Master •
A per – application ApplicationMaster (AM) that manages the application’s life cycle (scheduling and coordination). • An application is either a single job in the classic MapReduce jobs or a DAG of such jobs. ©2014 Zaloni, Inc. All Rights Reserved.
16.
• Application Master
has the responsibility of – negotiating appropriate resource containers from the Scheduler – launching tasks – tracking their status – monitoring for progress – handling task-failures. ©2014 Zaloni, Inc. All Rights Reserved. Application Master
17.
NodeManager : per-machine
framework agent – responsible for launching the applications’ containers, monitoring their resource usage (cpu, memory, disk, network) and reporting the same to the Scheduler. ©2014 Zaloni, Inc. All Rights Reserved. Node Manager
18.
– Basic unit
of allocation monitoring their resource usage (cpu, memory, disk, network) – Fine-grained resource allocation across multiple resource types (memory, cpu, disk, network, gpu etc.) – Replaces the fixed map/reduce slots ©2014 Zaloni, Inc. All Rights Reserved. Container
19.
Lifecycle of a
job Here you are Do work! I need resources! Client Resource Manager App Master Submit OK Done? No Done? No Done? Yes Go Containers Done Done Node Managers Start containers Here you are ©2014 Zaloni, Inc. All Rights Reserved.
20.
Lifecycle of a
job ©2014 Zaloni, Inc. All Rights Reserved.
21.
Job execution on
MRv2 1. Client submits MapReduce job by interacting with Job objects; 2. Job’s code interacts with Resource Manager to acquire application meta-data, such as application id 3. Job’s code moves all the job related resources to HDFS to make them available for the rest of the job 4. Job’s code submits the application to Resource Manager 5. Resource Manager chooses a Node Manager with available resources and requests a container for MRAppMaster 6. Node Manager allocates container for MRAppMaster; MRAppMaster will execute and coordinate MapReduce job ©2014 Zaloni, Inc. All Rights Reserved.
22.
Job execution on
MRv2 7. MRAppMaster grabs required resource from HDFS copied there in step 3 8. MRAppMaster negotiates with Resource Manager for available resources; Resource Manager will select Node Manager that has the most resources 9. MRAppMaster tells selected NodeManager to start Map and Reduce tasks 10.NodeManager creates YarnChild containers that will coordinate and run tasks 11.YarnChild acquires job resources from HDFS that will be required to execute Map and Reduce tasks 12.YarnChild executes Map and Reduce tasks ©2014 Zaloni, Inc. All Rights Reserved.
23.
YARN – Resource
Allocation & Usage ResourceRequest –Fine-grained resource ask to the ResourceManager –Ask for a specific amount of resources (memory, cpu etc.) on a specific machine or rack –Use special value of * for resource name for any machine © Hortonworks Inc. 2013 ResourceRequest priority resourceName capability numContainers ©2014 Zaloni, Inc. All Rights Reserved.
24.
YARN – Resource
Allocation & Usage ResourceRequest priority capability resourceName numContainers 0 <2gb, 1 core> © Hortonworks Inc. 2013 host01 1 rack0 1 * 1 1 <4gb, 1 core> * 1 ©2014 Zaloni, Inc. All Rights Reserved.
25.
YARN – Resource
Allocation & Usage Container –The basic unit of allocation in YARN –The result of the ResourceRequest provided by ResourceManager to the ApplicationMaster –A specific amount of resources (cpu, memory etc.) on a specific machine Container © Hortonworks Inc. 2013 containerId resourceName capability tokens ©2014 Zaloni, Inc. All Rights Reserved.
26.
YARN – Resource
Allocation & Usage ContainerLaunchContext – The context provided by ApplicationMaster to NodeManager to launch the Container – Complete specification for a process – LocalResource used to specify container binary and dependencies – NodeManager responsible for downloading from shared namespace (typically HDFS) ContainerLaunchContext © Hortonworks Inc. 2013 container commands environment localResources LocalResource uri type ©2014 Zaloni, Inc. All Rights Reserved.
27.
© Hortonworks Inc.
2013 YARN - ApplicationMaster ApplicationMaster –ApplicationSubmissionContext is the complete specification of the ApplicationMaster, provided by Client –ResourceManager responsible for allocating and launching ApplicationMaster container ApplicationSubmissionContext resourceRequest containerLaunchContext appName queue ©2014 Zaloni, Inc. All Rights Reserved.
28.
API in MRv2
ClientRMProtocol (Client—RM ) : This is the protocol for a client to communicate with the RM to launch a new application (i.e. an AM), check on the status of the application or kill the application. AMRMProtocol (AM—RM) : This is the protocol used by the AM to register/unregister itself with the RM, as well as to request resources from the RM Scheduler to run its tasks. ContainerManager (AM – NM) : This is the protocol used by the AM to communicate with the NM to start or stop containers and to get status updates on its containers. ©2014 Zaloni, Inc. All Rights Reserved.
29.
ResourceManager YARN Application
API – ClientRM NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager Client2 Application Request: YarnClient.createApplication Submit Application:YarnClient.submitApplication 1 2 Scheduler ©2014 Zaloni, Inc. All Rights Reserved.
30.
YARN Application API
– AMRM ResourceManager AMRMClient.allocate Container Scheduler NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager AM unregisterApplicationMaster registerApplicationMaster 4 1 2 3 NodeManager NodeManager NodeManager NodeManager ©2014 Zaloni, Inc. All Rights Reserved.
31.
YARN Application API
– ContainerManager ResourceManager Scheduler NodeManager NodeManager NodeManager Container 1.1 AMNMClient.startContainer NodeManager NodeManager NodeManager AM 1 AMNMClient.getContainerStatus NodeManager NodeManager NodeManager NodeManager ©2014 Zaloni, Inc. All Rights Reserved.
32.
Gain with New
Architecture • RM and Job manager segregated • The Hadoop MapReduce JobTracker spends a very significant portion of time and effort managing the life cycle of applications • Scalability • Availability • Wire-compatibility • Innovation & Agility • Cluster Utilization • Support for programming paradigms other than MapReduce ©2014 Zaloni, Inc. All Rights Reserved.
33.
Gain with New
Architecture • ResourceManage – Uses ZooKeeper for fail-over. – When primary fails, secondary can quickly start using the state stored in ZK • Application Master – MapReduce ApplicationMaster can recover from failures by restoring itself with the help of checkpoint. • Scalability • Availability • Wire-compatibility • Innovation & Agility • Cluster Utilization • Support for programming paradigms other than MapReduce ©2014 Zaloni, Inc. All Rights Reserved.
34.
Gain with New
Architecture • MRv2 uses wire-compatible protocols to allow different versions of servers and clients to communicate with each other. • Rolling upgrades for the cluster in future. • Scalability • Availability • Wire-compatibility • Innovation & Agility • Cluster Utilization • Support for programming paradigms other than MapReduce ©2014 Zaloni, Inc. All Rights Reserved.
35.
Gain with New
Architecture • New framework is generic. – Different versions of MR running in parallel – End users can upgrade to MR versions on their own schedule • Scalability • Availability • Wire-compatibility • Innovation & Agility • Cluster Utilization • Support for programming paradigms other than MapReduce ©2014 Zaloni, Inc. All Rights Reserved.
36.
Gain with New
Architecture • MRv2 uses a general concept of a resource for scheduling and allocating to individual applications. • Better cluster utilization • Scalability • Availability • Wire-compatibility • Innovation & Agility • Cluster Utilization • Support for programming paradigms other than MapReduce ©2014 Zaloni, Inc. All Rights Reserved.
37.
Gain with New
Architecture Store all data in one place and use them for more than one application • Scalability • Availability • Wire-compatibility • Innovation & Agility • Cluster Utilization • Support for programming paradigms other than MapReduce ©2014 Zaloni, Inc. All Rights Reserved.
38.
Failures in YARN
For MapReduce programs running on YARN, we need to consider the failure of any of the following entities: the task, the application master, the node manager, and the resource manager. • Task fail :Failure of the running task is similar to the classic case. Runtime exceptions , sudden exits of the JVM ,timed out tasks are marked as failed. a task is marked as failed after four attempts (set by mapreduce.map .maxattempts for map tasks and mapreduce.reduce.maxsttempts for reducer tasks). ©2014 Zaloni, Inc. All Rights Reserved.
39.
Failures in YARN
• NodeManager fail: If a node manager fails, then it will stop sending heartbeats to the resource manager, and the node manager will be removed from the resource manager’s pool of available nodes. • Node managers may be blacklisted if the number of failures for the application is high. Blacklisting is done by the application master, and for MapReduce the application master will try to reschedule tasks on different nodes if more than three tasks fail on a node manager. The threshold may be set with mapreduce.job.maxtaskfai lures.per.tracker. ©2014 Zaloni, Inc. All Rights Reserved.
40.
Failures in YARN
• Application master fail: An application master sends periodic heartbeats to the resource manager, and in the event of application master failure, the resource manager will detect the failure and start a new instance of the master running in a new container (managed by a node manager) • ResourceManager fail : Most critical as this failure can shut down the whole process. Eliminated by checkpoints, or standby node (HA ). ©2014 Zaloni, Inc. All Rights Reserved.
Download now