Your SlideShare is downloading. ×
  • Like
YARN - Hadoop's Resource Manager
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

YARN - Hadoop's Resource Manager


Raymie Stata, ex-CTO of Yahoo, talks about YARN, Hadoop's new Resource Manager, and other improvements in Hadoop 2.0.

Raymie Stata, ex-CTO of Yahoo, talks about YARN, Hadoop's new Resource Manager, and other improvements in Hadoop 2.0.

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. YARN Hadoop’s new Resource Manager Raymie Stata, VertiCloudVertiCloud 1
  • 2. Main features of Hadoop 2.0 • High availability for HDFS • Federation for HDFS • Generalized Resource Management (YARN) • Plus: performance improvements, security improvements, compatibility improvements…VertiCloud 2
  • 3. HDFS 2.0VertiCloud 3
  • 4. HDFS 1.0 (and earlier) Name node (Gets to be huge!) Data nodes (Lots of them!)VertiCloud 4
  • 5. Problems having a single NN • Scalability – NN limits horizontal scaling • Performance – NN is performance bottleneck • Isolation – all tenants share same NN – One misbehaving tenant brings everyone down – Can’t provide higher QOS to mission-critical apps – This is a problem even for small clusters!VertiCloud 5
  • 6. HDFS Federation ViewFS NN1 NN2 NN3 NN4 Data nodes (Even more of them!)VertiCloud 6
  • 7. Future possibilities for HDFS • Snapshots (!) • Partial name spaces • Alternative namespace managers • Global replication management • Disaster recoveryVertiCloud 7
  • 8. YARN AND MAPREDUCE 2.0VertiCloud 8
  • 9. MapReduce 1.0 (and earlier) JobTracker Queue of jobs Queue of tasks Job and task scheduling and monitoring Slave nodes (Lots of them!)VertiCloud 9
  • 10. Problems with JT • Scalability – JT limits horizontal scaling • Availability – when JT dies, jobs must restart • Upgradability – must stop jobs to upgrade JT • Hardwired – JT only supports MapReduce • Increasingly hard to improve – Performance, scheduling , or utilizationVertiCloud 10
  • 11. Observation Move intra-job management out of central node! JobTracker Queue of jobs Why are we Queue of tasks doing all of this on a single Job and task scheduling and node? monitoring When we have Slave nodes all these nodes? (Lots of them!)VertiCloud 11
  • 12. YARN Yet Another Resource Negotiator Resource Manager Job queue Resource list Job Resource scheduling allocation App Master Tasks Task queue Job lifecycle logic Slave nodesVertiCloud 12
  • 13. YARN Components • Resource Manager (per cluster) – Manages job scheduling and execution – Global resource allocation • Application Master (per job) – Manages task scheduling and execution – Local resource allocation • Node Manager (per-machine agent) – Manages the lifecycle of task containers – Reports to RM on health and resource usageVertiCloud 13
  • 14. Lifecycle of a job Resource App Node Client Manager Master Managers Submit OK Go I need resources! Here you are Done? Start containers No Here you are Do work! Done? No Done? Done Done Yes ContainersVertiCloud 14
  • 15. Why YARN is important • Fixes scalability and availability problems • Supports experimentation – At both YARN and MapReduce levels • Supports alternatives to MapReduce!! – OpenMPI – Interactive SQL (Impala) – Streaming • Storm, Apache S4, others… – HBase integration – Graph progressing (Apache Giraph)VertiCloud 15
  • 16. Futures of YARN and MR • YARN – Models beyond MapReduce – Scheduling improvements (including preemption) – Container isolation • MapReduce – Decompose into reusable pieces – Push as well as pull in shuffle – Simple hash (no sort) in shuffleVertiCloud 16