Apache Hadoop YARN                     Page 1
A Cursory Look At The Architecture                                                                          Node          ...
Global Scheduler (ResourceManager)• Pure resource arbitration• Multiple resource dimensions   –<priority, data-locality, m...
Scheduler Concepts• Input from AM(s) is a dynamic list of ResourceRequests  – <resource-name, resource-capability>  – Reso...
Scheduling Walkthrough MapReduce job with 2 maps and 1 reduce      © Hortonworks Inc. 2012. Confidential and Proprietary. ...
Scheduling Walkthrough Container allocation on r22/h2121:      © Hortonworks Inc. 2012. Confidential and Proprietary.   Pa...
Scheduling Walkthrough Container allocation on r11/h1010:      © Hortonworks Inc. 2012. Confidential and Proprietary.   Pa...
Writing Custom Applications• Grand total of 3 protocols   –ClientRMProtocol       – Application launching program       – ...
API improvements• Overload of the ‘*’ entry.• Release / reject containers• Ask for specific nodes/racks (only)• Don’t give...
Recent advancements• Tools for debugging AMs   –Unmanaged AM• Generic AM – Utility libraries for writing   –YARN-103, YARN...
Roadmap• MapReduce container reuse• RM restart capability• Multi-resource scheduling• Generic application history server  ...
Questions?Thank You!    © Hortonworks Inc. 2012. Confidential and Proprietary.   Page 12
Upcoming SlideShare
Loading in...5
×

Apache Hadoop YARN - Hortonworks Meetup Presentation

2,875

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
2,875
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
56
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Apache Hadoop YARN - Hortonworks Meetup Presentation

  1. 1. Apache Hadoop YARN Page 1
  2. 2. A Cursory Look At The Architecture Node Node Manager Manager Container App Mstr App Mstr Client Resource Node Node Resource Manager Manager Manager Manager Client Client App Mstr Container Container MapReduce Status Node Node MapReduce Status Manager Manager Job Submission Job Submission Node Status Node Status Resource Request Resource Request Container Container © Hortonworks Inc. 2012. Confidential and Proprietary. Page 2
  3. 3. Global Scheduler (ResourceManager)• Pure resource arbitration• Multiple resource dimensions –<priority, data-locality, memory, cpu, …>• In-built support for data-locality –Node, Rack etc. – Unique to YARN © Hortonworks Inc. 2012. Confidential and Proprietary. Page 3
  4. 4. Scheduler Concepts• Input from AM(s) is a dynamic list of ResourceRequests – <resource-name, resource-capability> – Resource name: (hostname / rackname / any) – Resource capability: (memory, cpu, …) – Essentially an inverted <name, capability> request map from AM to RM – No notion of tasks!• Output - Container –Resource(s) grant on a specific machine –Verifiable grant © Hortonworks Inc. 2012. Confidential and Proprietary. Page 4
  5. 5. Scheduling Walkthrough MapReduce job with 2 maps and 1 reduce © Hortonworks Inc. 2012. Confidential and Proprietary. Page 5
  6. 6. Scheduling Walkthrough Container allocation on r22/h2121: © Hortonworks Inc. 2012. Confidential and Proprietary. Page 6
  7. 7. Scheduling Walkthrough Container allocation on r11/h1010: © Hortonworks Inc. 2012. Confidential and Proprietary. Page 7
  8. 8. Writing Custom Applications• Grand total of 3 protocols –ClientRMProtocol – Application launching program – submitApplication –AMRMProtocol – Protocol between AM & RM for resource allocation – registerApplication / allocate / finishApplication –ContainerManagerProtocol – Protocol between AM & NM for container start/stop – startContainer / stopContainer © Hortonworks Inc. 2012. Confidential and Proprietary. Page 8
  9. 9. API improvements• Overload of the ‘*’ entry.• Release / reject containers• Ask for specific nodes/racks (only)• Don’t give me containers on this racks/nodes• Single client thread allowed to request containers• Overloaded allocate call Page 9 © Hortonworks Inc. 2012
  10. 10. Recent advancements• Tools for debugging AMs –Unmanaged AM• Generic AM – Utility libraries for writing –YARN-103, YARN-29• YARN project split and how multiple versions of MapReduce can coexist. Page 10 © Hortonworks Inc. 2012
  11. 11. Roadmap• MapReduce container reuse• RM restart capability• Multi-resource scheduling• Generic application history server Page 11 © Hortonworks Inc. 2012
  12. 12. Questions?Thank You! © Hortonworks Inc. 2012. Confidential and Proprietary. Page 12
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×