Your SlideShare is downloading. ×
Apache Hadoop YARN - Hortonworks Meetup Presentation
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Apache Hadoop YARN - Hortonworks Meetup Presentation

2,827
views

Published on

Published in: Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
2,827
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
54
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Apache Hadoop YARN Page 1
  • 2. A Cursory Look At The Architecture Node Node Manager Manager Container App Mstr App Mstr Client Resource Node Node Resource Manager Manager Manager Manager Client Client App Mstr Container Container MapReduce Status Node Node MapReduce Status Manager Manager Job Submission Job Submission Node Status Node Status Resource Request Resource Request Container Container © Hortonworks Inc. 2012. Confidential and Proprietary. Page 2
  • 3. Global Scheduler (ResourceManager)• Pure resource arbitration• Multiple resource dimensions –<priority, data-locality, memory, cpu, …>• In-built support for data-locality –Node, Rack etc. – Unique to YARN © Hortonworks Inc. 2012. Confidential and Proprietary. Page 3
  • 4. Scheduler Concepts• Input from AM(s) is a dynamic list of ResourceRequests – <resource-name, resource-capability> – Resource name: (hostname / rackname / any) – Resource capability: (memory, cpu, …) – Essentially an inverted <name, capability> request map from AM to RM – No notion of tasks!• Output - Container –Resource(s) grant on a specific machine –Verifiable grant © Hortonworks Inc. 2012. Confidential and Proprietary. Page 4
  • 5. Scheduling Walkthrough MapReduce job with 2 maps and 1 reduce © Hortonworks Inc. 2012. Confidential and Proprietary. Page 5
  • 6. Scheduling Walkthrough Container allocation on r22/h2121: © Hortonworks Inc. 2012. Confidential and Proprietary. Page 6
  • 7. Scheduling Walkthrough Container allocation on r11/h1010: © Hortonworks Inc. 2012. Confidential and Proprietary. Page 7
  • 8. Writing Custom Applications• Grand total of 3 protocols –ClientRMProtocol – Application launching program – submitApplication –AMRMProtocol – Protocol between AM & RM for resource allocation – registerApplication / allocate / finishApplication –ContainerManagerProtocol – Protocol between AM & NM for container start/stop – startContainer / stopContainer © Hortonworks Inc. 2012. Confidential and Proprietary. Page 8
  • 9. API improvements• Overload of the ‘*’ entry.• Release / reject containers• Ask for specific nodes/racks (only)• Don’t give me containers on this racks/nodes• Single client thread allowed to request containers• Overloaded allocate call Page 9 © Hortonworks Inc. 2012
  • 10. Recent advancements• Tools for debugging AMs –Unmanaged AM• Generic AM – Utility libraries for writing –YARN-103, YARN-29• YARN project split and how multiple versions of MapReduce can coexist. Page 10 © Hortonworks Inc. 2012
  • 11. Roadmap• MapReduce container reuse• RM restart capability• Multi-resource scheduling• Generic application history server Page 11 © Hortonworks Inc. 2012
  • 12. Questions?Thank You! © Hortonworks Inc. 2012. Confidential and Proprietary. Page 12

×