• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Apache Hadoop YARN - Hortonworks Meetup Presentation
 

Apache Hadoop YARN - Hortonworks Meetup Presentation

on

  • 3,765 views

 

Statistics

Views

Total Views
3,765
Views on SlideShare
2,174
Embed Views
1,591

Actions

Likes
0
Downloads
38
Comments
0

6 Embeds 1,591

http://hortonworks.com 1561
http://p1e7fdcd9gt3fvq0lbqn7jg21o21b4il.gadgets.mitre.org 23
http://staging.hortonworks.com 3
http://www.tuicool.com 2
http://translate.googleusercontent.com 1
http://wp.hortonworks.dev 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Apache Hadoop YARN - Hortonworks Meetup Presentation Apache Hadoop YARN - Hortonworks Meetup Presentation Presentation Transcript

    • Apache Hadoop YARN Page 1
    • A Cursory Look At The Architecture Node Node Manager Manager Container App Mstr App Mstr Client Resource Node Node Resource Manager Manager Manager Manager Client Client App Mstr Container Container MapReduce Status Node Node MapReduce Status Manager Manager Job Submission Job Submission Node Status Node Status Resource Request Resource Request Container Container © Hortonworks Inc. 2012. Confidential and Proprietary. Page 2
    • Global Scheduler (ResourceManager)• Pure resource arbitration• Multiple resource dimensions –<priority, data-locality, memory, cpu, …>• In-built support for data-locality –Node, Rack etc. – Unique to YARN © Hortonworks Inc. 2012. Confidential and Proprietary. Page 3
    • Scheduler Concepts• Input from AM(s) is a dynamic list of ResourceRequests – <resource-name, resource-capability> – Resource name: (hostname / rackname / any) – Resource capability: (memory, cpu, …) – Essentially an inverted <name, capability> request map from AM to RM – No notion of tasks!• Output - Container –Resource(s) grant on a specific machine –Verifiable grant © Hortonworks Inc. 2012. Confidential and Proprietary. Page 4
    • Scheduling Walkthrough MapReduce job with 2 maps and 1 reduce © Hortonworks Inc. 2012. Confidential and Proprietary. Page 5
    • Scheduling Walkthrough Container allocation on r22/h2121: © Hortonworks Inc. 2012. Confidential and Proprietary. Page 6
    • Scheduling Walkthrough Container allocation on r11/h1010: © Hortonworks Inc. 2012. Confidential and Proprietary. Page 7
    • Writing Custom Applications• Grand total of 3 protocols –ClientRMProtocol – Application launching program – submitApplication –AMRMProtocol – Protocol between AM & RM for resource allocation – registerApplication / allocate / finishApplication –ContainerManagerProtocol – Protocol between AM & NM for container start/stop – startContainer / stopContainer © Hortonworks Inc. 2012. Confidential and Proprietary. Page 8
    • API improvements• Overload of the ‘*’ entry.• Release / reject containers• Ask for specific nodes/racks (only)• Don’t give me containers on this racks/nodes• Single client thread allowed to request containers• Overloaded allocate call Page 9 © Hortonworks Inc. 2012
    • Recent advancements• Tools for debugging AMs –Unmanaged AM• Generic AM – Utility libraries for writing –YARN-103, YARN-29• YARN project split and how multiple versions of MapReduce can coexist. Page 10 © Hortonworks Inc. 2012
    • Roadmap• MapReduce container reuse• RM restart capability• Multi-resource scheduling• Generic application history server Page 11 © Hortonworks Inc. 2012
    • Questions?Thank You! © Hortonworks Inc. 2012. Confidential and Proprietary. Page 12