YARN - Presented At Dallas Hadoop User Group


Published on

Developing Hadoop Solutions With YARN.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Traditional Batch vsNextGen
  • First Hadoop use case was to map the whole internet. Graph with millions of nodes and trillions of edges. Still back to the same problem of silos in order to manipulate data for interactive or Online applicationsLong Story Short, no support for alterative processing models. Iterative tasks can take 10x longer…. because of IO barriers
  • In Classic Hadoop MapReduce was part of the JobTracker and TaskTracker. Hence everything had to be built on MapReduce firstHas some scalbility limits 4k nodesIterative processes can take forever on map reduceJT Failures kill jobs, and everything in the que
  • Typical Open Soruce stack, as we can see MapReduce is On TOP of MapReduce, All other applications like Pig and Hive and HBase are also on top of MapReduce
  • So while Hadoop 1.x had its uses this is really about turning Hadoop into the next generation platform. So what does that mean? A platform should be able to do multiple things, ergo more then just batch processing. Need Batch, Interactive, Online, and Streaming capabilities to really turn Hadoop into a Next Gen Platform.SCALES! Yahoo plans to move into a 10k node cluster
  • So what does this really do? It provides a distributed application framework.Hadoop is now providing a platform were we can store all data in a reliable wayThen on the same platform be able to processes this data without having to move it. Data locality
  • New additions to the family like Falcon for data lifecycle management TEZ a new way of processing which avoids some of the IO barriers that MapReduce experiencedKnox for security and other enterprise features.But most importantly Yarn which as you can see everything is now on top of.
  • AKA a container spawned by an AM can be a Client and ask for another application to start which in turn can do the same thing.
  • Now we have a concept of deploying applications into the hadoop clusterThese applications run in containers of set resources
  • RM takes place of JT and still has scheduling ques and such like the fair, capacity and hierarchical ques
  • Datalocality – attempts to find a local host, if that fails moves to nearest rackFault Tolerance – Rebust in terms of managing containers Recovery- If the AM dies MapReduce application master writes a checkpoint to HDFS. This way we can recover from an AM that dies, it will read the checkpoint and continue.Inter-application Priorites -Maps have to be completed before Reducers right- so there is a complex process in the application master to balance mappers and reducers. Complex feedback from RM -- App master can now kinda look ahead and find out how many resources it can get in the next 20 minutesMigrate directly to YARN without changing a single line of code. Just need to recompile.
  • Havn’t used Weave but its on the 2do list. Saddly my tasks keep growing and I can’t do them in parallel
  • Application Attempt Id ( combination of attemptId and fail count )Application Submission Context – submitted by the client
  • getAllQueues - metrics related to the que such as max capacity, current capacity, application count.getApplications - list of applicationsgetNodeReports – id, rack, host, numCansApplicationSubContext needs a ContainerLaunchContext as well + resources, priority, que etc.
  • YARN - Presented At Dallas Hadoop User Group

    1. 1. Hadoop 2.0 –YARN Yet Another Resource Negotiator Rommel Garcia Solutions Engineer © Hortonworks Inc. 2013 Page 1
    2. 2. Agenda • Hadoop 1.X & 2.X – Concepts Recap • YARN Architecture – How does this affect MRv1? • Slots be gone – What does this mean for MapReduce? • Building YARN Applications •Q &A © Hortonworks Inc. 2013
    3. 3. Hadoop 1.X vs. 2.X Recap over the differences © Hortonworks Inc. 2013
    4. 4. The 1st Generation of Hadoop: Batch HADOOP 1.0 Built for Web-Scale Batch Apps Single App Single App INTERACTIVE ONLINE Single App Single App Single App BATCH BATCH BATCH HDFS HDFS HDFS © Hortonworks Inc. 2013 • All other usage patterns must leverage that same infrastructure • Forces the creation of silos for managing mixed workloads
    5. 5. Hadoop MapReduce Classic • JobTracker –Manages cluster resources and job scheduling • TaskTracker –Per-node agent –Manage tasks © Hortonworks Inc. 2013 Page 5
    6. 6. Hadoop 1 • Limited up to 4,000 nodes per cluster • O(# of tasks in a cluster) • JobTracker bottleneck - resource management, job scheduling and monitoring • Only has one namespace for managing HDFS • Map and Reduce slots are static • Only job to run is MapReduce © Hortonworks Inc. 2013
    7. 7. Hadoop 1.X Stack OPERATIONAL SERVICES AMBARI DATA SERVICES FLUME HIVE & PIG OOZIE HCATALOG SQOOP HBASE LOAD & EXTRACT HADOOP CORE PLATFORM SERVICES NFS MAP REDUCE WebHDFS HDFS Enterprise Readiness High Availability, Disaster Recovery, Security and Snapshots HORTONWORKS DATA PLATFORM (HDP) OS © Hortonworks Inc. 2013 Cloud VM Appliance Page 7
    8. 8. Our Vision: Hadoop as Next-Gen Platform Single Use System Multi Purpose Platform Batch Apps Batch, Interactive, Online, Streaming, … HADOOP 1.0 HADOOP 2.0 MapReduce (data processing) MapReduce Others (data processing) YARN (cluster resource management & data processing) (cluster resource management) HDFS HDFS2 (redundant, reliable storage) (redundant, reliable storage) © Hortonworks Inc. 2012 Confidential and Proprietary. 2013. Page 8
    9. 9. YARN: Taking Hadoop Beyond Batch Store ALL DATA in one place… Interact with that data in MULTIPLE WAYS with Predictable Performance and Quality of Service Applications Run Natively IN Hadoop BATCH INTERACTIVE (MapReduce) (Tez) ONLINE (HBase) STREAMING (Storm, S4,…) GRAPH (Giraph) IN-MEMORY (Spark) HPC MPI (OpenMPI) OTHER (Search) (Weave…) YARN (Cluster Resource Management) HDFS2 (Redundant, Reliable Storage) © Hortonworks Inc. 2013 Page 9
    10. 10. Hadoop 2 • Potentially up to 10,000 nodes per cluster • O(cluster size) • Supports multiple namespace for managing HDFS • Efficient cluster utilization (YARN) • MRv1 backward and forward compatible • Any apps can integrate with Hadoop • Beyond Java © Hortonworks Inc. 2013
    11. 11. Hadoop 2.X Stack OPERATIONAL SERVICES AMBARI DATA SERVICES FLUME HBASE FALCON* OOZIE HIVE & PIG HCATALOG SQOOP LOAD & EXTRACT HADOOP CORE PLATFORM SERVICES NFS WebHDFS KNOX* MAP REDUCE TEZ* YARN HDFS Enterprise Readiness High Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots HORTONWORKS DATA PLATFORM (HDP) OS/VM Cloud Appliance *included Q1 2013 © Hortonworks Inc. 2013 Page 11
    12. 12. YARN Architecture © Hortonworks Inc. 2013
    13. 13. A Brief History of YARN • Originally conceived & architected by the team at Yahoo! – Arun Murthy created the original JIRA in 2008, led the PMC – Currently Arun is the Lead for Map-Reduce/YARN/Tez at Hortonworks and was formerly Architect Hadoop MapReduce at Yahoo • The team at Hortonworks has been working on YARN for 4 years • YARN based architecture running at scale at Yahoo! – Deployed on 35,000 nodes for about a year – Implemented Storm-on-Yarn that processes 133,000 events per second. © Hortonworks Inc. 2013 Page 13
    14. 14. Concepts • Application –Application is a job submitted to the framework –Example – Map Reduce Job • Container –Basic unit of allocation –Fine-grained resource allocation across multiple resource types (memory, cpu, disk, network, gpu etc.) – container_0 = 2GB, 1CPU – container_1 = 1GB, 6 CPU –Replaces the fixed map/reduce slots © Hortonworks Inc. 2013 14
    15. 15. Architecture • Resource Manager –Global resource scheduler –Hierarchical queues –Application management • Node Manager –Per-machine agent –Manages the life-cycle of container –Container resource monitoring • Application Master –Per-application –Manages application scheduling and task execution –E.g. MapReduce Application Master © Hortonworks Inc. 2013 15
    16. 16. YARN – Running Apps create app1 Hadoop Client 1 ASM NM ResourceManager .......negotiates....... Containers .......reports to....... ASM submit app1 Scheduler .......partitions....... Resources create app2 Hadoop Client 2 submit app2 Scheduler ASM queues status report NodeManager C2.1 NodeManager C2.2 NodeManager AM2 Rack1 © Hortonworks Inc. 2012 NodeManager NodeManager C1.3 NodeManager C2.3 C1.2 NodeManager AM1 Rack2 NodeManager C1.4 NodeManager C1.1 RackN
    17. 17. Slots be gone! How does MapReduce run on YARN © Hortonworks Inc. 2013
    18. 18. Apache Hadoop MapReduce on YARN • Original use-case • Most complex application to build – Data-locality – Fault tolerance – ApplicationMaster recovery: Check point to HDFS – Intra-application Priorities: Maps v/s Reduces – Needed complex feedback mechanism from ResourceManager – Security – Isolation • Binary compatible with Apache Hadoop 1.x © Hortonworks Inc. 2013 Page 18
    19. 19. Apache Hadoop MapReduce on YARN ResourceManager Scheduler NodeManager NodeManager NodeManager NodeManager map 1.1 map2.1 reduce2.1 NodeManager NodeManager MR AM 1 NodeManager map1.2 NodeManager reduce1.1 © Hortonworks Inc. 2012 NodeManager MR AM2 NodeManager NodeManager map2.2 NodeManager reduce2.2
    20. 20. Efficiency Gains of YARN • Key Optimizations – No hard segmentation of resource into map and reduce slots – Yarn scheduler is more efficient – All resources are fungible • Yahoo has over 30000 nodes running YARN across over 365PB of data. • They calculate running about 400,000 jobs per day for about 10 million hours of compute time. • They also have estimated a 60% – 150% improvement on node usage per day. • Yahoo got rid of a whole colo (10,000 node datacenter) because of their increased utilization. © Hortonworks Inc. 2013
    21. 21. An Example Calculating Node Capacity • Important Parameters – mapreduce.[map|reduce].memory.mb – This is the physical ram hard-limit enforced by Hadoop on the task – mapreduce.[map|reduce].java.opts – The heapsize of the jvm –Xmx – yarn.scheduler.minimum-allocation-mb – The smallest container yarn will allow – yarn.nodemanager.resource.memory-mb – The amount of physical ram on the node – yarn.nodemanager.vmem-pmem-ratio – The amount of virtual ram each container is allowed. – This is calculated by containerMemoryRequest*vmem-pmem-ratio © Hortonworks Inc. 2013
    22. 22. Calculating Node Capacity Continued • Lets pretend we need a 1g map and a 2g reduce • mapreduce[map|reduce].memory.mb = [-Xmx 1g | -Xmx 2g] • Remember a container has more overhead then just your heap! Add 512mb to the container limit for overhead • mapreduce.[map.reduce].memory.mb= [1536 | 2560] • We have 36g per node and minimum allocations of 512mb • yarn.nodemanager.resource.memory-mb=36864 • yarn.scheduler.minimum-allocation-mb=512 • Virtual Memory for each container is • Map: 1536mb*vmem-pmem-ratio (default is 2.1) = 3225.6mb • Reduce 2560mb*vmem-pmem-ratio = 5376mb • Our 36g node can support • 24 Maps OR 14 Reducers OR any combination allowed by the resources on the node © Hortonworks Inc. 2013
    23. 23. Building YARN Apps Super Simple APIs © Hortonworks Inc. 2013
    24. 24. YARN – Implementing Applications • What APIs do I need to use? –Only three protocols – Client to ResourceManager – Application submission – ApplicationMaster to ResourceManager – Container allocation – ApplicationMaster to NodeManager – Container launch –Use client libraries for all 3 actions –Module yarn-client –Provides both synchronous and asynchronous libraries –Use 3rd party like Weave – http://continuuity.github.io/weave/ © Hortonworks Inc. 2013 24
    25. 25. YARN – Implementing Applications • What do I need to do? –Write a submission Client –Write an ApplicationMaster (well copy-paste) –DistributedShell is the new WordCount –Get containers, run whatever you want! © Hortonworks Inc. 2013 25
    26. 26. YARN – Implementing Applications • What else do I need to know? –Resource Allocation & Usage –ResourceRequest –Container –ContainerLaunchContext –LocalResource –ApplicationMaster –ApplicationId –ApplicationAttemptId –ApplicationSubmissionContext © Hortonworks Inc. 2013 26
    27. 27. YARN – Resource Allocation & Usage • ResourceRequest – Fine-grained resource ask to the ResourceManager – Ask for a specific amount of resources (memory, cpu etc.) on a specific machine or rack – Use special value of * for resource name for any machine ResourceRequest priority resourceName capability numContainers © Hortonworks Inc. 2013 Page 27
    28. 28. YARN – Resource Allocation & Usage • ResourceRequest priority 1 © Hortonworks Inc. 2013 <4gb, 1 core> numContainers 1 rack0 1 * <2gb, 1 core> resourceName host01 0 capability 1 * 1 Page 28
    29. 29. YARN – Resource Allocation & Usage • Container – The basic unit of allocation in YARN – The result of the ResourceRequest provided by ResourceManager to the ApplicationMaster – A specific amount of resources (cpu, memory etc.) on a specific machine Container containerId resourceName capability tokens © Hortonworks Inc. 2013 Page 29
    30. 30. YARN – Resource Allocation & Usage • ContainerLaunchContext – The context provided by ApplicationMaster to NodeManager to launch the Container – Complete specification for a process – LocalResource used to specify container binary and dependencies – NodeManager responsible for downloading from shared namespace (typically HDFS) ContainerLaunchContext container commands environment localResources LocalResource uri type © Hortonworks Inc. 2013 Page 30
    31. 31. YARN - ApplicationMaster • ApplicationMaster – Per-application controller aka container_0 – Parent for all containers of the application – ApplicationMaster negotiates all it’s containers from ResourceManager – ApplicationMaster container is child of ResourceManager – Think init process in Unix – RM restarts the ApplicationMaster attempt if required (unique ApplicationAttemptId) – Code for application is submitted along with Application itself © Hortonworks Inc. 2013 Page 31
    32. 32. YARN - ApplicationMaster • ApplicationMaster – ApplicationSubmissionContext is the complete specification of the ApplicationMaster, provided by Client – ResourceManager responsible for allocating and launching ApplicationMaster container ApplicationSubmissionContext resourceRequest containerLaunchContext appName queue © Hortonworks Inc. 2013 Page 32
    33. 33. YARN Application API - Overview • YarnClient is submission client api • Both synchronous & asynchronous APIs for resource allocation and container start/stop • Synchronous API – AMRMClient – AMNMClient • Asynchronous API – AMRMClientAsync – AMNMClientAsync © Hortonworks Inc. 2013 Page 33
    34. 34. YARN Application API – The Client New Application Request: YarnClient.createApplication ResourceManager 1 Client2 Scheduler Submit Application: YarnClient.submitApplication 2 NodeManager NodeManager NodeManager NodeManager Container 1.1 Container 2.2 Container 2.4 NodeManager NodeManager AM 1 NodeManager Container 1.2 NodeManager Container 1.3 © Hortonworks Inc. 2012 NodeManager AM2 NodeManager NodeManager Container 2.1 NodeManager Container 2.3
    35. 35. YARN Application API – The Client • YarnClient – createApplication to create application – submitApplication to start application – Application developer needs to provide ApplicationSubmissionContext – APIs to get other information from ResourceManager – getAllQueues – getApplications – getNodeReports – APIs to manipulate submitted application e.g. killApplication © Hortonworks Inc. 2013 Page 35
    36. 36. YARN Application API – Resource Allocation ResourceManager AMRMClient.allocate Scheduler Container 3 NodeManager NodeManager NodeManager 2 4 unregisterApplicationMaster NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager AM 1 NodeManager registerApplicationMaster NodeManager © Hortonworks Inc. 2012
    37. 37. YARN Application API – Resource Allocation • AMRMClient - Synchronous API for ApplicationMaster to interact with ResourceManager – Prologue / epilogue – registerApplicationMaster / unregisterApplicationMaster – Resource negotiation with ResourceManager – Internal book-keeping - addContainerRequest / removeContainerRequest / releaseAssignedContainer – Main API – allocate – Helper APIs for cluster information – getAvailableResources – getClusterNodeCount © Hortonworks Inc. 2013 Page 37
    38. 38. YARN Application API – Resource Allocation • AMRMClientAsync - Asynchronous API for ApplicationMaster – Extension of AMRMClient to provide asynchronous CallbackHandler – Callbacks make it easier to build mental model of interaction with ResourceManager for the application developer – onContainersAllocated – onContainersCompleted – onNodesUpdated – onError – onShutdownRequest © Hortonworks Inc. 2013 Page 38
    39. 39. YARN Application API – Using Resources ResourceManager Scheduler NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager Container 1.1 AMNMClient.startContainer NodeManager NodeManager AM 1 AMNMClient.getContainerStatus NodeManager NodeManager © Hortonworks Inc. 2012
    40. 40. YARN Application API – Using Resources • AMNMClient - Synchronous API for ApplicationMaster to launch / stop containers at NodeManager – Simple (trivial) APIs – startContainer – stopContainer – getContainerStatus © Hortonworks Inc. 2013 Page 40
    41. 41. YARN Application API – Using Resources • AMNMClient - Asynchronous API for ApplicationMaster to launch / stop containers at NodeManager – Simple (trivial) APIs – startContainerAsync – stopContainerAsync – getContainerStatusAsync – CallbackHandler to make it easier to build mental model of interaction with NodeManager for the application developer – onContainerStarted – onContainerStopped – onStartContainerError – onContainerStatusReceived © Hortonworks Inc. 2013 Page 41
    42. 42. Hadoop Summit 2014 © Hortonworks Inc. 2013 Page 42
    43. 43. THANK YOU! Rommel Garcia, Solution Engineer – Big Data rgarcia@hortonworks.com © Hortonworks Inc. 2012 Page 43