Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Apache Hadoop YARN: Past, Present and Future

1,640 views

Published on

Apache Hadoop YARN: Past, Present and Future

Published in: Technology
  • Earn $500 for taking a 1 hour paid survey! read more... ➤➤ https://tinyurl.com/make2793amonth
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Your opinions matter! get paid for them! click here for more info...■■■ http://ishbv.com/surveys6/pdf
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Apache Hadoop YARN: Past, Present and Future

  1. 1. 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Hadoop YARN: Past, Present and Future Melbourne, Aug.31 2016 Junping Du
  2. 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Who.JSON { "name" : "Junping Du" , "job_title" : "Lead Software Engineer @ Hortonworks YARN core team", "experiences" : [ { "software_industry_years" : 10, "hadoop_experience" : "Hadoop contributor before YARN comes out, Apache Hadoop committer & PMC, Release Manager for Apache Hadoop 2.6", ”non_hadoop_experience" : “Architect in cloud computing and enterprise software" }], "email" : "junping_du@apache.org" }
  3. 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved What is Apache Hadoop YARN ? ⬢ YARN is short for “Yet Another Resource Negotiator” ⬢ Big Data Operating System –Resource Management and Scheduling –Support for “colorful” applications, like: Batch, Interactive, Real-Time, etc. ⬢ Enterprise adoption accelerating –Secure mode becoming more widespread –Multi-tenant support –Diverse workloads ⬢ SLAs –Tolerance for slow running jobs decreasing –Consistent performance desired
  4. 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Past
  5. 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved A brief Timeline 1st line of Code Open sourced First 2.0 alpha First 2.0 beta June-July 2010 August 2011 May 2012 August 2013 ⬢ Sub-project of Apache Hadoop ⬢ Releases tied to Hadoop releases ⬢ Alphas and betas
  6. 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved GA Releases 2.2 2.3 2.4 2.5 Oct. 2013 Feb. 2014 Apr. 2014 Aug. 2014 • 1st GA • MR binary compatibility • YARN API cleanup • Testing! • 1st Post GA • Bug fixes • Alpha features - Load simulator - LCE enhancements • RM Fail-over • CS Preemption • Timeline Service V1 • Writable REST APIs • Timeline Service V1 security
  7. 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved GA Releases (Recent + Planning) 2.6 2.7 2.8/2.9 3.0 Nov. 2014 Apr. 2015 2nd H 2016 (estimated) TBD • KMS • Long running service support • Rolling Upgrade • Node Label Support • Docker Container • Pluggable Authorization • Shared Resource Cache • Timeline Service V1.5 • Graceful Decommission • Log CLI Enhancement • Timeline Service V2
  8. 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Outstanding YARN Features released in 2.6/2.7 Default Partition Partition B GPUs Partition C Windows JDK 8 JDK 7 JDK 7 ⬢ Rolling Upgrade Node Label Pluggable ACLs
  9. 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Recent Maintenance Releases Updates ⬢ 2.6 and 2.7 maintenance releases are carried out –Only blockers and critical fixes are added ⬢ Apache Hadoop 2.6 –2.6.4 released in Feb. 2016 –2.6.3 released in Dec. 2015 –2.6.2 released in Oct. 2015 ⬢ Apache Hadoop 2.7 –2.7.3 released in Aug. 2016 –2.7.2 released in Jan. 2016 –2.7.1 released in Jul. 2015
  10. 10. 1 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved1 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Present
  11. 11. 1 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved YARN in Modern Data Architecture ⬢ Modern Data Architecture –Enable applications to have access to all your enterprise data through an efficient centralized platform –Supported with a centralized approach governance, security and operations –Versatile to handle any applications and datasets no matter the size or type ⬢ YARN’s Evolution –The “CORE” of Modern Data Architecture –Centralized resource management, high efficient scheduling, flexible resource model, isolation in security and performance, “colorful” applications support, etc.
  12. 12. 1 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Hadoop YARN ResourceManager (active) ResourceManager (standby) NodeManager1 NodeManager2 NodeManager3 NodeManager4 Resources: 128G, 16 vcores Auto-calculate node resources Label: SAS Dynamically update node resources
  13. 13. 1 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved NodeManager Resource Management ⬢ Options to report NM resources based on node hardware –YARN-160 –Restart of the NM required to enable feature ⬢ Alternatively, admins can use the rmadmin command to update the node’s resources –YARN-291 –Looks at the dynamic-resource.xml –No restart of the NM or the RM required
  14. 14. 1 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Hadoop YARN Scheduler Inter queue pre-emption Improvements to pre-emption Application Queue B – 25% Queue C – 25% Label: SAS (non-exclusive) Queue A – 50% Priority/FIFO, Fair ResourceManager (active) Application, Queue A, 4G, 1 vcore Support for application priority Reservation for application Support for cost based placement agent User
  15. 15. 1 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Capacity scheduler ⬢ Support for application priority within a queue –YARN-1963 –Users can specify application priority –Specified as an integer, higher number is higher priority –Application priority can be updated while it’s running ⬢ Improvements to reservations –YARN-2572 –Support for cost based placement agent added in addition to greedy ⬢ Queue allocation policy can be switched to fair sharing –YARN-3319 –Containers allocated on a fair share basis instead of FIFO
  16. 16. 1 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Capacity scheduler ⬢ Support for non-exclusive node labels –YARN-3214 –Improvement over partition that existed earlier –Better for cluster utilization ⬢ Improvements to pre-emption
  17. 17. 1 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Node 1 NodeManager Support added for graceful decomissioning 128G, 16 vcores Launch Applicaton 1 AMAM process/Docker container(alpha) Launch AM process via ContainerExecutor – DCE, LCE, WSCE. Monitor/isolate memory and cpu. Support added for disk and network isolation via CGroups(alpha) Apache Hadoop YARN Application Lifecycle ResourceManager (active) Request containers Allocate containers Support added to resize containers. Container 1 process/Docker container(alpha) Container 2 process/Docker container(alpha) Launch containers on node using DCE, LCE, WSCE. Monitor/isolate memory and cpu. Support added for disk and network isolation using Cgroups(alpha). History Server(ATS 1.5– leveldb + HDFS, JHS - HDFS) HDFS Log aggregation
  18. 18. 1 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Hadoop YARN ⬢ Graceful decommissioning of NodeManagers –YARN-914 –Drains a node that’s being decommissioned to allow running containers to finish ⬢ Resource isolation support for disk and network –YARN-2619, YARN-2140 –Containers get a fair share of disk and network resources using CGroups –Alpha feature ⬢ Docker support in LinuxContainerExecutor –YARN-3853 –Support to launch Docker containers alongside process containers –Alpha feature
  19. 19. 1 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Hadoop YARN ⬢ Support for container resizing –YARN-1197 –Allows applications to change the size of an existing container ⬢ ATS 1.5 –YARN-4233 –Store timeline events on HDFS –Better scalability and reliability
  20. 20. 2 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Operational support ⬢ Improvements to existing tools (like yarn logs) ⬢ New tools added (yarn top) ⬢ Improvements to the RM UI to expose more details about running applications
  21. 21. 2 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved2 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Future
  22. 22. 2 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Packaging  Containers – Lightweight mechanism for packaging and resource isolation – Popularized and made accessible by Docker – Can replace VMs in some cases – Or more accurately, VMs got used in places where they didn’t need to be  Native integration ++ in YARN – Support for “Container Runtimes” in LCE: YARN-3611 – Process runtime – Docker runtime
  23. 23. 2 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved APIs  Applications need simple APIs  Need to be deployable “easily”  Simple REST API layer fronting YARN – https://issues.apache.org/jira/browse/YARN-4793 – [Umbrella] Simplified API layer for services and beyond  Spawn services & Manage them
  24. 24. 2 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved YARN as a Platform  YARN itself is evolving to support services and complex apps – https://issues.apache.org/jira/browse/YARN-4692 – [Umbrella] Simplified and first-class support for services in YARN  Scheduling – Application priorities: YARN-1963 – Affinity / anti-affinity: YARN-1042 – Services as first-class citizens: Preemption, reservations etc
  25. 25. 2 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved YARN as a Platform (Contd)  Application & Services upgrades – ”Do an upgrade of my Spark / HBase apps with minimal impact to end- users” – YARN-4726  Simplified discovery of services via DNS mechanisms: YARN-4757  YARN Federation – to infinity and beyond: YARN-2915
  26. 26. 2 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved YARN Service Framework  Platform is only as good as the tools  A native YARN framework – https://issues.apache.org/jira/browse/YARN-4692 – [Umbrella] Native YARN framework layer for services and beyond  Slider supporting a DAG of apps: – https://issues.apache.org/jira/browse/SLIDER-875
  27. 27. 2 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Operational and User Experience  Modern YARN web UI - YARN-3368  Enhanced shell interfaces  Metrics: Timeline Service V2 – YARN-2928  Application & Services monitoring, integration with other systems  First class support for YARN hosted services in Ambari – https://issues.apache.org/jira/browse/AMBARI-17353
  28. 28. 2 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Use-cases.. Assemble! YARN and Other Platform Services Storage Resource Management Security Service Discovery Management Monitoring Alerts Holiday Assembly HBase Web Server IOT Assembly Kafka Storm HBase Solr Governance MR Tez Spark …
  29. 29. 2 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Future Work List (I) ⬢ Arbitrary resource types –YARN-3926 –Admins can decide what resource types to support –Resource types read via a config file ⬢ New scheduler features –YARN-4902 –Support richer placement strategies such as affinity, anti-affinity ⬢ Distributed scheduling –YARN-2877, YARN-4742 –NMs run a local scheduler –Allows faster scheduling turnaround ⬢ YARN federation –YARN-2915 –Allows YARN to scale out to tens of thousands of nodes –Cluster of clusters which appear as a single cluster to an end user ⬢ Better support for disk and network isolation –Tied to supporting arbitrary resource types
  30. 30. 3 0 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Future Work List (II) ⬢ Simplified and first-class support for services in YARN –YARN-4692 –Container restart (YARN-3988) •Allow container restart without losing allocation –Service discovery via DNS (YARN-4757) •Running services can be discovered via DNS –Allocation re-use (YARN-4726) •Allow AMs to stop a container but not lose resources on the node ⬢ Enhance Docker support –YARN-3611 –Support to mount volumes –Isolate containers using CGroups ⬢ ATS v2 Phase 2 –YARN-2928 (Phase 1), YARN-5355 (Phase 2) –Run timeline service on Hbase –Support for more data, better performance ⬢ Also in the pipeline –Switch to Java 8 with Hadoop 3.0 –Add support for GPU isolation –Better tools to detect limping nodes –New RM UI – YARN-3368
  31. 31. 3 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDP Evolution with Apache Hadoop YARN
  32. 32. 3 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved3 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Thank you!

×