Bring your Service to YARN

861 views

Published on

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
861
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • We recently integrated the ambari agent into Slider.  Currently the heart beat and registration messages are identical to the ones that are exchanged between the host agent and ambari server.  The agent is hosted within each application component container and communicates with the slider App Master.  The app master currently has a simple state machine implementation for guiding the components through install and start.  I imagine this means that the integration between an Ambari server and slider container hosted components should be relatively straightforward, though moving forward I imagine the nature of the messages exchanged may be modified and the fact that the agent is per-container as opposed to per-host may require some modifications on the server side.
  • Bring your Service to YARN

    1. 1. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Applications on YARN using Slider Provisioning, Managing, and Monitoring YARN Applications Sumit Mohanty @smohanty (@hortonworks) Steve Loughran @steveloughran (@hortonworks) Page 1
    2. 2. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Backdrop … Page 2
    3. 3. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Hadoop as Next-Gen Platform HADOOP 1.0 HDFS (redundant, reliable storage) MapReduce (cluster resource management & data processing) HDFS2 (redundant, reliable storage) YARN (cluster resource management) MapReduce (data processing) Others (data processing) HADOOP 2.0 Single Use System Batch Apps Multi Purpose Platform Batch, Interactive, Online, Streaming, … Page 3
    4. 4. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION The Platform Page 4 HDFS2 (Redundant, Reliable Storage) YARN (Cluster Resource Management) BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, S4,…) GRAPH (Giraph) HPC MPI (OpenMPI) App Z IN-MEMORY (Spark) HDFS2 (Redundant, Reliable Storage) YARN (Cluster Resource Management) BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, S4,…) GRAPH (Giraph) HPC MPI (OpenMPI) App ZIN-MEMORY (Spark) App X App X
    5. 5. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Advantages Page 5 Applications Run Natively IN Hadoop HDFS2 (Redundant, Reliable Storage) YARN (Cluster Resource Management) BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, S4,…) GRAPH (Giraph) HPC MPI (OpenMPI) Availability (alwayson) Flexibility (dynamic scaling) Resource Management (optimization) IN-MEMORY (Spark) App ZApp X
    6. 6. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Long Lived Applications Page 6
    7. 7. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Application – in simple terms • Set of active processes (component instance) • Communicating within as well as with outside • Shared configurations • … Page 7
    8. 8. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Long lived applications • Management of the application is critical – Start/Stop – Reconfigure – Scale up/down – Rolling-restart – Decommission/Recommission • Resource failure and downtime becomes real concern – Detection of failure and remedy – Managed downtime • Logs • Upgrade • Metrics – Ganglia, JMX • Alerts Page 8
    9. 9. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Applications on Yarn Page 9
    10. 10. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION YARN runs code across the cluster Page 10 HDFS YARN Node Manager HDFS YARN Node Manager HDFS YARN Resource Manager “The RM” HDFS YARN Node Manager • Servers run YARN Node Managers • NM's heartbeat to Resource Manager • RM schedules work over cluster • RM allocates containers to apps • NMs start containers • NMs report container health
    11. 11. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Client creates App Master Page 11 HDFS YARN Node Manager HDFS YARN Node Manager HDFS YARN Resource Manager “The RM” HDFS YARN Node Manager Client Application Master
    12. 12. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION AM asks for containers Page 12 HDFS YARN Node Manager HDFS YARN Node Manager HDFS YARN Resource Manager HDFS YARN Node Manager Application Master Container Container Container
    13. 13. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION YARN notifies AM of failures Page 13 HDFS YARN Node Manager HDFS YARN Node Manager Container HDFS YARN Resource Manager HDFS YARN Node Manager Application Master Container Container
    14. 14. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Application on YARN • Do I need to re-write parts of my application? • How do I package my application for YARN? • How do I configure my application? • How do I debug my application? • Can I still manage my application? • Can I monitor my application? • Can I manage inter-/intra-application dependencies? • How will the external clients communicate? • What does it take to secure the application? Page 14
    15. 15. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Apache Slider Page 15
    16. 16. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Apache Slider • Several brownfield, LOB applications exist – Many run along side Hadoop clusters • Many Hadoop clusters exist, some with large compute capacity – Full spectrum of interactions with Hadoop services (0  All) Apache Slider is a project in incubation at the Apache Software Foundation with the goal of making it possible and easy to deploy existing applications onto a YARN cluster • History – HBase on YARN (HOYA) – HBase/Accumulo/Flume/… on YARN – Agent Provider + App Packages Page 16
    17. 17. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Components of Slider 1. AppMaster 2. AgentProvider 3. Agent 4. AppPackage 5. CLI 6. Registry Page 17 Slider App Package Slider CLI HDFS YARN Resource Manager “The RM” HDFS YARN Node Manager Agent Comp. Inst. HDFS YARN Node Manager Agent Comp. Inst. App Master / Agent Provider Registry
    18. 18. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Application by Slider Page 18 Slider App Package Slider CLI HDFS YARN Resource Manager “The RM” HDFS YARN Node Manager Agent Comp. Inst. HDFS YARN Node Manager Agent Comp. Inst. Similar to any YARN application 1. CLI starts an instance of the AM 2. AM requests containers 3. Containers activate with an Agent 4. Agent gets application definition 5. Agent registers with AM 6. AM issues commands 7. Agent reports back, status, configuration, etc. 8. AM publishes endpoints, configurations Application Registry App Master/Agent Provider
    19. 19. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Slider Application Specification Page 19
    20. 20. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Slider Metainfo Page 20 <metainfo><services><service> <name>HBASE</name> <version>0.96.0.2.1.1</version> <exportGroups><exportGroup> <name>QuickLinks</name> <exports><export> <name>org.apache.slider.jmx</name> <value>http://${HBASE_MASTER_HOST}:${site.hbase-site.hbase.master.info.port}/jmx</value> </export></exports></exportGroup></exportGroups> <commandOrders><commandOrder> <command>HBASE_REGIONSERVER-START</command> <requires>HBASE_MASTER-STARTED</requires> </commandOrder></commandOrders> <components><component> <name>HBASE_MASTER</name> <category>MASTER</category> <minInstanceCount>1</minInstanceCount> <commandScript> <script>scripts/hbase_master.py</script> </commandScript></component></components> </service></services></metainfo> Application Info Commands have dependencies Publish an URI Contains components Commands are implemented as scripts
    21. 21. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Slider Application Resource Spec Page 21 { "schema": "http://example.org/specification/v2.0.0", "metadata": { }, "global": { }, "components": { "HBASE_MASTER": { "yarn.role.priority": "1", "yarn.component.instances": "1”, "yarn.memory": "1024”, "yarn.vcores": "1”, }, "slider-appmaster": { }, "HBASE_REGIONSERVER": { "yarn.role.priority": "2", "yarn.component.instances": "1" } } } YARN resource requirements Unique priorities
    22. 22. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Slider Application Config Spec Page 22 { "application.def": "/slider/hbase_v096.zip", "java_home": "/usr/jdk64/jdk1.7.0_45", "site.global.app_log_dir": "${AGENT_LOG_ROOT}/app/log", "site.global.app_pid_dir": "${AGENT_WORK_ROOT}/app/run", "site.global.hbase_master_heapsize": "1024m", "site.global.ganglia_server_host": "${NN_HOST}", "site.global.ganglia_server_port": "8667", "site.global.ganglia_server_id": "Application1", "site.hbase-site.hbase.tmp.dir": "${AGENT_WORK_ROOT}/work/app/tmp", "site.hbase-site.hfile.block.cache.size": "0.40", "site.hbase-site.hbase.security.authentication": "simple", "site.hbase-site.hbase.master.info.port": "${HBASE_MASTER.ALLOCATED_PORT}", "site.hbase-site.hbase.regionserver.port": "0", "site.hbase-site.hbase.zookeeper.quorum": "${ZK_HOST}", "site.core-site.fs.defaultFS": "${NN_URI}", } Configurations needed by Slider Named variables Site variables for application Named variables for cluster details Allocate and advertise Variables for the application scripts
    23. 23. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Page 23
    24. 24. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Application Registry Page 24 • A common problem (not specific to Slider) – https://issues.apache.org/jira/browse/YARN-913 • Currently, – Apache Curator based – Register URLs pointing to actual data – AM doubles up as a webserver for published data (obvious issues !!!) • Plan – Registry should be stand-alone – Slider is a consumer as well as publisher – Slider focuses on declarative solution for Applications to publish data – Allows integration of Applications independent of how they are hosted
    25. 25. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Slider View for Apache Ambari Page 25
    26. 26. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Manage YARN Application • Goal is to have Slider integrate with any application management framework, e.g. Ambari • Apache Ambari is an open source framework for provisioning, managing and monitoring Apache Hadoop clusters • Ambari Views allows development of custom Ambari web interface • Slider App View is to deploy, monitor, manage YARN apps using Slider, embedded in Ambari (currently, Tech Preview) Page 26 Ambari Server Ambari Web FE View UI View BE Slider CLI HDFS YARN Node Manager HDFS YARN Node Manager HDFS YARN Node Manager
    27. 27. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Page 27
    28. 28. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Slider “Bound” to a Mgmt. Infrastructure Page 28 Ambari Slider App Package HDFS YARN Resource Manager “The RM” HDFS YARN Node Manager Agent Comp. Inst. HDFS YARN Node Manager Agent Comp. Inst. App Master Registry • Ambari imports app packages • Starts the AM • Interacts with AM to start containers • Agents register with Ambari • Ambari sends commands/ receives results • YARN maintains ownership of containers • Ambari interacts with Registry
    29. 29. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION In closing Page 29
    30. 30. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION What’s Next in Slider Page 30 • Lock-in Application Specification • Integration with the YARN Registry • Inter/Intra-Application Dependencies • Robust failure handling • Improved debugging • Security • And, more applications
    31. 31. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Everyone is welcome to contribute • Bring your favorite Applications to YARN – Create packages, give feedback, create patches, … • Useful Links –Website – http://slider.incubator.apache.org/ –Dev Mailing Lists – dev@slider.incubator.apache.org –JIRA – https://issues.apache.org/jira/browse/SLIDER • Current and Upcoming Releases – Slider 0.30 (May) – Slider 0.40 (planned) Page 31
    32. 32. © Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION Thank you. @smohanty Page 32

    ×