YARN Ready - Integrating to YARN using Slider Webinar


Published on

YARN Ready webinar series - this is the 2nd in the series of webinars that shows developers how to integrate to YARN using Slider.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Register for Office Hours:
  • Recording “YARN Native” https://hortonworks.webex.com/hortonworks/lsr.php?RCID=7bad7795d07e408ccbe7412e9950603d
    Integrating with Tez Aug 21 https://hortonworks.webex.com/hortonworks/onstage/g.php?t=a&d=622317625
  • YARN Ready - Integrating to YARN using Slider Webinar

    1. 1. © Hortonworks Inc. 2014 YARN Ready – Apache Slider Provisioning, Managing, and Monitoring YARN Applications Sumit Mohanty @smohanty (@hortonworks) Steve Loughran @steveloughran (@hortonworks) Page 1
    2. 2. © Hortonworks Inc. 2014 Agenda • Long running applications on YARN • Introduction to Slider • Writing a Slider Application • Key Slider Features • Conclusion • Q/A Page 2
    3. 3. © Hortonworks Inc. 2014 Applications on Yarn Page 3
    4. 4. © Hortonworks Inc. 2014 YARN runs code across the cluster Page 4 HDFS YARN Node Manager HDFS YARN Node Manager HDFS YARN Resource Manager “The RM” HDFS YARN Node Manager • Servers run YARN Node Managers • NM's heartbeat to Resource Manager • RM schedules work over cluster • RM allocates containers to apps • NMs start containers • NMs report container health
    5. 5. © Hortonworks Inc. 2014 Client creates App Master Page 5 HDFS YARN Node Manager HDFS YARN Node Manager HDFS YARN Resource Manager “The RM” HDFS YARN Node Manager Client Application Master
    6. 6. © Hortonworks Inc. 2014 AM asks for containers Page 6 HDFS YARN Node Manager HDFS YARN Node Manager HDFS YARN Resource Manager HDFS YARN Node Manager Application Master Container Container Container
    7. 7. © Hortonworks Inc. 2014 YARN notifies AM of failures Page 7 HDFS YARN Node Manager HDFS YARN Node Manager Container HDFS YARN Resource Manager HDFS YARN Node Manager Application Master Container Container
    8. 8. © Hortonworks Inc. 2014 Long Running Applications Page 8
    9. 9. © Hortonworks Inc. 2014 Management Page 9 • Application instance must be managed –(Install/Configure/Start) –Restart –Reconfigure/Rolling update –Stop/Graceful stop –Status –Activate/deactivate/rebalance • Upgrade –Long running applications need to provide upgrade support, preferably rolling upgrade
    10. 10. © Hortonworks Inc. 2014 Registration and Discovery Page 10 • Application must declare itself –URLs –Host/port –Config (client config) • Application must be discoverable –Registry –Name-based lookups –Regularly updated • Client support –Callback if “data” changes; thick clients –Configurable gateway; thin clients
    11. 11. © Hortonworks Inc. 2014 Monitoring Page 11 • Metrics –Instantaneous metrics (jmx) –Time-series metrics (ganglia) –Configure Ganglia or other metrics stores • Alerts –Based of jmx/port scan/container status –Configure Nagios or other alerting mechanism
    12. 12. © Hortonworks Inc. 2014 Logs and Events Page 12 • Logs –Continuous log gathering –Single view for logs across all containers • Lifecycle Events –Integration with Application Timeline Server
    13. 13. © Hortonworks Inc. 2014 In addition to … Page 13 • Security –Configured for security –Token renewal • High Availability –On a highly available cluster (NN, RM HA) –Itself highly available (multi-master) • Packaging • Configurability • …
    14. 14. © Hortonworks Inc. 2014 Apache Slider Page 14
    15. 15. © Hortonworks Inc. 2014 Why? • Many mature applications exist • Full YARN-integration takes effort • Running under YARN delivers access to all the data in HDFS –and the CPU power alongside it • As Hadoop stack evolves, more to integrate with • Management tools –e.g. Ambari– exist to monitor applications in-cluster Page 15
    16. 16. © Hortonworks Inc. 2014 Slider is an in-incubation project with one goal: Make it possible and easy to deploy and manage existing applications on a YARN cluster Page 16 Status: Currently in Tech Preview GA with the next HDP release, tentatively November
    17. 17. © Hortonworks Inc. 2014 Slider view of an Application Page 17 • An application is a set of components • A component is a daemon/launched exe –configuration –scripts, data files, etc. • Component may have one or more instances • Component instances are managed –By extension, the app instance is • Example –HBase Application (3 components) – HBase Master – HBase RegionServer – HBase REST service
    18. 18. © Hortonworks Inc. 2014 YARN Containers with Slider Page 18 YARN Node Manager Component (container)AppMaster (container) YARN Node Manager HDFS Slider Agent Application Slider AppMaster Slider Client HDFS HDFS YARN Resource Manager
    19. 19. © Hortonworks Inc. 2014 Application by Slider Page 19 Slider App Package Slider CLI HDFS YARN Resource Manager “The RM” HDFS YARN Node Manager Agent Component HDFS YARN Node Manager Agent Component Similar to any YARN application 1. CLI starts an instance of the AM 2. AM requests containers 3. Containers activate with an Agent 4. Agent gets application definition 5. Agent registers with AM 6. AM issues commands 7. Agent reports back, status, configuration, etc. 8. AM publishes endpoints, configurations Application Registry App Master/Agent Provider
    20. 20. © Hortonworks Inc. 2014 Slider AppMaster/Agent/Client Page 20 • AppMaster –Common YARN interactions –Common *-client interactions –Publishing needs • Agent –Configure and start –Re-configure and restart –Heartbeats & failure detection –Port allocations and publishing –Custom commands if any (e.g. graceful-stop) • Client –App life cycle commands (flex, status, …)
    21. 21. © Hortonworks Inc. 2014 Memcached on YARN Sample Slider App Page 21
    22. 22. © Hortonworks Inc. 2014 Other Application Packages Page 23 • Reference doc for Memcached Application –http://slider.incubator.apache.org/docs/slider_specs/hello_world_sl ider_app.html • Slider github repo has other app –Accumulo –HBase –Storm –Memcached-windows
    23. 23. © Hortonworks Inc. 2014 Other Capabilities Page 24
    24. 24. © Hortonworks Inc. 2014 App Packaging Capabilities Page 25 • Dynamic port allocation and sharing • Inter-component dependency –Specify the start order of components • Exports –Construct arbitrary name value pairs –E.g. URLs (org.apache.slider.monitor: http://${HBASE_MASTER_HOST}:${site.hbase- site.hbase.master.info.port}/master-status) • Default HDFS and ZK isolation
    25. 25. © Hortonworks Inc. 2014 Application Registry Page 26 • A common problem (not specific to Slider) – https://issues.apache.org/jira/browse/YARN-913 • Currently, – Apache Curator based – Register URLs pointing to actual data – AM doubles up as a webserver for published data • Plan – Registry should be stand-alone – Slider is a consumer as well as publisher – Slider focuses on declarative solution for Applications to publish data – Allows integration of Applications independent of how they are hosted
    26. 26. © Hortonworks Inc. 2014 Plan: YARN Service Registry # YARN-wide registry in Zookeeper # Services listed by (user, service class, name) /yarnRegistry/users/sumit/slider/cluster1 # Ephemeral liveness node /yarnRegistry/users/sumit/slider/cluster1/live # service entry lists bindings: URLs, IPC (host, port), ZK # individual components have own (ephemeral) entries & endpoints /yarnRegistry/users/sumit/slider/cluster1/components/appmaster # ZK R/W API, REST read-only API Page 27
    27. 27. © Hortonworks Inc. 2014 Security Page 28 • Applications validated to work in Kerberos secured cluster –Secure cluster created and keytabs available to application components –Security parameters specified in application configuration –User obtains TGT (kinit) prior to Slider application creation –E.g. HBase 0.98.4 • Agent-AM SSL communication –One-Way by default –Two-Way can be enabled • Work initiated on ticket renewal for long running applications –YARN, HDFS
    28. 28. © Hortonworks Inc. 2014 Failure Handling Page 29 • Application Component Failure –Component instance restarted • AppMaster failure –YARN restarts the AppMaster, Slider reconstructs states, registry –App lifecycle commands are temporarily unavailable • NodeManager failure –App remains unaffected • ResourceManager/NodeManager failures with HA –App remains unaffected
    29. 29. © Hortonworks Inc. 2014 Windows and Linux Support Page 30 • Feature set parity on both platforms • Similar packaging constructs –Typically, only path spec needs to change • Both Linux and Windows Server as a platform for –Client (host slider-client) –Cluster (host hadoop cluster)
    30. 30. © Hortonworks Inc. 2014 Join in: Bring your favorite Applications to YARN Page 31
    31. 31. © Hortonworks Inc. 2014 Slider-ifying a new application 1. Grab slider: http://slider.incubator.apache.org/downloads/ 2. Look at App Package docs: http://slider.incubator.apache.org/docs/slider_specs/ 3. Look at source code examples under app-packages 4. Start with memcached/memcached-windows Page 32
    32. 32. © Hortonworks Inc. 2014 YARN API vs. Slider Page 33 • Native YARN app –Your own AppMaster is in charge: container placement, fault handling –You can implement an IPC API for callers to manipulate the application –AppMaster can send out event notifications Ideal for large-scale distributed algorithms, with specific placement and scheduling needs • Slider App –Slider AppMaster handles YARN integration with best-effort placement history, fault handling (recreate component instance) –Simple API/Web UI for cluster manipulation, endpoint listing –Lots of failure and security testing –You only need to write the App package (& test) Long-lived applications where failures can addressed by restarting elsewhere, with flexing decisions by admins
    33. 33. © Hortonworks Inc. 2014 Everyone is welcome • Useful Links –Website – http://slider.incubator.apache.org/ –Dev Mailing Lists – dev@slider.incubator.apache.org –JIRA – https://issues.apache.org/jira/browse/SLIDER • Current and Upcoming Releases – Slider 0.30 (May) – Slider 0.40 (July) – Slider 0.50 (planned) Page 34
    34. 34. © Hortonworks Inc. 2014 Q/A http://slider.incubator.apache.org/ Page 35
    35. 35. Next Steps
    36. 36. Next Steps 1. Review YARN Slider Resources 2. Review webinar recording or attend the next webinar 3. Attend Office Hours 4. Sign up for a 2 day class 5. Attend the next YARN webinar
    37. 37. Resources Setup HDP 2.1 environment • Leverage Sandbox: Hortonworks.com/Sandbox Get Started with YARN • http://hortonworks.com/get-started/YARN Technical Preview • http://hortonworks.com/blog/apache-slider-technical-preview-now- available/ Apache • http://slider.incubator.apache.org/ Dev Mailing Lists • dev@slider.incubator.apache.org JIRA • https://issues.apache.org/jira/browse/SLIDER
    38. 38. Hortonworks Office Hours YARN Office Hours Dial in and chat with YARN experts Next Office Hour: Thursday August 14 @ 10-11am PDT. Register: https://hortonworks.webex.com/hortonworks/onstage/g.php?t=a&d=628 190636 We plan Office Hours for September 11th and October 9th @ 10am PT (2nd Thursdays) Invitations will go out to those that attended or reviewed YARN webinars
    39. 39. And from Hortonworks University Hortonworks Course: Developing Custom YARN Applications Format: Online Duration: 2 Days When: September – date tbd Cost: No Charge to Hortonworks Partners Space: Very Limited Interested? Please contact Lisa
    40. 40. Next in the Series! Join us for the full series of YARN Ready webinars: YARN Native July 24 @ 9am PT (recording link) Tez August 21 @ 9am PT (registration link) Additional webinar topics are being added – watch the blog or visit Hortonworks.com/webinars: September: Ambari and Scalding October: Spark http://hortonworks.com/webinars
    41. 41. © Hortonworks Inc. 2014 Thank you. Page 42