Field Notes: YARN Meetup at LinkedIn

1,347 views
1,200 views

Published on

Notes from a variety of speakers at the YARN Meetup at LinkedIn on Sept 27, 2013

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,347
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
50
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Field Notes: YARN Meetup at LinkedIn

  1. 1. YARN Meet Up Sep 2013 @LinkedIn By (lots of speakers)
  2. 2. (Editable) Agenda • Hadoop 2.0 beta – Vinod Kumar Vavilapalli – YARN APIs stability – Existing applications • Application History Server – Mayank Bansal • RM reliability – Bikas Saha, Jian He, Karthik Kambatla – RM restartability – RM fail-over • • • • • • Apache Tez – Hitesh Shah, Siddharth Seth Apache Samza Apache Giraph Apache Helix gohadoop: go YARN application - Arun Llama - Alejandro
  3. 3. Hadoop 2.0 beta Vinod Kumar Vavilapalli
  4. 4. Hadoop 2.0 beta • • • • Stable YARN APIs MR binary compatibility Testing with the whole stack Ready for prime-time!
  5. 5. YARN API stability • • • • • • • • YARN-386 Broke APIs for one last time Hopefully  Exceptions/method-names Security: Tokens used irrespective of kerberos Read-only IDs, factories for creating records Protocols renamed! Client libraries
  6. 6. Compatibility for existing applications • • • • MAPREDUCE-5108 Old mapred APIs binary compatible New mapreduce APIs source compatible Pig, Hive, Oozie etc work with latest versions. No need to rewrite your scripts.
  7. 7. Application History Server Mayank Bansal
  8. 8. Contributions Mayank Bansal Zhije Shen Devaraj K Vinod Kumar Vavilapalli YARN-321
  9. 9. Why we need AHS ? • Job History Server is MR specific • Jobs which are not MR • RM Restart • Hard coded Limits for number of jobs • Longer running jobs
  10. 10. AHS • Different process or Embedded in RM • Contains generic application Data • • • • Application Application Attempts Container Client Interfaces • • • WEB UI Client Interface Web Services
  11. 11. AHS History Store • Pluggable History Store • Storage Format is PB • • • Backward Compatible Much easier to evolve the storage HDFS implementation
  12. 12. AHS Store Write Interface Store RM Store Reading Interface App Finished WEB APP WS AHS RPC
  13. 13. Remaining Work • Security • Command Line Interface
  14. 14. Next Phase Application Specific Data ??? Long running services
  15. 15. DEMO
  16. 16. RM reliability Bikas Saha Jian He Karthik Kambatla
  17. 17. RM reliability • Restartability • High availabilty
  18. 18. RM restartability Jian He Bikas Saha
  19. 19. Design and Work Plan • YARN-128 – RM Restart – Creates framework to store and load state information. Forms the basis of failover and HA. Work close to completion and being actively tested. • YARN-149 – RM HA – Adds HA service to RM in order to failover between instances. Work in active progress. • YARN-556 – Work preserving RM Restart – Support loss-less recovery of cluster state when RM restarts or fails over – Design proposal up • All the work is being done in a carefully planned manner directly on trunk. Code is always stable and ready.
  20. 20. RM Restart (YARN-128) • • • • Current state of the impl Internal details Impact on applications/frameworks How to use
  21. 21. RM Restart • Supports ZooKeeper, HDFS, and local FileSystem as the underlying store. • ClientRMProxy – all Clients (NM, AM, clients) of RM have the same retry behavior while RM is down. • RM restart is working in secure environment now!
  22. 22. Internal details
  23. 23. RMStateStore • Two types of State Info: – Application related state info: asynchronously • ApplicationState – ApplicationSubmissionContext ( AM ContainerLaunchContext, Queue, etc.) • ApplicationAttemptState – AM container, AMRMToken, ClientTokenMasterKey, etc. – RMDelegationTokenSecretManager State(not application specific) : synchronously • RMDelegationToken • RMDelegationToken MasterKey • RMDelegationToken Sequence Number
  24. 24. RM Recovery Workflow • Save the app on app submission – User Provided credentials (HDFSDelegationToken) • Save the attempt on AM attempt launch – AMRMToken, ClientToken • RMDelegationTokenSecretManager – Save the token and sequence number on token generation – Save master key when it rolls • RM crashes….
  25. 25. What happens after RM restarts? • Instruct the old AM to shutdown • Load the ApplicationSubmissionContext – Submit the application • Load the earlier attempts – Loads the attempt credentials (AMRMToken, ClientToken) • Launch a new attempt
  26. 26. Impact on applications/frameworks
  27. 27. Consistency between Downstream consumers of AM and YARN • AM should notify its consumers that the job is done only after YARN reports it’s done – FinishApplicationMasterResponse.getIsUnregister ed() – User is expected to retry this API until it becomes true. – Similarly, kill-application (fix in progress)
  28. 28. For MR AM • Races: – JobClient: AM crashes after JobClient sees FINISHED but before RM removes the app when app finishes • Bugs: relaunch FINISHED application(succeeded, failed, killed) – HistoryServer: History files flushed before RM removes the app when app finishes
  29. 29. How to use?
  30. 30. How to use: 3 steps • 1. Enable RM restart – yarn.resourcemanager.recovery.enabled • 2. Choose the underlying store you want (HDFS, ZooKeeper, local FileSystem) – yarn.resourcemanager.store.class – FileSystemRMStateStore / ZKRMStateStore • 3. Configure the address of the store – yarn.resourcemanager.fs.state-store.uri – hdfs://localhost:9000/rmstore
  31. 31. YARN – Fail over Karthik Kambatla
  32. 32. RM HA (YARN-149) ● Architecture ● Failover / Admin ● Fencing ● Config changes ● FailoverProxyProvider
  33. 33. Architecture ● Active / Standby ○ Standby is powered up, but doesn’t have any state ● Restructure RM services (YARN-1098) ○ Always On services ○ Active Services (e.g. Client <-> RM, AM <-> RM) ● RMHAService (YARN-1027)
  34. 34. Failover / Admin ● CLI: yarn rmhaadmin ● Manual failover ● Automatic failover (YARN-1177) ○ Use ZKFC ○ Start it as an RM service instead of a separate daemon. ○ Re-visit and strip out unnecessary parts.
  35. 35. Fencing (YARN-1222) ● Implicit fencing through ZK RM StateStore ● ACL-based fencing on store.load() during transition to active ○ Shared read-write-admin access to the store ○ Claim exclusive create-delete access ○ All store operations create-delete a fencing node ○ The other RM can’t write to the store anymore
  36. 36. Config changes (YARN-1232, YARN-986) 1. <name>yarn.resourcemanager.address</name> <value>clusterid</value> 2. <name>yarn.resourcemanager.ha.nodes.clusterid</name> <value>rm1,rm2</value> 3. <name>yarn.resourcemanager.ha.id</name> <value>rm1</value> 4. <name>yarn.resourcemanager.address.clusterid.rm1</name> <value>host1:23140</value> 5. <name>yarn.resourcemanager.address.clusterid.rm2</name> <value>host2:23140</value>
  37. 37. FailoverProxyProvider ● ConfiguredFailoverProxyProvider (YARN1028) ○ Use alternate RMs from the config during retry ○ ClientRMProxy ■ ○ addresses client-based RPC addresses ServerRMProxy ■ addresses server-based RPC addresses
  38. 38. Apache TEZ Hitesh Shah
  39. 39. What is Tez? • A data processing framework that can execute a complex DAG of tasks. Architecting the Future of Big Data © Hortonworks Inc. 2011 Page 44
  40. 40. Tez DAG and Tasks Architecting the Future of Big Data © Hortonworks Inc. 2011 Page 45
  41. 41. TEZ as a Yarn Application • No deployment of TEZ jars required on all nodes in the Cluster – Everything is pushed from either the client or from HDFS to the Cluster using YARN’s LocalResource functionality – Ability to run multiple different versions • TEZ Sessions – A single AM can handle multiple DAGs (“jobs”) – Amortize and hide platform latency • Exciting new features – Support for complex DAGs – broadcast joins (Hive map joins) – Support for lower latency – container reuse and shared objects – Support for dynamic concurrency control – determine reduce parallelism at runtime Architecting the Future of Big Data © Hortonworks Inc. 2011 Page 46
  42. 42. TEZ: Community • Early adopters and contributors welcome – Adopters to drive more scenarios. Contributors to make them happen. – Hive and Pig communities are on-board and making great progress - HIVE-4660 and PIG-3446 for Hive-on-Tez and Pig-on-Tez • Meetup group – Please sign up to know more – http://www.meetup.com/Apache-Tez-User-Group • Useful Links: – Website: http://tez.incubator.apache.org/ – Code: http://git-wip-us.apache.org/repos/asf/incubator-tez.git – JIRA: https://issues.apache.org/jira/browse/TEZ – Mailing Lists: – dev-subscribe@tez.incubator.apache.org – user-subscribe@tez.incubator.apache.org https://issues.apache.org/jira/browse/TEZ-65 Architecting the Future of Big Data © Hortonworks Inc. 2011 Page 47
  43. 43. © Hortonworks Inc. 2011
  44. 44. Apache Tez Apache Samza Apache Giraph Apache Helix YARN usage @LinkedIn
  45. 45. YARN Go demo https://github.com/hortonworks/gohadoop Arun C Murthy
  46. 46. Llama Alejandro Abdelnur

×