2013 11-19-hoya-status

521 views
383 views

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
521
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
8
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • "hoya" is actually the Hoya AM: it lets us define the memory and requirements of that node in the cluster
  • JMX port binding & publishing of portweb port rolling restart of NM/RMAM retry logicTesting: chaos monkey for YARNLogging: running Samza without HDFS -costs of ops & latency. Are only running Samza clusters w/ YARN. YARN puts logs into HDFS by default, so without HDFS you are stuffed.-rollover of stdout & stderr -have the NM implement the rolling. -Samza could log from log4j to append to Kafka; need a way to pull out and view. Adds 15 min YARN can use http: URLs to pull in local resourceSamza can handle outages of a few minutes, but for other services rolling restart/upgadeno classic scheduling; assume full code can run or buy more hardware-RM to tell AM when request cant be satisfied
  • co-existenc
  • 2013 11-19-hoya-status

    1. 1. Hoya: HBase on YARN Steve Loughran & Devaraj Das {stevel, ddas} at hortonworks.com @steveloughran, @ddraj November 2013 © Hortonworks Inc. 2013
    2. 2. Hadoop as Next-Gen Platform Single Use System Multi Purpose Platform Batch Apps Batch, Interactive, Online, Streaming, … HADOOP 1.0 HADOOP 2.0 MapReduce (data processing) MapReduce Others (data processing) YARN (cluster resource management & data processing) (cluster resource management) HDFS HDFS2 (redundant, reliable storage) (redundant, reliable storage) © Hortonworks Inc. 2012 Page 2
    3. 3. YARN: Taking Hadoop Beyond Batch Store ALL DATA in one place… Interact with that data in MULTIPLE WAYS with Predictable Performance and Quality of Service Applications Run Natively IN Hadoop BATCH INTERACTIVE IN-MEMORY STREAMING (MapReduce) (Tez) (Spark) (Storm, S4,…) GRAPH (Giraph) HPC MPI (OpenMPI) OTHER (Search) (Weave…) Samza YARN (Cluster Resource Management) HDFS2 (Redundant, Reliable Storage) © Hortonworks Inc. Page 3
    4. 4. BATCH INTERACTIVE IN-MEMORY STREAMING (MapReduce) (Tez) (Spark) (Storm, S4,…) GRAPH (Giraph) HPC MPI (OpenMPI) OTHER (Search) (Weave…) HBase YARN (Cluster Resource Management) HDFS2 (Redundant, Reliable Storage) And HBase? BATCH INTERACTIVE IN-MEMORY STREAMING (MapReduce) (Tez) (Spark) (Storm, S4,…) GRAPH (Giraph) HPC MPI (OpenMPI) OTHER (Search) (Weave…) YARN (Cluster Resource Management) HDFS2 (Redundant, Reliable Storage) HBase
    5. 5. © Hortonworks Inc. Page 5
    6. 6. Hoya: On-demand HBase clusters 1. Small HBase cluster in large YARN cluster 2. Dynamic HBase clusters 3. Self-healing HBase Cluster 4. Elastic HBase clusters 5. Transient/intermittent clusters for workflows 6. Custom versions & configurations 7. More efficient utilization/sharing of cluster © Hortonworks Inc. Page 6
    7. 7. Goal: No code changes in HBase • Today : none But we'd like • ZK reporting of web UI ports • A way to get from failed RS to YARN container (configurable ID is enough) © Hortonworks Inc. Page 7
    8. 8. Hoya – the tool • Hoya (Hbase On YArn) – Java tool – Completely CLI driven • Input: cluster description as JSON – Specification of cluster: node options, ZK params – Configuration generated – Entire state persisted • Actions: create, freeze/thaw, flex, exists <cluster> • Can change cluster state later – Add/remove nodes, started / stopped states © Hortonworks Inc.
    9. 9. YARN manages the cluster • • • • • • Servers run YARN Node Managers NM's heartbeat to Resource Manager RM schedules work over cluster YARN Node Manager RM allocates containers to apps NMs start containers NMs report container health YARN Resource Manager HDFS HDFS YARN Node Manager YARN Node Manager HDFS HDFS © Hortonworks Inc. 2012 Page 9
    10. 10. Hoya Client creates App Master YARN Resource Manager YARN Node Manager Hoya Client Hoya AM HDFS HDFS YARN Node Manager YARN Node Manager HDFS HDFS © Hortonworks Inc. 2012 Page 10
    11. 11. AM deploys HBase with YARN YARN Resource Manager YARN Node Manager Hoya Client Hoya AM HDFS HBase Master HDFS YARN Node Manager YARN Node Manager HBase Region Server HBase Region Server HDFS © Hortonworks Inc. 2012 HDFS Page 11
    12. 12. HBase & clients bind via Zookeeper YARN Resource Manager YARN Node Manager Hoya Client Hoya AM HDFS HBase Master HBase Client YARN Node Manager HDFS YARN Node Manager HBase Region Server HBase Region Server HDFS © Hortonworks Inc. 2012 HDFS Page 12
    13. 13. YARN notifies AM of failures YARN Resource Manager YARN Node Manager Hoya Client Hoya AM HDFS HBase Master HDFS YARN Node Manager YARN Node Manager HBase Region Server HBase Region Server HBase Region Server HDFS © Hortonworks Inc. 2012 HDFS Page 13
    14. 14. HOYA - cool bits • Cluster specification stored as JSON in HDFS • Conf dir cached, dynamically patched before pushing up as local resources for master & region servers • HBase .tar file stored in HDFS -clusters can use the same/different HBase versions • Handling of cluster flexing is the same code as unplanned container loss. • No Hoya code on region servers © Hortonworks Inc. Page 14
    15. 15. HOYA - AM RPC API //change cluster role counts flexCluster(ClusterSpec) //get current cluster state getJSONClusterStatus() : ClusterSpec listNodeUUIDsByRole(role): UUID[] getNode(UUID): RoleInfo getClusterNodes(UUID[]) RoleInfo[] stopCluster() © Hortonworks Inc. Page 15
    16. 16. Flexing/failure handling is same code boolean flexCluster(ClusterDescription updated) { providerService.validateClusterSpec(updated); appState.updateClusterSpec(updated); return reviewRequestAndReleaseNodes(); } void onContainersCompleted(List<ContainerStatus> completed) { for (ContainerStatus status : completed) { appState.onCompletedNode(status); } reviewRequestAndReleaseNodes(); } © Hortonworks Inc. Page 16
    17. 17. Cluster Specification: persistent & wire { "version" : "1.0", "name" : "TestLiveTwoNodeRegionService", "type" : "hbase", "options" : { "zookeeper.path" : "/yarnapps_hoya_stevel_live2nodes", "cluster.application.image.path" : "hdfs://bin/hbase-0.96.tar.gz", "zookeeper.hosts" : "127.0.0.1" }, "roles" : { "worker" : { "role.instances" : "2", }, "hoya" : { "role.instances" : "1", }, "master" : { "role.instances" : "1", } }, ... } © Hortonworks Inc. 2012
    18. 18. Role Specifications "roles" : { "worker" : { "yarn.memory" : "256", "role.instances" : "5", "jvm.heapsize" : "256", "yarn.vcores" : "1", "app.infoport" : "0" "env.MALLOC_ARENA_MAX": "4" }, "master" : { "yarn.memory" : "128", "role.instances" : "1", "jvm.heapsize" : "128", "yarn.vcores" : "1", "app.infoport" : "8585" } } © Hortonworks Inc. 2012
    19. 19. Current status • HBase clusters on-demand • Accumulo clusters (5+ roles, different “provider”) • Cluster freeze, thaw, flex, destroy • Location of role instances tracked & persisted –for placement close to data after failure, thaw • Secure cluster support © Hortonworks Inc.
    20. 20. Ongoing • Multiple roles: worker, master, monitor --role worker --roleopts worker yarn.vcores 2 • Multiple Providers: HBase + others – client side: preflight, configuration patching – server side: starting roles, liveness • Liveness probes: HTTP GET, RPC port, RPC op? • What do we need in YARN for production? © Hortonworks Inc. Page 20
    21. 21. Ongoing • Better failure handling, blacklisting • Liveness probes: HTTP GET, RPC port, RPC op? • Testing: functional, scale & load • What do we need in Hoya for production? • What do we need in YARN for production? © Hortonworks Inc. Page 21
    22. 22. Requirements of an App: MUST • Install from tarball; run as normal user • Deploy/start without human intervention • Pre-configurable, static instance config data • Support dynamic discovery/binding of peers • Co-existence with other app instance in cluster/nodes • Handle co-located role instances • Persist data to HDFS • Support 'kill' as a shutdown option • Handle failed role instances • Support role instances moving after failure © Hortonworks Inc. Page 22
    23. 23. Requirements of an App: SHOULD • Be configurable by Hadoop XML files • Publish dynamically assigned web UI & RPC ports • Support cluster flexing up/down • Support API to determine role instance status • Make it possible to determine role instance ID from app • Support simple remote liveness probes © Hortonworks Inc. Page 23
    24. 24. YARN-896: long-lived services 1. Container reconnect on AM restart 2. YARN Token renewal on long-lived apps 3. Containers: signalling, >1 process sequence 4. AM/RM managed gang scheduling 5. Anti-affinity hint in container requests 6. Service Registry - ZK? 7. Logging © Hortonworks Inc. Page 24
    25. 25. SLAs & co-existence with MapReduce 1. Make IO bandwidth/IOPs a resource used in scheduling & limits 2. Need to monitor what's going on w.r.t IO & net load from containers  apps  queues 3. Dynamic adaptation of cgroup HDD, Net, RAM limits 4. Could we throttle MR job File & HDFS IO bandwidth? © Hortonworks Inc. Page 25
    26. 26. Hoya needs a home! https://github.com/hortonworks/hoya © Hortonworks Inc. Page 26
    27. 27. Questions? hortonworks.com © Hortonworks Inc Page 27

    ×