Your SlideShare is downloading. ×
  • Like
Hoya : HBase on YARN (2013-08-20 HBase Hug)
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Hoya : HBase on YARN (2013-08-20 HBase Hug)

  • 4,056 views
Published

An introduction to Hoya, a prototype YARN application for deploying HBase in a YARN, using YARN to allocate space on nodes for the processes. The Hoya Application Master monitors the health of the …

An introduction to Hoya, a prototype YARN application for deploying HBase in a YARN, using YARN to allocate space on nodes for the processes. The Hoya Application Master monitors the health of the Region Servers and, whenever one exits or whose host drops out of the cluster, requests a new YARN container and deploys a new instance.

Hoya also supports live cluster flexing: increasing and decreasing cluster size dynamically.

The source code is at https://github.com/hortonworks/hoya

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
4,056
On SlideShare
0
From Embeds
0
Number of Embeds
46

Actions

Shares
Downloads
70
Comments
0
Likes
6

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. © Hortonworks Inc. 2013 Hoya: HBase on YARN Steve Loughran & Devaraj Das {stevel, ddas} at hortonworks.com @steveloughran, @ddraj August 2013
  • 2. © Hortonworks Inc. 2012 Hadoop as Next-Gen Platform HADOOP 1.0 HDFS (redundant, reliable storage) MapReduce (cluster resource management & data processing) HDFS2 (redundant, reliable storage) YARN (cluster resource management) MapReduce (data processing) Others (data processing) HADOOP 2.0 Single Use System Batch Apps Multi Purpose Platform Batch, Interactive, Online, Streaming, … Page 2
  • 3. © Hortonworks Inc. YARN: Taking Hadoop Beyond Batch Page 3 Applications Run Natively IN Hadoop HDFS2 (Redundant, Reliable Storage) YARN (Cluster Resource Management) BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, S4,…) GRAPH (Giraph) HPC MPI (OpenMPI) OTHER (Search) (Weave…) Samza Store ALL DATA in one place… Interact with that data in MULTIPLE WAYS with Predictable Performance and Quality of Service IN-MEMORY (Spark)
  • 4. HDFS2 (Redundant, Reliable Storage) YARN (Cluster Resource Management) BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, S4,…) GRAPH (Giraph) HPC MPI (OpenMPI) OTHER (Search) (Weave…) HBase IN-MEMORY (Spark) HDFS2 (Redundant, Reliable Storage) YARN (Cluster Resource Management) BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, S4,…) GRAPH (Giraph) HPC MPI (OpenMPI) OTHER (Search) (Weave…) HBase IN-MEMORY (Spark) And HBase?
  • 5. © Hortonworks Inc. Page 5
  • 6. © Hortonworks Inc. Hoya: On-demand HBase clusters 1. Small HBase cluster in large YARN cluster 2. Dynamic HBase clusters 3. Elastic HBase clusters 4. Transient/intermittent clusters for workflows 5. Custom versions & configurations 6. More efficient utilization/sharing of cluster Page 6
  • 7. © Hortonworks Inc. Goal: No code changes in HBase • Today : none HBase 0.95.2$ mvn install -Dhadoop.version=2.0 But we'd like • ZK reporting of web UI ports • Allocation of tables in RS to be block location aware • A way to get from failed RS to YARN container (configurable ID is enough) Page 7
  • 8. © Hortonworks Inc. Hoya – the tool • Hoya (Hbase On YArn) –Java tool –Completely CLI driven • Input: cluster description as JSON –Specification of cluster: node options, ZK params –Configuration generated –Entire state persisted • Actions: create, freeze/thaw, flex, exists <cluster> • Can change cluster state later –Add/remove nodes, started / stopped states
  • 9. © Hortonworks Inc. 2012 YARN manages the cluster Page 9 HDFS YARN Node Manager HDFS YARN Node Manager HDFS YARN Resource Manager HDFS YARN Node Manager • Servers run YARN Node Managers • NM's heartbeat to Resource Manager • RM schedules work over cluster • RM allocates containers to apps • NMs start containers • NMs report container health
  • 10. © Hortonworks Inc. 2012 Hoya Client creates App Master Page 10 HDFS YARN Node Manager HDFS YARN Node Manager HDFS YARN Resource Manager HDFS YARN Node Manager Hoya Client Hoya AM
  • 11. © Hortonworks Inc. 2012 AM deploys HBase with YARN Page 11 HDFS YARN Node Manager HDFS YARN Node Manager HDFS YARN Resource Manager Hoya Client HDFS YARN Node Manager Hoya AM [HBase Master] HBase Region Server HBase Region Server
  • 12. © Hortonworks Inc. 2012 HBase & clients bind via Zookeeper Page 12 HDFS YARN Node Manager HBase Region Server HDFS YARN Node Manager HBase Region Server HDFS YARN Resource Manager HBase Client HDFS YARN Node Manager Hoya AM [HBase Master] Hoya Client
  • 13. © Hortonworks Inc. 2012 YARN notifies AM of failures Page 13 HDFS YARN Node Manager HDFS YARN Node Manager HBase Region Server HDFS YARN Resource Manager Hoya Client HDFS YARN Node Manager Hoya AM [HBase Master] HBase Region Server HBase Region Server
  • 14. © Hortonworks Inc. HOYA - cool bits • Cluster specification stored as JSON in HDFS • Conf dir cached, dynamically patched before pushing up as local resources for master & region servers • HBase .tar file stored in HDFS -clusters can use the same/different HBase versions • Handling of cluster flexing is the same code as unplanned container loss. • No Hoya code on region servers Page 14
  • 15. © Hortonworks Inc. HOYA - AM RPC API //shut down public void stopCluster(); //change #of worker nodes in cluster public boolean flexNodes(int workers); //get JSON description of live cluster public String getClusterStatus(); Page 15
  • 16. © Hortonworks Inc. Flexing/failure handling is same code public boolean flexNodes(int workers) throws IOException { log.info("Flexing cluster count from {} to {}", numTotalContainers, workers); if (numTotalContainers == workers) { //no-op log.info("Flex is a no-op"); return false; } //update the #of workers numTotalContainers = workers; // ask for more containers if needed reviewRequestAndReleaseNodes(); return true; } Page 16
  • 17. © Hortonworks Inc. 2012 { "name" : "TestHBaseMaster", "createTime" : 1371738651059, "flags" : { "--Xtest" : "true" }, "originConfigurationPath" : "file:/Users/stevel/.hoya/cluster/TestHBaseMaster/orig", "generatedConfigurationPath" : "file:/Users/stevel/.hoya/cluster/TestHBaseMaster/gen", "hBaseClientProperties" : { }, "hbaseHome" : "/Users/stevel/Java/Apps/hbase", "hbaseRootPath" : "file:/Users/stevel/.hoya/cluster/TestHBaseMaster/hbase", "zkHosts" : "127.0.0.1", "zkPath" : "/hbase", "zkPort" : 49564 "workers" : 5, "masterHeap" : 128, "masters" : 1, "workerHeap" : 256, "startTime" : 0, "state" : 1, "statusTime" : 0, "stopTime" : 0, } Spec: declarative parts; this is persisted Cluster Specification: persistent & wire
  • 18. © Hortonworks Inc. Current status • Able to create & stop on-demand HBase clusters –RegionServer failures handled • Able to specify specific HBase configuration: hbase-home or .tar.gz • Cluster stop, restart, flex • get (dynamic) conf as XML, properties
  • 19. © Hortonworks Inc. What's Next • Multiple roles: worker, master, monitor --role worker --roleopts worker yarn.vcores 2 • Multiple Providers: HBase + others –client side: preflight, configuration patching –server side: starting roles, liveness • Liveness probes: HTTP GET, RPC port, RPC op? • YARN enhancements Page 19
  • 20. © Hortonworks Inc. YARN-896: long-lived services 1. Container reconnect on AM restart 2. Token renewal on long-lived apps 3. Containers: signalling, >1 process sequence 4. AM/RM managed gang scheduling 5. Anti-affinity hint in container requests 6. Service Registry - ZK? 7. Logging All post Hadoop-2.1 Page 20
  • 21. © Hortonworks Inc. Hoya needs a home! Page 21 https://github.com/hortonworks/hoya
  • 22. © Hortonworks Inc Questions? hortonworks.com Page 22
  • 23. © Hortonworks Inc http://hortonworks.com/careers/ Page 23 P.S: we are hiring
  • 24. © Hortonworks Inc. Requirements of an App: MUST • Install from tarball; run as normal user • Pre-configurable, static instance config data • deploy/start without human intervention • support dynamic discovery/binding of peers • co-existence with other app instance in cluster/nodes • handle co-located role instances • Persist data to HDFS • support 'kill' as a shutdown option • support role instances moving after failure • handle failed role instances Page 24
  • 25. © Hortonworks Inc. Requirements of an App: SHOULD • Be configurable by Hadoop XML files • Publish dynamically assigned web UI & RPC ports • Support cluster flexing up/down • Support API to determine role instance status • Make it possible to determine role instance ID from app • Support simple remote liveness probes Page 25