HBaseCon 2015: Elastic HBase on Mesos

1,845 views

Published on

Adobe has packaged HBase in Docker containers and uses Marathon and Mesos to schedule them—allowing us to decouple the RegionServer from the host, express resource requirements declaratively, and open the door for unassisted real-time deployments, elastic (up and down) real-time scalability, and more. In this talk, you'll hear what we've learned and explain why this approach could fundamentally change HBase operations.

Published in: Software
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,845
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

HBaseCon 2015: Elastic HBase on Mesos

  1. 1. Elastic HBase on Mesos Cosmin Lehene Adobe
  2. 2. Industry Average Resource Utilization <10% used capacity 1-10% spare / un-used capacity 90-99%
  3. 3. Cloud Resource Utilization ~60% used capacity 60% spare / un-used capacity 40%
  4. 4. Actual utilization: 3-6% used capacity 1-10% spare / un-used capacity 90-99%
  5. 5. Why • peak load provisioning (can be 30X) • resource imbalance (CPU vs. I/O vs. RAM bound) • incorrect usage predictions • all of the above (and others)
  6. 6. Typical HBase Deployment • (mostly) static deployment footprint • infrequent scaling out by adding more nodes • scaling down uncommon • OLTP, OLAP workloads as separate clusters • < 32GB Heap (compressed OOPS, GC)
  7. 7. Wasted Resources
  8. 8. Idleness Costs • idle servers draw > ~50% of the nominal power • hardware deprecation accounts for ~40% • public clouds idleness translates to 100% waste (charged by time not by resource use)
  9. 9. Workload segregation nulls economy of scale benefits
  10. 10. • daily, weekly, seasonal variation (both up and down) • load varies across workloads • peaks are not synchronized Load is not Constant
  11. 11. Opportunities • datacenter as a single pool of shared resources • resource oversubscription • mixed workloads can scale elastically within pools • shared extra capacity
  12. 12. Elastic HBase
  13. 13. Goals
  14. 14. Cluster Management “Bill of Materials” • single pool of resources • multi-tenancy • mixed short and long running tasks • elasticity • realtime scheduling ★ Mesos ★ Mesos ★ Mesos (through frameworks) ★ Marathon / Mesos ★ Marathon / Mesos
  15. 15. Multitenancy mixing multiple workloads • daily, weekly, variation • balance resource usage • e.g. cpu-bound + I/O bound • off-peak scheduling (e.g. nighty batch jobs) • No “analytics” clusters
  16. 16. HBase “Bill of Materials” • Task portability • statelessness • auto discovery • self contained binary • resource isolation ✓ built-in (HDFS and ZK) ✓ built-in ★ docker ★ docker (through CGgroups)
  17. 17. Node Level Hardware OS/Kernel Mesos Slave Docker Salt Minion Containers Kafka Broker HBase HRS [APP]
  18. 18. Resource Management: Mesos Kubernetes Marathon AuroraScheduling Storage HDFS Tachyon HBase Compute MapReduce Storm Spark Cluster Level
  19. 19. Docker (and containers in general)
  20. 20. Why: Docker Containers • “static link” everything (including the OS) • Standard interface (resources, lifecycle, events) • lightweight • Just another process • No overhead, native performance • fine-grained resources • e.g. 0.5 cores, 32MB RAM, 32MB disk
  21. 21. From .tgz/rpm + Puppet to Docker • Goal: optimize for Mesos (not standalone) • cluster, host agnostic (portability) • env config injected through Marathon • Self contained: • OS-base + JDK + HBase • centos-7 + java-1.8u40 + hbase-1.0
  22. 22. Marathon
  23. 23. Marathon “runs” Applications on Mesos • REST API to start / stop / scale apps • maintains desired state (e.g. # instances) • kills / restarts unhealthy containers • reacts to node failures • constraints (e.g. locality)
  24. 24. Marathon Manifest • env information: • ZK, HDFS URIs • container resources • CPU, RAM • cluster resources • # container instances
  25. 25. Marathon “deployment” • REST call • Marathon (and Mesos) handle the actual deployment automatically
  26. 26. Benefits
  27. 27. Easy • no code needed • trivial docker container • could be released with HBase • straight forward Marathon manifest
  28. 28. Efficiency • Improved resource utilization • mixed workloads • elasticity
  29. 29. Elasticity • Scale up / down based on load • traffic spikes, compactions, etc. • yield unused resources
  30. 30. Smaller, Better? • multiple RS per node • use all RAM without losing compressed OOPS • smaller failure domain • smaller heaps • less GC-induced latency jitter
  31. 31. Simplified Tuning • standard container sizes • decoupled from physical hosts • portable • same tuning everywhere • invariants based on resource ratios • # threads to # cores to RAM to Bandwidth
  32. 32. Collocated Clusters • multiple versions • e.g 0.94, 0.98, 1.0 • simplifies multi-tenancy aspects • e.g. cluster-per-table resource isolation
  33. 33. NEXT
  34. 34. Improvements • drain regions before suspending • schedule for data locality • collocate Region Servers and HFiles blocks • DN short-circuit through shared volumes
  35. 35. HBase Ergonomics • auto-tune to available resources • JVM heap • number of threads, etc.
  36. 36. Disaggregating HBase • HBase is an consistent, highly available, distributed cache on top of HFiles in HDFS • Most *real* resource-wise, multi-tenant concerns revolve around a (single) table • Each table could have it’s own cluster (minus some security groups concerns)
  37. 37. HMaster as a Scheduler? • could fully manage HRS lifecycle (start/stop) • in conjunction to region allocation • considerations: • Marathon is a generic long-running app scheduler • extend scheduling capabilities instead of “reinventing” it?
  38. 38. FIN
  39. 39. Resources • The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Second edition - http://www.morganclaypool.com/doi/abs/10.2200/S00516ED2V01Y201306CAC024 • Omega: flexible, scalable schedulers for large compute clusters - http://research.google.com/pubs/ pub41684.html • Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center - https://www.cs.berkeley.edu/~alig/ papers/mesos.pdf • https://github.com/mesosphere/marathon
  40. 40. Contact • @clehene • clehene@[gmail | adobe].com • hstack.org

×