Valta: A Resource Management
Layer over Apache HBase
Lars George| Director EMEA Services
Andrew Wang | Software Engineer
June 13, 2013
Background on HBase
2
• Write-heavy processing pipelines
• Web crawling, personalization, time-series
• Storing a lot of data (many TBs)
• Random reads/writes
• Tight MapReduce and Hadoop integration
Workloads
3
• Very much a shared system
• One system, multiple workloads
• Frontend doing random reads/writes
• Analytical MR doing sequential scans
• Bulk import/export with MR
• Hard to isolate multitenant workloads
Example: Rolling RS failures
4
• Happened in production
• Bad bulk import wiped out entire cluster
• MR writes kill the RS
• Region gets reassigned
• Repeat until cluster is dead
• Applies to any high-load traffic
Current state of the art
5
• Run separate clusters, replicate between
• $$$, poor utilization, more complex
• Namespace-based hardware partitioning
• Same issues as above
• Delay big tasks until periods of low load
• Ad-hoc, weak guarantees
Other Problems
6
• Long requests impact frontend latency
• I/O latency (HDFS, OS, disk)
• Unpredictable ops (compaction, cron, …)
• Some straightforward to fix, some not
Outline
7
• Project Valta (HBase)
• Resource limits
• Blueprint for further issues
• Request scheduling
• Auto-tuning scheduling for SLOs
• Multiple read replicas
8
Project Valta
Project Valta
9
• Need basic resource limits in HBase
• Single shared system
• Ill-behaved HBase clients are unrestricted
• Take resources from other clients
• Worst case: rolling RS failures
• Want to limit damage from bad clients
Resource Limits
10
• Collect RPC metrics
• Payload size and throughput
• Impose per-client throughput limits
• e.g. MR import limited to 100 1MB puts/s
• Limits are enforced per-regionserver
• Soft state
• Think of it as a firewall
Implementation
11
• Client-side table wrapper
• Server-side coprocessor
• Github
• https://github.com/larsgeorge/Valta
• Follow HBASE-8481
• https://issues.apache.org/jira/browse/HBASE-8481
Limitations
12
• Important first steps, still more to do
• Static limits need baby-sitting
• Dynamic workload, set of clients
• Doesn’t fix some parts of HBase
• Compactions
• Doesn’t fix the rest of the stack
• HDFS, OS, disk
13
Blueprint for further issues
Blueprint
14
• Ideas on other QoS issues
• Full-stack request scheduling
• HBase, HDFS, OS, disk
• Auto-tuning to meet high-level SLOs
• Random latency (compaction, cron, …)
• Let’s file some JIRAs 
Full-stack request scheduling
15
• Need scheduling in all layers
• HBase, HDFS, OS, disk
• Run high-priority requests first
• Preemption of long operations
• Some pieces already available
• RPC priority field (HADOOP-9194)
• Client names in MR/HBase/HDFS
HBase request scheduling
16
• Add more HBase scheduling hooks
• RPC handling
• Between HDFS I/Os
• During long coprocessors or scans
• Expose hooks to coprocessors
• Could be used by Valta
HDFS request scheduling
17
• Same scheduling hooks as in HBase
• RPC layer, between I/Os
• Bound # of requests per disk
• Reduces queue length and contention
• Preempt queues in OS and disk
• OS block layer (CFQ, ioprio_set)
• Disk controller (SATA NCQ, ???)
High-level SLO enforcement
18
• Research work I did at Berkeley (Cake)
• Specify high-level SLOs directly to HBase
• “100ms 99th percentile latency for gets”
• Added hooks to HBase and HDFS
• System auto-tunes to satisfy SLOs
• Read the paper or hit me up!
• http://www.umbrant.com/papers/socc12-cake.pdf
Multiple read replicas
19
• Also proposed for MTTR, availability
• Many unpredictable sources of latency
• Compactions
• Also: cron, MR spill, shared caches, network, …
• Sidestep the problem!
• Read from 3 RS, return the fastest result
• Unlikely all three will be slow
• Weaker consistency, better latency
Conclusion
20
• HBase is a great system!
• Let’s make it multitenant
• Request limits
• Full-stack request scheduling
• High-level SLO enforcement
• Multiple read replicas
21
Thanks!
lars@cloudera.com
andrew.wang@cloudera.com

HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase

  • 1.
    Valta: A ResourceManagement Layer over Apache HBase Lars George| Director EMEA Services Andrew Wang | Software Engineer June 13, 2013
  • 2.
    Background on HBase 2 •Write-heavy processing pipelines • Web crawling, personalization, time-series • Storing a lot of data (many TBs) • Random reads/writes • Tight MapReduce and Hadoop integration
  • 3.
    Workloads 3 • Very mucha shared system • One system, multiple workloads • Frontend doing random reads/writes • Analytical MR doing sequential scans • Bulk import/export with MR • Hard to isolate multitenant workloads
  • 4.
    Example: Rolling RSfailures 4 • Happened in production • Bad bulk import wiped out entire cluster • MR writes kill the RS • Region gets reassigned • Repeat until cluster is dead • Applies to any high-load traffic
  • 5.
    Current state ofthe art 5 • Run separate clusters, replicate between • $$$, poor utilization, more complex • Namespace-based hardware partitioning • Same issues as above • Delay big tasks until periods of low load • Ad-hoc, weak guarantees
  • 6.
    Other Problems 6 • Longrequests impact frontend latency • I/O latency (HDFS, OS, disk) • Unpredictable ops (compaction, cron, …) • Some straightforward to fix, some not
  • 7.
    Outline 7 • Project Valta(HBase) • Resource limits • Blueprint for further issues • Request scheduling • Auto-tuning scheduling for SLOs • Multiple read replicas
  • 8.
  • 9.
    Project Valta 9 • Needbasic resource limits in HBase • Single shared system • Ill-behaved HBase clients are unrestricted • Take resources from other clients • Worst case: rolling RS failures • Want to limit damage from bad clients
  • 10.
    Resource Limits 10 • CollectRPC metrics • Payload size and throughput • Impose per-client throughput limits • e.g. MR import limited to 100 1MB puts/s • Limits are enforced per-regionserver • Soft state • Think of it as a firewall
  • 11.
    Implementation 11 • Client-side tablewrapper • Server-side coprocessor • Github • https://github.com/larsgeorge/Valta • Follow HBASE-8481 • https://issues.apache.org/jira/browse/HBASE-8481
  • 12.
    Limitations 12 • Important firststeps, still more to do • Static limits need baby-sitting • Dynamic workload, set of clients • Doesn’t fix some parts of HBase • Compactions • Doesn’t fix the rest of the stack • HDFS, OS, disk
  • 13.
  • 14.
    Blueprint 14 • Ideas onother QoS issues • Full-stack request scheduling • HBase, HDFS, OS, disk • Auto-tuning to meet high-level SLOs • Random latency (compaction, cron, …) • Let’s file some JIRAs 
  • 15.
    Full-stack request scheduling 15 •Need scheduling in all layers • HBase, HDFS, OS, disk • Run high-priority requests first • Preemption of long operations • Some pieces already available • RPC priority field (HADOOP-9194) • Client names in MR/HBase/HDFS
  • 16.
    HBase request scheduling 16 •Add more HBase scheduling hooks • RPC handling • Between HDFS I/Os • During long coprocessors or scans • Expose hooks to coprocessors • Could be used by Valta
  • 17.
    HDFS request scheduling 17 •Same scheduling hooks as in HBase • RPC layer, between I/Os • Bound # of requests per disk • Reduces queue length and contention • Preempt queues in OS and disk • OS block layer (CFQ, ioprio_set) • Disk controller (SATA NCQ, ???)
  • 18.
    High-level SLO enforcement 18 •Research work I did at Berkeley (Cake) • Specify high-level SLOs directly to HBase • “100ms 99th percentile latency for gets” • Added hooks to HBase and HDFS • System auto-tunes to satisfy SLOs • Read the paper or hit me up! • http://www.umbrant.com/papers/socc12-cake.pdf
  • 19.
    Multiple read replicas 19 •Also proposed for MTTR, availability • Many unpredictable sources of latency • Compactions • Also: cron, MR spill, shared caches, network, … • Sidestep the problem! • Read from 3 RS, return the fastest result • Unlikely all three will be slow • Weaker consistency, better latency
  • 20.
    Conclusion 20 • HBase isa great system! • Let’s make it multitenant • Request limits • Full-stack request scheduling • High-level SLO enforcement • Multiple read replicas
  • 21.