Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
HBase @ HubSpot
Multitenancy
●
●
●
What does Multitenancy Mean?
●
●
●
Benefits of Multitenancy
Identifying Bad Actors
●
●
●
Our Tool: HBasetracing
●
●
●
●
Ad-hoc Querying with HBasetracing
●
●
●
HBasetracing Roll-up
Example Incident
Dealing with Bad Actors
●
●
●
●
Quotas
●
●
●
HADOOP_USER_NAME
Deploying Quotas
●
○
○
●
When Quotas Help
Remember this?
●
●
●
When Quotas Fall Flat
●
●
Detention Queues
●
The Dream
Handling Failure &
Managing Risk
●
●
●
Read Replicas
●
● Consistency.TIMELINE
● isStale()
Read Replica Usage
●
●
Read Replica Timeout Settings
●
●
●
Read Replica Limitations
●
●
●
Cluster Replication
●
●
●
Failover Client
●
●
●
@StaleReadOnly Annotation
Monitoring
●
●
●
●
●
● next()
●
●
●
●
●
●
●
●
●
G1GC:
Making it work with HBase
Why G1GC?
● Designed for large heaps.
○ Divides heap into many smaller G1 regions.
○ G1 regions scanned and collected inde...
The Need for Tuning
Out of the box, G1GC hurt our HBase
clusters’ performance:
● Too much time spent in GC pauses.
● Occas...
●
●
●
●
Recommended Defaults
Important Metrics for Tuning
● G1GC Eden & Tenured size.
○ GC logs: “[Eden: … Survivors: … Heap: …]”
● HBase memory used b...
Necessary Tuning Params
● JVM args:
-Xms, -Xmx
-XX:G1NewSizePercent
-XX:InitiatingHeapOccupancyPercent (aka “IHOP”)
● HBas...
Necessary Tuning: Method
A. Find max block cache size, memstore size,
and static index size from the past month.
B. Sum 11...
Necessary Tuning: cont.
In hbase-site.xml:
● Set hfile.block.cache.size ratio value to 110%
max block cache size from the ...
Further Tuning & Considerations
● -XX:G1ReservePercent
○ Accommodating for burst-y usage.
● -XX:G1HeapRegionSize
○ Reducin...
HBase Usage & Tuning Limits
A Full GC isn’t necessarily G1GC’s fault. There’s
a level of “bad usage” that’s unreasonable t...
Usage Note: Caching isn’t Free!
Yellow: % time spent in Mixed GC (left axis) | Blue: block cache churn, MB/sec (right axis)
...to Summarize:
● Tune heap size, IHOP, & HBase memory
caps based on HBase memory usage.
● Tune Eden size based on % time...
Links & Reference
Blog Post —  http://bit.ly/hbasegc
G1GC CollectD Plugin — http://bit.ly/collectdgc
G1GC Log Visualizer —...
Solving Multi-tenancy and G1GC in Apache HBase
Solving Multi-tenancy and G1GC in Apache HBase
Solving Multi-tenancy and G1GC in Apache HBase
Upcoming SlideShare
Loading in …5
×

Solving Multi-tenancy and G1GC in Apache HBase

1,046 views

Published on

Graham Baecher & Patrick Dignan (HubSpot)

At HubSpot, all HBase clusters run with G1GC and are highly multi-tenant, powering hundreds of unique APIs, Hadoop jobs, daemons, and crons. This two-part talk will cover challenges and solutions involving HBase multi-tenancy and G1GC tuning at HubSpot, including an overview of our request-by-request monitoring and analysis tools and how we identify/address G1 settings and behaviors that might be causing performance or stability problems.

Published in: Software
  • Be the first to comment

Solving Multi-tenancy and G1GC in Apache HBase

  1. 1. HBase @ HubSpot
  2. 2. Multitenancy
  3. 3. ● ● ● What does Multitenancy Mean?
  4. 4. ● ● ● Benefits of Multitenancy
  5. 5. Identifying Bad Actors
  6. 6. ● ● ● Our Tool: HBasetracing
  7. 7. ● ● ● ● Ad-hoc Querying with HBasetracing
  8. 8. ● ● ● HBasetracing Roll-up
  9. 9. Example Incident
  10. 10. Dealing with Bad Actors
  11. 11. ● ● ● ● Quotas
  12. 12. ● ● ● HADOOP_USER_NAME Deploying Quotas
  13. 13. ● ○ ○ ● When Quotas Help
  14. 14. Remember this?
  15. 15. ● ● ● When Quotas Fall Flat
  16. 16. ● ● Detention Queues
  17. 17. ● The Dream
  18. 18. Handling Failure & Managing Risk
  19. 19. ● ● ● Read Replicas
  20. 20. ● ● Consistency.TIMELINE ● isStale() Read Replica Usage
  21. 21. ● ● Read Replica Timeout Settings
  22. 22. ● ● ● Read Replica Limitations
  23. 23. ● ● ● Cluster Replication
  24. 24. ● ● ● Failover Client
  25. 25. ● ● ● @StaleReadOnly Annotation
  26. 26. Monitoring
  27. 27. ● ● ● ●
  28. 28. ● ● next() ● ● ●
  29. 29. ● ● ● ●
  30. 30. ● ●
  31. 31. G1GC: Making it work with HBase
  32. 32. Why G1GC? ● Designed for large heaps. ○ Divides heap into many smaller G1 regions. ○ G1 regions scanned and collected independently. ● Instead of occasional very long pauses, G1GC has more frequent, shorter pauses. If tuned properly, G1GC can provide performant GC that scales well for large RegionServer heaps.
  33. 33. The Need for Tuning Out of the box, G1GC hurt our HBase clusters’ performance: ● Too much time spent in GC pauses. ● Occasional very long GC pauses. ● “To-space Exhaustion”, leading to Full GCs, which led to slow RegionServer deaths.
  34. 34. ● ● ● ● Recommended Defaults
  35. 35. Important Metrics for Tuning ● G1GC Eden & Tenured size. ○ GC logs: “[Eden: … Survivors: … Heap: …]” ● HBase memory used by Memstore. ○ RegionServer JMX: “memStoreSize” ● HBase memory used by Block Cache. ○ RegionServer JMX: “blockCacheSize” ● HBase memory used by “static index”. ○ RegionServer JMX: “staticIndexSize”
  36. 36. Necessary Tuning Params ● JVM args: -Xms, -Xmx -XX:G1NewSizePercent -XX:InitiatingHeapOccupancyPercent (aka “IHOP”) ● HBase configs (hbase-site.xml): hfile.block.cache.size hbase.regionserver.global.memstore.size
  37. 37. Necessary Tuning: Method A. Find max block cache size, memstore size, and static index size from the past month. B. Sum 110% of (A) maxes, add heap waste. C. Set IHOP and heap size such that Initiating Heap Occupancy > (B) by at least 10% heap. D. Ensure IHOP + G1NewSizePercent < 90%. – 90% = 100% - G1ReservePercent (default 10)
  38. 38. Necessary Tuning: cont. In hbase-site.xml: ● Set hfile.block.cache.size ratio value to 110% max block cache size from the past month. ● Set hbase.regionserver.global.memstore.size ratio value to 110% max Memstore size from the past month.
  39. 39. Further Tuning & Considerations ● -XX:G1ReservePercent ○ Accommodating for burst-y usage. ● -XX:G1HeapRegionSize ○ Reducing occurrence of humongous objects. ○ Reducing long tail of slow GCs in some cases. ● -XX:G1NewSizePercent ○ Tuning individual pause time vs. % time in GC.
  40. 40. HBase Usage & Tuning Limits A Full GC isn’t necessarily G1GC’s fault. There’s a level of “bad usage” that’s unreasonable to tune around: ● Unexpected, excessively burst-y traffic. ● Too many/enormous Humongous objects. In either of these cases, the real solution is to fix the client code.
  41. 41. Usage Note: Caching isn’t Free! Yellow: % time spent in Mixed GC (left axis) | Blue: block cache churn, MB/sec (right axis)
  42. 42. ...to Summarize: ● Tune heap size, IHOP, & HBase memory caps based on HBase memory usage. ● Tune Eden size based on % time in GC & average Young GC pause times. ● Make adjustments as needed, based on cluster usage. ● Look for suboptimal usage in your HBase clients to further improve HBase GC.
  43. 43. Links & Reference Blog Post —  http://bit.ly/hbasegc G1GC CollectD Plugin — http://bit.ly/collectdgc G1GC Log Visualizer — http://bit.ly/gclogviz

×