TriHUG 3/14: HBase in Production

612 views

Published on

talk by Michael Webster, software engineer at Bronto

Published in: Internet, Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
612
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
12
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

TriHUG 3/14: HBase in Production

  1. 1. HBase In Production Hey we’re hiring!
  2. 2. Contents ● Bronto Overview ● HBase Architecture ● Operations ● Table Design ● Questions?
  3. 3. Bronto Overview Bronto Software provides a cloud-based marketing platform for organizations to drive revenue through their email, mobile and social campaigns
  4. 4. Bronto Contd. ● ESP for E-Commerce retailers ● Our customers are marketers ● Charts, graphs, reports ● Market segmentation ● Automation ● We are also hiring
  5. 5. Where We Use HBase ● High volume scenarios ● Realtime data ● Batch processing ● HDFS staging area ● Sorting/Indexing not a priority ○ We are working on this
  6. 6. HBase Overview ● Implementation of Google’s BigTable ● Sparse, sorted, versioned map ● Built on top of HDFS ● Row level ACID ● Get, Put, Scan ● Assorted RMW operations
  7. 7. Tables Overview Tables are sorted (lexicographically) key value pairs of uninterpreted byte[]s. Keyspace is divided up into regions of keys. Each region is hosted by exactly one machine.
  8. 8. R3R1 Server 1 Key Value a byte[] aa byte[] b byte[] bb byte[] c byte[] ca byte[] R1: [a, b) R2: [b, c) R3: [c, d) R2 Server 1 Table Overview
  9. 9. Operations ● Layers of complexity ● Normal failure modes ○ Hardware dies (or combust) ○ Human error ● JVM ● HDFS considerations ● Lots of knobs
  10. 10. Cascading Failure 1. High write volume fragments heap 2. GC promotion failure 3. Stop the world GC 4. ZK timeout 5. Receive YouAreDeadException, die 6. Failover 7. Goto 1
  11. 11. Useful Tunings ● MSLAB enabled ● hbase.regionserver.handler.count ○ Increasing puts more IO load on RS ○ 50 is our sweet spot ● JVM tuning ○ UseConcMarkSweepGC ○ UseParNewGC
  12. 12. Monitoring Tools ● Nagios for hardware checks ● Cloudera Manager ○ Reporting and health checks ○ Apache Ambari and MapR provide similar tools ● Hannibal + custom scripts ○ Identify hot regions for splitting
  13. 13. Table Design ● Table design is deceptively simple ● Main Considerations: ○ Row key structure ○ Number of column families ● Know your queries in advance
  14. 14. Additional Context ● SAAS environment ○ “Twitter clone” model won’t work ● Thousands of users millions, of attributes ● Skewed customer base ○ Biggest clients have 10MM+ contacts ○ Smallest have thousands
  15. 15. Row Keys ● Most important decision ● The only (native) index in HBase ● Random reads and writes are fast ○ Sorted on disk and in memory ○ Bloom filters speed read performance (not in use)
  16. 16. Hotspotting ● Associated with monotonically increasing keys ○ MySql AUTO_INCREMENT ● Writes lock onto one region at a time ● Consequences: ○ Flush and compaction storms ○ $500K cluster limited by $10K machine
  17. 17. Row Key Advice ● Read/Write ratio should drive design ○ We pay a write time penalty for faster reads ● Identify queries you need to support ● Consider composite keys instead of indexes ● Bucketed/Salted keys are an option ○ Distribute writes across N buckets ○ Rebucketing is difficult ○ Requires N reads, slow workers
  18. 18. Variable Width Keys customer_hash::email ● Allows scans for a single customer ● Hashed id distributes customers ● Sorted by email address ○ Could also use reverse domain for gmail, yahoo, etc.
  19. 19. Fixed Width Keys site::contact::create::email ● FuzzyRowFilter ○ Can fix site, contact, and reverse_create ○ Can search for any email address ○ Could use a fixed width encoding for domain ■ Search for just gmail, yahoo, etc ● Distributes sites and users ● Contacts sorted by create date
  20. 20. Column Families ● Groupings of named columns ● Versioning, compression, TTL ● Different than BigTable ○ BigTable: 100s ○ HBase: 1 or 2
  21. 21. Column Family Example Id d {VERSIONS => 2} s7 {TTL => 604800} a (address) p (phone) o:3-27 (open) c:3-20 (click) dfajkdh byte[] byte[]:555-5555 byte[] hnvdzu9 byte[]:1234 St. XXXX hnvdzu9 byte[]:1233 St. hnvdzu9 XXXX byte[] er9asyjk byte[]: 324 Ave Column Family Example ● PROTIP: Keep CF and qualifier names short ○ They are repeated on disk for every cell ● “d” supports 2 versions of each column, maps to demographics ● “s7” has seven day TTL, maps to stats kept for 7 days.
  22. 22. MemStore HDFS s2s1 s3 f1 Column Families In Depth MemStore HDFS s2s1 f2 my_table,,1328551097416.12921 bbc0c91869f88ba6a044a6a1c50. ● StoreFile(s) for each CF in region ● Sparse ● One memstore per CF ○ Must flush together ● Compactions happen at region level (Region) (family) (family)
  23. 23. Compactions ● Rewrites StoreFiles ○ Improves read performance ○ IO Intensive ● Region scope ● Used to take > 50 hours ● Custom script took it down to 18 ○ Can (theoretically) run during the day
  24. 24. MemStore HDFS S1 f1 my_table,, 1328551097416.12921 bbc0c91869f88ba6a044a6a1c5 0. (Region) MemStore HDFS s 2 s 1 s3 f1 my_table,,1328551097416.12921 bbc0c91869f88ba6a044a6a1c50. (Region) Compaction Before and After s4 s5 s6 Before After K-Way Merge
  25. 25. The Table From Hell ● 19 Column Families ● 60% of our region count ● Skewed write pattern ○ KB size store files ○ Frequent compaction storms ○ hbase.hstore.compaction.min.size (HBASE-5461) ● Moved to it’s own cluster
  26. 26. And yet... ● Cluster remained operational ○ Table is still in use today ● Met read and write demand ● Regions only briefly active ○ Rowkeys by date and customer
  27. 27. What saved us ● Keyed by customer and date ● Effectively write once ○ Kept “active” region count low ● Custom compaction script ○ Skipped old regions ● More hardware ● Were able to selectively migrate
  28. 28. Column Family Advice ● Bad choice for fine grained partitioning ● Good for ○ Similarly typed data ○ Varying versioning/retention requirements ● Prefer intra row scans ○ CF and qualifiers are sorted ○ ColumnRangeFilter
  29. 29. Questions?

×