Advertisement
Advertisement

More Related Content

Slideshows for you(20)

Similar to Keynote: Apache HBase at Yahoo! Scale(20)

Advertisement
Advertisement

Keynote: Apache HBase at Yahoo! Scale

  1. Apache HBase at Yahoo Scale PUSHING THE LIMITS Francis Liu HBase Yahoo
  2. HBase @
  3. HBase @ Y! • Hosted multi-tenant clusters • 3 Production • 3 Sandbox • HBase-only • Off-stage Use Cases • Internal 0.98 releases • Security HBase Client HBase Client Resource Mgr Namenode TaskTracker DataNode Namenode RegionServer DataNode RegionServer DataNode RegionServer DataNode HBase Master Zookeeper Quorum HBase Client MR Client M/R Task TaskTracker DataNode M/R Task Node Mgr DataNode MR Task Compute Cluster HBase Cluster Gateway/Launcher Rest Proxy HTTP Client
  4. Workload Jungle
  5. Multi-tenancy
  6. Multi-tenancy at Scale • 35 Tenants • 800 RegionServers • 300k regions • RS Peak 115k requests/sec
  7. Divide and Conquer RS RS…Group A RS RS RS…Group B RS RS RS…Group C RS RS RS…Group D RS RS RS…Group E RS
  8. RegionServer Groups • Group Membership • Table • RegionServer • Coarse Isolation • Group customization • Namespace integration
  9. Multi-tenancy at Scale • 800 RegionServers • 40 namespaces • 40 Region server groups • 4 to 100s of servers • Up to 2000+ regions per server • ~1 week rolling upgrade
  10. Scaling to 10’s of PBs (and Beyond) • Scale to Millions of Regions (and Beyond) • Avoid large regions • Data Locality • Network utilization • Datanode load • Performance
  11. • Region directories under table directory • HDFS data structure bottleneck • Namenode Hard Limit of ~6.7 Million Filesystem Layout
  12. Create file ops for 5M Region Table Filesystem Layout
  13. • Hierarchical Table Layout Filesystem Layout
  14. Performance Comparison Test 1M Regions 5M Regions 10M Regions Normal Table 20 mins 4 hours 23 mins DNF Humongous 15 mins 48 secs 1 hour 27 mins 2 hours 53 mins Region directory creation time
  15. ▪ Lock Thrashing ▪ ZK bottlenecks › List/Mutate Millions of Znodes › Notification firehose ▪ State is kept in 3 places › Cached in master › Zookeeper › Meta ZK Region Assignment RS Master Zookeeper Meta Region 1 Region 2 RS
  16. ZKLess Region Assignment ▪ ZK no longer involved ▪ Master approves all assignment ▪ State is persisted only in Meta ▪ State is updated by the Master Meta region RS Master Region 1 Region 2 RS
  17. Performance Comparison Test Latency ZK 1hr 16mins ZK w/o force-sync 11mins ZKLess 11mins Assignment Time for 1 Million Regions
  18. Single Meta Region ▪ Meta not splittable ▪ Large compactions ▪ Longer failover times
  19. Splittable Meta Table ▪ Scale Horizontally › I/O load › Caching › RPC Load
  20. Performance Comparison Scan Meta Assignment Total 1 Meta / 1 RS 56min 19.79min 75.79min 1 Meta / 1 RS 58.63min 28.16min 86.79min 32 Meta / 3 RS 2.91min 12.56min 15.47min 32 Meta / 3 RS 3.6min 12.54min 16.4min Assignment Time for 3 Million Regions
  21. Data Locality ▪ HDFS › Hadoop Distributed Filesystem ▪ Region Server › Serves Regions › Locality of a Region’s Data blocks
  22. Favored Nodes ▪ HDFS › Dictate block placement on file creation ▪ HBase › Partially completed in Apache HBase › Select 3 favored nodes per Region › 1 Node on-rack, 2 Node off-rack › Restrict Region Assignment
  23. Favored Nodes – Fault Testing Control Favored Nodes
  24. THANK YOU Icon Courtesy – iconfinder.com (under Creative Commons)
Advertisement