Headline Goes Here
Speaker Name or Subhead Goes Here
DO NOT USE PUBLICLY
PRIOR TO 10/23/12
Trends in Supporting Production...
Who are we?
Jonathan Hsieh
• Cloudera:
• Software Engineer
• Apache HBase committer /
PMC
• Apache Flume founder
Kevin O’D...
What is Apache HBase?
Apache HBase is a
reliable, column-
oriented data store that
provides consistent, low-
latency, rand...
HBase Architecture
ZK HDFS
App MR
4 6/26/13 Hadoop Summit / O'Dell, Hsieh
• HBase is designed to be fault tolerant
and hig...
From the trenches at Cloudera Customer Operations
Trends Supporting HBase
Customers in 2011-12 vs in 2012-13
0.90.x / CDH3 era
• Red Hat 5.x
• Java jvm 1.6.13
• 4-8 disk machines
• 24-48 GB RAM
• ...
Support Incidents 6/2011-6/2012
• Patched Bug
• Patched delivered, or
• Fixed in next version
• Operational Workaround
• M...
Comparing 6/11-6/12 to 6/12-6/13
8
Patched
12%
Workaround
(hbck)
28%
Workaround
(config)
44%
Net/HW/OS
16%
6/11-6/12 - CDH...
Comparing 2011 to 2012
• Majority customers
upgraded to CDH4.
• More customers, but similar
volume of support incidents
• ...
HBase Operations Challenges
Operation’s pain points from 6/12 – 6/13
• Hardware (Net/OS/HW)
• Upgrade (0.90 -> 0.92)
• HBase configuration
11 6/26/13 ...
Hardware / Network / Operating System
• Leap second
• Transparent Huge pages
• Bad 10GB Ethernet Firmware
12
Bug
14%
Worka...
Cloudera Manager (CM) system host checker
13 6/26/13 Hadoop Summit / O'Dell, Hsieh
Upgrade Issues
• Old .edits (HBASE-6440)
• 0-length HLogs (HBASE-6443)
• Bad region refs (HBASE-7199)
• Invalid HFile (Hei...
Upgrade Assistance
• Parcels
• simplified distribution
• flexibility of install location
• side by side installs for rolli...
Configuration / Feature
• Continuous Bulk Load
• Avoid and Use Puts
• Region tuning
• Updated defaults + CM
• GC tuning
• ...
CM helps
• Sanity checks on configurations
• Wizard based installation and setup
• Wizard based rolling upgrades (minor ve...
Configuration Management
18 6/26/13 Hadoop Summit / O'Dell, Hsieh
Support improvement wishlist
• Improved “Ergonomics”
• Better default configuration and guard rails
• “I’m sorry Dave, I c...
Good news
• All bug fixes go into the Apache versions before CDH
• HBase is maturing
• Higher percentage of incidents by u...
Getting rid of workarounds
Trends Developing HBase
21 6/26/13 Hadoop Summit / O'Dell, Hsieh
Developer Community
• Vibrant, Highly
Active community!
• We’re Growing!
6/26/13 Hadoop Summit / O'Dell, Hsieh22
Upstream Development Improvements for 0.95+
• Improving Usability
• Improving Reliability
• Improving Predictability
23 6/...
Improving Usability
Metrics and Frameworks
Usability Concerns
• Administering HBase has been too hard.
• Difficult to see what is happening in HBase
• Easy to make b...
Metrics Options
Cloudera Manager OpenTSDB
26
Ganglia
Ganglia Image From:http://www.flickr.com/photos/hongiiv/
6/26/13 Hado...
HTrace
• Problem:
• Where is time being spent inside HBase?
• Solution: HTrace Framework
• Inspired by Google Dapper
• Thr...
HBase Schemas
• HBase Application developers must iterate to find a suitable HBase
schema
• Schema critical for Performanc...
Row key design techniques
• Numeric Keys and lexicographic sort
• Store numbers big-endian.
• Pad ASCII numbers with 0’s.
...
Row key design techniques
• Numeric Keys and lexicographic sort
• Store numbers big-endian.
• Pad ASCII numbers with 0’s.
...
MTTR
Improving Reliability
Reliable
Reliable / Highly Available
• Reliable:
• Ability to recover service if a
component fails, without losing data.
•...
Mean Time To Recovery (MTTR)
• Average time taken to automatically recover from a failure.
• Detection time
• Repair Time
...
Reduce Detection Time
• Proactive notification of HMaster failure (0.95)
• Proactive notification of RS failure (0.95)
• F...
Reduce Detection Time
• Proactive notification of HMaster failure (0.95)
• Proactive notification of RS failure (0.95)
• F...
Reduce Recovery Time
• Distributed Log Splitting (0.92)
• Distributed Log Replay (0.95)
• Fast Write recovery (0.95)
• Pri...
Reduce Recovery Time
• Distributed Log Splitting (0.92)
• Distributed Log Replay (0.95)
• Fast Write recovery (0.95)
• Pri...
Reduce Notification Time
• Notify client on recovery
• Async Client rewrite (0.96+)
6/26/13 Hadoop Summit / O'Dell, Hsieh3...
Reduce Notification Time
• Notify client on recovery
• Async Client rewrite (0.96+)
6/26/13 Hadoop Summit / O'Dell, Hsieh3...
Compactions
Improving Predictability
Reliable
Reliable / Highly Available
• Reliable:
• Ability to recover service if a component
fails, without losing data.
•...
Reliable
Reliable / Highly Available / Latency Tolerant
• Reliable:
• Ability to recover service if a component
fails, wit...
Common causes of performance variability
• Compaction
• Garbage Collection
• Locality Loss
6/26/13 Hadoop Summit / O'Dell,...
Compaction
• Compactions optimizing read layout by rewriting files
• Reduce the seeks required to read a row
• Improve ran...
Compactions: Put workload
• Minor compactions
• Optimizes a sub set of adjacent
files
• Major Compactions
• Optimizes all ...
Compactions: Bulkload workload
• Functionality for loading data en
masse
• Intended for Bootstrapping
HBase tables
• New w...
Bulkload: Exploring Compactor
• Explore all compaction
possibilities
• Choose minor compactions
that reduces # of files wh...
Conclusions
Comparing 6/11-6/12 to 6/12-6/13
49
Patched
12%
Workaround
(hbck)
28%
Workaround
(config)
44%
Net/HW/OS
16%
6/11-6/12 - CD...
Summary by Version
0.90 0.92 /0.94 0.95-dev / 0.96 0.98 /trunk
•HBase Developer
Expertise
• HBase Operational
Experience
•...
Questions?
6/26/13 Hadoop Summit / O'Dell, Hsieh51
@kevinrodell
@jmhsieh
Upcoming SlideShare
Loading in...5
×

Trends in Supporting Production Apache HBase Clusters

950

Published on

Apache HBase is a distributed data store that is in production today at many enterprises and sites serving large volumes of near-real-time random-accesses. By supporting a wide range of production Apache HBase clusters with diverse use cases and sizes over the past year, we?ve noticed several new trends, learned lessons, and taken action to improve the HBase experience. We?ll present aggregated root-cause statistics on resolved support tickets from the past year. The comparison between this and the previous year?s shows an interesting shift away from problems internal to HBase (splitting, repairs, recovery time) that skews towards user-inflicted problems like poor application architecture level that can be mitigated by tuning (bulk load, r/w latencies and compaction policies). The talk will discuss several tuning tips used for a variety of production workloads running on top of the HBase 0.92.x/0.94.x clusters with 10s to 100s of nodes. This will include settings and their justification for sizing clusters, tuning bulk loads, region counts, and memory settings. We?ll also discuss recently added HBase features that alleviate these problems including an improved mean time to recovery, improved predictability, and improved performance.

Published in: Technology
1 Comment
9 Likes
Statistics
Notes
No Downloads
Views
Total Views
950
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
1
Likes
9
Embeds 0
No embeds

No notes for slide
  • Hbase is a project that solves this problem. In a sentence, Hbase is an open source, distributed, sorted map modeled after Google’s BigTable.Open-source: Apache HBase is an open source project with an Apache 2.0 license.Distributed: HBase is designed to use multiple machines to store and serve data.Sorted Map: HBase stores data as a map, and guarantees that adjacent keys will be stored next to each other on disk.HBase is modeled after BigTable, a system that is used for hundreds of applications at Google.
  • Hbase is a project that solves this problem. In a sentence, Hbase is an open source, distributed, sorted map modeled after Google’s BigTable.Open-source: Apache HBase is an open source project with an Apache 2.0 license.Distributed: HBase is designed to use multiple machines to store and serve data.Sorted Map: HBase stores data as a map, and guarantees that adjacent keys will be stored next to each other on disk.HBase is modeled after BigTable, a system that is used for hundreds of applications at Google.
  • This pie chart is a product from analyzing critical production Hbase tickets over the past 6 months: misconfig 44%, patch 12%,hw/nw 16%, repair 28%. Meaning that correcting a misconfig was all that it took to bring Hbase back up again. As you can see, misconfigurations and bugs break the most HBase clusters. Fixing bugs is up to the community. Fixing misconfigurations is up to you and the focus of the next segment. Because it’s hard to diagnose, misconfigurations are not what you want to spend your time on.If your cluster is broken, it’s probably a misconfiguration. This is a hard problem becausethe error messages are not tightly tied to the root cause.
  • This pie chart is a product from analyzing critical production Hbase tickets over the past 6 months: misconfig 44%, patch 12%,hw/nw 16%, repair 28%. Meaning that correcting a misconfig was all that it took to bring Hbase back up again. As you can see, misconfigurations and bugs break the most HBase clusters. Fixing bugs is up to the community. Fixing misconfigurations is up to you and the focus of the next segment. Because it’s hard to diagnose, misconfigurations are not what you want to spend your time on.If your cluster is broken, it’s probably a misconfiguration. This is a hard problem becausethe error messages are not tightly tied to the root cause.
  • This pie chart is a product from analyzing critical production Hbase tickets over the past 6 months: misconfig 44%, patch 12%,hw/nw 16%, repair 28%. Meaning that correcting a misconfig was all that it took to bring Hbase back up again. As you can see, misconfigurations and bugs break the most HBase clusters. Fixing bugs is up to the community. Fixing misconfigurations is up to you and the focus of the next segment. Because it’s hard to diagnose, misconfigurations are not what you want to spend your time on.If your cluster is broken, it’s probably a misconfiguration. This is a hard problem becausethe error messages are not tightly tied to the root cause.
  • Old edits – HBASE-64400-Length – HBASE-6443Bad refs – HBASE-7199(hbck) Invalid Hfile - Cosmic
  • Hannibal helped a lot with identifying balance issues.
  • Hannibal helped a lot with identifying balance issues.
  • This pie chart is a product from analyzing critical production Hbase tickets over the past 6 months: misconfig 44%, patch 12%,hw/nw 16%, repair 28%. Meaning that correcting a misconfig was all that it took to bring Hbase back up again. As you can see, misconfigurations and bugs break the most HBase clusters. Fixing bugs is up to the community. Fixing misconfigurations is up to you and the focus of the next segment. Because it’s hard to diagnose, misconfigurations are not what you want to spend your time on.If your cluster is broken, it’s probably a misconfiguration. This is a hard problem becausethe error messages are not tightly tied to the root cause.
  • Trends in Supporting Production Apache HBase Clusters

    1. 1. Headline Goes Here Speaker Name or Subhead Goes Here DO NOT USE PUBLICLY PRIOR TO 10/23/12 Trends in Supporting Production Apache HBase Clusters Jonathan Hsieh | @jmhsieh | Software Engineer at Cloudera / HBase PMC Member Kevin O’Dell| kevin.odell@cloudera| Systems Engineer at Cloudera June 26, 2013
    2. 2. Who are we? Jonathan Hsieh • Cloudera: • Software Engineer • Apache HBase committer / PMC • Apache Flume founder Kevin O’Dell • Cloudera: • Systems Engineer • Apache HBase contributor • Cloudera HBase Support Lead 2 6/26/13 Hadoop Summit / O'Dell, Hsieh
    3. 3. What is Apache HBase? Apache HBase is a reliable, column- oriented data store that provides consistent, low- latency, random read/write access. ZK HDFS App MR 3 6/26/13 Hadoop Summit / O'Dell, Hsieh
    4. 4. HBase Architecture ZK HDFS App MR 4 6/26/13 Hadoop Summit / O'Dell, Hsieh • HBase is designed to be fault tolerant and highly available • It depends on other systems to be as well. • Replication for fault tolerance • Serve regions from any Region server • Failover HMasters • ZK Quorums • HDFS Block replication on Data Nodes
    5. 5. From the trenches at Cloudera Customer Operations Trends Supporting HBase
    6. 6. Customers in 2011-12 vs in 2012-13 0.90.x / CDH3 era • Red Hat 5.x • Java jvm 1.6.13 • 4-8 disk machines • 24-48 GB RAM • Dual 4-core HT • CDH3 • Apache HBase 0.90 • Apache Hadoop 0.20.x 0.92.x/0.94.x / CDH4 era • Red Hat 6.x • Java jvm 1.6.31 • 12-15 disk machines • 48-96 GB RAM • Dual 6-core HT • CDH4 • Apache HBase 0.92/0.94 • Apache Hadoop 2.0 6 6/26/13 Hadoop Summit / O'Dell, Hsieh
    7. 7. Support Incidents 6/2011-6/2012 • Patched Bug • Patched delivered, or • Fixed in next version • Operational Workaround • Misconfiguration • Schema design / tuning • hbck used to fix • Network/HW/OS • Problems with underlying systems. 7 Patched 12% Workaround (hbck) 28% Workaround (config) 44% Net/HW/OS 16% 6/11-6/12 - CDH3 / 0.90.x HBase Support Tickets 6/26/13 Hadoop Summit / O'Dell, Hsieh
    8. 8. Comparing 6/11-6/12 to 6/12-6/13 8 Patched 12% Workaround (hbck) 28% Workaround (config) 44% Net/HW/OS 16% 6/11-6/12 - CDH3 / 0.90.x HBase Support Tickets Patched 14% Workaround (config/hbck) 36% Net/HW/OS 42% Documentation 8% 6/12-6/13 - CDH3+CDH4 HBase Support Tickets Much smaller! Merged config/hbck New category This is bigger! 6/26/13 Hadoop Summit / O'Dell, Hsieh
    9. 9. Comparing 2011 to 2012 • Majority customers upgraded to CDH4. • More customers, but similar volume of support incidents • Shrunk the CDH3’s largest trouble spots significantly. • Larger number of issues due to underlying systems. • This is actually a good thing! 9 Patched 14% Workaround (config/hbck) 36% Net/HW/OS 42% Documentation 8% 6/12-6/13 - CDH3+CDH4 HBase Support Tickets 6/26/13 Hadoop Summit / O'Dell, Hsieh
    10. 10. HBase Operations Challenges
    11. 11. Operation’s pain points from 6/12 – 6/13 • Hardware (Net/OS/HW) • Upgrade (0.90 -> 0.92) • HBase configuration 11 6/26/13 Hadoop Summit / O'Dell, Hsieh
    12. 12. Hardware / Network / Operating System • Leap second • Transparent Huge pages • Bad 10GB Ethernet Firmware 12 Bug 14% Workaround (config/hbck) 36% Net/HW/OS 42% Documentation 8% 6/12-6/13 - CDH3+CDH4 HBase Support Tickets 6/26/13 Hadoop Summit / O'Dell, Hsieh
    13. 13. Cloudera Manager (CM) system host checker 13 6/26/13 Hadoop Summit / O'Dell, Hsieh
    14. 14. Upgrade Issues • Old .edits (HBASE-6440) • 0-length HLogs (HBASE-6443) • Bad region refs (HBASE-7199) • Invalid HFile (Heisenbug) 14 Bug 14% Workaround (config/hbck) 36% Net/HW/OS 42% Documentation 8% 6/12-6/13 - CDH3+CDH4 HBase Support Tickets 6/26/13 Hadoop Summit / O'Dell, Hsieh
    15. 15. Upgrade Assistance • Parcels • simplified distribution • flexibility of install location • side by side installs for rolling upgrades • Rolling upgrades via CM • hot fixes • minor version upgrades • Automated tests for upgrades and compatibility 15 6/26/13 Hadoop Summit / O'Dell, Hsieh
    16. 16. Configuration / Feature • Continuous Bulk Load • Avoid and Use Puts • Region tuning • Updated defaults + CM • GC tuning • Updated defaults + CM • Balancer • Manual / custom tools • Bad Schema • Trial and Error 16 Bug 14% Workaround (config/hbck) 36% Net/HW/OS 42% Documentation 8% 6/12-6/13 - CDH3+CDH4 HBase Support Tickets 6/26/13 Hadoop Summit / O'Dell, Hsieh
    17. 17. CM helps • Sanity checks on configurations • Wizard based installation and setup • Wizard based rolling upgrades (minor versions) • Wizard based backup and disaster recovery strategies 17 6/26/13 Hadoop Summit / O'Dell, Hsieh
    18. 18. Configuration Management 18 6/26/13 Hadoop Summit / O'Dell, Hsieh
    19. 19. Support improvement wishlist • Improved “Ergonomics” • Better default configuration and guard rails • “I’m sorry Dave, I can’t let you do that” • Improved error messaging • Suggest likely root causes in logs • Improve log signal-to-noise ratio • More improved ops tooling and frameworks for app development 6/26/13 Hadoop Summit / O'Dell, Hsieh19
    20. 20. Good news • All bug fixes go into the Apache versions before CDH • HBase is maturing • Higher percentage of incidents by underlying OS/HW/NW • More performance and tuning oriented questions • Similar percentage of incidents caused by bugs • We’re getting better • Lower percentage of incidents managed with workarounds • More tools in place to help operational support • Hbck, CM, defaults • We can still do better! 20 6/26/13 Hadoop Summit / O'Dell, Hsieh
    21. 21. Getting rid of workarounds Trends Developing HBase 21 6/26/13 Hadoop Summit / O'Dell, Hsieh
    22. 22. Developer Community • Vibrant, Highly Active community! • We’re Growing! 6/26/13 Hadoop Summit / O'Dell, Hsieh22
    23. 23. Upstream Development Improvements for 0.95+ • Improving Usability • Improving Reliability • Improving Predictability 23 6/26/13 Hadoop Summit / O'Dell, Hsieh Patched 14% Workaround (config/hbck) 36% Net/HW/OS 42% Documentation 8% 6/12-6/13 - CDH3+CDH4 HBase Support Tickets
    24. 24. Improving Usability Metrics and Frameworks
    25. 25. Usability Concerns • Administering HBase has been too hard. • Difficult to see what is happening in HBase • Easy to make bad design decisions early without realizing • New Developments • Metrics Revamp • HTrace • Frameworks for Schema design 6/26/13 Hadoop Summit / O'Dell, Hsieh25
    26. 26. Metrics Options Cloudera Manager OpenTSDB 26 Ganglia Ganglia Image From:http://www.flickr.com/photos/hongiiv/ 6/26/13 Hadoop Summit / O'Dell, Hsieh
    27. 27. HTrace • Problem: • Where is time being spent inside HBase? • Solution: HTrace Framework • Inspired by Google Dapper • Threaded through HBase and HDFS • Tracks time spent in calls in a distributed system by tracking spans* on different machines. *Some assembly still required. 6/26/13 Hadoop Summit / O'Dell, Hsieh27
    28. 28. HBase Schemas • HBase Application developers must iterate to find a suitable HBase schema • Schema critical for Performance at Scale • How can we make this easier? • How can we reduce the expertise required to do this? • Today: • Lots of tuning knobs • Developers need to understand Column Families, Rowkey design, Data encoding, … • Some are expensive to change after the fact 6/26/13 Hadoop Summit / O'Dell, Hsieh28
    29. 29. Row key design techniques • Numeric Keys and lexicographic sort • Store numbers big-endian. • Pad ASCII numbers with 0’s. • Use reversal to have most significant traits first. • Reverse URL. • Reverse timestamp to get most recent first. • (MAX_LONG - ts) so “time” gets monotonically smaller. • Use composite keys to make key distribute nicely and work well with sub-scans • Ex: User-ReverseTimeStamp • Do not use current timestamp as first part of row key! 29 Row100 Row3 Row 31 Row003 Row031 Row100 vs. blog.cloudera.com hbase.apache.org strataconf.com vs. com.cloudera.blog com.strataconf org.apache.hbase 6/26/13 Hadoop Summit / O'Dell, Hsieh
    30. 30. Row key design techniques • Numeric Keys and lexicographic sort • Store numbers big-endian. • Pad ASCII numbers with 0’s. • Use reversal to have most significant traits first. • Reverse URL. • Reverse timestamp to get most recent first. • (MAX_LONG - ts) so “time” gets monotonically smaller. • Use composite keys to make key distribute nicely and work well with sub-scans • Ex: User-ReverseTimeStamp • Do not use current timestamp as first part of row key! 30 Row100 Row3 Row 31 Row003 Row031 Row100 vs. blog.cloudera.com hbase.apache.org strataconf.com vs. com.cloudera.blog com.strataconf org.apache.hbase 6/26/13 Hadoop Summit / O'Dell, Hsieh
    31. 31. MTTR Improving Reliability
    32. 32. Reliable Reliable / Highly Available • Reliable: • Ability to recover service if a component fails, without losing data. • Highly Available: • Ability to quickly recover service if a component fails, without losing data. • Goal: Minimize downtime! Highly Available 32 6/26/13 Hadoop Summit / O'Dell, Hsieh
    33. 33. Mean Time To Recovery (MTTR) • Average time taken to automatically recover from a failure. • Detection time • Repair Time • Notification Time • Measure: HTrace (Dapper) Infrastructure (0.96+) 6/26/13 Hadoop Summit / O'Dell, Hsieh33 Detect Repair Notify time
    34. 34. Reduce Detection Time • Proactive notification of HMaster failure (0.95) • Proactive notification of RS failure (0.95) • Fast server failover (Hardware) 6/26/13 Hadoop Summit / O'Dell, Hsieh34 Detect Notify time Repair
    35. 35. Reduce Detection Time • Proactive notification of HMaster failure (0.95) • Proactive notification of RS failure (0.95) • Fast server failover (Hardware) 6/26/13 Hadoop Summit / O'Dell, Hsieh35 Repair Notify time Detect
    36. 36. Reduce Recovery Time • Distributed Log Splitting (0.92) • Distributed Log Replay (0.95) • Fast Write recovery (0.95) • Pristine Read recovery (0.96+) 6/26/13 Hadoop Summit / O'Dell, Hsieh36 Notify time Detect Repair
    37. 37. Reduce Recovery Time • Distributed Log Splitting (0.92) • Distributed Log Replay (0.95) • Fast Write recovery (0.95) • Pristine Read recovery (0.96+) 6/26/13 Hadoop Summit / O'Dell, Hsieh37 Repair Notify time Detect
    38. 38. Reduce Notification Time • Notify client on recovery • Async Client rewrite (0.96+) 6/26/13 Hadoop Summit / O'Dell, Hsieh38 Notify time Detect Repair
    39. 39. Reduce Notification Time • Notify client on recovery • Async Client rewrite (0.96+) 6/26/13 Hadoop Summit / O'Dell, Hsieh39 Repair Notify time Detect
    40. 40. Compactions Improving Predictability
    41. 41. Reliable Reliable / Highly Available • Reliable: • Ability to recover service if a component fails, without losing data. • Highly Available: • Ability to quickly recover service if a component fails, without losing data. • Goal: Minimize downtime! Highly Available 41 6/26/13 Hadoop Summit / O'Dell, Hsieh
    42. 42. Reliable Reliable / Highly Available / Latency Tolerant • Reliable: • Ability to recover service if a component fails, without losing data. • Highly Available: • Ability to quickly recover service if a component fails, without losing data. • Latency Tolerant • Ability to perform and recover in a predictable amount of time, without losing data • New Goal: Predictable performance Highly Available 42 Latency Tolerant 6/26/13 Hadoop Summit / O'Dell, Hsieh
    43. 43. Common causes of performance variability • Compaction • Garbage Collection • Locality Loss 6/26/13 Hadoop Summit / O'Dell, Hsieh43
    44. 44. Compaction • Compactions optimizing read layout by rewriting files • Reduce the seeks required to read a row • Improve random read performance • Age off expired or deleted data • Assumes uniformly distributed write workload • But we have new workloads: • Continuous Bulk load write pattern • Time-series write pattern 6/26/13 Hadoop Summit / O'Dell, Hsieh44
    45. 45. Compactions: Put workload • Minor compactions • Optimizes a sub set of adjacent files • Major Compactions • Optimizes all files • Choosing: • Assume: older files should be larger than newer files. • “New” files are “larger” than “older” files? major compaction • Else, look at newer files and select files for a minor compaction 6/26/13 Hadoop Summit / O'Dell, Hsieh45 Newly flushed HFiles Minor … … Minor MajorMinor
    46. 46. Compactions: Bulkload workload • Functionality for loading data en masse • Intended for Bootstrapping HBase tables • New write workload: frequently ingest data only via bulk load • Problem: • Breaks age/size assumption! • Major Compaction Storms! • Compactions unnecessarily rewrite large files. 46 6/26/13 Hadoop Summit / O'Dell, Hsieh Newly bulk loaded HFiles Major Newly flushed HFiles MajorMajor
    47. 47. Bulkload: Exploring Compactor • Explore all compaction possibilities • Choose minor compactions that reduces # of files while incurring least IO. • “the best bang of the buck” • Compaction workload is more manageable 47 6/26/13 Hadoop Summit / O'Dell, Hsieh Newly bulk loaded HFiles Explore Newly flushed HFiles Minor Minor
    48. 48. Conclusions
    49. 49. Comparing 6/11-6/12 to 6/12-6/13 49 Patched 12% Workaround (hbck) 28% Workaround (config) 44% Net/HW/OS 16% 6/11-6/12 - CDH3 / 0.90.x HBase Support Tickets Patched 14% Workaround (config/hbck) 36% Net/HW/OS 42% Documentation 8% 6/12-6/13 - CDH3+CDH4 HBase Support Tickets Development and tooling efforts continue to reduce HBase is becoming more robust 6/26/13 Hadoop Summit / O'Dell, Hsieh Improved testing
    50. 50. Summary by Version 0.90 0.92 /0.94 0.95-dev / 0.96 0.98 /trunk •HBase Developer Expertise • HBase Operational Experience • Distributed Systems Admin Experience •  •True Durability • Consistency • Performance • MTTR • Protobufs • Snapshots • Table locks • (Predictability) • (File Block Affinity†) •Distributed log splitting* •Distributed log splitting • Distributed log splitting • Distributed log replay† • Fast Write Recovery† •Distributed log splitting •Distributed log replay† •Fast Write Recovery† •(Pristine Region Read Recovery) •Metrics • CF+Region Granularity Metrics • CF+Region Granularity Metrics • Improved failure detection time •CF +Region Granularity Metrics •Improved failure detection time •(Htrace) Recovery in Hours Recovery in Minutes Recovery in Seconds (for writes) Recovery in Seconds † experimental (in progress) *backported in CDH 50 6/26/13 Hadoop Summit / O'Dell, Hsieh
    51. 51. Questions? 6/26/13 Hadoop Summit / O'Dell, Hsieh51 @kevinrodell @jmhsieh

    ×