• Save
Hadoop Summit 2012 | Improving HBase Availability and Repair
 

Like this? Share it with your network

Share

Hadoop Summit 2012 | Improving HBase Availability and Repair

on

  • 3,966 views

Apache HBase is a rapidly-evolving random-access distributed data store built on top of Apache Hadoop's HDFS and Apache ZooKeeper. Drawing from real-world support experiences, this talk provides ...

Apache HBase is a rapidly-evolving random-access distributed data store built on top of Apache Hadoop's HDFS and Apache ZooKeeper. Drawing from real-world support experiences, this talk provides administrators insight into improving HBase's availability and recovering from situations where HBase is not available. We share tips on the common root causes of unavailability, explain how to diagnose them, and prescribe measures for ensuring maximum availability of an HBase cluster. We discuss new features that improve recovery time such as distributed log splitting as well as supportability improvements. We will also describe utilities including new failure recovery tools that we have developed and contributed that can be used to diagnose and repair rare corruption problems on live HBase systems.

Statistics

Views

Total Views
3,966
Views on SlideShare
3,679
Embed Views
287

Actions

Likes
18
Downloads
0
Comments
0

7 Embeds 287

http://www.scoop.it 192
http://www.cloudera.com 81
http://blog.cloudera.com 9
https://twitter.com 2
http://us-w1.rockmelt.com 1
http://translate.googleusercontent.com 1
https://si0.twimg.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • This pie chart is a product from analyzing critical production Hbase tickets over the past 6 months: misconfig 44%, patch 12%,hw/nw 16%, repair 28%. Meaning that correcting a misconfig was all that it took to bring Hbase back up again. As you can see, misconfigurations and bugs break the most HBase clusters. Fixing bugs is up to the community. Fixing misconfigurations is up to you and the focus of the next segment. Because it’s hard to diagnose, misconfigurations are not what you want to spend your time on.If your cluster is broken, it’s probably a misconfiguration. This is a hard problem becausethe error messages are not tightly tied to the root cause.
  • CDH3 goes GA 4/12/12
  • HDFS-2379
  • CDH4 GA 6/5/12
  • Tested under HBase
  • Transparent to clients
  • Coupled with HBase Master failover means no SPOF
  • Cause:Disconnect a region server for a whileKill -9 a region serverWhy?All writes at a region server go to a single Hlog. This can contain edits from multiple regions. Regions my get reassigned to multiple other region servers.Need to split up the hlog
  • Region_mover.rbMove regions off, recording which regions were presentRestore regions, based on recorded regions Still mostly a manual process
  • Region_mover.rbMove regions off, recording which regions were presentRestore regions, based on recorded regions Still mostly a manual process

Hadoop Summit 2012 | Improving HBase Availability and Repair Presentation Transcript

  • 1. Improving HBase Availability and Repair Improving HBase Availability and Repair Jeff Bean, Jonathan Hsieh {jwfbean,jon}@cloudera.com 6/13/12
  • 2. Who Are We?• Jeff Bean • Designated Support Engineer, Cloudera • Education Program Lead, Cloudera• Jonathan Hsieh • Software Engineer, Cloudera • Apache HBase Committer and PMC member Hadoop Summit 2012. 6/13/12 Copyright 2012 2 Cloudera Inc, All Rights Reserved
  • 3. What is Apache HBase? Apache HBase is an reliable, column- oriented data store that provides consistent, low- latency, random read/write access. Hadoop Summit 2012. 6/13/12 Copyright 2012 3 Cloudera Inc, All Rights Reserved
  • 4. Fault Tolerance vs Highly Available• Fault tolerant: • Ability to recover service if a component fails, without losing Fault Tolerant data.• Highly Available: • Ability to quickly recover service if Highly a component fails, without losing Available data.• Goal: Minimize downtime! Hadoop Summit 2012. 6/13/12 Copyright 2012 4 Cloudera Inc, All Rights Reserved
  • 5. HBase Architecture• HBase is designed to be fault tolerant and highly available • It depends on other systems to be as well. App MR• Replication for fault tolerance • Serve regions from any Region server • Failover HMasters • ZK Quorums • HDFS Block replication on Data Nodes ZK HDFS• But replication doesn’t guarantee high availability • There can still be software or human faults Hadoop Summit 2012. 6/13/12 Copyright 2012 5 Cloudera Inc, All Rights Reserved
  • 6. Causes of HBase Downtime HBase Downtime Distribution• Unplanned Maintenance • Hardware failures • Software errors Planned • Human error• Planned Maintenance • Upgrades Unplanned • Migrations Hadoop Summit 2012. 6/13/12 Copyright 2012 6 Cloudera Inc, All Rights Reserved
  • 7. Causes of Unexpected Maintenance Incidents Unplanned Maintenance: Root Cause from Cloudera Support• Misconfiguration• Metadata Corruptions Repair• Network / HW problems Needed HBase, ZK, 28%• SW problems MR, HDFS Misconfig 44% Fix• Long recovery time HW/NW 16% Patch • Automated and manual Required 12% Source: Cloudera’s production HBase Support Tickets CDH3’s HBase 0.90.x, Hadoop 0.20.x/1.0.x Hadoop Summit 2012. 6/13/12 Copyright 2012 7 Cloudera Inc, All Rights Reserved
  • 8. Outline• Where we were • HBase 0.90.x + Hadoop 0.20.x/1.0.x • Case Studies• Where we are today • HBase 0.92.x/0.94.x + Hadoop 2.0.x • Feature Summary• Where we are going • HBase 0.96.x + Hadoop 2.x • Feature Preview Hadoop Summit 2012. 6/13/12 Copyright 2012 8 Cloudera Inc, All Rights Reserved
  • 9. [T]here are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – there are things we do not know we dont know. —United States Secretary of Defense Donald RumsfeldWHERE WE WERE:CASE STUDIES Hadoop Summit 2012. 6/13/12 Copyright 2012 9 Cloudera Inc, All Rights Reserved
  • 10. Best Practices to avoid hazards Unplanned Maintenance: Root Cause from Cloudera Support Repair Needed HBase, ZK, 28% MR, HDFS Misconfig 44% Fix HW/NW 16% Patch Required 12% CAN PREVENT HBASE Source: Cloudera’s production HBase Support Tickets MISCONFIGURATIONS CDH3’s HBase 0.90.x, Hadoop 0.20.x/1.0.x Hadoop Summit 2012. 6/13/12 Copyright 2012 10 Cloudera Inc, All Rights Reserved
  • 11. Case #1: Memory Over-subscription Hazard Misconfig Bad Outcome Masters Take Node A swaps• Too many MR Slots • MapReduce tasks fail Action• MR Slots too large • HDFS datanode • “Arbitrary” processes operations time out • JobTracker blacklists TT pause or unresponsive on node B • HBase client operations fail • Jobs fail or run slow • NameNode re-replicates blocks from node A Node A Under Node B can’t Load connect to node A Hadoop Summit 2012. 6/13/12 Copyright 2012 11 Cloudera Inc, All Rights Reserved
  • 12. Case #2, #3: Hazards of Abusing HDFS and ZK Millions of HDFS files Millions of ZK nodes Bad Practice Misconfiguration 500,000 blocks per Millions of ZK znodes datanode 400MB snapshot Heartbeat thread SW Bug ZK fails to create new blocks IO snapshots, fails RS cannot access Bad outcome HBase goes down HDFS HBase goes down Bad outcome HBase fails to restart SW Bug, Worse Hadoop Summit 2012. 6/13/12 Copyright 2012 outcome 12 Cloudera Inc, All Rights Reserved
  • 13. Case #4: Splitting Corruption from HW failure Manual, Slow, and HW Failure requires expert HBase has Region regions Multiple 6 hour Network failure Split Recovery attempts to inconsistencies manual repair (takes out NN) incomplete split (overlaps / sessions. holes) SW Bug Hadoop Summit 2012. 6/13/12 Copyright 2012 13 Cloudera Inc, All Rights Reserved
  • 14. Case #5: Slow recovery from HW failure Correct but slow! Human error On RS loses restart, Roo 9 hour hlog Network Manual HDFS, WAL t and splittingHW failure Repairs s .META. recovery assign fails SW error Hadoop Summit 2012. 6/13/12 Copyright 2012 14 Cloudera Inc, All Rights Reserved
  • 15. Initial Lessons• Use Best practices to avoid problems • Conservative first • Avoid unstable features• What can we do? • Fix the bugs • Recover from problems faster • Make people smarter to avoid hazards and misconfigurations • Make software smarter to prevent hazards and misconfigurations Hadoop Summit 2012. 6/13/12 Copyright 2012 15 Cloudera Inc, All Rights Reserved
  • 16. In war, then, let your great object be victory, not lengthy campaigns. -- Sun TzuWHERE WE ARE TODAYHBASE 0.92.X + HADOOP 2.0.X Hadoop Summit 2012. 6/13/12 Copyright 2012 16 Cloudera Inc, All Rights Reserved
  • 17. Goal: Reduce unexpected downtime byrecovering faster• Removing the SPOFs • HA HDFS• Faster Recovery • Improved hbck • Distributed Log splitting Hadoop Summit 2012. 6/13/12 Copyright 2012 17 Cloudera Inc, All Rights Reserved
  • 18. Problem: HDFS NN goes down under HBase• HBase depends on HDFS. MR App • If HDFS is down, HBase goes down.• Ramifications. • Forces Recovery mechanism • Caused some data corruptions ZK HDFS• Ideally we avoid having to do recovery at all. Hadoop Summit 2012. 6/13/12 Copyright 2012 18 Cloudera Inc, All Rights Reserved
  • 19. HBase-HDFS HA Nodes NameNode (active) HMaster (metadata server) (region metadata) NameNode (standby) HMaster (active-standby (hot standby) hot failover) ZooKeeper Quorum HDFS DataNodes HBase RegionServers Hadoop Summit 2012. 6/13/12 Copyright 2012 19 Cloudera Inc, All Rights Reserved
  • 20. HBase-HDFS HA Nodes: Transparent to HBase HMaster (region metadata) HMaster NameNode (active) (hot standby) ZooKeeper Quorum HDFS DataNodes HBase RegionServers Hadoop Summit 2012. 6/13/12 Copyright 2012 20 Cloudera Inc, All Rights Reserved
  • 21. HBase-HDFS HA Nodes: No more SPOF HMaster NameNode (active) (active) ZooKeeper Quorum HDFS DataNodes HBase RegionServers Hadoop Summit 2012. 6/13/12 Copyright 2012 21 Cloudera Inc, All Rights Reserved
  • 22. Recovery operations• If a network switch fails or if there is a power outage, • HBase, ZK, and HA HDFS will fail • Will always still rely on recovery mechanisms.• Need to be able to quickly recover • Metadata Invariants to fix metadata corruptions • Data Consistency to restore ACID guarantees Hadoop Summit 2012. 6/13/12 Copyright 2012 22 Cloudera Inc, All Rights Reserved
  • 23. HBase Metadata Corruptions• Internal HBase metadata Unplanned Maintenance: Root Cause corruptions from Cloudera Support • Prevent HBase from starting • Cause some regions to be Repair unavailable. Needed 28% HBase, ZK, MR, HDFS Misconfig• Repairs are intricate and 44% Fix can cause extended periods HW/NW of downtime. 16% Patch Required 12% Hadoop Summit 2012. 6/13/12 Copyright 2012 23 Cloudera Inc, All Rights Reserved
  • 24. HBase Metadata Invariants Table Integrity Region Consistency • Every key shall get assigned • Metadata about regions should to a single region. agree in hdfs, meta and region server assignment. [‘ ‘,A) [A,B) regioninfo in META [B, C) [C, D) [D, E) Good [E, F) region assigned .regioninfo [F, G) to RS in HDFS [G, ‘ ‘) Hadoop Summit 2012. 6/13/12 Copyright 2012 24 Cloudera Inc, All Rights Reserved
  • 25. Detecting and Repairing corruption with hbck• HBase 0.90 hbck • Checks an HBase instance’s internals invariants.• HBase hbck today • Checks and can fix problem in an HBase instance’s internal invariants • 0.90.7, 0.92.2, 0.9 4.0 • CDH3u4, CDH4 Hadoop Summit 2012. 6/13/12 Copyright 2012 25 Cloudera Inc, All Rights Reserved
  • 26. Case #4 redux: Splitting Corruption Manual, Slow, and HW Failure requires expert HBase has Region Network failure regions Multiple 6 hour Split Recovery attempts to inconsistencies manual repair (takes out NN) incomplete split (overlaps / sessions. holes) SW Bug Hadoop Summit 2012. 6/13/12 Copyright 2012 26 Cloudera Inc, All Rights Reserved
  • 27. Case #4 redux: Splitting Corruption HW Failure HBase has Region Network failure regions Automated Split Recovery attempts to inconsistencies repair tool (takes out NN) incomplete split (overlaps / (Minutes) holes) Fixes are SW Bug quicker, operator can use Hadoop Summit 2012. 6/13/12 Copyright 2012 27 Cloudera Inc, All Rights Reserved
  • 28. Case #4 redux: Splitting Corruption HW Failure Minor HBase Region Network failure inconsistencies Automated Split Recovery attempts to repair tool (takes out NN) incomplete (bad split (seconds) assignments) Fixed SW Bug Hadoop Summit 2012. 6/13/12 Copyright 2012 28 Cloudera Inc, All Rights Reserved
  • 29. Data Consistency• When a region server goes down, it tries to flush data in memory to HDFS.• If it cannot write to HDFS, it relies on the WAL/HLog.• Recovery via the HLog is vital to prevent data loss • Understand the write path. • Recovery: HLog splitting. • Faster Recovery: Distributed HLog splitting. Hadoop Summit 2012. 6/13/12 Copyright 2012 29 Cloudera Inc, All Rights Reserved
  • 30. Write Path (Put / Delete / Increment) HBase Region Server client HLog Put Server HRegion HRegion MemStore MemStore Put HStore HStore HStore HStore Hadoop Summit 2012. 6/13/12 Copyright 2012 30 Cloudera Inc, All Rights Reserved
  • 31. Write Path (Put / Delete / Increment) Note, both regions write to the same HBase HLog Region Server client Put HLog Put Put Server HRegion HRegion MemStore MemStore Put Put HStore HStore HStore HStore Hadoop Summit 2012. 6/13/12 Copyright 2012 31 Cloudera Inc, All Rights Reserved
  • 32. Log Splitting HMaster RegionServer RegionServer RegionServer HLog1 HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion mem mem mem mem mem mem Hadoop Summit 2012. 6/13/12 Copyright 2012 32 Cloudera Inc, All Rights Reserved
  • 33. Log Splitting HMaster RegionServer RegionServer RegionServer HLog1 HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion mem mem mem mem mem mem Hadoop Summit 2012. 6/13/12 Copyright 2012 33 Cloudera Inc, All Rights Reserved
  • 34. Log Splitting HMaster HLog1 HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 34 Cloudera Inc, All Rights Reserved
  • 35. Log Splitting Splitting log 1 HMaster HLog1 HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 35 Cloudera Inc, All Rights Reserved
  • 36. Log Splitting Splitting log 2 HMaster HLog HLog1 HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 36 Cloudera Inc, All Rights Reserved
  • 37. Log Splitting Splitting log 3 HMaster HLog HLog1 HLog HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 37 Cloudera Inc, All Rights Reserved
  • 38. Log Splitting Splitting log 100 HMaster HLog HLog HLog … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 38 Cloudera Inc, All Rights Reserved
  • 39. Log Splitting Whew. I did a lot of splitting work. That HMaster took 9 hours! HLog HLog HLog … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 39 Cloudera Inc, All Rights Reserved
  • 40. Log Splitting RegionServers, here are your region HMaster assignments. RegionServer4 RegionServer5 RegionServer6 … … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 40 Cloudera Inc, All Rights Reserved
  • 41. Log Splitting Victory! HMaster RegionServer4 RegionServer5 RegionServer6 … HRegion HRegion HRegion HRegion HRegion HRegion mem mem mem mem mem mem Hadoop Summit 2012. 6/13/12 Copyright 2012 41 Cloudera Inc, All Rights Reserved
  • 42. Can we recover more quickly?• In the case study, this is all done serially by the master • The master took 9 hours to recovery. • The 100 region server nodes were idle.• Let’s use the idle machines to do splitting in parallel!• Distributed log splitting (HBASE-1364) • Introduced in 0.92.0 by Prakash Khemani (Facebook) • Included in CDH4 (0.92.1) • Backported to CDH3u3 (off by default) Hadoop Summit 2012. 6/13/12 Copyright 2012 42 Cloudera Inc, All Rights Reserved
  • 43. Distributed Log Splitting I’m the boss. HMaster RegionServer RegionServer RegionServer HLog1 HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion mem mem mem mem mem mem Hadoop Summit 2012. 6/13/12 Copyright 2012 43 Cloudera Inc, All Rights Reserved
  • 44. Distributed Log Splitting There is a lot of splitting work here, HMaster let’s split it up. HLog1 HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 44 Cloudera Inc, All Rights Reserved
  • 45. Distributed Log Splitting You guys do the work for me. HMaster RegionServer4 RegionServer5 RegionServer6 HLog1 HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 45 Cloudera Inc, All Rights Reserved
  • 46. Distributed Log Splitting You guys do the work for me. HMaster RegionServer4 RegionServer5 RegionServer6 HLog1 HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 46 Cloudera Inc, All Rights Reserved
  • 47. Distributed Log Splitting Great, that took 5.4 minutes. HMaster RegionServer4 RegionServer5 RegionServer6 … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 47 Cloudera Inc, All Rights Reserved
  • 48. Distributed Log Splitting Good Job, here are your region HMaster assignments. RegionServer4 RegionServer5 RegionServer6 … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 48 Cloudera Inc, All Rights Reserved
  • 49. Distributed Log Splitting Like a Boss. HMaster RegionServer4 RegionServer5 RegionServer6 … HRegion HRegion HRegion HRegion HRegion HRegion mem mem mem mem mem mem Hadoop Summit 2012. 6/13/12 Copyright 2012 49 Cloudera Inc, All Rights Reserved
  • 50. Case #5 redux: Network failure and slow recovery Correct but slow! Human error On RS loses restart, Roo 9 hour hlog Network Manual HDFS, WAL t and splittingHW failure Repair s .META. recovery assign fails Hadoop Summit 2012. 6/13/12 Copyright 2012 50 Cloudera Inc, All Rights Reserved
  • 51. Case #5 redux: Network failure and slow recovery Correct and Faster! Human error On 5.4 Minute RS loses restart, Roo Network Automatic hlog HDFS, WAL t andHW failure repairs splitting s .META. recovery assign fails Fixed! Hadoop Summit 2012. 6/13/12 Copyright 2012 51 Cloudera Inc, All Rights Reserved
  • 52. WHERE WE ARE GOINGHBASE 0.96 + HADOOP 2.X Hadoop Summit 2012. 6/13/12 Copyright 2012 52 Cloudera Inc, All Rights Reserved
  • 53. Themes• Minimizing Planned downtime HBase Downtime • Changing configurations Distribution • Online Schema Change (experimental in 0.92, 0.94) • Rolling Restarts Planned • Wire compatibility Unplanned Hadoop Summit 2012. 6/13/12 Copyright 2012 53 Cloudera Inc, All Rights Reserved
  • 54. Table unavailable when changing schema• Changing table schema requires disabling table • disable table, alter table schema, enable table • Schema includes compression, cf’s, caching, ttl, versions.• Goal: Quickly change table and column configuration settings without having to disable Hbase tables. • Feature Online Schema Change (HBASE-1730) • Included in but considered experimental in HBase 0.92/0.94. • Contributed by Facebook Hadoop Summit 2012. 6/13/12 Copyright 2012 54 Cloudera Inc, All Rights Reserved
  • 55. Changing Server Configs and Software updates• Rolling restart is an operation for upgrading an HBase cluster to a compatible version while keeping HBase available and serving data. • Handle server config changes. • Handle code changes like hotfixes or compatible upgrades Hadoop Summit 2012. 6/13/12 Copyright 2012 55 Cloudera Inc, All Rights Reserved
  • 56. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 56 Cloudera Inc, All Rights Reserved
  • 57. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 57 Cloudera Inc, All Rights Reserved
  • 58. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 58 Cloudera Inc, All Rights Reserved
  • 59. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 59 Cloudera Inc, All Rights Reserved
  • 60. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 60 Cloudera Inc, All Rights Reserved
  • 61. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 61 Cloudera Inc, All Rights Reserved
  • 62. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 62 Cloudera Inc, All Rights Reserved
  • 63. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS3 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 63 Cloudera Inc, All Rights Reserved
  • 64. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 64 Cloudera Inc, All Rights Reserved
  • 65. Rolling Restart Admin operations ZK Client Shell User operations HM2 RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 65 Cloudera Inc, All Rights Reserved
  • 66. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 66 Cloudera Inc, All Rights Reserved
  • 67. Rolling Restart Admin operations ZK Client Shell HM1 User operations RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 67 Cloudera Inc, All Rights Reserved
  • 68. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 68 Cloudera Inc, All Rights Reserved
  • 69. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 69 Cloudera Inc, All Rights Reserved
  • 70. Rolling restart limitations• There are limitations on Unplanned Maintenance: Root rolling restarts Cause from Cloudera Support • All Servers and clients must be wire compatible • All must be able to read old data in FS and ZK. Repair Needed HBase, ZK, 28%• Ramifications: MR, HDFS Misconfig • Only minor version upgrades 44% possible Fix • New features that change RPCs HW/NW require custom compatibility 16% Patch shims. Required • Data format changes not 12% possible across minor versions. Source: Cloudera’s production HBase Support Tickets CDH3’s HBase 0.90.x, Hadoop 0.20.x/1.0.x Hadoop Summit 2012. 6/13/12 Copyright 2012 70 Cloudera Inc, All Rights Reserved
  • 71. HBase Compatibility and Extensibility• Coming in HBase 0.96 • HBASE-5305 and friends• Goals: • Allow API and changes and persistent data structure changes while guarantees compatibility between different minor versions (0.96.0 -> 0.96.1) • HBase client server compatibility between Major Versions. (0.96.x -> 0.98.x) Hadoop Summit 2012. 6/13/12 Copyright 2012 71 Cloudera Inc, All Rights Reserved
  • 72. HDFS Wire Compatibility• Here in HDFS 2.0.x • HADOOP-7347 and friends App MR• Goals: • Allow API and changes while guaranteeing wire compatibility between different minor versions • HDFS client server compatibility ZK HDFS between Major Versions. Hadoop Summit 2012. 6/13/12 Copyright 2012 72 Cloudera Inc, All Rights Reserved
  • 73. HDFS Wire Compatibility• Here in HDFS 2.0.x • HADOOP-7347 and friends App MR• Goals: • Allow API and changes while guaranteeing wire compatibility between different minor versions • HDFS client server compatibility ZK HDFS between Major Versions. Hadoop Summit 2012. 6/13/12 Copyright 2012 73 Cloudera Inc, All Rights Reserved
  • 74. CONCLUSIONS Hadoop Summit 2012. 6/13/12 Copyright 2012 74 Cloudera Inc, All Rights Reserved
  • 75. Improving how we handling causes of downtime Unplanned Maintenance: Root HBase Downtime Distribution Cause from Cloudera Support Wire compat Best hbck practices Repair Planned Needed HBase, ZK, 28% MR, HDFS Misconfig 44% Unplanned Fix HW/NW 16% Patch Required hbck and 12% distributed log Wire splitting compat Hadoop Summit 2012. 6/13/12 Copyright 2012 75 Cloudera Inc, All Rights Reserved
  • 76. jon@cloudera.com Twitter: @jmhsieh We’re hiring!QUESTIONS? Hadoop Summit 2012. 6/13/12 Copyright 2012 76 Cloudera Inc, All Rights Reserved