• Save
Improving h base availability and repair
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Improving h base availability and repair

on

  • 1,618 views

Apache HBase is a rapidly-evolving random-access distributed data store built on top of Apache Hadoop's HDFS and Apache ZooKeeper. Drawing from real-world support experiences, this talk provides ...

Apache HBase is a rapidly-evolving random-access distributed data store built on top of Apache Hadoop's HDFS and Apache ZooKeeper. Drawing from real-world support experiences, this talk provides administrators insight into improving HBase's availability and recovering from situations where HBase is not available. We share tips on the common root causes of unavailability, explain how to diagnose them, and prescribe measures for ensuring maximum availability of an HBase cluster. We discuss new features that improve recovery time such as distributed log splitting as well as supportability improvements. We will also describe utilities including new failure recovery tools that we have developed and contributed that can be used to diagnose and repair rare corruption problems on live HBase systems.

Statistics

Views

Total Views
1,618
Views on SlideShare
1,505
Embed Views
113

Actions

Likes
5
Downloads
0
Comments
0

3 Embeds 113

http://eventifier.co 64
http://marilson.pbworks.com 41
http://eventifier.com 8

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Improving h base availability and repair Presentation Transcript

  • 1. Improving  HBase  Availability  and  Repair   Improving  HBase  Availability  and  Repair   Jeff  Bean,  Jonathan  Hsieh  {jw2ean,jon} @cloudera.com   6/13/12          
  • 2. Who  Are  We?  •  Jeff  Bean   •  Designated  Support  Engineer,  Cloudera   •  EducaGon  Program  Lead,  Cloudera  •  Jonathan  Hsieh   •  SoJware  Engineer,  Cloudera   •  Apache  HBase  CommiLer  and  PMC  member   Hadoop  Summit  2012.  6/13/12    Copyright  2012   2   Cloudera  Inc,  All  Rights  Reserved  
  • 3. What  is  Apache  HBase?   Apache  HBase  is  an   reliable,  column-­‐ oriented  data  store   that  provides   consistent,  low-­‐ latency,  random   read/write  access.   Hadoop  Summit  2012.  6/13/12    Copyright  2012   3   Cloudera  Inc,  All  Rights  Reserved  
  • 4. Fault  Tolerance  vs  Highly  Available  •  Fault  tolerant:     •  Ability  to  recover  service  if  a   component  fails,  without  losing   data.   Fault  Tolerant  •  Highly  Available:     •  Ability  to  quickly  recover  service  if   Highly   a  component  fails,  without  losing   Available   data.  •  Goal:  Minimize  downGme!   Hadoop  Summit  2012.  6/13/12    Copyright  2012   4   Cloudera  Inc,  All  Rights  Reserved  
  • 5. HBase  Architecture  •  HBase  is  designed  to  be  fault  tolerant   and  highly  available     •  It  depends  on  other  systems  to  be  as  well.   App   MR  •  ReplicaDon  for  fault  tolerance     •  Serve  regions  from  any  Region  server   •  Failover  HMasters   •  ZK  Quorums   •  HDFS  Block  replicaGon  on  Data  Nodes   ZK   HDFS  •  But  replicaGon  doesn’t  guarantee  high   availability   •  There  can  sGll  be  soJware  or  human  faults   Hadoop  Summit  2012.  6/13/12    Copyright  2012   5   Cloudera  Inc,  All  Rights  Reserved  
  • 6. Causes  of  HBase  DownDme   HBase  DownDme   DistribuDon  •  Unplanned  Maintenance     •  Hardware  failures     •  SoJware  errors   Planned   •  Human  error  •  Planned  Maintenance   •  Upgrades   Unplanned   •  MigraGons   Hadoop  Summit  2012.  6/13/12    Copyright  2012   6   Cloudera  Inc,  All  Rights  Reserved  
  • 7. Causes  of  Unexpected  Maintenance  Incidents     Unplanned  Maintenance:  Root   Cause  from  Cloudera  Support  •  MisconfiguraGon  •  Metadata  CorrupGons   Repair  •  Network  /  HW  problems   Needed   HBase,  ZK,   28%  •  SW  problems   MR,  HDFS   Misconfig   44%   Fix  HW/•  Long  recovery  Gme   NW   16%   Patch   •  Automated  and  manual   Required   12%   Source:  Cloudera’s  producGon  HBase  Support  Tickets     CDH3’s  HBase  0.90.x,  Hadoop  0.20.x/1.0.x   Hadoop  Summit  2012.  6/13/12    Copyright  2012   7   Cloudera  Inc,  All  Rights  Reserved  
  • 8. Outline  •  Where  we  were     •  HBase  0.90.x  +  Hadoop  0.20.x/1.0.x     •  Case  Studies  •  Where  we  are  today   •  HBase  0.92.x/0.94.x  +  Hadoop  2.0.x   •  Feature  Summary  •  Where  we  are  going   •  HBase  0.96.x  +  Hadoop  2.x     •  Feature  Preview   Hadoop  Summit  2012.  6/13/12    Copyright  2012   8   Cloudera  Inc,  All  Rights  Reserved  
  • 9. [T]here  are  known  knowns;  there  are  things  we  know  we  know.   We  also  know  there  are  known  unknowns;  that  is  to  say  we  know   there  are  some  things  we  do  not  know.   But  there  are  also  unknown  unknowns  –  there  are  things  we  do  not   know  we  dont  know.   —United  States  Secretary  of  Defense  Donald  Rumsfeld  WHERE  WE  WERE:  CASE  STUDIES     Hadoop  Summit  2012.  6/13/12    Copyright  2012   9   Cloudera  Inc,  All  Rights  Reserved  
  • 10. Best  PracDces  to  avoid  hazards   Unplanned  Maintenance:  Root   Cause  from  Cloudera  Support   Repair   Needed   HBase,  ZK,   28%   MR,  HDFS   Misconfig   44%   Fix  HW/ NW   16%   Patch   Required   12%   CAN PREVENT HBASE Source:  Cloudera’s  producGon  HBase  Support  Tickets     MISCONFIGURATIONS CDH3’s  HBase  0.90.x,  Hadoop  0.20.x/1.0.x   Hadoop  Summit  2012.  6/13/12    Copyright  2012   10   Cloudera  Inc,  All  Rights  Reserved  
  • 11. Case  #1:  Memory  Over-­‐subscripDon  Hazard   Misconfig   Bad  Outcome   Masters  Take   Node  A  swaps  •  Too  many  MR  Slots   •  MapReduce  tasks  fail   AcGon  •  MR  Slots  too  large   •  HDFS  datanode   •  “Arbitrary”  processes   operaGons  Gme  out   •  JobTracker  blacklists  TT   pause  or  unresponsive   on  node  B   •  HBase  client  operaGons   fail   •  Jobs  fail  or  run  slow   •  NameNode  re-­‐replicates   blocks  from  node  A   Node    A  Under   Node  B  can’t   Load   connect  to  node  A   Hadoop  Summit  2012.  6/13/12    Copyright  2012   11   Cloudera  Inc,  All  Rights  Reserved  
  • 12. Case  #2,  #3:  Hazards  of  Abusing  HDFS  and  ZK   Millions  of  HDFS  files   Millions  of  ZK  nodes   Bad  PracGce   MisconfiguraGon   500,000  blocks  per   Millions  of  ZK  znodes   datanode   400MB  snapshot   Heartbeat  thread   SW  Bug   ZK  fails  to  create  new   blocks  IO   snapshots,  fails   RS  cannot  access   Bad  outcome   HBase  goes  down   HDFS   HBase  goes  down   Bad  outcome   HBase  fails  to  restart   SW  Bug,  Worse   Hadoop  Summit  2012.  6/13/12    Copyright  2012   outcome   12   Cloudera  Inc,  All  Rights  Reserved  
  • 13. Case  #4:  SpliYng  CorrupDon  from  HW  failure   Manual,  Slow,  and   HW  Failure   requires  expert   HBase  has   Region   regions   MulGple  6  hour   Network  failure   Split  Recovery   inconsistencies   aLempts  to   manual  repair   (takes  out  NN)   incomplete   split   (overlaps  /   sessions.   holes)   SW  Bug   Hadoop  Summit  2012.  6/13/12    Copyright  2012   13   Cloudera  Inc,  All  Rights  Reserved  
  • 14. Case  #5:  Slow  recovery  from  HW  failure   Correct  but  slow!   Human  error   On  restart,   RS  loses   9  hour  hlog   Network   Root   Manual   HDFS,   spliung   HW  failure   and  .META.   Repairs   WALs   recovery   assign  fails   SW  error   Hadoop  Summit  2012.  6/13/12    Copyright  2012   14   Cloudera  Inc,  All  Rights  Reserved  
  • 15. IniDal  Lessons  •  Use  Best  pracGces  to  avoid  problems   •  ConservaGve  first   •  Avoid  unstable  features  •  What  can  we  do?   •  Fix  the  bugs   •  Recover  from  problems  faster   •  Make  people  smarter  to  avoid  hazards  and  misconfiguraGons   •  Make  soJware  smarter  to  prevent  hazards  and   misconfiguraGons   Hadoop  Summit  2012.  6/13/12    Copyright  2012   15   Cloudera  Inc,  All  Rights  Reserved  
  • 16. In  war,  then,  let  your  great  object  be  victory,   not  lengthy  campaigns.   -­‐-­‐  Sun  Tzu  WHERE  WE  ARE  TODAY  HBASE  0.92.X  +  HADOOP  2.0.X   Hadoop  Summit  2012.  6/13/12    Copyright  2012   16   Cloudera  Inc,  All  Rights  Reserved  
  • 17. Goal:  Reduce  unexpected  downDme  by  recovering  faster    •  Removing  the  SPOFs   •  HA  HDFS  •  Faster  Recovery   •  Improved  hbck   •  Distributed  Log  spliung   Hadoop  Summit  2012.  6/13/12    Copyright  2012   17   Cloudera  Inc,  All  Rights  Reserved  
  • 18. Problem:  HDFS  NN  goes  down  under  HBase  •  HBase  depends  on  HDFS.   App   MR   •  If  HDFS  is  down,  HBase  goes  down.  •  RamificaGons.   •  Forces  Recovery  mechanism   •  Caused  some  data  corrupGons   ZK   HDFS  •  Ideally  we  avoid  having  to  do  recovery  at  all.   Hadoop  Summit  2012.  6/13/12    Copyright  2012   18   Cloudera  Inc,  All  Rights  Reserved  
  • 19. HBase-­‐HDFS  HA  Nodes   NameNode    (acGve)   HMaster     (metadata  server)   (region  metadata)   NameNode    (standby)   HMaster      (acGve-­‐standby   (hot  standby)    hot  failover)   ZooKeeper    Quorum   HDFS  DataNodes   HBase  RegionServers   Hadoop  Summit  2012.  6/13/12    Copyright  2012   19   Cloudera  Inc,  All  Rights  Reserved  
  • 20. HBase-­‐HDFS  HA  Nodes:  Transparent  to  HBase   HMaster     (region  metadata)   HMaster     NameNode    (acGve)   (hot  standby)   ZooKeeper    Quorum   HDFS  DataNodes   HBase  RegionServers   Hadoop  Summit  2012.  6/13/12    Copyright  2012   20   Cloudera  Inc,  All  Rights  Reserved  
  • 21. HBase-­‐HDFS  HA  Nodes:  No  more  SPOF   HMaster     NameNode    (acGve)   (acGve)   ZooKeeper    Quorum   HDFS  DataNodes   HBase  RegionServers   Hadoop  Summit  2012.  6/13/12    Copyright  2012   21   Cloudera  Inc,  All  Rights  Reserved  
  • 22. Recovery  operaDons  •  If  a  network  switch  fails  or  if  there  is  a  power  outage,     •  HBase,  ZK,  and  HA  HDFS  will  fail   •  Will  always  sGll  rely  on  recovery  mechanisms.  •  Need  to  be  able  to  quickly  recover   •  Metadata  Invariants  to  fix  metadata  corrupGons   •  Data  Consistency  to  restore  ACID  guarantees   Hadoop  Summit  2012.  6/13/12    Copyright  2012   22   Cloudera  Inc,  All  Rights  Reserved  
  • 23. HBase  Metadata  CorrupDons  •  Internal  HBase  metadata   Unplanned  Maintenance:  Root  Cause   corrupGons   from  Cloudera  Support   •  Prevent  HBase  from  starGng     •  Cause  some  regions  to  be   Repair   unavailable.   Needed   28%   HBase,  ZK,   MR,  HDFS   Misconfig  •  Repairs  are  intricate  and   44%   Fix  HW/ can  cause  extended  periods   NW   of  downGme.   16%   Patch   Required   12%   Hadoop  Summit  2012.  6/13/12    Copyright  2012   23   Cloudera  Inc,  All  Rights  Reserved  
  • 24. HBase  Metadata  Invariants   Table  Integrity   Region  Consistency   •  Every  key  shall  get  assigned   •  Metadata  about  regions  should   to  a  single  region.   agree  in  hdfs,  meta  and  region   server  assignment.   [‘  ‘,A)   [A,B)   regioninfo     in  META   [B,  C)   [C,  D)   [D,  E)   Good   [E,  F)   region   assigned     .regioninfo     [F,  G)   to    RS   in  HDFS   [G,  ‘  ‘)   Hadoop  Summit  2012.  6/13/12    Copyright  2012   24   Cloudera  Inc,  All  Rights  Reserved  
  • 25. DetecDng  and  Repairing  corrupDon  with  hbck  •  HBase  0.90  hbck     •  Checks  an  HBase   instance’s  internals   invariants.  •  HBase  hbck  today   •  Checks  and  can  fix   problem  in  an  HBase   instance’s  internal   invariants   •  0.90.7,  0.92.2,   0.94.0   •  CDH3u4,  CDH4   Hadoop  Summit  2012.  6/13/12    Copyright  2012   25   Cloudera  Inc,  All  Rights  Reserved  
  • 26. Case  #4  redux:  SpliYng  CorrupDon   Manual,  Slow,  and   HW  Failure   requires  expert   HBase  has   Region   Network  failure   regions   MulGple  6  hour   Split  Recovery   inconsistencies   aLempts  to   manual  repair   (takes  out  NN)   incomplete   split   (overlaps  /   sessions.   holes)   SW  Bug   Hadoop  Summit  2012.  6/13/12    Copyright  2012   26   Cloudera  Inc,  All  Rights  Reserved  
  • 27. Case  #4  redux:  SpliYng  CorrupDon   HW  Failure   HBase  has   Region   Network  failure   regions   Automated   Split  Recovery   inconsistencies   aLempts  to   repair  tool   (takes  out  NN)   incomplete   split   (overlaps  /   (Minutes)   holes)   SW  Bug   Fixes  are  quicker,   operator  can  use   Hadoop  Summit  2012.  6/13/12    Copyright  2012   27   Cloudera  Inc,  All  Rights  Reserved  
  • 28. Case  #4  redux:  SpliYng  CorrupDon   HW  Failure   Minor    HBase   Region   Network  failure   inconsistencies   Automated   Split  Recovery   aLempts  to   repair  tool   (takes  out  NN)   incomplete   (bad   split   (seconds)   assignments)   Fixed  SW  Bug   Hadoop  Summit  2012.  6/13/12    Copyright  2012   28   Cloudera  Inc,  All  Rights  Reserved  
  • 29. Data  Consistency  •  When  a  region  server  goes  down,  it  tries  to  flush  data  in   memory  to  HDFS.  •  If  it  cannot  write  to  HDFS,  it  relies  on  the  WAL/HLog.  •  Recovery  via  the  HLog  is  vital  to  prevent  data  loss   •  Understand  the  write  path.   •  Recovery:    HLog  spliung.   •  Faster  Recovery:  Distributed  HLog  spliung.   Hadoop  Summit  2012.  6/13/12    Copyright  2012   29   Cloudera  Inc,  All  Rights  Reserved  
  • 30. Write  Path  (Put  /  Delete  /  Increment)   HBase   client   Region  Server   HLog   Put   Server   HRegion   HRegion   MemStore   MemStore   Put   HStore   HStore   HStore   HStore   Hadoop  Summit  2012.  6/13/12    Copyright  2012   30   Cloudera  Inc,  All  Rights  Reserved  
  • 31. Write  Path  (Put  /  Delete  /  Increment)   Note,  both  regions   write  to  the  same   HBase   HLog   client   Region  Server   Put   HLog   Put   Put   Server   HRegion   HRegion   MemStore   MemStore   Put   Put   HStore   HStore   HStore   HStore   Hadoop  Summit  2012.  6/13/12    Copyright  2012   31   Cloudera  Inc,  All  Rights  Reserved  
  • 32. Log  SpliYng   HMaster   RegionServer   RegionServer   RegionServer   HLog1   HLog2   HLog3   …   HRegion   HRegion   HRegion   HRegion   HRegion   HRegion   mem   mem   mem   mem   mem   mem   Hadoop  Summit  2012.  6/13/12    Copyright  2012   32   Cloudera  Inc,  All  Rights  Reserved  
  • 33. Log  SpliYng   HMaster   RegionServer   RegionServer   RegionServer   HLog1   HLog2   HLog3   …   HRegion   HRegion   HRegion   HRegion   HRegion   HRegion   mem   mem   mem   mem   mem   mem   Hadoop  Summit  2012.  6/13/12    Copyright  2012   33   Cloudera  Inc,  All  Rights  Reserved  
  • 34. Log  SpliYng   HMaster   HLog1   HLog2   HLog3   …   HRegion   HRegion   HRegion   HRegion   HRegion   HRegion   Hadoop  Summit  2012.  6/13/12    Copyright  2012   34   Cloudera  Inc,  All  Rights  Reserved  
  • 35. Log  SpliYng   Spliung  log  1   HMaster   HLog1   HLog2   HLog3   …   HRegion   HRegion   HRegion   HRegion   HRegion   HRegion   Hadoop  Summit  2012.  6/13/12    Copyright  2012   35   Cloudera  Inc,  All  Rights  Reserved  
  • 36. Log  SpliYng   Spliung  log  2   HMaster   HLog   HLog1   HLog2   HLog3   …   HRegion   HRegion   HRegion   HRegion   HRegion   HRegion   Hadoop  Summit  2012.  6/13/12    Copyright  2012   36   Cloudera  Inc,  All  Rights  Reserved  
  • 37. Log  SpliYng   Spliung  log  3   HMaster   HLog   HLog1   HLog   HLog2   HLog3   …   HRegion   HRegion   HRegion   HRegion   HRegion   HRegion   Hadoop  Summit  2012.  6/13/12    Copyright  2012   37   Cloudera  Inc,  All  Rights  Reserved  
  • 38. Log  SpliYng   Spliung  log  100   HMaster   HLog   HLog   HLog   …   HRegion   HRegion   HRegion   HRegion   HRegion   HRegion   Hadoop  Summit  2012.  6/13/12    Copyright  2012   38   Cloudera  Inc,  All  Rights  Reserved  
  • 39. Log  SpliYng   Whew.    I  did  a  lot  of   spliung  work.    That   took  9  hours!   HMaster   HLog   HLog   HLog   …   HRegion   HRegion   HRegion   HRegion   HRegion   HRegion   Hadoop  Summit  2012.  6/13/12    Copyright  2012   39   Cloudera  Inc,  All  Rights  Reserved  
  • 40. Log  SpliYng   RegionServers,  here   are  your  region   assignments.   HMaster   RegionServer4   RegionServer5   RegionServer6   …   …   HRegion   HRegion   HRegion   HRegion   HRegion   HRegion   Hadoop  Summit  2012.  6/13/12    Copyright  2012   40   Cloudera  Inc,  All  Rights  Reserved  
  • 41. Log  SpliYng   Victory!   HMaster   RegionServer4   RegionServer5   RegionServer6   …   HRegion   HRegion   HRegion   HRegion   HRegion   HRegion   mem   mem   mem   mem   mem   mem   Hadoop  Summit  2012.  6/13/12    Copyright  2012   41   Cloudera  Inc,  All  Rights  Reserved  
  • 42. Can  we  recover  more  quickly?    •  In  the  case  study,  this  is  all  done  serially  by  the  master     •  The  master  took  9  hours  to  recovery.   •  The  100  region  server  nodes  were  idle.    •  Let’s  use  the  idle  machines  to  do  spliung  in  parallel!  •  Distributed  log  spliYng  (HBASE-­‐1364)   •  Introduced  in  0.92.0  by  Prakash  Khemani  (Facebook)   •  Included  in  CDH4  (0.92.1)       •  Backported  to  CDH3u3  (off  by  default)   Hadoop  Summit  2012.  6/13/12    Copyright  2012   42   Cloudera  Inc,  All  Rights  Reserved  
  • 43. Distributed  Log  SpliYng   I’m  the  boss.   HMaster   RegionServer   RegionServer   RegionServer   HLog1   HLog2   HLog3   …   HRegion   HRegion   HRegion   HRegion   HRegion   HRegion   mem   mem   mem   mem   mem   mem   Hadoop  Summit  2012.  6/13/12    Copyright  2012   43   Cloudera  Inc,  All  Rights  Reserved  
  • 44. Distributed  Log  SpliYng   There  is  a  lot  of   spliung  work  here,   HMaster   let’s  split  it  up.   HLog1   HLog2   HLog3   …   HRegion   HRegion   HRegion   HRegion   HRegion   HRegion   Hadoop  Summit  2012.  6/13/12    Copyright  2012   44   Cloudera  Inc,  All  Rights  Reserved  
  • 45. Distributed  Log  SpliYng   You  guys  do  the  work   for  me.   HMaster   RegionServer4   RegionServer5   RegionServer6   HLog1   HLog2   HLog3   …   HRegion   HRegion   HRegion   HRegion   HRegion   HRegion   Hadoop  Summit  2012.  6/13/12    Copyright  2012   45   Cloudera  Inc,  All  Rights  Reserved  
  • 46. Distributed  Log  SpliYng   You  guys  do  the  work   for  me.   HMaster   RegionServer4   RegionServer5   RegionServer6   HLog1   HLog2   HLog3   …   HRegion   HRegion   HRegion   HRegion   HRegion   HRegion   Hadoop  Summit  2012.  6/13/12    Copyright  2012   46   Cloudera  Inc,  All  Rights  Reserved  
  • 47. Distributed  Log  SpliYng   Great,  that  took  5.4   minutes.   HMaster   RegionServer4   RegionServer5   RegionServer6   …   HRegion   HRegion   HRegion   HRegion   HRegion   HRegion   Hadoop  Summit  2012.  6/13/12    Copyright  2012   47   Cloudera  Inc,  All  Rights  Reserved  
  • 48. Distributed  Log  SpliYng   Good  Job,  here  are   your  region   assignments.   HMaster   RegionServer4   RegionServer5   RegionServer6   …   HRegion   HRegion   HRegion   HRegion   HRegion   HRegion   Hadoop  Summit  2012.  6/13/12    Copyright  2012   48   Cloudera  Inc,  All  Rights  Reserved  
  • 49. Distributed  Log  SpliYng   Like  a  Boss.   HMaster   RegionServer4   RegionServer5   RegionServer6   …   HRegion   HRegion   HRegion   HRegion   HRegion   HRegion   mem   mem   mem   mem   mem   mem   Hadoop  Summit  2012.  6/13/12    Copyright  2012   49   Cloudera  Inc,  All  Rights  Reserved  
  • 50. Case  #5  redux:  Network  failure  and  slow  recovery   Correct  but  slow!   Human  error   On  restart,   RS  loses   9  hour  hlog   Network   Root   Manual   HDFS,   spliung   HW  failure   and  .META.   Repair   WALs   recovery   assign  fails   Hadoop  Summit  2012.  6/13/12    Copyright  2012   50   Cloudera  Inc,  All  Rights  Reserved  
  • 51. Case  #5  redux:  Network  failure  and  slow  recovery   Correct  and  Faster!   Human  error   On  restart,   5.4  Minute   RS  loses   Network   Root   AutomaGc   hlog   HDFS,   HW  failure   and  .META.   repairs   spliung   WALs   assign  fails   recovery   Fixed!   Hadoop  Summit  2012.  6/13/12    Copyright  2012   51   Cloudera  Inc,  All  Rights  Reserved  
  • 52. WHERE  WE  ARE  GOING  HBASE  0.96  +  HADOOP  2.X   Hadoop  Summit  2012.  6/13/12    Copyright  2012   52   Cloudera  Inc,  All  Rights  Reserved  
  • 53. Themes  •  Minimizing  Planned  downGme   HBase  DownDme   •  Changing  configuraGons   DistribuDon   •  Online  Schema  Change   (experimental  in  0.92,  0.94)   •  Rolling  Restarts   Planned   •  Wire  compaGbility   Unplanned   Hadoop  Summit  2012.  6/13/12    Copyright  2012   53   Cloudera  Inc,  All  Rights  Reserved  
  • 54. Table  unavailable  when  changing  schema  •  Changing  table  schema  requires  disabling  table   •  disable  table,  alter  table  schema,  enable  table   •  Schema  includes  compression,  cf’s,  caching,  Ll,  versions.  •  Goal:  Quickly  change  table  and  column  configuraGon   seungs  without  having  to  disable  Hbase  tables.   •  Feature  Online  Schema  Change  (HBASE-­‐1730)   •  Included  in  but  considered  experimental  in  HBase  0.92/0.94.       •  Contributed  by  Facebook   Hadoop  Summit  2012.  6/13/12    Copyright  2012   54   Cloudera  Inc,  All  Rights  Reserved  
  • 55. Changing  Server  Configs  and  Sogware  updates  •  Rolling  restart  is  an  operaGon  for  upgrading  an  HBase   cluster  to  a  compaGble  version  while  keeping  HBase   available  and  serving  data.   •  Handle  server  config  changes.   •  Handle  code  changes  like  ho}ixes  or  compaGble  upgrades     Hadoop  Summit  2012.  6/13/12    Copyright  2012   55   Cloudera  Inc,  All  Rights  Reserved  
  • 56. Rolling  Restart   Admin   operaGons   ZK   Client   Shell   HM1   User   operaGons   HM2   RS1   RS2   RS3   RS4   Internal   operaGons   Hadoop  Summit  2012.  6/13/12    Copyright  2012   56   Cloudera  Inc,  All  Rights  Reserved  
  • 57. Rolling  Restart   Admin   operaGons   ZK   Client   Shell   HM1   User   operaGons   HM2   RS2   RS3   RS4   Internal   operaGons   Hadoop  Summit  2012.  6/13/12    Copyright  2012   57   Cloudera  Inc,  All  Rights  Reserved  
  • 58. Rolling  Restart   Admin   operaGons   ZK   Client   Shell   HM1   User   operaGons   HM2   RS1   RS2   RS3   RS4   Internal   operaGons   Hadoop  Summit  2012.  6/13/12    Copyright  2012   58   Cloudera  Inc,  All  Rights  Reserved  
  • 59. Rolling  Restart   Admin   operaGons   ZK   Client   Shell   HM1   User   operaGons   HM2   RS1   RS3   RS4   Internal   operaGons   Hadoop  Summit  2012.  6/13/12    Copyright  2012   59   Cloudera  Inc,  All  Rights  Reserved  
  • 60. Rolling  Restart   Admin   operaGons   ZK   Client   Shell   HM1   User   operaGons   HM2   RS1   RS2   RS3   RS4   Internal   operaGons   Hadoop  Summit  2012.  6/13/12    Copyright  2012   60   Cloudera  Inc,  All  Rights  Reserved  
  • 61. Rolling  Restart   Admin   operaGons   ZK   Client   Shell   HM1   User   operaGons   HM2   RS1   RS2   RS4   Internal   operaGons   Hadoop  Summit  2012.  6/13/12    Copyright  2012   61   Cloudera  Inc,  All  Rights  Reserved  
  • 62. Rolling  Restart   Admin   operaGons   ZK   Client   Shell   HM1   User   operaGons   HM2   RS1   RS2   RS3   RS4   Internal   operaGons   Hadoop  Summit  2012.  6/13/12    Copyright  2012   62   Cloudera  Inc,  All  Rights  Reserved  
  • 63. Rolling  Restart   Admin   operaGons   ZK   Client   Shell   HM1   User   operaGons   HM2   RS1   RS2   RS3   Internal   operaGons   Hadoop  Summit  2012.  6/13/12    Copyright  2012   63   Cloudera  Inc,  All  Rights  Reserved  
  • 64. Rolling  Restart   Admin   operaGons   ZK   Client   Shell   HM1   User   operaGons   HM2   RS1   RS2   RS3   RS4   Internal   operaGons   Hadoop  Summit  2012.  6/13/12    Copyright  2012   64   Cloudera  Inc,  All  Rights  Reserved  
  • 65. Rolling  Restart   Admin   operaGons   ZK   Client   Shell   User   operaGons   HM2   RS1   RS2   RS3   RS4   Internal   operaGons   Hadoop  Summit  2012.  6/13/12    Copyright  2012   65   Cloudera  Inc,  All  Rights  Reserved  
  • 66. Rolling  Restart   Admin   operaGons   ZK   Client   Shell   HM1   User   operaGons   HM2   RS1   RS2   RS3   RS4   Internal   operaGons   Hadoop  Summit  2012.  6/13/12    Copyright  2012   66   Cloudera  Inc,  All  Rights  Reserved  
  • 67. Rolling  Restart   Admin   operaGons   ZK   Client   Shell   HM1   User   operaGons   RS1   RS2   RS3   RS4   Internal   operaGons   Hadoop  Summit  2012.  6/13/12    Copyright  2012   67   Cloudera  Inc,  All  Rights  Reserved  
  • 68. Rolling  Restart   Admin   operaGons   ZK   Client   Shell   HM1   User   operaGons   HM2   RS1   RS2   RS3   RS4   Internal   operaGons   Hadoop  Summit  2012.  6/13/12    Copyright  2012   68   Cloudera  Inc,  All  Rights  Reserved  
  • 69. Rolling  Restart   Admin   operaGons   ZK   Client   Shell   HM1   User   operaGons   HM2   RS1   RS2   RS3   RS4   Internal   operaGons   Hadoop  Summit  2012.  6/13/12    Copyright  2012   69   Cloudera  Inc,  All  Rights  Reserved  
  • 70. Rolling  restart  limitaDons  •  There  are  limitaGons  on   Unplanned  Maintenance:  Root   rolling  restarts     Cause  from  Cloudera  Support   •  All  Servers  and  clients  must  be   wire  compaGble   •  All  must  be  able  to  read  old   data  in  FS  and  ZK.   Repair   Needed   HBase,  ZK,   28%  •  RamificaGons:     MR,  HDFS   Misconfig   •  Only  minor  version  upgrades   44%   possible   Fix  HW/ •  New  features  that  change  RPCs   NW   require  custom  compaGbility   16%   Patch   shims.   Required   •  Data  format  changes  not   12%   possible  across  minor  versions.   Source:  Cloudera’s  producGon  HBase  Support  Tickets     CDH3’s  HBase  0.90.x,  Hadoop  0.20.x/1.0.x   Hadoop  Summit  2012.  6/13/12    Copyright  2012   70   Cloudera  Inc,  All  Rights  Reserved  
  • 71. HBase  CompaDbility  and  Extensibility  •  Coming  in  HBase  0.96   •  HBASE-­‐5305  and  friends  •  Goals:   •  Allow  API  and  changes  and  persistent  data  structure  changes   while  guarantees  compaGbility  between  different  minor   versions  (0.96.0  -­‐>  0.96.1)   •  HBase  client  server  compaGbility  between  Major  Versions.   (0.96.x  -­‐>  0.98.x)   Hadoop  Summit  2012.  6/13/12    Copyright  2012   71   Cloudera  Inc,  All  Rights  Reserved  
  • 72. HDFS  Wire  CompaDbility  •  Here  in  HDFS  2.0.x   •  HADOOP-­‐7347  and  friends   App   MR  •  Goals:   •  Allow  API  and  changes  while   guaranteeing  wire  compaGbility   between  different  minor  versions   •  HDFS  client  server  compaGbility   ZK   HDFS   between  Major  Versions.       Hadoop  Summit  2012.  6/13/12    Copyright  2012   72   Cloudera  Inc,  All  Rights  Reserved  
  • 73. HDFS  Wire  CompaDbility  •  Here  in  HDFS  2.0.x   •  HADOOP-­‐7347  and  friends   App   MR  •  Goals:   •  Allow  API  and  changes  while   guaranteeing  wire  compaGbility   between  different  minor  versions   •  HDFS  client  server  compaGbility   ZK   HDFS   between  Major  Versions.       Hadoop  Summit  2012.  6/13/12    Copyright  2012   73   Cloudera  Inc,  All  Rights  Reserved  
  • 74. CONCLUSIONS   Hadoop  Summit  2012.  6/13/12    Copyright  2012   74   Cloudera  Inc,  All  Rights  Reserved  
  • 75. Improving  how  we  handling  causes  of  downDme   HBase  DownDme  DistribuDon   Unplanned  Maintenance:  Root   Cause  from  Cloudera  Support   Wire   compat   Best   hbck   pracGces   Repair   Planned   Needed   HBase,  ZK,   28%   MR,  HDFS   Misconfig   44%   Unplanned   Fix  HW/ NW   16%   Patch   Required   hbck  and   12%   distributed  log   Wire   spliung   compat   Hadoop  Summit  2012.  6/13/12    Copyright  2012   75   Cloudera  Inc,  All  Rights  Reserved  
  • 76. jon@cloudera.com   TwiLer:  @jmhsieh     We’re  hiring!  QUESTIONS?     Hadoop  Summit  2012.  6/13/12    Copyright  2012   76   Cloudera  Inc,  All  Rights  Reserved