Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Improving Hadoop Cluster Performance via Linux Configuration

4,699 views

Published on

Administering a Hadoop cluster isn't easy. Many Hadoop clusters suffer from Linux configuration problems that can negatively impact performance. With vast and sometimes confusing config/tuning options, it can can tempting (and scary) for a cluster administrator to make changes to Hadoop when cluster performance isn't as expected. Learn how to improve Hadoop cluster performance and eliminate common problem areas, applicable across use cases, using a handful of simple Linux configuration changes.

Published in: Software

Improving Hadoop Cluster Performance via Linux Configuration

  1. 1. Improving  Hadoop   Cluster  Performance  via   Linux  Configura:on   DevIgni:on  2014  –  Dulles,  Virginia   Alex  Moundalexis  //  @technmsg  
  2. 2. 2  ©  Cloudera,  Inc.  All  rights  reserved.   Tips   from  a  former  system  administrator    
  3. 3. 3  ©  Cloudera,  Inc.  All  rights  reserved.   Click  to  edit  Master  :tle  style   CC  BY  2.0  /  Richard  Bumgardner   Been  there,  done  that.  
  4. 4. 4  ©  Cloudera,  Inc.  All  rights  reserved.   Tips   from  a  former  system  administrator  field  guy    
  5. 5. 5  ©  Cloudera,  Inc.  All  rights  reserved.   Click  to  edit  Master  :tle  style   CC  BY  2.0  /  Alex  Moundalexis   Home  sweet  home.  
  6. 6. 6  ©  Cloudera,  Inc.  All  rights  reserved.   Tips   Easy  steps  to  take…  
  7. 7. 7  ©  Cloudera,  Inc.  All  rights  reserved.   Tips   Easy  steps  to  take…  that  most  people  don’t.  
  8. 8. 8  ©  Cloudera,  Inc.  All  rights  reserved.   What  this  talk  isn’t  about   •  Deploying   • Puppet,  Chef,  Ansible,  homegrown  scripts,  intern  labor   •  Sizing  &  Tuning   • Depends  heavily  on  data  and  workload   •  Coding   • Unless  you  count  STDOUT  redirec:on   •  Algorithms   • I  suck  at  math,  but  we’ll  try  some  mul:plica:on  later  
  9. 9. 9  ©  Cloudera,  Inc.  All  rights  reserved.   “The  answer  to  most  Hadoop   ques:ons  is…    
  10. 10. 10  ©  Cloudera,  Inc.  All  rights  reserved.   “The  answer  to  most  Hadoop   ques:ons  is…     it  depends.”  
  11. 11. 11  ©  Cloudera,  Inc.  All  rights  reserved.   “The  answer  to  most  Hadoop   ques:ons  is…     it  depends.”   (helpful,  right?)  
  12. 12. 12  ©  Cloudera,  Inc.  All  rights  reserved.   So  what  ARE  we  talking  about?   •  Seven  simple  things   • Quick   • Safe   • Viable  for  most  environments  and  use  cases   •  Iden:fy  issue,  then  offer  solu:on   •  Note:  Commands  run  as  root  or  sudo  
  13. 13. 13  ©  Cloudera,  Inc.  All  rights  reserved.   1.  Swapping   Bad  news,  best  not  to.  
  14. 14. 14  ©  Cloudera,  Inc.  All  rights  reserved.   Swapping   •  A  form  of  memory  management   •  When  OS  runs  low  on  memory…   • write  blocks  to  disk   • use  now-­‐free  memory  for  other  things   • read  blocks  back  into  memory  from  disk  when  needed   •  Also  known  as  paging  
  15. 15. 15  ©  Cloudera,  Inc.  All  rights  reserved.   Swapping   •  Problem:  Disks  are  slow,  especially  to  seek   •  Hadoop  is  about  maximizing  IO   • spend  less  :me  acquiring  data   • operate  on  data  in  place   • large  streaming  reads/writes  from  disk   •  Memory  usage  is  somewhat  limited  within  JVM   • we  should  be  able  to  manage  our  memory   • account  for  JVM  overhead  
  16. 16. 16  ©  Cloudera,  Inc.  All  rights  reserved.   Limit  swapping  in  kernel   •  Well,  as  much  as  possible.   •  Immediate:    #  echo  1  >  /proc/sys/vm/swappiness   •  Persist  amer  reboot:    #  echo  "vm.swappiness  =  1"  >>  /etc/sysctl.conf  
  17. 17. 17  ©  Cloudera,  Inc.  All  rights  reserved.   Swapping  peculiari:es   •  Behavior  varies  based  on  Linux  kernel   • CentOS  6.4+  /  Ubuntu  10.10+   • For  you  kernel  gurus,  that’s  Linux  2.6.32-­‐303+   •  Prior   • We  don’t  swap,  except  to  avoid  OOM  condi:on.   •  Amer   • We  don’t  swap,  ever.   •  Details:  hpp://:ny.cloudera.com/noswap  
  18. 18. 18  ©  Cloudera,  Inc.  All  rights  reserved.   2.  File  Access  Time   Disable  this  too.  
  19. 19. 19  ©  Cloudera,  Inc.  All  rights  reserved.   File  access  :me   •  Linux  tracks  access  :me   • writes  to  disk  even  if  all  you  did  was  read   •  Problem   • more  disk  seeks   • HDFS  is  write-­‐once,  read-­‐many   • NameNode  tracks  access  informa:on  for  HDFS  
  20. 20. 20  ©  Cloudera,  Inc.  All  rights  reserved.   Don’t  track  access  :me   •  Mount  volumes  with  noatime  op:on   • In  /etc/fstab:     /dev/sdc  /data01  ext3  defaults,noatime  0     • Note:  noatime  assumes  nodirtime  as  well   •  What  about  relatime?   • Faster  than  atime  but  slower  than  noatime   •  No  reboot  required   • #  mount  -­‐o  remount  /data01  
  21. 21. 21  ©  Cloudera,  Inc.  All  rights  reserved.   3.  Root  Reserved  Space   Reclaim  it,  impress  your  bosses!  
  22. 22. 22  ©  Cloudera,  Inc.  All  rights  reserved.   Root  reserved  space   •  EXT3/4  reserve  5%  of  disk  for  root-­‐owned  files   • On  an  OS  disk,  sure   • System  logs,  kernel  panics,  etc  
  23. 23. 23  ©  Cloudera,  Inc.  All  rights  reserved.   Click  to  edit  Master  :tle  style   CC  BY  2.0  /  Alex  Moundalexis   Disks  used  to  be  much  smaller,  right?  
  24. 24. 24  ©  Cloudera,  Inc.  All  rights  reserved.   Do  the  math   •  Conserva:ve   • 5%  of  1  TB  disk  =  46  GB   • 5  data  disks  per  server  =  230  GB   • 5  servers  per  rack  =  1.15  TB   •  Quasi-­‐Aggressive   • 5%  of  4  TB  disk  =  186  GB   • 12  data  disks  per  server  =  2.23  TB   • 18  servers  per  rack  =  40.1  TB   •  That’s  a  LOT  of  unused  storage!  
  25. 25. 25  ©  Cloudera,  Inc.  All  rights  reserved.   Root  reserved  space   •  On  a  Hadoop  data  disk,  no  root-­‐owned  files   •  When  crea:ng  a  par::on    #  mkfs.ext3  –m  0  /dev/sdc   •  On  exis:ng  par::ons    #  tune2fs  -­‐m  0  /dev/sdc   • 0  is  safe,  1  is  for  the  ultra-­‐paranoid  
  26. 26. 26  ©  Cloudera,  Inc.  All  rights  reserved.   4.  Name  Service  Cache   Turn  it  on,  already!  
  27. 27. 27  ©  Cloudera,  Inc.  All  rights  reserved.   Name  Service  Cache  Daemon   •  Daemon  that  caches  name  service  requests   • Passwords   • Groups   • Hosts   •  Helps  weather  network  hiccups   •  Helps  more  with  high  latency  LDAP,  NIS,  NIS+   •  Small  footprint   •  Zero  configura:on  required  
  28. 28. 28  ©  Cloudera,  Inc.  All  rights  reserved.   Name  Service  Cache  Daemon   •  Hadoop  nodes   • largely  a  network-­‐based  applica:on   • on  the  network  constantly   • issue  lots  of  name  lookups,  especially  HBase  &  distcp   • can  thrash  name  servers   •  Reducing  latency  of  service  requests?  Smart.   •  Reducing  impact  on  shared  infrastructure?  Smart.  
  29. 29. 29  ©  Cloudera,  Inc.  All  rights  reserved.   Name  Service  Cache  Daemon   •  Turn  it  on,  let  it  work,  leave  it  alone:   #  chkconfig  -­‐-­‐level  345  nscd  on   #  service  nscd  start     •  Check  on  it  later:   #  nscd  -­‐g   •  Unless  using  Red  Hat  SSSD;  modify  nscd  config  first!   • Don’t  use  nscd  to  cache  passwd,  group,  or  netgroup   • Red  Hat,  Using  NSCD  with  SSSD.  hpp://goo.gl/68HTMQ  
  30. 30. 30  ©  Cloudera,  Inc.  All  rights  reserved.   5.  File  Handle  Limits   Not  a  problem,  un:l  they  are.  
  31. 31. 31  ©  Cloudera,  Inc.  All  rights  reserved.   File  handle  limits   •  Kernel  refers  to  files  via  a  handle   • Also  called  descriptors   •  Linux  is  a  mul:-­‐user  system   •  File  handles  protect  the  system  from   • Poor  coding   • Malicious  users   • Poor  coding  of  malicious  users   • Pictures  of  cats  on  the  Internet  
  32. 32. 32  ©  Cloudera,  Inc.  All  rights  reserved.  32   Microsom  Office  EULA.  Really.   java.io.FileNotFoundExcep:on:  (Too  many  open  files)  
  33. 33. 33  ©  Cloudera,  Inc.  All  rights  reserved.   File  handle  limits   •  Linux  defaults  usually  not  enough   •  Increase  maximum  open  files  (default  1024)   #  echo  hdfs  –  nofile  32768  >>  /etc/security/limits.conf   #  echo  mapred  –  nofile  32768  >>  /etc/security/limits.conf   #  echo  hbase  –  nofile  32768  >>  /etc/security/limits.conf   •  Bonus:  Increase  maximum  processes  too   #  echo  hdfs  –  nproc  32768  >>  /etc/security/limits.conf   #  echo  mapred  –  nproc  32768  >>  /etc/security/limits.conf   #  echo  hbase  –  nproc  32768  >>  /etc/security/limits.conf   •  Note:  Cloudera  Manager  will  do  this  for  you.  
  34. 34. 34  ©  Cloudera,  Inc.  All  rights  reserved.   6.  Dedicated  Disks   Don’t  be  tempted  to  share,  even  with  monster  disks.  
  35. 35. 35  ©  Cloudera,  Inc.  All  rights  reserved.   The  Situa:on   1.  Your  new  server  has  a  dozen  1  TB  disks   2.  Eleven  disks  are  used  to  store  data   3.  One  disk  is  used  for  the  OS   • 20  GB  for  the  OS   • 980  GB  sits  unused     4.  Someone  asks  “can  we  store  data  there  too?”   5.  Seems  reasonable,  lots  of  space…  “OK,  why  not.”   Sound  familiar?  
  36. 36. 36  ©  Cloudera,  Inc.  All  rights  reserved.   Microsom  Office  EULA.  Really.   “I  don’t  understand  it,  there’s     no  consistency  to  these  run  >mes!”  
  37. 37. 37  ©  Cloudera,  Inc.  All  rights  reserved.   No  love  for  shared  disk   •  Our  quest  for  data  gets  interrupted  a  lot:   • OS  opera:ons   • OS  logs   • Hadoop  logging,  quite  chapy   • Hadoop  execu:on   • userspace  execu:on   •  Disk  seeks  are  slow,  remember?  
  38. 38. 38  ©  Cloudera,  Inc.  All  rights  reserved.   Dedicated  disk  for  OS  and  logs   •  At  install  :me       • Disk  0,  OS  &  logs   • Disk  1-­‐n,  Hadoop  data   •  Amer  install,  more  complicated  effort,  requires  manual  HDFS  block  rebalancing:   1.  Take  down  HDFS   •  If  you  can  do  it  in  under  10  minutes,  just  the  DataNode   2.  Move  or  distribute  blocks  from  disk0/dir  to  disk[1-­‐n]/dir   3.  Remove  dir  from  HDFS  config  (dfs.data.dir)   4.  Start  HDFS  
  39. 39. 39  ©  Cloudera,  Inc.  All  rights  reserved.   7.  Name  Resolu:on   Sane,  both  forward  and  reverse.  
  40. 40. 40  ©  Cloudera,  Inc.  All  rights  reserved.   Name  resolu:on  op:ons   1.  Hosts  file,  if  you  must   2.  DNS,  much  preferred      
  41. 41. 41  ©  Cloudera,  Inc.  All  rights  reserved.   Name  resolu:on  with  hosts  file   •  Set  canonical  names  properly     •  Right    10.1.1.1    r01m01.cluster.org  r01m01  master1    10.1.1.2    r01w01.cluster.org    r01w01  worker1   •  Wrong    10.1.1.1    r01m01          r01m01.cluster.org  master1    10.1.1.2    r01w01          r01w01.cluster.org  worker1  
  42. 42. 42  ©  Cloudera,  Inc.  All  rights  reserved.   Name  resolu:on  with  hosts  file   •  Set  loopback  address  properly   • Ensure  127.0.0.1  resolves  to  “localhost,”  NOT  hostname   •  Right    127.0.0.1  localhost   •  Wrong    127.0.0.1  r01m01  
  43. 43. 43  ©  Cloudera,  Inc.  All  rights  reserved.   Name  resolu:on  with  DNS   •  Forward   •  Reverse   •  Hostname  should  match  the  FQDN  in  DNS  
  44. 44. 44  ©  Cloudera,  Inc.  All  rights  reserved.   This  is  what  you  ought  to  see  
  45. 45. 45  ©  Cloudera,  Inc.  All  rights  reserved.   Name  resolu:on  errata   •  Mismatches?  Expect  odd  results.   • Problems  star:ng  DataNodes   • Non-­‐FQDN  in  Web  UI  links   • Security  features  are  extra  sensi:ve  to  FQDN   •  Errors  so  common  that  link  to  FAQ  is  included  in  logs!   • hpp://wiki.apache.org/hadoop/UnknownHost   •  Get  name  resolu:on  working  BEFORE  enabling  nscd!  
  46. 46. 46  ©  Cloudera,  Inc.  All  rights  reserved.   Summary   Now  is  the  appropriate  :me  to  take  out  your  camera   phone.  
  47. 47. 47  ©  Cloudera,  Inc.  All  rights  reserved.   A  white  background  is  supposedly   beper  for  prin:ng.   (who  prints  things  anymore?)  
  48. 48. 48  ©  Cloudera,  Inc.  All  rights  reserved.   A  white  background  is  supposedly   beper  for  prin:ng.   (but  makes  for  very  pale  slides)  
  49. 49. 49  ©  Cloudera,  Inc.  All  rights  reserved.   Summary   1.  disable  vm.swappiness   2.  data  disks:  mount  with  noatime  op:on   3.  data  disks:  disable  root  reserve  space   4.  enable  nscd   5.  increase  file  handle  limits   6.  use  dedicated  OS/logging  disk   7.  sane  name  resolu:on   hpp://:ny.cloudera.com/7steps  
  50. 50. 50  ©  Cloudera,  Inc.  All  rights  reserved.   Recommended  reading   •  Hadoop  Opera:ons   hpp://amzn.to/1ydMrLf  
  51. 51. 51  ©  Cloudera,  Inc.  All  rights  reserved.   Ques:ons?   Preferably  related  to  the  talk…  
  52. 52. 52  ©  Cloudera,  Inc.  All  rights  reserved.   Thanks!   Alex  Moundalexis|  @technmsg  
  53. 53. 53  ©  Cloudera,  Inc.  All  rights  reserved.   8.  Bonus  Round   Because  we  have  enough  :me  (or  I  talked  really  fast)…  
  54. 54. 54  ©  Cloudera,  Inc.  All  rights  reserved.   Other  things  to  check   •  Disk  IO   • hdparm   •  #  hdparm  -­‐Tt  /dev/sdc   •  Looking  for  at  least  70  MB/s  from  7200  RPM  disks   •  Slower  could  indicate  a  failing  drive,  disk  controller,  array,  etc.   • dd   •  hpp://romanrm.ru/en/dd-­‐benchmark  
  55. 55. 55  ©  Cloudera,  Inc.  All  rights  reserved.   Other  things  to  check   •  Disable  Red  Hat  Transparent  Huge  Pages  (RH6+  un:l  6.5)   • Can  reduce  elevated  CPU  usage   • In  rc.local:   echo  never  >  /sys/kernel/mm/redhat_transparent_hugepage/defrag   echo  never  >  /sys/kernel/mm/redhat_transparent_hugepage/enabled   • Reference:  Linux  6  Transparent  Huge  Pages  and  Hadoop  Workloads,  hpp:// goo.gl/WSF2qC  
  56. 56. 56  ©  Cloudera,  Inc.  All  rights  reserved.   Other  things  to  check   •  Enable  Jumbo  Frames   • Only  if  your  network  infrastructure  supports  it!   • Can  easily  (and  arguably)  boost  throughput  by  10-­‐20%  
  57. 57. 57  ©  Cloudera,  Inc.  All  rights  reserved.   Other  things  to  check   •  Enable  Jumbo  Frames   • Only  if  your  network  infrastructure  supports  it!   • Can  easily  (and  arguably)  boost  throughput  by  10-­‐20%   •  Monitor  and  Chart  Everything   • How  else  will  you  know  what’s  happening?   • Nagios   • Ganglia  
  58. 58. 58  ©  Cloudera,  Inc.  All  rights  reserved.   Ques:ons?   Preferably  related  to  the  talk…  
  59. 59. 59  ©  Cloudera,  Inc.  All  rights  reserved.   Thanks!   Alex  Moundalexis|  @technmsg  

×