Improving Hadoop Performance via Linux

  • 2,451 views
Uploaded on

Administering a Hadoop cluster isn't easy. Many Hadoop clusters suffer from Linux configuration problems that can negatively impact performance. With vast and sometimes confusing config/tuning …

Administering a Hadoop cluster isn't easy. Many Hadoop clusters suffer from Linux configuration problems that can negatively impact performance. With vast and sometimes confusing config/tuning options, it can can tempting (and scary) for a cluster administrator to make changes to Hadoop when cluster performance isn't as expected. Learn how to improve Hadoop cluster performance and eliminate common problem areas, applicable across use cases, using a handful of simple Linux configuration changes.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,451
On Slideshare
0
From Embeds
0
Number of Embeds
11

Actions

Shares
Downloads
93
Comments
0
Likes
9

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Improving  Hadoop  Cluster   Performance  via  Linux  Configura:on   2014  Hadoop  Summit  –  San  Jose,  California     Alex  Moundalexis   alexm  at  clouderagovt.com     @technmsg  
  • 2. 2 Tips  from  a  Former  SA  
  • 3. Click  to  edit  Master  :tle  style   CC  BY  2.0  /  Richard  Bumgardner   Been  there,  done  that.  
  • 4. 4 Tips  from  a  Former  SA  Field  Guy  
  • 5. Click  to  edit  Master  :tle  style   CC  BY  2.0  /  Alex  Moundalexis   Home  sweet  home.  
  • 6. 6 Tips  from  a  Former  SA  Field  Guy   Easy  steps  to  take…    
  • 7. 7 Tips  from  a  Former  SA  Field  Guy   Easy  steps  to  take…  that  most  people  don’t.  
  • 8. What  This  Talk  Isn’t  About   •  Deploying   •  Puppet,  Chef,  Ansible,  homegrown  scripts,  intern  labor   •  Sizing  &  Tuning   •  Depends  heavily  on  data  and  workload   •  Coding   •  Unless  you  count  STDOUT  redirec:on   •  Algorithms   •  I  suck  at  math,  but  we’ll  try  some  mul:plica:on  later   8
  • 9. 9   “  The  answer  to  most   Hadoop  ques:ons  is  it   depends.”  
  • 10. So  What  ARE  We  Talking  About?   •  Seven  simple  things   •  Quick   •  Safe   •  Viable  for  most  environments  and  use  cases   •  Iden:fy  issue,  then  offer  solu:on   •  Note:  Commands  run  as  root  or  sudo   10
  • 11. 11 Bad  news,  best  not  to…   1.  Swapping  
  • 12. Swapping   •  A  form  of  memory  management   •  When  OS  runs  low  on  memory…   •  write  blocks  to  disk   •  use  now-­‐free  memory  for  other  things   •  read  blocks  back  into  memory  from  disk  when  needed   •  Also  known  as  paging   12
  • 13. Swapping   •  Problem:  Disks  are  slow,  especially  to  seek   •  Hadoop  is  about  maximizing  IO   •  spend  less  :me  acquiring  data   •  operate  on  data  in  place   •  large  streaming  reads/writes  from  disk   •  Memory  usage  is  limited  within  JVM   •  we  should  be  able  to  manage  our  memory   13
  • 14. Disable  Swap  in  Kernel   •  Well,  as  much  as  possible.   •  Immediate:    #  echo  0  >  /proc/sys/vm/swappiness   •  Persist  ager  reboot:    #  echo  “vm.swappiness  =  0”  >>  /etc/sysctl.conf    14
  • 15. Swapping  Peculiari:es   •  Behavior  varies  based  on  Linux  kernel   •  CentOS  6.4+  /  Ubuntu  10.10+   •  For  you  kernel  gurus,  that’s  Linux  2.6.32-­‐303+   •  Prior   •  We  don’t  swap,  except  to  avoid  OOM  condi:on.   •  Ager   •  We  don’t  swap,  ever.   •  Details:  hkp://:ny.cloudera.com/noswap   15
  • 16. 16 Disable  this  too.   2.  File  Access  Time  
  • 17. File  Access  Time   •  Linux  tracks  access  :me   •  writes  to  disk  even  if  all  you  did  was  read   •  Problem   •  more  disk  seeks   •  HDFS  is  write-­‐once,  read-­‐many   •  NameNode  tracks  access  informa:on  for  HDFS   17
  • 18. Don’t  Track  Access  Time   •  Mount  volumes  with  noatime  op:on   •  In  /etc/fstab:     /dev/sdc  /data01  ext3  defaults,noatime  0     •  Note:  noatime  assumes  nodirtime  as  well   •  What  about  relatime?   •  Faster  than  atime  but  slower  than  noatime   •  No  reboot  required   •  #  mount  -­‐o  remount  /data01   18
  • 19. 19 Reclaim  it,  impress  your  bosses!   3.  Root  Reserved  Space  
  • 20. Root  Reserved  Space   •  EXT3/4  reserve  5%  of  disk  for  root-­‐owned  files   •  On  an  OS  disk,  sure   •  System  logs,  kernel  panics,  etc   20
  • 21. Click  to  edit  Master  :tle  style   CC  BY  2.0  /  Alex  Moundalexis   Disks  used  to  be  much  smaller,  right?  
  • 22. Do  The  Math   •  Conserva:ve   •  5%  of  1  TB  disk  =  46  GB   •  5  data  disks  per  server  =  230  GB   •  5  servers  per  rack  =  1.15  TB   •  Quasi-­‐Aggressive   •  5%  of  4  TB  disk  =  186  GB   •  12  data  disks  per  server  =  2.23  TB   •  18  servers  per  rack  =  40.1  TB   •  That’s  a  LOT  of  unused  storage!   22
  • 23. Root  Reserved  Space   •  On  a  Hadoop  data  disk,  no  root-­‐owned  files   •  When  crea:ng  a  par::on    #  mkfs.ext3  –m  0  /dev/sdc   •  On  exis:ng  par::ons    #  tune2fs  -­‐m  0  /dev/sdc   •  0  is  safe,  1  is  for  the  ultra-­‐paranoid   23
  • 24. 24 Turn  it  on,  already!   4.  Name  Service  Cache  Daemon  
  • 25. Name  Service  Cache  Daemon   •  Daemon  that  caches  name  service  requests   •  Passwords   •  Groups   •  Hosts   •  Helps  weather  network  hiccups   •  Helps  more  with  high  latency  LDAP,  NIS,  NIS+   •  Small  footprint   •  Zero  configura:on  required   25
  • 26. Name  Service  Cache  Daemon   •  Hadoop  nodes   •  largely  a  network-­‐based  applica:on   •  on  the  network  constantly   •  issue  lots  of  DNS  lookups,  especially  HBase  &  distcp   •  can  thrash  DNS  servers   •  Reducing  latency  of  service  requests?  Smart.   •  Reducing  impact  on  shared  infrastructure?  Smart.   26
  • 27. Name  Service  Cache  Daemon   •  Turn  it  on,  let  it  work,  leave  it  alone:   #  chkconfig  -­‐-­‐level  345  nscd  on   #  service  nscd  start     •  Check  on  it  later:   #  nscd  -­‐g   •  Unless  using  Red  Hat  SSSD;  modify  ncsd  config  first!   •  Don’t  use  nscd  to  cache  passwd,  group,  or  netgroup   •  Red  Hat,  Using  NSCD  with  SSSD.  hkp://goo.gl/68HTMQ   27
  • 28. 28 Not  a  problem,  un:l  they  are.   5.  File  Handle  Limits  
  • 29. File  Handle  Limits   •  Kernel  refers  to  files  via  a  handle   •  Also  called  descriptors   •  Linux  is  a  mul:-­‐user  system   •  File  handles  protect  the  system  from   •  Poor  coding   •  Malicious  users   •  Pictures  of  cats  on  the  Internet   29
  • 30. 30   Microsog  Office  EULA.  Really.   java.io.FileNotFoundExcep:on:  (Too  many  open  files)  
  • 31. File  Handle  Limits   •  Linux  defaults  usually  not  enough   •  Increase  maximum  open  files  (default  1024)   #  echo  hdfs  –  nofile  32768  >>  /etc/security/limits.conf   #  echo  mapred  –  nofile  32768  >>  /etc/security/limits.conf   #  echo  hbase  –  nofile  32768  >>  /etc/security/limits.conf   •  Bonus:  Increase  maximum  processes  too   #  echo  hdfs  –  nproc  32768  >>  /etc/security/limits.conf   #  echo  mapred  –  nproc  32768  >>  /etc/security/limits.conf   #  echo  hbase  –  nproc  32768  >>  /etc/security/limits.conf   •  Note:  Cloudera  Manager  will  do  this  for  you.   31
  • 32. 32 Don’t  be  tempted  to  share,  even  on  monster  disks.   6.  Dedicated  Disk  for  OS  and  Logs  
  • 33. The  Situa:on  in  Easy  Steps   1.  Your  new  server  has  a  dozen  1  TB  disks   2.  Eleven  disks  are  used  to  store  data   3.  One  disk  is  used  for  the  OS   •  20  GB  for  the  OS   •  980  GB  sits  unused     4.  Someone  asks  “can  we  store  data  there  too?”   5.  Seems  reasonable,  lots  of  space…  “OK,  why  not.”   Sound  familiar?   33
  • 34. 34   Microsog  Office  EULA.  Really.   I  don’t  understand  it,  there’s     no  consistency  to  these  run  >mes!  
  • 35. No  Love  for  Shared  Disk   •  Our  quest  for  data  gets  interrupted  a  lot:   •  OS  opera:ons   •  OS  logs   •  Hadoop  logging,  quite  chaky   •  Hadoop  execu:on   •  userspace  execu:on   •  Disk  seeks  are  slow,  remember?   35
  • 36. Dedicated  Disk  for  OS  and  Logs   •  At  install  :me       •  Disk  0,  OS  &  logs   •  Disk  1-­‐n,  Hadoop  data   •  Ager  install,  more  complicated  effort,  requires  manual   HDFS  block  rebalancing:   1.  Take  down  HDFS   •  If  you  can  do  it  in  under  10  minutes,  just  the  DataNode   2.  Move  or  distribute  blocks  from  disk0/dir  to  disk[1-­‐n]/dir   3.  Remove  dir  from  HDFS  config  (dfs.data.dir)   4.  Start  HDFS   36
  • 37. 37 Sane,  both  forward  and  reverse.   7.  Name  Resolu:on  
  • 38. Name  Resolu:on  Op:ons   1.  Hosts  file,  if  you  must   2.  DNS,  much  preferred       38
  • 39. Name  Resolu:on  with  Hosts  File   •  Set  canonical  names  properly     •  Right    10.1.1.1    r01m01.cluster.org  r01m01  master1    10.1.1.2    r01w01.cluster.org    r01w01  worker1   •  Wrong    10.1.1.1    r01m01  r01m01.cluster.org  master1    10.1.1.2    r01w01  r01w01.cluster.org  worker1   39
  • 40. Name  Resolu:on  with  Hosts  File   •  Set  loopback  address  properly   •  Ensure  127.0.0.1  resolves  to  localhost,  NOT  hostname   •  Right    127.0.0.1  localhost   •  Wrong    127.0.0.1  r01m01   40
  • 41. Name  Resolu:on  with  DNS   •  Forward   •  Reverse   •  Hostname  should  MATCH  the  FQDN  in  DNS   41
  • 42. This  Is  What  You  Ought  to  See   42
  • 43. Name  Resolu:on  Errata   •  Mismatches?  Expect  odd  results.   •  Problems  star:ng  DataNodes   •  Non-­‐FQDN  in  Web  UI  links   •  Security  features  are  extra  sensi:ve  to  FQDN   •  Errors  so  common  that  link  to  FAQ  is  included  in  logs!   •  hkp://wiki.apache.org/hadoop/UnknownHost   •  Get  name  resolu:on  working  BEFORE  enabling  nscd!   43
  • 44. 44 Time  to  take  out  your  camera  phones…   Summary  
  • 45. Summary   1.  disable  vm.swappiness   2.  data  disks:  mount  with  noatime  op:on   3.  data  disks:  disable  root  reserve  space   4.  enable  nscd   5.  increase  file  handle  limits   6.  use  dedicated  OS/logging  disk   7.  sane  name  resolu:on   hkp://:ny.cloudera.com/7steps   45
  • 46. Recommended  Reading   •  Hadoop  Opera:ons   hkp://amzn.to/1hDaN9B   46
  • 47. 47 Preferably  related  to  the  talk…   Ques:ons?  
  • 48. 48 Thank  You!   Alex  Moundalexis   alexm  at  clouderagovt.com   @technmsg     We’re  hiring,  kids!  Well,  not  kids.  
  • 49. 49 Because  we  had  enough  :me…   8.  Bonus  Round  
  • 50. Others  Things  to  Check   •  Disk  IO   •  hdparm   •  #  hdparm  -­‐Tt  /dev/sdc   •  Looking  for  at  least  70  MB/s  from  7200  RPM  disks   •  Slower  could  indicate  a  failing  drive,  disk  controller,  array,  etc.   •  dd   •  hkp://romanrm.ru/en/dd-­‐benchmark   50
  • 51. Others  Things  to  Check   •  Disable  Red  Hat  Transparent  Huge  Pages  (RH6+  Only)   •  Can  reduce  elevated  CPU  usage   •  In  rc.local:   echo  never  >  /sys/kernel/mm/redhat_transparent_hugepage/defrag   echo  never  >  /sys/kernel/mm/redhat_transparent_hugepage/enabled   •  Reference:  Linux  6  Transparent  Huge  Pages  and  Hadoop   Workloads,  hkp://goo.gl/WSF2qC   51
  • 52. Others  Things  to  Check   •  Enable  Jumbo  Frames   •  Only  if  your  network  infrastructure  supports  it!   •  Can  easily  (and  arguably)  boost  throughput  by  10-­‐20%   52
  • 53. Others  Things  to  Check   •  Enable  Jumbo  Frames   •  Only  if  your  network  infrastructure  supports  it!   •  Can  easily  (and  arguably)  boost  throughput  by  10-­‐20%   •  Monitor  Everything   •  How  else  will  you  know  what’s  happening?   •  Nagios,  Ganglia,  CM,  Ambari   53
  • 54. 54 Thank  You!   Alex  Moundalexis   alexm  at  clouderagovt.com   @technmsg     We’re  hiring,  kids!  Well,  not  kids.