Your SlideShare is downloading. ×
0
1
Improving	
  Hadoop	
  Cluster	
  
Performance	
  via	
  Linux	
  Configura:on	
  
2014	
  Hadoop	
  Summit	
  –	
  San	
...
2
Tips	
  from	
  a	
  Former	
  SA	
  
Click	
  to	
  edit	
  Master	
  :tle	
  style	
  
CC	
  BY	
  2.0	
  /	
  Richard	
  Bumgardner	
  
Been	
  there,	
  don...
4
Tips	
  from	
  a	
  Former	
  SA	
  Field	
  Guy	
  
Click	
  to	
  edit	
  Master	
  :tle	
  style	
  
CC	
  BY	
  2.0	
  /	
  Alex	
  Moundalexis	
  
Home	
  sweet	
  home.	...
6
Tips	
  from	
  a	
  Former	
  SA	
  Field	
  Guy	
  
Easy	
  steps	
  to	
  take…	
  	
  
7
Tips	
  from	
  a	
  Former	
  SA	
  Field	
  Guy	
  
Easy	
  steps	
  to	
  take…	
  that	
  most	
  people	
  don’t.	
...
What	
  This	
  Talk	
  Isn’t	
  About	
  
•  Deploying	
  
•  Puppet,	
  Chef,	
  Ansible,	
  homegrown	
  scripts,	
  in...
9	
  
“	
  The	
  answer	
  to	
  most	
  
Hadoop	
  ques:ons	
  is	
  it	
  
depends.”	
  
So	
  What	
  ARE	
  We	
  Talking	
  About?	
  
•  Seven	
  simple	
  things	
  
•  Quick	
  
•  Safe	
  
•  Viable	
  fo...
11
Bad	
  news,	
  best	
  not	
  to…	
  
1.	
  Swapping	
  
Swapping	
  
•  A	
  form	
  of	
  memory	
  management	
  
•  When	
  OS	
  runs	
  low	
  on	
  memory…	
  
•  write	
  ...
Swapping	
  
•  Problem:	
  Disks	
  are	
  slow,	
  especially	
  to	
  seek	
  
•  Hadoop	
  is	
  about	
  maximizing	
...
Disable	
  Swap	
  in	
  Kernel	
  
•  Well,	
  as	
  much	
  as	
  possible.	
  
•  Immediate:	
  
	
  #	
  echo	
  0	
  ...
Swapping	
  Peculiari:es	
  
•  Behavior	
  varies	
  based	
  on	
  Linux	
  kernel	
  
•  CentOS	
  6.4+	
  /	
  Ubuntu	...
16
Disable	
  this	
  too.	
  
2.	
  File	
  Access	
  Time	
  
File	
  Access	
  Time	
  
•  Linux	
  tracks	
  access	
  :me	
  
•  writes	
  to	
  disk	
  even	
  if	
  all	
  you	
  ...
Don’t	
  Track	
  Access	
  Time	
  
•  Mount	
  volumes	
  with	
  noatime	
  op:on	
  
•  In	
  /etc/fstab:	
  
	
  
/de...
19
Reclaim	
  it,	
  impress	
  your	
  bosses!	
  
3.	
  Root	
  Reserved	
  Space	
  
Root	
  Reserved	
  Space	
  
•  EXT3/4	
  reserve	
  5%	
  of	
  disk	
  for	
  root-­‐owned	
  files	
  
•  On	
  an	
  O...
Click	
  to	
  edit	
  Master	
  :tle	
  style	
  
CC	
  BY	
  2.0	
  /	
  Alex	
  Moundalexis	
  
Disks	
  used	
  to	
  ...
Do	
  The	
  Math	
  
•  Conserva:ve	
  
•  5%	
  of	
  1	
  TB	
  disk	
  =	
  46	
  GB	
  
•  5	
  data	
  disks	
  per	...
Root	
  Reserved	
  Space	
  
•  On	
  a	
  Hadoop	
  data	
  disk,	
  no	
  root-­‐owned	
  files	
  
•  When	
  crea:ng	
...
24
Turn	
  it	
  on,	
  already!	
  
4.	
  Name	
  Service	
  Cache	
  Daemon	
  
Name	
  Service	
  Cache	
  Daemon	
  
•  Daemon	
  that	
  caches	
  name	
  service	
  requests	
  
•  Passwords	
  
•  ...
Name	
  Service	
  Cache	
  Daemon	
  
•  Hadoop	
  nodes	
  
•  largely	
  a	
  network-­‐based	
  applica:on	
  
•  on	
...
Name	
  Service	
  Cache	
  Daemon	
  
•  Turn	
  it	
  on,	
  let	
  it	
  work,	
  leave	
  it	
  alone:	
  
#	
  chkcon...
28
Not	
  a	
  problem,	
  un:l	
  they	
  are.	
  
5.	
  File	
  Handle	
  Limits	
  
File	
  Handle	
  Limits	
  
•  Kernel	
  refers	
  to	
  files	
  via	
  a	
  handle	
  
•  Also	
  called	
  descriptors	...
30	
  
Microsog	
  Office	
  EULA.	
  Really.	
  
java.io.FileNotFoundExcep:on:	
  (Too	
  many	
  open	
  files)	
  
File	
  Handle	
  Limits	
  
•  Linux	
  defaults	
  usually	
  not	
  enough	
  
•  Increase	
  maximum	
  open	
  files	
...
32
Don’t	
  be	
  tempted	
  to	
  share,	
  even	
  on	
  monster	
  disks.	
  
6.	
  Dedicated	
  Disk	
  for	
  OS	
  a...
The	
  Situa:on	
  in	
  Easy	
  Steps	
  
1.  Your	
  new	
  server	
  has	
  a	
  dozen	
  1	
  TB	
  disks	
  
2.  Elev...
34	
  
Microsog	
  Office	
  EULA.	
  Really.	
  
I	
  don’t	
  understand	
  it,	
  there’s	
  	
  
no	
  consistency	
  to...
No	
  Love	
  for	
  Shared	
  Disk	
  
•  Our	
  quest	
  for	
  data	
  gets	
  interrupted	
  a	
  lot:	
  
•  OS	
  op...
Dedicated	
  Disk	
  for	
  OS	
  and	
  Logs	
  
•  At	
  install	
  :me	
  	
  	
  
•  Disk	
  0,	
  OS	
  &	
  logs	
  ...
37
Sane,	
  both	
  forward	
  and	
  reverse.	
  
7.	
  Name	
  Resolu:on	
  
Name	
  Resolu:on	
  Op:ons	
  
1.  Hosts	
  file,	
  if	
  you	
  must	
  
2.  DNS,	
  much	
  preferred	
  
	
  
	
  
38
Name	
  Resolu:on	
  with	
  Hosts	
  File	
  
•  Set	
  canonical	
  names	
  properly	
  	
  
•  Right	
  
	
  10.1.1.1	...
Name	
  Resolu:on	
  with	
  Hosts	
  File	
  
•  Set	
  loopback	
  address	
  properly	
  
•  Ensure	
  127.0.0.1	
  res...
Name	
  Resolu:on	
  with	
  DNS	
  
•  Forward	
  
•  Reverse	
  
•  Hostname	
  should	
  MATCH	
  the	
  FQDN	
  in	
  ...
This	
  Is	
  What	
  You	
  Ought	
  to	
  See	
  
42
Name	
  Resolu:on	
  Errata	
  
•  Mismatches?	
  Expect	
  odd	
  results.	
  
•  Problems	
  star:ng	
  DataNodes	
  
• ...
44
Time	
  to	
  take	
  out	
  your	
  camera	
  phones…	
  
Summary	
  
Summary	
  
1.  disable	
  vm.swappiness	
  
2.  data	
  disks:	
  mount	
  with	
  noatime	
  op:on	
  
3.  data	
  disks...
Recommended	
  Reading	
  
•  Hadoop	
  Opera:ons	
  
hkp://amzn.to/1hDaN9B	
  
46
47
Preferably	
  related	
  to	
  the	
  talk…	
  
Ques:ons?	
  
48
Thank	
  You!	
  
Alex	
  Moundalexis	
  
	
  
@technmsg	
  
	
  
We’re	
  hiring,	
  kids!	
  Well,	
  not	
  kids.	
  
49
Because	
  we	
  had	
  enough	
  :me…	
  
8.	
  Bonus	
  Round	
  
Others	
  Things	
  to	
  Check	
  
•  Disk	
  IO	
  
•  hdparm	
  
•  #	
  hdparm	
  -­‐Tt	
  /dev/sdc	
  
•  Looking	
  ...
Others	
  Things	
  to	
  Check	
  
•  Disable	
  Red	
  Hat	
  Transparent	
  Huge	
  Pages	
  (RH6+	
  Only)	
  
•  Can	...
Others	
  Things	
  to	
  Check	
  
•  Enable	
  Jumbo	
  Frames	
  
•  Only	
  if	
  your	
  network	
  infrastructure	
 ...
Others	
  Things	
  to	
  Check	
  
•  Enable	
  Jumbo	
  Frames	
  
•  Only	
  if	
  your	
  network	
  infrastructure	
 ...
54
Thank	
  You!	
  
Alex	
  Moundalexis	
  
	
  
@technmsg	
  
	
  
We’re	
  hiring,	
  kids!	
  Well,	
  not	
  kids.	
  
Upcoming SlideShare
Loading in...5
×

Improving Hadoop Performance via Linux

5,091

Published on

Administering a Hadoop cluster isn't easy. Many Hadoop clusters suffer from Linux configuration problems that can negatively impact performance. With vast and sometimes confusing config/tuning options, it can can tempting (and scary) for a cluster administrator to make changes to Hadoop when cluster performance isn't as expected. Learn how to improve Hadoop cluster performance and eliminate common problem areas, applicable across use cases, using a handful of simple Linux configuration changes.

Published in: Technology

Transcript of "Improving Hadoop Performance via Linux"

  1. 1. 1 Improving  Hadoop  Cluster   Performance  via  Linux  Configura:on   2014  Hadoop  Summit  –  San  Jose,  California     Alex  Moundalexis     @technmsg  
  2. 2. 2 Tips  from  a  Former  SA  
  3. 3. Click  to  edit  Master  :tle  style   CC  BY  2.0  /  Richard  Bumgardner   Been  there,  done  that.  
  4. 4. 4 Tips  from  a  Former  SA  Field  Guy  
  5. 5. Click  to  edit  Master  :tle  style   CC  BY  2.0  /  Alex  Moundalexis   Home  sweet  home.  
  6. 6. 6 Tips  from  a  Former  SA  Field  Guy   Easy  steps  to  take…    
  7. 7. 7 Tips  from  a  Former  SA  Field  Guy   Easy  steps  to  take…  that  most  people  don’t.  
  8. 8. What  This  Talk  Isn’t  About   •  Deploying   •  Puppet,  Chef,  Ansible,  homegrown  scripts,  intern  labor   •  Sizing  &  Tuning   •  Depends  heavily  on  data  and  workload   •  Coding   •  Unless  you  count  STDOUT  redirec:on   •  Algorithms   •  I  suck  at  math,  but  we’ll  try  some  mul:plica:on  later   8
  9. 9. 9   “  The  answer  to  most   Hadoop  ques:ons  is  it   depends.”  
  10. 10. So  What  ARE  We  Talking  About?   •  Seven  simple  things   •  Quick   •  Safe   •  Viable  for  most  environments  and  use  cases   •  Iden:fy  issue,  then  offer  solu:on   •  Note:  Commands  run  as  root  or  sudo   10
  11. 11. 11 Bad  news,  best  not  to…   1.  Swapping  
  12. 12. Swapping   •  A  form  of  memory  management   •  When  OS  runs  low  on  memory…   •  write  blocks  to  disk   •  use  now-­‐free  memory  for  other  things   •  read  blocks  back  into  memory  from  disk  when  needed   •  Also  known  as  paging   12
  13. 13. Swapping   •  Problem:  Disks  are  slow,  especially  to  seek   •  Hadoop  is  about  maximizing  IO   •  spend  less  :me  acquiring  data   •  operate  on  data  in  place   •  large  streaming  reads/writes  from  disk   •  Memory  usage  is  limited  within  JVM   •  we  should  be  able  to  manage  our  memory   13
  14. 14. Disable  Swap  in  Kernel   •  Well,  as  much  as  possible.   •  Immediate:    #  echo  0  >  /proc/sys/vm/swappiness   •  Persist  ager  reboot:    #  echo  “vm.swappiness  =  0”  >>  /etc/sysctl.conf     14
  15. 15. Swapping  Peculiari:es   •  Behavior  varies  based  on  Linux  kernel   •  CentOS  6.4+  /  Ubuntu  10.10+   •  For  you  kernel  gurus,  that’s  Linux  2.6.32-­‐303+   •  Prior   •  We  don’t  swap,  except  to  avoid  OOM  condi:on.   •  Ager   •  We  don’t  swap,  ever.   •  Details:  hkp://:ny.cloudera.com/noswap   15
  16. 16. 16 Disable  this  too.   2.  File  Access  Time  
  17. 17. File  Access  Time   •  Linux  tracks  access  :me   •  writes  to  disk  even  if  all  you  did  was  read   •  Problem   •  more  disk  seeks   •  HDFS  is  write-­‐once,  read-­‐many   •  NameNode  tracks  access  informa:on  for  HDFS   17
  18. 18. Don’t  Track  Access  Time   •  Mount  volumes  with  noatime  op:on   •  In  /etc/fstab:     /dev/sdc  /data01  ext3  defaults,noatime  0     •  Note:  noatime  assumes  nodirtime  as  well   •  What  about  relatime?   •  Faster  than  atime  but  slower  than  noatime   •  No  reboot  required   •  #  mount  -­‐o  remount  /data01   18
  19. 19. 19 Reclaim  it,  impress  your  bosses!   3.  Root  Reserved  Space  
  20. 20. Root  Reserved  Space   •  EXT3/4  reserve  5%  of  disk  for  root-­‐owned  files   •  On  an  OS  disk,  sure   •  System  logs,  kernel  panics,  etc   20
  21. 21. Click  to  edit  Master  :tle  style   CC  BY  2.0  /  Alex  Moundalexis   Disks  used  to  be  much  smaller,  right?  
  22. 22. Do  The  Math   •  Conserva:ve   •  5%  of  1  TB  disk  =  46  GB   •  5  data  disks  per  server  =  230  GB   •  5  servers  per  rack  =  1.15  TB   •  Quasi-­‐Aggressive   •  5%  of  4  TB  disk  =  186  GB   •  12  data  disks  per  server  =  2.23  TB   •  18  servers  per  rack  =  40.1  TB   •  That’s  a  LOT  of  unused  storage!   22
  23. 23. Root  Reserved  Space   •  On  a  Hadoop  data  disk,  no  root-­‐owned  files   •  When  crea:ng  a  par::on    #  mkfs.ext3  –m  0  /dev/sdc   •  On  exis:ng  par::ons    #  tune2fs  -­‐m  0  /dev/sdc   •  0  is  safe,  1  is  for  the  ultra-­‐paranoid   23
  24. 24. 24 Turn  it  on,  already!   4.  Name  Service  Cache  Daemon  
  25. 25. Name  Service  Cache  Daemon   •  Daemon  that  caches  name  service  requests   •  Passwords   •  Groups   •  Hosts   •  Helps  weather  network  hiccups   •  Helps  more  with  high  latency  LDAP,  NIS,  NIS+   •  Small  footprint   •  Zero  configura:on  required   25
  26. 26. Name  Service  Cache  Daemon   •  Hadoop  nodes   •  largely  a  network-­‐based  applica:on   •  on  the  network  constantly   •  issue  lots  of  DNS  lookups,  especially  HBase  &  distcp   •  can  thrash  DNS  servers   •  Reducing  latency  of  service  requests?  Smart.   •  Reducing  impact  on  shared  infrastructure?  Smart.   26
  27. 27. Name  Service  Cache  Daemon   •  Turn  it  on,  let  it  work,  leave  it  alone:   #  chkconfig  -­‐-­‐level  345  nscd  on   #  service  nscd  start     •  Check  on  it  later:   #  nscd  -­‐g   •  Unless  using  Red  Hat  SSSD;  modify  ncsd  config  first!   •  Don’t  use  nscd  to  cache  passwd,  group,  or  netgroup   •  Red  Hat,  Using  NSCD  with  SSSD.  hkp://goo.gl/68HTMQ   27
  28. 28. 28 Not  a  problem,  un:l  they  are.   5.  File  Handle  Limits  
  29. 29. File  Handle  Limits   •  Kernel  refers  to  files  via  a  handle   •  Also  called  descriptors   •  Linux  is  a  mul:-­‐user  system   •  File  handles  protect  the  system  from   •  Poor  coding   •  Malicious  users   •  Pictures  of  cats  on  the  Internet   29
  30. 30. 30   Microsog  Office  EULA.  Really.   java.io.FileNotFoundExcep:on:  (Too  many  open  files)  
  31. 31. File  Handle  Limits   •  Linux  defaults  usually  not  enough   •  Increase  maximum  open  files  (default  1024)   #  echo  hdfs  –  nofile  32768  >>  /etc/security/limits.conf   #  echo  mapred  –  nofile  32768  >>  /etc/security/limits.conf   #  echo  hbase  –  nofile  32768  >>  /etc/security/limits.conf   •  Bonus:  Increase  maximum  processes  too   #  echo  hdfs  –  nproc  32768  >>  /etc/security/limits.conf   #  echo  mapred  –  nproc  32768  >>  /etc/security/limits.conf   #  echo  hbase  –  nproc  32768  >>  /etc/security/limits.conf   •  Note:  Cloudera  Manager  will  do  this  for  you.   31
  32. 32. 32 Don’t  be  tempted  to  share,  even  on  monster  disks.   6.  Dedicated  Disk  for  OS  and  Logs  
  33. 33. The  Situa:on  in  Easy  Steps   1.  Your  new  server  has  a  dozen  1  TB  disks   2.  Eleven  disks  are  used  to  store  data   3.  One  disk  is  used  for  the  OS   •  20  GB  for  the  OS   •  980  GB  sits  unused     4.  Someone  asks  “can  we  store  data  there  too?”   5.  Seems  reasonable,  lots  of  space…  “OK,  why  not.”   Sound  familiar?   33
  34. 34. 34   Microsog  Office  EULA.  Really.   I  don’t  understand  it,  there’s     no  consistency  to  these  run  >mes!  
  35. 35. No  Love  for  Shared  Disk   •  Our  quest  for  data  gets  interrupted  a  lot:   •  OS  opera:ons   •  OS  logs   •  Hadoop  logging,  quite  chaky   •  Hadoop  execu:on   •  userspace  execu:on   •  Disk  seeks  are  slow,  remember?   35
  36. 36. Dedicated  Disk  for  OS  and  Logs   •  At  install  :me       •  Disk  0,  OS  &  logs   •  Disk  1-­‐n,  Hadoop  data   •  Ager  install,  more  complicated  effort,  requires   manual  HDFS  block  rebalancing:   1.  Take  down  HDFS   •  If  you  can  do  it  in  under  10  minutes,  just  the  DataNode   2.  Move  or  distribute  blocks  from  disk0/dir  to  disk[1-­‐n]/dir   3.  Remove  dir  from  HDFS  config  (dfs.data.dir)   4.  Start  HDFS   36
  37. 37. 37 Sane,  both  forward  and  reverse.   7.  Name  Resolu:on  
  38. 38. Name  Resolu:on  Op:ons   1.  Hosts  file,  if  you  must   2.  DNS,  much  preferred       38
  39. 39. Name  Resolu:on  with  Hosts  File   •  Set  canonical  names  properly     •  Right    10.1.1.1    r01m01.cluster.org  r01m01  master1    10.1.1.2    r01w01.cluster.org  r01w01  worker1   •  Wrong    10.1.1.1    r01m01  r01m01.cluster.org  master1    10.1.1.2    r01w01  r01w01.cluster.org  worker1   39
  40. 40. Name  Resolu:on  with  Hosts  File   •  Set  loopback  address  properly   •  Ensure  127.0.0.1  resolves  to  localhost,  NOT  hostname   •  Right    127.0.0.1  localhost   •  Wrong    127.0.0.1  r01m01   40
  41. 41. Name  Resolu:on  with  DNS   •  Forward   •  Reverse   •  Hostname  should  MATCH  the  FQDN  in  DNS   41
  42. 42. This  Is  What  You  Ought  to  See   42
  43. 43. Name  Resolu:on  Errata   •  Mismatches?  Expect  odd  results.   •  Problems  star:ng  DataNodes   •  Non-­‐FQDN  in  Web  UI  links   •  Security  features  are  extra  sensi:ve  to  FQDN   •  Errors  so  common  that  link  to  FAQ  is  included  in  logs!   •  hkp://wiki.apache.org/hadoop/UnknownHost   •  Get  name  resolu:on  working  BEFORE  enabling  nscd!   43
  44. 44. 44 Time  to  take  out  your  camera  phones…   Summary  
  45. 45. Summary   1.  disable  vm.swappiness   2.  data  disks:  mount  with  noatime  op:on   3.  data  disks:  disable  root  reserve  space   4.  enable  nscd   5.  increase  file  handle  limits   6.  use  dedicated  OS/logging  disk   7.  sane  name  resolu:on   hkp://:ny.cloudera.com/7steps   45
  46. 46. Recommended  Reading   •  Hadoop  Opera:ons   hkp://amzn.to/1hDaN9B   46
  47. 47. 47 Preferably  related  to  the  talk…   Ques:ons?  
  48. 48. 48 Thank  You!   Alex  Moundalexis     @technmsg     We’re  hiring,  kids!  Well,  not  kids.  
  49. 49. 49 Because  we  had  enough  :me…   8.  Bonus  Round  
  50. 50. Others  Things  to  Check   •  Disk  IO   •  hdparm   •  #  hdparm  -­‐Tt  /dev/sdc   •  Looking  for  at  least  70  MB/s  from  7200  RPM  disks   •  Slower  could  indicate  a  failing  drive,  disk  controller,  array,  etc.   •  dd   •  hkp://romanrm.ru/en/dd-­‐benchmark   50
  51. 51. Others  Things  to  Check   •  Disable  Red  Hat  Transparent  Huge  Pages  (RH6+  Only)   •  Can  reduce  elevated  CPU  usage   •  In  rc.local:   echo  never  >  /sys/kernel/mm/redhat_transparent_hugepage/defrag   echo  never  >  /sys/kernel/mm/redhat_transparent_hugepage/enabled   •  Reference:  Linux  6  Transparent  Huge  Pages  and  Hadoop   Workloads,  hkp://goo.gl/WSF2qC   51
  52. 52. Others  Things  to  Check   •  Enable  Jumbo  Frames   •  Only  if  your  network  infrastructure  supports  it!   •  Can  easily  (and  arguably)  boost  throughput  by  10-­‐20%   52
  53. 53. Others  Things  to  Check   •  Enable  Jumbo  Frames   •  Only  if  your  network  infrastructure  supports  it!   •  Can  easily  (and  arguably)  boost  throughput  by  10-­‐20%   •  Monitor  Everything   •  How  else  will  you  know  what’s  happening?   •  Nagios   •  Ganglia   53
  54. 54. 54 Thank  You!   Alex  Moundalexis     @technmsg     We’re  hiring,  kids!  Well,  not  kids.  
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×