Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

HDP2 and YARN operations point

5,253 views

Published on

TreasureData Tech Talk #1

Published in: Engineering

HDP2 and YARN operations point

  1. 1. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. HDP2 and YARN operations point Ryu Kobayashi Treasure Data Tech Talk 11 and 12 Mar 2015
  2. 2. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Who am I? • Ryu Kobayashi • @ryu_kobayashi • https://github.com/ryukobayashi • Treasure Data, Inc. • Software Engineer • Background • Hadoop, Cassandra, Machine Learning, ... • I developed Huahin(Hadoop) Framework. 
 http://huahinframework.org/
  3. 3. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. What is YARN?
  4. 4. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. YARN(Yet Another Resource Negotiator) Architecture
  5. 5. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. • MRv1 • JobTracker • TaskTracker
  6. 6. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. • YARN • ResourceManager • NodeManager • ApplicationMaster • Job History Server • YARN Timeline Server
  7. 7. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. • MRv1 • JobTracker • TaskTracker • YARN • ResourceManager • NodeManager • ApplicationMaster • Job History Server                                          (We  can  not  see  the  log  job  history  If  it  do  not  install)   • YARN Timeline Server                                          (We  can  not  see  the  log  YARN  history  If  it  do  not  install)
  8. 8. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. YARN Timeline Server • It is included container info
  9. 9. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Note!!!
  10. 10. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Use the Hadoop 2.4.0 and later!!!
  11. 11. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. • The versions which must not be used • Apache Hadoop 2.2.0 • Apache Hadoop 2.3.0 • HDP 2.0(2.2.0 based)
  12. 12. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. • Currently • Apache Hadoop 2.6.0 • CDH 5.3.2(2.5.0 based and patch) • HDP 2.2(2.6.0 based and patch)
  13. 13. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. • Why should not use? • Capacity Scheduler • There is a bug • Fair Scheduler • There is a bug
  14. 14. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. • Any bugs? • Each Scheduler will cause a deadlock
  15. 15. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. •In fact, there is a bug in 2.4.0 and 2.6.0… •It is better to use the new version. •Note: 2.7.0 and later is a different thing
  16. 16. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Backport Patch • I was backport some patch • https://github.com/ryukobayashi/patches
  17. 17. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Backport Patch • Included dead lock patch • Format of the counter • Application kill in Web UI.
  18. 18. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Format of the counter
  19. 19. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Format of the counter
  20. 20. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Application kill in Web UI
  21. 21. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Application kill in Web UI
  22. 22. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Application kill in Web UI • Job kill in Web UI • (default false) • Application kill in Web UI • (default true) mapreduce.jobtracker.webinterface.trusted yarn.resourcemanager.webapp.ui-actions.enabled
  23. 23. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Backport Patch • We want the next… • Job task attempt kill in Web UI patch (in development) • Currently, only command line $ mapred job -kill-task attempt_*
  24. 24. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Matter of resources • total container = 4 • concurrent application = 2
  25. 25. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Matter of resources • total container = 4 • concurrent application = 2 Cluster   Application App   Master Container Application App   Master Container
  26. 26. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Matter of resources • total container = 4 • concurrent application = 4
  27. 27. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Matter of resources • total container = 4 • concurrent application = 4 Cluster   Application App   Master Application App   Master Application App   Master Application App   Master
  28. 28. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Matter of resources • total container = 4 • concurrent application = 4 Cluster   Application App   Master Application App   Master Application App   Master Application App   Master
  29. 29. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Matter of resources • total container = 4 • concurrent application = 4 Cluster   Application App   Master Application App   Master Application App   Master Application App   Master Livelock!
  30. 30. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Matter of resources • total container = 4 • concurrent application = 4 Cluster   Application App   Master Application App   Master Application App   Master Container Application App   Master Kill
  31. 31. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Matter of resources • total container = 4 • concurrent application = 4 • ^ squeeze the number of applications
  32. 32. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Matter of resources • total container = 4 • concurrent application = 4 • ^ squeeze the number of applications • set the root maxRunningApps
  33. 33. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Matter of resources • total container = 4 • concurrent application = 4 • root maxRunningApps = 2 Cluster   Application App   Master Container Application App   Master Application App   Master Container Application App   Master Pending
  34. 34. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. YANR Resource Management yarn-site.xml • yarn.nodemanager.resource.memory-mb • (yarn.nodenamager.vmem-pmem-ratio) • (yarn.scheduler.minimum-allocation-mb) mapred-site.xml • yarn.app.mapreduce.am.resource.mb • mapreduce.map.memory.mb • mapreduce.reduce.memory.mb fair-scheduler.xml • maxResources, minResources etc…
  35. 35. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. e.g. Use hdp-configuration-utils.py script http://goo.gl/L2hxyq Use Ambari http://ambari.apache.org/ See the Cloudera’s document http://goo.gl/EBreca YANR Resource Management
  36. 36. Copyright  ©2015  Treasure  Data.    All  Rights  Reserved. Thanks!!!

×