Loading...
Flash Player 9 (or above) is needed to view slideshows. We have detected that you do not have it on your computer.To install it, go here
 
Post to Twitter Post to Twitter
Myspace Hi5 Friendster Xanga LiveJournal Facebook Blogger Tagged Typepad Freewebs BlackPlanet gigya icons
SlideShare is now available on LinkedIn. Add it to your LinkedIn profile.

Automatic Self-Tuning Architecture for Batch Scheduler on Large Scale Computing System

From sugree, 5 months ago Add as contact

Presentation of academic research with 140-char limit

799 views | 0 comments | 0 favorites | 4 downloads | 1 embeds (Stats)

Categories

Education

Groups/Events

Embed in your blog options close
Embed (wordpress.com) Exclude related slideshows Embed in your blog

More Info

This slideshow is Public
Total Views: 799 on Slideshare: 792 from embeds: 7
Most viewed embeds (Top 5): More
All Embeds: Less
Flagged as inappropriate Flag as inappropriate

Flag as inappropriate

Select your reason for flagging this slideshow as inappropriate.

If needed, use the feedback form to let us know more details.

Slideshow Transcript

  1. Slide 1: Automatic Self-Tuning Architecture for Batch Scheduler on Large Scale Computing System sugree June 9, 2008 1
  2. Slide 2: I am Sugree Phatanapherom from Kasetsart University. sugree June 9, 2008 2
  3. Slide 3: This research is a co-work with Asst. Prof. Putchong Uthayopas. sugree June 9, 2008 3
  4. Slide 4: Ready, steady, go. sugree June 9, 2008 4
  5. Slide 5: What is batch scheduler? sugree June 9, 2008 5
  6. Slide 6: Batch scheduler is responsible to schedule jobs to execute on resources at the right time. sugree June 9, 2008 6
  7. Slide 7: Why do we need batch scheduler? sugree June 9, 2008 7
  8. Slide 8: To utilize resources efficiently. sugree June 9, 2008 8
  9. Slide 9: To finish all jobs as fast as possible. sugree June 9, 2008 9
  10. Slide 10: To minimize power consumption. sugree June 9, 2008 10
  11. Slide 11: In general, it is so called \"resource scheduling problem\". sugree June 9, 2008 11
  12. Slide 12: Jobs, Resources and Time resources time June 9, 2008 12
  13. Slide 13: In this research, main criteria is to minimize cost to run the resources. sugree June 9, 2008 13
  14. Slide 14: Back to the past, most works focused on improving algorithms. sugree June 9, 2008 14
  15. Slide 15: To simplify the problem, this research limits scope job characteristics to independent sequential jobs. sugree June 9, 2008 15
  16. Slide 16: In short, a job contains the one and only one task. sugree June 9, 2008 16
  17. Slide 17: In other words, job = task. sugree June 9, 2008 17
  18. Slide 18: Scheduling Algorithms Scheduling On-line Batch RR OLB MET MCT MinMin MaxMin Sufferage CSufferage XSufferage CMinMin CMaxMin June 9, 2008 18
  19. Slide 19: There are on-line and batch scheduling. sugree June 9, 2008 19
  20. Slide 20: The most simple algorithm is \"Round Robin\". sugree June 9, 2008 20
  21. Slide 21: \"Opportunistic Load Balancing\" assigns job to the next available machine. sugree June 9, 2008 21
  22. Slide 22: \"Minimum Execution Time\" assigns job to the fastest machine. sugree June 9, 2008 22
  23. Slide 23: \"Minimum Completion Time\" assigns job to the machine with minimum completion time for that job. sugree June 9, 2008 23
  24. Slide 24: Next are batch scheduling algorithms. sugree June 9, 2008 24
  25. Slide 25: \"MinMin\" assigns shortest job to the fastest machine. sugree June 9, 2008 25
  26. Slide 26: \"MaxMin\" assign longest job to the fastest machine. sugree June 9, 2008 26
  27. Slide 27: \"Sufferage\" is reassignable MaxMin. sugree June 9, 2008 27
  28. Slide 28: \"XSufferage\" is Sufferage with data locality. sugree June 9, 2008 28
  29. Slide 29: CMinMin, CMaxMin and CSufferage are derivative with costing. sugree June 9, 2008 29
  30. Slide 30: How to verify? How to evaluate? sugree June 9, 2008 30
  31. Slide 31: The answer is simulation. Why? sugree June 9, 2008 31
  32. Slide 32: Closed. Controllable. Reproducible. sugree June 9, 2008 32
  33. Slide 33: Simulation is assumption and modeling. sugree June 9, 2008 33
  34. Slide 34: Grid is a meta-scheduler and underlying cluster schedulers managing hosts. sugree June 9, 2008 34
  35. Slide 35: Grid Host Cluster Scheduler Host jobs Cluster Grid Scheduler Scheduler Cluster Scheduler June 9, 2008 35
  36. Slide 36: Interconnection between scheduler and processors are dedicated. sugree June 9, 2008 36
  37. Slide 37: Network Storage Scheduler Processor Processor Processor Processor June 9, 2008 37
  38. Slide 38: Job consists of inputs, outputs and executable. sugree June 9, 2008 38
  39. Slide 39: Job Output Input Executable Machine June 9, 2008 39
  40. Slide 40: Operations are 2 steps; mapping and scheduling. sugree June 9, 2008 40
  41. Slide 41: Mapping \"job\" to \"machine\". sugree June 9, 2008 41
  42. Slide 42: Schedule \"job\" to the exact time. sugree June 9, 2008 42
  43. Slide 43: In short, the result is generic priority index. sugree June 9, 2008 43
  44. Slide 44: p ij= eij r j c ij g j d ij sugree June 9, 2008 44
  45. Slide 45: Time execution time period before deadline eij d ij time rj Di ready time deadline June 9, 2008 45
  46. Slide 46: Cost cost cij cost gj cumulative cost June 9, 2008 46
  47. Slide 47: Experimented based on GAMESS job log in ThaiGrid to assume a small and a big system and named them, KUGrid and ThaiGrid, respectively. sugree June 9, 2008 47
  48. Slide 48: Makespan and cost are observed. sugree June 9, 2008 48
  49. Slide 49: Makespan is the period of time from when the first job submitted to the last job finished. sugree June 9, 2008 49
  50. Slide 50: Price-Performance 30000 Cost-Time Ratio ($/h) 25000 20000 15000 10000 5000 0 KU Grid Thai Grid rr olb mct met minmin maxmin sufferage cminmin cmaxmin csufferage June 9, 2008 50
  51. Slide 51: Cost 4500 Thousands 4000 3500 3000 Cost ($) 2500 2000 1500 1000 500 0 KU Grid Thai Grid rr olb mct met minmin maxmin sufferage cminmin cmaxmin csufferage June 9, 2008 51
  52. Slide 52: Makespan 139.150 139.100 Makespan (hours) 139.050 139.000 138.950 138.900 138.850 138.800 KU Grid Thai Grid rr olb mct met minmin maxmin sufferage cminmin cmaxmin csufferage June 9, 2008 52
  53. Slide 53: Looks great! Any problems? Yes! sugree June 9, 2008 53
  54. Slide 54: Priority index contains 5 factors. What are the right values? sugree June 9, 2008 54
  55. Slide 55: What are the factors of those factors? sugree June 9, 2008 55
  56. Slide 56: There are so many dependencies. Job characteristics. Resource characteristics. User characteristics. sugree June 9, 2008 56
  57. Slide 57: This problem is so called \"Multi-variate Optimization\". sugree June 9, 2008 57
  58. Slide 58: Plus, a bit more complex with evaluation in simulator. sugree June 9, 2008 58
  59. Slide 59: How to solve? sugree June 9, 2008 59
  60. Slide 60: Optimization Architecture Monitoring Accounting Batch System System Scheduler Simulator Simulator Simulator Optimizer Simulator June 9, 2008 60
  61. Slide 61: Optimization Algorithm? sugree June 9, 2008 61
  62. Slide 62: Particle Swarm Optimization is selected as the first one to try. sugree June 9, 2008 62
  63. Slide 63: The position of each particle in n-dimension plane represents solution. sugree June 9, 2008 63
  64. Slide 64: PSO is social influence in various scopes. sugree June 9, 2008 64
  65. Slide 65: Local, neighbor and global. sugree June 9, 2008 65
  66. Slide 66: Usually, one trust oneself, friends and the world, respectively. The level of trust. sugree June 9, 2008 66
  67. Slide 67: PSO June 9, 2008 67
  68. Slide 68: How to fully automate self-tuning process? sugree June 9, 2008 68
  69. Slide 69: Historical data are the key. sugree June 9, 2008 69
  70. Slide 70: The quality of solution depends on optimizer. sugree June 9, 2008 70
  71. Slide 71: Running optimizer longer may return better solution. sugree June 9, 2008 71
  72. Slide 72: Precision of using historical data depends on data period and amount of data. sugree June 9, 2008 72
  73. Slide 73: How to use historical data? Log replay or estimation. sugree June 9, 2008 73
  74. Slide 74: How to maximize solution quality to near optimal? sugree June 9, 2008 74
  75. Slide 75: Just run more simulations using the whole grid system to optimize itself at night! sugree June 9, 2008 75
  76. Slide 76: Results? Please accept my apologize. They are not published yet. sugree June 9, 2008 76
  77. Slide 77: Conclusion. sugree June 9, 2008 77
  78. Slide 78: Flexible algorithms introduce more adjustable factors. sugree June 9, 2008 78
  79. Slide 79: The factors are vary from time to time. sugree June 9, 2008 79
  80. Slide 80: In other view, these algorithms are improved by external optimization periodically. sugree June 9, 2008 80
  81. Slide 81: Particle swarm optimization is selected to solve multi-variate optimization. sugree June 9, 2008 81
  82. Slide 82: Improve scheduler by scheduler itself. sugree June 9, 2008 82
  83. Slide 83: Any questions? sugree June 9, 2008 83