Automatic Self-Tuning Architecture for Batch Scheduler on Large Scale Computing System

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Automatic Self-Tuning Architecture for Batch Scheduler on Large Scale Computing System - Presentation Transcript

    1. Automatic Self-Tuning Architecture for Batch Scheduler on Large Scale Computing System
    2. I am Sugree Phatanapherom from Kasetsart University.
    3. This research is a co-work with Asst. Prof. Putchong Uthayopas.
    4. Ready, steady, go.
    5. What is batch scheduler?
    6. Batch scheduler is responsible to schedule jobs to execute on resources at the right time.
    7. Why do we need batch scheduler?
    8. To utilize resources efficiently.
    9. To finish all jobs as fast as possible.
    10. To minimize power consumption.
    11. In general, it is so called "resource scheduling problem".
    12. Jobs, Resources and Time time resources
    13. In this research, main criteria is to minimize cost to run the resources.
    14. Back to the past, most works focused on improving algorithms.
    15. To simplify the problem, this research limits scope job characteristics to independent sequential jobs.
    16. In short, a job contains the one and only one task.
    17. In other words, job = task.
    18. Scheduling Algorithms Scheduling On-line Batch RR OLB MET MCT MinMin MaxMin Sufferage XSufferage CMinMin CMaxMin CSufferage
    19. There are on-line and batch scheduling.
    20. The most simple algorithm is "Round Robin".
    21. "Opportunistic Load Balancing" assigns job to the next available machine.
    22. "Minimum Execution Time" assigns job to the fastest machine.
    23. "Minimum Completion Time" assigns job to the machine with minimum completion time for that job.
    24. Next are batch scheduling algorithms.
    25. "MinMin" assigns shortest job to the fastest machine.
    26. "MaxMin" assign longest job to the fastest machine.
    27. "Sufferage" is reassignable MaxMin.
    28. "XSufferage" is Sufferage with data locality.
    29. CMinMin, CMaxMin and CSufferage are derivative with costing.
    30. How to verify? How to evaluate?
    31. The answer is simulation. Why?
    32. Closed. Controllable. Reproducible.
    33. Simulation is assumption and modeling.
    34. Grid is a meta-scheduler and underlying cluster schedulers managing hosts.
    35. Grid Grid Scheduler Cluster Scheduler Host Cluster Scheduler Cluster Scheduler jobs Host
    36. Interconnection between scheduler and processors are dedicated.
    37. Network Scheduler Processor Storage Processor Processor Processor
    38. Job consists of inputs, outputs and executable.
    39. Job Executable Input Output Machine
    40. Operations are 2 steps; mapping and scheduling.
    41. Mapping "job" to "machine".
    42. Schedule "job" to the exact time.
    43. In short, the result is generic priority index.
    44.  
    45. Time ready time execution time deadline period before deadline time
    46. Cost cumulative cost cost cost
    47. Experimented based on GAMESS job log in ThaiGrid to assume a small and a big system and named them, KUGrid and ThaiGrid, respectively.
    48. Makespan and cost are observed.
    49. Makespan is the period of time from when the first job submitted to the last job finished.
    50. Price-Performance
    51. Cost
    52. Makespan
    53. Looks great! Any problems? Yes!
    54. Priority index contains 5 factors. What are the right values?
    55. What are the factors of those factors?
    56. There are so many dependencies. Job characteristics. Resource characteristics. User characteristics.
    57. This problem is so called "Multi-variate Optimization".
    58. Plus, a bit more complex with evaluation in simulator.
    59. How to solve?
    60. Optimization Architecture Optimizer Simulator Simulator Simulator Simulator Batch Scheduler Monitoring System Accounting System
    61. Optimization Algorithm?
    62. Particle Swarm Optimization is selected as the first one to try.
    63. The position of each particle in n-dimension plane represents solution.
    64. PSO is social influence in various scopes.
    65. Local, neighbor and global.
    66. Usually, one trust oneself, friends and the world, respectively. The level of trust.
    67. PSO
    68. How to fully automate self-tuning process?
    69. Historical data are the key.
    70. The quality of solution depends on optimizer.
    71. Running optimizer longer may return better solution.
    72. Precision of using historical data depends on data period and amount of data.
    73. How to use historical data? Log replay or estimation.
    74. How to maximize solution quality to near optimal?
    75. Just run more simulations using the whole grid system to optimize itself at night!
    76. Results? Please accept my apologize. They are not published yet.
    77. Conclusion.
    78. Flexible algorithms introduce more adjustable factors.
    79. The factors are vary from time to time.
    80. In other view, these algorithms are improved by external optimization periodically.
    81. Particle swarm optimization is selected to solve multi-variate optimization.
    82. Improve scheduler by scheduler itself.
    83. Any questions?

    + Sugree PhatanapheromSugree Phatanapherom, 2 years ago

    custom

    1736 views, 0 favs, 1 embeds more stats

    Presentation of academic research with 140-char lim more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 1736
      • 1729 on SlideShare
      • 7 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 17
    Most viewed embeds
    • 7 views on http://sugree.com

    more

    All embeds
    • 7 views on http://sugree.com

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories