Scheduling in MapReduce using Machine Learning Techniques<br />Cloud Computing Group<br />Search and Information Extractio...
Agenda<br />Cloud Computing Group @ IIIT Hyderabad<br />Admission Control<br />Task Assignment<br />Conclusion<br />2<br />
Cloud Computing Group @ IIIT Hyderabad<br /><ul><li>Search and Information Extraction
Large datasets
Clusters of machines
Web crawling
Data intensive applications
MapReduce
Apache Hadoop</li></ul>3<br />
Research Areas<br />Resource management for MapReduce<br />Scheduling<br />Data Placement<br />Power aware resource manage...
Teaching<br />Cloud Computing course<br />Monsoon semester (2008 onwards)<br />Special focus on Apache Hadoop<br />MapRedu...
Learning Based Admission Control and Task Assignment in MapReduce<br /><ul><li>Learning based approach
Admission Control
Should we accept a job for execution in the cluster?
Task Assignment
Which task to choose for running on a given node?</li></ul>6<br />
Admission Control<br /><ul><li>Deciding if and which request to accept from a set of incoming requests
Critical in achieving better QoS
Important to prevent over committing
Needed to maximize the utility from the perspective of a service provider</li></ul>7<br />
MapReduce as a Service<br /><ul><li>Web services interface for MR jobs
Users search jobs through repositories
Select one that matches their criteria
Launch it on clusters managed by service provider
Service providers rent infrastructure from IaaS provider</li></ul>8<br />
Utility Functions<br /><ul><li>Three phase
Soft and hard deadlines
Decay parameters
Provison for service provider penalty</li></ul>9<br />
Upcoming SlideShare
Loading in …5
×

Apache Hadoop India Summit 2011 talk "Scheduling in MapReduce using Machine Learning Techniques" by Vasudeva Varma

3,435 views
3,280 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
3,435
On SlideShare
0
From Embeds
0
Number of Embeds
162
Actions
Shares
0
Downloads
100
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • The Search and Information Extraction Lab (SIEL) at LTRC, IIIT Hyderabad is actively involved in research in many areas relevant to Cloud Computing. The main motivation behind establishing a research team in cloud computing at SIEL was to enable researchers in the lab in experimenting with very large datasets, which are nowadays becoming a norm in search and information extraction research. To facilitate handling of such large datasets, we began exploring several methods for operating on the data sets using a cluster of machines. Eventually, we chose MapReduce as the preferred model as it suited very well for data intensive applications. We began exploring MapReduce, and its most popular implementation, Apache Hadoop. However, we soon realized that there was huge potential in research in improving the core MapReduce framework in various areas such as fault tolerance, resource management and user accessibility. As a result we established a team that does dedicated research on Hadoop and MapReduce.
  • Apache Hadoop India Summit 2011 talk "Scheduling in MapReduce using Machine Learning Techniques" by Vasudeva Varma

    1. 1. Scheduling in MapReduce using Machine Learning Techniques<br />Cloud Computing Group<br />Search and Information Extraction Lab<br />http://search.iiit.ac.in<br />IIIT Hyderabad<br /> Vasudeva Varma vv@iiit.ac.in<br /> Radheshyam Nanduri radheshyam.nanduri@research.iiit.ac.in<br />
    2. 2. Agenda<br />Cloud Computing Group @ IIIT Hyderabad<br />Admission Control<br />Task Assignment<br />Conclusion<br />2<br />
    3. 3. Cloud Computing Group @ IIIT Hyderabad<br /><ul><li>Search and Information Extraction
    4. 4. Large datasets
    5. 5. Clusters of machines
    6. 6. Web crawling
    7. 7. Data intensive applications
    8. 8. MapReduce
    9. 9. Apache Hadoop</li></ul>3<br />
    10. 10. Research Areas<br />Resource management for MapReduce<br />Scheduling<br />Data Placement<br />Power aware resource management<br />Data management in cloud<br />Virtualization<br />4<br />
    11. 11. Teaching<br />Cloud Computing course<br />Monsoon semester (2008 onwards)<br />Special focus on Apache Hadoop<br />MapReduce and HDFS<br />Mahout<br />Virtualization<br />NoSQL databases<br />Guest lectures from industry experts<br />5<br />
    12. 12. Learning Based Admission Control and Task Assignment in MapReduce<br /><ul><li>Learning based approach
    13. 13. Admission Control
    14. 14. Should we accept a job for execution in the cluster?
    15. 15. Task Assignment
    16. 16. Which task to choose for running on a given node?</li></ul>6<br />
    17. 17. Admission Control<br /><ul><li>Deciding if and which request to accept from a set of incoming requests
    18. 18. Critical in achieving better QoS
    19. 19. Important to prevent over committing
    20. 20. Needed to maximize the utility from the perspective of a service provider</li></ul>7<br />
    21. 21. MapReduce as a Service<br /><ul><li>Web services interface for MR jobs
    22. 22. Users search jobs through repositories
    23. 23. Select one that matches their criteria
    24. 24. Launch it on clusters managed by service provider
    25. 25. Service providers rent infrastructure from IaaS provider</li></ul>8<br />
    26. 26. Utility Functions<br /><ul><li>Three phase
    27. 27. Soft and hard deadlines
    28. 28. Decay parameters
    29. 29. Provison for service provider penalty</li></ul>9<br />
    30. 30. Our Approach<br /><ul><li>Based on Expected Utility Hypothesis from decision theory
    31. 31. Accept a job that maximizes the expected utility
    32. 32. Use pattern classifier to classify incoming jobs
    33. 33. Two classes
    34. 34. Utility functions for prioritizing</li></ul>10<br />
    35. 35. Feature Vector<br /><ul><li>Given as input to the classifier
    36. 36. Contains job specific and cluster specific parameters
    37. 37. Includes variables that might affect admission decision</li></ul>11<br />
    38. 38. Bayesian Classifier<br /><ul><li>Naive Bayes Assumption
    39. 39. Conditionally independent parameters
    40. 40. Works well in practice
    41. 41. Use past events to predict future outcomes
    42. 42. Application of Bayes theorem while computing probabilities
    43. 43. Incremental Learning – efficient w.r.t. memory usage
    44. 44. Simple to implement</li></ul>12<br />
    45. 45. Evaluation<br /><ul><li>Success/Failure criteria: Load management
    46. 46. Simulation
    47. 47. Baseline
    48. 48. Myopic – Immediately select job that has maximum utility
    49. 49. Random – Randomly select one job from the candidate jobs</li></ul>13<br />
    50. 50. Algorithm Accuracy<br />14<br />
    51. 51. Comparison with baseline<br />15<br />
    52. 52. Meeting Deadlines<br />16<br />
    53. 53. Task Assignment<br /><ul><li>Deciding if a Task can be assigned on a node
    54. 54. Learning based technique
    55. 55. Extension of the work presented before</li></ul>17<br />
    56. 56. Learning Scheduler<br />18<br />
    57. 57. Features of Learning Scheduler<br />Flexible task assignment – based on state of resources<br />Consider job profile while allocating<br />Tries to avoid overloading task trackers<br />Allow users to control assignment by specifying priority functions<br />Incremental learning<br />19<br />
    58. 58. Using Classifier<br />Use a pattern classifier to classify candidate jobs<br />Two classes: good and bad<br />Good tasks don't overload task trackers<br />Overload: A limit set on system load average by the admin<br />20<br />
    59. 59. Feature Vector<br />Job features<br />CPU, memory, network and disk usage of a job<br />Node properties<br />Static: Number of processors, maximum physical and virtual memory, CPU Frequency<br />Dynamic: State of resources, Number of running map tasks, Number of running reduce tasks<br />21<br />
    60. 60. Job Selection<br />From the candidates labelled as good select one with maximum priority<br />Create a task of the selected job<br />22<br />
    61. 61. Priority (Utility) Functions<br />Policy enforcement<br />FIFO: U(J) = J.age<br />Revenue oriented<br />If priority of all jobs is equal, scheduler will always assign task that has the maximum likelihood of being labelled good.<br />23<br />
    62. 62. Job Profile<br />Users submit 'hints' about job performance<br />Estimate job's resource consumption on a scale of 10, 10 being the highest.<br />This data is passed at job submission time through job parameters:<br />learnsched.jobstat.map - “1:2:3:4”<br />This scheduler is made open-source at http://code.google.com/p/learnsched/<br />24<br />
    63. 63. Evaluation<br /><ul><li>Evaluation work load
    64. 64. TextWriter
    65. 65. WordCount
    66. 66. WordCount + 10ms delay
    67. 67. URLGet
    68. 68. URLToDisk
    69. 69. CPU Activity</li></ul>25<br />
    70. 70. Learning Behaviour<br />26<br />
    71. 71. Classifier Accuracy<br />27<br />
    72. 72. Conclusions<br /><ul><li>Feedback informed classifiers can be used effectively
    73. 73. Better QoS than naive approaches
    74. 74. Less runtime  happy users  more revenue for the service provider</li></ul>28<br />
    75. 75. Thank you<br />Cloud Computing Group<br />Search and Information Extraction Lab<br />http://search.iiit.ac.in<br />IIIT HyderabadQuestions/Suggestions/Comments?<br /> Vasudeva Varma vv@iiit.ac.in<br /> Radheshyam Nanduri radheshyam.nanduri@research.iiit.ac.in<br />

    ×