AN EXTENSION OF FAIRSHARESCHEDULER AND A NOVEL SLA BASED LEARNING SCHEDULER IN HADOOP<br />  BY<br />Dr G SUDHA SADHASIVAM...
agenda<br />Introduction<br />           - Metascheduler in Fairsharescheduler.<br />Features.<br />Extended Fairscheduler...
Fairshare scheduler<br />Existing System :-<br /><ul><li> Jobs in pool are executed in Fairshare manner.</li></ul>Proposed...
ARCHITECTURE<br />Node 1<br />USER  1<br />Node 2<br />Pool<br />USER  2<br />FAIRSHARE SCHEDULER<br />Node 3<br />USER  3...
Calculate<br /><ul><li>User Estimated time = (no.of maps *maptime)+(no.of reduces * reduce time).</li></ul>Update<br /><ul...
Taskcount=total_Tasks–running_Tasks–finished_Tasks+needed_Tasks_for_job
Weight = weight *priorityfactor.
Fairshare=(weight *oldslots)/totalweight
Deficit (MR_Deficit) =(fairshare - running) *timedelta</li></li></ul><li>WORKFLOW<br />= no.of maps * maptime+no.of reduce...
RESULT(LFSB)   :Different Jobs<br />
More small jobs<br />
 A Novel sla based learning scheduler<br />
Schedulers IN Hadoop<br />Hadoop on Demand – <br />FIFO with Torque<br />No data locality<br />Fairshare<br />Fairshares r...
PATCHES<br />Security features to isolate users<br />Launching multuple tasks per heartbeat<br />Parallelise jobs and laun...
<ul><li>Existing  System:-
 Task assignment             right node.
  No policies and less user level response.
Proposed System :-
SLA :user specifying requirements.
 Job executing at right node.
 Classify jobs as I/O bound or cpu     bound – priority and assign jobs</li></li></ul><li>Proposed methodology<br />SLA – ...
 Classification based on Job traces History (Learning).
  Creation of Queues  for jobs as I/O and CPU
   Assignment  to Queues based on Utility Function.                     </li></li></ul><li>Gather all node details & check...
Upcoming SlideShare
Loading in...5
×

Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

1,759

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,759
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
53
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and a Novel SLA based Learning Scheduler in Hadoop" by G Sudha Sadhasivam and Priya N

  1. 1. AN EXTENSION OF FAIRSHARESCHEDULER AND A NOVEL SLA BASED LEARNING SCHEDULER IN HADOOP<br /> BY<br />Dr G SUDHA SADHASIVAM<br />PROFESSOR<br />&<br />PRIYA N<br />STUDENTPSG COLLEGE OF TECHNOLOGY COIMBATORE<br />
  2. 2. agenda<br />Introduction<br /> - Metascheduler in Fairsharescheduler.<br />Features.<br />Extended Fairscheduler Architecture.<br />Work Flow.<br />Experimental results.<br />Learning Scheduler with SLA.<br />Design of Proposed System.<br />Work Flow<br />
  3. 3. Fairshare scheduler<br />Existing System :-<br /><ul><li> Jobs in pool are executed in Fairshare manner.</li></ul>Proposed System :-<br /><ul><li> Fairshare Execution of Jobs from pool such that Large Job first and Small Job Backfilling.</li></li></ul><li>FEAtures<br />Jobs in pools<br />Guaranteed capacity<br />Minimum Shares<br />Job Limits<br />Job Priorities<br />Pool Weights<br />
  4. 4. ARCHITECTURE<br />Node 1<br />USER 1<br />Node 2<br />Pool<br />USER 2<br />FAIRSHARE SCHEDULER<br />Node 3<br />USER 3<br />LARGE JOB FIRST+ SMALL JOB BACKFILLING<br />Node 4<br />USER 4<br />
  5. 5. Calculate<br /><ul><li>User Estimated time = (no.of maps *maptime)+(no.of reduces * reduce time).</li></ul>Update<br /><ul><li>Runnability
  6. 6. Taskcount=total_Tasks–running_Tasks–finished_Tasks+needed_Tasks_for_job
  7. 7. Weight = weight *priorityfactor.
  8. 8. Fairshare=(weight *oldslots)/totalweight
  9. 9. Deficit (MR_Deficit) =(fairshare - running) *timedelta</li></li></ul><li>WORKFLOW<br />= no.of maps * maptime+no.of reduces * reduce time<br />Calculate no. Of maps and reduces<br />Find User Estimated Time<br />Create a list of jobs<br />Get Jobs in pool<br />Finished/running<br />fairscheduler.start()<br />Get runstate of job in progress<br />Remove from list<br />Categorize jobs as small and large<br />Update:-<br />Weight,taskcount,min.slots,runnability,fairshare<br />Job finish time<user estimated time<br />Bring large job first and backfill small jobs<br />Backfill if exe_time<delay<br />
  10. 10.
  11. 11. RESULT(LFSB) :Different Jobs<br />
  12. 12. More small jobs<br />
  13. 13. A Novel sla based learning scheduler<br />
  14. 14. Schedulers IN Hadoop<br />Hadoop on Demand – <br />FIFO with Torque<br />No data locality<br />Fairshare<br />Fairshares resources among jobs in pools<br />Excess resources are shored between pools<br />Capacity<br />Fairsharing among organisations<br />Inter queue priority is maintained manually (not dynamic)<br />Dynamic priority scheduler<br />Adjustable priority dynamically<br />Demand / budget of the user<br />More priority for smaller jobs<br />Large jobs have to be broken up into smaller ones<br />
  15. 15. PATCHES<br />Security features to isolate users<br />Launching multuple tasks per heartbeat<br />Parallelise jobs and launch smaller jobs faster<br />Prevent oversubscribing nodes (only fter job submission) – RAM / HD<br />
  16. 16. <ul><li>Existing System:-
  17. 17. Task assignment right node.
  18. 18. No policies and less user level response.
  19. 19. Proposed System :-
  20. 20. SLA :user specifying requirements.
  21. 21. Job executing at right node.
  22. 22. Classify jobs as I/O bound or cpu bound – priority and assign jobs</li></li></ul><li>Proposed methodology<br />SLA – User details ,job requirements and charge sheet.<br />Scheduler:<br /><ul><li>Classifies jobs based on (SLA+Job Features) and node features.(new job)
  23. 23. Classification based on Job traces History (Learning).
  24. 24. Creation of Queues for jobs as I/O and CPU
  25. 25. Assignment to Queues based on Utility Function. </li></li></ul><li>Gather all node details & check for SLA approval. If Yes allow to submit jobs. <br />Owner,Description,User details and requirements<br />Node 1<br />Node 2<br />SLA<br />USER<br />Node 3<br />LEARNING SCHEDULER<br />Node 4<br />Node 5<br />
  26. 26.
  27. 27.
  28. 28.
  29. 29. Workflow of Scheduler<br />Node features<br />CLASSIFIER<br />Job Features+SLA<br />(MIS+MOS)/MTCT >Avg.Disk I/o rate<br />Job Traces history<br />RIGHT NODE& Job type<br />Calculate &Compare Utility<br />Change priority<br />I/O or CPU<br />I/O queue<br />CPU queue<br />
  30. 30. example<br />Node Feature value<br />
  31. 31. Job Submitted (Job Features)<br />ram=400Mb,HD=100Gb, M=6,R=2<br />ram=500Mb. HD=120Gb M=8 R=0.<br />P(node)={no. job Features+no.node features*(P(F1)+P(F2), …P(Fn))}/Total features<br />P(J1M1)=1,P(J1M2)=0.875 ,P(J1M3)=0.8,P(J1M4)=1, P(J1M5)=1, P(J1M6)=0.625.<br />P(J2M1)=1,P(J1M2)=0.857 ,P(J1M3)=0.857,P(J1M4)=0.514, P(J1M5)=0.857, P(J1M6)=0.514<br />JOB 1= M1,M4,M5. M4 satisfies.<br />JOB 2= M1.<br />
  32. 32. CPU or I/O bound JOB<br />I/O rate : 10 Mbytes / sec<br />MTCT : 10 sec<br />
  33. 33. Scheduler<br /><ul><li>Find the right node for the job using a classifier.</li></ul> :Naïve Bayes classifier<br /><ul><li>Find the Job type whether I/O or CPU bound.</li></ul>(MIS+MOS)/MTCT >Avg.Disk I/O rate<br /><ul><li>Calculate the Utility Function value.</li></ul> FIFO,Deficit,SJF.<br /><ul><li>Pass the jobs to the queue.</li></li></ul><li>Advantages<br /><ul><li>Fairscheduler with Backfilling improves on waiting time for large jobs. It introduces “no starvation” slogan and improves response time.
  34. 34. SLA based scheduler brings high user level response and better utilization of resources.</li></li></ul><li>References<br /><ul><li> Saeed Iqbal ,Rinku Gupta, Yung chin Fang “Job Scheduling in HPC clusters” DELL Power Solutions 2005.
  35. 35. Juan Wang, Wenming Guo, ”The Application of Backfilling in Cluster Systems”,2009 IEEE International Conference on Communication and Mobile Computing.
  36. 36. Jaideep Dhok and Vasudeva Varma “Using Pattern Classification for Task Assignment in Map Reduce”. 10th IEEE/ACM International Conference CCGrid 2010.
  37. 37. Amy W. Apon, Thomas D.Wagner, and Lawrence. Dowdy. “A learning approach to processor allocation in parallel systems”. In CIKM ’99:Proceedings of the eighth international conference on Information and knowledge management, pages 531–537, New York, NY, USA, 1999.
  38. 38. Harry Zhang. “The Optimality of Naive Bayes”. In Valerie Barr and Zdravko Markov, editors, FLAIRS Conference. AAAI Press, 2004.</li></li></ul><li>THANK YOU<br />
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×