Wei’s Notes on Map-Reduce Job Scheduling<br />Feb 2011<br />
[Map-Reduce] Workflow<br />Master splits a job into small chunks (symd model)<br />Assign to slaves with available mapper ...
[Map-Reduce] Data flow<br />Raw<br />Map(k1, v1) -> list(k2, v2)<br />Reduce(k2, list(v2)) -> list(v2) *why not v3?<br />
[Map-Reduce] Fault Tolerance<br />Upon machine failure:<br />
[Map-Reduce] To-Dos<br />Splitting: <br />When: upon arrival or upon head-of-queue <br />how is size M determined? (based ...
[Fair Scheduler] 3-phase allocation<br />Satisfy the pool whose min share >= demand<br />Allocate resources to the other p...
[Fair Scheduler] reschedule<br />Policy: wait & kill<br />Algorithm:<br />Wait Tmin. If min share not achieved, kill other...
[Fair Scheduler] Issues & Solutions<br />Data Locality<br />Delay scheduling: address sticky slots issue<br />IO-rate bias...
[Fair Scheduler] Tradeoffs<br />Batch response time: fairness vs. utilization tradeoff (throughput) <br />Average Response...
[Fair Scheduler] To-Dos<done><br />Reschedule/Reassignment<br />FairScheduler keeps UPDATE_INTERVAL, check all pools for t...
[Quincy]<br />Model the problem as a flow network<br />Flow network: a directed graph each of whose <br />Edges e is annot...
Quincy vs. Fair Scheduler<br />
Readings<br />MapReduce. Jeffery Dean*<br />Google: Cluster Computing and MR<br />Job Scheduling for Multi-User. Matei Zah...
Topic<br />Before: Existing systems predetermined and fixed allocation of resources/slots to queries/tasks. Intuitively, i...
Tips from Prof Tan<br />Keep references of all the literature reviews done and note where it is published<br />
Upcoming SlideShare
Loading in …5
×

Wei's notes on MapReduce Scheduling

1,977 views

Published on

Published in: Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,977
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
35
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Wei's notes on MapReduce Scheduling

  1. 1. Wei’s Notes on Map-Reduce Job Scheduling<br />Feb 2011<br />
  2. 2. [Map-Reduce] Workflow<br />Master splits a job into small chunks (symd model)<br />Assign to slaves with available mapper slots (taking into account of data locality)<br />Mapper collects required data, puts through user defined mapper function<br />Mapper writes intermediate results to local disk, report to Master with location of the results<br />Master record status, pick slaves with available reducer and push over location info for reduce phase (*locality? Yes!)<br />Reducer copies data from mapper via RPC, waits for all mappers to finish, then sorts by intermediate keys, eventually puts through user defined reducer function<br />Reducer writes final output to DFS, report to Master<br />
  3. 3. [Map-Reduce] Data flow<br />Raw<br />Map(k1, v1) -> list(k2, v2)<br />Reduce(k2, list(v2)) -> list(v2) *why not v3?<br />
  4. 4. [Map-Reduce] Fault Tolerance<br />Upon machine failure:<br />
  5. 5. [Map-Reduce] To-Dos<br />Splitting: <br />When: upon arrival or upon head-of-queue <br />how is size M determined? (based on chunk size)<br />“can be processed in parallel by different machines”<br />Cost of re-execution<br />Map & reduce <br />
  6. 6. [Fair Scheduler] 3-phase allocation<br />Satisfy the pool whose min share >= demand<br />Allocate resources to the other pools up to its min share<br />Residual given to the unfilled, starting with the least fulfilled<br />Notes<br />Resource allocation is pool based instead of job based<br />Pool: min share is user specified<br />
  7. 7. [Fair Scheduler] reschedule<br />Policy: wait & kill<br />Algorithm:<br />Wait Tmin. If min share not achieved, kill others<br />Wait Tfair. If fare share not achieved, kill more. <br />
  8. 8. [Fair Scheduler] Issues & Solutions<br />Data Locality<br />Delay scheduling: address sticky slots issue<br />IO-rate biasing: address hotspot node <br />Map/Reduce interdependency<br />Copy-Compute Splitting: overlapping IO intensive copy and CPU intensive reducing<br />
  9. 9. [Fair Scheduler] Tradeoffs<br />Batch response time: fairness vs. utilization tradeoff (throughput) <br />Average Response Time<br />Space Usage with Intermediate Data<br />User Isolation: “ability to provide worst-case performance comparable to owning a small private cluster regardless of user workload”<br />
  10. 10. [Fair Scheduler] To-Dos<done><br />Reschedule/Reassignment<br />FairScheduler keeps UPDATE_INTERVAL, check all pools for tasks to preempt and set status of those tasks, and place in action queue. <br />Next heartbeat will pick up the changes in task status and carry out the kills.<br />Relationship between batch response time and throughput: measure the same thing. <br />Relationship between average response time and user isolation: could be correlated, but not all the time. ART is not a quantitative measurement of user isolation <br />
  11. 11. [Quincy]<br />Model the problem as a flow network<br />Flow network: a directed graph each of whose <br />Edges e is annotated with a non-negative integer capacity and a cost, and whose<br />Nodes v is annotated with an integer “supply” where total supply of the graph equals to zero<br />To construct simplest graph with only hard constraint being no starvation <br />
  12. 12. Quincy vs. Fair Scheduler<br />
  13. 13. Readings<br />MapReduce. Jeffery Dean*<br />Google: Cluster Computing and MR<br />Job Scheduling for Multi-User. Matei Zaharia*<br />Max-min fairness. Wikipedia + algo*<br />Quincy. Michael Isard*<br />An update on Google’s infrastructure<br />
  14. 14. Topic<br />Before: Existing systems predetermined and fixed allocation of resources/slots to queries/tasks. Intuitively, if resources can be dynamically allocated to tasks, the resources can be better utilized.<br />After: Enable scheduler to make resource aware decisions. (IO, CPU, memory) + bring fair scheduler from pool level to job level.<br />
  15. 15. Tips from Prof Tan<br />Keep references of all the literature reviews done and note where it is published<br />

×