Weighted Flowtime on Capacitated Machines


Published on

It is well-known that SRPT is optimal for minimizing

ow time on machines that run one job at a time.
However, running one job at a time is a big under-
utilization for modern systems where sharing, simultane-
ous execution, and virtualization-enabled consolidation
are a common trend to boost utilization. Such machines,
used in modern large data centers and clouds, are
powerful enough to run multiple jobs/VMs at a time
subject to overall CPU, memory, network, and disk
capacity constraints.
Motivated by this pr

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Weighted Flowtime on Capacitated Machines

  1. 1. Weighted Flowtime on Capacitated Machines Kyle Fox∗ Madhukar Korupolu University of Illinois at Urbana-Champaign Google Research Urbana, IL Mountain View, CA kylefox2@illinois.edu mkar@google.comAbstract has been a driving force for virtualization adoptionIt is well-known that SRPT is optimal for minimizing and transformation of IT in data centers.flow time on machines that run one job at a time. This adoption, in turn, is driving a proliferationHowever, running one job at a time is a big under- of compute farms and public, private, and hybridutilization for modern systems where sharing, simultane- clouds which share resources and machines amongous execution, and virtualization-enabled consolidation multiple users and jobs [1, 2, 4, 32]. Large computeare a common trend to boost utilization. Such machines, farms and clouds often have excess capacity provi-used in modern large data centers and clouds, are sioned for peak workloads and these can be sharedpowerful enough to run multiple jobs/VMs at a time with new users to improve utilization during off-peaksubject to overall CPU, memory, network, and diskcapacity constraints. times. Tasks such as MapReduce [21] which are Motivated by this prominent trend and need, in this I/O driven, are often run with multiple instanceswork, we give the first scheduling algorithms to minimize (of different MapReduces) on the same machine soweighted flow time on such capacitated machines. To they all progress toward completion simultaneously.capture the difficulty of the problem, we show that A characterization of workloads in Google computewithout resource augmentation, no online algorithm can clusters is given in [20, 32, 34].achieve a bounded competitive ratio. We then investigate A common scenario in all of these situations isalgorithms with a small resource augmentation in speed that of jobs (or VMs) arriving in an online mannerand/or capacity. Our first result is a simple (2 + ε)- with their resource requirements and processing timecapacity O(1/ε)-competitive greedy algorithm. Usingonly speed augmentation, we then obtain a 1.75-speed requirements and a scheduler scheduling them onO(1)-competitive algorithm. Our main technical result one of the machines in the cluster. Jobs leave whenis a near-optimal (1 + ε)-speed, (1 + ε)-capacity O(1/ε3 )- they are completed. The problem is interesting evencompetitive algorithm using a novel combination of with a single machine. A common goal in suchknapsacks, densities, job classification into categories, a scenario is that of minimizing average weightedand potential function methods. We show that our flow time, defined as the weighted average of theresults also extend to the multiple unrelated capacitated job completion times minus arrival times, i.e., themachines setting. total time spent by jobs in the system. Other1 Introduction common goals include load-balancing (minimizing maximum load on machines), minimizing averageFour trends are becoming increasingly prominent stretch (stretch of a job defined as completion timein modern data centers. Modern machines are minus arrival time divided by processing time), orincreasingly powerful and able to run multiple completing as many jobs as possible before theirjobs at the same time subject to overall capacity deadlines if there are deadlines. Adding weightsconstraints. Running only one job at a time on them lets us prioritize the completion of certain jobs overleads to significant under-utilization. Virtualization others. In fact, weighted flow time subsumes thetechnologies such as VMware [3] and Xen [13] are stretch metric if we set the weight of a job to be theenabling workloads to be packaged as VMs and reciprocal of its processing time. In this paper wemultiple VMs run on the same physical machine [5]. focus on the weighted flow time objective.This packing of VMs on the same machine en-ables server consolidation [6, 7, 15] and data center 1.1 Prior work on minimizing flow time:cost reduction in the form of reduced hardware, Single job at a time model. Traditionally,floorspace, energy, cooling, and maintenance costs. almost all the known flow time results have beenThis utilization boost and the resulting cost savings for machines that run at most one job at a time. ∗ Portions of this work were done while this author was Consider the following model. A server sees n jobsvisiting Google Research. arrive over time. Each job j arrives at time rj
  2. 2. and requires a processing time pj . At each point scalable. Chekuri et al. [18] gave the first speed-in time t, the server may choose to schedule one scalable online algorithms for minimizing flow timejob j that has already arrived (rj ≤ t). We say a on multiple machines with unit weights. Becchettijob j has been completed when it has been scheduled et al. [14] gave the first analysis that the simplefor pj time units. Preemption is allowed, so job j greedy algorithm Highest Density First (HDF) ismay be scheduled, paused, and then later scheduled speed-scalable with arbitrary job weights on a singlewithout any loss of progress. Each job j is given a machine. HDF always schedules the alive job withweight wj . The goal of the scheduler is to minimize the highest weight to processing time ratio. Bussemathe total weighted flow time of the system, defined and Torng [16] showed the natural generalizationas j wj (Cj −rj ) where Cj is the completion time of of HDF to multiple machines is also speed-scalablejob j. For this paper in particular, we are interested with arbitrary job weights. Table 1 gives a summaryin the clairvoyant online version of this problem of these results.where the server is not aware of any job j until its 1.2 The capacitated flow time problem.release, but once job j arrives, the server knows pj The above results assume that any machine canand wj . run at most one job at a time. This assumption It is well known that Shortest Remaining Pro- is wasteful and does not take advantage of thecessing Time (SRPT) is the optimal algorithm sharing, simultaneous execution, and virtualizationfor minimizing flow time with unit weights in the abilities of modern computers. The current trend inabove single-job machine model. In fact, SRPT industry is not only to focus on distributing largeis the best known algorithm for unit weights on workloads, but also to run multiple jobs/VMs atmultiple machines in both the online and offline the same time on individual machines subject tosetting. Leonardi and Raz [30] show SRPT has com- overall capacity constraints. By not addressing thepetitive/approximation ratio O(log(min {n/m, P })) sharing and virtualization trend, theory is gettingwhere m is the number of machines and P is the ratio out of touch with the industry.of the longest processing time to shortest. They also The problem of scheduling an online sequencegive a matching lower bound for any randomized of jobs (VMs) in such scenarios necessitates theonline algorithm. Chekuri, Khanna, and Zhu [19] “capacitated machines” variant of the flow timeshow every randomized online algorithm has com- problem. In addition to each job j having apetitive ratio Ω(min W 1/2 , P 1/2 , (n/m)1/4 ) when processing time and weight, job j also has a height hjthe jobs are given arbitrary weights and there are that represents how much of a given resource (e.g.multiple machines. Bansal and Chan [11] partially RAM, network throughput) the job requires to beextended this result to show there is no deterministic scheduled. The server is given a capacity that weonline algorithm with constant competitive ratio assume is 1 without loss of generality. At eachwhen the jobs are given arbitrary weights, even point in time, the server may schedule an arbitrarywith a single machine. However, there are some number of jobs such that the sum of heights forpositive online results known for a single machine jobs scheduled is at most the capacity of the server.where the competitive ratio includes a function of We assume all jobs have height at most 1, or somethe processing times and weights [12, 19]1 . jobs could never be scheduled. The setting can be The strong lower bounds presented above moti- easily extended to use multiple machines and multi-vate the use of resource augmentation, originally dimensional heights, but even the single-machine,proposed by Kalyanasundaram and Pruhs [28], single-dimension case given above is challenging fromwhich gives an algorithm a fighting chance on worst a technical standpoint. We focus on the single-case scheduling instances when compared to an machine, single-dimension problem first, and lateroptimal offline algorithm. We say an algorithm show how our results can be extended to multipleis s-speed c-competitive if for any input instance I, machines. We leave further handling of multiplethe algorithm’s weighted flow time while running at dimensions as an interesting open problem.speed s is at most c times the optimal schedule’s The case of jobs having sizes related to machineweighted flow time while running at unit speed. capacities has been studied before in the context ofIdeally, we would like an algorithm to be constant load-balancing [9, 10] and in operations research andfactor competitive when given (1 + ε) times a systems [6,15,31,33]. However, this paper is the firstresource over the optimal offline algorithm for any instance we are aware of that studies capacitatedconstant ε > 0. We call such algorithms [resource]- machines with the objective of minimizing flow time. 1 The result of [19] is actually a semi-online algorithm in We call this the Capacitated Flow Time (CFT)that P is known in advance. problem. As flow time is a fundamental scheduling
  3. 3. Capacitated Weights Machines Competitive Ratio no unit single 1 [folklore] no unit multiple Ω(log n/m + log P ) [30] O(log min {n/m, P }) [30] (1 + ε)-speed O(1/ε)-comp. [18] no arbitrary single ω(1) [11] O(log2 P ) [19, (semi-online)] O(log W ) [12] O(log n + log P ) [12] (1 + ε)-speed O(1/ε)-comp. [14] no arbitrary multiple Ω min W 1/2 , P 1/2 , (n/m)1/4 [19] (1 + ε)-speed O(1/ε2 )-comp. [16] yes unit single (1.2 − ε)-capacity Ω(P ) [this paper] (1.5 − ε)-capacity Ω(log n + log P ) [this paper] yes weighted single (2 − ε)-capacity ω(1) [this paper] yes weighted multiple (2 + ε)-capacity O(1/ε)-comp. [this paper] 1.75-speed O(1)-comp. [this paper] (1 + ε)-speed (1 + ε)-capacity O(1/ε3 )-comp. [this paper]Table 1. A summary of prior and new results in online scheduling to minimize flowtime. Here, W denotes the number of distinctweights given to jobs, and P denotes the ratio of the largest to least processing time.objective, we believe this is an important problem HRDF uses greedy knapsack packing with jobto study and bridge the gap to practice. residual densities (defined later). The algorithm and analysis are tight for capacity-only augmentation1.3 Our results and outline of the paper. given the lower bound in Theorem 2.4. We then turnIn Section 2, we show that online scheduling to the question of whether speed augmentation helpsin the capacitated machines case is much more beat the (2 − ε)-capacity lower bound. Our initialdifficult than scheduling in the single-job model. In attempt and result (not described in this paper)particular, Theorem 2.1 shows there exists no online was an s-speed u-capacity constant-competitiverandomized algorithm for the CFT problem with a algorithm for all s and u with s · u ≥ 2 + ε. Webounded competitive ratio, even when all jobs have subsequently obtained an algorithm High Priorityunit weight. This result stands in stark contrast Category Scheduling (HPCS) which breaks throughto the same situation where SRPT is actually an the 2 bound. HPCS is 1.75-speed O(1)-competitiveoptimal algorithm for the single job at a time model. and is described in Section 4. Motivated by the difficulty of online scheduling, Our main result is a (1 + ε)-speed (1 + ε)-we consider the use of resource augmentation in capacity O(1/ε3 )-competitive algorithm called Knap-speed and capacity. In a capacity augmentation sack Category Scheduling (KCS). KCS builds onanalysis, an algorithm is given capacity greater HPCS and is presented in Section 5. The KCSthan 1 while being compared to the optimal schedule result shows the power of combining both speed andon a unit capacity server. We define u-capacity capacity augmentation and is the first of its kind.2c-competitive accordingly. We also say an algo- Pseudocode descriptions of all our algorithms arerithm is s-speed u-capacity c-competitive if it is c- given in Figures 1, 3, and 4. We conclude with acompetitive when given both s-speed and u-capacity brief summary of our results as well as some openover the optimal unit resource schedule. We show problems related to the CFT problem in Section 6.in Corollary 2.2 and Theorem 2.3 that our lowerbound holds even when an algorithm is given a small All of our results can be extended to the multipleconstant factor extra capacity without additional unrelated capacitated machine setting as detailedspeed. In fact, strong lower bounds continue to exist in Appendix D. In the unrelated machines model,even as the augmented capacity approaches 2. each machine-job pair is given a weight, processing time, and height unrelated to any other machine-job In Section 3, we propose and analyze a greedy pairs.algorithm Highest Residual Density First (HRDF)for the CFT problem and show it is (2 + ε)-capacity 2 Inpractice, machines are rarely filled to full capacity, soO(1/ε)-competitive for any small constant ε > 0. small augmentations in capacity are not unreasonable.
  4. 4. Technical novelty: Our algorithms use novel Suppose case (b) occurs. We release one jobcombinations and generalizations of knapsacks, job of type II at each time step t = T, T + P, T +densities, and grouping of jobs into categories. While 2P, . . . , L, L T . There are always T /2 · P jobsHRDF uses knapsacks and job densities and HPCS alive in A’s schedule during [T, L]. So A’s total flowuses job densities and categories, KCS combines time is Ω(T · L/P ). The optimal schedule finishesall three. In fact, our algorithms are the first we all jobs of type II during [0, T ]. It has 2T jobs ofare aware of that do not make flow time scheduling type I alive at time T , but it finishes these jobsdecisions based on a single ordering of jobs. Our during [T, 3T ] while also processing jobs of type II.analysis methods combining potential functions, So, after time 3T , the optimal schedule has no moreknapsacks, and categories are the first of their kind alive jobs and its total flow time is O(L). We getand may have other applications. the theorem by setting T ≥ P 2 .2 Lower Bounds Corollary 2.2. Any randomized onlineWe begin the evaluation of the CFT problem by algorithm A for the CFT problem is (1.2 − ε)-giving lower bounds on the competitive ratio for any capacity Ω(P )-competitive for any constant ε > 0online scheduling algorithm. Some lower bounds even when all jobs have unit weight.hold even when jobs are given unit weights. Recall Pdenotes the ratio of the largest processing time to In fact, superconstant lower bounds continuethe least processing time. to exist even when an online algorithm’s server is given more than 1.2 capacity. When we stopTheorem 2.1. Any randomized online algorithm A requiring input instances to have unit weights, thenfor the CFT problem has competitive ratio Ω(P ) the lower bounds become resilient to even largereven when all jobs have unit weight. capacity augmentations.Proof: Garg and Kumar [24] consider a variant on Theorem 2.3. Any randomized online algorithm Athe single-job capacity identical machines model they for the CFT problem is (1.5 − ε)-capacitycall Parallel Scheduling with Subset Constraints. Ω(log n + log P )-competitive for any constant ε > 0Our proof very closely follows the proof of [24, even when all jobs have unit weight.Theorem 3.2]. While our proof is a lower boundon any deterministic algorithm, it easily generalizes Proof: There is a simple reduction from onlineto randomized algorithms with oblivious adversaries. scheduling in the single-job identical machines modelFurther, the proof continues to hold even with to the capacitated server model. Let I be any inputsmall amounts of capacity augmentation. Fix a instance for the single-job identical machines modeldeterministic online algorithm A. We use only one where there are two machines and all jobs havemachine and all jobs have unit weight. There are unit weight. We create an input instance I usingtwo types of jobs: Type I jobs have unit processing the same jobs as I with each job given height 0.5.time and height 0.4; Type II jobs have processing Any schedule of I is a schedule for I. The lowertime P 1 and height 0.6. Let T P be a bound of [30, Theorem 9] immediately implies thelarge constant. On each time step t = 0, . . . , T − 1, theorem.we release two jobs of type I. At each time stept = 0, P, 2P, . . . , T − 1, we release one job of type Theorem 2.4. No deterministic onlineII (we assume T − 1 is a multiple of P ). The total algorithm A for the CFT problem is constantprocessing time of all released jobs is 3T . The server competitive even when given (2 − ε)-capacity forcan schedule two type I jobs at once, and one type any constant ε > 0 assuming jobs are allowedI job with a type II job, but it cannot schedule two arbitrary weights.type II jobs at once. Therefore, one of the followingmust occur: (a) at least T /2 jobs of type I remain Proof: We apply a similar reduction to that givenalive at time T , or (b) at least T /2 · P jobs of type in the proof of Theorem 2.3. Let I be any inputII remain alive at time T . instance for the single-job identical machines model Suppose case (a) occurs. We release 2 jobs of where there is one machine and jobs have arbitrarytype I at each time step t = T, . . . , L, L T . The weight. We create an input instance I using thetotal flow time of A is Ω(T ·L). However, the optimal same jobs as I with each job given height 1. Anyschedule finishes all jobs of type I during [0, T ]. It has schedule of I is a schedule for I. The loweronly T /P jobs (of type II) waiting at each time step bound of [11, Theorem 2.1] immediately impliesduring [T, L], so the optimal flow time is O(T · L/P ). the theorem.
  5. 5. HRDF(I, t): /* QA (t) : Jobs alive at time t. pA (t): Remaining processing time of j */ j 1. For each j ∈ QA (t): dA (t) ← wj /(hj pA (t)) /* Assign residual densities */ j j 2. Sort QA (t) by dA (t) descending and pack jobs from QA (t) up to capacity j Figure 1. The algorithm HRDF.3 Analysis Warmup: Highest Residual Den- We give the remaining notation for our potential sity First function. Let pO (t) be the remaining processing time jIn this section, we analyze the deterministic online of job j at time t in the optimal schedule. Let QO (t)algorithm Highest Residual Density First (HRDF) be the set of alive jobs in the optimal schedule atfor the CFT problem. A description of HRDF time t. Let QAj (t) = k ∈ QA (t) : dA (t) > dA (t) k jappears in Figure 1. Despite the simplicity of the be the set of alive jobs in HRDF’s schedule morealgorithm, we are able to show the following result dense than job j. Let S A (t) and S O (t) be the setmatching the lower bound of Theorem 2.4. of jobs processed at time t by HRDF’s and the optimal schedule respectively. Finally, let W A (t) =Theorem 3.1. HRDF is (2 + ε)-capacity O(1/ε)- j∈QA (t) wj be the total weight of all jobs alive in HRDF’s schedule at time t. The rest of the notationcompetitive for any constant ε where 0 < ε < 1. we use in our potential function is given in Figure 1 and Table 2.3.1 Analysis of HRDF. We use a potential The potential function we use has two terms forfunction analysis to prove Theorem 3.1. Set an each job j:input instance I, and consider running HRDF with2+ε capacity alongside the optimal schedule without wj ΦA (t) = j Rj (t) + (1 + ε)pA (t) − Vj> (t) jresource augmentation. We begin our analysis by ε A + Yj (t) + Fj (t)defining some notation. For any job j let Cj be thecompletion time of j in HRDF’s schedule and let Cj O wj < ΦO (t) = j R (t)be the completion time of j in the optimal schedule. ε j AFor any time t ≥ rj , let Aj (t) = min Cj , t − rj The potential function is defined asbe the accumulated weighted flow time of job j at Atime t and let Aj = Aj (Cj ). Finally, let A = j Aj Φ(t) = ΦA (t) − ΦO (t). j jbe the total weighted flow time for HRDF’s schedule. j∈QA (t) j∈QO (t)Let OPTj (t), OPTj , and OPT be defined similarlyfor the optimal schedule. 3.2 Main proof idea: Using extra capacity Our analysis of HRDF builds on the potential to keep up with the optimal. We give fullfunction approach outlined in [27], specifically [17, details of the potential function analysis in Ap-22, 23], for the single-job at a time problem and pendix A including the effects from job arrivalsextends it to the capacitated case. Unlike in [22, 23], and completions, but briefly mention here someour method supports arbitrary job weights and of the continuous changes made to A + Φ due tocapacities, and unlike in [17], we do not use the processing from HRDF and the optimal schedule.notion of fractional flow time.3 We give a potential Intuitively, these changes compare S A (t) and S O (t),function Φ that is almost everywhere differentiable the jobs processed at time t by HRDF’s and thesuch that Φ(0) = Φ(∞) = 0 and then focus optimal schedule respectively. As is the case foron bounding all continuous and discrete increases many potential function arguments, we must showto A + Φ across time by a function of OPT in order that HRDF removes the remaining area of jobs moreto bound A by a function of OPT.4 efficiently than the optimal schedule. The extra 3 Avoiding the use of fractional flow time allows us to efficiency shows Φ falls fast enough to counteractanalyze HRDF using capacity-only augmentation without changes in A. Showing an algorithm’s schedulespeed augmentation [17]. It also saves us a factor of Ω(1/ε) removes area more efficiently than the optimalin the competitive ratio for KCS. schedule is often the most difficult part of a potential 4 An analysis of an HRDF-like algorithm for the unrelatedsingle-job machine problem was given in [8] by relaxing an integrality gap of the natural linear program for our settinginteger program and applying dual fitting techniques. We is unbounded for the small amounts of resource augmentionfocus on potential function arguments in this work as the we use in later algorithms.
  6. 6. Term Definition Explanation Total remaining area of jobs more dense than j. Continuous Rj (t) k∈QAj (t) hk pA (t) k decreases in Rj counteract continuous increases in A. Total remaining area of jobs more dense than j upon j’s < Rj (t) k∈QAj (rj ) hk pA (t) k < arrival. The decrease from Rj cancels the increase to Φ caused by Rj when j arrives. The optimal schedule’s total remaining area for each job k more dense than j upon k’s arrival. The increase to Rj Vj> (t) k∈QO (t)∩QAj (rk ) hk pO (t) k caused by the arrival of a job k is canceled by an equal increase in Vj> . Yj (t) = 0 for all t < rj and Captures work done by the optimal schedule that does not Yj (t) d dt Yj (t) = k∈(S O (t)QAj (rk )) hk for affect Vj> . Used to bound increases in Φ from the removal < all t ≥ rj of Rj terms when jobs are completed. d Fj (t) = 0 for all t < rj , dt Fj (t) = 0 if j ∈/ Captures the increase in flow time by the optimal schedule Fj (t) S (t), and dt Fj (t) = hj k∈QO (t) wk oth- A d wj while our schedule processes j. Used to bound increases erwise. in Φ from the removal of Vj terms when jobs are completed. Table 2. A summary of notation used in our potential functions. Input HRDF OPT 0.8 0.3 time time timeFigure 2. A lower bound example for HRDF. Type I jobs appear every time step as green squares with height 0.3 and weight 2while type II jobs appear at time steps {0, 1} mod 3 as purple rectangles with height 0.8 and weight 1. The process continuesuntil time T 1. Left: The input set of jobs. Center: As type I jobs have higher residual density, HRDF picks one at each timestep as it arrives. Right: Optimal schedule uses type II jobs at time steps {0, 1} mod 3 and three type I jobs at time steps 2mod 3. dfunction analysis and greatly motivates the other process j. Therefore, dt Rj (t) ≤ −(1 + ε) andalgorithms given in this paper. In our potentialfunction, rate of work is primarily captured in the d wj 1+ε A (Rj + (1 + ε)pA )(t) ≤ − j W (t).Rj , pA , Vj , and Yj terms. dt ε ε j j∈QA (t)Lemma 3.2. If HRDF’s schedule uses 2+ε capacity The optimal schedule is limited to schedulingand the optimal schedule uses unit capacity, then jobs with a total height of 1. No job contributes to both Vj> and Yj , so d wj ΦA − j Fj (t) = dt ε d wj 1j∈QA (t) (−Vj> + Yj )(t) ≤ W A (t). d wj dt ε ε j∈QA (t) (Rj + (1 + ε)pA − Vj> + Yj )(t) j dt ε j∈QA (t) ≤ −W A (t). 4 Beyond Greedy: High Priority CategoryProof: Consider any time t. For any job j, Schedulingneither Rj nor (1 + ε)pA can increase due to j The most natural question after seeing the abovecontinuous processing of jobs. If HRDF’s schedule analysis is whether there exists an algorithm that is dprocesses j at time t, then dt (1 + ε)pA (t) = −(1 + ε). j constant factor competitive when given only speedIf HRDF’s schedule does not process j, then HRDF’s augmentation. In particular, a (1+ε)-speed constantschedule processes a set of more dense jobs of height competitive algorithm for any small constant ε wouldat least (1 + ε). Otherwise, there would be room to be ideal.
  7. 7. HPCS(I, t, κ): /* κ is an integer determined in analysis */ 1. For each 1 ≤ i < κ: /* Divide jobs into categories by height */ QA (t) ← j ∈ QA (t) : 1/(i + 1) < hj ≤ 1/i /* Assign alive category i jobs */ i Wi (t) ← j∈QA (t) wj /* Assign queued weight of category i */ i For each j ∈ QA (t): hj ← 1/i/* Assign modified height of job j */ i QA (t) ← j ∈ QA (t) : hj ≤ 1/κ /* X is the small category */ X WX (t) ← j∈QA (t) wj X For each j ∈ QA (t): hj ← hj X 2. For each j ∈ QA (t): dA (t) ← wj /(hj pA (t)) /* Assign densities by modified height */ j j 3. i∗ ← highest queued weight category Sort QA (t) by dA (t) descending and pack jobs from QA (t) up to capacity i∗ j i∗ Figure 3. The algorithms HPCS. Unfortunately, the input sequence given in Theorem 4.1. HPCS is 1.75-speed O(1)-Figure 2 shows that greedily scheduling by residual competitive.density does not suffice. HRDF’s schedule willaccumulate two type II jobs every three time steps, 4.2 Analysis of HPCS. We use a potentialgiving it a total flow time of Ω(T 2 ) by time T . function argument to show the competitiveness ofThe optimal schedule will have flow time O(T ), HPCS. The main change in our proof is that thegiving HRDF a competitive ratio of Ω(T ). Even terms in the potential function only consider jobsif we change HRDF to perform an optimal knapsack within a single category at a time. For any fixedpacking by density or add some small constant input instance I, consider running HPCS’s scheduleamount of extra speed, the lower bound still holds. at speed 1.75 alongside the optimal schedule at unit speed. For a non-small category i job j, let4.1 Categorizing jobs. The above example QAj (t) = k ∈ QA (t) : dA (t) > dA (t) be the set i i k jshows that greedily scheduling the high value jobs of alive category i jobs more dense than job j.and ignoring others does not do well. In fact, type II Define QAj (t) similarly. We use the notation in Xjobs, though lower in value individually, cannot be Figure 3 along with the notation given in theallowed to build up in the queue forever. To address proof for HRDF, but we use the modified heightthis issue, we use the notion of categories and group in every case height is used. Further, we restrictjobs of similar heights together. The intuition is the definitions of Rj , Rj , Vj> , Yj , and Fj to <that once a category or group of jobs accumulates include only jobs in the same category as j. Forenough weight, it becomes important and should be example, if job j belongs to category i then Rj (t) =scheduled. The algorithm HPCS using this intuition Ais given in Figure 3. Note that HPCS schedules from k∈QAj (t) hk pk (t). ione category at a time and packs as many jobs as Let λ be a constant we set later. The potentialpossible from that category. function we use has two terms for each job j. Within the category HPCS chooses, we canbound the performance of HPCS compared to the ΦA (t) = λκwj Rj (t) + pA (t) − Vj> (t) j joptimal. The only slack then is the ability of + Yj (t) + Fj (t)the optimal schedule to take jobs from multiple ΦO (t) j < = λκwj Rj (t)categories at once. However, using a surprisingapplication of a well known bound for knapsacks,we show that the optimal schedule cannot affect the The potential function is again defined aspotential function too much by scheduling jobs frommultiple categories. In fact, we are able to show Φ(t) = ΦA (t) − j O Φj (t).HPCS is constant competitive with strictly less than j∈QA (t) j∈QO (t)double speed.5 with speed strictly less than 1.75, but we do not have a closed form expression for the minimum speed possible using our 5 We note that we can achieve constant competitiveness techniques.
  8. 8. 4.3 Main proof idea: Knapsack upper packing problem. For each non-small category i,bounds on the optimal efficiency. As before, there are i items in K each with value λκW ∗ (t)/iwe defer the full analysis for Appendix B, but we and size 1/(i+1). These items represent the optimalmention here the same types of changes mentioned schedule being able to process i jobs at once fromin Section 3. Our goal is to show the optimal category i. Knapsack instance K also contains oneschedule cannot perform too much work compared item for each category X job j alive in the optimalto HPCS’s schedule even if the optimal schedule schedule with value λκW ∗ (t)hj and size hj . Theprocesses jobs from several categories at once. Our capacity for K is 1. A valid solution to K is aproof uses a well known upper bound argument set U of items of total size at most 1. The valuefor knapsack packing. Consider any time t and of a solution U is the sum of values for items in U .let W ∗ (t) be the queued weight of the category The optimal solution to K has a value at least asscheduled by HPCS at time t. large as the largest possible increase to Φ due to changes in the Vj and Yj terms. Note that we cannotLemma 4.2. If HPCS’s schedule uses 1.75 speed pack an item of size 1 and two items of size 1/2and the optimal schedule uses unit speed, then for in K. Define K1 and K2 to be K with one item ofall κ ≥ 200 and λ ≥ 100 size 1 removed and one item of size 1/2 removed respectively. The larger of the optimal solutions d ΦA (t) − λκwj Fj (t) = to K1 and K2 is the optimal solution to K. dt j j∈QA (t) Let the knapsack density of an item be the ratio d of its value to its size. Let v1 , v2 , . . . be the values λκwj (Rj +pA − Vj + Yj )(t) j of the items in K in descending order of knapsack dt j∈QA (t) density. Let s1 , s2 , . . . be the sizes of these same ≤ −κW ∗ (t). items. Greedily pack the items in decreasing order of knapsack density and let k be the index of theThere are κ categories, so this fall counteracts the first item that does not fit. Folklore states thatincrease in A. v1 + v2 + · · · + αvk is an upper bound on the optimal solution to K where α = 1−(s1 +s2sk +···+sk−1 ) . ThisProof: Suppose HPCS schedules jobs from small fact applies to K1 and K2 as well.category X. Let j be some job in category X. In all three knapsack instances described,If HPCS’s schedule processes j at time t, then ordering the items by size gives their order by d Adt pj (t) = −1.75. If HPCS’s schedule does not knapsack density. Indeed, items of size 1/iprocess j at time t, then it does process jobs from have knapsack density λκW ∗ (t)(i + 1)/i. Itemscategory X with height at least 1 − 1/κ. Therefore, created from category X jobs have knapsack d Xdt Rj (t) ≤ −1.75(1 − 1/κ). Similar bounds hold density λκW ∗ (t). We may now easily verifyif HPCS’s schedule processes jobs from non-small that 1.45λκW ∗ (t) and 1.73λκW ∗ (t) are uppercategory i, so bounds for the optimal solutions to K1 and K2 respectively. Therefore, d λκwj (Rj + pA )(t) ≤ j d dt λκwj (−Vj + Yj )(t) ≤ 1.73λκW ∗ (t). j∈QA (t) dt i∈QA (t) 1 ∗ − 1.75λ 1 − κW (t). κ Bounding the increases to Φ from changes to 5 Main Algorithm: Knapsack Categorythe Vj and Yj terms takes more subtlety than before. SchedulingWe pessimistically assume every category has queuedweight W ∗ (t). We also imagine that for each non- As stated in the previous section, a limitation ofsmall category i job j processed by the optimal HPCS is that it chooses jobs from one categoryschedule, job j has height 1/(i + 1). The optimal at a time, putting it at a disadvantage. In thisschedule can still process the same set of jobs with section, we propose and analyze a new algorithmthese assumptions, and the changes due to Vj and Yj called Knapsack Category Scheduling (KCS) whichterms only increase with these assumptions. uses a fully polynomial-time approximation scheme We now model the changes due to the Vj and Yj (FPTAS) for the knapsack problem [25,29] to chooseterms as an instance K to the standard knapsack jobs from multiple categories. Categories still play
  9. 9. KCS(I, t, ε): /* ε is any real number with 0 < ε ≤ 1/2 */ 1. For each integer i ≥ 0 with (1 + ε/2)i < 2/ε /* Divide jobs into categories by height */ QA (t) ← {j ∈ QA (t) : 1/(1 + ε/2)i+1 < hj ≤ 1/(1 + ε/2)i } i Wi (t) ← j∈QA (t) wj i For each j ∈ QA (t): hj ← 1/(1 + ε/2)i i QA (t) ← QA (t) ∪i QA (t) /* Category X jobs have height at most ε/2 */ X i WX (t) ← j∈QA (t) wj X For each j ∈ QA (t): hj ← hj X 2. For each j ∈ QA (t): dA (t) ← wj /(hj pA (t)) /* Assign densities by modified height */ j j 3. Initialize knapsack K(t) with capacity (1 + ε)/* Setup a knapsack instance to choose jobs */ For each category i and category X; for each j ∈ QA (t) i QAj (t) ← k ∈ QA (t) : dA (t) > dA (t) /* Assign category i jobs more dense than j */ i i k j Add item j to K(t) with size hj and value wj + hj k∈(QA (t)QAj (t){j}) wk i i 4. Pack jobs from a (1 − ε/2)-approximation for K(t) Figure 4. The algorithm KCS.a key role in KCS, as they help assign values to jobs KCS. Let κ be the number of categories used bywhen making scheduling decisions. KCS. Observe Detailed pseudocode for KCS is given in Figure 4.Steps 1 and 2 of KCS are similar to those of HPCS, κ ≤ 2 + log1+ε/2 (2/ε) = O(1/ε2 ).except that KCS uses geometrically decreasingheights for job categories.6 Another interesting The potential function has two terms for each job j.aspect of KCS is the choice of values for jobs in step4. Intuitively, the value for each knapsack item j is 8κwj ΦA (t) = j Rj (t) + pA (t) − Vj> (t) jchosen to reflect the total weight of less dense jobs εwaiting on j’s completion in a greedy scheduling + Yj (t) + Fj (t)of j’s category. More practically, the values are 8κwj <chosen so the optimal knapsack solution will affect ΦO (t) = j Rj (t) εthe potential function Φ as much as possible. KCS combines the best of all three techniques: The potential function is defined asknapsack, densities, and categories, and we are ableto show the following surprising result illustrating Φ(t) = ΦA (t) − j ΦO (t). jthe power of combining speed and capacity augmen- j∈QA (t) j∈QO (t)tation. Picking categories: Full details of the analysisTheorem 5.1. KCS is (1+ε)-speed (1+ε)-capacity appear in Appendix C, but we mention here theO(1/ε3 )-competitive. same types of changes mentioned in Section 3. The optimal schedule may cause large continuous5.1 Main proof idea: Combining categories increases to Φ by scheduling jobs from multiple ‘highoptimally. We use a potential function analysis weight’ categories. We show that KCS has the optionto prove Theorem 5.1. Fix an input instance I and of choosing jobs from these same categories. Becauseconsider running KCS’s schedule with (1 + ε) speed KCS approximates an optimal knapsack packing,and capacity alongside the optimal schedule with KCS’s schedule performs just as much work as theunit resources. We continue to use the same notation optimal schedule. Fix a time t. Let W ∗ (t) be theused for HPCS, but modified when necessary for maximum queued weight of any category in KCS’s schedule at time t. 6 As we shall see in the analysis, knapsacks and capacityaugmentation enable KCS to use geometrically decreasingheights, where as HPCS needs harmonically decreasing Lemma 5.2. If KCS’s schedule uses 1 + ε speedheights to pack jobs tightly and get a bound below 2. and capacity and the optimal schedule uses unit
  10. 10. speed and capacity, then then there are jobs with higher height density in U with total height at least HX . d 8κwj ΦA − j Fj (t) = dt ε d 8κwj j∈QA (t) (4) Rj + pA (t) ≤ j d 8κwj dt ε j∈QA (t) (Rj +pA − Vj> + Yj )(t) j X dt ε 8κWi (t) j∈QA (t) − HX . ≤ −κW ∗ (t). εProof: Suppose the optimal schedule processes Ti Let ∆U be the sum of the right hand sides of (3)jobs from non-small category i which have a total and (4).unmodified height of Hi . Then, Recall the knapsack instance K(t) used by KCS. Let U be a solution to K(t) and d 8κwj(1) −Vj> + Yj (t) ≤ let f (U ) be the value of solution U . Let i dt ε ∆A = d 8κwj (Rj + pA )(t). If KCS j∈QA (t) i j∈Q A (t) dt ε j 8κWi (t) schedules the jobs corresponding to the U , then Ti . ∆A = − 8(1+ε)κ f (U ). Let U ∗ be the optimal i (1 + ε/2)i ε ε solution to K(t) and let ∆∗ = − 8(1+ε)κ f (U ∗ ). εLet HX be the capacity not used by the optimal Because U as defined above is one potential solutionschedule for jobs in non-small categories. We see to K(t) and KCS runs an FPTAS on K(t) to decide which jobs are scheduled, we see d 8κwj(2) −Vj> + Yj (t) ≤ dt ε ∆O + ∆A ≤ ∆O + (1 − ε/2)∆∗ j∈QA (t) X 8κWX (t) 8(1 + ε/4)κ HX . ≤ ∆O − f (U ∗ ) ε ε ≤ ∆O + ∆U − 2κf (U ∗ )Let ∆O be the sum of the right hand sides of (1) and(2). Note that ∆O may be larger than the actual = −2κf (U ∗ ).rate of increase in Φ due to the Vj and Yj terms. Now, consider the following set of If category X has the maximum queued weight,jobs U ∈ QA (t). For each non-small category i, then one possible solution to K(t) is to greedilyset U contains the min |QA (t)|, Ti most dense jobs schedule jobs from category X in descending order ifrom QA (t). Each of the jobs chosen from QA (t) has of height density. Using previously shown tech- i iheight at most 1/(1 + ε/2)i and the optimal schedule niques, we see f (U ∗ ) ≥ W ∗ (t). If a non-smallprocesses Hi height worth of jobs, each with height category i has maximum queued weight, then aat least 1/(1 + ε/2)i+1 . Therefore, the jobs chosen possible solution is to greedily schedule jobs fromfrom QA (t) have total height at most (1 + ε/2)Hi . category i in descending order of height density. iSet U also contains up to HX + ε/2 height worth of Now we see f (U ∗ ) ≥ (1/2)W ∗ (t). Using our boundjobs from QA (t) chosen greedily in decreasing order on ∆O + ∆A , we prove the lemma. Xof density. The jobs of U together have height atmost i (1 + ε/2)Hi + HX + ε/2 ≤ 1 + ε. 6 Conclusions and Open Problems Now, suppose KCS schedules the jobs from In this paper, we introduce the problem of weightedset U at unit speed or faster. Consider a non-small flow time on capacitated machines and give the firstcategory i. Observe Ti ≤ (1 + ε/2)i . Using similar upper and lower bounds for the problem. Our maintechniques to those given earlier, we observe result is a (1 + ε)-speed (1 + ε)-capacity O(1/ε3 )- d 8κwj competitive algorithm that can be extended to the(3) Rj + pA (t) ≤ j dt ε unrelated multiple machines setting. However, it i j∈QA (t) i remains an open problem whether (1 + ε)-speed and 8κWi (t) unit capacity suffices to be constant competitive −Ti . i (1 + ε/2)i ε for any ε > 0. Further, the multi-dimensional case remains open, where jobs can be scheduled Now, consider any job j from small category X. simulateously given that no single dimension of the dIf j ∈ U , then dt pA (t) =≤ −1 ≤ −HX . If j ∈ U , j / server’s capacity is overloaded.
  11. 11. Acknowledgements. The authors would like to [15] N. Bobroff, A. Kochut, and K. Beaty. Dynamicthank Chandra Chekuri, Sungjin Im, Benjamin placement of virtual machines for managing slaMoseley, Aranyak Mehta, Gagan Aggarwal, and violations. Proc. 10th Ann. IEEE Symp. Integratedthe anonymous reviewers for helpful discussions Management (IM), pp. 119 – 128. 2007.and comments on an earlier version of the paper. [16] C. Bussema and E. Torng. Greedy multiprocessorResearch by the first author is supported in part by server scheduling. Oper. Res. Lett. 34(4):451–458, 2006.the Department of Energy Office of Science Graduate [17] J. Chadha, N. Garg, A. Kumar, and V. N. Mu-Fellowship Program (DOE SCGF), made possible in ralidhara. A competitive algorithm for minimizingpart by the American Recovery and Reinvestment weighted flow time on unrelated machines withAct of 2009, administered by ORISE-ORAU under speed augmentation. Proc. 41st Ann. ACM Symp.contract no. DE-AC05-06OR23100. Theory Comput., pp. 679–684. 2009.References [18] C. Chekuri, A. Goel, S. Khanna, and A. Kumar. [1] Amazon elastic compute cloud (amazon ec2). http: Multi-processor scheduling to minimize flow time //aws.amazon.com/ec2/ . with epsilon resource augmentation. Proc. 36th [2] Google appengine cloud. http://appengine.google. Ann. ACM Symp. Theory Comput., pp. 363–372. com/ . 2004. [3] Vmware. http://www.vmware.com/ . [19] C. Chekuri, S. Khanna, and A. Zhu. Algorithms [4] Vmware cloud computing solutions for minimizing weighted flow time. Proc. 33rd Ann. (public/private/hybrid). http://www.vmware. ACM Symp. Theory Comput., pp. 84–93. 2001. com/solutions/cloud-computing/index.html . [20] V. Chudnovsky, R. Rifaat, J. Hellerstein, B. Sharma, [5] Vmware distributed resource scheduler. http:// and C. Das. Modeling and synthesizing task www.vmware.com/products/drs/overview.html . placement constraints in google compute clusters. Symp. on Cloud Computing. 2011. [6] Vmware server consolidation solutions. http: //www.vmware.com/solutions/consolidation/ [21] J. Dean and S. Ghemawat. Mapreduce: Simplified index.html . data processing on large clusters. Proc. 6th Symp. [7] Server consolidation using VMware ESX Server, on Operating System Design and Implementation. 2005. http://www.redbooks.ibm.com/redpapers/ 2004. pdfs/redp3939.pdf . [22] K. Fox. Online scheduling on identical machines [8] S. Anand, N. Garg, and A. Kumar. Resource using SRPT. Master’s thesis, University of Illinois, augmentation for weighted flow-time explained by Urbana-Champaign, 2010. dual fitting. Proc. 23rd Ann. ACM-SIAM Symp. [23] K. Fox and B. Moseley. Online scheduling on Discrete Algorithms, pp. 1228–1241. 2012. identical machines using SRPT. Proc. 22nd ACM- [9] B. Awerbuch, Y. Azar, S. Plotkin, and O. Waarts. SIAM Symp. Discrete Algorithms, pp. 120–128. Competitive routing of virtual circuits with un- 2011. known duration. Proc. 5th Ann. ACM-SIAM Symp. [24] N. Garg and A. Kumar. Minimizing average flow- Discrete Algorithms, pp. 321–327. 1994. time : Upper and lower bounds. Proc. 48th Ann.[10] Y. Azar, B. Kalyanasundaram, S. Plotkin, K. Pruhs, IEEE Symp. Found. Comput. Sci., pp. 603–613. and O. Waarts. On-line load balancing of temporary 2007. tasks. Proc. 3rd Workshop on Algorithms and Data [25] H. Ibarra and E. Kim. Fast approximation algo- Structures, pp. 119–130. 1993. rithms for the knapsack and sum of subset problems.[11] N. Bansal and H.-L. Chan. Weighted flow time does J. ACM 22(4):463–468. ACM, Oct. 1975. not admit O(1)-competitive algorithms. Proc. 20th [26] S. Im and B. Moseley. An online scalable algorithm Ann. ACM-SIAM Symp. Discrete Algorithms, pp. for minimizing lk-norms of weighted flow time on 1238–1244. 2009. unrelated machines. Proc. 22nd ACM-SIAM Symp.[12] N. Bansal and K. Dhamdhere. Minimizing weighted Discrete Algorithms, pp. 95–108. 2011. flow time. ACM Trans. Algorithms 3(4):39:1–39:14, [27] S. Im, B. Moseley, and K. Pruhs. A tutorial on 2007. amortized local competitiveness in online scheduling.[13] P. Barham, B. Dragovic, K. Fraser, S. Hand, SIGACT News 42(2):83–97. ACM, June 2011. T. Harris, A. Ho, R. Neuge-bauer, I. Pratt, and [28] B. Kalyanasundaram and K. Pruhs. Speed is as A. Warfield. Xen and the art of virtualization. powerful as clairvoyance. Proc. 36th Ann. IEEE Proc. 19th Symp. on Operating Systems Principles Symp. on Found. Comput. Sci., pp. 214–221. 1995. (SOSP), pp. 164–177. 2003. [29] E. Lawler. Fast approximation algorithms for the[14] L. Becchetti, S. Leonardi, A. Marchetti-Spaccamela, knapsack problems. Mathematics of Operations and K. Pruhs. Online weighted flow time and dead- Research 4(4):339–356, 1979. line scheduling. Proc. 5th International Workshop [30] S. Leonardi and D. Raz. Approximating total flow on Randomization and Approximation, pp. 36–47. time on parallel machines. J. Comput. Syst. Sci. 2001. 73(6):875–891, 2007.
  12. 12. [31] S. Martello and P. Toth. Knapsack Problems: • If HRDF’s schedule processes job j at time t, Algorithms and Computer Implementations. John then Wiley, 1990.[32] A. Mishra, J. Hellerstein, W. Cirne, and C. Das. d wj hj wj wk Fj (t) = Towards characterizing cloud backend workloads: dt ε ε wj k∈QO (t) Insights from Google compute clusters, 2010. hj d[33] A. Singh, M. Korupolu, and D. Mohapatra. = OPT(t). Server-storage virtualization: Integration and load- ε dt balancing in data centers. Proc. IEEE-ACM Recall S A (t) is the set of jobs processed by Supercomputing Conference (SC), pp. 53:1–53:12. 2008. HRDF’s schedule at time t. Observe that[34] Q. Zhang, J. Hellerstein, and R. Boutaba. Char- hj ≤ (2 + ε). acterizing task usage shapes in Google compute j∈S A (t) clusters. Proc. 5th International Workshop on Large Scale Distributed Systems and Middleware. 2011. Therefore, d wj 2+ε d Fj (t) ≤ OPT(t).Appendix j∈S A (t) dt ε ε dtA Analysis of HRDF • Finally,In this appendix, we complete the analysis of HRDF.We use the definitions and potential function as d wj < (2 + ε)wj − (Rj )(t) ≤given in Section 3. Consider the changes that occur dt ε ε j∈QO (t) j∈QO (t)to A + Φ over time. The events we need to consider 2+ε dfollow: = OPT(t) ε dtJob arrivals: Consider the arrival of job j at Combining the above inequalities gives us thetime rj . The potential function Φ gains the terms ΦA j following:and −ΦO . We see j (1 + ε)wj d d d ΦA (rj ) − ΦO (rj ) = j j pj (A + Φ)(t) = Aj (t) + ΦA (t) ε dt dt dt j j∈QA (t) 1+ε ≤ OPTj d O ε − Φ (t) dt j j∈QO (t) For any term ΦA with k = j, job j may only k >contribute to Rk and −Vk , and its contributions to 2+ε d ≤ W A (t) − W A (t) + OPT(t)the two terms cancel. Further, job j has no effect on ε dtany term ΦO where k = j. All together, job arrivals 2+ε d k + OPT(t)increase A + Φ by at most ε dt 4 + 2ε d 1+ε = OPT(t)(5) OPT. ε dt ε Therefore, the total increase to A + Φ due to continuous processing of jobs is at mostContinuous job processing: Consider any time t. 4 + 2ε d 4 + 2ε (6) OPT(t)dt = OPT.We have the following: t≥0 ε dt ε • By Lemma 3.2, d wj Job completions: The completion of job j in (Rj + (1 + ε)pA − Vj> + Yj )(t) ≤ j dt ε HRDF’s schedule increases Φ by j∈QA (t) wj − W A (t). −ΦA (Cj ) = j A Vj> (Cj ) − Yj (Cj ) − Fj (Cj ) . A A A ε
  13. 13. Similarly, the completion of job j by the optimal A.1 Completing the proof. We may nowschedule increases Φ by complete the proof of Theorem 3.1. By summing wj the increases to A + Φ from (5) and (6), we see −ΦO (Cj ) = j O < O Rj (Cj ) . ε 5 + 3ε 1 We argue that the negative terms from these (7) A≤ OPT = O OPT. ε εincreases cancel the positive terms. Fix a job j andconsider any job k contributing to Vj> (Cj ). By A B Analysis of HPCS >definition of Vj , job k arrives after job j and is In this appendix, we complete the analysis of HPCS.more height dense than job j upon its arrival. We We use the definitions and potential function assee given in Section 4. Consider the changes that occur wk wj hj pA (rk )wk j to A + Φ over time. As is the case for HRDF, > A (r ) =⇒ > hk pk . job completions and density inversions cause no hk pk hj pj k wj overall increase to A + Φ. The analyses for theseJob A k is alive to contribute a total changes are essentially the same as those given in hj pj (rk )wkof wj > hk pk to Fj during times HRDF’s Appendix A, except the constants may change. Theschedule processes job j. The decrease in Φ of k’s remaining events we need to consider follow: Acontribution to Fj (Cj ) negates the increase causedby k’s contribution to Vj> (Cj ). A Job arrivals: Consider the arrival of job j at < O Now, consider job k contributing to Rj (Cj ). By time rj . The potential function Φ gains the terms ΦA jdefinition we know k arrives before job j and that and −ΦO . We see jjob j is less height dense than job k at the time j ΦA (rj ) − ΦO (rj ) = λκwj pj j jarrives. We see wk wj ≤ λκOPTj . > =⇒ wk hj pj > wj hk pA (rj ). k hk pA (rj ) k hj pj For any term ΦA with k = j, job j may only k >Job j does not contribute to > Vk , because it is less contribute to Rk and −Vk , and its contributions toheight dense than job k upon its arrival. Further, the two terms cancel. Further, job j has no effect onjob k is alive in HRDF’s schedule when the optimal any term ΦO where k = j. All together, job arrivals kschedule completes job j. Therefore, all processing increase A + Φ by at mostdone to j by the optimal schedule contributes (8) λκOPT. Ato Yk . So, removing job j’s contribution to Yk (Ck ) wk wj Acauses Φ to decrease by ε hj pj > ε hk pk (rj ), anupper bound on the amount of increase caused < Oby removing k’s contribution to −Rj (Cj ). We Continuous job processing: Consider any time t.conclude that job completions cause no overall We have the following:increase in A + Φ. • By Lemma 4.2, dDensity inversions: The final change we need λκwj (Rj +pA −Vj +Yj )(t) ≤ −κW ∗ (t) jto consider for Φ comes from HRDF’s schedule dt j∈QA (t)changing the ordering of the most height dense jobs. if κ ≥ 200 and λ ≥ 100.The Rj terms are the only ones changed by theseevents. Suppose HRDF’s schedule causes some job j • Note j∈S A (t) hj ≤ 1. Therefore,to become less height dense than another job k attime t. Residual density changes are continuous, so d wk λκwj Fj (t) = λκhj wj wj wk dt wj j∈S A (t) j∈S A (t) k∈QO (t) = =⇒ wj hk pA (t) = wk hj pA (t). k j hj pA (t) j hk pA (t) k d ≤ λκ OPT(t). At time t, job k starts to contribute to Rj , but dtjob j stops contributing to Rk . Therefore, the • Observe:change to Φ from the inversion is d < wj wk − λκwj (Rj )(t) ≤ 1.75λκwj hk pA (t) − k hj pA (t) = 0. j O dt j∈QO (t) ε ε j∈Q (t)We conclude that these events cause no overall d = 1.75λκ OPT(t)increase in A + Φ. dt
  14. 14. Combining the above inequalities gives us the • By Lemma 5.2,following: d 8κwj (Rj +pA −Vj> +Yj )(t) ≤ −κW ∗ (t). j dt ε d d d j∈QA (t) (A + Φ)(t) = Aj (t) + ΦA (t) dt dt dt j j∈QA (t) • For any job j in non-small category i, we d O have hj ≤ (1 + ε/2)hj by definition. Also, − Φ (t) dt j j∈S A (t) hj ≤ 1 + ε. Therefore, j∈QO (t) d ≤ κW ∗ (t) − κW ∗ (t) + λκ OPT(t) d 8κwj 8κhj wj wk dt Fj (t) = d dt ε ε wj i∈S A (t) i∈S A (t) + 1.75λκ OPT(t) dt 8(1 + ε)2 κ d d ≤ OPT(t). ≤ 2.75λκ OPT(t) ε dt dt Therefore, the total increase to A + Φ due to • The number of non-small category i jobs pro-continuous processing of jobs is at most cessed by KCS’s schedule is at most (1 + ε)2 /hj . d Therefore,(9) 2.75λκ OPT(t)dt = 2.75λκOPT. t≥0 dt d 8κwj < 8(1 + ε)3 κwj − (Rj )(t) ≤B.1 Completing the proof. We may now dt ε ε i∈QO (t) i∈QO (t)complete the proof of Theorem 4.1. Set κ = 200 8(1 + ε)3 κ dand λ = 100. By summing the increases to A + Φ = OPT(t) ε dtfrom (8) and (9), we see(10) A ≤ 3.75λκOPT = O(OPT). Combining the above inequalities gives us the following:C Analysis of KCSIn this appendix, we complete the analysis of KCS. d d dWe use the definitions and potential function as (A + Φ)(t) = Aj (t) + ΦA (t)given in Section 5. Consider the changes that occur dt dt dt j j∈QA (t)to A + Φ over time. As is the case for HRDF, d Ojob completions and density inversions cause no − Φ (t) dt joverall increase to A + Φ. The analyses for these j∈QO (t)changes are essentially the same as those given in ≤ κW (t) − κW ∗ (t) ∗Appendix A, except the constants may change. The 8(1 + ε)2 κ dremaining events we need to consider follow: + OPT(t) ε dtJob arrivals: Consider the arrival of job j at 8(1 + ε)3 κ d + OPT(t)time rj . The potential function Φ gains the terms ΦA j ε dtand −ΦO . We see 16(1 + ε)3 κ d j ≤ OPT(t) ε dt 8κwj ΦA (rj ) − ΦO (rj ) = j j pj Therefore, the total increase to A + Φ due to ε 1 continuous processing of jobs is at most ≤O OPTj . (12) ε3 16(1 + ε)3 κ d 1Again, the changes to other terms cancel. All OPT(t)dt = O OPT t≥0 ε dt ε3together, job arrivals increase A + Φ by at most 1 C.1 Completing the proof. We may now(11) O OPT. ε3 complete the proof of Theorem 5.1. By summing the increases to A + Φ from (11) and (12), we seeContinuous job processing: Consider any time t. 1 (13) A≤O OPT.We have the following: ε3
  15. 15. D Extensions to Multiple Machines scheduled on the machine chosen for job j by the optimal schedule. For example, if A is HRDF, thenHere, we describe how our results can be extended let pA (t) be the remaining processing time of j jto work with multiple unrelated machines. In theunrelated machines variant of the CFT problem, we on machine . Let dA (t) = w j /(h j pA (t)). Let j jhave m machines. Job j has weight w j , processing QAj (t) = k ∈ QA (t) : dA (t) > dA (t) k j Finally, wetime p j , and height h j if it is assigned to machine . have Rj (t) = k∈QAj (t) h k pA (t). The definition kAt each point in time, the server may schedule an of the potential function for each algorithm A thenarbitrary number of jobs assigned to each machine remains unchanged.as long as the sum of their heights does not exceed 1.For each machine and job j, we assume either h j ≤ The analysis for continuous job processing, job1 or p j = ∞. completitions, and density inversions is exactly the same as those given above for A as we restrict our The algorithms in this paper can be extended attention to a single machine in each case. We onlyeasily to the unrelated machines model. These need to consider job arrivals. Let cj be the coefficientextensions are non-migratory, meaning a job is of pA (t) in the potential function for A (for example, jfully processed on a single machine. In fact, our cj = (1 + ε)wj /ε for HRDF). Recall that in eachextensions are immediate dispatch, meaning each analysis, the increase to Φ from job arrivals wasjob is assigned to a machine upon its arrival. Let A at most cj OPTj . Consider the arrival of job j atbe one of the algorithms given above. In order time rj . Let A be the machine chosen for j byto extend A, we simply apply an idea of Chadha jet al. [17] also used by Im and Moseley [26]. Namely, our extension and let O be the machine chosen by jwe run A on each machine independently and assign the optimal algorithm. By our extension’s choicejobs to the machine that causes the least increase of machine, the potential function Φ increases byin A’s potential function. cj p O j upon job j’s arrival. This value is at most j A more precise explanation follows. We use cj OPTj . Summing over all jobs, we see job arrivalsall the definitions given for A and its analysis. increase Φ by the same amount they did in theLet QA (t) be the set of alive jobs assigned to earlier analyses.machine at time t. Upon job j’s arrival, assign Finally, we compare against optimal schedulesjob j to the machine that causes the least increase that can be migratory. In this case, we modelin A a migratory optimal schedule as assigning an α j k∈QA (t) w k (Rk + pk ) where each term Rk A fraction of job j to each machine ( α i = 1) uponand pk is modified to consider only jobs assigned job j’s arrival. We further adjust our definitions ofto and their processing times and heights on Vj> and Rj to account for these fractional jobs; Yj> < . Each machine independently runs A on the only includes fractional jobs that are more densejobs to which it is assigned using processing times, < than j and Rj includes contributions by jobs thatweights, and heights as given for that machine. are more height dense than the fractional job theThe extensions to each algorithm A given above optimal schedule assigns in place of j. Now by ourhave the same competitive ratios as their single extension’s choice in machine, the arrival of job jmachine counterparts, given the amount of resource increases Φ by α j cj p j ≤ cj OPTj . Summingaugmentation required in the single machine model over all jobs completes our analysis of job arrivals.for A. Continous job processing, job completions, andD.1 Analysis. Here, we sketch the analysis of density inversions work as before.our extension for multiple unrelated machines. Ouranalysis is essentially the same as the single machinecase, except the terms in the potential functionsare slightly changed. Let A be the algorithm weare extending. We begin by assuming the optimalschedule is non-migratory and immediate dispatch;whenever a job j arrives, it is immediately assignedto a machine and all processing for j is performedon . For each term Rj , Vj> , Yj , and Fj , wechange its definition to only include contributionsby jobs scheduled on the machine chosen for jobj by our extension to A. For Rj , we change its <definition to only include contributions by jobs