SWARM INTELLIGENCE SCHEDULING OF SOFT REAL-TIME TASKS IN HETEROGENEOUS MULTIP...
Queuing model estimating response time goals feasibility_CMG_Proc_2009_9097
1. Queuing model estimating response time goals feasibility
Anatoliy Rikun, PhD
Full Capture Solutions, Inc.
Anatoliy_Rikun@yahoo.com
Customers expect a fast response from their systems, but what are the limits
of the system? The paper analyzes a multi-class queuing model, which
evaluates if jobs response time goals are reachable, and if not, what would
be, in some sense, optimal alternatives. This queuing optimization model may
be useful to estimate the limits of tuning, or to compare alternatives between
hardware upgrade and tuning, or estimate a minimal necessary level of the
hardware upgrade to reach a given set of response time goals.
1. Introduction
Response time is important characteristics of
system performance, and response time metrics may
be a part of service level agreement.
Suppose we have a set of workload competing for
system resources with known response time targets
and known levels of system resources consumption.
Based on this information, how to estimate, if these
performance goals are achievable? And, if the
performance goals are not achievable, what is the
“best” possible solution we can get from the system?
This paper is organized as follows. Section 2
describes the “fairness” or “equitability” concept in
finding a good substitution in the case of unachievable
performance goals [GNT4, L9, BGTV3]. This concept
somehow reflects the logic used in IBM WLM Goal
Mode [BB9, SBG9], and it leads to lexicographically
minimal response times distribution in the cases, when
the response time goals are not reachable. Section 3
gives more formal model definition and necessary
assumptions. A greedy algorithm is presented in
Section 4. Section 5 gives some examples, and
estimation for comparison of tuning and upgrade
options and Sections 6 summarizes this paper.
2. Concept of Fairness, and What Can Be a
Reasonable Substitution For Unreachable
Goals?
Let us consider a CPU-bounded queuing
system of n jobs classes. For the sake of simplicity,
suppose all the job classes have the same importance
level, the same service times, but may have different
CPU utilizations, {Ui}, and different performance
goals, {Gi}, i=1,…,n. The job response time goals
may be defined in different ways: in may be, in particular,
average response time goals, deadlines, percentile
response time goals, execution velocity and others. Even
though in this paper only average response time goals
are considered, some of the conclusions can be
extended to the other metrics, too.
To evaluate system performance, it is natural to use
relative performance, or performance index metric
[SBG9], pi :
pi = Ri/Gi (1)
Thus, if all the performance indices are less or equal to 1,
then each job meet its performance goal. Now, suppose
that in this system, goals {Gi} are not reachable, i.e. for
any feasible response time vector, R={Ri} ∈ X , at least
one job average response time exceeds its goal, Gi .
This may be expressed in the term of the “worst” case
performance index, γl :
γ1 = max{ pi | i=1,…n} > 1 (2)
What could be a best possible response time distribution
in the case, when system capacity is not sufficient to
reach all the performance goals? A traditional approach
would be to minimize average – or somehow weighted
average over all job classes’ response times:
Ravg
*
= min { ∑i=1
n
cjRj | R ∈ X } (3)
Here ci – some job classes weighted coefficients1
, X – all
reachable combinations of the response time vector R,
1 In the calculations below ci =1/Gi i.e. (3) minimizes
the average performance index. But ci may be a cost
associated with delay for job class i.
2. and Ravg
*
- corresponds to a minimal possible weighted
average response time, which can be attained in the
system. A well known analytical approach to solving
problems like (3), so called cµ rule is very simple: it
recommends giving higher utilization to the jobs with
smaller ratio Uj/cj (“small” job first) 2
.
But, in spite of the beauty of cµ rule, the
underlying optimization model (3) may lead to
undesired results in the goal mode context. The reason
is that minimal average response time approach may
lead to solutions, when “small” jobs outperforms their
goals at the expense of “big” jobs, who do not meet
their goals. An example of this situation considered in
the Table 1. In that example, all the jobs have the
same response time goals, 1 sec., but, the optimal
average response time proves to be 0.17 for small jobs
at the expense of a big job, which has response time
1.7 sec. At the same time, all the goals are reachable
in this example. Thus, minimization of weighted
average response time leads to un-fair solutions.
At the same time, we need to reach all the response
time goals, or if this is not possible, to approach all of
them, as close as possible. In particular [IBM00], IBM
WLM tries to make all the workloads within a given
importance level to have similar performance index,
and when an important workload is missing its goal
WLM tries to provide it with the additional resources.
What this mean mathematically is called a mini-max or
lexicographical optimization [BB03
, GNT4, BGTV3, L9].
In other words, we are trying to reach a solution, when
all the jobs reach their goals, and have the same
performance index, γ. If this is not possible, jobs which
miss their goal most (and have the highest
performance index, γ.1) receive the highest priority;
then we are trying to reach the same performance
index, γ.2 for the rest of the system; if this not possible
– again most troubled jobs in the second group receive
the highest priority and so on. The algorithm details
with simple calculations are presented in section 4.
A comparison between the lexicographical
optimization, mimicking IBM WLM behavior, and the
weighted average response time minimization, is
2 cµ rule lead to optimal solution in (3) if feasible
area X satisfies to the conservation law [CY1]. In
case of different service times and identical weights,
cj the cµ rule recommend short-job-first (SJF) priority.
3 Prof. E. Bolker and Prof. J. Buzen, in their paper
did not use explicitly this terminology, but de-facto
their goal mode modeling, which was inspired by
IBM WLM is equivalent to the lexicographical
optimization, and all the results presented at Tables
1, 2 could be received using their approach, too.
presented in the Tables 1, 2 for a simple example of N=5
jobs. In this example the corresponding weight factors in
the “average response time minimization” model (3) are
in inverse proportion to the response time goals, ci = 1/Gj
– in this way jobs with more challenging response time
goals would have higher weight in the optimization. This
approach leads to the minimal average performance
index.
In the example below, the first 4 jobs are “small”
- each of them has utilization 0.1 and the last job is a
“big” one – its utilization is 0.5 and we consider M/M/1
queuing preemptive priority system. Let all the jobs have
the same response time goals, Gi = 1.0 sec, i=1,…,5 and
their service times, are the same, Si =0.1 sec, too:
Table 1 Minimal Average vs. “Fair” Solution when
response time goals are reachable.
Job Util.
& Goals
Minimal Ave-
rage Response
Time
Lexicographi-
cally Minimal
Resp. Time
Job
#
Ui Gi Ri
avg
pi
avg
Ri
*
pi
*
1 0.1 1 0.17 0.17 1 1
2 0.1 1 0.17 0.17 1 1
3 0.1 1 0.17 0.17 1 1
4 0.1 1 0.17 0.17 1 1
5 0.5 1 1.67 1.67 1 1
R
avg
0.47 0.47 1 1
γl
* 1.67 1
If all the response time goals are identical, the
solution of (3) is defined by “small job first” priority
distribution: πS =(1,1,1,1,2) – which recommends
running small jobs with the highest priority 1, and gives
lower priority to the big job (#5) . As a result, small jobs
response times is ten times faster than the big job
response. As a result, response time for small jobs (0.17)
is more than 5 times better than their goals, while big job
response time is almost 70% greater than its goal.
In some situation, this kind of “optimization” is
not acceptable, because, for example a user may not feel
the difference between system responses 0.3 and 0.1
sec but may be sensitive if system response is 3 sec
instead of 1 sec.
At the same time, all the goals in this example
are reachable – and the algorithm of lexicographical
minimization, which will be presented in section 4 can
find it. To check this, consider the response time vector,
which corresponds to the “big job first” priority rule, πB
=(2,1,1,1,1). In this case the big job response time R5 =
S5/((1- U5) = 0.1/0.5 =0.2, and for small jobs, Ri = Si/((1-
U5)▪ 1- ∑i=1
5
Uj))=0.1/=2.0, j=1,..4. Thus, if we use priority
3. πS 54% of the time and use the πB priority rule 45% of
the time, the average response time in the system will
match the response time goals for each job.
In the last two rows of the Table 1 we
presented weighted average respond times and the
“worst case” performance index. As expected, minimal
weighted average response time (R
avg =
0.47) is smaller
that the weighted average for “fair” solution (1), but its
“most missing goal “ index γl is worse: γl
*
= 1.67 >1.
Now, consider the case when the performance
goals are not reachable. In the Table 2, presented
below, all job’s utilizations and service times are the
same as in the previous case, but their goals are
different. In this case the average response time
minimization approach suggest using priority π=
(1,2,3,5,4) - which corresponds to sorting of 1/UiGi in
the decreasing order.
Even though in this case all the response
times are quite different and are not reachable, one
can see that lexicographical optimization leads to a
“fair” solution, when jobs {1,2,3,5} are all missing their
goals by 19%, and job 4 significantly exceed its goal4
Table 2 Minimal Average vs. “Fair” Solution
when response time goals are NOT reachable.
Job Util.
& Goals
Minimal Ave-
rage
Response
Time
Lexicograph.
Minimal
Resp. Time
Job
#
Ui Gi Ri
avg
pi
avg
Ri
*
pi
*
1 0.1 0.125 0.11 0.889 0.15 1.185
2 0.1 0.250 0.14 0.556 0.30 1.185
3 0.1 0.500 0.18 0.357 0.59 1.185
4 0.1 10.00 5.00 0.500 5.00 0.500
5 0.5 0.50 0.71 1.429 0.59 1.185
R
avg
1.67 0.47 1.33 1.05
γl
* 1.43 1.19
At the same time, the average response time
solution leads to quite different performance indices:
for jobs with challenging goals, the small jobs’
response times are 11-64% less than their goals, while
the big job exceeds its goal by 43%.
4 Job 4 exceeds its goal just because its goal G4 = 10
sec is not restrictive, at all (R4 = 5 in both solutions,
and it corresponds to job 4 running with the lowest
priority).
3. Model Definition.
3.1 Conservation Law and two equivalent
approaches to describe the set of achievable
solutions. Our goal is to find algorithm which could
evaluate if respond time goals are reachable, and if they
are not reachable, this algorithm should suggest a
solution which would be (in relative terms) as closed to
the goals as possible.
However, the problem of modeling priority
queuing systems in general case is very complicated. It
become significantly more tractable in the cases when so
called conservation law works. There exist numerous
different types of formulation for conservation law for
different types of queues [CY1, BGTV3, LK5, GM0, BB9,
FG8]. For example, if we have preemptive priority M/M/m
queue and all jobs have the same service time, changing
priorities does not change total queue (or average
response time) in the system. Intuitively, this sounds
obvious – because higher priority and shorter response
time for one job would lead to comparable delay for other
jobs. From another side, if service times for different job
classes are significantly different, giving higher priority to
shorter jobs would lead to smaller average queue than,
e.g. FIFO, and the conservation law does not work [BB9].
There are a few general conditions, which are necessary
for conservation law. For example, server performance
should be the same over the time; server is not allowed to
be idle when there are jobs waiting to be served. Besides,
scheduling should not be anticipating- i.e. should not be
based on the service time requirements of the customers
[CY1].
It should be clarified, that the restriction to analyze
only conservative queuing models is not related to IBM
Workload Manager logic. We need the conservation law
just to justify algorithm below. This algorithm is precise if
conservation law works, and leads to an approximate
solution in the case of moderate deviation from this law. As
it is noted in [BB9], “..normally, conservation law is quite
robust, meaning that it likely to be very close to correct,
even when assumption of equal (average) service times
across is violated. Time slicing and other preemption
mechanisms also contribute to the robustness of the
conservation law by reducing the likelihood of very long
processing burst”.
As an illustration of the conservation law in action
for quite different resource management tools, one can
look at Fig. 3, 10, 13 in [BD0] for measured total queues5
for SUN SRM, HP’s Process Resource Manager and IBM’s
Workload Manager for AIX. In all these cases, different
workload received different CPU’s shares, but the resulting
total queue was approximately constant independently on
which of workloads received bigger share.
5 At [BD0] authors presented “weighted average response
time”, which is proportional to the total queue.
4. Now, consider description of all achievable
performance states for a system of n job classes. Let
vector q={qi} denotes all job’s average queues, µ - their
service rates, λ ={λj} – arrival rates, u={ui} –utilizations,
and R={Ri} – response time vector. Let N={1,…n}
denotes the set of all jobs, and for each subset of job
classes, S ⊆ N we denote the total utilization of the jobs
in this group as ∑∈
=
Sj
jS uU , and QS = Q(US) is the
M/M/m queue corresponding to this utilization and FCFS
priority scheduling. According to the conservation law,
the set X of all feasible combinations of queues may be
described by the set of 2n
-1 inequalities (4) and equity
(5), for each possible subset S of all jobs:
∑∈
≥
Sj
Sj Qq , (4)
And, for the total queue in the system, QN – when in (4)
S=N this inequality become an equity:
∑∈
=
Nj
Nj Qq , (5)
This amazingly simple description of all possible
states of the queuing system is made possible by
conservation law. The inequalities (4)-(5) have very
simple explanation: for any group of jobs, S – the minimal
possible total queue in this group ∑qj = QS - corresponds
to the case, when all the jobs from this group are running
with the highest priority. Due to the preemptive
scheduling, their total queue is not affected by any other
jobs in the system and may be calculated as FCFS
priority scheduling queue and is defined by total
utilizations for the jobs in this group, US. In all the other
cases when some the other jobs j ∉S are running with
the same or higher priorities as jobs from group S, they
affect the total queue in the group S and we have strict
inequality: ∑qj > QS. Combining these two observations,
we have the inequality (4).
Let’s illustrate this approach with an example of
n=2 jobs, which have utilizations u=(0.25, 0.50) on 1-
processor CPU . The total queue in the system, QN =
(u1+u2)/(1-u1 –u2) = 3; if job 1 is running with highest
priority, its queue will Q1 = u1//(1-u1) = 0.33. When job 2
has the highest priority, its queue will be: Q2 = u2//(1-u2)
= 1. Thus, all possible states q=(q1, q2) in this system are
described by system of 3 inequality:
q1 ≥ Q1
q2 ≥ Q2
q1 + q2 = QN
For example, in this system queue state (q1, q2)
=(1.5, 1.5) is feasible because all the inequalities above are
fulfilled. Of course in the case of two jobs, it is clear in this
case how to reach this state: job 1 should be running with
high priority (QN - q1 –Q2) / ( QN -Q1-Q2) ≈ 30% of the time,
and the rest, 70% of the time it should have low priority.
This approach of representation of all possible states of
queuing system as a combination of queues, corresponding
to static priority rules was used in the pioneering paper
[GM0], and for conservative queuing systems it is
equivalent to the representation (4)-(5) [BB9, CY1].
But, in any case whether the feasibility area is described
by inequalities like (4)-(5), or by using priority, the number
of inequalities , or possible priority states grows
enormously when n become significant because
description (4)-(5) contains 2n
inequalities, and there exists
n! of possible priorities combinations. For example, for
n=20 it leads to ~106
and 1018
objects, correspondingly.
The algorithm, below is based on some intrinsic features of
the conservation law6
and is very fast, - all it needs is just n
calculation of M/M/m queues.
3.2 Formal Definition of “Fair” (Lexicographically
Optimal) Solution.
As it was already discussed, a measure of
success in reaching vector of performance goals, {Gi} is
vector of performance indices, {pi}, defined by (1).
Ideally, when all goals are reachable, and solution is
ideally “fair” we will have identical, and less than 1
performance indices for all the jobs:
p1 = p2 = … =pn = γl < 1 (5)
But, what should we do in the situation, when some or all
of the goals are not reachable? Tuning/Resource
balancing system are trying to give more system
resource to job which are missing their goal most, and
thus, improving the “worst” performance index, γ1 ; then
when improvement of γ1 is not possible, next most
troubled jobs received their share to improve their
performance index, γ2 and so on. The results of such
activity, - a vector of performance indices, p = {p1 ,..,pn}
also should be evaluated component-wise, when worst
(or biggest) components should be compared first.
Thus, if we have two competing tuning systems,
and in a comparable situation one lead to performance
indices vector, p1
= {p1
1
,…, pn
1
} and another – to a
performance vector p2
= {p1
2
,…, pn
2
} these results should
6 So call submodularity property of function Q(US)
[F1,CY1]
5. be compared lexicographically.
First, the performance indices should be sorted
in the descending order. Let the corresponding results
are vectors γ1
= {γ1
1
,.., γn
1
} and γ2
= {γ1
2
,…, γn
2
}. If γ1
1
< γ1
2
- the first vector of the performance results γ1
is
“better” or more preferable than γ2
: (γ1
p γ2
),
otherwise, if γ1
1
> γ1
2
the second result is better: (γ2
p γ1
). If γ1
1
=γ1
2
next performance pair of the indices,
γ2
1
,γ2
2
should be compared and so forth, so on.
As an illustration, consider performance vector results
from the Table 2. The minimization of average
response time lead to the performance vector,
p1
=(0.89,0.56,0.36,0.5,1.43) – and after sorting we will
have vector γ1
=(1.43,0.89,0.56,0.5,0.36). The
alternative vector p2
= (1.19, 1.19, 0.5, 1.19) and
γ2
=(1.19, 1.19, 1.19, 1.19, 0.5). Thus, because
γ1
2
=1.19 < γ1
1
=1.43 the second solution is more
preferable.
Our goal, is to find a feasible solution, vector
or queues q={qj} satisfying the conservation law (4)-(5)
and leading to lexicographically minimal performance
indices vector, γ7 .
4. Algorithm Description
The algorithm is based on the following simple
observations8
. Suppose all the jobs are sorted by their
response time goals in the increasing order:
G1 ≤ G2 ≤ … .≤ Gn (6)
Because all the job’s service times are
identical, it is clear that at least part of the time job 1,
which has the most challenging goal, should be
running with the highest priority; correspondingly, job 2
should be running at least some time with priority 1 or
2 and so forth so on.
From another side, because goal G1 is the
most challenging one, it is naturally to expect, that job
1 will have the “worst” performance index, γ1 , job 2 will
either have performance index γ1 or the next worst,γ2
and so on, i.e.:
γ1 ≥ γ2 ≥ … ≥ γn (7)
Suppose, a group of jobs S, has the same
performance index, γ = Rn/Gn , or Ri = pGi ∀j ∈S.
Using Little Law, qi =λj Ri = =γλjGi. Summing up these
7 Amazing, and general fact of submodularity theory, is
that solving the problem of lexicographical optimization is
equivalent to minimization of some quadratic function (in
our case, ∑qj
2
/Gj ) was discovered by Fujishige [F0].
8 A formal proof of the algorithm is presented in [R9]
equalities, we will have:
γS )(/)(/ SSQGq
Sj
jj
Sj
j ϖλ∑∑ ∈∈
==
(8)
where Q(S) and ω(S) – total queue and weighted goal
(8) in the job group S.
Now, all we need – is just to find the right
separation of the set of all jobs, 1,…,n into m subsets S1
, .., Sm , m ≤ n so that jobs from the same group, k
have the same performance index, γk (and γ1 > γ2 > …
>γm ).
Algorithm
Initialization:
First – sort all the jobs by their goals, like (6). Set
m=1, include current job, j:= 1 into the current group m:
Sm ={1}. Set higher groups utilization, Uh := 0 and total
utilization Utot = u1. Evaluate, using M/M/m formula total
queue in the group, Q(S1)=Q(Utot) – Q(Uh)=Q(U1) ; set
ω(S1)=λ1G1 and γ1 := Q(S1)/ω(S1).
Iteration: Repeat, while j < n:
Update current job index, j:=j+1,
First, try if this job j belongs to the current job set
Sm: Tentatively set Utot
~
:= Utot + uj; re-evaluate current
group total queue Qm
~
= Q(Utot
~
) – Q(Uh); ωm
~
:=
ω(Sm)+λjGj and updated performance index p~
=
Qm
~
/ωm
~
.
1 If p~
≥ γm than current job j belong to the current
group, do:
1.a Set Sm := Sm ∪j,γm := p~
, ωm := ωm
~
,
Utot:=Utot
~
and update the respond times for each of the
job, i from the current group, Ri :=γmGi ∀i ∈ Sm.
1.b If m >1 check monotonicity (7): If γm > γm-1, the
groups m and m-1 should be merged (Sm-1:= Sm ∪Sm-1,
ωm-1= ωm-1 +ωm , Qm-1:= Qm +Qm-1, γm-1 := Qm-1/ωm-1,
m:= m-1 ) and this merging operation should be repeated
until γ1,…, γm become a decreasing sequence.
2. If p~
< γm the new group, m=m+1 started:
Uh := Utot;; Utot = Utot
~
, Qm = Q(Utot) – Q(Uh), ωm =λjGj
γm := Qm/ωm, Rj=γmGj .
And, if j < n the iteration repeated.
After algorithm completion, we will have the
lexicographically optimal response times for each job,
and m different values of the performance indices, γk,
k=1,…m.
6. To illustrate how this algorithm work consider
its application to the example from Table 2.
Table 3 Algorithm iterations results for 5 jobs
j Uj Gj
m γm
Response time
estimations, R1,,…,Rn
1 0.1 0.125 1 0.89 0.11
2 0.1 0.25 2 0.56 0.11, 0.14
3 0.1 0.50 3 0.36 0.11, 0.14, 0.18
4 0.5 0.50 1 1.19 0.15, 0.30, 0.59, 0.59
5 0.1 10.0 2 0.50 0.30, 0.59, 0.59, 0.59, 5.0
In Table 3 presented are utilizations and goals of five
jobs from the Table 29
, as well as the algorithm
iterations results: the values of m – number of groups,
γm – performance index of the current group and
Rj=γiGj- estimated job response times after each
iteration, j. As one can see from this table, until the
iteration with the “big” job, j=4 each job has different
performance index, and each job group contained just
one job: S1={1}, S2={2}, S3={3}. At the iteration with big
job (in bold) all these groups were merged to one
group, S={1,2,3,4} with performance index γ1 ≈ 1.19
at the step 1.b. At the final iteration with
“unchallenging” goal G5 one more job group was
created (γ2 = 0.5).
This example illustrates a few important
properties of splitting all the jobs to the groups with
identical performance indices, S1 ,…, Sm . This splitting
corresponds to the “best” possible priority distribution.
Each job from the group Sk on the average
should be run with higher priority than any job from the
groups Sk+1 , …, Sm . Thus, even if all the jobs from
these groups k+1,..,m will be eliminated from the
system, there is no way to further improve performance
results for the jobs j∈S1 ∪ … ∪ Sk. Besides, this
grouping gives some way of simplification or
aggregation of the original performance problem. Thus,
if instead of n original jobs we consider m aggregated
jobs with the parameters of arrival rate λk
a
and the
goal Gk
a
defined as total and weighted total arrivals
rates and goals:
∑∑∑ ∈∈∈
==
111
/, Sj jSj jj
a
kSj j
a
k GG λλλλ (8)
than solution for the “aggregated “problem will lead to
the same set of performance indices, γ1,…, γm as it
was in the original non-aggregated problem of n jobs.
9 All the 5 jobs are sorted by their goals, so jobs #4 and
#5 are swapped comparing the Table 2.
The presented algorithm leads to the results of
the original paper [BB0], devoted to goal mode
scheduling and motivated by IBM WLM. The major
difference is that this algorithm avoids scanning ~2n
-1
constraints (4)-(5), and thus, it is scalable for big n.
In this algorithm, all the jobs have the same
importance level, but the case of different importance
levels may be easily accommodated in the algorithm.
In case of different – but relatively closed service
times, algorithm become approximate – and can be used
after rescaling jobs parameters. However, in the case
when service rates are significantly different more
general methods [FG8] or further research needed. Also,
this algorithm may be used for more general G/M/m
system with preemption in the case when all the job
classes have the same exponential service-time
distribution10
– but, of course, in this case calculation of
the functions Q(U) may become much more complicated.
5. Tuning vs. Upgrade and other examples.
Using tuning software tools may be an option to
avoid or postpone hardware update. Thus, it is important
to evaluate “limits of tuning” - i.e. can we reach the
performance goals without upgrading the system?
Besides, in the cases when the goals are not
reachable without hardware upgrade, it may be important
to evaluate upgrade + tuning option to evaluate what
should be a minimal hardware upgrade level which
allows meeting the performance goals if system is
perfectly tuned. Answering this question may lead to a
conclusion that the performance goals themselves are
not realistic or are not cost effective.
To simplify the analysis let’s start from the case of
one job, which has response time goal, G, actual response
time, R and total utilization U. Suppose, the goal is not
reached and the performance index, γ = G/ R >1. What
should be the minimal upgrade level υ > 1, such that the
response time goal becomes reachable at a server, which
is υ times faster, than the existing one?
Using the Little’s law and the fact, that the
throughput on the faster server (for open systems) will be
the same, the equation for the upgrade factor, υ will be:
Q( U/υ ) = Q(U ) /γ (9)
In the case when the original and the “upgraded” servers
have the same number of processors, m the equation (9)
may be rewritten as (10):
Q( m, U/υ ) = Q(m, U ) /γ (10)
10 G/M/m with the same average service time satisfies the
conservation law (4)-(5) [CY1].
7. For the M/M/m queues, it is easy to solve (10)
analytically for small m or numerically for any arbitrary
m. In particular, for m=1 solution of (10) will be:
υ = U + γ (1- U) (11)
In the case of m=2 the performance improvement
factor, υ will be a solution of quadratic equations (ρ =
U/m – per processor utilization):
(υ2
- ρ2
)= γ υ(1-ρ) (12)
It is interesting to analyze solution υ of the equation
(10) as a function of γ . For any combination of the
parameters m, ρ this function υ(γ), is monotonically
increasing. However, its behavior is quite different in
the areas of small and high utilizations. For small
utilization υ ≈ γ - and this reflects the obvious fact that
at small utilization there is no problem with queuing, we
need faster server just to make service time smaller. If
utilization is high (ρ≈1) this dependency is much
weaker (e.g. for m=1,2: ∂υ/∂γ ~ 1-ρ ) which reflect the
fact that at high utilization even relatively small
improvement in the cpu speed can lead to significant
effect.
It is important to mention, that modern
technology sometimes allows avoid “physical” upgrade
– customers can use (and pay) for additional
processors in the period of the pike loads [SB2]. In this
case, when we are trying to solve performance
problems by increasing the processors’ number from
m1 to m2 the analog of (10) will be:
m2 = min{ k| Q(k, U ) ≤ Q(m, U) /γ } (13)
All these speculations were made for the
simplest case of one “aggregated” job in the system.
However, this simple back-on-the-envelope formula (9)
can be used for approximate estimation of the
necessary upgrade, if we aggregate all the jobs, which
do not meet their goals.
Besides, the algorithm, presented in the
previous section allows estimation of a minimal
necessary upgrade level in the case of tuning +
upgrade analysis. Suppose after using the algorithm
from the previous section for n jobs we found, that the
best possible performance indices, which could be
achieved by tuning are : γ1
*
≥ γ2
*
≥ … ≥ γn
*
and
suppose that for k of the jobs their goals are not
reachable, γk
*
>1. In this case we can consider one
aggregated job with total utilization U, and performance
index γ , defined by (14):
∑∑∑ ===
==
k
j
jj
k
j
j
a
k
j
j
a
U
111
/,/ λγλγµλ (14)
Solving the equation (9) for the aggregated parameters,
Ua
, γa
gives a some estimation for the necessary upgrade
factor, υ. If respond times Rj
*
in (14) correspond to the
lexicographically optimal solution, (14) gives a good
estimation for the minimally necessary upgrade level
(see Tables 6, 7: tuning + upgrade section). The high
level of accuracy of aggregation may be explained by the
fact that all the aggregated jobs 1,…,k are not affected by
jobs k+1,…,n which has lower priorities. In the general
case, aggregation gives some reasonable but not very
accurate approximation. Thus, in the example presented
at Table 5, using aggregated formula (14), (11) instead of
the detailed information on the average priority
distributions leads to estimation υ≈1.4 instead of υ≈1.6.
Now, to compare tuning versus hardware
upgrade options let us consider following illustrative
example.
Suppose, we have 3 jobs, running on one
processor CPU and each job has a unit service time.
The parameters of these jobs (called, according to their
utilizations as Light, Medium, and Heavy) are presented
at Table 4. Job response times results in the Table 4
corresponds to the mix of priority ordering, π1={2,1,0},
π2={2,0,1}, π3={0,1,2}, which has the probabilities,
p={0.75, 0.15, 0.1}.
Table 4 Original jobs utilizations and the response times
Resp. Times
Jobs Utilizati
ons
Actua
l
Goals
Light (L) 0.1 10.97 2
Medium(M) 0.2 4.30 4
Heavy (H) 0.45 2.65 6
In the Table 5 we compared two approaches – “tuning”
and the cpu upgrade. Tuning results corresponds to the
lexicographically optimal solution, obtained with the
algorithm from the previous section. As one can see from
Table 5, goals gr
are reachable – system can be tuned in
a way, that all three jobs have the same performance
index, pj ≈ 0.8 (almost 20% better then the goals). At the
same time, reaching the goals using the existing priority
distribution would require almost 60% faster processor,
than the existing one (υ≈1.58).
8. The significant difference between the existing and the
upgraded cpu (almost 60%) may be explained in this
example by the fact that the original priorities does not
followed to the short job first rule (the most resource
consuming Heavy job, had the highest priority 75% of the
time). Or putting it another way- there is a high (≈97%)
correlation between the goals and the loads.
Table 5 Tuning and upgrade options
comparison for correlated loads and goals
(60% faster CPU), G={2,4,6}
Tuning Upgrade (υ≈1.58)
Jobs
Res-
ponse
Resp/
Goal
Res-
ponse
Resp/Goal
L 1.85 0.81 2.00 1.00
M 3.69 0.81 1.35 0.34
H 4.62 0.81 1.48 0.25
In the case of high correlation between the utilizations
and goals (if all the service times are the same) one can
expect especially significant results from lexicographical
optimization. And the opposite fact is true, too – when
loads and goal are highly negatively correlated potential
effect from tuning may be insignificant.
To illustrate this we compared tuning vs. upgrade for the
same model, but we swapped the goals for Light and
Heavy jobs, making utilizations and goals negatively
(≈-0.97) correlated. In this case of goals G={6,4,2} tuning
is not sufficient (γ ≈1.3) -and we need some upgrade
(υ≈1.08).
Table 6 Tuning and upgrade options
comparison for balanced (negatively
correlated) loads and goals, G={6,4,2}
Tuning +
upgrade
(υ≈1.08)
Upgrade
(υ≈1.21)
Jobs
Resp.
Time
Goals
Res-
ponse
Resp
/Goal
Res-
ponse
Resp/
Goal
L 6 6.0 1.0 4.54 0.76
M 4 4.0 1.0 2.45 0.61
H 2 2.0 1.0 1.99 1.0
As one can see from the Table 6, in the case of
“balanced” goals, when tight goals corresponds to the
small jobs, tuning does not help much: the necessary
upgrade without tuning is ~21% and with tuning ~8%.
This effect will be even more evident for multiprocessor
systems. But, of course the major factor defining the
necessary upgrade level is how challenging are the
goals.
In the Table 7 we presented results when all 3
goals are not reachable, and are much more challenging
then in the previous examples (Gi =1). In this case the
“un-tuned” system need 2.75 times faster processor to
reach their goals, while tuning + upgrade combination
needs much less faster cpu (υ =1.75). Note, that
estimations of the upgrade factor υ (1.08 and 1.75) in te
Tables 6 and 7 were made using the aggregated formula
(10) and (14) using lexicographical solutions proves to be
pretty accurate.
Table 7 Tuning and upgrade options comparison
in the case of challenging goals
Tuning +
upgrade
(υ≈1.75)
Upgrade
(υ≈2.75)
Jobs
Resp.
Time
Goals
Res-
ponse
Resp
/Goal
Res-
ponse
Resp/
Goal
L 1 1.00 1.00 0.66 0.66
M 1 1.00 1.00 0.55 0.55
H 1 1.00 1.00 1.00 1.00
6. Summary
Lexicographical optimization is an important
approach in the analyzing performance problems. This
relatively simple approach reflecting behavior of the
complicated real tuning system allows finding a feasible
solution, in the case when response time goals are
achievable, and lead to fair, lexicographically minimal
solutions, otherwise.
It was also shown that a straightforward
optimization approach, based on the weighted average of
the response times is not very helpful in analyzing such
systems.
Also presented are simplified –back on the
envelope formula for estimation necessary level of
upgrade in the cases, when performance goals are not
reachable.
References
[GNT4] Leonidas Georgiadis, Christos Nikolaou,
9. Alexander Thomasian “A fair workload allocation policy
for heterogeneous systems”, Journal of Parallel and
Distributed Computing v.64, Issue 4, April 2004, pp
507-519.
[L9] Luss, H. On equitable resource allocation
problems: a lexicographic mini-max approach.
Operations Research, v. 47., No. 3, pp.361-376, 1999.
[BGTV3] Bhatacharya, P.P, L.Georgiadis, P.Tsoucas,
I.Viniotis Adaptive Lexicographic Optimization in Multi-
class M/GI/1 Queues, Mathematics of Operations
Research, v. 18, No3, 1993, pp705-740..
[LK5] Leonard Kleinrock, “Queueing Systems”, v.1, v.2,
1975
[GM0] Goffman, E. and Mitrani, I. “A Characterization
of Waiting Time Performance Realizable by Single
Server Queues”, Operations Research, v28 (1980),
810-821.
[CY1] Chen, H., Yao, D. Fundamentals of Queuing
Networks, Springr, 2001
[BD0] Bolker, E., Ding, Y., On the Performance Impact
of Fair Share Scheduling. Proc CMG 2000), pp.71-81.
[BB9] Ethan Bolker, Jeff Buzen “Goal Mode: Part 1 -
Theory”, CMG Transactions, V96, May 1999, pp.9-15.
[SBG9] Annie Shum, Jeff Buzen, Boris Ginis, “Goal
Mode: Part 2- Practice”, CMG Transactions, V96, May
1999, pp.16-21.
[IBM00] IBM AIX V4.3.3 Workload Manager. Technical
References. February 2000 Update.
[F1] Fujishige, S., Submodular Functions and
Optimization, Annals of Discrete Optimization, v47, 1991,
270 p.
[R9] Rikun A. “A Polynomial Algorithm for Evaluation
Achievable Performance Solutions for Multi-class
Queues”, submitted for publication.
[FG8] Federgruen, A., Groenevelt H., Characterization
and Optimization of Achievable Performance in General
Queuing Systems, Operations Research, v. 36, No. 5,
1988, pp 733-741
[SB2] Shum, A., Buzen J. “Industry-wide Implications of
the ‘Capacity on Demand’ & ‘Pay-As-You-Go’
Phenomena: Intertwining of IT budget and capacity
planning”, In: J. Of Computer Resource Management,
2002, p.66-88.