SlideShare a Scribd company logo
1
Combating Software Aging: Use Two-Level
Rejuvenation to Maximize Average Resource
Performance and Minimize Tasks Deadline Miss
Rate
Hao Wu ∗, Student Member, IEEE, Chunhui Guo ∗, Student Member, IEEE, Xiayu Hua ∗, Student
Member, IEEE, Igor Lopes †, Shangping Ren ∗, Senior Member, IEEE
Abstract—Software aging is a common phenomenon which is often manifested through system performance degradation.
Rejuvenation is one of the most commonly used approaches to handle issues caused by software aging. To combat resource
performance degradation, we present a two-level rejuvenation strategy, i.e., interleaving a set of n warm rejuvenations with one cold
rejuvenation. Our first target is to find the optimal n that maximizes system average performance when no application’s information is
known a priori. We first define a resource model that takes into consideration of performance degradation and two-level rejuvenations.
Based on the resource model, we formally analyze the resource supply and present the MAX-PERFORMANCE algorithm to determine
the optimal rejuvenation pattern that maximizes the average resource performance. When a task is deployed on the resource,
maximized average resource performance does not necessarily guarantee minimized task’s deadline miss rate. Hence, our second
target is to design a dynamic two-level rejuvenation strategy that minimizes task’s deadline miss rate when a real-time periodic task is
deployed on the resource. The simulation results show that with a two-level rejuvenation strategy, we can achieve 25.22% higher
average resource performance compared with a single level rejuvenation strategy. In addition, with a two-level rejuvenation strategy, a
task’s deadline miss rate is always less than the deadline miss rate when a single rejuvenation strategy is applied. The experimental
results also show that asynchronized rejuvenation strategy outperforms the synchronized rejuvenation strategy in most of the
scenarios.
Index Terms—Software Aging, Performance Degradation, Resource Model, Two-Level Rejuvenation, Resource Supply Analysis,
Deadline Miss Rate Minimization Resource Performance Maximization
!
1 INTRODUCTION
HARDWARE aging is a well know phenomenon in com-
puter systems that slows down the system perfor-
mance and eventually leads to transient failure [2]. Simi-
lar to the hardware aging, software running on computer
systems also ages [17]. Software aging is usually caused
by memory leaks and error accumulation. Unlike the hard-
ware aging that may take long time to impact the system
performance, the software aging may reveal in a relatively
short time period [11], [2]. As modern computer systems
are getting more complex and supporting more concurrent
applications, software aging becomes more obvious and
has significant impacts on system performance. According
to [3], nowadays, computer system outages are caused more
by software failures than by hardware failures.
Software aging also has an impact on today’s mobile de-
vice performance. To provide evidences for such slowdown
phenomena on cellphones, we have written an Android
APP which computes the multiplication of two 500 × 500
∗The authors are with Department of Computer Science, Illinois Institute of
Technology, Chicago, IL 60616, USA
∗{ hwu28, cguo13, xhua}@hawk.iit.edu, ren@iit.edu
∗The research is supported in part by NSF under grant number CAREER
0746643, CNS 1018731, CNS 1035894, and CPS 1545008.
†Igor Lopes is with Department of Computer Science, University of Idaho.
oliv7721@vandals.uidaho.edu
matrices and records the computation time. The APP runs
on a cellphone with a Qualcomm 1.5GHz dual-core, 1G
RAM, and 2G internal storage. The APP is the only appli-
cation running on the cellphone under Android 2.3.6. Fig. 1
shows the measurements of the computation time of matrix
multiplication over about 5 days. Each point represents
the average computation time of 300 matrix multiplication
computations. From Fig. 1, we have following observations:
1) The computation time within the interval [1, 10],
[15, 20], [24, 39], and [42, 53] has an increasing trend,
which indicates that the cellphone suffers from aging
effects.
2) The computation time within the [10, 15], [20, 24] and
[39, 42] intervals have a decreasing trend. The log file
indicates that the cellphone was rebooted at point 10;
and the matrix multiplication application was restarted
at point 20 and 39.
The second observation also indicates that both cell-
phone reboot (cold rejuvenation) and application restart
(warm rejuvenation) can restore a cellphone’s performance,
but the restore capability of cold rejuvenation is higher than
the warm rejuvenation. In addition, the resource perfor-
mance after the second warm rejuvenation is lower than
the first warm rejuvenation.
Currently, smartphone subsystems have reset mecha-
2
1 10 20 30 40 50
24.2
24.4
24.6
24.8
No. of Points
Seconds
Fig. 1. Aging Effect of Matrix Multiplication Time on Cellphone
nisms, called “silent resets”, incorporated to restore func-
tionality while minimizing user impact. However, these
resets happen in a reactive manner and are not predicted or
scheduled. For instance, WiFi has a reset mechanism a.k.a.
sub system restart. It’s a structure for the WiFi Firmware
to restart its execution point. The reset is initiated by ei-
ther a program fault (e.g. out-of-bound memory access,
bad instruction jump, or memory corruption) or a health-
monitoring trigger (e.g. inability to transmit packets for a
period of time (hardware lockup), packet memory overflow,
or register value lockup).
For systems that support long lasting applications, soft-
ware aging is an unavoidable phenomenon which may lead
the applications running on the system violate their QoS
requirements. Hence, rejuvenation is a necessary process to
maintain the system performance at an expected level. How-
ever, rejuvenation takes time during which the system is not
available to user applications. Different levels of rejuvena-
tion have different overhead, different performance restore
capability, and result in different system performance. In
this paper, we are to design two-level rejuvenation strategies
that uses the combination of warm and cold software rejuve-
nation to maximize the system’s average performance when
no task’s information is known a priori and to minimize the
task’s deadline miss rate when a real-time periodic task is
deployed on the resource. In particular, we are to extend our
previous resource with performance degradation and pe-
riodic rejuvenation (P2
-resource) model [10] and integrate
it with the two-level rejuvenation strategy [6]. Based on
the extended resource model, we formally analyze resource
supply and give the optimal combination of using warm
and cold software rejuvenation that maximizes the system’s
average performance when no task’s information is given.
We further theoretically analyze the deadline miss rate of
single real-time periodic task when two-level rejuvenation
strategy is applied to restore resource’s performance. The
optimal rejuvenation period that minimizes task’s deadline
miss rate when rejuvenation period is synchronized with
task is also be calculated.
The rest of the paper is organized as follows. First,
we discuss related work in Section 2. Two-level rejuvena-
tion strategy design for maximizing average system per-
formance is presented in Section 3. In Section 4, we dis-
cuss the dynamic two-level rejuvenation strategy design for
minimizing real-time task’s deadline miss rate. We verify
the theoretical analysis and evaluate the proposed resource
model by simulations in Section 5. Section 6 concludes the
paper and points out our future work.
2 RELATED WORK
Software rejuvenation is a preventive and proactive main-
tenance solution for handling system aging effects. Huang
et al. [11] first proposed the concept of software rejuvena-
tion and developed a four-state (i.e., Robust State, Failure
Probable State, Failure State, and Rejuvenation State) system
model to reflect system operational states. Since then, many
rejuvenation models have been developed by the research
community [11], [8]. For instance, Koutras et al. extended the
initial rejuvenation model by considering two levels of re-
juvenation actions [15], i.e., perfect rejuvenation action and
minimal rejuvenation action. The perfect rejuvenation (cold
rejuvenation) results in the system returning to the Robust
State (initial state), while the minimal rejuvenation (warm
rejuvenation) results in the system returning to the Failure
Probable State (the state before rejuvenation). Alonso et al.
experimentally compared the overhead by taking different
software rejuvenation technologies [1]. They categorize the
software rejuvenation into three different granularities, i.e.
application level, operating system (OS) level and hardware
level. The application level rejuvenation takes the least time
but also has the least impact on the system performance. The
hardware level rejuvenation takes the longest time but lead
to the best system performance. The OS level rejuvenation
is in the middle for both time cost and performance impact.
To analyze software aging and study aging related fail-
ures, Trivedi et al. [20] presented two approaches: analyt-
ical modeling approach for determining optimal times to
rejuvenate and measurement based approach for detection
and validation. Tai et al. [19] identified key factors that
may impact system reliability and developed an approach
to maximizing system reliability by analyzing the optimal
interval between maintenances. Guo et al. considered both
transient faults caused by software aging effects and net-
work transmission faults and analyzed the optimal software
rejuvenation period that maximizes systems reliability [7].
Okamura et al. [16] discussed a maintenance policy that
combines aperiodic rejuvenations and periodic checkpoints
to maximize the system availability. The estimations of
reliability and availability were analyzed in [18], [15].
The two-level rejuvenation model has also been ana-
lyzed by the research community. Hong et al. studied two-
level closed-loop rejuvenation techniques and proposed an
approach to minimize the average rejuvenation cost [9].
Koutras et al. observed the effects of a two-level software
rejuvenation model on availability, downtime and rejuvena-
tion cost indicators [14]. The two-level rejuvenation model
was also modeled by a Semi-Markov process and analyzed
to find the optimal rejuvenation policy to maximize the
system availability [21], [13], [12].
As pointed out in [4], a general characteristic of software
aging is the gradual performance degradation and/or an in-
crease in the software failure rate. The above works mainly
focus on aging related failure effects on QoS and how to
perform rejuvenations to optimize the system QoS, such
3
as availability and reliability. However, not much work has
been done on designing rejuvenation strategies on when to
perform rejuvenations to improve the resource performance.
Recently, Hua et al. [10] proposed a new resource with
performance degradation and periodic rejuvenation (P2
-
resource) model which takes software aging and periodical
resource rejuvenations into consideration. It gives formally
schedulability analysis under the P2
-resource model for
both EDF (earliest deadline first) and RM (rate monotonic)
scheduling algorithms.
In this paper, we are to extend the P2
-resource with the
consideration of both warm and cold software rejuvenations
along with their impacts on the system performance. Based
on the extended resource model, we formally analyze re-
source’s supply and present a linear search algorithm to
determine the optimal interleaving between warm and cold
rejuvenations that maximizes the average resource perfor-
mance.
3 TWO-LEVEL REJUVENATION STRATEGY TO
MAXIMIZE AVERAGE SYSTEM PERFORMANCE
3.1 Models and Assumptions
Resource Performance Function
We use function f(t) to denote the resource perfor-
mance at time t. The resource performance represents the
computation cycles per unit time provided by the resource
to applications. As the system performance degrades over
time, we assume that the resource performance function
f(t) is a decreasing function and f(0) = 1 [17], [10]. As for
any decreasing resource performance function, the strategy
to analyze the resource’s performance is the same. Hence, to
simplify the discussion of our approach, we further assume
that the resource performance function is a linear decreasing
function, i.e.,
f(t) = 1 − at
where a denotes the resource performance decreasing rate
which is assumed to be a constant and 0 ≤ a < 1. If a = 0,
the resource’s performance does not degrade.
Resource Rejuvenation Pattern
Similar to [13], [18], [1], the system can perform two
levels of rejuvenations, i.e., cold rejuvenation and warm
rejuvenation. Once the resource’s performance f(t) de-
grades to a threshold r (0 ≤ r < 1), we take a warm or
cold rejuvenation to restore its performance. After a cold
rejuvenation, the system returns to the Robust State, and
the resource performance is restored to f(t) = 1. When
a warm rejuvenation is completed, the system goes back
to the Failure Probable State, and the resource performance
function becomes to fi(t) = pi
− a(t − ti) where i denotes
ith
warm rejuvenation, p is the resource performance restore
factor (0 < p < 1) and ti represents the time of ith
warm
rejuvenation’s finish time. The resource is unavailable when
it goes through the rejuvenation process. The downtime
caused by each cold rejuvenation or warm rejuvenation is
assumed to be a constant ΦC and ΦW , respectively. We
further assume ΦC > ΦW .
As the resource performance after each warm rejuve-
nation is smaller than the previous warm rejuvenation, if
we only take warm rejuvenations, the resource performance
will eventually be below the threshold r and hence a cold
rejuvenation becomes necessary. We define the rejuvenation
pattern as n (n ∈ N) warm rejuvenations followed by one
cold rejuvenation, as shown in Fig. 2. The time interval of
an entire rejuvenation pattern is denoted as rejuvenation
hyperperiod Π. We assume that the resource is repeatedly
rejuvenated by the above pattern with period Π.
Fig. 2. Resource Rejuvenation Pattern
As the initial resource performance is f(0) = 1, the
restored resource performance after n warm rejuvenations
is fn(t) = pn
. The resource performance after the nth warm
rejuvenation must not be smaller than the threshold, i.e.,
pn
≥ r, otherwise the nth rejuvenation should be a cold
rejuvenation. Hence, we have n ≤ logp r and the maximal
warm rejuvenation number before a cold rejuvenation in
the rejuvenation pattern is Nmax = logp r .
Resource Model
The resource model is characterized by a 6-tuple
R(f(t), r, p, ΦW , ΦC, n), where f(t) is the initial resource
performance function, r is the resource performance thresh-
old to start a cold rejuvenation, p is the resource perfor-
mance restore factor of a warm rejuvenation, ΦW is the
warm rejuvenation time cost, ΦC is the cold rejuvenation
time cost, and n is the number of warm rejuvenations before
a cold rejuvenation in the rejuvenation pattern. We assume
the resource starts at time zero.
If the resource only takes cold rejuvenations, i.e., n = 0,
the resource model degenerates to the P2
-resource model
in [10].
Average Resource Performance
We define the average resource performance within a
system’s longevity L as the ratio between the total resource
supply SL within L, i.e.,
AL =
SL
L
(1)
3.2 Problem Formulation
The problem we are to address is defined below:
Problem: Given a resource R(f(t), r, p, ΦW , ΦC, n), decide n
that maximizes the average resource performance, i.e., AL, within
its operational interval [0, L].
According to Eq. (1), maximizing the average resource
performance AL is to maximize the total resource supply SL
with given system longevity L. We take two steps to address
the problem. First, we analyze the total resource supply SL
with a given rejuvenation pattern (Section 3.3). Second, we
present the MAX-PERFORMANCE algorithm (Section 3.4)
to determine the optimal rejuvenation pattern, i.e., n with
respect to maximizing average resource performance.
4
3.3 Resource Supply Analysis
In this subsection, we first analyze the resource supply SΠ
within a rejuvenation hyperperiod, i.e., within two cold
rejuvenations, and then formalize the total resource supply
SL within the system longevity on the basis of SΠ.
3.3.1 Resource Supply within Rejuvenation Hyperperiod Π
Suppose the rejuvenation pattern is given as follows: n
warm rejuvenations followed by one cold rejuvenation, as
shown in Fig. 2.
Before the ith (1 ≤ i ≤ n + 1) rejuvenation, the start
performance and the end performance of the resource are
pi−1
and r, respectively. The resource available time length
of the ith rejuvenation is
li = f−1
(r) − f−1
(pi−1
) =
pi−1
− r
a
. (2)
The resource supply of the ith rejuvenation is
Si =
f−1
(r)
f−1(pi−1)
f(t)dt =
1−r
a
1−pi−1
a
f(t)dt. (3)
To generalize Eq. (2) and Eq. (3), we assume l0 = 0 and
S0 = 0.
A rejuvenation pattern contains n + 1 resource available
intervals, n warm rejuvenations, and one cold rejuvenation.
The rejuvenation hyperperiod is
Π =
n+1
i=1
li + n · ΦW + ΦC. (4)
The resource supply within the rejuvenation hyperperiod Π
is the summation of n + 1 resource available intervals, i.e.,
SΠ =
n+1
i=1
Si. (5)
3.3.2 Resource Supply within System Longevity L
In practical cases, the system longevity is much larger than
the rejuvenation hyperperiod, i.e., L >> Π [11]. We divide
the analysis of the resource supply SL within the longevity
into two cases based on if the longevity L is divisible by the
rejuvenation hyperperiod Π or not.
Case 1: L mod Π = 0
In this case, the system longevity contains L/Π entire
rejuvenation hyperperiods. The total resource supply within
the longevity is the sum of L/Π resource supplies SΠ in a
rejuvenation hyperperiod, i.e.,
SL = SΠ ·
L
Π
. (6)
Case 2: L mod Π = 0
In this case, we divide the total resource supply into two
parts: the resource supply in the interval containing L/Π
entire rejuvenation hyperperiods and the resource supply
of the remaining time interval IR. Hence, the total resource
supply within the longevity is
SL = SΠ ·
L
Π
+ SR. (7)
where SR is the resource supply of the remaining time
interval IR with length lR = L mod Π.
We further divide the analysis of the remaining resource
supply SR into two cases based on if the remaining time
interval IR ends in the time period when a rejuvenation is
in process.
Fig. 3. Resource Supply Analysis
Case 2.1: IR ends during a rejuvenation
As shown in Fig. 3, the remaining interval IR ends
during the jth rejuvenation implies



j
i=1
li + (j − 1)ΦW ≤ lR ≤
j
i=1
li + jΦW if 1 ≤ j ≤ n
Π − ΦC ≤ lR ≤ Π if j = n + 1
(8)
where j ∈ N.
The resource supply SR within IR is hence
SR =
j
i=1
Si (9)
where the value of j can be calculated from lR and Eq. (8).
Case 2.2: IR ends when the resource is available
Similar to Case 2.1, IR may end at the jth resource
available interval, which implies
j−1
i=0
li + (j − 1)ΦW ≤ lR ≤
j
i=0
li + (j − 1)ΦW (10)
where 1 ≤ j ≤ n + 1 and j ∈ N.
Hence, as shown in Fig. 3, the resource supply within IR
is
SR =
j−1
i=0
Si +
f−1
(pj−1
)+lR− j−1
i=0 li−(j−1)ΦW
f−1(pj−1)
f(t)dt (11)
where the value of j can be calculated from lR and Eq. (10).
3.4 Average Resource Performance Maximization
As n ∈ N and 0 ≤ n ≤ Nmax, the possible choices of
the number of warm rejuvenations n that maximizes the
average resource performance AL are limited. We present
a linear search method, i.e., the MAX-PERFORMANCE
algorithm, to determine the optimal number of warm
rejuvenations N∗
before a cold rejuvenation. The MAX-
PERFORMANCE algorithm is given as Algorithm 1.
In particular, the MAX-PERFORMANCE algorithm ini-
tializes both the optimal number of warm rejuvenations N∗
and the maximal average resource performance fmax as 0
(line 1-2), and calculates the possible maximal number of
warm rejuvenations Nmax (line 3). In the for loop (line 4-
11), for each possible number of warm rejuvenations n, we
5
Algorithm 1 MAX-PERFORMANCE
Input: A resource R(f(t), r, p, ΦW , ΦC, n) and the system
longevity L.
Output: The optimal warm rejuvenation number N∗
before
a cold rejuvenation that maximizes the average resource
performance, and the maximal average resource perfor-
mance fmax during the system longevity.
1: N∗
= 0
2: fmax = 0
3: Nmax = logp r
4: for n = 0 to Nmax do
5: Calculate SL according to Eq. (6) or Eq. (7)
6: AL = SL/L
7: if AL > fmax then
8: N∗
= n
9: fmax = AL
10: end if
11: end for
12: return N∗
and fmax
calculate the total resource supply SL (line 5) according the
analysis in Section 3.3 and the average resource performance
AL (line 6). Then we determine if the current n maximizes
the average resource performance (line 7-10). The algorithm
returns the maximal average resource performance fmax
and the corresponding number of warm rejuvenations N∗
(line 12).
Based on the resource supply analysis in Section 3.3, the
resource supply calculation in Algorithm 1 (line 5) costs
O(n) time. Hence, the time complexity of Algorithm 1 is
O(n2
).
4 TWO-LEVEL REJUVENATION STRATEGY TO
MINIMIZE REAL-TIME TASKS’S DEADLINE MISS
RATE
(a) Schedule for Task τ(3, 6) on P2-resource R(f(t) = 1 − 0.1t, 7, 2)
(b) Schedule for Task τ(3, 6) on P2-resource
R(f(t) = 1 − 0.1t, 6, 2)
Fig. 4. Schedule Example
So far, we discussed how frequent warm rejuvenation
needs to apply between two cold rejuvenations so that the
average resource performance is maximized. The discus-
sion focus on maximizing resource’s average performance
without any application’s information and relays on a des-
ignated threshold r. However, when tasks are deployed
Fig. 5. Improved Resource Model
on the resources, a constant threshold r may not be op-
timal for minimizing the tasks’ deadline missing rate on
the resource with given performance degradation. In other
words, maximizing resource’s average performance does
not necessarily guarantee task deadline satisfaction. As an
example, consider a simplified periodic resource with only
cold rejuvenation R(f(t), Π, ΦC), where f(t) = 1 − 0.1t
is the resource performance degradation function, Π is the
resource period and ΦC = 2 is the cold rejuvenation over-
head. The average resource performance is
Π−2
0 f(t)dt
Π =
−0.05Π2
+1.2Π−2.2
Π . Let dA
dΠ = −0.05Π2
+2.2
Π2 = 0, we obtain that
when Π = 6.63 = 7, the average resource performance is
maximized at 0.5357. However, if a task τ(3, 6) is deployed
on the resource R(f(t) = 1 − 0.1t, 7, 2), both job instance
J6 and J7 miss their deadlines as depicted in Fig. 4(a).
If we set Π = 6.63 = 6, then all the job instances of
task τ(3, 6) meet their deadlines (Fig. 4(b)), but the average
resource performance is only 0.5333, which is smaller than
0.5357. The example reveals that the rejuvenation schedule
that optimizes average resource performance does not nec-
essarily meet task’s deadline requirement. Hence, we are
to decide the optimal rejuvenation schedule that minimizes
given task’s deadline miss rate.
4.1 Models and Definitions
In this paper, we only consider single real-time periodic task
scheduling problem on resources with performance degra-
dation. A real-time periodic task τ is represented as a 2-tuple
τ(e, T), where e is the execution demand in computational
cycles and T is task τ’s period, which is also its deadline.
In Section 3, we use a constant performance degradation
threshold r to determine when to perform a rejuvenation.
However, when a real-time task is deployed on the resource,
rejuvenation may need to be performed according to tasks’
execution and a constant performance degradation thresh-
old may not be sufficient to capture the dynamicity of when
to perform rejuvenation. We extend the resource model to
a 7-tuple R(f(t), p, ΦC, ΦW , n, Πc, Πw). In addition to the
performance function f(t), resource performance restore
factor of a warm rejuvenation p, cold rejuvenation down-
time ΦC, warm rejuvenation downtime ΦW and number
of warm rejuvenations between two cold rejuvenations as
defined in Section 3, a set of cold rejuvenation periods
Πc = {π1
c , . . . , πj
c} and a set of warm rejuvenation pe-
riods Πw{π1
w, . . . , πk
w} within a rejuvenation hyperperiod
are characterized in the resource model. As illustrated in
Fig. 5, the rejuvenation period is defined as the time interval
that start from the resource available time after previous
6
rejuvenation to the end of current rejuvenation. All the
elements in Πc and Πw are in a chronological order. For a
rejuvenation hyperperiod, we have πj
c mod T = 0, where
πj
c is the last element in Πc. According to the relationship
between rejuvenation period and task period, we define two
categorizes of rejuvenation strategy: synchronized rejuvena-
tion and asynchronized rejuvenation.
Definition 1 (Synchronized Rejuvenation). For
a real-time periodic task τ(e, T) and a resource
R(f(t), p, ΦC, ΦW , n, Πc, Πw). The resource performs syn-
chronized rejuvenation on task τ, if ∀πi
c ∈ Πc, πj
w ∈ Πw, πi
c
mod T = 0 ∧ πj
w mod T = 0.
Definition 2 (Asynchronized Rejuvenation). For
a real-time periodic task τ(e, T) and a resource
R(f(t), p, ΦC, ΦW , n, Πc, Πw). The resource performs asyn-
chronized rejuvenation on task τ, if ∃πi
c ∈ Πc ∨ πj
w ∈ Πw, πi
c
mod T = 0 ∨ πj
w mod T = 0.
4.2 Cold Rejuvenation Strategy for Single Periodic
Task
We first consider a simple case where only cold rejuvenation
is performed and there is only one real-time periodic task
is deployed on the resource. The resource model can be
presented as R(f(t), 1, ΦC, ΦW , 0, Πc = {π}, Πw = ∅). We
can further simplify the resource model as R(f(t), π, ΦC),
where f(t) is the resource degradation function, π is the re-
juvenation period and ΦC is the cold rejuvenation overhead.
We assume that synchronized rejuvenation is performed.
From the Fig. 4 we can see that when task’s release time
is synchronized with the resource restore time, the task in
the example can be scheduled. The following lemma also
indicates that when ΦC − ΦC
T T +eimax
≤ T, synchronized
rejuvenation can minimize task’s deadline miss rate.
Lemma 1. For a resource R(f(t), p = 1, ΦC, ΦW , n = 0, Πc =
{πc}, Πw = ∅) and a real-time task τ(e, T) deployed on it.
Synchronized rejuvenation minimizes task’s deadline miss rate if
ΦC − ΦC
T T + eimax ≤ T, where eimax is the execution time of
the last task instance before rejuvenation.
Proof. Since there are at least ΦC
T task instances miss their
deadlines during the cold rejuvenation downtime. Assume
there are imax task instances can finish their execution
before deadline after each cold rejuvenation. Hence, the
possible minimum deadline miss rate when only cold re-
juvenation is performed is:
ΦC
T
ΦC
T + imax
Since in the cold rejuvenation downtime, there are total
ΦC
T task instances are released. In order to meet the
minimum deadline miss rate, we have to guarantee that the
additional release task instance during the cold rejuvenation
downtime is imax
th
task instance. Therefore, we have
ΦC −
ΦC
T
T ≤ T − eimax
⇒ ΦC −
ΦC
T
T + eimax
≤ T
As Guo et al. proved that when tasks period are har-
monic with a periodic resource, the utilization bound can be
maximized [5]. We have an hypothesis that when synchro-
nized rejuvenation is performed, the task’s deadline miss
rate can be minimized. Therefore, we are to design a rejuve-
nation strategy such that rejuvenation can synchronize with
task and the task’s deadline miss rate is minimized. Such
design not only possible reduces task’s deadline miss rate,
but also simplify the implementation. We will validate the
hypothesis in Section 5.
Our first step is to find the maximum number of in-
stances that can finish their execution without any rejuve-
nation.
The resource supply for each instance can be calculated
as follow equation:
Si =
iT
(i−1)T
f(t)dt (12)
Lemma 2. For a resource with performance degradation function
f(t) and a real-time task τ(e, T) deployed on it, if Simax
≥ e >
Simax+1, there are at most imax task instances can finish their
execution before their deadlines.
Proof. The proof of Lemma 2 is trivial. Since the perfor-
mance degradation function is a linear decreasing function.
Once a resource supply in one task period cannot provide
e cycles computation, then the task instance in that period
misses its deadline and the following task instances also
miss their deadlines.
With Lemma 2, we know that after ith
max instances,
we have to perform rejuvenation so that following task
instances have opportunity to finish their executions. How-
ever, cold rejuvenation takes time and there are some task
instances cannot be executed during the rejuvenation.
Lemma 3. For a resource with performance degradation func-
tion f(t), cold rejuvenation downtime ΦC, and a real-time task
τ(e, T) deployed on it, there are at least ΦC −T +e
T task instances
that miss their deadlines during the rejuvenation downtime.
Proof. In a cold rejuvenation downtime ΦC, there are n =
ΦC
T task instances that certainly miss their deadline. For
a task τi, it has at most T − e spare time. If the remaining
rejuvenation time ΦC − nT > (T − e), then another task
instance misses its deadline. If the reaming rejuvenation
time ΦC − nT < (T − e), it is possible that the following
task instance finishes within its deadline.
Lemma 4. For a resource R(f(t), π, ΦC) with performance
degradation function f(t), cold rejuvenation downtime ΦC, and a
real-time task τ(e, T) on it. Task’s deadline miss rate is minimized
when
Π =
imaxT + ΦC
T T − ΦC, if ΦC − ΦC
T T ≤ T − eimax
imaxT + ΦC
T T − ΦC, if ΦC − ΦC
T T > T − eimax
(13)
where eimax is the execution time of the ith
max instance.
Proof. Assume the execution time of the imax instance is
eimax , the idle time in imax’s period is T − eimax . As the
end of rejuvenation equals to a task instance’s release time,
there are at least ΦC
T instances miss their deadlines. The
7
Fig. 6. Resource with Cold Synchronized Rejuvenation
remaining time in the cold rejuvenation downtime is then
ΦC − ΦC
T T.
As the case 1 illustrated in Fig. 6, if the remaining time
ΦC − ΦC
T T ≤ T − eimax , then the rejuvenation starts at
imaxT + ΦC
T T − ΦC can synchronize the resource restore
with task instance’s release.
If the remaining time ΦC − ΦC
T T > T −eimax
, there are
two options to ensure that the resource restore synchronizes
with the task instance’s release. As shown in Fig. 7, first
strategy is to preempt imax’s execution and the second
strategy is to wait until imax + 1 instance.
Fig. 7. Two Cold Synchronized Rejuvenation Strategies
Denote ΦC
T = a and imax + ΦC
T = n. It is easy
to find out when preempt the execution of imax instance
and start rejuvenation during imax’s instance, at the end of
rejuvenation, there are total n task instances are released
and a + 1 task instances miss their deadlines. Therefore,
the deadline miss rate is a+1
n . If rejuvenation starts after
imax instance finish its execution, then there are total n + 1
instances are released and a+1 of them miss their deadlines.
Hence, the deadline miss rate is a+1
n+1 . It is obvious that
the second option always give the smaller deadline miss
rate. Hence, it is better to perform rejuvenation after imax
instance finish its execution.
4.3 Two-level Rejuvenation Strategy for Single Periodic
Task
Once we obtain the optimal strategy to minimize the task’s
deadline miss rate using single synchronized cold rejuve-
nation schedule, we can easily add any warm rejuvenation
between two cold rejuvenations. For each single warm re-
juvenation, we can treated as an individual resource with
single rejuvenation and apply Lemma 4 to calculate the op-
timal rejuvenation strategy. However, as warm rejuvenation
cannot fully restore resource’s performance, the number of
the warm rejuvenations between two cold rejuvenations is
yielded to the following two conditions.
Lemma 5. For a resource R(f(t) = 1 −
at, p, ΦC, ΦW , n, Πc, Πw) and a real-time periodic task
τ(e, T) deployed on it, there are at most n = 1
2 logp 2ae
warm rejuvenations that can be performed between two cold
rejuvenations.
Proof. After nth
warm rejuvenation, the resource’s restored
performance is pn
. The remaining supply Sn can be calcu-
lated as:
Sn =
1
2
1
a
−
1 − pn
a
pn
=
p2n
2a
(14)
Once Sn < e, there are no task instance can be finished
after warm rejuvenation. Hence, the maximum number of
warm rejuvenations n satisfies the following condition:
Sn = p2n
2a = e (15)
p2n
= 2ae (16)
n =
1
2
logp 2ae (17)
As n is an integer, therefore, after n = 1
2 logp 2ae warm
rejuvenation, no task instance can finish its execution.
Lemma 6. For a resource R(f(t) = 1 −
at, p, ΦC, ΦW , n, Πc, Πw) and a real-time periodic task
τ(e, T) deployed on it, if performing n warm rejuvenations
between two cold rejuvenations can minimize task’s deadline miss
rate, we have:
in
max ≥






ΦW
T (i0
max +
n
j=1
ij
max)
ΦC
T + n ΦW
T






> in+1
max (18)
where i0
max is the maximum number of instances that can finish
their execution after a cold rejuvenation and ij
max is the maximum
number of instances that can finish their execution after j warm
rejuvenations.
Fig. 8. Resource with Two-Level Synchronized Rejuvenation
Proof. Denote the minimum number of instances that miss
their deadlines in a cold rejuvenation and warm rejuvena-
tion as x and y, respectively. According to Lemma 3, we
have x = ΦC
T and y = ΦW
T . Denote Suc(n) as the
8
number of tasks meet their deadlines after first warm re-
juvenation to nth
rejuvenation. We have Suc(n) =
n
j=1
ij
max.
Since the resource performance restored after each warm
rejuvenation decreases as number of warm rejuvenation
increases, the function Suc(n) has following properties:
1) Suc(0) = 0
2) 1 ≤ Suc(n+1)−Suc(n) ≤ i0
max if 1 ≤ n ≤ 1
2 logp 2ae
3) Suc(n + 1) = Suc(n) if n ≥ 1
2 logp 2ae)
As illustrated in Fig. 8, for a given resource R(f(t) =
1 − at, p, ΦC, ΦW , n, Πc, Πw) and real-time periodic task
τ(e, T), the number of instances miss deadlines is Miss(n) =
x + ny and the number of instances released is Rel(n) =
x + i0
max + ny + Suc(n). Then the deadline miss rate can be
calculated as follow:
d(n) =
Miss(n)
Rel(n)
=
x + ny
x + i0
max + ny + Suc(n)
(19)
In order to find the minimum d(n), we need first de-
termine whether d(n) has a minimum value. Consider the
following equation:
d(n + 1) − d(n) =
Miss(n) + y
Rel(n) + in+1
max + y
−
Miss(n)
Rel(n)
=
Rel(n)y − (in+1
max + y)Miss(n)
(Rel(n) + in+1
max + y)Rel(n)
we have:
d(n + 1) ≥ d(n), if Rel(n)y − (in+1
max + y)Miss(n) ≥ 0
d(n + 1) < d(n), if Rel(n)y − (in+1
max + y)Miss(n) < 0
⇒



d(n + 1) ≥ d(n), if y
in+1
max+y
≥ Miss(n)
Rel(n)
d(n + 1) < d(n), if y
in+1
max+y
< Miss(n)
Rel(n)
⇒
d(n + 1) ≥ d(n), if y
in+1
max+y
≥ d(n)
d(n + 1) < d(n), if y
in+1
max+y
< d(n)
(20)
In equation (20), y
in+1
max+y
represents the deadline miss
rate of single n+1th
warm rejuvenation. Since y is a constant
for all warm rejuvenation, and in
max decreases as n increases.
The deadline miss rate for single warm rejuvenation y
in
max+y
increases as n increases. Hence, d(n) will keep decreasing
as long as y
in
max+y < d(n − 1) and d(n) will keep increasing
once y
in
max+y > d(n − 1). Therefore, the function d(n) exists
a minimum value.
As d(n) is the minimum deadline miss rate, we have:
d(n + 1) > d(n)
d(n − 1) ≥ d(n)
⇒



Miss(n)+y
Rel(n)+in+1
max+y
> Miss(n)
Rel(n)
Miss(n−1)+y
Rel(n−1)+in−1
max+y
≥ Miss(n)
Rel(n)
(21)
By solving the inequation (21), we obtain:
in
max ≥
(Suc(n) + i0
max)y
x + (n)y
> in+1
max (22)
Substitute x, y, and Suc(n) with ΦC
T , ΦW
T and
n
j=1
ij
max,
respectively. We have:
in
max ≥






ΦW
T (i0
max +
n
j=1
ij
max)
ΦC
T + n ΦW
T






> in+1
max (23)
5 EMPIRICAL STUDY
In this section, we first use simulation to evaluate the rela-
tionship between warm rejuvenation number n and average
resource performance AL and the impacts of warm/cold
rejuvenation time coat on the optimal warm rejuvenation
number N∗
that maximizes the average resource perfor-
mance AL. Since we proposed a hypothesis in Section 4.2
that task deadline miss rate is minimized when task release
time is synchronized with the rejuvenation period. The sec-
ond part of the empirical study is to validate the hypothesis
through experiments.
5.1 Experimental Study on Average Performance Maxi-
mization
Alonso et al. conducted a set of experiments to evaluate the
rejuvenation overhead of different rejuvenation techniques
[1]. Their experimental results show that standalone appli-
cation restart and virtual/physical machine reboot consume
about 45 seconds and 150 seconds, respectively. The appli-
cation restart can be treated as warm rejuvenation, while
the machine reboot is one kind cold rejuvenation. In our
simulations, we use the above experimental results as a
guide for how to set warm and cold rejuvenation time cost
parameters.
5.1.1 Relationship between n and AL
To evaluate the relationship between the number of warm
rejuvenation n and average resource performance AL, we
conduct a simulation with the following parameters:
• Resource performance degradation rate: a = 0.005
• Resource performance threshold: r = 0.3
• Resource performance restore factor of a warm reju-
venation: p = 0.95
• Cold rejuvenation time cost: ΦC = 150
• Warm rejuvenation time cost: ΦW = 45
• System longevity: L ∈ {1 × 104
, 3 × 104
, 5 × 104
, 1 ×
105
}
The possible maximal number of warm rejuvenations is
Nmax = logp r = 23. With a given system longevity L, for
each possible warm rejuvenation number, i.e., n ∈ [0, Nmax],
we calculate the average resource performance AL accord-
ing to the analysis in Section 3.3 and Eq. (1). Fig. 9 shows the
average resource performance under different numbers of
warm rejuvenations for each system longevity. From Fig. 9,
we have the following observations:
1) When the number of warm rejuvenations n increases,
the average resource performance AL first increases
and then decreases. For instance, AL increases when
9
n increases from 0 to 3 and starts to decrease when n
increases from 3 to 23.
2) When the number of warm rejuvenations n is too small
or too large, the average resource performance AL is
relatively low. For instance, when n = Nmax = 23, AL
reaches its minimal value.
3) The system longevity L does not have significant im-
pact on the rejuvenation behavior when L >> Π.
In our models given in Section 3.1, we assume a reju-
venation pattern starts with the initial state, i.e., f(t) = 0,
which indicates the rejuvenation behaviors in each rejuve-
nation hyperperiod are the same. In addition, we have also
made the assumption that the system periodically repeats
the rejuvenation pattern with period Π. If L >> Π, the
system longevity does not have a significant impact on
rejuvenation effects. This observation is evidenced from the
following aspects:
1) For different system longevity, the optimal number
of warm rejuvenations that maximize the average re-
source performance are the same. In particular, for the
tested four longevity cases, N∗
= 3.
2) With the same number of warm rejuvenations n, the
average resource performance AL of the four longevity
cases is similar. For instance, the maximal difference of
AL for four longevity cases is 3.76%.
3) The average resource performance trend changing over
the number of warm rejuvenations are similar.
The observations are consistent with our analysis, i.e.,
there is an optimal number of warm rejuvenations between
0 and Nmax that maximizes the average resource perfor-
mance. When n = 0, i.e., the system only takes cold rejuve-
nations, the resource becomes a P2
-resource [10]. The sim-
ulation results also show that the extended resource model
achieves 25.22% higher average resource performance than
the P2
-resource model.
0 5 10 15 20 23
0.28
0.3
0.32
0.34
0.36
0.38
0.4
n
AL
L = 1 × 104
L = 3 × 104
L = 5 × 104
L = 10 × 104
Fig. 9. Average Resource Performance vs Warm Rejuvenation Number
5.1.2 Warm/Cold Rejuvenation Time Cost Impact
We conduct a simulation to evaluate the impact of
warm/cold rejuvenation time cost on the optimal number
of warm rejuvenations N∗
that maximizes AL and average
resource performance fmax. The simulation parameters are
set the same as in Section 5.1.1 except the following two
parameters:
• Cold rejuvenation time cost: ΦC ∈
{100, 150, 200, 300}
• Warm rejuvenation time cost: ΦW ∈ [0, 100] with
step 5
• System longevity: L = 10 × 104
With a given cold rejuvenation time cost ΦC, for each
warm rejuvenation time cost ΦW choice, we use the MAX-
PERFORMANCE algorithm (Algorithm 1) to determine the
optimal number of warm rejuvenations N∗
that maximizes
AL and average resource performance fmax. Fig. 10(a) and
Fig. 10(b) depict the warm/cold rejuvenation time cost
impact on N∗
and fmax, respectively. From Fig. 10, we have
the following observations:
1) In general, the optimal number of warm rejuvenations
N∗
decreases when the warm rejuvenation time cost
ΦW increases; it increases when the cold rejuvenation
time cost ΦC increases.
2) The maximal average resource performance fmax de-
creases when both warm and cold rejuvenation time
costs increases.
3) Both the optimal number of warm rejuvenation N∗
and the maximal average resource performance fmax
decrease with warm/cold rejuvenation time cost ratio
ΦW /ΦC increasing.
The observations are consistent with the intuition behind
the proposed resource model. If the warm/cold rejuvena-
tion costs less/more time, i.e., the ratio ΦW /ΦC is smaller,
we should perform more warm rejuvenations to take its low
time cost advantage. As the resource is unavailable during
rejuvenations, the average resource performance decreases
if the rejuvenation’s time cost increases. When the ratio
ΦW /ΦC is larger, the proposed resource model can benefit
more from the low time cost advantage of warm rejuvena-
tions, i.e., results in higher average resource performance
fmax.
0 0.2 0.4 0.6 0.8 1
0
2
4
6
8
10
11
ΦW /ΦC
N∗
ΦC = 100
ΦC = 150
ΦC = 250
ΦC = 300
(a) Optimal Number of Warm
Rejuvenations
0 0.2 0.4 0.6 0.8 1
0.28
0.3
0.35
0.4
0.45
0.5
0.53
ΦW /ΦC
fmax
ΦC = 100
ΦC = 150
ΦC = 250
ΦC = 300
(b) Maximal Average Resource
Performance
Fig. 10. Warm/Cold Rejuvenation Time Cost Impact
5.2 Synchronization v.s. Asynchronized Rejuvenation
In Section 4.2, we made a hypothesis that task’s deadline
miss rate is minimized when synchronized rejuvenation is
performed. In this section, we are to validate the hypothesis
through both analytic and empirical study. We start from the
simple case where only cold rejuvenation is performed.
Lemma 1 shows that when ΦC − ΦC
T T + eimax
≤ T,
synchronized rejuvenation can minimize task’s deadline
miss rate.
However, for a resource R(f(t), p = 1, ΦC, ΦW , n =
0, Πc = {πc}, Πw = ∅) and a real-time task τ(e, T) deployed
on it. If ΦC − ΦC
T T + eimax
> T, where eimax
is the
execution time of the last task instance before rejuvenation.
The situation becomes complicated if asynchronized rejuve-
nation is performed. Two extreme scenarios may occur:
Best Case:
As depicted in Fig. 11, synchronized cold rejuvena-
tion is performed to restore resource’s performance. If
ΦC − ΦC
T T + eimax > T, the first imax task instances
can finish their execution before deadline. In order to keep
10
Fig. 11. Comparison between Synchronized Cold Rejuvenation and
Asynchronized Cold Rejuvenation (Best Case Scenario)
the rejuvenation synchronizing with the task instance, the
rejuvenation starts in the middle of imax + 1 task instance.
Since the rejuvenation is synchronized with task, assume
there are n task instances released during the rejuvenation
period. Then the deadline miss rate of the synchronized cold
rejuvenation is:
dsyn =
ΦC
T
n
(24)
For the asynchronized rejuvenation, we consider the case
that rejuvenation always starts immediate after imax task
instance finish its execution. We consider the same time
interval of synchronized rejuvenation period nT. As illus-
trated in Fig. 11, it is possible that the first task instance J1 is
released before resource’s available time, however, it can fin-
ish its execution before deadline when resource is available
to task instances. Because the second task J2 can start earlier
compare to J2 in the synchronized rejuvenation scenario, it
can have more resource supply compare to the synchronized
rejuvenation scenario. Therefore, each of the following task
instances can get more resource supply compared to the
synchronized rejuvenation scenario. Hence, it is possible
that imax+1 task instance can also finish its execution before
its deadline. Then, the number of task instances that meet
their deadlines before rejuvenation is imax+1. Since number
of task instances released during the time interval nT is n.
The deadline miss rate of asynchronized scenario is:
dasyn =
ΦC
T
n
(25)
It is obvious that the deadline miss rate of asynchronized
cold rejuvenation is less than the deadline miss rate of syn-
chronized cold rejuvenation in the best case. However, in the
worst case scenario, the deadline miss rate of asynchronized
cold rejuvenation can be larger than the deadline miss rate
of synchronized cold rejuvenation.
Worst Case:
Fig. 12. Comparison between Synchronized Cold Rejuvenation and
Asynchronized Cold Rejuvenation (Worst Case Scenario)
As shown in Fig. 12, in the worst case scenario, the
first task instance J1 is released during last rejuvenation
downtime and before resource’s restore time. However,
unlike the aforementioned best case scenario, J1 misses its
deadline. Although from task instance J2 to Jimax+!, the
resource supply increases compared to the synchronized
rejuvenation scenario, it is possible that the Jimax+1 still can-
not finish its execution before its deadline. Hence, the cold
rejuvenation is performed immediate after Jimax finish its
execution. Since the rejuvenation starts earlier, the resource
restores its supply during the last task instance’s execution.
The resource can support part of the execution of the last
task instance Jn in time interval nT, but cannot fully execute
the Jn within its deadline. Hence, there are total imax − 1
out of n task instances can meet their deadlines. Then the
deadline miss rate of the worst case scenario is:
dasyn =
n − imax + 1
n
(26)
which is larger the the deadline miss rate of synchro-
nized rejuvenation as shown in equation (24).
From the above analysis, we know that when only
cold rejuvenation is performed to restore resource’s per-
formance, if ΦC − ΦC
T T + eimax
≤ T, the synchronized
rejuvenation outperforms the asynchronized rejuvenation.
If ΦC − ΦC
T T + eimax
> T, it is difficult to tell which
rejuvenation strategy has better performance. As when re-
juvenation is asynchronized with task, both best case and
worst case scenario may exist during the execution, it is
difficult to evaluate the synchronized and asynchronized
rejuvenation through theoretical analysis. It is even more
complicated to do the analysis when two-level rejuvenation
is enabled. Hence, we study the performance of both re-
juvenation strategies through simulations. The simulations
are conducted with following parameters:
• Resource performance degradation rate: a ∈
{0.01, 0.001, 0.0001}
11
• Resource performance restore factor of a warm reju-
venation: p = 0.9
• Task period: T ∈ [100, 200]
• Task utilization U = e
T : U ∈
{0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1}
• Cold rejuvenation time cost: ΦC = nT, n ∈ {2, 5, 10}
• Warm rejuvenation time cost: ΦW = 1
3 ΦC
• System longevity: L = 105
Four different rejuvenation strategies are evaluated in
the simulations: synchronized cold rejuvenation, asynchro-
nized cold rejuvenation, synchronized two-level rejuvena-
tion, asynchronized two-level rejuvenation. Under each set
of parameters, we repeat the simulation one hundred times
with random task periods using these four rejuvenation
strategies. An average deadline miss rate of each rejuvena-
tion strategies is calculated for the comparison. In addition
to average deadline miss rate, an outperform ratio is also
calculated to evaluate how many simulations using synchro-
nized rejuvenation strategy outperforms the simulations
using asynchronized rejuvenation strategy under the same
set of parameters.
0 0.2 0.4 0.6 0.8 1
0.4
0.6
0.8
1
Synchronized Asynchronized
(a) Cold Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(b) Outperform Ratio of Cold
Rejuvenation
0 0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1
Synchronized Asynchronized
(c) Two-Level Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(d) Outperform Ratio of Two-
Level Rejuvenation
Fig. 13. Deadline Miss Rate Comparison between Syn. Rejuvenation
and Asyn. Rejuvenation (a = 0.01, ΦC = 2T)
Fig. 13, Fig. 14 and Fig. 15 show the comparison of aver-
age deadline miss rate between synchronized rejuvenation
strategy and asynchronized rejuvenation strategy under the
setting of resource degradation rate a = 0.01 and cold
rejuvenation downtime ΦC equals to two times , five times
and ten times of task’s period, respectively. From the figures,
it is clear that when two-level rejuvenation strategy applied,
average deadline miss rate reduces compared to the cold
rejuvenation strategy. It isnoticed that when the task utiliza-
tion is small, the asynchronized rejuvenation strategies has
less average deadline miss rate compared to synchronized
rejuvenation strategies. When the task utilization increases,
the synchronized rejuvenation strategies outperform asyn-
chronized rejuvenation strategies. When task’s utilization is
0 0.2 0.4 0.6 0.8 1
0.75
0.8
0.85
0.9
0.95
1
Synchronized Asynchronized
(a) Cold Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(b) Outperform Ratio of Cold
Rejuvenation
0 0.2 0.4 0.6 0.8 1
0.6
0.7
0.8
0.9
1
Synchronized Asynchronized
(c) Two-Level Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(d) Outperform Ratio of Two-
Level Rejuvenation
Fig. 14. Deadline Miss Rate Comparison between Syn. Rejuvenation
and Asyn. Rejuvenation (a = 0.01, ΦC = 5T)
0 0.2 0.4 0.6 0.8 1
0.88
0.9
0.92
0.94
0.96
0.98
1
Synchronized Asynchronized
(a) Cold Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(b) Outperform Ratio of Cold
Rejuvenation
0 0.2 0.4 0.6 0.8 1
0.8
0.85
0.9
0.95
1
Synchronized Asynchronized
(c) Two-Level Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(d) Outperform Ratio of Two-
Level Rejuvenation
Fig. 15. Deadline Miss Rate Comparison between Syn. Rejuvenation
and Asyn. Rejuvenation (a = 0.01, ΦC = 10T)
above 50%, none of the tasks can meet their deadline. This
is because, for a resource with degradation rate a = 0.01,
the resource performance reaches 0 at time 100. As our task
period is larger than 100, then maximum task utilization the
resource can support is 0.5.
Fig. 16, Fig. 17 and Fig. 18 illustrate the comparison
of average deadline miss rate between synchronized reju-
12
0 0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1
Synchronized Asynchronized
(a) Cold Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(b) Outperform Ratio of Cold
Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
Synchronized Asynchronized
(c) Two-Level Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(d) Outperform Ratio of Two-
Level Rejuvenation
Fig. 16. Deadline Miss Rate Comparison between Syn. Rejuvenation
and Asyn. Rejuvenation (a = 0.001, ΦC = 2T)
0 0.2 0.4 0.6 0.8 1
0.4
0.6
0.8
1
Synchronized Asynchronized
(a) Cold Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(b) Outperform Ratio of Cold
Rejuvenation
0 0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1
Synchronized Asynchronized
(c) Two-Level Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(d) Outperform Ratio of Two-
Level Rejuvenation
Fig. 17. Deadline Miss Rate Comparison between Syn. Rejuvenation
and Asyn. Rejuvenation (a = 0.001, ΦC = 5T)
venation strategy and asynchronized rejuvenation strategy
under the setting of resource degradation rate a = 0.001
and cold rejuvenation downtime ΦC equals to two times
, five times and ten times of task’s period, respectively.
Fig. 19, Fig. 20 and Fig. 21 depict the comparison of aver-
age deadline miss rate between synchronized rejuvenation
strategy and asynchronized rejuvenation strategy under the
0 0.2 0.4 0.6 0.8 1
0.6
0.7
0.8
0.9
1
Synchronized Asynchronized
(a) Cold Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(b) Outperform Ratio of Cold
Rejuvenation
0 0.2 0.4 0.6 0.8 1
0.4
0.6
0.8
1
Synchronized Asynchronized
(c) Two-Level Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(d) Outperform Ratio of Two-
Level Rejuvenation
Fig. 18. Deadline Miss Rate Comparison between Syn. Rejuvenation
and Asyn. Rejuvenation (a = 0.001, ΦC = 10T)
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
Synchronized Asynchronized
(a) Cold Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(b) Outperform Ratio of Cold
Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
Synchronized Asynchronized
(c) Two-Level Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(d) Outperform Ratio of Two-
Level Rejuvenation
Fig. 19. Deadline Miss Rate Comparison between Syn. Rejuvenation
and Asyn. Rejuvenation (a = 0.0001, ΦC = 2T)
setting of resource degradation rate a = 0.0001 and cold
rejuvenation downtime ΦC equals to two times , five times
and ten times of task’s period, respectively. It is obvious
that two-level rejuvenation strategies always have less av-
erage deadline miss rate compared to the cold rejuvena-
tion strategies. However, when the resource degradation
rate reduces, asynchronized rejuvenation strategies always
13
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
Synchronized Asynchronized
(a) Cold Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(b) Outperform Ratio of Cold
Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
Synchronized Asynchronized
(c) Two-Level Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(d) Outperform Ratio of Two-
Level Rejuvenation
Fig. 20. Deadline Miss Rate Comparison between Syn. Rejuvenation
and Asyn. Rejuvenation (a = 0.0001, ΦC = 5T)
0 0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1
Synchronized Asynchronized
(a) Cold Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(b) Outperform Ratio of Cold
Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
Synchronized Asynchronized
(c) Two-Level Rejuvenation
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(d) Outperform Ratio of Two-
Level Rejuvenation
Fig. 21. Deadline Miss Rate Comparison between Syn. Rejuvenation
and Asyn. Rejuvenation (a = 0.001, ΦC = 10T)
outperform synchronized rejuvenation strategies in terms
of average deadline miss rate. When the resource degra-
dation rate a = 0.001 and task utilization is small (less
than 0.4), asynchronized rejuvenation strategies have much
better performance compared to synchronized rejuvenation
strategies. As illustrated in Fig. ??, Fig. ?? and Fig. ??
indicate, when only cold rejuvenation strategy applied, a
few cases with synchronized rejuvenation strategies have
less deadline miss rate then asynchronized rejuvenation
strategies. However, when two-level rejuvenation applied,
all simulations with asychronized rejuvenation strategies
have less deadline miss rate than the one with sychro-
nized rejuvenation strategies. When resource degradation
rate reduces to 0.0001, the average deadline miss rate
of synchronized rejuvenation strategies and asynchronized
rejuvenation strategies are almost at the same level but
asynchronized rejuvenation strategies still outperform syn-
chronized rejuvenation strategy.
The experimental results overthrow the initial hypoth-
esis that synchronized rejuvenation strategies minimize
task’s deadline miss rate. On the contrary, in most of the
scenario, asynchronized rejuvenation strategies, i.e. start
rejuvenation immediate after last task instance that can meet
its deadline finish its execution, have better performance on
minimizing task’s deadline miss rate.
6 CONCLUSION
To combat resource performance degradation due to soft-
ware aging, we have extended our previous resource with
performance degradation and periodic rejuvenation (P2
-
resource) model by using a two-level rejuvenation strategy
to maintain resource performance. Based on the extended
resource model, we have formally analyzed the resource
supply function and presented the MAX-PERFORMANCE
algorithm to determine the optimal rejuvenation pattern
that maximizes the average resource performance. For a
given real-time task is deployed on the resource, maxi-
mized average resource performance does not necessarily
guarantee task’s deadline miss rate is minimized. Hence,
our second contribution of the paper is the design of
a dynamic two-level rejuvenation strategy that minimize
task’s deadline miss rate when a real-time periodic task is
deployed on the resource. The extensive simulation results
show that with a two-level rejuvenation strategy, we can
achieve 25.22% higher average resource performance com-
pared with a single level rejuvenation strategy. In addition,
with a two-level rejuvenation strategy, a task’s deadline miss
rate is always lower than the deadline miss rate when a
single rejuvenation strategy is applied. The experimental
results also show that asynchronized rejuvenation strategy
outperforms the synchronized rejuvenation strategy in most
of the scenarios.
The paper focuses on minimizing the deadline miss rate
for a single real-time periodic task. However, when multiple
tasks are deployed on the resource, the current two-level
rejuvenation strategy may not apply in terms of minimizing
deadline miss rate. Our future work is to analyze task
schedulability and study the optimal rejuvenation pattern
for a given task set with the goal of maximizing the task set
schedulability and minimizing task set’s deadline miss rate.
REFERENCES
[1] J. Alonso, R. Matias, E. Vicente, A. Maria, and K. Trivedi. A com-
parative experimental study of software rejuvenation overhead.
Performance Evaluation, 70(3):231 – 250, 2013. Special Issue on
Software Aging and Rejuvenation.
[2] J. R. Black. Electromigrationa brief survey and some recent results.
Electron Devices, IEEE Transactions on, 16(4):338–347, 1969.
14
[3] S. Garg, A. van Moorsel, K. Vaidyanathan, and K. S. Trivedi.
A methodology for detection and estimation of software aging.
In Software Reliability Engineering, 1998. Proceedings. The Ninth
International Symposium on, pages 283–292. IEEE, 1998.
[4] M. Grottke, R. Matias, and K. Trivedi. The fundamentals of
software aging. In Software Reliability Engineering Workshops, 2008.
ISSRE Wksp 2008. IEEE International Conference on, pages 1–6, Nov
2008.
[5] C. Guo, X. Hua, H. Wu, D. Lautner, and S. Ren. Best-harmonically-
fit periodic task assignment algorithm on multiple periodic re-
sources. IEEE Transactions on Parallel and Distributed Systems, pp:1,
2015.
[6] C. Guo, H. Wu, X. Hua, D. Lautner, and S. Ren. Use two-
level rejuvenation to combat software aging and maximize av-
erage resource performance. In High Performance Computing and
Communications (HPCC), 2015 IEEE 7th International Symposium on
Cyberspace Safety and Security (CSS), 2015 IEEE 12th International
Conferen on Embedded Software and Systems (ICESS), 2015 IEEE 17th
International Conference on, pages 1160–1165. IEEE, 2015.
[7] C. Guo, H. Wu, X. Hua, S. Ren, and J. Nogiec. Maximize system
reliability for long lasting and continuous applications. In New
Contributions in Information Systems and Technologies, volume 353
of Advances in Intelligent Systems and Computing, pages 603–612.
Springer International Publishing, 2015.
[8] R. Hanmer and V. Mendiratta. Rejuvenation with workload
migration. In Dependable Systems and Networks Workshops (DSN-
W), 2010 International Conference on, pages 80–85, June 2010.
[9] Y. Hong, D. Chen, L. Li, and K. S. Trivedi. Closed loop design for
software rejuvenation. In Workshop on Self-Healing, Adaptive, and
Self-Managed Systems, 2002.
[10] X. Hua, C. Guo, H. Wu, and S. Ren. Schedulability analysis
for real-time task set on resource with performance degradation
and periodic rejuvenation. In Embedded and Real-Time Computing
Systems and Applications (RTCSA), 2015 IEEE 21th International
Conference on, Aug 2015.
[11] Y. Huang, C. Kintala, N. Kolettis, and N. Fulton. Software rejuve-
nation: analysis, module and applications. In Fault-Tolerant Com-
puting, 1995. FTCS-25. Digest of Papers., Twenty-Fifth International
Symposium on, pages 381–390, June 1995.
[12] V. Koutras. Two-level software rejuvenation model with increasing
failure rate degradation. In Dependable Computer Systems, vol-
ume 97 of Advances in Intelligent and Soft Computing, pages 101–115.
Springer Berlin Heidelberg, 2011.
[13] V. Koutras and A. Platis. Semi-markov availability modeling of
a redundant system with partial and full rejuvenation actions. In
Dependability of Computer Systems, 2008. DepCos-RELCOMEX ’08.
Third International Conference on, pages 127–134, June 2008.
[14] V. Koutras and A. Platis. Applying partial and full rejuvenation
in different degradation levels. In Software Aging and Rejuvenation
(WoSAR), 2011 IEEE Third International Workshop on, pages 20–25,
Nov 2011.
[15] V. Koutras, A. Platis, and N. Limnios. Availability and reliability
estimation for a system undergoing minimal, perfect and failed
rejuvenation. In Software Reliability Engineering Workshops, 2008.
ISSRE Wksp 2008. IEEE International Conference on, pages 40–45,
Nov 2008.
[16] H. Okamura and T. Dohi. Availability optimization in operational
software system with aperiodic time-based software rejuvenation
scheme. In Software Reliability Engineering Workshops, 2008. ISSRE
Wksp 2008. IEEE International Conference on, pages 22–27, Nov 2008.
[17] D. L. Parnas. Software aging. In Proceedings of the 16th International
Conference on Software Engineering, ICSE ’94, pages 279–287, Los
Alamitos, CA, USA, 1994. IEEE Computer Society Press.
[18] A. Sadek and N. Limnios. Nonparametric estimation of reliability
and survival function for continuous-time finite markov processes.
Journal of Statistical Planning and Inference, 133(1):1 – 21, 2005.
[19] A. Tai, S. Chau, L. Alkalaj, and H. Hecht. On-board preventive
maintenance: analysis of effectiveness and optimal duty period.
In Object-Oriented Real-Time Dependable Systems, 1997. Proceedings.,
Third International Workshop on, pages 40–47, Feb 1997.
[20] K. Trivedi, K. Vaidyanathan, and K. Goseva-Popstojanova. Model-
ing and analysis of software aging and rejuvenation. In Simulation
Symposium, 2000. (SS 2000) Proceedings. 33rd Annual, pages 270–
279, 2000.
[21] W. Xie, Y. Hong, and K. Trivedi. Analysis of a two-level software
rejuvenation policy. Reliability Engineering & System Safety, 87(1):13
– 22, 2005.
Hao Wu is now a Ph.D candidate in Computer
Science Department at Illinois Institute of Tech-
nology. He received B.E in Information Security
from Sichuan University, Chengdu, China, 2007.
He received M.S. in Computer Science from Uni-
versity of Bridgeport, Bridgeport, CT, 2009. His
current research interests mainly focus on cloud
computing, real-time distributed open systems,
Cyber-Physical System, parallel and distributed
systems, and real-time applications.
Chunhui Guo is now a Ph.D candidate in the
Computer Science Department at Illinois Insti-
tute of Technology. He earned his BSEE and
MSEE from Shandong University, China, in 2010
and 2013, respectively. His current research in-
terests mainly focus on real-time systems and
Cyber-Physical System.
Xiayu Hua is a Ph.D. student in the Computer
Science Department at Illinois Institute of Tech-
nology. His research interest is in distributed
file system, virtualization technology, real-time
scheduling and cloud computing. He earned his
B.S. degree from the Northwestern Polytechnic
University, China, in 2008 and his M.S. degree
from the East China Normal University, China, in
2012.
Igor Lopes is an exchange student majoring in
Computer Science at the University of Idaho.
He is part of the Science without Borders Pro-
gram, sponsored by the Brazilian Government.
His fields of interest are Software Development
and Software Engineering.
Dr. Shangping Ren is an associate professor
in Computer Science Department at the Illinois
Institute of Technology. She earned her Ph.D
from UIUC in 1997. Before she joined IIT in
2003, she worked in software and telecommuni-
cation companies as software engineer and then
lead software engineer. Her current research
interests include coordination models for real-
time distributed open systems, real-time, fault-
tolerant and adaptive systems, Cyber-Physical
System, parallel and distributed systems, cloud
computing, and application-aware many-core virtualization for embed-
ded and real-time applications.

More Related Content

What's hot

OS Memory Management
OS Memory ManagementOS Memory Management
OS Memory Management
anand hd
 
Proactive cloud service assurance framework for fault remediation in cloud en...
Proactive cloud service assurance framework for fault remediation in cloud en...Proactive cloud service assurance framework for fault remediation in cloud en...
Proactive cloud service assurance framework for fault remediation in cloud en...
IJECEIAES
 
OS introduction
OS introductionOS introduction
OS introduction
anand hd
 
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATIONUSING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
ijaia
 
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATIONUSING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
gerogepatton
 
Structure of Operating System
Structure of Operating System Structure of Operating System
Structure of Operating System
anand hd
 
AN INVESTIGATION OF THE MONITORING ACTIVITY IN SELF ADAPTIVE SYSTEMS
AN INVESTIGATION OF THE MONITORING ACTIVITY IN SELF ADAPTIVE SYSTEMSAN INVESTIGATION OF THE MONITORING ACTIVITY IN SELF ADAPTIVE SYSTEMS
AN INVESTIGATION OF THE MONITORING ACTIVITY IN SELF ADAPTIVE SYSTEMS
ijseajournal
 
Os overview
Os overviewOs overview
Os overview
anand hd
 
OS virtual memory
OS virtual memoryOS virtual memory
OS virtual memory
anand hd
 
A model for run time software architecture adaptation
A model for run time software architecture adaptationA model for run time software architecture adaptation
A model for run time software architecture adaptation
ijseajournal
 
Characteristics and Quality Attributes of Embedded System
Characteristics and Quality Attributes of Embedded SystemCharacteristics and Quality Attributes of Embedded System
Characteristics and Quality Attributes of Embedded System
anand hd
 
An analysis of software aging in cloud environment
An analysis of software aging in cloud environment  An analysis of software aging in cloud environment
An analysis of software aging in cloud environment
IJECEIAES
 
System implemantation
System implemantationSystem implemantation
System implemantation
Jaipal Dhobale
 
Procesamiento multinúcleo óptimo para aplicaciones críticas de seguridad
 Procesamiento multinúcleo óptimo para aplicaciones críticas de seguridad Procesamiento multinúcleo óptimo para aplicaciones críticas de seguridad
Procesamiento multinúcleo óptimo para aplicaciones críticas de seguridad
Marketing Donalba
 

What's hot (16)

OS Memory Management
OS Memory ManagementOS Memory Management
OS Memory Management
 
Proactive cloud service assurance framework for fault remediation in cloud en...
Proactive cloud service assurance framework for fault remediation in cloud en...Proactive cloud service assurance framework for fault remediation in cloud en...
Proactive cloud service assurance framework for fault remediation in cloud en...
 
OS introduction
OS introductionOS introduction
OS introduction
 
50120140501006 2
50120140501006 250120140501006 2
50120140501006 2
 
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATIONUSING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
 
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATIONUSING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
 
Structure of Operating System
Structure of Operating System Structure of Operating System
Structure of Operating System
 
AN INVESTIGATION OF THE MONITORING ACTIVITY IN SELF ADAPTIVE SYSTEMS
AN INVESTIGATION OF THE MONITORING ACTIVITY IN SELF ADAPTIVE SYSTEMSAN INVESTIGATION OF THE MONITORING ACTIVITY IN SELF ADAPTIVE SYSTEMS
AN INVESTIGATION OF THE MONITORING ACTIVITY IN SELF ADAPTIVE SYSTEMS
 
Os overview
Os overviewOs overview
Os overview
 
OS virtual memory
OS virtual memoryOS virtual memory
OS virtual memory
 
A model for run time software architecture adaptation
A model for run time software architecture adaptationA model for run time software architecture adaptation
A model for run time software architecture adaptation
 
Characteristics and Quality Attributes of Embedded System
Characteristics and Quality Attributes of Embedded SystemCharacteristics and Quality Attributes of Embedded System
Characteristics and Quality Attributes of Embedded System
 
An analysis of software aging in cloud environment
An analysis of software aging in cloud environment  An analysis of software aging in cloud environment
An analysis of software aging in cloud environment
 
TEM Glossary
TEM GlossaryTEM Glossary
TEM Glossary
 
System implemantation
System implemantationSystem implemantation
System implemantation
 
Procesamiento multinúcleo óptimo para aplicaciones críticas de seguridad
 Procesamiento multinúcleo óptimo para aplicaciones críticas de seguridad Procesamiento multinúcleo óptimo para aplicaciones críticas de seguridad
Procesamiento multinúcleo óptimo para aplicaciones críticas de seguridad
 

Similar to Combating Software Aging: Use Two-Level Rejuvenation to Maximize Average Resource Performance and Minimize Tasks Deadline Miss Rate

Software rejuvenation
Software rejuvenationSoftware rejuvenation
Software rejuvenation
RVCE
 
Software rejuvenation
Software rejuvenationSoftware rejuvenation
Software rejuvenation
RVCE2
 
1 introduction
1 introduction1 introduction
1 introductionhanmya
 
26 7956 8212-1-rv software (edit)
26 7956 8212-1-rv software (edit)26 7956 8212-1-rv software (edit)
26 7956 8212-1-rv software (edit)
IAESIJEECS
 
ESTIMATING HANDLING TIME OF SOFTWARE DEFECTS
ESTIMATING HANDLING TIME OF SOFTWARE DEFECTSESTIMATING HANDLING TIME OF SOFTWARE DEFECTS
ESTIMATING HANDLING TIME OF SOFTWARE DEFECTS
csandit
 
Performance testing methodologies
Performance testing methodologiesPerformance testing methodologies
Performance testing methodologiesDhanunjay Rasamala
 
EFFICIENT SCHEDULING STRATEGY USING COMMUNICATION AWARE SCHEDULING FOR PARALL...
EFFICIENT SCHEDULING STRATEGY USING COMMUNICATION AWARE SCHEDULING FOR PARALL...EFFICIENT SCHEDULING STRATEGY USING COMMUNICATION AWARE SCHEDULING FOR PARALL...
EFFICIENT SCHEDULING STRATEGY USING COMMUNICATION AWARE SCHEDULING FOR PARALL...
ijdpsjournal
 
EVALUATION OF SOFTWARE DEGRADATION AND FORECASTING FUTURE DEVELOPMENT NEEDS I...
EVALUATION OF SOFTWARE DEGRADATION AND FORECASTING FUTURE DEVELOPMENT NEEDS I...EVALUATION OF SOFTWARE DEGRADATION AND FORECASTING FUTURE DEVELOPMENT NEEDS I...
EVALUATION OF SOFTWARE DEGRADATION AND FORECASTING FUTURE DEVELOPMENT NEEDS I...
ijseajournal
 
E018132735
E018132735E018132735
E018132735
IOSR Journals
 
Restoration and Degeneration of the Applications
Restoration and Degeneration of the ApplicationsRestoration and Degeneration of the Applications
Restoration and Degeneration of the Applications
iosrjce
 
F017264143
F017264143F017264143
F017264143
IOSR Journals
 
Review Paper on Recovery of Data during Software Fault
Review Paper on Recovery of Data during Software FaultReview Paper on Recovery of Data during Software Fault
Review Paper on Recovery of Data during Software Fault
AM Publications
 
7 deadly sins of backup and recovery
7 deadly sins of backup and recovery7 deadly sins of backup and recovery
7 deadly sins of backup and recoverygeekmodeboy
 
Program Aging and Service Crash
Program Aging and Service CrashProgram Aging and Service Crash
Program Aging and Service Crash
Editor IJCATR
 
Performance testing basics
Performance testing basicsPerformance testing basics
Performance testing basics
Charu Anand
 
Survey of streaming data warehouse update scheduling
Survey of streaming data warehouse update schedulingSurvey of streaming data warehouse update scheduling
Survey of streaming data warehouse update scheduling
eSAT Journals
 
Benchmark methods to analyze embedded processors and systems
Benchmark methods to analyze embedded processors and systemsBenchmark methods to analyze embedded processors and systems
Benchmark methods to analyze embedded processors and systems
XMOS
 
Software aging prediction – a new approach
Software aging prediction – a new approach Software aging prediction – a new approach
Software aging prediction – a new approach
IJECEIAES
 
Investigation of quality and functional risk
Investigation of quality and functional riskInvestigation of quality and functional risk
Investigation of quality and functional risk
ijpla
 
Intro softwareeng
Intro softwareengIntro softwareeng
Intro softwareengPINKU29
 

Similar to Combating Software Aging: Use Two-Level Rejuvenation to Maximize Average Resource Performance and Minimize Tasks Deadline Miss Rate (20)

Software rejuvenation
Software rejuvenationSoftware rejuvenation
Software rejuvenation
 
Software rejuvenation
Software rejuvenationSoftware rejuvenation
Software rejuvenation
 
1 introduction
1 introduction1 introduction
1 introduction
 
26 7956 8212-1-rv software (edit)
26 7956 8212-1-rv software (edit)26 7956 8212-1-rv software (edit)
26 7956 8212-1-rv software (edit)
 
ESTIMATING HANDLING TIME OF SOFTWARE DEFECTS
ESTIMATING HANDLING TIME OF SOFTWARE DEFECTSESTIMATING HANDLING TIME OF SOFTWARE DEFECTS
ESTIMATING HANDLING TIME OF SOFTWARE DEFECTS
 
Performance testing methodologies
Performance testing methodologiesPerformance testing methodologies
Performance testing methodologies
 
EFFICIENT SCHEDULING STRATEGY USING COMMUNICATION AWARE SCHEDULING FOR PARALL...
EFFICIENT SCHEDULING STRATEGY USING COMMUNICATION AWARE SCHEDULING FOR PARALL...EFFICIENT SCHEDULING STRATEGY USING COMMUNICATION AWARE SCHEDULING FOR PARALL...
EFFICIENT SCHEDULING STRATEGY USING COMMUNICATION AWARE SCHEDULING FOR PARALL...
 
EVALUATION OF SOFTWARE DEGRADATION AND FORECASTING FUTURE DEVELOPMENT NEEDS I...
EVALUATION OF SOFTWARE DEGRADATION AND FORECASTING FUTURE DEVELOPMENT NEEDS I...EVALUATION OF SOFTWARE DEGRADATION AND FORECASTING FUTURE DEVELOPMENT NEEDS I...
EVALUATION OF SOFTWARE DEGRADATION AND FORECASTING FUTURE DEVELOPMENT NEEDS I...
 
E018132735
E018132735E018132735
E018132735
 
Restoration and Degeneration of the Applications
Restoration and Degeneration of the ApplicationsRestoration and Degeneration of the Applications
Restoration and Degeneration of the Applications
 
F017264143
F017264143F017264143
F017264143
 
Review Paper on Recovery of Data during Software Fault
Review Paper on Recovery of Data during Software FaultReview Paper on Recovery of Data during Software Fault
Review Paper on Recovery of Data during Software Fault
 
7 deadly sins of backup and recovery
7 deadly sins of backup and recovery7 deadly sins of backup and recovery
7 deadly sins of backup and recovery
 
Program Aging and Service Crash
Program Aging and Service CrashProgram Aging and Service Crash
Program Aging and Service Crash
 
Performance testing basics
Performance testing basicsPerformance testing basics
Performance testing basics
 
Survey of streaming data warehouse update scheduling
Survey of streaming data warehouse update schedulingSurvey of streaming data warehouse update scheduling
Survey of streaming data warehouse update scheduling
 
Benchmark methods to analyze embedded processors and systems
Benchmark methods to analyze embedded processors and systemsBenchmark methods to analyze embedded processors and systems
Benchmark methods to analyze embedded processors and systems
 
Software aging prediction – a new approach
Software aging prediction – a new approach Software aging prediction – a new approach
Software aging prediction – a new approach
 
Investigation of quality and functional risk
Investigation of quality and functional riskInvestigation of quality and functional risk
Investigation of quality and functional risk
 
Intro softwareeng
Intro softwareengIntro softwareeng
Intro softwareeng
 

Combating Software Aging: Use Two-Level Rejuvenation to Maximize Average Resource Performance and Minimize Tasks Deadline Miss Rate

  • 1. 1 Combating Software Aging: Use Two-Level Rejuvenation to Maximize Average Resource Performance and Minimize Tasks Deadline Miss Rate Hao Wu ∗, Student Member, IEEE, Chunhui Guo ∗, Student Member, IEEE, Xiayu Hua ∗, Student Member, IEEE, Igor Lopes †, Shangping Ren ∗, Senior Member, IEEE Abstract—Software aging is a common phenomenon which is often manifested through system performance degradation. Rejuvenation is one of the most commonly used approaches to handle issues caused by software aging. To combat resource performance degradation, we present a two-level rejuvenation strategy, i.e., interleaving a set of n warm rejuvenations with one cold rejuvenation. Our first target is to find the optimal n that maximizes system average performance when no application’s information is known a priori. We first define a resource model that takes into consideration of performance degradation and two-level rejuvenations. Based on the resource model, we formally analyze the resource supply and present the MAX-PERFORMANCE algorithm to determine the optimal rejuvenation pattern that maximizes the average resource performance. When a task is deployed on the resource, maximized average resource performance does not necessarily guarantee minimized task’s deadline miss rate. Hence, our second target is to design a dynamic two-level rejuvenation strategy that minimizes task’s deadline miss rate when a real-time periodic task is deployed on the resource. The simulation results show that with a two-level rejuvenation strategy, we can achieve 25.22% higher average resource performance compared with a single level rejuvenation strategy. In addition, with a two-level rejuvenation strategy, a task’s deadline miss rate is always less than the deadline miss rate when a single rejuvenation strategy is applied. The experimental results also show that asynchronized rejuvenation strategy outperforms the synchronized rejuvenation strategy in most of the scenarios. Index Terms—Software Aging, Performance Degradation, Resource Model, Two-Level Rejuvenation, Resource Supply Analysis, Deadline Miss Rate Minimization Resource Performance Maximization ! 1 INTRODUCTION HARDWARE aging is a well know phenomenon in com- puter systems that slows down the system perfor- mance and eventually leads to transient failure [2]. Simi- lar to the hardware aging, software running on computer systems also ages [17]. Software aging is usually caused by memory leaks and error accumulation. Unlike the hard- ware aging that may take long time to impact the system performance, the software aging may reveal in a relatively short time period [11], [2]. As modern computer systems are getting more complex and supporting more concurrent applications, software aging becomes more obvious and has significant impacts on system performance. According to [3], nowadays, computer system outages are caused more by software failures than by hardware failures. Software aging also has an impact on today’s mobile de- vice performance. To provide evidences for such slowdown phenomena on cellphones, we have written an Android APP which computes the multiplication of two 500 × 500 ∗The authors are with Department of Computer Science, Illinois Institute of Technology, Chicago, IL 60616, USA ∗{ hwu28, cguo13, xhua}@hawk.iit.edu, ren@iit.edu ∗The research is supported in part by NSF under grant number CAREER 0746643, CNS 1018731, CNS 1035894, and CPS 1545008. †Igor Lopes is with Department of Computer Science, University of Idaho. oliv7721@vandals.uidaho.edu matrices and records the computation time. The APP runs on a cellphone with a Qualcomm 1.5GHz dual-core, 1G RAM, and 2G internal storage. The APP is the only appli- cation running on the cellphone under Android 2.3.6. Fig. 1 shows the measurements of the computation time of matrix multiplication over about 5 days. Each point represents the average computation time of 300 matrix multiplication computations. From Fig. 1, we have following observations: 1) The computation time within the interval [1, 10], [15, 20], [24, 39], and [42, 53] has an increasing trend, which indicates that the cellphone suffers from aging effects. 2) The computation time within the [10, 15], [20, 24] and [39, 42] intervals have a decreasing trend. The log file indicates that the cellphone was rebooted at point 10; and the matrix multiplication application was restarted at point 20 and 39. The second observation also indicates that both cell- phone reboot (cold rejuvenation) and application restart (warm rejuvenation) can restore a cellphone’s performance, but the restore capability of cold rejuvenation is higher than the warm rejuvenation. In addition, the resource perfor- mance after the second warm rejuvenation is lower than the first warm rejuvenation. Currently, smartphone subsystems have reset mecha-
  • 2. 2 1 10 20 30 40 50 24.2 24.4 24.6 24.8 No. of Points Seconds Fig. 1. Aging Effect of Matrix Multiplication Time on Cellphone nisms, called “silent resets”, incorporated to restore func- tionality while minimizing user impact. However, these resets happen in a reactive manner and are not predicted or scheduled. For instance, WiFi has a reset mechanism a.k.a. sub system restart. It’s a structure for the WiFi Firmware to restart its execution point. The reset is initiated by ei- ther a program fault (e.g. out-of-bound memory access, bad instruction jump, or memory corruption) or a health- monitoring trigger (e.g. inability to transmit packets for a period of time (hardware lockup), packet memory overflow, or register value lockup). For systems that support long lasting applications, soft- ware aging is an unavoidable phenomenon which may lead the applications running on the system violate their QoS requirements. Hence, rejuvenation is a necessary process to maintain the system performance at an expected level. How- ever, rejuvenation takes time during which the system is not available to user applications. Different levels of rejuvena- tion have different overhead, different performance restore capability, and result in different system performance. In this paper, we are to design two-level rejuvenation strategies that uses the combination of warm and cold software rejuve- nation to maximize the system’s average performance when no task’s information is known a priori and to minimize the task’s deadline miss rate when a real-time periodic task is deployed on the resource. In particular, we are to extend our previous resource with performance degradation and pe- riodic rejuvenation (P2 -resource) model [10] and integrate it with the two-level rejuvenation strategy [6]. Based on the extended resource model, we formally analyze resource supply and give the optimal combination of using warm and cold software rejuvenation that maximizes the system’s average performance when no task’s information is given. We further theoretically analyze the deadline miss rate of single real-time periodic task when two-level rejuvenation strategy is applied to restore resource’s performance. The optimal rejuvenation period that minimizes task’s deadline miss rate when rejuvenation period is synchronized with task is also be calculated. The rest of the paper is organized as follows. First, we discuss related work in Section 2. Two-level rejuvena- tion strategy design for maximizing average system per- formance is presented in Section 3. In Section 4, we dis- cuss the dynamic two-level rejuvenation strategy design for minimizing real-time task’s deadline miss rate. We verify the theoretical analysis and evaluate the proposed resource model by simulations in Section 5. Section 6 concludes the paper and points out our future work. 2 RELATED WORK Software rejuvenation is a preventive and proactive main- tenance solution for handling system aging effects. Huang et al. [11] first proposed the concept of software rejuvena- tion and developed a four-state (i.e., Robust State, Failure Probable State, Failure State, and Rejuvenation State) system model to reflect system operational states. Since then, many rejuvenation models have been developed by the research community [11], [8]. For instance, Koutras et al. extended the initial rejuvenation model by considering two levels of re- juvenation actions [15], i.e., perfect rejuvenation action and minimal rejuvenation action. The perfect rejuvenation (cold rejuvenation) results in the system returning to the Robust State (initial state), while the minimal rejuvenation (warm rejuvenation) results in the system returning to the Failure Probable State (the state before rejuvenation). Alonso et al. experimentally compared the overhead by taking different software rejuvenation technologies [1]. They categorize the software rejuvenation into three different granularities, i.e. application level, operating system (OS) level and hardware level. The application level rejuvenation takes the least time but also has the least impact on the system performance. The hardware level rejuvenation takes the longest time but lead to the best system performance. The OS level rejuvenation is in the middle for both time cost and performance impact. To analyze software aging and study aging related fail- ures, Trivedi et al. [20] presented two approaches: analyt- ical modeling approach for determining optimal times to rejuvenate and measurement based approach for detection and validation. Tai et al. [19] identified key factors that may impact system reliability and developed an approach to maximizing system reliability by analyzing the optimal interval between maintenances. Guo et al. considered both transient faults caused by software aging effects and net- work transmission faults and analyzed the optimal software rejuvenation period that maximizes systems reliability [7]. Okamura et al. [16] discussed a maintenance policy that combines aperiodic rejuvenations and periodic checkpoints to maximize the system availability. The estimations of reliability and availability were analyzed in [18], [15]. The two-level rejuvenation model has also been ana- lyzed by the research community. Hong et al. studied two- level closed-loop rejuvenation techniques and proposed an approach to minimize the average rejuvenation cost [9]. Koutras et al. observed the effects of a two-level software rejuvenation model on availability, downtime and rejuvena- tion cost indicators [14]. The two-level rejuvenation model was also modeled by a Semi-Markov process and analyzed to find the optimal rejuvenation policy to maximize the system availability [21], [13], [12]. As pointed out in [4], a general characteristic of software aging is the gradual performance degradation and/or an in- crease in the software failure rate. The above works mainly focus on aging related failure effects on QoS and how to perform rejuvenations to optimize the system QoS, such
  • 3. 3 as availability and reliability. However, not much work has been done on designing rejuvenation strategies on when to perform rejuvenations to improve the resource performance. Recently, Hua et al. [10] proposed a new resource with performance degradation and periodic rejuvenation (P2 - resource) model which takes software aging and periodical resource rejuvenations into consideration. It gives formally schedulability analysis under the P2 -resource model for both EDF (earliest deadline first) and RM (rate monotonic) scheduling algorithms. In this paper, we are to extend the P2 -resource with the consideration of both warm and cold software rejuvenations along with their impacts on the system performance. Based on the extended resource model, we formally analyze re- source’s supply and present a linear search algorithm to determine the optimal interleaving between warm and cold rejuvenations that maximizes the average resource perfor- mance. 3 TWO-LEVEL REJUVENATION STRATEGY TO MAXIMIZE AVERAGE SYSTEM PERFORMANCE 3.1 Models and Assumptions Resource Performance Function We use function f(t) to denote the resource perfor- mance at time t. The resource performance represents the computation cycles per unit time provided by the resource to applications. As the system performance degrades over time, we assume that the resource performance function f(t) is a decreasing function and f(0) = 1 [17], [10]. As for any decreasing resource performance function, the strategy to analyze the resource’s performance is the same. Hence, to simplify the discussion of our approach, we further assume that the resource performance function is a linear decreasing function, i.e., f(t) = 1 − at where a denotes the resource performance decreasing rate which is assumed to be a constant and 0 ≤ a < 1. If a = 0, the resource’s performance does not degrade. Resource Rejuvenation Pattern Similar to [13], [18], [1], the system can perform two levels of rejuvenations, i.e., cold rejuvenation and warm rejuvenation. Once the resource’s performance f(t) de- grades to a threshold r (0 ≤ r < 1), we take a warm or cold rejuvenation to restore its performance. After a cold rejuvenation, the system returns to the Robust State, and the resource performance is restored to f(t) = 1. When a warm rejuvenation is completed, the system goes back to the Failure Probable State, and the resource performance function becomes to fi(t) = pi − a(t − ti) where i denotes ith warm rejuvenation, p is the resource performance restore factor (0 < p < 1) and ti represents the time of ith warm rejuvenation’s finish time. The resource is unavailable when it goes through the rejuvenation process. The downtime caused by each cold rejuvenation or warm rejuvenation is assumed to be a constant ΦC and ΦW , respectively. We further assume ΦC > ΦW . As the resource performance after each warm rejuve- nation is smaller than the previous warm rejuvenation, if we only take warm rejuvenations, the resource performance will eventually be below the threshold r and hence a cold rejuvenation becomes necessary. We define the rejuvenation pattern as n (n ∈ N) warm rejuvenations followed by one cold rejuvenation, as shown in Fig. 2. The time interval of an entire rejuvenation pattern is denoted as rejuvenation hyperperiod Π. We assume that the resource is repeatedly rejuvenated by the above pattern with period Π. Fig. 2. Resource Rejuvenation Pattern As the initial resource performance is f(0) = 1, the restored resource performance after n warm rejuvenations is fn(t) = pn . The resource performance after the nth warm rejuvenation must not be smaller than the threshold, i.e., pn ≥ r, otherwise the nth rejuvenation should be a cold rejuvenation. Hence, we have n ≤ logp r and the maximal warm rejuvenation number before a cold rejuvenation in the rejuvenation pattern is Nmax = logp r . Resource Model The resource model is characterized by a 6-tuple R(f(t), r, p, ΦW , ΦC, n), where f(t) is the initial resource performance function, r is the resource performance thresh- old to start a cold rejuvenation, p is the resource perfor- mance restore factor of a warm rejuvenation, ΦW is the warm rejuvenation time cost, ΦC is the cold rejuvenation time cost, and n is the number of warm rejuvenations before a cold rejuvenation in the rejuvenation pattern. We assume the resource starts at time zero. If the resource only takes cold rejuvenations, i.e., n = 0, the resource model degenerates to the P2 -resource model in [10]. Average Resource Performance We define the average resource performance within a system’s longevity L as the ratio between the total resource supply SL within L, i.e., AL = SL L (1) 3.2 Problem Formulation The problem we are to address is defined below: Problem: Given a resource R(f(t), r, p, ΦW , ΦC, n), decide n that maximizes the average resource performance, i.e., AL, within its operational interval [0, L]. According to Eq. (1), maximizing the average resource performance AL is to maximize the total resource supply SL with given system longevity L. We take two steps to address the problem. First, we analyze the total resource supply SL with a given rejuvenation pattern (Section 3.3). Second, we present the MAX-PERFORMANCE algorithm (Section 3.4) to determine the optimal rejuvenation pattern, i.e., n with respect to maximizing average resource performance.
  • 4. 4 3.3 Resource Supply Analysis In this subsection, we first analyze the resource supply SΠ within a rejuvenation hyperperiod, i.e., within two cold rejuvenations, and then formalize the total resource supply SL within the system longevity on the basis of SΠ. 3.3.1 Resource Supply within Rejuvenation Hyperperiod Π Suppose the rejuvenation pattern is given as follows: n warm rejuvenations followed by one cold rejuvenation, as shown in Fig. 2. Before the ith (1 ≤ i ≤ n + 1) rejuvenation, the start performance and the end performance of the resource are pi−1 and r, respectively. The resource available time length of the ith rejuvenation is li = f−1 (r) − f−1 (pi−1 ) = pi−1 − r a . (2) The resource supply of the ith rejuvenation is Si = f−1 (r) f−1(pi−1) f(t)dt = 1−r a 1−pi−1 a f(t)dt. (3) To generalize Eq. (2) and Eq. (3), we assume l0 = 0 and S0 = 0. A rejuvenation pattern contains n + 1 resource available intervals, n warm rejuvenations, and one cold rejuvenation. The rejuvenation hyperperiod is Π = n+1 i=1 li + n · ΦW + ΦC. (4) The resource supply within the rejuvenation hyperperiod Π is the summation of n + 1 resource available intervals, i.e., SΠ = n+1 i=1 Si. (5) 3.3.2 Resource Supply within System Longevity L In practical cases, the system longevity is much larger than the rejuvenation hyperperiod, i.e., L >> Π [11]. We divide the analysis of the resource supply SL within the longevity into two cases based on if the longevity L is divisible by the rejuvenation hyperperiod Π or not. Case 1: L mod Π = 0 In this case, the system longevity contains L/Π entire rejuvenation hyperperiods. The total resource supply within the longevity is the sum of L/Π resource supplies SΠ in a rejuvenation hyperperiod, i.e., SL = SΠ · L Π . (6) Case 2: L mod Π = 0 In this case, we divide the total resource supply into two parts: the resource supply in the interval containing L/Π entire rejuvenation hyperperiods and the resource supply of the remaining time interval IR. Hence, the total resource supply within the longevity is SL = SΠ · L Π + SR. (7) where SR is the resource supply of the remaining time interval IR with length lR = L mod Π. We further divide the analysis of the remaining resource supply SR into two cases based on if the remaining time interval IR ends in the time period when a rejuvenation is in process. Fig. 3. Resource Supply Analysis Case 2.1: IR ends during a rejuvenation As shown in Fig. 3, the remaining interval IR ends during the jth rejuvenation implies    j i=1 li + (j − 1)ΦW ≤ lR ≤ j i=1 li + jΦW if 1 ≤ j ≤ n Π − ΦC ≤ lR ≤ Π if j = n + 1 (8) where j ∈ N. The resource supply SR within IR is hence SR = j i=1 Si (9) where the value of j can be calculated from lR and Eq. (8). Case 2.2: IR ends when the resource is available Similar to Case 2.1, IR may end at the jth resource available interval, which implies j−1 i=0 li + (j − 1)ΦW ≤ lR ≤ j i=0 li + (j − 1)ΦW (10) where 1 ≤ j ≤ n + 1 and j ∈ N. Hence, as shown in Fig. 3, the resource supply within IR is SR = j−1 i=0 Si + f−1 (pj−1 )+lR− j−1 i=0 li−(j−1)ΦW f−1(pj−1) f(t)dt (11) where the value of j can be calculated from lR and Eq. (10). 3.4 Average Resource Performance Maximization As n ∈ N and 0 ≤ n ≤ Nmax, the possible choices of the number of warm rejuvenations n that maximizes the average resource performance AL are limited. We present a linear search method, i.e., the MAX-PERFORMANCE algorithm, to determine the optimal number of warm rejuvenations N∗ before a cold rejuvenation. The MAX- PERFORMANCE algorithm is given as Algorithm 1. In particular, the MAX-PERFORMANCE algorithm ini- tializes both the optimal number of warm rejuvenations N∗ and the maximal average resource performance fmax as 0 (line 1-2), and calculates the possible maximal number of warm rejuvenations Nmax (line 3). In the for loop (line 4- 11), for each possible number of warm rejuvenations n, we
  • 5. 5 Algorithm 1 MAX-PERFORMANCE Input: A resource R(f(t), r, p, ΦW , ΦC, n) and the system longevity L. Output: The optimal warm rejuvenation number N∗ before a cold rejuvenation that maximizes the average resource performance, and the maximal average resource perfor- mance fmax during the system longevity. 1: N∗ = 0 2: fmax = 0 3: Nmax = logp r 4: for n = 0 to Nmax do 5: Calculate SL according to Eq. (6) or Eq. (7) 6: AL = SL/L 7: if AL > fmax then 8: N∗ = n 9: fmax = AL 10: end if 11: end for 12: return N∗ and fmax calculate the total resource supply SL (line 5) according the analysis in Section 3.3 and the average resource performance AL (line 6). Then we determine if the current n maximizes the average resource performance (line 7-10). The algorithm returns the maximal average resource performance fmax and the corresponding number of warm rejuvenations N∗ (line 12). Based on the resource supply analysis in Section 3.3, the resource supply calculation in Algorithm 1 (line 5) costs O(n) time. Hence, the time complexity of Algorithm 1 is O(n2 ). 4 TWO-LEVEL REJUVENATION STRATEGY TO MINIMIZE REAL-TIME TASKS’S DEADLINE MISS RATE (a) Schedule for Task τ(3, 6) on P2-resource R(f(t) = 1 − 0.1t, 7, 2) (b) Schedule for Task τ(3, 6) on P2-resource R(f(t) = 1 − 0.1t, 6, 2) Fig. 4. Schedule Example So far, we discussed how frequent warm rejuvenation needs to apply between two cold rejuvenations so that the average resource performance is maximized. The discus- sion focus on maximizing resource’s average performance without any application’s information and relays on a des- ignated threshold r. However, when tasks are deployed Fig. 5. Improved Resource Model on the resources, a constant threshold r may not be op- timal for minimizing the tasks’ deadline missing rate on the resource with given performance degradation. In other words, maximizing resource’s average performance does not necessarily guarantee task deadline satisfaction. As an example, consider a simplified periodic resource with only cold rejuvenation R(f(t), Π, ΦC), where f(t) = 1 − 0.1t is the resource performance degradation function, Π is the resource period and ΦC = 2 is the cold rejuvenation over- head. The average resource performance is Π−2 0 f(t)dt Π = −0.05Π2 +1.2Π−2.2 Π . Let dA dΠ = −0.05Π2 +2.2 Π2 = 0, we obtain that when Π = 6.63 = 7, the average resource performance is maximized at 0.5357. However, if a task τ(3, 6) is deployed on the resource R(f(t) = 1 − 0.1t, 7, 2), both job instance J6 and J7 miss their deadlines as depicted in Fig. 4(a). If we set Π = 6.63 = 6, then all the job instances of task τ(3, 6) meet their deadlines (Fig. 4(b)), but the average resource performance is only 0.5333, which is smaller than 0.5357. The example reveals that the rejuvenation schedule that optimizes average resource performance does not nec- essarily meet task’s deadline requirement. Hence, we are to decide the optimal rejuvenation schedule that minimizes given task’s deadline miss rate. 4.1 Models and Definitions In this paper, we only consider single real-time periodic task scheduling problem on resources with performance degra- dation. A real-time periodic task τ is represented as a 2-tuple τ(e, T), where e is the execution demand in computational cycles and T is task τ’s period, which is also its deadline. In Section 3, we use a constant performance degradation threshold r to determine when to perform a rejuvenation. However, when a real-time task is deployed on the resource, rejuvenation may need to be performed according to tasks’ execution and a constant performance degradation thresh- old may not be sufficient to capture the dynamicity of when to perform rejuvenation. We extend the resource model to a 7-tuple R(f(t), p, ΦC, ΦW , n, Πc, Πw). In addition to the performance function f(t), resource performance restore factor of a warm rejuvenation p, cold rejuvenation down- time ΦC, warm rejuvenation downtime ΦW and number of warm rejuvenations between two cold rejuvenations as defined in Section 3, a set of cold rejuvenation periods Πc = {π1 c , . . . , πj c} and a set of warm rejuvenation pe- riods Πw{π1 w, . . . , πk w} within a rejuvenation hyperperiod are characterized in the resource model. As illustrated in Fig. 5, the rejuvenation period is defined as the time interval that start from the resource available time after previous
  • 6. 6 rejuvenation to the end of current rejuvenation. All the elements in Πc and Πw are in a chronological order. For a rejuvenation hyperperiod, we have πj c mod T = 0, where πj c is the last element in Πc. According to the relationship between rejuvenation period and task period, we define two categorizes of rejuvenation strategy: synchronized rejuvena- tion and asynchronized rejuvenation. Definition 1 (Synchronized Rejuvenation). For a real-time periodic task τ(e, T) and a resource R(f(t), p, ΦC, ΦW , n, Πc, Πw). The resource performs syn- chronized rejuvenation on task τ, if ∀πi c ∈ Πc, πj w ∈ Πw, πi c mod T = 0 ∧ πj w mod T = 0. Definition 2 (Asynchronized Rejuvenation). For a real-time periodic task τ(e, T) and a resource R(f(t), p, ΦC, ΦW , n, Πc, Πw). The resource performs asyn- chronized rejuvenation on task τ, if ∃πi c ∈ Πc ∨ πj w ∈ Πw, πi c mod T = 0 ∨ πj w mod T = 0. 4.2 Cold Rejuvenation Strategy for Single Periodic Task We first consider a simple case where only cold rejuvenation is performed and there is only one real-time periodic task is deployed on the resource. The resource model can be presented as R(f(t), 1, ΦC, ΦW , 0, Πc = {π}, Πw = ∅). We can further simplify the resource model as R(f(t), π, ΦC), where f(t) is the resource degradation function, π is the re- juvenation period and ΦC is the cold rejuvenation overhead. We assume that synchronized rejuvenation is performed. From the Fig. 4 we can see that when task’s release time is synchronized with the resource restore time, the task in the example can be scheduled. The following lemma also indicates that when ΦC − ΦC T T +eimax ≤ T, synchronized rejuvenation can minimize task’s deadline miss rate. Lemma 1. For a resource R(f(t), p = 1, ΦC, ΦW , n = 0, Πc = {πc}, Πw = ∅) and a real-time task τ(e, T) deployed on it. Synchronized rejuvenation minimizes task’s deadline miss rate if ΦC − ΦC T T + eimax ≤ T, where eimax is the execution time of the last task instance before rejuvenation. Proof. Since there are at least ΦC T task instances miss their deadlines during the cold rejuvenation downtime. Assume there are imax task instances can finish their execution before deadline after each cold rejuvenation. Hence, the possible minimum deadline miss rate when only cold re- juvenation is performed is: ΦC T ΦC T + imax Since in the cold rejuvenation downtime, there are total ΦC T task instances are released. In order to meet the minimum deadline miss rate, we have to guarantee that the additional release task instance during the cold rejuvenation downtime is imax th task instance. Therefore, we have ΦC − ΦC T T ≤ T − eimax ⇒ ΦC − ΦC T T + eimax ≤ T As Guo et al. proved that when tasks period are har- monic with a periodic resource, the utilization bound can be maximized [5]. We have an hypothesis that when synchro- nized rejuvenation is performed, the task’s deadline miss rate can be minimized. Therefore, we are to design a rejuve- nation strategy such that rejuvenation can synchronize with task and the task’s deadline miss rate is minimized. Such design not only possible reduces task’s deadline miss rate, but also simplify the implementation. We will validate the hypothesis in Section 5. Our first step is to find the maximum number of in- stances that can finish their execution without any rejuve- nation. The resource supply for each instance can be calculated as follow equation: Si = iT (i−1)T f(t)dt (12) Lemma 2. For a resource with performance degradation function f(t) and a real-time task τ(e, T) deployed on it, if Simax ≥ e > Simax+1, there are at most imax task instances can finish their execution before their deadlines. Proof. The proof of Lemma 2 is trivial. Since the perfor- mance degradation function is a linear decreasing function. Once a resource supply in one task period cannot provide e cycles computation, then the task instance in that period misses its deadline and the following task instances also miss their deadlines. With Lemma 2, we know that after ith max instances, we have to perform rejuvenation so that following task instances have opportunity to finish their executions. How- ever, cold rejuvenation takes time and there are some task instances cannot be executed during the rejuvenation. Lemma 3. For a resource with performance degradation func- tion f(t), cold rejuvenation downtime ΦC, and a real-time task τ(e, T) deployed on it, there are at least ΦC −T +e T task instances that miss their deadlines during the rejuvenation downtime. Proof. In a cold rejuvenation downtime ΦC, there are n = ΦC T task instances that certainly miss their deadline. For a task τi, it has at most T − e spare time. If the remaining rejuvenation time ΦC − nT > (T − e), then another task instance misses its deadline. If the reaming rejuvenation time ΦC − nT < (T − e), it is possible that the following task instance finishes within its deadline. Lemma 4. For a resource R(f(t), π, ΦC) with performance degradation function f(t), cold rejuvenation downtime ΦC, and a real-time task τ(e, T) on it. Task’s deadline miss rate is minimized when Π = imaxT + ΦC T T − ΦC, if ΦC − ΦC T T ≤ T − eimax imaxT + ΦC T T − ΦC, if ΦC − ΦC T T > T − eimax (13) where eimax is the execution time of the ith max instance. Proof. Assume the execution time of the imax instance is eimax , the idle time in imax’s period is T − eimax . As the end of rejuvenation equals to a task instance’s release time, there are at least ΦC T instances miss their deadlines. The
  • 7. 7 Fig. 6. Resource with Cold Synchronized Rejuvenation remaining time in the cold rejuvenation downtime is then ΦC − ΦC T T. As the case 1 illustrated in Fig. 6, if the remaining time ΦC − ΦC T T ≤ T − eimax , then the rejuvenation starts at imaxT + ΦC T T − ΦC can synchronize the resource restore with task instance’s release. If the remaining time ΦC − ΦC T T > T −eimax , there are two options to ensure that the resource restore synchronizes with the task instance’s release. As shown in Fig. 7, first strategy is to preempt imax’s execution and the second strategy is to wait until imax + 1 instance. Fig. 7. Two Cold Synchronized Rejuvenation Strategies Denote ΦC T = a and imax + ΦC T = n. It is easy to find out when preempt the execution of imax instance and start rejuvenation during imax’s instance, at the end of rejuvenation, there are total n task instances are released and a + 1 task instances miss their deadlines. Therefore, the deadline miss rate is a+1 n . If rejuvenation starts after imax instance finish its execution, then there are total n + 1 instances are released and a+1 of them miss their deadlines. Hence, the deadline miss rate is a+1 n+1 . It is obvious that the second option always give the smaller deadline miss rate. Hence, it is better to perform rejuvenation after imax instance finish its execution. 4.3 Two-level Rejuvenation Strategy for Single Periodic Task Once we obtain the optimal strategy to minimize the task’s deadline miss rate using single synchronized cold rejuve- nation schedule, we can easily add any warm rejuvenation between two cold rejuvenations. For each single warm re- juvenation, we can treated as an individual resource with single rejuvenation and apply Lemma 4 to calculate the op- timal rejuvenation strategy. However, as warm rejuvenation cannot fully restore resource’s performance, the number of the warm rejuvenations between two cold rejuvenations is yielded to the following two conditions. Lemma 5. For a resource R(f(t) = 1 − at, p, ΦC, ΦW , n, Πc, Πw) and a real-time periodic task τ(e, T) deployed on it, there are at most n = 1 2 logp 2ae warm rejuvenations that can be performed between two cold rejuvenations. Proof. After nth warm rejuvenation, the resource’s restored performance is pn . The remaining supply Sn can be calcu- lated as: Sn = 1 2 1 a − 1 − pn a pn = p2n 2a (14) Once Sn < e, there are no task instance can be finished after warm rejuvenation. Hence, the maximum number of warm rejuvenations n satisfies the following condition: Sn = p2n 2a = e (15) p2n = 2ae (16) n = 1 2 logp 2ae (17) As n is an integer, therefore, after n = 1 2 logp 2ae warm rejuvenation, no task instance can finish its execution. Lemma 6. For a resource R(f(t) = 1 − at, p, ΦC, ΦW , n, Πc, Πw) and a real-time periodic task τ(e, T) deployed on it, if performing n warm rejuvenations between two cold rejuvenations can minimize task’s deadline miss rate, we have: in max ≥       ΦW T (i0 max + n j=1 ij max) ΦC T + n ΦW T       > in+1 max (18) where i0 max is the maximum number of instances that can finish their execution after a cold rejuvenation and ij max is the maximum number of instances that can finish their execution after j warm rejuvenations. Fig. 8. Resource with Two-Level Synchronized Rejuvenation Proof. Denote the minimum number of instances that miss their deadlines in a cold rejuvenation and warm rejuvena- tion as x and y, respectively. According to Lemma 3, we have x = ΦC T and y = ΦW T . Denote Suc(n) as the
  • 8. 8 number of tasks meet their deadlines after first warm re- juvenation to nth rejuvenation. We have Suc(n) = n j=1 ij max. Since the resource performance restored after each warm rejuvenation decreases as number of warm rejuvenation increases, the function Suc(n) has following properties: 1) Suc(0) = 0 2) 1 ≤ Suc(n+1)−Suc(n) ≤ i0 max if 1 ≤ n ≤ 1 2 logp 2ae 3) Suc(n + 1) = Suc(n) if n ≥ 1 2 logp 2ae) As illustrated in Fig. 8, for a given resource R(f(t) = 1 − at, p, ΦC, ΦW , n, Πc, Πw) and real-time periodic task τ(e, T), the number of instances miss deadlines is Miss(n) = x + ny and the number of instances released is Rel(n) = x + i0 max + ny + Suc(n). Then the deadline miss rate can be calculated as follow: d(n) = Miss(n) Rel(n) = x + ny x + i0 max + ny + Suc(n) (19) In order to find the minimum d(n), we need first de- termine whether d(n) has a minimum value. Consider the following equation: d(n + 1) − d(n) = Miss(n) + y Rel(n) + in+1 max + y − Miss(n) Rel(n) = Rel(n)y − (in+1 max + y)Miss(n) (Rel(n) + in+1 max + y)Rel(n) we have: d(n + 1) ≥ d(n), if Rel(n)y − (in+1 max + y)Miss(n) ≥ 0 d(n + 1) < d(n), if Rel(n)y − (in+1 max + y)Miss(n) < 0 ⇒    d(n + 1) ≥ d(n), if y in+1 max+y ≥ Miss(n) Rel(n) d(n + 1) < d(n), if y in+1 max+y < Miss(n) Rel(n) ⇒ d(n + 1) ≥ d(n), if y in+1 max+y ≥ d(n) d(n + 1) < d(n), if y in+1 max+y < d(n) (20) In equation (20), y in+1 max+y represents the deadline miss rate of single n+1th warm rejuvenation. Since y is a constant for all warm rejuvenation, and in max decreases as n increases. The deadline miss rate for single warm rejuvenation y in max+y increases as n increases. Hence, d(n) will keep decreasing as long as y in max+y < d(n − 1) and d(n) will keep increasing once y in max+y > d(n − 1). Therefore, the function d(n) exists a minimum value. As d(n) is the minimum deadline miss rate, we have: d(n + 1) > d(n) d(n − 1) ≥ d(n) ⇒    Miss(n)+y Rel(n)+in+1 max+y > Miss(n) Rel(n) Miss(n−1)+y Rel(n−1)+in−1 max+y ≥ Miss(n) Rel(n) (21) By solving the inequation (21), we obtain: in max ≥ (Suc(n) + i0 max)y x + (n)y > in+1 max (22) Substitute x, y, and Suc(n) with ΦC T , ΦW T and n j=1 ij max, respectively. We have: in max ≥       ΦW T (i0 max + n j=1 ij max) ΦC T + n ΦW T       > in+1 max (23) 5 EMPIRICAL STUDY In this section, we first use simulation to evaluate the rela- tionship between warm rejuvenation number n and average resource performance AL and the impacts of warm/cold rejuvenation time coat on the optimal warm rejuvenation number N∗ that maximizes the average resource perfor- mance AL. Since we proposed a hypothesis in Section 4.2 that task deadline miss rate is minimized when task release time is synchronized with the rejuvenation period. The sec- ond part of the empirical study is to validate the hypothesis through experiments. 5.1 Experimental Study on Average Performance Maxi- mization Alonso et al. conducted a set of experiments to evaluate the rejuvenation overhead of different rejuvenation techniques [1]. Their experimental results show that standalone appli- cation restart and virtual/physical machine reboot consume about 45 seconds and 150 seconds, respectively. The appli- cation restart can be treated as warm rejuvenation, while the machine reboot is one kind cold rejuvenation. In our simulations, we use the above experimental results as a guide for how to set warm and cold rejuvenation time cost parameters. 5.1.1 Relationship between n and AL To evaluate the relationship between the number of warm rejuvenation n and average resource performance AL, we conduct a simulation with the following parameters: • Resource performance degradation rate: a = 0.005 • Resource performance threshold: r = 0.3 • Resource performance restore factor of a warm reju- venation: p = 0.95 • Cold rejuvenation time cost: ΦC = 150 • Warm rejuvenation time cost: ΦW = 45 • System longevity: L ∈ {1 × 104 , 3 × 104 , 5 × 104 , 1 × 105 } The possible maximal number of warm rejuvenations is Nmax = logp r = 23. With a given system longevity L, for each possible warm rejuvenation number, i.e., n ∈ [0, Nmax], we calculate the average resource performance AL accord- ing to the analysis in Section 3.3 and Eq. (1). Fig. 9 shows the average resource performance under different numbers of warm rejuvenations for each system longevity. From Fig. 9, we have the following observations: 1) When the number of warm rejuvenations n increases, the average resource performance AL first increases and then decreases. For instance, AL increases when
  • 9. 9 n increases from 0 to 3 and starts to decrease when n increases from 3 to 23. 2) When the number of warm rejuvenations n is too small or too large, the average resource performance AL is relatively low. For instance, when n = Nmax = 23, AL reaches its minimal value. 3) The system longevity L does not have significant im- pact on the rejuvenation behavior when L >> Π. In our models given in Section 3.1, we assume a reju- venation pattern starts with the initial state, i.e., f(t) = 0, which indicates the rejuvenation behaviors in each rejuve- nation hyperperiod are the same. In addition, we have also made the assumption that the system periodically repeats the rejuvenation pattern with period Π. If L >> Π, the system longevity does not have a significant impact on rejuvenation effects. This observation is evidenced from the following aspects: 1) For different system longevity, the optimal number of warm rejuvenations that maximize the average re- source performance are the same. In particular, for the tested four longevity cases, N∗ = 3. 2) With the same number of warm rejuvenations n, the average resource performance AL of the four longevity cases is similar. For instance, the maximal difference of AL for four longevity cases is 3.76%. 3) The average resource performance trend changing over the number of warm rejuvenations are similar. The observations are consistent with our analysis, i.e., there is an optimal number of warm rejuvenations between 0 and Nmax that maximizes the average resource perfor- mance. When n = 0, i.e., the system only takes cold rejuve- nations, the resource becomes a P2 -resource [10]. The sim- ulation results also show that the extended resource model achieves 25.22% higher average resource performance than the P2 -resource model. 0 5 10 15 20 23 0.28 0.3 0.32 0.34 0.36 0.38 0.4 n AL L = 1 × 104 L = 3 × 104 L = 5 × 104 L = 10 × 104 Fig. 9. Average Resource Performance vs Warm Rejuvenation Number 5.1.2 Warm/Cold Rejuvenation Time Cost Impact We conduct a simulation to evaluate the impact of warm/cold rejuvenation time cost on the optimal number of warm rejuvenations N∗ that maximizes AL and average resource performance fmax. The simulation parameters are set the same as in Section 5.1.1 except the following two parameters: • Cold rejuvenation time cost: ΦC ∈ {100, 150, 200, 300} • Warm rejuvenation time cost: ΦW ∈ [0, 100] with step 5 • System longevity: L = 10 × 104 With a given cold rejuvenation time cost ΦC, for each warm rejuvenation time cost ΦW choice, we use the MAX- PERFORMANCE algorithm (Algorithm 1) to determine the optimal number of warm rejuvenations N∗ that maximizes AL and average resource performance fmax. Fig. 10(a) and Fig. 10(b) depict the warm/cold rejuvenation time cost impact on N∗ and fmax, respectively. From Fig. 10, we have the following observations: 1) In general, the optimal number of warm rejuvenations N∗ decreases when the warm rejuvenation time cost ΦW increases; it increases when the cold rejuvenation time cost ΦC increases. 2) The maximal average resource performance fmax de- creases when both warm and cold rejuvenation time costs increases. 3) Both the optimal number of warm rejuvenation N∗ and the maximal average resource performance fmax decrease with warm/cold rejuvenation time cost ratio ΦW /ΦC increasing. The observations are consistent with the intuition behind the proposed resource model. If the warm/cold rejuvena- tion costs less/more time, i.e., the ratio ΦW /ΦC is smaller, we should perform more warm rejuvenations to take its low time cost advantage. As the resource is unavailable during rejuvenations, the average resource performance decreases if the rejuvenation’s time cost increases. When the ratio ΦW /ΦC is larger, the proposed resource model can benefit more from the low time cost advantage of warm rejuvena- tions, i.e., results in higher average resource performance fmax. 0 0.2 0.4 0.6 0.8 1 0 2 4 6 8 10 11 ΦW /ΦC N∗ ΦC = 100 ΦC = 150 ΦC = 250 ΦC = 300 (a) Optimal Number of Warm Rejuvenations 0 0.2 0.4 0.6 0.8 1 0.28 0.3 0.35 0.4 0.45 0.5 0.53 ΦW /ΦC fmax ΦC = 100 ΦC = 150 ΦC = 250 ΦC = 300 (b) Maximal Average Resource Performance Fig. 10. Warm/Cold Rejuvenation Time Cost Impact 5.2 Synchronization v.s. Asynchronized Rejuvenation In Section 4.2, we made a hypothesis that task’s deadline miss rate is minimized when synchronized rejuvenation is performed. In this section, we are to validate the hypothesis through both analytic and empirical study. We start from the simple case where only cold rejuvenation is performed. Lemma 1 shows that when ΦC − ΦC T T + eimax ≤ T, synchronized rejuvenation can minimize task’s deadline miss rate. However, for a resource R(f(t), p = 1, ΦC, ΦW , n = 0, Πc = {πc}, Πw = ∅) and a real-time task τ(e, T) deployed on it. If ΦC − ΦC T T + eimax > T, where eimax is the execution time of the last task instance before rejuvenation. The situation becomes complicated if asynchronized rejuve- nation is performed. Two extreme scenarios may occur: Best Case: As depicted in Fig. 11, synchronized cold rejuvena- tion is performed to restore resource’s performance. If ΦC − ΦC T T + eimax > T, the first imax task instances can finish their execution before deadline. In order to keep
  • 10. 10 Fig. 11. Comparison between Synchronized Cold Rejuvenation and Asynchronized Cold Rejuvenation (Best Case Scenario) the rejuvenation synchronizing with the task instance, the rejuvenation starts in the middle of imax + 1 task instance. Since the rejuvenation is synchronized with task, assume there are n task instances released during the rejuvenation period. Then the deadline miss rate of the synchronized cold rejuvenation is: dsyn = ΦC T n (24) For the asynchronized rejuvenation, we consider the case that rejuvenation always starts immediate after imax task instance finish its execution. We consider the same time interval of synchronized rejuvenation period nT. As illus- trated in Fig. 11, it is possible that the first task instance J1 is released before resource’s available time, however, it can fin- ish its execution before deadline when resource is available to task instances. Because the second task J2 can start earlier compare to J2 in the synchronized rejuvenation scenario, it can have more resource supply compare to the synchronized rejuvenation scenario. Therefore, each of the following task instances can get more resource supply compared to the synchronized rejuvenation scenario. Hence, it is possible that imax+1 task instance can also finish its execution before its deadline. Then, the number of task instances that meet their deadlines before rejuvenation is imax+1. Since number of task instances released during the time interval nT is n. The deadline miss rate of asynchronized scenario is: dasyn = ΦC T n (25) It is obvious that the deadline miss rate of asynchronized cold rejuvenation is less than the deadline miss rate of syn- chronized cold rejuvenation in the best case. However, in the worst case scenario, the deadline miss rate of asynchronized cold rejuvenation can be larger than the deadline miss rate of synchronized cold rejuvenation. Worst Case: Fig. 12. Comparison between Synchronized Cold Rejuvenation and Asynchronized Cold Rejuvenation (Worst Case Scenario) As shown in Fig. 12, in the worst case scenario, the first task instance J1 is released during last rejuvenation downtime and before resource’s restore time. However, unlike the aforementioned best case scenario, J1 misses its deadline. Although from task instance J2 to Jimax+!, the resource supply increases compared to the synchronized rejuvenation scenario, it is possible that the Jimax+1 still can- not finish its execution before its deadline. Hence, the cold rejuvenation is performed immediate after Jimax finish its execution. Since the rejuvenation starts earlier, the resource restores its supply during the last task instance’s execution. The resource can support part of the execution of the last task instance Jn in time interval nT, but cannot fully execute the Jn within its deadline. Hence, there are total imax − 1 out of n task instances can meet their deadlines. Then the deadline miss rate of the worst case scenario is: dasyn = n − imax + 1 n (26) which is larger the the deadline miss rate of synchro- nized rejuvenation as shown in equation (24). From the above analysis, we know that when only cold rejuvenation is performed to restore resource’s per- formance, if ΦC − ΦC T T + eimax ≤ T, the synchronized rejuvenation outperforms the asynchronized rejuvenation. If ΦC − ΦC T T + eimax > T, it is difficult to tell which rejuvenation strategy has better performance. As when re- juvenation is asynchronized with task, both best case and worst case scenario may exist during the execution, it is difficult to evaluate the synchronized and asynchronized rejuvenation through theoretical analysis. It is even more complicated to do the analysis when two-level rejuvenation is enabled. Hence, we study the performance of both re- juvenation strategies through simulations. The simulations are conducted with following parameters: • Resource performance degradation rate: a ∈ {0.01, 0.001, 0.0001}
  • 11. 11 • Resource performance restore factor of a warm reju- venation: p = 0.9 • Task period: T ∈ [100, 200] • Task utilization U = e T : U ∈ {0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1} • Cold rejuvenation time cost: ΦC = nT, n ∈ {2, 5, 10} • Warm rejuvenation time cost: ΦW = 1 3 ΦC • System longevity: L = 105 Four different rejuvenation strategies are evaluated in the simulations: synchronized cold rejuvenation, asynchro- nized cold rejuvenation, synchronized two-level rejuvena- tion, asynchronized two-level rejuvenation. Under each set of parameters, we repeat the simulation one hundred times with random task periods using these four rejuvenation strategies. An average deadline miss rate of each rejuvena- tion strategies is calculated for the comparison. In addition to average deadline miss rate, an outperform ratio is also calculated to evaluate how many simulations using synchro- nized rejuvenation strategy outperforms the simulations using asynchronized rejuvenation strategy under the same set of parameters. 0 0.2 0.4 0.6 0.8 1 0.4 0.6 0.8 1 Synchronized Asynchronized (a) Cold Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 (b) Outperform Ratio of Cold Rejuvenation 0 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 Synchronized Asynchronized (c) Two-Level Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 (d) Outperform Ratio of Two- Level Rejuvenation Fig. 13. Deadline Miss Rate Comparison between Syn. Rejuvenation and Asyn. Rejuvenation (a = 0.01, ΦC = 2T) Fig. 13, Fig. 14 and Fig. 15 show the comparison of aver- age deadline miss rate between synchronized rejuvenation strategy and asynchronized rejuvenation strategy under the setting of resource degradation rate a = 0.01 and cold rejuvenation downtime ΦC equals to two times , five times and ten times of task’s period, respectively. From the figures, it is clear that when two-level rejuvenation strategy applied, average deadline miss rate reduces compared to the cold rejuvenation strategy. It isnoticed that when the task utiliza- tion is small, the asynchronized rejuvenation strategies has less average deadline miss rate compared to synchronized rejuvenation strategies. When the task utilization increases, the synchronized rejuvenation strategies outperform asyn- chronized rejuvenation strategies. When task’s utilization is 0 0.2 0.4 0.6 0.8 1 0.75 0.8 0.85 0.9 0.95 1 Synchronized Asynchronized (a) Cold Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 (b) Outperform Ratio of Cold Rejuvenation 0 0.2 0.4 0.6 0.8 1 0.6 0.7 0.8 0.9 1 Synchronized Asynchronized (c) Two-Level Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 (d) Outperform Ratio of Two- Level Rejuvenation Fig. 14. Deadline Miss Rate Comparison between Syn. Rejuvenation and Asyn. Rejuvenation (a = 0.01, ΦC = 5T) 0 0.2 0.4 0.6 0.8 1 0.88 0.9 0.92 0.94 0.96 0.98 1 Synchronized Asynchronized (a) Cold Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 (b) Outperform Ratio of Cold Rejuvenation 0 0.2 0.4 0.6 0.8 1 0.8 0.85 0.9 0.95 1 Synchronized Asynchronized (c) Two-Level Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 (d) Outperform Ratio of Two- Level Rejuvenation Fig. 15. Deadline Miss Rate Comparison between Syn. Rejuvenation and Asyn. Rejuvenation (a = 0.01, ΦC = 10T) above 50%, none of the tasks can meet their deadline. This is because, for a resource with degradation rate a = 0.01, the resource performance reaches 0 at time 100. As our task period is larger than 100, then maximum task utilization the resource can support is 0.5. Fig. 16, Fig. 17 and Fig. 18 illustrate the comparison of average deadline miss rate between synchronized reju-
  • 12. 12 0 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 Synchronized Asynchronized (a) Cold Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 (b) Outperform Ratio of Cold Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Synchronized Asynchronized (c) Two-Level Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 (d) Outperform Ratio of Two- Level Rejuvenation Fig. 16. Deadline Miss Rate Comparison between Syn. Rejuvenation and Asyn. Rejuvenation (a = 0.001, ΦC = 2T) 0 0.2 0.4 0.6 0.8 1 0.4 0.6 0.8 1 Synchronized Asynchronized (a) Cold Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 (b) Outperform Ratio of Cold Rejuvenation 0 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 Synchronized Asynchronized (c) Two-Level Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 (d) Outperform Ratio of Two- Level Rejuvenation Fig. 17. Deadline Miss Rate Comparison between Syn. Rejuvenation and Asyn. Rejuvenation (a = 0.001, ΦC = 5T) venation strategy and asynchronized rejuvenation strategy under the setting of resource degradation rate a = 0.001 and cold rejuvenation downtime ΦC equals to two times , five times and ten times of task’s period, respectively. Fig. 19, Fig. 20 and Fig. 21 depict the comparison of aver- age deadline miss rate between synchronized rejuvenation strategy and asynchronized rejuvenation strategy under the 0 0.2 0.4 0.6 0.8 1 0.6 0.7 0.8 0.9 1 Synchronized Asynchronized (a) Cold Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 (b) Outperform Ratio of Cold Rejuvenation 0 0.2 0.4 0.6 0.8 1 0.4 0.6 0.8 1 Synchronized Asynchronized (c) Two-Level Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 (d) Outperform Ratio of Two- Level Rejuvenation Fig. 18. Deadline Miss Rate Comparison between Syn. Rejuvenation and Asyn. Rejuvenation (a = 0.001, ΦC = 10T) 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Synchronized Asynchronized (a) Cold Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 (b) Outperform Ratio of Cold Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Synchronized Asynchronized (c) Two-Level Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 (d) Outperform Ratio of Two- Level Rejuvenation Fig. 19. Deadline Miss Rate Comparison between Syn. Rejuvenation and Asyn. Rejuvenation (a = 0.0001, ΦC = 2T) setting of resource degradation rate a = 0.0001 and cold rejuvenation downtime ΦC equals to two times , five times and ten times of task’s period, respectively. It is obvious that two-level rejuvenation strategies always have less av- erage deadline miss rate compared to the cold rejuvena- tion strategies. However, when the resource degradation rate reduces, asynchronized rejuvenation strategies always
  • 13. 13 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Synchronized Asynchronized (a) Cold Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 (b) Outperform Ratio of Cold Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Synchronized Asynchronized (c) Two-Level Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 (d) Outperform Ratio of Two- Level Rejuvenation Fig. 20. Deadline Miss Rate Comparison between Syn. Rejuvenation and Asyn. Rejuvenation (a = 0.0001, ΦC = 5T) 0 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 Synchronized Asynchronized (a) Cold Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 (b) Outperform Ratio of Cold Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Synchronized Asynchronized (c) Two-Level Rejuvenation 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 (d) Outperform Ratio of Two- Level Rejuvenation Fig. 21. Deadline Miss Rate Comparison between Syn. Rejuvenation and Asyn. Rejuvenation (a = 0.001, ΦC = 10T) outperform synchronized rejuvenation strategies in terms of average deadline miss rate. When the resource degra- dation rate a = 0.001 and task utilization is small (less than 0.4), asynchronized rejuvenation strategies have much better performance compared to synchronized rejuvenation strategies. As illustrated in Fig. ??, Fig. ?? and Fig. ?? indicate, when only cold rejuvenation strategy applied, a few cases with synchronized rejuvenation strategies have less deadline miss rate then asynchronized rejuvenation strategies. However, when two-level rejuvenation applied, all simulations with asychronized rejuvenation strategies have less deadline miss rate than the one with sychro- nized rejuvenation strategies. When resource degradation rate reduces to 0.0001, the average deadline miss rate of synchronized rejuvenation strategies and asynchronized rejuvenation strategies are almost at the same level but asynchronized rejuvenation strategies still outperform syn- chronized rejuvenation strategy. The experimental results overthrow the initial hypoth- esis that synchronized rejuvenation strategies minimize task’s deadline miss rate. On the contrary, in most of the scenario, asynchronized rejuvenation strategies, i.e. start rejuvenation immediate after last task instance that can meet its deadline finish its execution, have better performance on minimizing task’s deadline miss rate. 6 CONCLUSION To combat resource performance degradation due to soft- ware aging, we have extended our previous resource with performance degradation and periodic rejuvenation (P2 - resource) model by using a two-level rejuvenation strategy to maintain resource performance. Based on the extended resource model, we have formally analyzed the resource supply function and presented the MAX-PERFORMANCE algorithm to determine the optimal rejuvenation pattern that maximizes the average resource performance. For a given real-time task is deployed on the resource, maxi- mized average resource performance does not necessarily guarantee task’s deadline miss rate is minimized. Hence, our second contribution of the paper is the design of a dynamic two-level rejuvenation strategy that minimize task’s deadline miss rate when a real-time periodic task is deployed on the resource. The extensive simulation results show that with a two-level rejuvenation strategy, we can achieve 25.22% higher average resource performance com- pared with a single level rejuvenation strategy. In addition, with a two-level rejuvenation strategy, a task’s deadline miss rate is always lower than the deadline miss rate when a single rejuvenation strategy is applied. The experimental results also show that asynchronized rejuvenation strategy outperforms the synchronized rejuvenation strategy in most of the scenarios. The paper focuses on minimizing the deadline miss rate for a single real-time periodic task. However, when multiple tasks are deployed on the resource, the current two-level rejuvenation strategy may not apply in terms of minimizing deadline miss rate. Our future work is to analyze task schedulability and study the optimal rejuvenation pattern for a given task set with the goal of maximizing the task set schedulability and minimizing task set’s deadline miss rate. REFERENCES [1] J. Alonso, R. Matias, E. Vicente, A. Maria, and K. Trivedi. A com- parative experimental study of software rejuvenation overhead. Performance Evaluation, 70(3):231 – 250, 2013. Special Issue on Software Aging and Rejuvenation. [2] J. R. Black. Electromigrationa brief survey and some recent results. Electron Devices, IEEE Transactions on, 16(4):338–347, 1969.
  • 14. 14 [3] S. Garg, A. van Moorsel, K. Vaidyanathan, and K. S. Trivedi. A methodology for detection and estimation of software aging. In Software Reliability Engineering, 1998. Proceedings. The Ninth International Symposium on, pages 283–292. IEEE, 1998. [4] M. Grottke, R. Matias, and K. Trivedi. The fundamentals of software aging. In Software Reliability Engineering Workshops, 2008. ISSRE Wksp 2008. IEEE International Conference on, pages 1–6, Nov 2008. [5] C. Guo, X. Hua, H. Wu, D. Lautner, and S. Ren. Best-harmonically- fit periodic task assignment algorithm on multiple periodic re- sources. IEEE Transactions on Parallel and Distributed Systems, pp:1, 2015. [6] C. Guo, H. Wu, X. Hua, D. Lautner, and S. Ren. Use two- level rejuvenation to combat software aging and maximize av- erage resource performance. In High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on Cyberspace Safety and Security (CSS), 2015 IEEE 12th International Conferen on Embedded Software and Systems (ICESS), 2015 IEEE 17th International Conference on, pages 1160–1165. IEEE, 2015. [7] C. Guo, H. Wu, X. Hua, S. Ren, and J. Nogiec. Maximize system reliability for long lasting and continuous applications. In New Contributions in Information Systems and Technologies, volume 353 of Advances in Intelligent Systems and Computing, pages 603–612. Springer International Publishing, 2015. [8] R. Hanmer and V. Mendiratta. Rejuvenation with workload migration. In Dependable Systems and Networks Workshops (DSN- W), 2010 International Conference on, pages 80–85, June 2010. [9] Y. Hong, D. Chen, L. Li, and K. S. Trivedi. Closed loop design for software rejuvenation. In Workshop on Self-Healing, Adaptive, and Self-Managed Systems, 2002. [10] X. Hua, C. Guo, H. Wu, and S. Ren. Schedulability analysis for real-time task set on resource with performance degradation and periodic rejuvenation. In Embedded and Real-Time Computing Systems and Applications (RTCSA), 2015 IEEE 21th International Conference on, Aug 2015. [11] Y. Huang, C. Kintala, N. Kolettis, and N. Fulton. Software rejuve- nation: analysis, module and applications. In Fault-Tolerant Com- puting, 1995. FTCS-25. Digest of Papers., Twenty-Fifth International Symposium on, pages 381–390, June 1995. [12] V. Koutras. Two-level software rejuvenation model with increasing failure rate degradation. In Dependable Computer Systems, vol- ume 97 of Advances in Intelligent and Soft Computing, pages 101–115. Springer Berlin Heidelberg, 2011. [13] V. Koutras and A. Platis. Semi-markov availability modeling of a redundant system with partial and full rejuvenation actions. In Dependability of Computer Systems, 2008. DepCos-RELCOMEX ’08. Third International Conference on, pages 127–134, June 2008. [14] V. Koutras and A. Platis. Applying partial and full rejuvenation in different degradation levels. In Software Aging and Rejuvenation (WoSAR), 2011 IEEE Third International Workshop on, pages 20–25, Nov 2011. [15] V. Koutras, A. Platis, and N. Limnios. Availability and reliability estimation for a system undergoing minimal, perfect and failed rejuvenation. In Software Reliability Engineering Workshops, 2008. ISSRE Wksp 2008. IEEE International Conference on, pages 40–45, Nov 2008. [16] H. Okamura and T. Dohi. Availability optimization in operational software system with aperiodic time-based software rejuvenation scheme. In Software Reliability Engineering Workshops, 2008. ISSRE Wksp 2008. IEEE International Conference on, pages 22–27, Nov 2008. [17] D. L. Parnas. Software aging. In Proceedings of the 16th International Conference on Software Engineering, ICSE ’94, pages 279–287, Los Alamitos, CA, USA, 1994. IEEE Computer Society Press. [18] A. Sadek and N. Limnios. Nonparametric estimation of reliability and survival function for continuous-time finite markov processes. Journal of Statistical Planning and Inference, 133(1):1 – 21, 2005. [19] A. Tai, S. Chau, L. Alkalaj, and H. Hecht. On-board preventive maintenance: analysis of effectiveness and optimal duty period. In Object-Oriented Real-Time Dependable Systems, 1997. Proceedings., Third International Workshop on, pages 40–47, Feb 1997. [20] K. Trivedi, K. Vaidyanathan, and K. Goseva-Popstojanova. Model- ing and analysis of software aging and rejuvenation. In Simulation Symposium, 2000. (SS 2000) Proceedings. 33rd Annual, pages 270– 279, 2000. [21] W. Xie, Y. Hong, and K. Trivedi. Analysis of a two-level software rejuvenation policy. Reliability Engineering & System Safety, 87(1):13 – 22, 2005. Hao Wu is now a Ph.D candidate in Computer Science Department at Illinois Institute of Tech- nology. He received B.E in Information Security from Sichuan University, Chengdu, China, 2007. He received M.S. in Computer Science from Uni- versity of Bridgeport, Bridgeport, CT, 2009. His current research interests mainly focus on cloud computing, real-time distributed open systems, Cyber-Physical System, parallel and distributed systems, and real-time applications. Chunhui Guo is now a Ph.D candidate in the Computer Science Department at Illinois Insti- tute of Technology. He earned his BSEE and MSEE from Shandong University, China, in 2010 and 2013, respectively. His current research in- terests mainly focus on real-time systems and Cyber-Physical System. Xiayu Hua is a Ph.D. student in the Computer Science Department at Illinois Institute of Tech- nology. His research interest is in distributed file system, virtualization technology, real-time scheduling and cloud computing. He earned his B.S. degree from the Northwestern Polytechnic University, China, in 2008 and his M.S. degree from the East China Normal University, China, in 2012. Igor Lopes is an exchange student majoring in Computer Science at the University of Idaho. He is part of the Science without Borders Pro- gram, sponsored by the Brazilian Government. His fields of interest are Software Development and Software Engineering. Dr. Shangping Ren is an associate professor in Computer Science Department at the Illinois Institute of Technology. She earned her Ph.D from UIUC in 1997. Before she joined IIT in 2003, she worked in software and telecommuni- cation companies as software engineer and then lead software engineer. Her current research interests include coordination models for real- time distributed open systems, real-time, fault- tolerant and adaptive systems, Cyber-Physical System, parallel and distributed systems, cloud computing, and application-aware many-core virtualization for embed- ded and real-time applications.