This document presents a new scheduling algorithm called ITS-RT that uses information theory principles to schedule real-time tasks. ITS-RT selects the task with the highest amount of information per studied interval to schedule. It is shown to have slightly better performance than EDF in terms of average number of context switches and preemptions for the task sets studied. The document defines information theory concepts used in ITS-RT like probability, entropy, and information of tasks. It also presents the design, feasibility analysis, and performance comparison of ITS-RT against EDF.
3. INTRODUCTION
This paper presents a scheduling solution based on information
theory principles to schedule real time tasks. We propose the
mathematical background for using information as a parameter in
real-time systems as well as the relationship between information
and utilization. We present a new dynamic priority scheduling
solution that selects the task with the highest amount of information
per studied interval. The feasibility analysis of the scheduling solution
is proposed and its performance against the Earliest Deadline First
(EDF) scheduling algorithm is compared using as dependent variables
the number of context switches and the number of preemptions.
4. CONTRIBUTION OF THIS PAPER
• Presenting the mathematical background to measure the
information of a task set in a real-time system as well as
proposing the relationship between information theory and
utilization.
• Presenting the design, feasibility analysis and implementation of a
solution based on information theory principles to
schedule real-time tasks.
• Comparing the performance of our scheduling solution
against EDF using as metrics the number of context switches
and number of preemptions.
5. RELATED WORK
Entropy is used as a new governing parameter in real-time distributed
systems to replace utilization and implement a task migration
technique when selecting where to execute a task in a
multiprocessor environment.
Three advantages when using entropy instead of utilization:
(a) Utilization is dimensionless (with no units) because it is a ratio
while entropy is represented by an information unit. Therefore, it is
more suitable when comparing to other parameters represented
with an information unit such as memory.
6. (b) Entropy has a better scaling option
because it can be used at the same time for uniprocessors and
multiprocessors since it is represented by an information unit;
and
(c) Entropy gives more information than the utilization
factor because it provides additional information about used
and empty space on the system
7. USING INFORMATION THEORY TO SCHEDULE
REAL-TIME TASKS
Considering a periodic, independent and synchronous release task
system with m tasks and implicit deadlines (di=pi for 1 ≤ i ≤ m), where
task i has a release time = ri, computation time = ci, deadline = di,
period = pi, the hyperperiod (hperiod) of the system = least common
multiple of the periods, and the utilization of the task set equal to
(U = Pm i=1 ci=pi) [2], we apply the information-theoretic concepts to
define the following parameters:
Definition 1: Probability of a task P task
It is the probability of execution of the task during the
hyper-period represented by the ratio between the computation
time (ci) and the period (pi) of the task.
Ptaski= ci / pi (1)
8. Definition 2 : Normalized Probability of a task NP task
It is the probability of a task normalized by the sum of all the tasks’
probabilities. If U < 1 we used this value to consider the idle task, and
if U > 1 we use this value to consider the behavior of the task set per
unit of time.
NPtaski = ( ci / pi) / ( 𝑖=1
𝑚
𝑐𝑖/𝑝𝑖) (2)
Definition 3 : Information of a Task Itaski
Using the definition of information by Shannon, we define the
information of a task as the information generated by the task during
an instant of time (Itinstanti) multiplied by its computation time of
the task.
Itinstanti = log2 (1 / NPtaski) (3)
Then using equation (3), we have:
Itaski = ci ∗ Itinstanti (4)
9. Definition 4: Entropy of the system Hsys
Entropy is defined as the expected average size of a time instant in
the scheduling diagram. Based on equations (2) and (3) we have:
Hsys = 𝑖=1
𝑚
𝑁𝑃𝑡𝑎𝑠𝑘𝑖 ∗ 𝐼𝑡𝑖𝑛𝑠𝑡𝑎𝑛𝑡𝑖 (5)
Definition 3.5: Total Information of the task set “T Its”
It is the amount of information that is needed to represented all the
tasks in the task set based on Itaski and the number of instances of
the tasks (ni).
ni = hperiod / pi
Then using equation (4), we have:
T Its = 𝑖=1
𝑚
(𝑛𝑖 ∗ 𝐼𝑡𝑎𝑠𝑘𝑖) (6)
10. Definition 3.6: Total Information of the scheduling diagram T Isched
It is the amount of information that is needed to represented
the scheduling diagram based on the entropy of the system and
the hyper-period.
T Isched = hperiod ∗ Hsys (7)
Relationship between Information Theory and Utilization:
Based on previous definitions we can relate the ratio between the
total information of the task set (T Its) and the total information of
the scheduling diagram (T Isched) to the utilization of the task set (U).
Using equations (6) and (7), we define the information ratio of the
system (IRsys) as:
IRsys =T Its / T Isched (8)
11. From equation 8 we show that IRsys is equal to the
utilization of the task set:
IRsys = 𝑖=1
𝑚
(
𝑐𝑖
𝑝𝑖
)
This equation allows us to conclude that IRsys can be
used as a metric to determine the schedulability condition of
a scheduling solution based on information theory principles.
12. INFORMATION THEORY BASED SCHEDULING
SOLUTION FOR REAL-TIME SYSTEMS
(ITS-RT)
The heuristic that is used to assign the priorities of the tasks is
represented by the amount of information that must be contained in
a studied time interval. This interval is represented by the closest
deadline that the system must meet based on the current scheduling
time t.
The amount of information per studied interval is equal
to: Itaski ; dt = ci * Hsys * dt / di, where dt is the closest
deadline to the scheduling time t and di is the deadline of task
i. We decided to use the amount of information per studied
interval ( Itaski ; dt) instead of information of a task (Itaski)
because of the discrete nature of the scheduling problem
13. A. Algorithm’s Design
ITS-RT uses the amount of information per studied interval
( Itaski ; dt) as a parameter to assign the priorities of the tasks.
In order to minimize the number of context switches and
the number of preemptions, we use following algorithm :
1. Select the task with the highest amount of information
for the studied interval ( Itaski ; dt):
2. Select the next scheduling point based on :
if the computation time of selected task is less than
dt - 𝑘=1
𝑚,⩝𝑘≠𝑖
⌈( 𝑐𝑘 ∗
𝑑𝑡
𝑑𝑘
)⌉ , then the next scheduling point is the
current time plus ci. Else , the next scheduling point is the current
time plus dt - 𝑘=1
𝑚,⩝𝑘≠𝑖
⌈( 𝑐𝑘 ∗
𝑑𝑡
𝑑𝑘
)⌉.
14. B. Feasibility Analysis
Based on the design of our solution in terms of the
event-driven method to calculate the next scheduling point,
taking into consideration the highest amount of information
based on the studied interval ( Itask i,dt), and assuming that
we have a task set S with:
(a) implicit deadlines.
(b) U = 𝑖=1
𝑚
(ci/𝑝𝑖) ≤ 1, we say that S is not schedulable by
our solution if for any deadline dt, the sum of the amount of
information for all the scheduled tasks in the studied interval
is greater than the amount information used to represent the
interval (dt).
15. Mathematically, we have :
𝑖=1
𝑚
(Itask i,dt ) > dt ∗ Hsys , then:
𝑖=1
𝑚
(ci∗ Hsys∗dt /p𝑖) > dt * 𝑖=1
𝑚
(𝑁 𝑃𝑡𝑎𝑠𝑘𝑖 ∗ 𝐻𝑠𝑦𝑠) =
dt * 𝑖=1
𝑚
(Hsys∗ ci /p𝑖)> dt* 𝑖=1
𝑚
((ci/𝑝𝑖)/( 𝑖=1
𝑚
(𝑐𝑖/𝑝𝑖) ∗ 𝐻𝑠𝑦𝑠) =
𝑖=1
𝑚
(Hsys∗ ci /p𝑖)>1/( 𝑖=1
𝑚
(ci/𝑝𝑖)) ∗ 𝑖=1
𝑚
(Hsys∗ ci /p𝑖)
Then : 𝑖=1
𝑚
(ci/𝑝𝑖) > 1
This result shows that task set S is not schedulable by our
solution only if U of the studied task set is greater than 100%.
16. C. Executing the RTS-RT algorithm
Given the following example (with implicit deadlines) consisting of 3
tasks (J1, J2 and J3) with the following parameters (ri, ci, di, pi):
J1= (0,1,3,3), J2= (0,2,4,4), J3= (0,1,12,12)
We show the schedules derived by the ITS-RT algorithm
(Figure 1), and the EDF algorithm (Figure 2):
For this example, both ITS-RT and EDF generate 9 context
switches and 0 preemptions (but the schedules are different).
17. PERFORMANCE COMPARISON BETWEEN ITS-RT AND EDF
A. Methods
ITS-RT is evaluated by comparing its performance against the EDF
algorithm. We used as dependent variables the number of context
switches and the no. of preemptions and as independent variables
the utilization (70%, 80%, 90% and 100%) and the number of tasks per
task set (2,3,4,5 tasks). The value of the hyper-period (40 time
instants) is fixed to minimize the effect that this parameter has on
the selected dependent variables.
We used a linear programming solution to select the periods for the
tasks from all the possible factors of the hyper-period (making sure
that 40 was always selected) and we randomly generated the
computation time (ci) of each task to meet the selected utilization for
the task set.
18. B. Results
• Number of Context Switches : Tables I and II present the
average number of context switches between ITS-RT and EDF
based on the utilization and the number of tasks per task set
respectively.
19. • Number of Preemptions: Tables III and IV present the
average number of preemptions between ITS-RT and EDF
based on the utilization and the number of tasks per task set
respectively.
20. • Similarity Ratio: We define the similarity ratio as number of similar schedules
between ITS-RT and EDF per test file. After performing a comparison of the
scheduling diagrams (100 task sets per test file) that were generated by the
execution of the studied algorithms, we analyzed the results based on the
independent variables utilization and number of tasks per task set.
21. CONCLUSION
• Based on the average number of context switches, ITS-RT shows an
improvement of 1.1384% over EDF for the studied task sets.
• Based on the average number of preemptions, ITS-RT shows an
improvement of 2.0428% over EDF for the studied task sets.
• The similarity ratio between the two algorithms was
42.68%.
• Even though the performance increase shown by ITSRT can be seen
as minimal, we think that our approach can
perform better in a multiprocessor environment where EDF
is no longer optimal because of the additional information
provided to the scheduler by our solution.