Observations on dag scheduling and dynamic load-balancing using genetic algorithm

1
Observations on DAG Scheduling
and Dynamic-Load-Balancing
using Genetic Algorithm
Rahul Jain
IDD Part V
Roll No.- 07020007
IT-BHU,Varanasi

2
Outline
 Introduction
 Thesis Objective
 Directed Acyclic Graph
 Basic Genetic Algorithm
 Proposed Algorithm for DAG-Scheduling
 Proposed Algorithm for DLB
 Experimental Results and Discussion
 Conclusion
 References

3
Introduction
1. Heterogeneous Computing System
Heterogeneous computing systems refer to electronic systems that use a
variety of different types of computational units.
2. Task Scheduling
The multiprocessor scheduling problem is to allocate the tasks of a parallel
program to processors in a way that minimizes its completion time and
optimizes the performance.
3. Load Balancing
The technique of distributing load among processors in order to avoid overloading

4
Thesis Objective
 This thesis comprises the study of two research projects: DAG
Scheduling using Genetic Algorithm in Part I and Dynamic Load
Balancing using Genetic Algorithm in Part II.
 Part I: The objective of part-I is to design an algorithm to schedule
the DAG tasks on Heterogeneous processors in such a way that
minimize the total completion time (Makespan).
 Part II: This part is based on designing an algorithm for scheduling
the load among the processor in such a way that none of the processor
is overloaded.
 Comparison of various metrics is to be done for DAG-Scheduling and
Dynamic Load Balancing.

5
Directed Acyclic Graph
 A process or an application can be broken down into a set of tasks
 we represent these tasks in the form of a directed acyclic graph
(DAG)
 A parallel program with n tasks can be represented by a 4-tuple (T, E,
D, [Ai])
1) T = {tl, t2, . . , tn} is the set of tasks.
2) E the edges, represents the communication between tasks
3) D is an n x n matrix, where the element dij of D is the data volume
which ti should transmit into tj.
4) Ai, 1 <= i <= n, is a vector [eil, ei2, . . . , eiu,], where eiu, is the execution
time of ti on pu.

6
Directed Acyclic Graph
Task No. P1 P2 P3
1 14 16 9
2 13 19 18
3 11 13 19
4 13 8 17
5 12 13 10
6 13 16 9
7 7 15 11
8 5 11 14
9 18 12 20
10 21 7 16

7
DAG-Scheduling
Basic Assumptions
 Any processor can execute the task and communicate with other
machines at the same time.
 Each processor can only execute one process at each moment.
 Graph is fully connected.
 Once a processor has started task execution it continues without
interruption, and after completing the execution it immediately sends
the output data to children tasks in parallel.
 Intra-processor communication cost is negligible compared to the
inter-processor communication cost.

8
DAG-Scheduling
1.Task selection and Schedule phase
 Task Selection phase
Task are selected according to their height in DAG
 Calculation of task’s start and finish time
Where , PAT(ti , pu) = processor available
DAT(ti , tk , pu) = data avaulable
ET(ti , pu) = execution time of ti on pu.

9
DAG-Scheduling
2. Scheduling Encoding (Chromosome)
 A string is a candidate solution for the problem. String consists of
several lists. Each list is associated with a processor.
 Lets for any application of 10 tasks the generated schedule is:
• 1th Processor : t3 t4 t8
• 2th Processor : t5 t7 t9
• 3th Processor : t0 t1 t2 t6
 Then the chromosome can be represented as matrix of size[No. of
Task x No. of Processors]
P1 P2 P3
t3 t5 t0
t4 t7 t1
t8 t9 t2

10
DAG-Scheduling
3. Initialzation
 Population of size POP_SIZE has been initialized.
4.Fitness Function
 The GA requires a fitness function that assigns a score to each
chromosome in the population.
 The fitness function in a GA is the objective function that is to be
optimized.
 In the proposed algorithm Fitness function returns the time when all
tasks in a DAG complete their executions. A fitness function f of a
string x is defined as follows:

11
DAG-Scheduling
5.Roulette-Wheel Selection
 Roulette-wheel selection is used for selecting potentially useful solutions
for recombination ( Crossover ).
 Probability of being selected of any chromosome is:
 Sum of Fitness = 8.
 Rand ( 8 ) = 3
 Chromosome3 is the parent
.5 1.5 4 2

12
DAG-Scheduling
6.Crossover
 New chromosome is generated with this operator.
 A parent chromosome is selected by roulette-wheel operator and then
two processors are selected from this chromosome.
 Apply single point crossover on these selected processor list.
Figure: Modified Single Point Crossover

13
DAG-Scheduling
7. Mutation
 A mutation operation is designed to reduce the idle time of a
processor waiting for the data from other processors.
 Data Dominating Parent (DDP) task of task ti be the task which
transmits the largest volume of data to ti. That is,
 Example of Mutation :
Figure : Mutation.

14
DAG-Scheduling
8. Termination Conditions
 Condition 1: If we find an Individual which has Makespan less then
the specified minimum, then GA stops evolving.
 Condition 2: Variable gen stores the number, how many generations
the GA should run. User input the variable every time he runs the
program. When the generation count crosses the gen, the GA stops
evolving.

15
DAG-Scheduling
9. Pseudo code
Begin
initialize P(k); {create an initial population of size POP_SIZE}
evaluate P(k); {evaluates the fitness of all the chromosomes}
Repeat
For i=1 to POPSIZE do
Select a chromosome a as parent from population;
Child 1 <= Crossover( parent);
Child 2 <= Mutation ( Child 1 );
Add (new temporary population, Child 1, Child 2);
End For;
Make (new pop, new temp pop, old pop );
Population = new population;
While (not termination condition);
Select Best chromosome in population as solution and
return it;
End

16
Dynamic Load Balancing
1. Basic Definitions
 Load Calculations: sum of processes execution times allocated to
that processor.
 Maxspan : maximal finishing time of all processes.
max span(T ) = max(Load ( pi ) )
∀ 1 ≤ i ≤ Number of Processors
 Processor Utilization : Ratio of Load(Pi) to maxspan.


17
2. Basic Assumptions
 Each processor can only execute one process at each moment.
 Tasks are non-preemptive.
 Tasks are totally independent i.e. there is no data transfer take place
among tasks and there are no precedence relations.
 Heterogeneity of processors is defined by a multiplying factor x. If 1th
processor’s speed is P1 then the ith
processor’s speed can be calculated
as
Pi = (1+ (i-1)*x) p1

18
3. Sliding Window Technique
 How many task should selected at a time from the pool of task, this is
decided by size of sliding window.
 Size is inputted by user. The number of task in a chromosome is
equal to size of sliding window.
 Sliding window contains task ID .
Example: Sliding window of size 10
1 2 3 4 5 6 7 8 9 10

19
4. Scheduling Encoding (Chromosome)
 This is a 2D matrix of size[no. of processors x size of sliding
window].
Figure: Chromosome Representation
5. Initialization
 Population of size POP_SIZE is initialized by randomly assigning
tasks to processors.
P1 P2 P3
3 5 0
4 7 1
8 9 2
6

20
6. Fitness Function
 Fitness function attaches a value to each chromosome in the
population, which indicates the quality of the schedule.
7. Roulette-wheel selection
 Roulette wheel selection is used which I have described in DAG-
Scheduling Section.

21
8. Cycle Crossover
 Single point crossover can’t be used in this GA as it may cause some
tasks to be assigned more than once while some are not assigned.
 A new crossover operator is designed called cycle crossover. Here I
am showing you how does it works.
A 8 6 4 10 9 7 1 5 3 2
B 10 2 9 5 6 9 8 7 4 1
A` 8 - - - - - - - - -
B` 10 - - - - - - - - -
A` 8 6 - 10 9 7 1 5 - 2
B` 10 2 - 5 6 9 8 7 - 1
A` 8 6 3 10 9 7 1 5 4 2
B` 10 2 4 5 6 9 8 7 3 1

22
9. Random Swap Mutation
 Random mutation technique is used to apply mutation on newly
generated child.
 Two processor is selected randomly from the processor list. Make
sure must be different.
 Then two random task selected from each processor and swapped.
mutation
P1 P2 P3
3 5 0
4 7 1
8 9 2
6
P1 P2 P3
3 5 0
4 7 8
1 9 2
6

23
10. Termination Condition
 The variable gen stores the count, how many generations the GA
should run.
 When the generation count crosses the gen ,the GA stop evaluating.
11.Task allocation and updating the window
 When termination condition met, fittest chromosome is assigned to
final schedule.
 Now window is filled up again by sliding along the subsequent tasks
waiting in the task pool.

24
12. Pseudo Code
Begin
Repeat
save the tasks into sliding window;
initialize P(k); {create an initial population of size POP_SIZE}
evaluate P(k); {evaluates the fitness of all the chromosomes}
Repeat
For i=1 to POPSIZE do
Select two chromosome as parent from population;
Child 1, Child2 <= Crossover( parent1,parent2);
Child 3 <= Mutation ( Child 1 );
child4 <= Mutation (Child2)
Add (new temp pop, Child 1, Child 2,Child3,Child4);
End For;
Make (new pop, new temp pop, old pop );
Population = new population;
While (not termination condition);
Assign the Best chromosome in population to final schedule;
While(Task pool has more tasks)
End

25
Experimental results and Discussion
1.Dynamic Load Balancing
1. Test Parameters
 The measurement of performance of proposed algorithm was based
on two metrics: total completion time (Makespan) and average
processor utilization. The calculation of these metrics depends on
the following parameters: Default
values
• Population Size ( POP_SIZE ) 100
• Sliding window size ( sizeSlidingWindow ) 10
• No. of Generations ( gen ) 100
• No. of Processors ( no_of_Proc ) 10

26
2. Changing the population size
 The population sizes ranged from 20 to 200.
 It was observed that increasing in the population does not increase
the performance after certain limit. You after 120 the completion
time is approximately constant
 Increasing the population size had a positive effect on the processor
utilization

27
3. Changing the No. of Generation Cycles
 The number of generation cycles was changed from 1 to 500.
 As the no. of Generation cycle was increased, performance of the
schedule also increased. The total completion time was significantly
reduced as the number of generations was increased from 1 to 50.
 Increasing the population size had a positive effect on the processor
utilization

28
4. Changing the No. Processors
 The no. of processors was changed from 2 to 20.
 As the no. of processors were increased for the same number of
tasks, Completion time decreased because now the system has more
number of processing elements.
 When the no. of processors was increased, avg. processor utilization
decreased.

29
5. Changing the No. Tasks
 The number of tasks was varied from 10 to 1000.
 As we increases the no. of tasks, the time taken to completion time
also increases.
 When no. of tasks is large, avg. utilization is more then 96%.

30
6. Changing the Sliding Window
 The sliding window size was changed from 2 to 50
 the effect on completion time and avg. Processors utilization is
given below:

31
2.DAG-Scheduling
1. Test Parameters
 The measurement of performance for DAG scheduling algorithm
was done by speedup .
 The speedup value for a given graph is computed by dividing the
sequential execution by the parallel execution time.
 Speedup for the proposed DAG scheduling algorithm depends on
the following parameters.
• No. of Generations

32
2.DAG-Scheduling
2. Changing the no. of Generation cycles
 The no. of generations was varied from 1 to 1000.
 As the no. of Generation cycle was increased, performance of the
schedule also increased.
 After 250 Generations, it is observed that running the GA does not
seem to improve performance much.

33
2.DAG-Scheduling
3. Changing the no. of Tasks
 The no. of tasks was varied from 10 to 60.

34
Conclusion
 The result generated by the proposed dynamic load-balancing
mechanism using Genetic Algorithm was extremely good when the
number of tasks is large.
 The avg. Processor Utilization by proposed algorithm was found
more then 97-98%.
 The complete genetic algorithm for DAG scheduling was
implemented and tested on the various input task graphs in a
heterogeneous system.
 Proposed DAG- Scheduling algorithm gives best speedup when the
generation cycle is more then 250.

35
References
1) AlbertY.Zomaya, Senior Member, IEEE, Chris Ward, and Ben Macey, “Genetic Scheduling for Parallel Processor Systems: Comparative
Studies and Performance Issues” VOL.10, NO.8, AUGUST 1999
2) Andrew J. Page , Thomas M. Keane, Thomas J. Naughton, "Multi-heuristic dynamic task allocation using genetic algorithms in a
heterogeneous distributed system" Journal of Parallel and Distributed Computing Volume 70, Issue 7, July 2010, Pages 758–766.
3) Yuan, Yuan , Xue, Huifeng “Modified Genetic Algorithm for Task Scheduling in Multiprocessor Systems” Jisuanji Celiang yu Kongzhi/
Computer Measurement & Control (China). Vol. 13, no. 5, pp. 488-490. May 2005
4) Y.K. Kwok and I. Ahmad, “Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors”,
IEEE Trans. Parallel and Distributed Systems, Vol. 7, No. 5, pp. 506-521, May 1996.
5) A.T. Haghighat, K. Faez, M. Dehghan, A. Mowlaei, & Y. Ghahremani, “GA-based heuristic algorithms for bandwidth- delay-constrained
least-cost multicast routing”, International Journal of Computer Communications 27, 2004, 111–127.
6) D.E. Goldberg, “Genetic Algorithms in Search, Optimization, and Machine Learning. Reading, Mass” Addison-Wesley, 1989.
7) Albert Y. Zomaya, Senior Member, IEEE, and Yee-Hwei “Observations on using genetic algorithms for dynamic load-balancing ” IEEE
Transactions On Parallel And Distributed Systems, Vol. 12, No. 9, September 2001
8) H.C. Lin and C.S. Raghavendra, “A Dynamic Load-Balancing Policy with a Central Job Dispatcher (LBC)” IEEE Trans. Software Eng.,
vol. 18, no. 2, pp. 148-158, Feb. 1992.
9) M. Munetomo, Y. Takai, and Y. Sato, “A Genetic Approach to Dynamic Load-Balancing in a Distributed Computing System” Proc. First
Int'l Conf. Evolutionary Computation, IEEE World Congress Computational Intelligence, vol. 1, pp. 418-421, 1994.

36
Questions
 Thank you.
 Questions?

Observations on dag scheduling and dynamic load-balancing using genetic algorithm

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Observations on dag scheduling and dynamic load-balancing using genetic algorithm

Similar to Observations on dag scheduling and dynamic load-balancing using genetic algorithm (20)

Recently uploaded

Recently uploaded (20)

Observations on dag scheduling and dynamic load-balancing using genetic algorithm