Adaptive job scheduling with load balancing for workflow application


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Adaptive job scheduling with load balancing for workflow application

  1. 1. International Journal of Computer Engineeringand Technology (IJCET), 2, Number and – 6367(Print)© IAEME 0976 – 6367(Print),International Journal of Computer Engineering Technology (IJCET), ISSNISSN 0976 – 6375(Online) Volume ISSN 0976ISSN 0976 – 6375(Online) Volume 2 1, Dec - Jan (2011), IJCETNumber 1, Dec - Jan (2011), pp. 09-21 ©IAEME© IAEME, JOB SCHEDULING WITH LOAD BALANCING FOR WORKFLOW APPLICATION IN GRID PLATFORM D.Daniel PG Scholar Karunya University E-Mail: Mrs.S.P.Jeno Lovesum M.E Asst.Professor Karunya University E-Mail: D.Asir PG Scholar Karunya University E-Mail: A.Catherine Esther Karunya PG Scholar Karunya University E-Mail: Grid computing servers as the globally connected systems which performs highcomputing in many practical applications. Scheduling plays a key role in providingperformance for grid workflow applications. Various scheduling strategies are proposed,including static scheduling strategies which map jobs to resources before execution time,or dynamic alternatives which schedule individual job only when it is ready to execute.Both of the schedules require significantly high scheduling cost and they may notproduce good quality of schedule with low cost. This paper proposes a novel semidynamic algorithm with load balancing concept, which allows the schedule to adapt andschedule the jobs as per the changes in the dynamic grid environment. The proposednovel algorithm schedules the job statically and continues the schedule with dynamicscheduling due to the dynamic nature of the grid. The makespan and the resource usageare the main to objective of this scheduling algorithm. When the resource and 9
  2. 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 2, Number 1, Dec - Jan (2011), © IAEMEperformance fluctuation occur in the grid environment it affects the processing of the jobswhich results in the delay in the job completion time. In this algorithm load balancing isincorporated to handle such situation where the jobs are handled after it is dispatched totheir respective hosts. When there is resource fluctuation occurs due to the dynamicnature of the grid or over loading of jobs to a processor which delays the makespan, loadbalancing is done to handle the job execute and to get desired makespan.Index Terms: DAG, Tasks, Makespan, Resource usage, Semidynamic scheduling.I. INTRODUCTION Grids as geographically distributed computing systems has variety of resourcesoften dispersed geographically to be interconnected and shared, for scientific andengineering challenges, in which majority of applications fall into the interdependenttask model. These applications are generally known as workflow applications [4]. Due tothe growing popularity of grid computing systems, many applications have beenattempting to take advantage of these computing environments. Such applications aregenerally constructed by interweaving interdependent jobs; these applications are calledworkflow applications. Workflow applications are essentially the same as typical parallelprograms, with one exception: a workflow application consists of a set of interdependentapplications (not partitioned tasks of a parallel program). Like conventional parallelprograms, workflow applications can be represented by a DAG. A DAG, G = (V, E),consists of a set V of v nodes and a set E of e edges. A DAG is also known as a taskgraph or macro dataflow graph. The nodes usually represent jobs of a workflowapplication, and the edges usually represent precedence constraints. An edge (i,j)ϵEbetween job ni and job nj represents the inter job communication. Specifically, theoutputs of job ni must be transmitted to job nj for job nj to start its execution. A job withno predecessors is called an entry job, nentry; an exit job, nexit , is one that has nosuccessors. Among the predecessors of a job ni, the predecessor that completes thecommunication at the latest time is the most influential parent (MIP) of the job denotedas MIP(ni). A job is called a ready job if all of its predecessors have been completed. Thelongest path of a task graph is the critical path (CP) [1]. Workflow applications can take advantage of a grid computing platform;however, these applications, besides the use of resource heterogeneity and dynamism, 10
  3. 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 2, Number 1, Dec - Jan (2011), © IAEMEimpose a great burden on scheduling. In some systems, this workflow scheduling is leftfor manual dispatch by users, while other systems employ automated workflowmanagement platforms (WMPs)[1] .These WMPs tend to focus on the minimization ofthe application’s completion time. However, there are other important performanceconsiderations of WMPs, such as resource usage, load balancing, and fault tolerance.Although some WMPs have facilities to deal with these considerations, they often lackthe capability of explicit resource usage control. Rather, for the sake of fault tolerance,resources are overly used (task duplication). The job scheduling has a close relationship with the load balancing. There are twoways load balancing can be made with the given job and resource in the hand, predictionbased and non prediction based. The prediction based load balancing already collects theamount of jobs it have to schedule against the amount of resources, i.e processors. In thiscase the job scheduling is done keeping in account of the availability of the resource andthe load of jobs that is scheduled is scheduled even to all resources based on their statusand the scheduled job is dispatched to the hosts [11]. The non prediction based approachdoes not have any information about the resource and the jobs. Based on the schedulestrategy it schedules the job and dispatched to the hosts based on the dynamic changesthat happens among the availability of the resource, the load is migrated and jobs areexecuted, in this case dynamic balancing of the load is done. The rest of the paper is organized as Chapter 2, explains the related works andChapter 3 gives a detailed presentation about the adaptive scheduling Chapter 4 gives thesystem model, and chapter 5 tells proposed scheduling with load balancing, chapter 6gives the comparison and evaluation of the proposed scheduling, which is followed byconclusion and future work on Chapter 7.II. RELATED WORKS Since in many respects, workflow scheduling in grids is similar to theconventional task scheduling problem in tightly coupled heterogeneous computingsystems (e.g., clusters), some well-known task scheduling algorithms (e.g., HEFT) havebeen adopted and modified for grid workflow scheduling. Most of the algorithms aredesigned in such a way to meet the dynamic nature of the grid. More specifically,rescheduling and advance reservation among other techniques are often used to deal with 11
  4. 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 2, Number 1, Dec - Jan (2011), © IAEMEuncertainties in resource performance. Most job scheduling approaches adapted fromtraditional task scheduling algorithms fall into two category look-ahead category andjust-in-time category. The major difference between these two categories is whetherscheduling decisions are made before the actual job dispatch or at the time any ready jobsare identified, i.e., their predecessors have all been completed. Clearly, for look-aheadapproaches, the acquisition of accurate performance information on resources plays acritical role in their decision making [9]. One major drawback of just-in-time approachesis the loss of timely data transfers. For example, provided that a job has threepredecessors and they complete at different times, the data transfers from thesepredecessors to the job start at the time the last predecessor completes its execution.Where, the times the first two predecessors are completed and the time the lastpredecessor is completed is wasted. The challenge of scheduling grid workflow application with static strategy isdiscussed many researches, but few research efforts address them. Rescheduling isimplemented in the GrADS, where it is normally activated by contract violation.However, the efforts are all conducted for iterative applications, allowing system toperform rescheduling decisions at each iteration. The plan switching approach is toconstruct a family of activity graphs and investigates the means of switching from onemember of the family to another when the execution of one activity graph fails, but themajor drawback is all the plans are made without knowledge about the futureenvironment change since the grid does not ensures a stable computing environment [5].Another rescheduling policy is proposed in, which considers rescheduling at a few,carefully selected points during the execution. The research tackles one of theshortcomings that static scheduling always assumes accurate prediction of jobperformance. After the initial schedule is made, it selectively reschedules some jobs if therun time performance variance exceeds predefined threshold. However, this approachdeals with only the inaccurate estimation and does not consider the change of resourcepool [10]. Since the majority of the tasks that grid computing handles are interdependent andmost of them are workflow application, the scheduling must concentrate on the resourceusage, to have a well organized use of resource; the scheduler must know the information 12
  5. 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 2, Number 1, Dec - Jan (2011), © IAEMEabout the resource, and not just the amount of resource alone. The basic function of thescheduler needs the amount of resource that is available in the grid to schedule the nnumber of jobs. To have a effective schedule the scheduler needs the status of theresource, its processor speed, how much time it has left before start executing anotherjob, how much jobs it can handle in the given amount of time,etc. Based on which theadaptive scheduling strategy is framed , when the schedule does not produce an optimalperformance , or due to the dynamic changes in the availability of the resources thescheduler adapts to the situation and schedules the job to complete its execution.III. ADAPTIVE SCHEDULING STRATEGY Even though static and dynamic scheduling performs near to optimal, itseffectiveness in a dynamic grid environment is questioned. The proposed semidynamicstrategy based novel adaptive job scheduling with load balancing algorithm by which theworkflow scheduler can adapt to the grid dynamics to achieve its strength practically.A. Issues with Traditional Scheduling Planning is a onetime activity in the traditional static scheduling. The staticscheduling does not consider the future change of grid environment after the resourcemapping is made. On the other hand, rescheduling in execution phase is proposed butmainly used to support fault tolerance. Overall, the issues with traditional staticscheduling are: (1) Accuracy of estimation of communication and computation costs, (2)Adaptation to dynamic environment, Figure 3.1 Classification of Static schedulingand (3) Separation of workflow scheduler from executor. Fundamentally the above twoissues are related to the lack of collaboration between the workflow scheduler andexecutor. With collaboration, a scheduler will be aware of the grid environment change,including the job performance variance and resource availability, and is able to 13
  6. 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 2, Number 1, Dec - Jan (2011), © IAEMEadaptively reschedule based on the increasingly accurate estimations. This approach canboth continuously improve performance by considering the new resources and minimizethe impact caused by unexpected resource downgrade or unavailability [9]. The mainissue with dynamic scheduling in workflow application is the execution procedure of theinterdependent tasks. The output of one job could be the input of another job. In thedynamic scheduling jobs are executed in all possible ways where the resources are takeninto account for schedulingB. Adaptive Scheduling The basic idea of adaptive scheduling for a given DAG and a set of currentlyavailable resources, the scheduler makes the initial resource mapping as any othertraditional static approaches do [5]. Along with scheduler gets updation from the executorabout the resources information, such as: Resource Pool Change: If new resource is discovered after the current plan ismade, rescheduling may reduce the makespan of a DAG by considering the resourceaddition. When resource fails, fault tolerant mechanism is triggered and it is taken care ofby Executor. However, if the failure is predictable, rescheduling can minimize the failureimpact on overall performance. Resource Performance Variance: The performance estimation accuracy islargely dependent on history data, and inaccurate estimation leads to a bad schedule. Ifthe run time Performance Monitor can notify the scheduler of any significantperformance variance, the scheduler along with predictor will evaluate its impact andreschedule if necessary. The scheduler reacts to event by evaluating if makespan can be reduced byrescheduling. For example, if a new resource becomes available, the scheduler willevaluate if a new schedule with the extra resource in consideration can produce smallermakespan [7]. If so, the scheduler will replace the current one with new one bysubmitting it to the Executor.IV. SYSTEM MODEL This paper proposes Adaptive Rescheduling approach that can both continuouslyimprove performance by considering the new resources and minimize the impact causedby unexpected resource downgrade or unavailability. 14
  7. 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 2, Number 1, Dec - Jan (2011), © IAEME Grid consists of number of sites. Each site is autonomous in nature; it has its ownusers and global users, hosts are time and resource shared. The hosts of same group ororganizations are clustered together and termed to be sites. The resources of the samecluster or site can access by the hosts of the same site as their own. When the resourcethat has to be accessed from another site the complexity arises. There are administratorsallocated for every site. Depending on which they deploy access polices and access rightsand processors allocation etc. These this differs for every organization. The workflowapplication has n number of interrelated jobs in one task. The start time of first job of thetask to the finish time of nth job in the task is termed as makespan. These jobs are denotedby DAGs(Directed Acyclic Graph) each node is job and the edges denote the relationbetween jobs. The cluster has load scheduler which makes the load balancer and alsoreports status of the resource to the job scheduler which maintains in a history repositoryfor future job scheduling. Figure 4.1 System DesignV. ADAPTVE SCHEDULING WITH LOAD BALANCING FORWORKFLOW APPLICATIONS While the task scheduling problem in heterogeneous computing systems withperfectly accurate performance information on resources and applications still remainsvery difficult, uncertainties on resource performance and the lack of control over gridresources make workflow scheduling even more complex. Unlike many other workflowscheduling schemes, we consider both makespan and resource usage to be equallyimportant and take this into account in our scheduling model. Efficient resource usage iscrucial in grid scheduling because 1) a grid consists of multiple sites administered by 15
  8. 8. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 2, Number 1, Dec - Jan (2011), © IAEMEdifferent entities that use their own resources for other tasks beside the grid jobs and 2)due to the fluctuations and uncertainty surrounding sites in a grid system, lower resourceusage not necessarily the minimization of the number of resources used, rather theminimization of resource time means lower overall variance in the expected completiontime (makespan) of an application [1].A. Job Scheduling To start with, the schedule is made static with the predefine tasks and itsexecution procedures. The tasks are scheduled statically based on their priority, thepriority of each task to be set with upward rank value, ranku, which is based on the meancomputation and communication cost. The task list is created by sorting the tasks inascending order of the ranku. Tie-breaking is done randomly. There can be alternativepolicies for tie-breaking, such as selecting the task whose immediate successor tasks hashigher upward rank. Since these alternate policies increase time complexity, randomselection strategy is preferred [7]. The task list that is made based on the priority is takenas S* (current schedule). At the end of each iteration, mutation is considered if no improvement on S* ismade during the current iteration. Schedule randomly chooses a mutation methodbetween point and swap mutations and mutates each job in S* with a probability of 0.5sufficient to generate substantially different schedules. The mutated schedule is then usedas the current schedule (S*). If there have been some improvements on S* in the currentiteration, it is passed onto the next iteration for further improvements. This schedulemanipulation process repeats for a predefined number of iterations. Now, jobs in thecurrent schedule S* are dispatched to their assigned hosts, as they become ready, i.e.,their predecessor jobs have finished [1].B. Cluster Based Dynamic Load Balancing In the grid computing the resources are globally distributed and thegeographically located resources are termed as clusters. The clusters are group ofprocessors in an organization or a LAN (Local Area Network). Number of clusters grouptogether and perform the computation. The user can access any number of resources fromanywhere any time. Each cluster has a scheduler which has the details of the resources orprocessor information, about their current computation. Amount of jobs they executed 16
  9. 9. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 2, Number 1, Dec - Jan (2011), © IAEMEamount of resource they hold .etc. the scheduler also has the details of the neighboringclusters. The cluster communicates between them to make the load dynamically balanceand to execute the job effectively [11]. When the scheduled jobs are dispatched to the hosts. The next step is the loadbalancing. When a job is getting delayed to execute, the Actual Latest Finish time of thejob is calculated. When the delay is less than the ALFT, then the load is balanced. If thedelay is greater than the ALFT, the scheduler communicates with the neighboring clusterand the load is dynamically allocated to processor that is optimal to execute theremaining job. Migration is used for allocating the job from one resource to another.VI. EVALUATION AND COMPARISIONAdaptive scheduling strategy based HEFT-based adaptive rescheduling algorithm(AHEFT) has the advantage of continuously improves the performance, considering thenew resource and minimizes the impact caused by unexpected resource down grade orunavailability [7]. Drawbacks of adaptive rescheduling technique are it takes time inrescheduling the jobs and to implement the collaboration model, rescheduling has to beintegrated with advance resource reservation and resource availability prediction model.The gridflow gives the Advantage of grid performance service comprises performanceprediction capability with a new application response measurement technique[2], whichcan be used to enable prediction-based scheduling as well as response-based scheduling.But the disadvantage such as the process of a grid workflow encompasses multipleadministrative domains (organizations) [7]. The lack of central ownership and controlresults in incomplete information and Computational and networking capabilities canvary significantly over time in the grid environment. Application performance predictionbecomes difficult and real-time resource information update within a large-scale globalgrid becomes impossible, which lower its performance 17
  10. 10. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 2, Number 1, Dec - Jan (2011), © IAEME Figure 5.1 The structure of adaptive semidynamic scheduling strategy with Load Balancing Critical Path-on-a processor (CPOP) Algorithm and Heterogeneous Earliest FirstTime (HEFT) Algorithm, both gives more or less same performance measures as highperformance and fast schedule time. But has slight disadvantage of high schedule cost[4]. Duplication-based Bottom-Up Scheduling Algorithm (DBUS) gives uses both taskinsertion and task duplication. It also gives the facility to minimize the schedule length[6]. It does not impose any restriction on number of task duplication and Task duplicationmainly causes an increase of resource usage which causes disadvantage to the algorithm.The ADOS algorithm has the highest possibility for reducing the makespan of the task,which is the total amount of time required from the start of the first job to the end of thelast job. It also reduces the resource usage. The disadvantage is the algorithm itself is acomplicated which makes iteration more complex in selecting the scheduling bestschedule strategy. The Adaptive scheduling with load balancing strategy provides bestperformance on both parameters of the grid work, it reduces the makespan of the jobs andit makes effective use of resource. The concept provide less complex static and dynamic 18
  11. 11. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 2, Number 1, Dec - Jan (2011), © IAEMEscheduling strategy and clear condition of the occurrence of load balancing, which makesthe scheduling to have less schedule time.VII. CONCLUSION In this paper, the scheduling of workflow applications in grids is addressed.Unlike many previous scheduling approaches for such a class of applications, thesemidynamic scheduling strategy takes into account both makespan and resource usage.The schedule achieves the two objectives effectively combining a static heuristicscheduling scheme with a dynamic scheduling technique with load balancing. Based onthe research and study conducted, the results obtained, the resource-usage-consciousscheduling scheme significantly improves resource utilization without sacrificing toomuch of makespan. The load balancing strategy incorporated into scheduling helps inensuring the quality of schedules against performance fluctuations of grid resources. The future work would be the implementation of the proposed AdaptiveScheduling with load balancing concept for the workflow application carried on the gridenvironment. The result should be taken and compared and a complete performanceevaluation study will be conducted to determine the promising performance of theschedule on the grid platform.REFERENCE 1. Young Choon Lee, Member, IEEE, Riky Subrata, and Albert Y. Zomaya, Fellow, IEEE “On the Performance of a Dual-Objective Optimization Model for Workflow Applications on Grid Platforms,” Proc, IEEE Transactions On Parallel And Distributed systems, Vol. 20, N0. 9, September 2009. 2. J. Cao, S.A. Jarvis, S. Saini, and G.R. Nudd, “GridFlow: Workflow Management for Grid Computing,” Proc. Third IEEE/ACM Int’l Symp. Cluster Computing and the Grid (CCGrid ’03), pp. 198-205, 2003. 3. H. Casanova, “Simgrid: A Toolkit for the Simulation of Application Scheduling,” Proc. First IEEE/ACM Int’l Symp. Cluster Computing and the Grid (CCGrid ’01), pp. 430-437, 2001. 4. R. Wolski, “Dynamically Forecasting Network Performance Using the Network Weather Service,” Proc. Sixth IEEE Int’l Symp. High Performance Distributed Computing (HPDC ’97), pp. 316-325, 1997. 19
  12. 12. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 2, Number 1, Dec - Jan (2011), © IAEME 5. H. Topcuoglu, S. Hariri, and M. Wu, “Performance-Effective and Low- Complexity Task Scheduling for Heterogeneous Computing,” IEEE Trans. Parallel and Distributed Systems, vol. 13, no. 3, pp. 260-274, Mar. 2002. 6. D. Bozdag, U. Catalyurek, and F. Ozguner, “A Task Duplication Based Bottom- Up Scheduling Algorithm for Heterogeneous Environments,” Proc. 19th Int’l Parallel and Distributed Processing Symp. (IPDPS ’05), Apr. 2005. 7. Z.Yu and W.Shi, “An Adaptive Rescheduling Strategy for Grid Workflow Applications,” Proc. 21st Int’l Parallel and Distributed Processing Symp. (IPDPS), 2007. 8. Y. Gil, V. Ratnakar, E. Deelman, G. Mehta, and J. Kim, “Wings for Pegasus: Creating Large-Scale Scientific Applications Using Semantic Representations of Computational Workflows,” Proc.19th Conf. Innovative Applications of Artificial Intelligence (IAAI ’07),pp. 1767-1774, 2007. 9. M. Wieczorek, R. Prodan, and T. Fahringer, “Scheduling of Scientific Workflows in the ASKALON Grid Environment,” ACMSIGMOD Record, vol. 34, no. 3, pp. 56-62, Sept. 2005. 10. G. Singh, E. Deelman, G. Mehta, K. Vahi, M.-H. Su, G.B. Berriman,J. Good, J.C. Jacob, D.S. Katz, A. Lazzarini, K. Blackburn, andS. Koranda, “The Pegasus Portal: Web Based Grid Computing,”Proc. 20th Ann. ACM Symp. Applied Computing (SAC ’05),pp. 680-686, 2005. 11. A.Mondal, K. Goda, and M. Kitsuregawa, Effective Load Balancing via Migration and Replication in Spatial Grids, LNCS 2736, pp. 201-211, 2003. 12. B.A. Shirazi, A.R. Hurson, and K.M. Kavi, Scheduling and Load Balancing in Parallel and Distributed Systems. IEEE CS Press, 1995. 13. A.M. Dobber, G.M. Koole, and R.D. van der Mei, “Dynamic Load Balancing Experiments in a Grid,” Proc. Fifth IEEE Int’l Symp. Cluster Computing and the Grid (CCGrid ’05), pp. 123-130, 2005. 20
  13. 13. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 2, Number 1, Dec - Jan (2011), © IAEME D.Daniel received the B.E degree in Information Technology fromKarunya University in 2009. He is currently doing his Post graduate in KarunyaUniversity,now works on the project in parallel and distributed Systems, and Continuesresearch on adaptive scheduling techniques in grid computing. D. Asir received the B.E degree in Information Technology fromAnna University in 2009. He is currently doing his Post graduate in KarunyaUniversity,now works on the project in parallel and distributed Systems, and Continuesresearch on Dynamic load balancing techniques in grid computing. Mrs.S.P.Jeno Lovesum ,Asst professor, has completed her Master ofEngineering (CSE) in Annamalai University, Chidambaram and doing her research inCloud Computing. A.Catherine Esther Karunya received the B.E degree inInformation Technology from Karunya University in 2009. She is currently doing herPost graduate in Karunya University,now works on the project in Computing Securityand continues her research in Networking. 21