The document discusses resource scheduling techniques for cloud computing including single processor scheduling algorithms, cloud scheduling approaches for multi-tenant systems like the Dominant Resource Fairness scheduler, and Hadoop schedulers like the fair scheduler and capacity scheduler. It also proposes a management model for analyzing elasticity of cluster capacity and job dependencies to enable data bursting between private and public clouds.
2. Scheduling levels
Multiple-core machine/OS (which
operating system decides about
scheduling simultaneously)
Single-core machine/OS (which
operating system decides about
scheduling)
The tasks of a Hadoop job (which
clusters decide about scheduling)
The tasks of multiple Hadoop jobs
(which clusters decide about
3. Scheduling goals
Good throughput or response time for
tasks(jobs)
High utilization of resources
4. Single processor scheduling
Algorithms
Which tasks run When?
First-In First-Out(FIFO)/FCFS (useful for batch
applications)
Shortest Task First(STF)-priority scheduling
(useful for batch applications)
Round-robin- fair(useful for interactive
applications)
Hybrid Scheduling approaches(Combining all
above scheduling algorithms in hierarchical
approaches)
5. Cloud Scheduling for multi-
tenant systems
Cloud scheduling works with two types:
1. Jobs with One type-resource requirement-
(Hadoop Schedulers)
2. Jobs with multi-resource requirements (Dominant-
resource Scheduler-DRF)
6. Advantages of DRF
Generalizes to multiple jobs
Generalizes to more than 2 resource types
such as CPU, RAM, Network, Disk, etc
Ensures that each job gets a fair share of
that type of resource which the job desires
the most – Hence fairness
7. Dominant- resource
Scheduler
Dominant Resource Fairness (DRF)
1. Schedule VMs in a cluster.
2. Schedule Hadoop in a cluster.
Also used in Mesos, an OS intended for cloud
environments
8. Hadoop YARN schedulers
1. Hadoop fair scheduler
2. Hadoop capacity scheduler( good for
hierarchical management)
9. Hadoop fair scheduler
Goal : All jobs get equal share of resources.
Solution: Divides cluster into pools(typically one
pool per user) and divides Resources equally
among pools (gives each user fair share of cluster).
Fair share scheduling or FIFO/FCFS can be used
Within each pool (Configurable).
10. Hadoop capacity scheduler
This scheduler contains multiple queues which
each queue contains multiple jobs and guaranteed
portion of the cluster capacity.
Example:
Queue 1 is given 80% of cluster for high priority
jobs.
Queue 2 is given 20% of cluster for less important
jobs.
Facts:
FIFO typically used for jobs within same queue.
The portion of the cluster capacity should not be
fixed so we need Elasticity.
11. HCS Features
Needs Elasticity capacity analyzing.
Needs Elasticity dependencies
analyzing for hierarchical models:
Queues will be hierarchical and contain child sub-
queues so child sub-queues can share resources
equally.
12. Background: DaaS model for
cloud bursting
The elasticity of dataflow analysis during data
migration within hierarchical clouds could be like
the following Data as a service(DaaS) model:
• Runs in private cloud
• Bursts in public cloud
13. Proposal: A management model for
clusters capacity and job dependencies
Elasticity capacity
analizer
Elasticity
dependencies analizer
Elasticity management and
control for capacity and
dependencies scheduling
DaaS Monitoring
Job
Tracker
Scheduler