Scaling and scheduling to maximize application performance within budget constraints

Ming Mao, Marty Humphrey
CS Department, UVa
Scaling and Scheduling to Maximize
Application Performance within Budget
Constraints in Cloud Workflows
IPDPS 2013
(May 21st 2013)
1

2
 Dynamic scalability and cost saving are two of the most important factors when
considering cloud adoption
Two major benefits
- dynamic scalability and cost
A survey from 39 major technology companies [1]
 Cloud benefits
 On-demand self-services
 Broad network access
 Resource pooling
 Rapid elasticity
 Measured services
 Cheaper maintenance
 ……
Why do you move into the cloud?

3
 Dynamic scalability – the ability to acquire/release resources in response to
demand dynamically
 Dynamic scalability challenge → It relies on the users to tell the size of resource
pool
 Over-provisioning → cost more than necessary, offset cloud advantages
 Under-provisioning → hurt application performance, cannot meet service level agreements and
lose application customers
Cloud dynamic scalability
over-provisioning under-provisioning

4
 Problem - What resources should be acquired/released in the cloud,
and how should the computing activities be mapped to the cloud
resources, so that the application performance can be maximized
within the budget constrains?
 In this paper, we discuss limited budget case
 The unlimited budget case was discussed in our SC 11 paper
 Solution - This paper argues that an automatic resource
provisioning and allocation mechanism, i.e., an auto-scaling
solution – is the key to successful cloud adoption. Essentially, an
auto-scaling solution needs to answer the following two questions:
 Capacity determination (or resource provisioning)
 what types of resources, how much and for how long
 Job scheduling (or resource allocation)
 map computing activities onto the cloud resources
Problem statement

5
 An application consists of service components. A workflow goes through different
service components and therefore consists of multiple connected tasks
 Workload is a stream of workflow jobs not known in advance
 Task precedence constraints need to be preserved
 Jobs have individual priorities
Service oriented architecture (SOA) & workflow jobs

6
Minimize job turnaround time within budget constraints
Problem formulation
 Problem terminology
 Cloud application
 app = {Si}
 Job class
 J = {DAG(Si), priorityJ| Si ∈ app}
 Cloud VM
 VMv = {[𝐽 𝑆 𝑖]v , cv , lagv}
 Workload
 Wt = 𝑗𝑜𝑏𝐽
𝑆 𝑖
𝑗𝑜𝑏𝐽𝑆 𝑖
 Scaling plan
 Scalingt = {VMv → Nv}
 Scheduling plan
 Schedulet = { 𝑗𝐽
𝑆 𝑖
→VMv}
 Goal
 Min( 𝑗𝑜𝑏𝑡𝑢𝑟𝑛𝑎𝑟𝑜𝑢𝑛𝑑 × 𝑝𝑟𝑖𝑜𝑟𝑖𝑡𝑦/𝑗𝑜𝑏 𝑝𝑟𝑖𝑜𝑟𝑖𝑡𝑦𝑗𝑜𝑏 )
&&
 Cost(app) <= B (budget, dollars/hour)
 Target - The service provider has a limited budget and
aims to maximize the application performance.
 Solution idea – a monitor-control loop that
makes scaling and scheduling decisions based
on updated workload and VM information

7
 Scheduling-first
 Idea – allocate application budget to individual jobs based on priorities
and schedule tasks within job budget
 Step 1 – Distribute budget: 𝐵𝑗 = 𝐵 × 𝑝𝑗/ 𝑝𝑗𝑗
 Step 2 – Schedule tasks
 for each job, schedule as many tasks as possible on their fast machines
 Step 3 – Consolidate budget
 return job budget to the application
 the application uses the remaining budget collected from individual jobs to schedule
high priority tasks
 Step 4 – Acquire instance
 acquire instances and execute tasks based on the determined schedule plans
Solution: scheduling-first

8
 Scheduling-first
 Step 1 – Distribute budget: 𝐵𝑗 = 𝐵 × 𝑝𝑗/ 𝑝𝑗𝑗
Solution: scheduling-first
 Step 2 – Schedule tasks
e.g. Budget(B) = $1/h;
Large(L) = $0.5/h;
Medium(M) = $0.3/h;
Small(S) = $0.1/h;
 Step 1: job1 and job2 have
the same priority,
job1 → $0.5/h, job2 → $0.5/h
 Step 2:
job1(T1) → $0.5(L);
job2(T5) → $0.5(L);
 Step 3:
job1(T2+T3) → $0.5(S+M);
job2(T6) → $0.5(L);
job1 returns $0.1 to system;
job2(T7) → $0.1(S);
 Step 4
acquire instances when
necessary
 Step 3 – Consolidate budget
 Step 4 – Acquire instance

9
Solution: scaling-first
 Scaling-first
 Idea – determine the computing capacity by looking at the overall
workload and schedule tasks based on priority
 Step 1 – determine the VMs
 assume tasks run on their fastest machines and calculate the cost Cfast for the next
hour
 acquire VMs proportionally based on Budget/Cfast
 Step 2 – consolidate budget
 use the remaining the budget to purchase new machines.
 Step 3 – schedule tasks
 schedule tasks based on task priority

10
Solution: scaling-first
 Scaling-first
 Step 1 – determine the VMs
 Step 2 – consolidate budget
 Step 3 – schedule tasks
 Step 1: assume tasks run on fastest
machines and calculate Cfast and
acquire VMs proportionally based on
B/Cfast,
 Step 2: the remaining $0.5 can be used to
purchase 1 L machine
 Step 3: tasks are scheduled
based on their priorities

11
 Instance consolidation
 Schedule tasks on different VM types to save partial instance hour cost
 Budget allocation schemes
 Evenly distributed – e.g. daily x/365, hourly x/8760
 Based on workload – e.g. high on busy times, low on non-busy times
 Workload prediction – $/hour → $/job
Other considerations

 Workload patterns
 Application models
12
 Time
 72 hours
 Task execution
 Randomly generated
 VM lag
 5 min
Evaluation – experiment setup
 Baseline
 Standard
VM Type Price
Micro $0.02/hour
Standard $0.080/hour
High-CPU $0.66/hour
High-Memory $0.45/hour
Extra-Large $1.3/hour

13
Evaluation – job turnaround time
 above – weighted average job turnaround time for the hybrid application and cycle
workload pattern
 Scheduling-first and scaling-first can save 9.8%- 45.2% cost compared to the standard
machine choice.
 Scaling-first works better under small budget ranges while scheduling-first works better
under large budget ranges.

14
Evaluation – sensitivity to inaccurate parameters
 left – scheduling-first’s sensitivity to inaccurate parameters (Hybrid application + Cycle
workload pattern)
 right – scaling-first’s sensitivity to inaccurate parameters (Hybrid application + Cycle workload
pattern)
 When the estimation error is within ±20%, the job turnaround time shows -10.2% – 16.7%
difference.
 When the task estimation error reaches ±60%, the performance of both algorithms shows
significant degradation (more than ±25% difference)

15
Evaluation – instance consolidation
 left – job turnaround time / resource utilization of scheduling-first’s instance consolidation
(Hybrid application + Cycle workload pattern)
 right – job turnaround time / resource utilization of scaling-first’s instance consolidation
(Hybrid application + Cycle workload pattern)
 When budget is low or high, the improvement is small. When the budget is in between, the
improvement is more significant (e.g. utilization rate improves 2.2% to 19.9% when the budget
is between $15/hour and $25/hour).
 Scaling-first benefits more from instance consolidation process than scheduling-first

16
 Conclusions
 choose appropriate VM types based on the workload.
 Scheduling-first and scaling-first are trade-offs between the task execution time and
waiting time.
 As long as the VM performance can be correctly ranked, the proposed mechanisms have
good tolerance to inaccurate parameters.
 Instance consolidation is an efficient strategy to save partial instance hours and improve
resource utilization.
 Future work
 Other billing models – reserved instances, spot instances, $/min
 Maximize application performance within budget constraints for data-intensive
applications
 Hybrid and federate cloud environments
 Develop evaluation benchmarks and simulation platforms
Conclusion and future work

Scaling and scheduling to maximize application performance within budget constraints

Recommended

Recommended

More Related Content

What's hot

What's hot (9)

Viewers also liked

Viewers also liked (9)

Similar to Scaling and scheduling to maximize application performance within budget constraints

Similar to Scaling and scheduling to maximize application performance within budget constraints (20)

Recently uploaded

Recently uploaded (20)

Scaling and scheduling to maximize application performance within budget constraints