IRJET- Improving Data Availability by using VPC Strategy in Cloud Environ...
Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation
1. Refining the Estimation of
the Available Bandwidth in Inter-Cloud
Links for Task Scheduling
Thiago A. L. Genez, Luiz F. Bittencourt,
Nelson L. S. da Fonseca, Edmundo R. M. Madeira
Institute of Computing (IC)
University of Campinas (UNICAMP)
Campinas, SP, Brazil
December 10, 2014
IEEE GLOBECOM 2014
1 / 22
3. Introduction
Workflow Scheduling Problem in Hybrid Clouds
Peak demand time:
• Private resources → overloaded or insufficient
• Hybrid Cloud: Public resources + private resources
What are the advantages of using public clouds?
• Elasticity
• Pay-as-you-go basis
Workflow scheduling problem
3 / 22
4. Introduction
Current schedulers
Not designed to cope with imprecise information
Produce schedules without taking into account the variability of the
available bandwidth in inter-cloud links
Available bandwidth can increase or decrease at the running time
Application execution can lead
• Violation of deadlines
4 / 22
5. Introduction
Purpose of this work
How to reduce the negative impact of imprecise information about the
inter-cloud available bandwidth on the production of schedules by a
scheduler that was not designed to address with such imprecise
information?
Challenge
Use the original scheduling algorithm
Proposed Mechanism
Deflating the estimate of the inter-cloud available bandwidth based on
the expected imprecision of such estimate and provide a deflated
bandwidth estimate as an input to the scheduler
5 / 22
6. Introduction
Purpose of this work
How to reduce the negative impact of imprecise information about the
inter-cloud available bandwidth on the production of schedules by a
scheduler that was not designed to address with such imprecise
information?
Challenge
Use the original scheduling algorithm
Proposed Mechanism
Deflating the estimate of the inter-cloud available bandwidth based on
the expected imprecision of such estimate and provide a deflated
bandwidth estimate as an input to the scheduler
5 / 22
7. Introduction
Purpose of this work
How to reduce the negative impact of imprecise information about the
inter-cloud available bandwidth on the production of schedules by a
scheduler that was not designed to address with such imprecise
information?
Challenge
Use the original scheduling algorithm
Proposed Mechanism
Deflating the estimate of the inter-cloud available bandwidth based on
the expected imprecision of such estimate and provide a deflated
bandwidth estimate as an input to the scheduler
5 / 22
9. Related Works
Rahman et al.
– Performance of the network of the Amazon EC2
(2010)
– Analysis of the packets delay of VMs to/from Amazon EC2
– Large delay variations
– Negatively impact the performance of scientific applications
Batista et al. – Describe tools for estimating available bandwidth
2010 – Produce estimations with large variability
8 / 22
11. Procedure for Deflating Estimates of the Available
Bandwidth in Inter-cloud Links
Available bandwidth
estimation tool
Scheduler
Estimate of the
Available Bandwidth
Hybrid Cloud
Application workflow
and
Deadline Value
Schedule
10 / 22
12. Procedure for Deflating Estimates of the Available
Bandwidth in Inter-cloud Links
Available bandwidth
estimation tool
Scheduler
Estimate of
the Available
Bandwidth
Expected
uncertainty
value
Hybrid Cloud
Procedure
Deflated
Available
Bandwidth
Estimate
Application workflow
and
Deadline Value
Schedule
10 / 22
13. Procedure for Deflating Estimates of the Available
Bandwidth in Inter-cloud Links
Procedure
History of past executions of the target workflow
When a workflow is about to be scheduled
1. Estimate of the available bandwidth
2. Expected uncertainty value
3. Query the history of past executions of the target workflow
4. Calculates the deflating factor U
U = 10 ⇒ 90% of the estimate of the available bandwidth
Schedule produced is based on the expected uncertainty of the
estimate of available bandwidth in inter-cloud links
11 / 22
14. Procedure for Deflating Estimates of the Available
Bandwidth in Inter-cloud Links
Procedure
History of past executions of the target workflow
When a workflow is about to be scheduled
1. Estimate of the available bandwidth
2. Expected uncertainty value
3. Query the history of past executions of the target workflow
4. Calculates the deflating factor U
U = 10 ⇒ 90% of the estimate of the available bandwidth
Schedule produced is based on the expected uncertainty of the
estimate of available bandwidth in inter-cloud links
11 / 22
15. Procedure for Deflating Estimates of the Available
Bandwidth in Inter-cloud Links
Procedure
History of past executions of the target workflow
When a workflow is about to be scheduled
1. Estimate of the available bandwidth
2. Expected uncertainty value
3. Query the history of past executions of the target workflow
4. Calculates the deflating factor U
U = 10 ⇒ 90% of the estimate of the available bandwidth
Schedule produced is based on the expected uncertainty of the
estimate of available bandwidth in inter-cloud links
11 / 22
16. Procedure for Deflating Estimates of the Available
Bandwidth in Inter-cloud Links
Database
Available bandwidth
estimation tool
Scheduler
Observed
Available
Bandwidth
value
Expected
uncertainty
value
Hybrid Cloud
Procedure
Deflated
Available
Bandwidth
Application workflow
and
Deadline Value
Schedule
12 / 22
17. Procedure for Deflating Estimates of the Available
Bandwidth in Inter-cloud Links
Database
Available bandwidth
estimation tool
Scheduler
Estimate of
the Available
Bandwidth
Expected
uncertainty
value
Hybrid Cloud
Procedure
Deflated
Available
Bandwidth
Application workflow
and
Deadline Value
Schedule
12 / 22
18. Procedure for Deflating Estimates of the Available
Bandwidth in Inter-cloud Links
Database
Available bandwidth
estimation tool
Scheduler
Estimate of
the Available
Bandwidth
Expected
uncertainty
value
Hybrid Cloud
Procedure
Deflated
Available
Bandwidth
Application workflow
and
Deadline Value
Schedule
12 / 22
19. Procedure for Deflating Estimates of the Available
Bandwidth in Inter-cloud Links
Database
Available bandwidth
estimation tool
Scheduler
Estimate of
the Available
Bandwidth
Expected
uncertainty
value
Hybrid Cloud
Procedure
Deflated
Available
Bandwidth
Application workflow
and
Deadline Value
Schedule
12 / 22
20. Procedure for Deflating Estimates of the Available
Bandwidth in Inter-cloud Links
Database
Available bandwidth
estimation tool
Scheduler
Estimate of
the Available
Bandwidth
Expected
uncertainty
value
Hybrid Cloud
Procedure
Deflated
Available
Bandwidth
Application workflow
and
Deadline Value
Schedule
12 / 22
21. Procedure for Deflating Estimates of the Available
Bandwidth in Inter-cloud Links
Database
Available bandwidth
estimation tool
Scheduler
Estimate of
the Available
Bandwidth
Expected
uncertainty
value
Hybrid Cloud
Procedure
Deflated
Available
Bandwidth
Application workflow
and
Deadline Value
Schedule
Untouched Qualifed
Solution
12 / 22
22. Procedure for Deflating Estimates of the Available
Bandwidth in Inter-cloud Links
Computation of the Deflating factor U for the Target
Workflow
Multiple Linear Regression: f(x, y) = ax + by + c
• x: Current estimate of the available bandwidth
• y: Current expected uncertainty
• Deflating factor U = f(x, y)
Computation of the coefficients a, b and c
Target workflow G: dataset HG
• 5-tuple hi = bw, p, U, errorm
G , error$
G
Subset Hk ⊆ HG
• For each pair (bw, p) in HG
• bw, p, Um
and bw, p, U$
are added into Hk
Subset Hk is used by the multiple linear regression
13 / 22
24. Evaluation
Experimental Parameters
Scheduler
• HCOC scheduling algorithm
Hybrid Cloud Scenario
• 1 private cloud and 2 public clouds
• Inter-cloud bandwidths of 10 to 60 Mbps
• Intra-cloud bandwidths of 1 Gbps
Simulator
• Estimates the makespan and cost of the execution of the workflow
15 / 22
26. Evaluation
Experimental Steps
1. History of execution was created
• Fixed bandwidth deflating factors U ∈ {0, 25, 50}
• p varying from 45% to 99%
• 100 simulations
2. Multiple linear regression (MLR) procedure
• f(x, y) = ax + by + c
• Employs using 50% and 100% of the dataset
3. Use the equation f(x, y) to calculate the deflating factor U
• 100 simulations
17 / 22
27. Evaluation
40
50
60
70
80
90
100
0 45 50 60 70 80 90 100
%ofqualifiedsolutions
Uncertainty p
Montage DAG
U=0
U=25
U=50
MLR 50%
MLR 100%
Inter-cloud available bandwidth of 60Mbps
D = Tmax × 3/7
18 / 22
28. Evaluation
25
30
35
40
45
50
55
60
0 45 50 60 70 80 90 100
Averagemakespanestimation
Uncertainty p
Montage DAG
U=0
U=25
U=50
MLR 50%
MLR 100%
Inter-cloud available bandwidth of 60Mbps
D = Tmax × 3/7
19 / 22
30. Final Considerations
Conclusion
Current scheduler
• Estimated available bandwidth is precise at the scheduling time
• Produce inefficient scheduling decisions
• Missing deadlines, increasing costs and makespan more than expected
The proposed procedure
• Deflates the estimate of the available bandwidth in inter-cloud links
• Multiple linear regression approach
• Increases the number of qualified solutions
21 / 22
31. Final Considerations
Conclusion
Current scheduler
• Estimated available bandwidth is precise at the scheduling time
• Produce inefficient scheduling decisions
• Missing deadlines, increasing costs and makespan more than expected
The proposed procedure
• Deflates the estimate of the available bandwidth in inter-cloud links
• Multiple linear regression approach
• Increases the number of qualified solutions
21 / 22