Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Self-Adaptive SLA-Driven Capacity Management for Internet Services

This work considers the problem of hosting multiple third-party Internet services in a cost-effective manner so as to maximize a provider’s business objective. For this purpose, we
present a dynamic capacity management framework based on an optimization model, which links a cost model based on SLA
contracts with an analytical queuing-based performance model, in an attempt to adapt the platform to changing capacity needs in
real time. In addition, we propose a two-level SLA specification for different operation modes, namely, normal and surge, which allows for per-use service accounting with respect to requirements of throughput and tail distribution response time. The cost model proposed is based on penalties, incurred by the provider due
to SLA violation, and rewards, received when the service level expectations are exceeded. Finally, we evaluate approximations
for predicting the performance of the hosted services under two different scheduling disciplines, namely FCFS and processor
sharing. Through simulation, we assess the effectiveness of the proposed approach as well as the level of accuracy resulting from
the performance model approximations.

Related Books

Free with a 30 day trial from Scribd

See all
  • Be the first to comment

  • Be the first to like this

Self-Adaptive SLA-Driven Capacity Management for Internet Services

  1. 1. IEEE NOMS 2006 Self-Adaptive SLA-Driven 6 April, 2006 Capacity Management for Internet ServicesBruno Abrahao, Virgilio Almeida, Jussara AlmeidaFederal University of Minas Gerais, BrazilAlex Zhang, Dirk Beyer, Fereydoon SafaiHewllet-Packard Labs Palo Alto, CA
  2. 2. Motivation• IT outsourcing for Internet Services − Contracts with a provider − Multiple service shared Internet Data Centers (IDC)• Providers’ challenging task − cost effectiveness while satisfying the customers’ SLA requirements• Complexity − Keep track of different application requirements, systems characteristics, and simultaneous workload variations, as well as (and more importantly!) to consider the business goal of the provider 2
  3. 3. Challenges•New customer demands Multiple metric Probabilistic Per use service requirements performance accounting requirements•Application characteristics High workload Unexpected Application fluctuations workload peaks Heterogeneity• manual management •even more complex businessbecomes impractical and systems models 3
  4. 4. Goal• To present a self-adaptive capacity management scheme for IDCs which aims at maximizing the service revenue of the provider − Take into account the new challenges of the modern IT business and infra-structure − Allows providers to offer customers flexible service plans − Minimize management costs for service providers 4
  5. 5. IDC Environment•Virtualization • VMs provide admission control mechanisms •Transparent and flexible capacity expansion/ contraction. 5
  6. 6. Self-Adaptive Framework •Control Interval 6
  7. 7. Capacity Manager Scheme• Provides IDC configurations that maximize the business objective of the provider 7
  8. 8. Cost Model• Allows per-use service accounting − Customers pay for extra capacity (than that normally required) only when needed• Service accounting − performance achieved by virtual machines instead of simply accounting for resource utilization 8
  9. 9. Cost Model• Allows probabilistic response time requirements P( Ri  RiSLA )  i• Allows multiple metric service level − Throughput, subjected to a guarantee in the response time of the processed transactions   {X | P( R  R SLA )  } 9
  10. 10. Cost Model Extra processing limit Two-level SLA contracts - Normal operation mode -Surge operation mode Normal processing requirement Penalty/Reward model Provider’s business objetive Maximize the net result from the penalties and rewards 10
  11. 11. Performance Model•Based on queuing-theory Capacity allocation decision•application systemcharacteristics Performance•performancerequirements Model•current workload intensity •Throughput •Utilization •Response time probability distribution 11
  12. 12. Performance Model• Utilization and Throughput can be estimated using well-known queuing-based formulas• Approximations are often needed to estimate Response time probability distribution E[ Ri ] − Markov P( Ri  R i SLA )  SLA Ri var[Ri ] − Chebyshev P( Ri  R SLA )  SLA ( R  E[ Ri ])2  RiS LA( f i / E[ Si ])(1 i ) − Percentile (M/M/1) P( Ri  R i SLA )e 12
  13. 13. Optimization model Provider’s business objective Cost Model { Perf. Model Capacity { allocation 13
  14. 14. Experimental Analysis• Self-adaptive versus static configuration − Examine the resulting provider’s payoff − Examine whether performance requirements are met and queue stability is maintained• Compare the degree of accuracy provided by each of the performance approximations• how − Simulate and analyze the behavior of two competing applications that receive different workloads levels over time 14
  15. 15. Experimental Analysis•Net result of the provider (M/M/1) 15
  16. 16. Experimental Analysis •Queue size M/M/1 i2 (0.95) 2• Theoretical value: Qi    18.05 1  i 1  0.95 16
  17. 17. Experimental Analysis •Response time M/M/1• Requirement: P( R  0.1)  0.10 17
  18. 18. Experimental Analysis •Penalty/Rewards M/M/1 18
  19. 19. Conclusions• The self-adaptive capacity management model with any of the approximations is able to − increase the business potential of the provider − Higher payoffs − maintain the application stability − Stable service queues − Response time requirement satisfaction − Markov’s approximation overestimates capacity needs − Chebyshev e Percentile result in a equivalent degree of precision in M/M/1 model• Allows for the new challenges of the problem 19
  20. 20. Time for questions IEEE NOMS 2006 Self-Adaptive SLA-Driven 6 April, 2006 Capacity Management for Internet ServicesBruno Abrahao, Virgilio Almeida, Jussara AlmeidaUniversidade Federal de Minas Gerais, BrazilAlex Zhang, Dirk Beyer, Fereydoon SafaiHewllet-Packard Labs Palo Alto, CA
  21. 21. Backup slides 21
  22. 22. Experimental Analysis•Experimental setup •Two similar applications •Service demand: E[Si ]  103 sec 22
  23. 23. Environment•Virtualization • utilization = busy time / total time 23
  24. 24. Cost Model   Y   24
  25. 25. Cost Model   YX NSLA  25
  26. 26. Cost Model   Z X NSLA 26
  27. 27. Cost Model   ZX SSLA X NSLA 27
  28. 28. Net result M/M/1 and M/G/1 PS 28
  29. 29. Experimental Analysis •Queue size M/G/1 (PS) i2 (0.95) 2 Qi    18.05• Theoretical value: 1  i 1  0.95 29
  30. 30. Experimental Analysis •Response time M/G/1 (PS)• Requirement: P( R  0.1)  0.10 30
  31. 31. Experimental Analysis •Penalty/Reward M/G/1 (PS) 31