Presentation at the 16th IEEE International Conference on Autonomic Computing (ICAC 2019) by Vladimir Podolskiy.
Link to the preprint: https://www.researchgate.net/publication/332290963_Capacity-Driven_Scaling_Schedules_Derivation_for_Coordinated_Elasticity_of_Containers_and_Virtual_Machines
Abstract: With the growing complexity of microservice applications and proliferation of containers, scaling of cloud applications became challenging. Containers enabled the adaptation of the application capacity to the changing workload on the finer level of granularity than it was possible only with virtual machines. The common way to automate the adaptation of a cloud application is via autoscaling. Autoscaling is provided both on the level of virtual machines and containers. Its accuracy on dynamic workloads suffers significantly from the reactive nature of the available autoscaling solutions. The aim of the paper is to explore potential improvements of autoscaling by designing and evaluating several predictive-based autoscaling policies. These policies are naive (used as a baseline), best resource pair, only-Delta-load, always-resize, resize when beneficial. The scaling policies were implemented in Scaling Policy Derivation Tool (SPDT). SPDT takes the long-term forecast of the workload and the capacity model of microservices as input to produce the sequence of scaling actions scheduled for the execution in future with the aims to meet the service level objectives and minimize the costs. Policies implemented in SPDT were evaluated for three mi-croservice applications and several workload patterns. The tests demonstrate that the combination of horizontal and vertical scaling enables more flexibility and reduces costs. Schedule derivation according to some policies might be compute-intensive, therefore careful consideration of the optimization objective (e.g. cost minimization or timeliness of the scaling policy) is required from the user of SPDT.
Unblocking The Main Thread Solving ANRs and Frozen Frames
Capacity-Driven Scaling Schedules Derivation or Coordinated Elasticity of Containers and Virtual Machines
1. Capacity-Driven Scaling Schedules Derivation
for Coordinated Elasticity
of Containers and Virtual Machines
Yesika Ramirez1,2, Vladimir Podolskiy1, Prof. Dr. Michael Gerndt1
1 Technical University of Munich (TUM), Germany
Chair for Computer Architecture and Parallel Systems
http://www.caps.in.tum.de/en
2 SAP AG, Germany
IEEE ICAC 2019
Umeå, Sweden, June 19th 2019
Full Paper
Resource management and cloud – 2
2. Cloud and IoT Sytems Research Group @ TUM
Team
Research highlights
Key Publications
3. 3Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity...
Team
Cloud and IoT Systems RG @ Chair of Computer Architecture and Parallel Systems
Prof. Dr.
Michael Gerndt
Ph.D. student
Vladimir Podolskiy
Ph.D. student
Anshul Jindal
+ Our incredible students!
Student Researcher
Harshit Chopra
4. • Research group exists since Fall 2016
• Group is a ~spin-off~ of HPC group of Prof. Gerndt that exists since 2000
at TUM
• Research areas:
Self-adaptive cloud (in particular – predictive autoscaling for VMs and
apps)
AI for Smart Cloud Operations (failure prediction in cloud)
Self-adaptive IoT middleware
• Research Funding:
German Academic Exchange Service (DAAD)
German Ministry of Education and Science (BMBF)
BMW, AWS, Google
4Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity...
Research Highlights (1)
5. • Research Collaborations:
ORCA Lab at the University of Waikato, New Zealand
Software Engineering Group at the University of Würzburg, Germany
Instana (Application Performance Management for Microservice
Applications), USA
Huawei’s German Research Center, Germany
5Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity...
Research Highlights (2)
6. • [2019, SASO] Vladimir Podolskiy, Michael Mayo, Abigail Koay, Michael Gerndt, Panos
Patros. Maintaining SLOs of Cloud-native Applications via Self-Adaptive Resource Sharing
• [2019, ICPE] Anshul Jindal, Vladimir Podolskiy, Michael Gerndt. Performance Modeling for
Cloud Microservice Applications
• [2019, Int. Journal of Applied Mathematics and Computer Science, University of
Zielona Góra, Poland] Vladimir Podolskiy, Anshul Jindal, Michael Gerndt. Multilayered
Autoscaling Performance Evaluation: Can Virtual Machines and Containers Co-Scale?
• [2018, SASO] Vladimir Podolskiy, Anshul Jindal, Michael Gerndt, Yury Oleynik. Forecasting
Models for Self-Adaptive Cloud Applications: A Comparative Study
• [2018, CLOUD] Vladimir Podolskiy, Anshul Jindal, Michael Gerndt. IaaS Reactive
Autoscaling Performance Challenges
6Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity...
Key Publications
14. Motivation of the Study
The Downside of the Reactive Autoscaling
Predictive Autoscaling Pipeline
Research Problem
15. 15Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity...
The Downside of the Reactive Autoscaling: Method
16. 16Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity...
The Downside of the Reactive Autoscaling:
Evaluation of AWS+Kubernetes
17. 17Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity...
The Downside of the Reactive Autoscaling:
Evaluation of AWS+Kubernetes
18. 18Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity...
The Downside of the Reactive Autoscaling:
Evaluation of Azure+Kubernetes
19. 19Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity...
The Downside of the Reactive Autoscaling:
Evaluation of Google Cloud Platform+Kubernetes
20. 20Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity...
The Downside of the Reactive Autoscaling: Preliminary Work
[ to appear in a few days, journal paper ] Vladimir Podolskiy, Anshul Jindal,
Michael Gerndt. Multilayered Autoscaling Performance Evaluation: Can Virtual
Machines and Containers Co-Scale? // International Journal of Applied Mathematics
and Computer Science (AMCS).
[ 2018, conference proceedings ] V. Podolskiy, A. Jindal and M. Gerndt, "IaaS
Reactive Autoscaling Performance Challenges," 2018 IEEE 11th International
Conference on Cloud Computing (CLOUD), San Francisco, CA, USA, 2018, pp. 954-
957. doi: 10.1109/CLOUD.2018.00144
[ 2018, conference proceedings ] Anshul Jindal, Vladimir Podolskiy, and Michael
Gerndt. 2018. Autoscaling Performance Measurement Tool. In Companion of the
2018 ACM/SPEC International Conference on Performance Engineering (ICPE '18).
ACM, New York, NY, USA, 91-92. DOI: doi.org/10.1145/3185768.3186293
[ 2017, conference proceedings ] Anshul Jindal, Vladimir Podolskiy and Michael
Gerndt. 2017. Multilayered Cloud Applications Autoscaling Performance Estimation.
In Proceedings of the 2017 IEEE 7th International Symposium on Cloud and Service
Computing. IEEE. pp. 24-31. DOI 10.1109/SC2.2017.12.
22. Kubernetes/CSPsAPIs
22Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity...
Predictive Autoscaling Pipeline
Requests
Forecasting
Capacity /
Performance
Modeling
Scaling
Schedule
Derivation
Scaling Actions
Execution
Budget / Further constraints
SLOs for application
Microservice
application
Historical
data
Monitored number
of requests
Forecasted
workload
Capacity
models for
pods
Scaling
schedule Scaling
actions
MonitoringAPIUser
User
Structural
Modeling and
Capacity
Balancing
Balanced
Application
Graph
23. Kubernetes/CSPsAPIs
23Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity...
Predictive Autoscaling Pipeline
Requests
Forecasting
Capacity /
Performance
Modeling
Scaling
Schedule
Derivation
Scaling Actions
Execution
Budget / Further constraints
SLOs for application
Microservice
application
Historical
data
Monitored number
of requests
Forecasted
workload
Capacity
models for
pods
Scaling
schedule Scaling
actions
MonitoringAPIUser
User
Structural
Modeling and
Capacity
Balancing
Balanced
Application
Graph
24. How and when to scale under the dynamic workload
so that SLOs are met and costs are minimized?
To derive scaling schedules ensuring to a certain degree
that SLOs are met and the costs are minimized
Research Problem
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity... 24
27. Workload Forecast*
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity... 27
Forecasting for time series leverages the previous measured values of some variable to
provide an estimate of the future values of the same variable
Time
[produced by R package forecast version 8.2]
Variable
WWWusage
Historical data
Forecast with
Prediction
Interval
*In V. Podolskiy et al. Forecasting Models for Self-Adaptive Cloud Applications: A Comparative Study. SASO-2018.
28. Performance Profile*
Microservice Capacity (MSC)
the maximal possible amount of
requests per second (RPS) that a
microservice can handle under a certain
configuration with the given SLOs
Example Profile
Application name
Application type
Resource Limits CPU
Memory
Max. number of pod replicas
Booting time
Microservice Capacity, MSC
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity... 28
*In A. Jindal et al. Performance Modeling for Cloud Microservice Applications
Performance Profile characterizes performance aspects of the given cloud-native
application or individual containers, also contains performance info on infrastructure
29. • Describes how to adapt the system under certain conditions
• Governs how and when to add/remove resources in a cloud environment
• Should comply with the system constraints
Autoscaling Policy
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity... 29
30. Scaling Schedule
---
LaunchTime: “2018-11-01T06:56:54Z“
Services:
movieapp:
Replicas: 3
Cpu: 700m
Memory: 700000000
VMs:
“t2.micro“: 3
ExpectedTime: “2018-11-01T07:00:00Z“
• Schedule defines a sequence of scaling
actions
• Each scaling action describes a transition
between states
• A state describes a deployment
configuration (VMs, Services)
VM VM VM
Time
VM VM
State 1 State 2 State 3 …
State
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity... 30
33. Planning „How“: Resource estimation
Node_001
CPU: 2
Mem: 4 GbLimits:
CPU: 0.5
Mem: 0.5 Gb
Allocable resourceT = Node Capacity − Reserved
nVMT =
Number of Pods
PodCapacity_VMT
Estimation of VMs Estimation of Pods
𝐿 = Pod Limits (CPU, Mem)
𝑁𝑃𝑜𝑑𝑠 =
Requests demand
MaxServiceCapacity 𝐿
• Query application profiles
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity... 33
34. Planning „When“
State 0
State 1
VM set booting time
Pull docker image
Given N° requests at time t
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity... 34
35. Planning „When“
State 0
State 1
Pods booting time
VM set termination time
Given N° requests at time t
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity... 35
36. Scaling policy I: Naïve
Pods: Horizontal
VMs: Horizontal
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity... 36
Forecasted RPS, 𝑅
For a given time 𝒕 in the future:
Performance Profile:
Microservice
Capacity, 𝑀𝑆𝐶𝐿
Max. pods
replicas, 𝐶𝐶 𝑇
Current VM type, 𝑇
𝑚 𝐿 =
𝑅
𝑀𝑆𝐶𝐿
𝑛 𝑇 =
𝑚 𝐿
𝐶𝐶 𝑇
37. T2.small
T2.medium
T2.large
(Cpu: 0.2, Mem:0.2)
MSC = Max Service
Capacity
(Cpu: 0.5, Mem:0.5)
MSC = Max Service
Capacity
(Cpu: 1, Mem:1)
MSC = Max Service
Capacity
Scaling policy II: Best Resource Pair
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity... 37
Instead of using the same type of VM, we try to identify such VM type that:
can host pods amount for the given resource limit computed for forecasted workload
is the cheapest among all other combinations
38. Scaling policy II: Best Resource Pair
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity... 38
Forecasted RPS, 𝑅
For a given time 𝒕 in the future:
Performance Profile:
Microservice
Capacity, 𝑀𝑆𝐶𝐿
Max. pods
replicas, 𝐶𝐶 𝑇
Various VM types, 𝑇𝑖
𝑚 𝐿 =
𝑅
𝑀𝑆𝐶𝐿
𝑛 𝑇 𝑖
=
𝑚 𝐿
𝐶𝐶 𝑇 𝑖
VM types prices, 𝑃 𝑇 𝑖
𝑃𝑆(𝑇 𝑖) = 𝑛 𝑇 𝑖
∙ 𝑃 𝑇 𝑖
Pods & VMs: One time
vertical then horizontal
Select the cheapest
VM set
39. Scaling policy III: Only-Delta-Load
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity...
Current replicas, 𝑀𝐿
For a given time 𝒕 in the future:
Performance Profile:
Microservice
Capacity, 𝑀𝑆𝐶𝐿
Max. pods
replicas, 𝐶𝐶 𝑇
∆= 𝑅 − 𝑀𝐿 ∙ 𝑀𝑆𝐶𝐿
Forecasted RPS, 𝑅
∆> 0 ?
Removing extra
pods/VMs
NO
𝑚 𝐿 =
𝑅
𝑀𝑆𝐶𝐿
YES
𝑚 𝐿 − 𝐶𝐶 𝑇 ∙ 𝑁 𝑇 < 0 ?Done
YES
Pods: Horizontal
VMs: Horizontal & Heterogeneous
NO
39
40. Scaling policy III: Only-Delta-Load (cont…)
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity...
Current VMs, 𝑁 𝑇
For a given time 𝒕 in the future:
Performance Profile:
Microservice
Capacity, 𝑀𝑆𝐶𝐿
Max. pods
replicas, 𝐶𝐶 𝑇
Various VM types, 𝑇𝑖
𝑛 𝑇 𝑖
=
𝛿1
𝐶𝐶 𝑇 𝑖
VM types prices, 𝑃 𝑇 𝑖
𝑃𝑆(𝑇 𝑖) = 𝑛 𝑇 𝑖
∙ 𝑃 𝑇 𝑖
Select the cheapest
VM set
Pods: Horizontal
VMs: Horizontal & Heterogeneous
40
𝛿 = 𝑚 𝐿 − 𝐶𝐶 𝑇 ∙ 𝑁 𝑇
Schedule pods
that can be
scheduled → 𝛿1
41. Scaling policy IV: Always resize
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity...
For a given time 𝒕 in the future:
Performance Profiles:
Microservice
Capacity, 𝑀𝑆𝐶𝐿
Max. pods
replicas, 𝐶𝐶 𝑇
Various VM types, 𝑇𝑖
VM types prices, 𝑃 𝑇 𝑖
𝑃𝑆(𝑇 𝑖) = 𝑛 𝑇 𝑖
∙ 𝑃 𝑇 𝑖
Select the
cheapest
VM set
41
𝑚 𝐿 𝑖
=
𝑅
𝑀𝑆𝐶 𝐿 𝑖
Forecasted RPS, 𝑅
𝑅(𝐴 𝑃)
𝑖
=
𝑚 𝐿 𝑖
∙ 𝐿𝑖
𝑀𝑆𝐶𝐿 𝑖
Select profile 𝑗 with
the smallest ratio
Allocation /
Performance Ratio
allows to select the
profile with highest
MSC and smallest
resource usage
𝑚 𝐿 𝑗
=
𝑅
𝑀𝑆𝐶 𝐿 𝑗
𝑛 𝑇 𝑖
=
𝑚 𝐿 𝑗
𝐶𝐶 𝑇 𝑖
Pods: Hybrid
VMs: Hybrid
42. Scaling policy V: Resize When Beneficial
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity...
Current replicas, 𝑀𝐿
Current VMs, 𝑁 𝑇
For a given time 𝒕 in the future:
Performance Profile:
Microservice
Capacity, 𝑀𝑆𝐶𝐿
Max. pods
replicas, 𝐶𝐶 𝑇
Forecasted RPS, 𝑅
42
Pods: Hybrid
VMs: Hybrid
Various VM types, 𝑇𝑖
VM types prices, 𝑃 𝑇 𝑖
Only-Delta-Load:
VM set 𝑆1
Always Resize:
VM set 𝑆2
𝐶1 = 𝑛 𝑆1
∙ 𝑃 𝑇 𝑆1
∙ ∆𝑡 + ∆𝐶1
𝐶2 = 𝑛 𝑆2
∙ 𝑃 𝑇 𝑆2
∙ ∆𝑡 + ∆𝐶2
𝐶1 < 𝐶2 ?
𝑆1 𝑆2
YES NO
46. Adaptivity of SPDT
Capacity
Time
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity... 46
Perfect World:
forecast is 100% accurate
47. Adaptivity of SPDT
Capacity
Time
Capacity
Time
When the workload forecast is updated:
update the scaling actions of the policy;
trigger invalidation of scheduled states;
schedule new states.
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity... 47
Perfect World:
forecast is 100% accurate
56. • Scaling on both pods and VMs is more beneficial than on each layer separately
• Success of the Naive policy requires good understanding of the application’s performance
to select the first configuration deployment accurately
• Only-delta-load policy has the shortest transition time. It is ideal for quick adaptations to
changes, although expensive in long run due to configuration fragmentation
• Migration between VM types is beneficial if its cost can be mitigated by keeping the new
configuration for a period that lasts long enough
Conclusions
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity... 56