Capacity-Driven Scaling Schedules Derivation or Coordinated Elasticity of Containers and Virtual Machines

Capacity-Driven Scaling Schedules Derivation
for Coordinated Elasticity
of Containers and Virtual Machines
Yesika Ramirez1,2, Vladimir Podolskiy1, Prof. Dr. Michael Gerndt1
1 Technical University of Munich (TUM), Germany
Chair for Computer Architecture and Parallel Systems
http://www.caps.in.tum.de/en
2 SAP AG, Germany
IEEE ICAC 2019
Umeå, Sweden, June 19th 2019
Full Paper
Resource management and cloud – 2

Cloud and IoT Sytems Research Group @ TUM
Team
Research highlights
Key Publications

3Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity...
Team
Cloud and IoT Systems RG @ Chair of Computer Architecture and Parallel Systems
Prof. Dr.
Michael Gerndt
Ph.D. student
Vladimir Podolskiy
Ph.D. student
Anshul Jindal
+ Our incredible students!
Student Researcher
Harshit Chopra

• Research group exists since Fall 2016
• Group is a ~spin-off~ of HPC group of Prof. Gerndt that exists since 2000
at TUM
• Research areas:
 Self-adaptive cloud (in particular – predictive autoscaling for VMs and
apps)
 AI for Smart Cloud Operations (failure prediction in cloud)
 Self-adaptive IoT middleware
• Research Funding:
 German Academic Exchange Service (DAAD)
 German Ministry of Education and Science (BMBF)
 BMW, AWS, Google
Research Highlights (1)

• Research Collaborations:
 ORCA Lab at the University of Waikato, New Zealand
 Software Engineering Group at the University of Würzburg, Germany
 Instana (Application Performance Management for Microservice
Applications), USA
 Huawei’s German Research Center, Germany
Research Highlights (2)

• [2019, SASO] Vladimir Podolskiy, Michael Mayo, Abigail Koay, Michael Gerndt, Panos
Patros. Maintaining SLOs of Cloud-native Applications via Self-Adaptive Resource Sharing
• [2019, ICPE] Anshul Jindal, Vladimir Podolskiy, Michael Gerndt. Performance Modeling for
Cloud Microservice Applications
• [2019, Int. Journal of Applied Mathematics and Computer Science, University of
Zielona Góra, Poland] Vladimir Podolskiy, Anshul Jindal, Michael Gerndt. Multilayered
Autoscaling Performance Evaluation: Can Virtual Machines and Containers Co-Scale?
• [2018, SASO] Vladimir Podolskiy, Anshul Jindal, Michael Gerndt, Yury Oleynik. Forecasting
Models for Self-Adaptive Cloud Applications: A Comparative Study
• [2018, CLOUD] Vladimir Podolskiy, Anshul Jindal, Michael Gerndt. IaaS Reactive
Autoscaling Performance Challenges
Key Publications

Capacity-Driven Scaling Schedules Derivation
for Coordinated Elasticity
of Containers and Virtual Machines

• Background:
 Autoscaling
• Motivation of the Study
• Theoretical Framework:
 Terms
 Building blocks of an autoscaling policy
 Autoscaling policies for predictive autoscaling
• Scaling Policy Derivation Tool (SPDT)
• Evaluation of Autoscaling Policies
• Conclusions
Contents
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity... 8

Scaling Types
Deployed Cloud Application Application Scaling
Manual Scaling
Autoscaling

Autoscaling Types
Reactive
Autoscaling
Scheduled
Predictive

Motivation of the Study
The Downside of the Reactive Autoscaling
Predictive Autoscaling Pipeline
Research Problem

The Downside of the Reactive Autoscaling: Method

The Downside of the Reactive Autoscaling:
Evaluation of AWS+Kubernetes

Evaluation of AWS+Kubernetes

Evaluation of Azure+Kubernetes

Evaluation of Google Cloud Platform+Kubernetes

The Downside of the Reactive Autoscaling: Preliminary Work
[ to appear in a few days, journal paper ] Vladimir Podolskiy, Anshul Jindal,
Michael Gerndt. Multilayered Autoscaling Performance Evaluation: Can Virtual
Machines and Containers Co-Scale? // International Journal of Applied Mathematics
and Computer Science (AMCS).
[ 2018, conference proceedings ] V. Podolskiy, A. Jindal and M. Gerndt, "IaaS
Reactive Autoscaling Performance Challenges," 2018 IEEE 11th International
Conference on Cloud Computing (CLOUD), San Francisco, CA, USA, 2018, pp. 954-
957. doi: 10.1109/CLOUD.2018.00144
[ 2018, conference proceedings ] Anshul Jindal, Vladimir Podolskiy, and Michael
Gerndt. 2018. Autoscaling Performance Measurement Tool. In Companion of the
2018 ACM/SPEC International Conference on Performance Engineering (ICPE '18).
ACM, New York, NY, USA, 91-92. DOI: doi.org/10.1145/3185768.3186293
[ 2017, conference proceedings ] Anshul Jindal, Vladimir Podolskiy and Michael
Gerndt. 2017. Multilayered Cloud Applications Autoscaling Performance Estimation.
In Proceedings of the 2017 IEEE 7th International Symposium on Cloud and Service
Computing. IEEE. pp. 24-31. DOI 10.1109/SC2.2017.12.

Autoscaling Types
Reactive
Autoscaling
Scheduled
Predictive

Kubernetes/CSPsAPIs
Requests
Forecasting
Capacity /
Performance
Modeling
Scaling
Schedule
Derivation
Scaling Actions
Execution
Budget / Further constraints
SLOs for application
Microservice
application
Historical
data
Monitored number
of requests
Forecasted
workload
Capacity
models for
pods
Scaling
schedule Scaling
actions
MonitoringAPIUser
User
Structural
Modeling and
Capacity
Balancing
Balanced
Application
Graph

How and when to scale under the dynamic workload
so that SLOs are met and costs are minimized?
To derive scaling schedules ensuring to a certain degree
that SLOs are met and the costs are minimized
Research Problem

Theoretical Framework
Terms
Building Blocks of an Autoscaling Policy
Autoscaling Policies for Predictive Autoscaling

I/O
Scaling
Schedule
Derivation
INPUTS:
• Workload Forecast
• Autoscaling Policy
• Performance Profile
OUTPUT:
• Scaling Schedule

Workload Forecast*
Forecasting for time series leverages the previous measured values of some variable to
provide an estimate of the future values of the same variable
Time
[produced by R package forecast version 8.2]
Variable
WWWusage
Historical data
Forecast with
Prediction
Interval
*In V. Podolskiy et al. Forecasting Models for Self-Adaptive Cloud Applications: A Comparative Study. SASO-2018.

Performance Profile*
Microservice Capacity (MSC)
 the maximal possible amount of
requests per second (RPS) that a
microservice can handle under a certain
configuration with the given SLOs
Example Profile
Application name
Application type
Resource Limits CPU
Memory
Max. number of pod replicas
Booting time
Microservice Capacity, MSC
*In A. Jindal et al. Performance Modeling for Cloud Microservice Applications
Performance Profile characterizes performance aspects of the given cloud-native
application or individual containers, also contains performance info on infrastructure

• Describes how to adapt the system under certain conditions
• Governs how and when to add/remove resources in a cloud environment
• Should comply with the system constraints
Autoscaling Policy

Scaling Schedule
---
LaunchTime: “2018-11-01T06:56:54Z“
Services:
movieapp:
Replicas: 3
Cpu: 700m
Memory: 700000000
VMs:
“t2.micro“: 3
ExpectedTime: “2018-11-01T07:00:00Z“
• Schedule defines a sequence of scaling
actions
• Each scaling action describes a transition
between states
• A state describes a deployment
configuration (VMs, Services)
VM VM VM
Time
VM VM
State 1 State 2 State 3 …
State

I/O
Scaling
Schedule
Derivation
INPUTS:
• Workload Forecast
• Autoscaling Policy
• Performance Profile
OUTPUT:
• Scaling Schedule

Autoscaling
Policy
Scaling
Indicator
CSP
Perspective
User
Perspective
Scaling Timing
Reactive
Proactive
Virtualization
Level
Virtual
Machines
Containers
Scaling Method
Vertical Horizontal
Homogeneous
Heterogeneous
Resource
Estimation
Rule-based
Application
Profiling
Analytical
modeling
Pricing Model
On Demand
Reserved
Adaptivity
Non-Adaptive
Self-Adaptive
Building Blocks of an Autoscaling Policy

Planning „How“: Resource estimation
Node_001
CPU: 2
Mem: 4 GbLimits:
CPU: 0.5
Mem: 0.5 Gb
Allocable resourceT = Node Capacity − Reserved
nVMT =
Number of Pods
PodCapacity_VMT
 Estimation of VMs Estimation of Pods
𝐿 = Pod Limits (CPU, Mem)
𝑁𝑃𝑜𝑑𝑠 =
Requests demand
MaxServiceCapacity 𝐿
• Query application profiles

Planning „When“
State 0
State 1
VM set booting time
Pull docker image
Given N° requests at time t

Planning „When“
State 0
State 1
Pods booting time
VM set termination time
Given N° requests at time t

Scaling policy I: Naïve
Pods: Horizontal
VMs: Horizontal
Forecasted RPS, 𝑅
For a given time 𝒕 in the future:
Performance Profile:
 Microservice
Capacity, 𝑀𝑆𝐶𝐿
 Max. pods
replicas, 𝐶𝐶 𝑇
Current VM type, 𝑇
𝑚 𝐿 =
𝑅
𝑀𝑆𝐶𝐿
𝑛 𝑇 =
𝑚 𝐿
𝐶𝐶 𝑇

T2.small
T2.medium
T2.large
(Cpu: 0.2, Mem:0.2)
MSC = Max Service
Capacity
(Cpu: 0.5, Mem:0.5)
MSC = Max Service
Capacity
(Cpu: 1, Mem:1)
MSC = Max Service
Capacity
Scaling policy II: Best Resource Pair
Instead of using the same type of VM, we try to identify such VM type that:
 can host pods amount for the given resource limit computed for forecasted workload
 is the cheapest among all other combinations

Scaling policy II: Best Resource Pair
 Microservice
 Max. pods
Various VM types, 𝑇𝑖
𝑚 𝐿 =
𝑅
𝑀𝑆𝐶𝐿
𝑛 𝑇 𝑖
=
𝑚 𝐿
𝐶𝐶 𝑇 𝑖
VM types prices, 𝑃 𝑇 𝑖
𝑃𝑆(𝑇 𝑖) = 𝑛 𝑇 𝑖
∙ 𝑃 𝑇 𝑖
Pods & VMs: One time
vertical then horizontal
Select the cheapest
VM set

Scaling policy III: Only-Delta-Load
Vladimir Podolskiy (TUM) | Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity...
Current replicas, 𝑀𝐿
 Microservice
 Max. pods
∆= 𝑅 − 𝑀𝐿 ∙ 𝑀𝑆𝐶𝐿
∆> 0 ?
Removing extra
pods/VMs
NO
𝑚 𝐿 =
𝑅
𝑀𝑆𝐶𝐿
YES
𝑚 𝐿 − 𝐶𝐶 𝑇 ∙ 𝑁 𝑇 < 0 ?Done
YES
Pods: Horizontal
VMs: Horizontal & Heterogeneous
NO
39

Scaling policy III: Only-Delta-Load (cont…)
Current VMs, 𝑁 𝑇
 Microservice
 Max. pods
𝑛 𝑇 𝑖
=
𝛿1
𝐶𝐶 𝑇 𝑖
∙ 𝑃 𝑇 𝑖
Select the cheapest
VM set
Pods: Horizontal
VMs: Horizontal & Heterogeneous
40
𝛿 = 𝑚 𝐿 − 𝐶𝐶 𝑇 ∙ 𝑁 𝑇
Schedule pods
that can be
scheduled → 𝛿1

Scaling policy IV: Always resize
Performance Profiles:
 Microservice
 Max. pods
∙ 𝑃 𝑇 𝑖
Select the
cheapest
VM set
41
𝑚 𝐿 𝑖
=
𝑅
𝑀𝑆𝐶 𝐿 𝑖
𝑅(𝐴 𝑃)
𝑖
=
𝑚 𝐿 𝑖
∙ 𝐿𝑖
𝑀𝑆𝐶𝐿 𝑖
Select profile 𝑗 with
the smallest ratio
Allocation /
Performance Ratio
allows to select the
profile with highest
MSC and smallest
resource usage
𝑚 𝐿 𝑗
=
𝑅
𝑀𝑆𝐶 𝐿 𝑗
𝑛 𝑇 𝑖
=
𝑚 𝐿 𝑗
𝐶𝐶 𝑇 𝑖
Pods: Hybrid
VMs: Hybrid

Scaling policy V: Resize When Beneficial
Current replicas, 𝑀𝐿
Current VMs, 𝑁 𝑇
 Microservice
 Max. pods
42
Pods: Hybrid
VMs: Hybrid
Only-Delta-Load:
VM set 𝑆1
Always Resize:
VM set 𝑆2
𝐶1 = 𝑛 𝑆1
∙ 𝑃 𝑇 𝑆1
∙ ∆𝑡 + ∆𝐶1
𝐶2 = 𝑛 𝑆2
∙ 𝑃 𝑇 𝑆2
∙ ∆𝑡 + ∆𝐶2
𝐶1 < 𝐶2 ?
𝑆1 𝑆2
YES NO

Scaling Policy Derivation Tool (SPDT)

SPDT – Scaling Policy Derivation Tool
SPDT
Forecast Service
Performance
Profiles
Service
Executor
Forecasted RPS
Current deployment
configuration
VM types available
Costs
Performance profile Scaling Schedule

SPDT Components

Adaptivity of SPDT
Capacity
Time
Perfect World:
 forecast is 100% accurate

Adaptivity of SPDT
Capacity
Time
Capacity
Time
When the workload forecast is updated:
 update the scaling actions of the policy;
 trigger invalidation of scheduled states;
 schedule new states.
Perfect World:
 forecast is 100% accurate

Evaluation of Autoscaling Policies

Use Case: Database Access Application
78
224
224
86
122
78
78
48
76
122
BEST-PAIR NAIVE ONLY WHEN ALW AYS
N° SCALING ACTIONS
Containers VMs
15.12 15.18
24.53
38.57
26.89
0
10
20
30
40
50
Cost ($)
85.5 77.04
24.36
180.43
34.57
0
50
100
150
200
Avg Transition Time (Seconds)
Policy Derivation
Duration, s
Best-resource-pair 14.45
Resize-when-beneficial 0.7
Always-resize 1.03
Naive 1.79
Only-delta-load 0.54

Comparison for workload patterns
1.83
1.59
2.85
2.99
2.77
COST($)
0.88
0.9
1.63
2.45
1.65
0.72
0.77
1.33
1.79
1.3
Capacity
Capacity
Capacity

Scaling Schedules
 Naive policy
Initial State
---
Services:
movieapp:
Replicas: 1
Cpu: 200m
Memory: 200000000
VMs:
“t2.small“: 1
Capacity

 Best resource
pair
 Resize when
beneficial
Scaling Schedules
Capacity
Capacity

 Only-Delta-
Load
 Always
resize
Scaling Schedules
Capacity Capacity

Comparison of different types of application
Database AccessWeb Access Compute Intensive
Web Access Database Access Compute Intensive
Policy (Sorted) Cost ($) Policy (Sorted) Cost ($) Policy (Sorted) Cost ($)
1 Always resize 0.13 Best resource pair 0.72 Best resource pair 2.6
2 Best resource pair 0.15 Resize when
beneficial
0.77 Resize when
beneficial
2.2
3 Resize when
beneficial
0.15 Naive 1.3 Only-delta-load 3.11
4 Naive 0.31 Only-delta-load 1.33 Naive 3.59
5 Only-delta-load 0.31 Always resize 1.79 Always resize 5.76

• Scaling on both pods and VMs is more beneficial than on each layer separately
• Success of the Naive policy requires good understanding of the application’s performance
to select the first configuration deployment accurately
• Only-delta-load policy has the shortest transition time. It is ideal for quick adaptations to
changes, although expensive in long run due to configuration fragmentation
• Migration between VM types is beneficial if its cost can be mitigated by keeping the new
configuration for a period that lasts long enough
Conclusions

57
Contacts
Vladimir Podolskiy
v.podolskiy@tum.de
/vladimirpodolskiy
/Vladimir_Podolskiy

Capacity-Driven Scaling Schedules Derivation or Coordinated Elasticity of Containers and Virtual Machines

Recommended

Recommended

More Related Content

Similar to Capacity-Driven Scaling Schedules Derivation or Coordinated Elasticity of Containers and Virtual Machines

Similar to Capacity-Driven Scaling Schedules Derivation or Coordinated Elasticity of Containers and Virtual Machines (20)

Recently uploaded

Recently uploaded (20)

Capacity-Driven Scaling Schedules Derivation or Coordinated Elasticity of Containers and Virtual Machines