Autoscaling in kubernetes v1

Autoscaling in Kubernetes
Marian Soltys
DevOps Engineers

www.pixelfederation.com
Introduction
We are game studio based in Slovakia developing free to
play mobile games.
● Trainstation
● Diggy’s Adventure
● Seaport
● Trainstation 2
● AFK Cats
● Emporea

TL;DR Summary
● Autoscaling - do we need it?
● Three levels of scaling - VPA, HPA, CA
● Cluster Autoscaler
● Horizontal Pod Autoscaler
● Vertical Pod Autoscaler
● Custom and External metrics for scaling
● Real life example

Autoscaling - do we need it?
● Number of active players changes over daytime and day of week
● Batch processing
● Combination of both
● Cost optimization

Three levels of scaling - VPA, HPA, CA
Kubernetes
● CA - Cluster Autoscaler

Kubernetes

● CA - Cluster Autoscaler
Kubernetes

Node Autoscaling
● Cluster Autoscaler
(https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler)
● Escalator
(https://github.com/atlassian/escalator)
● Cerebral
(https://github.com/containership/cerebral)

Cluster Autoscaler
Cluster Autoscaler automatically adjusts the size of the Kubernetes cluster cross AZ:
● Watches for pod in pending state events due to insufficient resources.
● Periodically check for underutilized nodes with pods that can be placed on other existing
nodes.
● Respects PodDistributionBudget, Affinity, Annotation, ...
How we use it:
● MinReplicas of 2 with podAntiAffinity to hostname
● Parameters:
○ scale-down-delay-after-add: 10m
○ scale-down-delay-after-delete: 10s
○ scale-down-unneeded-time: 10m
○ scale-down-utilization-threshold: 0.65
● Spot instances

Vertical Pod Autoscaler
● Can automatically adjust pod requests and limits
● Calculation based on current and historical metrics
● Modes: Auto, Recreate, Initial, Off
Cons:
● Pod restarts when request changes
(Auto/Recreate modes)
● All pods start events goes through VPA
● Could conflict with HPA (on CPU and
memory)
Pros:
● Recommender
● Can solve under or over provisioned
pods
How we use it: We don’t.

Horizontal Pod Autoscaler
● Scale deployments (number of pods) on metrics base
○ Container resources - CPU/Memory
○ Object
○ Custom/External metrics
● Think twice to use it with stateful deployments
Take into consideration:
● Default metrics loop 15 sec
● Metric toleration 10%
● Downscale stabilization time window. The default value is 5 minutes (5m0s).
● Parameters prior Kubernetes 1.17 are configured on cluster level
● Since Kubernetes 1.18+ some parameters can be tweaked under HPA .spec.behavior

Horizontal Pod Autoscaler

Metrics types (autoscaling/v2beta2)
Resource:
● CPU/memory
● Container request(s) must be set
● API: metrics.k8s.io
External:
● Metrics not related to Kubernetes objects
● AWS SQS, RDS, …
● API: external.metrics.k8s.io
Custom:
● Pod/Object (in same namespace)
● Time series DB required (e.g. Prometheus)
● API: custom.metrics.k8s.io
Example with target average 65%:
ceil[currentReplicas * ( currentMetricValue /
desiredMetricValue )] = desiredReplicas
4*(0.781 / 0.650 ) = 4.8 (means +1 pod)
Note: With multiple metrics highest value is
chosen.

Custom and External metrics
Limitation: One adapter per type Custom/External metrics or one for both
% kubectl get APIService v1beta1.external.metrics.k8s.io -o yaml
kind: APIService
metadata:
labels: …
name: v1beta1.external.metrics.k8s.io
spec:
group: external.metrics.k8s.io
service:
name: k8s-cloudwatch-adapter
namespace: ...
port: …
...

Custom and External metrics
● Check available metrics:
% kubectl get --raw /apis/external.metrics.k8s.io/v1beta1 | jq .
% kubectl get --raw /apis/external.metrics.k8s.io/v1beta1/namespaces/<ns name>/<metric name> | jq .
{
"kind": "ExternalMetricValueList",
"apiVersion": "external.metrics.k8s.io/v1beta1",
"metadata": {...},
"items": [
{
"metricName": "ts2-numberOfMessagesSent",
"metricLabels": null,
"timestamp": "2021-02-17T11:05:20Z",
"value": “27"
}
]
}
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "external.metrics.k8s.io/v1beta1",
"resources": [
{
"name": "ts2-numberOfMessagesSent",
"singularName": "",
"namespaced": true,
"kind": "ExternalMetricValueList",
"verbs": ["get"]
}
]
}

Tested Adapters for HPA
● Prometheus adapter - we no longer use it
○ Doesn’t fit our needs any more - redesign of infrastructure needed
○ (https://github.com/kubernetes-sigs/prometheus-adapter)
● K8s-cloudwatch adapter - in use now
○ (https://github.com/awslabs/k8s-cloudwatch-adapter)
● Kube-metrics-adapter - evaluated
○ Collectors: Pod, Prometheus, AWS, HTTP, …
○ (https://github.com/zalando-incubator/kube-metrics-adapter)
● KEDA - evaluating now
○ (https://keda.sh)

Trainstation 2 - Real Life Example
Types of workloads:
● Live traffic - changes over daytime
● Start/End of the Event - once per month
● Start/End of the Competitions - twice per week
● Event cleanup - a few days after event ends
Goals:
● Scaling backend - based on live traffic
● Batch/Asynchronous workers scaling - based on queue size
● Limit batch processing under DB pressure
● Maximize off peak hours batch processing
● Start of the Event/Competitions - scale ahead

Cloudwatch adapter
Common approach:
1. Monitor queue
2. Calculate number of optimal
workers
3. Return metrics
Advanced approach:
1. Monitor queue
2. Calculate number of optimal
workers
3. Monitor RDS utilization
4. Tune number of workers to not
overload live workload
5. Return metrics

apiVersion: metrics.aws/v1alpha1
kind: ExternalMetric
metadata:
name: "ts2-numberOfMessagesSent"
spec:
name: name: "ts2-numberOfMessagesSent"
resource:
resource: "deployment"
queries:
- id: queue_metric
metricStat:
metric:
namespace: "AWS/SQS"
metricName: "NumberOfMessagesSent"
dimensions:
- name: QueueName
value: "ts2-demo"
period: 30
stat: Sum
unit: Count
returnData: false
- id: db_cpuutilization
metricStat:
metric:
namespace: "AWS/RDS"
metricName: "CPUUtilization"
dimensions:
- name: DBClusterIdentifier
value: "ts2-demo-cluster"
- name: Role
value: WRITER
period: 300
stat: Average
unit: Percent
returnData: false
- id: workers_calculated
expression: "IF((queue_metric / 300) > 100, 100, queue_metric / 300)"
returnData: false
- id: workers_desired
expression: "IF(db_cpuutilization < 80, workers_calculated,
IF(db_cpuutilization < 90, workers_calculated * 80 / 100, 0))"
returnData: true
Desired pods: get queue size -> calculate desired pods -> limit to 100 max -> if DB util. is more than 80% reduce 20%

Horizontal Pod Autoscaler config:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
labels:
app: ts2-demo
name: ts2-demo-worker
spec:
behavior:
scaleDown:
policies:
- periodSeconds: 15
type: Percent
value: 100
selectPolicy: Max
stabilizationWindowSeconds: 120
scaleUp:
...
stabilizationWindowSeconds: 0
spec:
minReplicas: 4
maxReplicas: 100
metrics:
- external:
metric:
name: ts2-numberOfMessagesVisible
target:
averageValue: "1"
type: AverageValue
type: External
- external:
metric:
name: ts2-numberOfMessagesSent
target:
averageValue: "1"
type: AverageValue
type: External
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: ts2-demo-worker

% k get hpa ts2-demo-worker
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
ts2-demo-worker Deployment/ts2-demo-worker 0/1 (avg),
1038m/1 (avg)
4 100 27 2d4h

It’s working :) but you have to take into account stabilization window and metrics delay

Questions ?
msoltys@pixelfederation.com
linkedin.com/in/mariansoltys

https://portal.pixelfederation.com/sk/career

Autoscaling in kubernetes v1

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Autoscaling in kubernetes v1

Similar to Autoscaling in kubernetes v1 (20)

Recently uploaded

Recently uploaded (20)

Autoscaling in kubernetes v1