SlideShare a Scribd company logo
Journey of
Kubernetes Scaling
Code Mania 111 @ Siam University
June 10, 2018
Journey of Kubernetes Scaling
● Setthasarun Prasanpun (Beer)
● Former PHP developer
● DevOps Engineer @ Opsta
#whoami
Journey of Kubernetes Scaling
● Jirayut Nimsaeng (Dear)
● Interested in Cloud and
Open Source
● Agile Practitioner with
DevOps Driven
● CEO and Founder Opsta
#whoami
Journey of Kubernetes Scaling
● What is Docker and Kubernetes?
● Batch Processing
● Solution to scale Batch Processing
● Optimization
● Benchmark
● Future
Agenda
Journey of Kubernetes Scaling
What is Docker Container?
Journey of Kubernetes Scaling
One Server
Node
Container
Journey of Kubernetes Scaling
Multiple Servers
Node 2
Container
Node 1 Node 3
???
Journey of Kubernetes Scaling
Kubernetes Automatic Bin Packing
Node 2Node 1 Node 3
Container
Service A
Container
Service A
Container
Service B
kube-scheduler
Journey of Kubernetes Scaling
● Self-healing
● Service discovery & load balancing
● Automated rollouts and rollbacks
● Secret and configuration management
● Storage orchestration
● Batch execution
● Horizontal manual/auto-scaling
Some more features on Kubernetes
Journey of Kubernetes Scaling
Batch Processing
User
User
User
User
Queue
Worker
Worker
Worker
Result
Job
Job
Job
Job
Consume
Consume
Consume
Journey of Kubernetes Scaling
Challenge
User
User
User
User
Queue
Worker
Worker
Worker
DB
Job
Job
Job
Job
API
Consume
Consume
Consume
Push
Journey of Kubernetes Scaling
First Design on AWS
User
User
User
User
SQS
Worker
Worker
Worker
DB
API
Journey of Kubernetes Scaling
Problem
User
User
User
User
SQS
Worker
Worker
Worker
DB
API
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
User
60,000
QUEUES!!!
Journey of Kubernetes Scaling
Solution with Elastic Beanstalk
API
SQS
Elastic Beanstalk Container
Auto Scaling Instance Group
EC2 Sqsd
Worker
EC2 Sqsd
Worker
EC2 Sqsd
Worker
Set scale condition by CPU utilization
Journey of Kubernetes Scaling
Problems
- CPU utilization not a good metric for autoscale condition
- 1 EC2 contain only 1 Worker container
- EC2 spec not fit with worker require, waste resources.
- Very slow to scale up, Autoscaling isn't really intended for
bursting.
Journey of Kubernetes Scaling
Kubernetes Solution
User
User
User
User
SQS
Worker
Worker
Worker
DB
API
Journey of Kubernetes Scaling
Solution with Kubernetes
SQS
WORKER
WORKER
WORKER
Node1
Node2
Node3
Kubernetes
Cluster
Journey of Kubernetes Scaling
Scale Pod with Kubernetes
SQS
WORKER
WORKER
WORKER
WORKER
WORKER
WORKER
Node1
Node2
Node3
Kubernetes
Cluster
Journey of Kubernetes Scaling
Scale Node with Kubernetes
SQS
WORKER
WORKER
WORKER
WORKER
WORKER
WORKER
WORKER
WORKER
Node1
Node2
Node3
Node4
Kubernetes
Cluster
Journey of Kubernetes Scaling
What need to be done
● Change code not to depend on Sqsd
● Build Kubernetes Cluster on AWS
● Find solution to automated scale pods and nodes
Journey of Kubernetes Scaling
Scale Pod with kube-sqs-autoscaler
● https://github.com/Wattpad/kube-sqs-autoscaler
● Pod autoscaler based on queue size in AWS SQS
● Periodically retrieves the number of messages in SQS
and scales pods accordingly with configuration
○ --scale-down-cool-down=30s
--scale-up-cool-down=5m
--scale-up-messages=100
--scale-down-messages=10
--max-pods=5
--min-pods=1
Journey of Kubernetes Scaling
SQS Autoscaling Pods (1)
SQS
WORKER
WORKER
WORKER
Node1
Node2
Node3
SQS
Autoscale
10 QUEUES
Kubernetes
Cluster
Journey of Kubernetes Scaling
SQS Autoscaling Pods (2)
SQS
WORKER
WORKER
WORKER
Node1
Node2
Node3
SQS
Autoscale
WORKER
Kubernetes
Cluster
5 QUEUES
Journey of Kubernetes Scaling
SQS Autoscaling Pods (3)
SQS
WORKER
WORKER
WORKER
Node1
Node2
Node3
SQS
Autoscale
WORKER
WORKER
Kubernetes
Cluster
0 QUEUES
Journey of Kubernetes Scaling
Scale Node with OpenAI
● https://github.com/openai/kubernetes-ec2-autoscaler
● Work with AWS Autoscaling Group to scale instance up
and down
● Scale node up by checking pod if pending status and no
free capacity node left
● Scale node down by checking CPU idle
Journey of Kubernetes Scaling
Journey of Kubernetes Scaling
Scale Node with OpenAI
SQS
WORKER
WORKER
WORKER
Node1
Node2
Node3
Kubernetes
ClusterEC2
Autoscaler
Auto Scaling Instance Group
PENDING
WORKER
WORKER
WORKER
Journey of Kubernetes Scaling
Scale Node with OpenAI
SQS
WORKER
WORKER
WORKER
Node1
Node2
Node3
Kubernetes
ClusterEC2
Autoscaler
Auto Scaling Instance Group
WORKER
Node4
WORKER
WORKER
WORKER
Journey of Kubernetes Scaling
Optimization
Journey of Kubernetes Scaling
Enhance kube-sqs-autoscale
● Scale 1 Pod at a time is too slow!
● So we improve kube-sqs-autoscale code to scale pod by
ratio between SQS and pod
○ --scale-by-ratio
--queue-per-pod-ratio=100
--scale-down-cool-down=30s
--scale-up-cool-down=5m
--max-pods=5
--min-pods=1
Journey of Kubernetes Scaling
Move from OpenAI to autoscaler
● https://github.com/kubernetes/autoscaler
● OpenAI is lack of development since developer move from
AWS to Azure
● OpenAI is not support multiple instance groups
● Autoscaler is more maturity since it is one of the
Kubernetes component
Journey of Kubernetes Scaling
Worker parallel optimization
- Worker consume only 1 job at a time.
- CPU using less than 15% but Memory going to ~35% per
worker on node, Not good for us.
- We improved our worker to consume and process multiple
jobs simultaneously (configurable setting).
- After some trials, Worker can do 5 concurrent jobs with
same processing time using more CPU and a bit more of
Memory.
Journey of Kubernetes Scaling
Worker CPU optimization
- Our worker using Tensorflow installed via Pip
- Tensorflow notice about library wasn't compiled to use
AVX and SSE4.1 instructions, but these are available on
machine. Pip version not build for any cpu instructions
- So, We build Tensorflow with all CPU instructions
available on EC2 (t2.medium) machine.
- Result is job processed about 35% Faster!!!
Journey of Kubernetes Scaling
Benchmark
Journey of Kubernetes Scaling
Benchmark questions
● How to do load test?
○ Python script 5000 reqs (200 ccu x 25 reqs/u)
within 1 mins
● What is the most optimize instance size with cost
effective?
Journey of Kubernetes Scaling
Benchmark Result Graph
t2.medium win
@1570 queues/minute
Journey of Kubernetes Scaling
Benchmark result
● Worker scaling speed:
○ EB 5-10 mins per worker instance
○ K8S <2 mins (Node available, use free node)
<5 mins (Node not available, spin up new)
Journey of Kubernetes Scaling
Conclusions
● K8s is flexible for batch processing job
● K8s has many components for autoscale
● K8s help us to optimize resource with cost effective
● K8s can finished 60,000 queues in 10 mins
Journey of Kubernetes Scaling
Future
● Use Kubernetes with AWS GPU Instance
● Change Queue
○ RabbitMQ
○ Kafka
● Optimize cost with AWS Spot Instance
Journey of Kubernetes Scaling
Q/A

More Related Content

What's hot

How we can do Multi-Tenancy on Kubernetes
How we can do Multi-Tenancy on KubernetesHow we can do Multi-Tenancy on Kubernetes
How we can do Multi-Tenancy on Kubernetes
Opsta
 
Accelerate your business and reduce cost with OpenStack
Accelerate your business and reduce cost with OpenStackAccelerate your business and reduce cost with OpenStack
Accelerate your business and reduce cost with OpenStack
Opsta
 
Openshift argo cd_v1_2
Openshift argo cd_v1_2Openshift argo cd_v1_2
Openshift argo cd_v1_2
RastinKenarsari
 
Beyond OpenStack | OpenStack in Real Life
Beyond OpenStack | OpenStack in Real LifeBeyond OpenStack | OpenStack in Real Life
Beyond OpenStack | OpenStack in Real Life
Opsta
 
Kubernetes - A Rising Hero
Kubernetes - A Rising HeroKubernetes - A Rising Hero
Kubernetes - A Rising Hero
Huynh Thai Bao
 
PuppetConf 2017: Kubernetes in the Cloud w/ Puppet + Google Container Engine-...
PuppetConf 2017: Kubernetes in the Cloud w/ Puppet + Google Container Engine-...PuppetConf 2017: Kubernetes in the Cloud w/ Puppet + Google Container Engine-...
PuppetConf 2017: Kubernetes in the Cloud w/ Puppet + Google Container Engine-...
Puppet
 
Introduction to Kubernetes and Google Container Engine (GKE)
Introduction to Kubernetes and Google Container Engine (GKE)Introduction to Kubernetes and Google Container Engine (GKE)
Introduction to Kubernetes and Google Container Engine (GKE)
Opsta
 
16. Cncf meetup-docker
16. Cncf meetup-docker16. Cncf meetup-docker
16. Cncf meetup-docker
Juraj Hantak
 
Canary Releases on Kubernetes with Spinnaker, Istio, & Prometheus (2020)
Canary Releases on Kubernetes with Spinnaker, Istio, & Prometheus (2020)Canary Releases on Kubernetes with Spinnaker, Istio, & Prometheus (2020)
Canary Releases on Kubernetes with Spinnaker, Istio, & Prometheus (2020)
Kublr
 
GlueCon kubernetes & container engine
GlueCon kubernetes & container engineGlueCon kubernetes & container engine
GlueCon kubernetes & container engine
brendandburns
 
GPU enablement for data science on OpenShift | DevNation Tech Talk
GPU enablement for data science on OpenShift | DevNation Tech TalkGPU enablement for data science on OpenShift | DevNation Tech Talk
GPU enablement for data science on OpenShift | DevNation Tech Talk
Red Hat Developers
 
Setting up CI/CD pipeline with Kubernetes and Kublr step-by-step
Setting up CI/CD pipeline with Kubernetes and Kublr step-by-stepSetting up CI/CD pipeline with Kubernetes and Kublr step-by-step
Setting up CI/CD pipeline with Kubernetes and Kublr step-by-step
Oleg Chunikhin
 
DevOps Fest 2020. Дмитрий Кудрявцев. Реализация GitOps на Kubernetes. ArgoCD
DevOps Fest 2020. Дмитрий Кудрявцев. Реализация GitOps на Kubernetes. ArgoCDDevOps Fest 2020. Дмитрий Кудрявцев. Реализация GitOps на Kubernetes. ArgoCD
DevOps Fest 2020. Дмитрий Кудрявцев. Реализация GitOps на Kubernetes. ArgoCD
DevOps_Fest
 
Kubernetes-native or not? When should you ditch your traditional CI/CD server...
Kubernetes-native or not? When should you ditch your traditional CI/CD server...Kubernetes-native or not? When should you ditch your traditional CI/CD server...
Kubernetes-native or not? When should you ditch your traditional CI/CD server...
Red Hat Developers
 
Building CI/CD Pipelines with Jenkins and Kubernetes
Building CI/CD Pipelines with Jenkins and KubernetesBuilding CI/CD Pipelines with Jenkins and Kubernetes
Building CI/CD Pipelines with Jenkins and Kubernetes
Janakiram MSV
 
Getting started with Azure Container Service (AKS)
Getting started with Azure Container Service (AKS)Getting started with Azure Container Service (AKS)
Getting started with Azure Container Service (AKS)
Janakiram MSV
 
Knative Intro
Knative IntroKnative Intro
Knative Intro
Joe Searcy
 
GitOps is the best modern practice for CD with Kubernetes
GitOps is the best modern practice for CD with KubernetesGitOps is the best modern practice for CD with Kubernetes
GitOps is the best modern practice for CD with Kubernetes
Volodymyr Shynkar
 
From development to production: Deploying Java and Scala apps to kubernetes
From development to production: Deploying Java and Scala apps to kubernetesFrom development to production: Deploying Java and Scala apps to kubernetes
From development to production: Deploying Java and Scala apps to kubernetes
Olanga Ochieng'
 
Quarkus: From developer joy to Kubernetes nirvana! | DevNation Tech Talk
Quarkus: From developer joy to Kubernetes nirvana! | DevNation Tech TalkQuarkus: From developer joy to Kubernetes nirvana! | DevNation Tech Talk
Quarkus: From developer joy to Kubernetes nirvana! | DevNation Tech Talk
Red Hat Developers
 

What's hot (20)

How we can do Multi-Tenancy on Kubernetes
How we can do Multi-Tenancy on KubernetesHow we can do Multi-Tenancy on Kubernetes
How we can do Multi-Tenancy on Kubernetes
 
Accelerate your business and reduce cost with OpenStack
Accelerate your business and reduce cost with OpenStackAccelerate your business and reduce cost with OpenStack
Accelerate your business and reduce cost with OpenStack
 
Openshift argo cd_v1_2
Openshift argo cd_v1_2Openshift argo cd_v1_2
Openshift argo cd_v1_2
 
Beyond OpenStack | OpenStack in Real Life
Beyond OpenStack | OpenStack in Real LifeBeyond OpenStack | OpenStack in Real Life
Beyond OpenStack | OpenStack in Real Life
 
Kubernetes - A Rising Hero
Kubernetes - A Rising HeroKubernetes - A Rising Hero
Kubernetes - A Rising Hero
 
PuppetConf 2017: Kubernetes in the Cloud w/ Puppet + Google Container Engine-...
PuppetConf 2017: Kubernetes in the Cloud w/ Puppet + Google Container Engine-...PuppetConf 2017: Kubernetes in the Cloud w/ Puppet + Google Container Engine-...
PuppetConf 2017: Kubernetes in the Cloud w/ Puppet + Google Container Engine-...
 
Introduction to Kubernetes and Google Container Engine (GKE)
Introduction to Kubernetes and Google Container Engine (GKE)Introduction to Kubernetes and Google Container Engine (GKE)
Introduction to Kubernetes and Google Container Engine (GKE)
 
16. Cncf meetup-docker
16. Cncf meetup-docker16. Cncf meetup-docker
16. Cncf meetup-docker
 
Canary Releases on Kubernetes with Spinnaker, Istio, & Prometheus (2020)
Canary Releases on Kubernetes with Spinnaker, Istio, & Prometheus (2020)Canary Releases on Kubernetes with Spinnaker, Istio, & Prometheus (2020)
Canary Releases on Kubernetes with Spinnaker, Istio, & Prometheus (2020)
 
GlueCon kubernetes & container engine
GlueCon kubernetes & container engineGlueCon kubernetes & container engine
GlueCon kubernetes & container engine
 
GPU enablement for data science on OpenShift | DevNation Tech Talk
GPU enablement for data science on OpenShift | DevNation Tech TalkGPU enablement for data science on OpenShift | DevNation Tech Talk
GPU enablement for data science on OpenShift | DevNation Tech Talk
 
Setting up CI/CD pipeline with Kubernetes and Kublr step-by-step
Setting up CI/CD pipeline with Kubernetes and Kublr step-by-stepSetting up CI/CD pipeline with Kubernetes and Kublr step-by-step
Setting up CI/CD pipeline with Kubernetes and Kublr step-by-step
 
DevOps Fest 2020. Дмитрий Кудрявцев. Реализация GitOps на Kubernetes. ArgoCD
DevOps Fest 2020. Дмитрий Кудрявцев. Реализация GitOps на Kubernetes. ArgoCDDevOps Fest 2020. Дмитрий Кудрявцев. Реализация GitOps на Kubernetes. ArgoCD
DevOps Fest 2020. Дмитрий Кудрявцев. Реализация GitOps на Kubernetes. ArgoCD
 
Kubernetes-native or not? When should you ditch your traditional CI/CD server...
Kubernetes-native or not? When should you ditch your traditional CI/CD server...Kubernetes-native or not? When should you ditch your traditional CI/CD server...
Kubernetes-native or not? When should you ditch your traditional CI/CD server...
 
Building CI/CD Pipelines with Jenkins and Kubernetes
Building CI/CD Pipelines with Jenkins and KubernetesBuilding CI/CD Pipelines with Jenkins and Kubernetes
Building CI/CD Pipelines with Jenkins and Kubernetes
 
Getting started with Azure Container Service (AKS)
Getting started with Azure Container Service (AKS)Getting started with Azure Container Service (AKS)
Getting started with Azure Container Service (AKS)
 
Knative Intro
Knative IntroKnative Intro
Knative Intro
 
GitOps is the best modern practice for CD with Kubernetes
GitOps is the best modern practice for CD with KubernetesGitOps is the best modern practice for CD with Kubernetes
GitOps is the best modern practice for CD with Kubernetes
 
From development to production: Deploying Java and Scala apps to kubernetes
From development to production: Deploying Java and Scala apps to kubernetesFrom development to production: Deploying Java and Scala apps to kubernetes
From development to production: Deploying Java and Scala apps to kubernetes
 
Quarkus: From developer joy to Kubernetes nirvana! | DevNation Tech Talk
Quarkus: From developer joy to Kubernetes nirvana! | DevNation Tech TalkQuarkus: From developer joy to Kubernetes nirvana! | DevNation Tech Talk
Quarkus: From developer joy to Kubernetes nirvana! | DevNation Tech Talk
 

Similar to Journey of Kubernetes Scaling

[GS네오텍] Google Kubernetes Engine
[GS네오텍]  Google Kubernetes Engine [GS네오텍]  Google Kubernetes Engine
[GS네오텍] Google Kubernetes Engine
GS Neotek
 
Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...
Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...
Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...
Chris Fregly
 
Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusDistributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Jakob Karalus
 
Kubernetes intro
Kubernetes introKubernetes intro
Kubernetes intro
Pravin Magdum
 
AWS ECS workshop
AWS ECS workshopAWS ECS workshop
AWS ECS workshop
Prashant Kalkar
 
Docker on Amazon ECS
Docker on Amazon ECSDocker on Amazon ECS
Docker on Amazon ECS
Deepak Kumar
 
Kubernetes: Managed or Not Managed?
Kubernetes: Managed or Not Managed?Kubernetes: Managed or Not Managed?
Kubernetes: Managed or Not Managed?
Mathieu Herbert
 
reBuy on Kubernetes
reBuy on KubernetesreBuy on Kubernetes
reBuy on Kubernetes
Stephan Lindauer
 
Containerized architectures for deep learning
Containerized architectures for deep learningContainerized architectures for deep learning
Containerized architectures for deep learning
Antje Barth
 
Bootstrapping Clusters with EKS Blueprints.pptx
Bootstrapping Clusters with EKS Blueprints.pptxBootstrapping Clusters with EKS Blueprints.pptx
Bootstrapping Clusters with EKS Blueprints.pptx
ssuserd4e0d2
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloud
Datadog
 
Persist your data in an ephemeral k8 ecosystem
Persist your data in an ephemeral k8 ecosystemPersist your data in an ephemeral k8 ecosystem
Persist your data in an ephemeral k8 ecosystem
LibbySchulze
 
Kubernetes & Google Container Engine @ mabl
Kubernetes & Google Container Engine @ mablKubernetes & Google Container Engine @ mabl
Kubernetes & Google Container Engine @ mabl
Joseph Lust
 
Aws Fargate clusterless serverless
Aws Fargate clusterless serverlessAws Fargate clusterless serverless
Aws Fargate clusterless serverless
Rodrigo Galba
 
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
NETWAYS
 
Kubernetes #1 intro
Kubernetes #1   introKubernetes #1   intro
Kubernetes #1 intro
Terry Cho
 
Kubernetes for Beginners
Kubernetes for BeginnersKubernetes for Beginners
Kubernetes for Beginners
DigitalOcean
 
Swarm migration
Swarm migrationSwarm migration
Swarm migration
Janakiram MSV
 
Kubecon 2023 EU - KServe - The State and Future of Cloud-Native Model Serving
Kubecon 2023 EU - KServe - The State and Future of Cloud-Native Model ServingKubecon 2023 EU - KServe - The State and Future of Cloud-Native Model Serving
Kubecon 2023 EU - KServe - The State and Future of Cloud-Native Model Serving
Theofilos Papapanagiotou
 
Using Deep Learning Toolkits with Kubernetes clusters
Using Deep Learning Toolkits with Kubernetes clustersUsing Deep Learning Toolkits with Kubernetes clusters
Using Deep Learning Toolkits with Kubernetes clusters
Joy Qiao
 

Similar to Journey of Kubernetes Scaling (20)

[GS네오텍] Google Kubernetes Engine
[GS네오텍]  Google Kubernetes Engine [GS네오텍]  Google Kubernetes Engine
[GS네오텍] Google Kubernetes Engine
 
Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...
Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...
Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...
 
Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusDistributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
 
Kubernetes intro
Kubernetes introKubernetes intro
Kubernetes intro
 
AWS ECS workshop
AWS ECS workshopAWS ECS workshop
AWS ECS workshop
 
Docker on Amazon ECS
Docker on Amazon ECSDocker on Amazon ECS
Docker on Amazon ECS
 
Kubernetes: Managed or Not Managed?
Kubernetes: Managed or Not Managed?Kubernetes: Managed or Not Managed?
Kubernetes: Managed or Not Managed?
 
reBuy on Kubernetes
reBuy on KubernetesreBuy on Kubernetes
reBuy on Kubernetes
 
Containerized architectures for deep learning
Containerized architectures for deep learningContainerized architectures for deep learning
Containerized architectures for deep learning
 
Bootstrapping Clusters with EKS Blueprints.pptx
Bootstrapping Clusters with EKS Blueprints.pptxBootstrapping Clusters with EKS Blueprints.pptx
Bootstrapping Clusters with EKS Blueprints.pptx
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloud
 
Persist your data in an ephemeral k8 ecosystem
Persist your data in an ephemeral k8 ecosystemPersist your data in an ephemeral k8 ecosystem
Persist your data in an ephemeral k8 ecosystem
 
Kubernetes & Google Container Engine @ mabl
Kubernetes & Google Container Engine @ mablKubernetes & Google Container Engine @ mabl
Kubernetes & Google Container Engine @ mabl
 
Aws Fargate clusterless serverless
Aws Fargate clusterless serverlessAws Fargate clusterless serverless
Aws Fargate clusterless serverless
 
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
 
Kubernetes #1 intro
Kubernetes #1   introKubernetes #1   intro
Kubernetes #1 intro
 
Kubernetes for Beginners
Kubernetes for BeginnersKubernetes for Beginners
Kubernetes for Beginners
 
Swarm migration
Swarm migrationSwarm migration
Swarm migration
 
Kubecon 2023 EU - KServe - The State and Future of Cloud-Native Model Serving
Kubecon 2023 EU - KServe - The State and Future of Cloud-Native Model ServingKubecon 2023 EU - KServe - The State and Future of Cloud-Native Model Serving
Kubecon 2023 EU - KServe - The State and Future of Cloud-Native Model Serving
 
Using Deep Learning Toolkits with Kubernetes clusters
Using Deep Learning Toolkits with Kubernetes clustersUsing Deep Learning Toolkits with Kubernetes clusters
Using Deep Learning Toolkits with Kubernetes clusters
 

More from Opsta

Deploy 22 microservices from scratch in 30 mins with GitOps
Deploy 22 microservices from scratch in 30 mins with GitOpsDeploy 22 microservices from scratch in 30 mins with GitOps
Deploy 22 microservices from scratch in 30 mins with GitOps
Opsta
 
Let's build Developer Portal with Backstage
Let's build Developer Portal with BackstageLet's build Developer Portal with Backstage
Let's build Developer Portal with Backstage
Opsta
 
Kubernetes Secrets Management on Production with Demo
Kubernetes Secrets Management on Production with DemoKubernetes Secrets Management on Production with Demo
Kubernetes Secrets Management on Production with Demo
Opsta
 
Introduction of CCE and DevCloud
Introduction of CCE and DevCloudIntroduction of CCE and DevCloud
Introduction of CCE and DevCloud
Opsta
 
How to build DevSecOps Platform on Huawei Cloud
How to build DevSecOps Platform on Huawei CloudHow to build DevSecOps Platform on Huawei Cloud
How to build DevSecOps Platform on Huawei Cloud
Opsta
 
Make a better DevOps with GitOps
Make a better DevOps with GitOpsMake a better DevOps with GitOps
Make a better DevOps with GitOps
Opsta
 
Deploy Application on Kubernetes
Deploy Application on KubernetesDeploy Application on Kubernetes
Deploy Application on Kubernetes
Opsta
 
Platform Engineering
Platform EngineeringPlatform Engineering
Platform Engineering
Opsta
 
Manage Kubernetes Clusters with Cluster API and ArgoCD
Manage Kubernetes Clusters with Cluster API and ArgoCDManage Kubernetes Clusters with Cluster API and ArgoCD
Manage Kubernetes Clusters with Cluster API and ArgoCD
Opsta
 
Security Process in DevSecOps
Security Process in DevSecOpsSecurity Process in DevSecOps
Security Process in DevSecOps
Opsta
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platform
Opsta
 
Managing traffic routing with istio and envoy workshop
Managing traffic routing with istio and envoy workshopManaging traffic routing with istio and envoy workshop
Managing traffic routing with istio and envoy workshop
Opsta
 
How to pass the Google Certification Exams
How to pass the Google Certification ExamsHow to pass the Google Certification Exams
How to pass the Google Certification Exams
Opsta
 
DevOps Transformation in Technical
DevOps Transformation in TechnicalDevOps Transformation in Technical
DevOps Transformation in Technical
Opsta
 
Performance Testing with Tsung
Performance Testing with TsungPerformance Testing with Tsung
Performance Testing with Tsung
Opsta
 
Modern Monitoring - SysAdminDay 2017
Modern Monitoring - SysAdminDay 2017Modern Monitoring - SysAdminDay 2017
Modern Monitoring - SysAdminDay 2017
Opsta
 
OpenStack and DevOps - DevOps Meetup
OpenStack and DevOps - DevOps MeetupOpenStack and DevOps - DevOps Meetup
OpenStack and DevOps - DevOps Meetup
Opsta
 
How to contribute to OpenStack
How to contribute to OpenStackHow to contribute to OpenStack
How to contribute to OpenStack
Opsta
 

More from Opsta (18)

Deploy 22 microservices from scratch in 30 mins with GitOps
Deploy 22 microservices from scratch in 30 mins with GitOpsDeploy 22 microservices from scratch in 30 mins with GitOps
Deploy 22 microservices from scratch in 30 mins with GitOps
 
Let's build Developer Portal with Backstage
Let's build Developer Portal with BackstageLet's build Developer Portal with Backstage
Let's build Developer Portal with Backstage
 
Kubernetes Secrets Management on Production with Demo
Kubernetes Secrets Management on Production with DemoKubernetes Secrets Management on Production with Demo
Kubernetes Secrets Management on Production with Demo
 
Introduction of CCE and DevCloud
Introduction of CCE and DevCloudIntroduction of CCE and DevCloud
Introduction of CCE and DevCloud
 
How to build DevSecOps Platform on Huawei Cloud
How to build DevSecOps Platform on Huawei CloudHow to build DevSecOps Platform on Huawei Cloud
How to build DevSecOps Platform on Huawei Cloud
 
Make a better DevOps with GitOps
Make a better DevOps with GitOpsMake a better DevOps with GitOps
Make a better DevOps with GitOps
 
Deploy Application on Kubernetes
Deploy Application on KubernetesDeploy Application on Kubernetes
Deploy Application on Kubernetes
 
Platform Engineering
Platform EngineeringPlatform Engineering
Platform Engineering
 
Manage Kubernetes Clusters with Cluster API and ArgoCD
Manage Kubernetes Clusters with Cluster API and ArgoCDManage Kubernetes Clusters with Cluster API and ArgoCD
Manage Kubernetes Clusters with Cluster API and ArgoCD
 
Security Process in DevSecOps
Security Process in DevSecOpsSecurity Process in DevSecOps
Security Process in DevSecOps
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platform
 
Managing traffic routing with istio and envoy workshop
Managing traffic routing with istio and envoy workshopManaging traffic routing with istio and envoy workshop
Managing traffic routing with istio and envoy workshop
 
How to pass the Google Certification Exams
How to pass the Google Certification ExamsHow to pass the Google Certification Exams
How to pass the Google Certification Exams
 
DevOps Transformation in Technical
DevOps Transformation in TechnicalDevOps Transformation in Technical
DevOps Transformation in Technical
 
Performance Testing with Tsung
Performance Testing with TsungPerformance Testing with Tsung
Performance Testing with Tsung
 
Modern Monitoring - SysAdminDay 2017
Modern Monitoring - SysAdminDay 2017Modern Monitoring - SysAdminDay 2017
Modern Monitoring - SysAdminDay 2017
 
OpenStack and DevOps - DevOps Meetup
OpenStack and DevOps - DevOps MeetupOpenStack and DevOps - DevOps Meetup
OpenStack and DevOps - DevOps Meetup
 
How to contribute to OpenStack
How to contribute to OpenStackHow to contribute to OpenStack
How to contribute to OpenStack
 

Recently uploaded

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 

Journey of Kubernetes Scaling

  • 1. Journey of Kubernetes Scaling Code Mania 111 @ Siam University June 10, 2018
  • 2. Journey of Kubernetes Scaling ● Setthasarun Prasanpun (Beer) ● Former PHP developer ● DevOps Engineer @ Opsta #whoami
  • 3. Journey of Kubernetes Scaling ● Jirayut Nimsaeng (Dear) ● Interested in Cloud and Open Source ● Agile Practitioner with DevOps Driven ● CEO and Founder Opsta #whoami
  • 4. Journey of Kubernetes Scaling ● What is Docker and Kubernetes? ● Batch Processing ● Solution to scale Batch Processing ● Optimization ● Benchmark ● Future Agenda
  • 5. Journey of Kubernetes Scaling What is Docker Container?
  • 6. Journey of Kubernetes Scaling One Server Node Container
  • 7. Journey of Kubernetes Scaling Multiple Servers Node 2 Container Node 1 Node 3 ???
  • 8. Journey of Kubernetes Scaling Kubernetes Automatic Bin Packing Node 2Node 1 Node 3 Container Service A Container Service A Container Service B kube-scheduler
  • 9. Journey of Kubernetes Scaling ● Self-healing ● Service discovery & load balancing ● Automated rollouts and rollbacks ● Secret and configuration management ● Storage orchestration ● Batch execution ● Horizontal manual/auto-scaling Some more features on Kubernetes
  • 10. Journey of Kubernetes Scaling Batch Processing User User User User Queue Worker Worker Worker Result Job Job Job Job Consume Consume Consume
  • 11. Journey of Kubernetes Scaling Challenge User User User User Queue Worker Worker Worker DB Job Job Job Job API Consume Consume Consume Push
  • 12. Journey of Kubernetes Scaling First Design on AWS User User User User SQS Worker Worker Worker DB API
  • 13. Journey of Kubernetes Scaling Problem User User User User SQS Worker Worker Worker DB API User User User User User User User User User User User User User User User User User User User User User User User User User 60,000 QUEUES!!!
  • 14. Journey of Kubernetes Scaling Solution with Elastic Beanstalk API SQS Elastic Beanstalk Container Auto Scaling Instance Group EC2 Sqsd Worker EC2 Sqsd Worker EC2 Sqsd Worker Set scale condition by CPU utilization
  • 15. Journey of Kubernetes Scaling Problems - CPU utilization not a good metric for autoscale condition - 1 EC2 contain only 1 Worker container - EC2 spec not fit with worker require, waste resources. - Very slow to scale up, Autoscaling isn't really intended for bursting.
  • 16. Journey of Kubernetes Scaling Kubernetes Solution User User User User SQS Worker Worker Worker DB API
  • 17. Journey of Kubernetes Scaling Solution with Kubernetes SQS WORKER WORKER WORKER Node1 Node2 Node3 Kubernetes Cluster
  • 18. Journey of Kubernetes Scaling Scale Pod with Kubernetes SQS WORKER WORKER WORKER WORKER WORKER WORKER Node1 Node2 Node3 Kubernetes Cluster
  • 19. Journey of Kubernetes Scaling Scale Node with Kubernetes SQS WORKER WORKER WORKER WORKER WORKER WORKER WORKER WORKER Node1 Node2 Node3 Node4 Kubernetes Cluster
  • 20. Journey of Kubernetes Scaling What need to be done ● Change code not to depend on Sqsd ● Build Kubernetes Cluster on AWS ● Find solution to automated scale pods and nodes
  • 21. Journey of Kubernetes Scaling Scale Pod with kube-sqs-autoscaler ● https://github.com/Wattpad/kube-sqs-autoscaler ● Pod autoscaler based on queue size in AWS SQS ● Periodically retrieves the number of messages in SQS and scales pods accordingly with configuration ○ --scale-down-cool-down=30s --scale-up-cool-down=5m --scale-up-messages=100 --scale-down-messages=10 --max-pods=5 --min-pods=1
  • 22. Journey of Kubernetes Scaling SQS Autoscaling Pods (1) SQS WORKER WORKER WORKER Node1 Node2 Node3 SQS Autoscale 10 QUEUES Kubernetes Cluster
  • 23. Journey of Kubernetes Scaling SQS Autoscaling Pods (2) SQS WORKER WORKER WORKER Node1 Node2 Node3 SQS Autoscale WORKER Kubernetes Cluster 5 QUEUES
  • 24. Journey of Kubernetes Scaling SQS Autoscaling Pods (3) SQS WORKER WORKER WORKER Node1 Node2 Node3 SQS Autoscale WORKER WORKER Kubernetes Cluster 0 QUEUES
  • 25. Journey of Kubernetes Scaling Scale Node with OpenAI ● https://github.com/openai/kubernetes-ec2-autoscaler ● Work with AWS Autoscaling Group to scale instance up and down ● Scale node up by checking pod if pending status and no free capacity node left ● Scale node down by checking CPU idle
  • 27. Journey of Kubernetes Scaling Scale Node with OpenAI SQS WORKER WORKER WORKER Node1 Node2 Node3 Kubernetes ClusterEC2 Autoscaler Auto Scaling Instance Group PENDING WORKER WORKER WORKER
  • 28. Journey of Kubernetes Scaling Scale Node with OpenAI SQS WORKER WORKER WORKER Node1 Node2 Node3 Kubernetes ClusterEC2 Autoscaler Auto Scaling Instance Group WORKER Node4 WORKER WORKER WORKER
  • 29. Journey of Kubernetes Scaling Optimization
  • 30. Journey of Kubernetes Scaling Enhance kube-sqs-autoscale ● Scale 1 Pod at a time is too slow! ● So we improve kube-sqs-autoscale code to scale pod by ratio between SQS and pod ○ --scale-by-ratio --queue-per-pod-ratio=100 --scale-down-cool-down=30s --scale-up-cool-down=5m --max-pods=5 --min-pods=1
  • 31. Journey of Kubernetes Scaling Move from OpenAI to autoscaler ● https://github.com/kubernetes/autoscaler ● OpenAI is lack of development since developer move from AWS to Azure ● OpenAI is not support multiple instance groups ● Autoscaler is more maturity since it is one of the Kubernetes component
  • 32. Journey of Kubernetes Scaling Worker parallel optimization - Worker consume only 1 job at a time. - CPU using less than 15% but Memory going to ~35% per worker on node, Not good for us. - We improved our worker to consume and process multiple jobs simultaneously (configurable setting). - After some trials, Worker can do 5 concurrent jobs with same processing time using more CPU and a bit more of Memory.
  • 33. Journey of Kubernetes Scaling Worker CPU optimization - Our worker using Tensorflow installed via Pip - Tensorflow notice about library wasn't compiled to use AVX and SSE4.1 instructions, but these are available on machine. Pip version not build for any cpu instructions - So, We build Tensorflow with all CPU instructions available on EC2 (t2.medium) machine. - Result is job processed about 35% Faster!!!
  • 34. Journey of Kubernetes Scaling Benchmark
  • 35. Journey of Kubernetes Scaling Benchmark questions ● How to do load test? ○ Python script 5000 reqs (200 ccu x 25 reqs/u) within 1 mins ● What is the most optimize instance size with cost effective?
  • 36. Journey of Kubernetes Scaling Benchmark Result Graph t2.medium win @1570 queues/minute
  • 37. Journey of Kubernetes Scaling Benchmark result ● Worker scaling speed: ○ EB 5-10 mins per worker instance ○ K8S <2 mins (Node available, use free node) <5 mins (Node not available, spin up new)
  • 38. Journey of Kubernetes Scaling Conclusions ● K8s is flexible for batch processing job ● K8s has many components for autoscale ● K8s help us to optimize resource with cost effective ● K8s can finished 60,000 queues in 10 mins
  • 39. Journey of Kubernetes Scaling Future ● Use Kubernetes with AWS GPU Instance ● Change Queue ○ RabbitMQ ○ Kafka ● Optimize cost with AWS Spot Instance
  • 40. Journey of Kubernetes Scaling Q/A