Webcast - Making kubernetes production ready

Making Kubernetes
Production Ready
Harry Zhang
Abhinav Das
VOIP or Dial-in (see chat)
Questions? Send via the GTW ‘questions’ chat

• We have about 40 minutes of content but more time for questions
• We will post slides today and email a video by Monday
• Send any questions on the GTM chat
• If audio fails, let us know on chat! We will re-dial in quickly…
• Apologies for the train that goes by at about :24 minute mark 
But first, some quick housekeeping
July 21, 2017 2

Who are we?
July 21, 2017 3
Abhinav DasHarry Zhang

About Applatix
• Platform to build and run containerized apps in cloud.
▪ Built on Kubernetes
• Simplify the journey to cloud with:
▪ Infrastructure automation
▪ End to end DevOps workflows
▪ Monitoring, audit and governance

Outline
• What is “Production Ready” to us
• Kubernetes Design at a Glance
• How we hardened Kubernetes Master
• How we hardened Kubernetes Minion

What is “Production Ready”?

Our Workload
Workflow 1
Workflow 2
Tasks
Workflow 1
Workflow 2
Tasks

Our Workload
High Pod churn
• Large number of Pods created and deleted in unit time

Current Applatix Production Workload

Everybody talks with API server!
• 20+ Controllers
• All Kubelets
• All Kube-Proxy
• Scheduler
• Add-ons
• Other customized microservices
Default configurations
does not work for us!
Kubernetes At A Glance

Knobs to manage API server
Purpose Flag Rule of Thumb
Throttle API
Requests
--max-request-inflight • We use 1 inflight request per 2 Pods
Control
Memory
Consumption
--target-ram-mb • Configures watch cache and deserialization cache
• We use 2.5 MB per Pod

Knobs to controller manager
Control level
of parallelism
--concurrent-deployment-syncs
--concurrent-endpoint-syncs
--concurrent-gc-syncs
--concurrent-namespace-syncs
--concurrent-replicaset-syncs
--concurrent-resource-quota-syncs
--concurrent-service-syncs
--concurrent-serviceaccount-token-syncs
--concurrent-rc-syncs
• Set it to large value for
components you use frequently
and require fast response
• For example, our production
cluster can have couple of
hundreds of deployments, we
assigned 20 workers for
deployment syncs, replica set
syncs and replication controller
syncs

Knobs to controller manager
Control
Memory
Consumption
--replication-controller-lookup-cache-size
--replicaset-lookup-cache-size
--daemonset-lookup-cache-size
• Available to versions prior to 1.6
• We use ~4G/4G/1G respectively
for the 3 flags for production
cluster, and scale them down
based on master resource for
other cluster types

Knobs to control API calls
Throttle API
query rate
--kube-api-burst
--kube-api-qps
• Maximum inflight API call for API
server should be considered
• We set 3 QPS per 10 Pod for
scheduler, and the number is
doubled for burst

Admission Control
• Another observation was that if we let the creation
of Pods to be unconstrained, Kubernetes master
was unstable
 We have an admission controller that manages the creation of
Pods
 This ensures that we are only creating Pods that will be able
to execute without resource constraints

Further Reduce Master Workload

Problem 2: Minion becomes
“Unhealthy”

Kernel CFS Bug (Kubernetes Issue #874)
[ 3960.004144] BUG: unable to handle kernel NULL pointer dereference at 0000000000000080
[ 3960.008059] IP: [<ffffffff810b332f>] pick_next_task_fair+0x30f/0x4a0
[ 3960.008059] PGD 6e7bd7067 PUD 72813c067 PMD 0
[ 3960.008059] Oops: 0000 [#1] SMP
[ 3960.008059] Modules linked in: xt_statistic(E) xt_nat(E) ......
[ 3960.008059] CPU: 4 PID: 10158 Comm: mysql_tzinfo_to Tainted: G E 4.4.41-k8s #1
[ 3960.008059] Hardware name: Xen HVM domU, BIOS 4.2.amazon 11/11/2016
[ 3960.008059] task: ffff8807578fae00 ti: ffff88075f028000 task.ti: ffff88075f028000
[ 3960.008059] RIP: 0010:[<ffffffff810b332f>] [<ffffffff810b332f>] pick_next_task_fair+0x30f/0x4a0
[ 3960.008059] RSP: 0018:ffff88075f02be38 EFLAGS: 00010046
[ 3960.008059] RAX: 0000000000000000 RBX: ffff8807250ff400 RCX: 0000000000000000
[ 3960.008059] RDX: ffff88078fc95e30 RSI: 0000000000000000 RDI: ffff8807250ff400
[ 3960.008059] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff88076bc13700
[ 3960.008059] R10: 0000000000001cf7 R11: ffffea001c98a100 R12: 0000000000015dc0
[ 3960.008059] R13: 0000000000000000 R14: ffff88078fc95dc0 R15: 0000000000000004
[ 3960.008059] FS: 00007fa34b7f6740(0000) GS:ffff88078fc80000(0000) knlGS:0000000000000000
[ 3960.008059] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3960.008059] CR2: 0000000000000080 CR3: 000000067762d000 CR4: 00000000001406e0
[ 3960.008059] Stack:
[ 3960.008059] ffff8807578fae00 0000000000001000 0000000200000000 0000000000015dc0
[ 3960.008059] ffff88078fc95e30 00007fa34b7fc000 000000005ef04228 ffff88078fc95dc0
[ 3960.008059] ffff8807578fae00 0000000000015dc0 0000000000000000 ffff8807578fb2a0
[ 3960.008059] Call Trace:
[ 3960.008059] [<ffffffff8159cd1f>] ? __schedule+0xdf/0x960
[ 3960.008059] [<ffffffff8159d5d1>] ? schedule+0x31/0x80
[ 3960.008059] [<ffffffff810031cb>] ? exit_to_usermode_loop+0x6b/0xc0
[ 3960.008059] [<ffffffff81003bcf>] ? syscall_return_slowpath+0x8f/0x110
[ 3960.008059] [<ffffffff815a1518>] ? int_ret_from_sys_call+0x25/0x8f
[ 3960.008059] Code: c6 44 24 17 00 eb ......
[ 3960.008059] RIP [<ffffffff810b332f>] pick_next_task_fair+0x30f/0x4a0
[ 3960.008059] RSP <ffff88075f02be38>
[ 3960.008059] CR2: 0000000000000080
[ 3960.008059] ---[ end trace e1b9f0775b83e8e3 ]---
[ 3960.008059] Kernel panic - not syncing: Fatal exception

Summary
• Kubernetes resource consumption is directly related
to number of Pods and Pod churns
• Find a balance among performance, stability, and cost
• Kubernetes is stable and production ready

Webcast - Making kubernetes production ready

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (12)

Similar to Webcast - Making kubernetes production ready

Similar to Webcast - Making kubernetes production ready (20)

Recently uploaded

Recently uploaded (20)

Webcast - Making kubernetes production ready

Editor's Notes