Lessons learned with kubernetes in productionat PlayPass

•Download as PPTX, PDF•

1 like•345 views

Lessons learned with kubernetes in productionat PlayPass, presented at the 6th Docker Birthday Meetup in Antwerpen. What went well and what are some open issues. Also, we discussed some security measures after the presentations.

Software

Lessons learned with
Kubernetes in production
at PlayPass
Peter Vandenabeele
19 March 2019

PlayPass
● Antwerp scale-up
● Cashless payments
● Access control
● NFC wristband
● world-wide
● Music Festivals
● Sports Events

New apps: new tech stack
● existing apps: pets, Ruby/Rails, postgresql, classic stack
● Docker for development (docker-compose, ruby, elixir)
● Kubernetes for deployment, for HA
● on GKE (Google Kubernetes Engine)
● Gitlab for CI/CD
● terraform for base infra (clusters, networking, firewall)
● kubectl and helm for app deployment
● Vault for secrets management
● Stackdriver and DataDog for logging and monitoring

Environments (GCP projects + namespaces)
k8s-dev
(pre-empt)
edge
(pre-empt)
production
(fixed VM’s)
clusterapplications
(default)
play
experimental
apps
(default)
play
gitlab runners
dashboard
new apps ...
(default)
play
-gitlab runners
dashboard
new apps ...

Kubernetes => pet infrastructure firewalling
RMQ
firewall:
only
ingress
from
fixed
IP
node 1 (preemptible)
new public IP on preempt.
node 2 (preemptible)
new public IP on preempt.
node 3 (preemptible)
new public IP on preempt.
“cloud-native” “pet infra”

NAT
GW
Kubernetes => NAT GW => outside
node 1 (preemptible)
new public IP on preempt.
node 2 (preemptible)
new public IP on preempt.
node 3 (preemptible)
new public IP on preempt.
fixed IP
to outside
default
GW
GKE
master
ssh
jump
host
800
700

Node pre-emption and upgrades (planned events)
● Is GCP LB aware upfront of nodes going down ??
● Are nodes doing proper `drain` on pre-emption ??
● Are nodes hanging in `cordoned` state after upgrade ??
● Node pre-emption => short service interruptions
● Production => non-preemptible nodes $$$
● Node upgrade => short service interruption, manual
uncordon needed (see next slide)

Node upgrade
from GKE GUI
leaves most
nodes in
SchedulingDisabled
state
=> manual uncordon

Cloud SQL postgres performance issues
● Try to be “servicefull”, obviously :-)
● postgres service with HA, backups, auto scaling of disk
● basic performance test revealed that a workload with a
few INSERT’s per transaction, dropped to only 20
INSERT’s per second and started disconnecting cloudsql-
proxy
● => reverted 1 “fast aggregate” to self-hosted postgres,
using Docker (back-up, HA => rebuild the aggregate)

the good news: fast e2e deploy by developer

Extra security measures as discussed
● Kaniko for building images without “docker-in-docker” or privileged mode
● PodSecurityPolicy with non privileged and MustRunAsNonRoot
○ spec:
○ privileged: false
○ allowPrivilegeEscalation: false
○ runAsUser:
○ # Require the container to run without root privileges.
○ rule: 'MustRunAsNonRoot'
● “Private” cluster with NAT for external access
● RBAC on different namespaces
● tiller per namespace with limited roleBinding and `--listen=localhost:44134`
(ref https://engineering.bitnami.com/articles/helm-security.html)

What's hot

Effective Kubernetes - Is Kubernetes the new Linux? Is the new Application Se...Wojciech Barczyński

Setting up CI/CD pipeline with Kubernetes and Kublr step-by-stepOleg Chunikhin

Zero downtime deployment of micro-services with KubernetesWojciech Barczyński

Kubernetes basics and hands on exerciseCloud Technology Experts

Kubernetes persistence 101Kublr

Deploying WSO2 Middleware on KubernetesImesh Gunaratne

Kubernetes stack reliabilityOleg Chunikhin

Introduction to Kubernetes RBACKublr

Vault - Enhancement for K8S secret securityHuynh Thai Bao

Effective Building your Platform with Kubernetes == Keep it Simple Wojciech Barczyński

MongoDB.local Austin 2018: MongoDB Ops Manager + KubernetesMongoDB

KubeCon EU 2016: Kubernetes and the Potential for Higher Level InterfacesKubeAcademy

Docker on docker leveraging kubernetes in docker eeDocker, Inc.

Deep dive into Kubernetes NetworkingSreenivas Makam

Ingress overviewHarshal Shah

Helm - Package Manager for KubernetesKnoldus Inc.

Implement Advanced Scheduling Techniques in Kubernetes Kublr

Managing kubernetes deployment with operatorsCloud Technology Experts

K8s in 3h - Kubernetes Fundamentals TrainingPiotr Perzyna

AWS Summit Singapore 2019 | Autoscaling Your Kubernetes WorkloadsAWS Summits

What's hot (20)

Effective Kubernetes - Is Kubernetes the new Linux? Is the new Application Se...

Setting up CI/CD pipeline with Kubernetes and Kublr step-by-step

Zero downtime deployment of micro-services with Kubernetes

Kubernetes basics and hands on exercise

Kubernetes persistence 101

Deploying WSO2 Middleware on Kubernetes

Kubernetes stack reliability

Introduction to Kubernetes RBAC

Vault - Enhancement for K8S secret security

Effective Building your Platform with Kubernetes == Keep it Simple

MongoDB.local Austin 2018: MongoDB Ops Manager + Kubernetes

KubeCon EU 2016: Kubernetes and the Potential for Higher Level Interfaces

Docker on docker leveraging kubernetes in docker ee

Deep dive into Kubernetes Networking

Ingress overview

Helm - Package Manager for Kubernetes

Implement Advanced Scheduling Techniques in Kubernetes

Managing kubernetes deployment with operators

K8s in 3h - Kubernetes Fundamentals Training

AWS Summit Singapore 2019 | Autoscaling Your Kubernetes Workloads

Similar to Lessons learned with kubernetes in productionat PlayPass

PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander KukushkinEqunix Business Solutions

Best practices for optimizing Red Hat platforms for large scale datacenter de...Jeremy Eder

6 Months Sailing with Docker in Production Hung Lin

DCSF 19 Accelerating Docker Containers with NVIDIA GPUsDocker, Inc.

Deploying Kubernetes on GCP with KubesprayAltoros

Heroku to Kubernetes & Gihub to Gitlab success storyJérémy Wimsingues

Tensorflow in DockerEric Ahn

Free GitOps Workshop + Intro to Kubernetes & GitOpsWeaveworks

Sprint 161ManageIQ

Infrastructure Management in GCPDana Hoffman

Debugging Go in KubernetesAlexei Ledenev

Environment management in a continuous delivery world (3)Victor Iglesias

Keynote #Tech - Google : aperçu de la gestion des services distribués chez Go...Paris Open Source Summit

How to Puppetize Google Cloud Platform - PuppetConf 2014Puppet

Putting the Fun into Functioning CI/CD with JHipsterGerard Gigliotti

App container rktXiaofeng Guo

Rex gke-clustreeRomain Vrignaud

The 2nd half. Scaling to the next^2Haggai Philip Zagury

DockerCon EU '17 - Dockerizing AureaŁukasz Piątkowski

ContainerDayVietnam2016: Django Development with DockerDocker-Hanoi

Similar to Lessons learned with kubernetes in productionat PlayPass (20)

PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin

Best practices for optimizing Red Hat platforms for large scale datacenter de...

6 Months Sailing with Docker in Production

DCSF 19 Accelerating Docker Containers with NVIDIA GPUs

Deploying Kubernetes on GCP with Kubespray

Heroku to Kubernetes & Gihub to Gitlab success story

Tensorflow in Docker

Free GitOps Workshop + Intro to Kubernetes & GitOps

Sprint 161

Infrastructure Management in GCP

Debugging Go in Kubernetes

Environment management in a continuous delivery world (3)

Keynote #Tech - Google : aperçu de la gestion des services distribués chez Go...

How to Puppetize Google Cloud Platform - PuppetConf 2014

Putting the Fun into Functioning CI/CD with JHipster

App container rkt

Rex gke-clustree

The 2nd half. Scaling to the next^2

DockerCon EU '17 - Dockerizing Aurea

ContainerDayVietnam2016: Django Development with Docker

Recently uploaded

Exploring the Best Video Editing App.pdfproinshot.com

The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE9953056974 Low Rate Call Girls In Saket, Delhi NCR

A Secure and Reliable Document Management System is Essential.docxComplianceQuest1

%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba

10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems

BUS PASS MANGEMENT SYSTEM USING PHP.pptxalwaysnagaraju26

The Top App Development Trends Shaping the Industry in 2024-25 .pdfayushiqss

Microsoft AI Transformation Partner Playbook.pdfWilly Marroquin (WillyDevNET)

%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba

The title is not connected to what is insideshinachiaurasa2

%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba

Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd

Pharm-D Biostatistics and Research methodologyAnusha Are

call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls

%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba

Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions

%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba

LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456KiaraTiradoMicha

Recently uploaded (20)

Exploring the Best Video Editing App.pdf

The Ultimate Test Automation Guide_ Best Practices and Tips.pdf

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE

A Secure and Reliable Document Management System is Essential.docx

%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein

10 Trends Likely to Shape Enterprise Technology in 2024

BUS PASS MANGEMENT SYSTEM USING PHP.pptx

The Top App Development Trends Shaping the Industry in 2024-25 .pdf

Microsoft AI Transformation Partner Playbook.pdf

%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein

The title is not connected to what is inside

%in tembisa+277-882-255-28 abortion pills for sale in tembisa

Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...

Pharm-D Biostatistics and Research methodology

call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️

%in Midrand+277-882-255-28 abortion pills for sale in midrand

Introducing Microsoft’s new Enterprise Work Management (EWM) Solution

%in ivory park+277-882-255-28 abortion pills for sale in ivory park

LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456

Lessons learned with kubernetes in productionat PlayPass

1. Lessons learned with Kubernetes in production at PlayPass Peter Vandenabeele 19 March 2019

2. PlayPass ● Antwerp scale-up ● Cashless payments ● Access control ● NFC wristband ● world-wide ● Music Festivals ● Sports Events

3. New apps: new tech stack ● existing apps: pets, Ruby/Rails, postgresql, classic stack ● Docker for development (docker-compose, ruby, elixir) ● Kubernetes for deployment, for HA ● on GKE (Google Kubernetes Engine) ● Gitlab for CI/CD ● terraform for base infra (clusters, networking, firewall) ● kubectl and helm for app deployment ● Vault for secrets management ● Stackdriver and DataDog for logging and monitoring

4. Environments (GCP projects + namespaces) k8s-dev (pre-empt) edge (pre-empt) production (fixed VM’s) clusterapplications (default) play experimental apps (default) play gitlab runners dashboard new apps ... (default) play -gitlab runners dashboard new apps ...

5. Kubernetes => pet infrastructure firewalling RMQ firewall: only ingress from fixed IP node 1 (preemptible) new public IP on preempt. node 2 (preemptible) new public IP on preempt. node 3 (preemptible) new public IP on preempt. “cloud-native” “pet infra”

6. Use a NAT GW, but, not for all traffic

7. NAT GW Kubernetes => NAT GW => outside node 1 (preemptible) new public IP on preempt. node 2 (preemptible) new public IP on preempt. node 3 (preemptible) new public IP on preempt. fixed IP to outside default GW GKE master ssh jump host 800 700

8. Node pre-emption and upgrades (planned events) ● Is GCP LB aware upfront of nodes going down ?? ● Are nodes doing proper `drain` on pre-emption ?? ● Are nodes hanging in `cordoned` state after upgrade ?? ● Node pre-emption => short service interruptions ● Production => non-preemptible nodes $$$ ● Node upgrade => short service interruption, manual uncordon needed (see next slide)

9. Node upgrade from GKE GUI leaves most nodes in SchedulingDisabled state => manual uncordon

10. Cloud SQL postgres performance issues ● Try to be “servicefull”, obviously :-) ● postgres service with HA, backups, auto scaling of disk ● basic performance test revealed that a workload with a few INSERT’s per transaction, dropped to only 20 INSERT’s per second and started disconnecting cloudsql- proxy ● => reverted 1 “fast aggregate” to self-hosted postgres, using Docker (back-up, HA => rebuild the aggregate)

11. the good news: fast e2e deploy by developer

12. Extra security measures as discussed ● Kaniko for building images without “docker-in-docker” or privileged mode ● PodSecurityPolicy with non privileged and MustRunAsNonRoot ○ spec: ○ privileged: false ○ allowPrivilegeEscalation: false ○ runAsUser: ○ # Require the container to run without root privileges. ○ rule: 'MustRunAsNonRoot' ● “Private” cluster with NAT for external access ● RBAC on different namespaces ● tiller per namespace with limited roleBinding and `--listen=localhost:44134` (ref https://engineering.bitnami.com/articles/helm-security.html)

Lessons learned with kubernetes in productionat PlayPass

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Lessons learned with kubernetes in productionat PlayPass

Similar to Lessons learned with kubernetes in productionat PlayPass (20)

Recently uploaded

Recently uploaded (20)

Lessons learned with kubernetes in productionat PlayPass

Lessons learned with kubernetes in production at PlayPass

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Lessons learned with kubernetes in production at PlayPass

Similar to Lessons learned with kubernetes in production at PlayPass (20)

Recently uploaded

Recently uploaded (20)

Lessons learned with kubernetes in production at PlayPass

Lessons learned with kubernetes in productionat PlayPass

Similar to Lessons learned with kubernetes in productionat PlayPass

Similar to Lessons learned with kubernetes in productionat PlayPass (20)

Lessons learned with kubernetes in productionat PlayPass