Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Kubernetes at Zalando - CNCF End User Committee Presentation

707 views

Published on

Inaugural presentation by Zalando in the first CNCF End User Committee Call on 2017-04-13.

Published in: Technology
  • Be the first to comment

Kubernetes at Zalando - CNCF End User Committee Presentation

  1. 1. Kubernetes at Zalando CNCF END USER COMMITTEE HENNING JACOBS @try_except_ 2017-04-13
  2. 2. 2 …HAS BECOME THE EUROPEAN ONLINE PLATFORM FOR FASHION
  3. 3. OUR VISION: CONNECTING PEOPLE AND FASHION
  4. 4. 4 WE OFFER A SUCCESSFUL AND CURATED ASSORTMENT ~200,000 articles from >1,500 international brands 17 private labels HIGHLY EXPERIENCED category management >350 designers & stylistsLOCALIZATION of the assortment CURATED SHOPPING with Zalon
  5. 5. 5 ZALANDO 15 markets 6 fulfillment centers 20 million active customers 3.6 billion € net sales 2016 165 million visits per month 12,000 employees in Europe
  6. 6. 6 OUR FOOTPRINT AROUND EUROPE as at Dec 2016 10 9 8 7 6 5 3 2 1 11 1 2 3 4 5 6 7 8 9 10 11 12 13 12 13 4 BERLIN HEADQUARTERS / TECH HUB BRIESELANG FULFILLMENT CENTER ERFURT FULFILLMENT CENTER MÖNCHENGLADBACH FULFILLMENT CENTER LAHR FULFILLMENT CENTER DORTMUND TECH HUB FRANKFURT OUTLET DUBLIN TECH HUB HELSINKI TECH HUB STRADELLA FULFILLMENT CENTER KÖLN OUTLET MOISSY-CRAMAYEL FULFILLMENT CENTER GRYFINO (start autumn 2017) FULFILLMENT CENTER
  7. 7. 7 ZALANDO TECHNOLOGY HOME-BREWED, CUTTING-EDGE & SCALABLE technology solutions >1,600 employees from tech locations + HQs in Berlin6 77 nations help our brand to WIN ONLINE
  8. 8. 8 KUBERNETES ON AWS: CONTEXT 200 engineering teams 30 prod. clusters AWS Dockerized apps No manual operations Reliability Autoscaling Seamless migration
  9. 9. 9 CURRENT STATUS • First service in prod. on Kubernetes since Nov 2016 • 8 production clusters • 8 non-production clusters • “Early Access” phase (onboarding of individual teams)
  10. 10. 10 ARCHITECTURE
  11. 11. 11 ISOLATED AWS ACCOUNTS Internet *.abc.example.org *.xyz.example.org Product ABC Product XYZ EC2 LBLB
  12. 12. 12 KUBERNETES ON AWS
  13. 13. 13 ARCHITECTURE DECISIONS • One prod. cluster per AWS account / “product” • API server behind SSL ELB, OAuth webhook • Read only access to production • CI/CD for write access • etcd running separately on EC2 • Multi AZ clusters
  14. 14. 14 CLUSTER PROVISIONING
  15. 15. 15 CLUSTER PROVISIONING • Two Cloud Formation stacks • Master & worker ASGs + etcd • Nodes w/ Container Linux • K8s manifests applied separately • kube-system Deployments • DaemonSets
  16. 16. 16 DEPLOYMENT
  17. 17. 17 DEPLOYMENT CONFIGURATION . ├── apply │ ├── credentials.yaml # K8s TPR │ ├── ingress.yaml # K8s Ingress │ ├── redis-deployment.yaml # K8s Deployment │ ├── redis-service.yaml # K8s Service │ └── service.yaml # K8s Service ├── deployment.yaml # K8s Deployment └── pipeline.yaml # proprietary config
  18. 18. 18 JENKINS DEPLOY PIPELINE
  19. 19. 19 INGRESS
  20. 20. 20 INGRESS CONTROLLER https://github.com/zalando-incubator/kube-ingress-aws-controller
  21. 21. 21 AWS INTEGRATION
  22. 22. 22 CLOUD FORMATION VIA CI/CD . ├── apply │ ├── cf-iam-role.yaml # AWS IAM Role │ ├── cf-rds.yaml # AWS RDS Database │ ├── kube-ingress.yaml # K8s Ingress │ ├── kube-secret.yaml # K8s Secret │ └── kube-service.yaml # K8s Service ├── deployment.yaml # K8s Deployment └── pipeline.yaml # CI/CD config
  23. 23. 23 CLUSTER AUTOSCALING
  24. 24. 24 CLUSTER AUTOSCALING Control # of worker nodes in ASG: • Satisfy all resource requests • One spare node per AZ • No manual config “tweaking” • Scale down, but not too fast https://github.com/hjacobs/kube-aws-autoscaler
  25. 25. 25 OAUTH / IAM INTEGRATION
  26. 26. 26 OAUTH INTEGRATION • App declares needed credentials via Kubernetes Third Party Resource • OAuth client/tokens are provisioned as Kubernetes secrets
  27. 27. 27 OPERATIONS & MONITORING
  28. 28. 28 OPERATIONS • Cluster updates automatic via CLM • CronJob is great, but needs cleanup • Docker can be PITA
  29. 29. 29 CLUSTER UPDATES
  30. 30. 30 MONITORING •
  31. 31. 31 MONITORING • Each cluster contains ZMON appliance • K8s resources are available as ZMON entities • Users can create app checks/alerts via UI https://zmon.io/
  32. 32. https://github.com/hjacobs/kube-ops-view
  33. 33. 33 OPEN SOURCE
  34. 34. 34 OPEN SOURCE Kube AWS Ingress Controller https://github.com/zalando-incubator/kube-ingress-aws-controller External DNS https://github.com/kubernetes-incubator/external-dns Zalando Cluster Config & Docs https://github.com/zalando-incubator/kubernetes-on-aws more to come...
  35. 35. 35 OUR EXPERIENCE SO FAR • Missing best practices are a big pain point • Community is great and welcoming • Talking with other users is essential • Slack channels are “compensating” lack of docs
  36. 36. 36 LIST OF KUBERNETES ON AWS USERS If you are using Kubernetes on AWS, please fill out the Google form: https://github.com/hjacobs/kubernetes-on-aws-users +
  37. 37. QUESTIONS? HENNING JACOBS TECH INFRASTRUCTURE CLOUD ENGINEER henning@zalando.de @try_except_

×