Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Kubernetes on AWS at Europe's Leading Online Fashion Platform

3,163 views

Published on

Henning Jacobs is a Kubernetes on AWS Hacker at Zalando Tech. His talk briefly covers our learnings in Zalando Tech while running Kubernetes on AWS in production.

Topics include:

- Cluster provisioning,
- AWS integration,
- Ingress,
- Cluster autoscaling,
- OAuth/IAM and
- Operations/monitoring.

https://www.meetup.com/Zalando-Tech-Events-Berlin/events/238212872/

Published in: Technology
  • Be the first to comment

Kubernetes on AWS at Europe's Leading Online Fashion Platform

  1. 1. Kubernetes on AWS AT EUROPE’S LEADING ONLINE FASHION PLATFORM HENNING JACOBS @try_except_ 2017-03-27
  2. 2. 2 ZALANDO 15 markets 6 fulfillment centers 20 million active customers 3.6 billion € net sales 2016 165 million visits per month 12,000 employees in Europe
  3. 3. 3 ZALANDO TECHNOLOGY HOME-BREWED, CUTTING-EDGE & SCALABLE technology solutions >1,600 employees from tech locations + HQs in Berlin6 77 nations help our brand to WIN ONLINE
  4. 4. 4 KUBERNETES ON AWS: CONTEXT 200 engineering teams 30 prod. clusters AWS Dockerized apps No manual operations Reliability Autoscaling Seamless migration
  5. 5. 5 ARCHITECTURE
  6. 6. 6 ISOLATED AWS ACCOUNTS Internet *.abc.example.org *.xyz.example.org Product ABC Product XYZ EC2 LBLB
  7. 7. 7 KUBERNETES ON AWS
  8. 8. 8 ARCHITECTURE DECISIONS • API server behind SSL ELB • Webhook for authn & authz • OAuth Bearer token • Group membership lookup • Read only access to production • CI/CD for write access • etcd running separately on EC2 • Multi AZ clusters
  9. 9. 9 CLUSTER PROVISIONING
  10. 10. 10 CLUSTER PROVISIONING • Two Cloud Formation stacks • Master & worker ASGs + etcd • Nodes w/ Container Linux • K8s manifests applied separately • kube-system Deployments • DaemonSets
  11. 11. 11 DEPLOYMENT
  12. 12. 12 DEPLOYMENT CONFIGURATION . ├── apply │ ├── credentials.yaml # K8s TPR │ ├── ingress.yaml # K8s Ingress │ ├── redis-deployment.yaml # K8s Deployment │ ├── redis-service.yaml # K8s Service │ └── service.yaml # K8s Service ├── deployment.yaml # K8s Deployment └── pipeline.yaml # proprietary config
  13. 13. 13 JENKINS DEPLOY PIPELINE
  14. 14. 14 INGRESS
  15. 15. 15 INGRESS.YAML apiVersion: extensions/v1beta1 kind: Ingress metadata: name: "{{ application }}" annotations: # optional: SSL certificate ARN to use for the ALB (auto discovery for ACM) zalando.org/aws-load-balancer-ssl-cert: "arn:aws:iam:..:..:..1a" spec: rules: # DNS name your application should be exposed on - host: "myapp.foo.example.org" http: paths: - backend: serviceName: "{{ application }}" servicePort: 80
  16. 16. 16 INGRESS CONTROLLER
  17. 17. 17 AWS INTEGRATION
  18. 18. 18 CLOUD FORMATION VIA CI/CD . ├── apply │ ├── cf-iam-role.yaml # AWS IAM Role │ ├── cf-rds.yaml # AWS RDS Database │ ├── kube-ingress.yaml # K8s Ingress │ ├── kube-secret.yaml # K8s Secret │ └── kube-service.yaml # K8s Service ├── deployment.yaml # K8s Deployment └── pipeline.yaml # CI/CD config
  19. 19. 19 ASSIGNING AWS IAM ROLE TO POD kind: Deployment spec: template: metadata: annotations: # annotation for kube2iam iam.amazonaws.com/role: "app-{{ application }}-1" spec: containers: - name: ... ...
  20. 20. 20 CLUSTER AUTOSCALING
  21. 21. 21 CLUSTER AUTOSCALING Control # of worker nodes in ASG: • Satisfy all resource requests • One spare node per AZ • No manual config “tweaking” • Scale down, but not too fast
  22. 22. 22 CURRENT SETUP • https://github.com/hjacobs/kube-aws-autoscaler • Node draining via systemd unit Open topic: node “readiness” during scale out
  23. 23. 24 OAUTH / IAM INTEGRATION
  24. 24. 25 DECLARING NEEDED CREDENTIALS # apply/credentials.yaml apiVersion: "zalando.org/v1" kind: PlatformCredentialsSet metadata: name: "{{ application }}" spec: application: "{{ application }}" tokens: # OAuth service tokens mytok: # the token name used in application code privileges: - com.zalando::foobar.write clients: # OAuth clients implicit: grant: implicit # grant type according to RFC-6749 realm: users redirectUri: https://myapp.foo.example.org/oauth
  25. 25. 26 MOUNTING THE OAUTH CREDENTIALS kind: Deployment spec: template: spec: containers: - name: ... ... volumeMounts: - name: "{{ application }}-credentials" mountPath: /meta/credentials readOnly: true volumes: - name: "{{ application }}-credentials" secret: secretName: "{{ application }}"
  26. 26. 27 USING THE OAUTH CREDENTIALS #!/bin/bash type=$(cat /meta/credentials/read-only-token-type) secret=$(cat /meta/credentials/read-only-token-secret) curl -H "Authorization: $type $secret" https://resource-server.example.org/protected
  27. 27. 28 OPERATIONS & MONITORING
  28. 28. 29 OPERATIONS • Cluster updates automatic via CLM • CronJob is great, but needs cleanup • Docker can be PITA
  29. 29. 30 CLUSTER UPDATES
  30. 30. 31 LIMIT RANGE kubectl describe limitrange Name: limits Namespace: default Type Resource Min Max Default Req Default Limit Max Limit/Request Ratio ---- -------- --- --- ----------- ------------- ----------------------- Container memory - 64Gi 100Mi 1Gi - Container cpu - 16 100m 3 -
  31. 31. 32 MONITORING •
  32. 32. 33 SIMPLE ZMON CHECK/ALERT EXAMPLE •
  33. 33. 34 MONITORING • Each cluster contains ZMON appliance • K8s resources are available as ZMON entities • Users can create app checks/alerts via UI
  34. 34. https://github.com/hjacobs/kube-ops-view
  35. 35. 36 OPEN SOURCE
  36. 36. 37 OPEN SOURCE Kube AWS Ingress Controller https://github.com/zalando-incubator/kube-ingress-aws-controller External DNS https://github.com/kubernetes-incubator/external-dns Zalando Cluster Config & Docs https://github.com/zalando-incubator/kubernetes-on-aws more to come...
  37. 37. QUESTIONS? HENNING JACOBS TECH INFRASTRUCTURE CLOUD ENGINEER henning@zalando.de @try_except_

×