Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Amazon EKS: the good, the bad, and the ugly

164 views

Published on

Geoff Flarity, Software Engineer at CashApp (Square), gave a talk covering everything you need to know about EKS, AWS' managed Kubernetes offering at the Kubernetes + Cloud Native meetups in Toronto and Kitchener-Waterloo.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Amazon EKS: the good, the bad, and the ugly

  1. 1. Amazon EKS The good, the bad, and the ugly.
  2. 2. I am... Geoff Flarity (gflarity) Software Engineer Cash App (Square)
  3. 3. Uphill BOTH WAYS ● Kash started using Kubernetes back around 1.2-1.3 ● On GKE ● YOLO!?
  4. 4. About that Cash App... ● Over 15m MAU, December 2018 ○ We define active as making a money movement ● Cash Card GPV: ○ 90m Dec 2017 ○ 250m June 2018 => ● People love us so much they write songs about us!
  5. 5. About that Cash App... Songs written about Cash… ~90
  6. 6. About that Cash App...
  7. 7. Cash App on EKS +
  8. 8. Cash App on EKS https://blog.hasura.io/gke-vs-aks-vs-eks-411f080640dc/ ● Check out the comparison chart ● Some of the info is out of date ● This talk will focus on the issues that matter to the Cash App platform
  9. 9. The Good
  10. 10. The Good ● Managed control plane ● Automatic patch updates (security) ● Click to upgrade for major releases ● Yadda...
  11. 11. The Good Google doesn’t run Search/Adsense on GCP. AWS > GCP Also: https://www.youtube.com/watch?v=1oXAGBDZnXw If your laptop gets owned, your clusters have been owned to.
  12. 12. The Good ● AWS, AWS, AWS ● AWS IAM (Identity and Access Management) ● Temporary credentials for roles ● Multi factor Authentication If your laptop gets owned, has your cluster been owned too?
  13. 13. The Good - Kubernetes On AWS => 63% ● This is *pre* EKS ● Via KOPS and other tooling ● EKS leverages this work, and the cloud vendor support that is baked into Kubernetes (more on this)
  14. 14. The Good ● Everything is free as in speech... and beer* ● No magic, just AWS primitives ● Active community on github ● Fork and customize! * does not include control plane management system
  15. 15. The Bad
  16. 16. The Bad Service Limit/ELB Issues ● Hard cap on number of services is 300 due to firewall limits (in reality MUCH lower) ● Cloud provider specific logic is built into Kubernetes directly currently ● Won’t be separated for a while ● Work-arounds are rather hacky
  17. 17. The Bad ● AWS has great support for private/isolated virtual networking (VPC) ● Well designed, super configurable! ● The Kubernetes API doesn’t use it ● It’s public! ● Well encrypted, but all communication with master still goes over “internet” (private to AWS but still)
  18. 18. The Ugly
  19. 19. The Ugly ● GA (Generally Available) ○ ...BNPR (But Not Production Ready) ● AMI shipped with no docker log rotation ○ But… wasn’t this the image that much of that 63% were using? ○ What where those 63% doing? Anything serious?
  20. 20. The Ugly ● Single kube-dns pod by default ○ Single point of failure for all your communication (internal/external) ● Certain availability zones with in regions don’t have much capacity. But it’s random! ○ Scaling can fail after you’ve set everything up ○ Trial and error unless you have pro support
  21. 21. The Ugly ● Resources are reserved for the system/kubelet ○ If you run out of disk space, kubectl might die silently. ○ Have fun debugging! ● Control plane logging doesn’t ship to automatically somewhere. ○ Have fun debugging!
  22. 22. The Ugly ● AWS-CNI (networking architecture for EKS) didn’t support multiple subnets properly. ○ Wait… how many of that 63% using it? Many/most of these issues have been resolved or will be soon. But much confidence has eroded :(
  23. 23. Questions And More Info https://techmovers.salemove.com/infrastructure/2018/11/01/Productionproofing+EKS.html#limited-pod- capacity-per-subnet--vpc https://kubedex.com/90-days-of-aws-eks-in-production/ https://blog.hasura.io/gke-vs-aks-vs-eks-411f080640dc/

×