Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Running a High-Performance Kubernetes Cluster with Amazon EKS (CON318-R1) - AWS re:Invent 2018

317 views

Published on

How do you ensure that a containerized system can handle the needs of your application? Designing and testing for performance is a critical aspect of operating containerized architectures at scale. In this session, we cover best practices for designing perfomant containerized applications on AWS using Kubernetes. We also show you how State Street deployed a high-performance database at scale using Amazon Elastic Container Service for Kubernetes (Amazon EKS).

  • Be the first to comment

Running a High-Performance Kubernetes Cluster with Amazon EKS (CON318-R1) - AWS re:Invent 2018

  1. 1. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Running a High Performance Kubernetes Cluster with Amazon EKS Nathan Peck Developer Advocate Amazon Web Services C O N 3 1 8 Yekesa Kosuru Managing Director State Street
  2. 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Breakout repeats Tuesday, November 27th Running a high performance Kubernetes cluster with Amazon EKS 6:15 PM | Venetian, Level 3, Murano 3205 Wednesday, November 28th Running a high performance Kubernetes cluster with Amazon EKS 1:45 PM | Venetian, Level 4, Delfino 4002
  3. 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda Best practices of designing for performance How do I test performance? State Street: Database at scale demo
  4. 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  5. 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Basic components of Kubernetes You Worker Nodes Amazon EKS etcd Control Plane Your Container
  6. 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Optimizing your container Optimize for smaller size, use a multistage Docker build to reduce the size of the runtime container. Use a minimalist operating system: Alpine Linux, or similar. Or use no operating system: statically linked Go binary. Not all runtimes are equal. Does your app have a cold start that requires an initial burst of resources?
  7. 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Popular base images have a huge range by size REPOSITORY SIZE node:latest 674MB java:latest 643MB node:slim 184MB ubuntu:latest 85.8MB alpine:latest 4.41MB busybox:latest 1.15MB
  8. 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Optimize pods How many sidecar containers do you have in each pod? Admission controllers make it easy to add a lot of sidecars but don’t underestimate the overhead cost. Keep pods as lightweight as you can.
  9. 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Optimize pod placement Make sure you use resource constraints: - Request the baseline average resource needs of the app - Put a limit on the max resources to be made available to the pod to prevent one pod from interfering with the performance of another pod
  10. 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Optimize density vs. size of pods 4 x pod .5 CPU 256 memory 2 x pod 1 CPU 512 memory
  11. 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Anti-affinity Anti-affinity constraints can keep heavy CPU using pods away from each other, on different hosts Warning: anti-affinity is a beta feature pre Kubernetes 1.12, which improves anti-affinity performance 100x in large clusters Tradeoff: heavier control plane scheduling burden, application pod performance bonus
  12. 12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Observability for pod performance
  13. 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Optimizing your worker nodes Use the latest generation of Amazon Elastic Compute Cloud (Amazon EC2) instances. The c5 instance generation has up to 25% better price/performance than c4 instances.
  14. 14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Optimizing your worker nodes Choose instance class that matches your primary pod resource needs: • “c” instances are optimized for CPU heavy work • “r” instances are optimized for memory heavy work • “m” instances are general purpose • “p” instances optimized for GPU powered machine learning
  15. 15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Kubernetes control plane on tap, optimized by Amazon EKS mycluster.eks.amazonaws.com EKS Workers Kubectl AZ 1 AZ 2 AZ 3 Your AWS account
  16. 16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Networking performance: AWS VPC CNI plugin K u b e l e t V P C C N I p l u g i n 1 . C N I A d d / D e l e t e E C 2 E N I E N I E N I P o d P o d P o d P o d V P C N e t w o r k ......... 0 . C r e a t e E N I 2 . S e t u p v e t h Thin layer, no overhead Give K8s pods native IP addresses in the VPC Multiple ENI per Amazon EC2, multiple pods per ENI, all configurable
  17. 17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Pod to pod networking E C 2 Default namespace Pod namespace veth veth Main Route Table E C 2 Default namespace Pod namespace veth Route Table Main Route Table ENI RT veth VPC fabric ENI RT Route Table veth
  18. 18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Kubernetes performance envelope Number of Nodes Pod Churn Pod Density Networking Secrets Anti-affinity Active Namespaces
  19. 19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Heavy monolithic pods in a very large cluster Number of Nodes Pod Churn Pod DensityNetworking Secrets Anti-affinity Active Namespaces
  20. 20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Numerous densely bin packed microservice pods Number of Nodes Pod ChurnPod Density Networking SecretsAnti-affinity Active Namespaces
  21. 21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. The EKS team is here to help! We are constantly learning from the varying use cases of the many large deployments orchestrated using an EKS control plane Reach out to the team via support ticket, we will help you optimize your control plane to your exact performance needs
  22. 22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  23. 23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. State Street Disclaimer • Views and opinions expressed herein are those of PRESENTER as of 11/27/18 and they are subject to change based on market and other conditions and in any event may not reflect the views of State Street Corporation and its subsidiaries and affiliates (“State Street”). • This information herein is for informational purposes only and it does not constitute investment research or investment, legal, or tax advice, and it is not an offer or solicitation to buy or sell any product, service, or securities or any financial instrument, and it does not constitute any binding contractual arrangement or commitment of any kind. • This information is provided “as-is” and State Street makes no guarantee, representation, or warranty of any kind regarding such information. • This information is not intended to be relied upon by any person or entity. State Street disclaims all liability, whether arising in contract, tort or otherwise, for any losses, liabilities, damages, expenses or costs arising, either direct, indirect, consequential, special or punitive, from or in connection with the use of the information herein. • No permission is granted to reprint, sell, copy, distribute, or modify any material herein, in any form or by any means without the prior written consent of State Street.
  24. 24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda • What and why • How: Leverage Kubernetes primitives to build a high-performance system • Design Considerations and Best Practices • Scaling Factors and Bottlenecks • Live Demo: Demonstrate scale-out database on Amazon EKS • 1Million + queries per second • Measure latencies
  25. 25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. What and why Transactional database w/ unlimited scale concurrency High Concurrency Low Latencies Open Source Cloud Native Architecture Custom Database Features
  26. 26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Replication architecture • MySQL Master/Slave • RocksDB Engine • LSM Data Structure optimized for writes • Write intensive workloads • Low memory demands • MariaDB or Percona • RocksDB • Standard MySQL features • Semi Sync Replication w/ GTID • Failover • Cloud Native • SST Files and WAL’s synchronized Percona Server (binlogs) MyRocks WALSST rocksdb local-attached storage bin bin
  27. 27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Resilient scale-out architecture • Scale-out • Vitess - Sharded cluster for scale • Each shard is one master + multiple slaves • Custom sharding key • Read Scale-out: add more slaves • Write Scale-out: add more shards • ACID compliance across cluster • Connection pool, restrain bad queries • Amazon S3 and Amazon EKS • Backups stored to Amazon Simple Storage Service (Amazon S3) • Cluster hosted on Amazon EKS
  28. 28. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  29. 29. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Storage • Persistent Volume (PV) • Persistent storage survives pod restarts • HostPath PV • Local storage SSD/NVMe devices • PV are attached via PV Claims • PV Claims • Dynamic • Abstraction to underlying storage • ReadWriteOnce • Tradeoff between resiliency and performance Pod Data Volume Pod (pvc) Persistent Vol Persistent Vol Pod (pvc) Best resiliency Low performance No resiliency Best performance Medium resiliency Best performance
  30. 30. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Getting the most out of K8S • Taints & tolerations • Place one master per worker • Taint the node to maximize performance • Affinity & anti-affinity • Affinity pods scheduled on SSD/NVMe • AntiAffinity: Ensure masters aren’t scheduled with replicas • Services • Resource requests and limits • SYSCTL • StatefulSet • Replicated group of pods with unique properties • StatefulSet restarts Pod on same node • Requires to use PV Claims • Operator + ETCD • DaemonSet • Background processes per node on all nodes • Monitoring & upgrades • e.g., Metrics agent, Local Volume Provisioner
  31. 31. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Pod networking
  32. 32. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Best practices for high-performance clusters • Lean and mean container • Amazon EKS optimized AMI • Image pull policy • Master on SSD/NVMe • Slave: • SSD/NVMe • EBS for increased resiliency • Monitor key metrics • Watch overcommitted state • Cluster auto scaler. HPA. VPA • Placement groups • 30% higher packets per second • Cross AZ deployment • Place writes closer to master • Choose right size nodes • Good n/w performance • More CPU better than more RAM • Use EKS CNI plugin • Upgrade your CNI plug-in • aws-k8s-cni-g74ecf61.yaml or later • Bottleneck: Packets per second, not bandwidth • Jumbo packets increased 90% QPS
  33. 33. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Bottleneck Bottleneck @400K QPS
  34. 34. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Scaling factors Query Throughput Query Latency Network PPS Configuration 300K P95: 300ms P50: 8 ms N/A 2 Shards 4 workers Overloaded 400K P95: 4 ms P50: 500 nano 6.8M Single VPC No PG. One replica. 3 Subnets 600K P95: 4 ms P50: 500 nano 9.5M Single AZ. One replica. 1.5K MTU Placement Group 948K P95: 3 ms P50: 500 nano 9.5M Single AZ. Placement Group. One replica. Jumbo Frames 1.36M P95: 3 ms P50: 500 nano 9.5M Single AZ. Placement Group. Jumbo Frames 4 Replicas
  35. 35. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  36. 36. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  37. 37. Thank you! © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Yekesa Kosuru ykosuru@statestreet.com
  38. 38. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

×