Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Envoy @ Lyft: developer productivity (kubecon 2.0)

303 views

Published on

How can infrastructure engineers empower their product developers with easy-to-use systems and processes that abstract the complexity of core infrastructure? This talk focuses on Envoy configuration management, and how the networking team at Lyft builds on top of Envoy to allow Lyft engineers to focus on business logic. I gave this talk twice and made some edits for the second time. This is the most recent version

Published in: Software
  • Writing good research paper is quite easy and very difficult simultaneously. It depends on the individual skill set also. You can get help from research paper writing. Check out, please ⇒ www.WritePaper.info ⇐
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Envoy @ Lyft: developer productivity (kubecon 2.0)

  1. 1. Envoy @ ft José Niño jnino@lyft.com - @junr03
  2. 2. Who am I? Envoy Maintainer Networking Team @ Lyft @junr03
  3. 3. My time at Lyft 1. Initial Envoy open sourcing: documentation, and docker sandbox examples 2. Create Envoyoutbound: enable developers to easily communicate with partners over stable IPs 3. Open sourcing ratelimit, and a couple other golang libraries: provide ample documentation for consumers 4. Expand Envoy’s outlier detection system, and build tooling (stats, logging) to help developers understand anomalies in their services 5. xDS APIs and the future of Envoy configuration management at Lyft: how do we make the control plane accessible and easy to use @junr03
  4. 4. There is a pattern... 1. Open sourcing envoy: documentation, and docker sandbox examples 2. Create Envoyoutbound: enable developers to easily communicate with partners over stable IPs 3. Open sourcing ratelimit, and a couple other golang libraries: provide ample documentation for consumers 4. Expand Envoy’s outlier detection system, and build tooling (stats, logging) to help developers understand anomalies in their services 5. xDS APIs and the future of Envoy configuration management at Lyft: how do we make the control plane accessible and easy to use The focus is on developer productivity! @junr03
  5. 5. The Story Envoy is a powerful and complex tool. How does the Networking Team at Lyft hide the complexity to allow service developers to leverage the power of Envoy? @junr03
  6. 6. Why is this important? • Lyft engineers are the Infra org’s customers • Lyft is about to have a lot more engineers • The number of services at Lyft is ever increasing @junr03
  7. 7. Frame of Reference - The Control Plane • Proxy configuration is complicated: envoy is not the exception • Operating the data plane should be reserved to a select few • Configuring some options of the data plane via the control plane should be open to all service owners @junr03
  8. 8. Envoy Rollout @ Lyft @junr03@junr03
  9. 9. Envoy Design Goals 1. Out of process architecture 2. Low latency, high performance, dev productivity 3. Filter Architecture: L3/L4 & L7 4. HTTP/2 first 5. Service/Config discovery 6. Active/passive health checking 7. Advanced load balancing 8. Envoy everywhere 9. Best in class observability @junr03
  10. 10. Envoy Rollout - Edge Proxy AWS TCP ELB Service Foo Service Bar Service Baz /foo /bar /baz 1. Microservice architectures need an edge proxy 2. Easy to show value with: ‒ Stats ‒ Enhanced Load Balancing ‒ Routing ‒ Protocols @junr03
  11. 11. Envoy Rollout - TCP Proxy / MongoDB 1. Parse Mongo at L7 and get useful stats 2. Ratelimit Mongo to avoid death spirals 3. Better connection handling than to raw Mongo 4. We can do this with all services! Global Ratelimit ServiceMongo Router / / Mongo DB @junr03
  12. 12. Envoy Rollout - Service Sidecar AWS TCP ELB 1. Proxying to Mongo meant all Services already had Envoy running 2. Still used internal ELB for service-to-service traffic 3. Use for: ‒ Ingress buffering ‒ Circuit Breaking ‒ Observability AWS TCP ELB / / @junr03
  13. 13. Envoy Rollout - Service Mesh 1. Direct Connect 2. Service Discovery Cron job Discovery / / Cron job @junr03
  14. 14. Envoy @ Lyft Mesh Front Envoy Envoyoutbound Tracing Collectors Ratelimit Discovery Ancillary Services > 200 services > 20,000 Hosts > 5 million RPS @junr03
  15. 15. Control Plane @junr03
  16. 16. Configuration Management - The Past Initially static files ‒ Only two types: edge proxy, service sidecar ‒ Deployed on a deploy bundle out to the edge proxy, and to all services in the mesh Human Static Files “Deploy Magic” Proxies @junr03
  17. 17. Configuration Management - The Past As complexity grew we moved to templated files ‒ Jinja2 templates, and some python glue ‒ Expose certain “knobs” to the service engineers at Lyft ‒ At deploy time, create the configuration file Human Exposed Knobs “Deploy Magic” Proxies Jinja2 Templates + @junr03
  18. 18. Use case: create a new public route • Service developers manipulate edge proxy route table • Deploying public routing changes was tied to an Envoy binary deployment • Erroneous configuration could be deployed next to complex code Front Envoy /new/route New Service @junr03
  19. 19. Pain points • No clear ownership • Configuration deployment was tied to binary deployment • UX is tedious and fragmented The Complexity is in Plain Sight @junr03
  20. 20. Configuration Management - The Present Mid 2017: xDS APIs for configuration management. • gRPC/protobuf based • Bi-directional gRPC streaming • Interacting with the control plane is separated from data plane operation • Enable us to develop smart, robust control plane solutions RDS - Route Discovery Service CDS - Cluster DS LDS - Listener DS ... @junr03
  21. 21. Configuration Management - The Present Envoymanager / / service deployment envoy-static-config service “manifest” Document Cloud Storage @junr03
  22. 22. Configuration Management - The Present envoy-static-config service “manifest” match: path: /rider/ route: cluster: pagelauncher @junr03
  23. 23. Configuration Management - The Present internal_hosts: - jobscheduler - roads external_hosts: - dynamodb_iad - kinesis_iad circuit_breaker: default: max_requests: 100 envoy-static-config service “manifest” @junr03
  24. 24. Configuration Management - The Present service deployment Document Cloud Storage @junr03
  25. 25. Caching Configuration Management - The Present Envoymanager / / Data processing xDS Server @junr03
  26. 26. Configuration Management - The Present Envoymanager / / service deployment envoy-static-config service “manifest” Document Cloud Storage @junr03
  27. 27. Envoy @ Lyft Mesh Front Envoy Envoyoutbound Tracing Collectors Ratelimit Discovery Envoymanager Ancillary Services @junr03
  28. 28. How is the complexity hidden? @junr03
  29. 29. Use case: create a new public route Envoymanager envoy-static-config Document Cloud Storage @junr03
  30. 30. Documentation • Documentation built on top of the public Envoy documentation • Clear step-by-step guides • FAQs • Video Tutorials @junr03
  31. 31. Making a Change
  32. 32. Deployment • Same mechanics as service deployments • Easy to use deployment pipeline • Canary is part of the deployment process @junr03
  33. 33. Deployment • Same mechanics as service deployments • Easy to use deployment pipeline • Canary is part of the deployment process @junr03
  34. 34. Versioning service deployment envoy-static-config service “manifest” Document Cloud Storage • Leverage git as a versioning system • Easy rollback and roll forward • Git shas have semantic meaning • Versions are used throughout the system • Used in monitoring tooling @junr03
  35. 35. Stats - Envoymanager @junr03
  36. 36. Stats - Front Envoy @junr03
  37. 37. Stats - Per Service Metrics
  38. 38. Wins • Allows service developers to own configuration changes all the way to production • Most configuration changes do not entail an envoy restart • Most configuration changes do not entail an envoy binary deploy • Opens up the world to more friendly UX for configuration changes @junr03
  39. 39. The Future @junr03
  40. 40. The networking team focuses on building accessible and easy-to-use systems for service developers to successfully configure, operate, and debug Envoy @junr03
  41. 41. Thanks jnino@lyft.com - @junr03

×