Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Paved PaaS to Microservices

1,093 views

Published on

Traditionally, a tug of war has existed between service reliability and engineering velocity. Increasing speed to fuel product innovation has meant making tradeoffs in reliability.

Netflix standardizes common functionality, like service discovery, configuration, metrics, logging, and RPC across services. This frees teams to focus on the unique business value of their service. It also enables us to evolve and maintain platform components independently from individual services.

Even with a standard set of components, service owners still need to combine these disparate elements into a coherent platform. We reduce this friction by providing a preassembled platform where teams only need to provide their business logic, and not worry about assembling the service from scratch.

We can further streamline the service lifecycle by providing automation and tooling for development, testing, deployment and operations. We provide "one click" solutions to automatically generate the associated pipelines, machinery, and infrastructure that's required to run their service reliably in production.

These patterns, while described in a Netflix context, can be broadly applicable to increase both reliability and velocity of your microservices architecture.

Published in: Technology
  • Be the first to comment

Paved PaaS to Microservices

  1. 1. The Paved PaaS to Microservices Yunong Xiao, Principal Software Engineer, Netflix yunong@netflix.com, @yunongx, http://yunong.io June 26, 2017
  2. 2. 100 million customers in over 190 countries streaming 125 million hrs/day
  3. 3. –Wikipedia “Platform as a service (PaaS)… allows customers to develop, run, and manage applications without the complexity of building and maintaining the infrastructure and platform…” What is a Platform as a Service (Paas), Anyway?
  4. 4. Our Use Case
  5. 5. Netflix Edge API Script A Script B Script C Script D Script N … Client Library A Client Library B Client Library C Client Library N … Backend Service A Backend Service B Backend Service C Backend Service N … 1000+100
  6. 6. TV iOS Android Windows Browsers Discovery Playback Non- member … Backend Service A Backend Service B Backend Service C Backend Service N … Clients Standalone Services Edge API Backend Services Owned by client teams Mostly
  7. 7. Know Your Customer
  8. 8. Goals Velocity Reliability
  9. 9. Our PaaS to Achieving Both 1. Standardized components 2. Preassembled platform 3. Automation and tooling
  10. 10. 1. Standardized Components
  11. 11. What’s Inside a Microservice? RPC Discovery Registration Runtime OS Configuration Metrics Logging Tracing Dashboards Alerts Stream Processing
  12. 12. Why Standards?
  13. 13. Mélange of RPC
  14. 14. Microservice Interactions µServiceClient uService µService µService µService … µService … µService
  15. 15. Failure: When not If
  16. 16. – Managers everywhere “Is it fixed yet?” Mean Time to Detection (MTTD) Mean Time to Repair (MTTR)
  17. 17. N Flavors of RPC µServiceClient uService µService µService µService … µService … µService
  18. 18. One Standard RPC µServiceClient uService µService µService µService … µService … µService
  19. 19. Benefits of Standardizing Consistency Leverage Interoperability Quality Support RPC Discovery Registration Runtime OS Configuration Metrics Logging Tracing Dashboards Alerts Stream Processing
  20. 20. But I'm a Snowflake... Freedom Responsibility& Off-road Innovation Reintegrate BurdenVacuumNew
  21. 21. 2. Preassembled Platform
  22. 22. Assembly Required RPC Discovery Registration Runtime OS Configuration Metrics Logging Tracing Dashboards Alerts Stream Processing
  23. 23. Getting Out of the Blocks Docs Copy/paste Which versions? Initialization Configuration Missing components Days or weeks
  24. 24. Velocity Reliability Product Innovation Not a single line of business logic!
  25. 25. Assembly Required RPC Discovery Registration Runtime OS Configuration Metrics Logging Tracing Dashboards Alerts Stream Processing
  26. 26. Preassembled Platform RPC Discovery Registration Runtime OS Configuration Metrics Logging Tracing Dashboards Alerts Stream Processing
  27. 27. Out of the Box
  28. 28. Component Management
  29. 29. Insights
  30. 30. Application, system, and runtime metrics & logs
  31. 31. Consistent application, system, and runtime metrics & logs Reduces MTTD & MTTR
  32. 32. Configures and Initializes Correctly BOEING 747-400 NORMAL PROCEDURES CHECKLIST POWER UP / SAFETY CHECK First Officer Captain CIRCUITBREAKERS...………………...CHECKED BATTERY……………...………………….………ON STANDBY POWER………………...………..AUTO HYDRAULIC DEMAND PUMPS….………..…..OFF WINDSHIELD WIPERS....……………….……..OFF ALTERNATE FLAPS AND GEAR…………….OFF GEAR LEVER…….………….....…………..DOWN FLAPS…..……………………………....CHECKED APU….………………………………...….RUNNING ELECTRICAL SYSTEM....……SET/APU AVAIL ON APU BLEED AIR…...……………………………ON ISOLATION VALVES…………………………OPEN PACKS………………...……………………NORMAL PREFLIGHT First Officer Captain EMERGENCY EQUIPMENT…….…….CHECKED FIRE PROTECTION…………...………CHECKED INTERRUPT SWITCHES………………………ON PASSENGER OXYGEN……………..……NORMAL STAB TRIM CUTOUT SWITCHES………….AUTO TRANSPONDER…………………………………SET SOURCE SELECTORS…………………...…….SET CLOCKS…………………………………………..SET CRT SELECTORS…………………………NORMAL PFD………………………………………CHECKED ND…………………………………………CHECKED AUTOBRAKES…………………………………RTO EIU SELECTOR………………………………AUTO HDG REFERENCE SWITCH….…………NORMAL FMC MASTER SELECTOR…………………LEFT GROUND PROX SYSTEM……………CHECKED BEFORE STARTING First Officer Captain HYDRAULIC DEMAND PUMPS……………………. .............................................AUTO, AUX (1 AND 4) BRAKE PRESSURE……..……………….NORMAL FUEL QUANTITY………………..…………. ____KG FUEL SYSTEM………………………………….SET X-FEEDS…………………OPEN (1 & 4 CLOSED) SEAT BELTS SIGN………………………………ON NOTOC………………………………….CHECKED SHIPS PAPERS…………………………ON BOARD PERFORMANCE DATA......…CHECKED AND SET
  33. 33. Versions Updates Compatibility
  34. 34. Important Questions
  35. 35. What’s in and out?
  36. 36. Maintenance vs Convenience RPC Discovery Registration Runtime OS Configuration Metrics Logging Tracing Dashboards Alerts Stream Processing Base Platform
  37. 37. Solution: Layers & Flavors Base platform Data access Rendering Backend …
  38. 38. How to Ensure Platform Correctness?
  39. 39. Test, Test, Test Unit Integration Functional Cloud Every PR
  40. 40. Dog Food with your own Service
  41. 41. Component Correctness Lock down versions Test components Updates require PRs Gate Keeper RPC Discovery Registration Runtime OS Configuration Metrics Logging Tracing Dashboards Alerts Stream ProcessingCustomers
  42. 42. How Locked Down is it?
  43. 43. Tradeoffs vs Flexibility Reliability Consistency Support
  44. 44. Stay on paved path!
  45. 45. Season to Taste Config overrides Startup & shutdown hooks Access to 3rd party libs Swap, disable, or configure components Raw component access
  46. 46. Platform Versioning?
  47. 47. API Semantic Versioning 1.2.3 ^1.0.0 ~1.3.0
  48. 48. Use Conventional Changelog
  49. 49. 3. Automation and Tooling
  50. 50. Ship a Feature
  51. 51. Development Testing Deployment Operations Steps
  52. 52. Development
  53. 53. Env bootstrap Integrate tooling & services Run local & cloud CLI for common dev experience
  54. 54. Local Development Live reload Attach debugger localhost PROD
  55. 55. Testing
  56. 56. Testing
  57. 57. Testing Preassembled Platform RPC Discovery Registration Runtime OS Configuration Metrics Logging Tracing Dashboards Alerts Stream Processing
  58. 58. Preassembled Platform Pre-Prod RPC Discovery Registration Runtime OS Configuration Metrics Logging Tracing Dashboards Alerts Stream Processing
  59. 59. Provide First Class Mocks RPC Discovery Registration Runtime OS Configuration Metrics Logging Tracing Dashboards Alerts Stream Processing Preassembled Platform
  60. 60. Mock Data Generation RPC +
  61. 61. Mock Ownership RPC Discovery Registration Configuration Metrics Logging Tracing Alerts Stream Processing
  62. 62. Platform Testing API • Just like a runtime API, need a testing API • Provide mocks interface for components • Gets platform out of the loop for providing mocks
  63. 63. Deployment
  64. 64. “Production is war!”
  65. 65. Experience Differences
  66. 66. Deploy and Manage Services • Pre-configured pipelines for deployment and rollback • Single command deploy to any stack • Integration for automated canary analysis • Pre-configured autoscaling
  67. 67. Operations
  68. 68. Consolidated View
  69. 69. RPS Latency Error rates CPU Memory Generated Dashboards & Alerts
  70. 70. CPU profiling Core dump analysis Automated Analytics & Tooling
  71. 71. Our PaaS to Velocity & Reliability 1. Standardized components 2. Preassembled platform 3. Automation and tooling
  72. 72. The Paved PaaS to Microservices Yunong Xiao, Principal Software Engineer, Netflix yunong@netflix.com, @yunongx, http://yunong.io June 26, 2017

×