Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Building Resilient Serverless Systems


Published on

Video and slides synchronized, mp3 and slide download available at URL

John Chapin explains how to use serverless technologies and an infrastructure-as-code approach to architect, build, and operate large-scale systems that are resilient to vendor failures, even while taking advantage of fully managed vendor services and platforms. He leads an end-to-end demo of the resilience of a well-architected serverless system in the face of massive simulated failure. Filmed at

John Chapin is a co-founder of Symphonia, an expert Serverless and Cloud Technology Consultancy based in NYC. He has over 15 years of experience as a technical executive and senior engineer. He was previously VP Engineering, Core Services & Data Science at Intent Media, where he helped teams transform how they delivered business value through serverless technology and Agile practices.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Building Resilient Serverless Systems

  1. 1. Building Resilient Serverless Systems @johnchapin |
  2. 2. News & Community Site Watch the video with slide synchronization on! serverless-resiliency-infrastructure-as- code/ • Over 1,000,000 software developers, architects and CTOs read the site world- wide every month • 250,000 senior developers subscribe to our weekly newsletter • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • 2 dedicated podcast channels: The InfoQ Podcast, with a focus on Architecture and The Engineering Culture Podcast, with a focus on building • 96 deep dives on innovative topics packed as downloadable emags and minibooks • Over 40 new content items per week
  3. 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon London
  4. 4. John Chapin • Currently Partner, Symphonia • Former VP Engineering, Technical Lead • Data Engineering and Data Science teams • 20+ yrs experience in govt, healthcare, travel, and ad-tech • Intent Media, RoomKey, Meddius, SAIC, Booz Allen
  5. 5. Agenda • What is Serverless? • Resiliency • Demo • Discussion and Questions
  6. 6. What is Serverless?
  7. 7. Serverless = FaaS + BaaS! • FaaS = Functions as a Service • AWS Lambda, Auth0 Webtask, Azure Functions, Google Cloud Functions, etc... • BaaS = Backend as a Service • Auth0, Amazon DynamoDB, Google Firebase, Parse, Amazon S3, etc...
  8. 8. Serverless attributes • No managing of hosts or processes • Self auto-scaling and provisioning • Costs based on precise usage (down to zero!) • Implicit high availability
  9. 9. Serverless benefits • Cloud benefits ++ • Reduced TCO • Scaling flexibility • Shorter lead time
  10. 10. Loss of control • Limited configuration options • Fewer opportunities for optimization • Hands-off issue resolution
  11. 11. Resiliency
  12. 12. –Werner Vogels 
 ( “Failures are a given and everything will eventually fail over time ...”
  13. 13. Werner on Embracing Failure • Systems will fail • At scale, systems will fail a lot • Embrace failure as a natural occurrence • Limit the blast radius of failures • Keep operating • Recover quickly (automate!)
  14. 14. K.C. Green, Gunshow #648
  15. 15. Failures in Serverless land • Serverless is all about using vendor-managed services. • Two classes of failures: • Application failures (your problem, your resolution) • All other failures (your problem, but not your resolution) • What happens when those vendor-managed services fail? • Or when the services used by those services fail?
  16. 16. Mitigation through architecture • No control over resolving acute vendor failures. • Plan for failure, architect and build applications to be resilient. • Take advantage of: • Vendor-designed isolation mechanisms (like AWS regions). • Vendor services designed to work across regions (like Route 53). • Take advantage of vendor-recommended architectural practices, like the AWS Well- Architected Framework's Reliability Pillar:
  17. 17. AWS isolation mechanisms us-east-1a us-east-1b eu-west-2a eu-west-2b sa-east-1a sa-east-1b eu-west-2c us-east-1d us-east-1c us-east-1e us-east-1f sa-east-1c
  18. 18. Serverless resiliency on AWS • Regional high-availability = services running across multiple availability zones in one region. • With EC2 (and other traditional instance-based services), it's our problem. • With Serverless (Lambda, DynamoDB, S3, etc), AWS handle it for us. • Global high-availability = services running across multiple regions. • We must architect our systems for global high-availability. • The Serverless cost model is a huge advantage!
  19. 19. Serverless resiliency on AWS (cont) • Event-driven Serverless systems with externalized state mean: • Little or no data in-flight when a failure occurs • Data persisted to reliable stores (like DynamoDB or S3) • Serverless continuous deployment means: • No persistent infrastructure to re-hydrate • Highly likely to be a portable, infrastructure-as-code approach • Again, Serverless is a huge advantage!
  20. 20. Demo
  21. 21. Overview • Global, highly-available API • • Serverless Application Model (SAM) template • Lambda code (Typescript) • Build system (NPM + shell) • Elm front-end
  22. 22.
 (eu-west-2) messages
 (us-west-2) wss:// https:// /health messages wss:// https:// /health eu-west-2 us-west-2 conns conns
  23. 23. Request flow • DNS lookup for • Route 53 responds with IP address for • lowest latency regional API Gateway endpoint • that has a passing health check (HTTP 2xx or 3xx from /health endpoint) • Request traverses regional API Gateway to regional Lambda • Regional Lambda writes to regional DynamoDB table • DynamoDB replicates data to all replica tables in other regions, last write wins
  24. 24. Simulating failure • Alter eu-west-2 health check to return HTTP error status • Observe request routed to us-east-1 or us-west-2 instead • Observe DynamoDB writes propagated from us-west-2 back to eu-west-2
  25. 25. Rough edges • DynamoDB Global Tables not available in CloudFormation • API Gateway WebSockets + Custom Domains not available in CloudFormation • Can't add new replicas to DynamoDB global tables after inserting data • SAM not compatible with CloudFormation Stack Sets
  26. 26. Additional approaches • Multi-region deployment via Code Pipeline • CloudFront Origin Failover high_availability_origin_failover.html • Global Accelerator (for ELB, ALB, and EIP)
  27. 27. AWS Resources • James Hamilton's "Amazon Global Network Overview" • Rick Houlihan's DAT401: Advanced Design Patterns for DynamoDB • gateway-and-aws-lambda/
 (Magnus Bjorkman, November 2017) • multiregion-architectures/
 (Adrian Hornsby, December 2018) • (Diego Magalhaes, December 2018)
  28. 28. Symphonia resources • What is Serverless? Our 2017 report, published by O'Reilly. • Programming AWS Lambda - Our upcoming full-length book with O'Reilly. • Serverless Architectures - Mike's de facto industry primer on Serverless. • Learning Lambda - A 9-part blog series to help new Lambda devs get started. • Serverless Insights - Our email newsletter covering Serverless news, event, etc. • The Symphonium - Our blog, featuring technical content and analysis.
  29. 29. Stay in touch! @johnchapin @symphoniacloud
  30. 30. Watch the video with slide synchronization on! serverless-resiliency-infrastructure-as- code/