SlideShare a Scribd company logo
Resilient serverless
architectures on AWS
Three key factors in building resilient serverless architectures
Lee Gilmore
https://twitter.com/LeeJamesGilmore
Lee Gilmore - about me
https://www.linkedin.com/in/lee-james-gilmore
Principal Developer at AO.com, blogger & AWS Community Builder
Linktree
What are we covering today?
Q. How do we build resilient serverless architectures on AWS?
Event-driven - Event-driven first mindset with Amazon EventBridge
Scalable - Load testing serverless architectures with Artillery
Monitoring - Using synthetic canaries to find issues proactively
A. Scalable + Event-driven + Monitoring
20 minutes
Load testing serverless architectures
Key takeaway: Serverless is not a silver bullet for scaling. Understand how your solutions work at unexpected scale.
01.
Scan for deep dive
Why load test your serverless architectures?
Using Artillery for load testing serverless solutions
Artillery is an open source load, functional and smoke testing solution, which can
be installed as a dependency of your serverless solution using NPM, configured
using a YML file, and accompanied CSV file for load test data, and ran within your
pipelines for regular testing
Configuration
Config
Allows you to pull in load test data
and configure plugins
Configuration
Config
Allows you to pull in load test data
and configure plugins
Environments
This is where you can split out
between Dev/QA/Staging/Prod
Configuration
Config
Allows you to pull in load test data
and configure plugins
Environments
This is where you can split out
between Dev/QA/Staging/Prod
Scenarios
This allows you to set up the actual
tests against endpoints
Configuration
What are the benefits of Artillery?
Easy to setup and to run through NPM scripts
Expect the correct responses, status codes and headers returned
You can use for smoke, fuzz and functional testing too (not just load testing)
Can be ran very easily in pipelines as part of the CI process
Additional plugins allow for fuzzing and writing the test results to DynamoDB/CloudWatch/SNS
Artillery produces test reports in HTML format which can be saved in pipelines as assets
Event-driven first mindset
Key takeaway: Don’t build tightly coupled, brittle architectures, or you will be regularly crying into your coffee at 2am
02.
Scan for deep dive
Sync vs Async - why does it matter?
Sync vs Async - why does it matter?
Sync vs Async - why does it matter?
Sync vs Async - why does it matter?
Importance of being event-driven?
Domain services are individually testable
Domain services are individually deployable
Shared versioned schemas for events
They have their own data stores
Totally decoupled
They can scale independently
There are numerous benefits of event-driven domain services which are detailed below:
What is an event anyway?
“By using Event Messages you can easily decouple senders and receivers both in terms of identity (you broadcast events without caring
who responds to them) and time (events can be queued and forwarded when the receiver is ready to process them). Such architectures
offer a great deal for scalability and modifiability due to this loose coupling.” - Martin Fowler
An event is a change of state within a
domain (past)
A command is an intent aimed at
another domain which results in some
output (future)
Serverless + Events
“Amazon EventBridge is a serverless event bus that makes it easier to build event-driven applications at scale using events
generated from your applications, integrated Software-as-a-Service (SaaS) applications, and AWS services” - AWS
Amazon EventBridge should be your default for serverless event-driven architectures for the
following reasons:
There are no servers to maintain or manage
Schema discovery and sharing using the registry
Content based filtering
Input transformation
Archive and Replay
Areas to consider
Consider idempotency
Queues, batching and failures
Version events with the Schema Registry
Event-carried state transfer
Potentially use Amazon SNS for low latency/high frequency messages
When building out your new architectures in an event-driven manner, it is worth planning for the
following to ensure your solutions are resilient:
Using canaries to find issues proactively
Key takeaway: “Everything fails all the time” - Werner Vogels
03.
Scan for deep dive
Amazon CloudWatch Synthetic Canaries
You can use Amazon CloudWatch Synthetics to create ‘canaries’, which are configurable scripts that
run on a schedule, to monitor your endpoints and APIs.
Canaries follow the same routes and perform the same actions as a customer, which makes it
possible for you to continually verify your customer experience.
By using canaries, you can discover issues before your customers do.
CloudWatch Synthetics features
Amazon CloudWatch Synthetics is a powerful fully serverless
way of constantly ensuring that your API’s are working correctly,
that there are no broken links in your webpages, visual diff checks
to make sure your web pages are displaying correctly, and
heartbeat checks.
How do canaries work?
Canaries are Lambda functions which run on a schedule
Can be written in Node.JS or Python
Offer programmatic access to a headless Google Chrome Browser via Puppeteer or Selenium Webdriver
Can check latency of your endpoints and can store load time data and screenshots of the UI
Can check for unauthorized changes from phishing, code injection and cross-site scripting
Can alarm and send alerts based on failures
Blueprints
Deploy canaries using the AWS CDK
Cont...
Amazon CloudWatch Synthetics dashboard
Summary
These are just three examples of the many ways we can make serverless architectures on AWS more
resilient for our customers
Event-driven - Event-driven first mindset with Amazon EventBridge
Scalable - Load testing serverless architectures with Artillery
Monitoring - Using synthetic canaries to find issues proactively
Summary
These are just three examples of the many ways we can make serverless architectures on AWS more
resilient for our customers
Thank You!
Lee Gilmore - https://linktr.ee/leegilmore

More Related Content

What's hot

Updating Security Operations For The Cloud
Updating Security Operations For The CloudUpdating Security Operations For The Cloud
Updating Security Operations For The Cloud
Mark Nunnikhoven
 

What's hot (20)

Building Secure Mobile APIs
Building Secure Mobile APIsBuilding Secure Mobile APIs
Building Secure Mobile APIs
 
(MBL402) Mobile Identity Management & Data Sync Using Amazon Cognito
(MBL402) Mobile Identity Management & Data Sync Using Amazon Cognito(MBL402) Mobile Identity Management & Data Sync Using Amazon Cognito
(MBL402) Mobile Identity Management & Data Sync Using Amazon Cognito
 
Platform for Innovation - AWS
Platform for Innovation - AWSPlatform for Innovation - AWS
Platform for Innovation - AWS
 
Serverless in production, an experience report (JeffConf)
Serverless in production, an experience report (JeffConf)Serverless in production, an experience report (JeffConf)
Serverless in production, an experience report (JeffConf)
 
Build high performing mobile apps, faster with AWS
Build high performing mobile apps, faster with AWSBuild high performing mobile apps, faster with AWS
Build high performing mobile apps, faster with AWS
 
Effective Collaboration & Delivery with GitHub and AWS Code Deploy – GitHub
Effective Collaboration & Delivery with GitHub and AWS Code Deploy – GitHubEffective Collaboration & Delivery with GitHub and AWS Code Deploy – GitHub
Effective Collaboration & Delivery with GitHub and AWS Code Deploy – GitHub
 
Serverless use cases with AWS Lambda - More Serverless Event
Serverless use cases with AWS Lambda - More Serverless EventServerless use cases with AWS Lambda - More Serverless Event
Serverless use cases with AWS Lambda - More Serverless Event
 
Automating nist 800 171 compliance in AWS Govcloud (US)
Automating nist 800 171 compliance in AWS Govcloud (US)Automating nist 800 171 compliance in AWS Govcloud (US)
Automating nist 800 171 compliance in AWS Govcloud (US)
 
Running your Windows Enterprise Workloads on AWS - Technical 201
Running your Windows Enterprise Workloads on AWS - Technical 201Running your Windows Enterprise Workloads on AWS - Technical 201
Running your Windows Enterprise Workloads on AWS - Technical 201
 
Serverless Architecture
Serverless ArchitectureServerless Architecture
Serverless Architecture
 
初探 AWS 平台上的 Docker 服務
初探 AWS 平台上的 Docker 服務初探 AWS 平台上的 Docker 服務
初探 AWS 平台上的 Docker 服務
 
Microservices Architecture for Web Applications using AWS Lambda and more
Microservices Architecture for Web Applications using AWS Lambda and moreMicroservices Architecture for Web Applications using AWS Lambda and more
Microservices Architecture for Web Applications using AWS Lambda and more
 
Why your next serverless project should use AWS AppSync
Why your next serverless project should use AWS AppSyncWhy your next serverless project should use AWS AppSync
Why your next serverless project should use AWS AppSync
 
FaaS or not to FaaS. Visible and invisible benefits of the Serverless paradig...
FaaS or not to FaaS. Visible and invisible benefits of the Serverless paradig...FaaS or not to FaaS. Visible and invisible benefits of the Serverless paradig...
FaaS or not to FaaS. Visible and invisible benefits of the Serverless paradig...
 
Your APIs can be soft and fluffy
Your APIs can be soft and fluffyYour APIs can be soft and fluffy
Your APIs can be soft and fluffy
 
Comparison and mapping between various cloud services 2019
Comparison and mapping between various cloud services 2019Comparison and mapping between various cloud services 2019
Comparison and mapping between various cloud services 2019
 
Essential Capabilities of an IoT Cloud Platform - AWS Online Tech Talks
Essential Capabilities of an IoT Cloud Platform - AWS Online Tech TalksEssential Capabilities of an IoT Cloud Platform - AWS Online Tech Talks
Essential Capabilities of an IoT Cloud Platform - AWS Online Tech Talks
 
Updating Security Operations For The Cloud
Updating Security Operations For The CloudUpdating Security Operations For The Cloud
Updating Security Operations For The Cloud
 
How to Sell Serverless to Your Colleagues
How to Sell Serverless to Your ColleaguesHow to Sell Serverless to Your Colleagues
How to Sell Serverless to Your Colleagues
 
Devops on AWS
Devops on AWSDevops on AWS
Devops on AWS
 

Similar to Serverless Summit 21 - Resilient serverless architecture on AWS

Similar to Serverless Summit 21 - Resilient serverless architecture on AWS (20)

"Fast Start to Building on AWS", Igor Ivaniuk
"Fast Start to Building on AWS", Igor Ivaniuk"Fast Start to Building on AWS", Igor Ivaniuk
"Fast Start to Building on AWS", Igor Ivaniuk
 
Getting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesGetting Started with Serverless Architectures
Getting Started with Serverless Architectures
 
Build an app on aws for your first 10 million users (2)
Build an app on aws for your first 10 million users (2)Build an app on aws for your first 10 million users (2)
Build an app on aws for your first 10 million users (2)
 
Primeros pasos con arquitecturas serverless
Primeros pasos con arquitecturas serverlessPrimeros pasos con arquitecturas serverless
Primeros pasos con arquitecturas serverless
 
5 Years Of Building SaaS On AWS
5 Years Of Building SaaS On AWS5 Years Of Building SaaS On AWS
5 Years Of Building SaaS On AWS
 
Following Well Architected Frameworks - Lunch and Learn.pdf
Following Well Architected Frameworks - Lunch and Learn.pdfFollowing Well Architected Frameworks - Lunch and Learn.pdf
Following Well Architected Frameworks - Lunch and Learn.pdf
 
Build an App on AWS for Your First 10 Million Users
Build an App on AWS for Your First 10 Million UsersBuild an App on AWS for Your First 10 Million Users
Build an App on AWS for Your First 10 Million Users
 
Simplify & Standardise your migration to AWS with a Migration Landing Zone
Simplify & Standardise your migration to AWS with a Migration Landing ZoneSimplify & Standardise your migration to AWS with a Migration Landing Zone
Simplify & Standardise your migration to AWS with a Migration Landing Zone
 
AWS Serverless concepts and solutions
AWS Serverless concepts and solutionsAWS Serverless concepts and solutions
AWS Serverless concepts and solutions
 
Infrastructure Provisioning & Automation For Large Enterprises
Infrastructure Provisioning & Automation For Large EnterprisesInfrastructure Provisioning & Automation For Large Enterprises
Infrastructure Provisioning & Automation For Large Enterprises
 
Getting Started with AWS Lambda and the Serverless Cloud
Getting Started with AWS Lambda and the Serverless CloudGetting Started with AWS Lambda and the Serverless Cloud
Getting Started with AWS Lambda and the Serverless Cloud
 
Advanced Continuous Delivery on AWS
Advanced Continuous Delivery on AWSAdvanced Continuous Delivery on AWS
Advanced Continuous Delivery on AWS
 
Create Agile, Automated and Predictable IT Infrastructure in the Cloud
Create Agile, Automated and Predictable IT Infrastructure in the CloudCreate Agile, Automated and Predictable IT Infrastructure in the Cloud
Create Agile, Automated and Predictable IT Infrastructure in the Cloud
 
AWS re:Invent 2016: Enabling Enterprise Migrations: Creating an AWS Landing Z...
AWS re:Invent 2016: Enabling Enterprise Migrations: Creating an AWS Landing Z...AWS re:Invent 2016: Enabling Enterprise Migrations: Creating an AWS Landing Z...
AWS re:Invent 2016: Enabling Enterprise Migrations: Creating an AWS Landing Z...
 
Migración a la Nube: Preparación y Mejores Prácticas
Migración a la Nube: Preparación y Mejores PrácticasMigración a la Nube: Preparación y Mejores Prácticas
Migración a la Nube: Preparación y Mejores Prácticas
 
SRV203 Getting Started with AWS Lambda and the Serverless Cloud
SRV203 Getting Started with AWS Lambda and the Serverless CloudSRV203 Getting Started with AWS Lambda and the Serverless Cloud
SRV203 Getting Started with AWS Lambda and the Serverless Cloud
 
AWS Cloud Solutions Architects & Tech Enthusiasts
AWS Cloud Solutions Architects & Tech EnthusiastsAWS Cloud Solutions Architects & Tech Enthusiasts
AWS Cloud Solutions Architects & Tech Enthusiasts
 
오토스케일링 제대로 활용하기 (김일호) - AWS 웨비나 시리즈 2015
오토스케일링 제대로 활용하기 (김일호) - AWS 웨비나 시리즈 2015오토스케일링 제대로 활용하기 (김일호) - AWS 웨비나 시리즈 2015
오토스케일링 제대로 활용하기 (김일호) - AWS 웨비나 시리즈 2015
 
Aws serverless applications lens
Aws serverless applications lensAws serverless applications lens
Aws serverless applications lens
 
When to use serverless computing.pdf
When to use serverless computing.pdfWhen to use serverless computing.pdf
When to use serverless computing.pdf
 

Recently uploaded

Recently uploaded (20)

The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 

Serverless Summit 21 - Resilient serverless architecture on AWS

  • 1. Resilient serverless architectures on AWS Three key factors in building resilient serverless architectures Lee Gilmore
  • 2. https://twitter.com/LeeJamesGilmore Lee Gilmore - about me https://www.linkedin.com/in/lee-james-gilmore Principal Developer at AO.com, blogger & AWS Community Builder Linktree
  • 3. What are we covering today? Q. How do we build resilient serverless architectures on AWS? Event-driven - Event-driven first mindset with Amazon EventBridge Scalable - Load testing serverless architectures with Artillery Monitoring - Using synthetic canaries to find issues proactively A. Scalable + Event-driven + Monitoring 20 minutes
  • 4. Load testing serverless architectures Key takeaway: Serverless is not a silver bullet for scaling. Understand how your solutions work at unexpected scale. 01. Scan for deep dive
  • 5. Why load test your serverless architectures?
  • 6. Using Artillery for load testing serverless solutions Artillery is an open source load, functional and smoke testing solution, which can be installed as a dependency of your serverless solution using NPM, configured using a YML file, and accompanied CSV file for load test data, and ran within your pipelines for regular testing
  • 8. Config Allows you to pull in load test data and configure plugins Configuration
  • 9. Config Allows you to pull in load test data and configure plugins Environments This is where you can split out between Dev/QA/Staging/Prod Configuration
  • 10. Config Allows you to pull in load test data and configure plugins Environments This is where you can split out between Dev/QA/Staging/Prod Scenarios This allows you to set up the actual tests against endpoints Configuration
  • 11. What are the benefits of Artillery? Easy to setup and to run through NPM scripts Expect the correct responses, status codes and headers returned You can use for smoke, fuzz and functional testing too (not just load testing) Can be ran very easily in pipelines as part of the CI process Additional plugins allow for fuzzing and writing the test results to DynamoDB/CloudWatch/SNS Artillery produces test reports in HTML format which can be saved in pipelines as assets
  • 12. Event-driven first mindset Key takeaway: Don’t build tightly coupled, brittle architectures, or you will be regularly crying into your coffee at 2am 02. Scan for deep dive
  • 13. Sync vs Async - why does it matter?
  • 14. Sync vs Async - why does it matter?
  • 15. Sync vs Async - why does it matter?
  • 16. Sync vs Async - why does it matter?
  • 17. Importance of being event-driven? Domain services are individually testable Domain services are individually deployable Shared versioned schemas for events They have their own data stores Totally decoupled They can scale independently There are numerous benefits of event-driven domain services which are detailed below:
  • 18. What is an event anyway? “By using Event Messages you can easily decouple senders and receivers both in terms of identity (you broadcast events without caring who responds to them) and time (events can be queued and forwarded when the receiver is ready to process them). Such architectures offer a great deal for scalability and modifiability due to this loose coupling.” - Martin Fowler An event is a change of state within a domain (past) A command is an intent aimed at another domain which results in some output (future)
  • 19. Serverless + Events “Amazon EventBridge is a serverless event bus that makes it easier to build event-driven applications at scale using events generated from your applications, integrated Software-as-a-Service (SaaS) applications, and AWS services” - AWS Amazon EventBridge should be your default for serverless event-driven architectures for the following reasons: There are no servers to maintain or manage Schema discovery and sharing using the registry Content based filtering Input transformation Archive and Replay
  • 20. Areas to consider Consider idempotency Queues, batching and failures Version events with the Schema Registry Event-carried state transfer Potentially use Amazon SNS for low latency/high frequency messages When building out your new architectures in an event-driven manner, it is worth planning for the following to ensure your solutions are resilient:
  • 21. Using canaries to find issues proactively Key takeaway: “Everything fails all the time” - Werner Vogels 03. Scan for deep dive
  • 22. Amazon CloudWatch Synthetic Canaries You can use Amazon CloudWatch Synthetics to create ‘canaries’, which are configurable scripts that run on a schedule, to monitor your endpoints and APIs. Canaries follow the same routes and perform the same actions as a customer, which makes it possible for you to continually verify your customer experience. By using canaries, you can discover issues before your customers do.
  • 23. CloudWatch Synthetics features Amazon CloudWatch Synthetics is a powerful fully serverless way of constantly ensuring that your API’s are working correctly, that there are no broken links in your webpages, visual diff checks to make sure your web pages are displaying correctly, and heartbeat checks.
  • 24. How do canaries work? Canaries are Lambda functions which run on a schedule Can be written in Node.JS or Python Offer programmatic access to a headless Google Chrome Browser via Puppeteer or Selenium Webdriver Can check latency of your endpoints and can store load time data and screenshots of the UI Can check for unauthorized changes from phishing, code injection and cross-site scripting Can alarm and send alerts based on failures
  • 26. Deploy canaries using the AWS CDK
  • 29. Summary These are just three examples of the many ways we can make serverless architectures on AWS more resilient for our customers Event-driven - Event-driven first mindset with Amazon EventBridge Scalable - Load testing serverless architectures with Artillery Monitoring - Using synthetic canaries to find issues proactively
  • 30. Summary These are just three examples of the many ways we can make serverless architectures on AWS more resilient for our customers
  • 31. Thank You! Lee Gilmore - https://linktr.ee/leegilmore

Editor's Notes

  1. Hey everyone, thank you so much for joining today! My name is Lee Gilmore, and I am a Principal Developer & Cloud Architect at AO.com, which is one of the UK’s largest online retailers, and I am also an AWS Community Builder, and an active blogger in the serverless industry I love connecting with like minded people, so feel free to connect with me on LinkedIn, Twitter or Medium - so let’s get started..
  2. Today I aim to cover as much ground as I can in the 20 minutes we have - helping to answer the question “how do we build resilient serverless architectures on AWS” Hopefully this will give you a taster of three key areas that I personally consider are massively important, specifically focusing in on Serverless architectures being scalable, and load testing them using Artillery. Event-driven first mindset using Amazon EventBridge. And finally the importance of proactive monitoring using CloudWatch Synthetic Canaries. You will see QR codes to scan with your smartphone cameras as we go along, which link off to detailed articles I have written which cover these areas in a lot more detail! (including code examples in GitHub written in TypeScript and the Serverless Framework typically)
  3. First of all we will cover ‘Load testing serverless architectures’, and no, I haven’t gone crazy! As serverless architectures should scale obviously! The key takeaway of this section is ‘Serverless is not a silver bullet for scaling. Understand how your solutions work at unexpected scale’.
  4. Some of the key reasons (but not limited too) that you may see issues with high load are: 1. Lambda functions horizontally scaling out really quickly which are opening and closing database connections which can quickly spike the CPU and memory on your database server making it crash. 2. Less scalable legacy systems downstream that can’t cope with the sudden scale out of the lambda functions. May be on premise, or 3rd party services for example. 3. Asynchronous eventually consistent processes taking too long for your customers due to poor batching and configuration, for example waiting for time bound activation emails being received. 4. Reserved concurrency causing throttling at high load, think orders going through a system on Black Friday and customers being throttled.. 5. Hitting the regional lambda concurrency account limit at high scale! Much better to be able to speak to AWS prior to an event rather than in the middle of it!
  5. This is where load testing with Artillery comes in. Artillery is an open source load, functional and smoke testing solution which you can configure using a YML file and pull in repeatable load test data from an associated CSV file. And this can be ran within your pipelines regularly with minimal setup. There are also additional community led plugins to use alongside Artillery, with the big advantage that you can create your own plugins to extend it however you need!
  6. Now let’s have a look at how you can configure Artillery very easily with a YML file. (I have had load tests running in as quickly as 15 minutes with Artillery)
  7. The first section allows us to firstly configure additional plugins to use alongside the load tests. An example here is the expect plugin so we can make assertions on the responses from your API requests. It also allows us to set fail limits, for example ensuring at least 95% of the request latency is equal to or under 3 seconds, and we fail hard if there are any errors (great for pipelines and functional tests) Finally it allows us to pull in repeatable load test data from an associated CSV file (which you can then clear up using your delete endpoints or by invoking a lambda to do this)
  8. The next section of the YML file is to configure your various environments that you want to test against, so typically staging for load tests, but for smoke and functional tests this may be all of your environments. It allows us to add our target APIs for each environment, as well as configuring the actual phases of the tests. As you can see from this example it is going to run for 10 seconds, with one virtual user when the test phase starts, and after ten seconds will not have more than 2 virtual users.
  9. Finally we have the scenarios section, which allows us to configure the actual flows we want to run through, and allowing us to also use the expect plugin we configured at the top to assert responses coming back. We can see in this example that we are creating an employee with a POST request, pulling through the JSON from the CSV file for the POST body, and asserting the response status code is 201 and the headers are correct.
  10. In summary for this section of the talk, this will give you the confidence your Serverless architectures will scale for your customers making them more resilient to high load, and this is why I think load testing your serverless solutions are key!
  11. Now we are going to cover why it is so important to have an ‘event-driven first mindset’, with the key takeaway being ‘don’t build tightly coupled, brittle architectures, or you will be regularly crying into your coffee at 2am’ - nobody wants to be getting alerts throughout the night because one of your domain services is having issues!
  12. Getting started with serverless really quickly is one of its main benefits, and very easy to spin up domain services using API Gateway, Lambda and DynamoDB typically. With a lot of serverless architectures I have seen in the past it is easy to chain multiple domain services together in an organisation with synchronous requests - but this.. 1. Increases the overall latency of calls for the end user as they wait for all requests to resolve. 2. It makes the architecture hugely coupled like a spider web of links - making it massively brittle!
  13. What we find now is that if one domain service goes down (our example of a database on fire) - then all other domain services are affected as they are so tightly coupled and intrinsically linked, and you have very unhappy customers!
  14. The alternative is to use event-driven architectures which are eventually consistent, and asynchronous in nature, where serverless domain services are loosely coupled, and interact with each other using events, rather than synchronous requests. This way the domain services produce events without the concern of who is consuming those events, and the added benefits that we utilise dead letter queues to make sure we can replay events in the event of one of the domain services having issues.
  15. You can see from the diagram that all of the domain services remain online other than the one bottom right, but its failed records are safely kept for re-processing, so your customers are not aware of any issues, and they can be reprocessed later when the service is brought online again.
  16. You can test a domain service in isolation without coordinating with several other teams and with multiple dependencies. (for example mocking APIs) In the same vein as above, you can deploy your domain services in isolation without being dependant on other teams, as long as the agreed event schemas have not changed. Historically teams would share contracts through Nuget or NPM packages with actual code, whereas now teams can simply share versioned schemas so work can be developed, tested and deployed in a loosely coupled manner. This reduces the overall dependencies between teams. Domain services should have their own data stores (typically databases) so they don’t have this dependency at a data layer level. If domain services have a shared database they become tightly coupled, risking cross contamination of bugs, deployment issues and security risks. Domain services should not be aware of each other. A producer can produce events without caring about which consumers are using them. Consumers also don’t care who produced the events. And finally, domain services can scale independently without the concern and co-ordination between other teams and domain services.
  17. So what is an event anyway? An event is something which has already happened in the past and immutable, which your producing to allow any other domains that are interested to consume and act upon that event. A command on the other hand is made with an intent for another domain to do something which results in some kind of output, and this is typically a one to one mapping between domains. An example is sending an email, and the producer expects the consumer to deal with errors and retries if there are any.
  18. So now we have covered high level why we want to design our serverless architecture to be event-driven, and have covered what events and commands are. Now lets cover Amazon EventBridge as an enterprise serverless event bus, and why it is so important. It is completely serverless and allows us to decouple our domain services with the smallest of overheads Sharing event schemas has been historically difficult, however the schema registry allows us to easily find and share schema structures between domains and teams Content based filtering, even at the body level of the event (so data), allows us to only consume events that we are interested in. Input transformations allows us to transpose the event structure and property names to meet the requirements of our consumers. Archive and Replay functionality allows us to replay events to hydrate new domain service data stores, or once a bug is fixed we can replay the events on failed records.
  19. As you look to use EventBridge it is worth considering the following: Build your services to be idempotent, so if you get the same event more than once you will always get the same result once actioned. EventBridge guarantees at-least-once delivery, but consumers can get the same events multiple times. You don’t want to be taking additional payments from your customers for example! When we have issues we need to ensure that we utilise dead letter queues to store failed events, and to remember that a failed record in SQS will force the full batch to be replayed again. This is why idempotency is so important! Also remember that you need DLQs on your event rules in case it can’t route to the targets. Use the Schema Registry auto discovery mode in development only, as this can be costly if left on in production! And use the Schema Registry to manually upload your own custom schemas, to share with other teams. The maximum event size for EventBridge is 256kb which is typically fine for most applications, but bear this in mind for events bigger than this, AWS recommend putting the event or part of it into S3 and include a link to it in the event. For architectures which need low latency and high frequency of messages then it may be worth looking at SNS over EventBridge, but this is in exceptional circumstances. So in summary, this event-driven first approach well help ensure that your Serverless architectures are more resilient to issues for your customers!
  20. Lastly we will be covering “using canaries to find issues proactively before our customers do” - with the key takeaway of this section being “Everything fails all the time”, which is a famous quote by the fantastic Werner Vogels - and not a truer word has been said!
  21. Typically you find out that your customers are experiencing issues when either a.) your support line is ringing off the hook or b.) you have monitoring, which could be a paid for service through a 3rd party provider, but this is typically only when there are specific errors alerted on. CloudWatch Synthetics is an AWS offering that allows you to create ‘canaries’, which are configurable scripts that run regularly to monitor your solutions.
  22. Amazon CloudWatch Synthetics is a powerful, yet largely unknown in my experience, way of monitoring your applications proactively! They perform the same actions and follow the same routes as your customers, so you can continually verify the customer experience, and proactively find issues before your customers do (even when there are no customers on the system) 1. Canaries can check that your APIs are working correctly. 2. There are no broken links in your web pages by crawling them. 3. Check the latency of your endpoints storing the information as HAR files (Http ARchive format). 4. Visual diff checks so you know if a change has broken some webpages 5. And heartbeat checks to ensure that your services are up and running correctly.
  23. Canaries are essentially lambda functions that are being invoked via CloudWatch events, and can be written in either Node.JS or Python. They offer programmatic access to headless Google Chrome Browser via Puppeteer or Selenium Webdriver, so you can easily navigate your webpages as customers would do through code (then verifying the experience) The canaries will check the latency of your endpoints and store these with other information, alongside any screenshots of your webpages, for 31 days as default (this is configurable). If you do have an issue with your Serverless solutions you can also setup alerting you so know about issues as they happen (even when there are no customers on your service).
  24. To get started very quickly with CloudWatch Synthetic Canaries you can use the blueprints which have already been created by AWS through the console (as shown in the screenshot) This also allows you to also use the AWS Canary Recorder plugin in your Chrome browser to automatically generate your scripts based on your performing actions on your web pages, or the workflow builder to generate sequences that your customers would typically do (for example navigating a page, clicking on buttons, typing into text boxes etc)
  25. The other way to setup Synthetic Canaries if you want to do something more bespoke is to use an IaC tool such as the CDK and your own lambda code, which is very simple to setup and deploy as you can see from the code on the screen. This example is creating a canary that runs every minute, stores the screenshots in an S3 bucket called ‘assetsBucket’, pulls in the Node JS Lambda code from a local directory, and is using Puppeteer version 3.2. That is the actual infrastructure...
  26. So here is some TypeScript sudo code for the lambda itself. As you can see it utilises the AWS Synthetics package to do the interesting work of taking the screenshots, setting the variance threshold when comparing this screenshot to the base line image, and the logging so you can view the results in the dashboard and alert on them. Super simple to code and setup.
  27. And once this is fully deployed you will be able to view previous runs in the dashboard, view detailed logs on things like latency, view previous screenshots which are stored (and more…) In summary, Synthetic Canaries, and proactive monitoring as an approach, makes your services more resilient by alerting on issues which could affect customers, potentially before they are even aware of them.
  28. So in closing, there are a lot more factors obviously involved in building resilient serverless architectures on AWS, but I am hoping these three key areas and supporting technologies may have peaked your interest to learn more outside of this short talk in your own time.
  29. And as I said earlier I have detailed articles and Github repos for each of the three areas, so feel free to pull down the code and have a play about!
  30. Thank you so much for taking the time to listen to me today, it has been a real pleasure, and thank you to Marc, Fabian, and the team for inviting me!