SlideShare a Scribd company logo
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
How to Determine if You Are Well-Architected for Reliability
Rodney Lester,
Reliability Pillar Lead, Well-Architected
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Setting Up Your Test Environment
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Downloading the Lab Guide
• Navigate to or download the lab guide: https://bit.ly/2rgjFAh
• Execute the first section: Deploying the Infrastructure.
• This will take about 5 minutes to start the execution and about 30 minutes to
deploy everything.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Background on the AWS Well-Architected Reliability Pillar
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
What is it?
• “The reliability pillar encompasses the ability of a system to recover from
infrastructure or service disruptions, dynamically acquire computing resources
to meet demand, and mitigate disruptions such as misconfigurations or
transient network issues.” – AWS Well Architected Framework Whitepaper
Design principles
• Test recovery procedures
• Automatically recover from failures
• Scale horizontally to increase aggregate system availability
• Stop guessing capacity
• Manage change using automation
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
What is it?
• “The reliability pillar encompasses the ability of a system to recover from
infrastructure or service disruptions, dynamically acquire computing resources
to meet demand, and mitigate disruptions such as misconfigurations or
transient network issues.” – AWS Well Architected Framework Whitepaper
Design principles
• Test recovery procedures
• Automatically recover from failures
• Scale horizontally to increase aggregate system availability
• Stop guessing capacity
• Manage change using automation
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Failure Management
• “Everything fails, all the time.” - Werner Vogels, Amazon CTO
• We are going to work on withstanding COMPONENT failures and planning
for RECOVERY
• Netflix OSS Simian Army, specifically “Chaos” suite, is an example of how
to test fault tolerance
• Chaos Monkey—injects failures on instances/containers
• Chaos Gorilla (deprecated)—simulates an AZ failure
• Chaos Kong—simulates an AWS Region failure
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Automated Recovery
• Manual recovery will take much longer
• Use AWS services to automate
• Auto Scaling
• Amazon Relational Database Service (Amazon RDS)
• Amazon Route 53
• Amazon Simple Storage Service (Amazon S3)
Auto
Scaling
group
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Why are we here?
• Just use Netflix OSS, right?
• Wait, Chaos Monkey is written in Java or Go?
• Wait, I need to share SSH keys to inject failure?
• Wait, what about simulating an AZ Failure
• (no more Chaos Gorilla)?
• This creates fear, uncertainty, and doubt
• Can I do this?
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Yes, you can do this!
• You can write code to perform these scenarios in any language
• You can use AWS Systems Manager and EC2 Run Command to inject
failure without using SSH
• You can write your own AZ Failure simulation
• You can simulate your own regional failure
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Scenario: Simple three tier web application
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Scenario for testing
AWS US-EAST-2 Region
Virtual private cloud
Availability zoneAvailability zone Availability zone
IGW routed subnet IGW routed subnet IGW routed subnet
Private subnet Private subnet Private subnet
Application Load
Balancer
MySQL DB
Multi-AZ
MySQL DB
Multi-AZ
Instance Instance Auto Scale
App Tier
Bucket
Instance Instance Instance Instance
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Lab Time
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Start at Discussion and Example Failure Scenarios Section in
the Lab Guide
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Summary of Tests
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Simulated Failures
AWS US-EAST-2 Region
Virtual private cloud
Availability zoneAvailability zone Availability zone
IGW routed subnet IGW routed subnet IGW routed subnet
Private subnet Private subnet Private subnet
Application Load
Balancer
MySQL DB
Multi-AZ
MySQL DB
Multi-AZ
Instance Instance Auto Scale
App Tier
Bucket
Instance Instance Instance Instance
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
What did you learn?
We want you to lose the fear of writing these tests!
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Failure and Resiliency Testing
• The simulation might not be obvious—discuss and think about it before
implementation
• Writing this code is not difficult
• On very large implementations, you’ll need to use orchestration to have
things happen almost simultaneously
AWS Step
Functions
Lambda function
Lambda function
Lambda function
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Things to watch out for
• You will find problems and resolve them, but real failure might not look
exactly like the simulation
• Expect this to be an ongoing effort
• Add to your acceptance testing
• Some failure modes are destructive
• Automated deployment will help bring the environment back to
original deployment configuration
• Failover is easy; failback is more difficult
• Game day testing is essential
• Practice, practice, practice
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
This Code is Available for Download
• Download it at:
• https://bit.ly/2low3dY
• Which expands to:
• https://s3.us-east-2.amazonaws.com/aws-well-architected-labs-
ohio/Reliability/AWSLoftReliabilityLabCode.zip
• It is licensed under the Apache License, Version 2.0
• https://aws.amazon.com/apache2.0
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Thank you!

More Related Content

What's hot

Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...
Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...
Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...
Amazon Web Services
 
Build a Vulnerability Management Program Using AWS for AWS (SEC337-R1) - AWS ...
Build a Vulnerability Management Program Using AWS for AWS (SEC337-R1) - AWS ...Build a Vulnerability Management Program Using AWS for AWS (SEC337-R1) - AWS ...
Build a Vulnerability Management Program Using AWS for AWS (SEC337-R1) - AWS ...
Amazon Web Services
 
Amazon WorkSpaces for Regulated Industries (BAP211) - AWS re:Invent 2018
Amazon WorkSpaces for Regulated Industries (BAP211) - AWS re:Invent 2018Amazon WorkSpaces for Regulated Industries (BAP211) - AWS re:Invent 2018
Amazon WorkSpaces for Regulated Industries (BAP211) - AWS re:Invent 2018
Amazon Web Services
 
Accelerate Productivity by Computing at the Edge - AWS Online Tech Talks
Accelerate Productivity by Computing at the Edge - AWS Online Tech TalksAccelerate Productivity by Computing at the Edge - AWS Online Tech Talks
Accelerate Productivity by Computing at the Edge - AWS Online Tech Talks
Amazon Web Services
 
Inventory and Patch Management Using AWS Systems Manager (ARC332) - AWS re:In...
Inventory and Patch Management Using AWS Systems Manager (ARC332) - AWS re:In...Inventory and Patch Management Using AWS Systems Manager (ARC332) - AWS re:In...
Inventory and Patch Management Using AWS Systems Manager (ARC332) - AWS re:In...
Amazon Web Services
 
Introducing AWS Firewall Manager - AWS Online Tech Talks
Introducing AWS Firewall Manager - AWS Online Tech TalksIntroducing AWS Firewall Manager - AWS Online Tech Talks
Introducing AWS Firewall Manager - AWS Online Tech Talks
Amazon Web Services
 
AWS Storage and Edge Processing
AWS Storage and Edge ProcessingAWS Storage and Edge Processing
AWS Storage and Edge Processing
Amazon Web Services
 
DEM20 Protecting Your Data in Amazon S3
DEM20 Protecting Your Data in Amazon S3DEM20 Protecting Your Data in Amazon S3
DEM20 Protecting Your Data in Amazon S3
Amazon Web Services
 
Building Web Apps on AWS
Building Web Apps on AWSBuilding Web Apps on AWS
Building Web Apps on AWS
Amazon Web Services
 
Understanding AWS Secrets Manager - AWS Online Tech Talks
Understanding AWS Secrets Manager - AWS Online Tech TalksUnderstanding AWS Secrets Manager - AWS Online Tech Talks
Understanding AWS Secrets Manager - AWS Online Tech Talks
Amazon Web Services
 
Your road to a Well Architected solution in the Cloud - Tel Aviv Summit 2018
Your road to a Well Architected solution in the Cloud - Tel Aviv Summit 2018Your road to a Well Architected solution in the Cloud - Tel Aviv Summit 2018
Your road to a Well Architected solution in the Cloud - Tel Aviv Summit 2018
Amazon Web Services
 
Moving 400 Engineers to AWS: Our Journey to Secure Adoption (SEC306-S) - AWS ...
Moving 400 Engineers to AWS: Our Journey to Secure Adoption (SEC306-S) - AWS ...Moving 400 Engineers to AWS: Our Journey to Secure Adoption (SEC306-S) - AWS ...
Moving 400 Engineers to AWS: Our Journey to Secure Adoption (SEC306-S) - AWS ...
Amazon Web Services
 
DEM07 Best Practices for Monitoring Amazon ECS Containers Launched with Fargate
DEM07 Best Practices for Monitoring Amazon ECS Containers Launched with FargateDEM07 Best Practices for Monitoring Amazon ECS Containers Launched with Fargate
DEM07 Best Practices for Monitoring Amazon ECS Containers Launched with Fargate
Amazon Web Services
 
Build and Collaborate on a Modern Web Application on AWS
Build and Collaborate on a Modern Web Application on AWS Build and Collaborate on a Modern Web Application on AWS
Build and Collaborate on a Modern Web Application on AWS
Amazon Web Services
 
Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech Talks
Amazon Web Services
 
Build a Voice-Based Chatbot for Your Amazon Connect Contact Center
Build a Voice-Based Chatbot for Your Amazon Connect Contact CenterBuild a Voice-Based Chatbot for Your Amazon Connect Contact Center
Build a Voice-Based Chatbot for Your Amazon Connect Contact Center
Amazon Web Services
 
So You Want to be Well-Architected?
So You Want to be Well-Architected?So You Want to be Well-Architected?
So You Want to be Well-Architected?
Amazon Web Services
 
CI/CD using AWS developer tools
CI/CD using AWS developer toolsCI/CD using AWS developer tools
CI/CD using AWS developer tools
AWS User Group Bengaluru
 
Living the AWS Well Architected Framework
Living the AWS Well Architected FrameworkLiving the AWS Well Architected Framework
Living the AWS Well Architected Framework
Adam Dillman
 
SRV328 Designing and Implementing a Serverless Media-Processing Workflow
SRV328 Designing and Implementing a Serverless Media-Processing WorkflowSRV328 Designing and Implementing a Serverless Media-Processing Workflow
SRV328 Designing and Implementing a Serverless Media-Processing Workflow
Amazon Web Services
 

What's hot (20)

Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...
Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...
Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...
 
Build a Vulnerability Management Program Using AWS for AWS (SEC337-R1) - AWS ...
Build a Vulnerability Management Program Using AWS for AWS (SEC337-R1) - AWS ...Build a Vulnerability Management Program Using AWS for AWS (SEC337-R1) - AWS ...
Build a Vulnerability Management Program Using AWS for AWS (SEC337-R1) - AWS ...
 
Amazon WorkSpaces for Regulated Industries (BAP211) - AWS re:Invent 2018
Amazon WorkSpaces for Regulated Industries (BAP211) - AWS re:Invent 2018Amazon WorkSpaces for Regulated Industries (BAP211) - AWS re:Invent 2018
Amazon WorkSpaces for Regulated Industries (BAP211) - AWS re:Invent 2018
 
Accelerate Productivity by Computing at the Edge - AWS Online Tech Talks
Accelerate Productivity by Computing at the Edge - AWS Online Tech TalksAccelerate Productivity by Computing at the Edge - AWS Online Tech Talks
Accelerate Productivity by Computing at the Edge - AWS Online Tech Talks
 
Inventory and Patch Management Using AWS Systems Manager (ARC332) - AWS re:In...
Inventory and Patch Management Using AWS Systems Manager (ARC332) - AWS re:In...Inventory and Patch Management Using AWS Systems Manager (ARC332) - AWS re:In...
Inventory and Patch Management Using AWS Systems Manager (ARC332) - AWS re:In...
 
Introducing AWS Firewall Manager - AWS Online Tech Talks
Introducing AWS Firewall Manager - AWS Online Tech TalksIntroducing AWS Firewall Manager - AWS Online Tech Talks
Introducing AWS Firewall Manager - AWS Online Tech Talks
 
AWS Storage and Edge Processing
AWS Storage and Edge ProcessingAWS Storage and Edge Processing
AWS Storage and Edge Processing
 
DEM20 Protecting Your Data in Amazon S3
DEM20 Protecting Your Data in Amazon S3DEM20 Protecting Your Data in Amazon S3
DEM20 Protecting Your Data in Amazon S3
 
Building Web Apps on AWS
Building Web Apps on AWSBuilding Web Apps on AWS
Building Web Apps on AWS
 
Understanding AWS Secrets Manager - AWS Online Tech Talks
Understanding AWS Secrets Manager - AWS Online Tech TalksUnderstanding AWS Secrets Manager - AWS Online Tech Talks
Understanding AWS Secrets Manager - AWS Online Tech Talks
 
Your road to a Well Architected solution in the Cloud - Tel Aviv Summit 2018
Your road to a Well Architected solution in the Cloud - Tel Aviv Summit 2018Your road to a Well Architected solution in the Cloud - Tel Aviv Summit 2018
Your road to a Well Architected solution in the Cloud - Tel Aviv Summit 2018
 
Moving 400 Engineers to AWS: Our Journey to Secure Adoption (SEC306-S) - AWS ...
Moving 400 Engineers to AWS: Our Journey to Secure Adoption (SEC306-S) - AWS ...Moving 400 Engineers to AWS: Our Journey to Secure Adoption (SEC306-S) - AWS ...
Moving 400 Engineers to AWS: Our Journey to Secure Adoption (SEC306-S) - AWS ...
 
DEM07 Best Practices for Monitoring Amazon ECS Containers Launched with Fargate
DEM07 Best Practices for Monitoring Amazon ECS Containers Launched with FargateDEM07 Best Practices for Monitoring Amazon ECS Containers Launched with Fargate
DEM07 Best Practices for Monitoring Amazon ECS Containers Launched with Fargate
 
Build and Collaborate on a Modern Web Application on AWS
Build and Collaborate on a Modern Web Application on AWS Build and Collaborate on a Modern Web Application on AWS
Build and Collaborate on a Modern Web Application on AWS
 
Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech Talks
 
Build a Voice-Based Chatbot for Your Amazon Connect Contact Center
Build a Voice-Based Chatbot for Your Amazon Connect Contact CenterBuild a Voice-Based Chatbot for Your Amazon Connect Contact Center
Build a Voice-Based Chatbot for Your Amazon Connect Contact Center
 
So You Want to be Well-Architected?
So You Want to be Well-Architected?So You Want to be Well-Architected?
So You Want to be Well-Architected?
 
CI/CD using AWS developer tools
CI/CD using AWS developer toolsCI/CD using AWS developer tools
CI/CD using AWS developer tools
 
Living the AWS Well Architected Framework
Living the AWS Well Architected FrameworkLiving the AWS Well Architected Framework
Living the AWS Well Architected Framework
 
SRV328 Designing and Implementing a Serverless Media-Processing Workflow
SRV328 Designing and Implementing a Serverless Media-Processing WorkflowSRV328 Designing and Implementing a Serverless Media-Processing Workflow
SRV328 Designing and Implementing a Serverless Media-Processing Workflow
 

Similar to Rodney Lester: Well-Architected - Reliability Instructor Led Lab.pdf

How to Determine if You Are Well-Architected for Reliability
How to Determine if You Are Well-Architected for ReliabilityHow to Determine if You Are Well-Architected for Reliability
How to Determine if You Are Well-Architected for ReliabilityAmazon Web Services
 
2018 10-19-jc conf-embrace-legacy-java-ee-by-aws-serverless
2018 10-19-jc conf-embrace-legacy-java-ee-by-aws-serverless2018 10-19-jc conf-embrace-legacy-java-ee-by-aws-serverless
2018 10-19-jc conf-embrace-legacy-java-ee-by-aws-serverless
Kim Kao
 
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...Designing for Operability: Getting the Last Nines in Five-Nines Availability ...
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...
Amazon Web Services
 
AWS Lambda use cases and best practices - Builders Day Israel
AWS Lambda use cases and best practices - Builders Day IsraelAWS Lambda use cases and best practices - Builders Day Israel
AWS Lambda use cases and best practices - Builders Day Israel
Amazon Web Services
 
AWS Security Week: Infrastructure Security- Your Minimum Security Baseline
AWS Security Week: Infrastructure Security- Your Minimum Security BaselineAWS Security Week: Infrastructure Security- Your Minimum Security Baseline
AWS Security Week: Infrastructure Security- Your Minimum Security Baseline
Amazon Web Services
 
Infrastructure Security: Your Minimum Security Baseline
Infrastructure Security: Your Minimum Security BaselineInfrastructure Security: Your Minimum Security Baseline
Infrastructure Security: Your Minimum Security Baseline
Amazon Web Services
 
Hybrid Cloud Customer Use Cases on AWS
Hybrid Cloud Customer Use Cases on AWSHybrid Cloud Customer Use Cases on AWS
Hybrid Cloud Customer Use Cases on AWS
Tom Laszewski
 
Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...
Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...
Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...
Amazon Web Services
 
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Amazon Web Services
 
Pragmatic container security - DEM11-R - AWS re:Inforce 2019
Pragmatic container security - DEM11-R - AWS re:Inforce 2019 Pragmatic container security - DEM11-R - AWS re:Inforce 2019
Pragmatic container security - DEM11-R - AWS re:Inforce 2019
Amazon Web Services
 
Operational Excellence with Containerized Workloads Using AWS Fargate (CON320...
Operational Excellence with Containerized Workloads Using AWS Fargate (CON320...Operational Excellence with Containerized Workloads Using AWS Fargate (CON320...
Operational Excellence with Containerized Workloads Using AWS Fargate (CON320...
Amazon Web Services
 
Introduction to Serverless on AWS - Builders Day Jerusalem
Introduction to Serverless on AWS - Builders Day JerusalemIntroduction to Serverless on AWS - Builders Day Jerusalem
Introduction to Serverless on AWS - Builders Day Jerusalem
Amazon Web Services
 
Set Up a CI/CD Pipeline for Deploying Containers Using the AWS Developer Tool...
Set Up a CI/CD Pipeline for Deploying Containers Using the AWS Developer Tool...Set Up a CI/CD Pipeline for Deploying Containers Using the AWS Developer Tool...
Set Up a CI/CD Pipeline for Deploying Containers Using the AWS Developer Tool...
Amazon Web Services
 
Operational Excellence for Identity & Access Management (SEC334) - AWS re:Inv...
Operational Excellence for Identity & Access Management (SEC334) - AWS re:Inv...Operational Excellence for Identity & Access Management (SEC334) - AWS re:Inv...
Operational Excellence for Identity & Access Management (SEC334) - AWS re:Inv...
Amazon Web Services
 
Become a Serverless Black Belt - Optimizing Your Serverless Applications - AW...
Become a Serverless Black Belt - Optimizing Your Serverless Applications - AW...Become a Serverless Black Belt - Optimizing Your Serverless Applications - AW...
Become a Serverless Black Belt - Optimizing Your Serverless Applications - AW...
Amazon Web Services
 
Gluecon 2018 - The Best Practices and Hard Lessons Learned of Serverless Appl...
Gluecon 2018 - The Best Practices and Hard Lessons Learned of Serverless Appl...Gluecon 2018 - The Best Practices and Hard Lessons Learned of Serverless Appl...
Gluecon 2018 - The Best Practices and Hard Lessons Learned of Serverless Appl...
Chris Munns
 
PaaS – From Code to Running Application using AWS Elastic Beanstalk (DEV323) ...
PaaS – From Code to Running Application using AWS Elastic Beanstalk (DEV323) ...PaaS – From Code to Running Application using AWS Elastic Beanstalk (DEV323) ...
PaaS – From Code to Running Application using AWS Elastic Beanstalk (DEV323) ...
Amazon Web Services
 
SRV313 Introduction to Building Web Apps on AWS
 SRV313 Introduction to Building Web Apps on AWS SRV313 Introduction to Building Web Apps on AWS
SRV313 Introduction to Building Web Apps on AWS
Amazon Web Services
 
Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ...
Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ...Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ...
Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ...
Amazon Web Services
 
Industrialize Machine Learning Using CI/CD Techniques (FSV304-i) - AWS re:Inv...
Industrialize Machine Learning Using CI/CD Techniques (FSV304-i) - AWS re:Inv...Industrialize Machine Learning Using CI/CD Techniques (FSV304-i) - AWS re:Inv...
Industrialize Machine Learning Using CI/CD Techniques (FSV304-i) - AWS re:Inv...
Amazon Web Services
 

Similar to Rodney Lester: Well-Architected - Reliability Instructor Led Lab.pdf (20)

How to Determine if You Are Well-Architected for Reliability
How to Determine if You Are Well-Architected for ReliabilityHow to Determine if You Are Well-Architected for Reliability
How to Determine if You Are Well-Architected for Reliability
 
2018 10-19-jc conf-embrace-legacy-java-ee-by-aws-serverless
2018 10-19-jc conf-embrace-legacy-java-ee-by-aws-serverless2018 10-19-jc conf-embrace-legacy-java-ee-by-aws-serverless
2018 10-19-jc conf-embrace-legacy-java-ee-by-aws-serverless
 
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...Designing for Operability: Getting the Last Nines in Five-Nines Availability ...
Designing for Operability: Getting the Last Nines in Five-Nines Availability ...
 
AWS Lambda use cases and best practices - Builders Day Israel
AWS Lambda use cases and best practices - Builders Day IsraelAWS Lambda use cases and best practices - Builders Day Israel
AWS Lambda use cases and best practices - Builders Day Israel
 
AWS Security Week: Infrastructure Security- Your Minimum Security Baseline
AWS Security Week: Infrastructure Security- Your Minimum Security BaselineAWS Security Week: Infrastructure Security- Your Minimum Security Baseline
AWS Security Week: Infrastructure Security- Your Minimum Security Baseline
 
Infrastructure Security: Your Minimum Security Baseline
Infrastructure Security: Your Minimum Security BaselineInfrastructure Security: Your Minimum Security Baseline
Infrastructure Security: Your Minimum Security Baseline
 
Hybrid Cloud Customer Use Cases on AWS
Hybrid Cloud Customer Use Cases on AWSHybrid Cloud Customer Use Cases on AWS
Hybrid Cloud Customer Use Cases on AWS
 
Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...
Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...
Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...
 
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
 
Pragmatic container security - DEM11-R - AWS re:Inforce 2019
Pragmatic container security - DEM11-R - AWS re:Inforce 2019 Pragmatic container security - DEM11-R - AWS re:Inforce 2019
Pragmatic container security - DEM11-R - AWS re:Inforce 2019
 
Operational Excellence with Containerized Workloads Using AWS Fargate (CON320...
Operational Excellence with Containerized Workloads Using AWS Fargate (CON320...Operational Excellence with Containerized Workloads Using AWS Fargate (CON320...
Operational Excellence with Containerized Workloads Using AWS Fargate (CON320...
 
Introduction to Serverless on AWS - Builders Day Jerusalem
Introduction to Serverless on AWS - Builders Day JerusalemIntroduction to Serverless on AWS - Builders Day Jerusalem
Introduction to Serverless on AWS - Builders Day Jerusalem
 
Set Up a CI/CD Pipeline for Deploying Containers Using the AWS Developer Tool...
Set Up a CI/CD Pipeline for Deploying Containers Using the AWS Developer Tool...Set Up a CI/CD Pipeline for Deploying Containers Using the AWS Developer Tool...
Set Up a CI/CD Pipeline for Deploying Containers Using the AWS Developer Tool...
 
Operational Excellence for Identity & Access Management (SEC334) - AWS re:Inv...
Operational Excellence for Identity & Access Management (SEC334) - AWS re:Inv...Operational Excellence for Identity & Access Management (SEC334) - AWS re:Inv...
Operational Excellence for Identity & Access Management (SEC334) - AWS re:Inv...
 
Become a Serverless Black Belt - Optimizing Your Serverless Applications - AW...
Become a Serverless Black Belt - Optimizing Your Serverless Applications - AW...Become a Serverless Black Belt - Optimizing Your Serverless Applications - AW...
Become a Serverless Black Belt - Optimizing Your Serverless Applications - AW...
 
Gluecon 2018 - The Best Practices and Hard Lessons Learned of Serverless Appl...
Gluecon 2018 - The Best Practices and Hard Lessons Learned of Serverless Appl...Gluecon 2018 - The Best Practices and Hard Lessons Learned of Serverless Appl...
Gluecon 2018 - The Best Practices and Hard Lessons Learned of Serverless Appl...
 
PaaS – From Code to Running Application using AWS Elastic Beanstalk (DEV323) ...
PaaS – From Code to Running Application using AWS Elastic Beanstalk (DEV323) ...PaaS – From Code to Running Application using AWS Elastic Beanstalk (DEV323) ...
PaaS – From Code to Running Application using AWS Elastic Beanstalk (DEV323) ...
 
SRV313 Introduction to Building Web Apps on AWS
 SRV313 Introduction to Building Web Apps on AWS SRV313 Introduction to Building Web Apps on AWS
SRV313 Introduction to Building Web Apps on AWS
 
Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ...
Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ...Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ...
Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ...
 
Industrialize Machine Learning Using CI/CD Techniques (FSV304-i) - AWS re:Inv...
Industrialize Machine Learning Using CI/CD Techniques (FSV304-i) - AWS re:Inv...Industrialize Machine Learning Using CI/CD Techniques (FSV304-i) - AWS re:Inv...
Industrialize Machine Learning Using CI/CD Techniques (FSV304-i) - AWS re:Inv...
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
Amazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
Amazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
Amazon Web Services
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Amazon Web Services
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
Amazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
Amazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Amazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
Amazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Amazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Rodney Lester: Well-Architected - Reliability Instructor Led Lab.pdf

  • 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved How to Determine if You Are Well-Architected for Reliability Rodney Lester, Reliability Pillar Lead, Well-Architected
  • 2. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved Setting Up Your Test Environment
  • 3. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved Downloading the Lab Guide • Navigate to or download the lab guide: https://bit.ly/2rgjFAh • Execute the first section: Deploying the Infrastructure. • This will take about 5 minutes to start the execution and about 30 minutes to deploy everything.
  • 4. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved Background on the AWS Well-Architected Reliability Pillar
  • 5. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved What is it? • “The reliability pillar encompasses the ability of a system to recover from infrastructure or service disruptions, dynamically acquire computing resources to meet demand, and mitigate disruptions such as misconfigurations or transient network issues.” – AWS Well Architected Framework Whitepaper Design principles • Test recovery procedures • Automatically recover from failures • Scale horizontally to increase aggregate system availability • Stop guessing capacity • Manage change using automation
  • 6. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved What is it? • “The reliability pillar encompasses the ability of a system to recover from infrastructure or service disruptions, dynamically acquire computing resources to meet demand, and mitigate disruptions such as misconfigurations or transient network issues.” – AWS Well Architected Framework Whitepaper Design principles • Test recovery procedures • Automatically recover from failures • Scale horizontally to increase aggregate system availability • Stop guessing capacity • Manage change using automation
  • 7. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved Failure Management • “Everything fails, all the time.” - Werner Vogels, Amazon CTO • We are going to work on withstanding COMPONENT failures and planning for RECOVERY • Netflix OSS Simian Army, specifically “Chaos” suite, is an example of how to test fault tolerance • Chaos Monkey—injects failures on instances/containers • Chaos Gorilla (deprecated)—simulates an AZ failure • Chaos Kong—simulates an AWS Region failure
  • 8. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved Automated Recovery • Manual recovery will take much longer • Use AWS services to automate • Auto Scaling • Amazon Relational Database Service (Amazon RDS) • Amazon Route 53 • Amazon Simple Storage Service (Amazon S3) Auto Scaling group
  • 9. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved Why are we here? • Just use Netflix OSS, right? • Wait, Chaos Monkey is written in Java or Go? • Wait, I need to share SSH keys to inject failure? • Wait, what about simulating an AZ Failure • (no more Chaos Gorilla)? • This creates fear, uncertainty, and doubt • Can I do this?
  • 10. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved Yes, you can do this! • You can write code to perform these scenarios in any language • You can use AWS Systems Manager and EC2 Run Command to inject failure without using SSH • You can write your own AZ Failure simulation • You can simulate your own regional failure
  • 11. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved Scenario: Simple three tier web application
  • 12. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved Scenario for testing AWS US-EAST-2 Region Virtual private cloud Availability zoneAvailability zone Availability zone IGW routed subnet IGW routed subnet IGW routed subnet Private subnet Private subnet Private subnet Application Load Balancer MySQL DB Multi-AZ MySQL DB Multi-AZ Instance Instance Auto Scale App Tier Bucket Instance Instance Instance Instance
  • 13. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved Lab Time
  • 14. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved Start at Discussion and Example Failure Scenarios Section in the Lab Guide
  • 15. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved Summary of Tests
  • 16. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved Simulated Failures AWS US-EAST-2 Region Virtual private cloud Availability zoneAvailability zone Availability zone IGW routed subnet IGW routed subnet IGW routed subnet Private subnet Private subnet Private subnet Application Load Balancer MySQL DB Multi-AZ MySQL DB Multi-AZ Instance Instance Auto Scale App Tier Bucket Instance Instance Instance Instance
  • 17. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved What did you learn? We want you to lose the fear of writing these tests!
  • 18. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved Failure and Resiliency Testing • The simulation might not be obvious—discuss and think about it before implementation • Writing this code is not difficult • On very large implementations, you’ll need to use orchestration to have things happen almost simultaneously AWS Step Functions Lambda function Lambda function Lambda function
  • 19. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved Things to watch out for • You will find problems and resolve them, but real failure might not look exactly like the simulation • Expect this to be an ongoing effort • Add to your acceptance testing • Some failure modes are destructive • Automated deployment will help bring the environment back to original deployment configuration • Failover is easy; failback is more difficult • Game day testing is essential • Practice, practice, practice
  • 20. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved This Code is Available for Download • Download it at: • https://bit.ly/2low3dY • Which expands to: • https://s3.us-east-2.amazonaws.com/aws-well-architected-labs- ohio/Reliability/AWSLoftReliabilityLabCode.zip • It is licensed under the Apache License, Version 2.0 • https://aws.amazon.com/apache2.0
  • 21. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved Thank you!