SlideShare a Scribd company logo
1 of 39
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
DevOps and Highly-Available Systems
Rich Alberth
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Approach
• Part 1: Understand High-Availability
• Part 2: Measure Deployment Risk
• Part 3: Minimize Probability and Impact
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
A system is
Highly-Available if …
Part 1:
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Degraded
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Trick Question: a system degrades when…
A. Out of capacity (ASG maxed)
B. Network outage (us-east-1a down?)
C. SNS unavailable (temporary credential
expired)
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Cause & Effect
Cause internallyis what happens to your application.
Effect externallyis what your app does to your users.
Database
failure
Can’t accept
orders
Can’t accept
orders
Delayed
orders
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Rule #1:
Internal ≠ customer
failure impact
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
High-Availability
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Risk Management
• Risk
• Probability
• Impact
• P × I
Part 2:
Low Med High
Low
Med
High
Amazon
SQS
AWS Elastic
Beanstalk
AWS
Lambda
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Reduce
Probability
Reduce
Impact
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Rule #2:
Reduce P × I
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Part 3: Architecture that Degrades Gracefully
Proactive: Reduce Probability
Reactive: Reduce the Impact
No single point
of failure
Build
horizontally
Build
vertically
Local cache Retry Reroute
1
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Proactive: No Single Point of Failure1
Fault-Tolerant System  Remove single points of failure
Highly-Available System  Remove single points of business failure
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Proactive: Build Horizontally
Amazon
RDS
Amazon EC2
Instances
Elastic Beanstalk Amazon S3
Bucket
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Proactive: Build Horizontally
Amazon
RDS
Amazon EC2
Instances
Amazon S3
Bucket
Amazon S3
Bucket
Amazon API
Gateway
AWS
Lambda
Web
browser
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Proactive: Build Horizontally
Web
browser
Amazon Simple
Queue Service
Amazon
ElastiCache
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Proactive: Build Vertically
Amazon
RDS
Amazon EC2
Instances
Elastic Beanstalk
Amazon S3
Bucket
External
Web Service
External
Web Service
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Proactive: Build Vertically
Amazon
RDS
Amazon EC2
Instances
Elastic Beanstalk
Amazon
S3
External
Web
Service
Amazon SWF
Worker Worker Worker Worker
External
Web
Service
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Rule #1:
Build ‘em small.
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Reactive: Local Cache
Other approaches concentrate on “all or nothing” semantics:
• Publish to SNS works, or fail-over to something else.
• Database there or not
• Web server up or fail-over to backup region
Local Cache:
• Serving some data is better than serving no data
• We degrade by using stale data until data store RTS
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Reactive: Local Cache
Amazon RDS
Database
Amazon EC2
Web
browser
AWS Elastic
Beanstalk
Application
Amazon
ElastiCache
Local file
cache
HTML5
Local
Storage
Amazon Simple
Queue Service
Mobile
Application
Amazon
CloudFront
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Rule #3:
Put some data from some
systems somewhere else too.
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Reactive: Retry
Wait a bit, try same request again
Try something else
Schedule it for later
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Reactive: Retry
AWS Elastic
Beanstalk
Amazon Simple
Queue Service
Mobile
Application
Amazon
S3
DynamoDB
Amazon
CloudSearch
Web
browser
AWS
Lambda
AWS
Lambda
Other Web
Service
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Reactive: Reroute
decider worker
Route 53 Elastic Load
Balancing EC2
Instances
Amazon SWF
worker
Amazon SQS
Amazon
Aurora
Dead
Letter
Queue
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Reactive: Reroute
AWS Elastic
Beanstalk
Amazon Simple
Queue Service
Mobile
Application
Web
browser
Amazon RDS
DynamoDB
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Reactive: Local Cache + Reroute
us-east-2
Elastic
Beanstalk
application
Route 53
Hosted Zone
Amazon
S3
DynamoDBAmazon
CloudSearch
sa-east-1
Elastic
Beanstalk
application
Route 53
Hosted Zone
Amazon
S3
DynamoDBAmazon
CloudSearch
+
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Rule #3:
Have a “Plan B”
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Operations to Degrade Gracefully
Proactive
Reactive
Canary
System
Monitors
Gremlins
Detection Playbooks
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Canary
Measure the user
experience,
Alarm when
distasteful.
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
System Monitors
Measure the internal
health of the system,
Alarm when it is not
within control.
Amazon
SQS
AWS Elastic
Beanstalk
AWS
Lambda
Monitor
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Custom System Monitor
Amazon Simple
Queue Service
Mobile
Application
Amazon
S3
AWS
Lambda
Other Web
Service
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Gremlins
Separate components
Create controlled failure
Audit what they do
Elastic
Load
Balancing
EC2
Instances
Amazon
Aurora
Amazon Simple
Queue Service
AWS
Lambda
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Detection
Time breakdown:
• Alert when a problem
probably will occur
• Alarm when a problem
happens
Control Range:
• Alarm over and under
• Alarm when data missing
0
20
40
60
80
100
1 6 11 16
CPUUtilization
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Playbooks
Document how to RTS: Return To Service
Versioned along with code
Reviewed and approved with code changes
Not about diagnosing problems, all about RTS
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Playbook: “Website 5xx errors”
Route 53 Elastic
Load
Balancing
EC2
Instances
Runbook:
1. Dig DNS name
2. Ping ELB IP
3. Ssh EC2 instances
4. Kill off EC2 instances one at a time
5. Fail-over to us-west-2
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Playbook: database intermittent errors
EC2
Instances
Amazon
Aurora
Amazon
ElastiCache
Runbook:
1. Connect to the Aurora endpoint
2. Create 2 new replicas
3. Isolate all databases
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
SWF worker EC2 instances keep rebooting
decider worker
Amazon SWF
worker
Amazon
Aurora
Runbook:
1. Don’t touch anything!
2. Monitor other metrics to look
for customer impact.
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Thank you

More Related Content

What's hot

From Monolith to Microservices – and Beyond!
From Monolith to Microservices – and Beyond!From Monolith to Microservices – and Beyond!
From Monolith to Microservices – and Beyond!Jules Pierre-Louis
 
Deploying Java Applicationson Ec2
Deploying Java Applicationson Ec2Deploying Java Applicationson Ec2
Deploying Java Applicationson Ec2Chris Richardson
 
Start Up Austin 2017: If How and When to Adopt Microservices
Start Up Austin 2017: If How and When to Adopt MicroservicesStart Up Austin 2017: If How and When to Adopt Microservices
Start Up Austin 2017: If How and When to Adopt MicroservicesAmazon Web Services
 
A Modern Data Architecture for Microservices
A Modern Data Architecture for MicroservicesA Modern Data Architecture for Microservices
A Modern Data Architecture for MicroservicesAmazon Web Services
 
Microservices Architectures on Amazon Web Services
Microservices Architectures on Amazon Web ServicesMicroservices Architectures on Amazon Web Services
Microservices Architectures on Amazon Web ServicesAmazon Web Services
 
DIY guide to runbooks, incident reports, and incident response
DIY guide to runbooks, incident reports, and incident responseDIY guide to runbooks, incident reports, and incident response
DIY guide to runbooks, incident reports, and incident responseNathan Case
 
Start Up Austin 2017: Manual vs Automation - When to Start Automating your Pr...
Start Up Austin 2017: Manual vs Automation - When to Start Automating your Pr...Start Up Austin 2017: Manual vs Automation - When to Start Automating your Pr...
Start Up Austin 2017: Manual vs Automation - When to Start Automating your Pr...Amazon Web Services
 
Microservices: Decomposing Applications for Deployability and Scalability (ja...
Microservices: Decomposing Applications for Deployability and Scalability (ja...Microservices: Decomposing Applications for Deployability and Scalability (ja...
Microservices: Decomposing Applications for Deployability and Scalability (ja...Chris Richardson
 
AWS Canberra WWPS Summit 2013 - Opening Keynote
AWS Canberra WWPS Summit 2013 - Opening KeynoteAWS Canberra WWPS Summit 2013 - Opening Keynote
AWS Canberra WWPS Summit 2013 - Opening KeynoteAmazon Web Services
 
Introduction to Microservices
Introduction to MicroservicesIntroduction to Microservices
Introduction to MicroservicesMahmoudZidan41
 
Debugging Modern Applications: Introduction to AWS X-Ray
Debugging Modern Applications: Introduction to AWS X-RayDebugging Modern Applications: Introduction to AWS X-Ray
Debugging Modern Applications: Introduction to AWS X-RayAmazon Web Services
 
Aufbau von agilen und effizienten IT Organisationen mit DevOps
Aufbau von agilen und effizienten IT Organisationen mit DevOpsAufbau von agilen und effizienten IT Organisationen mit DevOps
Aufbau von agilen und effizienten IT Organisationen mit DevOpsAWS Germany
 
re:Invent for Introverts 2021
re:Invent for Introverts 2021re:Invent for Introverts 2021
re:Invent for Introverts 2021AWS Chicago
 
AWS Finance Symposium_AWS와 함께 하는 디지털 금융 혁신 사례
AWS Finance Symposium_AWS와 함께 하는 디지털 금융 혁신 사례 AWS Finance Symposium_AWS와 함께 하는 디지털 금융 혁신 사례
AWS Finance Symposium_AWS와 함께 하는 디지털 금융 혁신 사례 Amazon Web Services Korea
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice Architecturetyrantbrian
 
AWS re:Inforce 2021 re:Cap 1
AWS re:Inforce 2021 re:Cap 1 AWS re:Inforce 2021 re:Cap 1
AWS re:Inforce 2021 re:Cap 1 Hayato Kiriyama
 
Adding the Sec to Your DevOps Pipelines
Adding the Sec to Your DevOps PipelinesAdding the Sec to Your DevOps Pipelines
Adding the Sec to Your DevOps PipelinesAmazon Web Services
 

What's hot (20)

From Monolith to Microservices – and Beyond!
From Monolith to Microservices – and Beyond!From Monolith to Microservices – and Beyond!
From Monolith to Microservices – and Beyond!
 
Deploying Java Applicationson Ec2
Deploying Java Applicationson Ec2Deploying Java Applicationson Ec2
Deploying Java Applicationson Ec2
 
Start Up Austin 2017: If How and When to Adopt Microservices
Start Up Austin 2017: If How and When to Adopt MicroservicesStart Up Austin 2017: If How and When to Adopt Microservices
Start Up Austin 2017: If How and When to Adopt Microservices
 
A Modern Data Architecture for Microservices
A Modern Data Architecture for MicroservicesA Modern Data Architecture for Microservices
A Modern Data Architecture for Microservices
 
Microservices Architectures on Amazon Web Services
Microservices Architectures on Amazon Web ServicesMicroservices Architectures on Amazon Web Services
Microservices Architectures on Amazon Web Services
 
DIY guide to runbooks, incident reports, and incident response
DIY guide to runbooks, incident reports, and incident responseDIY guide to runbooks, incident reports, and incident response
DIY guide to runbooks, incident reports, and incident response
 
Start Up Austin 2017: Manual vs Automation - When to Start Automating your Pr...
Start Up Austin 2017: Manual vs Automation - When to Start Automating your Pr...Start Up Austin 2017: Manual vs Automation - When to Start Automating your Pr...
Start Up Austin 2017: Manual vs Automation - When to Start Automating your Pr...
 
Microservices: Decomposing Applications for Deployability and Scalability (ja...
Microservices: Decomposing Applications for Deployability and Scalability (ja...Microservices: Decomposing Applications for Deployability and Scalability (ja...
Microservices: Decomposing Applications for Deployability and Scalability (ja...
 
Introduction to Microservices
Introduction to MicroservicesIntroduction to Microservices
Introduction to Microservices
 
AWS Canberra WWPS Summit 2013 - Opening Keynote
AWS Canberra WWPS Summit 2013 - Opening KeynoteAWS Canberra WWPS Summit 2013 - Opening Keynote
AWS Canberra WWPS Summit 2013 - Opening Keynote
 
Introduction to Microservices
Introduction to MicroservicesIntroduction to Microservices
Introduction to Microservices
 
Debugging Modern Applications: Introduction to AWS X-Ray
Debugging Modern Applications: Introduction to AWS X-RayDebugging Modern Applications: Introduction to AWS X-Ray
Debugging Modern Applications: Introduction to AWS X-Ray
 
Aufbau von agilen und effizienten IT Organisationen mit DevOps
Aufbau von agilen und effizienten IT Organisationen mit DevOpsAufbau von agilen und effizienten IT Organisationen mit DevOps
Aufbau von agilen und effizienten IT Organisationen mit DevOps
 
re:Invent for Introverts 2021
re:Invent for Introverts 2021re:Invent for Introverts 2021
re:Invent for Introverts 2021
 
AWS Finance Symposium_AWS와 함께 하는 디지털 금융 혁신 사례
AWS Finance Symposium_AWS와 함께 하는 디지털 금융 혁신 사례 AWS Finance Symposium_AWS와 함께 하는 디지털 금융 혁신 사례
AWS Finance Symposium_AWS와 함께 하는 디지털 금융 혁신 사례
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice Architecture
 
Cloud security ppt
Cloud security pptCloud security ppt
Cloud security ppt
 
AWS re:Inforce 2021 re:Cap 1
AWS re:Inforce 2021 re:Cap 1 AWS re:Inforce 2021 re:Cap 1
AWS re:Inforce 2021 re:Cap 1
 
Adding the Sec to Your DevOps Pipelines
Adding the Sec to Your DevOps PipelinesAdding the Sec to Your DevOps Pipelines
Adding the Sec to Your DevOps Pipelines
 
D do s_white_paper
D do s_white_paperD do s_white_paper
D do s_white_paper
 

Similar to Workshop: DevOps and Highly-Available Systems

Getting Started with AWS Lambda and Serverless Computing
Getting Started with AWS Lambda and Serverless ComputingGetting Started with AWS Lambda and Serverless Computing
Getting Started with AWS Lambda and Serverless ComputingAmazon Web Services
 
Wild rydes serverless website workshop
Wild rydes   serverless website workshopWild rydes   serverless website workshop
Wild rydes serverless website workshopAmazon Web Services
 
The AWS Shared Responsibility Model in Practice
The AWS Shared Responsibility Model in PracticeThe AWS Shared Responsibility Model in Practice
The AWS Shared Responsibility Model in PracticeAlert Logic
 
CSS17: Dallas - The AWS Shared Responsibility Model in Practice
CSS17: Dallas - The AWS Shared Responsibility Model in PracticeCSS17: Dallas - The AWS Shared Responsibility Model in Practice
CSS17: Dallas - The AWS Shared Responsibility Model in PracticeAlert Logic
 
AWS Shared Security Model in Practice
AWS Shared Security Model in PracticeAWS Shared Security Model in Practice
AWS Shared Security Model in PracticeAlert Logic
 
Detective Controls: Gain Visibility and Record Change
Detective Controls: Gain Visibility and Record ChangeDetective Controls: Gain Visibility and Record Change
Detective Controls: Gain Visibility and Record ChangeAmazon Web Services
 
A Tale of Two Pizzas: Accelerating Software Delivery with AWS Developer Tools
A Tale of Two Pizzas: Accelerating Software Delivery with AWS Developer ToolsA Tale of Two Pizzas: Accelerating Software Delivery with AWS Developer Tools
A Tale of Two Pizzas: Accelerating Software Delivery with AWS Developer ToolsAmazon Web Services
 
Scaling up to and beyond 10M users
Scaling up to and beyond 10M usersScaling up to and beyond 10M users
Scaling up to and beyond 10M usersAmazon Web Services
 
Using Amazon VPC Flow Logs for Predictive Security Analytics (NET319) - AWS r...
Using Amazon VPC Flow Logs for Predictive Security Analytics (NET319) - AWS r...Using Amazon VPC Flow Logs for Predictive Security Analytics (NET319) - AWS r...
Using Amazon VPC Flow Logs for Predictive Security Analytics (NET319) - AWS r...Amazon Web Services
 
Introduction to the Security Perspective of the Cloud Adoption Framework (CAF)
 Introduction to the Security Perspective of the Cloud Adoption Framework (CAF) Introduction to the Security Perspective of the Cloud Adoption Framework (CAF)
Introduction to the Security Perspective of the Cloud Adoption Framework (CAF)Amazon Web Services
 
Going Serverless at AWS Startup Day Bangalore
Going Serverless at AWS Startup Day Bangalore Going Serverless at AWS Startup Day Bangalore
Going Serverless at AWS Startup Day Bangalore Madhusudan Shekar
 
Workshop: Deploy a Deep Learning Framework on Amazon ECS
Workshop: Deploy a Deep Learning Framework on Amazon ECSWorkshop: Deploy a Deep Learning Framework on Amazon ECS
Workshop: Deploy a Deep Learning Framework on Amazon ECSAmazon Web Services
 
Bootcamp: Getting Started on AWS
Bootcamp: Getting Started on AWSBootcamp: Getting Started on AWS
Bootcamp: Getting Started on AWSAmazon Web Services
 
DevOps on AWS: Advanced Techniques for Amazon EC2 Deployments on AWS
DevOps on AWS: Advanced Techniques for Amazon EC2 Deployments on AWSDevOps on AWS: Advanced Techniques for Amazon EC2 Deployments on AWS
DevOps on AWS: Advanced Techniques for Amazon EC2 Deployments on AWSAmazon Web Services
 
How to build scalable and resilient applications in the cloud - AWS Summit Ca...
How to build scalable and resilient applications in the cloud - AWS Summit Ca...How to build scalable and resilient applications in the cloud - AWS Summit Ca...
How to build scalable and resilient applications in the cloud - AWS Summit Ca...Amazon Web Services
 

Similar to Workshop: DevOps and Highly-Available Systems (20)

Getting Started with AWS Lambda and Serverless Computing
Getting Started with AWS Lambda and Serverless ComputingGetting Started with AWS Lambda and Serverless Computing
Getting Started with AWS Lambda and Serverless Computing
 
Elasticity and Management
Elasticity and ManagementElasticity and Management
Elasticity and Management
 
Wild rydes serverless website workshop
Wild rydes   serverless website workshopWild rydes   serverless website workshop
Wild rydes serverless website workshop
 
Deep Dive on the IoT at AWS
Deep Dive on the IoT at AWSDeep Dive on the IoT at AWS
Deep Dive on the IoT at AWS
 
The AWS Shared Responsibility Model in Practice
The AWS Shared Responsibility Model in PracticeThe AWS Shared Responsibility Model in Practice
The AWS Shared Responsibility Model in Practice
 
CSS17: Dallas - The AWS Shared Responsibility Model in Practice
CSS17: Dallas - The AWS Shared Responsibility Model in PracticeCSS17: Dallas - The AWS Shared Responsibility Model in Practice
CSS17: Dallas - The AWS Shared Responsibility Model in Practice
 
AWS Shared Security Model in Practice
AWS Shared Security Model in PracticeAWS Shared Security Model in Practice
AWS Shared Security Model in Practice
 
Deep Dive on IoT at AWS
Deep Dive on IoT at AWSDeep Dive on IoT at AWS
Deep Dive on IoT at AWS
 
Automating Operations Workload
Automating Operations WorkloadAutomating Operations Workload
Automating Operations Workload
 
Amazon EC2 & VPC HOL
Amazon EC2 & VPC HOLAmazon EC2 & VPC HOL
Amazon EC2 & VPC HOL
 
Detective Controls: Gain Visibility and Record Change
Detective Controls: Gain Visibility and Record ChangeDetective Controls: Gain Visibility and Record Change
Detective Controls: Gain Visibility and Record Change
 
A Tale of Two Pizzas: Accelerating Software Delivery with AWS Developer Tools
A Tale of Two Pizzas: Accelerating Software Delivery with AWS Developer ToolsA Tale of Two Pizzas: Accelerating Software Delivery with AWS Developer Tools
A Tale of Two Pizzas: Accelerating Software Delivery with AWS Developer Tools
 
Scaling up to and beyond 10M users
Scaling up to and beyond 10M usersScaling up to and beyond 10M users
Scaling up to and beyond 10M users
 
Using Amazon VPC Flow Logs for Predictive Security Analytics (NET319) - AWS r...
Using Amazon VPC Flow Logs for Predictive Security Analytics (NET319) - AWS r...Using Amazon VPC Flow Logs for Predictive Security Analytics (NET319) - AWS r...
Using Amazon VPC Flow Logs for Predictive Security Analytics (NET319) - AWS r...
 
Introduction to the Security Perspective of the Cloud Adoption Framework (CAF)
 Introduction to the Security Perspective of the Cloud Adoption Framework (CAF) Introduction to the Security Perspective of the Cloud Adoption Framework (CAF)
Introduction to the Security Perspective of the Cloud Adoption Framework (CAF)
 
Going Serverless at AWS Startup Day Bangalore
Going Serverless at AWS Startup Day Bangalore Going Serverless at AWS Startup Day Bangalore
Going Serverless at AWS Startup Day Bangalore
 
Workshop: Deploy a Deep Learning Framework on Amazon ECS
Workshop: Deploy a Deep Learning Framework on Amazon ECSWorkshop: Deploy a Deep Learning Framework on Amazon ECS
Workshop: Deploy a Deep Learning Framework on Amazon ECS
 
Bootcamp: Getting Started on AWS
Bootcamp: Getting Started on AWSBootcamp: Getting Started on AWS
Bootcamp: Getting Started on AWS
 
DevOps on AWS: Advanced Techniques for Amazon EC2 Deployments on AWS
DevOps on AWS: Advanced Techniques for Amazon EC2 Deployments on AWSDevOps on AWS: Advanced Techniques for Amazon EC2 Deployments on AWS
DevOps on AWS: Advanced Techniques for Amazon EC2 Deployments on AWS
 
How to build scalable and resilient applications in the cloud - AWS Summit Ca...
How to build scalable and resilient applications in the cloud - AWS Summit Ca...How to build scalable and resilient applications in the cloud - AWS Summit Ca...
How to build scalable and resilient applications in the cloud - AWS Summit Ca...
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Workshop: DevOps and Highly-Available Systems

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved DevOps and Highly-Available Systems Rich Alberth
  • 2. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Approach • Part 1: Understand High-Availability • Part 2: Measure Deployment Risk • Part 3: Minimize Probability and Impact
  • 3. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved A system is Highly-Available if … Part 1:
  • 4. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Degraded
  • 5. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Trick Question: a system degrades when… A. Out of capacity (ASG maxed) B. Network outage (us-east-1a down?) C. SNS unavailable (temporary credential expired)
  • 6. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Cause & Effect Cause internallyis what happens to your application. Effect externallyis what your app does to your users. Database failure Can’t accept orders Can’t accept orders Delayed orders
  • 7. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Rule #1: Internal ≠ customer failure impact
  • 8. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved High-Availability
  • 9. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Risk Management • Risk • Probability • Impact • P × I Part 2: Low Med High Low Med High Amazon SQS AWS Elastic Beanstalk AWS Lambda
  • 10. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Reduce Probability Reduce Impact
  • 11. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Rule #2: Reduce P × I
  • 12. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Part 3: Architecture that Degrades Gracefully Proactive: Reduce Probability Reactive: Reduce the Impact No single point of failure Build horizontally Build vertically Local cache Retry Reroute 1
  • 13. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Proactive: No Single Point of Failure1 Fault-Tolerant System  Remove single points of failure Highly-Available System  Remove single points of business failure
  • 14. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Proactive: Build Horizontally Amazon RDS Amazon EC2 Instances Elastic Beanstalk Amazon S3 Bucket
  • 15. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Proactive: Build Horizontally Amazon RDS Amazon EC2 Instances Amazon S3 Bucket Amazon S3 Bucket Amazon API Gateway AWS Lambda Web browser
  • 16. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Proactive: Build Horizontally Web browser Amazon Simple Queue Service Amazon ElastiCache
  • 17. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Proactive: Build Vertically Amazon RDS Amazon EC2 Instances Elastic Beanstalk Amazon S3 Bucket External Web Service External Web Service
  • 18. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Proactive: Build Vertically Amazon RDS Amazon EC2 Instances Elastic Beanstalk Amazon S3 External Web Service Amazon SWF Worker Worker Worker Worker External Web Service
  • 19. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Rule #1: Build ‘em small.
  • 20. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Reactive: Local Cache Other approaches concentrate on “all or nothing” semantics: • Publish to SNS works, or fail-over to something else. • Database there or not • Web server up or fail-over to backup region Local Cache: • Serving some data is better than serving no data • We degrade by using stale data until data store RTS
  • 21. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Reactive: Local Cache Amazon RDS Database Amazon EC2 Web browser AWS Elastic Beanstalk Application Amazon ElastiCache Local file cache HTML5 Local Storage Amazon Simple Queue Service Mobile Application Amazon CloudFront
  • 22. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Rule #3: Put some data from some systems somewhere else too.
  • 23. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Reactive: Retry Wait a bit, try same request again Try something else Schedule it for later
  • 24. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Reactive: Retry AWS Elastic Beanstalk Amazon Simple Queue Service Mobile Application Amazon S3 DynamoDB Amazon CloudSearch Web browser AWS Lambda AWS Lambda Other Web Service
  • 25. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Reactive: Reroute decider worker Route 53 Elastic Load Balancing EC2 Instances Amazon SWF worker Amazon SQS Amazon Aurora Dead Letter Queue
  • 26. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Reactive: Reroute AWS Elastic Beanstalk Amazon Simple Queue Service Mobile Application Web browser Amazon RDS DynamoDB
  • 27. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Reactive: Local Cache + Reroute us-east-2 Elastic Beanstalk application Route 53 Hosted Zone Amazon S3 DynamoDBAmazon CloudSearch sa-east-1 Elastic Beanstalk application Route 53 Hosted Zone Amazon S3 DynamoDBAmazon CloudSearch +
  • 28. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Rule #3: Have a “Plan B”
  • 29. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Operations to Degrade Gracefully Proactive Reactive Canary System Monitors Gremlins Detection Playbooks
  • 30. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Canary Measure the user experience, Alarm when distasteful.
  • 31. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved System Monitors Measure the internal health of the system, Alarm when it is not within control. Amazon SQS AWS Elastic Beanstalk AWS Lambda Monitor
  • 32. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Custom System Monitor Amazon Simple Queue Service Mobile Application Amazon S3 AWS Lambda Other Web Service
  • 33. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Gremlins Separate components Create controlled failure Audit what they do Elastic Load Balancing EC2 Instances Amazon Aurora Amazon Simple Queue Service AWS Lambda
  • 34. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Detection Time breakdown: • Alert when a problem probably will occur • Alarm when a problem happens Control Range: • Alarm over and under • Alarm when data missing 0 20 40 60 80 100 1 6 11 16 CPUUtilization
  • 35. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Playbooks Document how to RTS: Return To Service Versioned along with code Reviewed and approved with code changes Not about diagnosing problems, all about RTS
  • 36. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Playbook: “Website 5xx errors” Route 53 Elastic Load Balancing EC2 Instances Runbook: 1. Dig DNS name 2. Ping ELB IP 3. Ssh EC2 instances 4. Kill off EC2 instances one at a time 5. Fail-over to us-west-2
  • 37. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Playbook: database intermittent errors EC2 Instances Amazon Aurora Amazon ElastiCache Runbook: 1. Connect to the Aurora endpoint 2. Create 2 new replicas 3. Isolate all databases
  • 38. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved SWF worker EC2 instances keep rebooting decider worker Amazon SWF worker Amazon Aurora Runbook: 1. Don’t touch anything! 2. Monitor other metrics to look for customer impact.
  • 39. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Thank you