SlideShare a Scribd company logo
1 of 89
Download to read offline
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
B E R L I N
25.10.19
Resiliency and availability design
patterns for the cloud
Cobus Bernard
Senior Technical Evangelist
Amazon Web Services
@cobusbernard
cobusbernard
cobusbernard
B A R 3
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Can you guess whatwillhappen?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Distributed Systems are hard
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Failures areagivenand
everythingwilleventuallyfail
over time.
Werner Vogels
CTO – Amazon.com
“ “
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Resiliency:Abilityfor asystemtohandle and
eventuallyrecover from unexpected conditions
conditions
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Partialfailure mode
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Why do we build resilient software systems?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thecostof downtime
Annual Fortune
1000 application
downtime costs
(IDC)
$1.25 to
$2.5B
Average cost of a
data breach
(Ponemon
Institute)
$3.6M
Cost/hr of a
critical
application
failure (IDC)
$500K
to $1M
Average cost/hr
of downtime
(Ponemon
Institute)
$474K
Average cost per
lost or stolen
record (Ponemon
Institute)
$141
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How do we build resilient software systems?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
People
Application
Network & Data
Infrastructure
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Let’s talk aboutAvailability
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Systemavailability
Availability =
Normal Operation Time
Total Time
MTBF**
MTBF** + MTTR*
=
* Mean Time To Repair (MTTR)
**Mean Time Between Failure (MTBF)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Reading homework
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Availabilityinparallel
A = 1 – (1 – Ax)2
Part X
Part X
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Availabilityinparallel
Component Availability Downtime
X 99% (2-nines) 3 days 15 hours
Two X in parallel 99.99% (4-nines) 52 minutes
Three X in parallel 99.9999% (6-nines) 31 seconds
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Componentredundancyincreases availability
significantly!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Global Infrastructure
• 22 Regions with 69 Availability Zones
• 3 Regions coming soon: Cape Town
Jakarta and Milan
• 100Gbps redundant network
• 99.99% SLA
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Fully-scaledAvailabilityZone
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Highlyredundant regional network
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Region and availabilityzones
Region
Availability zone a Availability zone b Availability zone c
data center
data center
data center
1 or more data centers per AZ
2 or more AZs per region (new regions min 3)
data center
data center
data center
data center
data center
data center
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How about a global architecture?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Once upon a time …
Origin
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
And Now …
Origin
~300ms
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Origin
Improve latency for end-users
Origin
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Improve availabilityand disasterrecovery
Applications in
US West
Applications in
US East
Users from
San Francisco
Users from
New York
Service 1
Service 2
Service 3
Service 4
Service 1
Service 2
Service 3
Service 4
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
So should we go for a global architecture?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Perfect your regional architecture first!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Let’s talk about Multi-AZ
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Multi-AZ architecture
Region
Availability zone a Availability zone b Availability zone c
Instances Instances Instances
DB Instance DB instance
standby
Elastic Load
Balancing (ELB)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Multi-AZ architecture
Region
Availability zone a Availability zone b Availability zone c
Instances Instances Instances
DB Instance DB instance
standby
Elastic Load
Balancing (ELB)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Multi-AZ architecture
Region
Availability zone a Availability zone b Availability zone c
Instances Instances Instances
DB Instance DB instance
standby
Elastic Load
Balancing (ELB)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Multi-AZ architecture
Region
Availability zone a Availability zone b Availability zone c
Instances Instances Instances
DB Instance DB instance
new master
Elastic Load
Balancing (ELB)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Multi-AZ architecture
• Enables fault-tolerant applications
• AWS regional services designed to
withstand AZ failures
• Leveraged by AWS regional
services such as Amazon S3,
Amazon DynamoDB, Amazon
Aurora, Amazon ELBs, etc.
Region
Availability zone a Availability zone b Availability zone c
Instances Instances Instances
DB Instance DB instance
standby
Elastic Load
Balancing (ELB)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Let’s talk about auto scaling
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Auto-Scaling
FixedVariable
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Availability zone 1
Auto Scaling group
AWS Region
Availability zone 2
Auto-scaling for self-healing
Elastic Load
Balancing (ELB)
X
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Let’s talk about the AWS responsibility
models
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWSoperationalresponsibilitymodels
On-Premises Cloud
Less More
Compute Virtual Machine
EC2 Elastic Beanstalk AWS LambdaFargate
Databases MySQL MySQL on EC2
RDS MySQL RDS Aurora Aurora Serverless DynamoDB
Storage Storage
S3
Messaging ESBs
Amazon MQ Kinesis SQS / SNS
Analytics
Hadoop Hadoop on EC2 EMR Elasticsearch Service Athena
Firehose
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Let’s talk about databases
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Common resiliencyissueswithDatabases??
REPLICATION BACKUPSSCALING
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
RR RR RR RR RR
RR RR RR RR RR
RR RR RR RR RR
RR RR RR RR RR
RR RR RR RR RR
RR RR RR RR RR
AZ 1
AZ 2
AZ 3
Network
RR RR RR RR RR
RR RR RR RR RR
RR RR RR RR RR
Storage
Node
Leader
PutItem
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
RR RR RR RR RR
RR RR RR RR RR
RR RR RR RR RR
RR RR RR RR RR
RR RR RR RR RR
RR RR RR RR RR
AZ 1
AZ 2
AZ 3
RR RR RR RR RR
RR RR RR RR RR
RR RR RR RR RR
Storage
Node
Leader
GetItem
Network
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon.com, Nike, Netflix, Duolingo, Lyft, Airbnb, Samsung,
Toyota, and Capital One depend on the scale and performance of
DynamoDB to support their workloads.
10 trillion requests
per day
20 million
requests per second
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Purpose-built databases
Relational Key-value Document In-memory Graph Time-series Ledger
DynamoDB NeptuneAmazon RDS
Aurora CommercialCommunity
Timestream QLDBElastiCacheDocumentDB
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Read-Writeseparation
Master Read Replica Read Replica Read Replica
Instance InstanceInstance
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
DatabaseFederation
Users
DB
Products
DB
Master
(Read) Replica
Master
(Read) Replica
Instance InstanceInstance
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
DatabaseSharding User ShardID
002345 A
002346 B
002347 C
002348 B
002349 A
CBA
Master
(Read) Replica
Master
(Read) Replica
Master
(Read) Replica
Instance InstanceInstance
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Let’s talk about backups
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Backup service
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
PreventingAccidentalTableDeletion
https://aws.amazon.com/blogs/database/preventing-accidental-table-deletion-in-dynamodb/
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
PreventingAccidentalTableDeletion (sql)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Practice and testrecoveryfrom your backups!!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Let’s talk about timeouts, backoff &
retries!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Users
App
DB
Conn
Pool
INSERT
INSERT
INSERT
INSERT
What happens if the DB “slows down”?
Timeout client side Timeout backend side ??
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
User 1
App
DB
Conn
Pool
INSERT
Timeout client side = 10s Timeout backend side = default = Infinite
Retry INSERT
Retry INSERT
ERROR: Failed to get connection from pool
Retry
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
https://docs.microsoft.com/en-us/dotnet/api/system.net.httpwebrequest.timeout
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-configuration-properties.html
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
@timeout_decorator.timeout(5, timeout_exception=StopIteration)
def timed_get(url):
return requests.get(url)
https://pypi.org/project/timeout-decorator/
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How elsecould wehaveprevented theerror?
User 1
DB
Conn
Pool
INSERT
Retry INSERT
Retry INSERT
Retry
ERROR: Failed to get connection from pool
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
User 1
DB
Conn
Pool
INSERT
Timeout client side = 10s Timeout backend side = 10s
Wait 2s before Retry
INSERT
INSERT
Wait 4s before Retry
Wait 8s before Retry
Wait 16s before Retry
Backing off betweenretries
Releasing connectionsBackoff
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
No jitter With jitter
https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/
SimpleExponentialBackoffisnotenough:AddJitter
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Example: add jitter0-1000ms
def get_item(self, url, n=1):
MAX_TRIES = 12
try:
res = requests.get(url)
except:
if n > MAX_TRIES:
return None
n += 1
time.sleep((2 ** n) + (random.randint(0, 1000) / 1000.0))
return self.get_item(url, n)
else:
return res
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Idempotent operation
No additional effect if it is called more than
once with the same input parameters.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Let’s talk about health checking!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Auto Scaling group
Service A
Availability zone 1
Auto Scaling group
AWS Region
Service A
Availability zone 2
Service BService B
database Email
Probing for health
Cluster
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Shallowhealthcheck
Instance
Cache node
Email
database
Cluster
Are you healthy?
yes
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Shallowhealthcheck
Instance
Cache node
Email
database
Cluster
Are you healthy?
yes
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Deep healthcheck
Instance
Cache node
Email
database
Cluster
Are you healthy?
yes
Are you healthy?
yes
yes
yes
yes
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Deep healthcheck
Instance
Cache node
Email
database
Cluster
Are you healthy?
no
Are you healthy?
no
yes
yes
yes
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Prioritize shallow health checks during
hard times.
Cache.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Let’s talk about load shedding.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Cheaply reject excess work
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Be careful when selecting the right metric
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Don’tbeoverlyoptimisticandtakeonmorethanyoucan.
Findanoperationalmetrictorejectwhatyoucannottakein.
Favorcachedandstaticcontent
PrioritizeELBhealthcheck(shallow)pings
Inanoverloadsituationyouhavepreciousresources,donotletany
ofitgotowaste.
Load Shedding
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Let’s talk aboutresiliency (chaos) engineering
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Fire Drills
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
GameDay atAmazon
CreatingResiliencyThroughDestruction
https://www.youtube.com/watch?v=zoz0ZjfrQ9s
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Chaosengineering
https://github.com/Netflix/SimianArmy
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
“Chaos Engineeringis the discipline of
experimentingon a distributedsystem
in orderto buildconfidence in the system’s
capabilitytowithstand turbulentconditionsin
production.”
http://principlesofchaos.org
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Failureinjection
• Start small & build confidence
• Application level
• Host failure
• Resource attacks (CPU, memory, …)
• Network attacks (dependencies, latency, …)
• Region attacks
• “Paul” attack
https://www.gremlin.comhttps://github.com/Netflix/SimianArmy https://chaostoolkit.org
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
STEADY
STATE
HYPOTHESIS
RUN
EXPERIMENT
VERIFY
FIX!
PhasesofChaosEngineering
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
https://aws.amazon.com/wellarchitected
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
@cobusbernard
cobusbernard
cobusbernard

More Related Content

What's hot

Orchestrating containers on AWS | AWS Summit Tel Aviv 2019
Orchestrating containers on AWS  | AWS Summit Tel Aviv 2019Orchestrating containers on AWS  | AWS Summit Tel Aviv 2019
Orchestrating containers on AWS | AWS Summit Tel Aviv 2019AWS Summits
 
[REPEAT] Optimize your workloads with Amazon EC2 & AMD EPYC - DEM01-R - Santa...
[REPEAT] Optimize your workloads with Amazon EC2 & AMD EPYC - DEM01-R - Santa...[REPEAT] Optimize your workloads with Amazon EC2 & AMD EPYC - DEM01-R - Santa...
[REPEAT] Optimize your workloads with Amazon EC2 & AMD EPYC - DEM01-R - Santa...Amazon Web Services
 
Build accurate training data sets with Amazon SageMaker Ground Truth - AIM302...
Build accurate training data sets with Amazon SageMaker Ground Truth - AIM302...Build accurate training data sets with Amazon SageMaker Ground Truth - AIM302...
Build accurate training data sets with Amazon SageMaker Ground Truth - AIM302...Amazon Web Services
 
Introduction to EC2 A1 instances, powered by the AWS Graviton processor - CMP...
Introduction to EC2 A1 instances, powered by the AWS Graviton processor - CMP...Introduction to EC2 A1 instances, powered by the AWS Graviton processor - CMP...
Introduction to EC2 A1 instances, powered by the AWS Graviton processor - CMP...Amazon Web Services
 
Pensi di essere pronto per i microservizi?
Pensi di essere pronto per i microservizi?Pensi di essere pronto per i microservizi?
Pensi di essere pronto per i microservizi?Amazon Web Services
 
신입 개발자가 스타트업에서 AWS로 살아남는 이야기 - 조용진, 모두의 캠퍼스 :: AWS Summit Seoul 2019
신입 개발자가 스타트업에서 AWS로 살아남는 이야기 - 조용진, 모두의 캠퍼스 :: AWS Summit Seoul 2019신입 개발자가 스타트업에서 AWS로 살아남는 이야기 - 조용진, 모두의 캠퍼스 :: AWS Summit Seoul 2019
신입 개발자가 스타트업에서 AWS로 살아남는 이야기 - 조용진, 모두의 캠퍼스 :: AWS Summit Seoul 2019Amazon Web Services Korea
 
Managing microservices using AWS App Mesh - MAD302 - Chicago AWS Summit
Managing microservices using AWS App Mesh - MAD302 - Chicago AWS SummitManaging microservices using AWS App Mesh - MAD302 - Chicago AWS Summit
Managing microservices using AWS App Mesh - MAD302 - Chicago AWS SummitAmazon Web Services
 
Developing intelligent robots with AWS RoboMaker - SVC207 - New York AWS Summit
Developing intelligent robots with AWS RoboMaker - SVC207 - New York AWS SummitDeveloping intelligent robots with AWS RoboMaker - SVC207 - New York AWS Summit
Developing intelligent robots with AWS RoboMaker - SVC207 - New York AWS SummitAmazon Web Services
 
Analyze customer sentiment using AI - AIM307 - New York AWS Summit
Analyze customer sentiment using AI - AIM307 - New York AWS SummitAnalyze customer sentiment using AI - AIM307 - New York AWS Summit
Analyze customer sentiment using AI - AIM307 - New York AWS SummitAmazon Web Services
 
Developing intelligent robots with AWS RoboMaker - SVC207 - Atlanta AWS Summit
Developing intelligent robots with AWS RoboMaker - SVC207 - Atlanta AWS SummitDeveloping intelligent robots with AWS RoboMaker - SVC207 - Atlanta AWS Summit
Developing intelligent robots with AWS RoboMaker - SVC207 - Atlanta AWS SummitAmazon Web Services
 
Creare soluzioni immersive di realtà virtuale aumentata
Creare soluzioni immersive di realtà virtuale aumentataCreare soluzioni immersive di realtà virtuale aumentata
Creare soluzioni immersive di realtà virtuale aumentataAmazon Web Services
 
CI/CD best practices for building modern applications - MAD301 - Santa Clara ...
CI/CD best practices for building modern applications - MAD301 - Santa Clara ...CI/CD best practices for building modern applications - MAD301 - Santa Clara ...
CI/CD best practices for building modern applications - MAD301 - Santa Clara ...Amazon Web Services
 
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...Amazon Web Services
 
Build intelligent applications quickly with AWS AI services - AIM301 - New Yo...
Build intelligent applications quickly with AWS AI services - AIM301 - New Yo...Build intelligent applications quickly with AWS AI services - AIM301 - New Yo...
Build intelligent applications quickly with AWS AI services - AIM301 - New Yo...Amazon Web Services
 
Scale - Implementing a Data Warehouse on AWS
Scale - Implementing a Data Warehouse on AWSScale - Implementing a Data Warehouse on AWS
Scale - Implementing a Data Warehouse on AWSAmazon Web Services
 
[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...
[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...
[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...Amazon Web Services
 
Alexa + IoT - SVC203 - New York AWS Summit
Alexa + IoT - SVC203 - New York AWS SummitAlexa + IoT - SVC203 - New York AWS Summit
Alexa + IoT - SVC203 - New York AWS SummitAmazon Web Services
 
Favorire l'innovazione passando da applicazioni monolitiche ad architetture m...
Favorire l'innovazione passando da applicazioni monolitiche ad architetture m...Favorire l'innovazione passando da applicazioni monolitiche ad architetture m...
Favorire l'innovazione passando da applicazioni monolitiche ad architetture m...Amazon Web Services
 
Add intelligence to applications with AWS AI services - AIM201 - New York AWS...
Add intelligence to applications with AWS AI services - AIM201 - New York AWS...Add intelligence to applications with AWS AI services - AIM201 - New York AWS...
Add intelligence to applications with AWS AI services - AIM201 - New York AWS...Amazon Web Services
 
Migrating on-premises Apache Spark and Hive to Amazon EMR - ADB304 - New York...
Migrating on-premises Apache Spark and Hive to Amazon EMR - ADB304 - New York...Migrating on-premises Apache Spark and Hive to Amazon EMR - ADB304 - New York...
Migrating on-premises Apache Spark and Hive to Amazon EMR - ADB304 - New York...Amazon Web Services
 

What's hot (20)

Orchestrating containers on AWS | AWS Summit Tel Aviv 2019
Orchestrating containers on AWS  | AWS Summit Tel Aviv 2019Orchestrating containers on AWS  | AWS Summit Tel Aviv 2019
Orchestrating containers on AWS | AWS Summit Tel Aviv 2019
 
[REPEAT] Optimize your workloads with Amazon EC2 & AMD EPYC - DEM01-R - Santa...
[REPEAT] Optimize your workloads with Amazon EC2 & AMD EPYC - DEM01-R - Santa...[REPEAT] Optimize your workloads with Amazon EC2 & AMD EPYC - DEM01-R - Santa...
[REPEAT] Optimize your workloads with Amazon EC2 & AMD EPYC - DEM01-R - Santa...
 
Build accurate training data sets with Amazon SageMaker Ground Truth - AIM302...
Build accurate training data sets with Amazon SageMaker Ground Truth - AIM302...Build accurate training data sets with Amazon SageMaker Ground Truth - AIM302...
Build accurate training data sets with Amazon SageMaker Ground Truth - AIM302...
 
Introduction to EC2 A1 instances, powered by the AWS Graviton processor - CMP...
Introduction to EC2 A1 instances, powered by the AWS Graviton processor - CMP...Introduction to EC2 A1 instances, powered by the AWS Graviton processor - CMP...
Introduction to EC2 A1 instances, powered by the AWS Graviton processor - CMP...
 
Pensi di essere pronto per i microservizi?
Pensi di essere pronto per i microservizi?Pensi di essere pronto per i microservizi?
Pensi di essere pronto per i microservizi?
 
신입 개발자가 스타트업에서 AWS로 살아남는 이야기 - 조용진, 모두의 캠퍼스 :: AWS Summit Seoul 2019
신입 개발자가 스타트업에서 AWS로 살아남는 이야기 - 조용진, 모두의 캠퍼스 :: AWS Summit Seoul 2019신입 개발자가 스타트업에서 AWS로 살아남는 이야기 - 조용진, 모두의 캠퍼스 :: AWS Summit Seoul 2019
신입 개발자가 스타트업에서 AWS로 살아남는 이야기 - 조용진, 모두의 캠퍼스 :: AWS Summit Seoul 2019
 
Managing microservices using AWS App Mesh - MAD302 - Chicago AWS Summit
Managing microservices using AWS App Mesh - MAD302 - Chicago AWS SummitManaging microservices using AWS App Mesh - MAD302 - Chicago AWS Summit
Managing microservices using AWS App Mesh - MAD302 - Chicago AWS Summit
 
Developing intelligent robots with AWS RoboMaker - SVC207 - New York AWS Summit
Developing intelligent robots with AWS RoboMaker - SVC207 - New York AWS SummitDeveloping intelligent robots with AWS RoboMaker - SVC207 - New York AWS Summit
Developing intelligent robots with AWS RoboMaker - SVC207 - New York AWS Summit
 
Analyze customer sentiment using AI - AIM307 - New York AWS Summit
Analyze customer sentiment using AI - AIM307 - New York AWS SummitAnalyze customer sentiment using AI - AIM307 - New York AWS Summit
Analyze customer sentiment using AI - AIM307 - New York AWS Summit
 
Developing intelligent robots with AWS RoboMaker - SVC207 - Atlanta AWS Summit
Developing intelligent robots with AWS RoboMaker - SVC207 - Atlanta AWS SummitDeveloping intelligent robots with AWS RoboMaker - SVC207 - Atlanta AWS Summit
Developing intelligent robots with AWS RoboMaker - SVC207 - Atlanta AWS Summit
 
Creare soluzioni immersive di realtà virtuale aumentata
Creare soluzioni immersive di realtà virtuale aumentataCreare soluzioni immersive di realtà virtuale aumentata
Creare soluzioni immersive di realtà virtuale aumentata
 
CI/CD best practices for building modern applications - MAD301 - Santa Clara ...
CI/CD best practices for building modern applications - MAD301 - Santa Clara ...CI/CD best practices for building modern applications - MAD301 - Santa Clara ...
CI/CD best practices for building modern applications - MAD301 - Santa Clara ...
 
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
 
Build intelligent applications quickly with AWS AI services - AIM301 - New Yo...
Build intelligent applications quickly with AWS AI services - AIM301 - New Yo...Build intelligent applications quickly with AWS AI services - AIM301 - New Yo...
Build intelligent applications quickly with AWS AI services - AIM301 - New Yo...
 
Scale - Implementing a Data Warehouse on AWS
Scale - Implementing a Data Warehouse on AWSScale - Implementing a Data Warehouse on AWS
Scale - Implementing a Data Warehouse on AWS
 
[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...
[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...
[REPEAT] Get hands on with AWS DeepRacer & compete in the AWS DeepRacer Leagu...
 
Alexa + IoT - SVC203 - New York AWS Summit
Alexa + IoT - SVC203 - New York AWS SummitAlexa + IoT - SVC203 - New York AWS Summit
Alexa + IoT - SVC203 - New York AWS Summit
 
Favorire l'innovazione passando da applicazioni monolitiche ad architetture m...
Favorire l'innovazione passando da applicazioni monolitiche ad architetture m...Favorire l'innovazione passando da applicazioni monolitiche ad architetture m...
Favorire l'innovazione passando da applicazioni monolitiche ad architetture m...
 
Add intelligence to applications with AWS AI services - AIM201 - New York AWS...
Add intelligence to applications with AWS AI services - AIM201 - New York AWS...Add intelligence to applications with AWS AI services - AIM201 - New York AWS...
Add intelligence to applications with AWS AI services - AIM201 - New York AWS...
 
Migrating on-premises Apache Spark and Hive to Amazon EMR - ADB304 - New York...
Migrating on-premises Apache Spark and Hive to Amazon EMR - ADB304 - New York...Migrating on-premises Apache Spark and Hive to Amazon EMR - ADB304 - New York...
Migrating on-premises Apache Spark and Hive to Amazon EMR - ADB304 - New York...
 

Similar to AWS DevDay Berlin - Resiliency and availability design patterns for the cloud

AWS DevDay Vienna - Resiliency and availability design patterns for the cloud
AWS DevDay Vienna - Resiliency and availability design patterns for the cloudAWS DevDay Vienna - Resiliency and availability design patterns for the cloud
AWS DevDay Vienna - Resiliency and availability design patterns for the cloudCobus Bernard
 
Resiliency-and-Availability-Design-Patterns-for-the-Cloud
Resiliency-and-Availability-Design-Patterns-for-the-CloudResiliency-and-Availability-Design-Patterns-for-the-Cloud
Resiliency-and-Availability-Design-Patterns-for-the-CloudAmazon Web Services
 
DevConf 2020: Resiliency and availability design patterns for the cloud
DevConf 2020: Resiliency and availability design patterns for the cloudDevConf 2020: Resiliency and availability design patterns for the cloud
DevConf 2020: Resiliency and availability design patterns for the cloudCobus Bernard
 
"Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma...
"Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma..."Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma...
"Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma...Provectus
 
PatternsResiliency_DevDays2019.pdf
PatternsResiliency_DevDays2019.pdfPatternsResiliency_DevDays2019.pdf
PatternsResiliency_DevDays2019.pdfAmazon Web Services
 
PatternsResiliency_DevDays2019.pdf
PatternsResiliency_DevDays2019.pdfPatternsResiliency_DevDays2019.pdf
PatternsResiliency_DevDays2019.pdfAmazon Web Services
 
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ..."How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...Provectus
 
AWS DevDay Berlin 2019 - Going Global With Serverless
AWS DevDay Berlin 2019 - Going Global With ServerlessAWS DevDay Berlin 2019 - Going Global With Serverless
AWS DevDay Berlin 2019 - Going Global With ServerlessDarko Mesaroš
 
DevConZM - Modern Applications Development in the Cloud
DevConZM - Modern Applications Development in the CloudDevConZM - Modern Applications Development in the Cloud
DevConZM - Modern Applications Development in the CloudCobus Bernard
 
Secure and Fast microVM for Serverless Computing using Firecracker
Secure and Fast microVM for Serverless Computing using FirecrackerSecure and Fast microVM for Serverless Computing using Firecracker
Secure and Fast microVM for Serverless Computing using FirecrackerArun Gupta
 
建構全球跨區域 x Active-Active架構的無伺服器化後台服務
建構全球跨區域  x Active-Active架構的無伺服器化後台服務建構全球跨區域  x Active-Active架構的無伺服器化後台服務
建構全球跨區域 x Active-Active架構的無伺服器化後台服務Amazon Web Services
 
Creating Serverless apps for NASA in GovCloud
Creating Serverless apps for NASA in GovCloudCreating Serverless apps for NASA in GovCloud
Creating Serverless apps for NASA in GovCloudChris Shenton
 
AWS Lambda 내부 동작 방식 및 활용 방법 자세히 살펴 보기 - 김일호 솔루션즈 아키텍트 매니저, AWS :: AWS Summit ...
AWS Lambda 내부 동작 방식 및 활용 방법 자세히 살펴 보기 - 김일호 솔루션즈 아키텍트 매니저, AWS :: AWS Summit ...AWS Lambda 내부 동작 방식 및 활용 방법 자세히 살펴 보기 - 김일호 솔루션즈 아키텍트 매니저, AWS :: AWS Summit ...
AWS Lambda 내부 동작 방식 및 활용 방법 자세히 살펴 보기 - 김일호 솔루션즈 아키텍트 매니저, AWS :: AWS Summit ...Amazon Web Services Korea
 
Tools for building your Startup on AWS
Tools for building your Startup on AWSTools for building your Startup on AWS
Tools for building your Startup on AWSRob De Feo
 
How to build scalable and resilient applications in the cloud - AWS Summit Ca...
How to build scalable and resilient applications in the cloud - AWS Summit Ca...How to build scalable and resilient applications in the cloud - AWS Summit Ca...
How to build scalable and resilient applications in the cloud - AWS Summit Ca...Amazon Web Services
 
Migrate and Modernize Your Database
Migrate and Modernize Your DatabaseMigrate and Modernize Your Database
Migrate and Modernize Your DatabaseAmazon Web Services
 
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...All Databases Are Equal, But Some Databases Are More Equal than Others: How t...
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...javier ramirez
 
Firecracker: Secure and fast microVMs for serverless computing - SEP316 - AWS...
Firecracker: Secure and fast microVMs for serverless computing - SEP316 - AWS...Firecracker: Secure and fast microVMs for serverless computing - SEP316 - AWS...
Firecracker: Secure and fast microVMs for serverless computing - SEP316 - AWS...Amazon Web Services
 

Similar to AWS DevDay Berlin - Resiliency and availability design patterns for the cloud (20)

AWS DevDay Vienna - Resiliency and availability design patterns for the cloud
AWS DevDay Vienna - Resiliency and availability design patterns for the cloudAWS DevDay Vienna - Resiliency and availability design patterns for the cloud
AWS DevDay Vienna - Resiliency and availability design patterns for the cloud
 
Resiliency-and-Availability-Design-Patterns-for-the-Cloud
Resiliency-and-Availability-Design-Patterns-for-the-CloudResiliency-and-Availability-Design-Patterns-for-the-Cloud
Resiliency-and-Availability-Design-Patterns-for-the-Cloud
 
DevConf 2020: Resiliency and availability design patterns for the cloud
DevConf 2020: Resiliency and availability design patterns for the cloudDevConf 2020: Resiliency and availability design patterns for the cloud
DevConf 2020: Resiliency and availability design patterns for the cloud
 
"Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma...
"Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma..."Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma...
"Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma...
 
PatternsResiliency_DevDays2019.pdf
PatternsResiliency_DevDays2019.pdfPatternsResiliency_DevDays2019.pdf
PatternsResiliency_DevDays2019.pdf
 
PatternsResiliency_DevDays2019.pdf
PatternsResiliency_DevDays2019.pdfPatternsResiliency_DevDays2019.pdf
PatternsResiliency_DevDays2019.pdf
 
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ..."How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
 
AWS DevDay Berlin 2019 - Going Global With Serverless
AWS DevDay Berlin 2019 - Going Global With ServerlessAWS DevDay Berlin 2019 - Going Global With Serverless
AWS DevDay Berlin 2019 - Going Global With Serverless
 
DevConZM - Modern Applications Development in the Cloud
DevConZM - Modern Applications Development in the CloudDevConZM - Modern Applications Development in the Cloud
DevConZM - Modern Applications Development in the Cloud
 
Cloud Migration Workshop
Cloud Migration WorkshopCloud Migration Workshop
Cloud Migration Workshop
 
Secure and Fast microVM for Serverless Computing using Firecracker
Secure and Fast microVM for Serverless Computing using FirecrackerSecure and Fast microVM for Serverless Computing using Firecracker
Secure and Fast microVM for Serverless Computing using Firecracker
 
建構全球跨區域 x Active-Active架構的無伺服器化後台服務
建構全球跨區域  x Active-Active架構的無伺服器化後台服務建構全球跨區域  x Active-Active架構的無伺服器化後台服務
建構全球跨區域 x Active-Active架構的無伺服器化後台服務
 
Creating Serverless apps for NASA in GovCloud
Creating Serverless apps for NASA in GovCloudCreating Serverless apps for NASA in GovCloud
Creating Serverless apps for NASA in GovCloud
 
AWS Lambda 내부 동작 방식 및 활용 방법 자세히 살펴 보기 - 김일호 솔루션즈 아키텍트 매니저, AWS :: AWS Summit ...
AWS Lambda 내부 동작 방식 및 활용 방법 자세히 살펴 보기 - 김일호 솔루션즈 아키텍트 매니저, AWS :: AWS Summit ...AWS Lambda 내부 동작 방식 및 활용 방법 자세히 살펴 보기 - 김일호 솔루션즈 아키텍트 매니저, AWS :: AWS Summit ...
AWS Lambda 내부 동작 방식 및 활용 방법 자세히 살펴 보기 - 김일호 솔루션즈 아키텍트 매니저, AWS :: AWS Summit ...
 
Tools for building your Startup on AWS
Tools for building your Startup on AWSTools for building your Startup on AWS
Tools for building your Startup on AWS
 
How to build scalable and resilient applications in the cloud - AWS Summit Ca...
How to build scalable and resilient applications in the cloud - AWS Summit Ca...How to build scalable and resilient applications in the cloud - AWS Summit Ca...
How to build scalable and resilient applications in the cloud - AWS Summit Ca...
 
Migrate and Modernize Your Database
Migrate and Modernize Your DatabaseMigrate and Modernize Your Database
Migrate and Modernize Your Database
 
Containers on AWS
Containers on AWSContainers on AWS
Containers on AWS
 
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...All Databases Are Equal, But Some Databases Are More Equal than Others: How t...
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...
 
Firecracker: Secure and fast microVMs for serverless computing - SEP316 - AWS...
Firecracker: Secure and fast microVMs for serverless computing - SEP316 - AWS...Firecracker: Secure and fast microVMs for serverless computing - SEP316 - AWS...
Firecracker: Secure and fast microVMs for serverless computing - SEP316 - AWS...
 

More from Cobus Bernard

London Microservices Meetup: Lessons learnt adopting microservices
London Microservices  Meetup: Lessons learnt adopting microservicesLondon Microservices  Meetup: Lessons learnt adopting microservices
London Microservices Meetup: Lessons learnt adopting microservicesCobus Bernard
 
AWS SSA Webinar 34 - Getting started with databases on AWS - Managing DBs wit...
AWS SSA Webinar 34 - Getting started with databases on AWS - Managing DBs wit...AWS SSA Webinar 34 - Getting started with databases on AWS - Managing DBs wit...
AWS SSA Webinar 34 - Getting started with databases on AWS - Managing DBs wit...Cobus Bernard
 
AWS SSA Webinar 33 - Getting started with databases on AWS Amazon DynamoDB
AWS SSA Webinar 33 - Getting started with databases on AWS Amazon DynamoDBAWS SSA Webinar 33 - Getting started with databases on AWS Amazon DynamoDB
AWS SSA Webinar 33 - Getting started with databases on AWS Amazon DynamoDBCobus Bernard
 
AWS SSA Webinar 32 - Getting Started with databases on AWS: Choosing the righ...
AWS SSA Webinar 32 - Getting Started with databases on AWS: Choosing the righ...AWS SSA Webinar 32 - Getting Started with databases on AWS: Choosing the righ...
AWS SSA Webinar 32 - Getting Started with databases on AWS: Choosing the righ...Cobus Bernard
 
AWS SSA Webinar 30 - Getting Started with AWS - Infrastructure as Code - Terr...
AWS SSA Webinar 30 - Getting Started with AWS - Infrastructure as Code - Terr...AWS SSA Webinar 30 - Getting Started with AWS - Infrastructure as Code - Terr...
AWS SSA Webinar 30 - Getting Started with AWS - Infrastructure as Code - Terr...Cobus Bernard
 
AWS SSA Webinar 28 - Getting Started with AWS - Infrastructure as Code
AWS SSA Webinar 28 - Getting Started with AWS - Infrastructure as CodeAWS SSA Webinar 28 - Getting Started with AWS - Infrastructure as Code
AWS SSA Webinar 28 - Getting Started with AWS - Infrastructure as CodeCobus Bernard
 
AWS Webinar 24 - Getting Started with AWS - Understanding DR
AWS Webinar 24 - Getting Started with AWS - Understanding DRAWS Webinar 24 - Getting Started with AWS - Understanding DR
AWS Webinar 24 - Getting Started with AWS - Understanding DRCobus Bernard
 
AWS Webinar 23 - Getting Started with AWS - Understanding total cost of owner...
AWS Webinar 23 - Getting Started with AWS - Understanding total cost of owner...AWS Webinar 23 - Getting Started with AWS - Understanding total cost of owner...
AWS Webinar 23 - Getting Started with AWS - Understanding total cost of owner...Cobus Bernard
 
AWS SSA Webinar 21 - Getting Started with Data lakes on AWS
AWS SSA Webinar 21 - Getting Started with Data lakes on AWSAWS SSA Webinar 21 - Getting Started with Data lakes on AWS
AWS SSA Webinar 21 - Getting Started with Data lakes on AWSCobus Bernard
 
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWSAWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWSCobus Bernard
 
AWS SSA Webinar 19 - Getting Started with Multi-Region Architecture: Services
AWS SSA Webinar 19 - Getting Started with Multi-Region Architecture: ServicesAWS SSA Webinar 19 - Getting Started with Multi-Region Architecture: Services
AWS SSA Webinar 19 - Getting Started with Multi-Region Architecture: ServicesCobus Bernard
 
AWS SSA Webinar 18 - Getting Started with Multi-Region Architecture: Data
AWS SSA Webinar 18 - Getting Started with Multi-Region Architecture: DataAWS SSA Webinar 18 - Getting Started with Multi-Region Architecture: Data
AWS SSA Webinar 18 - Getting Started with Multi-Region Architecture: DataCobus Bernard
 
AWS EMEA Online Summit - Live coding with containers
AWS EMEA Online Summit - Live coding with containersAWS EMEA Online Summit - Live coding with containers
AWS EMEA Online Summit - Live coding with containersCobus Bernard
 
AWS EMEA Online Summit - Blending Spot and On-Demand instances to optimizing ...
AWS EMEA Online Summit - Blending Spot and On-Demand instances to optimizing ...AWS EMEA Online Summit - Blending Spot and On-Demand instances to optimizing ...
AWS EMEA Online Summit - Blending Spot and On-Demand instances to optimizing ...Cobus Bernard
 
AWS SSA Webinar 17 - Getting Started on AWS with Amazon RDS
AWS SSA Webinar 17 - Getting Started on AWS with Amazon RDSAWS SSA Webinar 17 - Getting Started on AWS with Amazon RDS
AWS SSA Webinar 17 - Getting Started on AWS with Amazon RDSCobus Bernard
 
AWS SSA Webinar 16 - Getting Started on AWS with Amazon EC2
AWS SSA Webinar 16 - Getting Started on AWS with Amazon EC2AWS SSA Webinar 16 - Getting Started on AWS with Amazon EC2
AWS SSA Webinar 16 - Getting Started on AWS with Amazon EC2Cobus Bernard
 
AWS SSA Webinar 15 - Getting started on AWS with Containers: Amazon EKS
AWS SSA Webinar 15 - Getting started on AWS with Containers: Amazon EKSAWS SSA Webinar 15 - Getting started on AWS with Containers: Amazon EKS
AWS SSA Webinar 15 - Getting started on AWS with Containers: Amazon EKSCobus Bernard
 
AWS SSA Webinar 13 - Getting started on AWS with Containers: Amazon ECS
AWS SSA Webinar 13 - Getting started on AWS with Containers: Amazon ECSAWS SSA Webinar 13 - Getting started on AWS with Containers: Amazon ECS
AWS SSA Webinar 13 - Getting started on AWS with Containers: Amazon ECSCobus Bernard
 
AWS SSA Webinar 11 - Getting started on AWS: Security
AWS SSA Webinar 11 - Getting started on AWS: SecurityAWS SSA Webinar 11 - Getting started on AWS: Security
AWS SSA Webinar 11 - Getting started on AWS: SecurityCobus Bernard
 
AWS SSA Webinar 12 - Getting started on AWS with Containers
AWS SSA Webinar 12 - Getting started on AWS with ContainersAWS SSA Webinar 12 - Getting started on AWS with Containers
AWS SSA Webinar 12 - Getting started on AWS with ContainersCobus Bernard
 

More from Cobus Bernard (20)

London Microservices Meetup: Lessons learnt adopting microservices
London Microservices  Meetup: Lessons learnt adopting microservicesLondon Microservices  Meetup: Lessons learnt adopting microservices
London Microservices Meetup: Lessons learnt adopting microservices
 
AWS SSA Webinar 34 - Getting started with databases on AWS - Managing DBs wit...
AWS SSA Webinar 34 - Getting started with databases on AWS - Managing DBs wit...AWS SSA Webinar 34 - Getting started with databases on AWS - Managing DBs wit...
AWS SSA Webinar 34 - Getting started with databases on AWS - Managing DBs wit...
 
AWS SSA Webinar 33 - Getting started with databases on AWS Amazon DynamoDB
AWS SSA Webinar 33 - Getting started with databases on AWS Amazon DynamoDBAWS SSA Webinar 33 - Getting started with databases on AWS Amazon DynamoDB
AWS SSA Webinar 33 - Getting started with databases on AWS Amazon DynamoDB
 
AWS SSA Webinar 32 - Getting Started with databases on AWS: Choosing the righ...
AWS SSA Webinar 32 - Getting Started with databases on AWS: Choosing the righ...AWS SSA Webinar 32 - Getting Started with databases on AWS: Choosing the righ...
AWS SSA Webinar 32 - Getting Started with databases on AWS: Choosing the righ...
 
AWS SSA Webinar 30 - Getting Started with AWS - Infrastructure as Code - Terr...
AWS SSA Webinar 30 - Getting Started with AWS - Infrastructure as Code - Terr...AWS SSA Webinar 30 - Getting Started with AWS - Infrastructure as Code - Terr...
AWS SSA Webinar 30 - Getting Started with AWS - Infrastructure as Code - Terr...
 
AWS SSA Webinar 28 - Getting Started with AWS - Infrastructure as Code
AWS SSA Webinar 28 - Getting Started with AWS - Infrastructure as CodeAWS SSA Webinar 28 - Getting Started with AWS - Infrastructure as Code
AWS SSA Webinar 28 - Getting Started with AWS - Infrastructure as Code
 
AWS Webinar 24 - Getting Started with AWS - Understanding DR
AWS Webinar 24 - Getting Started with AWS - Understanding DRAWS Webinar 24 - Getting Started with AWS - Understanding DR
AWS Webinar 24 - Getting Started with AWS - Understanding DR
 
AWS Webinar 23 - Getting Started with AWS - Understanding total cost of owner...
AWS Webinar 23 - Getting Started with AWS - Understanding total cost of owner...AWS Webinar 23 - Getting Started with AWS - Understanding total cost of owner...
AWS Webinar 23 - Getting Started with AWS - Understanding total cost of owner...
 
AWS SSA Webinar 21 - Getting Started with Data lakes on AWS
AWS SSA Webinar 21 - Getting Started with Data lakes on AWSAWS SSA Webinar 21 - Getting Started with Data lakes on AWS
AWS SSA Webinar 21 - Getting Started with Data lakes on AWS
 
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWSAWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
 
AWS SSA Webinar 19 - Getting Started with Multi-Region Architecture: Services
AWS SSA Webinar 19 - Getting Started with Multi-Region Architecture: ServicesAWS SSA Webinar 19 - Getting Started with Multi-Region Architecture: Services
AWS SSA Webinar 19 - Getting Started with Multi-Region Architecture: Services
 
AWS SSA Webinar 18 - Getting Started with Multi-Region Architecture: Data
AWS SSA Webinar 18 - Getting Started with Multi-Region Architecture: DataAWS SSA Webinar 18 - Getting Started with Multi-Region Architecture: Data
AWS SSA Webinar 18 - Getting Started with Multi-Region Architecture: Data
 
AWS EMEA Online Summit - Live coding with containers
AWS EMEA Online Summit - Live coding with containersAWS EMEA Online Summit - Live coding with containers
AWS EMEA Online Summit - Live coding with containers
 
AWS EMEA Online Summit - Blending Spot and On-Demand instances to optimizing ...
AWS EMEA Online Summit - Blending Spot and On-Demand instances to optimizing ...AWS EMEA Online Summit - Blending Spot and On-Demand instances to optimizing ...
AWS EMEA Online Summit - Blending Spot and On-Demand instances to optimizing ...
 
AWS SSA Webinar 17 - Getting Started on AWS with Amazon RDS
AWS SSA Webinar 17 - Getting Started on AWS with Amazon RDSAWS SSA Webinar 17 - Getting Started on AWS with Amazon RDS
AWS SSA Webinar 17 - Getting Started on AWS with Amazon RDS
 
AWS SSA Webinar 16 - Getting Started on AWS with Amazon EC2
AWS SSA Webinar 16 - Getting Started on AWS with Amazon EC2AWS SSA Webinar 16 - Getting Started on AWS with Amazon EC2
AWS SSA Webinar 16 - Getting Started on AWS with Amazon EC2
 
AWS SSA Webinar 15 - Getting started on AWS with Containers: Amazon EKS
AWS SSA Webinar 15 - Getting started on AWS with Containers: Amazon EKSAWS SSA Webinar 15 - Getting started on AWS with Containers: Amazon EKS
AWS SSA Webinar 15 - Getting started on AWS with Containers: Amazon EKS
 
AWS SSA Webinar 13 - Getting started on AWS with Containers: Amazon ECS
AWS SSA Webinar 13 - Getting started on AWS with Containers: Amazon ECSAWS SSA Webinar 13 - Getting started on AWS with Containers: Amazon ECS
AWS SSA Webinar 13 - Getting started on AWS with Containers: Amazon ECS
 
AWS SSA Webinar 11 - Getting started on AWS: Security
AWS SSA Webinar 11 - Getting started on AWS: SecurityAWS SSA Webinar 11 - Getting started on AWS: Security
AWS SSA Webinar 11 - Getting started on AWS: Security
 
AWS SSA Webinar 12 - Getting started on AWS with Containers
AWS SSA Webinar 12 - Getting started on AWS with ContainersAWS SSA Webinar 12 - Getting started on AWS with Containers
AWS SSA Webinar 12 - Getting started on AWS with Containers
 

AWS DevDay Berlin - Resiliency and availability design patterns for the cloud

  • 1. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. B E R L I N 25.10.19 Resiliency and availability design patterns for the cloud Cobus Bernard Senior Technical Evangelist Amazon Web Services @cobusbernard cobusbernard cobusbernard B A R 3
  • 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Can you guess whatwillhappen?
  • 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Distributed Systems are hard
  • 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Failures areagivenand everythingwilleventuallyfail over time. Werner Vogels CTO – Amazon.com “ “
  • 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Resiliency:Abilityfor asystemtohandle and eventuallyrecover from unexpected conditions conditions
  • 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Partialfailure mode
  • 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Why do we build resilient software systems?
  • 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thecostof downtime Annual Fortune 1000 application downtime costs (IDC) $1.25 to $2.5B Average cost of a data breach (Ponemon Institute) $3.6M Cost/hr of a critical application failure (IDC) $500K to $1M Average cost/hr of downtime (Ponemon Institute) $474K Average cost per lost or stolen record (Ponemon Institute) $141
  • 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. How do we build resilient software systems?
  • 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. People Application Network & Data Infrastructure
  • 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Let’s talk aboutAvailability
  • 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Systemavailability Availability = Normal Operation Time Total Time MTBF** MTBF** + MTTR* = * Mean Time To Repair (MTTR) **Mean Time Between Failure (MTBF)
  • 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Reading homework
  • 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Availabilityinparallel A = 1 – (1 – Ax)2 Part X Part X
  • 15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Availabilityinparallel Component Availability Downtime X 99% (2-nines) 3 days 15 hours Two X in parallel 99.99% (4-nines) 52 minutes Three X in parallel 99.9999% (6-nines) 31 seconds
  • 16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Componentredundancyincreases availability significantly!
  • 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Global Infrastructure • 22 Regions with 69 Availability Zones • 3 Regions coming soon: Cape Town Jakarta and Milan • 100Gbps redundant network • 99.99% SLA
  • 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Fully-scaledAvailabilityZone
  • 19. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Highlyredundant regional network
  • 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Region and availabilityzones Region Availability zone a Availability zone b Availability zone c data center data center data center 1 or more data centers per AZ 2 or more AZs per region (new regions min 3) data center data center data center data center data center data center
  • 21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. How about a global architecture?
  • 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Once upon a time … Origin
  • 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. And Now … Origin ~300ms
  • 24. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Origin Improve latency for end-users Origin
  • 25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Improve availabilityand disasterrecovery Applications in US West Applications in US East Users from San Francisco Users from New York Service 1 Service 2 Service 3 Service 4 Service 1 Service 2 Service 3 Service 4
  • 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. So should we go for a global architecture?
  • 27. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Perfect your regional architecture first!
  • 28. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Let’s talk about Multi-AZ
  • 29. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Multi-AZ architecture Region Availability zone a Availability zone b Availability zone c Instances Instances Instances DB Instance DB instance standby Elastic Load Balancing (ELB)
  • 30. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Multi-AZ architecture Region Availability zone a Availability zone b Availability zone c Instances Instances Instances DB Instance DB instance standby Elastic Load Balancing (ELB)
  • 31. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Multi-AZ architecture Region Availability zone a Availability zone b Availability zone c Instances Instances Instances DB Instance DB instance standby Elastic Load Balancing (ELB)
  • 32. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Multi-AZ architecture Region Availability zone a Availability zone b Availability zone c Instances Instances Instances DB Instance DB instance new master Elastic Load Balancing (ELB)
  • 33. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Multi-AZ architecture • Enables fault-tolerant applications • AWS regional services designed to withstand AZ failures • Leveraged by AWS regional services such as Amazon S3, Amazon DynamoDB, Amazon Aurora, Amazon ELBs, etc. Region Availability zone a Availability zone b Availability zone c Instances Instances Instances DB Instance DB instance standby Elastic Load Balancing (ELB)
  • 34. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Let’s talk about auto scaling
  • 35. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Auto-Scaling FixedVariable
  • 36. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Availability zone 1 Auto Scaling group AWS Region Availability zone 2 Auto-scaling for self-healing Elastic Load Balancing (ELB) X
  • 37. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Let’s talk about the AWS responsibility models
  • 38. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWSoperationalresponsibilitymodels On-Premises Cloud Less More Compute Virtual Machine EC2 Elastic Beanstalk AWS LambdaFargate Databases MySQL MySQL on EC2 RDS MySQL RDS Aurora Aurora Serverless DynamoDB Storage Storage S3 Messaging ESBs Amazon MQ Kinesis SQS / SNS Analytics Hadoop Hadoop on EC2 EMR Elasticsearch Service Athena Firehose
  • 39. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Let’s talk about databases
  • 40. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Common resiliencyissueswithDatabases?? REPLICATION BACKUPSSCALING
  • 41. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 42. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR AZ 1 AZ 2 AZ 3 Network RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR Storage Node Leader PutItem
  • 43. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR AZ 1 AZ 2 AZ 3 RR RR RR RR RR RR RR RR RR RR RR RR RR RR RR Storage Node Leader GetItem Network
  • 44. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon.com, Nike, Netflix, Duolingo, Lyft, Airbnb, Samsung, Toyota, and Capital One depend on the scale and performance of DynamoDB to support their workloads. 10 trillion requests per day 20 million requests per second
  • 45. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Purpose-built databases Relational Key-value Document In-memory Graph Time-series Ledger DynamoDB NeptuneAmazon RDS Aurora CommercialCommunity Timestream QLDBElastiCacheDocumentDB
  • 46. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Read-Writeseparation Master Read Replica Read Replica Read Replica Instance InstanceInstance
  • 47. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. DatabaseFederation Users DB Products DB Master (Read) Replica Master (Read) Replica Instance InstanceInstance
  • 48. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. DatabaseSharding User ShardID 002345 A 002346 B 002347 C 002348 B 002349 A CBA Master (Read) Replica Master (Read) Replica Master (Read) Replica Instance InstanceInstance
  • 49. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Let’s talk about backups
  • 50. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 51. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Backup service
  • 52. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. PreventingAccidentalTableDeletion https://aws.amazon.com/blogs/database/preventing-accidental-table-deletion-in-dynamodb/
  • 53. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. PreventingAccidentalTableDeletion (sql)
  • 54. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Practice and testrecoveryfrom your backups!!
  • 55. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Let’s talk about timeouts, backoff & retries!
  • 56. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Users App DB Conn Pool INSERT INSERT INSERT INSERT What happens if the DB “slows down”? Timeout client side Timeout backend side ??
  • 57. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. User 1 App DB Conn Pool INSERT Timeout client side = 10s Timeout backend side = default = Infinite Retry INSERT Retry INSERT ERROR: Failed to get connection from pool Retry
  • 58. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. https://docs.microsoft.com/en-us/dotnet/api/system.net.httpwebrequest.timeout
  • 59. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 60. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-configuration-properties.html
  • 61. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. @timeout_decorator.timeout(5, timeout_exception=StopIteration) def timed_get(url): return requests.get(url) https://pypi.org/project/timeout-decorator/
  • 62. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. How elsecould wehaveprevented theerror? User 1 DB Conn Pool INSERT Retry INSERT Retry INSERT Retry ERROR: Failed to get connection from pool
  • 63. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. User 1 DB Conn Pool INSERT Timeout client side = 10s Timeout backend side = 10s Wait 2s before Retry INSERT INSERT Wait 4s before Retry Wait 8s before Retry Wait 16s before Retry Backing off betweenretries Releasing connectionsBackoff
  • 64. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. No jitter With jitter https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/ SimpleExponentialBackoffisnotenough:AddJitter
  • 65. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Example: add jitter0-1000ms def get_item(self, url, n=1): MAX_TRIES = 12 try: res = requests.get(url) except: if n > MAX_TRIES: return None n += 1 time.sleep((2 ** n) + (random.randint(0, 1000) / 1000.0)) return self.get_item(url, n) else: return res
  • 66. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Idempotent operation No additional effect if it is called more than once with the same input parameters.
  • 67. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Let’s talk about health checking!
  • 68. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Auto Scaling group Service A Availability zone 1 Auto Scaling group AWS Region Service A Availability zone 2 Service BService B database Email Probing for health Cluster
  • 69. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Shallowhealthcheck Instance Cache node Email database Cluster Are you healthy? yes
  • 70. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Shallowhealthcheck Instance Cache node Email database Cluster Are you healthy? yes
  • 71. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Deep healthcheck Instance Cache node Email database Cluster Are you healthy? yes Are you healthy? yes yes yes yes
  • 72. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Deep healthcheck Instance Cache node Email database Cluster Are you healthy? no Are you healthy? no yes yes yes
  • 73. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Prioritize shallow health checks during hard times. Cache.
  • 74. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Let’s talk about load shedding.
  • 75. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 76. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 77. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Cheaply reject excess work
  • 78. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 79. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Be careful when selecting the right metric
  • 80. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Don’tbeoverlyoptimisticandtakeonmorethanyoucan. Findanoperationalmetrictorejectwhatyoucannottakein. Favorcachedandstaticcontent PrioritizeELBhealthcheck(shallow)pings Inanoverloadsituationyouhavepreciousresources,donotletany ofitgotowaste. Load Shedding
  • 81. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Let’s talk aboutresiliency (chaos) engineering
  • 82. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Fire Drills
  • 83. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. GameDay atAmazon CreatingResiliencyThroughDestruction https://www.youtube.com/watch?v=zoz0ZjfrQ9s
  • 84. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Chaosengineering https://github.com/Netflix/SimianArmy
  • 85. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. “Chaos Engineeringis the discipline of experimentingon a distributedsystem in orderto buildconfidence in the system’s capabilitytowithstand turbulentconditionsin production.” http://principlesofchaos.org
  • 86. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Failureinjection • Start small & build confidence • Application level • Host failure • Resource attacks (CPU, memory, …) • Network attacks (dependencies, latency, …) • Region attacks • “Paul” attack https://www.gremlin.comhttps://github.com/Netflix/SimianArmy https://chaostoolkit.org
  • 87. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. STEADY STATE HYPOTHESIS RUN EXPERIMENT VERIFY FIX! PhasesofChaosEngineering
  • 88. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. https://aws.amazon.com/wellarchitected
  • 89. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thank you! © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. @cobusbernard cobusbernard cobusbernard