SlideShare a Scribd company logo
Multi-Region Active-Active Architecture
楊海俊
Solutions Architect
Session objectives
1. Understand how to think about application availability
2. Understand how AWS makes services highly-available
3. Understand why you might need Multi-Region Active-Active architecture
4. Review AWS services that are useful for Multi-Region Active-Active design
Understand how to think about
application availability
Where are we?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Calculating availability with hard dependencies
Application
Dependency 1 Dependency 3
99% 90%
<= 90%
Dependency 2
95%
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
“A chain is only as strong as its
weakest link.”
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Calculating availability with redundant
components
Application
Component 1 Component 1 Replica
90% 90%
~99%
Assuming
instantaneous
failover!
Let us understand
intuitively
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
“Component redundancy
increases availability
significantly!”
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
“Nines” of availability
Reliability Pillar White-paper (Nov 2017)
Availability Max Disruption (per year) Application Categories
99% 3 days 15 hours Batch processing, data extraction, transfer, and load jobs
99.90% 8 hours 45 minutes Internal tools like knowledge management, project tracking
99.95% 4 hours 22 minutes Online commerce, point of sale
99.99% 52 minutes Video delivery, ridesharing services, broadcast systems
99.999% 5 minutes ATM transactions, telecommunications systems
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Your application’s availability
depends …
… n o t only on availability of AW S s er vic es , but als o on
the quality of your s oftw ar e and your adher enc e to
oper ational bes t pr ac tic es !
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
“Nines” of availability
Reliability Pillar whitepaper (Nov 2017)
Availability Max Disruption (per year) Application Categories
99% 3 days 15 hours Batch processing, data extraction, transfer, and load jobs
99.90% 8 hours 45 minutes Internal tools like knowledge management, project tracking
99.95% 4 hours 22 minutes Online commerce, point of sale
99.99% 52 minutes Video delivery, ridesharing services, broadcast systems
99.999% 5 minutes ATM transactions, telecommunications systems
Consider Multi-Region Active-
Active for this class
Understand how AWS makes services
highly available
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Region-wide AWS services
Amazon
DynamoDB
Amazon
RDS
Amazon
ElastiCache
Amazon
S3
Amazon
EFS
Amazon
SQS
Amazon
Kinesis
Amazon ES
Default
Configurable for multi-
AZ deployment
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
High availability for EC2 – instance recovery
Instance ID,
private IP addresses,
Elastic IP addresses, and
all instance metadata
Instance ID,
private IP addresses,
Elastic IP addresses, and
all instance metadata
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-recover.html
Loss of network connectivity
Loss of system power
Software issues on the physical host
Hardware issues on the physical host that impact network
reachability
New instance, same in all
manners as old instance
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
High availability for EC2 – Auto Scaling
Availability Zone 1 Availability Zone 2
Auto-scaling Group
Fresh server from AMI
Understand why you might need
Multi-Region Active-Active
architecture
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Serving geographically distributed customer
base
TaiwanUSA West USA East EU-East
Users from
San Francisco
Users from
New York
Users from
London
Users from
Taipei
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Guarding against failure of your applications
in one region
Applications
in US West
Applications
in US East
Users from
San Francisco
Users from
New York
AWS Service
1
AWS Service
2
AWS Service
3
AWS Service
4
AWS Service
1
AWS Service
2
AWS Service
3
AWS Service
4
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Cost effective DR: Why not use DR all the
time?
DR environments that don’t get used
1. Fall out of sync, eventually
2. Waste money
Key Technology Requirements
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Reliable & secure network
• Network backbone
• VPN or encrypted (SSL) communication over public IP
Region BRegion A Encrypted data
Backbone
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data replication
1. Synchronous
2. Asynchronous
o Nearly continuous (lag of few seconds or minutes)
o Batch (hourly/daily/weekly copy)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Code synchronization
1. Staged deployment across regions
o Rolling
o Blue-Green
o Rollback facility
• 2. Parameterized localization
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Traffic segregation & management
• Segregation options
• Explicit – different URLs
– e.g. east.abc-corp.com and west.abc-corp.com
• Implicit (DNS level) – the same URL
– e.g. www.abc-corp.com
• Traffic management infrastructure
– Throttling
– Internal redirecting
– External redirecting
When users reach wrong
regions
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Monitoring
• Application & infrastructure health
• Replication lag & code sync monitoring
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Multi-tenancy
• What is a tenant?
• -A unit of movement/failover
US EastUS West
Backbone
California
Texas New York
Pennsylvania
Texas
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Direction of replication – before
Texas
containers DB DB
Region A Region B
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Direction of replication – after
Texas
containersDB DB
Region A Region B
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Failover scripts
Failover scripts should be able to:
1. Traffic rerouting for a tenant
2. Change direction of replication
Key architectural considerations
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tolerance for network partitioning
• Failure of one region should not lead to failure of applications in
another
• Regional independence for request serving – no API calls from
one region to another
Region BRegion A
Backbone
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Minimal data replication requirements
• Does all data need to be replicated?
• If yes, does it need to replicated synchronously?
• Does all data need to be replicated continuously?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Classification of data
Transactions
Catalog
information
Events,
objects
Server logs
Purchase record Product details Click stream Https logs
Low volume,
but highly
critical
High volume,
but less critical
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Sync vs. async replication modes
Tenant
1
App
DB DB
Tenant
1
App
DB DB
Write
Write
Replicate
Replicate
AckAck
Ack Ack
Cons: Network & target
dependent!
Pros: Guarantees
consistency
Cons: Two databases can
go out of sync
Pros: Network & target
independent
Region A Region B
Async preferred,
when possible!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Concept of data replication lanes
Synchronous replication
Asynchronous nearly
continuous replication
Asynchronous batch
replication
Transactions
Catalog
information
Most difficult to manage
Easiest to manage
Events
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Ideal replication system
• Each data store type will need a different technology.
• But at the the minimum:
1. It should report replication lag
2. It should report record offset
3. Should be able to retry replication of failed records
ReplicatorSource DB Target DB
High Level
Metrics
monitoring
Try until successful
Replication lag Record offset
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Distributed system design best
practices
Eventual
consistencyIdempotency
Static
stability
ThrottlingExponential
backup
Circuit
breaking
Reliability Pillar whitepaper (Nov 2017)
How AWS enables customers to deploy
Multi-Region Active-Active
Architecture
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS worldwide network backbone
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Multi-Region VPN – over non-AWS
network
https://aws.amazon.com/answers/networking/aws-multiple-region-multi-vpc-connectivity/
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Multi-Region VPN – over AWS network
https://aws.amazon.com/answers/networking/aws-multiple-region-multi-vpc-connectivity/
INTER REGION VPC PEERING (New)
HQ
OREGO
N
VPC
Shared
services
VPC
VPC
Customer
network
OREGON
REGION
VPC
VPC
VPC VPC
VPC
IRELAND
REGION
VPC
Shared
services
VPC
VPC
VPC
VPC
VPC
SHARED SERVICES
A M A Z O N B A C K B O N E
V P C P E E R
NO OVERLAPPING IP ADDRESS SPACE
C R O S S R E G I O N P E E R E D
C O N N E C T I O N E N C RY P T E D
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Key benefits
• 1. Works similar to existing Intra Region VPC Peering
• 2. Data always stays on the AWS backbone
• 3. Data always encrypted by default
• 4. No need to use Gateways, third-party VPN solutions to connect across
regions.
• 5. No additional charges for using interregion VPC peering. Customers
pay standard data transfer rates
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon RDS Multi-AZ
deployment
Availability Zone 1 Availability Zone 2
Security group
mydb1.abc45345.eu-west-1.rds.amazonaws.com:3306
VPC subnetVPC subnet
Synchronous
physical replication
• Standbys ensure zero data loss in event of the master’s failure.
• Always have a stand-by. Always!!!
• Also note, cross AZ failovers are automatic & fast; whereas cross-
region failovers take time and need nontrivial planning.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Supported DBs
• MySQL
• MariaDB
• PostgreSQL
• Aurora
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ReadRepl.html#USER_ReadRepl.XRgn
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/AuroraMySQL.Replication.CrossRegion.html
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Monitoring RDS replication lag
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Very simple
plan
All regions send
critical write traffic
to a single master
Replication traffic
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
“Simple, but higher latency and
network dependency”
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Better plan (for most cases)
Tenant in
Region A
Tenant in
Region B
Tenant in
Region C
Sync
Async
Async
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Applications get faster response
Some committed transactions may not make it to the failover region, before the
switch.
Committed transactions are safe due to standby.
They can be recovered with help of reconciliation techniques.
However
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Zero application downtime from ANY instance
failure
• Zero application downtime from ANY AZ failure
• Faster write performance and higher scale
• Sign up for single-region multi-master preview
today;
• Multi-Region Multi-Master coming in 2018
Availability
Zone 1
Scale out both reads and writes
Availability
Zone 2
Availability
Zone 3
Application
Read/Write
Master 1
Shared distributed storage volume
Read/Write
Master 2
Read/Write
Master 3
Aurora multi-master—scale out reads & writes
F i r s t M y S Q L c o m p a t i b l e D B s e r v i c e w i t h s c a l e o u t a c r o s s m u l t i p l e d a t a c e n t e r s
Database Migration Service (DMS) for
granular replication
DMS
Replication instance
Source Target
Update
t1 t2
t1
t2
Transactions Change
apply
after bulk
load
Change data capture
(supported for MySQL, MariaDB,
Aurora, PostgreSQL)
Details: http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.html
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon DynamoDB
F a s t a n d f l e x i b l e N o S Q L d a t a b a s e s e r v i c e f o r a n y s c a l e
NoSQL database that supports both document and key-value structures
Fast, consistent
performance
Highly scalable Fully managed
Business Critical
Reliability
Consistently single-digit
millisecond latencies at any
scale. DAX speeds up times
to microseconds.
Auto-scaling tables serving
millions of requests per
second, storing hundreds of
terabytes of data.
Automatic provisioning
and infrastructure
management.
Data is replicated across
fault tolerant availability
zones, with fine-grained
access control.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon DynamoDB Global Tables ( G A)
Fi r s t f u l l y m a n a g e d , m u l t i - m a s t e r , m u l t i - r e g i o n d a t a b a s e
Build high performance, globally distributed applications
Low latency reads & writes to locally available tables
Disaster proof with multi-region redundancy
Easy to set up and no application rewrites required
Globally dispersed users
Replica (N. America)
Replica (Europe)
Replica (Asia)
Global App
Global Table
Open Source Cross-
Region Replication Library
or
Lambda
Amazon DynamoDB cross-region
replication with streams
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is AWS Lambda
• It is a serverless, event driven compute service
• Define functions, and the service executes them as many times as input
arrives
• Many supported event sources including: Kinesis and DynamoDB streams
• Scales automatically with event rate
• Supported languages: Java, .Net, Node.js, Python
Ensures streamed records in shards are
processed (replicated in this case) in order!
DynamoDB Streams and AWS Lambda
Error handling in Lambda
1. For stream-based event sources (Amazon Kinesis Streams and
DynamoDB streams), AWS Lambda polls your stream and invokes your
Lambda function.
2. Therefore, if a Lambda function fails, AWS Lambda attempts to process
the erring batch of records until the time the data expires from the
stream.
3. The exception is treated as blocking, and AWS Lambda will not read any
new records from the stream until the failed batch of records either
expires or processed successfully.
This ensures that AWS Lambda processes the stream events in order.
http://docs.aws.amazon.com/lambda/latest/dg/retries-on-errors.html
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ElasticSearch cross region replication
Amazon Kinesis
Firehose
Amazon Kinesis
Firehose
Amazon ES
Amazon ES
Source
Source needs to have tracking
to have successful posting to
both regions
Region A
Region B
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Redshift cross region replication
Amazon RedshiftAmazonKinesisFirehose
AmazonKinesisFirehose Amazon Redshift
Source
Source needs to have tracking
to have successful posting to
both regions
Region A
Region B
Global Traffic Management with
Route 53
Global service
• Load balancers and/or servers across multiple regions globally
Region A
Region B
Region C
User for Region A
User for Region C
User for Region B
Traffic flow: endpoints
• Hybrid/low level infrastructure: IP address or
CNAME
• ELB Classic Load Balancer / Application Load
Balancer
• Amazon S3 website
• Amazon CloudFront distribution
• AWS Elastic Beanstalk environment
Traffic flow: Quick overview of rules
• Simple routing policy – Use for a single resource that performs a given function for
your domain, for example, a web server that serves content for the example.com
website.
• Failover routing policy – Use when you want to configure active-passive failover.
• Geolocation routing policy – Use when you want to route traffic based on the
location of your users.
Traffic flow: Quick overview of rules
• Geoproximity routing policy – Use when you want to route traffic based on the
location of your resources and, optionally, shift traffic from one resource in one
location to resources in another.
• Latency routing policy – Use when you have resources in multiple locations and
you want to route traffic to the resource that provides the best latency.
• Multivalue answer routing policy – Use when you want Amazon Route 53 to
respond to DNS queries with up to eight healthy records selected at random.
• Weighted routing policy – Use to route traffic to multiple resources in proportions
that you specify.
Traffic policy example
Example of an advanced
policyFailover Rule Geolocation Rule
Failover Rule
End points
Conclusions
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Conclusions
o Avoid synchronous replication & simultaneous deployments as much as
possible
o Design applications for idempotency & eventual consistency as much as
possible
o Closely monitor replication & code sync delays
o Have push buttons ready to switch traffic for tenants
o Make High-Level Metrics monitoring systems also Multi-Region
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Conclusions
o It is an involved exercise. It requires careful planning and
design.
o However, various AWS services make implementation much
easier by doing undifferentiated heavy lifting for our
customers.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Conclusions
o For companies with extremely high availability requirements
or/and geographically distributed user base, benefits of
Multi-Region Active-Active architecture can be profound.
o In such cases, consider designing applications for Multi-
Region Active-Active implementation from DAY ONE.
• But, again as we say in Amazon, today is DAY ONE!

More Related Content

What's hot

Introducing AWS DataSync - Simplify, automate, and accelerate online data tra...
Introducing AWS DataSync - Simplify, automate, and accelerate online data tra...Introducing AWS DataSync - Simplify, automate, and accelerate online data tra...
Introducing AWS DataSync - Simplify, automate, and accelerate online data tra...
Amazon Web Services
 
Amazon Aurora: Deep Dive - SRV308 - Chicago AWS Summit
Amazon Aurora: Deep Dive - SRV308 - Chicago AWS SummitAmazon Aurora: Deep Dive - SRV308 - Chicago AWS Summit
Amazon Aurora: Deep Dive - SRV308 - Chicago AWS Summit
Amazon Web Services
 
Disaster Recovery Strategies - AWS Siklab 2022.pptx
Disaster Recovery Strategies - AWS Siklab 2022.pptxDisaster Recovery Strategies - AWS Siklab 2022.pptx
Disaster Recovery Strategies - AWS Siklab 2022.pptx
John Louis Garcia
 
(STG402) Amazon EBS Deep Dive
(STG402) Amazon EBS Deep Dive(STG402) Amazon EBS Deep Dive
(STG402) Amazon EBS Deep Dive
Amazon Web Services
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
GetInData
 
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at UberDisaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber
confluent
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
Amazon Web Services
 
Deep Dive on Amazon EBS Elastic Volumes - March 2017 AWS Online Tech Talks
Deep Dive on Amazon EBS Elastic Volumes - March 2017 AWS Online Tech TalksDeep Dive on Amazon EBS Elastic Volumes - March 2017 AWS Online Tech Talks
Deep Dive on Amazon EBS Elastic Volumes - March 2017 AWS Online Tech Talks
Amazon Web Services
 
Amazon Aurora: Under the Hood
Amazon Aurora: Under the HoodAmazon Aurora: Under the Hood
Amazon Aurora: Under the Hood
Amazon Web Services
 
Amazon Kinesis
Amazon KinesisAmazon Kinesis
Amazon Kinesis
Amazon Web Services
 
Building a Modern Data Architecture on AWS - Webinar
Building a Modern Data Architecture on AWS - WebinarBuilding a Modern Data Architecture on AWS - Webinar
Building a Modern Data Architecture on AWS - Webinar
Amazon Web Services
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon Aurora
Amazon Web Services
 
Introduction to Amazon Route 53 Resolver for Hybrid Cloud (NET215) - AWS re:I...
Introduction to Amazon Route 53 Resolver for Hybrid Cloud (NET215) - AWS re:I...Introduction to Amazon Route 53 Resolver for Hybrid Cloud (NET215) - AWS re:I...
Introduction to Amazon Route 53 Resolver for Hybrid Cloud (NET215) - AWS re:I...
Amazon Web Services
 
Neptune webinar AWS
Neptune webinar AWS Neptune webinar AWS
Neptune webinar AWS
Amazon Web Services
 
AWS Neptune - A Fast and reliable Graph Database Built for the Cloud
AWS Neptune - A Fast and reliable Graph Database Built for the CloudAWS Neptune - A Fast and reliable Graph Database Built for the Cloud
AWS Neptune - A Fast and reliable Graph Database Built for the Cloud
Amazon Web Services
 
Deep Dive on Amazon S3 Storage Classes: Creating Cost Efficiencies across You...
Deep Dive on Amazon S3 Storage Classes: Creating Cost Efficiencies across You...Deep Dive on Amazon S3 Storage Classes: Creating Cost Efficiencies across You...
Deep Dive on Amazon S3 Storage Classes: Creating Cost Efficiencies across You...
Amazon Web Services
 
DNS Security Presentation ISSA
DNS Security Presentation ISSADNS Security Presentation ISSA
DNS Security Presentation ISSA
Srikrupa Srivatsan
 
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Amazon Web Services
 
AWS for Backup and Recovery
AWS for Backup and RecoveryAWS for Backup and Recovery
AWS for Backup and Recovery
Amazon Web Services
 
Case Study: Stream Processing on AWS using Kappa Architecture
Case Study: Stream Processing on AWS using Kappa ArchitectureCase Study: Stream Processing on AWS using Kappa Architecture
Case Study: Stream Processing on AWS using Kappa Architecture
Joey Bolduc-Gilbert
 

What's hot (20)

Introducing AWS DataSync - Simplify, automate, and accelerate online data tra...
Introducing AWS DataSync - Simplify, automate, and accelerate online data tra...Introducing AWS DataSync - Simplify, automate, and accelerate online data tra...
Introducing AWS DataSync - Simplify, automate, and accelerate online data tra...
 
Amazon Aurora: Deep Dive - SRV308 - Chicago AWS Summit
Amazon Aurora: Deep Dive - SRV308 - Chicago AWS SummitAmazon Aurora: Deep Dive - SRV308 - Chicago AWS Summit
Amazon Aurora: Deep Dive - SRV308 - Chicago AWS Summit
 
Disaster Recovery Strategies - AWS Siklab 2022.pptx
Disaster Recovery Strategies - AWS Siklab 2022.pptxDisaster Recovery Strategies - AWS Siklab 2022.pptx
Disaster Recovery Strategies - AWS Siklab 2022.pptx
 
(STG402) Amazon EBS Deep Dive
(STG402) Amazon EBS Deep Dive(STG402) Amazon EBS Deep Dive
(STG402) Amazon EBS Deep Dive
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
 
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at UberDisaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
 
Deep Dive on Amazon EBS Elastic Volumes - March 2017 AWS Online Tech Talks
Deep Dive on Amazon EBS Elastic Volumes - March 2017 AWS Online Tech TalksDeep Dive on Amazon EBS Elastic Volumes - March 2017 AWS Online Tech Talks
Deep Dive on Amazon EBS Elastic Volumes - March 2017 AWS Online Tech Talks
 
Amazon Aurora: Under the Hood
Amazon Aurora: Under the HoodAmazon Aurora: Under the Hood
Amazon Aurora: Under the Hood
 
Amazon Kinesis
Amazon KinesisAmazon Kinesis
Amazon Kinesis
 
Building a Modern Data Architecture on AWS - Webinar
Building a Modern Data Architecture on AWS - WebinarBuilding a Modern Data Architecture on AWS - Webinar
Building a Modern Data Architecture on AWS - Webinar
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon Aurora
 
Introduction to Amazon Route 53 Resolver for Hybrid Cloud (NET215) - AWS re:I...
Introduction to Amazon Route 53 Resolver for Hybrid Cloud (NET215) - AWS re:I...Introduction to Amazon Route 53 Resolver for Hybrid Cloud (NET215) - AWS re:I...
Introduction to Amazon Route 53 Resolver for Hybrid Cloud (NET215) - AWS re:I...
 
Neptune webinar AWS
Neptune webinar AWS Neptune webinar AWS
Neptune webinar AWS
 
AWS Neptune - A Fast and reliable Graph Database Built for the Cloud
AWS Neptune - A Fast and reliable Graph Database Built for the CloudAWS Neptune - A Fast and reliable Graph Database Built for the Cloud
AWS Neptune - A Fast and reliable Graph Database Built for the Cloud
 
Deep Dive on Amazon S3 Storage Classes: Creating Cost Efficiencies across You...
Deep Dive on Amazon S3 Storage Classes: Creating Cost Efficiencies across You...Deep Dive on Amazon S3 Storage Classes: Creating Cost Efficiencies across You...
Deep Dive on Amazon S3 Storage Classes: Creating Cost Efficiencies across You...
 
DNS Security Presentation ISSA
DNS Security Presentation ISSADNS Security Presentation ISSA
DNS Security Presentation ISSA
 
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
 
AWS for Backup and Recovery
AWS for Backup and RecoveryAWS for Backup and Recovery
AWS for Backup and Recovery
 
Case Study: Stream Processing on AWS using Kappa Architecture
Case Study: Stream Processing on AWS using Kappa ArchitectureCase Study: Stream Processing on AWS using Kappa Architecture
Case Study: Stream Processing on AWS using Kappa Architecture
 

Similar to How to Design a Multi-Region Active-Active Architecture

ARC319_Multi-Region Active-Active Architecture
ARC319_Multi-Region Active-Active ArchitectureARC319_Multi-Region Active-Active Architecture
ARC319_Multi-Region Active-Active Architecture
Amazon Web Services
 
Building Multiregion Serverless Backends
Building Multiregion Serverless BackendsBuilding Multiregion Serverless Backends
Building Multiregion Serverless Backends
Amazon Web Services
 
Building Global Serverless Backends
Building Global Serverless BackendsBuilding Global Serverless Backends
Building Global Serverless BackendsAmazon Web Services
 
Building Global Serverless Backends powered by Amazon DynamoDB Global Tables
Building Global Serverless Backends powered by Amazon DynamoDB Global TablesBuilding Global Serverless Backends powered by Amazon DynamoDB Global Tables
Building Global Serverless Backends powered by Amazon DynamoDB Global TablesAmazon Web Services
 
Building Global Multi-Region, Active-Active Serverless Backends I AWS Dev Day...
Building Global Multi-Region, Active-Active Serverless Backends I AWS Dev Day...Building Global Multi-Region, Active-Active Serverless Backends I AWS Dev Day...
Building Global Multi-Region, Active-Active Serverless Backends I AWS Dev Day...
AWS Germany
 
Building a Multi-Region, Active-Active Serverless Backends.
Building a Multi-Region, Active-Active Serverless Backends.Building a Multi-Region, Active-Active Serverless Backends.
Building a Multi-Region, Active-Active Serverless Backends.
Adrian Hornsby
 
ATC301-Big Data & Analytics for Manufacturing Operations
ATC301-Big Data & Analytics for Manufacturing OperationsATC301-Big Data & Analytics for Manufacturing Operations
ATC301-Big Data & Analytics for Manufacturing Operations
Amazon Web Services
 
Running Mission Critical Workloads on AWS
Running Mission Critical Workloads on AWSRunning Mission Critical Workloads on AWS
Running Mission Critical Workloads on AWS
Amazon Web Services
 
ARC306_High Resiliency & Availability Of Online Entertainment Communities Usi...
ARC306_High Resiliency & Availability Of Online Entertainment Communities Usi...ARC306_High Resiliency & Availability Of Online Entertainment Communities Usi...
ARC306_High Resiliency & Availability Of Online Entertainment Communities Usi...
Amazon Web Services
 
Design, Build, and Modernize Your Web Applications with AWS
 Design, Build, and Modernize Your Web Applications with AWS Design, Build, and Modernize Your Web Applications with AWS
Design, Build, and Modernize Your Web Applications with AWS
Donnie Prakoso
 
Scale Website dan Mobile Applications Anda di AWS hingga 10 juta pengguna
Scale Website dan Mobile Applications Anda di AWS hingga 10 juta penggunaScale Website dan Mobile Applications Anda di AWS hingga 10 juta pengguna
Scale Website dan Mobile Applications Anda di AWS hingga 10 juta pengguna
Amazon Web Services
 
Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017
Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017
Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017
Amazon Web Services
 
ARC201_Scaling Up to Your First 10 Million Users
ARC201_Scaling Up to Your First 10 Million UsersARC201_Scaling Up to Your First 10 Million Users
ARC201_Scaling Up to Your First 10 Million Users
Amazon Web Services
 
Learn how to build serverless applications using the AWS Serverless Platform-...
Learn how to build serverless applications using the AWS Serverless Platform-...Learn how to build serverless applications using the AWS Serverless Platform-...
Learn how to build serverless applications using the AWS Serverless Platform-...
Amazon Web Services
 
STG401_This Is My Architecture
STG401_This Is My ArchitectureSTG401_This Is My Architecture
STG401_This Is My Architecture
Amazon Web Services
 
AWS Database and Analytics State of the Union - 2017 - DAT201 - re:Invent 2017
AWS Database and Analytics State of the Union - 2017 - DAT201 - re:Invent 2017AWS Database and Analytics State of the Union - 2017 - DAT201 - re:Invent 2017
AWS Database and Analytics State of the Union - 2017 - DAT201 - re:Invent 2017
Amazon Web Services
 
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...
Amazon Web Services
 
CTD405_Building Serverless Video Workflows
CTD405_Building Serverless Video WorkflowsCTD405_Building Serverless Video Workflows
CTD405_Building Serverless Video Workflows
Amazon Web Services
 
在 AWS 上運行任務關鍵工作負載
在 AWS 上運行任務關鍵工作負載在 AWS 上運行任務關鍵工作負載
在 AWS 上運行任務關鍵工作負載
Amazon Web Services
 
Scaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million UsersScaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million Users
Amazon Web Services
 

Similar to How to Design a Multi-Region Active-Active Architecture (20)

ARC319_Multi-Region Active-Active Architecture
ARC319_Multi-Region Active-Active ArchitectureARC319_Multi-Region Active-Active Architecture
ARC319_Multi-Region Active-Active Architecture
 
Building Multiregion Serverless Backends
Building Multiregion Serverless BackendsBuilding Multiregion Serverless Backends
Building Multiregion Serverless Backends
 
Building Global Serverless Backends
Building Global Serverless BackendsBuilding Global Serverless Backends
Building Global Serverless Backends
 
Building Global Serverless Backends powered by Amazon DynamoDB Global Tables
Building Global Serverless Backends powered by Amazon DynamoDB Global TablesBuilding Global Serverless Backends powered by Amazon DynamoDB Global Tables
Building Global Serverless Backends powered by Amazon DynamoDB Global Tables
 
Building Global Multi-Region, Active-Active Serverless Backends I AWS Dev Day...
Building Global Multi-Region, Active-Active Serverless Backends I AWS Dev Day...Building Global Multi-Region, Active-Active Serverless Backends I AWS Dev Day...
Building Global Multi-Region, Active-Active Serverless Backends I AWS Dev Day...
 
Building a Multi-Region, Active-Active Serverless Backends.
Building a Multi-Region, Active-Active Serverless Backends.Building a Multi-Region, Active-Active Serverless Backends.
Building a Multi-Region, Active-Active Serverless Backends.
 
ATC301-Big Data & Analytics for Manufacturing Operations
ATC301-Big Data & Analytics for Manufacturing OperationsATC301-Big Data & Analytics for Manufacturing Operations
ATC301-Big Data & Analytics for Manufacturing Operations
 
Running Mission Critical Workloads on AWS
Running Mission Critical Workloads on AWSRunning Mission Critical Workloads on AWS
Running Mission Critical Workloads on AWS
 
ARC306_High Resiliency & Availability Of Online Entertainment Communities Usi...
ARC306_High Resiliency & Availability Of Online Entertainment Communities Usi...ARC306_High Resiliency & Availability Of Online Entertainment Communities Usi...
ARC306_High Resiliency & Availability Of Online Entertainment Communities Usi...
 
Design, Build, and Modernize Your Web Applications with AWS
 Design, Build, and Modernize Your Web Applications with AWS Design, Build, and Modernize Your Web Applications with AWS
Design, Build, and Modernize Your Web Applications with AWS
 
Scale Website dan Mobile Applications Anda di AWS hingga 10 juta pengguna
Scale Website dan Mobile Applications Anda di AWS hingga 10 juta penggunaScale Website dan Mobile Applications Anda di AWS hingga 10 juta pengguna
Scale Website dan Mobile Applications Anda di AWS hingga 10 juta pengguna
 
Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017
Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017
Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017
 
ARC201_Scaling Up to Your First 10 Million Users
ARC201_Scaling Up to Your First 10 Million UsersARC201_Scaling Up to Your First 10 Million Users
ARC201_Scaling Up to Your First 10 Million Users
 
Learn how to build serverless applications using the AWS Serverless Platform-...
Learn how to build serverless applications using the AWS Serverless Platform-...Learn how to build serverless applications using the AWS Serverless Platform-...
Learn how to build serverless applications using the AWS Serverless Platform-...
 
STG401_This Is My Architecture
STG401_This Is My ArchitectureSTG401_This Is My Architecture
STG401_This Is My Architecture
 
AWS Database and Analytics State of the Union - 2017 - DAT201 - re:Invent 2017
AWS Database and Analytics State of the Union - 2017 - DAT201 - re:Invent 2017AWS Database and Analytics State of the Union - 2017 - DAT201 - re:Invent 2017
AWS Database and Analytics State of the Union - 2017 - DAT201 - re:Invent 2017
 
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...
 
CTD405_Building Serverless Video Workflows
CTD405_Building Serverless Video WorkflowsCTD405_Building Serverless Video Workflows
CTD405_Building Serverless Video Workflows
 
在 AWS 上運行任務關鍵工作負載
在 AWS 上運行任務關鍵工作負載在 AWS 上運行任務關鍵工作負載
在 AWS 上運行任務關鍵工作負載
 
Scaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million UsersScaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million Users
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
Amazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
Amazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
Amazon Web Services
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Amazon Web Services
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
Amazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
Amazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Amazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
Amazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Amazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

How to Design a Multi-Region Active-Active Architecture

  • 2. Session objectives 1. Understand how to think about application availability 2. Understand how AWS makes services highly-available 3. Understand why you might need Multi-Region Active-Active architecture 4. Review AWS services that are useful for Multi-Region Active-Active design
  • 3. Understand how to think about application availability
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Calculating availability with hard dependencies Application Dependency 1 Dependency 3 99% 90% <= 90% Dependency 2 95%
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. “A chain is only as strong as its weakest link.”
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Calculating availability with redundant components Application Component 1 Component 1 Replica 90% 90% ~99% Assuming instantaneous failover! Let us understand intuitively
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. “Component redundancy increases availability significantly!”
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. “Nines” of availability Reliability Pillar White-paper (Nov 2017) Availability Max Disruption (per year) Application Categories 99% 3 days 15 hours Batch processing, data extraction, transfer, and load jobs 99.90% 8 hours 45 minutes Internal tools like knowledge management, project tracking 99.95% 4 hours 22 minutes Online commerce, point of sale 99.99% 52 minutes Video delivery, ridesharing services, broadcast systems 99.999% 5 minutes ATM transactions, telecommunications systems
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Your application’s availability depends … … n o t only on availability of AW S s er vic es , but als o on the quality of your s oftw ar e and your adher enc e to oper ational bes t pr ac tic es !
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. “Nines” of availability Reliability Pillar whitepaper (Nov 2017) Availability Max Disruption (per year) Application Categories 99% 3 days 15 hours Batch processing, data extraction, transfer, and load jobs 99.90% 8 hours 45 minutes Internal tools like knowledge management, project tracking 99.95% 4 hours 22 minutes Online commerce, point of sale 99.99% 52 minutes Video delivery, ridesharing services, broadcast systems 99.999% 5 minutes ATM transactions, telecommunications systems Consider Multi-Region Active- Active for this class
  • 12. Understand how AWS makes services highly available
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Region-wide AWS services Amazon DynamoDB Amazon RDS Amazon ElastiCache Amazon S3 Amazon EFS Amazon SQS Amazon Kinesis Amazon ES Default Configurable for multi- AZ deployment
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. High availability for EC2 – instance recovery Instance ID, private IP addresses, Elastic IP addresses, and all instance metadata Instance ID, private IP addresses, Elastic IP addresses, and all instance metadata http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-recover.html Loss of network connectivity Loss of system power Software issues on the physical host Hardware issues on the physical host that impact network reachability New instance, same in all manners as old instance
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. High availability for EC2 – Auto Scaling Availability Zone 1 Availability Zone 2 Auto-scaling Group Fresh server from AMI
  • 18. Understand why you might need Multi-Region Active-Active architecture
  • 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Serving geographically distributed customer base TaiwanUSA West USA East EU-East Users from San Francisco Users from New York Users from London Users from Taipei
  • 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Guarding against failure of your applications in one region Applications in US West Applications in US East Users from San Francisco Users from New York AWS Service 1 AWS Service 2 AWS Service 3 AWS Service 4 AWS Service 1 AWS Service 2 AWS Service 3 AWS Service 4
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Cost effective DR: Why not use DR all the time? DR environments that don’t get used 1. Fall out of sync, eventually 2. Waste money
  • 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Reliable & secure network • Network backbone • VPN or encrypted (SSL) communication over public IP Region BRegion A Encrypted data Backbone
  • 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data replication 1. Synchronous 2. Asynchronous o Nearly continuous (lag of few seconds or minutes) o Batch (hourly/daily/weekly copy)
  • 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Code synchronization 1. Staged deployment across regions o Rolling o Blue-Green o Rollback facility • 2. Parameterized localization
  • 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Traffic segregation & management • Segregation options • Explicit – different URLs – e.g. east.abc-corp.com and west.abc-corp.com • Implicit (DNS level) – the same URL – e.g. www.abc-corp.com • Traffic management infrastructure – Throttling – Internal redirecting – External redirecting When users reach wrong regions
  • 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Monitoring • Application & infrastructure health • Replication lag & code sync monitoring
  • 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Multi-tenancy • What is a tenant? • -A unit of movement/failover US EastUS West Backbone California Texas New York Pennsylvania Texas
  • 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Direction of replication – before Texas containers DB DB Region A Region B
  • 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Direction of replication – after Texas containersDB DB Region A Region B
  • 31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Failover scripts Failover scripts should be able to: 1. Traffic rerouting for a tenant 2. Change direction of replication
  • 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tolerance for network partitioning • Failure of one region should not lead to failure of applications in another • Regional independence for request serving – no API calls from one region to another Region BRegion A Backbone
  • 34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Minimal data replication requirements • Does all data need to be replicated? • If yes, does it need to replicated synchronously? • Does all data need to be replicated continuously?
  • 35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Classification of data Transactions Catalog information Events, objects Server logs Purchase record Product details Click stream Https logs Low volume, but highly critical High volume, but less critical
  • 36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Sync vs. async replication modes Tenant 1 App DB DB Tenant 1 App DB DB Write Write Replicate Replicate AckAck Ack Ack Cons: Network & target dependent! Pros: Guarantees consistency Cons: Two databases can go out of sync Pros: Network & target independent Region A Region B Async preferred, when possible!
  • 37. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Concept of data replication lanes Synchronous replication Asynchronous nearly continuous replication Asynchronous batch replication Transactions Catalog information Most difficult to manage Easiest to manage Events
  • 38. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Ideal replication system • Each data store type will need a different technology. • But at the the minimum: 1. It should report replication lag 2. It should report record offset 3. Should be able to retry replication of failed records ReplicatorSource DB Target DB High Level Metrics monitoring Try until successful Replication lag Record offset
  • 39. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Distributed system design best practices Eventual consistencyIdempotency Static stability ThrottlingExponential backup Circuit breaking Reliability Pillar whitepaper (Nov 2017)
  • 40. How AWS enables customers to deploy Multi-Region Active-Active Architecture
  • 41. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS worldwide network backbone
  • 42. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Multi-Region VPN – over non-AWS network https://aws.amazon.com/answers/networking/aws-multiple-region-multi-vpc-connectivity/
  • 43. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Multi-Region VPN – over AWS network https://aws.amazon.com/answers/networking/aws-multiple-region-multi-vpc-connectivity/
  • 44. INTER REGION VPC PEERING (New) HQ OREGO N VPC Shared services VPC VPC Customer network OREGON REGION VPC VPC VPC VPC VPC IRELAND REGION VPC Shared services VPC VPC VPC VPC VPC SHARED SERVICES A M A Z O N B A C K B O N E V P C P E E R NO OVERLAPPING IP ADDRESS SPACE C R O S S R E G I O N P E E R E D C O N N E C T I O N E N C RY P T E D
  • 45. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Key benefits • 1. Works similar to existing Intra Region VPC Peering • 2. Data always stays on the AWS backbone • 3. Data always encrypted by default • 4. No need to use Gateways, third-party VPN solutions to connect across regions. • 5. No additional charges for using interregion VPC peering. Customers pay standard data transfer rates
  • 46. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 47. Amazon RDS Multi-AZ deployment Availability Zone 1 Availability Zone 2 Security group mydb1.abc45345.eu-west-1.rds.amazonaws.com:3306 VPC subnetVPC subnet Synchronous physical replication • Standbys ensure zero data loss in event of the master’s failure. • Always have a stand-by. Always!!! • Also note, cross AZ failovers are automatic & fast; whereas cross- region failovers take time and need nontrivial planning.
  • 48. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Supported DBs • MySQL • MariaDB • PostgreSQL • Aurora http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ReadRepl.html#USER_ReadRepl.XRgn http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/AuroraMySQL.Replication.CrossRegion.html
  • 49. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Monitoring RDS replication lag
  • 50. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Very simple plan All regions send critical write traffic to a single master Replication traffic
  • 51. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. “Simple, but higher latency and network dependency”
  • 52. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Better plan (for most cases) Tenant in Region A Tenant in Region B Tenant in Region C Sync Async Async
  • 53. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Applications get faster response Some committed transactions may not make it to the failover region, before the switch. Committed transactions are safe due to standby. They can be recovered with help of reconciliation techniques. However
  • 54. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Zero application downtime from ANY instance failure • Zero application downtime from ANY AZ failure • Faster write performance and higher scale • Sign up for single-region multi-master preview today; • Multi-Region Multi-Master coming in 2018 Availability Zone 1 Scale out both reads and writes Availability Zone 2 Availability Zone 3 Application Read/Write Master 1 Shared distributed storage volume Read/Write Master 2 Read/Write Master 3 Aurora multi-master—scale out reads & writes F i r s t M y S Q L c o m p a t i b l e D B s e r v i c e w i t h s c a l e o u t a c r o s s m u l t i p l e d a t a c e n t e r s
  • 55. Database Migration Service (DMS) for granular replication DMS Replication instance Source Target Update t1 t2 t1 t2 Transactions Change apply after bulk load Change data capture (supported for MySQL, MariaDB, Aurora, PostgreSQL) Details: http://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.html
  • 56. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon DynamoDB F a s t a n d f l e x i b l e N o S Q L d a t a b a s e s e r v i c e f o r a n y s c a l e NoSQL database that supports both document and key-value structures Fast, consistent performance Highly scalable Fully managed Business Critical Reliability Consistently single-digit millisecond latencies at any scale. DAX speeds up times to microseconds. Auto-scaling tables serving millions of requests per second, storing hundreds of terabytes of data. Automatic provisioning and infrastructure management. Data is replicated across fault tolerant availability zones, with fine-grained access control.
  • 57. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon DynamoDB Global Tables ( G A) Fi r s t f u l l y m a n a g e d , m u l t i - m a s t e r , m u l t i - r e g i o n d a t a b a s e Build high performance, globally distributed applications Low latency reads & writes to locally available tables Disaster proof with multi-region redundancy Easy to set up and no application rewrites required Globally dispersed users Replica (N. America) Replica (Europe) Replica (Asia) Global App Global Table
  • 58. Open Source Cross- Region Replication Library or Lambda Amazon DynamoDB cross-region replication with streams
  • 59. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What is AWS Lambda • It is a serverless, event driven compute service • Define functions, and the service executes them as many times as input arrives • Many supported event sources including: Kinesis and DynamoDB streams • Scales automatically with event rate • Supported languages: Java, .Net, Node.js, Python Ensures streamed records in shards are processed (replicated in this case) in order!
  • 60. DynamoDB Streams and AWS Lambda
  • 61. Error handling in Lambda 1. For stream-based event sources (Amazon Kinesis Streams and DynamoDB streams), AWS Lambda polls your stream and invokes your Lambda function. 2. Therefore, if a Lambda function fails, AWS Lambda attempts to process the erring batch of records until the time the data expires from the stream. 3. The exception is treated as blocking, and AWS Lambda will not read any new records from the stream until the failed batch of records either expires or processed successfully. This ensures that AWS Lambda processes the stream events in order. http://docs.aws.amazon.com/lambda/latest/dg/retries-on-errors.html
  • 62. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ElasticSearch cross region replication Amazon Kinesis Firehose Amazon Kinesis Firehose Amazon ES Amazon ES Source Source needs to have tracking to have successful posting to both regions Region A Region B
  • 63. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Redshift cross region replication Amazon RedshiftAmazonKinesisFirehose AmazonKinesisFirehose Amazon Redshift Source Source needs to have tracking to have successful posting to both regions Region A Region B
  • 64. Global Traffic Management with Route 53
  • 65. Global service • Load balancers and/or servers across multiple regions globally Region A Region B Region C User for Region A User for Region C User for Region B
  • 66. Traffic flow: endpoints • Hybrid/low level infrastructure: IP address or CNAME • ELB Classic Load Balancer / Application Load Balancer • Amazon S3 website • Amazon CloudFront distribution • AWS Elastic Beanstalk environment
  • 67. Traffic flow: Quick overview of rules • Simple routing policy – Use for a single resource that performs a given function for your domain, for example, a web server that serves content for the example.com website. • Failover routing policy – Use when you want to configure active-passive failover. • Geolocation routing policy – Use when you want to route traffic based on the location of your users.
  • 68. Traffic flow: Quick overview of rules • Geoproximity routing policy – Use when you want to route traffic based on the location of your resources and, optionally, shift traffic from one resource in one location to resources in another. • Latency routing policy – Use when you have resources in multiple locations and you want to route traffic to the resource that provides the best latency. • Multivalue answer routing policy – Use when you want Amazon Route 53 to respond to DNS queries with up to eight healthy records selected at random. • Weighted routing policy – Use to route traffic to multiple resources in proportions that you specify.
  • 70. Example of an advanced policyFailover Rule Geolocation Rule Failover Rule End points
  • 72. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Conclusions o Avoid synchronous replication & simultaneous deployments as much as possible o Design applications for idempotency & eventual consistency as much as possible o Closely monitor replication & code sync delays o Have push buttons ready to switch traffic for tenants o Make High-Level Metrics monitoring systems also Multi-Region
  • 73. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Conclusions o It is an involved exercise. It requires careful planning and design. o However, various AWS services make implementation much easier by doing undifferentiated heavy lifting for our customers.
  • 74. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Conclusions o For companies with extremely high availability requirements or/and geographically distributed user base, benefits of Multi-Region Active-Active architecture can be profound. o In such cases, consider designing applications for Multi- Region Active-Active implementation from DAY ONE. • But, again as we say in Amazon, today is DAY ONE!