2. Overview
General High Availability (HA) Database Principles
HA Database Options on AWS
Amazon Relational Database Service
3. Architecting for Availability and Durability
Logical Corruption
Compute Node Failure
Storage Failure
Network Disruption
Data Center or Availability Zone Disruption
Database Maintenance (Scaling, Patching)
4. Technologies for HA - Backups
Logical corruption
Restore operations after storage
volume or AZ disruption
5. Technologies for HA - Replication
Faster than reboot in case of failure
Don’t Forget: Multi-AZ architecture
Don’t Forget: Backups still have value
Advanced: replication/failover for maintenance
Make changes to replica, failover to it
Practice
X Availability
Zone
Availability
Zone
6. Tale of Two Companies
Company A Company B
Single-AZ Multi-AZ
Nightly Backup Only Backups from Standby
Maintenance on Standby
Online DDL on Replica
8. Asynchronous vs. Synchronous Replication
Asynchronous Replication
Acknowledged as soon as written to the local storage
Lower durability than synchronous
Can fall behind on shipping
Synchronous Replication
Write is not committed until it is written on both replicas
Highest durability
Higher transaction latency
9. Asynchronous
insert into person values (‘ray’); DURABLE in ONE LOCATION
DURABLE in TWO LOCATION
commit;
ACK
ACK
Primary Secondary
Replication Log Relay Log
ray
10. Synchronous
insert into person values (‘ray’);
commit;
ACK DURABLE in TWO LOCATION
ACK
Primary Secondary
Replication Log Relay Log
ray
11. Logical vs. Physical
Logical
Standard MySQL Replication
Logical statement or transaction is shipped
With MySQL, enables operations on standby (reads,
DDL)
Physical
Shipping the physical block changes
• Oracle Dataguard
• Filesystem or block layer replication
• Physical SAN device replication
May not allow operations on standby
12. Logical Replication
insert into person values(‘ray’);
commit;
Primary Replica
Single
Threaded
Parse Parse
Buffer Replication Relay Log Buffer Replication
Log Log
Recovery Recovery
Log Log
Data Data
13. Physical Replication
insert into person values(‘grant’);
commit;
Primary Replica
Parse Parse
Buffer Replication Relay Log Buffer Replication
Log Log
Recovery Recovery
Log Log
Data Data
14. Popular Database Engines on EC2
Native, asynchronous
(Semi)-synchronous
Data Guard, RMAN, Oracle Secure Backup Cloud
Module
Oracle Database Machine Image (AMI)
Log shipping and DB mirroring
15. What is Amazon RDS
Managed Relational Database Service
Goals
Make it easy to set up, operate, and scale relational
databases in the cloud.
Maintain compatibility with applications and tools.
Supported Engines
MySQL, Oracle
16. Amazon RDS Features
Pre-configured deployments in minutes with console
DB Instance type scaling (cpu, memory)
Online storage scaling
Amazon CloudWatch integration
Automatic software patching with user control
Backups
Replication
17. Amazon RDS Backups
Automated Backups
Nightly system snapshots +
transaction backup
Enables point-in-time restore to
any point in retention period, up
to the last 5 minutes
Max retention period = 8 days
DB Snapshots
User-driven snapshots of
database
Kept until explicitly deleted
18. Multi-AZ Deployments for Amazon RDS
Managed fault-tolerance solution for production
databases
What is a Multi-AZ deployment?
Amazon RDS creates and maintains a hot standby in a
different Availability Zone
In the event of an unplanned or planned outage,
Amazon RDS automatically fails over to the standby
MySQL
DB Instance
19. Things to know about Multi-AZ
Replication Type
Synchronous - designed for high data durability
Physical - standby cannot be accessed directly
What events result in automatic failover?
Unplanned (instance failure, storage volume failure)
Planned (instance scaling, patching)
1-2 minute average failover time