AWS Summit 2011: High Availability Database Architectures in AWS Cloud


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

AWS Summit 2011: High Availability Database Architectures in AWS Cloud

  1. 1. High Availability Database Architecture Ray Bradford, Principal Product Manager
  2. 2. Overview General High Availability (HA) Database Principles HA Database Options on AWS Amazon Relational Database Service
  3. 3. Architecting for Availability and Durability Logical Corruption Compute Node Failure Storage Failure Network Disruption Data Center or Availability Zone Disruption Database Maintenance (Scaling, Patching)
  4. 4. Technologies for HA - Backups Logical corruption Restore operations after storage volume or AZ disruption
  5. 5. Technologies for HA - Replication Faster than reboot in case of failure Don’t Forget: Multi-AZ architecture Don’t Forget: Backups still have value Advanced: replication/failover for maintenance  Make changes to replica, failover to it  Practice X Availability Zone Availability Zone
  6. 6. Tale of Two Companies Company A Company B Single-AZ Multi-AZ Nightly Backup Only Backups from Standby Maintenance on Standby Online DDL on Replica
  7. 7. Replication Options Synchronous vs. Asynchronous Logical vs. Physical
  8. 8. Asynchronous vs. Synchronous Replication Asynchronous Replication  Acknowledged as soon as written to the local storage  Lower durability than synchronous  Can fall behind on shipping Synchronous Replication  Write is not committed until it is written on both replicas  Highest durability  Higher transaction latency
  9. 9. Asynchronous insert into person values (‘ray’); DURABLE in ONE LOCATION DURABLE in TWO LOCATION commit; ACK ACK Primary Secondary Replication Log Relay Log ray
  10. 10. Synchronous insert into person values (‘ray’); commit; ACK DURABLE in TWO LOCATION ACK Primary Secondary Replication Log Relay Log ray
  11. 11. Logical vs. Physical Logical  Standard MySQL Replication  Logical statement or transaction is shipped  With MySQL, enables operations on standby (reads, DDL) Physical  Shipping the physical block changes • Oracle Dataguard • Filesystem or block layer replication • Physical SAN device replication  May not allow operations on standby
  12. 12. Logical Replicationinsert into person values(‘ray’);commit; Primary Replica Single Threaded Parse Parse Buffer Replication Relay Log Buffer Replication Log Log Recovery Recovery Log Log Data Data
  13. 13. Physical Replicationinsert into person values(‘grant’);commit; Primary Replica Parse ParseBuffer Replication Relay Log Buffer Replication Log Log Recovery Recovery Log Log Data Data
  14. 14. Popular Database Engines on EC2 Native, asynchronous (Semi)-synchronous Data Guard, RMAN, Oracle Secure Backup Cloud Module Oracle Database Machine Image (AMI) Log shipping and DB mirroring
  15. 15. What is Amazon RDS Managed Relational Database Service Goals  Make it easy to set up, operate, and scale relational databases in the cloud.  Maintain compatibility with applications and tools. Supported Engines  MySQL, Oracle
  16. 16. Amazon RDS Features Pre-configured deployments in minutes with console DB Instance type scaling (cpu, memory) Online storage scaling Amazon CloudWatch integration Automatic software patching with user control Backups Replication
  17. 17. Amazon RDS Backups Automated Backups  Nightly system snapshots + transaction backup  Enables point-in-time restore to any point in retention period, up to the last 5 minutes  Max retention period = 8 days DB Snapshots  User-driven snapshots of database  Kept until explicitly deleted
  18. 18. Multi-AZ Deployments for Amazon RDS Managed fault-tolerance solution for production databases What is a Multi-AZ deployment?  Amazon RDS creates and maintains a hot standby in a different Availability Zone  In the event of an unplanned or planned outage, Amazon RDS automatically fails over to the standby MySQL DB Instance
  19. 19. Things to know about Multi-AZ Replication Type  Synchronous - designed for high data durability  Physical - standby cannot be accessed directly What events result in automatic failover?  Unplanned (instance failure, storage volume failure)  Planned (instance scaling, patching) 1-2 minute average failover time
  20. 20. Multi-AZ Deployments Read Replicas Highly Available, Durable, & Scalable MySQL Deployments
  21. 21. THANK YOU