Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Getting Started with Amazon Aurora


Published on

Amazon Aurora is a MySQL-compatible database engine that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. This session introduces you to Amazon Aurora, explains common use cases for the service, and helps you get started with building your first Amazon Aurora–powered application.

Published in: Technology
  • Be the first to comment

Getting Started with Amazon Aurora

  1. 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Dave Lang, Sr. Product Manager, Amazon Aurora Puneet Agarwal, Solutions Architect, AWS Jeffrey Schmidt, Sr. Performance Engineer, Pearson April 19, 2016 Getting Started with Amazon Aurora
  2. 2. Meet Amazon Aurora …… Databases reimagined for the cloud  Speed and availability of high-end commercial databases  Simplicity and cost-effectiveness of open source databases  Drop-in compatibility with MySQL  Simple pay as you go pricing Delivered as a managed service
  3. 3. Not much has changed in last 30 years Even when you scale it out, you’re still replicating the same stack SQL Transactions Caching Logging SQL Transactions Caching Logging Application SQL Transactions Caching Logging SQL Transactions Caching Logging Application SQL Transactions Caching Logging SQL Transactions Caching Logging Storage Application
  4. 4. Reimagining the relational database What if you were inventing the database today? You wouldn’t design it the way we did in 1970 You’d build something  that can scale out …  that is self-healing …  that leverages existing AWS services …
  5. 5. A service-oriented architecture applied to the database Moved the logging and storage layer into a multi-tenant, scale-out database-optimized storage service Integrated with other AWS services like Amazon EC2, Amazon VPC, Amazon DynamoDB, Amazon SWF, and Amazon Route 53 for control plane operations Integrated with Amazon S3 for continuous backup with 99.999999999% durability Control planeData plane Amazon DynamoDB Amazon SWF Amazon Route 53 Logging + Storage SQL Transactions Caching Amazon S3 1 2 3
  6. 6. Rapid adoption of Amazon Aurora
  7. 7. Fastest growing service in AWS history Aurora customer adoption
  8. 8. Expedia: On-line travel marketplace  Real-time business intelligence and analytics on a growing corpus of on-line travel marketplace data.  Current Microsoft SQL Server–based architecture is too expensive. Performance degrades as data volume grows.  Cassandra with Solr index requires large memory footprint and hundreds of nodes, adding cost. Aurora benefits:  Aurora meets scale and performance requirements with much lower cost.  25,000 inserts/sec with peak up to 70,000. 30 ms average response time for write and 17 ms for read, with 1 month of data. World’s leading online travel company, with a portfolio that includes 150+ travel sites in 70 countries.
  9. 9. Higher performance, lower cost  lowered their bill by 40% by switching from sharded MySQL to a single Aurora instance.  Double Down Interactive (gaming) lowered their bill by 67% while also achieving better latencies (most queries ran faster) and lower CPU utilization. Aurora benefits:  Due to high performance and large storage support, sharded MySQL instances can be consolidated in fewer Aurora instances.  High performance allows for smaller instances.  Automated storage provisioning removes the need for storage “headroom.”  No additional storage for Read Replicas.
  10. 10. “When we ran Alfresco’s workload on Aurora, we were blown away to find that Aurora was 10x faster than our MySQL environment,” said John Newton, founder and CTO of Alfresco. “Speed matters in our business, and Aurora has been faster, cheaper, and considerably easier to use than MySQL.” Amazon Aurora is fast
  11. 11. • Four client machines with 1,000 threads each WRITE PERFORMANCE READ PERFORMANCE • Single client with 1,000 threads • MySQL SysBench • R3.8XL with 32 cores and 244 GB RAM SQL benchmark results
  12. 12. Consistent performance as table count increases • Write-only workload • 1,000 connections • Query cache (default on for Amazon Aurora, off for MySQL) • i2.8XL instances for MySQL SSD and RAM • r3.8XL instances for Aurora and Amazon RDS MySQL 11x U P TO FASTER
  13. 13. Writes scale with connection count • OLTP Workload • Variable connection count • 250 tables • Query cache (default on for Amazon Aurora, off for MySQL) 8x U P TO FA STER
  14. 14. Consistent performance with growing dataset 67x U P TO FA STER • SysBench write-only workload • Aurora r3.8XL • Amazon RDS MySQL r3.8XL with 30K IOPS (single AZ)
  15. 15. Do fewer I/Os Minimize network packets Cache prior results Offload the database engine DO LESS WORK Process asynchronously Reduce latency path Use lock-free data structures Batch operations together BE MORE EFFICIENT How do we achieve these results?
  16. 16. Aurora requires fewer I/Os Binlog Data Double-write bufferLog records FRM files, metadata T Y P E O F W R IT E S EBS mirrorEBS mirror AZ 1 AZ 2 Amazon S3 MYSQL WITH STANDBY SEQUENTIAL WRITE SEQUENTIAL WRITE EBS Amazon Elastic Block Store (EBS) Primary Instance Standby Instance AZ 1 AZ 3 Primary Instance Amazon S3 AZ 2 Replica Instance AMAZON AURORA ASYNC 4/6 QUORUM DISTRIBUTED WRITES
  17. 17. Amazon Aurora is highly available
  18. 18. Amazon Aurora is highly available Highly available storage • Six copies of data across three AZs • Latency tolerant quorum system for read/write • Up to 15 replicas with low replication lag Survivable caches • Cache remains warm in the event of a database restart • Lets you resume fully loaded operations much faster Instant crash recovery • Underlying storage replays redo records on demand as part of a disk read • Parallel, distributed, asynchronous AZ 1 AZ 2 AZ 3 Amazon S3 SQL Transactions Caching T0
  19. 19. Faster, more predictable failover App RunningFailure Detection DNS Propagation Recovery Recovery DB Failure RDS MYSQL App Running Failure Detection DNS Propagation Recovery DB Failure AURORA WITH MARIADB DRIVER 1 5 – 3 0 s e c 5 – 2 0 s e c
  20. 20. Amazon Aurora is easy to use “Amazon Aurora’s new user-friendly monitoring interface made it easy to diagnose and address issues. Its performance, reliability, and monitoring really shows Amazon Aurora is an enterprise-grade AWS database.” —Mohamad Reza, information systems officer at United Nations
  21. 21. Simplify storage management • Automatic storage scaling up to 64 TB—no performance impact • Continuous, incremental backups to Amazon S3 • Instantly create user snapshots—no performance impact • Automatic restriping, mirror repair, hot spot management, encryption Up to 64 TB of storage—auto-incremented in 10 GB units up to 64 TB
  22. 22. Simplify monitoring with AWS Management Console Amazon CloudWatch metrics for RDS  CPU utilization  Storage  Memory  50+ system/OS metrics  1–60 second granularity  DB connections  Selects per second  Latency (read and write)  Cache hit ratio  Replica lag CloudWatch alarms  Similar to on-premises custom monitoring tools
  23. 23. Simplify data security  Encryption to secure data at rest • AES-256; hardware accelerated • All blocks on disk and in Amazon S3 are encrypted • Key management by using AWS KMS  SSL to secure data in transit  Network isolation by using Amazon VPC by default  No direct access to nodes  Supports industry standard security and data protection certifications Storage SQL Transactions Caching Amazon S3 Application
  24. 24. Delivered as a managed service
  25. 25. If you host your databases on-premises Power, HVAC, net Rack and stack Server maintenance OS patches DB software patches Database backups Scaling High availability DB software installs OS installation you App optimization
  26. 26. If you host your databases in Amazon EC2 Power, HVAC, net Rack and stack Server maintenance OS patches DB software patches Database backups Scaling High availability DB software installs OS installation you App optimization
  27. 27. If you choose Amazon RDS Power, HVAC, net Rack and stack Server maintenance OS patches DB software patches Database backups App optimization High availability DB software installs OS installation you Scaling
  28. 28. Amazon Aurora saves you money
  29. 29. Simple pricing No licenses No lock-in Pay only for what you use Discounts 44% with a 1-year Reserved Instance 63% with a 3-year Reserved Instance vCPU Mem Hourly Price db.r3.large 2 15.25 $0.29 db.r3.xlarge 4 30.5 $0.58 db.r3.2xlarge 8 61 $1.16 db.r3.4xlarge 16 122 $2.32 db.r3.8xlarge 32 244 $4.64 • Storage consumed, up to 64 TB, is $0.10/GB-month • IOs consumed are billed at $0.20 per million I/O • Prices are for US East (N. Virginia) region Enterprise grade, open source pricing
  30. 30. Cost of ownership: Aurora vs. MySQL MySQL configuration hourly cost Primary r3.8XL Standby r3.8XL Replica r3.8XL Replica R3.8XL Storage 6TB/10K PIOP Storage 6TB/10K PIOP Storage 6TB/5K PIOP Storage 6TB/5K PIOP $3.78/hr $3.78/hr $3.78/hr $3.78/hr $2.42/hr $2.42/hr $2.42/hr Instance cost: $15.12/hr Storage cost: $8.30/hr Total cost: $23.42/hr $2.42/hr
  31. 31. Cost of ownership: Aurora vs. MySQL Aurora configuration hourly cost Instance cost: $13.92/hr Storage cost: $4.43/hr Total cost: $18.35/hr Primary r3.8XL Replica r3.8XL Replica R3.8XL Storage/6 TB $4.64/hr $4.64/hr $4.64/hr $4.43/hr *At a macro level, Aurora saves over 50% in storage cost compared to RDS MySQL. 21.6% Savings  No idle standby instance  Single shared storage volume  No PIOPS—pay for use IO  Reduction in overall IOP
  32. 32. Cost of ownership: Aurora vs. MySQL Further opportunity for saving Instance cost: $6.96/hr Storage cost: $4.43/hr Total cost: $11.39/hrStorage IOPS assumptions: 1. Average IOPS is 50% of maximum IOPS 2. 50% savings from shipping logs vs. full pages 51.3% Savings Primary r3.8XL Replica r3.8XL Replica r3.8XL Storage/6 TB $2.32/hr $2.32/hr $2.32 hr $4.43/hr r3.4XL r3.4XL r3.4XL  Use smaller instance size  Pay-as-you-go storage
  33. 33. Migration to Aurora is easy
  34. 34. Start your first migration in 10 minutes or less Keep your apps running during the migration Replicate within, to, or from Amazon EC2 or RDS Move data to the same or different database engine AWS Database Migration Service
  35. 35. Customer premises Application users AWS Internet VPN Start a replication instance Connect to source and target databases Select tables, schemas, or databases Let AWS Database Migration Service create tables, load data, and keep them in sync Switch applications over to the target at your convenience Keep your apps running during the migration AWS Database Migration Service
  36. 36. Migrate from Oracle and SQL Server Move your tables, views, stored procedures, and data manipulation language (DML) to MySQL, MariaDB, and Amazon Aurora Know exactly where manual edits are needed Download at AWS Schema Conversion Tool
  37. 37. Know exactly where manual edits are needed
  38. 38. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Jeffrey Schmidt, Senior Performance Engineer, Pearson April 19, 2016 Pearson Assessments and Amazon Aurora
  39. 39. Pearson Assessments—core business • Computer based assessments • K-12 students, college prep, clinical, etc. • Oftentimes high-stakes • Typical platform will oftentimes contain… o A customer-facing product to take tests on o A customer-facing product to register students, administer tests, view reports o A back-end product to score tests and generate reports
  40. 40. Context • We’ve been using MySQL heavily for 5 years o Percona’s XtraDB-based products are what we’ve used most frequently • Deployed first products into AWS in 2012–2013 • In 2012–2013, RDS MySQL was inadequate for our use cases • Currently manage (lots, but I’m not allowed to say how many) MySQL servers deployed in AWS, mostly on traditional EC2 instances • Most product/customer combinations are deployed on separate/isolated application and database servers. Each deployment only needs to perform “fast enough” for the product/customer. • We’ve been testing Aurora since day 1 of the Aurora beta • We aim for entirely open-source tech stacks whenever possible
  41. 41. Database needs and priorities • ACID • Durability above all else. We can’t lose data. • Failover with no data loss (if at all avoidable) o Better to be down longer and lose no data than to recover quickly and lose data • “Fast enough” performance
  42. 42. Aurora alternatives: MySQL master/slave Pros • Simple • Well supported/documented • “Fast enough” for many use cases Cons • Replication is asynchronous • Heavy write traffic on the master can cause replication lag • Failover to the slave can result in data loss, so failover only occurs when the master is unrecoverable • The slave can fall out of sync with the master. It’s rare, but it can happen.
  43. 43. Aurora alternatives: Galera-based products Pros • Synchronous replication • Highly available Cons • Cost (need three or more servers for minimum footprint) • Difficult to manage at scale • Slow writes due to network latency (must commit on all nodes, preferably in 3+ AWS Availability Zones). • High volumes of small write transactions = 50% slower loss (or more) compared to M/S • Deadlocks if writing to all nodes, so must effectively use 1 node as a read/write node, and the other nodes as read- only/failover • Must avoid large transactions, which isn’t always feasible
  44. 44. Aurora alternatives: MySQL RDS Pros • Don’t have to manage the servers ourselves • Failover is easier than DIY MySQL master/slave • Infrastructure changes are easier than DIY MySQL • RDS servers keeps up-to-date with OS patches and MySQL community edition patches Cons • Same cons as MySQL master/slave • 2 TB file size limit • 6 TB total DB size limit • Lack of thread-pooling hurts performance on platforms that require many connections to the DB
  45. 45. Aurora alternatives: Other topologies • MySQL master with a hot standby using DRDB • MySQL semi-synchronous replication We ultimately decided that veering too far outside of mainstream MySQL patterns does more harm than good.
  46. 46. Aurora—why we’re moving towards it: Technology • Failover is “quick enough” and reliable o MariaDB driver has excellent failover support • Performance is “fast enough” • “Readers” don’t slow down the “writer” • Technology ensures that failover will not result in data loss • Full compatibility with MySQL Community Edition means that we can back out of Aurora at any time o This is important for us, as some contracts may not allow us to use AWS
  47. 47. Aurora—why we’re moving towards it: Logistics • Don’t have to manage the servers any more • Don’t have plan or manage disk space allocation anymore • Don’t have to develop, maintain, or test automation code for deploying and managing databases and database backups • No more babysitting MySQL traditional asynchronous replication • Adding a new “reader” node is easy
  48. 48. Aurora advice • Use the MariaDB Connector/J driver • The primary difference between the MariaDB Connector/J driver and the MySQL Connector/J driver is that MariaDB Connector/J has better failover support for Aurora • Test Test Test • Don’t do your dev, test, and QA work solely against RDS MySQL (or some other flavor of MySQL). Test your app on Aurora before you deploy it to production. • Fail over Aurora regularly in your QA environment. In production, you want to make sure that a failover doesn’t cause unexpected issues. • Don’t send all read-only queries out to replica nodes • Recommended reader use cases: reports, CPU-heavy queries, asynchronous tasks, analytics
  49. 49. Aurora advice: performance • Don’t enable binary logging (unless you need it) • Aurora replicas do not require binary logging to function • Binary logging can degrade performance in Aurora • Only enable if you are planning on having a traditional MySQL asynchronous slave attached to an Aurora cluster • Enable query cache • Query cache isn’t performance poison on Aurora like it can be in MySQL Community Edition. Aurora’s default behavior is to have query cache enabled and many applications will benefit from query cache. • Aurora won’t fix a bad query • Untuned SQL will oftentimes consume as much CPU on Aurora as on MySQL Community Edition • On writer nodes, CPU can only scale so much …
  50. 50. Feature requests Here’s some stuff that we wish Aurora could do. If enough of us ask Amazon really nicely... • Smaller and cheaper instance types for dev and test? • Transaction isolation on replicas are currently locked at REPEATABLE READ. We wish we could set a replica to another isolation level, like READ COMMITTED. • Right now, all replicas are failover candidates. We’d like a replica that we can mark as “do not fail over to.” • Aurora clusters have a DNS endpoint that always points to the writer. Can we get one that points to the reader(s)? • Copy a snapshot to another region? • Copy Aurora database cluster parameter groups?
  51. 51. Special thanks to… Our AWS solutions architects, technical account managers, and the Aurora engineering team. Thanks!