Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Beyond Aurora. Scale-out SQL databases for AWS

602 views

Published on

As enterprises move to AWS, they have great choices for MySQL compatible databases. Knowing the best database for the specific job can save you time and money. In this webinar, Lokesh Khosla will discuss high-performance databases for AWS and share his findings based on a benchmark test that simulates the workload of a high-transaction AWS-based solution.

If you work with high transactional workloads, and you need a relational database to keep track of economically valuable items like revenue, inventory and monetary transactions, you'll be interested in this discussion about the strengths and weaknesses of Aurora and other MySQL solutions for AWS.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Beyond Aurora. Scale-out SQL databases for AWS

  1. 1. © 2015 CLUSTRIX The First Scale-out SQL Database Engineered for Today’s Cloud Beyond Aurora. Scale-out SQL databases for AWS
  2. 2. Magnus Data SCALE OUT RDBMS 3/4/2016 MagnusData
  3. 3. Agenda  Market Landscape of DB market  Options to Scale DB  Scale-Out Architecture  Comparisons of solutions for high transaction relational databases 3/4/2016
  4. 4. Generalized and Specialized 3/4/2016 High Concurrency/Write heavy /Real Time Analytics Historical Analytics Exploratory Transactional Analytics Traditional Databases No SQL DW/Analytical DBMS Operational System/OLTP (New SQL) Hadoop
  5. 5. Scale-Up vs. Scale-Out 3/4/2016 Scale-Out databases Transactions Per Second LatencyHigh High Scale-Up Databases (like Aurora and MySQL)
  6. 6. RDBMS Scaling Techniques  Scale-Up  Master Slave  Master Master  MySQL Clustering Technologies  Sharding  Scale-Out 3/4/2016
  7. 7. Options to Scale DBMS 3/4/2016 DBMS Scale Out e.g., MongoDB No transactions May have weak consistency (CAP) Application involves DB Coding e.g. ClustrixDB ACID Proven Scalability (Reads and Writes) Shared Nothing Scale Up e.g., Aurora Reads Scale limited scalability on writes Not Shared nothing scale out
  8. 8. Scaling-Up  Keep increasing the size of the (single) database server  Pros  Simple, no application changes needed  Cons  Expensive. At some point, you’re paying 5x for 2x the performance  ‘Exotic’ hardware (128 cores and above) become price prohibitive  Eventually you ‘hit the wall’, and you literally cannot scale-up anymore 8
  9. 9. Scaling Reads: Master/Slave  Add a ‘Slave’ read-server(s) to your ‘Master’ database server  Pros  Reasonably simple to implement.  Read/write fan-out can be done at the proxy level  Cons  Only adds Read performance  Data consistency issues can occur, especially if the application isn’t coded to ensure reads from the slave are consistent with reads from the master 9
  10. 10. Scaling Writes: Master/Master 10  Add additional ‘Master’(s) to your ‘Master’ database server  Pros  Adds Write scaling without needing to shard  Cons  Adds write scaling at the cost of read-slaves  Adding read-slaves would add even more latency  Application changes are required to ensure data consistency / conflict resolution
  11. 11. Scaling Reads & Writes: Sharding 11 SHARDO1 SHARDO2 SHARDO3 SHARDO4  Partitioning tables across separate database servers  Pros  Adds both write and read scaling  Cons  Loses the ability of an RDBMS to manage transactionality, referential integrity and ACID  ACID compliance & transactionality must be managed at the application level  Consistent backups across all the shards are very hard to manage  Read and Writes can be skewed / unbalanced  Application changes can be significant A - K L - O P - S T - Z
  12. 12. Scaling Reads & Writes: MySQL Cluster  Provides shared-nothing clustering and auto-sharding for MySQL. (designed for Telco deployments: minimal cross-node transactions, HA emphasis)  Pros  Distributed, multi-master model  Provides high availability and high throughput  Cons  Only supports read-committed isolation  Long-running transactions can block a node restart  SBR replication not supported  Range scans are expensive and lower performance than MySQL  Unclear how it scales with many nodes 12
  13. 13. Application Workload Partitioning 13  Partition entire application + RDBMS stack across several “pods”  Pros  Adds both write and read scaling  Flexible: can keep scaling with addition of pods  Cons  No data consistency across pods (only suited for cases where it is not needed)  High overhead in DBMS maintenance and upgrade  Queries / Reports across all pods can be very complex  Complex environment to setup and support APP APP APP APP APP APP
  14. 14. DBMS Capacity, Elasticity and Resiliency 14 Scale-up Master – Slave Master – Master MySQL Cluster Sharding Scale-Out DBMS Scaling Many cores – very expensive Reads Only Read / Write Read / Write Unbalanced Read/Writes Read / Write Capacity Single Point Failure Fail-over Yes Yes Multiple points of failure Yes ResiliencyElasticity No No No No No Yes None Yes – for read scale High – update conflict None (or minor) Very High None Application Impact
  15. 15. DBMS Architecture-Scale out 3/4/2016 Shared Nothing Architecture Compiler Map Engine Data Compiler Map Engine Data Compiler Map Engine Data Each Node Contains:  Query Parser/Planner: distribute partial query fragments to the nodes.  Data Map: all nodes metadata about data across the cluster  Database Engine: all nodes can perform all database operations (no leader, aggregator, leaf, data-only, etc nodes)  Data: Table Distributed: All table auto- redistributed
  16. 16. BillionsofRows Database Tables S1 S2 S2 S3 S3 S4 S4 S5 S5 Intelligent Data Distribution 16 S1 ClustrixDB  Tables Auto Distributed across nodes  Tunable amount of redundancy of data across nodes  Tables are auto distributed, auto-protected
  17. 17. Query Distributed Query Processing 17 ClustrixDB Load Balancer TRXTRXTRX  Queries are fielded by any peer node  Routed to node holding the data  Complex queries are split into steps and processed in parallel  Automatically distributed for optimized performance  All nodes handle writes and reads  Result is aggregated and returned to the user
  18. 18. DBMS Capacity, Elasticity and Resiliency 18 Features ClustrixDB Aurora Write Scalability Writes scales by adding nodes Cannot add write nodes High Concurrency Latency Low with High concurrency Latency climbs quickly with high concurrency ACID Yes Yes On-Demand Write Scale Yes No Automatically Distributed queries Yes: No Application changes No: Read/Write fanout needed. Write contention on Master Cloud/On Premises Yes No, only AWS Cloud Shared Nothing Storage Yes: Parallel data access No: Contention at high write concurrency
  19. 19. Benchmark Results 3/4/2016 0 10 20 30 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,000 AverageLatency(ms) Throughput (tps) Sysbench OLTP 90:10 Mix 0 10 20 30 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,000 AverageLatency(ms) Throughput (tps) Sysbench OLTP 90:10 Mix Clustrix 4 Node Aurora Mysql RDS
  20. 20. Scalability Test 3/4/2016 0 10 20 30 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,000 AverageLatency(ms) Throughput (tps) Sysbench OLTP 90:10 Mix
  21. 21. 21 Thank you. Q&A

×