As enterprises move to AWS, they have great choices for MySQL compatible databases. Knowing the best database for the specific job can save you time and money. In this webinar, Lokesh Khosla will discuss high-performance databases for AWS and share his findings based on a benchmark test that simulates the workload of a high-transaction AWS-based solution.
If you work with high transactional workloads, and you need a relational database to keep track of economically valuable items like revenue, inventory and monetary transactions, you'll be interested in this discussion about the strengths and weaknesses of Aurora and other MySQL solutions for AWS.
3. Agenda
Market Landscape of DB market
Options to Scale DB
Scale-Out Architecture
Comparisons of solutions for high transaction relational databases
3/4/2016
4. Generalized and Specialized
3/4/2016
High Concurrency/Write heavy /Real Time Analytics Historical Analytics Exploratory
Transactional Analytics
Traditional Databases
No
SQL
DW/Analytical
DBMS
Operational
System/OLTP (New
SQL)
Hadoop
7. Options to Scale DBMS
3/4/2016
DBMS
Scale Out
e.g., MongoDB
No transactions
May have weak consistency
(CAP)
Application involves DB
Coding
e.g. ClustrixDB
ACID
Proven Scalability
(Reads and Writes)
Shared Nothing
Scale Up
e.g., Aurora
Reads Scale
limited scalability on writes
Not Shared nothing scale
out
8. Scaling-Up
Keep increasing the size of the (single) database server
Pros
Simple, no application changes needed
Cons
Expensive. At some point, you’re paying 5x for 2x the performance
‘Exotic’ hardware (128 cores and above) become price prohibitive
Eventually you ‘hit the wall’, and you literally cannot scale-up anymore
8
9. Scaling Reads: Master/Slave
Add a ‘Slave’ read-server(s) to your ‘Master’ database server
Pros
Reasonably simple to implement.
Read/write fan-out can be done at the proxy level
Cons
Only adds Read performance
Data consistency issues can occur, especially if the application isn’t coded to
ensure reads from the slave are consistent with reads from the master
9
10. Scaling Writes: Master/Master
10
Add additional ‘Master’(s) to your ‘Master’ database server
Pros
Adds Write scaling without needing to shard
Cons
Adds write scaling at the cost of read-slaves
Adding read-slaves would add even more latency
Application changes are required to ensure data consistency / conflict resolution
11. Scaling Reads & Writes: Sharding
11
SHARDO1 SHARDO2 SHARDO3 SHARDO4
Partitioning tables across separate database servers
Pros
Adds both write and read scaling
Cons
Loses the ability of an RDBMS to manage transactionality, referential integrity and ACID
ACID compliance & transactionality must be managed at the application level
Consistent backups across all the shards are very hard to manage
Read and Writes can be skewed / unbalanced
Application changes can be significant
A - K L - O P - S T - Z
12. Scaling Reads & Writes: MySQL Cluster
Provides shared-nothing clustering and auto-sharding for MySQL. (designed for Telco deployments: minimal cross-node
transactions, HA emphasis)
Pros
Distributed, multi-master model
Provides high availability and high throughput
Cons
Only supports read-committed isolation
Long-running transactions can block a node restart
SBR replication not supported
Range scans are expensive and lower performance than MySQL
Unclear how it scales with many nodes
12
13. Application Workload Partitioning
13
Partition entire application + RDBMS stack across several “pods”
Pros
Adds both write and read scaling
Flexible: can keep scaling with addition of pods
Cons
No data consistency across pods (only suited for cases where it is not
needed)
High overhead in DBMS maintenance and upgrade
Queries / Reports across all pods can be very complex
Complex environment to setup and support
APP
APP
APP
APP
APP
APP
14. DBMS Capacity, Elasticity and Resiliency
14
Scale-up
Master – Slave
Master – Master
MySQL Cluster
Sharding
Scale-Out
DBMS Scaling
Many cores – very expensive
Reads Only
Read / Write
Read / Write
Unbalanced Read/Writes
Read / Write
Capacity
Single Point Failure
Fail-over
Yes
Yes
Multiple points of failure
Yes
ResiliencyElasticity
No
No
No
No
No
Yes
None
Yes – for read scale
High – update conflict
None (or minor)
Very High
None
Application Impact
15. DBMS Architecture-Scale out
3/4/2016
Shared Nothing Architecture
Compiler Map
Engine Data
Compiler Map
Engine Data
Compiler Map
Engine Data
Each Node Contains:
Query Parser/Planner: distribute partial query
fragments to the nodes.
Data Map: all nodes metadata about data
across the cluster
Database Engine: all nodes can perform all
database operations (no leader, aggregator,
leaf, data-only, etc nodes)
Data: Table Distributed: All table auto-
redistributed
17. Query
Distributed Query Processing
17
ClustrixDB
Load
Balancer
TRXTRXTRX
Queries are fielded by any peer node
Routed to node holding the data
Complex queries are split into steps and processed in parallel
Automatically distributed for optimized performance
All nodes handle writes and reads
Result is aggregated and returned to the user
18. DBMS Capacity, Elasticity and Resiliency
18
Features ClustrixDB Aurora
Write Scalability Writes scales by adding nodes Cannot add write nodes
High Concurrency Latency Low with High concurrency Latency climbs quickly with high
concurrency
ACID Yes Yes
On-Demand Write Scale Yes No
Automatically Distributed
queries
Yes: No Application changes No: Read/Write fanout needed.
Write contention on Master
Cloud/On Premises Yes No, only AWS Cloud
Shared Nothing Storage Yes: Parallel data access No: Contention at high write
concurrency