Up2012 scaling my sql in the cloud by moshe shadmon, founder, cto scaledb
1. Scaling MySQL in the cloud
Moshe Shadmon
Founder, CTO
ScaleDB Inc.
2. Scaling the Database Tier in the Cloud
• Shared Nothing
• Shared Disk
• Shared Data
3. Shared Nothing
MySQL - Shared Nothing • Scaling by Partitioning Data
• Multi-months project
• Manual – not a “cloud ready” approach
• Not doable with many applications
• High Availability
Masters • Manual – not a “cloud ready” approach
• Very hard to automate
• Bad usage of cloud resources
Slaves
4. MySQL Shared Nothing Functionality
Customers
1-10K 10,001-20K 20,001-30K 30,0001-40K
cust_num
All relations
f_name
Customers Customers Customers Customers Orders must be
cust_num cust_num l_name
cust_num cust_num cust_num redefined and
f_name f_name add1
f_name f_name Prod_Ord
po_num
add2
l_name l_name l_name
city
l_name date p_num handled in the
add1 add1 add1 add1 p_num amount
add2 add2 state
add2 add2Products
application tier
city city zip
city cityp_name
state state phone1
state state
phone2 p_des
zip zip zip zipp_num
phone1 phone1 phone1 phone1
inv_count
phone2 phone2 phone2 phone2
Masters
Slaves
5. Shared Disk
Oracle RAC /Exadata - Shared Disk
RAC Databases Exadata Storage Cells
Oracle Oracle’s Exadata
Database Instance
• Dynamic scaling
• No need to partition the data
• Automated HA
• Very expensive (in the millions)
• Needs dedicated hardware
• Not a cloud solution
6. Shared Data
MySQL + ScaleDB ScaleDB Storage
Database Nodes Nodes
MySQL MySQL + ScaleDB
Database Instance
• Dynamic scaling
• No need to partition the data
• Automated HA
• Leverage cloud infrastructure
• Supports all types of applications
• Builds into MySQL ecosystem
• Provides Exadata functionality
on the cloud and with a much
lower TCO
7. Shared Data Maintains the Schema
Customers
cust_num
f_name
Orders
l_name
cust_num
add1 Prod_Ord
po_num
add2
date p_num All relations
city amount
state
p_num remain intact.
Products
zip No partitioning
phone1 p_name
phone2 p_des or repartitioning
p_num
inv_count
ScaleDB ScaleDB ScaleDB
Compute Compute Compute
ScaleDB ScaleDB ScaleDB ScaleDB
Storage Storage Storage Storage
8. ScaleDB DBMS Cluster Infrastructure
Database Layer - Physical or VM nodes
MySQL tier Node 1 Node 2 Node N
DBMS DBMS DBMS
…
ScaleDB Database tier ScaleDB ScaleDB ScaleDB
Storage Engine
ScaleDB Cluster Local Cache Local Cache Local Cache Local Cache
Manager
ScaleDB Storage tier
Connectors TCP/UDP TCP/UDP TCP/UDP TCP/UDP
Global Cache Cache Cache Cache Cache
Global Storage Striped Storage Striped Storage Striped Storage Striped Storage
Storage Layer - Physical or VM nodes
9. Scaling the Storage Tier
ScaleDB Database Layer Node 1 Node 2 Node N
Based on “shared disk” approach DBMS DBMS DBMS
…
ScaleDB Cluster ScaleDB ScaleDB ScaleDB
Manager
ScaleDB Storage Layer
Striped Striped Striped
Storage Storage Storage
Striped Striped Striped
Mirror Mirror Mirror
Striped Striped Striped
Stripe 1 Hot Backup Stripe 2 Hot Backup Stripe N Hot Backup
10. Traditional Query Processing
What Were Yesterday Sales ?
DBMS Server Storage Array
Get The
Sales Table
Process
Retrieve
Table Data
Entire Sales
Table
10
11. ScaleDB Query Processing
DBMS Server
What Were Yesterday Sales ?
Storage Nodes
Get Get Get
October 15 October 15 October 15
Sales Sales Sales
12. Performance: Shared Data
Performance tests running on DL380 servers, large data set
2236
2500
1884
Operations per Second
2000
1238
1500
1000 550
500
0
MySQL ScaleDB ScaleDB ScaleDB
Benchmark Details: YCSB Workload A, 1-Node 2-Nodes 3-Nodes
1:1 Read/Write Ratio, Database Size: 200M Rows, MariaDB V5.3.5
13. Performance: Shared Data
Performance tests running on public cloud (Read:Write Ratio = 1:1)
4668
5000
Operations per Second
4500 3542
4000
3500
3000
2500
2000
1500 544
1000
500
0
MySQL ScaleDB ScaleDB
1-Node 2-Nodes
Benchmark Details: YCSB Workload A,
1:1 Read/Write Ratio, Database Size: 40M Rows, MySQL V5.1.42
14. Performance: Shared Data
Performance tests running on public cloud (Read-Only)
11920
12000
Operations per Second
10000
8000
6117
6000
4000
930
2000
0
MySQL ScaleDB ScaleDB
1-Node 2-Nodes
Benchmark Details: YCSB Workload A, 1:0 Read/Write Ratio, Database Size: 40M Rows, MySQL V5.1.42
15. Value Proposition – Shared Data
• Scaling the Database Tier
– Scales by adding database nodes to the cluster
– No need to partition the data
– No need to change the application
• Scaling the Storage Tier
– Transparent to the application
– Provides parallel processing at the storage layer
– Leverages the cache of the storage nodes
• “Built-In” HA
• Leverages cloud infrastructure
• Lower TCO
16. From a single MySQL instance to a cluster of integrated
databases and storage nodes in the Cloud
Scaling MySQL in the Cloud