M|18 How MariaDB Server Scales with Spider

How MariaDB Server Scales
with Spider
Jacob Mathew
Senior Software Engineer, MariaDB
Kentoku Shiba
Author of Spider, Spiral Arms

Spider
● What is Spider?
● Why should I use Spider?
● Sharding with Spider.
● Redundant Data.
● Data Consistency.
● Getting Started with Spider.
● What’s New in Spider?
● What’s Ahead for Spider?

What is Spider?
● Storage engine plugin.
○ Spider doesn’t itself store data.
● Manage storage and retrieval of data stored using other storage engines.
● Sharding solution that stores data remotely on other servers.
● Partition tables using the Partition Engine.
● View the data as if it is local.

Why Should I Use Spider?
● Very large tables.
● Volume of data is growing.
● Lots of concurrent operations on the data.
● Few or no application code changes required.

Why Should I Use
Spider?
● Spider pushes down query
information.
● Reduces amount of result data
returned by data nodes.
● Parallel execution.
● Data consistency.
SQL Client
Data Node
MariaDB
Spider Node
MariaDB
table_a
Data Node
MariaDB
table_a
Data Node
MariaDB
table_a
Data Node
MariaDB
table_a
A-F G-L M-R S-Z

Sharding with
Spider
1. Receive a request.
2. Execute the request.
a. Distribute SQL to data
nodes.
b. Receive and consolidate
results from data nodes.
3. Send reply.
SQL Client
Data Node
MariaDB
Spider Node
MariaDB
table_a
Data Node
MariaDB
table_a
Data Node
MariaDB
table_a
Data Node
MariaDB
table_a
1 3
2a 2b
A-F G-L M-R S-Z

Sharding with Spider
● Partition Engine
○ Supports all partitioning rules.
■ Range.
■ Key.
■ Hash.
■ List.
● CREATE SERVER
○ Comment for connection details.
○ Useful when each data node has different connection information.

Sharding with Spider
Spider cluster pushdown
● Engine condition.
● Index hints.
● Join.
● Aggregation.
● Direct update/delete.

Redundant Data
● Full copy of the table on each
data node.
● For SELECTs, Spider performs
load balancing and chooses the
data node.
● INSERTs, UPDATEs and
DELETEs are parallelized to the
data nodes.
SQL Client
Data Node
MariaDB
Spider Node
MariaDB
table_a
Data Node
MariaDB
table_a
Data Node
MariaDB
table_a
Data Node
MariaDB
table_a
A-Z A-Z A-Z A-Z

Data Consistency
● Data needs to be written to
multiple data nodes.
● Spider uses 2-phase commit.
SQL Client
Data Node
MariaDB
Spider Node
MariaDB
table_a
Data Node
MariaDB
table_a
Data Node
MariaDB
table_a
Data Node
MariaDB
table_a
A-F G-L M-R S-Z

Getting Started with Spider
1. Get MariaDB.
a. Spider is bundled with MariaDB.
2. Install the database.
a. mysql_install_db
3. Start MariaDB server.
4. Install Spider engine.
a. mysql < scripts/install_spider.sql
5. CREATE TABLE with options to use Spider.

Getting Started
with Spider
On the Data Node:
CREATE TABLE r_table_a
(c1 INT PRIMARY KEY,
c2 VARCHAR(100))
ENGINE=innodb
DEFAULT CHARSET=UTF8;

Getting Started
with Spider
On the Spider Node:
CREATE TABLE table_a
(c1 INT PRIMARY KEY,
c2 VARCHAR(100))
ENGINE=spider
DEFAULT CHARSET=UTF8
COMMENT
‘table "r_table_a", database "test",
port "3306",
host "<host name of data node>",
user "<user name for data node>",
password "<password for user>"’;

Getting Started
with Spider
Omit column definitions
on the Spider Node:
CREATE TABLE table_a
ENGINE=spider
DEFAULT CHARSET=UTF8
COMMENT
‘table "r_table_a", database "test",
port "3306",
host "<host name of data node>",
password "<password for user>"’;

CREATE TABLE table_a (c1 INT PRIMARY KEY, c2 VARCHAR(100))
ENGINE=spider DEFAULT CHARSET=UTF8
COMMENT
‘table "r_table_a", database "test", port "3306",
password "<password for user>"’
PARTITION BY RANGE(c1)
(PARTITION p1 VALUES LESS THAN (100000) COMMENT 'host "h1"',
PARTITION p2 VALUES LESS THAN (200000) COMMENT 'host "h2"',
PARTITION p3 VALUES LESS THAN (300000) COMMENT 'host "h3"',
PARTITION p4 VALUES LESS THAN MAXVALUE COMMENT 'host "h4"');
Sharding on the Spider Node

CREATE SERVER server_1
FOREIGN DATA WRAPPER mysql OPTIONS
HOST 'host name of data node',
DATABASE 'test',
USER 'user name for data node',
PASSWORD 'password for data node',
PORT 3306;
COMMENT ‘table "r_table_a", server "server_1"’;
CREATE SERVER for connection information on the Spider Node

CREATE SERVER server_1 FOREIGN DATA WRAPPER mysql OPTIONS
HOST 'host name of data node 1', DATABASE 'test',
USER 'user name for data node 1', PASSWORD 'password for data node 1', PORT 3306;
CREATE SERVER server_2 FOREIGN DATA WRAPPER mysql OPTIONS
HOST 'host name of data node 2', DATABASE 'test',
USER 'user name for data node 2', PASSWORD 'password for data node', PORT 3306;
COMMENT ‘table "r_table_a"’
PARTITION BY RANGE(c1)
(PARTITION p1 VALUES LESS THAN (200000) COMMENT 'server "server_1"',
PARTITION p2 VALUES LESS THAN MAXVALUE COMMENT 'server "server_2"');
CREATE SERVER for shard connection information on the Spider Node

What’s New in Spider?
● Support in the Partition Engine for additional features.
○ Engine Condition pushdown pushes down to the data nodes.
○ Multi range read.
○ Full Text search.
○ Auto-Increment data type.
● Direct aggregation of min, max, avg, count, sum
● Direct update/delete.
● Direct join.
● Options to log
○ Result errors.
○ Stored Procedure Queries.
● Contributions from Tencent.

What’s New in
Spider?
Direct Aggregation
● Aggregation is pushed down to
the data nodes:
min, max, avg, count, sum.
● Aggregation results are
returned by the data nodes.
SQL Client
Data Node
MariaDB
Spider Node
MariaDB
table_a
Data Node
MariaDB
table_a
Data Node
MariaDB
table_a
Data Node
MariaDB
table_a
A-F G-L M-R S-Z

What’s New in
Spider?
Direct Update/Delete
● Entire update/delete operation
is pushed down to the data
nodes.
● Update/delete executed as a
single cluster operation instead
of one row at a time.
SQL Client
Data Node
MariaDB
Spider Node
MariaDB
table_a
Data Node
MariaDB
table_a
Data Node
MariaDB
table_a
Data Node
MariaDB
table_a
A-F G-L M-R S-Z

What’s New in
Spider?
Direct Join
● Join is pushed down to the data
nodes.
● Join results are consolidated by
the Spider node.
SQL Client
Data Node
MariaDB
Spider Node
MariaDB
table_a
Data Node
MariaDB
table_a
Data Node
MariaDB
table_a
Data Node
MariaDB
table_a
A-F G-L M-R S-Z

What’s New in Spider?
● Force pushdown of index hints.
● Optimization for LIMIT.
● Added max connection pool size feature to Spider.
● Bug fixes.
Contributions from Tencent

What’s Ahead for Spider?
● Vertical Partition (VP) Engine.
○ Multi-dimensional sharding.
○ VP merges multiple child tables into a single view.
○ VP efficiently chooses child tables for each query.

Vertical Partitioning with VP
SQL Client
Spider / VP Node
MariaDB
table_a
table_a_ca table_a_cb
Partition by
column col_b
Partition by
column col_a
CREATE TABLE table_a_ca (
col_a int,,
col_b date,
col_c int,
primary key(col_a))
ENGINE=innodb partition by ...
CREATE TABLE table_a_cb (
col_a int,
col_b date,
col_c int,
key idx1(col_a),
key idx2(col_b))
ENGINE=innodb partition by ...

SQL Client
Spider / VP Node
MariaDB
table_a
Partition by
column col_b
Partition by
column col_a
SELECT … FROM table_a WHERE col_a = 1

SQL Client
Spider / VP Node
MariaDB
table_a
Partition by
column col_b
Partition by
column col_a
SELECT … FROM table_a WHERE col_b = ‘2016-01-01’

● When sharding Spider tables which have different partitioning rules for VP
child tables, VP chooses sharded Spider tables efficiently.

Vertical
Partitioning
with VP
SELECT …
FROM
table_a
WHERE
col_a = 1
SQL Client
Spider / VP Node
MariaDB
Partition by
column col_a
Data Node
MariaDB
table_a_cb
A-L
Data Node
MariaDB
table_a_cb
M-Z
Data Node
MariaDB
table_a_ca
A-L
Data Node
MariaDB
table_a_ca
M-Z
table_a
Partition by
column col_b

Vertical
Partitioning
with VP
SELECT …
FROM
table_a
WHERE
col_b =
‘2016-01-01’
SQL Client
Spider / VP Node
MariaDB
Partition by
column col_a
Data Node
MariaDB
table_a_cb
A-L
Data Node
MariaDB
table_a_cb
M-Z
Data Node
MariaDB
table_a_ca
A-L
Data Node
MariaDB
table_a_ca
M-Z
table_a
Partition by
column col_b

M|18 How MariaDB Server Scales with Spider

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to M|18 How MariaDB Server Scales with Spider

Similar to M|18 How MariaDB Server Scales with Spider (20)

More from MariaDB plc

More from MariaDB plc (20)

Recently uploaded

Recently uploaded (20)

M|18 How MariaDB Server Scales with Spider