Sharding and scale out

MySQL 5.6 Fabric:
High Availability and Sharding
By : Omar Eisa
Software Engineer
Eng.omar.essa@gmail.com

Agenda
● Introuduction .
● High Aviliabilty (HA).
● Replcation.
● What is sharding ?
● Manage shard database.
● Limition.
● Best Practice.

Introduction
● Enterprises often start with a single server
setup.

●More and more page requests
...more and more reads
●What to do?
Scale out!

Availability
● Cluster as a whole unaffected by loss of nodes
Scalability
● Geographic distribution
● Scale size in terms of users and data
● Database specific: read and/or write load
Distribution Transparency
● Access, Location, Migration, Relocation (while in use)
● Replication
● Concurrency, Failure
Goals of distributed databases

HA&Replication
As the enterprise grows, so does the data and the number of requests for the data. Using
a single server setup makes it difficult to manage the increasing load.This creates
the requirement to scale out.

What is sharding?
● When nearing the capacity or write
performance limit of a single MySQL Server (or
HA group), MySQL Fabric can be used to scale-
out the database servers by partitioning the
data across multiple MySQL Server “groups”.
● Note that a group could contain a single MySQL
Server or it could be a HA group.

Type of sharding
HASH: A hash function is run on the shard key to generate
the shard number. If values held in the column used as the
sharding key don’t tend to have too many repeated values
then this should result in an even partitioning of rows across
the shards.
RANGE: The administrator defines an explicit mapping
between ranges of values for the sharding key and shards.
This gives maximum control to the user of how data is
partitioned and which rows should be co-located.

MySqlFabric Tool
Basics first. MySQL Fabric is daemon for managing
farms of MySQL servers. Farms consist of „groups“. A
group either consists of any number of individual MySQL
servers, or holds a MySQL Replication cluster. A group
describing a replication cluster consists of a master and any
number of slaves, as ever since. MySQL Fabric can setup,
administrate and monitor groups. Once a MySQL Server
has been installed, Fabric can take care of the replication
setup details. DBAs might, for example, use virtual machine
images to add new MySQL Servers, whenever needed.
Then, Fabric is used to integrate those servers into the
replication cluster. For example, integrating a new node
boils down to one line on the CLI. More later...

Scalability: sharding
Primary/Master
Slave Slave
MySQL Fabric
Shard Group: shard1
Backing Store
Master
Slave
Shard Group: shard2
shard_key column
1 Abkhazia
2 Afghanistan
shard_key column
11 Azerbaijan
12 Bahamas
Setup, Monitor Split, Merge, Move
RANGE,
HASH,
LIST etc

Schema updates, global tables
Slave/Primary
Slave Slave
Slave/Primary
Slave
Shard Group: Shard_2
Primary
Global Group
Shard Group: Shard_1
id cur rate
1 USD 1.353
1
id cur rate
1 USD 1.353
1
id cur rate
1 USD 1.353
1

The DBAs view on Fabric
New mysqlfabric command line tool
● Central administration tool
● Easy for you to integrate in your favourite deployment tool
● Easy for us to integrate into our admin/management GUIs
Extensible HTTP XML RPC interface
● No SSH access required for remote deployment
● Power users may add custom commands long term
Self-deploying clients
● Use „fabric aware“ drivers, or improve your clients

Replication with auto failover
> # Install MySQL servers, edit Fabric config, setup MySQL
backing server for Fabric
> mysqlfabric manage start
> # Create group to manage master/slave replication
> mysqlfabric group create my_master_group
> # Assign servers to master group
> mysqlfabric group add my_master_group mysql_host
mysql_user mysql_password
> …
> # Choose primary, start replication
> mysqlfabric group promote my_master_group
> # Add heartbeating for automatic failover
> mysqlfabric group activate my_master_group

How fabric aware clients tick...
Fabric Core
HTTP XML RPC
SlaveMaster
Fabric aware driver
connect('fabric');
set('group=...');
begin_trx(READ_WRITE);
Application
dump_servers() Lazy connection

Limition
• Sharding is not completely transparent to the application. While the application
need not be aware of which server stores a set of rows and it doesn't need to be
concerned when that data is moved, it does need to provide the sharding key
when accessing the database.
•Auto-increment columns cannot be used as a sharding key.
• All transactions and queries need to be limited in scope to the rows held in a
single shard, together with the global (non-sharded) tables. For example, Joins
involving multiple shards are not supported.
• Because the connectors perform the routing function, the extra latency
involved in proxy-based solutions is avoided but it does mean that Fabric-aware
connectors are required - at the time of writing these exist for .Net, Python and
Java

• The MySQL Fabric process itself is not fault-tolerant and must be restarted in
the event of it failing.
Note that this does not represent a single-point-of-failure for the server farm (HA
and/or sharding) as the connectors are able to continue routing operations using
their local caches while the MySQL Fabric process is unavailable.

Sharding and scale out

More Related Content

What's hot

Similar to Sharding and scale out

Recently uploaded

Sharding and scale out

Editor's Notes