MySQL 5.6 Fabric:
High Availability and Sharding
By : Omar Eisa
Software Engineer
Eng.omar.essa@gmail.com
Agenda
● Introuduction .
● High Aviliabilty (HA).
● Replcation.
● What is sharding ?
● Manage shard database.
● Limition.
● Best Practice.
Introduction
● Enterprises often start with a single server
setup.
●More and more page requests
...more and more reads
●What to do?
Scale out!
Availability
● Cluster as a whole unaffected by loss of nodes
Scalability
● Geographic distribution
● Scale size in terms of users and data
● Database specific: read and/or write load
Distribution Transparency
● Access, Location, Migration, Relocation (while in use)
● Replication
● Concurrency, Failure
Goals of distributed databases
HA&Replication
As the enterprise grows, so does the data and the number of requests for the data. Using
a single server setup makes it difficult to manage the increasing load.This creates
the requirement to scale out.
What is sharding?
● When nearing the capacity or write
performance limit of a single MySQL Server (or
HA group), MySQL Fabric can be used to scale-
out the database servers by partitioning the
data across multiple MySQL Server “groups”.
● Note that a group could contain a single MySQL
Server or it could be a HA group.
Type of sharding
HASH: A hash function is run on the shard key to generate
the shard number. If values held in the column used as the
sharding key don’t tend to have too many repeated values
then this should result in an even partitioning of rows across
the shards.
RANGE: The administrator defines an explicit mapping
between ranges of values for the sharding key and shards.
This gives maximum control to the user of how data is
partitioned and which rows should be co-located.
MySqlFabric Tool
Basics first. MySQL Fabric is daemon for managing
farms of MySQL servers. Farms consist of „groups“. A
group either consists of any number of individual MySQL
servers, or holds a MySQL Replication cluster. A group
describing a replication cluster consists of a master and any
number of slaves, as ever since. MySQL Fabric can setup,
administrate and monitor groups. Once a MySQL Server
has been installed, Fabric can take care of the replication
setup details. DBAs might, for example, use virtual machine
images to add new MySQL Servers, whenever needed.
Then, Fabric is used to integrate those servers into the
replication cluster. For example, integrating a new node
boils down to one line on the CLI. More later...
Scalability: sharding
Primary/Master
Slave Slave
MySQL Fabric
Shard Group: shard1
Backing Store
Master
Slave
Shard Group: shard2
shard_key column
1 Abkhazia
2 Afghanistan
shard_key column
11 Azerbaijan
12 Bahamas
Setup, Monitor Split, Merge, Move
RANGE,
HASH,
LIST etc
Schema updates, global tables
Slave/Primary
Slave Slave
Slave/Primary
Slave
Shard Group: Shard_2
Primary
Global Group
Shard Group: Shard_1
id cur rate
1 USD 1.353
1
id cur rate
1 USD 1.353
1
id cur rate
1 USD 1.353
1
The DBAs view on Fabric
New mysqlfabric command line tool
● Central administration tool
● Easy for you to integrate in your favourite deployment tool
● Easy for us to integrate into our admin/management GUIs
Extensible HTTP XML RPC interface
● No SSH access required for remote deployment
● Power users may add custom commands long term
Self-deploying clients
● Use „fabric aware“ drivers, or improve your clients
Replication with auto failover
> # Install MySQL servers, edit Fabric config, setup MySQL
backing server for Fabric
> mysqlfabric manage start
> # Create group to manage master/slave replication
> mysqlfabric group create my_master_group
> # Assign servers to master group
> mysqlfabric group add my_master_group mysql_host
mysql_user mysql_password
> …
> # Choose primary, start replication
> mysqlfabric group promote my_master_group
> # Add heartbeating for automatic failover
> mysqlfabric group activate my_master_group
How fabric aware clients tick...
Fabric Core
HTTP XML RPC
SlaveMaster
Fabric aware driver
connect('fabric');
set('group=...');
begin_trx(READ_WRITE);
Application
dump_servers() Lazy connection
Limition
• Sharding is not completely transparent to the application. While the application
need not be aware of which server stores a set of rows and it doesn't need to be
concerned when that data is moved, it does need to provide the sharding key
when accessing the database.
•Auto-increment columns cannot be used as a sharding key.
• All transactions and queries need to be limited in scope to the rows held in a
single shard, together with the global (non-sharded) tables. For example, Joins
involving multiple shards are not supported.
• Because the connectors perform the routing function, the extra latency
involved in proxy-based solutions is avoided but it does mean that Fabric-aware
connectors are required - at the time of writing these exist for .Net, Python and
Java
• The MySQL Fabric process itself is not fault-tolerant and must be restarted in
the event of it failing.
Note that this does not represent a single-point-of-failure for the server farm (HA
and/or sharding) as the connectors are able to continue routing operations using
their local caches while the MySQL Fabric process is unavailable.
Q&A

Sharding and scale out

  • 1.
    MySQL 5.6 Fabric: HighAvailability and Sharding By : Omar Eisa Software Engineer Eng.omar.essa@gmail.com
  • 2.
    Agenda ● Introuduction . ●High Aviliabilty (HA). ● Replcation. ● What is sharding ? ● Manage shard database. ● Limition. ● Best Practice.
  • 3.
    Introduction ● Enterprises oftenstart with a single server setup.
  • 4.
    ●More and morepage requests ...more and more reads ●What to do? Scale out!
  • 5.
    Availability ● Cluster asa whole unaffected by loss of nodes Scalability ● Geographic distribution ● Scale size in terms of users and data ● Database specific: read and/or write load Distribution Transparency ● Access, Location, Migration, Relocation (while in use) ● Replication ● Concurrency, Failure Goals of distributed databases
  • 6.
    HA&Replication As the enterprisegrows, so does the data and the number of requests for the data. Using a single server setup makes it difficult to manage the increasing load.This creates the requirement to scale out.
  • 7.
    What is sharding? ●When nearing the capacity or write performance limit of a single MySQL Server (or HA group), MySQL Fabric can be used to scale- out the database servers by partitioning the data across multiple MySQL Server “groups”. ● Note that a group could contain a single MySQL Server or it could be a HA group.
  • 9.
    Type of sharding HASH:A hash function is run on the shard key to generate the shard number. If values held in the column used as the sharding key don’t tend to have too many repeated values then this should result in an even partitioning of rows across the shards. RANGE: The administrator defines an explicit mapping between ranges of values for the sharding key and shards. This gives maximum control to the user of how data is partitioned and which rows should be co-located.
  • 10.
    MySqlFabric Tool Basics first.MySQL Fabric is daemon for managing farms of MySQL servers. Farms consist of „groups“. A group either consists of any number of individual MySQL servers, or holds a MySQL Replication cluster. A group describing a replication cluster consists of a master and any number of slaves, as ever since. MySQL Fabric can setup, administrate and monitor groups. Once a MySQL Server has been installed, Fabric can take care of the replication setup details. DBAs might, for example, use virtual machine images to add new MySQL Servers, whenever needed. Then, Fabric is used to integrate those servers into the replication cluster. For example, integrating a new node boils down to one line on the CLI. More later...
  • 11.
    Scalability: sharding Primary/Master Slave Slave MySQLFabric Shard Group: shard1 Backing Store Master Slave Shard Group: shard2 shard_key column 1 Abkhazia 2 Afghanistan shard_key column 11 Azerbaijan 12 Bahamas Setup, Monitor Split, Merge, Move RANGE, HASH, LIST etc
  • 12.
    Schema updates, globaltables Slave/Primary Slave Slave Slave/Primary Slave Shard Group: Shard_2 Primary Global Group Shard Group: Shard_1 id cur rate 1 USD 1.353 1 id cur rate 1 USD 1.353 1 id cur rate 1 USD 1.353 1
  • 13.
    The DBAs viewon Fabric New mysqlfabric command line tool ● Central administration tool ● Easy for you to integrate in your favourite deployment tool ● Easy for us to integrate into our admin/management GUIs Extensible HTTP XML RPC interface ● No SSH access required for remote deployment ● Power users may add custom commands long term Self-deploying clients ● Use „fabric aware“ drivers, or improve your clients
  • 14.
    Replication with autofailover > # Install MySQL servers, edit Fabric config, setup MySQL backing server for Fabric > mysqlfabric manage start > # Create group to manage master/slave replication > mysqlfabric group create my_master_group > # Assign servers to master group > mysqlfabric group add my_master_group mysql_host mysql_user mysql_password > … > # Choose primary, start replication > mysqlfabric group promote my_master_group > # Add heartbeating for automatic failover > mysqlfabric group activate my_master_group
  • 15.
    How fabric awareclients tick... Fabric Core HTTP XML RPC SlaveMaster Fabric aware driver connect('fabric'); set('group=...'); begin_trx(READ_WRITE); Application dump_servers() Lazy connection
  • 16.
    Limition • Sharding isnot completely transparent to the application. While the application need not be aware of which server stores a set of rows and it doesn't need to be concerned when that data is moved, it does need to provide the sharding key when accessing the database. •Auto-increment columns cannot be used as a sharding key. • All transactions and queries need to be limited in scope to the rows held in a single shard, together with the global (non-sharded) tables. For example, Joins involving multiple shards are not supported. • Because the connectors perform the routing function, the extra latency involved in proxy-based solutions is avoided but it does mean that Fabric-aware connectors are required - at the time of writing these exist for .Net, Python and Java
  • 17.
    • The MySQLFabric process itself is not fault-tolerant and must be restarted in the event of it failing. Note that this does not represent a single-point-of-failure for the server farm (HA and/or sharding) as the connectors are able to continue routing operations using their local caches while the MySQL Fabric process is unavailable.
  • 18.

Editor's Notes

  • #6 Operating System File-size Limit Win32 w/ FAT/FAT32 2GB/4GB Win32 w/ NTFS 2TB (possibly larger) Linux 2.2-Intel 32-bit 2GB (LFS: 4GB) Linux 2.4+ (using ext3 file system) 4TB Solaris 9/10 16TB OS X w/ HFS+ 2TB NetWare w/NSS file system 8TB
  • #8 Optimize everything else first, and then if performance still isn’t good enough, it’s time to take a very bitter medicine. The reason you need to shard basically comes down to one of these two reasons: Very large working set – The amount of memory you require to keep your frequently accessed data loaded exceeds what you can (economically) fit in a commodity machine. 5 years ago this was 4GB, today it is 128GB or even 256GB.  Defining “working set” is always an interesting concept here, since with good schema and indexing it normally doesn’t need to be the same size as your entire database. Too many writes – Either the IO system, or a slave can’t keep up with the amount of writes being sent to the server.  While the IO system can be improved with a RAID 10 controller w/battery backed write cache, the slave delay problem is actually very hard to solve. Maatkit has a partial-solution (via Paul Tuckfield), but it doesn’t work for all workloads. (Yes, I am simplifying some of the scalability issues with MySQL on big machines, but I have faith that Yasufumi is making this better).
  • #10 What types of Sharding are there? Despite my cautions, if you have established that you need to shard there are quite a few options available to you: Sharding Partitioning by Application Function – This is usually the best way to fix any of the problems mentioned above. What you do is pick a few very busy tables, and move them onto their own MySQL server.  Partition-by-function keeps the architecture still simple, and should work for most cases unless you have a single table which by itself can’t fit into the above constraints. Sharding by hash or key – This method works by picking a column on a table and try and divide up your data based on it.  You can choose any column to hash on, you just need to make sure that it will equally distribute the data equally. In practice this method can be really hard to get working right, since even if each shard has the same amount of ‘customers’, demanding users tend to by far exceed average users and some servers are overloaded while others are not. (Tip: There are a few famous cases of both (a) bad hashing algorithms and (b) users becoming unequal all of the sudden;  You don’t want to shard based on the first character of a username – as there will be a lot more ‘M’ than ‘Z’.  For users becoming unequal all of the sudden, it’s always interesting to think of what scaling challenges Flickr would have had for the official Obama photographer in the lead up to the 08 election.) Sharding via a Lookup Service – This method works by having some sort of directory service which you query first to ask “what shard number will this users data exist on?”.  It’s a highly scalable architecture, and once you write scripts to be able to migrate users to/from shards you can tweak and rebalanced to make sure that all your hardware is utilized efficiently.  The only problem with this method is what I stated at the start: it’s complicated. (Note: I’ve left out some of the more complicated sharding architectures.  For example; another solution is to have shards all store fragments of data, and to cross backup those fragments across shards.)
  • #12 Fabric supports range, hash or list based partitioning of tables using one column as a shard key. Each partition is assigned to a logical shard group, short: shard. Recall, a group consists of an individual server, or forms a replication cluster in itself. There are Fabric commands for defining the sharding rules (shard mappings), for assigning nodes to shards, for populating shards from unsharded database servers, for splitting shards, for merging shards, and for moving shards.Clients can ask MySQL Fabric for a list of nodes and sharding rules. Given a shard key, clients can route requests to the appropriate servers.
  • #13 A global group can be defined to replicate global tables to all shards and to manage schema changes to partitioned tables. Updates to global tables and DDL operations on partitioned tables are performed on the global group. Then, all shards replicate from the global group to copy the changes. Clients ask Fabric where to send global updates and route their requests to the appropriate servers.
  • #14 A major goal was to create an extensible, flexible base which integrates smoothly in existing deployments. We currently do not offer integration in our own free GUI administration tool MySQL Workbench and our own commercial GUI management tool MySQL Enterprise Monitor. However, look at the architecture and draw your own conclusions. As I expect mostly developers not DBAs reading this, and the pre-production release tends to use basic methods of performing actions (3rd-party: Call for Patches and Branches is open ;-)) , I skip further details.
  • #16 Fabric aware drivers communicate with Fabric to learn about master groups, shard groups, global groups and their nodes. Application developers think in terms of groups instead of individual servers. If, for example, an application requests use of a master group, the driver asks Fabric for a list of all nodes using the dump_servers() XML RPC call. Then, it returns a connection handle to the application. At this point, no connection to any MySQL server has been established yet. The driver uses lazy connections to delay the actual connect until it knows about the transaction/query to choose an appropriate server. In the example, the master would be used to run a read-write transaction. Lazy connection :it means if the server starts and if the database is down or database is not exists then it skips the connection and do not try to connect the database.