Advanced Sharding Techniques with Spider (MUC2010)
Upcoming SlideShare
Loading in...5
×
 

Advanced Sharding Techniques with Spider (MUC2010)

on

  • 5,160 views

 

Statistics

Views

Total Views
5,160
Slideshare-icon Views on SlideShare
5,148
Embed Views
12

Actions

Likes
5
Downloads
153
Comments
0

2 Embeds 12

http://www.slideshare.net 6
https://twitter.com 6

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Advanced Sharding Techniques with Spider (MUC2010) Advanced Sharding Techniques with Spider (MUC2010) Presentation Transcript

    • Advanced sharding techniques with Spider Kentoku SHIBA kentokushiba at gmail dot com
    • How to shard database without stopping the service
    • How to shard database What is database sharding? When the data volume increases or the updating traffic increases, your updating database server cannot process effectively. We often use the technique for dividing data into two or more databases to solve the problem. This is database sharding. Here, I will explain how to shard a data, without stopping the service.
    • Initial Structure tbl_a Create table tbl_a ( col_a int, DB1 col_b int, primary key(col_a) ) engine = InnoDB; There is 1 MySQL server without Spider.
    • Step 1 (for sharding) col_a%2=0 Create table tbl_a3 ( Create table tbl_a ( col_a int, tbl_a tbl_a col_a int, col_b int, col_b int, primary key(col_a) DB2 primary key(col_a) ) engine = InnoDB; ) engine = Spider Connection ‘ tbl_a2 table “tbl_a”, col_a%2=1 user “user”, Create table tbl_a4 ( password “pass” col_a int, ‘ tbl_a3 tbl_a col_b int, partition by list( primary key(col_a) mod(col_a, 2)) ( partition pt1 values in(0) DB3 ) engine = VP Comment ‘ comment ‘host “DB2”’, cit "2", partition pt2 values in(1) tbl_a4 cil "2", comment ‘host “DB3”’ ctm “1”, ); DB1 ist “1”, zru “1”, tnl “tbl_a2 tbl_a3” Create table on DB2 and DB3. ‘; Then create tables on DB1.
    • Step 2 col_a%2=0 tbl_a2 tbl_a DB2 tbl_a5 col_a%2=1 tbl_a3 tbl_a DB3 tbl_a DB1 Rename table on DB1. (rename table tbl_a2 to tbl_a5, tbl_a to tbl_a2, tbl_a4 to tbl_a)
    • Step 3 col_a%2=0 tbl_a2 tbl_a DB2 tbl_a5 col_a%2=1 tbl_a3 tbl_a DB3 tbl_a DB1 Copy data from tbl_a2 to tbl_a3 on DB1. (select vp_copy_tables(‘tbl_a’, ‘tbl_a2’, ‘tbl_a3’))
    • Step 4 col_a%2=0 tbl_a2 tbl_a DB2 tbl_a5 col_a%2=1 tbl_a tbl_a DB3 tbl_a4 DB1 Rename table on DB1. (rename table tbl_a to tbl_a4, tbl_a3 to tbl_a)
    • Finish col_a%2=0 tbl_a DB2 col_a%2=1 tbl_a tbl_a DB3 DB1 Drop table on DB1. (drop table tbl_a2, tbl_a4, tbl_a5)
    • How to re-shard database without stopping the service
    • How to re-shard database What is re-sharding? When the data volume increases or the updating traffic increases so much, even if you had your database sharded, your updating database server cannot process right again. So we solve that problem by increasing the number of servers and distributing the load. It is called re-sharding to increase the number of servers, and to distribute the load. Here, I will explain how to re-shard without stopping the service.
    • Initial Structure col_a%2=0 col_a%2=1 Create table tbl_a ( col_a int, col_b int, primary key(col_a) tbl_a tbl_a ) engine = Spider Connection ‘ DB2 DB3 table “tbl_a”, user “user”, password “pass” ‘ partition by list( mod(col_a, 2)) ( tbl_a partition pt1 values in(0) comment ‘host “DB2”’, partition pt2 values in(1) DB1 comment ‘host “DB3”’ ); There are 1 MySQL server with Spider and 2 remote MySQL servers without Spider.
    • Step 1 (for re-sharding) col_a%2=0 col_a%2=1 col_a%4=1 Create table tbl_a3 ( col_a int, tbl_a tbl_a tbl_a col_b int, primary key(col_a) DB2 DB4 ) engine = Spider Connection ‘ tbl_a2 table “tbl_a”, col_a%4=3 user “user”, password “pass” ‘ tbl_a tbl_a3 tbl_a partition by list( mod(col_a, 4)) ( partition pt1 values in(1) DB1 DB5 Create table tbl_a4 ( comment ‘host “DB4”’, partition pt2 values in(3) tbl_a4 col_a int, comment ‘host “DB5”’ col_b int, primary key(col_a) ); DB3 ) engine = VP Comment ‘ cit "2", Create table on DB4 and DB5. cil "2", ctm “1”, Then create tables on DB3. ist “1”, zru “1”, tnl “tbl_a2 tbl_a3” ‘;
    • Step 2 col_a%2=0 col_a%2=1 col_a%4=1 tbl_a tbl_a2 tbl_a DB2 DB4 tbl_a5 col_a%4=3 tbl_a tbl_a3 tbl_a DB1 DB5 tbl_a DB3 Rename table on DB3. (rename table tbl_a2 to tbl_a5, tbl_a to tbl_a2, tbl_a4 to tbl_a)
    • Step 3 col_a%2=0 col_a%2=1 col_a%4=1 tbl_a tbl_a2 tbl_a DB2 DB4 tbl_a5 col_a%4=3 tbl_a tbl_a3 tbl_a DB1 DB5 tbl_a DB3 Copy data from tbl_a2 to tbl_a3 on DB3. (select vp_copy_tables(‘tbl_a’, ‘tbl_a2’, ‘tbl_a3’))
    • Step 4 col_a%2=0 col_a%2=1 col_a%4=1 tbl_a tbl_a2 tbl_a Alter table tbl_a partition by list( DB2 DB4 mod(col_a, 4)) ( partition pt1 values in(0,2) tbl_a5 col_a%4=3 comment ‘host “DB2”’, partition pt2 values in(1) comment ‘host “DB4”’, partition pt2 values in(3) tbl_a tbl_a tbl_a comment ‘host “DB5”’ ); DB1 DB5 tbl_a4 Rename table tbl_a to tbl_a4, DB3 tbl_a3 to tbl_a; Rename table on DB3. Then alter table on DB1.
    • Finish col_a%2=0 col_a%4=1 tbl_a tbl_a DB2 DB4 col_a%4=3 tbl_a tbl_a DB1 DB5 Drop DB3.
    • How to add an index without stopping the service
    • How to add an index If you add an index in MySQL, you cannot update your data until the process is completed. When it comes to a big table, it takes a long time to complete, sometimes you cannot use the service during the change. Here, I will explain how to add an index, without stopping the update of your data.
    • Initial Structure tbl_a Create table tbl_a ( col_a int, DB1 col_b int, primary key(col_a) ) engine = InnoDB; There is 1 MySQL server.
    • Step 1 (for adding an index) Create table tbl_a2 ( col_a int, tbl_a col_b int, primary key(col_a) ) engine = InnoDB; Create table tbl_a4 ( tbl_a2 col_a int, col_b int, primary key(col_a) Create table tbl_a3 ( ) engine = VP col_a int, tbl_a3 Comment ‘ col_b int, cit "2", primary key(col_a), cil "2", key idx1(col_b) ctm “1”, ) engine = InnoDB; tbl_a4 ist “1”, zru “1”, tnl “tbl_a2 tbl_a3” DB1 ‘; Create tables on DB1.
    • Step 2 tbl_a2 tbl_a5 tbl_a3 tbl_a DB1 Rename table on DB1. (rename table tbl_a2 to tbl_a5, tbl_a to tbl_a2, tbl_a4 to tbl_a)
    • Step 3 tbl_a2 tbl_a5 tbl_a3 tbl_a DB1 Copy data from tbl_a2 to tbl_a3 on DB1. (select vp_copy_tables(‘tbl_a’, ‘tbl_a2’, ‘tbl_a3’))
    • Step 4 tbl_a2 tbl_a5 tbl_a tbl_a4 DB1 Rename table on DB1. (rename table tbl_a to tbl_a4, tbl_a3 to tbl_a)
    • Finish tbl_a DB1 Drop table on DB1. (drop table tbl_a2, tbl_a4, tbl_a5)
    • How to change the schema without stopping the service
    • How to change the schema If you change schema in MySQL, you cannot update your data until the process is completed. When it comes to a big table, it takes a long time to complete, sometimes you cannot use the service during the change. Here, I will explain how to change schema, without stopping the update of your data.
    • Initial Structure tbl_a Create table tbl_a ( col_a int, DB1 col_b int, primary key(col_a) ) engine = InnoDB; There is 1 MySQL server.
    • Step 1 (for adding a column) Create table tbl_a2 ( col_a int, tbl_a col_b int, primary key(col_a) ) engine = InnoDB; Create table tbl_a4 ( tbl_a2 col_a int, col_b int, primary key(col_a) Create table tbl_a3 ( ) engine = VP col_a int, tbl_a3 Comment ‘ col_b int, cit "2", col_c int default null, cil "2", primary key(col_a) ctm “1”, ) engine = InnoDB; tbl_a4 ist “1”, zru “1”, tnl “tbl_a2 tbl_a3” DB1 ‘; Create tables on DB1.
    • Step 2 tbl_a2 tbl_a5 tbl_a3 tbl_a DB1 Rename table on DB1. (rename table tbl_a2 to tbl_a5, tbl_a to tbl_a2, tbl_a4 to tbl_a)
    • Step 3 tbl_a2 tbl_a5 tbl_a3 tbl_a DB1 Copy data from tbl_a2 to tbl_a3 on DB1. (select vp_copy_tables(‘tbl_a’, ‘tbl_a2’, ‘tbl_a3’))
    • Step 4 tbl_a2 tbl_a5 tbl_a tbl_a4 DB1 Rename table on DB1. (rename table tbl_a to tbl_a4, tbl_a3 to tbl_a)
    • Finish tbl_a DB1 Drop table on DB1. (drop table tbl_a2, tbl_a4, tbl_a5)
    • How to set up a cluster for fault tolerance without stopping the service
    • How to set up a cluster for fault tolerance Spider can set up a cluster for fault tolerance by each table. Here, I will explain how to set up cluster, without stopping service. 'Monitoring node' in this slide is a node that works to observe the trouble of each node that composes clustering. 'Spider_copy_tables' in this slide is in development , so please wait for a while to use it.
    • Initial Structure tbl_a Create table tbl_a ( col_a int, DB2 col_b int, primary key(col_a) Create table tbl_a ( ) engine = InnoDB; col_a int, col_b int, primary key(col_a) ) engine = Spider Connection ‘ tbl_a table “tbl_a”, user “user”, password “pass”, DB1 host “DB2” ‘; There are 1 MySQL server with Spider and 1 remote Mysql servers without Spider.
    • Step 1 (for clustering) tbl_a tbl_a tbl_a DB2 DB3 DB4 tbl_a Create table tbl_a ( col_a int, DB1 col_b int, primary key(col_a) ) engine = InnoDB; Add new data nodes(DB3 and DB4) and tables.
    • Step 2 tbl_a tbl_a tbl_a DB2 DB3 DB4 Create table tbl_a ( col_a int, col_b int, primary key(col_a) ) engine = Spider tbl_a Connection ‘ table “tbl_a”, DB1 tbl_a DB7 user “user”, password “pass”, DB6 host “DB2 DB3 DB4” DB5 ‘; Add new monitoring nodes (DB5, DB6, DB7) and tables.
    • Step 3 insert into mysql.spider_link_mon_servers (db_name, table_name, link_id, sid, server, scheme, host, port, socket, username, password) values tbl_a tbl_a tbl_a ('db_name', 'tbl_a', 0, DB5_sid, null, 'mysql', 'DB5', 3306, null, 'user', 'pass‘), ('db_name', 'tbl_a', 0, DB6_sid, null, 'mysql', 'DB6', 3306, null, 'user', 'pass‘), DB2 DB3 DB4 ('db_name', 'tbl_a', 0, DB7_sid, null, 'mysql', 'DB7', 3306, null, 'user', 'pass‘); Alter table tbl_a Connection ‘ table “tbl_a”, user “user”, password “pass”, host “DB2 DB3 DB4”, tbl_a mbk “2”, mkd “2”, msi “DB5_sid”, DB1 tbl_a DB7 link_status “0 2 2” DB6 ‘; DB5 Register monitornig node information to MySQL servers with Spider. Then alter table on DB1.
    • Step 4 tbl_a tbl_a tbl_a DB2 DB3 DB4 tbl_a Select spider_copy_tables(‘tbl_a’, ‘’, ‘’); DB1 tbl_a DB7 DB6 DB5 Copy data from DB2 to DB3 and DB4.
    • Finish tbl_a tbl_a tbl_a DB2 DB3 DB4 Alter table tbl_a Connection ‘ table “tbl_a”, user “user”, password “pass”, host “DB2 DB3 DB4”, tbl_a mbk “2”, mkd “2”, msi “DB5_sid”, link_status “0 1 1” DB1 tbl_a DB7 ‘; DB6 DB5 Alter table on DB1.
    • How to add new node after failover and preparing new server without stopping the service
    • Create a table of a new node to the clustered table You need to create a new node, in order to maintain redundancy, when there is a trouble at the node that composes the cluster. Here, I will explain how to add a table of a new node, without stopping the service. 'Monitoring node' in this slide is a node that works to observe the trouble of each node that composes clustering. 'Spider_copy_tables' in this slide is still in development , it will be available in future releases.
    • Initial Structure tbl_a tbl_a tbl_a DB2 DB3 DB4 tbl_a DB1 tbl_a DB7 DB6 DB5 There are 4 MySQL servers with Spider (include 3 monitoring nodes) and 3 MySQL servers without Spider (including 1 broken node).
    • Step 1 tbl_a tbl_a tbl_a tbl_a DB8 DB2 DB3 DB4 Create table tbl_a ( tbl_a col_a int, col_b int, DB1 tbl_a DB7 primary key(col_a) DB6 ) engine = InnoDB; DB5 Add new data node(DB8) and table.
    • Step 2 tbl_a tbl_a tbl_a tbl_a DB8 DB2 DB3 DB4 Alter table tbl_a tbl_a Connection ‘ table “tbl_a”, user “user”, DB1 tbl_a DB7 password “pass”, DB6 host “DB2 DB4 DB8” DB5 ‘; Alter table on monitoring nodes (DB5, DB6 and DB7).
    • Step 3 tbl_a tbl_a tbl_a tbl_a DB8 DB2 DB3 DB4 Alter table tbl_a Connection ‘ table “tbl_a”, user “user”, password “pass”, host “DB2 DB4 DB8”, tbl_a mbk “2”, mkd “2”, msi “DB5_sid”, DB1 tbl_a DB7 link_status “0 0 2” DB6 ‘; DB5 Alter table on DB1.
    • Step 4 tbl_a tbl_a tbl_a tbl_a DB8 DB2 DB3 DB4 tbl_a Select spider_copy_tables(‘tbl_a’, ‘’, ‘’); DB1 tbl_a DB7 DB6 DB5 Copy data from DB2 to DB8.
    • Finish tbl_a tbl_a tbl_a tbl_a DB8 DB2 DB3 DB4 Alter table tbl_a Connection ‘ table “tbl_a”, user “user”, password “pass”, host “DB2 DB4 DB8”, tbl_a mbk “2”, mkd “2”, msi “DB5_sid”, DB1 tbl_a DB7 link_status “0 0 1” DB6 ‘; DB5 Alter table on DB1.
    • How to avoid table partitioning UNIQUE column limitation without stopping the service
    • How to avoid table partitioning UNIQUE column limitation Right now, there is a restriction of MySQL that you cannot partition in other columns when there is a PK or UNIQUE. Here, I will show you how to partition a table by any columns even if there is a PK or UNIQUE.
    • Initial Structure tbl_a Create table tbl_a ( col_a int, DB1 col_b int, primary key(col_a) ) engine = InnoDB; There is 1 MySQL server.
    • Step 1 (for avoiding partitioning limitation) Create table tbl_a3 ( col_a int, Create table tbl_a2 ( primary key(col_a) tbl_a col_a int, ) engine = InnoDB col_b int, partition by primary key(col_a) linear hash(col_a) tbl_a2 ) engine = InnoDB; partitions 4; Create table tbl_a5 ( Create table tbl_a4 ( tbl_a3 col_a int, col_a int, col_b int, col_b int, primary key(col_a) key idx1(col_a), tbl_a4 ) engine = VP key idx2(col_b) Comment ‘ ) engine = InnoDB ctm “1”, ist “1”, partition by list( zru “1”, pcm “1” mod(col_b, 2)) ( tbl_a5 ‘ partition pt1 values in(0), Connection ‘ ); partition pt2 values in(1) DB1 ‘; tnl “tbl_a2 tbl_a3 tbl_a4” Create tables on DB1.
    • Step 2 tbl_a2 tbl_a6 tbl_a3 tbl_a4 tbl_a DB1 Rename table on DB1. (rename table tbl_a2 to tbl_a6, tbl_a to tbl_a2, tbl_a5 to tbl_a)
    • Step 3 tbl_a2 tbl_a6 tbl_a3 tbl_a4 tbl_a DB1 Copy data from tbl_a2 to tbl_a3 and tbl_a4. (select vp_copy_tables(‘tbl_a’, ‘tbl_a2’, ‘tbl_a3 tbl_a4’))
    • Step 4 tbl_a2 tbl_a6 tbl_a3 Alter table tbl_a tbl_a4 Comment ‘ ctm “1”, ist “1”, pcm “1” ‘, tbl_a Connection ‘ tnl “tbl_a3 tbl_a4” DB1 ‘; Alter table tbl_a.
    • Finish tbl_a3 tbl_a4 tbl_a DB1 Drop table. (drop table tbl_a2, tbl_a6)
    • Case study
    • About MicroAd MicroAd is an advatising company. This company can advertise efficiently using "behavioral targeting" technology. 【MicroAd, Inc.] http://www.microad.jp/english/
    • The previous architecture …… …… AP AP AP AP LVS Slave Slave DB DB Register new statistical rules replication from batch server Master Batch DB Batch processing updates new statistical rules every day. (For every advertisers, every advertising medias and every users)
    • The problem with business expansion Increase data and request. At that time the limit of updates were 20 million records a day. They needed to update 100 million records a day. They also wanted to improve the performance of the reference slave by decreasing the amount of the update by one slave. They did not want to change or modify their application to support the increase. Then, Spider was used.
    • The architecture with Spider …… AP AP AP AP …… with Spider with Spider with Spider with Spider Spider sharding LVS LVS LVS SlaveDB SlaveDB SlaveDB SlaveDB SlaveDB SlaveDB replication replication replication MasterDB MasterDB MasterDB Spider sharding Register new statistical rules from batch server SpiderDB (MySQL with Spider) Batch They created the shards with the unit of the replication.
    • Resolved the problem As a result, They achieved update 100 million records a day and improved the performance of the reference. They didn't need to change or modify their applications so much. They are planning in the near future of resharding, when they expand the business.
    • Any Questions? Thank you for taking your time!! Kentoku SHIBA (kentokushiba at gmail dot com) http://wild-growth.blogspot.com/ http://spiderformysql.com