*

© SkySQL Ab. Commercial in Confidence

*
CCM Escape Case Study
- Elastic Statistics Cluster
Damien Mangin, Nicolas Payart, Stéphane Varoqui

*

© SkySQL Ab. Commercial in Confidence

*
ESCape Challenges
❏ Scale writes
❏ Should not hurt API performance
❏ Reduce the number of servers

*

© SkySQL Ab. Commercial in Confidence

*
ESCape candidates
❏ Redis
Cluster is it ?(work in progress). Is it so fast for writes?
❏ Riak - LeveDB - InnoDB
Do I really need Erlang for this? To store billions of entries use Innostore.
InnoDB is a robust and well-known storage engine. Performance aside it
appears that LevelDB may become a preferred choice for Riak.
❏ Cassandra – Hbase
Do I really need Java for this? 10K insert per node group, is the price to
pay for hight level design. We don’t need eventualy consistancy? Read
performance will suffer from such design
❏ Syslogng & UDP
We do real time processing , what if database stopped

*

© SkySQL Ab. Commercial in Confidence

*
MariaDB 10 my toolkit
❏ Storage Engines
InnoDB, Cassandra, Hbase, LevelDB, TokuDB,
Oqgraph
❏ Sharding, Clustering, and Federation
Spider, Multi source and Parallel replication,
Galera , Connect, federatedx, sphinx,
mroonga
*

© SkySQL Ab. Commercial in Confidence

*
ESCape Architecture

*

© SkySQL Ab. Commercial in Confidence

*
ESCape TokuDB

•

*

TokuDB and MyISAM performance 3 times faster on 32G of memory on the most
demanding query
© SkySQL Ab. Commercial in Confidence

*
ESCape Architecture

*

© SkySQL Ab. Commercial in Confidence

*
ESCape Proxy Node Howto
CREATE TABLE shard01.kw (
ip INT unsigned NOT NULL,
date DATETIME shard02.kw (
CREATE TABLE NOT NULL,
firstseenon BINARY(255) NOT NULL,
ip INT unsigned NOT NULL,
keyword BINARY(128) NULL,
date DATETIME NOT NOT NULL,
domaine BINARY(128) NOT NULL, NULL,
firstseenon BINARY(255) NOT
referer BINARY(255) NOT NULL ,
keyword BINARY(128) NOT NULL,
crc64 BIGINT UNSIGNEDNULL,
c1 BINARY(128) NOT PRIMARY KEY
) ENGINE=blackhole;
c2 BINARY(255) NOT NULL ,

CREATE TABLE ccmkw (
ip INT unsigned NOT NULL,
date DATETIME NOT NULL,
firstseenon VARCHAR(255) NOT NULL,
keyword VARCHAR(128) NOT NULL,
c1 VARCHAR(128) NOT NULL,
c2 varchar(255) NOT NULL ,
crc64 BIGINT UNSIGNED PRIMARY KEY
crc64 BIGINT UNSIGNED AS (CONV(LEFT(MD5(
) ENGINE=blackhole;
IF(keyword='',CONCAT(firstseenon,c1),keyword)
) ,16),10,16)) PERSISTENT
) ENGINE=spider COMMENT='user "skysql",password "skyvodka"'

PARTITION BY LIST (mod(crc64,24))
(PARTITION pt01 VALUES IN (0) COMMENT = ' tbl
"3306", database "ccmstats_shard01"' ENGINE =
PARTITION pt02 VALUES IN (1) COMMENT = ' tbl
"3306", database "ccmstats_shard02"' ENGINE =
...
;
*

"ccmkw", host
SPIDER,
"ccmkw", host
SPIDER,

© SkySQL Ab. Commercial in Confidence

"127.0.0.1", port
"127.0.0.1", port

*
ESCape Proxy Node Howto
default-storage-engine=MyISAM
skip-innodb
skip_name_resolv
back_log=1024
max_connections = 1024
table_open_cache = 4096
table_definition_cache = 2048
max_allowed_packet = 32K
binlog_cache_size = 32K
max_heap_table_size = 64M
thread_cache_size = 1024
query_cache_size = 0
expire_logs_days=4
progress_report_time=0
Binlog_ignore_db=ccmstats
*

spider_use_handler=1
spider_sts_sync=0
spider_remote_sql_log_off=1
spider_remote_autocommit=0
spider_direct_dup_insert=1
spider_local_lock_table=0
spider_support_xa=0
spider_sync_autocommit=0
spider_sync_trx_isolation=0
spider_crd_sync=0
spider_conn_recycle_mode=1
spider_reset_sql_alloc=0

© SkySQL Ab. Commercial in Confidence

*
ESCape Data Node Multi source Howto
MariaDB [(none)]> pager egrep
'Connection_name|Slave_IO_Running|Slave_SQL_Running|Gtid_Slave_Pos'
MariaDB [(none)]> show
Connection_name:
Slave_IO_Running:
Slave_SQL_Running:
Gtid_Slave_Pos:
Connection_name:
Slave_IO_Running:
Slave_SQL_Running:
Gtid_Slave_Pos:
Connection_name:
Slave_IO_Running:
Slave_SQL_Running:
Gtid_Slave_Pos:
Connection_name:
Slave_IO_Running:
Slave_SQL_Running:
Gtid_Slave_Pos:

*

all slaves statusG
Yes
Yes
0-31091-138522330
ccmstats_gertrude
Yes
Yes
0-31091-138522330
ccmstats_lucifer
Yes
Yes
0-31091-138522330
ccmstats_mysql1
Yes
Yes
0-31091-138522330

Subset of shards per data node
SET GLOBAL ESCapeProxy1.
replicate_do_db='ccmstats_shard09,
Subset of shards per data node
ccmstats_shard10,ccmstats_shard11,
SET ccmstats_shard12,ccmstats_shard13,
GLOBAL ESCapeProxy1.
replicate_do_db='ccmstats_shard01,
ccmstats_shard14,ccmstats_shard15,
ccmstats_shard02,ccmstats_shard03,
ccmstats_shard16';
Subset of shards per data node
ccmstats_shard04,ccmstats_shard05,
SET GLOBAL ESCapeProxy1.
ccmstats_shard06,ccmstats_shard07,
replicate_do_db='ccmstats_shard01,
ccmstats_shard08';
ccmstats_shard02,ccmstats_shard03,
ccmstats_shard04,ccmstats_shard05,
ccmstats_shard22,ccmstats_shard23,
ccmstats_shard24;

© SkySQL Ab. Commercial in Confidence

*
ESCape Data Node: I’m a dummy
Not counting multiple time from a single
IP.

DECLARE l_idUrl INT unsigned DEFAULT 0;
DECLARE i_idDom TINYINT unsigned DEFAULT 0;
DECLARE c_kw VARCHAR(128);
IF NOT EXISTS (SELECT 1 FROM ccmreferers
WHERE keyword = NEW.keyword AND ip = NEW.ip)
THEN
SET l_idUrl = ccmstats.GetIdUrl(NEW.firstseenon);
SET i_idDom = ccmstats.GetIdDomaine(NEW.domaine);
INSERT INTO stats_url_cur
SET keyword_crc64 = NEW.keyword_crc64,
DATE = NEW.date,
idUrl = l_idUrl,
idDomaine = i_idDom,
nb = 1
ON DUPLICATE KEY UPDATE nb=nb+1;
IF LENGTH(NEW.keyword) > 0 THEN
SET c_kw = REPLACE(TRIM(NEW.keyword),' ',' ');
INSERT INTO stats_url_kw_cur
SET keyword_crc64 = NEW.keyword_crc64,
DATE = NEW.date,
idUrl = l_idUrl,
keyword = c_kw,
idDomaine = i_idDom,
nb = 1
ON DUPLICATE KEY UPDATE nb=nb+1;
END IF;
END IF;

Massively DEADLOCKING
Trigger execution should never be re
entering to same table
Solution is a blacklist IP keyword table

*

© SkySQL Ab. Commercial in Confidence

*
ESCape Benchmarks Writes
Write performance should never be an
issue full scalability with group commit

*

© SkySQL Ab. Commercial in Confidence

*
ESCape Spider Open Resty and LibDrizzle
❏

Non blocking all the way

http {
include
mime.types;
default_type application/octet-stream;
sendfile
on;
keepalive_timeout 65;
upstream backend {
drizzle_server 192.168.0.30:5059 dbname=stat_shard01
password=skyvodka user=skysql protocol=mysql;
drizzle_keepalive max=100 mode=single overflow=reject;
}
server {
listen
81;
server_name localhost;
location / {
root
html;
index index.html index.htm;
}
location /referers {
set_unescape_uri $ip $arg_ip;
set_unescape_uri $date $arg_date;
set_unescape_uri $firstseenon $arg_firstseenon;set_quote_sql_str $quoted_firstseenon $firstseenon;
set_unescape_uri $keyword $arg_keyword;set_quote_sql_str $quoted_keyword $keyword;
set_unescape_uri $domaine $arg_domaine;set_quote_sql_str $quoted_domaine $domaine;
set_unescape_uri $ref $arg_ref;set_quote_sql_str $quoted_ref $ref;
set_unescape_uri $crc64 $arg_crc64;

❏

Fast as mysqlslap in C

❏

A gazillon faster vs ApachePHP

❏

Limited usage but scripting via lua
cosocket

❏

Non blocking MariaDB not in Resty
but should be faster

echo_location /mysql "INSERT INTO ref VALUES($ip,$date,$quoted_firstseenon,$quoted_keyword,$quoted_domaine,$quoted_ref,$crc64)";
location /mysql {
drizzle_pass backend;
drizzle_module_header off;
drizzle_query $query_string;
rds_json on;
}

*

© SkySQL Ab. Commercial in Confidence

*
ESCape Benchmarks
Latency for 1000 Key Point Access

*

© SkySQL Ab. Commercial in Confidence

*
ESCape Spider Direct SQL voodoo
HOW TO MAP REDUCE
CREATE TEMPORARY TABLE `res` (
`keyword_crc64` bigint(20) unsigned NOT NULL,
`date` date NOT NULL DEFAULT '0000-00-00',
`idUrl` int(10) unsigned NOT NULL,
`keyword` varchar(128) NOT NULL DEFAULT '',
`idDomaine` tinyint(3) unsigned NOT NULL DEFAULT '0',
`nb` mediumint(8) unsigned NOT NULL DEFAULT '0',
`id` bigint(20) unsigned NOT NULL DEFAULT '0'
) ENGINE=MEMORY DEFAULT CHARSET=latin1;
select spider_bg_direct_sql('SELECT * FROM stats_url_kw_cur s WHERE s.id IN
(241448386253908686)', 'res',
concat('host "', host, '", port "', port, '", user
"', username, '", password "', password, '", database "', tgt_db_name, '"')) a from
mysql.spider_tables where
db_name = 'commentcamarche' and table_name like 'stats_url_kw_cur#P#pt%’;

*

© SkySQL Ab. Commercial in Confidence

*
ESCape Running non RC
Be prepared for beta testing
❏

❏

Have a compilation machine

❏

Learn about reproductible test
case

❏

Learn about Valgrind and mode
debug

❏

*

Presente yourself and the
feature you would like to beta
test IRC on freenode #mariadb

Don’t get frustated if no answer it
is a dev channel

© SkySQL Ab. Commercial in Confidence

*
ESCape Running non RC
10.0.4, Oups my InnoDB datetime are
broken after alter table
Fixed in 10.0.6
MySQL 5.5, it is NOT marked as
DATA_UNSIGNED despite it being treated by
the handler as HA_KEYTYPE_ULONGLONG.
This should perhaps be considered a bug in
InnoDB, but it is ancient history

Spider debug
mysqld --debug=S:T:t:r:p:n:L:i:F:f:D:d,info,error,query,
qcache,my,exit,general,where:O,/tmp/mysqld.trace
Having a compilation server
-DBUILD_CONFIG=mysql_release DCMAKE_BUILD_TYPE=Debug -DWITH_VALGRIND=ON

Replication Gotcha 10.0.6

*

© SkySQL Ab. Commercial in Confidence

*
*

© SkySQL Ab. Commercial in Confidence

*
Thanks

*

© SkySQL Ab. Commercial in Confidence

*

CCM Escape Case Study - SkySQL Paris Meetup 17.12.2013

  • 1.
    * © SkySQL Ab.Commercial in Confidence *
  • 2.
    CCM Escape CaseStudy - Elastic Statistics Cluster Damien Mangin, Nicolas Payart, Stéphane Varoqui * © SkySQL Ab. Commercial in Confidence *
  • 3.
    ESCape Challenges ❏ Scalewrites ❏ Should not hurt API performance ❏ Reduce the number of servers * © SkySQL Ab. Commercial in Confidence *
  • 4.
    ESCape candidates ❏ Redis Clusteris it ?(work in progress). Is it so fast for writes? ❏ Riak - LeveDB - InnoDB Do I really need Erlang for this? To store billions of entries use Innostore. InnoDB is a robust and well-known storage engine. Performance aside it appears that LevelDB may become a preferred choice for Riak. ❏ Cassandra – Hbase Do I really need Java for this? 10K insert per node group, is the price to pay for hight level design. We don’t need eventualy consistancy? Read performance will suffer from such design ❏ Syslogng & UDP We do real time processing , what if database stopped * © SkySQL Ab. Commercial in Confidence *
  • 5.
    MariaDB 10 mytoolkit ❏ Storage Engines InnoDB, Cassandra, Hbase, LevelDB, TokuDB, Oqgraph ❏ Sharding, Clustering, and Federation Spider, Multi source and Parallel replication, Galera , Connect, federatedx, sphinx, mroonga * © SkySQL Ab. Commercial in Confidence *
  • 6.
    ESCape Architecture * © SkySQLAb. Commercial in Confidence *
  • 7.
    ESCape TokuDB • * TokuDB andMyISAM performance 3 times faster on 32G of memory on the most demanding query © SkySQL Ab. Commercial in Confidence *
  • 8.
    ESCape Architecture * © SkySQLAb. Commercial in Confidence *
  • 9.
    ESCape Proxy NodeHowto CREATE TABLE shard01.kw ( ip INT unsigned NOT NULL, date DATETIME shard02.kw ( CREATE TABLE NOT NULL, firstseenon BINARY(255) NOT NULL, ip INT unsigned NOT NULL, keyword BINARY(128) NULL, date DATETIME NOT NOT NULL, domaine BINARY(128) NOT NULL, NULL, firstseenon BINARY(255) NOT referer BINARY(255) NOT NULL , keyword BINARY(128) NOT NULL, crc64 BIGINT UNSIGNEDNULL, c1 BINARY(128) NOT PRIMARY KEY ) ENGINE=blackhole; c2 BINARY(255) NOT NULL , CREATE TABLE ccmkw ( ip INT unsigned NOT NULL, date DATETIME NOT NULL, firstseenon VARCHAR(255) NOT NULL, keyword VARCHAR(128) NOT NULL, c1 VARCHAR(128) NOT NULL, c2 varchar(255) NOT NULL , crc64 BIGINT UNSIGNED PRIMARY KEY crc64 BIGINT UNSIGNED AS (CONV(LEFT(MD5( ) ENGINE=blackhole; IF(keyword='',CONCAT(firstseenon,c1),keyword) ) ,16),10,16)) PERSISTENT ) ENGINE=spider COMMENT='user "skysql",password "skyvodka"' PARTITION BY LIST (mod(crc64,24)) (PARTITION pt01 VALUES IN (0) COMMENT = ' tbl "3306", database "ccmstats_shard01"' ENGINE = PARTITION pt02 VALUES IN (1) COMMENT = ' tbl "3306", database "ccmstats_shard02"' ENGINE = ... ; * "ccmkw", host SPIDER, "ccmkw", host SPIDER, © SkySQL Ab. Commercial in Confidence "127.0.0.1", port "127.0.0.1", port *
  • 10.
    ESCape Proxy NodeHowto default-storage-engine=MyISAM skip-innodb skip_name_resolv back_log=1024 max_connections = 1024 table_open_cache = 4096 table_definition_cache = 2048 max_allowed_packet = 32K binlog_cache_size = 32K max_heap_table_size = 64M thread_cache_size = 1024 query_cache_size = 0 expire_logs_days=4 progress_report_time=0 Binlog_ignore_db=ccmstats * spider_use_handler=1 spider_sts_sync=0 spider_remote_sql_log_off=1 spider_remote_autocommit=0 spider_direct_dup_insert=1 spider_local_lock_table=0 spider_support_xa=0 spider_sync_autocommit=0 spider_sync_trx_isolation=0 spider_crd_sync=0 spider_conn_recycle_mode=1 spider_reset_sql_alloc=0 © SkySQL Ab. Commercial in Confidence *
  • 11.
    ESCape Data NodeMulti source Howto MariaDB [(none)]> pager egrep 'Connection_name|Slave_IO_Running|Slave_SQL_Running|Gtid_Slave_Pos' MariaDB [(none)]> show Connection_name: Slave_IO_Running: Slave_SQL_Running: Gtid_Slave_Pos: Connection_name: Slave_IO_Running: Slave_SQL_Running: Gtid_Slave_Pos: Connection_name: Slave_IO_Running: Slave_SQL_Running: Gtid_Slave_Pos: Connection_name: Slave_IO_Running: Slave_SQL_Running: Gtid_Slave_Pos: * all slaves statusG Yes Yes 0-31091-138522330 ccmstats_gertrude Yes Yes 0-31091-138522330 ccmstats_lucifer Yes Yes 0-31091-138522330 ccmstats_mysql1 Yes Yes 0-31091-138522330 Subset of shards per data node SET GLOBAL ESCapeProxy1. replicate_do_db='ccmstats_shard09, Subset of shards per data node ccmstats_shard10,ccmstats_shard11, SET ccmstats_shard12,ccmstats_shard13, GLOBAL ESCapeProxy1. replicate_do_db='ccmstats_shard01, ccmstats_shard14,ccmstats_shard15, ccmstats_shard02,ccmstats_shard03, ccmstats_shard16'; Subset of shards per data node ccmstats_shard04,ccmstats_shard05, SET GLOBAL ESCapeProxy1. ccmstats_shard06,ccmstats_shard07, replicate_do_db='ccmstats_shard01, ccmstats_shard08'; ccmstats_shard02,ccmstats_shard03, ccmstats_shard04,ccmstats_shard05, ccmstats_shard22,ccmstats_shard23, ccmstats_shard24; © SkySQL Ab. Commercial in Confidence *
  • 12.
    ESCape Data Node:I’m a dummy Not counting multiple time from a single IP. DECLARE l_idUrl INT unsigned DEFAULT 0; DECLARE i_idDom TINYINT unsigned DEFAULT 0; DECLARE c_kw VARCHAR(128); IF NOT EXISTS (SELECT 1 FROM ccmreferers WHERE keyword = NEW.keyword AND ip = NEW.ip) THEN SET l_idUrl = ccmstats.GetIdUrl(NEW.firstseenon); SET i_idDom = ccmstats.GetIdDomaine(NEW.domaine); INSERT INTO stats_url_cur SET keyword_crc64 = NEW.keyword_crc64, DATE = NEW.date, idUrl = l_idUrl, idDomaine = i_idDom, nb = 1 ON DUPLICATE KEY UPDATE nb=nb+1; IF LENGTH(NEW.keyword) > 0 THEN SET c_kw = REPLACE(TRIM(NEW.keyword),' ',' '); INSERT INTO stats_url_kw_cur SET keyword_crc64 = NEW.keyword_crc64, DATE = NEW.date, idUrl = l_idUrl, keyword = c_kw, idDomaine = i_idDom, nb = 1 ON DUPLICATE KEY UPDATE nb=nb+1; END IF; END IF; Massively DEADLOCKING Trigger execution should never be re entering to same table Solution is a blacklist IP keyword table * © SkySQL Ab. Commercial in Confidence *
  • 13.
    ESCape Benchmarks Writes Writeperformance should never be an issue full scalability with group commit * © SkySQL Ab. Commercial in Confidence *
  • 14.
    ESCape Spider OpenResty and LibDrizzle ❏ Non blocking all the way http { include mime.types; default_type application/octet-stream; sendfile on; keepalive_timeout 65; upstream backend { drizzle_server 192.168.0.30:5059 dbname=stat_shard01 password=skyvodka user=skysql protocol=mysql; drizzle_keepalive max=100 mode=single overflow=reject; } server { listen 81; server_name localhost; location / { root html; index index.html index.htm; } location /referers { set_unescape_uri $ip $arg_ip; set_unescape_uri $date $arg_date; set_unescape_uri $firstseenon $arg_firstseenon;set_quote_sql_str $quoted_firstseenon $firstseenon; set_unescape_uri $keyword $arg_keyword;set_quote_sql_str $quoted_keyword $keyword; set_unescape_uri $domaine $arg_domaine;set_quote_sql_str $quoted_domaine $domaine; set_unescape_uri $ref $arg_ref;set_quote_sql_str $quoted_ref $ref; set_unescape_uri $crc64 $arg_crc64; ❏ Fast as mysqlslap in C ❏ A gazillon faster vs ApachePHP ❏ Limited usage but scripting via lua cosocket ❏ Non blocking MariaDB not in Resty but should be faster echo_location /mysql "INSERT INTO ref VALUES($ip,$date,$quoted_firstseenon,$quoted_keyword,$quoted_domaine,$quoted_ref,$crc64)"; location /mysql { drizzle_pass backend; drizzle_module_header off; drizzle_query $query_string; rds_json on; } * © SkySQL Ab. Commercial in Confidence *
  • 15.
    ESCape Benchmarks Latency for1000 Key Point Access * © SkySQL Ab. Commercial in Confidence *
  • 16.
    ESCape Spider DirectSQL voodoo HOW TO MAP REDUCE CREATE TEMPORARY TABLE `res` ( `keyword_crc64` bigint(20) unsigned NOT NULL, `date` date NOT NULL DEFAULT '0000-00-00', `idUrl` int(10) unsigned NOT NULL, `keyword` varchar(128) NOT NULL DEFAULT '', `idDomaine` tinyint(3) unsigned NOT NULL DEFAULT '0', `nb` mediumint(8) unsigned NOT NULL DEFAULT '0', `id` bigint(20) unsigned NOT NULL DEFAULT '0' ) ENGINE=MEMORY DEFAULT CHARSET=latin1; select spider_bg_direct_sql('SELECT * FROM stats_url_kw_cur s WHERE s.id IN (241448386253908686)', 'res', concat('host "', host, '", port "', port, '", user "', username, '", password "', password, '", database "', tgt_db_name, '"')) a from mysql.spider_tables where db_name = 'commentcamarche' and table_name like 'stats_url_kw_cur#P#pt%’; * © SkySQL Ab. Commercial in Confidence *
  • 17.
    ESCape Running nonRC Be prepared for beta testing ❏ ❏ Have a compilation machine ❏ Learn about reproductible test case ❏ Learn about Valgrind and mode debug ❏ * Presente yourself and the feature you would like to beta test IRC on freenode #mariadb Don’t get frustated if no answer it is a dev channel © SkySQL Ab. Commercial in Confidence *
  • 18.
    ESCape Running nonRC 10.0.4, Oups my InnoDB datetime are broken after alter table Fixed in 10.0.6 MySQL 5.5, it is NOT marked as DATA_UNSIGNED despite it being treated by the handler as HA_KEYTYPE_ULONGLONG. This should perhaps be considered a bug in InnoDB, but it is ancient history Spider debug mysqld --debug=S:T:t:r:p:n:L:i:F:f:D:d,info,error,query, qcache,my,exit,general,where:O,/tmp/mysqld.trace Having a compilation server -DBUILD_CONFIG=mysql_release DCMAKE_BUILD_TYPE=Debug -DWITH_VALGRIND=ON Replication Gotcha 10.0.6 * © SkySQL Ab. Commercial in Confidence *
  • 19.
    * © SkySQL Ab.Commercial in Confidence *
  • 20.
    Thanks * © SkySQL Ab.Commercial in Confidence *