3. We have been fighting
against “Big Data”!
2006 - 2008
Mobage grew up to 600M PV/day
2009 - 2010
Started SocialGame ->2000M PV/day
2011 -
Globalization
7. What is “Big Data”?
Many Servers?
+500 Database Servers
8. What is “Big Data”?
Many Servers?
+500 Database Servers
Large Data Size?
9. What is “Big Data”?
Many Servers?
+500 Database Servers
Large Data Size?
+100TB InnoDB(include replicas)
10. What is “Big Data”?
Many Servers?
+500 Database Servers
Large Data Size?
+100TB InnoDB(include replicas)
High Traffic?
11. What is “Big Data”?
Many Servers?
+500 Database Servers
Large Data Size?
+100TB InnoDB(include replicas)
High Traffic?
+1Mqps to all MySQL at peak time
13. Scale out - Replication
Master - Slaves Architecture
critical SELECT -> master
used by many HTTP states
used by UPDATE(lock)
non critical select -> slaves
used by only showing information
easy to scale out
14. INSERT/DELETE/UPDATE App
App
App
critical SELECT
non critical
DB SELECT
Master
a l b le
a DB
s
replication
cDB Slave
Slave
15. Scale out - Sharding
At first -> 1 database - all tables
Sharding 1
divide per tables
Sharding 2
divide per records
mapping table / hashing
Make new DB series using replication
before sharding
20. Scale out - auto increment
AUTO INCREMENT usually can’t be used
same schema - different database
MyISAM sequence table
UPDATE seq SET id=LAST_INSERT_ID(id+1)
SELECT LAST_INSERT_ID()
not affect other InnoDB transaction
21. Scale back - Multi instances
Reduce scale-outed servers
spec up server / service get smaller
MySQL repli don’t support multi masters
Run multi mysqld instances on 1 server
bind different virtual IP addresses
nothing change on app servers
22. DB(1) DB(2)
Master Master
multiple master
replicattion
DB new
Master
23. DB(1) DB(2)
Master Master
192.168.10.1 192.168.10.2
DB(1) new
DB DB(2)
mysqld mysqld
Master
my.cnf my.cnf
bind-address=xxx bind-address=yyy
24. Data Availability - backup
using non-service slave as “backup”
the same spec as master
daily logical backup
easy to add new slave
use daily backup and binary log
new schema slave
ALTER before importing data
25. AM 3:00
DB 1 mysqldump
position
Backup Anytime add slave 2
3
DB
DB Slave
Master
DB
DB Slave
Slave
26. Data Availability - MHA
MHA - MySQL Master High Availability
Change master when master failed
prevent split brain / do IP failover
Online schema change
change slaves/backup schema offline
backup(new schema) -> new master
master -> new backup and change
schema
28. Purge
Many records make DB performance bad
Especially range scanning
Cause storage capacity problem
Purging unnecessary records is important
ex.) old messages, logs
Usually using timestamp column
29. Purge - Before MySQL 5.1
purge records using DELETE
Range DELETE is too heavy
ex.) DELETE ... WHERE t <= ...
Using Primary Key
pre-selected from backup
needs index of time column
adjust speed watching slave repli delay
and stop at peak time
30. Purge - After MySQL 5.1
Supported RANGE partitioning
DROP PARTITION ≒ DROP TABLE
Partition pruning
reduce useless scanning
Need to add timestamp column to all
UNIQUE INDEX
31. id date ...
101 1337191650
CREATE TABLE t ( 102 1337191650
id date
...
103 1337192655
id int not NULL, 104 213 1337192650
1337192660
date int not NULL, 105 214 1337192650
1337192662
215 1337192655
... ...
216 1337192660
PRIMARY KEY (id, date) 217 1337192662
) ENGINE=InnoDB ...
PARTITION BY RANGE(date) (
PARTITION p20120517 VALUES LESS THAN ...,
PARTITION p20120518 VALUES LESS THAN ...,
...
PARTITION over VALUES LESS THAN MAXVALUE
)
33. SELECT range scan
Range scan using Index
Index is important for Big Data
InnoDB index = B+ Tree
Primary Key = Data
search PK = access data
WHERE a = 1 AND b = 2 ORDER BY c
KEY i1 (a, b, c)
Covering Index
34. Data leaf block
SELECT * FROM t
PK Col1 Col2 ...
WHERE Col1 = 1 100
AND Col2 = 'b' 101
AND Col3 >= 6 102
103
104
Index leaf block ...
Col1 Col2 Col3 PK
1 a 3 10
1 b 2 400
1 b 6 103
2 a 1 201
2 c 5 9
...
35. SELECT Primary Key
many queries use WHERE PK = ...
like Key-Value Store
Handler Socket plugin (made by Higuchi)
skip SQL parse phase
use Handler interface directly
higher performance than memcached
without considering cache consistency!
MySQL 5.6 added memcached API
36. SELECT PK,Col1
FROM t P 0 db t PRIMARY PK,Col1
WHERE PK = 101 0 = 1 101
Listener
for libmysql
HandlerSocket
SQL Layer Plugin
PK Col1 ...
Handler Interface 100 a
101 b
102 c
http://engineer.dena.jp/2010/08/handlersocket-plugin-for-mysql.html 103 d
...
37. T many UPDATE
oo
Shorten locking time
Avoid too slow procedures in transactions
Connect before locking
to avoid SIN resending
conn A, conn B and lock A, lock B
kick out from transactions
Connecting other servers like using
HTTP API, memcached
38. UPDATE distributed masters
Distributed Master (Sharding)
MySQL can’t detect Deadlock
-> wait innodb_lockwait_timeout
SocialGame needs many locks of records
sort lock order as much as possible
Optimistic lock(use version column)
raise error when update the record
and it has already updated by others
41. Optimistic Lock
(1)
SELECT * FROM t PK ver Col1 ...
WHERE PK = 101 (no lock) 100 25643
=> ver = 36786 101 36786
102 14624
App ...
(2)
UPDATE t SET Col1 = 100, ver = ver+1
WHERE PK = 101 AND ver = 36786
42. Replication Delay - monitor
MySQL replication is single thread
High CUD qps is bottleneck of slave
difficult to benchmark before
-> monitor delay of backup server
backup is low spec comparing with
slave
able to detect repli delay problems
before it happens on slave
43. Replication Delay - SSD
SSD is quite effective for slave
high IOPS means high throughput of
replication thread
Multi instance have multi repli threads
SATA-SSD is good way for most case
enough cheep / storage capacity
PCIe-SSD is too expensive now
SAS-SSD can’t use full storage capacity
45. DeNA needs
Big Data lovers
We have been fighting “Big Data”
MySQL/Hadoop/App/Network/etc...
There are still many problems ;(
We need more power
We are looking forward to your joining
Please contact riywo / DeNA staff :)