"Mobage DBA Fight against Big Data" - NHN TE

NHN Technology Conference

Mobage DBA
Fight against
the Big Data
Ryosuke IWANAGA
a.k.a
2012/05/19 riywo @ DeNA

Self Introduction

Ryosuke IWANAGA
a.k.a. riywo(りーお)

DeNA (2009-)
Mobage Server-Side
Ops-Engi / DBA / Manager

We have been fighting
against “Big Data”!
2006 - 2008
Mobage grew up to 600M PV/day
2009 - 2010
Started SocialGame ->2000M PV/day
2011 -
Globalization

http://www.flickr.com/photos/pmarkham/7115593769/
http://www.flickr.com/photos/severud/40227346/

What is “Big Data”?

Many Servers?


Many Servers?
+500 Database Servers


Many Servers?
Large Data Size?


Many Servers?
Large Data Size?
+100TB InnoDB(include replicas)


Many Servers?
Large Data Size?
High Traffic?


Many Servers?
Large Data Size?
High Traffic?
+1Mqps to all MySQL at peak time

Scale out - Replication
Master - Slaves Architecture
critical SELECT -> master
used by many HTTP states
used by UPDATE(lock)
non critical select -> slaves
used by only showing information
easy to scale out

INSERT/DELETE/UPDATE App
App
App
critical SELECT

non critical
DB SELECT
Master
a l b le
a DB
s
replication
cDB Slave
Slave

Scale out - Sharding
At first -> 1 database - all tables
Sharding 1
divide per tables
Sharding 2
divide per records
mapping table / hashing
Make new DB series using replication
before sharding

DB(1) Master DB(1)
Table A Table B DB(1)
Slave
Slave
1 1
2 2
3 3
4 4
5 5
... ...

replication
App
App
App DB(2)
Master
Table 1
DB(2)
DB(2)
Sharding
2
3
4 Slave
Slave
5
(no JOIN!) ...

DB(1) DB(1)
DB(1)
Master Slave
Slave
1
2
3
4
5
...

App
App
App DB(2)
Master
Table 1
DB(2)
DB(2)
Sharding
2
3
4 Slave
Slave
5
(no JOIN!) ...

DB(1) DB(1)
DB(1)
Master Slave
Slave
App
App
App 1
2
3
4
5
...

replication
DB(2)
Record Master
Sharding 1
2
DB(2)
DB(2)
(difficult
3
4
Slave
Slave
5
range scan) ...

DB(1) DB(1)
DB(1)
Master Slave
Slave
App
App
App 1
2
3
4

hash 5
...

or
mapping
DB(2)
Record Master
Sharding 1
2
DB(2)
DB(2)
(difficult
3
4
Slave
Slave
5
range scan) ...

Scale out - auto increment

AUTO INCREMENT usually can’t be used
same schema - different database
MyISAM sequence table
UPDATE seq SET id=LAST_INSERT_ID(id+1)
SELECT LAST_INSERT_ID()
not affect other InnoDB transaction

Scale back - Multi instances

Reduce scale-outed servers
spec up server / service get smaller
MySQL repli don’t support multi masters
Run multi mysqld instances on 1 server
bind different virtual IP addresses
nothing change on app servers

DB(1) DB(2)
Master Master
multiple master
replicattion

DB new
Master

DB(1) DB(2)
Master Master

192.168.10.1 192.168.10.2

DB(1) new
DB DB(2)
mysqld mysqld
Master
my.cnf my.cnf
bind-address=xxx bind-address=yyy

Data Availability - backup
using non-service slave as “backup”
the same spec as master
daily logical backup
easy to add new slave
use daily backup and binary log
new schema slave
ALTER before importing data

AM 3:00
DB 1 mysqldump
position
Backup Anytime add slave 2

3
DB
DB Slave
Master
DB
DB Slave
Slave

Data Availability - MHA
MHA - MySQL Master High Availability
Change master when master failed
prevent split brain / do IP failover
Online schema change
change slaves/backup schema offline
backup(new schema) -> new master
master -> new backup and change
schema

Purge

Many records make DB performance bad
Especially range scanning
Cause storage capacity problem
Purging unnecessary records is important
ex.) old messages, logs
Usually using timestamp column

Purge - Before MySQL 5.1
purge records using DELETE
Range DELETE is too heavy
ex.) DELETE ... WHERE t <= ...
Using Primary Key
pre-selected from backup
needs index of time column
adjust speed watching slave repli delay
and stop at peak time

Purge - After MySQL 5.1

Supported RANGE partitioning
DROP PARTITION ≒ DROP TABLE
Partition pruning
reduce useless scanning
Need to add timestamp column to all
UNIQUE INDEX

id date ...
101 1337191650
CREATE TABLE t ( 102 1337191650
id date
...
103 1337192655
id int not NULL, 104 213 1337192650
1337192660
date int not NULL, 105 214 1337192650
1337192662
215 1337192655
... ...
216 1337192660
PRIMARY KEY (id, date) 217 1337192662
) ENGINE=InnoDB ...

PARTITION BY RANGE(date) (
PARTITION p20120517 VALUES LESS THAN ...,
PARTITION p20120518 VALUES LESS THAN ...,
...
PARTITION over VALUES LESS THAN MAXVALUE
)

SELECT range scan
Range scan using Index
Index is important for Big Data
InnoDB index = B+ Tree
Primary Key = Data
search PK = access data
WHERE a = 1 AND b = 2 ORDER BY c
KEY i1 (a, b, c)
Covering Index

Data leaf block
SELECT * FROM t
PK Col1 Col2 ...
WHERE Col1 = 1 100
AND Col2 = 'b' 101

AND Col3 >= 6 102
103
104
Index leaf block ...
Col1 Col2 Col3 PK
1 a 3 10
1 b 2 400
1 b 6 103
2 a 1 201
2 c 5 9
...

SELECT Primary Key
many queries use WHERE PK = ...
like Key-Value Store
Handler Socket plugin (made by Higuchi)
skip SQL parse phase
use Handler interface directly
higher performance than memcached
without considering cache consistency!
MySQL 5.6 added memcached API

SELECT PK,Col1
FROM t P 0 db t PRIMARY PK,Col1
WHERE PK = 101 0 = 1 101

Listener
for libmysql
HandlerSocket
SQL Layer Plugin
PK Col1 ...
Handler Interface 100 a
101 b
102 c
http://engineer.dena.jp/2010/08/handlersocket-plugin-for-mysql.html 103 d
...

T many UPDATE
oo
Shorten locking time
Avoid too slow procedures in transactions
Connect before locking
to avoid SIN resending
conn A, conn B and lock A, lock B
kick out from transactions
Connecting other servers like using
HTTP API, memcached

UPDATE distributed masters
Distributed Master (Sharding)
MySQL can’t detect Deadlock
-> wait innodb_lockwait_timeout
SocialGame needs many locks of records
sort lock order as much as possible
Optimistic lock(use version column)
raise error when update the record
and it has already updated by others

Deadlock at distributed
(1) lock 101 PK Col1 ...
App1 ok 100
101
(3) lock 200 102
waiting ...

(4) lock 101
PK Col1 ...
waiting 200
201
App2 (2) lock 200
ok
202
...

Ordered Lock
(1) lock 101 PK Col1 ...
App1 ok 100
101
(3) lock 200 102

ok ...

(2) lock 101
PK Col1 ...
waiting 200
201
(4) lock 200
App2 waiting
202
...

Optimistic Lock
(1)
SELECT * FROM t PK ver Col1 ...
WHERE PK = 101 (no lock) 100 25643
=> ver = 36786 101 36786
102 14624

App ...

(2)
UPDATE t SET Col1 = 100, ver = ver+1
WHERE PK = 101 AND ver = 36786

Replication Delay - monitor
MySQL replication is single thread
High CUD qps is bottleneck of slave
difficult to benchmark before
-> monitor delay of backup server
backup is low spec comparing with
slave
able to detect repli delay problems
before it happens on slave

Replication Delay - SSD
SSD is quite effective for slave
high IOPS means high throughput of
replication thread
Multi instance have multi repli threads
SATA-SSD is good way for most case
enough cheep / storage capacity
PCIe-SSD is too expensive now
SAS-SSD can’t use full storage capacity

DeNA needs
Big Data lovers
We have been fighting “Big Data”
MySQL/Hadoop/App/Network/etc...
There are still many problems ;(
We need more power
We are looking forward to your joining
Please contact riywo / DeNA staff :)

"Mobage DBA Fight against Big Data" - NHN TE

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to "Mobage DBA Fight against Big Data" - NHN TE

Similar to "Mobage DBA Fight against Big Data" - NHN TE (20)

More from Ryosuke IWANAGA

More from Ryosuke IWANAGA (7)

Recently uploaded

Recently uploaded (20)

"Mobage DBA Fight against Big Data" - NHN TE