• Save
"Mobage DBA Fight against Big Data" - NHN TE
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

"Mobage DBA Fight against Big Data" - NHN TE

on

  • 8,143 views

 

Statistics

Views

Total Views
8,143
Views on SlideShare
4,711
Embed Views
3,432

Actions

Likes
32
Downloads
0
Comments
1

13 Embeds 3,432

http://debiancdn.wordpress.com 2435
http://blog.riywo.com 779
http://blog.livedoor.jp 178
http://webcache.googleusercontent.com 11
https://debiancdn.wordpress.com 8
https://twimg0-a.akamaihd.net 5
https://si0.twimg.com 5
http://us-w1.rockmelt.com 4
http://www.linkedin.com 3
https://www.google.co.jp 1
http://translate.googleusercontent.com 1
http://a0.twimg.com 1
https://twitter.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • 为什么使用MySQL的平均单机QPS都这么低啊,500台,算1M1S,1MQPS,平均单机才4000QPS啊,总数据量100TB,平均单机才400GB大小。
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

"Mobage DBA Fight against Big Data" - NHN TE Presentation Transcript

  • 1. NHN Technology ConferenceMobage DBAFight againstthe Big Data Ryosuke IWANAGA a.k.a 2012/05/19 riywo @ DeNA
  • 2. Self IntroductionRyosuke IWANAGAa.k.a. riywo(りーお)DeNA (2009-) Mobage Server-Side Ops-Engi / DBA / Manager
  • 3. We have been fighting against “Big Data”!2006 - 2008 Mobage grew up to 600M PV/day2009 - 2010 Started SocialGame ->2000M PV/day2011 - Globalization
  • 4. http://www.flickr.com/photos/pmarkham/7115593769/http://www.flickr.com/photos/severud/40227346/
  • 5. What is “Big Data”?
  • 6. What is “Big Data”?Many Servers?
  • 7. What is “Big Data”?Many Servers? +500 Database Servers
  • 8. What is “Big Data”?Many Servers? +500 Database ServersLarge Data Size?
  • 9. What is “Big Data”?Many Servers? +500 Database ServersLarge Data Size? +100TB InnoDB(include replicas)
  • 10. What is “Big Data”?Many Servers? +500 Database ServersLarge Data Size? +100TB InnoDB(include replicas)High Traffic?
  • 11. What is “Big Data”?Many Servers? +500 Database ServersLarge Data Size? +100TB InnoDB(include replicas)High Traffic? +1Mqps to all MySQL at peak time
  • 12. +500 Servers
  • 13. Scale out - ReplicationMaster - Slaves Architecture critical SELECT -> master used by many HTTP states used by UPDATE(lock) non critical select -> slaves used by only showing information easy to scale out
  • 14. INSERT/DELETE/UPDATE App App App critical SELECT non critical DB SELECT Master a l b le a DB sreplication cDB Slave Slave
  • 15. Scale out - ShardingAt first -> 1 database - all tablesSharding 1 divide per tablesSharding 2 divide per records mapping table / hashingMake new DB series using replicationbefore sharding
  • 16. DB(1) Master DB(1) Table A Table B DB(1) Slave Slave 1 1 2 2 3 3 4 4 5 5 ... ... replication App App App DB(2) MasterTable 1 DB(2) DB(2)Sharding 2 3 4 Slave Slave 5(no JOIN!) ...
  • 17. DB(1) DB(1) DB(1) Master Slave Slave 1 2 3 4 5 ... App App App DB(2) MasterTable 1 DB(2) DB(2)Sharding 2 3 4 Slave Slave 5(no JOIN!) ...
  • 18. DB(1) DB(1) DB(1) Master Slave Slave App App App 1 2 3 4 5 ... replication DB(2)Record MasterSharding 1 2 DB(2) DB(2)(difficult 3 4 Slave Slave 5 range scan) ...
  • 19. DB(1) DB(1) DB(1) Master Slave Slave App App App 1 2 3 4 hash 5 ... or mapping DB(2)Record MasterSharding 1 2 DB(2) DB(2)(difficult 3 4 Slave Slave 5 range scan) ...
  • 20. Scale out - auto increment AUTO INCREMENT usually can’t be used same schema - different database MyISAM sequence table UPDATE seq SET id=LAST_INSERT_ID(id+1) SELECT LAST_INSERT_ID() not affect other InnoDB transaction
  • 21. Scale back - Multi instances Reduce scale-outed servers spec up server / service get smaller MySQL repli don’t support multi masters Run multi mysqld instances on 1 server bind different virtual IP addresses nothing change on app servers
  • 22. DB(1) DB(2)Master Master multiple master replicattion DB new Master
  • 23. DB(1) DB(2) Master Master 192.168.10.1 192.168.10.2 DB(1) new DB DB(2) mysqld mysqld Mastermy.cnf my.cnfbind-address=xxx bind-address=yyy
  • 24. Data Availability - backup using non-service slave as “backup” the same spec as master daily logical backup easy to add new slave use daily backup and binary log new schema slave ALTER before importing data
  • 25. AM 3:00 DB 1 mysqldump position Backup Anytime add slave 2 3 DB DB SlaveMaster DB DB Slave Slave
  • 26. Data Availability - MHAMHA - MySQL Master High AvailabilityChange master when master failed prevent split brain / do IP failoverOnline schema change change slaves/backup schema offline backup(new schema) -> new master master -> new backup and change schema
  • 27. +100TB Data
  • 28. PurgeMany records make DB performance bad Especially range scanning Cause storage capacity problemPurging unnecessary records is important ex.) old messages, logsUsually using timestamp column
  • 29. Purge - Before MySQL 5.1 purge records using DELETE Range DELETE is too heavy ex.) DELETE ... WHERE t <= ... Using Primary Key pre-selected from backup needs index of time column adjust speed watching slave repli delay and stop at peak time
  • 30. Purge - After MySQL 5.1Supported RANGE partitioningDROP PARTITION ≒ DROP TABLEPartition pruning reduce useless scanningNeed to add timestamp column to allUNIQUE INDEX
  • 31. id date ... 101 1337191650CREATE TABLE t ( 102 1337191650 id date ... 103 1337192655 id int not NULL, 104 213 1337192650 1337192660 date int not NULL, 105 214 1337192650 1337192662 215 1337192655 ... ... 216 1337192660 PRIMARY KEY (id, date) 217 1337192662) ENGINE=InnoDB ...PARTITION BY RANGE(date) ( PARTITION p20120517 VALUES LESS THAN ..., PARTITION p20120518 VALUES LESS THAN ..., ... PARTITION over VALUES LESS THAN MAXVALUE)
  • 32. +1 Mqps
  • 33. SELECT range scanRange scan using Index Index is important for Big DataInnoDB index = B+ Tree Primary Key = Data search PK = access data WHERE a = 1 AND b = 2 ORDER BY c KEY i1 (a, b, c) Covering Index
  • 34. Data leaf blockSELECT * FROM t PK Col1 Col2 ...WHERE Col1 = 1 100AND Col2 = b 101AND Col3 >= 6 102 103 104Index leaf block ... Col1 Col2 Col3 PK 1 a 3 10 1 b 2 400 1 b 6 103 2 a 1 201 2 c 5 9 ...
  • 35. SELECT Primary Keymany queries use WHERE PK = ... like Key-Value StoreHandler Socket plugin (made by Higuchi) skip SQL parse phase use Handler interface directly higher performance than memcached without considering cache consistency!MySQL 5.6 added memcached API
  • 36. SELECT PK,Col1FROM t P 0 db t PRIMARY PK,Col1WHERE PK = 101 0 = 1 101 Listener for libmysql HandlerSocket SQL Layer Plugin PK Col1 ... Handler Interface 100 a 101 b 102 chttp://engineer.dena.jp/2010/08/handlersocket-plugin-for-mysql.html 103 d ...
  • 37. T many UPDATE ooShorten locking timeAvoid too slow procedures in transactions Connect before locking to avoid SIN resending conn A, conn B and lock A, lock B kick out from transactions Connecting other servers like using HTTP API, memcached
  • 38. UPDATE distributed masters Distributed Master (Sharding) MySQL can’t detect Deadlock -> wait innodb_lockwait_timeout SocialGame needs many locks of records sort lock order as much as possible Optimistic lock(use version column) raise error when update the record and it has already updated by others
  • 39. Deadlock at distributed (1) lock 101 PK Col1 ...App1 ok 100 101(3) lock 200 102 waiting ...(4) lock 101 PK Col1 ... waiting 200 201App2 (2) lock 200 ok 202 ...
  • 40. Ordered Lock (1) lock 101 PK Col1 ...App1 ok 100 101(3) lock 200 102 ok ...(2) lock 101 PK Col1 ... waiting 200 201 (4) lock 200App2 waiting 202 ...
  • 41. Optimistic Lock(1)SELECT * FROM t PK ver Col1 ...WHERE PK = 101 (no lock) 100 25643=> ver = 36786 101 36786 102 14624 App ...(2)UPDATE t SET Col1 = 100, ver = ver+1WHERE PK = 101 AND ver = 36786
  • 42. Replication Delay - monitor MySQL replication is single thread High CUD qps is bottleneck of slave difficult to benchmark before -> monitor delay of backup server backup is low spec comparing with slave able to detect repli delay problems before it happens on slave
  • 43. Replication Delay - SSDSSD is quite effective for slave high IOPS means high throughput of replication thread Multi instance have multi repli threadsSATA-SSD is good way for most case enough cheep / storage capacity PCIe-SSD is too expensive now SAS-SSD can’t use full storage capacity
  • 44. Finally...
  • 45. DeNA needs Big Data loversWe have been fighting “Big Data” MySQL/Hadoop/App/Network/etc... There are still many problems ;( We need more powerWe are looking forward to your joining Please contact riywo / DeNA staff :)
  • 46. Thank you!