SlideShare a Scribd company logo
1 of 46
NHN Technology Conference

Mobage DBA
Fight against
the Big Data
                    Ryosuke IWANAGA
                                a.k.a
           2012/05/19 riywo @ DeNA
Self Introduction

Ryosuke IWANAGA
a.k.a. riywo(りーお)

DeNA (2009-)
  Mobage Server-Side
  Ops-Engi / DBA / Manager
We have been fighting
  against “Big Data”!
2006 - 2008
  Mobage grew up to 600M PV/day
2009 - 2010
  Started SocialGame ->2000M PV/day
2011 -
  Globalization
http://www.flickr.com/photos/pmarkham/7115593769/
http://www.flickr.com/photos/severud/40227346/
What is “Big Data”?
What is “Big Data”?

Many Servers?
What is “Big Data”?

Many Servers?
  +500 Database Servers
What is “Big Data”?

Many Servers?
  +500 Database Servers
Large Data Size?
What is “Big Data”?

Many Servers?
  +500 Database Servers
Large Data Size?
  +100TB InnoDB(include replicas)
What is “Big Data”?

Many Servers?
  +500 Database Servers
Large Data Size?
  +100TB InnoDB(include replicas)
High Traffic?
What is “Big Data”?

Many Servers?
  +500 Database Servers
Large Data Size?
  +100TB InnoDB(include replicas)
High Traffic?
  +1Mqps to all MySQL at peak time
+500 Servers
Scale out - Replication
Master - Slaves Architecture
  critical SELECT -> master
     used by many HTTP states
     used by UPDATE(lock)
  non critical select -> slaves
     used by only showing information
     easy to scale out
INSERT/DELETE/UPDATE   App
                         App
                         App
    critical SELECT

                         non critical
  DB                        SELECT
 Master
            a l b le
               a DB
         s
replication
           cDB Slave
              Slave
Scale out - Sharding
At first -> 1 database - all tables
Sharding 1
   divide per tables
Sharding 2
   divide per records
       mapping table / hashing
Make new DB series using replication
before sharding
DB(1) Master                 DB(1)
         Table A       Table B          DB(1)
                                       Slave
                                        Slave
     1             1
     2             2
     3             3
     4             4
     5             5
            ...           ...


                                  replication
 App
 App
 App                   DB(2)
                       Master
Table              1
                                       DB(2)
                                        DB(2)
Sharding
                   2
                   3
                   4                   Slave
                                        Slave
                   5
(no JOIN!)                  ...
DB(1)     DB(1)
                        DB(1)
             Master    Slave
                        Slave
             1
             2
             3
             4
             5
                 ...




 App
 App
 App         DB(2)
             Master
Table        1
                       DB(2)
                        DB(2)
Sharding
             2
             3
             4         Slave
                        Slave
             5
(no JOIN!)       ...
DB(1)          DB(1)
                               DB(1)
               Master         Slave
                               Slave
 App
 App
 App           1
               2
               3
               4
               5
                   ...


                         replication
               DB(2)
Record         Master
Sharding       1
               2
                              DB(2)
                               DB(2)
(difficult
               3
               4
                              Slave
                               Slave
               5
 range scan)       ...
DB(1)     DB(1)
                           DB(1)
                Master    Slave
                           Slave
 App
 App
 App            1
                2
                3
                4

       hash     5
                    ...

        or
      mapping
                DB(2)
Record          Master
Sharding        1
                2
                          DB(2)
                           DB(2)
(difficult
                3
                4
                          Slave
                           Slave
                5
 range scan)        ...
Scale out - auto increment

 AUTO INCREMENT usually can’t be used
   same schema - different database
 MyISAM sequence table
   UPDATE seq SET id=LAST_INSERT_ID(id+1)
   SELECT LAST_INSERT_ID()
   not affect other InnoDB transaction
Scale back - Multi instances

  Reduce scale-outed servers
    spec up server / service get smaller
  MySQL repli don’t support multi masters
  Run multi mysqld instances on 1 server
    bind different virtual IP addresses
    nothing change on app servers
DB(1)    DB(2)
Master   Master
            multiple master
             replicattion


     DB new
     Master
DB(1)            DB(2)
  Master           Master

   192.168.10.1     192.168.10.2

    DB(1) new
        DB  DB(2)
    mysqld mysqld
        Master
my.cnf                 my.cnf
bind-address=xxx       bind-address=yyy
Data Availability - backup
 using non-service slave as “backup”
   the same spec as master
   daily logical backup
 easy to add new slave
   use daily backup and binary log
   new schema slave
      ALTER before importing data
AM 3:00
   DB      1           mysqldump
                      position
 Backup    Anytime add slave 2

                  3
                        DB
 DB                    Slave
Master
                       DB
           DB         Slave
          Slave
Data Availability - MHA
MHA - MySQL Master High Availability
Change master when master failed
  prevent split brain / do IP failover
Online schema change
  change slaves/backup schema offline
  backup(new schema) -> new master
  master -> new backup and change
  schema
+100TB Data
Purge

Many records make DB performance bad
  Especially range scanning
  Cause storage capacity problem
Purging unnecessary records is important
  ex.) old messages, logs
Usually using timestamp column
Purge - Before MySQL 5.1
 purge records using DELETE
   Range DELETE is too heavy
      ex.) DELETE ... WHERE t <= ...
   Using Primary Key
      pre-selected from backup
          needs index of time column
   adjust speed watching slave repli delay
      and stop at peak time
Purge - After MySQL 5.1

Supported RANGE partitioning
DROP PARTITION ≒ DROP TABLE
Partition pruning
  reduce useless scanning
Need to add timestamp column to all
UNIQUE INDEX
id       date       ...
                            101    1337191650
CREATE TABLE t (            102    1337191650
                                   id       date
                                          ...
                            103    1337192655
  id   int not NULL,        104   213 1337192650
                                   1337192660
  date int not NULL,        105   214 1337192650
                                   1337192662
                                  215 1337192655
  ...                                   ...
                                  216 1337192660
  PRIMARY KEY (id, date)          217 1337192662
) ENGINE=InnoDB                              ...

PARTITION BY RANGE(date) (
  PARTITION p20120517 VALUES LESS THAN ...,
  PARTITION p20120518 VALUES LESS THAN ...,
  ...
  PARTITION over VALUES LESS THAN MAXVALUE
)
+1 Mqps
SELECT range scan
Range scan using Index
   Index is important for Big Data
InnoDB index = B+ Tree
   Primary Key = Data
      search PK = access data
   WHERE a = 1 AND b = 2 ORDER BY c
      KEY i1 (a, b, c)
   Covering Index
Data leaf block
SELECT * FROM t
                               PK    Col1         Col2   ...
WHERE Col1 = 1                 100
AND   Col2 = 'b'               101

AND   Col3 >= 6                102
                               103
                               104
Index leaf block                            ...
 Col1   Col2     Col3    PK
  1      a           3   10
  1      b           2   400
  1      b           6   103
  2      a           1   201
  2      c           5   9
               ...
SELECT Primary Key
many queries use WHERE PK = ...
  like Key-Value Store
Handler Socket plugin (made by Higuchi)
  skip SQL parse phase
  use Handler interface directly
  higher performance than memcached
  without considering cache consistency!
MySQL 5.6 added memcached API
SELECT PK,Col1
FROM t                                      P 0 db t PRIMARY PK,Col1
WHERE PK = 101                              0 = 1 101

      Listener
    for libmysql
                                     HandlerSocket
      SQL Layer                         Plugin
                                                                      PK    Col1   ...
                 Handler Interface                                    100    a
                                                                      101    b
                                                                      102     c
http://engineer.dena.jp/2010/08/handlersocket-plugin-for-mysql.html   103    d
                                                                            ...
T many UPDATE
    oo
Shorten locking time
Avoid too slow procedures in transactions
  Connect before locking
     to avoid SIN resending
     conn A, conn B and lock A, lock B
  kick out from transactions
     Connecting other servers like using
     HTTP API, memcached
UPDATE distributed masters
  Distributed Master (Sharding)
     MySQL can’t detect Deadlock
     -> wait innodb_lockwait_timeout
  SocialGame needs many locks of records
     sort lock order as much as possible
     Optimistic lock(use version column)
        raise error when update the record
        and it has already updated by others
Deadlock at distributed
          (1) lock 101   PK    Col1   ...
App1           ok        100
                         101
(3) lock 200             102
  waiting                      ...

(4) lock 101
                         PK    Col1   ...
  waiting                200
                         201
App2      (2) lock 200
                ok
                         202
                               ...
Ordered Lock
          (1) lock 101   PK    Col1   ...
App1           ok        100
                         101
(3) lock 200             102

      ok                       ...

(2) lock 101
                         PK    Col1   ...
  waiting                200
                         201
          (4) lock 200
App2        waiting
                         202
                               ...
Optimistic Lock
(1)
SELECT * FROM t                PK      ver     Col1   ...
WHERE PK = 101 (no lock)       100   25643
=> ver = 36786                 101   36786
                               102   14624

  App                                    ...




(2)
UPDATE t SET Col1 = 100, ver = ver+1
WHERE PK = 101 AND ver = 36786
Replication Delay - monitor
 MySQL replication is single thread
 High CUD qps is bottleneck of slave
    difficult to benchmark before
 -> monitor delay of backup server
    backup is low spec comparing with
    slave
    able to detect repli delay problems
    before it happens on slave
Replication Delay - SSD
SSD is quite effective for slave
  high IOPS means high throughput of
  replication thread
  Multi instance have multi repli threads
SATA-SSD is good way for most case
  enough cheep / storage capacity
  PCIe-SSD is too expensive now
  SAS-SSD can’t use full storage capacity
Finally...
DeNA needs
      Big Data lovers
We have been fighting “Big Data”
  MySQL/Hadoop/App/Network/etc...
  There are still many problems ;(
  We need more power
We are looking forward to your joining
  Please contact riywo / DeNA staff :)
Thank you!

More Related Content

What's hot

Practical Testing of Ruby Core
Practical Testing of Ruby CorePractical Testing of Ruby Core
Practical Testing of Ruby CoreHiroshi SHIBATA
 
Roll Your Own API Management Platform with nginx and Lua
Roll Your Own API Management Platform with nginx and LuaRoll Your Own API Management Platform with nginx and Lua
Roll Your Own API Management Platform with nginx and LuaJon Moore
 
Build your own_map_by_yourself
Build your own_map_by_yourselfBuild your own_map_by_yourself
Build your own_map_by_yourselfMarc Huang
 
RestMQ - HTTP/Redis based Message Queue
RestMQ - HTTP/Redis based Message QueueRestMQ - HTTP/Redis based Message Queue
RestMQ - HTTP/Redis based Message QueueGleicon Moraes
 
Perl Memory Use - LPW2013
Perl Memory Use - LPW2013Perl Memory Use - LPW2013
Perl Memory Use - LPW2013Tim Bunce
 
Devinsampa nginx-scripting
Devinsampa nginx-scriptingDevinsampa nginx-scripting
Devinsampa nginx-scriptingTony Fabeen
 
Gdb basics for my sql db as (openfest 2017) final
Gdb basics for my sql db as (openfest 2017) finalGdb basics for my sql db as (openfest 2017) final
Gdb basics for my sql db as (openfest 2017) finalValeriy Kravchuk
 
Percona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL AdministrationPercona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL AdministrationMydbops
 
Lua tech talk
Lua tech talkLua tech talk
Lua tech talkLocaweb
 
Perl at SkyCon'12
Perl at SkyCon'12Perl at SkyCon'12
Perl at SkyCon'12Tim Bunce
 
Gazelle - Plack Handler for performance freaks #yokohamapm
Gazelle - Plack Handler for performance freaks #yokohamapmGazelle - Plack Handler for performance freaks #yokohamapm
Gazelle - Plack Handler for performance freaks #yokohamapmMasahiro Nagano
 
Ansible fest Presentation slides
Ansible fest Presentation slidesAnsible fest Presentation slides
Ansible fest Presentation slidesAaron Carey
 
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)Ontico
 

What's hot (20)

Practical Testing of Ruby Core
Practical Testing of Ruby CorePractical Testing of Ruby Core
Practical Testing of Ruby Core
 
What is nodejs
What is nodejsWhat is nodejs
What is nodejs
 
Roll Your Own API Management Platform with nginx and Lua
Roll Your Own API Management Platform with nginx and LuaRoll Your Own API Management Platform with nginx and Lua
Roll Your Own API Management Platform with nginx and Lua
 
Build your own_map_by_yourself
Build your own_map_by_yourselfBuild your own_map_by_yourself
Build your own_map_by_yourself
 
PostgreSQL and PL/Java
PostgreSQL and PL/JavaPostgreSQL and PL/Java
PostgreSQL and PL/Java
 
EC2
EC2EC2
EC2
 
RestMQ - HTTP/Redis based Message Queue
RestMQ - HTTP/Redis based Message QueueRestMQ - HTTP/Redis based Message Queue
RestMQ - HTTP/Redis based Message Queue
 
MySQLinsanity
MySQLinsanityMySQLinsanity
MySQLinsanity
 
Perl Memory Use - LPW2013
Perl Memory Use - LPW2013Perl Memory Use - LPW2013
Perl Memory Use - LPW2013
 
Jmx capture
Jmx captureJmx capture
Jmx capture
 
Devinsampa nginx-scripting
Devinsampa nginx-scriptingDevinsampa nginx-scripting
Devinsampa nginx-scripting
 
Gdb basics for my sql db as (openfest 2017) final
Gdb basics for my sql db as (openfest 2017) finalGdb basics for my sql db as (openfest 2017) final
Gdb basics for my sql db as (openfest 2017) final
 
Percona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL AdministrationPercona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL Administration
 
Lua tech talk
Lua tech talkLua tech talk
Lua tech talk
 
Query logging with proxysql
Query logging with proxysqlQuery logging with proxysql
Query logging with proxysql
 
Perl at SkyCon'12
Perl at SkyCon'12Perl at SkyCon'12
Perl at SkyCon'12
 
Gazelle - Plack Handler for performance freaks #yokohamapm
Gazelle - Plack Handler for performance freaks #yokohamapmGazelle - Plack Handler for performance freaks #yokohamapm
Gazelle - Plack Handler for performance freaks #yokohamapm
 
Ansible fest Presentation slides
Ansible fest Presentation slidesAnsible fest Presentation slides
Ansible fest Presentation slides
 
Top Node.js Metrics to Watch
Top Node.js Metrics to WatchTop Node.js Metrics to Watch
Top Node.js Metrics to Watch
 
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
 

Similar to "Mobage DBA Fight against Big Data" - NHN TE

How to migrate_to_sharding_with_spider
How to migrate_to_sharding_with_spiderHow to migrate_to_sharding_with_spider
How to migrate_to_sharding_with_spiderKentoku
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesYoshinori Matsunobu
 
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)Matthew Lease
 
Solving performance problems in MySQL without denormalization
Solving performance problems in MySQL without denormalizationSolving performance problems in MySQL without denormalization
Solving performance problems in MySQL without denormalizationdmcfarlane
 
Akiban Technologies: Renormalize
Akiban Technologies: RenormalizeAkiban Technologies: Renormalize
Akiban Technologies: RenormalizeAriel Weil
 
Akiban Technologies: Renormalize
Akiban Technologies: RenormalizeAkiban Technologies: Renormalize
Akiban Technologies: RenormalizeAriel Weil
 
Practical Recipes for Daily DBA Activities using DB2 9 and 10 for z/OS
Practical Recipes for Daily DBA Activities using DB2 9 and 10 for z/OSPractical Recipes for Daily DBA Activities using DB2 9 and 10 for z/OS
Practical Recipes for Daily DBA Activities using DB2 9 and 10 for z/OSCuneyt Goksu
 
Introduction to MapReduce using Disco
Introduction to MapReduce using DiscoIntroduction to MapReduce using Disco
Introduction to MapReduce using DiscoJim Roepcke
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsServer Density
 
Advanced mysql replication for the masses
Advanced mysql replication for the massesAdvanced mysql replication for the masses
Advanced mysql replication for the massesGiuseppe Maxia
 
Simple Works Best
 Simple Works Best Simple Works Best
Simple Works BestEDB
 
Things you should know about Oracle truncate
Things you should know about Oracle truncateThings you should know about Oracle truncate
Things you should know about Oracle truncateKazuhiro Takahashi
 
Spil Games @ FOSDEM: Galera Replicator IRL
Spil Games @ FOSDEM: Galera Replicator IRLSpil Games @ FOSDEM: Galera Replicator IRL
Spil Games @ FOSDEM: Galera Replicator IRLspil-engineering
 
MySQL Replication Update -- Zendcon 2016
MySQL Replication Update -- Zendcon 2016MySQL Replication Update -- Zendcon 2016
MySQL Replication Update -- Zendcon 2016Dave Stokes
 
Spider HA 20100922(DTT#7)
Spider HA 20100922(DTT#7)Spider HA 20100922(DTT#7)
Spider HA 20100922(DTT#7)Kentoku
 
Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...
Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...
Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...Flink Forward
 
Has MySQL grown up?
Has MySQL grown up?Has MySQL grown up?
Has MySQL grown up?Mark Stanton
 
Paper_Scalable database logging for multicores
Paper_Scalable database logging for multicoresPaper_Scalable database logging for multicores
Paper_Scalable database logging for multicoresHyo jeong Lee
 

Similar to "Mobage DBA Fight against Big Data" - NHN TE (20)

How to migrate_to_sharding_with_spider
How to migrate_to_sharding_with_spiderHow to migrate_to_sharding_with_spider
How to migrate_to_sharding_with_spider
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
 
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
 
Solving performance problems in MySQL without denormalization
Solving performance problems in MySQL without denormalizationSolving performance problems in MySQL without denormalization
Solving performance problems in MySQL without denormalization
 
Akiban Technologies: Renormalize
Akiban Technologies: RenormalizeAkiban Technologies: Renormalize
Akiban Technologies: Renormalize
 
Akiban Technologies: Renormalize
Akiban Technologies: RenormalizeAkiban Technologies: Renormalize
Akiban Technologies: Renormalize
 
Practical Recipes for Daily DBA Activities using DB2 9 and 10 for z/OS
Practical Recipes for Daily DBA Activities using DB2 9 and 10 for z/OSPractical Recipes for Daily DBA Activities using DB2 9 and 10 for z/OS
Practical Recipes for Daily DBA Activities using DB2 9 and 10 for z/OS
 
Introduction to MapReduce using Disco
Introduction to MapReduce using DiscoIntroduction to MapReduce using Disco
Introduction to MapReduce using Disco
 
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
Advanced mysql replication for the masses
Advanced mysql replication for the massesAdvanced mysql replication for the masses
Advanced mysql replication for the masses
 
Simple Works Best
 Simple Works Best Simple Works Best
Simple Works Best
 
Things you should know about Oracle truncate
Things you should know about Oracle truncateThings you should know about Oracle truncate
Things you should know about Oracle truncate
 
Spil Games @ FOSDEM: Galera Replicator IRL
Spil Games @ FOSDEM: Galera Replicator IRLSpil Games @ FOSDEM: Galera Replicator IRL
Spil Games @ FOSDEM: Galera Replicator IRL
 
MySQL Replication Update -- Zendcon 2016
MySQL Replication Update -- Zendcon 2016MySQL Replication Update -- Zendcon 2016
MySQL Replication Update -- Zendcon 2016
 
Spider HA 20100922(DTT#7)
Spider HA 20100922(DTT#7)Spider HA 20100922(DTT#7)
Spider HA 20100922(DTT#7)
 
Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...
Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...
Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...
 
Has MySQL grown up?
Has MySQL grown up?Has MySQL grown up?
Has MySQL grown up?
 
Hadoop tutorial
Hadoop tutorialHadoop tutorial
Hadoop tutorial
 
Paper_Scalable database logging for multicores
Paper_Scalable database logging for multicoresPaper_Scalable database logging for multicores
Paper_Scalable database logging for multicores
 

More from Ryosuke IWANAGA

"I want to use Fluentd" Fluentd Casual Talks LT
"I want to use Fluentd" Fluentd Casual Talks LT"I want to use Fluentd" Fluentd Casual Talks LT
"I want to use Fluentd" Fluentd Casual Talks LTRyosuke IWANAGA
 
English Casual 2012/05/10
English Casual 2012/05/10English Casual 2012/05/10
English Casual 2012/05/10Ryosuke IWANAGA
 
#bphbqpstudy2012 LT riywo
#bphbqpstudy2012 LT riywo#bphbqpstudy2012 LT riywo
#bphbqpstudy2012 LT riywoRyosuke IWANAGA
 
qpstudy#5 懇親会LT riywo
qpstudy#5 懇親会LT riywoqpstudy#5 懇親会LT riywo
qpstudy#5 懇親会LT riywoRyosuke IWANAGA
 
tcpdump & xtrabackup @ MySQL Casual Talks #1
tcpdump & xtrabackup @ MySQL Casual Talks #1tcpdump & xtrabackup @ MySQL Casual Talks #1
tcpdump & xtrabackup @ MySQL Casual Talks #1Ryosuke IWANAGA
 

More from Ryosuke IWANAGA (7)

"I want to use Fluentd" Fluentd Casual Talks LT
"I want to use Fluentd" Fluentd Casual Talks LT"I want to use Fluentd" Fluentd Casual Talks LT
"I want to use Fluentd" Fluentd Casual Talks LT
 
English Casual 2012/05/10
English Casual 2012/05/10English Casual 2012/05/10
English Casual 2012/05/10
 
20120127 LDeNA LT riywo
20120127 LDeNA LT riywo20120127 LDeNA LT riywo
20120127 LDeNA LT riywo
 
#bphbqpstudy2012 LT riywo
#bphbqpstudy2012 LT riywo#bphbqpstudy2012 LT riywo
#bphbqpstudy2012 LT riywo
 
qpstudy#5 懇親会LT riywo
qpstudy#5 懇親会LT riywoqpstudy#5 懇親会LT riywo
qpstudy#5 懇親会LT riywo
 
tcpdump & xtrabackup @ MySQL Casual Talks #1
tcpdump & xtrabackup @ MySQL Casual Talks #1tcpdump & xtrabackup @ MySQL Casual Talks #1
tcpdump & xtrabackup @ MySQL Casual Talks #1
 
Tsukuba.R#4
Tsukuba.R#4Tsukuba.R#4
Tsukuba.R#4
 

Recently uploaded

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

"Mobage DBA Fight against Big Data" - NHN TE

  • 1. NHN Technology Conference Mobage DBA Fight against the Big Data Ryosuke IWANAGA a.k.a 2012/05/19 riywo @ DeNA
  • 2. Self Introduction Ryosuke IWANAGA a.k.a. riywo(りーお) DeNA (2009-) Mobage Server-Side Ops-Engi / DBA / Manager
  • 3. We have been fighting against “Big Data”! 2006 - 2008 Mobage grew up to 600M PV/day 2009 - 2010 Started SocialGame ->2000M PV/day 2011 - Globalization
  • 5. What is “Big Data”?
  • 6. What is “Big Data”? Many Servers?
  • 7. What is “Big Data”? Many Servers? +500 Database Servers
  • 8. What is “Big Data”? Many Servers? +500 Database Servers Large Data Size?
  • 9. What is “Big Data”? Many Servers? +500 Database Servers Large Data Size? +100TB InnoDB(include replicas)
  • 10. What is “Big Data”? Many Servers? +500 Database Servers Large Data Size? +100TB InnoDB(include replicas) High Traffic?
  • 11. What is “Big Data”? Many Servers? +500 Database Servers Large Data Size? +100TB InnoDB(include replicas) High Traffic? +1Mqps to all MySQL at peak time
  • 13. Scale out - Replication Master - Slaves Architecture critical SELECT -> master used by many HTTP states used by UPDATE(lock) non critical select -> slaves used by only showing information easy to scale out
  • 14. INSERT/DELETE/UPDATE App App App critical SELECT non critical DB SELECT Master a l b le a DB s replication cDB Slave Slave
  • 15. Scale out - Sharding At first -> 1 database - all tables Sharding 1 divide per tables Sharding 2 divide per records mapping table / hashing Make new DB series using replication before sharding
  • 16. DB(1) Master DB(1) Table A Table B DB(1) Slave Slave 1 1 2 2 3 3 4 4 5 5 ... ... replication App App App DB(2) Master Table 1 DB(2) DB(2) Sharding 2 3 4 Slave Slave 5 (no JOIN!) ...
  • 17. DB(1) DB(1) DB(1) Master Slave Slave 1 2 3 4 5 ... App App App DB(2) Master Table 1 DB(2) DB(2) Sharding 2 3 4 Slave Slave 5 (no JOIN!) ...
  • 18. DB(1) DB(1) DB(1) Master Slave Slave App App App 1 2 3 4 5 ... replication DB(2) Record Master Sharding 1 2 DB(2) DB(2) (difficult 3 4 Slave Slave 5 range scan) ...
  • 19. DB(1) DB(1) DB(1) Master Slave Slave App App App 1 2 3 4 hash 5 ... or mapping DB(2) Record Master Sharding 1 2 DB(2) DB(2) (difficult 3 4 Slave Slave 5 range scan) ...
  • 20. Scale out - auto increment AUTO INCREMENT usually can’t be used same schema - different database MyISAM sequence table UPDATE seq SET id=LAST_INSERT_ID(id+1) SELECT LAST_INSERT_ID() not affect other InnoDB transaction
  • 21. Scale back - Multi instances Reduce scale-outed servers spec up server / service get smaller MySQL repli don’t support multi masters Run multi mysqld instances on 1 server bind different virtual IP addresses nothing change on app servers
  • 22. DB(1) DB(2) Master Master multiple master replicattion DB new Master
  • 23. DB(1) DB(2) Master Master 192.168.10.1 192.168.10.2 DB(1) new DB DB(2) mysqld mysqld Master my.cnf my.cnf bind-address=xxx bind-address=yyy
  • 24. Data Availability - backup using non-service slave as “backup” the same spec as master daily logical backup easy to add new slave use daily backup and binary log new schema slave ALTER before importing data
  • 25. AM 3:00 DB 1 mysqldump position Backup Anytime add slave 2 3 DB DB Slave Master DB DB Slave Slave
  • 26. Data Availability - MHA MHA - MySQL Master High Availability Change master when master failed prevent split brain / do IP failover Online schema change change slaves/backup schema offline backup(new schema) -> new master master -> new backup and change schema
  • 28. Purge Many records make DB performance bad Especially range scanning Cause storage capacity problem Purging unnecessary records is important ex.) old messages, logs Usually using timestamp column
  • 29. Purge - Before MySQL 5.1 purge records using DELETE Range DELETE is too heavy ex.) DELETE ... WHERE t <= ... Using Primary Key pre-selected from backup needs index of time column adjust speed watching slave repli delay and stop at peak time
  • 30. Purge - After MySQL 5.1 Supported RANGE partitioning DROP PARTITION ≒ DROP TABLE Partition pruning reduce useless scanning Need to add timestamp column to all UNIQUE INDEX
  • 31. id date ... 101 1337191650 CREATE TABLE t ( 102 1337191650 id date ... 103 1337192655 id int not NULL, 104 213 1337192650 1337192660 date int not NULL, 105 214 1337192650 1337192662 215 1337192655 ... ... 216 1337192660 PRIMARY KEY (id, date) 217 1337192662 ) ENGINE=InnoDB ... PARTITION BY RANGE(date) ( PARTITION p20120517 VALUES LESS THAN ..., PARTITION p20120518 VALUES LESS THAN ..., ... PARTITION over VALUES LESS THAN MAXVALUE )
  • 33. SELECT range scan Range scan using Index Index is important for Big Data InnoDB index = B+ Tree Primary Key = Data search PK = access data WHERE a = 1 AND b = 2 ORDER BY c KEY i1 (a, b, c) Covering Index
  • 34. Data leaf block SELECT * FROM t PK Col1 Col2 ... WHERE Col1 = 1 100 AND Col2 = 'b' 101 AND Col3 >= 6 102 103 104 Index leaf block ... Col1 Col2 Col3 PK 1 a 3 10 1 b 2 400 1 b 6 103 2 a 1 201 2 c 5 9 ...
  • 35. SELECT Primary Key many queries use WHERE PK = ... like Key-Value Store Handler Socket plugin (made by Higuchi) skip SQL parse phase use Handler interface directly higher performance than memcached without considering cache consistency! MySQL 5.6 added memcached API
  • 36. SELECT PK,Col1 FROM t P 0 db t PRIMARY PK,Col1 WHERE PK = 101 0 = 1 101 Listener for libmysql HandlerSocket SQL Layer Plugin PK Col1 ... Handler Interface 100 a 101 b 102 c http://engineer.dena.jp/2010/08/handlersocket-plugin-for-mysql.html 103 d ...
  • 37. T many UPDATE oo Shorten locking time Avoid too slow procedures in transactions Connect before locking to avoid SIN resending conn A, conn B and lock A, lock B kick out from transactions Connecting other servers like using HTTP API, memcached
  • 38. UPDATE distributed masters Distributed Master (Sharding) MySQL can’t detect Deadlock -> wait innodb_lockwait_timeout SocialGame needs many locks of records sort lock order as much as possible Optimistic lock(use version column) raise error when update the record and it has already updated by others
  • 39. Deadlock at distributed (1) lock 101 PK Col1 ... App1 ok 100 101 (3) lock 200 102 waiting ... (4) lock 101 PK Col1 ... waiting 200 201 App2 (2) lock 200 ok 202 ...
  • 40. Ordered Lock (1) lock 101 PK Col1 ... App1 ok 100 101 (3) lock 200 102 ok ... (2) lock 101 PK Col1 ... waiting 200 201 (4) lock 200 App2 waiting 202 ...
  • 41. Optimistic Lock (1) SELECT * FROM t PK ver Col1 ... WHERE PK = 101 (no lock) 100 25643 => ver = 36786 101 36786 102 14624 App ... (2) UPDATE t SET Col1 = 100, ver = ver+1 WHERE PK = 101 AND ver = 36786
  • 42. Replication Delay - monitor MySQL replication is single thread High CUD qps is bottleneck of slave difficult to benchmark before -> monitor delay of backup server backup is low spec comparing with slave able to detect repli delay problems before it happens on slave
  • 43. Replication Delay - SSD SSD is quite effective for slave high IOPS means high throughput of replication thread Multi instance have multi repli threads SATA-SSD is good way for most case enough cheep / storage capacity PCIe-SSD is too expensive now SAS-SSD can’t use full storage capacity
  • 45. DeNA needs Big Data lovers We have been fighting “Big Data” MySQL/Hadoop/App/Network/etc... There are still many problems ;( We need more power We are looking forward to your joining Please contact riywo / DeNA staff :)