How MySQL Innodb works and
    how realestate.com.au
  configures their databases

   Trent Hornibrook @mysqldbahelp
Inner workings of InnoDB
Transaction logs
  • Used for durability
  • Any changes that are made
    are written synchronously
  • Only ever read from at
    startup
  • Quick if you have a BBU and
    flush logs disabled
Tablespace
• Where all the table data is
• Read from in the foreground -
  and its RANDOM.
• Expensive! Want to avoid this!
Bufferpool
• In memory cache of the tablespace
Memory + BBU
High read IO
IO stat from a database box.
avg-cpu: %user    %nice %system %iowait     %steal   %idle
          15.91    0.00    1.63   20.80       0.00   61.65

Device:            rrqm/swrqm/sr/sw/srsec/swsec/savgrq-szavgqu-sz    await svctm   %util
sda                 19.00    14.00 895.00     7.00 70976.00   168.00     78.87     2.78    3.20    1.11 100.00
sda1                 0.00     0.00    0.00    0.00     0.00     0.00      0.00     0.00    0.00    0.00   0.00
sda2                 0.00     0.00    0.00    0.00     0.00     0.00      0.00     0.00    0.00    0.00   0.00
sda3                19.00    14.00 895.00     7.00 70976.00   168.00     78.87     2.78    3.20    1.11 100.00
sda4                 0.00     0.00    0.00    0.00     0.00     0.00      0.00     0.00    0.00    0.00   0.00
sda5                 0.00     0.00    0.00    0.00     0.00     0.00      0.00     0.00    0.00    0.00   0.00
dm-0                 0.00     0.00 912.00    21.00 70944.00   168.00     76.22     3.56    3.93    1.07 100.00

avg-cpu:   %user    %nice %system %iowait   %steal   %idle
           15.94     0.00    1.57   18.24     0.00   64.25

Device:            rrqm/swrqm/sr/sw/srsec/swsec/savgrq-szavgqu-sz    await svctm   %util
sda                 23.00    30.00 1071.00    7.00 83560.00   296.00     77.79     2.05     1.88    0.92   99.60
sda1                 0.00     0.00    0.00    0.00     0.00     0.00      0.00     0.00     0.00    0.00    0.00
sda2                 0.00    18.00    0.00    1.00     0.00   152.00    152.00     0.04    20.00   40.00    4.00
sda3                23.00    12.00 1071.00    6.00 83560.00   144.00     77.72     2.01     1.87    0.92   99.60
sda4                 0.00     0.00    0.00    0.00     0.00     0.00      0.00     0.00     0.00    0.00    0.00
sda5                 0.00     0.00    0.00    0.00     0.00     0.00      0.00     0.00     0.00    0.00    0.00
dm-0                 0.00     0.00 1094.00   18.00 83624.00   144.00     75.33     2.42     2.18    0.90   99.60
High read IO

Device: r/sw/s    await svctm
sda   1071.00 7.00 1.88 0.92
sda1    0.00    0.00 0.00 0.00
sda2    0.00    1.00 20.00 40.00
sda31071.006.00 1.87 0.92
sda4    0.00    0.00 0.00 0.00
sda5    0.00    0.00 0.00 0.00
dm-0    1094.00 18.00 2.18 0.90
Dealing with IO problems
•   Can you increase buffer pool?
•   Can you add additional memory
•   Nobarrier, noatime, nodiratime?
•   If you’re read IO – can you slave out queries
•   SSD or similar for tablespace?
•   Schema refactoring?
•   If you’re write IO – functional partitioning or
    sharding
Sometimes you could be CPU bound

procs -----------memory---------- ---swap-- -----io----   -system-- ----cpu----
rbswpd   free   buff cache    si   so    bi    bo   in     cs us sy id wa
 3 1 243980 181492 173472 129244     0    0 1599     40      0    0 13 1 67 19
 1 2 243980 181848 173480 129248     0    0 25636    68   4933 13578 15 1 65 18
 2 2 243980 181848 173488 129264     0    0 33268    80   3724 12734 16 1 63 19
 1 5 243980 181340 173492 129280     0    0 28756    84   3928 13184 15 1 57 26
 1 3 243980 181692 173508 129316     0    0 30820   280   4415 12656 16 1 52 31
 1 4 243980 182020 173520 129308     0    0 26536 1404    7238 13976 20 2 50 28
 4 5 243980 181408 173528 129332     0    0 26064    68   7860 16334 28 2 41 29
 3 4 243980 181528 173532 129340     0    0 33248    96   7567 18472 26 2 40 32
 3 4 243980 181320 173532 129396     0    0 30532    52   8437 19065 16 2 50 32
 1 4 243980 181452 173552 129400     0    0 28688   136   7709 17505 16 1 54 29


                     Average 15% user CPU and 55% idle time?
                            How can I be CPU bound ?
CPU bound server
host:~# mpstat           -P ALL 1

08:45:49   CPU   %user   %nice   %sys %iowait   %irq   %soft   %steal   %idle    intr/s
08:45:50   all   16.87    0.00   1.20   10.77   0.12    0.60     0.00   70.45   5999.00
08:45:50     0    4.67    0.00   0.93    4.67   0.00    0.93     0.00   88.79    750.00
08:45:50     1    3.67    0.00   3.67   22.94   0.00    0.92     0.00   68.81    749.00
08:45:50     2    5.56    0.00   1.85    9.26   0.00    0.93     0.00   82.41    749.00
08:45:50     3   97.06    0.00   0.00    0.00   0.98    0.00     0.00    1.96    750.00
08:45:50     4    8.08    0.00   2.02   15.15   0.00    1.01     0.00   73.74    749.00
08:45:50     5    3.70    0.00   0.93    8.33   0.00    0.00     0.00   87.04    750.00
08:45:50     6    9.80    0.00   0.98   17.65   0.00    0.00     0.00   71.57    750.00
08:45:50     7    3.92    0.00   0.00    8.82   0.00    0.98     0.00   86.27    750.00

08:45:50   CPU   %user   %nice   %sys %iowait   %irq   %soft   %steal   %idle    intr/s
08:45:51   all   15.96    0.00   0.97    9.62   0.12    0.49     0.00   72.84   5599.00
08:45:51     0    2.97    0.00   0.00   11.88   0.00    0.00     0.00   85.15    699.00
08:45:51     1    3.77    0.00   1.89   18.87   0.00    0.94     0.00   74.53    700.00
08:45:51     2    4.67    0.00   0.93    6.54   0.00    0.00     0.00   87.85    700.00
08:45:51     3   98.99    0.00   0.00    0.00   0.00    1.01     0.00    0.00    700.00
08:45:51     4    6.93    0.00   1.98   12.87   0.00    0.99     0.00   77.23    700.00
08:45:51     5    3.81    0.00   0.00   11.43   0.00    0.95     0.00   83.81    700.00
08:45:51     6    5.94    0.00   0.99    9.90   0.99    0.99     0.00   81.19    700.00
08:45:51     7    3.88    0.00   0.97    5.83   0.00    0.00     0.00   89.32    700.00
Dealing with CPU problems
• Upgrade your box?
• Running MySQL 5.5 (Percona 5.5)?
• Deadline scheduler
How REA does databases
•   Centos 6 / RHEL6
•   Percona 5.5 for anything performant
•   DELL R710 for anything massively big
•   DELL M710 for anything big
•   Lots of RAM - working set in buffer pool
•   SSD for our R710 servers as well as 15k 2.5
    drives (for logs)
How REA does databases
• As much memory as possible in the box
• 80%+ memory allocated to buffer pool
  (innodb_buffer_pool)
• BBU + flush_logs_at_trx_commit=0
• InnoDB all the way (except for fulltext search)
• NO NO NO QUERY CACHE!*
• No real tuning on session buffers
How REA does databases
• innodb_flush_method=O_DIRECT
• Deadline IO scheduler
• Swapness =0
• XFS filesystem - nobarrier, noatime, nodiratime
  (ext4 may be quicker on SSD though)
• Blocked aligned filesystems
• Innodb file per table
• Tablespace on SSD / transaction logs on disk
Questions?



Trent Hornibrook @mysqldbahelp
Extra stuff
Our schema changes
Our standard topology
Don’t replicate changes (SET SQL_LOG_BIN=0)
Caveats
• Use statement based replication
• Can only do some changes online (add
  columns / indexes / new tables)
• Don’t rename tables / drop columns
• Requires application ‘support’
Schemabot
• In house tooling to execute the schema
  changes
• Performs the long tedious procedural task
• Calls out to nagios upon failure
• Need to hook it into Active Record

realestate and MySQL devops melbourne

  • 1.
    How MySQL Innodbworks and how realestate.com.au configures their databases Trent Hornibrook @mysqldbahelp
  • 8.
  • 9.
    Transaction logs • Used for durability • Any changes that are made are written synchronously • Only ever read from at startup • Quick if you have a BBU and flush logs disabled
  • 10.
    Tablespace • Where allthe table data is • Read from in the foreground - and its RANDOM. • Expensive! Want to avoid this!
  • 11.
    Bufferpool • In memorycache of the tablespace
  • 12.
  • 13.
    High read IO IOstat from a database box. avg-cpu: %user %nice %system %iowait %steal %idle 15.91 0.00 1.63 20.80 0.00 61.65 Device: rrqm/swrqm/sr/sw/srsec/swsec/savgrq-szavgqu-sz await svctm %util sda 19.00 14.00 895.00 7.00 70976.00 168.00 78.87 2.78 3.20 1.11 100.00 sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda3 19.00 14.00 895.00 7.00 70976.00 168.00 78.87 2.78 3.20 1.11 100.00 sda4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-0 0.00 0.00 912.00 21.00 70944.00 168.00 76.22 3.56 3.93 1.07 100.00 avg-cpu: %user %nice %system %iowait %steal %idle 15.94 0.00 1.57 18.24 0.00 64.25 Device: rrqm/swrqm/sr/sw/srsec/swsec/savgrq-szavgqu-sz await svctm %util sda 23.00 30.00 1071.00 7.00 83560.00 296.00 77.79 2.05 1.88 0.92 99.60 sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda2 0.00 18.00 0.00 1.00 0.00 152.00 152.00 0.04 20.00 40.00 4.00 sda3 23.00 12.00 1071.00 6.00 83560.00 144.00 77.72 2.01 1.87 0.92 99.60 sda4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-0 0.00 0.00 1094.00 18.00 83624.00 144.00 75.33 2.42 2.18 0.90 99.60
  • 14.
    High read IO Device:r/sw/s await svctm sda 1071.00 7.00 1.88 0.92 sda1 0.00 0.00 0.00 0.00 sda2 0.00 1.00 20.00 40.00 sda31071.006.00 1.87 0.92 sda4 0.00 0.00 0.00 0.00 sda5 0.00 0.00 0.00 0.00 dm-0 1094.00 18.00 2.18 0.90
  • 15.
    Dealing with IOproblems • Can you increase buffer pool? • Can you add additional memory • Nobarrier, noatime, nodiratime? • If you’re read IO – can you slave out queries • SSD or similar for tablespace? • Schema refactoring? • If you’re write IO – functional partitioning or sharding
  • 16.
    Sometimes you couldbe CPU bound procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- rbswpd free buff cache si so bi bo in cs us sy id wa 3 1 243980 181492 173472 129244 0 0 1599 40 0 0 13 1 67 19 1 2 243980 181848 173480 129248 0 0 25636 68 4933 13578 15 1 65 18 2 2 243980 181848 173488 129264 0 0 33268 80 3724 12734 16 1 63 19 1 5 243980 181340 173492 129280 0 0 28756 84 3928 13184 15 1 57 26 1 3 243980 181692 173508 129316 0 0 30820 280 4415 12656 16 1 52 31 1 4 243980 182020 173520 129308 0 0 26536 1404 7238 13976 20 2 50 28 4 5 243980 181408 173528 129332 0 0 26064 68 7860 16334 28 2 41 29 3 4 243980 181528 173532 129340 0 0 33248 96 7567 18472 26 2 40 32 3 4 243980 181320 173532 129396 0 0 30532 52 8437 19065 16 2 50 32 1 4 243980 181452 173552 129400 0 0 28688 136 7709 17505 16 1 54 29 Average 15% user CPU and 55% idle time? How can I be CPU bound ?
  • 17.
    CPU bound server host:~#mpstat -P ALL 1 08:45:49 CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s 08:45:50 all 16.87 0.00 1.20 10.77 0.12 0.60 0.00 70.45 5999.00 08:45:50 0 4.67 0.00 0.93 4.67 0.00 0.93 0.00 88.79 750.00 08:45:50 1 3.67 0.00 3.67 22.94 0.00 0.92 0.00 68.81 749.00 08:45:50 2 5.56 0.00 1.85 9.26 0.00 0.93 0.00 82.41 749.00 08:45:50 3 97.06 0.00 0.00 0.00 0.98 0.00 0.00 1.96 750.00 08:45:50 4 8.08 0.00 2.02 15.15 0.00 1.01 0.00 73.74 749.00 08:45:50 5 3.70 0.00 0.93 8.33 0.00 0.00 0.00 87.04 750.00 08:45:50 6 9.80 0.00 0.98 17.65 0.00 0.00 0.00 71.57 750.00 08:45:50 7 3.92 0.00 0.00 8.82 0.00 0.98 0.00 86.27 750.00 08:45:50 CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s 08:45:51 all 15.96 0.00 0.97 9.62 0.12 0.49 0.00 72.84 5599.00 08:45:51 0 2.97 0.00 0.00 11.88 0.00 0.00 0.00 85.15 699.00 08:45:51 1 3.77 0.00 1.89 18.87 0.00 0.94 0.00 74.53 700.00 08:45:51 2 4.67 0.00 0.93 6.54 0.00 0.00 0.00 87.85 700.00 08:45:51 3 98.99 0.00 0.00 0.00 0.00 1.01 0.00 0.00 700.00 08:45:51 4 6.93 0.00 1.98 12.87 0.00 0.99 0.00 77.23 700.00 08:45:51 5 3.81 0.00 0.00 11.43 0.00 0.95 0.00 83.81 700.00 08:45:51 6 5.94 0.00 0.99 9.90 0.99 0.99 0.00 81.19 700.00 08:45:51 7 3.88 0.00 0.97 5.83 0.00 0.00 0.00 89.32 700.00
  • 18.
    Dealing with CPUproblems • Upgrade your box? • Running MySQL 5.5 (Percona 5.5)? • Deadline scheduler
  • 19.
    How REA doesdatabases • Centos 6 / RHEL6 • Percona 5.5 for anything performant • DELL R710 for anything massively big • DELL M710 for anything big • Lots of RAM - working set in buffer pool • SSD for our R710 servers as well as 15k 2.5 drives (for logs)
  • 20.
    How REA doesdatabases • As much memory as possible in the box • 80%+ memory allocated to buffer pool (innodb_buffer_pool) • BBU + flush_logs_at_trx_commit=0 • InnoDB all the way (except for fulltext search) • NO NO NO QUERY CACHE!* • No real tuning on session buffers
  • 21.
    How REA doesdatabases • innodb_flush_method=O_DIRECT • Deadline IO scheduler • Swapness =0 • XFS filesystem - nobarrier, noatime, nodiratime (ext4 may be quicker on SSD though) • Blocked aligned filesystems • Innodb file per table • Tablespace on SSD / transaction logs on disk
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
    Don’t replicate changes(SET SQL_LOG_BIN=0)
  • 27.
    Caveats • Use statementbased replication • Can only do some changes online (add columns / indexes / new tables) • Don’t rename tables / drop columns • Requires application ‘support’
  • 28.
    Schemabot • In housetooling to execute the schema changes • Performs the long tedious procedural task • Calls out to nagios upon failure • Need to hook it into Active Record