Howmysqlworks

About me
• Before REA I worked for a hosting company for 4
years
• Learnt a lot about MySQL during this period
• I started as an OPS engineer at REA and did some
database stuff here
• I havent done much MySQL for two years
!
• In this period we’ve had
• MySQL 5.6 come out
• RDS MySQL improvements
!
!

Today
• High level overview of MySQL
• Look inside MySQL server
• Talk through how InnoDB performs queries
• Schema changes and online schema changes
!
!

Background
• Second biggest relational database in the world
• Anyone know what the biggest is?
!
• Used by many of the big web shops
• Facebook for all feed data
• Etsy for user & data
• Twitter for tweet persistence
• …
!
• Sun bought MySQL AB in 2008, people freaked out
• Oracle bought Sun in 2010, people freaked out
• Big improvements over the past few years (5.5 / 5.6)
!

How it works
mysqld
mysql client tcp:3306
ﬁlesocket
ServerClient
libmysql
mysql lib
ORM
host$ mysql
host$ mysqld_safe!
host$ mysqld

Big features
• Pluggable storage engines is massive ﬂexibility
• You can use a different storage engine per table
!
!
• Common storage engines
• InnoDB - ACID compliant store
• MyISAM - one of the original storage engines -
avoid
• Lots and lots of others
!

Inside mysqld
mysqld
storage engine(s)
MyISAM / InnoDB / Memory / etc
tcp:3306
ﬁlesocket
Getting the query
and responding
bits and
bytes stuff
API

Inside mysqld
mysqld
storage engine(s)
tcp:3306
ﬁlesocket
API

Inside mysqld
mysqld
storage engine(s)
tcp:3306
ﬁlesocket
Parse query and
convert to SE
API calls.
API

Inside mysqld
mysqld
Storage engines(s)
tcp:3306
ﬁlesocket
API
query cache
binary log
join / sort buffers
Binary Log for replication
Buffers: Kinda how much
batching to storage API
to save cost

The bad parts - binary log
mysqld
tcp:3306
ﬁlesocket
query cache
binary log
join / sort buffers
Because you can mix and
match storage engines,
replicating state has to be
done at the lowest
common denominator
Innodb
Table 1
MyISAM
Table 2

The bad parts - binary log
mysqld
Innodb
Table 1
tcp:3306
ﬁlesocket
query cache
binary log
join / sort buffers
Highly concurrent InnoDB
needs its queries,
serialized down
to replicate to slave
MyISAM
Table 2

Compare to other DBs
DB server
tcp:x
ﬁlesocket?
performance
knobs
transaction
log

The bad parts - query cache
mysqld
Innodb
Table 1
tcp:3306
ﬁlesocket
query cache
binary log
join / sort buffers
Can only work on an
entire table level &
uses LRU
MyISAM
Table 2

The bad parts - query cache
mysqld
Innodb
Table 1
tcp:3306
ﬁlesocket
query cache
binary log
join / sort buffers
All ‘update/insert/delete’
blocks all transactions
to ensure correct state.
It actually slows !
things down
MyISAM
Table 2

Inside mysqld
mysqld
storage engine(s)
tcp:3306
ﬁlesocket
API
Where 10%* of !
the work is done
Where 90% !
of the work !
is done
*Finger in the air calculation

Summary
• Turn off query cache! It is hurting you!
• Binary log is a necessary evil for replication
• Dont play with the performance knobs (sort buffer /
join buffer / etc)
• Turning the 10% - Focus on the 90%
• Use InnoDB for all the tables (unless they’re system
tables)
!
!
• So…lets look at the 90%
!
!

InnoDB
mysqld
InnoDB
tcp:3306
ﬁlesocket
API

Unknown gems
mysqld
InnoDB
tcp:3306
ﬁlesocket
API
handlersocket/
memcache
Innodb

Unknown gems
mysqld
InnoDB
tcp:3306
ﬁlesocket
API
handlersocket/
memcache
Innodb
Memcache API
that talks directly
to InnoDB
!
Removes the SQL
layer and is just K/V
!
Google: handlersocket
memcache InnoDB

Key parts of InnoDB
Innodb
API

Key parts of InnoDB
Table space
(whats on disk)
redo log
Innodb
bufferpool
memory!
ib_log1!
ib_log2!
sometable.ibd!
ib_data!

Key parts of InnoDB
Table space
redo log
bufferpool
• Each block is a page
• Just think of it as a row with some extra stuff like
version number
• InnoDB is ACID meaning each connection has to
have its own view of the world!
• Extra metadata at the top about where everything is

Key parts of InnoDB
Table space
redo log
bufferpool
• Black is the base metadata

What is bufferpool
Table space
redo log
bufferpool
• In memory version of hot tablespace pages

What is redo log
Table space
redo log
bufferpool
• An append only log ﬁle of changes

What is table space
Table space
redo log
bufferpool
• The raw ﬁles - but it may not be always up-to-date

Lets run some queries
Table space
redo log
bufferpool
• First - lets ﬁll up our database with data

Table space
redo log
bufferpool
• mysql> select * from red;

Table space
redo log
bufferpool
mysqld parses the query and sends !
API calls to InnoDB

Table space
redo log
bufferpool
Is red in bufferpool?

Table space
redo log
bufferpool
Load red into buffer pool
perform random !
read disk activity!
At this point, the time to respond!
is dependent on disk speed!

Table space
redo log
bufferpool
Respond to API calls to mysqld

Table space
redo log
bufferpool
Parse results !
second pass sort if necessary !
return to client

Table space
redo log
bufferpool
• mysql> select * from red; #again

Table space
redo log
bufferpool
Is red in bufferpool? - yes

Table space
redo log
bufferpool
• mysql> select * from brown;
What if bufferpool is full?

Table space
redo log
bufferpool
• mysql> select * from blue;
Assume metadata knows page requirements!
LRU on the bufferpool to ﬁnd space and evict

Table space
redo log
bufferpool
• mysql> select * from blue;
!
Load in blue as before and return

Table space
redo log
bufferpool
• mysql> update blue set A=B where ID=1;
!
Is page in bufferpool? yes

Table space
redo log
bufferpool
!
Update the page in bufferpool and increment !
page version number

Table space
redo log
bufferpool
!
Write the new page to the redo log!
what tablespace is up to!

Table space
redo log
bufferpool
!
Update metadata telling it that this is the !
new state of the page!
perform sequential !
write disk activity to redo log!
This is pretty fast!

Table space
redo log
bufferpool
!
Send the OK back to mysqld!

Table space
redo log
bufferpool
!
At sometime in the near future, do a semi sequential !
parse changes and update the tablespace
semi sequential write!
Not that bad !
!

Table space
redo log
bufferpool
• mysql> insert into yellow values
(1,2,3,4);
!
Is space in bufferpool?!

Table space
redo log
bufferpool
(1,2,3,4);
!
Is adjacent pages in bufferpool?!
Random read !
(but not too big)

Table space
redo log
bufferpool
(1,2,3,4);
!
Send OK back to client!

Key take aways
Table space
redo log
bufferpool
• Cold databases are slow!
• bufferpool is empty and everything requires random disk
reads
• The bigger the buffer pool you have, the quicker you will be!
(NUMBER 1 factor on performance!)!
• Redo log is the second slowest part
• we can tune this
innodb_flush_logs_at_trx_commit=1
innodb_buffer_pool_size= ~80% memory!

Schema changes
• Everyone hates them
• They are not transactional
• They are blocking (well most)
• How do they work?
!

Remember this?
mysqld
Storage engines(s)
tcp:3306
ﬁlesocket
API
query cache
binary log
join / sort buffers

Lets add a few more
Storage engines(s)
tcp:3306
ﬁlesocket
API
table meta data
binary log
Table triggers
functions / procedures

mysqld
tcp:3306
ﬁlesocket
mytbl
Schema changes
mytbl meta

mytbl
mysql> CREATE TABLE mytbl  
(a INT, b CHAR (20), INDEX (a)) ENGINE=InnoDB;
Schema changes

mytbl
Schema changes
mysql> ALTER TABLE mytbl  
ADD COLUMN age INT;

mytbl
Schema changes
ADD COLUMN age INT;
!
Wait for any open transactions to close and then !
lock access to the table!
block

mytbl
Schema changes
ADD COLUMN age INT;
!
Create a new ‘hidden’ table with the new schema!
block
.mytbl

mytbl
Schema changes
ADD COLUMN age INT;
!
Do an internal ‘mysqldump’ / ‘mysqlrestore’!
block
.mytbl

mytbl
Schema changes
ADD COLUMN age INT;
!
Do a switch-a-roo!
block
.mytbl

Schema changes
ADD COLUMN age INT;
!
Unblock and send OK back to client!
mytbl

Online schema changes
• Some changes can be done without ‘locking’
• Column additions
• Index additions
!
• Any removal cannot be done online
• Unless you do it yourself
!

mysqld
mytbl
Online schema changes 5.1/5.5
mytbl meta

mysqld
mytbl
mytbl meta
client
mysql> ALTER…
pt-online-schema-change
soundclouds lhm
etc

mysqld
mytbl
mytbl meta
client
mysql> ALTER… .mytbl meta
.mytbl
!
Creates another
table with the
updated
schema in the
client!

mysqld
mytbl
mytbl meta
client
.mytbl
!
Creates a set of
triggers on the
main table!
triggers

mysqld
mytbl
mytbl meta
client
.mytbl
!
selects the
entire table (in
batches) and
inserts into the
new table!
!
!
This can take a
long time!
triggers

mysqld
mytbl
mytbl meta
client
.mytbl
triggers
client
mysql> select
!
Whilst the new
table is being
built, selects
still go to the
old table!

mytbl
mytbl meta
client
.mytbl
triggers
client
mysql> UPDATE
!
Updates cause
the trigger to
ﬁre which
updates the
new table being
built. (Inserts
the row if its
still hasn't been
copied)!
!
Deletes happen
the same way!

mytbl
mytbl meta
client
.mytbl
triggers
client
mysql> INSERT
!
Inserts also
happen the
same way

mytbl
mytbl meta
client
.mytbl
triggers
!
Eventually the
table
population
ﬁnishes and a
‘RENAME table’
action is
performed
(which is quick)

mytbl
mytbl meta
client
.mytbl
triggers
!
Eventually the
table
population
ﬁnishes. This is
the only
‘blocking’ time
block block

mytbl
mytbl meta
client
mysql> ALTER… !
The schema
change tool
ﬁnishes

Online schema changes in 5.6
• All that ‘switch-a-roo’ logic is now within mysqld itself
!

Summary about schema changes
• You ‘double your IO’ when you’re online schema
changes!
• Binary log is outside of this - so watch out for your
slaves table locking if you don't use the right ﬂags
!
!

Howmysqlworks

More Related Content

What's hot

Viewers also liked

Similar to Howmysqlworks

Recently uploaded

Howmysqlworks