Data Tiering: Squeezing Scale out of MySQL (LRUG Presentation 2014-01-13)

data tiering
Squeezing scale out of MySQL

Julien, VP Engineering at HouseTrip
github.com/mezis

disclaimer
IANADBA* 
I’m not a database administrator

3

Load balancer
(ELB)

3k rpm
30x10
web workers
(Passenger/Rack)

6x20
job workers
(DJ)

85k qpm
Memcache
4x7GB

MySQL
400GB
5x replica

MondoDB
120GB
2x replica

S3
TBs

5

predictable traffic
~25% searches

6

search =
[destination, start date, end date] 
↓ 
[ [property, price], … ]
property 

availability 

rate 

!

!

!

destination_id

property_id
rate_id
start_date
end_date

price

7

search →

booking →

big bulky join 
 

long transaction 
+ business logic 
+ many small R/W

property 

availability 

rate 

!

!

!

destination_id

property_id
rate_id
start_date
end_date

price

8

peak traffic 7pm - 10pm
write queries !
read queries "
write IO ⛅️
cpu load ⛅️
memory ☀️
10

contention
noun (kənˈtɛnʃən)
1. a struggling between opponents
2. competition for limited resources

11

slow reads ?
poor use of indices 
during large write transactions

http://dev.mysql.com/doc/refman/5.5/en/optimizing-innodb-transaction-management.html 
http://dev.mysql.com/doc/refman/5.5/en/glossary.html#glos_covering_index

12

slow writes ?
load+locking on rollback segments

http://dev.mysql.com/doc/refman/5.1/en/innodb-multi-versioning.html 
http://dev.mysql.com/doc/refman/5.5/en/glossary.html#glos_rollback_segment

13

digging & deeper
SHOW ENGINE INNODB STATUS '
---TRANSACTION 72C, ACTIVE 755 sec 
4 lock struct(s), …, 3 row lock(s), undo log entries 12 
TABLE LOCK … 
RECORD LOCKS … 
RECORD LOCKS … locks rec but not gap 
RECORD LOCKS … lock_mode X locks gap before rec

http://www.mysqlperformanceblog.com/2012/03/27/innodbs-gap-locks/ 
http://dev.mysql.com/doc/refman/5.1/en/innodb-monitors.html

14

horizontal scaling
“throw money at it”
→ does not work
→ ops + maintenance cost

17

ﬁne-tune 
the DB engine
“bring in the experts”
→ only works short-term

18

vertical scaling
“throw more money at it”
→ bigger database instances 
~ +4k$/mo
→ only 2 cartridges 
in that gun

19

put it in a service
de-normalise the data
use that noSQL thing
webscale

webscale

websc
ale
20

good solution ?
→ live in 1-2 weeks
→ buys 6-12 months

http://xkcd.com/1205/

21

frame tearing
(not tiering)

23

frame tearing
caused by “simple buffering”
render scene in buffer

draw to screen from buffer

t

t

24

frame(buffer) tiering
aka “double buffering”
render scene in buffer

draw to screen from buffer

t

t

25

data tiering
separate read and write tables
read
availabilities_front
swap
availabilities_back
read/write

copy
availabilities

http://en.wikipedia.org/wiki/File:Comparison_double_triple_buffering.svg

26

tables
availabilities 
availabilities_0 
availabilities_1
data_tiering_switches 
data_tiering_sync_logs
28

using tiered tables
DataTiering::Switch.new.active_scope_for(Availability)
equivalent one of
Availability.scoped(:from => 'properties_0') 
Availability.scoped(:from => 'properties_1')

read
swap
availabilities_back
read/write

copy
availabilities
29

syncing
DataTiering::Sync.sync_and_switch!
regularly* scheduled task 
(every 5 min for us, takes ~ 60s)
*depend on acceptable staleness
read
swap
availabilities_back
read/write

copy
availabilities
30

syncing: schema
lazily (no migrations) : 
- create missing /.*_[01]/ tables
- compare schemas with 
SHOW CREATE TABLE
read
swap
availabilities_back
read/write

copy
availabilities
31

syncing: bulk
- run TRUNCATE TABLE then 
INSERT INTO … SELECT FROM
- too slow at runtime 
(only for setup / after migrations)
read
swap
availabilities_back
read/write

copy
availabilities
32

syncing: deltas
- deletions : 
SELECT id … LEFT JOIN … 
DELETE … WHERE id IN …

- insertions & updates : 
read
REPLACE INTO … row_touched_at > X

swap
- remember last sync in data_tiering_sync_logs
availabilities_back

- row_touched_at “magic” 
copy
read/write
TIMESTAMP columnavailabilities

33

swapping
- renaming tables not transactional
- atomically change a pointer instead 
(and cache it)
→ data_tiering_switches
read
swap
availabilities_back
read/write

copy
availabilities
34

gem by
@kratob 
@hubb 
@mconnell 
@danielgrieve

35

outcome
average 
DB time
unchanged
nominal at
peak traffic
deadlocks
timeouts 
faster reads
37

so long, data tiering !
you served us well
39

keep calm
♡
refactor
@mezis_fr
http://dev.housetrip.com/

40

Data Tiering: Squeezing Scale out of MySQL (LRUG Presentation 2014-01-13)

More Related Content

What's hot

Similar to Data Tiering: Squeezing Scale out of MySQL (LRUG Presentation 2014-01-13)

Recently uploaded

Data Tiering: Squeezing Scale out of MySQL (LRUG Presentation 2014-01-13)