We run a busy installation with high levels of activity and architectural changes. Over the years we have developed techniques and mastered tools to help us maintain high levels of reliability and availability.
Here are some of the things we use on a day-to-day basis, and you probably could too.
1. Using MySQL in a
web-scale environment
David Landgren (@dlandgren, david@landgren.net)
Percona Live Europe Amsterdam
September 21-23, 2015
2.
3. Some facts and figures
Video hosting platform
10 years old
300 billion+ video views per month
400 million+ unique visitors per month
34th largest site world wide (comScore)
Largest European site with global reach
5. 12 clusters
Limited use of horizontal partitioning
Percona 5.6 used (nearly) everywhere
InnoDB used (nearly) everywhere
No foreign keys or referential integrity
6.
7. Primary cluster
2 master servers for writes
25 slaves for reads
Additional slaves for special purposes (backups, devs)
Vertical partitioning, no sharding
No query cache
8.
9. Managing the farm
Apache ZooKeeper / zkfarmer
Slaves announce their presence in the farm
PHP framework is automatically aware (on next request)
https://github.com/rs/zkfarmer
10.
11. Performance
Memcache deflects 10x more queries
Nearly all reads on PK
Organic growth in "slow" queries
count(*) still hurts us
Blindly kill long running queries
12.
13. All your base are optimized
How do we avoid fragmentation?
Perform rolling optimize
table
<foo> across the
schema we care about
Take an ordinary slave out of production, copy the latest
backup over, and send it back into production
Perform regular backups on a dedicated slave (Percona
Xtrabackup)
15. Spread the love^Wload
Slaves have a nominal weight (probability to be selected for
a given HTTP connection)
0
50
0.25
lag
weight
p
0
50
0.25
0
50
0.25
0
50
0.25
∑=200
16. Spread the load
Reduce production pressure on slaves that are not keeping
up with the replication from master
0
50
0.3125
lag
weight
p
0
50
0.3125
0
50
0.3125
120
10
0.0625
∑=160
17. Slow queries
Percona has some interesting parameters for managing
slow logs
§ log_slow_filter
§ log_slow_rate_limit
pt-query-digest <logfile>
Read the results regularly
18. Worst. Query. Ever.
SELECT
foo.foo_id
AS
id
FROM
foo
INNER
JOIN
(
SELECT
foo.foo_id
FROM
foo
INNER
JOIN
bar_quux
bar_gonk
ON
(bar_gonk.label
=
'splak'
AND
bar_gonk.value
=
'skroom')
INNER
JOIN
foo_quux
foo_gonk
ON
(foo_gonk.label
=
'fwask'
AND
foo_gonk.value
=
'florf')
INNER
JOIN
bar_has_quux
is_bar_gonk
ON
(is_bar_gonk.bar_quux_id
=
bar_gonk.bar_quux_id
AND
is_bar_gonk.bar_id
=
foo.bar_id)
INNER
JOIN
foo_has_quux
is_foo_gonk
ON
(is_foo_gonk.foo_quux_id
=
foo_gonk.foo_quux_id
AND
is_foo_gonk.foo_id
=
foo.foo_id)
UNION
SELECT
baz.foo_id
FROM
baz
WHERE
baz.value
>
0
)
X
ON
(X.foo_id
=
foo.foo_id)
LEFT
JOIN
foo_has_fliff
zwot
ON
(zwot.foo_id
=
foo.foo_id
AND
zwot.foo_fliff_id
=
246)
INNER
JOIN
foo_quux
bleep
ON
(bleep.label
=
'blort-‐allow'
AND
bleep.value
=
'thlip')
LEFT
JOIN
foo_quux
bap
ON
(bap.label
=
'blort'
AND
bap.value
=
'ba')
LEFT
JOIN
foo_has_quux
blif
ON
(blif.foo_id
=
foo.foo_id
AND
blif.foo_quux_id
=
bap.foo_quux_id)
LEFT
JOIN
foo_has_quux
fweep
ON
(fweep.foo_id
=
foo.foo_id
AND
fweep.foo_quux_id
=
bleep.foo_quux_id)
WHERE
foo.a
=
1
AND
foo.b
=
0
AND
foo.c
=
0
AND
foo.d
=
0
AND
foo.e
=
0
AND
foo.f
=
0
AND
foo.g
=
1
AND
foo.bar_id
=
12345
AND
zwot.foo_fliff_id
IS
NULL
AND
NOT
(blif.foo_id
IS
NULL
XOR
fweep.foo_id
IS
NULL)
ORDER
BY
foo.tink
DESC
22. REALLY slow queries
Save your platform, kill them
github.com/dailymotion/mysql-‐genocide
Graph the effects
23.
24. Watching technical debt
Information_schema.table_statistics
SELECT
t.table_name,
s.rows_read
FROM
information_schema.tables
t
LEFT
JOIN
information_schema.table_statistics
s
ON
(t.table_schema
=
s.table_schema
and
t.table_name
=
s.table_name)
WHERE
t.table_schema
=
'myschema'
AND
coalesce(rows_read,
0)
=
0
ORDER
BY
1;
25. Watching technical debt
Unused indices
SELECT
DISTINCT
t.table_name,
t.index_name,
s.rows_read
FROM
statistics
t
LEFT
JOIN
index_statistics
s
USING
(table_schema,
table_name,
index_name)
WHERE
t.table_schema
=
'myschema'
AND
coalesce(s.rows_read,
0)
=
0
ORDER
BY
1,
2;