This is part 1 of a webinar trilogy on MySQL Query Tuning, in which we look at query tuning process and tools to help with that. We’ve covered topics such as SQL tuning, indexing, the optimizer and how to leverage EXPLAIN to gain insight into execution plans. Part 1: Query tuning process and tools.
AGENDA
• Query tuning process
- Build
- Collect
- Analyze
- Tune
- Test
• Tools
- tcpdump
- pt-query-digest
SPEAKER
Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard.
Webinar replay: MySQL Query Tuning Trilogy: Query tuning process and tools
1. Copyright 2016 Severalnines AB
1
Your host & some logistics
I'm Jean-Jérôme from the Severalnines Team
and I'm your host for today's webinar!
Feel free to ask any questions in the Questions
section of this application or via the Chat box.
You can also contact me directly via the chat
box or via email: jj@severalnines.com during
or after the webinar.
10. Copyright 2016 Severalnines AB
10
The process
! A short guide to query performance review:
! Build test environment
! Collect your data
! Process your data
! Analyze your data
! Tune SQL and schemas
! Test changes
! Apply changes on production systems
11. Copyright 2016 Severalnines AB
11
The process - build test environment
! You need a deterministic, test environment to make sure you can measure the impact of the
changes
! Environment should mirror production as close as possible, to make it more relevant
! Hardware makes a difference, especially I/O - you’ll see different performance if your I/O
subsystem is slow
! Dataset makes even more significant difference - you should test your new queries against
production dataset
! If you need to obfuscate your data, your results may be different than on production systems
12. Copyright 2016 Severalnines AB
12
The process - build test environment
! The best way to build a test system is to grab a backup of your production systems
! Restore it on a host which matches hardware configuration of the production
! You are all set
! Ideally, you have such QA/staging/dev environment up and running all the time and use it to
test your new code and SQL
! If not, build it at least couple times per year to run such a review
! If you are using ClusterControl and Galera Cluster, take advantage of the cloning feature
13. Copyright 2016 Severalnines AB
13
The process - collect your data
! Native method - slow query log
! Collects the most important data about the query:
14. Copyright 2016 Severalnines AB
14
The process - collect your data
! It can collect even more data on Percona Server and MariaDB:
! Even more data with log_slow_verbosity='full,profiling_use_getrusage,profiling'
15. Copyright 2016 Severalnines AB
15
The process - collect your data
! Another method - tcpdump
! Capture TCP traffic containing MySQL queries
! Gives limited information about the query:
! Query execution time
! Query size
! Main advantage - not necessary to execute it on the MySQL host - this reduces the impact
! Proxy
! Application host
18. Copyright 2016 Severalnines AB
18
The process - analyze your data
! Thousands of lines in the log - can’t parse it manually
! You can build your own tool to aggregate data from slow query log
! Or you can use industry standard - pt-query-digest:
! wget http://percona.com/get/pt-query-digest && chmod u+x ./pt-query-digest
! Run pt-query-digest against slow query log or tcpdump
! But not on the database host - it can be CPU and memory intensive
! It’ll provide you with a summary of the traffic along with detailed information about each and
every query
19. Copyright 2016 Severalnines AB
19
The process - tune SQL and schemas
! Use EXPLAIN to check the query execution plan
! Rewrite queries to your liking
! Test new indexes, remove obsolete ones
! Make sure the new query execution plan works better
! Leverage Performance Schema for deep dive into where query execution time goes
20. Copyright 2016 Severalnines AB
20
The process - apply changes on production systems
! Make sure you know what impact your
changes will have
! If you run detailed tests in the test
environment, you should be good
! Execute your changes
! Directly
! Using pt-online-schema-change
! Make sure you monitor the system after the
change
! Have a rollback plan
! A list of ALTERs which will revert your
changes can be really handy when
things go awry
! Make sure you monitor how things unravel
for at least couple of days - some processes
may be executed on a weekly basis (or
even less often)
22. Copyright 2016 Severalnines AB
22
The tools - tcpdump
! Collects a snapshot of the TCP traffic
! When you capture MySQL traffic, it contains
enough data to calculate:
! Query execution time
! Query size
! Not much of details but enough to pinpoint
slow queries
! You can always check details later, via
EXPLAIN
! tcpdump -s 65535 -x -nn -q -tttt -i any port
3306 > mysql.tcp.txt
! Run it on the MySQL host
! Run it on the application server
! Run it on the proxy node
! Great flexibility - you can easily avoid most
of the overhead caused by the tool
23. Copyright 2016 Severalnines AB
23
The tools - pt-query-digest
! Amazing tool which is designed to
aggregate data located in slow query log
! It can work with (among others) slow log
and tcpdump output
! Presents the data in a very nice way
! Summary of all of the queries
! Each query presented in a detailed way
(if there’s enough data for it, that is)
! EXPLAIN ready for copy&paste
! Report can be modified to user’s liking
! Sort queries by numerous attributes
! Aggregate queries differently
! Ability to store data in the database helps
to build a solid review process
! You can also compare old query
performance data with current one - to
confirm performance has not changed
24. Copyright 2016 Severalnines AB
24
The tools - pt-query-digest
! pt-query-digest --limit=100% /var/lib/mysql/slow.log > ptqd1.out
! Report starts with a summary:
! How many queries?
! How many distinct queries?
25. Copyright 2016 Severalnines AB
25
The tools - pt-query-digest
! Large amount of summarized data
! Helps to understand
query patterns
! Helps to understand
how workload has
changed
26. Copyright 2016 Severalnines AB
26
The tools - pt-query-digest
! Queries in the report are sorted - by default, by total query execution time.
! Each query has a Query ID assigned - a hash of the query digest
27. Copyright 2016 Severalnines AB
27
The tools - pt-query-digest
! Extensive information about every query. Extra data for Percona Server and MariaDB.
28. Copyright 2016 Severalnines AB
28
The tools - pt-query-digest
! Detailed InnoDB stats:
! All the data: I/O operations, lock waits, pages accessed
30. Copyright 2016 Severalnines AB
30
The tools - pt-query-digest
! Data from tcpdump can’t be
as detailed as from slow log
! It still can be used to generate
important statistics
! You can always analyze such query
manually later
31. Copyright 2016 Severalnines AB
31
The tools - ClusterControl Query Monitor
! ClusterControl provides a continuous view into query performance
! Query Monitor presents query performance data based on Performance Schema (preferred) or
slow query log (if Performance Schema is not available)
! May not be suitable for a “grand” review, but will help significantly on a day-to-day basis
! Query Histogram section lists queries with variable execution time - something definitely worth
looking into