Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014

775 views

Published on

An introduction to MySQL query tuning for those who are new to the subject. This is not a dark art but a skill that can be developed.

Published in: Technology
  • Be the first to comment

MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014

  1. 1. MMyySSQQLL QQuueerryy TTuunniinngg FFoorr tthhee SSqquueeeemmiisshh David.Stokes@Oracle.com @stoker
  2. 2. Copyright © 2014 Dave Stokes 2 MySQL Optimizer ● The MySQL Cost Based Optimizer wants to optimize each query, every time. Query plans can not be locked in as with Oracle. ● Cost is/was based on disk I/O ● FusionIO, hybrid drives changing cost paradigm – Cost when cost = 0?!? ● Checks Syntax ● Guesstimates costs, looks for cheapest way to do things ● May not be the best way ● How good are your statistics? When was your last ANALYSE TABLE? – Need to run to update stats
  3. 3. Logging – find slow queries Log slow queries (time configurable) and queries w/o indexes. MySQL 5.7 allows levels of verbosity. PERFORMANCE_SCHEMA, INFORMATION_SCEMA has tons of useful data. Copyright © 2014 Dave Stokes 3
  4. 4. Chapter 8 of the MySQL Manual ● All about optimization ● EXPLAIN MySQL 5.6.3 and later permits explainable statements for EXPLAIN are SELECT, DELETE, INSERT, REPLACE, and UPDATE. Before MySQL 5.6.3, SELECT is the only explainable statement. ● Many developers have been trained in their language of choice, JavaScript, JSON/REST, etc. but very, very few have any SQL training. And many try to use ORMs to make up for their rotten SQL skills or try to force NoSQL technologies for relational data. Copyright © 2014 Dave Stokes 4
  5. 5. Copyright © 2014 Dave Stokes 5 Please note
  6. 6. Copyright © 2014 Dave Stokes 6 Example 1 This is the ASCII style output of explain versus VE
  7. 7. From the Manual – If you want details 8.8.4 Estimating Query Performance ● In most cases, you can estimate query performance by counting disk seeks. For small tables, you can usually find a row in one disk seek (because the index is probably cached). For bigger tables, you can estimate that, using B-tree indexes, you need this many seeks to find a row: log(row_count) / log(index_block_length / 3 * 2 / (index_length + data_pointer_length)) + 1. ● In MySQL, an index block is usually 1,024 bytes and the data pointer is usually four bytes. For a 500,000-row table with a key value length of three bytes (the size of MEDIUMINT), the formula indicates log(500,000)/log(1024/3*2/(3+4)) + 1 = 4 seeks. ● This index would require storage of about 500,000 * 7 * 3/2 = 5.2MB (assuming a typical index buffer fill ratio of 2/3), so you probably have much of the index in memory and so need only one or two calls to read data to find the row. ● For writes, however, you need four seek requests to find where to place a new index value and normally two seeks to update the index and write the row. Copyright © 2014 Dave Stokes 7
  8. 8. Previous slide in simpler terms ● Use smallest data type possible. No BIGINT for age or customer id numbers – Are you really going to live to 18,446,744,073,709,551,615? – Use PROCEDUE ANALYSE to find optimal size/type ● Wasted space is moved from disk to memory, memoery to buffer, buffer to network, network to buffer, and/or buffer to disk – Reading off disk 100,000 slower than memory Copyright © 2014 Dave Stokes 8
  9. 9. From the Manual – If you want details 8.8.4 Estimating Query Performance ● In most cases, you can estimate query performance by counting disk seeks. For small tables, you can usually find a row in one disk seek (because the index is probably cached). For bigger tables, you can estimate that, using B-tree indexes, you need this many seeks to find a row: log(row_count) / log(index_block_length / 3 * 2 / (index_length + data_pointer_length)) + 1. ● In MySQL, an index block is usually 1,024 bytes and the data pointer is usually four bytes. For a 500,000-row table with a key value length of three bytes (the size of MEDIUMINT), the formula indicates log(500,000)/log(1024/3*2/(3+4)) + 1 = 4 seeks. ● This index would require storage of about 500,000 * 7 * 3/2 = 5.2MB (assuming a typical index buffer fill ratio of 2/3), so you probably have much of the index in memory and so need only one or two calls to read data to find the row. ● For writes, however, you need four seek requests to find where to place a new index value and normally two seeks to update the index and write the row. Copyright © 2014 Dave Stokes 9
  10. 10. Chapter 8 is a treasure trove of information. Copyright © 2014 Dave Stokes 10
  11. 11. The dreaded B-tree index Copyright © 2014 Dave Stokes 11 ● Docs.Oracle.com
  12. 12. Why your query stinks! SELECT CONCAT(customer.last_name, ', ', customer.first_name) AS customer, address.phone, film.title FROM rental INNER JOIN customer ON rental.customer_id = customer.customer_id INNER JOIN address ON customer.address_id = Copyright © 2014 Dave Stokes 12 address.address_id INNER JOIN inventory ON rental.inventory_id = inventory.inventory_id INNER JOIN film ON inventory.film_id = film.film_id WHERE rental.return_date IS NULL AND rental_date + INTERVAL film.rental_duration DAY < CURRENT_DATE() LIMIT 5;
  13. 13. Copyright © 2014 Dave Stokes 13 NULL ● Usually used to do designate no value or a lack of data. – Useful for many applications – Screws up indexes as you need to also test values for null. Multiple nulls can kill performance
  14. 14. FULL TABLE SCAN – Try to avoid ● FULL TABLE SCAN means each and every row in your table had to be read. If all the table is not in memory, it is made so. Copyright © 2014 Dave Stokes 14 – Not all ways bad ● Sometimes you have to go through all rows – Use indexes to find exact record(s) needed
  15. 15. Still have a full table scan Copyright © 2014 Dave Stokes 15
  16. 16. Note numbers for rows! Cost!! Estimate that it will take 2,000 reads to get desired data Copyright © 2014 Dave Stokes 16
  17. 17. Copyright © 2014 Dave Stokes 17 Simpler Example SELECT City.name, Country.Name FROM City JOIN Country ON (City.CountryCode = Country.Code) LIMIT 5;
  18. 18. But I ONLY want FIVE ROWS!! SELECT City.name, Country.Name FROM City JOIN Country ON (City.CountryCode = Country.Code) LIMIT 5; Copyright © 2014 Dave Stokes 18 Still needs to do the join BEFORE grabbing just the first FIVE rows.
  19. 19. Add stuff to WHERE to narrow search SELECT City.name, Country.Name FROM City JOIN Country ON (City.CountryCode = Country.Code) WHERE City.Population > 500000 AND District = 'Texas' LIMIT 5; Should really cut down amount of data read, right? Copyright © 2014 Dave Stokes 19
  20. 20. Copyright © 2014 Dave Stokes 20 Not Really
  21. 21. Copyright © 2014 Dave Stokes 21 Another example ● SELECT * FROM City WHERE Population > 400000
  22. 22. Tossing an index at a problem ● CREATE INDEX CityPopIndex ON City (Population) Copyright © 2014 Dave Stokes 22 ● SELECT * FROM City WHERE Population > 400000 Net gain zero
  23. 23. ● Optimizer did not use our new index!! Copyright © 2014 Dave Stokes 23
  24. 24. The best-laid schemes o’ mice an’ men Gang aft agley ● SELECT * FROM City USE INDEX (CityPopindex) WHERE Population > 400000 ● This is a case where even forcing the index does not help as the Population entries are not unique enough for the optimizer to take advantage of it. Copyright © 2014 Dave Stokes 24
  25. 25. Copyright © 2014 Dave Stokes 25 Another example ● SELECT Name FROM City WHERE CountryCode = 'USA' ● 274 rows in world.City have USA as the CountryCode, better than 4K total records in a FULL TABLE SCAN
  26. 26. Copyright © 2014 Dave Stokes 26 Using a Join ● SELECT City.Name, Country.name FROM City Join Country ON (Country.code = City.CountryCode) WHERE Country.Population > 4000000 AND City.CountryCode='USA'
  27. 27. Copyright © 2014 Dave Stokes 27 Indexes ● An index on City, State, and Zip can be used to look up – City, State, Zip – City, State – City ● Drags Zip along on City, State and saves a read into the data
  28. 28. Add a table for zipcodes SHOW CREATE TABLE zipcodesG Copyright © 2014 Dave Stokes 28 Table: zipcodes Create Table: CREATE TABLE `zipcodes` ( `zip` varchar(5) NOT NULL, `type` char(10) DEFAULT NULL, `primary_city` varchar(25) DEFAULT NULL, `accceptable_cities` varchar(50) DEFAULT NULL, `unacceptable_city` varchar(50) DEFAULT NULL, `state` char(2) DEFAULT NULL, `county` varchar(30) DEFAULT NULL, `tz` char(50) DEFAULT NULL, `area_code` varchar(20) DEFAULT NULL, `latitude` decimal(5,0) DEFAULT NULL, `longitude` decimal(5,0) DEFAULT NULL, `world_region` char(4) DEFAULT NULL, `country` char(2) DEFAULT NULL, `decomissioned` int(2) DEFAULT NULL, `est_population` int(10) unsigned DEFAULT NULL, `notes` char(20) DEFAULT NULL, PRIMARY KEY (`zip`), ) ENGINE=InnoDB DEFAULT CHARSET=latin1
  29. 29. Lookup up by city name ● SELECT primary_city, state, zip, accceptable_cities FROM zipcodes WHERE primary_city='Roanoke' Copyright © 2014 Dave Stokes 29
  30. 30. Copyright © 2014 Dave Stokes 30 Two factors SELECT primary_city, state, zip, accceptable_cities FROM zipcodes WHERE primary_city='Roanoke' AND state='TX'
  31. 31. What is we specify all three items? ● SELECT primary_city, state, zip, accceptable_cities FROM zipcodes WHERE primary_city='Roanoke' AND state='TX' AND zip='76262' Optimizer is smart enough to figure that PRIMARY key gives all we need the cheapest Copyright © 2014 Dave Stokes 31
  32. 32. Sometimes indexes help a little ● CREATE INDEX cityindex ON zipcodes (primary_city); An index is added on primary_city to facilitate searching by city name Copyright © 2014 Dave Stokes 32
  33. 33. Yet another example SELECT primary_city, state, area_code FROM zipcodes USE INDEX (primary) WHERE primary_city='Orlando' AND state='FL' ● Default is a full table scan ● CREATE INDEX citystatearea ON zipcodes (primary_city,state,area_code); Copyright © 2014 Dave Stokes 33
  34. 34. More complex query SELECT CONCAT(c.city, _utf8',', cy.country) AS store, CONCAT(m.first_name, _utf8' ', m.last_name) AS manager, SUM(p.amount) AS total_sales FROM payment AS p INNER JOIN rental AS r ON p.rental_id = r.rental_id INNER JOIN inventory AS i ON r.inventory_id = i.inventory_id INNER JOIN store AS s ON i.store_id = s.store_id INNER JOIN address AS a ON s.address_id = a.address_id INNER JOIN city AS c ON a.city_id = c.city_id INNER JOIN country AS cy ON c.country_id = cy.country_id INNER JOIN staff AS m ON s.manager_staff_id = m.staff_id GROUP BY s.store_id ORDER BY cy.country, c.city; Copyright © 2014 Dave Stokes 34
  35. 35. Workbench – Mouse Over Blocks ● MySQL Workbench can provide information on queries – Mouse over blocks for details. In this case the query does not require a filesort (slow) to order the query. – SELECT Name FROM City WHERE Population > 100000 or District='Florida' Order by Population limit 10; Note: Full INDEX scan not Full TABLE scan Copyright © 2014 Dave Stokes 35
  36. 36. Joins – think sets on rows of data Copyright © 2014 Dave Stokes 36
  37. 37. Copyright © 2014 Dave Stokes 37 Books ● Effective MySQL Optimizing SQL Statements – Ronald Bradford ● High Performance MySQL – Schwartz, Zaitsev, Tkachenko
  38. 38. Copyright © 2014 Dave Stokes 38 Other resources ● MySQL Manual ● Planet.MySQL.Com ● Forums.MySQL.com – Forums for Performance, Optimizer & Parser, and many more ● MySQL Central @ OpenWorld – Part of Oracle Open World – End of September
  39. 39. Copyright © 2014 Dave Stokes 39 Q/A ●David.Stokes@Oracle.com ●@stoker ●Slideshare.net/davestokes

×