Optimizing MySQL Queries

  • 1,042 views
Uploaded on

CTO Dr. Aris Zakinthinos explain the hands-on approach to optimize MySQl queries.

CTO Dr. Aris Zakinthinos explain the hands-on approach to optimize MySQl queries.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,042
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
5

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. HTTP://ACHIEVERS.COM/TECHHTTP://MEETUP.COM/ACHIEVERSTECH
  • 2. Goals• To show you how to use MySQL Explain• More importantly, how to think about whatMySQL is doing when it is processing a query
  • 3. INDEXING REFRESHER
  • 4. What’s an Index?• An index is a data structure that improves thespeed of data retrieval operations
  • 5. Types of Indexes• Unique/Index– B+ Tree indexes– Hash indexes• Spatial– R-Tree indexes• Full-text indexes
  • 6. Column Index Type Definitions• Column Index– An index on a single column• Compound Index– An index on multiple columns• Covering Index– “Covers” all columns in a query• Partial Index– A subset of a column for the index• E.g. Only the first 10 characters of a person’s name
  • 7. Compound Index• CREATE TABLE test (id INT NOT NULL,last_name CHAR(30) NOT NULL,first_name CHAR(30) NOT NULL,PRIMARY KEY (id),INDEX name(last_name,first_name) );• The name index is an index over thelast_name and first_name columns
  • 8. What does that mean?• Queries like:– SELECT * FROM test WHERE first_name=‘Aris’ ANDlast_name=‘Zakinthinos’;– SELECT * FROM test WHERE last_name=‘York’;– Will use the index• But a query like:– SELECT * FROM test WHERE first_name=‘Zak’;– Will not
  • 9. How Compound Indexes are Used• If you have an index on (col1, col2,col3)• This Index will be used on queries for (col1),(col1, col2) and (col1, col2, col3)– Notice that the leftmost prefix must exist• Only the col1 part of the index will be used forqueries for (col1, col3)• This Index will not be used for queries for(col2), (col3) and (col2, col3)
  • 10. One Thing to Keep in Mind• Remember that the index stores things insorted order so an n-field compound index isthe equivalent of sorting the data on n fields• For example, for a 2 column index:COL1 COL2A 4Z 3A 5Z 1Index(A,4)(A,5)(Z,1)(Z,3)
  • 11. Pro Tip• MySQL always adds the primary key to theend of your index– You never have to add it to the end of yourcompound key
  • 12. Can you have too many indexes?• YES!• They take up space– You want all your indexes to fit in memory• They make inserts/deletes slower– Remember that you need to insert/deleteinto/from each index
  • 13. INTRODUCING EXPLAIN
  • 14. What does Explain do?• Shows MySQL’s query execution plan
  • 15. What does that mean?• How many tables are used• How tables are joined• How data is looked-up• Possible and actual index use• Length of index used• Approximate number of rows examined
  • 16. Why should I care?• Ultimately, less server load leads to a betteruser experience– Could be the difference between usable andbankrupt• Impress your friends• Get better jobs
  • 17. Which queries should I examine?• Every single one!• If you have never used EXPLAIN:– Start by looking at the items in the slow query log– Or, execute SHOW FULL PROCESSLIST everyonce in a while and grab a query that you see veryoften
  • 18. Pro Tip• Using EXPLAIN during QA is better than inproduction• It avoids you having to say:– “It’s not a problem—we call it the coffee breakfeature.”
  • 19. Can I run it on any query?• Up to MySQL 5.6, it only worked on SELECTqueries
  • 20. Anything else I should know?• It doesn’t execute your query but MAYexecute portions of it. CAUTION!– Nested subqueries are executed
  • 21. USING EXPLAIN
  • 22. Test Database• We will be using MySQL’s sakalia testdatabase, available at:http://dev.MySQL.com/doc/index-other.html• It is a sample database for a DVD rental store
  • 23. How do you use Explain?• To get MySQL’s execution plan, simply put theword ‘EXPLAIN’ in front of your selectstatement:EXPLAIN SELECT * FROM film fJOIN film_actor fa ON fa.film_id = f.film_idJOIN actor a ON a.actor_id = fa.actor_id;
  • 24. Explain Output• Using a pretty GUI:
  • 25. Explain Output• On the commandline it is easier toread if you use Gat the end of theline:
  • 26. SO WHAT DOES IT ALL MEAN?
  • 27. Column DefinitionsA sequential ID identifying the select in the query.EXPLAIN SELECT * FROM film fJOIN film_actor fa ON fa.film_id = f.film_idJOIN actor a ON a.actor_id = fa.actor_id;
  • 28. Column DefinitionsThe type of select you are doing. Common values:• SIMPLE – Simple SELECT (not using UNION or subqueries)• PRIMARY – Outermost SELECT• UNION – Second or later SELECT in a UNION• SUBQUERY – First SELECT in a subquery• DEPENDENT SUBQUERY - First SELECT in subquery, dependent on outer query• DERIVED - Derived table select (subquery in FROM clause)
  • 29. Column DefinitionsThe Table or Alias this row refers to.EXPLAIN SELECT * FROM film fJOIN film_actor fa ON fa.film_id = f.film_idJOIN actor a ON a.actor_id = fa.actor_id;
  • 30. Column DefinitionsThe join type. Lots more to come on this.EXPLAIN SELECT * FROM film fJOIN film_actor fa ON fa.film_id = f.film_idJOIN actor a ON a.actor_id = fa.actor_id;
  • 31. Column DefinitionsThe possible indexes that could be used.If NULL then no appropriate index was found.EXPLAIN SELECT * FROM film fJOIN film_actor fa ON fa.film_id = f.film_idJOIN actor a ON a.actor_id = fa.actor_id;
  • 32. Column DefinitionsThe Index that was used.EXPLAIN SELECT * FROM film fJOIN film_actor fa ON fa.film_id = f.film_idJOIN actor a ON a.actor_id = fa.actor_id;
  • 33. Column DefinitionsThe number of bytes MySQL uses from the index.EXPLAIN SELECT * FROM film fJOIN film_actor fa ON fa.film_id = f.film_idJOIN actor a ON a.actor_id = fa.actor_id;
  • 34. Column DefinitionsThe columns (or constants) form the index that are used.EXPLAIN SELECT * FROM film fJOIN film_actor fa ON fa.film_id = f.film_idJOIN actor a ON a.actor_id = fa.actor_id;
  • 35. Column DefinitionsThe approximate number of rows examined.Note: this is just a guide.EXPLAIN SELECT * FROM film fJOIN film_actor fa ON fa.film_id = f.film_idJOIN actor a ON a.actor_id = fa.actor_id;
  • 36. Column DefinitionsInformation on how the tables are join.Lots more to come on this!EXPLAIN SELECT * FROM film fJOIN film_actor fa ON fa.film_id = f.film_idJOIN actor a ON a.actor_id = fa.actor_id;
  • 37. JOIN TYPES
  • 38. Join Type – system/const• At most one row returned– E.g., … WHERE id=1– Constant can be propagated through query– Index must be either PRIMARY or UNIQUEEXPLAIN SELECT * FROM rental WHERE rental_id = 10;
  • 39. Join Type – eq_ref• Index lookup returns exactly one row– E.g., WHERE a.id=b.id– Requires unique index on all parts of the key usedby the joinEXPLAIN SELECT * FROM customer cJOIN address a ON c.address_id = a.address_id
  • 40. Join Type – ref• Index lookup that can return more than onerow– Used when• Either leftmost part of a unique key is used• Or a non-unique or non-null key is usedEXPLAIN SELECT * FROM rental WHERE rental_id IN (10,11,12)AND rental_date = 2006-02-01;
  • 41. Join Type– ref_or_null• Similar to ref but allows for null values or nullconditions– Essentially an extra pass to look for nullsEXPLAIN SELECT * FROM film WHERErelease_year = 2006 OR release_year IS NULL;
  • 42. Join Type – index_merge• Uses 2 separate indexes– Extra field shows more info– Can be one of:• sort_union – OR condition on non-primary key fields• union – OR condition using constants or ranges on primarykey fields• intersection – AND condition with constants or rangeconditions on primary key fieldsEXPLAIN SELECT * FROM rental WHERE rental_id IN (10,11,12)OR rental_date = 2006-02-01;
  • 43. Join Type – range• Access method for a range value in the whereclause (<, <=, >, >=, LIKE, IN or BETWEEN,IN())– Performs a partial index scan– Lots of optimizations for this type of queryEXPLAIN SELECT * FROM rental WHERE rental_dateBETWEEN 2006-01-01 AND 2006-07-01;
  • 44. Join Type – index• Does an index scan– You are doing a full scan of every record in theindex– Better than ‘ALL’ but still requires a LOT ofresources– Note: This is not the same as ‘USING INDEX’ inExtrasEXPLAIN SELECT rental_date FROM rental;
  • 45. Join Type – ALL• Full table scan– It will look at every record in the table– Unless you want the whole table, it should beavoided for all but the smallest of tablesEXPLAIN SELECT * FROM rental;
  • 46. Join Type Summary• From best to worst– system/const– eq_ref– ref– ref_or_null– index_merge– range– index– ALL
  • 47. A MORE COMPLICATED EXAMPLE
  • 48. A More Complicated Query• All identical id values are part of the same select• This query has a bunch of UNIONs• You can also see all of the join types used by this Query
  • 49. BACK TO OUR QUERY
  • 50. Execution Order• First thing to notice is that the order of execution is not the same as the query• MySQL will rearrange your query to do what it thinks is optimal• You will often disagree!EXPLAIN SELECT * FROM film fJOIN film_actor fa ON fa.film_id = f.film_idJOIN actor a ON a.actor_id = fa.actor_id;
  • 51. Query ProcessingThe above can be read as:for (each row in table a [Actor] ) {for (each row in fa [film_actor] matching a.actor_id) {for the row in f [film] matching fa.film_id ) {Add row to output}}} EXPLAIN SELECT * FROM film fJOIN film_actor fa ON fa.film_id = f.film_idJOIN actor a ON a.actor_id = fa.actor_id;
  • 52. Query Processing In Generalfor (each row in outer table matching where clause){…for (each row in inner table matching key valueand where clause) {Add to output}}
  • 53. rows is a critical performance indicator• Total approximate rows is the PRODUCT of allthe rows with the same ID• It is only an estimate• This query estimated a total of 68,640 rows• Actual rows examined 2,969,639
  • 54. Remember: it is an estimateRemember this query: Here is the rest:This query executes in less than200 ms because there are limitclauses that only return a smallnumber of rows.
  • 55. key_len• This will tell you how much of the index it will use– Really only useful if you have compound keys• For a complete list of type to byte mapping, see:– http://dev.mysql.com/doc/refman/5.6/en/storage-requirements.html• What’s missing from that page:– VARCHAR(n)will always use a 2 byte length fields– If a column can be NULL it takes one extra byte in theindex• Warning! Multibyte characters make byte!=character– UTF8 is 3 bytes
  • 56. key_len exampleEXPLAIN SELECT film_id, title FROM film WHERE description LIKEA Epic% AND release_year=2006 AND language_id = 1;CREATE TABLE `film` (...`description` text,`release_year` year(4) DEFAULT NULL,`language_id` tinyint(3) unsigned NOT NULL,...KEY `idx_compound_key` (`language_id`,`release_year`,`description`(20)),) ENGINE=InnoDB AUTO_INCREMENT=1001 DEFAULT CHARSET=utf8;tinyint – 1 byte year – 1 byte + 1 byte possible NULLDescription 20*3 bytes+2 bytelength+1 byte possible NULL
  • 57. EXTRA COLUMN
  • 58. Pay Close Attention• This column will give you a good sense ofwhat is going to happen during execution
  • 59. Extra Column – The Bad• Using temporary– During execution, a temporary table was required– Not horrible if the temporary table is small since itwill sit in memory
  • 60. Extra Column – The Bad• Using filesort– Sorting was needed rather than using an index– Not horrible if the result set is small
  • 61. Extra Column – The Good• Using index– The query was satisfied using only an index
  • 62. Extra Column – The Whatever• Using where– A where clause is used to restrict rows– Unless you also see ‘Using Index,’ this means thatMySQL had to read the row from the database toapply the where clause
  • 63. THINGS TO LOOK OUT FOR
  • 64. Signs That Your Query Stinks• No index is used – NULL in key column• Large number of rows estimated• Using temporary• Using filesort• Using derived tables – DERIVED inselect_type column• Joining on derived tables• Having dependent subqueries on a large resultset
  • 65. Examples:EXPLAIN SELECT * FROM film f WHERE release_year = 2006;ALTER TABLE `sakila`.`film`ADD INDEX `idx_release_year` (`release_year` ASC) ;
  • 66. Examples:EXPLAIN SELECT rating, COUNT(*) FROM filmWHERE rental_rate <1.00GROUP BY rating;• This is a terrible query.• A full table scan that creates a temporary table and then sorts it.• Why?
  • 67. Grouping• To execute this query, MySQL will build atemporary table with all the rows that haverental_rate <1.00• It will then sort them by rating• It will then count all of the items that have thesame rating• To make this query fast you need all of theratings to be processed in order.– That is, you have to avoid the build and sort step
  • 68. Adding an IndexALTER TABLE `sakila`.`film` ADD INDEX `idx_rating`(`rating` ASC, `rental_rate` ASC) ;EXPLAIN SELECT rating, COUNT(*) FROM filmWHERE rental_rate <1.00GROUP BY rating;• This works because MySQL can process all the rows with the same ‘rating’ sequentially.• The second part of the index allows it to check the where clause from the index directly.
  • 69. Optimizing GROUP BY• All columns used in the GROUP BY mustcome from the same index and the index muststore the keys in the order specified in theGROUP BY• If this isn’t true you will see a “Usingfilesort” and/or “Using temporary”
  • 70. Optimizing ORDER BY• An index can be used with ORDER BY evenif the index doesn’t match the ORDER BYcolumns exactly as long as the “missing” keysare constants in the WHERE clause–The order of the keys must match the orderof the ORDER BY clause• If this isn’t true you will see a “Usingfilesort” and/or “Using temporary”
  • 71. Both GROUP BY and ORDER BYEXPLAIN SELECT rating, COUNT(*) AS c FROM filmWHERE rental_rate <1.00GROUP by ratingORDER BY c;• There is no way to avoid the ‘Using temporary’ and the ‘Using filesort’• You are asking MySQL to first sort by rating to process the GROUP BY which it does usingan index.• Then you are asking it to take that result and resort it from a computed field.• An ORDER BY column list which is in a different order than the GROUP BY list will causeproblem.
  • 72. What if I can’t make it better?• Never give up!– There is always a solution. You might not want todo it, but there is always a solution.• You might need to:– Denormalize your data to allow you to construct abetter compound index– Cache outside the database– Pull some of the processing into your application– Break up the query into smaller faster chunks
  • 73. FINAL THOUGHTS
  • 74. Test with Production Data• Do not optimize with a subset of your data• MySQL uses table statistics to determine itsexecution plan• If you optimize with test data your real worldperformance might be completely different
  • 75. PRACTICE!• Like everything, you need to do it, to fullyunderstand it• Don’t expect to be a MySQL ninja overnight• It takes hard work but the benefits are worthit– 10 second searches optimized to 100 milliseconds