Goldilocks And The Three Queries –MySQLs EXPLAIN Explained Dave Stokes MySQL Community Manager, North America David.Stokes...
Please Read The following is intended to outline our generalproduct direction. It is intended for informationpurposes only...
Simple IntroductionEXPLAIN & EXPLAIN EXTENDED are tools to helpoptimize queries. As tools there are only asgood as the cra...
Why worry about the optimizer?Client sends statement to serverServer checks the query cache to see if it has already run s...
Once upon a time ...There was a PHP Programmer named Goldilocks who wanted toget the phone number of her friend Little Red...
Oh-No!But the special chant kept running, and running,and running.Eventually Goldi control-C-ed when she realizedthat Grim...
A second chantGoldi did some searching in the library and learned she couldadd to the chant to look only for her friend Re...
What she gotName, phoneRedford 1234Redmund 2323Redlegs 1234Red Sox 1914Redding 9021     – But this was not what Goldilocks...
The Owls chantAh, you want the nickname field! He re-crafted her chant. SELECT first, nick, last, phone, group FROM employ...
Still too much data … but betterBetty, Big Red, Lopez, 4321, AccountingEthel, Little Red, Riding-Hoode, 127.0.0.1, Network...
We can tune the query betterCried the Owl.  SELECT first, nick, name, phone, group  WHERE nick LIKE Red%  AND group = Netw...
The preceding were            obviously flawed queries•    But how do you check if queries are running    efficiently?•   ...
EXPLAIN & EXPLAIN EXTENDEDEXPLAIN [EXTENDED | PARTITIONS]{     SELECT statement    | DELETE statement    | INSERT statemen...
What is being EXPLAINedPrepending EXPLAIN to a statement* asks the optimizer how it would   plan to execute that statement...
The Columnsid              Which SELECTselect_type     The SELECT typetable           Output row tabletype            JOIN...
A first look at EXPLAIN...using World database                 Will read all 4079                   rows – all the        ...
EXPLAIN EXTENDED -> query plan             Filtered: Estimated % of rows filtered                          By condition  T...
Add in a WHERE clause
Time for a quick review of indexesAdvantages                      Disadvantages     – Go right to desired           – Over...
Quiz: Why read 4079 rows when onlyfive are needed?
Information in the type ColumnALL – full table scan (to be avoided when possible)CONST – WHERE ID=1EQ_REF – WHERE a.ID = b...
Full table scans VS IndexSo lets create a copy of  the World.City table that  has no indexes. The  optimizer estimates tha...
How does NULL change things?Taking NOT NULL away  from the ID field (plus  the previous index)  increases the estimated  r...
Both of the following return 1 row
EXPLAIN PARTITIONS -Add 12 hash partitions to City
Some parts of your query   may be hidden!!
Latin1 versus UTF8Create a copy of the City  table but with UTF8  character set replacing  Latin1. The three  character ke...
INDEX Length               If a new index on                   CountryCode with                   length of 2 bytes, does ...
Forcing use of new shorter index ...Still generates a   guesstimate that 39   rows must be read.In some cases there is   p...
SubqueriesRun as part of EXPLAIN  execution and may  cause significant  overhead. So be careful  when testing.Note here th...
EXAMPLE of covering IndexingIn this case, adding an   index reduces the reads   from 239 to 42.Can we do better for this  ...
Index on both Continent andGovernment Form                 With both Continent and                    GovernmentForm index...
Extra ***USING INDEX – Getting data from the index rather  than the tableUSING FILESORT – Sorting was needed rather than  ...
Things can get messy!
straight_join forces order of tables
Index Hints                               Use only as a last resort –index_hint:                                 shifts in...
Controlling the Optimizermysql> SELECT @@optimizer_switchG                                                  You can turn o...
Things to watch mysqladmin -r -i 10 extended-statusSlow_queries – number in last periodSelect_scan – full table scansSelec...
Optimizer Tracing (6.5.3 onward)SET optimizer_trace="enabled=on";SELECT Name FROM City WHERE ID=999;SELECT trace into dump...
Sample from the trace – but no clueson optimizing for Joe Average DBA
Final Thoughts1. READ chapter 7 of the MySQL Manual2. Run ANALYZE TABLE periodically3. Minimize disk I/o
Q&A
David.Stokes@Oracle.Com
Upcoming SlideShare
Loading in …5
×

Goldilocks and the Three MySQL Queries

1,167 views

Published on

Optimizing MySQL queries using explain or the optimizer tracer can greatly increase the speed of retrieving or storing data.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Goldilocks and the Three MySQL Queries

  1. 1. Goldilocks And The Three Queries –MySQLs EXPLAIN Explained Dave Stokes MySQL Community Manager, North America David.Stokes@Oracle.Com
  2. 2. Please Read The following is intended to outline our generalproduct direction. It is intended for informationpurposes only, and may not be incorporated into anycontract. It is not a commitment to deliver anymaterial, code, or functionality, and should not berelied upon in making purchasing decisions.The development, release, and timing of anyfeatures or functionality described for Oracle’sproducts remains at the sole discretion of Oracle.
  3. 3. Simple IntroductionEXPLAIN & EXPLAIN EXTENDED are tools to helpoptimize queries. As tools there are only asgood as the crafts persons using them. There ismore to this subject than can be covered here ina single presentation. But hopefully this sessionwill start you out on the right path for usingEXPLAIN.
  4. 4. Why worry about the optimizer?Client sends statement to serverServer checks the query cache to see if it has already run statement. Ifso, it retrieves stored result and sends it back to the Client.Statement is parsed, preprocessed and optimized to make a QueryExecution Plan.The query execution engine sends the QEP to the storage engine API.Results sent to the Client.
  5. 5. Once upon a time ...There was a PHP Programmer named Goldilocks who wanted toget the phone number of her friend Little Red Riding Hood inNetworking’s phone number. She found an old, dusty piece ofcode in the enchanted programmers library. Inside the code wasa special chant to get all the names and phone numbers of theemployees of Grimm-Fayre-Tails Corp. And so, Goldi tried thatspecial chant! SELECT name, phone FROM employees;
  6. 6. Oh-No!But the special chant kept running, and running,and running.Eventually Goldi control-C-ed when she realizedthat Grimm hired many, many folks after hearingthat the company had 10^10 employees in thedatabase.
  7. 7. A second chantGoldi did some searching in the library and learned she couldadd to the chant to look only for her friend Red. SELECT name, phone FROM employees WHERE name LIKE Red%;Goldi crossed her fingers, held her breath, and let er rip.
  8. 8. What she gotName, phoneRedford 1234Redmund 2323Redlegs 1234Red Sox 1914Redding 9021 – But this was not what Goldilocks needed. So she asked a kindly old Java Owl for help
  9. 9. The Owls chantAh, you want the nickname field! He re-crafted her chant. SELECT first, nick, last, phone, group FROM employees WHERE nick LIKE %red%;
  10. 10. Still too much data … but betterBetty, Big Red, Lopez, 4321, AccountingEthel, Little Red, Riding-Hoode, 127.0.0.1, NetworksAgatha, Red Herring, Christie, 007, Public RelationsJohnny, Reds Catcher, Bench, 421, Gaming
  11. 11. We can tune the query betterCried the Owl. SELECT first, nick, name, phone, group WHERE nick LIKE Red% AND group = Networking; But Goldi was too busy after she got thedata she needed to listen.
  12. 12. The preceding were obviously flawed queries• But how do you check if queries are running efficiently?• What does the query the MySQL server runs really look like? (the dreaded Query Execution Plan). What is cost based optimization?• How can you make queries faster?
  13. 13. EXPLAIN & EXPLAIN EXTENDEDEXPLAIN [EXTENDED | PARTITIONS]{ SELECT statement | DELETE statement | INSERT statement | REPLACE statement | UPDATE statement}Or EXPLAIN tbl_name (same as DESCRIBE tbl_name)
  14. 14. What is being EXPLAINedPrepending EXPLAIN to a statement* asks the optimizer how it would plan to execute that statement (and sometimes it guesses wrong) at lowest cost (measures in disk page seeks*).What it can tell you:--Where to add INDEXes to speed row access--Check JOIN orderAnd Optimizer Tracing (more later) has been recently introduced!* SELECT, DELETE, INSERT, REPLACE & UPDATE as of 5.6, only SELECT 5.5 & previous* Does not know if page is in memory, on disk (storage engines problem, not optimizer), see MySQL Manual 7.8.3
  15. 15. The Columnsid Which SELECTselect_type The SELECT typetable Output row tabletype JOIN typepossible_keys Potential indexeskey Actual index usedkey_ken Length of actual indexref Columns used against indexrows Estimate of rowsextra Additional Info
  16. 16. A first look at EXPLAIN...using World database Will read all 4079 rows – all the rows in this table
  17. 17. EXPLAIN EXTENDED -> query plan Filtered: Estimated % of rows filtered By condition The query as seen by server (kind of, sort of, close)
  18. 18. Add in a WHERE clause
  19. 19. Time for a quick review of indexesAdvantages Disadvantages – Go right to desired – Overhead* row(s) instead of • CRUD reading ALL – Not used on full table ROWS scans – Smaller than whole table (read from disk faster) * May need to run – Can carry other data ANALYZE TABLE to update statistics such as with compound cardinality to help optimizer indexes make better choices
  20. 20. Quiz: Why read 4079 rows when onlyfive are needed?
  21. 21. Information in the type ColumnALL – full table scan (to be avoided when possible)CONST – WHERE ID=1EQ_REF – WHERE a.ID = b.ID (uses indexes, 1 row returned)REF – WHERE state=CA (multiple rows for key values)REF_OR_NULL – WHERE ID IS NULL (extra lookup needed for NULL)INDEX_MERGE – WHERE ID = 10 OR state = CARANGE – WHERE x IN (10,20,30)INDEX – (usually faster when index file < data file)UNIQUE_SUBQUERY –INDEX-SUBQUERY –SYSTEM – Table with 1 row or in-memory table
  22. 22. Full table scans VS IndexSo lets create a copy of the World.City table that has no indexes. The optimizer estimates that it would require 4,279 rows to be read to find the desired record – 5% more than actual rows.And the table has only 4,079 rows.
  23. 23. How does NULL change things?Taking NOT NULL away from the ID field (plus the previous index) increases the estimated rows read to 4296! Roughly 5.5% more rows than actual in file.Running ANALYZE TABLE reduces the count to 3816 – still > 1
  24. 24. Both of the following return 1 row
  25. 25. EXPLAIN PARTITIONS -Add 12 hash partitions to City
  26. 26. Some parts of your query may be hidden!!
  27. 27. Latin1 versus UTF8Create a copy of the City table but with UTF8 character set replacing Latin1. The three character key_len grows to nine characters. That is more data to read and more to compare which is pronounced slower.
  28. 28. INDEX Length If a new index on CountryCode with length of 2 bytes, does it work as well as the original 3 bytes?
  29. 29. Forcing use of new shorter index ...Still generates a guesstimate that 39 rows must be read.In some cases there is performance to be gained in using shorter indexes.
  30. 30. SubqueriesRun as part of EXPLAIN execution and may cause significant overhead. So be careful when testing.Note here that #1 is not using an index. And that is why we recommend rewriting sub queries as joins.
  31. 31. EXAMPLE of covering IndexingIn this case, adding an index reduces the reads from 239 to 42.Can we do better for this query?
  32. 32. Index on both Continent andGovernment Form With both Continent and GovernmentForm indexed together, we go from 42 rows read to 19. Using index means the data is retrieved from index not table (good) Using index condition means eval pushed down to storage engine. This can reduce storage engine read of table and server reads of storage engine (not bad)
  33. 33. Extra ***USING INDEX – Getting data from the index rather than the tableUSING FILESORT – Sorting was needed rather than using an index. Uses file system (slow)ORDER BY can use indexesUSING TEMPORARY – A temp table was created – see tmp_table_size and max_heap_table_sizeUSING WHERE – filter outside storage engineUsing Join Buffer -- means no index used.
  34. 34. Things can get messy!
  35. 35. straight_join forces order of tables
  36. 36. Index Hints Use only as a last resort –index_hint: shifts in data can make USE {INDEX|KEY} this the long way [{FOR {JOIN|ORDER BY| GROUP BY}] ([index_list]) around. | IGNORE {INDEX|KEY} [{FOR {JOIN|ORDER BY| GROUP BY}] (index_list) | FORCE {INDEX|KEY} [{FOR {JOIN|ORDER BY| GROUP BY}] (index_list) http://dev.mysql.com/doc/refman/5.6/en/index- hints.html
  37. 37. Controlling the Optimizermysql> SELECT @@optimizer_switchG You can turn on or off*************************** 1. row *************************** certain optimizer@@optimizer_switch: index_merge=on,index_merge_union=on, settings for index_merge_sort_union=on, GLOBAL or index_merge_intersection=on, SESSION engine_condition_pushdown=on, index_condition_pushdown=on, mrr=on,mrr_cost_based=on, See MySQL Manual block_nested_loop=on,batched_key_access=off 7.8.4.2 and know your mileage may vary.
  38. 38. Things to watch mysqladmin -r -i 10 extended-statusSlow_queries – number in last periodSelect_scan – full table scansSelect_full_join full scans to completeCreated_tmp_disk_tables – file sortsKey_read_requerts/Key_wrtie_requests – read/write weighting of application, may need to modify application
  39. 39. Optimizer Tracing (6.5.3 onward)SET optimizer_trace="enabled=on";SELECT Name FROM City WHERE ID=999;SELECT trace into dumpfile /tmp/foo FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE;Shows more logic than EXPLAINThe output shows much deeper detail on how the optimizer chooses to process a query. This level of detail is well past the level for this presentation.
  40. 40. Sample from the trace – but no clueson optimizing for Joe Average DBA
  41. 41. Final Thoughts1. READ chapter 7 of the MySQL Manual2. Run ANALYZE TABLE periodically3. Minimize disk I/o
  42. 42. Q&A
  43. 43. David.Stokes@Oracle.Com

×