Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

San diegophp


Published on

Presentation given to San Diego PHP on August 2nd, 2012 on MySQL query tuning, programming best practices.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

San diegophp

  1. 1. San Diego PHP Aug 2Goldilocks And The Three Queries –MySQLs EXPLAIN Explained Dave Stokes MySQL Community Manager, North America David.Stokes@Oracle.Com
  2. 2. Please Read The following is intended to outline our generalproduct direction. It is intended for informationpurposes only, and may not be incorporated into anycontract. It is not a commitment to deliver anymaterial, code, or functionality, and should not berelied upon in making purchasing decisions.The development, release, and timing of anyfeatures or functionality described for Oracle’sproducts remains at the sole discretion of Oracle.
  3. 3. Simple IntroductionEXPLAIN & EXPLAIN EXTENDED are tools to helpoptimize queries. As tools there are only asgood as the crafts persons using them. There ismore to this subject than can be covered here ina single presentation. But hopefully this sessionwill start you out on the right path for usingEXPLAIN.
  4. 4. Why worry about the optimizer?Client sends statement to serverServer checks the query cache to see if it has already runstatement. If so, it retrieves stored result and sends it backto the Client.Statement is parsed, preprocessed and optimized to makea Query Execution Plan.The query execution engine sends the QEP to the storageengine API.Results sent to the Client.
  5. 5. Once upon a time ...There was a PHP Programmer named Goldilocks who wanted toget the phone number of her friend Little Red Riding Hood inNetworking’s phone number. She found an old, dusty piece ofcode in the enchanted programmers library. Inside the code wasa special chant to get all the names and phone numbers of theemployees of Grimm-Fayre-Tails Corp. And so, Goldi tried thatspecial chant! SELECT name, phone FROM employees;
  6. 6. Oh-No!But the special chant kept running, and running,and running.Eventually Goldi control-C-ed when she realizedthat Grimm hired many, many folks afterhearing that the company had 10^10 employeesin the database.
  7. 7. A second chantGoldi did some searching in the library and learned she couldadd to the chant to look only for her friend Red. SELECT name, phone FROM employees WHERE name LIKE Red%;Goldi crossed her fingers, held her breath, and let er rip.
  8. 8. What she gotName, phoneRedford 1234Redmund 2323Redlegs 1234Red Sox 1914Redding 9021 ● But this was not what Goldilocks needed. So she asked a kindly old Database Owl for help
  9. 9. The Owls chantAh, you want the nickname field! He re-crafted her chant. SELECT first, nick, last, phone, group FROM employees WHERE nick LIKE %red%;
  10. 10. Still too much data … but betterBetty, Big Red, Lopez, 4321, AccountingEthel, Little Red, Riding-Hoode,, NetworksAgatha, Red Herring, Christie, 007, Public RelationsJohnny, Reds Catcher, Bench, 421, Gaming
  11. 11. We can tune the query betterCried the Owl. SELECT first, nick, name, phone, group WHERE nick LIKE Red% AND group = Networking; But Goldi was too busy after she got thedata she needed to listen.
  12. 12. The preceding were obviously flawed queries• But how do you check if queries are running efficiently?• What does the query the MySQL server runs really look like? (the dreaded Query Execution Plan). What is cost based optimization?• How can you make queries faster?
  13. 13. EXPLAIN & EXPLAIN EXTENDEDEXPLAIN [EXTENDED | PARTITIONS]{ SELECT statement | DELETE statement | INSERT statement | REPLACE statement | UPDATE statement}Or EXPLAIN tbl_name (same as DESCRIBE tbl_name)
  14. 14. What is being EXPLAINedPrepending EXPLAIN to a statement* asks the optimizer how it would plan to execute that statement (and sometimes it guesses wrong) at lowest cost (measures in disk page seeks*).What it can tell you:--Where to add INDEXes to speed row access--Check JOIN orderAnd Optimizer Tracing (more later) has been recently introduced!* SELECT, DELETE, INSERT, REPLACE & UPDATE as of 5.6, only SELECT 5.5 & previous* Does not know if page is in memory, on disk (storage engines problem, not optimizer), see MySQL Manual 7.8.3
  15. 15. The Columnsid Which SELECTselect_type The SELECT typetable Output row tabletype JOIN typepossible_keys Potential indexeskey Actual index usedkey_ken Length of actual indexref Columns used against indexrows Estimate of rowsextra Additional Info
  16. 16. A first look at EXPLAIN Will read all 4079 rows – all the rows in this table
  17. 17. EXPLAIN EXTENDED -> query plan Filtered: Estimated % of rows filtered By condition The query as seen by server (kind of, sort of, close)
  18. 18. Add in a WHERE clause
  19. 19. Time for a quick review of indexesAdvantages Disadvantages ● Go right to desired ● Overhead* row(s) instead of – CRUD reading ALL ● Not used on full ROWS table scans ● Smaller than whole table (read from disk faster) * May need to run ● Can carry other ANALYZE TABLE to update statistics such as data with cardinality to help compound optimizer make better choices indexes
  20. 20. Quiz: Why read 4079 rows when onlyfive are needed?
  21. 21. Information in the type ColumnALL – full table scan (to be avoided when possible)CONST – WHERE ID=1EQ_REF – WHERE a.ID = b.ID (uses indexes, 1 row returned)REF – WHERE state=CA (multiple rows for key values)REF_OR_NULL – WHERE ID IS NULL (extra lookup needed for NULL)INDEX_MERGE – WHERE ID = 10 OR state = CARANGE – WHERE x IN (10,20,30)INDEX – (usually faster when index file < data file)UNIQUE_SUBQUERY –INDEX-SUBQUERY –SYSTEM – Table with 1 row or in-memory table
  22. 22. Full table scans VS IndexSo lets create a copy of the World.City table that has no indexes. The optimizer estimates that it would require 4,279 rows to be read to find the desired record – 5% more than actual rows.And the table has only 4,079 rows.
  23. 23. How does NULL change things?Taking NOT NULL away from the ID field (plus the previous index) increases the estimated rows read to 4296! Roughly 5.5% more rows than actual in file.Running ANALYZE TABLE reduces the count to 3816 – still > 1
  24. 24. Both of the following return 1 row
  25. 25. EXPLAIN PARTITIONS -Add 12 hash partitions to City
  26. 26. Some parts of your query may be hidden!!
  27. 27. Latin1 versus UTF8Create a copy of the City table but with UTF8 character set replacing Latin1. The three character key_len grows to nine characters. That is more data to read and more to compare which is pronounced slower.
  28. 28. INDEX Length If a new index on CountryCode with length of 2 characters, does it work as well as the original 3 chars?
  29. 29. Forcing use of new shorter index ...Still generates a guesstimate that 39 rows must be read.In some cases there is performance to be gained in using shorter indexes.
  30. 30. SubqueriesRun as part of EXPLAIN execution and may cause significant overhead. So be careful when testing.Note here that #1 is not using an index. And that is why we recommend rewriting sub queries as joins.
  31. 31. EXAMPLE of covering IndexingIn this case, adding an index reduces the reads from 239 to 42.Can we do better for this query?
  32. 32. Index on both Continent andGovernment Form With both Continent and GovernmentForm indexed together, we go from 42 rows read to 19. Using index means the data is retrieved from index not table (good) Using index condition means eval pushed down to storage engine. This can reduce storage engine read of table and server reads of storage engine (not bad)
  33. 33. Extra ***USING INDEX – Getting data from the index rather than the tableUSING FILESORT – Sorting was needed rather than using an index. Uses file system (slow)ORDER BY can use indexesUSING TEMPORARY – A temp table was created – see tmp_table_size and max_heap_table_sizeUSING WHERE – filter outside storage engineUsing Join Buffer -- means no index used.
  34. 34. Avoiding file sortSelect from attendees where Id = 2 order by NameAdd index sss (ID,Name)
  35. 35. Things can get messy!
  36. 36. straight_join forces order of tables
  37. 37. Index Hints Use only as a last resort –index_hint: shifts in data can make USE {INDEX|KEY} this the long way [{FOR {JOIN|ORDER BY| GROUP BY}] ([index_list]) around. | IGNORE {INDEX|KEY} [{FOR {JOIN|ORDER BY| GROUP BY}] (index_list) | FORCE {INDEX|KEY} [{FOR {JOIN|ORDER BY| GROUP BY}] (index_list) hints.html
  38. 38. Controlling the Optimizermysql> SELECT @@optimizer_switchG You can turn on or off*************************** 1. row *************************** certain optimizer@@optimizer_switch: index_merge=on,index_merge_union=on, settings for index_merge_sort_union=on, GLOBAL or index_merge_intersection=on, SESSION engine_condition_pushdown=on, index_condition_pushdown=on, mrr=on,mrr_cost_based=on, See MySQL Manual block_nested_loop=on,batched_key_access=off and know your mileage may vary.
  39. 39. Things to watch mysqladmin -r -i 10 extended-statusSlow_queries – number in last periodSelect_scan – full table scansSelect_full_join full scans to completeCreated_tmp_disk_tables – fielsortsKey_read_requerts/Key_wrtie_requests – read/write weighting of application, may need to modify application
  40. 40. Optimizer Tracing (6.5.3 onward)SET optimizer_trace="enabled=on";SELECT Name FROM City WHERE ID=999;SELECT trace into dumpfile /tmp/foo FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE;Shows more logic than EXPLAINThe output shows much deeper detail on how the optimizer chooses to process a query. This level of detail is well past the level for this presentation.
  41. 41. Sample from the trace – but no clueson optimizing for Joe Average DBA
  42. 42. Final Thoughts on EXPLAIN1. READ chapter 7 of the MySQL Manual2. Add index on columns in WHERE clause3. Run ANALYZE TABLE periodically4. Adjust buffer pool size, minimize disk I/o
  43. 43. Common Programming Problems1. SCRUB Your Data
  44. 44. 2. Check return codes$result = mysqli_query(“SELECT Id FROM City WHERE Id = $city_id”);If ($result) { // Code where query executed} else { // Code when query did not execute}
  45. 45. 3. Use Prepared StatementsIf ($stmt = mysqli_>prepare(“INSERT INTO FOO VALUES (?,?,?)”);$stmt->bind_param(ssd,$first,$last,$age);$first = Joe;$last=Jones;$age=22;$stmt->execute();If (!mysqli_stmt_affected_rows($stmt)) { // PROBLEM}
  46. 46. 4. Be careful reporting problemsmysqli_query(“DROP TABLE”);$result = mysqli_stmt_execute($stmt);If (!$result) { // Did NOT EXECUTE printf("Error: %s.n", mysqli_stmt_error($stmt));} Can give hackers clues!
  47. 47. 5. Ask for what you need for speedSELECT * SELECT Name, Phone,FROM foo Customer_idWHERE id = $id FROM foo WHERE id=$idSLOW! FASTER!!
  48. 48. Q&A