Your SlideShare is downloading. ×
DPC Tutorial
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

DPC Tutorial

1,641

Published on

presentation given with Ray.

presentation given with Ray.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,641
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • Depending on the context, good computer performance may involve one or more of the following:\nShort response time\nHigh throughput (rate of processing work)\nLow utilization of computing resource(s)\nHigh availability\nHigh bandwidth / short data transmission time\n\nhttp://en.wikipedia.org/wiki/Performance_%28Computer%29\n
  • So the first thing you need to do is define what exactly you want to work on to improve. \n\nWe obviously will be working on improving the performance of the database. But because things cross over a lot we will be talking about more then just the MySQL server along the way. \n
  • Some are easier to use then others.\nA common tool used in the MySQL world is sysbench. I have also used mysqlslap or Apache Bench for fast testing of load.\n
  • \n
  • - Generally speaking MySQL plays better on Linux... however there has been a big push on improving how we work on Windows. So if that is your preferred system, there is no reason to not use it. \n- there is no real difference between distros. But... you want to use the latest version from mysql.com, not what is in the distro repositories. There have been problems with distros placing files in odd places, adding my.cnf files or altering scripts - that can be problematic (Ex: debian altering the start script to check all tables - which can take a lot of time if you have a large number of tables - http://www.mysqlperformanceblog.com/2009/01/28/the-perils-of-innodb-with-debian-and-startup-scripts/)\n\n
  • CAVEOT: These setting are for a dedicated DB server\n- cfq (completely fair queuing) - should not be used for a database \n- Noop scheduler (noop) is the simplest I/O scheduler for the Linux kernel based upon FIFO queue concept (may reorder but does not wait).\n- Deadline scheduler (deadline) - it attempt to guarantee a start service time for a request.\n\nhttp://www.cyberciti.biz/faq/linux-change-io-scheduler-for-harddisk/\nhttp://dom.as/2008/02/05/linux-io-schedulers/\n
  • - Generally speaking with database servers you want to have a much RAM as you can - so you can fit the data into it. If you can’t fit all the data into RAM - then you want fast disks since you will be reading data in and out of them as you need it.\n- RAID 10, battery backed write cache\n- Since MySQL is a single process, multi-threaded the general advice is that you want faster CPUs rather then more of them. This however is changing from 5.1 + the plugin and 5.5\n
  • there are a couple of basic Linux commands that you want to become familiar with (in terms of performance).\n\n- top helps you watch how things are going at a pretty general level. CPU, load, memory\n- iostat is good to watch to see how your IO is. r/s, w/s, %await (how long it took to respond to requests), %svctm (how long the request actually took), %util (utilization)\n- vmstat is good to see how our memory, swap, processes and CPU is being used (amongst other things).\n\nIf you are not already familiar with these commands, you need to learn about them.\n\ndstat is a python script that combines top, iostat and vmstat (apt-get/yum install dstat)\n
  • \n
  • \n
  • \n
  • \n
  • - Setting the path so the general log *can* be turned out, but having it disabled at start can be a huge win in development/troubleshooting time.\n- Slow logging should be on, always. And long_query_time - which now supports microseconds.\n- Binary log should always be on IMO, and no file extension\n- Slow logging should be on, always. And long_query_time\n- “not_using” should only be used with “examined_limit”\n
  • - “once” in that there is only one key_buffer_pool, but there can be many named buffer pools.\n- Be careful wish per session size 2m join_buffer with 1k connections is 2Gb\n
  • expire_log_days - make sure you have a backup before you expire the logs\n
  • - *_buffer_size - Noted, here, but should be controlled on a per sessions basis.\n- SHOW GLOBAL STATUS LIKE ‘Opened_tables’\n- t_c_s can be very helpful for PHP apps, nearly useless for pool managers\n
  • This is no longer the default engine. 5.5 innodb is the default\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • - If a concurrent_insert of 2 is used, then OPTIMIZE TALBE should be run more frequently.\n
  • \n
  • With larger buffer pool sizes, the reallocation of the space can take a while.\n\nflush log at trx - only have at 0 or 2 if you have a battery backed write cache (removes the ACID complience since it moves the durability from the DB to the hardware)\n
  • double write buffer - leave on unless you can control that with your battery backup write unit\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • - GROUP BY / ORDER BY will create tmp_tables\n- Remind: that if max_heap < tmp_table then tmp_table is still limited to max_heap\n
  • \n
  • We are basically at max table_cache, so bump it up some.\n
  • - Knowing about your environment is key. Here the numbers are skewed because of the DBCP request on a Java component restart. \n
  • \n
  • - Increase when using ORDER BY and GROUP BY / ORDER BY\n
  • \n
  • \n
  • \n
  • efficient resource usage \n - less IO since we are working with indexes rather then all the data. Potentially fitting all the relevant data in RAM.\n - less CPU - again because we are using indexes\n - since things are supposedly happening faster - we are holding any locks for less time\nAll this allows us to potentially have more concurrency with the same response times.\n
  • \n
  • \n
  • - entries not added until after they are executed and all locks released\n- time to acquire locks not counted in execution time\n- long_query_time can be microseconds as of 5.1.21 \n
  • log_slow_query is deprecated.\n
  • - default is file\n- microsecond time for long_query_time is only supported when we are going to a file. It is ignored when logging to tables.\n- logging to a table uses significantly more server overhead. However the table option does provide you with the power of the SQL query language - be careful though. There are limitations on what you can and can’t do with the table. If you are thinking of using the table format, be sure to look over what can and can not be done in the manual[1]. It also uses the CVS engine so you can open the data in anything that can read that.\n\n[1] http://dev.mysql.com/doc/refman/5.5/en/log-destinations.html\n
  • - NONE if present overrides everything else\n- default is file if no value given\n- multiple values can be given, separated by a comma\n- if no log file name and path is given it will default to host_name-slow.log in the data directory\n
  • Can turn on the slow query log dynamically - no restarted needed. Possible to turn it on during a time when you have repeated problems for example. \n
  • This is the output of the slow query log when you just look at it - in its raw form. I just opened up the .log file and looked at what was inside.\n\nPlease note the time, the query itself, the Query_time, the lock time and rows_examined\n
  • \n
  • \n
  • - groups queries that are similar except for the particular values of number and string data values. It “abstracts” these values to N and 'S' when displaying summary output.\n
  • general summary information from the file. After this come information on the specific queries.\n
  • By default mysqldumpslow sorts by average query time. You can also sort by lock time, rows sent, or the count. We are going to work on the query with the highest count.\n
  • In order to tune a query we needs a number of things. Each of these things provide you with information to help you out.\n\nCREATE TABLE - structure of the table - indexes available, datatypes, storage engine\nTABLE STATUS - number of rows, storage engine, collation, create options, comments\nSHOW INDEXES - key name, sequence in the index, collation, cardinality, index type\n\nEXPLAIN - the execution plan. We will go into greater detail of this since the information from this tells you how MySQL will handle the query. Used with the other information, you can then work to optimize this.\n\n
  • \n
  • Extended works with SHOW WARNINGS\nUPDATE/DELETE can be also written as SELECTs that work with EXPLAIN. \n
  • \n
  • \n
  • Derived table is a subquery in the FROM clause that created a temporary table \nThe \n
  • SIMPLE - Normal\nPRIMARY - Outer SELECT\nDERIVED - subquery in the FROM clause that makes a temp table\nUNION - self - explanitory\nSUBQUERY - not in a FROM clause\n(UNION and SUBQUERY can have DEPENDANT if they are linked to the outer query)\n
  • point 2 - regardless of what order they are in for the query\npoint 3 - there are no indexes on derived tables. \n\nNOTE: it may be better to explicitly make a temp table and index it then use a derived table\n\n
  • system/const - only one row from MEMORY table or PRIMARY index\neq_ref - index lookup with one row returned\nref - similar to eq-ref but can return more then 1 row\nref_or_null - similar to ref but allowes null values or conditions\nindex_merge - allows you to use 2 indexes on 1 table\nrange - for range values (<, >, <=, >=, LIKE, IN, BETWEEN)\nindex - full index scan\nALL - full table scan\n
  • \n
  • If you have a list of possible keys but the key chosen is NULL - it may be that you have hit the threshhold where it is better to do the sequential scan rather then the random IO for the amount of data to be read\n
  • a varchar(32) can be seen as 96 bytes because of the UNICODE \n
  • \n
  • each engine updates its own statistics with varying levels of accuracy \n\n
  • \n
  • filesort - does not mean it has to be on disk - it can be in RAM\n
  • \n
  • ambiguous table aliases\nugly as sin\nkeywords not in upper\ntheta join instead of ANSI\nsubquery\nORDER BY rand()\ndistinct in the inner query - is it really needed?\n
  • For the sample query.\n
  • \n
  • Steps to optimize query.\n
  • We have dependent subqueries... and to confuse the situation even more we do not know if it is a correlated subquery since we use the same table aliases within it and without. This is bad!!!\n
  • We are moving the subquery from the WHERE clause to the FROM clause. With the subquery in the WHERE clause we potentially will cause the subquery to be run for each matching record ( Rows_examined in slow query log was ~4.5 million... OUCH). By moving it to the FROM clause, we will only be running the subquery once. I am also removing the ORDER BY rand() since it will have to generate random numbers for ~4.5 rows (keep in mind collisions). \n\n\n
  • But now we are using the speaker_id column for JOINing and querying on.\n\n\n
  • But now we are using the speaker_id column for JOINing and querying on. We need to make sure these are indexed.\n\n
  • Looking at the CREATE table talk_speaker - the speaker_id does not seem to be indexed. So we need to add an index for it. \n\n\n
  • Looking at the CREATE table talk_speaker - the speaker_id does not seem to be indexed. So we need to add an index for it. We do that with an ALTER statement. \n\nAgain we can handle this a couple of ways. We can just add the speaker_id index, or we can see if a composite index of speaker_id and talk_id would be more beneficial. This would have to be tested to see which index helps us out more.\n\n
  • Demo time!\n
  • \n
  • \n
  • Transcript

    • 1. Optimizing MySQL Essentials
    • 2. About Me• Ligaya Turmelle• Senior Technical Support Engineer• MySQL Now (AKA Oracle)• ~3 years
    • 3. Agenda• Basics• Hardware and OS• Server • variables• Queries • slow query log and EXPLAIN • example query
    • 4. Basics1+1 = 3
    • 5. • Definition of Performance: Characterized by the amount of useful work accomplished by a computer system compared to the time and resources used.
    • 6. This can cover a lot of different areas(potentially at the same time) in a LAMPstack• Application• Network• Hardware• Database
    • 7. Learn how to benchmark• There are lots of benchmarking tools out there.• Use what is appropriate to measure what you are trying to improve
    • 8. OS and Hardware
    • 9. OS• Linux vs. Windows• Linux vs. Linux
    • 10. Linux• swappiness = 0• IO scheduler • HDD = deadline • SSD or you have a good RAID controller = noop• You can use ext3 - but not the best
    • 11. Hardware• RAM• RAM• RAM• Disks• CPU• And did I mention RAM
    • 12. Basic Linux commands• top• iostat• vmstat• dstat
    • 13. Server Settings So what should i tweak?
    • 14. Knobs, Levers and Buttons• mysqld• MyISAM• InnoDB• Others
    • 15. MySQLDwhat knobs to turn?
    • 16. MySQLD Many think that yum install / apt-get install and they’re done.• No, we’re just beginning! • Logging • Memory Usage: • Global • Session • Character Sets
    • 17. MySQLD ( logging )• log_bin = <filename>• general_log = 0 general_log_file = <filename>• slow_query_log = 1 slow_log_file = <filename> long_query_time = 1• log_queries_not_using_indexes min_examined_row_limit = 100
    • 18. MySQLD ( memory usage )• Memory is allocated... • Globally - server wide, usually allocated only at server startup and exist only “once” • Session - each connection or event • Both - In these cases the global variable value ( join_buffer_size ) is used if a new value is not set during the session.
    • 19. MySQLD ( Memory / global )• connection_timeout ( 10 )• expire_log_days ( 7 )• query_cache_type = 0 ( yes, 0! )• wait_timeout = 3• skip_name_resolve = 1• performance_schema = 1
    • 20. MySQLD ( memory / global )• max_heap_table_size / tmp_table_size• table_open_cache• thead_cache_size }• read_buffer_size / read_rnd_buffer_size Also• sort_buffer_size configurable per session• join_buffer_size
    • 21. MyISAMwhat levers to pull
    • 22. MyISAM
    • 23. MyISAM• Switch to using InnoDB
    • 24. MyISAM• Switch to using InnoDB • Not an option?
    • 25. MyISAM• Switch to using InnoDB • Not an option?• Switch to using InnoDB
    • 26. MyISAM• Switch to using InnoDB • Not an option?• Switch to using InnoDB • Still not an option?
    • 27. MyISAM• Switch to using InnoDB • Not an option?• Switch to using InnoDB • Still not an option?• The BLACKHOLE storage engine is really fast
    • 28. MyISAM• Switch to using InnoDB • Not an option?• Switch to using InnoDB • Still not an option?• The BLACKHOLE storage engine is really fast• Still stuck?
    • 29. MyISAM• switch to mongodb, it’s webscale! http://nosql.mypopescu.com/post/1016320617/mongodb-is-web-scale
    • 30. MyISAM• concurrent_insert • 0 - Never, 1 - Hole free, 2 - Holes• key_buffer_size • Key_reads / Key_read_requests • Key_writes / Key_write_requests
    • 31. InnoDBwhat buttons to press
    • 32. InnoDB Many options only a few covered• innodb_buffer_pool_size - caches both indexes and data - upwards of 85% available RAM• innodb_flush_log_at_trx_commit - 0 - written and flushed to disk once per second - 1 - written and flushed to disk at each transaction commit - 2 - written to disk at each commit, flush one a second
    • 33. InnoDB• innodb_log_buffer_size - Amount of data written to ib_logfile - The larger, the less disk I/O• innodb_log_file_size - The larger the file, the more InnoDB can do - < 5.5 the larger, the longer crash recovery time• innodb_double_write_buffer - data integrity checking, costs IOPSs• innodb_file_per_table - data and indexes written to their own .ibd file
    • 34. Examine, Testand Measure “Dr. Watson I Presume”
    • 35. MySQLD ( quick check )Quick Check - mysql-tuning-primerhttps://launchpad.net/mysql-tuning-primer
    • 36. MySQLD ( in depth 1/11 ) Getting In depth - SHOW GLOBAL STATUS+--------------------------+------------+| Variable_name | Value |+--------------------------+------------+ | Com_replace | 37372 || Com_alter_table | 184 | | Com_replace_select | 0 || Com_begin | 103281460 | | Com_rollback | 148312594 || Com_change_db | 631 | | Com_select | 7164448573 || Com_commit | 669825867 | | Com_set_option | 1877614377 || Com_create_index | 0 | | Com_show_collations | 130652029 || Com_create_table | 354 | | Com_show_fields | 694 || Com_delete | 1235248406 | | Com_show_grants | 36 || Com_delete_multi | 353434291 | | Com_show_processlist | 33184 || Com_drop_table | 359 | | Com_show_slave_hosts | 1 || Com_flush | 3 | | Com_show_variables | 130885618 || Com_insert | 2725147239 | | Com_show_warnings | 99619 || Com_insert_select | 9 | | Com_truncate | 11 || Com_kill | 3642 | | Com_unlock_tables | 18 || Com_load | 15 | | Com_update | 1348570903 || Com_lock_tables | 18 | | Com_update_multi | 8 | +--------------------------+------------+
    • 37. MySQLD ( in depth 2/11 ) Values with out timeframe are meaningless.10:21:08 rderoo@mysql09:mysql [1169]> SHOW GLOBAL STATUS LIKE Uptime;+---------------+----------+| Variable_name | Value |+---------------+----------+| Uptime | 12973903 |+---------------+----------+1 row in set (0.00 sec) That’s about 150 days. :)
    • 38. MySQLD ( in depth 3/11 ) By using mysqladmin we can observe changes over short periods of time:$ mysqladmin -u rderoo -p -c 2 -i 10 -r extended-status > review.txt Output is the same as SHOW GLOBAL STATUS with two copies appearing in review.txt the first has values since the last server restart / FLUSH COUNTERS the second output are the delta between the first and 10 seconds later.
    • 39. MySQLD ( in depth 4/11 ) Temporary Tables: How big?17:54:33 rderoo@mysql09:event [1185]> SHOW GLOBAL VARIABLES LIKE %_table_size;+---------------------+----------+| Variable_name | Value |+---------------------+----------+| max_heap_table_size | 67108864 || tmp_table_size | 67108864 |+---------------------+----------+2 rows in set (0.00 sec) How many?17:55:09 rderoo@mysql09:event [1186]> SHOW GLOBAL STATUS LIKE %_tmp%tables;+-------------------------+-----------+| Variable_name | Value |+-------------------------+-----------+| Created_tmp_disk_tables | 156 || Created_tmp_tables | 278190736 |+-------------------------+-----------+2 rows in set (0.00 sec)
    • 40. MySQLD ( in depth 5/11 )The max temporary table size is 64mb.In 150 days 278 million temp tables wherecreated.With only 156 going to disk.This is well tuned!
    • 41. MySQLD ( in depth 6/11 ) open_table_cache| Open_tables | 4094 || Opened_tables | 12639 || Uptime | 12980297 | • This yields a rate of ~85 table opens per day, a bit high...20:52:22 rderoo@mysql09:mysql [1190]> SHOW GLOBAL VARIABLES LIKE table_cache;+---------------+-------+| Variable_name | Value |+---------------+-------+| table_cache | 4096 |+---------------+-------+ • Increasing the table_cache would be advisable.
    • 42. MySQLD ( in depth 7/11 ) thread_cache_size • To determine the effectiveness of the thread_cache as a hit ratio use the following formula: 100-((Threads_created/Connections)*100)| Connections | 131877233 || Threads_created | 20014624 |100 - ( ( 20014624 / 131877233 ) * 100 ) = 84.8% • Increasing the thread_cache_size would seem to be advisable.
    • 43. MySQLD ( in depth 8/11 ) read_buffer_size • The effectiveness can be found by examining the Select_scan :| Select_scan | 304864263 | • This should be set to a small value globally and increased on a per session basis • Increase when performing full table scans
    • 44. MySQLD ( in depth 9/11)read_rnd_buffer_size• Used when reading sorted results after a key sort has taken place• This should be set to a small value globally and increased on a per session basis• Usually should be set ( per session ) no high than 8M
    • 45. MySQLD ( in depth 10/11) sort_buffer_size • The effectiveness can be found by examining the Sort_merge_passes :| Sort_merge_passes | 7 | • This should be set to a small value globally and increased on a per session basis • Increase GROUP BY, ORDER BY, SELECT DISTINCT and UNION DISTINCT performance
    • 46. MySQLD ( in depth 11/11 ) join_buffer_size • The effectiveness can be found by examining the Select_full_join :| Select_full_join | 5369 | • A crutch used when proper indexing is not in place • Will help in cases of full table scans • This should be set to a small value globally and increased on a per session basis • Increase GROUP BY and ORDER BY performance
    • 47. Tuning Queries Where the magic happens
    • 48. Tuning your queries is the single biggestperformance improvement you can do toyour server• efficient resource usage• results returned quickerPer QUERY!
    • 49. Where do I start?
    • 50. slow query log
    • 51. Slow query log• A log of all the “slow” queries• Define “slow” • took longer then long_query_time to run AND at least min_examined_row_limit rows are looked at • log_queries_not_using_indexes • log_slow_admin_statements
    • 52. Turning it on• Option is slow_query_log • no argument or value of 1 - enabled • argument of 0 - disabled (default)
    • 53. Output Types• Use log_output • write to a file • write to a table in the mysql database
    • 54. slow log at startup• Place the slow_query_log and optional log_output options in my.cnf• log_output acceptable values • TABLE, FILE, NONE• Can also use slow_query_log_file to specify a log file name
    • 55. slow log at runtime• All the below global variables can be changed at run time • log_output • slow_query_log • slow_query_log_file SET GLOBAL slow_query_log = 1;
    • 56. /usr/sbin/mysqld, Version: 5.0.67-0ubuntu6.1-log ((Ubuntu)). started with:Tcp port: 3306 Unix socket: /var/run/mysqld/mysqld.sockTime Id Command Argument# Time: 110328 13:15:05# User@Host: homer[homer] @ localhost []# Query_time: 3 Lock_time: 0 Rows_sent: 0 Rows_examined: 4452238use confs;select distinct u.ID as user_id, t.event_id, u.username, u.full_name from user u, talks t, talk_speaker ts where u.ID <> 6198 and u.ID = ts.speaker_id and t.ID = ts.talk_id and t.event_id in ( select distinct t.event_id from talk_speaker ts, talks t where ts.speaker_id = 6198 and t.ID = ts.talk_id ) order by rand() limit 15;
    • 57. slow query log EXPLAINed
    • 58. There are multiple ways of gettinginformation out of the slow query log.• just open it up to see the raw data• usually more helpful to summarize/ aggregate the raw data
    • 59. Tools for aggregating the slow log data• mk-query-digest • third party • part of the maatkit utilities• mysqldumpslow • comes with MySQL in the bin dir
    • 60. Macintosh-7:joindin-for-lig ligaya$ perl -laF ~/mysql_installs/maatkit/mk-query-digest mysql-slow.log# 23s user time, 240ms system time, 17.53M rss, 86.51M vsz# Current date: Mon May 16 11:08:24 2011# Hostname: Macintosh-7.local# Files: mysql-slow.log# Overall: 73.72k total, 14 unique, 0.02 QPS, 0.04x concurrency __________# Time range: 2011-03-28 13:15:05 to 2011-05-13 02:33:52# Attribute total min max avg 95% stddev median# ============ ======= ======= ======= ======= ======= ======= =======# Exec time 160130s 0 180s 2s 3s 819ms 2s# Lock time 11s 0 1s 149us 0 12ms 0# Rows sent 387.64k 0 255 5.38 14.52 6.86 0# Rows examine 338.66G 0 5.35M 4.70M 4.93M 387.82k 4.70M# Query size 30.19M 17 8.35k 429.47 420.77 29.33 420.77# Profile# Rank Query ID Response time Calls R/Call Apdx V/M Item# ==== ================== ================= ===== ======= ==== ===== =====# 1 0x2AAA5EC420D7E2A2 159878.0000 99.8% 73682 2.1698 0.50 0.12 SELECTuser talks talk_speaker talks# 3 0x0A2584C03C8614A9 50.0000 0.0% 16 3.1250 0.44 0.87 SELECTwp_comments# MISC 0xMISC 202.0000 0.1% 20 10.1000 NS 0.0 <12ITEMS>
    • 61. Count: 1 Time=180.00s (180s) Lock=0.00s (0s) Rows=1.0 (1), XXXXX[xxxxx]@localhost SELECT SLEEP(N)Count: 16 Time=3.12s (50s) Lock=0.00s (0s) Rows=0.9 (14), XXXXX[xxxxx]@localhost SELECT comment_date_gmt FROM wp_comments WHERE comment_author_IP = S ORcomment_author_email = S ORDER BY comment_date DESC LIMIT NCount: 5 Time=2.80s (14s) Lock=0.00s (0s) Rows=0.0 (0), XXXXX[xxxxx]@localhost SELECT comment_ID FROM wp_comments WHERE comment_post_ID = S AND ( comment_author= S OR comment_author_email = S ) AND comment_content = S LIMIT NCount: 73682 Time=2.17s (159867s) Lock=0.00s (11s) Rows=5.4 (396223),XXXXX[xxxxx]@localhost select distinct u.ID as user_id, t.event_id, u.username, u.full_name from user u, talks t, talk_speaker ts where u.ID <> N and u.ID = ts.speaker_id and t.ID = ts.talk_id and t.event_id in ( select distinct t.event_id from talk_speaker ts, talks t where ts.speaker_id = N and t.ID = ts.talk_id ) order by rand() limit N
    • 62. Tuning a query• Tables • SHOW CREATE TABLE `talks`; • SHOW TABLE STATUS LIKE ‘talks’;• Indexes • SHOW INDEXES FROM ‘talks’;• EXPLAIN
    • 63. EXPLAIN
    • 64. EXPLAIN Basics• Syntax: EXPLAIN [EXTENDED] SELECT select_options• Displays information from the optimizer about the query execution plan• Works only with SELECT statements
    • 65. EXPLAIN Output• Each row provides information on one table• Output columns: id key_length select_type ref table rows type Filtered (new to 5.1) possible_keys Extra key
    • 66. EXPLAIN Output
    • 67. id• Select identifier• Only if there are subqueries, derived tables or unions is this incremented• Number reflects the order that the SELECT/FROM was done in
    • 68. select_type• Type of SELECT • SIMPLE • PRIMARY • DERIVED • UNION • SUBQUERY
    • 69. table• The table (or alias) that is used• Read down - these are the order the tables are JOINed• DERIVED tables are noted and numbered
    • 70. type• Access type• Various types: preference access type BEST system/const eq_ref ref ref_or_null index_merge range index WORST ALL
    • 71. possible_keys• List of all possible indexes (or NULL) that optimizer considered using
    • 72. key• Index used for the query or NULL• Look at key and possible_keys to consider your index strategy
    • 73. key_len• shows number of bytes MySQL will use from the index• Possible that only part of the index will be used NOTE: a character set may use more then one byte per character
    • 74. ref• Very different then the access type ‘ref’• Show which columns or a constant within the index will be used for the access
    • 75. rows• Number of rows MySQL expects to find based on statistics • statistics can be updated with ANALYZE TABLE
    • 76. filtered• New in 5.1.12• Estimated number of rows filtered by the table condition• Shows up if you use EXPLAIN EXTENDED
    • 77. Extra• All the other stuff • Using index - using a covering index • Using filesort - sorting in a temporary table • Using temporary - a temporary table was made. • Using where - filtering outside the storage engine.
    • 78. Query Tuning Example
    • 79. Query we are working with:EXPLAIN select distinct u.ID as user_id, t.event_id, u.username, u.full_name from user u, talks t, talk_speaker ts where u.ID <> 11945 and u.ID = ts.speaker_id and t.ID = ts.talk_id and t.event_id in ( select distinct t.event_id from talk_speaker ts, talks t where ts.speaker_id = 11945 and t.ID = ts.talk_id ) order by rand() limit 15G
    • 80. Full EXPLAIN plan*************************** 1. row *************************** id: 1 select_type: PRIMARY table: t type: indexpossible_keys: PRIMARY key: idx_event key_len: 9 ref: NULL rows: 3119 Extra: Using where; Using index; Using temporary; Using filesort*************************** 2. row *************************** id: 1 select_type: PRIMARY table: ts type: refpossible_keys: talk_id key: talk_id key_len: 5 ref: joindin.t.ID rows: 1 Extra: Using where*************************** 3. row *************************** id: 1 select_type: PRIMARY table: u type: eq_refpossible_keys: PRIMARY key: PRIMARY key_len: 4 ref: joindin.ts.speaker_id rows: 1 Extra:
    • 81. Full EXPLAIN plan (con’t)*************************** 4. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: t type: refpossible_keys: PRIMARY,idx_event key: idx_event key_len: 5 ref: func rows: 24 Extra: Using where; Using index; Using temporary*************************** 5. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: ts type: refpossible_keys: talk_id key: talk_id key_len: 5 ref: joindin.t.ID rows: 1 Extra: Using where5 rows in set (0.00 sec)
    • 82. What was wrongwith that query plan?
    • 83. HintEXPLAIN select distinct u.ID as user_id, *************************** 4. row *********************** t.event_id, id: 2 u.username, select_type: DEPENDENT SUBQUERY u.full_name table: t from user u, type: ref talks t, possible_keys: PRIMARY,idx_event talk_speaker ts key: idx_event where u.ID <> 11945 and key_len: 5 u.ID = ts.speaker_id and ref: func t.ID = ts.talk_id and rows: 24 t.event_id in ( Extra: Using where; Using index; Using temporary select distinct t.event_id *************************** 5. row ********************** from talk_speaker ts, id: 2 talks t select_type: DEPENDENT SUBQUERY where ts.speaker_id = 11945 table: tsand type: ref t.ID = ts.talk_id possible_keys: talk_id ) key: talk_id order by rand() key_len: 5 limit 15G ref: joindin.t.ID rows: 1 Extra: Using where 5 rows in set (0.00 sec)
    • 84. Solution1:SELECT DISTINCT u.ID as user_id, t.event_id, u.username, u.full_nameFROM user u JOIN talk_speaker ts ON u.ID = ts.speaker_id JOIN talks t ON t.ID = ts.talk_id JOIN -- find the events the speaker has talks in ( SELECT t1.event_id FROM talk_speaker ts1 JOIN talks t1 ON t1.ID = ts1.talk_id WHERE ts1.speaker_id = 11945 ) as e ON t.event_id = e.event_idWHERE u.ID <> 11945LIMIT 15G
    • 85. Solution1:SELECT DISTINCT u.ID as user_id, t.event_id, u.username, u.full_nameFROM user u JOIN talk_speaker ts ON u.ID = ts.speaker_id JOIN talks t ON t.ID = ts.talk_id JOIN -- find the events the speaker has talks in ( SELECT t1.event_id FROM talk_speaker ts1 JOIN talks t1 ON t1.ID = ts1.talk_id WHERE ts1.speaker_id = 11945 ) as e ON t.event_id = e.event_idWHERE u.ID <> 11945LIMIT 15G
    • 86. Solution1:SELECT DISTINCT u.ID as user_id, t.event_id, u.username, u.full_nameFROM user u JOIN talk_speaker ts ON u.ID = ts.speaker_id JOIN talks t ON t.ID = ts.talk_id JOIN -- find the events the speaker has talks in ( SELECT t1.event_id FROM talk_speaker ts1 JOIN talks t1 ON t1.ID = ts1.talk_id WHERE ts1.speaker_id = 11945 ) as e ON t.event_id = e.event_idWHERE u.ID <> 11945LIMIT 15G
    • 87. Solution1: reminder: CREATE TABLE `talk_speaker` (SELECT DISTINCT u.ID as user_id, `talk_id` int(11) DEFAULT NULL, t.event_id, `speaker_name` varchar(200) DEFAULT NULL, u.username, `ID` int(11) NOT NULL AUTO_INCREMENT, u.full_name `speaker_id` int(11) DEFAULT NULL,FROM user u `status` varchar(10) DEFAULT NULL, JOIN talk_speaker ts PRIMARY KEY (`ID`), ON u.ID = ts.speaker_id KEY `talk_id` (`talk_id`) USING BTREE JOIN talks t ) ENGINE=MyISAM AUTO_INCREMENT=3763 ON t.ID = ts.talk_id DEFAULT CHARSET=utf8; JOIN -- find the events the speaker has talks in ( SELECT t1.event_id FROM talk_speaker ts1 JOIN talks t1 ON t1.ID = ts1.talk_id WHERE ts1.speaker_id = 11945 ) as e ON t.event_id = e.event_idWHERE u.ID <> 11945LIMIT 15G
    • 88. Solution1: reminder: CREATE TABLE `talk_speaker` (SELECT DISTINCT u.ID as user_id, `talk_id` int(11) DEFAULT NULL, t.event_id, `speaker_name` varchar(200) DEFAULT NULL, u.username, `ID` int(11) NOT NULL AUTO_INCREMENT, u.full_name `speaker_id` int(11) DEFAULT NULL,FROM user u `status` varchar(10) DEFAULT NULL, JOIN talk_speaker ts PRIMARY KEY (`ID`), ON u.ID = ts.speaker_id KEY `talk_id` (`talk_id`) USING BTREE JOIN talks t ) ENGINE=MyISAM AUTO_INCREMENT=3763 ON t.ID = ts.talk_id DEFAULT CHARSET=utf8; JOIN -- find the events the speaker has talks in ( SELECT t1.event_id 2: FROM talk_speaker ts1 ALTER TABLE `talk_speaker` JOIN talks t1 ADD INDEX (speaker_id); ON t1.ID = ts1.talk_id WHERE ts1.speaker_id = 11945 OR ) as e ALTER TABLE `talk_speaker` ON t.event_id = e.event_id ADD INDEX (speaker_id, talk_id);WHERE u.ID <> 11945LIMIT 15G
    • 89. More?
    • 90. Q&A
    • 91. Thank youhttp://joind.in/3215

    ×