Explain that explain

  • 1,171 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,171
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
54
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Explain that “Explain”The Road to Understanding“You Should Not Fight with the Database, the Database is your Friend”put together by Fabrizio ParrellaPARTS ARE QUOTED OR COPIED OR REF FROM:http://devzone.zend.com/1436/the-zendcon-sessions-episode-17-sql-query-tuning-the-legend-of-drunken-query-master/http://www.slideshare.net/phpcodemonkey/mysql-explain-explained
  • 2. get to know your friend➲ Recognize the strengths and also the weaknesses ofyour database➲ No database is perfect -- deal with it, youre notperfect either➲ Think of both big things and small thingsBIG: Architecture, surrounding servers, cachingSMALL: SQL coding, join rewrites, server config
  • 3. becoming friends➲ Understand storage engine abilities and weaknesses➲ Understand how the query cache and importantbuffers works➲ Understand optimizers limitations➲ Understand what should and should not be done atthe application level➲ If you understand the above, youll start to see thedatabase as a friend and not an enemy
  • 4. the schema➲ Basic foundation of performance➲ Everything else depends on it➲ Choose your data types wisely➲ “Divide et Impera” the schema through partitioningA divide and conquer (D&C) algorithm works by recursively break down a problem into two or more sub-problems of the same (or related) type, until there become simple enough to be solved directly. Thesolution to the sub-problems is then combined to give a solution to the original problem.http://en.wikipedia.org/wiki/Divide_and_conquer_algorithm
  • 5. size does matter !!smaller, smaller, SMALLERThe more records you can fit into a single page ofmemory/disk, the faster your seeks and scans will be➲ Do you really need that BIGINT?➲ Use INT UNSIGNED for IPv4 addresses➲ Use VARCHAR carefullyConverted to CHAR when used in a temporary table➲ Use TEXT sparinglyConsider separate tables➲ Use BLOBs very sparinglyUse the filesystem for what it was intended
  • 6. real life example... handling IPv4 addressesCREATE TABLE Sessions (session_id INT UNSIGNED NOT NULL AUTO_INCREMENT,ip_address INT UNSIGNED NOT NULL, // Compare to CHAR(15)...session_data TEXT NOT NULL,PRIMARY KEY (session_id),INDEX (ip_address)) ENGINE=InnoDB;// Insert a new dummy recordINSERT INTO Sessions VALUES(NULL, INET_ATON(192.168.0.2), some session data);SELECTsession_id,ip_address as ip_raw,INET_NTOA(ip_address) as ip,session_dataFROM SessionsWHEREip_address BETWEEN INET_ATON(192.168.0.1) AND INET_ATON(192.168.0.255);+------------+------------+-------------+-------------------+| session_id | ip_raw | ip | session_data |+------------+------------+-------------+-------------------+| 1 | 3232235522 | 192.168.0.2 | some session data |+------------+------------+-------------+-------------------+
  • 7. SETs and ENUMs➲ Often sign of poor schema design➲ Changing the definition will most likely require a fullrebuild of the table➲ Search functions like FIND_IN_SET() are inefficientcompared to index operation on a join
  • 8. normalization, taking it too farDateDate ?http://thedailywtf.com/forums/thread/75982.aspx
  • 9. vertical partitioning➲ Never mix frequently and infrequentlyaccessed fields in a single table➲ Splitting tables allows main records to consume the bufferpages without the extra data taking up space in memory➲ Do you need FULLTEXT on your text columns (PRE 5.6.4)?CREATE TABLE Users (user_id INT NOT NULL AUTO_INCREMENT,email VARCHAR(80) NOT NULL,display_name VARCHAR(50) NOT NULL,password CHAR(41) NOT NULL,first_name VARCHAR(25) NOT NULL,last_name VARCHAR(25) NOT NULL,address VARCHAR(80) NOT NULL,city VARCHAR(30) NOT NULL,province CHAR(2) NOT NULL,postcode CHAR(7) NOT NULL,interests TEXT NULL,bio TEXT NULL,signature TEXT NULL,skills TEXT NULL,PRIMARY KEY (user_id),UNIQUE INDEX (email)) ENGINE=InnoDB;CREATE TABLE Users (user_id INT NOT NULL AUTO_INCREMENT,email VARCHAR(80) NOT NULL,display_name VARCHAR(50) NOT NULL,password CHAR(41) NOT NULL,PRIMARY KEY (user_id),UNIQUE INDEX (email)) ENGINE=InnoDB;CREATE TABLE UserExtra (user_id INT NOT NULLfirst_name VARCHAR(25) NOT NULLlast_name VARCHAR(25) NOT NULLaddress VARCHAR(80) NOT NULLcity VARCHAR(30) NOT NULLprovince CHAR(2) NOT NULLpostcode CHAR(7) NOT NULLinterests TEXT NULLbio TEXT NULLsignature TEXT NULLskills TEXT NULLPRIMARY KEY (user_id)FULLTEXT KEY (interests, skills)) ENGINE=MyISAM;
  • 10. understand MySQL query cache➲ You must understand your applications read/writepatterns➲ Internal query cache design is a compromise betweenCPU usage and read performance➲ Stores the MYSQL_RESULT of a SELECT along with a hashof the SELECT SQL statement➲ Any modification to any table involved in the SELECTinvalidates the stored result➲ Write applications to be aware of the query cacheUse SELECT SQL_NO_CACHE
  • 11. coding like a master➲ Be consistent (for crying out loud)➲ Use ANSI SQL coding style (vs. Theta)➲ Stop thinking in terms of iterators, for loops, whileloops, etc➲ Instead, think in terms of sets➲ Break complex SQL statements (or business requests)into smaller, manageable chunks
  • 12. Consistency, consistency, CONSISTENCY !!➲ Tabs and Spacing➲ Upper and Lower Case➲ Keywords, function namesNothing pisses offthe query master likeinconsistent SQL code!SELECTa.first_name,a.last_name,COUNT(*) as num_rentalsFROM actor aINNER JOIN film f ON a.actor_id = f.actor_idGROUP BY a.actor_idORDER BYnum_rentals DESC,a.last_name,a.first_nameLIMIT 10;vs.select first_name, a.last_name,count(*) AS num_rentalsFROM actor a join film on a.actor_id = film.actor_idgroup by a.actor_id order bynum_rentals DESC, a.last_name, a.first_nameLIMIT 10;➲ Aliases➲ Consider yourteammates➲ Like your code, SQL ismeant to be read, notwritten
  • 13. guidelines➲ Beware of join hints“force index” can get “out of date”➲ Just because it can be done in a single SQLstatement doesnt meat it should➲ ALWAYS test and benchmark your solution
  • 14. ANSI vs. THETASELECTa.first_name,a.last_name,COUNT(*) as num_rentalsFROM actor aINNER JOIN film_actor fa ON a.actor_id = fa.actor_idINNER JOIN film f ON fa.film_id = f.film_idINNER JOIN inventory I ON f.film_id = i.film_idINNER JOIN rental r ON r.inventory_id = i.inventory_idGROUP BY a.actor_idORDER BYnum_rentals DESC,a.last_name,a.first_nameLIMIT 10; SELECTa.first_name,a.last_name,COUNT(*) as num_rentalsFROMactor a,film f,film_actor fa,inventory i,rental rWHEREa.actor_id = fa.actor_idAND fa.film_id = f.film_idAND f.film_id = i.film_idAND r.inventory_id = i.inventory_idGROUP BY a.actor_idORDER BYnum_rentals DESC,a.last_name,a.first_nameLIMIT 10;ANSI STYLEExplicitly declare JOIN conditionsusing the ON clauseTHETA STYLEImplicitly declare JOIN conditionsin the WHERE clause
  • 15. why ANSI style kicks THETA styles A55➲ MySQL THETA style only supports INNER and CROSSjoinBut MySQL ANSI style supports INNER, CROSS, LEFT, RIGHT,and NATURAL joinsMixing and matching both styles can lead to hard-to-readSQL code➲ It is extremely easy to miss a join condition withTHETA styleEspecially when joining many tablesForgetting a Join will produce a cartesian product (NOTGOOD !!!)
  • 16. WITHOUT THE STRENGHT OFEXPLAINYOU WILL GET LOST IN THE FIELDSOF MISUNDERSTANDINGhow to test our SQL
  • 17. EXPLAIN the basics➲ Provides the execution plan chosen by the MySQLoptimizer➲ Simply prepend the word EXPLAIN in front of yourSELECT statement➲ Each row represent a set of information for eachtable used in the SELECT
  • 18. EXPLAIN the columns➲ select_type - type of “set” the data in this rowcontains (SIMPLE, DERIVATE, SUBQUERY, etc..)➲ table - alias (or full table name if no alias) of the tableor derived table from which the data in this setcomes➲ type - “access strategy” used to grab the data in thisset (ALL, CONST, REF, etc...)➲ possible_keys - keys available to optimizer for query➲ keys - keys chosen by the optimizer➲ key_len – number of bytes used from the keys➲ ref - shows the column used in join relations➲ rows - estimate of the number of rows in this set➲ Extra - information the optimizer chooses to give you
  • 19. EXPLAIN the outputEXPLAINSELECTa.first_name,a.last_name,COUNT(*) as num_rentalsFROM film fINNER JOIN film_category fc ON f.film_id = fc.film_idINNER JOIN category c ON fc.category_id = c.category_idWHERE f.title LIKE T%G*************************** 1. row ***************************select_type: SIMPLEtable: ctype: ALLpossible_keys: PRIMARYkey: NULLkey_len: NULLref: NULLrows: 16Extra:*************************** 2. row ***************************select_type: SIMPLEtable: fctype: refpossible_keys: PRIMARY, fk_film_category_categorykey: fk_film_category_categorykey_len: 1ref: c.category_idrows: 1Extra: using index*************************** 2. row ***************************select_type: SIMPLEtable: ftype: eq_refpossible_keys: PRIMARY, idx_titlekey: PRIMARYkey_len: 2ref: fc.film_idrows: 1Extra: using whereestimate row countavailable indexes andthe chosen onea covering indexwas used
  • 20. EXPLAIN a real world exampleCREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM attendeesWHEREconference_id = 123AND registration_status > 0//Lets only show the important parts for now*************************** 1. row ***************************table: attendeespossible_keys: NULLkey: NULLrows: 14052CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;➲ The three most important columns returned by EXPLAINpossible_keysAll possible indexes which MYSQL could have usedBased on a series of very quick lookups and calculationskey: chosen keyrows: estimate of the scanned rows
  • 21. EXPLAIN a real world example➲ Interpreting the result:No suitable indexes for this queryMySQL has to do a full scan of the tableFull table scans are almost always the slowestFull table scans are usually an indication that an index isneededCREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM attendeesWHEREconference_id = 123AND registration_status > 0//Lets only show the important parts for now*************************** 1. row ***************************table: attendeespossible_keys: NULLkey: NULLrows: 14052CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;
  • 22. EXPLAIN a real world example➲ MySQL has two indexes to choose from➲ “reg” is not “sufficently unique”the spread of the values can also be a factor (e.g. when 99% ofrows contain the same value)➲ Index “uniqueness” is called cardinality➲ There is space for performance increaseCREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM attendeesWHEREconference_id = 123AND registration_status > 0//Lets only show the important parts for now*************************** 1. row ***************************table: attendeespossible_keys: conf, regkey: confrows: 331CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;ALTER TABLE attendeesADD INDEX conf (conference_id),ADD INDEX reg (registration_status);
  • 23. EXPLAIN a real world example➲ “reg_conf_index” is a much better choice➲ Other keys are still available, just not as effectiveCREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM attendeesWHEREconference_id = 123AND registration_status > 0//Lets only show the important parts for now*************************** 1. row ***************************table: attendeespossible_keys: reg, conf, reg_conf_indexkey: reg_conf_indexrows: 204CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;ALTER TABLE attendeesADD INDEX reg_conf_index (registration_status, conference_id);
  • 24. EXPLAIN a real world example➲ Seems like that also without the “reg” index everything isworking just as expectedCREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM attendeesWHEREregistration_status = 2//Lets only show the important parts for now*************************** 1. row ***************************table: attendeespossible_keys: reg_conf_indexkey: reg_conf_indexrows: 372CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;ALTER TABLE attendeesDELETE INDEX reg,DELETE INDEX conf;
  • 25. EXPLAIN a real world example➲ Without the “conf” index we are at square one➲ The orders in which the fields are defined in a composite indexaffects whether is available in a query➲ Potential workaroundSELECT * FROM attendees WHERE conference_id = 123 ANDregistration_id > 0;CREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM attendeesWHEREconference_id = 123//Lets only show the important parts for now*************************** 1. row ***************************table: attendeespossible_keys: NULLkey: NULLrows: 14502CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;ALTER TABLE attendeesDELETE INDEX reg,DELETE INDEX conf;
  • 26. EXPLAIN a real world example➲ Great, MySQL it is using the index on “lastname”, which is goodCREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM attendeesWHERElastname LIKE “parr%”//Lets only show the important parts for now*************************** 1. row ***************************table: attendeespossible_keys: lastnamekey: lastnamerows: 234CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;ALTER TABLE attendeesADD INDEX lastname (lastname);
  • 27. EXPLAIN a real world example➲ MySQL doesnt even try to use an index !CREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM attendeesWHERElastname LIKE “%arr%”//Lets only show the important parts for now*************************** 1. row ***************************table: attendeespossible_keys: NULLkey: NULLrows: 14052CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;ALTER TABLE attendeesADD INDEX lastname (lastname);
  • 28. EXPLAIN a real world example (pre MySQL 5.1)➲ MySQL doesnt use an index because of the OR➲ MySQL perform a full table scan➲ Workaround, use “UNION”➲ Workaround, add a composite INDEXALTER TABLE conferenceADD INDEX location_topic (location_id, topic_id);CREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM conferencesWHERElocation_id = 2OR topic_id IN (4,6,1)//Lets only show the important parts for now*************************** 1. row ***************************table: conferencespossible_keys: location_id, topic_idkey: NULLrows: 5043CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;ALTER TABLE conferencesADD INDEX location_id (location_id)ADD INDEX topic_id (topic_id);
  • 29. EXPLAIN a real world example➲ Looks like we need an index on “conference_id” on attendees➲ How many total ROWS are estimate ?CREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM conferences cINNER JOIN attendees a USING (conference_id)WHEREc.location_id = 2AND c.topic_id IN (4,6,1)AND a.registration_status > 1//Lets only show the important parts for now*************************** 1. row ***************************table: cpossible_keys: conference_topickey: conference_topicrows: 15*************************** 1. row ***************************table: apossible_keys: NULLkey: NULLrows: 14502CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;15 x 14502
  • 30. EXPLAIN the “type”
  • 31. EXPLAIN the type➲ CONST: SELECT * FROM table WHERE field = “value”; The field needs to be indexed with a unique non-nullable key If non-unique or nullable the type will be “ref” It refers to when a table with a single row is referenced in the SELECT Can be propagate across multiple joined columns:EXPLAINSELECT r.*FROM rental rINNER JOIN customer c ON r.customer_id = c.customer_idWHERE r.rental_id = 13G*************************** 1. row ***************************id: 1select_type: SIMPLEtable: rtype: constpossible_keys: PRIMARY,idx_fk_customer_idkey: PRIMARYkey_len: 4ref: constrows: 1Extra:*************************** 2. row ***************************id: 1select_type: SIMPLEtable: ctype: constpossible_keys: PRIMARYkey: PRIMARYkey_len: 2ref: const /* Here is where the propagation occurs...*/rows: 1Extra:2 rows in set (0.00 sec)
  • 32. EXPLAIN the type➲ RANGE: SELECT * FROM table WHERE field BETWEEN “value” AND“value”; The field needs to be indexed It too many records are estimated, it wont be usedEPLAINSELECT *FROM rentalWHERE rental_date BETWEEN 2005-06-14 AND 2005-06-16G*************************** 1. row ***************************id: 1select_type: SIMPLEtable: rentaltype: rangepossible_keys: rental_datekey: rental_datekey_len: 8ref: NULLrows: 364Extra: Using where1 row in set (0.00 sec)
  • 33. EXPLAIN the type➲ ALL: SELECT * FROM table WHERE field BETWEEN “value” AND“far away from starting value”; No WHERE condition (duh) No index on the field in the WHERE condition Poor selectivity on the indexed field Too many records meet the WHERE condition SEEK: jumps into random places to fetch the data and repeat for eachpiece of data needed SCAN: jump to the start and sequentially read the data For large amount of data, SCAN operations tends to be more efficient thanmultiple SEEK operations Using SELECT * FROMEPLAINSELECT *FROM rentalWHERE rental_date BETWEEN 2001-01-14 AND 2012-12-31G*************************** 1. row ***************************id: 1select_type: SIMPLEtable: rentaltype: ALLpossible_keys: rental_date /* large range force full scan */key: NULLkey_len: NULLref: NULLrows: 16298Extra: Using where1 row in set (0.00 sec)
  • 34. EXPLAIN the type➲ INDEX_MERGE: SELECT * FROM table WHERE field = “value”AND field1 = “value”; Introduced with the optimizer on MySQL 5.0 Allows the optimizer to use more than one index to satisfy a join condition Prior to MySQL 5.0, only one index In case of OR conditions, MySQL < 5.0 would use a full table scanEXPLAINSELECT *FROM rentalWHERErental_id IN (10,11,12)OR rental_date = 2006-02-01 G*************************** 1. row ***************************id: 1select_type: SIMPLEtable: rentaltype: index_mergepossible_keys: PRIMARY,rental_datekey: rental_date,PRIMARYkey_len: 8,4ref: NULLrows: 4Extra: Using sort_union(rental_date,PRIMARY); Using where1 row in set (0.02 sec)
  • 35. EXPLAIN the “Extra”
  • 36. EXPLAIN the Extra➲ “Extra” shows additional operations invoked to get your result set➲ Some common values are (more are discussed in the MySQLmanual): Using where Using temporary table Using filesort Using indexEXPLAINSELECT *FROM rentalWHERErental_id IN (10,11,12)OR rental_date = 2006-02-01 G*************************** 1. row ***************************id: 1select_type: SIMPLEtable: rentaltype: index_mergepossible_keys: PRIMARY,rental_datekey: rental_date,PRIMARYkey_len: 8,4ref: NULLrows: 4Extra: Using sort_union(rental_date,PRIMARY); Using where1 row in set (0.02 sec)
  • 37. EXPLAIN the Extra➲ Using filesort: AVOID➲ Avoid because Doesnt Use Index Involves a full scan Uses a generic algorithm (one fits all) Uses filesystem (BAD !!) Gets slower with more data➲ Its not all that bad Sometime unavoidable - ORDER BY RAND() Acceptable provided you get to your result as quickly as possible, andkeep it predictably smallEXPLAINSELECT *FROM attendeesWHEREconference_id = 123ORDER BY lastname*************************** 1. row ***************************table: attendeespossible_keys: conference_idkey: conference_idrows: 331Extra: Using filesort
  • 38. EXPLAIN the Extra➲ Using index: GOOD➲ Celebrate because MySQL got your results just by consulting the index MySQL didnt need to look at the table to get the results (open table isexpensive) Fastest way to get your data➲ Particularly useful... When you are interested in a single data or id When you are interested in COUNT(), SUM(), AVG(), etc. of a fieldEXPLAIN SELECT AVG(age) FROM attendees WHERE conference_id = 123*************************** 1. row ***************************table: attendeespossible_keys: conference_idkey: conference_idrows: 331Extra:ALTER TABLE attendees ADD INDEX conf_age (conference_id, age);EXPLAIN SELECT AVG(age) FROM attendees WHERE conference_id = 123*************************** 1. row ***************************table: attendeespossible_keys: conference_id, conf_surnamekey: conf_surnamerows: 331Extra: Using indexNothing is actually wrong with this query, it just could be quickerOutside from caching, this is the fastest way to get your data
  • 39. INDEXES... your schemas phone book➲ Speed up SELECTs, but slow down modifications➲ Make sure you have indexes on columns used inWHERE, ON, and GROUP BY clauses➲ Always ensure that JOIN conditions are indexed ANDhave identical data types➲ Good keys:Selectivity:% of distinct values= distinct values / number rowsunique or primary always 1Low selectivity:Maybe you can put it in a multi-column indexPrefix ? Suffix ? It depends on your application
  • 40. indexed columns and functions dont mixA full table scan is used because a function (LEFT) is operating onthe lastname column.Lets Fix this...EXPLAINSELECT *FROM attendeesWHERELEFT(lastname.2) = “Pa”*************************** 1. row ***************************id: 1select_type: SIMPLEtable: filmtype: ALLpossible_keys: NULLkey: NULLkey_len: NULLref: NULLrows: 951Extra: Using whereEXPLAINSELECT *FROM attendeesWHERElastname LIKE “Pa%”*************************** 1. row ***************************id: 1select_type: SIMPLEtable: filmtype: rangepossible_keys: idx_titlekey: idx_titlekey_len: 767ref: NULLrows: 15Extra: Using where
  • 41. lets fix multiple issues with a SELECT queryFirst, we are operating on an index column (order_created) with a function – lets fix that:SELECT * FROM orders WHERE TO_DAYS(CURRENT_DATE()) - TO_DAYS(order_created) <= 7;Even if we removed the function in the WHERE expression, we still have a non-deterministic function in the statement which eliminates this query from being places inthe query cache – lets fix that:SELECT * FROM orders WHERE order_created >= CURRENT_DATE() - INTERVAL 7 DAYS;We replaced the function with a constant, however we are specifying a SELECT * insteadthan the actual fields that we need.What is there is a TEXT field in the table that we dont seen to see ? Having it included inthe result means a larger result set which may not fit in the query cache and may force adisk-based temporary table – lets fix that:SELECT * FROM orders WHERE order_created >= 2013-01-13 - INTERVAL 7 DAYS;SELECTorder_id,customer_id,order_total,date_createdFROM ordersWHERE order_created >= 2013-01-13 - INTERVAL 7 DAYS;
  • 42. good indexes vs. bad indexesDont forget that MySQL string indexes allow only 1000 characters (333using UTF-8).Lets say you have 11,000,000 records in a table called “USERS” withthe following fields:➲ user, firstname, lastname, gender, email, age, country_idOur application perform searched on the following fields:➲ user➲ firstname, lastname, gender➲ emailIt is obvious to create indexes on user and email, especially if they areunique, but what about the other fields?➲ “gender” can be M or F, selectivity is very low 2/11,000,000 = 0.Best would be to remove the index on gender if you have it➲ “firstname”/”lastname” depend on the uniqueness of the valuesstores.SELECT DISTINCT to calculate the selectivityif it is above 15% keep itbelow 15% you might want to create a composite INDEX
  • 43. removing crappy or redundant indexesSELECTt.TABLE_SCHEMA AS `db`,t.TABLE_NAME AS `table`,s.INDEX_NAME AS `index name`,s.COLUMN_NAME AS `field name`,s.SEQ_IN_INDEX `seq in index`,s2.max_columns AS `# cols,s.CARDINALITY AS `card`,t.TABLE_ROWS AS `est rows`,ROUND(((s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) * 100), 2) AS `sel %`FROMINFORMATION_SCHEMA.STATISTICS sINNER JOIN INFORMATION_SCHEMA.TABLES t ON s.TABLE_SCHEMA = t.TABLE_SCHEMA AND s.TABLE_NAME = t.TABLE_NAMEINNER JOIN (SELECTTABLE_SCHEMA,TABLE_NAME,INDEX_NAME,MAX(SEQ_IN_INDEX) AS max_columnsFROM INFORMATION_SCHEMA.STATISTICSWHERE TABLE_SCHEMA != mysqlGROUP BYTABLE_SCHEMA,TABLE_NAME,INDEX_NAME) AS s2 ON s.TABLE_SCHEMA = s2.TABLE_SCHEMA AND s.TABLE_NAME = s2.TABLE_NAME AND s.INDEX_NAME = s2.INDEX_NAMEWHEREt.TABLE_SCHEMA != mysql /* Filter out the mysql system DB */AND t.TABLE_ROWS > 10 /* Only tables with some rows */AND s.CARDINALITY IS NOT NULL /* Need at least one non-NULL value in the field */AND (s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) < 1.00 /* unique indexes are perfect anyway */ORDER BY`sel %`, /* DESC for best non-unique indexes */s.TABLE_SCHEMA,s.TABLE_NAMELIMIT 100