Explain that explain
Upcoming SlideShare
Loading in...5
×
 

Explain that explain

on

  • 1,009 views

 

Statistics

Views

Total Views
1,009
Views on SlideShare
1,003
Embed Views
6

Actions

Likes
3
Downloads
40
Comments
0

2 Embeds 6

http://www.stilediroma.com 4
http://www.novaproget.com 2

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Explain that explain Explain that explain Presentation Transcript

    • Explain that “Explain”The Road to Understanding“You Should Not Fight with the Database, the Database is your Friend”put together by Fabrizio ParrellaPARTS ARE QUOTED OR COPIED OR REF FROM:http://devzone.zend.com/1436/the-zendcon-sessions-episode-17-sql-query-tuning-the-legend-of-drunken-query-master/http://www.slideshare.net/phpcodemonkey/mysql-explain-explained
    • get to know your friend➲ Recognize the strengths and also the weaknesses ofyour database➲ No database is perfect -- deal with it, youre notperfect either➲ Think of both big things and small thingsBIG: Architecture, surrounding servers, cachingSMALL: SQL coding, join rewrites, server config
    • becoming friends➲ Understand storage engine abilities and weaknesses➲ Understand how the query cache and importantbuffers works➲ Understand optimizers limitations➲ Understand what should and should not be done atthe application level➲ If you understand the above, youll start to see thedatabase as a friend and not an enemy
    • the schema➲ Basic foundation of performance➲ Everything else depends on it➲ Choose your data types wisely➲ “Divide et Impera” the schema through partitioningA divide and conquer (D&C) algorithm works by recursively break down a problem into two or more sub-problems of the same (or related) type, until there become simple enough to be solved directly. Thesolution to the sub-problems is then combined to give a solution to the original problem.http://en.wikipedia.org/wiki/Divide_and_conquer_algorithm
    • size does matter !!smaller, smaller, SMALLERThe more records you can fit into a single page ofmemory/disk, the faster your seeks and scans will be➲ Do you really need that BIGINT?➲ Use INT UNSIGNED for IPv4 addresses➲ Use VARCHAR carefullyConverted to CHAR when used in a temporary table➲ Use TEXT sparinglyConsider separate tables➲ Use BLOBs very sparinglyUse the filesystem for what it was intended
    • real life example... handling IPv4 addressesCREATE TABLE Sessions (session_id INT UNSIGNED NOT NULL AUTO_INCREMENT,ip_address INT UNSIGNED NOT NULL, // Compare to CHAR(15)...session_data TEXT NOT NULL,PRIMARY KEY (session_id),INDEX (ip_address)) ENGINE=InnoDB;// Insert a new dummy recordINSERT INTO Sessions VALUES(NULL, INET_ATON(192.168.0.2), some session data);SELECTsession_id,ip_address as ip_raw,INET_NTOA(ip_address) as ip,session_dataFROM SessionsWHEREip_address BETWEEN INET_ATON(192.168.0.1) AND INET_ATON(192.168.0.255);+------------+------------+-------------+-------------------+| session_id | ip_raw | ip | session_data |+------------+------------+-------------+-------------------+| 1 | 3232235522 | 192.168.0.2 | some session data |+------------+------------+-------------+-------------------+
    • SETs and ENUMs➲ Often sign of poor schema design➲ Changing the definition will most likely require a fullrebuild of the table➲ Search functions like FIND_IN_SET() are inefficientcompared to index operation on a join
    • normalization, taking it too farDateDate ?http://thedailywtf.com/forums/thread/75982.aspx
    • vertical partitioning➲ Never mix frequently and infrequentlyaccessed fields in a single table➲ Splitting tables allows main records to consume the bufferpages without the extra data taking up space in memory➲ Do you need FULLTEXT on your text columns (PRE 5.6.4)?CREATE TABLE Users (user_id INT NOT NULL AUTO_INCREMENT,email VARCHAR(80) NOT NULL,display_name VARCHAR(50) NOT NULL,password CHAR(41) NOT NULL,first_name VARCHAR(25) NOT NULL,last_name VARCHAR(25) NOT NULL,address VARCHAR(80) NOT NULL,city VARCHAR(30) NOT NULL,province CHAR(2) NOT NULL,postcode CHAR(7) NOT NULL,interests TEXT NULL,bio TEXT NULL,signature TEXT NULL,skills TEXT NULL,PRIMARY KEY (user_id),UNIQUE INDEX (email)) ENGINE=InnoDB;CREATE TABLE Users (user_id INT NOT NULL AUTO_INCREMENT,email VARCHAR(80) NOT NULL,display_name VARCHAR(50) NOT NULL,password CHAR(41) NOT NULL,PRIMARY KEY (user_id),UNIQUE INDEX (email)) ENGINE=InnoDB;CREATE TABLE UserExtra (user_id INT NOT NULLfirst_name VARCHAR(25) NOT NULLlast_name VARCHAR(25) NOT NULLaddress VARCHAR(80) NOT NULLcity VARCHAR(30) NOT NULLprovince CHAR(2) NOT NULLpostcode CHAR(7) NOT NULLinterests TEXT NULLbio TEXT NULLsignature TEXT NULLskills TEXT NULLPRIMARY KEY (user_id)FULLTEXT KEY (interests, skills)) ENGINE=MyISAM;
    • understand MySQL query cache➲ You must understand your applications read/writepatterns➲ Internal query cache design is a compromise betweenCPU usage and read performance➲ Stores the MYSQL_RESULT of a SELECT along with a hashof the SELECT SQL statement➲ Any modification to any table involved in the SELECTinvalidates the stored result➲ Write applications to be aware of the query cacheUse SELECT SQL_NO_CACHE
    • coding like a master➲ Be consistent (for crying out loud)➲ Use ANSI SQL coding style (vs. Theta)➲ Stop thinking in terms of iterators, for loops, whileloops, etc➲ Instead, think in terms of sets➲ Break complex SQL statements (or business requests)into smaller, manageable chunks
    • Consistency, consistency, CONSISTENCY !!➲ Tabs and Spacing➲ Upper and Lower Case➲ Keywords, function namesNothing pisses offthe query master likeinconsistent SQL code!SELECTa.first_name,a.last_name,COUNT(*) as num_rentalsFROM actor aINNER JOIN film f ON a.actor_id = f.actor_idGROUP BY a.actor_idORDER BYnum_rentals DESC,a.last_name,a.first_nameLIMIT 10;vs.select first_name, a.last_name,count(*) AS num_rentalsFROM actor a join film on a.actor_id = film.actor_idgroup by a.actor_id order bynum_rentals DESC, a.last_name, a.first_nameLIMIT 10;➲ Aliases➲ Consider yourteammates➲ Like your code, SQL ismeant to be read, notwritten
    • guidelines➲ Beware of join hints“force index” can get “out of date”➲ Just because it can be done in a single SQLstatement doesnt meat it should➲ ALWAYS test and benchmark your solution
    • ANSI vs. THETASELECTa.first_name,a.last_name,COUNT(*) as num_rentalsFROM actor aINNER JOIN film_actor fa ON a.actor_id = fa.actor_idINNER JOIN film f ON fa.film_id = f.film_idINNER JOIN inventory I ON f.film_id = i.film_idINNER JOIN rental r ON r.inventory_id = i.inventory_idGROUP BY a.actor_idORDER BYnum_rentals DESC,a.last_name,a.first_nameLIMIT 10; SELECTa.first_name,a.last_name,COUNT(*) as num_rentalsFROMactor a,film f,film_actor fa,inventory i,rental rWHEREa.actor_id = fa.actor_idAND fa.film_id = f.film_idAND f.film_id = i.film_idAND r.inventory_id = i.inventory_idGROUP BY a.actor_idORDER BYnum_rentals DESC,a.last_name,a.first_nameLIMIT 10;ANSI STYLEExplicitly declare JOIN conditionsusing the ON clauseTHETA STYLEImplicitly declare JOIN conditionsin the WHERE clause
    • why ANSI style kicks THETA styles A55➲ MySQL THETA style only supports INNER and CROSSjoinBut MySQL ANSI style supports INNER, CROSS, LEFT, RIGHT,and NATURAL joinsMixing and matching both styles can lead to hard-to-readSQL code➲ It is extremely easy to miss a join condition withTHETA styleEspecially when joining many tablesForgetting a Join will produce a cartesian product (NOTGOOD !!!)
    • WITHOUT THE STRENGHT OFEXPLAINYOU WILL GET LOST IN THE FIELDSOF MISUNDERSTANDINGhow to test our SQL
    • EXPLAIN the basics➲ Provides the execution plan chosen by the MySQLoptimizer➲ Simply prepend the word EXPLAIN in front of yourSELECT statement➲ Each row represent a set of information for eachtable used in the SELECT
    • EXPLAIN the columns➲ select_type - type of “set” the data in this rowcontains (SIMPLE, DERIVATE, SUBQUERY, etc..)➲ table - alias (or full table name if no alias) of the tableor derived table from which the data in this setcomes➲ type - “access strategy” used to grab the data in thisset (ALL, CONST, REF, etc...)➲ possible_keys - keys available to optimizer for query➲ keys - keys chosen by the optimizer➲ key_len – number of bytes used from the keys➲ ref - shows the column used in join relations➲ rows - estimate of the number of rows in this set➲ Extra - information the optimizer chooses to give you
    • EXPLAIN the outputEXPLAINSELECTa.first_name,a.last_name,COUNT(*) as num_rentalsFROM film fINNER JOIN film_category fc ON f.film_id = fc.film_idINNER JOIN category c ON fc.category_id = c.category_idWHERE f.title LIKE T%G*************************** 1. row ***************************select_type: SIMPLEtable: ctype: ALLpossible_keys: PRIMARYkey: NULLkey_len: NULLref: NULLrows: 16Extra:*************************** 2. row ***************************select_type: SIMPLEtable: fctype: refpossible_keys: PRIMARY, fk_film_category_categorykey: fk_film_category_categorykey_len: 1ref: c.category_idrows: 1Extra: using index*************************** 2. row ***************************select_type: SIMPLEtable: ftype: eq_refpossible_keys: PRIMARY, idx_titlekey: PRIMARYkey_len: 2ref: fc.film_idrows: 1Extra: using whereestimate row countavailable indexes andthe chosen onea covering indexwas used
    • EXPLAIN a real world exampleCREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM attendeesWHEREconference_id = 123AND registration_status > 0//Lets only show the important parts for now*************************** 1. row ***************************table: attendeespossible_keys: NULLkey: NULLrows: 14052CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;➲ The three most important columns returned by EXPLAINpossible_keysAll possible indexes which MYSQL could have usedBased on a series of very quick lookups and calculationskey: chosen keyrows: estimate of the scanned rows
    • EXPLAIN a real world example➲ Interpreting the result:No suitable indexes for this queryMySQL has to do a full scan of the tableFull table scans are almost always the slowestFull table scans are usually an indication that an index isneededCREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM attendeesWHEREconference_id = 123AND registration_status > 0//Lets only show the important parts for now*************************** 1. row ***************************table: attendeespossible_keys: NULLkey: NULLrows: 14052CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;
    • EXPLAIN a real world example➲ MySQL has two indexes to choose from➲ “reg” is not “sufficently unique”the spread of the values can also be a factor (e.g. when 99% ofrows contain the same value)➲ Index “uniqueness” is called cardinality➲ There is space for performance increaseCREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM attendeesWHEREconference_id = 123AND registration_status > 0//Lets only show the important parts for now*************************** 1. row ***************************table: attendeespossible_keys: conf, regkey: confrows: 331CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;ALTER TABLE attendeesADD INDEX conf (conference_id),ADD INDEX reg (registration_status);
    • EXPLAIN a real world example➲ “reg_conf_index” is a much better choice➲ Other keys are still available, just not as effectiveCREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM attendeesWHEREconference_id = 123AND registration_status > 0//Lets only show the important parts for now*************************** 1. row ***************************table: attendeespossible_keys: reg, conf, reg_conf_indexkey: reg_conf_indexrows: 204CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;ALTER TABLE attendeesADD INDEX reg_conf_index (registration_status, conference_id);
    • EXPLAIN a real world example➲ Seems like that also without the “reg” index everything isworking just as expectedCREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM attendeesWHEREregistration_status = 2//Lets only show the important parts for now*************************** 1. row ***************************table: attendeespossible_keys: reg_conf_indexkey: reg_conf_indexrows: 372CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;ALTER TABLE attendeesDELETE INDEX reg,DELETE INDEX conf;
    • EXPLAIN a real world example➲ Without the “conf” index we are at square one➲ The orders in which the fields are defined in a composite indexaffects whether is available in a query➲ Potential workaroundSELECT * FROM attendees WHERE conference_id = 123 ANDregistration_id > 0;CREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM attendeesWHEREconference_id = 123//Lets only show the important parts for now*************************** 1. row ***************************table: attendeespossible_keys: NULLkey: NULLrows: 14502CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;ALTER TABLE attendeesDELETE INDEX reg,DELETE INDEX conf;
    • EXPLAIN a real world example➲ Great, MySQL it is using the index on “lastname”, which is goodCREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM attendeesWHERElastname LIKE “parr%”//Lets only show the important parts for now*************************** 1. row ***************************table: attendeespossible_keys: lastnamekey: lastnamerows: 234CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;ALTER TABLE attendeesADD INDEX lastname (lastname);
    • EXPLAIN a real world example➲ MySQL doesnt even try to use an index !CREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM attendeesWHERElastname LIKE “%arr%”//Lets only show the important parts for now*************************** 1. row ***************************table: attendeespossible_keys: NULLkey: NULLrows: 14052CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;ALTER TABLE attendeesADD INDEX lastname (lastname);
    • EXPLAIN a real world example (pre MySQL 5.1)➲ MySQL doesnt use an index because of the OR➲ MySQL perform a full table scan➲ Workaround, use “UNION”➲ Workaround, add a composite INDEXALTER TABLE conferenceADD INDEX location_topic (location_id, topic_id);CREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM conferencesWHERElocation_id = 2OR topic_id IN (4,6,1)//Lets only show the important parts for now*************************** 1. row ***************************table: conferencespossible_keys: location_id, topic_idkey: NULLrows: 5043CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;ALTER TABLE conferencesADD INDEX location_id (location_id)ADD INDEX topic_id (topic_id);
    • EXPLAIN a real world example➲ Looks like we need an index on “conference_id” on attendees➲ How many total ROWS are estimate ?CREATE TABLE `attendees` (`attendee_id` int(11) NOT NULL,`lastname` varchar(50) NOT NULL,`conference_id` int(11) NOT NULL,`registration_status` tinyint(4) NOT NULL,PRIMARY KEY (`attendee_id`)) ENGINE=InnoDB;EXPLAINSELECT *FROM conferences cINNER JOIN attendees a USING (conference_id)WHEREc.location_id = 2AND c.topic_id IN (4,6,1)AND a.registration_status > 1//Lets only show the important parts for now*************************** 1. row ***************************table: cpossible_keys: conference_topickey: conference_topicrows: 15*************************** 1. row ***************************table: apossible_keys: NULLkey: NULLrows: 14502CREATE TABLE `conferences` (`conference_id` int(11) NOT NULL,`location_id` int(11) NOT NULL,`topic_id` int(11) NOT NULL,`date` date NOT NULL,PRIMARY KEY (`conference_id`)) ENGINE=InnoDB;15 x 14502
    • EXPLAIN the “type”
    • EXPLAIN the type➲ CONST: SELECT * FROM table WHERE field = “value”; The field needs to be indexed with a unique non-nullable key If non-unique or nullable the type will be “ref” It refers to when a table with a single row is referenced in the SELECT Can be propagate across multiple joined columns:EXPLAINSELECT r.*FROM rental rINNER JOIN customer c ON r.customer_id = c.customer_idWHERE r.rental_id = 13G*************************** 1. row ***************************id: 1select_type: SIMPLEtable: rtype: constpossible_keys: PRIMARY,idx_fk_customer_idkey: PRIMARYkey_len: 4ref: constrows: 1Extra:*************************** 2. row ***************************id: 1select_type: SIMPLEtable: ctype: constpossible_keys: PRIMARYkey: PRIMARYkey_len: 2ref: const /* Here is where the propagation occurs...*/rows: 1Extra:2 rows in set (0.00 sec)
    • EXPLAIN the type➲ RANGE: SELECT * FROM table WHERE field BETWEEN “value” AND“value”; The field needs to be indexed It too many records are estimated, it wont be usedEPLAINSELECT *FROM rentalWHERE rental_date BETWEEN 2005-06-14 AND 2005-06-16G*************************** 1. row ***************************id: 1select_type: SIMPLEtable: rentaltype: rangepossible_keys: rental_datekey: rental_datekey_len: 8ref: NULLrows: 364Extra: Using where1 row in set (0.00 sec)
    • EXPLAIN the type➲ ALL: SELECT * FROM table WHERE field BETWEEN “value” AND“far away from starting value”; No WHERE condition (duh) No index on the field in the WHERE condition Poor selectivity on the indexed field Too many records meet the WHERE condition SEEK: jumps into random places to fetch the data and repeat for eachpiece of data needed SCAN: jump to the start and sequentially read the data For large amount of data, SCAN operations tends to be more efficient thanmultiple SEEK operations Using SELECT * FROMEPLAINSELECT *FROM rentalWHERE rental_date BETWEEN 2001-01-14 AND 2012-12-31G*************************** 1. row ***************************id: 1select_type: SIMPLEtable: rentaltype: ALLpossible_keys: rental_date /* large range force full scan */key: NULLkey_len: NULLref: NULLrows: 16298Extra: Using where1 row in set (0.00 sec)
    • EXPLAIN the type➲ INDEX_MERGE: SELECT * FROM table WHERE field = “value”AND field1 = “value”; Introduced with the optimizer on MySQL 5.0 Allows the optimizer to use more than one index to satisfy a join condition Prior to MySQL 5.0, only one index In case of OR conditions, MySQL < 5.0 would use a full table scanEXPLAINSELECT *FROM rentalWHERErental_id IN (10,11,12)OR rental_date = 2006-02-01 G*************************** 1. row ***************************id: 1select_type: SIMPLEtable: rentaltype: index_mergepossible_keys: PRIMARY,rental_datekey: rental_date,PRIMARYkey_len: 8,4ref: NULLrows: 4Extra: Using sort_union(rental_date,PRIMARY); Using where1 row in set (0.02 sec)
    • EXPLAIN the “Extra”
    • EXPLAIN the Extra➲ “Extra” shows additional operations invoked to get your result set➲ Some common values are (more are discussed in the MySQLmanual): Using where Using temporary table Using filesort Using indexEXPLAINSELECT *FROM rentalWHERErental_id IN (10,11,12)OR rental_date = 2006-02-01 G*************************** 1. row ***************************id: 1select_type: SIMPLEtable: rentaltype: index_mergepossible_keys: PRIMARY,rental_datekey: rental_date,PRIMARYkey_len: 8,4ref: NULLrows: 4Extra: Using sort_union(rental_date,PRIMARY); Using where1 row in set (0.02 sec)
    • EXPLAIN the Extra➲ Using filesort: AVOID➲ Avoid because Doesnt Use Index Involves a full scan Uses a generic algorithm (one fits all) Uses filesystem (BAD !!) Gets slower with more data➲ Its not all that bad Sometime unavoidable - ORDER BY RAND() Acceptable provided you get to your result as quickly as possible, andkeep it predictably smallEXPLAINSELECT *FROM attendeesWHEREconference_id = 123ORDER BY lastname*************************** 1. row ***************************table: attendeespossible_keys: conference_idkey: conference_idrows: 331Extra: Using filesort
    • EXPLAIN the Extra➲ Using index: GOOD➲ Celebrate because MySQL got your results just by consulting the index MySQL didnt need to look at the table to get the results (open table isexpensive) Fastest way to get your data➲ Particularly useful... When you are interested in a single data or id When you are interested in COUNT(), SUM(), AVG(), etc. of a fieldEXPLAIN SELECT AVG(age) FROM attendees WHERE conference_id = 123*************************** 1. row ***************************table: attendeespossible_keys: conference_idkey: conference_idrows: 331Extra:ALTER TABLE attendees ADD INDEX conf_age (conference_id, age);EXPLAIN SELECT AVG(age) FROM attendees WHERE conference_id = 123*************************** 1. row ***************************table: attendeespossible_keys: conference_id, conf_surnamekey: conf_surnamerows: 331Extra: Using indexNothing is actually wrong with this query, it just could be quickerOutside from caching, this is the fastest way to get your data
    • INDEXES... your schemas phone book➲ Speed up SELECTs, but slow down modifications➲ Make sure you have indexes on columns used inWHERE, ON, and GROUP BY clauses➲ Always ensure that JOIN conditions are indexed ANDhave identical data types➲ Good keys:Selectivity:% of distinct values= distinct values / number rowsunique or primary always 1Low selectivity:Maybe you can put it in a multi-column indexPrefix ? Suffix ? It depends on your application
    • indexed columns and functions dont mixA full table scan is used because a function (LEFT) is operating onthe lastname column.Lets Fix this...EXPLAINSELECT *FROM attendeesWHERELEFT(lastname.2) = “Pa”*************************** 1. row ***************************id: 1select_type: SIMPLEtable: filmtype: ALLpossible_keys: NULLkey: NULLkey_len: NULLref: NULLrows: 951Extra: Using whereEXPLAINSELECT *FROM attendeesWHERElastname LIKE “Pa%”*************************** 1. row ***************************id: 1select_type: SIMPLEtable: filmtype: rangepossible_keys: idx_titlekey: idx_titlekey_len: 767ref: NULLrows: 15Extra: Using where
    • lets fix multiple issues with a SELECT queryFirst, we are operating on an index column (order_created) with a function – lets fix that:SELECT * FROM orders WHERE TO_DAYS(CURRENT_DATE()) - TO_DAYS(order_created) <= 7;Even if we removed the function in the WHERE expression, we still have a non-deterministic function in the statement which eliminates this query from being places inthe query cache – lets fix that:SELECT * FROM orders WHERE order_created >= CURRENT_DATE() - INTERVAL 7 DAYS;We replaced the function with a constant, however we are specifying a SELECT * insteadthan the actual fields that we need.What is there is a TEXT field in the table that we dont seen to see ? Having it included inthe result means a larger result set which may not fit in the query cache and may force adisk-based temporary table – lets fix that:SELECT * FROM orders WHERE order_created >= 2013-01-13 - INTERVAL 7 DAYS;SELECTorder_id,customer_id,order_total,date_createdFROM ordersWHERE order_created >= 2013-01-13 - INTERVAL 7 DAYS;
    • good indexes vs. bad indexesDont forget that MySQL string indexes allow only 1000 characters (333using UTF-8).Lets say you have 11,000,000 records in a table called “USERS” withthe following fields:➲ user, firstname, lastname, gender, email, age, country_idOur application perform searched on the following fields:➲ user➲ firstname, lastname, gender➲ emailIt is obvious to create indexes on user and email, especially if they areunique, but what about the other fields?➲ “gender” can be M or F, selectivity is very low 2/11,000,000 = 0.Best would be to remove the index on gender if you have it➲ “firstname”/”lastname” depend on the uniqueness of the valuesstores.SELECT DISTINCT to calculate the selectivityif it is above 15% keep itbelow 15% you might want to create a composite INDEX
    • removing crappy or redundant indexesSELECTt.TABLE_SCHEMA AS `db`,t.TABLE_NAME AS `table`,s.INDEX_NAME AS `index name`,s.COLUMN_NAME AS `field name`,s.SEQ_IN_INDEX `seq in index`,s2.max_columns AS `# cols,s.CARDINALITY AS `card`,t.TABLE_ROWS AS `est rows`,ROUND(((s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) * 100), 2) AS `sel %`FROMINFORMATION_SCHEMA.STATISTICS sINNER JOIN INFORMATION_SCHEMA.TABLES t ON s.TABLE_SCHEMA = t.TABLE_SCHEMA AND s.TABLE_NAME = t.TABLE_NAMEINNER JOIN (SELECTTABLE_SCHEMA,TABLE_NAME,INDEX_NAME,MAX(SEQ_IN_INDEX) AS max_columnsFROM INFORMATION_SCHEMA.STATISTICSWHERE TABLE_SCHEMA != mysqlGROUP BYTABLE_SCHEMA,TABLE_NAME,INDEX_NAME) AS s2 ON s.TABLE_SCHEMA = s2.TABLE_SCHEMA AND s.TABLE_NAME = s2.TABLE_NAME AND s.INDEX_NAME = s2.INDEX_NAMEWHEREt.TABLE_SCHEMA != mysql /* Filter out the mysql system DB */AND t.TABLE_ROWS > 10 /* Only tables with some rows */AND s.CARDINALITY IS NOT NULL /* Need at least one non-NULL value in the field */AND (s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) < 1.00 /* unique indexes are perfect anyway */ORDER BY`sel %`, /* DESC for best non-unique indexes */s.TABLE_SCHEMA,s.TABLE_NAMELIMIT 100