Indexing the MySQL Index: Key to performance tuning
Indexing the MySQL Index: Guide to Performance Enhancement Presented by – Sonali Minocha OSSCube
Who Am I? Chief Technology Officer (MySQL) with OSSCubeMySQL Consulting, Implementation & TrainingMySQL Certified DBA & Cluster DBA
What is Index? A mechanism to locate and A database index is a data access data within astructure that improves the database. An index may speed of data retrieval quote one or more columns operations on a database and be a means of enforcing table. uniqueness on their values.
More about Index• Speedy data retrieval. • SPEED of SELECTs• Rapid random look ups.• Efficient for Reporting, OLAP, read-intensive applications •However it is expensive for – Slows down writes – heavy write applications (OLTP) be careful – More disk space used
PropertiesIndex can be created on: Index only contains key-• One or more columns. fields according to which table is arranged. Index may quote one or more columns and be aIndex may be unique or means of enforcing non-unique. uniqueness of their values.
Cont.In this table if we have to search for employee whose name isRony then code will look like :For each row in table if row = Rony then results.append[row] Else movenextSo we checking each now for condition.
HOW DATABASE INDEXES WORK ?• Lets assume we have a table of data like this:
Type Of Indexes ConcatenatedColumn Index Covering Index Index Clustered/Non- Partial Index clustered Index
Column Index Only those query will Index on a single be optimized which column satisfy your criteria. Eg: By adding an index to SELECT employeeid, theemployeeid, firstnam query is optimized to e only look at records FROM Employee that satisfy your WHERE criteria. employeeid = 001
Concatenated IndexIndex on multiple Use appropriate index. columns. : SELECT employeeid, lastname FROM Employee WHERE employeeid = 002 AND lastname = ‘Felix’;
Covering Index The benefit of a covering index is that the lookup of the various B-Tree index pagesCovers all columns in a query. necessarily satisfies the query, and no additional data page lookups are necessary. SELECT employeeid FROM Employee WHERE employeeid = 001
Partial Index Subset of a column for the index. Use on CHAR, VARCHAR,TEXT etc. Creating a partial index may greatly reduce the size of the index, and minimize the additional data lookups required. Create table t ( name char(255) , INDEX ( name(15) ) ); Eg:-SELECT employeeid, firstname, lastname FROM Employee WHERE lastname like ‘A%’ We should add an index to lastname to improve performance.
Clustered vs. Non-clustered Describes whether the data records are stored on disk in a sorted orderMyISAM - non clustered. InnoDB - Clustered. Secondary indexes built upon the clustering key
Primary Index is added to all secondary index.Because the data resides within the leaf nodes of index, more space in memory needed to search through same amount of records
How it can be faster?If we create HASH TABLE. The key ofhash table would be based onempnameand the values would bepointer to the database row.This is Hash Index: • Hash index are good for equality searches. • Hash index are not good for index searches.So what should be the solution for Range Searches?B-Tree
30 0X775800 Age Location of the data B-Tree/ Binary tree: Stores data in ordered way. Nodes in B-Tree contains a index field and a pointer to a Allows data row. logarithmic • So like in above Each node takes It allows faster Single diskselections, inser example if we up one disk range searches. operation. tions and create an index on block. deletion. age the node of B- tree will look like
B-Tree 003 006Diagram 001 002 004 005 008 007 EMPLOYEE ID FIRSTNAME LASTNAME AGE SALARY GENDER 001 Ashish Kataria 25 10000 M 002 Rony Felix 28 20000 M 003 Namita Misra 24 10000 F 004 Ankur Aeran 30 25000 M 005 Priyanka Jain 30 20000 F 006 Pradeep Pandey 31 30000 M 007 Pankaj Gupta 25 12000 M 008 Ankit Garg 30 15000 M
R-TreeMySQL supports any other type of index called Spatial Index. Spatial Index arecreated the way other index are created. Only extended keyword is usedSPATIAL.
Fulltext IndexesAbility to search for text.Only available in MyISAM.Can be created for a TEXT, CHAR or VARCHAR.Important points of fulltext Search: • Searches are not case sensitive. • Short words are ignored, the default minimum length is 4 character. • ft_min_word_len • ft_max_word_lenWords called stopwords are ignored: • ft_stopword_file= If a word is present in more than 50% of the rows it will have a weight of zero. This has advantageon large data sets.
Hash, B-Tree, R-Tree uses different strategy to speed data retrieval time. The best algorithm is pickedup depending on data expected and supportedalgorithm.
Query is using Index or Not? With EXPLAIN the query isQuery Execution Plan sent all the way to the (EXPLAIN) optimizer, but not to the storage engine Secrets of Best MySQL Optimization Practice
mysql> explain select * from citylistG id: 1select_type: SIMPLE table: citylist type: ALLpossible_keys: NULL key: NULLkey_len: NULL ref: NULL rows: 4079 Extra:1 row in set (0.01 sec)
Selectivity• Selectivity of a column is the ratio between number of distinctvalues and number of total values.•Primary Key has selectivity 1. eg: Employee table has 10,000 users with fields employeeid ,email ,firstname ,lastname ,salary ,genderOur application searches for following fields: employeeid first ,lastname ,gender email So employeeid, email, firstname and lastname can be candiates for indexes.
Since employee id is unique its selectivity will be equal to the primary key selectivity.In case of gender it will have two values M ,F selectivity = 2/10,000 = .00002 If we drop this index , it will be more beneficial. Index on firstname and lastname selectivity is a function of name you are searching.Selectivity above than 15% is a good index.
# /* SQL script to grab the # SQL script to grab the worst performing indexes worst performing # in the whole server indexes in the whole server # */ # SELECT # t.TABLE_SCHEMA AS `db` # , t.TABLE_NAME AS `table` # , s.INDEX_NAME AS `inde name` # , s.COLUMN_NAME AS `field name` # , s.SEQ_IN_INDEX `seq in index` # , s2.max_columns AS `# cols` # , s.CARDINALITY AS `card` # , t.TABLE_ROWS AS `est rows` # , ROUND(((s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) * 100), 2) AS `sel %` # FROM INFORMATION_SCHEMA.STATISTICS s # INNER JOIN INFORMATION_SCHEMA.TABLES t # ON s.TABLE_SCHEMA = t.TABLE_SCHEMA # AND s.TABLE_NAME = t.TABLE_NAME
# INNER JOIN ( # SELECT # TABLE_SCHEMA # , TABLE_NAME # , INDEX_NAME # , MAX(SEQ_IN_INDEX) AS max_columns # FROM INFORMATION_SCHEMA.STATISTICS # WHERE TABLE_SCHEMA != mysql # GROUP BY TABLE_SCHEMA, TABLE_NAME, INDEX_NAME # ) AS s2 # ON s.TABLE_SCHEMA = s2.TABLE_SCHEMA # AND s.TABLE_NAME = s2.TABLE_NAME # AND s.INDEX_NAME = s2.INDEX_NAME # WHERE t.TABLE_SCHEMA != mysql /* Filter out the mysql system DB */ # AND t.TABLE_ROWS> 10 /* Only tables with some rows */ # AND s.CARDINALITY IS NOT NULL /* Need at least one non-NULL value in the field */ # AND (s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) < 1.00 /* Selectivity < 1.0 b/c unique indexes are perfect anyway */ # ORDER BY `sel %`, s.TABLE_SCHEMA, s.TABLE_NAME /* Switch to `sel %` DESC for best non-unique indexes */
Where to add indexWHERE clauses ( on which column data is filtered)• Good distribution and selectivity in field values• BAD IDEA to index gender or columns like statusIndex join columnsTry to create as many Covering Index as possibleGROUP BY clauses• Field order is important.
Avoid Redundant IndexesExample:Key(a)key(a,b) Key(a(10));Key(a)andKey(a(10) is redundant because they are prefix of Key(A,B)Redundantx may be usefulA – integer columnB – varchar(255)Key(A) will be faster than using Key(A,B).Index on short columns are more faster however if index on longer column is created that can be beneficial as covered index.
Key Caches (MyISAM)• For tables are used more often Key Cache can be used to optimize read of those tableshot_cache.key_buffer_size = 128K• Assign tables to caches CACHE INDEX table1, TO hot_cache; CACHE INDEX table2 TO cold_cache;
• Preload your indexes for maximum efficiency• LOAD INDEX INTO CACHE table1;• Use IGNORE LEAVES
Case where Index will not be usedFunctions on indexed fields.WHERE TO_DAYS(dateofjoining) –TO_DAYS(Now()) <= 7 (doesn’t use index)WHERE dateofjoing >= DATE_SUB(NOW(), INTER VAL 7 DAY) (uses index)
Select * from employee where name like ‘%s’;If we use left() function used on index column.
Choosing IndexesIndex columns that you use for searching, Consider columnsorting or grouping, not Index Short Values. selectivity. columns you only display as output.Index prefixes of string Take advantage of Dont over Index. values. leftmost prefixes. Match Index types to Use the slow-query log the type of to identify queries that comparisions you may be performing perform. badly.
Keep data types as small as possible for what you need Dont use BIGINT unless required The smaller your data types, the more records will fit into the index blocks. The more records fit in each block, the fewer reads are needed to find your records.
Common indexing mistakes Using CREATE Misusing aNot using an Index. INDEX. composite Index. Appending the Using an primary key to an expression on a index on an column. InnoDB table.