MySQL users commonly ask: Here's my table, what indexes do I need? Why aren't my indexes helping me? Don't indexes cause overhead? This talk gives you some practical answers, with a step by step method for finding the queries you need to optimize, and choosing the best indexes for them.
Advanced MySQL Query Tuning - talk at Percona Live and MySQL Meetup tour.
Tuning Queries and Schema/Indexes can significantly increase performance of your application and decrease response times.
This year I will cover new MySQL 5.6 and 5.7 algorithms that has been designed to improve query performance and simply tuning.
Topics:
1. Group by and order by optimizations
2. MySQL temporary tables and filesort
3. Using covered indexes to optimize your queries
4. Loose and tight index scan in MySQL
5. Using summary tables to optimize your reporting queries
6. New MySQL 5.6 and 5.7 Optimizer features and improvements
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Jaime Crespo
Tutorial delivered at Percona MySQL Conference Live London 2013.
It doesn't matter what new SSD technologies appear, or what are the latest breakthroughs in flushing algorithms: the number one cause for MySQL applications being slow is poor execution plan of SQL queries. While the latest GA version provided a huge amount of transparent optimizations -specially for JOINS and subqueries- it is still the developer's responsibility to take advantage of all new MySQL 5.6 features.
In this tutorial we will propose the attendants a sample PHP application with bad response time. Through practical examples, we will suggest step-by-step strategies to improve its performance, including:
* Checking MySQL & InnoDB configuration
* Internal (performance_schema) and external tools for profiling (pt-query-digest)
* New EXPLAIN tools
* Simple and multiple column indexing
* Covering index technique
* Index condition pushdown
* Batch key access
* Subquery optimization
How to use histograms to get better performanceMariaDB plc
Sergei Petrunia and Varun Gupta, software engineers MariaDB, show how histograms can be used to improve query performance. They begin by introducing histrograms and explaining why they’re needed by the query optimizer. Next, they discuss how to determine whether or not histrograms are needed, and if so, how to determine which tables and columns they should be applied. Finally, they cover best practices and recent improvements to histograms.
Adrian Hardy's slides from PHPNW08
Once you have your query returning the correct results, speed becomes an important factor. Speed can either be an issue from the outset, or can creep in as your dataset grows. Understanding the EXPLAIN command is essential to helping you solve and even anticipate slow queries.
Associated video: http://blip.tv/file/1791781
Let's get into several common types of queries that developers struggle with, showing SQL solutions, and then analyze them for optimal efficiency. I'll cover Exclusion Join, Random Selection, Greatest-Per-Group, Dynamic Pivot, and Relational Division.
MySQL users commonly ask: Here's my table, what indexes do I need? Why aren't my indexes helping me? Don't indexes cause overhead? This talk gives you some practical answers, with a step by step method for finding the queries you need to optimize, and choosing the best indexes for them.
Advanced MySQL Query Tuning - talk at Percona Live and MySQL Meetup tour.
Tuning Queries and Schema/Indexes can significantly increase performance of your application and decrease response times.
This year I will cover new MySQL 5.6 and 5.7 algorithms that has been designed to improve query performance and simply tuning.
Topics:
1. Group by and order by optimizations
2. MySQL temporary tables and filesort
3. Using covered indexes to optimize your queries
4. Loose and tight index scan in MySQL
5. Using summary tables to optimize your reporting queries
6. New MySQL 5.6 and 5.7 Optimizer features and improvements
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Jaime Crespo
Tutorial delivered at Percona MySQL Conference Live London 2013.
It doesn't matter what new SSD technologies appear, or what are the latest breakthroughs in flushing algorithms: the number one cause for MySQL applications being slow is poor execution plan of SQL queries. While the latest GA version provided a huge amount of transparent optimizations -specially for JOINS and subqueries- it is still the developer's responsibility to take advantage of all new MySQL 5.6 features.
In this tutorial we will propose the attendants a sample PHP application with bad response time. Through practical examples, we will suggest step-by-step strategies to improve its performance, including:
* Checking MySQL & InnoDB configuration
* Internal (performance_schema) and external tools for profiling (pt-query-digest)
* New EXPLAIN tools
* Simple and multiple column indexing
* Covering index technique
* Index condition pushdown
* Batch key access
* Subquery optimization
How to use histograms to get better performanceMariaDB plc
Sergei Petrunia and Varun Gupta, software engineers MariaDB, show how histograms can be used to improve query performance. They begin by introducing histrograms and explaining why they’re needed by the query optimizer. Next, they discuss how to determine whether or not histrograms are needed, and if so, how to determine which tables and columns they should be applied. Finally, they cover best practices and recent improvements to histograms.
Adrian Hardy's slides from PHPNW08
Once you have your query returning the correct results, speed becomes an important factor. Speed can either be an issue from the outset, or can creep in as your dataset grows. Understanding the EXPLAIN command is essential to helping you solve and even anticipate slow queries.
Associated video: http://blip.tv/file/1791781
Let's get into several common types of queries that developers struggle with, showing SQL solutions, and then analyze them for optimal efficiency. I'll cover Exclusion Join, Random Selection, Greatest-Per-Group, Dynamic Pivot, and Relational Division.
The JSON data type and functions that support it comprise one of the most interesting features introduced in MySQL 5.7 for application developers. But no feature is a Golden Hammer. We need to apply a little expertise to get the best of it, and avoid misusing it. I’ll show practical examples that work well with JSON, and other scenarios where conventional columns would perform better. Questions addressed in this presentation: How much space does JSON data use, compared to conventional data? What is the performance of querying JSON vs. conventional data? How do I create indexes for JSON data? What kind of data is best to store in JSON? How do I get the best of both worlds?
How to Analyze and Tune MySQL Queries for Better Performanceoysteing
Tutorial at Oracle Open World 2015:
Performance of SQL queries plays a big role in application performance. If some queries execute slowly, these queries or the database schema may need tuning. This tutorial covers query processing, optimization methods, and how the MySQL optimizer chooses a specific plan to execute SQL. See demonstrations on how to use tools such as EXPLAIN (including the JSON-based variant), optimizer trace, and performance schema to analyze query plans. See how the Visual Explain functionality in MySQL Workbench helps you to visualize these plans. Based on the analysis, the tutorial covers how to take the next steps for performance tuning. It might mean forcing a particular index, changing the schema, or modifying configuration parameters.
MySQL Indexing : Improving Query Performance Using Index (Covering Index)Hemant Kumar Singh
Query performance can be enhanced by a major factor if Database Indexed are used properly. The main aim of this slide was to explain the benefits of Covering Index, but ended up writing everything I knew.
Here is the summary of what I have covered in this slide:-
1. What affects Database performance
2. What is Database Index
3. Types Of Database Index
4. Column Index
5. Composite Index
6. Covering Index
7. Indexing Guidelines
It would be interesting to know these as well -
Best practices for Indexing in Database(RDBMS)
Best practices for Indexing in MySQL
Best practices for Indexing in PostgreSQL
Best practices for Database Modeling
Best practices for SQL Query Construction
Performance impact of Indexing on Query Performance
Performance impact of Indexing on INSERT Queries
Consolidate all these knowledge and you should be happy to see the overall performance gain in your SQL Query and hence overall application will run faster.
When does InnoDB lock a row? Multiple rows? Why would it lock a gap? How do transactions affect these scenarios? Locking is one of the more opaque features of MySQL, but it’s very important for both developers and DBA’s to understand if they want their applications to work with high performance and concurrency. This is a creative presentation to illustrate the scenarios for locking in InnoDB and make these scenarios easier to visualize. I'll cover: key locks, table locks, gap locks, shared locks, exclusive locks, intention locks, insert locks, auto-inc locks, and also conditions for deadlocks.
In this session, we looked at five things you might not know about the Oracle Database or might have forgotten. For each topic, I explained the functionality and demonstrated the benefits using real-world examples. The topics covered apply to anyone running Oracle Database 11g and up, including Standard Edition, with only a few minor exceptions.
Have you ever wondered about the internal structure of indexes in postgres? What's the difference between a btree and a gin index? How are indexes searches etc. ? This talk is about this.
Connection Pooling in PostgreSQL using pgbouncer Sameer Kumar
The presentation was presented at 5th Postgres User Group, Singapore.
It explain how to setup pgbouncer and also shows a few demonstration graphs comparing the advantages/gains in performance when using pgbouncer instead of direct connections to PostgreSQL database.
Geek Sync I Consolidating Indexes in SQL ServerIDERA Software
You can watch the replay for this Geek Sync webcast in the IDERA Resource Center: http://ow.ly/OzvF50A5r07
Watch this Geek Sync with special guest Microsoft Certified Master Kendra Little to learn why duplicate indexes can be a big problem for your performance and maintenance. Kendra will show you three different index consolidation scenarios, and how you can measure the pros and cons of different consolidation options.
The JSON data type and functions that support it comprise one of the most interesting features introduced in MySQL 5.7 for application developers. But no feature is a Golden Hammer. We need to apply a little expertise to get the best of it, and avoid misusing it. I’ll show practical examples that work well with JSON, and other scenarios where conventional columns would perform better. Questions addressed in this presentation: How much space does JSON data use, compared to conventional data? What is the performance of querying JSON vs. conventional data? How do I create indexes for JSON data? What kind of data is best to store in JSON? How do I get the best of both worlds?
How to Analyze and Tune MySQL Queries for Better Performanceoysteing
Tutorial at Oracle Open World 2015:
Performance of SQL queries plays a big role in application performance. If some queries execute slowly, these queries or the database schema may need tuning. This tutorial covers query processing, optimization methods, and how the MySQL optimizer chooses a specific plan to execute SQL. See demonstrations on how to use tools such as EXPLAIN (including the JSON-based variant), optimizer trace, and performance schema to analyze query plans. See how the Visual Explain functionality in MySQL Workbench helps you to visualize these plans. Based on the analysis, the tutorial covers how to take the next steps for performance tuning. It might mean forcing a particular index, changing the schema, or modifying configuration parameters.
MySQL Indexing : Improving Query Performance Using Index (Covering Index)Hemant Kumar Singh
Query performance can be enhanced by a major factor if Database Indexed are used properly. The main aim of this slide was to explain the benefits of Covering Index, but ended up writing everything I knew.
Here is the summary of what I have covered in this slide:-
1. What affects Database performance
2. What is Database Index
3. Types Of Database Index
4. Column Index
5. Composite Index
6. Covering Index
7. Indexing Guidelines
It would be interesting to know these as well -
Best practices for Indexing in Database(RDBMS)
Best practices for Indexing in MySQL
Best practices for Indexing in PostgreSQL
Best practices for Database Modeling
Best practices for SQL Query Construction
Performance impact of Indexing on Query Performance
Performance impact of Indexing on INSERT Queries
Consolidate all these knowledge and you should be happy to see the overall performance gain in your SQL Query and hence overall application will run faster.
When does InnoDB lock a row? Multiple rows? Why would it lock a gap? How do transactions affect these scenarios? Locking is one of the more opaque features of MySQL, but it’s very important for both developers and DBA’s to understand if they want their applications to work with high performance and concurrency. This is a creative presentation to illustrate the scenarios for locking in InnoDB and make these scenarios easier to visualize. I'll cover: key locks, table locks, gap locks, shared locks, exclusive locks, intention locks, insert locks, auto-inc locks, and also conditions for deadlocks.
In this session, we looked at five things you might not know about the Oracle Database or might have forgotten. For each topic, I explained the functionality and demonstrated the benefits using real-world examples. The topics covered apply to anyone running Oracle Database 11g and up, including Standard Edition, with only a few minor exceptions.
Have you ever wondered about the internal structure of indexes in postgres? What's the difference between a btree and a gin index? How are indexes searches etc. ? This talk is about this.
Connection Pooling in PostgreSQL using pgbouncer Sameer Kumar
The presentation was presented at 5th Postgres User Group, Singapore.
It explain how to setup pgbouncer and also shows a few demonstration graphs comparing the advantages/gains in performance when using pgbouncer instead of direct connections to PostgreSQL database.
Geek Sync I Consolidating Indexes in SQL ServerIDERA Software
You can watch the replay for this Geek Sync webcast in the IDERA Resource Center: http://ow.ly/OzvF50A5r07
Watch this Geek Sync with special guest Microsoft Certified Master Kendra Little to learn why duplicate indexes can be a big problem for your performance and maintenance. Kendra will show you three different index consolidation scenarios, and how you can measure the pros and cons of different consolidation options.
This presentation can help you to apply partioning when appropriate, and to avoid problems when using it. The oneliner is: Simple Works Best. The illustrating demos are on Postgres12 (maybe -13 by the time of presenting) and show some of the problems and solutions that Partitioning can provide. Some of this “experience” is quite old and the demo runs near-identical on Oracle…
These problems are the same on any database.
Talk given for the #phpbenelux user group, March 27th in Gent (BE), with the goal of convincing developers that are used to build php/mysql apps to broaden their horizon when adding search to their site. Be sure to also have a look at the notes for the slides; they explain some of the screenshots, etc.
An accompanying blog post about this subject can be found at http://www.jurriaanpersyn.com/archives/2013/11/18/introduction-to-elasticsearch/
An overview and discussion on indexing data in Redis to facilitate fast and efficient data retrieval. Presented on September 22nd, 2014 to the Redis Tel Aviv Meetup.
Introduction to TokuDB v7.5 and Read Free ReplicationTim Callaghan
TokuDB v7.5 introduced Read Free Replication, allowing MySQL slaves to run with virtually no read IO. This presentation discusses how Fractal Tree indexes work, what they enable in TokuDB, and they allow TokuDB to uniquely offer this replication innovation.
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
Humana, like many companies, is tackling the challenge of creating real-time insights from data that is diverse and rapidly changing. This is our journey of how we used MongoDB to combined traditional batch approaches with streaming technologies to provide continues alerting capabilities from real-time data streams.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
The Internet of Things (IoT) is a revolutionary concept that connects everyday objects and devices to the internet, enabling them to communicate, collect, and exchange data. Imagine a world where your refrigerator notifies you when you’re running low on groceries, or streetlights adjust their brightness based on traffic patterns – that’s the power of IoT. In essence, IoT transforms ordinary objects into smart, interconnected devices, creating a network of endless possibilities.
Here is a blog on the role of electrical and electronics engineers in IOT. Let's dig in!!!!
For more such content visit: https://nttftrg.com/
4. Engines covered
•InnoDB / XtraDB
•MyISAM
•PARTITIONing
•Not covered:
•NDB Cluster
•MEMORY
•FULLTEXT, Sphinx, GIS
•Except where noted, comments apply to both InnoDB/XtraDB and MyISAM
5. Yahoo! Confidential
Index ~= Table
•Each index is stored separately
•Index is very much like a table
•BTree structure
•InnoDB leaf: cols of PRIMARY KEY
•MyISAM leaf: row num or offset into data
("Leaf": a bottom node in a BTree)
6. Yahoo! Confidential
BTrees
•Efficient for keyed row lookup
•Efficient for “range” scan by key
•RoT ("Rule of Thumb): Fan-out of about 100 (1M rows = 3 levels)
•Best all-around index type
http://en.wikipedia.org/wiki/B-tree
http://upload.wikimedia.org/wikipedia/commons/thumb/6/65/B-tree.svg/500px-B-tree.svg.png
http://upload.wikimedia.org/wikipedia/commons/thumb/6/65/B-tree.svg/500px-B-tree.svg.png
7. Yahoo! Confidential
Index Attributes
•Diff Engines have diff attributes
•Limited combinations (unlike other vendors) of
•clustering (InnoDB PK)
•unique
•method (Btree)
8. KEY vs INDEX vs …
•KEY == INDEX
•UNIQUE is an INDEX
•PRIMARY KEY is UNIQUE
•At most 1 per table
•InnoDB must have one
•Secondary Key = any key but PRIMARY KEY
•FOREIGN KEY implicitly creates a KEY
9. Yahoo! Confidential
“Clustering”
•Consecutive things are ‘adjacent’ on disk ☺ , therefore efficient in disk I/O
•“locality of reference” (etc)
•Index scans are clustered
•But note: For InnoDB PK you have to step over rest of data
•Table scan of InnoDB PK – clustered by PK (only)
10. “Table Scan”, “Index Scan”
•What – go through whole data/index
•Efficient because of way BTree works
•Slow for big tables
•When – if more than 10-30% otherwise
•Usually “good” if picked by optimizer
•EXPLAIN says “ALL”
Yahoo! Confidential
11. Range / BETWEEN
A "range" scan is a "table" scan, but for less than the whole table
•Flavors of “range” scan
•a BETWEEN 123 AND 456
•a > 123
•Sometimes: IN (…)
Yahoo! Confidential
12. Common Mistakes
•“I indexed every column” – usually not useful.
•User does not understand “compound indexes”
•INDEX(a), INDEX(a, b) – redundant
•PRIMARY KEY(id), INDEX(id) – redundant
13. Size RoT
•1K rows & fit in RAM: rarely performance problems
•1M rows: Need to improve datatypes & indexes
•1B rows: Pull out all stops! Add on Summary tables, SSDs, etc.
15. The question
Q: "When was Andrew Johnson president of the US?” Table `Presidents`: +-----+------------+-----------+-----------+ | seq | last | first | term | +-----+------------+-----------+-----------+ | 1 | Washington | George | 1789-1797 | | 2 | Adams | John | 1797-1801 | ... | 7 | Jackson | Andrew | 1829-1837 | ... | 17 | Johnson | Andrew | 1865-1869 | ... | 36 | Johnson | Lyndon B. | 1963-1969 | ...
16. The question – in SQL
SELECT term FROM Presidents WHERE last = 'Johnson' AND first = 'Andrew';
What INDEX(es) would be best for that question?
18. No Indexes
The interesting rows in EXPLAIN:
type: ALL <-- Implies table scan
key: NULL <-- Implies that no index is useful, hence table scan
rows: 44 <-- That's about how many rows in the table, so table scan
Not good.
19. INDEX(first), INDEX(last)
Two separate indexes
MySQL rarely uses more than one index
Optimizer will study each index, decide that 2 rows come from each, and pick one.
EXPLAIN:
key: last
key_len: 92 ← VARCHAR(30) utf8: 2+3*30
rows: 2 ← two “Johnson”
20. INDEX(first), INDEX(last) (cont.)
What’s it doing?
1.With INDEX(last), it finds the Johnsons
2.Get the PK from index (InnoDB): [17,36]
3.Reach into data (2 BTree probes)
4.Use “AND first=…” to filter
5.Deliver answer (1865-1869)
21. Index Merge Intersect
(“Index Merge” is rarely used)
1.INDEX(last) → [7,17]
2.INDEX(first) → [17, 36]
3.“AND” the lists → [17]
4.BTree into data for the row
5.Deliver answer
type: index_merge
possible_keys: first, last
key: first, last
key_len: 92,92
rows: 1
Extra: Using intersect(first,last); Using where
22. INDEX(last, first)
1.Index BTree to the one row: [17]
2.PK BTree for data
3.Deliver answer
key_len: 184 ← length of both fields
ref: const, const ← WHERE had constants
rows: 1 ← Goodie
23. INDEX(last, first, term)
1.Index BTree using last & first; get to leaf
2.Leaf has the answer – Finished!
key_len: 184 ← length of both fields
ref: const, const ← WHERE had constants
rows: 1 ← Goodie
Extra: Using where; Using index ← Note
“Covering” index – “Using index”
24. Variants
•Reorder ANDs in WHERE – no diff
•Reorder cols in INDEX – big diff
•Extra fields on end of index – mostly harmless
•Redundancy: INDEX(a) + INDEX(a,b) – DROP shorter
•“Prefix” INDEX(last(5)) – rarely helps; can hurt
25. Variants – examples
INDEX(last, first)
•WHERE last=... – good
•WHERE last=... AND first=… – good
•WHERE first=... AND last=… – good
•WHERE first=... – index useless
INDEX(last) (applied above):
good, so-so, so-so, useless
26. Cookbook
SELECT → the optimal compound INDEX to make.
1.all fields in WHERE that are “= const” (any order)
2.One more field (no skipping!):
1. WHERE Range (BETWEEN, >, …)
2. GROUP BY
3. ORDER BY
27. Cookbook – IN
IN ( SELECT ... ) – Very poor opt. (until 5.6)
IN ( 1,2,... ) – Works somewhat like “=“.
29. Yahoo! Confidential
PRIMARY KEY
•By definition: UNIQUE & NOT NULL
•InnoDB PK:
•Leaf contains the data row
•So... Lookup by PK goes straight to row
•So... Range scans by PK are efficient
•(PK needed for ACID)
•MyISAM PK:
•Identical structure to secondary index
30. Yahoo! Confidential
Secondary Indexes
•BTree
•Leaf item points to data row
•InnoDB: pointer is copy of PRIMARY KEY
•MyISAM: pointer is offset to row
31. Yahoo! Confidential
"Using Index"
When a SELECT references only the fields in a Secondary index, only the secondary index need be touched. This is a performance bonus.
32. What should be the PK?
Plan A: A “natural”, such as a unique name; possibly compound
Plan B: An artificial INT AUTO_INCREMENT
Plan C: No PK – generally not good
Plan D: UUID/GUID/MD5 – inefficient due to randomness
33. AUTO_INCREMENT?
id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY
•Better than no key – eg, for maintenance
•Useful when “natural key” is bulky and lots of secondary keys; else unnecessary
•Note: each InnoDB secondary key includes the PK columns. (Bulky PK → bulky secondary keys)
34. Size of InnoDB PK
Each InnoDB secondary key includes the PK columns.
•Bulky PK → bulky secondary keys
•"Using index" may kick in – because you have the PK fields implicitly in the Secondary key
35. No PK?
InnoDB must have a PK:
1.User-provided (best)
2.First UNIQUE NOT NULL key (sloppy)
3.Hidden, inaccessible 6-byte integer (you are better off with your own A_I)
"Trust me, have a PK."
36. Redundant Index
PRIMARY KEY (id),
INDEX (id, x),
UNIQUE (id, y)
Since the PK is “clustered” in InnoDB, the other two indexes are almost totally useless. Exception: If the index is “covering”.
INDEX (x, id) – a different case
37. Compound PK - Relationship
CREATE TABLE Relationship (
foo_id INT …,
bar_id INT …,
PRIMARY KEY (foo_id, bar_id),
INDEX (bar_id, foo_id) –- if going both directions
) ENGINE=InnoDB;
39. Yahoo! Confidential
Normalizing (Mapping) table
Goal: Normalization – id ↔ value
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
name VARCHAR(255),
PRIMARY KEY (id),
UNIQUE (name)
In MyISAM add these to “cover”: INDEX(id,name), INDEX(name,id)
40. Normalizing BIG
id MEDIUMINT UNSIGNED NOT NULL AUTO_INCREMENT,
md5 BINARY(16/22/32) NOT NULL,
stuff TEXT/BLOB NOT NULL,
PRIMARY KEY (id),
UNIQUE (md5)
INSERT INTO tbl (md5, stuff) VALUES($m,$s)
ON DUPLICATE KEY UPDATE id=LAST_INDERT_ID(id);
$id = SELECT LAST_INSERT_ID();
Caveat: Dups burn ids.
41. Avoid Burn
1.UPDATE ... JOIN ... WHERE id IS NULL -- Get the ids (old) – Avoids Burn
2.INSERT IGNORE ... SELECT DISTINCT ... -- New rows (if any)
3.UPDATE ... JOIN ... WHERE id IS NULL -- Get the ids (old or new) – multi-thread is ok.
42. WHERE lat … AND lng …
•Two fields being range tested
•Plan A: INDEX(lat), INDEX(lng) – let optimizer pick
•Plan B: Complex subqueries / UNIONs beyond scope
•Plan C: Akiban
•Plan D: Partition on Latitude; PK starts with Longitude: http://mysql.rjweb.org/doc.php/latlng
Yahoo! Confidential
43. Yahoo! Confidential
Index on MD5 / GUID
•VERY RANDOM! Therefore,
•Once the index is bigger than can fit in RAM cache, you will be thrashing on disk
•What to do??
•Normalize
•Some other key
•PARTITION by date may help INSERTs
•http://mysql.rjweb.org/doc.php/uuid (type-1 only)
44. Yahoo! Confidential
Key-Value
•Flexible, expandable
•Clumsy, inefficient
•http://mysql.rjweb.org/doc.php/eav
•Horror story about RDF...
•Indexes cannot make up for the clumsiness
45. Yahoo! Confidential
ORDER BY RAND()
•No built-in optimizations
•Will read all rows, sort by RAND(), deliver the LIMIT
•http://mysql.rjweb.org/doc.php/random
46. Pagination
•ORDER BY … LIMIT 40,10 – Indexing won't be efficient
•→ Keep track of “left off”
•WHERE x > $leftoff ORDER BY … LIMIT 10
•LIMIT 11 – to know if there are more
•http://mysql.rjweb.org/doc.php/pagination
47. Latest 10 Articles
•Potentially long list
•of articles, items, comments, etc;
•you want the "latest"
But
•JOIN getting in the way, and
•INDEXes are not working for you
Then build an helper table with a useful index:
http://mysql.rjweb.org/doc.php/lists
48. LIMIT rows & get total count
•SELECT SQL_CALC_FOUND_ROWS … LIMIT 10
•SELECT FOUND_ROWS()
•If INDEX can be used, this is not “too” bad.
•Avoids a second SELECT
Yahoo! Confidential
49. ORDER BY x LIMIT 5
•Only if you get to the point of using x in the INDEX is the LIMIT going to be optimized.
•Otherwise it will
1.Collect all possible rows – costly
2.Sort by x – costly
3.Deliver first 5
Yahoo! Confidential
50. “It’s not using my index!”
SELECT … FROM tbl WHERE x=3;
INDEX (x)
•Case: few rows have x=3 – will use INDEX.
•Case: 10-30% match – might use INDEX
•Case: most rows match – will do table scan
The % depends on the phase of the moon
51. Getting ORDERed rows
Plan A: Gather the rows, filter via WHERE, deal with GROUP BY & DISTINCT, then sort (“filesort”).
Plan B: Use an INDEX to fetch the rows in the ‘correct’ order. (If GROUP BY is used, it must match the ORDER BY.)
The optimizer has trouble picking between them.
52. INDEX(a,b) vs (b,a)
INDEX (a, b) vs INDEX (b, a)
WHERE a=1 AND b=2 – both work equally well
WHERE a=1 AND b>2 – first is better
WHERE a>1 AND b>2 – each stops after 1st col
WHERE b=2 – 2nd only
WHERE b>2 – 2nd only
53. Compound “>”
•[assuming] INDEX(hr, min)
•WHERE (hr, min) >= (7,45) -- poorly optimized
•WHERE hr >= 7 AND min >= 45 – wrong
•WHERE (hr = 7 AND min >= 45) OR (hr > 7) – slow because of OR
•WHERE hr >= 7 AND (hr > 7 OR min >= 45) – better; [only needs INDEX(hr)]
•Use TIME instead of two fields! – even better
Yahoo! Confidential
54. UNION [ ALL | DISTINCT ]
•UNION defaults to UNION DISTINCT; maybe UNION ALL will do? (Avoids dedupping pass)
•Best practice: Explicitly state ALL or DISTINCT
Yahoo! Confidential
55. DISTINCT vs GROUP BY
•SELECT DISTINCT … GROUP BY → redundant
•To dedup the rows: SELECT DISTINCT
•To do aggregates: GSELECT GROUP BY
Yahoo! Confidential
56. OR --> UNION
•OR does not optimize well
•UNION may do better
SELECT ... WHERE a=1 OR b='x'
-->
SELECT ... WHERE a=1
UNION DISTINCT
SELECT ... WHERE b='x'
Yahoo! Confidential
58. EXPLAIN SELECT …
To see if your INDEX is useful
http://dev.mysql.com/doc/refman/5.5/en/explain-output.html
59. Yahoo! Confidential
EXPLAIN
•Run EXPLAIN SELECT ... to find out how MySQL might perform the query today.
•Caveat: Actual query may pick diff plan
•Explain says which key it will use; SHOW CREATE TABLE shows the INDEXes
•If using compound key, look at byte len to deduce how many fields are used.
<#>
60. Yahoo! Confidential
EXPLAIN – “using index”
•EXPLAIN says “using index”
•Benefit: Don’t need to hit data ☺
•How to achieve: All fields used are in one index
•InnoDB: Remember that PK field(s) are in secondary indexes
•Tip: Sometimes useful to add fields to index:
•SELECT a,b FROM t WHERE c=1
•SELECT b FROM t WHERE c=1 ORDER BY a
•SELECT b FROM t WHERE c=1 GROUP BY a
•INDEX (c,a,b)
61. EXPLAIN EXTENDED
EXPLAIN EXTENDED SELECT …;
SHOW WARNINGS;
The first gives an extra column.
The second details how the optimizer reformulated the SELECT. LEFT JOIN→JOIN and other xforms.
62. Yahoo! Confidential
EXPLAIN – filesort
•Filesort: ☹ But it is just a symptom.
•A messy query will gather rows, write to temp, sort for group/order, deliver
•Gathering includes all needed columns
•Write to tmp:
•Maybe MEMORY, maybe MyISAM
•Maybe hits disk, maybe not -- can't tell easily
63. “filesort”
These might need filesort:
•DISTINCT
•GROUP BY
•ORDER BY
•UNION DISTINCT
Possible to need multiple filsorts (but no clue)
Yahoo! Confidential
64. Yahoo! Confidential
“Using Temporary”
•if
•no BLOB, TEXT, VARCHAR > 512, FULLTEXT, etc (MEMORY doesn’t handle them)
•estimated data < max_heap_table_size
•others
•then “filesort” is done using the MEMORY engine (no disk)
•VARCHAR(n) becomes CHAR(n) for MEMORY
•utf8 takes 3n bytes
•else MyISAM is used
65. EXPLAIN PARTITIONS SELECT
Check whether the “partition pruning” actually pruned.
The “first” partition is always included when the partition key is DATE or DATETIME. This is to deal with invalid dates like 20120500.
Tip: Artificial, empty, “first” partition.
66. INDEX cost
•An INDEX is a BTree.
•Smaller than data (usually)
•New entry added during INSERT (always up to date)
•UPDATE of indexed col -- juggle index entry
•Benefit to SELECT far outweighs cost of INSERT (usually)
68. Yahoo! Confidential
Add-an-Index-Cure (not)
•Normal learning curve:
•Stage 1: Learn to build table
•Stage 2: Learn to add index
•Stage 3: Indexes are a panacea, so go wild adding indexes
•Don’t go wild. Every index you add costs something in
•Disk space
•INSERT/UPDATE time
69. Yahoo! Confidential
OR → UNION
•INDEX(a), INDEX(b) != INDEX(a, b)
•Newer versions sometimes use two indexes
•WHERE a=1 OR b=2 =>
(SELECT ... WHERE a=1)
UNION
(SELECT ... WHERE b=2)
70. Subqueries – Inefficient
Generally, subqueries are less efficient than the equivalent JOIN.
Subquery with GROUP BY or LIMIT may be efficient
5.6 and MariaDB 5.5 do an excellent job of making most subqueries perform well
Yahoo! Confidential
71. Subquery Types
SELECT a, (SELECT …) AS b FROM …;
RoT: Turn into JOIN if no agg/limit
RoT: Leave as subq. if aggregation
SELECT … FROM ( SELECT … );
Handy for GROUP BY or LIMIT
SELECT … WHERE x IN ( SELECT … );
SELECT … FROM ( SELECT … ) a
JOIN ( SELECT … ) b ON …;
Usually very inefficient – do JOIN instead (Fixed in 5.6 and MariaDB 5.5)
Yahoo! Confidential
72. Subquery – example of utility
•You are SELECTing bulky stuff (eg TEXT/BLOB)
•WHERE clause could be entirely indexed, but is messy (JOIN, multiple ranges, ORs, etc)
•→ SELECT a.text, … FROM tbl a
JOIN ( SELECT id FROM tbl WHERE …) b
ON a.id = b.id;
•Why? Smaller “index scan” than “table scan”
Yahoo! Confidential
73. Extra filesort
•“ORDER BY NULL” – Eh? “I don’t care what order”
•GROUP BY may sort automatically
•ORDER BY NULL skips extra sort if GROUP BY did not sort
•Non-standardNo
Yahoo! Confidential
74. Yahoo! Confidential
USE, FORCE ("hints")
•SELECT ... FROM foo USE INDEX(x)
•RoT: Rarely needed
•Sometimes ANALYZE TABLE fixes the ‘problem’ instead, by recalculating the “statistics”.
•RoT: Inconsistent cardinality → FORCE is a mistake.
•STRAIGHT_JOIN forces order of table usage (use sparingly)
76. •VARCHAR (utf8: 3x, utf8mb4: 4x) → VARBINARY (1x)
•INT is 4 bytes → SMALLINT is 2 bytes, etc
•DATETIME → TIMESTAMP (8*:4)
•DATETIME → DATE (8*:3)
•Normalize (id instead of string)
•VARCHAR → ENUM (N:1)
Yahoo! Confidential
Field Sizes
77. Yahoo! Confidential
Smaller → Cacheable → Faster
•Fatter fields → fatter indexes → more disk space → poorer caching → more I/O → poorer performance
•INT is better than a VARCHAR for a url
•But this may mean adding a mapping table
78. WHERE fcn(col) = ‘const’
•No functions!
•WHERE <fcn>(<indexed col>) = …
•WHERE lcase(name) = ‘foo’
•Add extra column; index `name`
•Hehe – in this example lcase is unnecessary if using COLLATE *_ci !
Yahoo! Confidential
79. Date Range
•WHERE dt BETWEEN ‘2009-02-27’ AND ‘2009-03-02’ →
•“Midnight problem”
WHERE dt >= ‘2009-02-27’
AND dt < ‘2009-02-27’ + INTERVAL 4 DAY
•WHERE YEAR(dt) = ‘2009’ →
•Function precludes index usage
WHERE dt >= ‘2009-01-01’
AND dt < ‘2009-01-01’ + INTERVAL 1 YEAR
80. WHERE utf8 = latin1
•Mixed character set tests (or mixed collation tests) tend not to use INDEX
oDeclare VARCHAR fields consistently
DD
•WHERE foo = _utf8 'abcd'
Yahoo! Confidential
81. Don’t index sex
•gender CHAR(1) CHARSET ascii
•INDEX(gender)
•Don’t bother!
•WHERE gender = ‘F’ – if it occurs > 10%, index will not be used
82. Prefix Index
•INDEX(a(10)) – Prefixing usually bad
•May fail to use index when it should
•May not use subsequent fields
•Must check data anyway
•Etc.
•UNIQUE(a(10)) constrains the first 10 chars to be unique – probably not what you wanted!
•May be useful for TEXT/BLOB
Yahoo! Confidential
83. VARCHAR – VARBINARY
•Collation takes some effort
•UTF8 may need 3x the space (utf8mb4: 4x)
•CHAR, TEXT – collated (case folding, etc)
•BINARY, BLOB – simply compare the bytes
•Hence… MD5s, postal codes, IP addresses, etc, should be BINARY or VARBINARY
Yahoo! Confidential
84. IP Address
•VARBINARY(39)
•Avoids unnecessary collation
•Big enough for Ipv6
•BINARY(16)
•Smaller
•Sortable, Range-scannable
•http://mysql.rjweb.org/doc.php/ipranges
Yahoo! Confidential
90. PARTITION Keys
•Either:
•No UNIQUE or PRIMARY KEY, or
•All Partition-by fields must be in all UNIQUE/PRIMARY KEYs
•(Even if artificially added to AI)
•RoT: Partition fields should not be first in keys
•Sorta like getting two-dimensional index - - first is partition 'pruning', then PK.
Yahoo! Confidential
91. PARTITION Use Cases
•Possible use cases
•Time series
•DROP PARTITION much better than DELETE
•“two” clustered indexes
•random index and most of effort spent in last partition
Yahoo! Confidential
92. PARTITION RoTs
Rules of Thumb
•Reconsider PARTITION – often no benefit
•Don't partition if under 1M rows
•BY RANGE only
•No SUBPARTITIONs
http://mysql.rjweb.org/doc.php/ricksrots#partitioning
93. PARTITION Pruning
•Uses WHERE to pick some partition(s)
•Sort of like having an extra dimension
•Don't need to pick partition (cannot until 5.6)
•Each "partition" is like a table
Yahoo! Confidential
95. MyISAM vs InnoDB Keys
InnoDB PK is “clustered” with the data
•PK lookup finds row
•Secondary indexes use PK to find data
MyISAM PK is just like secondary indexes
•All indexes (in .MYI) point to data (in .MYD) via row number or byte offset
http://mysql.rjweb.org/doc.php/myisam2innodb
96. Yahoo! Confidential
Caching
•MyISAM: 1KB BTree index blocks are cached in “key buffer”
•key_buffer_size
•Recently lifted 4GB limit
•InnoDB: 16KB BTree index and data blocks are cached in buffer pool
•innodb_buffer_pool_size
•The 16K is settable (rare cases)
•MyISAM has “delayed key write” – probably rarely useful, especially with RAID & BBWC
97. Yahoo! Confidential
4G in MyISAM
•The “pointer” in MyISAM indexes is fixed at N bytes.
•Old versions defaulted to 4 bytes (4G)
•5.1 default: 6 bytes (256T)
•Fixed/Dynamic
•Fixed length rows (no varchar, etc): Pointer is row number
•Dynamic: Pointer is byte offset
•Override/Fix: CREATE/ALTER TABLE ... MAX_ROWS = ...
•Alter is slow
99. Impact on INSERT / DELETE
•Write operations need to update indexes – sooner or later
•Performance
•INSERT at end = hot spot there
•Random key = disk thrashing
•Minimize number of indexes, especially random
Yahoo! Confidential
100. WHERE name LIKE ‘Rick%’
•WHERE name LIKE ‘Rick%’
•INDEX (name) – “range”
•WHERE name LIKE ‘%James’
•won’t use index
Yahoo! Confidential
101. WHERE a=1 GROUP BY b
• WHERE a=1 GROUP BY b
WHERE a=1 ORDER BY b
WHERE a=1 GROUP BY b ORDER BY b
•INDEX(a, b) – nice for those
•WHERE a=1 GROUP BY b ORDER BY c
•INDEX(a, b, c) – no better than (a,b)
Yahoo! Confidential
102. WHERE a > 9 ORDER BY a
•WHERE a > 9 ORDER BY a
•INDEX (a) – will catch both the WHERE and the ORDER BY ☺
•WHERE b=1 AND a > 9 ORDER BY a
•INDEX (b, a)
Yahoo! Confidential
103. Yahoo! Confidential
GROUP BY, ORDER BY
•if there is a compound key such that
•WHERE is satisfied, and
•there are more fields in the key,
•then, MySQL will attempt to use more fields in the index for GROUP BY and/or ORDER BY
•GROUP BY aa ORDER BY bb → extra “filesort”
104. Yahoo! Confidential
ORDER BY, LIMIT
•If you get all the way through the ORDER BY, still using the index, and you have LIMIT, then the LIMIT is done efficiently.
•If not, it has to gather all the data, sort it, finally deliver what LIMIT says.
•This is the “Classic Meltdown Query”.
105. GROUP+ORDER+LIMIT
•Efficient:
•WHERE a=1 GROUP BY b INDEX(a,b)
•WHERE a=1 ORDER BY b LIMIT 9 INDEX(a,b)
•GROUP BY b ORDER BY c INDEX(b,c)
•Inefficient:
•WHERE x.a=1 AND y.c=2 GROUP/ORDER/LIMIT
•(because of 2 tables)
Yahoo! Confidential
106. Yahoo! Confidential
Index Types (BTree, etc)
•BTree
•most used, most general
•Hash
•MEMORY Engine only
•useless for range scan
•Fulltext
•Pretty good for “word” searches in text
•GIS (Spatial) (2D)
•No bit, etc.
107. FULLTEXT index
•“Words”
•Stoplist excludes common English words
•Min length defaults to 4
•Natural
•IN BOOLEAN MODE
•Trumps other INDEXes
•Serious competitors: Lucene, Sphinx
•MyISAM only until 5.6.4
oMultiple diffs in InnoDB FT
Yahoo! Confidential
108. AUTO_INCREMENT index
•AI field must be first in some index
•Need not be UNIQUE or PRIMARY
•Can be compound (esp. for PARTITION)
•Could explicitly add dup id (unless ...)
•(MyISAM has special case for 2nd field)
109. RoTs
Rules of Thumb
•100 I/Os / sec (500/sec for SSD)
•RAID striping (1,5,6,10) – divide time by striping factor
•RAID write cache – writes are “instantaneous” but not sustainable in the long haul
•Cached fetch is 10x faster than uncached
•Query Cache is useless (in heavy writes)
110. Yahoo! Confidential
Low cardinality, Not equal
•WHERE deleted = 0
•WHERE archived != 1
•These are likely to be poorly performing queries. Characteristics:
•Poor cardinality
•Boolean
•!=
•Workarounds
•Move deleted/hidden/etc rows into another table
•Juggle compound index order (rarely works)
•"Cardinality", by itself, is rarely of note
111. Not NOT
•Rarely uses INDEX:
•NOT LIKE
•NOT IN
•NOT (expression)
•<>
•NOT EXISTS ( SELECT * … ) – essentially a LEFT JOIN; often efficient
Yahoo! Confidential
112. Replication
•SBR
•Replays query
•Slave could be using different Engine and/or Indexes
•RBR
•PK important
113. Index Limits
•Index width – 767B per column
•Index width – 3072B total
•Number of indexes – more than you should have
•Disk size – terabytes
114. Location
•InnoDB, file_per_table – .ibd file
•InnoDB, old – ibdata1
•MyISAM – .MYI
•PARTITION – each partition looks like a separate table
115. ALTER TABLE
1.copy data to tmp
2.rebuild indexes (on the fly, or separately)
3.RENAME into place
Even ALTERs that should not require the copy do so. (few exceptions)
RoT: Do all changes in a single ALTER. (some PARTITION exceptions)
5.6 fixes most of this
116. Tunables
•InnoDB indexes share caching with data in innodb_buffer_pool_size – recommend 70% of available RAM
•MyISAM indexes, not data, live in key_buffer_size – recommend 20% of available RAM
•log_queries_not_using_indexes – don’t bother