Slow query? Add an index or two! But things are suddenly even slower! Indexes are great tools to speed data lookup but have overhead issues. Histograms don’t have that overhead but may not be suited. And how you lock rows also effects performance. So what do you do to speed up queries smartly?
2. Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for information purposes
only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code,
or functionality, and should not be relied upon in making purchasing decisions. The development, release,
timing, and pricing of any features or functionality described for Oracle’s products may change and
remains at the sole discretion of Oracle Corporation.
Statements in this presentation relating to Oracle’s future plans, expectations, beliefs, intentions and
prospects are “forward-looking statements” and are subject to material risks and uncertainties. A detailed
discussion of these factors and other risks that affect our business is contained in Oracle’s Securities and
Exchange Commission (SEC) filings, including our most recent reports on Form 10-K and Form 10-Q
under the heading “Risk Factors.” These filings are available on the SEC’s website or on Oracle’s website
at http://www.oracle.com/investor. All information in this presentation is current as of September 2019 and
Oracle undertakes no duty to update any statement in light of new information or future events.
2
5. What Is This Session About?
Nobody ever complains that the database is too fast!
Speeding up queries is not a ‘dark art’
But understanding how to speed up queries is often treated
as magic
So we will be looking at the proper use of indexes,
histograms, locking options, and some other ways to speed
queries up.
5
7. Normalize Your Data (not covered today)
● Can not build a skyscraper on sand
● Third normal form or better
○ Use JSON for ‘stub’ table data, avoiding
repeated unneeded index/table dives
● Think about how you will use your data
7
8. The Optimizer
● Consider the optimizer the brain
and nervous system of the
system.
● Query optimization is a feature
of many relational database
management systems.
● The query optimizer attempts to
determine the most efficient
way to execute a given query
by considering the possible
query plans. Wikipedia
8
9. The query optimizer evaluates the options
● Similar to Google Maps (or similar) the
optimizer wants to get your data the
cheapest way (less amount of expensive
disk reads) possible.
● Like a GPS, the cost is built on historical
statistics. And these statistics can
change while the optimizer is working. So
like a traffic jam, washed out road, or other
traffic problem, the optimizer may be
making poor decisions.
● The final determination from the
optimizer is call the query plan.
● MySQL wants to optimize each query
every time it sees it – there is no
locking down the query plan like Oracle
(cough cough).
You will see how to obtain a query plan
later in this presentation.
9
10. But now for something completely different
The tools for looking at queries
● EXPLAIN
○ EXPLAIN FORMAT=
■ JSON
■ TREE
○ ANALYZE
● VISUAL EXPLAIN 10
12. EXPLAIN Example
12
EXPLAIN is used to obtain a query execution plan (an explanation of how MySQL would
execute a query) and should be considered an ESTIMATE as it does not run the query.
14. EXPLAIN FORMAT=TREE
EXPLAIN FORMAT=TREE SELECT * FROM City
JOIN Country ON (City.Population = Country.Population);
-> Inner hash join (city.Population = country.Population) (cost=100154.05 rows=100093)
-> Table scan on City (cost=0.30 rows=4188)
-> Hash
-> Table scan on Country (cost=29.90 rows=239)
14
15. EXPLAIN ANALYZE – MySQL 8.0.18
15
EXPLAIN ANALYZE SELECT * FROM City WHERE CountryCode = 'GBR'G
*************************** 1. row ***************************
EXPLAIN: -> Index lookup on City using CountryCode (CountryCode='GBR')
(cost=80.76 rows=81) (actual time=0.153..0.178 rows=81 loops=1)
1 row in set (0.0008 sec)
MySQL 8.0.18 introduced EXPLAIN ANALYZE, which runs the query and produces
EXPLAIN output along with timing and additional, iterator-based, information about
how the optimizer's expectations matched the actual execution.
18. Indexes
A database index is a data structure that improves the speed of data retrieval
operations on a database table at the cost of additional writes and storage space
to maintain the index data structure.
Indexes are used to quickly locate data without having to search every row in a
database table every time a database table is accessed.
Indexes can be created using one or more columns of a database table, providing
the basis for both rapid random lookups and efficient access of ordered records. -
- https://en.wikipedia.org/wiki/Database_index
18
19. Indexes
An index is a copy of selected columns of data from a table, called a database key
or simply key, that can be searched very efficiently that also includes a low-level
disk block address or direct link to the complete row of data it was copied from.
Some databases extend the power of indexing by letting developers create
indexes on functions or expressions. For example, an index could be created on
upper(last_name), which would only store the upper-case versions of the
last_name field in the index.
-- https://en.wikipedia.org/wiki/Database_index
19
20. Indexes
Another option sometimes supported is the use of partial indices, where index
entries are created only for those records that satisfy some conditional expression.
A further aspect of flexibility is to permit indexing on user-defined functions, as
well as expressions formed from an assortment of built-in functions.. --
https://en.wikipedia.org/wiki/Database_index
20
21. Think of an Index as a Table
With Shortcuts to another
table!
And the more tables you have the read to get to the
data the slower things run!
(Now for very bad example with humans)
21
22. MySQL has Two Types of Index Structures
● B-Tree is a self-balancing tree data structure that maintains sorted data and
allows searches, sequential access, insertions, and deletions.
● Hash are more efficient than nested loops joins, except when the probe side
of the join is very small but joins can only be used to compute equijoins.
22
23. NULL
● NULL is used to designate a
LACK of data.
False 0
True 1
Don’t know NULL
23
24. Why Avoid NULL
● Image you have a nice column
of numbers with high cardinality
and then some rows with
NULLs as values. And your
optimizer has to pick through
the well ordered data and then
muck through the NULLs.
● https://dev.mysql.com/doc/refm
an/8.0/en/problems-with-
null.html
24
25. Indexing NULL values
really drives down the
performance of
INDEXes -
Cardinal Values and ‘junk drawer’ of nulls
25
27. Before Invisible Indexes
1. Doubt usefulness of index
2. Check using EXPLAIN
3. Remove that Index
4. Rerun EXPLAIN
5. Get phone/text/screams from power user about slow query
6. Suddenly realize that the index in question may have had no use for you but
the rest of the planet seems to need that dang index!
7. Take seconds/minutes/hours/days/weeks rebuilding that index
27
28. After Invisible Indexes
1. Doubt usefulness of index
2. Check using EXPLAIN
3. Make index invisible – optimizer can not see that index!
4. Rerun EXPLAIN
5. Get phone/text/screams from power user about slow query
6. Make index visible
7. Blame problem on { network | JavaScript | GDPR | Slack | Cloud }
28
29. How to use INVISIBLE INDEX
ALTER TABLE t1 ALTER INDEX i_idx INVISIBLE;
ALTER TABLE t1 ALTER INDEX i_idx VISIBLE;
29
30. Histograms
What is a histogram?
It is not a gluten free, keto friendly biscuit.
30
32. Histograms
What is a histogram?
Wikipedia declares a histogram is an accurate representation of the distribution of
numerical data. For RDBMS, a histogram is an approximation of the data
distribution within a specific column.
So in MySQL, histograms help the optimizer to find the most efficient Query Plan
to fetch that data.
32
33. Histograms
What is a histogram?
A histogram is a distribution of data into logical buckets.
There are two types:
Singleton or Equi-Height
Maximum number of buckets is 1024
33
34. Histogram
Histogram statistics are useful primarily for non-indexed columns. Adding an
index to a column for which histogram statistics are applicable might also help
the optimizer make row estimates.
The tradeoffs are:
An index must be updated when table data is modified.
A histogram is created or updated only on demand, so it adds no
overhead when table data is modified. On the other hand, the statistics become
progressively more out of date when table modifications occur, until the next time
they are updated.
34
35. The Optimizer
Occasionally the query optimizer fails to find the most efficient plan and ends up
spending a lot more time executing the query than necessary.
The optimizer assumes that the data is evenly distributed in the column. This can
be the old ‘assume’ makes an ‘ass out of you and me’ joke brought to life.
The main reason for this is often that the optimizer doesn’t have enough
knowledge about the data it is about to query:
● How many rows are there in each table?
● How many distinct values are there in each column?
● How is the data distributed in each column?
35
36. Frequency
Histogram
Each distinct value has its own
bucket.
Because this is a frequency
histogram, the endpoint number is
the cumulative frequency of
endpoints.
For 52793, the endpoint number 6
indicates that the value appears 5
times (6 - 1).
For 52794, the endpoint number 8
indicates that the value appears 2
times (8 - 6).
36
37. Frequency
Histogram
The optimizer calculates their
cardinality based on endpoint
numbers.
For example, the optimizer
calculates the cardinality (C) of
value 52799 using the following
formula, where the number of rows
in the table is 23:
C = 23 * ( 9 / 23 )
37
38. Creating and removing Histograms
ANALYZE TABLE t UPDATE HISTOGRAM ON c1, c2, c3 WITH 10 BUCKETS;
ANALYZE TABLE t UPDATE HISTOGRAM ON c1, c3 WITH 10 BUCKETS;
ANALYZE TABLE t DROP HISTOGRAM ON c2;
Note the first statement creates three different histograms on c1, c2, and c3.
38
39. Singleton histograms: The buckets contain two values:
● Value 1: The value for the bucket. The type depends on the column data type.
● Value 2: A double representing the cumulative frequency for the value. For
example, .25 and .75 indicate that 25% and 75% of the values in the column
are less than or equal to the bucket value.
39
40. Equi-height histograms: buckets contain four values:
● Values 1, 2: The lower and upper inclusive values for the bucket. The type
depends on the column data type.
● Value 3: A double representing the cumulative frequency for the value. For
example, .25 and .75 indicate that 25% and 75% of the values in the column
are less than or equal to the bucket upper value.
● Value 4: The number of distinct values in the range from the bucket lower
value to its upper value.
40
41. Information about Histograms
mysql> SELECT TABLE_NAME, COLUMN_NAME,
HISTOGRAM->>'$."data-type"' AS 'data-type',
JSON_LENGTH(HISTOGRAM->>'$."buckets"') AS 'bucket-count'
FROM INFORMATION_SCHEMA.COLUMN_STATISTICS;
+-----------------+-------------+-----------+--------------+
| TABLE_NAME | COLUMN_NAME | data-type | bucket-count |
+-----------------+-------------+-----------+--------------+
| country | Population | int | 226 |
| city | Population | int | 1024 |
| countrylan | Language | string | 457 |
+-----------------+-------------+-----------+--------------+
41
42. Where Histograms Shine
create table h1 (id int unsigned auto_increment,
x int unsigned, primary key(id));
insert into h1 (x) values (1),(1),(2),(2),(2),(3),(3),(3),(3);
select x, count(x) from h1 group by x;
+---+----------+
| x | count(x) |
+---+----------+
| 1 | 2 |
| 2 | 3 |
| 3 | 4 |
+---+----------+
3 rows in set (0.0011 sec)
42
43. Where Histograms Shine
EXPLAIN SELECT * FROM h1 WHERE x > 0G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: h1
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 9
filtered: 33.32999801635742 – The optimizer knows about 1/3 of the data
Extra: Using where
1 row in set, 1 warning (0.0007 sec) 43
44. Where Histograms Shine
ANALYZE TABLE h1 UPDATE HISTOGRAM ON x WITH 3 BUCKETSG
*************************** 1. row ***************************
Table: demox.h1
Op: histogram
Msg_type: status
Msg_text: Histogram statistics created for column 'x'.
1 row in set (0.0819 sec)
44
45. Where Histograms Shine
EXPLAIN SELECT * FROM h1 WHERE x > 0G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: h1
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 9
filtered: 100 – all rows!!!
Extra: Using where
1 row in set, 1 warning (0.0007 sec) 45
46. With EXPLAIN ANALYSE
EXPLAIN analyze SELECT * FROM h1 WHERE x > 0G
*************************** 1. row ***************************
EXPLAIN: -> Filter: (h1.x > 0) (cost=1.15 rows=9) (actual time=0.027..0.034 rows=9 loops=1)
-> Table scan on h1 (cost=1.15 rows=9) (actual time=0.025..0.030 rows=9 loops=1)
46
47. Where Histograms Shine
SELECT * FROM information_schema.column_statistics WHERE table_name = 'h1'G
*************************** 1. row ***************************
SCHEMA_NAME: demox
TABLE_NAME: h1
COLUMN_NAME: x
HISTOGRAM: {"buckets":
[[1, 0.2222222222222222], [2, 0.5555555555555556], [3, 1.0]],
"data-type": "int", "null-values": 0.0, "collation-id": 8,
"last-updated": "2020-01-29 11:25:36.121891",
"sampling-rate": 1.0,
"histogram-type": "singleton",
"number-of-buckets-specified": 3}
1 row in set (0.0012 sec)
47insert into h1 (x) values (1),(1),(2),(2),(2),(3),(3),(3),(3);
48. Think of a Histogram as
ordered buckets of your
data
But not exactly dynamic
48
49. Performance is not just Indexes and Histograms
● There are many other ‘tweaks’ that can be made to speed things up
● Use explain to see what your query is doing?
○ File sorts, full table scans, using temporary tables, etc.
○ Does the join order look right
○ Buffers and caches big enough
○ Do you have enough memory
○ Disk and I/O speeds sufficient
49
50. Locking Options
MySQL added two locking options to MySQL 8.0:
● NOWAIT
○ A locking read that uses NOWAIT never waits to acquire a row lock. The query executes
immediately, failing with an error if a requested row is locked.
● SKIP LOCKED
○ A locking read that uses SKIP LOCKED never waits to acquire a row lock. The query executes
immediately, removing locked rows from the result set.
50
51. Locking Examples – Buying concert tickets
START TRANSACTION;
SELECT seat_no, row_no, cost
FROM seats s
JOIN seat_rows sr USING ( row_no )
WHERE seat_no IN ( 3,4 ) AND sr.row_no IN ( 5,6 )
AND booked = 'NO'
FOR UPDATE OF s SKIP LOCKED;
Let’s shop for tickets in rows 5 or 6 and seats 3 & 4 but we skip any locked
records!
51
52. LOCK NOWAIT
START TRANSACTION;
SELECT seat_no
FROM seats JOIN seat_rows USING ( row_no )
WHERE seat_no IN (3,4) AND seat_rows.row_no IN (12)
AND booked = 'NO'
FOR UPDATE OF seats SKIP LOCKED
FOR SHARE OF seat_rows NOWAIT;
Without NOWAIT, this query would have waited for innodb_lock_wait_timeout
(default: 50) seconds while attempting to acquire the shared lock on seat_rows.
With NOWAIT an error is immediately returned ERROR 3572 (HY000): Do
not wait for lock. 52
54. Resource groups – setting and using
CREATE RESOURCE GROUP Batch
TYPE = USER VCPU = 2-3 -- assumes a system with at least 4 CPUs
THREAD_PRIORITY = 10;
SET RESOURCE GROUP Batch;
or
INSERT /*+ RESOURCE_GROUP(Batch) */ INTO t2 VALUES(2);
54
55. Optimizer Hints
SELECT /*+ JOIN_ORDER(t1, t2) */ ... FROM t1, t2;
Optimizer hints can be specified within individual statements. Because the
optimizer hints apply on a per-statement basis, they provide finer control over
statement execution plans than can be achieved using optimizer_switch.
For example, you can enable an optimization for one table in a statement and
disable the optimization for a different table.
55
56. Partitioning
● In MySQL 8.0, partitioning support is provided by the InnoDB and NDB
storage engines.
● Partitioning enables you to distribute portions of individual tables across a file
system according to rules which you can set largely as needed. In effect,
different portions of a table are stored as separate tables in different
locations.
56
57. Multi-value indexes
● You can now have more index pointers than index keys!
○ Very useful for JSON arrays
mysql> SELECT 3 MEMBER OF('[1, 3, 5, 7, "Moe"]');
+--------------------------------------+
| 3 MEMBER OF('[1, 3, 5, 7, "Moe"]') |
+--------------------------------------+
| 1 |
+--------------------------------------+
57
58. EXPLAIN Explained
This is all in Chapter 8 of the MySQL 8.0 manual in
greater detail, so if we run out of time…
58
59. Cost Based
Optimizer
The way the optimizer makes its
decisions on how to execute queries
is based on a methodology called
cost-based optimization. A
simplification of this process is as
follows:
● Assign a cost to each operation.
● Evaluate how many operations
each possible plan would take.
● Sum up the total.
● Choose the plan with the lowest
overall cost.
59
60. The Query
EXPLAIN SELECT City.name, Country.Name
FROM City
JOIN Country ON
(City.CountryCode = Country.Code)
WHERE Country.Code = 'GBR'G
60
61. The Query – What we want
EXPLAIN SELECT City.name, Country.Name
FROM City
JOIN Country ON (City.CountryCode =
Country.Code)
WHERE Country.Code = 'GBR'G
61
62. The Query – Where we want it from
EXPLAIN SELECT City.name, Country.Name
FROM City
JOIN Country ON
(City.CountryCode = Country.Code)
WHERE Country.Code = 'GBR'G
62
63. The Query – How the two tables relate
EXPLAIN SELECT City.name, Country.Name
FROM City
JOIN Country ON
(City.CountryCode = Country.Code)
WHERE Country.Code = 'GBR'G
63
64. The Query – And any filters
EXPLAIN SELECT City.name, Country.Name
FROM City
JOIN Country ON
(City.CountryCode = Country.Code)
WHERE Country.Code = 'GBR'G
64
66. The Actual Query Plan
select
`world`.`city`.`Name` AS `name`,
'United Kingdom' AS `Name`
from `world`.`city`
join `world`.`country` where
(`world`.`city`.`CountryCode` = 'GBR')
66
68. Optimizer substituted Country.name
explain format=tree SELECT City.name, Country.Name FROM City
JOIN Country ON (City.CountryCode = Country.Code) WHERE
Country.Code = 'GBR'G
*************************** 1. row ***************************
EXPLAIN: -> Index lookup on City using CountryCode
(CountryCode='GBR') (cost=26.85 rows=81)
So the optimizer ‘knows’ just to grab the ‘GBR’ records from the City table and does
not need to read the Country table at all.
68
69. Optimizer substituted Country.name
Explain ANALYZE SELECT City.name, Country.Name FROM City JOIN
Country ON (City.CountryCode = Country.Code) WHERE Country.Code
= 'GBR'G
*************************** 1. row ***************************
EXPLAIN: -> Index lookup on City using CountryCode
(CountryCode='GBR') (cost=26.85 rows=81)
(actual time=0.123..0.138 rows=81 loops=1)
1 row in set (0.0009 sec)
Explain ANALYZE runs the query and the estimate is pretty good!
69
70. 70
EXPLAIN SELECT City.name, Country.Name
FROM City
JOIN Country ON (City.CountryCode = Country.Code) WHERE Country.Code = 'GBR'G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: Country
partitions: NULL
type: const
possible_keys: PRIMARY
key: PRIMARY
key_len: 3
ref: const This is a CONSTANT
rows: 1
filtered: 100
Extra: NULL
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: City
partitions: NULL
type: ref
possible_keys: CountryCode
key: CountryCode
key_len: 3
ref: const
rows: 81 There are 81 records that match in the index
filtered: 100
Extra: NULL
2 rows in set, 1 warning (0.0013 sec)
72. Some General Rules
● Look to index columns on right side of WHERE clause
● Maybe index SORT BY columns (test)
● JOIN on like type and size columns
○ i.e. No VARCHAR(32) to DECIMAL matches
72
73. Slightly More Complex
SELECT CONCAT(customer.last_name,', ', customer.first_name) AS customer,
address.phone,
film.title FROM rental
INNER JOIN customer
ON rental.customer_id = customer.customer_id
INNER JOIN address
ON customer.address_id = address.address_id
INNER JOIN inventory
ON rental.inventory_id = inventory.inventory_id
INNER JOIN film
ON inventory.film_id = film.film_id
WHERE rental.return_date IS NULL
AND rental_date + INTERVAL film.rental_duration DAY < CURRENT_DATE() ORDER
BY title LIMIT 5;
73
78. Back to the old tabular EXPLAIN, cleaned up
| table | type | possible_keys | key | key_len | ref | rows | filtered | Extra
|
+----+-------------+-----------+------------+--------+----------------------------------------+---------+---------+----------------------------
+-------+----------+----------------------------------------------+
| rental | ALL | idx_fk_inventory_id,idx_fk_customer_id | NULL | NULL | NULL | 16008 | 10 | Using
where; Using temporary; Using filesort |
| customer | eq_ref | PRIMARY,idx_fk_address_id | PRIMARY | 2 | sakila.rental.customer_id | 1 | 100 | NULL
|
| address | eq_ref | PRIMARY | PRIMARY | 2 | sakila.customer.address_id | 1 | 100 | NULL
|
| inventory | eq_ref | PRIMARY,idx_fk_film_id | PRIMARY | 3 | sakila.rental.inventory_id | 1 | 100 | NULL
|
| film | eq_ref | PRIMARY | PRIMARY | 2 | sakila.inventory.film_id | 1 | 100 | Using where
|
78
79. First glance – what indexes are being used?
| table | type | possible_keys | key | key_len | ref | rows | filtered | Extra
|
+----+-------------+-----------+------------+--------+----------------------------------------+---------+---------+----------------------------
+-------+----------+----------------------------------------------+
| rental | ALL | idx_fk_inventory_id,idx_fk_customer_id | NULL | NULL | NULL | 16008 | 10 | Using
where; Using temporary; Using filesort |
| customer | eq_ref | PRIMARY,idx_fk_address_id | PRIMARY | 2 | sakila.rental.customer_id | 1 | 100 | NULL
|
| address | eq_ref | PRIMARY | PRIMARY | 2 | sakila.customer.address_id | 1 | 100 | NULL
|
| inventory | eq_ref | PRIMARY,idx_fk_film_id | PRIMARY | 3 | sakila.rental.inventory_id | 1 | 100 | NULL
|
| film | eq_ref | PRIMARY | PRIMARY | 2 | sakila.inventory.film_id | 1 | 100 | Using where
|
79
Anytime the type column shows all it means the entire file/table must be read, which is slow.
If it does have to read all the rows is it because there are no indexes available?
Or does it have to read all the table because is processing the entire table?
80. Look back at our query and determine why it reads all the records
SELECT CONCAT(customer.last_name,', ', customer.first_name) AS customer,
address.phone,
film.title FROM rental
INNER JOIN customer
ON rental.customer_id = customer.customer_id
INNER JOIN address
ON customer.address_id = address.address_id
INNER JOIN inventory
ON rental.inventory_id = inventory.inventory_id
INNER JOIN film
ON inventory.film_id = film.film_id
WHERE rental.return_date IS NULL
AND rental_date + INTERVAL film.rental_duration DAY < CURRENT_DATE() ORDER
BY title LIMIT 5;
80
81. Quiz: Why so slow if it only has to return FIVE RECORDS?!?!?!?!
SELECT CONCAT(customer.last_name,', ', customer.first_name) AS customer,
address.phone,
film.title FROM rental
INNER JOIN customer
ON rental.customer_id = customer.customer_id
INNER JOIN address
ON customer.address_id = address.address_id
INNER JOIN inventory
ON rental.inventory_id = inventory.inventory_id
INNER JOIN film
ON inventory.film_id = film.film_id
WHERE rental.return_date IS NULL
AND rental_date + INTERVAL film.rental_duration DAY < CURRENT_DATE() ORDER
BY title LIMIT 5; 81
82. Second glance – what EXTRAS are being used?
| table | type | possible_keys | key | key_len | ref | rows | filtered | Extra
|
+----+-------------+-----------+------------+--------+----------------------------------------+---------+---------+----------------------------
+-------+----------+----------------------------------------------+
| rental | ALL | idx_fk_inventory_id,idx_fk_customer_id | NULL | NULL | NULL | 16008 | 10 | Using
where; Using temporary; Using filesort |
| customer | eq_ref | PRIMARY,idx_fk_address_id | PRIMARY | 2 | sakila.rental.customer_id | 1 | 100 | NULL
|
| address | eq_ref | PRIMARY | PRIMARY | 2 | sakila.customer.address_id | 1 | 100 | NULL
|
| inventory | eq_ref | PRIMARY,idx_fk_film_id | PRIMARY | 3 | sakila.rental.inventory_id | 1 | 100 | NULL
|
| film | eq_ref | PRIMARY 100 | NULL |
| film | eq_ref | PRIMARY
82
The where; Using temporary; Using filesort informs that the output
has to be sorted (part of the sort clause) and that a temporary
table needed to be used.
83. MySQL 8.0 Temporary Table much faster
● Previously temporary tables were size limited and when they reached that
limit:
○ The processing was halted
○ The data was copied to InnoDB
○ The processing continued with the InnoDB copy of the data
○ And the above was slow
83
84. Second glance – what EXTRAS are being used?
| table | type | possible_keys | key | key_len | ref | rows | filtered | Extra
|
+----+-------------+-----------+------------+--------+----------------------------------------+---------+---------+----------------------------
+-------+----------+----------------------------------------------+
| rental | ALL | idx_fk_inventory_id,idx_fk_customer_id | NULL | NULL | NULL | 16008 | 10 | Using
where; Using temporary; Using filesort |
| customer | eq_ref | PRIMARY,idx_fk_address_id | PRIMARY | 2 | sakila.rental.customer_id | 1 | 100 | NULL
|
| address | eq_ref | PRIMARY | PRIMARY | 2 | sakila.customer.address_id | 1 | 100 | NULL
|
| inventory | eq_ref | PRIMARY,idx_fk_film_id | PRIMARY | 3 | sakila.rental.inventory_id | 1 | 100 | NULL
|
| film | eq_ref | PRIMARY 100 | NULL |
| film | eq_ref | PRIMARY
84
The where; Using temporary; Using filesort informs that the output
has to be sorted (part of the sort clause) and that a temporary
table needed to be used.
Sometime you can avoid a filesort IF the column used is sorted
85. Is there something unindexed that need to be indexed?
● Or something we can use to build a histogram
85
86. SHOW INDEX FROM rental
● PRIMARY
● Rental_date – rental_date, inventory_id, customer_id
● Idx_fk_inventory_id – inventory_id, cutomer_id
● Idx_fk_staff_id
86
87. SHOW INDEX FROM film
● PRIMARY
● Idx_title
● Idx_fk_language_id
● Idx_fk_original_language_id
87
88. A Functional Index? Or Generated Column?
WHERE rental.return_date IS NULL
AND rental_date + INTERVAL film.rental_duration DAY < CURRENT_DATE()
ORDER BY title LIMIT 5;
Could this part in red be reduced to a functional index/generated column?
AND return_date < CURRENT_DATE()
film.rental_duration + rental_date
Would want to store this in the record with the rental data?
88
89. Yes, let us add that new column
● ALTER TABLE operations can be expensive
● How do we seed data
● Do we use a GENERATED COLUMN? Can we?
● Do we add a stub table?
○ Extra index/table dives
○ How much code do we have to change?
○ Other considerations
89
90. Where Else To Look
● MySQL Manual
● Forums.MySQL.com
● MySQLcommunity.slack.com
90
91. New Books
● Can be pre ordered
● Author’s others work are
outstanding and well written
91
92. Great Books
● Getting very dated
● Make sure you get 3rd edition
● Can be found in used book
stores
92
93. My Book
● A guide to using JSON and
MySQL
● Usually on Sale at Amazon
93
94. Thank you
● David.Stokes @Oracle.com
● @Stoker
● PHP Related blog https://elephantdolphin.blogspot.com/
● MySQL Latest Blogs https://planet.mysql.com/
● Where to ask questions https://forums.mysql.com/
● Slack https://mysqlcommunity.slack.com
94