RivieraJUG - MySQL Indexes and Histograms

Frédéric Descamps
Community Manager
Oracle MySQL
RivieraJUG - September 2022
MySQL Indexes and Histograms
How They Really Speed Up Your Queries

Who am I ?
about.me/lefred
Copyright @ 2022 Oracle and/or its affiliates.
2

@lefred
MySQL Evangelist
using MySQL since version 3.20
devops believer
living in
h ps://lefred.be
Frédéric Descamps
3

What Is This Session About ?
4

Speed and Performance !
Nobody ever complains that the database is too fast !
5

It's often the reverse... blame the database !
5

Speeding up queries is not a 'dark art'... but understanding how to speed up queries is often
treated as magic.
5

Speeding up queries is not a 'dark art'... but understanding how to speed up queries is often
treated as magic.
So we will be looking at the proper use of indexes, histograms and see how to keep the rigth
balance for your workload.
5

No coverage today of:
System Con guration
OS
MySQL
Hardware
Networking/Cloud
This is a dry subject !
Do not try to absorb all the content at once, get the slides
(h ps://slideshare.net/lefred.descamps)
6

The North Star
Query Respone Time
7

Daniel Nichter, E cient MySQL
Performance - Best Practices and
Techniques, O'Reilly, 2021
Query Response time
Query response time is the only metric anyone truly cares about [...] because query
response time is the only metric we experience. When a query takes 7.5 seconds to execute,
we experience 7.5 seconds of impatience. That same query might examine a million rows,
but we don't experience a million rows examined. Our time is precious.(*)
8

Today's goal is to...
reduce the Query Response Time
9

Bad Queries
Finding the Ugly Duckling
10

What are bad queries ?
We can de ne bad queries in two di erent categories:
11

queries called way to often
queries that are way too slow
full table scan
using lesort
using temporary tables
We can de ne bad queries in two di erent categories:
11

12

If there could be only one ?
If you should optimize only one query, the best candidate should be the query that
consumes the most of the execution time (seen as latency in PFS, but usually called
"response time").
13

"response time").
sys Schema contains all the necessary info to nd that ugly Duckling:
13

"response time").
SELECT
SELECT schema_name
schema_name,
, format_pico_time
format_pico_time(
(total_latency
total_latency)
) tot_lat
tot_lat,
,
exec_count
exec_count,
, format_pico_time
format_pico_time(
(total_latency
total_latency/
/exec_count
exec_count)
) latency_per_call
latency_per_call,
,
query_sample_text
query_sample_text
FROM
FROM sys
sys.
.x$statements_with_runtimes_in_95th_percentile
x$statements_with_runtimes_in_95th_percentile AS
AS t1
t1
JOIN
JOIN performance_schema
performance_schema.
.events_statements_summary_by_digest
events_statements_summary_by_digest AS
AS t2
t2
ON
ON t2
t2.
.digest
digest=
=t1
t1.
.digest
digest
WHERE
WHERE schema_name
schema_name NOT
NOT in
in (
('performance_schema'
'performance_schema',
, 'sys'
'sys')
)
ORDER
ORDER BY
BY (
(total_latency
total_latency/
/exec_count
exec_count)
) desc
desc LIMIT
LIMIT 1
1G
G
13

"response time").
SELECT
SELECT schema_name
schema_name,
, format_pico_time
format_pico_time(
(total_latency
total_latency)
) tot_lat
tot_lat,
,
exec_count
exec_count,
, format_pico_time
format_pico_time(
(total_latency
total_latency/
/exec_count
exec_count)
) latency_per_call
latency_per_call,
,
query_sample_text
query_sample_text
FROM
FROM sys
sys.
.x$statements_with_runtimes_in_95th_percentile
x$statements_with_runtimes_in_95th_percentile AS
AS t1
t1
JOIN
JOIN performance_schema
performance_schema.
.events_statements_summary_by_digest
events_statements_summary_by_digest AS
AS t2
t2
ON
ON t2
t2.
.digest
digest=
=t1
t1.
.digest
digest
WHERE
WHERE schema_name
schema_name NOT
NOT in
in (
('performance_schema'
'performance_schema',
, 'sys'
'sys')
)
ORDER
ORDER BY
BY (
(total_latency
total_latency/
/exec_count
exec_count)
) desc
desc LIMIT
LIMIT 1
1G
G
*************************** 1. row ***************************
schema_name: piday
tot_lat: 4.29 h
exec_count: 5
latency_per_call: 51.51 min
query_sample_text: select a.device_id, max(a.value) as `max temp`,
min(a.value) as `min temp`, avg(a.value) as `avg temp`,
max(b.value) as `max humidity`, min(b.value) as `min humidity`,
avg(b.value) as `avg humidity`
from temperature_history a
join humidity_history b on b.device_id=a.device_id
where date(a.time_stamp) = date(now())
and date(b.time_stamp)=date(now()) group by device_id
13

More info about Queries
Sys Schema contains all the required information in these tables :
statements_with_full_table_scans
statements_with_runtimes_in_95th_percentile
statements_with_sorting
statements_with_temp_tables
14

And since MySQL 8.0 you can join the table
performance_schema.events_statements_summary_by_digest to have a sample
you can use.
14

And since MySQL 8.0 you can join the table
performance_schema.events_statements_summary_by_digest to have a sample
you can use.
We will check the meaning of this tables in some slides... be patient ;)
14

15

The Optimizer
some theory
16

The MySQL Optimizer
Consider the Optimizer the brain and nervous system of MySQL
Query optimization is a feature of many Relational Database Management Systems
The query optimizer a empts to determine the most e ective way to execute a given
query by considering the possible query plans (h ps://en.wikipedia.org/wiki/Query_optimization)
17

The MySQL Optimizer - estimation
One of the hardest problems in query optimization is to accurately estimate the costs of
alternative query plans.
These costs are the result of a mathematical model of query execution costs that relies
heavily on estimates of the cardinality, or number of tuple, owing through each edge in
a query plan.
18

The MySQL Optimizer - evaluation of the options
The Optimizer wants to get your data the cheapest way possible.
Like a route planner, the cost is built on historical statistics. And these statistics can
change while the optimizer is working. So like a tra c jam, washed out road, or other
tra c problem, the optimizer may be making poor decisions for the present situation...
but this is very rare !
The nal determination from the optimizer is called the Query Execution Plan (or QEP, or
Query Plan).
MySQL wants to optimize each query every time it sees it (there is no locking down the
query plan like Oracle).
19

120
if your query has ve joins the optimizer may
have to evaluate 120 di erent options
5!
(5 * 4 * 3 * 2 * 1)
20

The Query Execution Plan
EXPLAIN: DBAs companion
21

EXPLAIN is the command used to obtain
the Query Execution Plan for a query
including information about how tables are
joined and in which order, which indexes are
used and estimation of rows, ...
EXPLAIN Syntax
22

EXPLAIN Example
23

EXPLAIN Example
this is an ESTIMATION on how MySQL would run the query
as it is not executed !
23

system: the table contains excatly 1 row
const: at most 1 row is matched for the
table
eq_ref: the table is the right-hand table
in a join where the condition is on a PK
or not null unique key.
ref: the table is ltered by a nonunique
secondary index.
ref_or_null: the same as ref but the
ltered column may also be NULL.
index_merge: the Optimizer chooses a
combination of two or more indexes to
resolve a lter that includes an OR or
AND between columns in di erent
indexes.
fulltext: use of a full text index to lter
the table.
range: this is used when an index is used
to look up several values either in
sequence or in groups.
EXPLAIN - Access type
24

index: the Optimizer chosen to perform
a full index scan.
ALL: full table scan !!
EXPLAIN - Access type (2)
Get much more info an examples in Chapter 20, Analyzing Queries from Jesper Wisborg
Krogh's book: MySQL 8 Query Performance Tuning, Apress, 2020.
25

EXPLAIN Example (with a JOIN)
26

Visual EXPLAIN Example from MySQL Workbench
27

EXPLAIN FORMAT=TREE Example
28

EXPLAIN FORMAT=JSON Example
29

EXPLAIN FORMAT=JSON Example
this is the most detailed estimation !
29

can we know the
real numbers ?
30

Estimated cost
Actual execution statistics
Time to return rst row
Time to return all rows
Number of rows returned
Number of loops
EXPLAIN ANALYZE
31

Estimated cost
Actual execution statistics
Time to return rst row
Time to return all rows
Number of rows returned
Number of loops
Instruments and executes
the query
EXPLAIN ANALYZE
31

EXPLAIN ANALYZE Example
32

33

time is expressed in milliseconds
33

34

35

36

More on EXPLAIN
h ps://dev.mysql.com/doc/refman/8.0/en/explain.html
h ps://dev.mysql.com/doc/refman/8.0/en/explain-output.html
h ps://dev.mysql.com/doc/refman/8.0/en/explain-extended.html
h ps://dev.mysql.com/doc/workbench/en/wb-performance-explain.html
37

Indexes
nd rows with speci c column values quickly
38

Indexes
A database index is a data structure that improves the speed of data retrieval operations on
a database table at the cost of additional writes and storage space to maintain the index
data structure.
Indexes are used to quickly locate data without having to search every row in a database
table every time a database table is accessed.
Indexes can be created using one or more columns of a database table, providing the basis
for both rapid random lookups and e cient access of ordered records.
(h ps://en.wikipedia.org/wiki/Database_index)
39

MySQL supports multiple kind of indexes:
primary key / clustered index
secondary index
full-text index
spatial index
Indexes in MySQL
40

pre x of a column
mutli-column
unique
covering
functional
multi-value
Indexes in MySQL (2)
MySQL supports 2 types of indexes:
BTREE
HASH
and Indexes can have some "properties":
41

Clustered Indexes
Each InnoDB table has a special index called the clustered index that stores row data.
Typically, the clustered index is synonymous with the primary key.
42

Clustered Indexes
Each InnoDB table has a special index called the clustered index that stores row data.
Typically, the clustered index is synonymous with the primary key.
Let's check now an example on how we usually mentally represent a table and an index.
42

Mental representation of table and an index
Table 1
Indexed Column column 1 column 2 column x column n
Index
43

Mental representation of table and an index
Table 1
Index Indexed Column column 1 column 2 column x column n
5001
5001 a b 2022-03-14 NULL
44

Mental representation of a table and an index
Table 1
5001
3
3
5001
a
a
c
b 2022-03-14
2022-03-15
NULL
NULL
45

Table 1
5001
3
3
6
a
a
c
b 2022-03-14
2022-03-15
NULL
NULL
6
5001 be
f 2022-03-15 1
46

Table 1
5001
3
3
6
a
a
c
b 2022-03-14
2022-03-15
NULL
NULL
6
27 be
f 2022-03-15 1
27
5001 it
d 2022-03-16 101
47

Table 1
5001
3
3
6
a
a
c
b 2022-03-14
2022-03-15
NULL
NULL
6
27
be
f 2022-03-15 1
27
12
it
d 2022-03-16 101
12
5001 uk
e 2022-03-22 NULL
48

Table 1
5001
3
3
6
a
a
c
b 2022-03-14
2022-03-15
NULL
NULL
6
27
be
f 2022-03-15 1
27
12
it
d 2022-03-16 101
12
5001 uk
e 2022-03-22 NULL
49

InnoDB representation of a table and PK
Table 1
PRIMARY KEY column 1 column 2 column x column n
5001 a b 2022-03-14 NULL
insert into table1 values
(5001,'a','b','2022-03-14', NULL);
50

Table 1
5001 a b 2022-03-14 NULL
(3,'c','a','2022-03-15', NULL);
3 a
c 2022-03-15 NULL
51

6 be
f 2022-03-15 1
Table 1
5001 a b 2022-03-14 NULL
(6,'f','be','2022-03-15', 1);
3 a
c 2022-03-15 NULL
52

6 be
f 2022-03-15 1
Table 1
5001 a b 2022-03-14 NULL
(27,'d','it','2022-03-16', 101);
3 a
c 2022-03-15 NULL
27 it
d 2022-03-16 101
53

6 be
f 2022-03-15 1
Table 1
5001 a b 2022-03-14 NULL
(12,'e','uk','2022-03-22', NULL);
3 a
c 2022-03-15 NULL
27 it
d 2022-03-16 101
12 uk
e 2022-03-22 NULL
54

Table 1
5001 a b 2022-03-14 NULL
3 a
c 2022-03-15 NULL
6 be
f 2022-03-15 1
27 it
d 2022-03-16 101
12 uk
e 2022-03-22 NULL
55

Table 1
5001 a b 2022-03-14 NULL
3 a
c 2022-03-15 NULL
6 be
f 2022-03-15 1
27 it
d 2022-03-16 101
12 uk
e 2022-03-22 NULL
This is the clustered index representation: stored by order of Primary Key
55

InnoDB Primary Key
InnoDB stores data in table spaces.
And so far, we know that records are stored and sorted using the clustered index.
The Primary Key is a key for the index that uniquely de ned for a row, should be
immutable.
InnoDB needs a PRIMARY KEY
No NULL values are allowed
Monotonically increasing
use UID_To_BIN() if you must use UUIDs, otherwise avoid them
56

InnoDB Primary Key (2)
What we don't know is that all secondary indexes also contain the primary key as the right-
most column in the index (even if this is not exposed). That means when a secondary index
is used to retrieve a record, two indexes are used: rst the secondary one pointing to the
primary key that will be used to nally retrieve the record.
57

When no primary key is de ned, the rst unique not null key is used. And if none is
available, InnoDB will create an hidden primary key (6 bytes).
57

The problem with such key is that you don’t have any control on it and worse, this value is
global to all tables without primary keys and can be a contention problem if you perform
multiple simultaneous writes on such tables (dict_sys->mutex).
57

The problem with such key is that you don’t have any control on it and worse, this value is
global to all tables without primary keys and can be a contention problem if you perform
multiple simultaneous writes on such tables (dict_sys->mutex).
If you use HA solutions, Primary Keys are mandatory !
57

Primary Keys impact how the values are inserted and the size of the secondary indexes. A
non sequential PK can lead to many random IOPS.
58

Also, it's more and more common to use
applications that generate complete
random primary keys...that means if the
Primary Key is not sequential, InnoDB will
have to heavily re-balance all the pages on
inserts.
Primary Keys impact how the values are inserted and the size of the secondary indexes. A
non sequential PK can lead to many random IOPS.
58

If we compare the same load (inserts) when using an auto_increment integer as Primary
Key, we can see that only the latest pages are recently touched:
Generated with h ps://github.com/jeremycole/innodb_ruby from @jeremycole
59

< >
My legacy application didn't define any Primary Key, adding one
(auto_increment) breaks the application ! What can I do ?
60

< >
My legacy application didn't define any Primary Key, adding one
(auto_increment) breaks the application ! What can I do ?
Easy, just create a new invisible column and define it as Primary Key !
60

select
select *
* from
from actors
actors;
;
+
+----------------+-----+
----------------+-----+
|
| name
name |
| age
age |
|
+
+----------------+-----+
----------------+-----+
|
| Al Pacino
Al Pacino |
| 80
80 |
|
|
| Robert De Niro
Robert De Niro |
| 77
77 |
|
|
| Joe Pesci
Joe Pesci |
| 78
78 |
|
|
| Sharon Stone
Sharon Stone |
| 63
63 |
|
|
| Diane Keaton
Diane Keaton |
| 75
75 |
|
|
| Talia Shire
Talia Shire |
| 74
74 |
|
+
+----------------+-----+
----------------+-----+
Invisible column as Primary Key
61

select
select *
* from
from actors
actors;
;
+
+----------------+-----+
----------------+-----+
|
| name
name |
| age
age |
|
+
+----------------+-----+
----------------+-----+
|
| Al Pacino
Al Pacino |
| 80
80 |
|
|
| Robert De Niro
Robert De Niro |
| 77
77 |
|
|
| Joe Pesci
Joe Pesci |
| 78
78 |
|
|
| Sharon Stone
Sharon Stone |
| 63
63 |
|
|
| Diane Keaton
Diane Keaton |
| 75
75 |
|
|
| Talia Shire
Talia Shire |
| 74
74 |
|
+
+----------------+-----+
----------------+-----+
Do we have a Primary Key ?
Invisible column as Primary Key
61

Invisible column as Primary Key (2)
Let's nd out by listing all tables where the clustered index was generated (internal hidden
key):
select
select i
i.
.table_id
table_id,
, t
t.
.name
name
from
from information_schema
information_schema.
.innodb_indexes i
innodb_indexes i
join
join information_schema
information_schema.
.innodb_tables t
innodb_tables t on
on (
(i
i.
.table_id
table_id =
= t
t.
.table_id
table_id)
)
where
where i
i.
.name
name=
='GEN_CLUST_INDEX'
'GEN_CLUST_INDEX';
;
+
+----------+------------------+
----------+------------------+
|
| table_id
table_id |
| name
name |
|
+
+----------+------------------+
----------+------------------+
|
| 1293
1293 |
| hollywood
hollywood/
/actors
actors |
|
+
+----------+------------------+
----------+------------------+
1
1 row
row in
in set
set (
(0.0211
0.0211 sec
sec)
)
62

We can verify with the table's de nition:
show
show create
create table
table actorsG
actorsG
*
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
* 1.
1. row
row *
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
*
Table
Table: actors
: actors
Create
Create Table
Table:
: CREATE
CREATE TABLE
TABLE `
`actors
actors`
` (
(
`
`name
name`
` varchar
varchar(
(20
20)
) DEFAULT
DEFAULT NULL
NULL,
,
`
`age
age`
` int
int unsigned
unsigned DEFAULT
DEFAULT NULL
NULL
)
) ENGINE
ENGINE=
=InnoDB
InnoDB DEFAULT
DEFAULT CHARSET
CHARSET=
=utf8mb4
utf8mb4 COLLATE
COLLATE=
=utf8mb4_0900_ai_ci
utf8mb4_0900_ai_ci
63

We can verify with the table's de nition:
show
show create
create table
table actorsG
actorsG
*
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
* 1.
1. row
row *
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
*
Table
Table: actors
: actors
Create
Create Table
Table:
: CREATE
CREATE TABLE
TABLE `
`actors
actors`
` (
(
`
`name
name`
` varchar
varchar(
(20
20)
) DEFAULT
DEFAULT NULL
NULL,
,
`
`age
age`
` int
int unsigned
unsigned DEFAULT
DEFAULT NULL
NULL
)
) ENGINE
ENGINE=
=InnoDB
InnoDB DEFAULT
DEFAULT CHARSET
CHARSET=
=utf8mb4
utf8mb4 COLLATE
COLLATE=
=utf8mb4_0900_ai_ci
utf8mb4_0900_ai_ci
Now let's add a hidden column as primary key:
alter
alter table
table actors
actors add
add id
id int
int unsigned
unsigned auto_increment
auto_increment primary
primary key
key invisible
invisible first
first;
;
63

select
select *
* from
from actors
actors;
;
+
+----------------+-----+
----------------+-----+
|
| name
name |
| age
age |
|
+
+----------------+-----+
----------------+-----+
|
| Al Pacino
Al Pacino |
| 80
80 |
|
|
| Robert De Niro
Robert De Niro |
| 77
77 |
|
|
| Joe Pesci
Joe Pesci |
| 78
78 |
|
|
| Sharon Stone
Sharon Stone |
| 63
63 |
|
|
| Diane Keaton
Diane Keaton |
| 75
75 |
|
|
| Talia Shire
Talia Shire |
| 74
74 |
|
+
+----------------+-----+
----------------+-----+
We can now test again our application's queries:
64

select
select *
* from
from actors
actors;
;
+
+----------------+-----+
----------------+-----+
|
| name
name |
| age
age |
|
+
+----------------+-----+
----------------+-----+
|
| Al Pacino
Al Pacino |
| 80
80 |
|
|
| Robert De Niro
Robert De Niro |
| 77
77 |
|
|
| Joe Pesci
Joe Pesci |
| 78
78 |
|
|
| Sharon Stone
Sharon Stone |
| 63
63 |
|
|
| Diane Keaton
Diane Keaton |
| 75
75 |
|
|
| Talia Shire
Talia Shire |
| 74
74 |
|
+
+----------------+-----+
----------------+-----+
show
show create
create table
table actorsG
actorsG
*
**
**
**
**
**
**
**
**
**
**
**
**
* 1.
1. row
row *
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
*
Table
Table: actors
: actors
Create
Create Table
Table:
: CREATE
CREATE TABLE
TABLE `
àctors
actors`
` (
(
`
ìd
id`
` int
int unsigned
unsigned NOT
NOT NULL
NULL
AUTO_INCREMENT
AUTO_INCREMENT /*!80023 INVISIBLE */
/*!80023 INVISIBLE */,
,
`
`name
name`
` varchar
varchar(
(20
20)
) DEFAULT
DEFAULT NULL
NULL,
,
`
àge
age`
` int
int unsigned
unsigned DEFAULT
DEFAULT NULL
NULL,
,
PRIMARY
PRIMARY KEY
KEY (
(`
ìd
id`
`)
)
)
) ENGINE
ENGINE=
=InnoDB
InnoDB AUTO_INCREMENT
AUTO_INCREMENT=
=7
7 DEFAULT
DEFAULT
CHARSET
CHARSET=
=utf8mb4
utf8mb4
COLLATE
COLLATE=
=utf8mb4_0900_ai_ci
utf8mb4_0900_ai_ci
We can now test again our application's queries:
64

< >
Great ! But what about inserts without specifying the columns ?
65

< >
insert
insert into
into actors
actors values
values (
('James Caan'
'James Caan',
, 81
81)
);
;
Query OK
Query OK,
, 1
1 row
row affected
affected (
(0.0248
0.0248 sec
sec)
)
Great ! But what about inserts without specifying the columns ?
65

select
select id
id,
, a
a.
.*
* from
from actors a
actors a;
;
+
+----+----------------+-----+
----+----------------+-----+
|
| id
id |
| name
name |
| age
age |
|
+
+----+----------------+-----+
----+----------------+-----+
|
| 1
1 |
| Al Pacino
Al Pacino |
| 80
80 |
|
|
| 2
2 |
| Robert De Niro
Robert De Niro |
| 77
77 |
|
|
| 3
3 |
| Joe Pesci
Joe Pesci |
| 78
78 |
|
|
| 4
4 |
| Sharon Stone
Sharon Stone |
| 63
63 |
|
|
| 5
5 |
| Diane Keaton
Diane Keaton |
| 75
75 |
|
|
| 6
6 |
| Talia Shire
Talia Shire |
| 74
74 |
|
|
| 7
7 |
| James Caan
James Caan |
| 81
81 |
|
+
+----+----------------+-----+
----+----------------+-----+
But if needed we have access to that PK id:
66

select
select id
id,
, a
a.
.*
* from
from actors a
actors a;
;
+
+----+----------------+-----+
----+----------------+-----+
|
| id
id |
| name
name |
| age
age |
|
+
+----+----------------+-----+
----+----------------+-----+
|
| 1
1 |
| Al Pacino
Al Pacino |
| 80
80 |
|
|
| 2
2 |
| Robert De Niro
Robert De Niro |
| 77
77 |
|
|
| 3
3 |
| Joe Pesci
Joe Pesci |
| 78
78 |
|
|
| 4
4 |
| Sharon Stone
Sharon Stone |
| 63
63 |
|
|
| 5
5 |
| Diane Keaton
Diane Keaton |
| 75
75 |
|
|
| 6
6 |
| Talia Shire
Talia Shire |
| 74
74 |
|
|
| 7
7 |
| James Caan
James Caan |
| 81
81 |
|
+
+----+----------------+-----+
----+----------------+-----+
And this id is sequential, used as clustered
index to store the data and externalized for
replication !
But if needed we have access to that PK id:
66

Since MySQL 8.0.30 you can also enable GIPK mode !
GIPK mode is controlled by the sql_generate_invisible_primary_key server
system variable.
When MySQL is running in GIPK mode, a primary key is added to a table by the server, the
column and key name is always my_row_id.
67

InnoDB Secondary Key
Indexes other than the clustered index are known as secondary indexes.
Remember that in InnoDB, each record in a secondary index contains the primary key
columns for the row (right most), as well as the columns speci ed for the secondary
index.
InnoDB uses this primary key value to search for the row in the clustered index.
If the Primary Key is long, the secondary indexes use more space. It's advantageous to
have a short Primary Key.
68

Indexing on a pre x of a column
create
create index
index part_of_name
part_of_name on
on city
city (
(name
name(
(10
10)
))
);
;
Only the rst 10 characters are indexed in this example and this can save space/speed.
69

Indexing on a pre x of a column (2)
Let's compare the between this pre x index and an index using the full column:
select
select database_name
database_name,
, table_name
table_name,
, index_name
index_name,
,
stat_value
stat_value *
* @
@@innodb_page_size
@innodb_page_size as
as size_in_bytes
size_in_bytes
from
from mysql
mysql.
.innodb_index_stats
innodb_index_stats
where
where stat_name
stat_name =
= 'size'
'size' and
and database_name
database_name=
='world'
'world' and
and table_name
table_name=
='city'
'city'
and
and index_name
index_name like
like '%name%'
'%name%'
order
order by
by size_in_bytes
size_in_bytes desc
desc;
;
+
+---------------+------------+--------------+---------------+
---------------+------------+--------------+---------------+
|
| database_name
database_name |
| table_name
table_name |
| index_name
index_name |
| size_in_bytes
size_in_bytes |
|
+
+---------------+------------+--------------+---------------+
---------------+------------+--------------+---------------+
|
| world
world |
| city
city |
| name_idx
name_idx |
| 212992
212992 |
|
|
| world
world |
| city
city |
| part_of_name
part_of_name |
| 114688
114688 |
|
+
+---------------+------------+--------------+---------------+
---------------+------------+--------------+---------------+
70

71

72

We see that both indexes on name are candidates and the partial one got the preference.
72

Index key_len
What does that 40 mean ?
73

Index key_len
What does that 40 mean ?
The key_len column indicates the length of the key that MySQL decided to use.
73

Index key_len (2)
< >
Oh... Okay... but why 40 ? It doesn't make any sense, does it ?
74

Index key_len (2)
< >
Oh... Okay... but why 40 ? It doesn't make any sense, does it ?
In fact, we indexed the first 10 characters of the 'name' column... but this uses
utf8mb4 charset: 1 character is encoded in up to 4 bytes
10 x 4 bytes = 40 bytes per record in the index
74

Multi-column Index
It's also possible to index multiple columns in one single index:
create
create index
index first_last_idx
first_last_idx on
on employees
employees (
(first_name
first_name,
, last_name
last_name)
);
;
75

Multi-column Index
create
create index
first_last_idx on
on employees
employees (
(first_name
first_name,
, last_name
last_name)
);
;
This index will be work on ( rst_name, lastname) and ( rst_name) but not on
(last_name).
Put highest cardinality eld rst !
75

Multi-column Index
create
create index
first_last_idx on
on employees
employees (
(first_name
first_name,
, last_name
last_name)
);
;
This index will be work on ( rst_name, lastname) and ( rst_name) but not on
(last_name).
Put highest cardinality eld rst !
Indexes are parsed from left to right
75

Multi-column Index Example
The value of key_len enables you to determine how many parts of a multiple-part key
MySQL actually uses.
76

Multi-column Index Example
The value of key_len enables you to determine how many parts of a multiple-part key
MySQL actually uses.
show create table employeesG
*************************** 1. row ***************************
Table: employees
Create Table: CREATE TABLE èmployees` (
èmp_no` int NOT NULL,
`birth_date` date NOT NULL,
`first_name` varchar(14) NOT NULL,
`last_name` varchar(16) NOT NULL,
`gender` enum('M','F') NOT NULL,
`hire_date` date NOT NULL,
PRIMARY KEY (èmp_no`),
KEY `first_last_idx` (`first_name`,`last_name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
select (14*4)+2+(16*4)+2;
+-------------------+
| (14*4)+2+(16*4)+2 | (+2 --> VARCHAR's length is coded on 2 bytes)
+-------------------+
| 124 |
+-------------------+
76

Multi-column Index Example (2)
77

14 x 4 + 2 = 58
77

78

the left-most part of the index cannot be used
--> the index is not used
78

select
select emp_no
emp_no,
, first_name
first_name,
, last_name
last_name,
,
hire_date
hire_date
from
from employees
employees
where
where last_name
last_name like
like 'de%'
'de%' limit
limit 10
10;
;
select
select emp_no
emp_no,
, first_name
first_name,
, last_name
last_name,
,
hire_date
hire_date
from
from employees
employees
where
where last_name
last_name like
like 'de%'
'de%'
order
order by
by first_name
first_name limit
limit 10
10;
;
Multi-column Index: challenge
What do you think about these two statements:
[A] none uses the index
[B] the left-one uses the index
[C] the right-one uses the index
79

Multi-column Index: hashing values
If you need to perform search of the exact value (not using like or range) of multiple large
columns, some times it could be more e cient to use a hash function and index its result:
alter
alter table
table employees
employees add
add column
column hash_bin_names
hash_bin_names binary
binary(
(16
16)
)
generated always
generated always as
as (
(unhex
unhex(
(md5
md5(
(concat
concat(
(first_name
first_name,
, last_name
last_name)
))
))
))
) virtual
virtual,
,
add
add key
key hash_bin_idx
hash_bin_idx(
(hash_bin_names
hash_bin_names)
);
;
80

Multi-column Index: hashing values
And now let's rewrite the query and exame the QEP:
explain
explain select
select emp_no
emp_no,
, first_name
first_name,
, last_name
last_name,
, hire_date
hire_date
from
from employees
employees
where
where hash_bin_names
hash_bin_names=
=unhex
unhex(
(md5
md5(
('AamodDeville'
'AamodDeville')
))
)
and
and first_name
first_name=
='Aamod'
'Aamod' and
and last_name
last_name like
like 'Deville'
'Deville' order
order by
by first_name
first_name limit
limit 10
10G
G
*
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
* 1.
1. row
row *
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
*
id:
id: 1
1
select_type:
select_type: SIMPLE
SIMPLE
table
table: employees
: employees
partitions:
partitions: NULL
NULL
type
type: ref
: ref
possible_keys: first_last_idx
possible_keys: first_last_idx,
,hash_bin_idx
hash_bin_idx
key
key: hash_bin_idx
: hash_bin_idx
key_len:
key_len: 17
17
ref: const
ref: const
rows
rows:
: 1
1
filtered:
filtered: 5
5
Extra:
Extra: Using
Using where
where
81

Functional Indexes
MySQL supports functional key parts that index expression values rather than column or
column pre x values.
Use of functional key parts enables indexing of values not stored directly in the table.
82

Functional Indexes
MySQL supports functional key parts that index expression values rather than column or
column pre x values.
Use of functional key parts enables indexing of values not stored directly in the table.
Let's suppose we want to retriev all employees that were hired in March:
select
select first_name
first_name,
, hire_date
hire_date from
from employees
employees where
where month
month(
(hire_date
hire_date)
)=
=3
3;
;
82

Functional Indexes (2)
Get the Query Execution Plan:
explain
explain select
select first_name
first_name,
, hire_date
hire_date from
from employees
employees where
where month
month(
(hire_date
hire_date)
)=
=3
3G
G
*
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
* 1.
1. row
row *
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
*
id:
id: 1
1
select_type:
select_type: SIMPLE
SIMPLE
table
table: employees
: employees
partitions:
partitions: NULL
NULL
type
type:
: ALL
ALL
possible_keys:
possible_keys: NULL
NULL
key
key:
: NULL
NULL
key_len:
key_len: NULL
NULL
ref:
ref: NULL
NULL
rows
rows:
: 299379
299379
filtered:
filtered: 100
100
Extra:
Extra: Using
Using where
where
83

Get the Query Execution Plan:
explain
explain select
select first_name
first_name,
, hire_date
hire_date from
from employees
employees where
where month
month(
(hire_date
hire_date)
)=
=3
3G
G
*
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
* 1.
1. row
row *
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
*
id:
id: 1
1
select_type:
select_type: SIMPLE
SIMPLE
table
table: employees
: employees
partitions:
partitions: NULL
NULL
type
type:
: ALL
ALL
possible_keys:
possible_keys: NULL
NULL
key
key:
: NULL
NULL
key_len:
key_len: NULL
NULL
ref:
ref: NULL
NULL
rows
rows:
: 299379
299379
filtered:
filtered: 100
100
Extra:
Extra: Using
Using where
where
FULL TABLE SCAN !
83

create
create index
index month_hire_idx
month_hire_idx
on
on employees
employees (
((
(month
month(
(hire_date
hire_date)
))
))
);
; please mind the
please mind the (
((
(.
..
..
.)
))
) notation
notation
84

create
create index
index month_hire_idx
month_hire_idx
on
on employees
employees (
((
(month
month(
(hire_date
hire_date)
))
))
);
; please mind the
please mind the (
((
(.
..
..
.)
))
) notation
notation
explain
explain select
select first_name
first_name,
, hire_date
hire_date from
from employees
employees where
where month
month(
(hire_date
hire_date)
)=
=3
3G
G
*
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
* 1.
1. row
row *
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
*
id:
id: 1
1
select_type:
select_type: SIMPLE
SIMPLE
table
table: employees
: employees
partitions:
partitions: NULL
NULL
type
type: ref
: ref
possible_keys: month_hire_idx
possible_keys: month_hire_idx
key
key: month_hire_idx
: month_hire_idx
key_len:
key_len: 5
5
ref: const
ref: const
rows
rows:
: 51684
51684
filtered:
filtered: 100
100
Extra:
Extra: NULL
NULL
84

Please Keep in Mind...
If there is a choice between multiple indexes, MySQL normally uses the index that nds the
smallest number of rows (the most selective index).
MySQL can use indexes on columns more e ciently if they are declared as the same type
and size.
85

NULL is used to designate a LACK of data:
False 0
True 1
Don't know NULL
NULL
Indexing NULL values really drives down the performances of indexes.
86

Before Invisible Indexes
. doubt usefulness of index
. remove that index
. get phone/text/screams from power user about slow
performance
. the rest of the planet seems to need that dang index !
. recreate the index... and it can take a looooooong time
After Invisible Indexes
. doubt usefulness of index
. make index invisible - optimizer can not see it!
. get phone/text/screams from power user about slow
performance
. make index visible again
. blame a problem on { network | hardware | cloud | a colleague}
Invisible Indexes
MySQL o ers the possibility to hide indexes from the optimizer.
This feature is very useful for testing the relevance of indexes before deleting them.
And very useful for the operations team.
87

How to use INVISIBLE INDEX
alter
alter table
table employees
employees alter
alter index
index first_last_idx invisible
first_last_idx invisible;
;
alter
alter table
table employees
employees alter
alter index
index first_last_idx visible
first_last_idx visible;
;
88

How to use INVISIBLE INDEX
alter
alter table
table employees
employees alter
alter index
index first_last_idx invisible
first_last_idx invisible;
;
alter
alter table
table employees
employees alter
alter index
index first_last_idx visible
first_last_idx visible;
;
List all invisible indexes:
select
select table_name
table_name,
, index_name
index_name,
, is_visible
is_visible
from
information_schema.
.statistics
statistics
where
where is_visible
is_visible=
='no'
'no' group
group by
by table_name
table_name,
, index_name
index_name;
;
+
+------------+----------------+------------+
------------+----------------+------------+
|
| TABLE_NAME
TABLE_NAME |
| INDEX_NAME
INDEX_NAME |
| IS_VISIBLE
IS_VISIBLE |
|
+
+------------+----------------+------------+
------------+----------------+------------+
|
| employees
employees |
| first_last_idx
first_last_idx |
| NO
NO |
|
+
+------------+----------------+------------+
------------+----------------+------------+
88

Unused Indexes
We learned that updating indexes that are not used can be expensive and increase the iops.
Using sys Schema and innodb_index_stats it's possible to identify those unused
indexes:
select
database_name,
, table_name
table_name,
, t1
t1.
.index_name
index_name,
,
format_bytes
format_bytes(
(stat_value
stat_value *
* @
@@innodb_page_size
@innodb_page_size)
) size
size
from
from mysql
mysql.
.innodb_index_stats t1
innodb_index_stats t1
join
join sys
sys.
.schema_unused_indexes t2
schema_unused_indexes t2 on
on object_schema
object_schema=
=database_name
database_name
and
and object_name
object_name=
=table_name
table_name and
and t2
t2.
.index_name
index_name=
=t1
t1.
.index_name
index_name
where
where stat_name
stat_name=
='size'
'size' order
order by
by stat_value
stat_value desc
desc;
;
89

Unused Indexes
We learned that updating indexes that are not used can be expensive and increase the iops.
Using sys Schema and innodb_index_stats it's possible to identify those unused
indexes:
select
database_name,
, table_name
table_name,
, t1
t1.
.index_name
index_name,
,
format_bytes
format_bytes(
(stat_value
stat_value *
* @
@@innodb_page_size
@innodb_page_size)
) size
size
from
from mysql
mysql.
join
join sys
sys.
on object_schema
object_schema=
=database_name
database_name
and
and object_name
object_name=
=table_name
table_name and
and t2
t2.
.index_name
index_name=
=t1
t1.
.index_name
index_name
where
where stat_name
stat_name=
='size'
'size' order
order by
by stat_value
stat_value desc
desc;
;
select
database_name,
, table_name
table_name,
, t1
t1.
.index_name
index_name,
,
format_bytes
format_bytes(
(stat_value
stat_value *
* @
@@innodb_page_size
@innodb_page_size)
) size
size
from
from mysql
mysql.
join
join sys
sys.
on object_schema
object_schema=
=database_name
database_name
and
and object_name
object_name=
=table_name
table_name and
and t2
t2.
.index_name
index_name=
=t1
t1.
.index_name
index_name
where
where stat_name
stat_name=
='size'
'size' and
and database_name
database_name=
="employees"
"employees" order
order by
by stat_value
stat_value desc
desc;
;
+
+---------------+--------------+---------------------+-----------+
---------------+--------------+---------------------+-----------+
|
| database_name
database_name |
| table_name
table_name |
| index_name
index_name |
| size
size |
|
+
+---------------+--------------+---------------------+-----------+
---------------+--------------+---------------------+-----------+
|
| employees
employees |
| employees
employees |
| hash_bin_names2
hash_bin_names2 |
| 9.52
9.52 MiB
MiB |
|
|
| employees
employees |
| employees
employees |
| month_year_hire_idx
month_year_hire_idx |
| 6.52
6.52 MiB
MiB |
|
|
| employees
employees |
| dept_emp
dept_emp |
| dept_no
dept_no |
| 5.52
5.52 MiB
MiB |
|
|
| employees
employees |
| dept_manager
dept_manager |
| dept_no
dept_no |
| 16.00
16.00 KiB
KiB |
|
+
+---------------+--------------+---------------------+-----------+
---------------+--------------+---------------------+-----------+
4
4 rows
rows in
in set
set (
(0.0252
0.0252 sec
sec)
)
89

Duplicate Indexes
And this is the same behaviour for duplicate indexes. There is no reason to keep
maintaining them:
select
select t2
t2.
.*
*,
, format_bytes
format_bytes(
(stat_value
stat_value *
* @
@@innodb_page_size
@innodb_page_size)
) size
size
from
from mysql
mysql.
join
join sys
sys.
.schema_redundant_indexes t2
schema_redundant_indexes t2
on
on table_schema
table_schema=
=database_name
database_name and
and t2
t2.
.table_name
table_name=
=t1
t1.
.table_name
table_name
and
and t2
t2.
.redundant_index_name
redundant_index_name=
=t1
t1.
.index_name
index_name
where
where stat_name
stat_name=
='size'
'size' order
order by
by stat_value
stat_value desc
descG
G
90

Duplicate Indexes
And this is the same behaviour for duplicate indexes. There is no reason to keep
maintaining them:
select
select t2
t2.
.*
*,
, format_bytes
format_bytes(
(stat_value
stat_value *
* @
@@innodb_page_size
@innodb_page_size)
) size
size
from
from mysql
mysql.
join
join sys
sys.
.schema_redundant_indexes t2
schema_redundant_indexes t2
on
on table_schema
table_schema=
=database_name
database_name and
and t2
t2.
.table_name
table_name=
=t1
t1.
.table_name
table_name
and
and t2
t2.
.redundant_index_name
redundant_index_name=
=t1
t1.
.index_name
index_name
where
where stat_name
stat_name=
='size'
'size' order
order by
by stat_value
stat_value desc
descG
G
*
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
* 1.
1. row
row *
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
*
table_schema: world
table_schema: world
table_name: city
table_name: city
redundant_index_name: part_of_name
redundant_index_name: part_of_name
redundant_index_columns: Name
redundant_index_columns: Name
redundant_index_non_unique:
redundant_index_non_unique: 1
1
dominant_index_name: name_idx
dominant_index_name: name_idx
dominant_index_columns: Name
dominant_index_columns: Name
dominant_index_non_unique:
dominant_index_non_unique: 1
1
subpart_exists:
subpart_exists: 1
1
sql_drop_index:
sql_drop_index: ALTER
ALTER TABLE
TABLE `
`world
world`
`.
.`
`city
city`
` DROP
DROP INDEX
INDEX `
`part_of_name
part_of_name`
`
size:
size: 112.00
112.00 KiB
KiB
*
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
* 2.
2. row
row *
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
*
table_schema: world
table_schema: world
table_name: countrylanguage
table_name: countrylanguage
redundant_index_name: CountryCode
redundant_index_name: CountryCode
redundant_index_columns: CountryCode
redundant_index_columns: CountryCode
redundant_index_non_unique:
redundant_index_non_unique: 1
1
dominant_index_name:
dominant_index_name: PRIMARY
PRIMARY
dominant_index_columns: CountryCode
dominant_index_columns: CountryCode,
,Language
Language
dominant_index_non_unique:
dominant_index_non_unique: 0
0
subpart_exists:
subpart_exists: 0
0
sql_drop_index:
sql_drop_index: ALTER
ALTER TABLE
TABLE `
`world
world`
`.
.`
`countrylanguage
countrylanguage`
` DROP
DROP INDEX
INDEX `
`CountryCode
CountryCode`
`
size:
size: 64.00
64.00 KiB
KiB
2
2 rows
rows in
in set
set (
(0.0330
0.0330 sec
sec)
)
90

Don't forget !
Do not take recommendations at face value, check before deleting an index.
Do not delete an index immediately, but rst set it as INVISIBLE for some time. Once in a
while this index might be used, like for a monthly report.
91

Don't forget !
Do not take recommendations at face value, check before deleting an index.
Do not delete an index immediately, but rst set it as INVISIBLE for some time. Once in a
while this index might be used, like for a monthly report.
But when I add or remove an Index, can I estimate the time left ?
91

ALTER Progression
select
select stmt
stmt.
.thread_id
thread_id,
, stmt
stmt.
.sql_text
sql_text,
, stage
stage.
.event_name
event_name as
as state
state,
,
stage
stage.
.work_completed
work_completed,
, stage
stage.
.work_estimated
work_estimated,
,
lpad
lpad(
(concat
concat(
(round
round(
(100
100*
*stage
stage.
.work_completed
work_completed/
/stage
stage.
.work_estimated
work_estimated,
, 2
2)
),
,"%"
"%")
),
,10
10,
," "
" ")
)
as
as completed_at
completed_at,
,
lpad
lpad(
(format_pico_time
format_pico_time(
(stmt
stmt.
.timer_wait
timer_wait)
),
, 10
10,
, " "
" ")
) as
as started_ago
started_ago,
,
lpad
lpad(
(format_pico_time
format_pico_time(
(stmt
stmt.
.timer_wait
timer_wait/
/round
round(
(100
100*
*stage
stage.
.work_completed
work_completed/
/stage
stage.
.work_estimated
work_estimated,
,2
2)
)*
*100
100)
),
,
10
10,
, " "
" ")
) as
as estimated_full_time
estimated_full_time,
,
lpad
lpad(
(format_pico_time
format_pico_time(
((
(stmt
stmt.
.timer_wait
timer_wait/
/round
round(
(100
100*
*stage
stage.
.work_completed
work_completed/
/stage
stage.
.work_estimated
work_estimated,
,2
2)
)*
*100
100)
)
-
-stmt
stmt.
.timer_wait
timer_wait)
),
, 10
10,
, " "
" ")
) as
as estimated_remaining_time
estimated_remaining_time,
,
current_allocated memory
current_allocated memory
from
from performance_schema
performance_schema.
.events_statements_current stmt
events_statements_current stmt
inner
inner join
join sys
sys.
.memory_by_thread_by_current_bytes mt
memory_by_thread_by_current_bytes mt
on
on mt
mt.
.thread_id
thread_id =
= stmt
stmt.
.thread_id
thread_id
inner
inner join
join performance_schema
performance_schema.
.events_stages_current stage
events_stages_current stage
on
on stage
stage.
.thread_id
thread_id =
= stmt
stmt.
.thread_idG
thread_idG
92

ALTER Progression - example
93

Index Creation is slow
< >
Creating indexes is a very slow operation even on my powerfull server with
multiple cores ! Anything I can do ?
94

Index Creation is slow
< >
Creating indexes is a very slow operation even on my powerfull server with
multiple cores ! Anything I can do ?
Since MySQL 8.0.27, you have the possibility to control the maximum of
parallel threads InnoDB can use to create seconday indexes !
94

Parallel Index Creation
The amount of parallel threads used by InnoDB is controlled by innodb_ddl_threads.
This new variable is coupled with another new variable: innodb_ddl_buffer_size.
If you have fast storage and multiple CPU cores, tuning these variables can speed up
secondary index creation.
95

Parallel Index Creation - example
SQL
SQL>
> alter
alter table
table booking
booking
add
add index
index idx_2
idx_2(
(flight_id
flight_id,
, seat
seat,
, passenger_id
passenger_id)
);
;
Query OK
Query OK,
, 0
0 rows
rows affected
affected (
(9
9 min
min 0.6838
0.6838 sec
sec)
)
96

SQL
SQL>
> alter
alter table
table booking
booking
add
add index
index idx_2
idx_2(
(flight_id
flight_id,
, seat
seat,
, passenger_id
passenger_id)
);
;
Query OK
Query OK,
, 0
0 rows
rows affected
affected (
(9
9 min
min 0.6838
0.6838 sec
sec)
)
The default se ings are:
innodb_ddl_threads = 4
innodb_ddl_buffer_size = 1048576
innodb_parallel_read_threads = 4
96

SQL
SQL>
> alter
alter table
table booking
booking
add
add index
index idx_2
idx_2(
(flight_id
flight_id,
, seat
seat,
, passenger_id
passenger_id)
);
;
Query OK
Query OK,
, 0
0 rows
rows affected
affected (
(9
9 min
min 0.6838
0.6838 sec
sec)
)
The default se ings are:
innodb_ddl_threads = 4
innodb_ddl_buffer_size = 1048576
innodb_parallel_read_threads = 4
The innodb_ddl_buffer_size is shared between all innodb_ddl_threads de ned. If
you increase the amount of threads, I recommend that you also increase the bu er size.
96

Parallel Index Creation - example (2)
To nd the best values for these variables, let's have a look at the amount of CPU cores:
SQL
SQL>
> select
select count
count from
information_schema.
.INNODB_METRICS
INNODB_METRICS
where
where name
name =
= 'cpu_n'
'cpu_n';
;
+
+-------+
-------+
|
| count
count |
|
+
+-------+
-------+
|
| 16
16 |
|
+
+-------+
-------+
We have then 16 cores to share. As my machine as plenty of memory, I will allocate 1GB for
the InnoDB DDL bu er.
97

SQL
SQL>
> SET
SET innodb_ddl_threads
innodb_ddl_threads =
= 8
8;
;
SQL
SQL>
> SET
SET innodb_parallel_read_threads
innodb_parallel_read_threads =
= 8
8;
;
SQL
SQL>
> SET
SET innodb_ddl_buffer_size
innodb_ddl_buffer_size =
= 1048576000
1048576000;
;
98

SQL
SQL>
> SET
SET innodb_ddl_threads
innodb_ddl_threads =
= 8
8;
;
SQL
SQL>
> SET
SET innodb_parallel_read_threads
innodb_parallel_read_threads =
= 8
8;
;
SQL
SQL>
> SET
SET innodb_ddl_buffer_size
innodb_ddl_buffer_size =
= 1048576000
1048576000;
;
We can now retry the same index creation as previously:
SQL
SQL>
> alter
alter table
table booking
booking add
add index
index idx_2
idx_2(
(flight_id
flight_id,
, seat
seat,
, passenger_id
passenger_id)
);
;
Query OK
Query OK,
, 0
0 rows
rows affected
affected (
(3
3 min
min 9.1862
9.1862 sec
sec)
)
98

I recommend to make tests to de ne the optimal se ings for your database, your hardware
and data.
For example, I got the best result se ing the bu er size to 2GB and both ddl threads and
parallel read threads to 4.
It took 2 min 43 sec, much be er than the initial 9 minutes !
For more information, go to h ps://lefred.be/content/mysql-8-0-innodb-parallel-threads-
for-online-ddl-operations/
99

Histograms
help the Optimizer to make the right decision
100

Histograms
What is a histogram?
Wikipedia declares a histogram is an accurate representation of the distribution of
numerical data. For RDBMS, a histogram is an approximation of the data distribution within
a speci c column.
So in MySQL, histograms help the optimizer to nd the most e cient Query Plan to fetch
that data.
101

Histograms in MySQL
MySQL provides:
statement histograms
optimizer histograms
102

Histograms in MySQL
MySQL provides:
statement histograms
optimizer histograms
the second category is what we need to focus on today !
102

Statements Histograms
This is an example of query response time distribution for a statement:
103

Global Statements Histograms
If you want a global overview of all statements:
SELECT
SELECT CONCAT
CONCAT(
('<'
'<',
,ROUND
ROUND(
(BUCKET_TIMER_HIGH
BUCKET_TIMER_HIGH/
/1000000
1000000,
,2
2)
),
,
' microsec (<'
' microsec (<',
,ROUND
ROUND(
(BUCKET_TIMER_HIGH
BUCKET_TIMER_HIGH/
/1000000000
1000000000,
,2
2)
) ,
,'ms)'
'ms)')
) QRT
QRT,
,
CONCAT
CONCAT(
(RPAD
RPAD(
(''
'',
,ROUND
ROUND(
(BUCKET_QUANTILE
BUCKET_QUANTILE*
*100
100)
),
,'*'
'*')
),
,
ROUND
ROUND(
(BUCKET_QUANTILE
BUCKET_QUANTILE*
*100
100,
,2
2)
),
,"%"
"%")
) bar
bar
FROM
FROM events_statements_histogram_global
events_statements_histogram_global WHERE
WHERE count_bucket
count_bucket>
>0
0;
;
104

Optimizer Histograms in MySQL
A histogram is a distribution of data into logical buckets
There are two types of histograms:
singleton
equi-height
The maximum number of buckets is 1024.
105

Optimizer Histograms in MySQL (2)
So to help the optimizer to nd the most e cient Query Plan, histograms can be created. As
we know, a histogram is an approximation of the data distribution within a speci c column.
Histograms are useful for columns NOT being candidate to have indexes.
A histogram is created or updated only on demand, so it adds no overhead when table data
is modi ed. On the other hand, the statistics become progressively more out of date when
table modi cations occur, until the next time they are updated.
106

So to help the optimizer to nd the most e cient Query Plan, histograms can be created. As
we know, a histogram is an approximation of the data distribution within a speci c column.
Histograms are useful for columns NOT being candidate to have indexes.
A histogram is created or updated only on demand, so it adds no overhead when table data
is modi ed. On the other hand, the statistics become progressively more out of date when
table modi cations occur, until the next time they are updated.
As an example is be er than 1000 words, let's see how we can create and bene t from
histograms.
(*)Example from MySQL 8 Query Performance Tuning, Jesper Wisborg Krogh, Apress, 2020
106

SELECT
SELECT film_id
film_id,
, title
title,
, length
length,
,
GROUP_CONCAT
GROUP_CONCAT(
(
CONCAT_WS
CONCAT_WS(
(' '
' ',
, first_name
first_name,
, last_name
last_name)
)
)
) AS
AS actors
actors
FROM
FROM sakila
sakila.
.film
film
INNER
INNER JOIN
JOIN sakila
sakila.
.film_actor
film_actor USING
USING (
(film_id
film_id)
)
INNER
INNER JOIN
JOIN sakila
sakila.
.actor
actor USING
USING (
(actor_id
actor_id)
)
WHERE
WHERE length
length <
< 55
55 AND
AND first_name
first_name =
= 'Groucho'
'Groucho'
GROUP
GROUP BY
BY film_id
film_id;
;
Consider this query:
107

SELECT
SELECT film_id
film_id,
, title
title,
, length
length,
,
GROUP_CONCAT
GROUP_CONCAT(
(
CONCAT_WS
CONCAT_WS(
(' '
' ',
, first_name
first_name,
, last_name
last_name)
)
)
) AS
AS actors
actors
FROM
FROM sakila
sakila.
.film
film
INNER
INNER JOIN
JOIN sakila
sakila.
.film_actor
film_actor USING
USING (
(film_id
film_id)
)
INNER
INNER JOIN
JOIN sakila
sakila.
.actor
actor USING
USING (
(actor_id
actor_id)
)
WHERE
WHERE length
length <
< 55
55 AND
AND first_name
first_name =
= 'Groucho'
'Groucho'
GROUP
GROUP BY
BY film_id
film_id;
;
It returns 6 rows:
6 rows in set (0.0143 sec)
Consider this query:
107

Let's have a look at the Query Execution Plan:
108

Now we will create an histogram on the length column of the lm table:
ANALYZE
ANALYZE TABLE
TABLE sakila
sakila.
.film
film
UPDATE
UPDATE HISTOGRAM
HISTOGRAM ON
ON length
length
WITH
WITH 256
256 BUCKETSG
BUCKETSG
*
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
* 1.
1. row
row *
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
*
Table
Table: sakila
: sakila.
.film
film
Op: histogram
Op: histogram
Msg_type:
Msg_type: status
status
Msg_text: Histogram
Msg_text: Histogram statistics
statistics created
created for
for column
column 'length'
'length'.
.
109

Let's have a new look at the Query Execution Plan:
110

For info, the query is now faster:
111

You can also retrieve some information about the histograms:
SELECT
SELECT SCHEMA_NAME
SCHEMA_NAME,
, TABLE_NAME
TABLE_NAME,
, COLUMN_NAME
COLUMN_NAME,
,
HISTOGRAM
HISTOGRAM-
->
>'$."data-type"'
'$."data-type"' 'data-type'
'data-type',
,
HISTOGRAM
HISTOGRAM-
->
>'$."last-updated"'
'$."last-updated"' 'last-updated'
'last-updated',
,
HISTOGRAM
HISTOGRAM-
->
>'$."histogram-type"'
'$."histogram-type"' 'histogram-type'
'histogram-type',
,
HISTOGRAM
HISTOGRAM-
->
>'$."number-of-buckets-specified"'
'$."number-of-buckets-specified"' 'specified-buckets'
'specified-buckets'
FROM
FROM information_schema
information_schema.
.column_statistics
column_statistics
WHERE
WHERE COLUMN_NAME
COLUMN_NAME =
= 'length'
'length';
;
+-------------+------------+-------------+-----------+------------------------------+----------------+-------------------+
| SCHEMA_NAME | TABLE_NAME | COLUMN_NAME | data-type | last-updated | histogram-type | specified-buckets |
+-------------+------------+-------------+-----------+------------------------------+----------------+-------------------+
| sakila | film | length | "int" | "2021-02-24 09:59:13.046631" | "singleton" | 256 |
+-------------+------------+-------------+-----------+------------------------------+----------------+-------------------+
112

select
select v
v value
value,
, concat
concat(
(round
round(
(c
c*
*100
100,
,1
1)
),
,'%'
'%')
) cumulfreq
cumulfreq,
,
concat
concat(
(round
round(
((
(c
c -
- LAG
LAG(
(c
c,
, 1
1,
, 0
0)
)
over
over(
()
))
) *
* 100
100,
,1
1)
),
, '%'
'%')
) freq
freq
from
information_schema.
.column_statistics
column_statistics,
,
JSON_TABLE
JSON_TABLE(
(histogram
histogram-
->
>'$.buckets'
'$.buckets',
,
'$[*]'
'$[*]' COLUMNS
COLUMNS(
(v
v VARCHAR
VARCHAR(
(60
60)
)
PATH
PATH '$[0]'
'$[0]',
, c
c double
double PATH
PATH '$[1]'
'$[1]')
))
) hist
hist
where
where schema_name
schema_name =
= 'sakila'
'sakila'
and
and table_name
table_name =
= 'film'
'film' and
and column_name
column_name =
= 'length'
'length';
;
+-------+-----------+------+
| value | cumulfreq | freq |
+-------+-----------+------+
| 46 | 0.5% | 0.5% |
| 47 | 1.2% | 0.7% |
| 48 | 2.3% | 1.1% |
| 49 | 2.8% | 0.5% |
| 50 | 3.7% | 0.9% |
| 51 | 4.4% | 0.7% |
| 52 | 5.1% | 0.7% |
| 53 | 6% | 0.9% |
| 54 | 6.6% | 0.6% |
| 55 | 6.8% | 0.2% |
| 56 | 7.3% | 0.5% |
...
| 185 | 100% | 1% |
+-------+-----------+------+
Optimizer Histograms (7)
113

Two types of Histograms
Equi-height: One bucket represents a range of values. This type of histograms will be
created when distinct values in the column are greater than the number of buckets
speci ed in the ANALYZE TABLE syntax. Think A-G H-L M-T U-Z.
Singleton: One bucket represents one single value in the column, it is the most accurate and
will be created when the number of distinct values in the column is less than or equal to the
number of buckets speci ed in the ANALYZE TABLE statement.
114

Optimizer Histograms Syntax
ANALYZE
ANALYZE TABLE
TABLE t
t UPDATE
UPDATE HISTOGRAM
HISTOGRAM ON
ON c1
c1,
, c2
c2,
, c3
c3 WITH
WITH 10
10 BUCKETS
BUCKETS;
;
ANALYZE
ANALYZE TABLE
TABLE t
t UPDATE
UPDATE HISTOGRAM
HISTOGRAM ON
ON c1
c1,
, c3
c3 WITH
WITH 10
10 BUCKETS
BUCKETS;
;
ANALYZE
ANALYZE TABLE
TABLE t
t DROP
DROP HISTOGRAM
HISTOGRAM ON
ON c2
c2;
;
Note that the rst statement creates three di erent histograms on c1, c2 and c3 as an
histogram is created per columns
115

Optimizer Histograms Syntax
ANALYZE
ANALYZE TABLE
TABLE t
t UPDATE
UPDATE HISTOGRAM
HISTOGRAM ON
ON c1
c1,
, c2
c2,
, c3
c3 WITH
WITH 10
10 BUCKETS
BUCKETS;
;
ANALYZE
ANALYZE TABLE
TABLE t
t UPDATE
UPDATE HISTOGRAM
HISTOGRAM ON
ON c1
c1,
, c3
c3 WITH
WITH 10
10 BUCKETS
BUCKETS;
;
ANALYZE
ANALYZE TABLE
TABLE t
t DROP
DROP HISTOGRAM
HISTOGRAM ON
ON c2
c2;
;
Note that the rst statement creates three di erent histograms on c1, c2 and c3 as an
histogram is created per columns
Histograms can be created for almost any data type.
If a type is not supported you will get:
The
The column
column 'doc'
'doc' has an unsupported
has an unsupported data
data type
type.
.
115

Optimizer Histograms Syntax (2)
I've created histograms for some columns in di erent tables in the world database.
Information_Schema can be used to retrieve the info related to these histograms:
select
select table_name
table_name,
, column_name
column_name,
, histogram
histogram-
->>
>>'$."data-type"'
'$."data-type"' AS
AS 'data-type'
'data-type',
,
json_length
json_length(
(histogram
histogram-
->>
>>'$."buckets"'
'$."buckets"')
) AS
AS 'bucket-count'
'bucket-count'
from
information_schema.
.column_statistics
column_statistics;
;
+
+------------+-------------+-----------+--------------+
------------+-------------+-----------+--------------+
|
| TABLE_NAME
TABLE_NAME |
| COLUMN_NAME
COLUMN_NAME |
| data
data-
-type
type |
| bucket
bucket-
-count
count |
|
+
+------------+-------------+-----------+--------------+
------------+-------------+-----------+--------------+
|
| country
country |
| Population
Population |
| int
int |
| 226
226 |
|
|
| city
city |
| Population
Population |
| int
int |
| 1024
1024 |
|
|
| countrylan
countrylan |
| Language
Language |
| string
string |
| 457
457 |
|
+
+------------+-------------+-----------+--------------+
------------+-------------+-----------+--------------+
116

Where Histograms Shine
create
create table
table h1
h1 (
(id
id int
int unsigned
unsigned auto_increment
auto_increment,
,
x
x int
int unsigned
unsigned,
, primary
primary key
key(
(id
id)
))
);
;
insert
insert into
into h1
h1 (
(x
x)
) values
values (
(1
1)
),
,(
(1
1)
),
,(
(2
2)
),
,(
(2
2)
),
,(
(2
2)
),
,(
(3
3)
),
,(
(3
3)
),
,(
(3
3)
),
,(
(3
3)
);
;
select
select x
x,
, count
count(
(x
x)
) from
from h1
h1 group
group by
by x
x;
;
+
+---+----------+
---+----------+
|
| x
x |
| count
count(
(x
x)
) |
|
+
+---+----------+
---+----------+
|
| 1
1 |
| 2
2 |
|
|
| 2
2 |
| 3
3 |
|
|
| 3
3 |
| 4
4 |
|
+
+---+----------+
---+----------+
3
3 rows
rows in
in set
set (
(0.0234
0.0234 sec
sec)
)
117

explain
explain select
select *
* from
from h1
h1 where
where x
x >
> 0
0G
G
*
**
**
**
**
**
**
**
* 1.
1. row
row *
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
*
id:
id: 1
1
select_type:
select_type: SIMPLE
SIMPLE
table
table: h1
: h1
partitions:
partitions: NULL
NULL
type
type:
: ALL
ALL
possible_keys:
possible_keys: NULL
NULL
key
key:
: NULL
NULL
key_len:
key_len: NULL
NULL
ref:
ref: NULL
NULL
rows
rows:
: 9
9
filtered:
filtered: 33.33
33.33
Extra:
Extra: Using
Using where
where
The ltered column indicates an estimated percentage of table rows
that will be ltered by the table condition. The maximum value is 100,
which means no ltering of rows occurred.
Values decreasing from 100 indicate increasing amounts of ltering.
rows shows the estimated number of rows examined and rows ×
ltered shows the number of rows the opimizer plans to retrieve.
In this example 9 rows x 33% = 3 rows
The Optimizer estimates about 1/3 of the
data will go match the 'X > 0' condition
Where Histograms Shine (2)
118

analyze
analyze table
table h1
h1 update
update histogram
histogram on
on x
x with
with 3
3 bucketsG
bucketsG
*
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
* 1.
1. row
row *
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
*
Table
Table: test
: test.
.h1
h1
Op: histogram
Op: histogram
Msg_type:
Msg_type: status
status
Msg_text: Histogram
Msg_text: Histogram statistics
statistics created
created for
for column
column 'x'
'x'.
.
1
1 row
row in
in set
set (
(0.0287
0.0287 sec
sec)
)
119

explain
explain select
select *
* from
from h1
h1 where
where x
x >
> 0
0G
G
*
**
**
**
**
**
**
**
* 1.
1. row
row *
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
*
id:
id: 1
1
select_type:
select_type: SIMPLE
SIMPLE
table
table: h1
: h1
partitions:
partitions: NULL
NULL
type
type:
: ALL
ALL
possible_keys:
possible_keys: NULL
NULL
key
key:
: NULL
NULL
key_len:
key_len: NULL
NULL
ref:
ref: NULL
NULL
rows
rows:
: 9
9
filtered:
filtered: 100
100
Extra:
Extra: Using
Using where
where
all rows !
120

Without Histogram:
id:
id: 1
1
select_type:
select_type: SIMPLE
SIMPLE
table
table: h1
: h1
partitions:
partitions: NULL
NULL
type
type:
: ALL
ALL
possible_keys:
possible_keys: NULL
NULL
key
key:
: NULL
NULL
key_len:
key_len: NULL
NULL
ref:
ref: NULL
NULL
rows
rows:
: 9
9
filtered:
filtered: 33.33
33.33
Extra:
Extra: Using
Using where
where
With Histogram:
id:
id: 1
1
select_type:
select_type: SIMPLE
SIMPLE
table
table: h1
: h1
partitions:
partitions: NULL
NULL
type
type:
: ALL
ALL
possible_keys:
possible_keys: NULL
NULL
key
key:
: NULL
NULL
key_len:
key_len: NULL
NULL
ref:
ref: NULL
NULL
rows
rows:
: 9
9
filtered:
filtered: 44.44
44.44
Extra:
Extra: Using
Using where
where
Let's change the condition to x > 2:
121

select
select round
round(
(9
9*
*0.3333
0.3333)
) 'without histogram'
'without histogram',
, round
round(
(9
9*
*0.4444
0.4444)
) 'with histogram'
'with histogram';
;
+
+-------------------+----------------+
-------------------+----------------+
|
| without histogram
without histogram |
| with
with histogram
histogram |
|
+
+-------------------+----------------+
-------------------+----------------+
|
| 3
3 |
| 4
4 |
|
+
+-------------------+----------------+
-------------------+----------------+
select
select count
count(
(*
*)
) from
from h1
h1 where
where x
x >
> 2
2;
;
+
+----------+
----------+
|
| count
count(
(*
*)
) |
|
+
+----------+
----------+
|
| 4
4 |
|
+
+----------+
----------+
122

Histogram vs Index
Why you might consider a histogram instead of an index:
Maintaining an index has a cost. If you have an index, every INSERT/UPDATE/DELETE
causes the index to be updated. This is not free, and will have an impact on your
performance. A histogram on the other hand is created once and never updated unless
you explicitly ask for it. It will thus not hurt your write performance.
If you have an index, the optimizer will need to use it to do what we call "index dives" to
estimate the number of records in a given range. This might become too costly if you
have for instance very long IN-lists in your query. Histogram statistics are much cheaper
in this case, and might thus be more suitable.
123

MySQL HeatWave
I need more performance, much more !
124

MySQL HeatWave
MySQL is also available in Oracle Cloud Infrastructure (OCI) as a managed service.
HeatWave is a massively, high performance, in-memory query accelerator for OCI MySQL
Database Service that accelerates MySQL performance by orders of magnitude for
analytics and mixed workload.
HeatWave can be enabled on demand.
125

When queries are still too slow
Some times, the data is to heavy and the indexes are not manageable or don't t in
memory... in that case it's very complicate to perform query optimization.
This is especially true for Analytics queries.
126

HeatWave Example - data from PiDay
SQL
SQL >
> select
select *
* from
from (
(
select
select date
date(
(time_stamp
time_stamp)
) as
as `
`day
day`
`,
, device_id
device_id,
, count
count(
(*
*)
) as
as `
`tot
tot`
`,
,
max
max(
(value
value)
) as
as `
`max hum
max hum`
`,
, min
min(
(value
value)
) as
as `
`min hum
min hum`
`,
, avg
avg(
(value
value)
) as
as `
`avg hum
avg hum`
`
from
from humidity_history
humidity_history group
group by
by device_id
device_id,
, day
day)
) a
a
natural
natural join
join (
(
select
select date
date(
(time_stamp
time_stamp)
) as
as `
`day
day`
`,
, device_id
device_id,
, count
count(
(*
*)
) as
as `
`tot
tot`
`,
,
max
max(
(value
value)
) as
as `
`max temp
max temp`
`,
, min
min(
(value
value)
) as
as `
`min temp
min temp`
`,
, avg
avg(
(value
value)
) as
as `
`avg temp
avg temp`
`
from
from temperature_history
temperature_history group
group by
by device_id
device_id,
, day
day)
) b
b order
order by
by day
day,
, device_id
device_id;
;
+
+------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
|
| day
day |
| device_id
device_id |
| tot
tot |
| max hum
max hum |
| min hum
min hum |
| avg hum
avg hum |
| max
max temp
temp |
| min
min temp
temp |
| avg
avg temp
temp |
|
+
+------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
|
| 2022
2022-
-03
03-
-09
09 |
| 00006227543
00006227543c0000000000000002
c0000000000000002 |
| 14534
14534 |
| 65.00
65.00 |
| 55.00
55.00 |
| 60.009273
60.009273 |
| 29.99
29.99 |
| 20.00
20.00 |
| 22.597118
22.597118 |
|
|
| 2022
2022-
-03
03-
-09
09 |
| 00006227543
00006227543c0000000000000003
c0000000000000003 |
| 31605
31605 |
| 800.21
800.21 |
| 1.00
1.00 |
| 8.570861
8.570861 |
| 814.36
814.36 |
| 0.00
0.00 |
| 5.079733
5.079733 |
|
|
| 2022
2022-
-03
03-
-09
09 |
| 00006227543
00006227543c0000000000000004
c0000000000000004 |
| 31284
31284 |
| 279.32
279.32 |
| 30.00
30.00 |
| 35.294440
35.294440 |
| 288.44
288.44 |
| 10.00
10.00 |
| 12.797445
12.797445 |
|
|
| 2022
2022-
-03
03-
-10
10 |
| 00006227543
00006227543c0000000000000001
c0000000000000001 |
| 114906
114906 |
| 50.00
50.00 |
| 40.00
40.00 |
| 45.001613
45.001613 |
| 14.00
14.00 |
| 9.00
9.00 |
| 11.499796
11.499796 |
|
|
| 2022
2022-
-03
03-
-10
10 |
| 00006227543
00006227543c0000000000000002
c0000000000000002 |
| 100913
100913 |
| 65.00
65.00 |
| 55.00
55.00 |
| 59.999105
59.999105 |
| 25.00
25.00 |
| 20.00
20.00 |
| 22.501319
22.501319 |
|
|
| 2022
2022-
-03
03-
-10
10 |
| 00006227543
00006227543c0000000000000003
c0000000000000003 |
| 101465
101465 |
| 11.00
11.00 |
| 1.00
1.00 |
| 5.998472
5.998472 |
| 5.00
5.00 |
| 0.00
0.00 |
| 2.501763
2.501763 |
|
|
| 2022
2022-
-03
03-
-10
10 |
| 00006227543
00006227543c0000000000000004
c0000000000000004 |
| 101044
101044 |
| 40.00
40.00 |
| 30.00
30.00 |
| 34.991012
34.991012 |
| 15.00
15.00 |
| 10.00
10.00 |
| 12.496505
12.496505 |
|
+
+------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
7
7 rows
rows in
in set
set (
(1.2717
1.2717 sec
sec)
)
127

Same Query after having enabled and loaded data to HeatWave
SQL
SQL >
> select
select *
* from
from (
(
select
select date
date(
(time_stamp
time_stamp)
) as
as `
`day
day`
`,
, device_id
device_id,
, count
count(
(*
*)
) as
as `
`tot
tot`
`,
,
max
max(
(value
value)
) as
as `
`max hum
max hum`
`,
, min
min(
(value
value)
) as
as `
`min hum
min hum`
`,
, avg
avg(
(value
value)
) as
as `
`avg hum
avg hum`
`
from
from humidity_history
humidity_history group
group by
by device_id
device_id,
, day
day)
) a
a
natural
natural join
join (
(
select
select date
date(
(time_stamp
time_stamp)
) as
as `
`day
day`
`,
, device_id
device_id,
, count
count(
(*
*)
) as
as `
`tot
tot`
`,
,
max
max(
(value
value)
) as
as `
`max temp
max temp`
`,
, min
min(
(value
value)
) as
as `
`min temp
min temp`
`,
, avg
avg(
(value
value)
) as
as `
`avg temp
avg temp`
`
from
from temperature_history
temperature_history group
group by
by device_id
device_id,
, day
day)
) b
b order
order by
by day
day,
, device_id
device_id;
;
+
+------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
|
| day
day |
| device_id
device_id |
| tot
tot |
| max hum
max hum |
| min hum
min hum |
| avg hum
avg hum |
| max
max temp
temp |
| min
min temp
temp |
| avg
avg temp
temp |
|
+
+------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
|
| 2022
2022-
-03
03-
-09
09 |
| 00006227543
00006227543c0000000000000002
c0000000000000002 |
| 14534
14534 |
| 65.00
65.00 |
| 55.00
55.00 |
| 60.009272
60.009272 |
| 29.99
29.99 |
| 20.00
20.00 |
| 22.597117
22.597117 |
|
|
| 2022
2022-
-03
03-
-09
09 |
| 00006227543
00006227543c0000000000000003
c0000000000000003 |
| 31605
31605 |
| 800.21
800.21 |
| 1.00
1.00 |
| 8.570860
8.570860 |
| 814.36
814.36 |
| 0.00
0.00 |
| 5.079732
5.079732 |
|
|
| 2022
2022-
-03
03-
-09
09 |
| 00006227543
00006227543c0000000000000004
c0000000000000004 |
| 31284
31284 |
| 279.32
279.32 |
| 30.00
30.00 |
| 35.294440
35.294440 |
| 288.44
288.44 |
| 10.00
10.00 |
| 12.797445
12.797445 |
|
|
| 2022
2022-
-03
03-
-10
10 |
| 00006227543
00006227543c0000000000000001
c0000000000000001 |
| 115609
115609 |
| 50.00
50.00 |
| 40.00
40.00 |
| 45.001736
45.001736 |
| 14.00
14.00 |
| 9.00
9.00 |
| 11.499157
11.499157 |
|
|
| 2022
2022-
-03
03-
-10
10 |
| 00006227543
00006227543c0000000000000002
c0000000000000002 |
| 100913
100913 |
| 65.00
65.00 |
| 55.00
55.00 |
| 59.999104
59.999104 |
| 25.00
25.00 |
| 20.00
20.00 |
| 22.501318
22.501318 |
|
|
| 2022
2022-
-03
03-
-10
10 |
| 00006227543
00006227543c0000000000000003
c0000000000000003 |
| 101465
101465 |
| 11.00
11.00 |
| 1.00
1.00 |
| 5.998472
5.998472 |
| 5.00
5.00 |
| 0.00
0.00 |
| 2.501762
2.501762 |
|
|
| 2022
2022-
-03
03-
-10
10 |
| 00006227543
00006227543c0000000000000004
c0000000000000004 |
| 101044
101044 |
| 40.00
40.00 |
| 30.00
30.00 |
| 34.991011
34.991011 |
| 15.00
15.00 |
| 10.00
10.00 |
| 12.496504
12.496504 |
|
+
+------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
------------+------------------------------+--------+---------+---------+-----------+----------+----------+-----------+
7
7 rows
rows in
in set
set (
(0.1267
0.1267 sec
sec)
)
128

0.1267 sec VS 1.2717 sec
10x faster... but that is only 1 day of data...
Now lets increase the data to 14 days: +11G of data:
+
+----------+----------+------------+
----------+----------+------------+
|
| DATA
DATA |
| INDEXES
INDEXES |
| TOTAL SIZE
TOTAL SIZE |
|
+
+----------+----------+------------+
----------+----------+------------+
|
| 8.66
8.66 GiB
GiB |
| 2.93
2.93 GiB
GiB |
| 11.59
11.59 GiB
GiB |
|
+
+----------+----------+------------+
----------+----------+------------+
129

Without HeatWave:
44
44 rows
rows in
in set
set (
(10
10 min
min 14.1022
14.1022 sec
sec)
)
With HeatWave:
45
45 rows
rows in
in set
set (
(1.6051
1.6051 sec
sec)
)
14 days of data (11GB)
69M ROWS
383x faster !
130

HeatWave - loading data
alter
alter table
table temperature_history secondary_engine
temperature_history secondary_engine=
=rapid
rapid;
;
Query OK
Query OK,
, 0
0 rows
rows affected
affected (
(0.0257
0.0257 sec
sec)
)
alter
alter table
table temperature_history secondary_load
temperature_history secondary_load;
;
Query OK
Query OK,
, 0
0 rows
rows affected
affected (
(17.3070
17.3070 sec
sec)
)
4.6GB of data for this table (whitout indexes) loaded to HeatWave
131

HeatWave
132

MySQL Workload
knowing your system
133

MySQL Workload
Read or Write Intensive ?
134

MySQL Workload
You should know ! And many people just have an idea (and usually wrong...).
134

MySQL Shell with
plugins can help !
MySQL Workload
You should know ! And many people just have an idea (and usually wrong...).
134

MySQL Workload (2)
Details are also available:
135

MySQL Workload Changes
What can cause a (one-time) performance drop ?
136

MySQL Workload Changes
What can cause a (one-time) performance drop ?
Backup (physical, logical), Clone
Filesystem snapshot
External process consuming resources (IOPS, CPU...)
Maintenance operation (ALTER, Changes in partitions, Archiving data..)
Hardware problem or maintenance
Di erent statistics, di erent Query Execution Plan
136

Save the QEP for your queries
I also recommend to save the Query Execution Plan of some of your queries in a table to
compare them:
137

Save the QEP for your queries (2)
138

Recommended Lectures
books
139

140

Try MySQL HeatWave for Free:
h ps://www.oracle.com/mysql/heatwave/
141

MySQL Shell is now included in Visual Studio Code:
142

143

Share your ❤ to MySQL
#mysql
Join our slack channel!
bit.ly/mysql-slack
144

Questions ?
145

Resources & Credits
h ps://dev.mysql.com/doc/refman/8.0/en/execution-plan-information.html
h ps://github.com/lefred/mysqlshell-plugins
h ps://dev.mysql.com/doc/heatwave/en/
h ps://www.mysql.com/cloud/
146

RivieraJUG - MySQL Indexes and Histograms

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to RivieraJUG - MySQL Indexes and Histograms

Similar to RivieraJUG - MySQL Indexes and Histograms (20)

More from Frederic Descamps

More from Frederic Descamps (20)

Recently uploaded

Recently uploaded (20)

RivieraJUG - MySQL Indexes and Histograms