SlideShare a Scribd company logo
1 of 146
Download to read offline
Billion Goods in Few Categories:
how Histograms Save a Life?
November, 7, 2018
Sveta Smirnova
•The Case
•The Cardinality: Two Levels
•ANALYZE TABLE Limitations
•Solutions in Percona Server 5.7
•Histograms
•Conclusion
Table of Contents
2
• MySQL Support engineer
• Author of
• MySQL Troubleshooting
• JSON UDF functions
• FILTER clause for MySQL
• Speaker
• Percona Live, OOW, Fosdem,
DevConf, HighLoad...
Sveta Smirnova
3
• Hardware
• Wise options
• Optimized queries
• Brain
Everything can be Resolved!
4
• This talk is about
• How I spent last two years
• Resolving the same issue
• For different customers
Not Everything
5
• This talk is about
• How I spent last two years
• Resolving the same issue
• For different customers
• Task was to speed up the query
Not Everything
5
• Specific data distribution
• Access on different fields
• ON clause
• WHERE clause
• GROUP BY
• ORDER BY
• Index cannot be used effectively
Not All the Queries can be Optimized
6
• Topic based on real Support cases
• Couple of them are still in progress
Disclaimer
7
• Topic based on real Support cases
• All examples are 100% fake
• They created such that
• No customer can be identified
• Everything generated
Table names
Column names
Data
• Use case itself is fictional
Disclaimer
7
• Topic based on real Support cases
• All examples are 100% fake
• All examples are simplified
• Only columns, required to show the issue
• Everything extra removed
• Real tables usually store much more data
Disclaimer
7
• Topic based on real Support cases
• All examples are 100% fake
• All examples are simplified
• All disasters happened with version 5.7
Disclaimer
7
The Case
• categories
• Less than 20 rows
Two tables
9
• categories
• Less than 20 rows
• goods
• More than 1M rows
• 20 unique cat id values
• Many other fields
Price
Date: added, last updated, etc.
Characteristics
Store
...
Two tables
9
select *
from
goods
join
categories
on
(categories.id=goods.cat_id)
where
date_added between ’2018-07-01’ and ’2018-08-01’
and
cat_id in (16,11)
and
price >= 1000 and <=10000 [ and ... ]
[ GROUP BY ... [ORDER BY ... [ LIMIT ...]]]
;
JOIN
10
• Select from the Small Table
Option 1: Select from the Small Table First
11
• Select from the Small Table
• For each cat id select from the large table
Option 1: Select from the Small Table First
11
• Select from the Small Table
• For each cat id select from the large table
• Filter result on date added[ and price[...]]
Option 1: Select from the Small Table First
11
• Select from the Small Table
• For each cat id select from the large table
• Filter result on date added[ and price[...]]
• Slow with many items in the category
Option 1: Select from the Small Table First
11
• Filter rows by date added[ and price[...]]
Option 2: Select from the Large Table First
12
• Filter rows by date added[ and price[...]]
• Get cat id values
Option 2: Select from the Large Table First
12
• Filter rows by date added[ and price[...]]
• Get cat id values
• Retrieve rows from the small table
Option 2: Select from the Large Table First
12
• Filter rows by date added[ and price[...]]
• Get cat id values
• Retrieve rows from the small table
• Slow if number of rows, filtered by
date added, is larger than number of goods
in the selected categories
Option 2: Select from the Large Table First
12
• CREATE INDEX index everything
(cat id, date added[, price[, ...]])
• It resolves the issue
What if use Combined Indexes?
13
• CREATE INDEX index everything
(cat id, date added[, price[, ...]])
• It resolves the issue
• But not in all cases
What if use Combined Indexes?
13
• Maintenance cost
• Slower INSERT/UPDATE/DELETE
• Disk space
The Problem
14
• Maintenance cost
• Slower INSERT/UPDATE/DELETE
• Disk space
• Tables may have wrong cardinality
The Problem
14
The Cardinality: Two Levels
• Optimizer
• Engine
• TokuDB
• InnoDB
• Any
MySQL is Layered Architecture
16
• Stores statistics on disk
• mysql.innodb table stats
• mysql.innodb index stats
InnoDB: Overview
17
• Stores statistics on disk
• Returns statistics to Optimizer
InnoDB: Overview
17
• Stores statistics on disk
• Returns statistics to Optimizer
• In ha innobase::info
• handler/ha innodb.cc
InnoDB: Overview
17
• Stores statistics on disk
• Returns statistics to Optimizer
• In ha innobase::info
• handler/ha innodb.cc
• When opens table
• flag = HA STATUS CONST
• Reads data from disk
• Stores it in memory
InnoDB: Overview
17
• Stores statistics on disk
• Returns statistics to Optimizer
• In ha innobase::info
• handler/ha innodb.cc
• When opens table
• Subsequent table accesses
• flag = HA STATUS VARIABLE
• Statistics from memory
• Up to date Primary Key data
InnoDB: Overview
17
• Table created with option STATS AUTO RECALC = 0
• Before ANALYZE TABLE
mysql> show index from testG
...
*************************** 2. row ***************************
Table: test
Non_unique: 1
Key_name: f1
Seq_in_index: 1
Column_name: f1
Collation: A
Cardinality: 64
...
InnoDB: Flow
18
• Table created with option STATS AUTO RECALC = 0
• After ANALYZE TABLE
mysql> show index from testG
...
*************************** 2. row ***************************
Table: test
Non_unique: 1
Key_name: f1
Seq_in_index: 1
Column_name: f1
Collation: A
Cardinality: 2
...
InnoDB: Flow
18
• Table created with option STATS AUTO RECALC = 0
• After inserting rows
mysql> show index from testG
...
*************************** 2. row ***************************
Table: test
Non_unique: 1
Key_name: f1
Seq_in_index: 1
Column_name: f1
Collation: A
Cardinality: 16
...
InnoDB: Flow
18
• Table created with option STATS AUTO RECALC = 0
• After restart
mysql> show index from testG
...
*************************** 2. row ***************************
Table: test
Non_unique: 1
Key_name: f1
Seq_in_index: 1
Column_name: f1
Collation: A
Cardinality: 2
...
InnoDB: Flow
18
• Takes data from the engine
Optimizer: Overview
19
• Takes data from the engine
• Class ha statistics
• sql/handler.h
Optimizer: Overview
19
• Takes data from the engine
• Class ha statistics
• sql/handler.h
• Does not have Cardinality field at all
Optimizer: Overview
19
• Takes data from the engine
• Class ha statistics
• sql/handler.h
• Does not have Cardinality field at all
• Uses formula to calculate Cardinality
Optimizer: Overview
19
• n rows: number of rows in the table
• Naturally up to date
• Constantly changing!
Optimizer: Formula
20
• n rows: number of rows in the table
• Naturally up to date
• Constantly changing!
• rec per key: number of duplicates per key
• Calculated by InnoDB in time of ANALYZE
• rec per key = n rows / unique values
• Do not change!
Optimizer: Formula
20
• n rows: number of rows in the table
• Naturally up to date
• Constantly changing!
• rec per key: number of duplicates per key
• Calculated by InnoDB in time of ANALYZE
• rec per key = n rows / unique values
• Do not change!
• Cardinality = n rows / rec per key
Optimizer: Formula
20
• Engine stores persistent statistics
TokuDB InnoDB
Storage Files Tables
Statistics As Calculated As Calculated
Row Count Persistent Only in Memory
Persistent Statistics Are Not Persistent
21
• Engine stores persistent statistics
TokuDB InnoDB
Storage Files Tables
Statistics As Calculated As Calculated
Row Count Persistent Only in Memory
• Optimizer calculates Cardinality every time
when accesses engine statistics
Persistent Statistics Are Not Persistent
21
• Engine stores persistent statistics
TokuDB InnoDB
Storage Files Tables
Statistics As Calculated As Calculated
Row Count Persistent Only in Memory
• Optimizer calculates Cardinality every time
when accesses engine statistics
• Weak user control
Persistent Statistics Are Not Persistent
21
ANALYZE TABLE Limitations
• Counts number of pages in the table
How ANALYZE TABLE Works with InnoDB?
23
• Counts number of pages in the table
• Takes STATS SAMPLE PAGES
How ANALYZE TABLE Works with InnoDB?
23
• Counts number of pages in the table
• Takes STATS SAMPLE PAGES
• Counts number of unique values in
secondary index in these pages
How ANALYZE TABLE Works with InnoDB?
23
• Counts number of pages in the table
• Takes STATS SAMPLE PAGES
• Counts number of unique values in
secondary index in these pages
• Divides number of pages in the table on
number of sample pages and multiplies
result on number of unique values
How ANALYZE TABLE Works with InnoDB?
23
• Number of pages in the table: 20,000
• STATS SAMPLE PAGES: 20 (default)
• Unique values in the secondary index:
• In sample pages: 10
• In the table: 11
Example
24
• Number of pages in the table: 20,000
• STATS SAMPLE PAGES: 20 (default)
• Unique values in the secondary index:
• In sample pages: 10
• In the table: 11
• Cardinality: 20,000 * 10 / 20 = 10,000
Example
24
• Number of pages in the table: 20,000
• STATS SAMPLE PAGES: 5,000
• Unique values in the secondary index:
• In sample pages: 10
• In the table: 11
• Cardinality: 20,000 * 10 / 5,000 = 40
Example 2
25
• Time consuming
mysql> select count(*) from goods;
+----------+
| count(*) |
+----------+
| 80303000 |
+----------+
1 row in set (35.95 sec)
Use Larger STATS SAMPLE PAGES?
26
• Time consuming
• With default STATS SAMPLE PAGES
mysql> analyze table goods;
+------------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+------------+---------+----------+----------+
| test.goods | analyze | status | OK |
+------------+---------+----------+----------+
1 row in set (0.32 sec)
Use Larger STATS SAMPLE PAGES?
26
• Time consuming
• With bigger number
mysql> alter table goods STATS_SAMPLE_PAGES=5000;
Query OK, 0 rows affected (0.04 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> analyze table goods;
+------------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+------------+---------+----------+----------+
| test.goods | analyze | status | OK |
+------------+---------+----------+----------+
1 row in set (27.13 sec)
Use Larger STATS SAMPLE PAGES?
26
• Time consuming
• With bigger number
• 27.13/0.32 = 85 times slower!
Use Larger STATS SAMPLE PAGES?
26
User Manual claims it does not
During the analysis, the table is locked
with a read lock for InnoDB and MyISAM.
Does ANALYZE TABLE Block Reads?
27
User Manual claims it does not
• But!
Does ANALYZE TABLE Block Reads?
27
User Manual claims it does not
Sometimes it blocks all subsequent queries
+------+-------------------------+---------------------------------+
| Time | State | Info |
+------+-------------------------+---------------------------------+
| 32 | Writing to net | select * from t where c > ’%0%’ |
| 12 | Waiting for table flush | select * from test.t where i=1 |
| 12 | Waiting for table flush | select * from test.t where i=2 |
| 12 | Waiting for table flush | select * from test.t where i=3 |
| 11 | Waiting for table flush | select * from test.t where i=7 |
| 10 | Waiting for table flush | select * from test.t where i=11 |
...
Does ANALYZE TABLE Block Reads?
27
Is not a solution
Simply Increasing STATS SAMPLE PAGES
28
Solutions in Percona Server 5.7
Considered as a bug
• jira.percona.com/browse/PS-2503
• lp:1704195
• bugs.mysql.com/87065
Blocking ANALYZE TABLE
30
Considered as a bug
• Fixed in Percona Server
5.6.38-83.0/5.7.20-18
Blocking ANALYZE TABLE
30
• Before the fix
• Opens table statistics
Concurrent DML allowed
• Updates table statistics
Concurrent DML allowed
• Update finished
• Invalidates entry in table definition cache
Concurrent DML forbidden
• Invalidates query cache
Concurrent DML forbidden
Non-Blocking ANALYZE TABLE
31
• After the fix
• Opens table statistics
Concurrent DML allowed
• Updates table statistics
Concurrent DML allowed
• Update finished
• Invalidates entry in table definition cache
Concurrent DML forbidden
• Invalidates query cache
Concurrent DML forbidden
Non-Blocking ANALYZE TABLE
31
• InnoDB stores its statistics
mysql.innodb index stats
Without the Fix: Manual Update
32
• InnoDB stores its statistics
mysql.innodb index stats
• This table is writable
Without the Fix: Manual Update
32
• InnoDB stores its statistics
mysql.innodb index stats
• This table is writable
• Updating it with following FLUSH TABLE
allows to fake any statistics
Without the Fix: Manual Update
32
• InnoDB stores its statistics
mysql.innodb index stats
• This table is writable
• Updating it with following FLUSH TABLE
allows to fake any statistics
• Hack
• Not documented
• Not recommended
• Can stop working any time
Without the Fix: Manual Update
32
• With Percona fix for blocking ANALYZE
TABLE we can use large value for
STATS SAMPLE PAGES
• Does not help when
• Index cannot be used
• Data distribution in the index vary a lot
5.7: Resume
33
• With Percona fix for blocking ANALYZE
TABLE we can use large value for
STATS SAMPLE PAGES
• Does not help when
• Index cannot be used
• Data distribution in the index vary a lot
• Manual update allows to fix statistics
• Not recommended
• Can stop working any time
5.7: Resume
33
Histograms
• Optimizer Column Statistics
• Engine-independent
• No fancy calculations
• Knows about data distribution
What are the Histograms?
35
1 2 3 4 5 6 7 8 9 10
0
200
400
600
800
Number of Values in Each Bucket
36
1 2 3 4 5 6 7 8 9 10
0
0.2
0.4
0.6
0.8
1
Data in the Histogram
37
• Accurate statistics
• Truly persistent
• No extra calculations on access
• Optimizer knows about data distribution
• Without touching the table!
How Histograms are Helpful?
38
• Example data
mysql> create table example(f1 int) engine=innodb;
mysql> insert into example values(1),(1),(1),(2),(3);
mysql> select f1, count(f1) from example group by f1;
+------+-----------+
| f1 | count(f1) |
+------+-----------+
| 1 | 3 |
| 2 | 1 |
| 3 | 1 |
+------+-----------+
3 rows in set (0.00 sec)
Filtered Rows
39
• Without a histogram
mysql> explain select * from example where f1 > 0G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 33.33
Extra: Using where
Filtered Rows
39
• Without a histogram
mysql> explain select * from example where f1 > 1G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 33.33
Extra: Using where
Filtered Rows
39
• Without a histogram
mysql> explain select * from example where f1 > 2G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 33.33
Extra: Using where
Filtered Rows
39
• Without a histogram
mysql> explain select * from example where f1 > 3G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 33.33
Extra: Using where
Filtered Rows
39
• With the histogram
mysql> analyze table example update histogram on f1 with 3 buckets;
+-----------------+-----------+----------+------------------------------+
| Table | Op | Msg_type | Msg_text |
+-----------------+-----------+----------+------------------------------+
| hist_ex.example | histogram | status | Histogram statistics created
for column ’f1’. |
+-----------------+-----------+----------+------------------------------+
1 row in set (0.03 sec)
Filtered Rows
39
• With the histogram
mysql> select * from information_schema.column_statistics
-> where table_name=’example’G
*************************** 1. row ***************************
SCHEMA_NAME: hist_ex
TABLE_NAME: example
COLUMN_NAME: f1
HISTOGRAM:
"buckets": [[1, 0.6], [2, 0.8], [3, 1.0]],
"data-type": "int", "null-values": 0.0, "collation-id": 8,
"last-updated": "2018-11-07 09:07:19.791470",
"sampling-rate": 1.0, "histogram-type": "singleton",
"number-of-buckets-specified": 3
1 row in set (0.00 sec)
Filtered Rows
39
• With the histogram
mysql> explain select * from example where f1 > 0G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 100.00 -- all rows
Extra: Using where
Filtered Rows
39
• With the histogram
mysql> explain select * from example where f1 > 1G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 40.00 -- 2 rows
Extra: Using where
Filtered Rows
39
• With the histogram
mysql> explain select * from example where f1 > 2G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 20.00 -- one row
Extra: Using where
Filtered Rows
39
• With the histogram
mysql> explain select * from example where f1 > 3G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 20.00 - one row
Extra: Using where
Filtered Rows
39
• EXPLAIN without histograms
mysql> explain select goods.* from goods
-> join categories on (categories.id=goods.cat_id)
-> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17)
-> and
-> date_added between ’2000-01-01’ and ’2001-01-01’ -- Large range
-> order by goods.cat_id
-> limit 10G -- We ask for 10 rows only!
Example
40
• EXPLAIN without histograms
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: categories -- Small table first
partitions: NULL
type: index
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: NULL
rows: 20
filtered: 70.00
Extra: Using where; Using index;
Using temporary; Using filesort
Example
40
• EXPLAIN without histograms
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: goods -- Large table
partitions: NULL
type: ref
possible_keys: cat_id_2
key: cat_id_2
key_len: 5
ref: orig.categories.id
rows: 51827
filtered: 11.11 -- Default value
Extra: Using where
2 rows in set, 1 warning (0.01 sec)
Example
40
• Execution time without histograms
mysql> flush status;
Query OK, 0 rows affected (0.00 sec)
mysql> select goods.* from goods
-> join categories on (categories.id=goods.cat_id)
-> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17)
-> and
-> date_added between ’2000-01-01’ and ’2001-01-01’
-> order by goods.cat_id
-> limit 10;
ab9f9bb7bc4f357712ec34f067eda364 -
10 rows in set (56.47 sec)
Example
40
• Engine statistics without histograms
mysql> show status like ’Handler%’;
+----------------------------+--------+
| Variable_name | Value |
+----------------------------+--------+
...
| Handler_read_next | 964718 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 10 |
| Handler_read_rnd_next | 951671 |
...
| Handler_write | 951670 |
+----------------------------+--------+
18 rows in set (0.01 sec)
Example
40
• Now lets add the histogram
mysql> analyze table goods update histogram on date_added;
+------------+-----------+----------+------------------------------+
| Table | Op | Msg_type | Msg_text |
+------------+-----------+----------+------------------------------+
| orig.goods | histogram | status | Histogram statistics created
for column ’date_added’. |
+------------+-----------+----------+------------------------------+
1 row in set (2.01 sec)
Example
40
• EXPLAIN with the histogram
mysql> explain select goods.* from goods
-> join categories
-> on (categories.id=goods.cat_id)
-> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17)
-> and
-> date_added between ’2000-01-01’ and ’2001-01-01’
-> order by goods.cat_id
-> limit 10G
Example
40
• EXPLAIN with the histogram
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: goods -- Large table first
partitions: NULL
type: index
possible_keys: cat_id_2
key: cat_id_2
key_len: 5
ref: NULL
rows: 10 -- Same as we asked
filtered: 98.70 -- True numbers
Extra: Using where
Example
40
• EXPLAIN with the histogram
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: categories -- Small table
partitions: NULL
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: orig.goods.cat_id
rows: 1
filtered: 100.00
Extra: Using index
2 rows in set, 1 warning (0.01 sec)
Example
40
• Execution time with the histogram
mysql> flush status;
Query OK, 0 rows affected (0.00 sec)
mysql> select goods.* from goods
-> join categories on (categories.id=goods.cat_id)
-> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17)
-> and
-> date_added between ’2000-01-01’ and ’2001-01-01’
-> order by goods.cat_id
-> limit 10;
eeb005fae0dd3441c5c380e1d87fee84 -
10 rows in set (0.00 sec) -- 56 times faster!
Example
40
• Engine statistics with the histogram
mysql> show status like ’Handler%’;
+----------------------------+-------++----------------------------+-------+
| Variable_name | Value || Variable_name | Value |
+----------------------------+-------++----------------------------+-------+
| Handler_commit | 1 || Handler_read_prev | 0 |
| Handler_delete | 0 || Handler_read_rnd | 0 |
| Handler_discover | 0 || Handler_read_rnd_next | 0 |
| Handler_external_lock | 4 || Handler_rollback | 0 |
| Handler_mrr_init | 0 || Handler_savepoint | 0 |
| Handler_prepare | 0 || Handler_savepoint_rollback | 0 |
| Handler_read_first | 1 || Handler_update | 0 |
| Handler_read_key | 3 || Handler_write | 0 |
| Handler_read_last | 0 |+----------------------------+-------+
| Handler_read_next | 9 |18 rows in set (0.00 sec)
Example
40
• Data distribution is uniform
• Range optimization can be used
• Full table scan is fast
When Histogram are not Helpful?
41
Backward index scan
• Better Statistics Persistence in InnoDB
• MySQL bug #80178
• MySQL bug #84654
• Better PRIMARY key access
Other Improvements in 8.0
42
Conclusion
• Index statistics collected by the egine
• Optimizer calculates Cardinality each time
when accesses statistics
• Indexes not always improve performance
• Histograms can help
Still new feature
Conclusion
44
MySQL User Reference Manual
Blog by Erik Froseth
Blog by Frederic Descamps
Talk by Oystein Grovlen @Fosdem
Talk by Sergei Petrunia @PerconaLive
Talk by Sergei Golubchik @HighLoad++
More information
45
Rate My Session!
46
http://www.slideshare.net/SvetaSmirnova
https://twitter.com/svetsmirnova
https://github.com/svetasmirnova
Thank you!
47
How TokuDB Updates Statistics
• Stores key statistics on disk and in memory
• tablename status id.tokudb
TokuDB: Overview
49
• Stores key statistics on disk and in memory
• Stores row count on disk and in memory
• tablename main id.tokudb
• tablename key keyname id.tokudb
TokuDB: Overview
49
• Stores key statistics on disk and in memory
• Stores row count on disk and in memory
• Returns statistics to Optimizer
TokuDB: Overview
49
• Stores key statistics on disk and in memory
• Stores row count on disk and in memory
• Returns statistics to Optimizer
• In ha tokudb::info (handler/ha tokudb.cc)
TokuDB: Overview
49
• Stored on disk
TokuDB: Key Statistics
50
• Stored on disk
• Updated during ANALYZE
• Background ANALYZE
• Explicitly called
TokuDB: Key Statistics
50
• Stored on disk
• Updated during ANALYZE
• Background ANALYZE
• Explicitly called
• Not updated when tokudb auto analyze=0
TokuDB: Key Statistics
50
• Updated in TOKUDB SHARE::update cardinality counts
TokuDB Key Statistics: Code
51
• Updated in TOKUDB SHARE::update cardinality counts
• Stored in tokudb::set card in status
• In standard ANALYZE
• standard t::on run
TokuDB Key Statistics: Code
51
• Updated in TOKUDB SHARE::update cardinality counts
• Stored in tokudb::set card in status
• Retrieved in tokudb::get card from status
• When table is open
• In ha tokudb::initialize share
TokuDB Key Statistics: Code
51
• Updated in TOKUDB SHARE::update cardinality counts
• Stored in tokudb::set card in status
• Retrieved in tokudb::get card from status
• Used in TOKUDB SHARE::set cardinality counts in table
for (uint32_t j = 0; j < key->actual_key_parts; j++) {
...
assert_always(next_key_part < _rec_per_keys);
ulong val = _rec_per_key[next_key_part++];
val = (val * tokudb::sysvars::cardinality_scale_percent) / 100;
TokuDB Key Statistics: Code
51
• Stored on disk
• Updated
• Each time table is updated
• When ha tokudb::info called
TokuDB Logical Rows Count
52
mysql> create table test(
-> id int not null auto_increment primary key,
-> f1 int,
-> ts timestamp,
-> key(f1)
-> ) engine=tokudb;
Query OK, 0 rows affected (0.10 sec)
mysql> insert into test (f1, ts) values(1, NOW()), (2, NOW());
Query OK, 2 rows affected (0.03 sec)
Records: 2 Duplicates: 0 Warnings: 0
...
mysql> insert into test (f1, ts) select f1, NOW() from test;
Query OK, 32 rows affected (0.01 sec)
Records: 32 Duplicates: 0 Warnings: 0
TokuDB Test Case
53
mysql> select count(distinct id), count(distinct f1) from test;
+--------------------+--------------------+
| count(distinct id) | count(distinct f1) |
+--------------------+--------------------+
| 64 | 2 |
+--------------------+--------------------+
1 row in set (0.01 sec)
TokuDB Test Case
53
• SHOW INDEX
mysql> show index from testG
*************************** 1. row ***************************
Table: test
Non_unique: 0
Key_name: PRIMARY
Column_name: id
Cardinality: 64
...
*************************** 2. row ***************************
Table: test
Non_unique: 1
Key_name: f1
Column_name: f1
Cardinality: 64
TokuDB: After First run
54
• Number of rows
$ ../bin/tokuftdump --header --nodata var/mysqld.1/data/test/test_key_f1_145_1_1
ft:
layout_version=29
layout_version_original=29
layout_version_read_from_disk=29
build_id=0
build_id_original=0
time_of_creation= 1537709029 Sun Sep 23 16:23:49 2018
time_of_last_modification=1537709100 Sun Sep 23 16:25:00 2018
...
estimated numrows=64
estimated numbytes=640
logical row count=64
TokuDB: After First run
54
• Index Statistics
Thread 44 "mysqld" hit Breakpoint 1, TOKUDB_SHARE::set_cardinality_counts_in_tab
(this=0x7fd86da54020, table=0x7fd86d90b020)
at /home/sveta/src/percona-server/storage/tokudb/ha_tokudb.cc:400
400 if (val == 0 || _rows == 0 ||
(gdb) p key->name
$21 = 0x7fd86d879999 "f1"
(gdb) p val
$22 = 0
TokuDB: After First run
54
• Cardinality = 64 / 0 = 64
TokuDB: After First run
54
mysql> analyze table test;
+-----------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+-----------+---------+----------+----------+
| test.test | analyze | status | OK |
+-----------+---------+----------+----------+
1 row in set (0.01 sec)
TokuDB: ANALYZE TABLE
55
• SHOW INDEX
mysql> show index from testG
*************************** 1. row ***************************
Table: test
Non_unique: 0
Key_name: PRIMARY
Column_name: id
Cardinality: 64
...
*************************** 2. row ***************************
Table: test
Non_unique: 1
Key_name: f1
Column_name: f1
Cardinality: 2
TokuDB: After ANALYZE TABLE
56
• Number of rows
$ ../bin/tokuftdump --header --nodata var/mysqld.1/data/test/test_key_f1_145_1_1
ft:
layout_version=29
layout_version_original=29
layout_version_read_from_disk=29
build_id=0
build_id_original=0
time_of_creation= 1537709029 Sun Sep 23 16:23:49 2018
time_of_last_modification=1537709100 Sun Sep 23 16:25:00 2018
...
estimated numrows=64
estimated numbytes=640
logical row count=64
TokuDB: After ANALYZE TABLE
56
• Index Statistics
Thread 44 "mysqld" hit Breakpoint 1, TOKUDB_SHARE::set_cardinality_counts_in_tab
(this=0x7fd86da54020, table=0x7fd86d90b020)
at /home/sveta/src/percona-server/storage/tokudb/ha_tokudb.cc:400
400 if (val == 0 || _rows == 0 ||
(gdb) p key->name
$26 = 0x7fd86d879999 "f1"
(gdb) p val
$27 = 32
TokuDB: After ANALYZE TABLE
56
• Cardinality = 64 / 32 = 2
TokuDB: After ANALYZE TABLE
56
mysql> insert into test (f1, ts) select f1, NOW() from test;
Query OK, 64 rows affected (0.01 sec)
Records: 64 Duplicates: 0 Warnings: 0
mysql> insert into test (f1, ts) select f1, NOW() from test;
Query OK, 128 rows affected (0.01 sec)
Records: 128 Duplicates: 0 Warnings: 0
mysql> insert into test (f1, ts) select f1, NOW() from test;
Query OK, 256 rows affected (0.02 sec)
Records: 256 Duplicates: 0 Warnings: 0
mysql> select count(distinct id), count(distinct f1) from test;
+--------------------+--------------------+
| count(distinct id) | count(distinct f1) |
+--------------------+--------------------+
| 512 | 2 |
+--------------------+--------------------+
1 row in set (0.01 sec)
TokuDB: Let’s Insert More Data
57
• SHOW INDEX
mysql> show index from testG
*************************** 1. row ***************************
Table: test
Non_unique: 0
Key_name: PRIMARY
Column_name: id
Cardinality: 512
...
*************************** 2. row ***************************
Table: test
Non_unique: 1
Key_name: f1
Column_name: f1
Cardinality: 16
TokuDB: After INSERT
58
• Number of rows
$ ../bin/tokuftdump --header --nodata var/mysqld.1/data/test/test_key_f1_145_1_1
ft:
layout_version=29
layout_version_original=29
layout_version_read_from_disk=29
build_id=0
build_id_original=0
time_of_creation= 1537709029 Sun Sep 23 16:23:49 2018
time_of_last_modification=1537709880 Sun Sep 23 16:38:00 2018
...
estimated numrows=512
estimated numbytes=5120
logical row count=512
TokuDB: After INSERT
58
• Index Statistics
Thread 44 "mysqld" hit Breakpoint 1, TOKUDB_SHARE::set_cardinality_counts_in_tab
(this=0x7fd86da54020, table=0x7fd86d90b020)
at /home/sveta/src/percona-server/storage/tokudb/ha_tokudb.cc:400
400 if (val == 0 || _rows == 0 ||
(gdb) p key->name
$30 = 0x7fd86d879999 "f1"
(gdb) p val
$31 = 32
TokuDB: After INSERT
58
• Cardinality = 512 / 32 = 16
TokuDB: After INSERT
58
• SHOW INDEX
mysql> show index from testG
*************************** 1. row ***************************
Table: test
Non_unique: 0
Key_name: PRIMARY
Column_name: id
Cardinality: 512
...
*************************** 2. row ***************************
Table: test
Non_unique: 1
Key_name: f1
Column_name: f1
Cardinality: 16
TokuDB: After Restart
59
• Index Statistics
Thread 44 "mysqld" hit Breakpoint 1, TOKUDB_SHARE::set_cardinality_counts_in_tab
(this=0x7fd4e67ea020, table=0x7fd4e6765c20)
at /home/sveta/src/percona-server/storage/tokudb/ha_tokudb.cc:400
400 if (val == 0 || _rows == 0 ||
(gdb) p key->name
$3 = 0x7fd4e66d7599 "f1"
(gdb) p val
$4 = 32
TokuDB: After Restart
59
• Cardinality = 512 / 32 = 16
TokuDB: After Restart
59
• Cardinality = 512 / 32 = 16
• Same!
TokuDB: After Restart
59
• Index statistics updated only when ANALYZE
TABLE is running
TokuDB: Conclusion
60
• Index statistics updated only when ANALYZE
TABLE is running
• Logical row count updated each time when
number of rows change
TokuDB: Conclusion
60
• Index statistics updated only when ANALYZE
TABLE is running
• Logical row count updated each time when
number of rows change
• Cardinality based on both numbers
TokuDB: Conclusion
60
• Index statistics updated only when ANALYZE
TABLE is running
• Logical row count updated each time when
number of rows change
• Cardinality based on both numbers
• It is expected the cardinality is not the same
• After updates
• Even when ANALYZE TABLE never run
TokuDB: Conclusion
60

More Related Content

What's hot

MySQL Performance Schema in Action
MySQL Performance Schema in ActionMySQL Performance Schema in Action
MySQL Performance Schema in ActionSveta Smirnova
 
Introduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]sIntroduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]sSveta Smirnova
 
Performance Schema in Action: demo
Performance Schema in Action: demoPerformance Schema in Action: demo
Performance Schema in Action: demoSveta Smirnova
 
Preparse Query Rewrite Plugins
Preparse Query Rewrite PluginsPreparse Query Rewrite Plugins
Preparse Query Rewrite PluginsSveta Smirnova
 
MySQL Performance Schema in Action
MySQL Performance Schema in ActionMySQL Performance Schema in Action
MySQL Performance Schema in ActionSveta Smirnova
 
Introduction into MySQL Query Tuning
Introduction into MySQL Query TuningIntroduction into MySQL Query Tuning
Introduction into MySQL Query TuningSveta Smirnova
 
0888 learning-mysql
0888 learning-mysql0888 learning-mysql
0888 learning-mysqlsabir18
 
MySQL Query tuning 101
MySQL Query tuning 101MySQL Query tuning 101
MySQL Query tuning 101Sveta Smirnova
 
Troubleshooting MySQL Performance add-ons
Troubleshooting MySQL Performance add-onsTroubleshooting MySQL Performance add-ons
Troubleshooting MySQL Performance add-onsSveta Smirnova
 
Why Use EXPLAIN FORMAT=JSON?
 Why Use EXPLAIN FORMAT=JSON?  Why Use EXPLAIN FORMAT=JSON?
Why Use EXPLAIN FORMAT=JSON? Sveta Smirnova
 
MySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZEMySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZENorvald Ryeng
 
Optimizer overviewoow2014
Optimizer overviewoow2014Optimizer overviewoow2014
Optimizer overviewoow2014Mysql User Camp
 
Optimizing queries MySQL
Optimizing queries MySQLOptimizing queries MySQL
Optimizing queries MySQLGeorgi Sotirov
 
Basic MySQL Troubleshooting for Oracle Database Administrators
Basic MySQL Troubleshooting for Oracle Database AdministratorsBasic MySQL Troubleshooting for Oracle Database Administrators
Basic MySQL Troubleshooting for Oracle Database AdministratorsSveta Smirnova
 
Understanding Query Execution
Understanding Query ExecutionUnderstanding Query Execution
Understanding Query Executionwebhostingguy
 
Highload Perf Tuning
Highload Perf TuningHighload Perf Tuning
Highload Perf TuningHighLoad2009
 
MySQL Performance Schema in 20 Minutes
 MySQL Performance Schema in 20 Minutes MySQL Performance Schema in 20 Minutes
MySQL Performance Schema in 20 MinutesSveta Smirnova
 
How to migrate from MySQL to MariaDB without tears
How to migrate from MySQL to MariaDB without tearsHow to migrate from MySQL to MariaDB without tears
How to migrate from MySQL to MariaDB without tearsSveta Smirnova
 
New features in Performance Schema 5.7 in action
New features in Performance Schema 5.7 in actionNew features in Performance Schema 5.7 in action
New features in Performance Schema 5.7 in actionSveta Smirnova
 
Performance Schema for MySQL Troubleshooting
Performance Schema for MySQL TroubleshootingPerformance Schema for MySQL Troubleshooting
Performance Schema for MySQL TroubleshootingSveta Smirnova
 

What's hot (20)

MySQL Performance Schema in Action
MySQL Performance Schema in ActionMySQL Performance Schema in Action
MySQL Performance Schema in Action
 
Introduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]sIntroduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]s
 
Performance Schema in Action: demo
Performance Schema in Action: demoPerformance Schema in Action: demo
Performance Schema in Action: demo
 
Preparse Query Rewrite Plugins
Preparse Query Rewrite PluginsPreparse Query Rewrite Plugins
Preparse Query Rewrite Plugins
 
MySQL Performance Schema in Action
MySQL Performance Schema in ActionMySQL Performance Schema in Action
MySQL Performance Schema in Action
 
Introduction into MySQL Query Tuning
Introduction into MySQL Query TuningIntroduction into MySQL Query Tuning
Introduction into MySQL Query Tuning
 
0888 learning-mysql
0888 learning-mysql0888 learning-mysql
0888 learning-mysql
 
MySQL Query tuning 101
MySQL Query tuning 101MySQL Query tuning 101
MySQL Query tuning 101
 
Troubleshooting MySQL Performance add-ons
Troubleshooting MySQL Performance add-onsTroubleshooting MySQL Performance add-ons
Troubleshooting MySQL Performance add-ons
 
Why Use EXPLAIN FORMAT=JSON?
 Why Use EXPLAIN FORMAT=JSON?  Why Use EXPLAIN FORMAT=JSON?
Why Use EXPLAIN FORMAT=JSON?
 
MySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZEMySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZE
 
Optimizer overviewoow2014
Optimizer overviewoow2014Optimizer overviewoow2014
Optimizer overviewoow2014
 
Optimizing queries MySQL
Optimizing queries MySQLOptimizing queries MySQL
Optimizing queries MySQL
 
Basic MySQL Troubleshooting for Oracle Database Administrators
Basic MySQL Troubleshooting for Oracle Database AdministratorsBasic MySQL Troubleshooting for Oracle Database Administrators
Basic MySQL Troubleshooting for Oracle Database Administrators
 
Understanding Query Execution
Understanding Query ExecutionUnderstanding Query Execution
Understanding Query Execution
 
Highload Perf Tuning
Highload Perf TuningHighload Perf Tuning
Highload Perf Tuning
 
MySQL Performance Schema in 20 Minutes
 MySQL Performance Schema in 20 Minutes MySQL Performance Schema in 20 Minutes
MySQL Performance Schema in 20 Minutes
 
How to migrate from MySQL to MariaDB without tears
How to migrate from MySQL to MariaDB without tearsHow to migrate from MySQL to MariaDB without tears
How to migrate from MySQL to MariaDB without tears
 
New features in Performance Schema 5.7 in action
New features in Performance Schema 5.7 in actionNew features in Performance Schema 5.7 in action
New features in Performance Schema 5.7 in action
 
Performance Schema for MySQL Troubleshooting
Performance Schema for MySQL TroubleshootingPerformance Schema for MySQL Troubleshooting
Performance Schema for MySQL Troubleshooting
 

Similar to Billion Goods in Few Categories: how Histograms Save a Life?

MariaDB 10 Tutorial - 13.11.11 - Percona Live London
MariaDB 10 Tutorial - 13.11.11 - Percona Live LondonMariaDB 10 Tutorial - 13.11.11 - Percona Live London
MariaDB 10 Tutorial - 13.11.11 - Percona Live LondonIvan Zoratti
 
MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)Karthik .P.R
 
Advanced MariaDB features that developers love.pdf
Advanced MariaDB features that developers love.pdfAdvanced MariaDB features that developers love.pdf
Advanced MariaDB features that developers love.pdfFederico Razzoli
 
Index the obvious and not so obvious
Index the obvious and not so obviousIndex the obvious and not so obvious
Index the obvious and not so obviousHarry Zheng
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentationMichael Keane
 
5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...
5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...
5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...Kangaroot
 
10x improvement-mysql-100419105218-phpapp02
10x improvement-mysql-100419105218-phpapp0210x improvement-mysql-100419105218-phpapp02
10x improvement-mysql-100419105218-phpapp02promethius
 
10x Performance Improvements
10x Performance Improvements10x Performance Improvements
10x Performance ImprovementsRonald Bradford
 
Deep Dive into DynamoDB
Deep Dive into DynamoDBDeep Dive into DynamoDB
Deep Dive into DynamoDBAWS Germany
 
PL/SQL New and Advanced Features for Extreme Performance
PL/SQL New and Advanced Features for Extreme PerformancePL/SQL New and Advanced Features for Extreme Performance
PL/SQL New and Advanced Features for Extreme PerformanceZohar Elkayam
 
MySQL performance monitoring using Statsd and Graphite (PLUK2013)
MySQL performance monitoring using Statsd and Graphite (PLUK2013)MySQL performance monitoring using Statsd and Graphite (PLUK2013)
MySQL performance monitoring using Statsd and Graphite (PLUK2013)spil-engineering
 
MariaDB for Developers and Operators (DevOps)
MariaDB for Developers and Operators (DevOps)MariaDB for Developers and Operators (DevOps)
MariaDB for Developers and Operators (DevOps)Colin Charles
 
My Query is slow, now what?
My Query is slow, now what?My Query is slow, now what?
My Query is slow, now what?Gianluca Sartori
 
Hundreds of queries in the time of one - Gianmario Spacagna
Hundreds of queries in the time of one - Gianmario SpacagnaHundreds of queries in the time of one - Gianmario Spacagna
Hundreds of queries in the time of one - Gianmario SpacagnaSpark Summit
 
MySQL Query Optimisation 101
MySQL Query Optimisation 101MySQL Query Optimisation 101
MySQL Query Optimisation 101Federico Razzoli
 
Tuning Apache Phoenix/HBase
Tuning Apache Phoenix/HBaseTuning Apache Phoenix/HBase
Tuning Apache Phoenix/HBaseAnil Gupta
 
Star schema my sql
Star schema   my sqlStar schema   my sql
Star schema my sqldeathsubte
 

Similar to Billion Goods in Few Categories: how Histograms Save a Life? (20)

MariaDB 10 Tutorial - 13.11.11 - Percona Live London
MariaDB 10 Tutorial - 13.11.11 - Percona Live LondonMariaDB 10 Tutorial - 13.11.11 - Percona Live London
MariaDB 10 Tutorial - 13.11.11 - Percona Live London
 
MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)MySQL Query Optimization (Basics)
MySQL Query Optimization (Basics)
 
Advanced MariaDB features that developers love.pdf
Advanced MariaDB features that developers love.pdfAdvanced MariaDB features that developers love.pdf
Advanced MariaDB features that developers love.pdf
 
Index the obvious and not so obvious
Index the obvious and not so obviousIndex the obvious and not so obvious
Index the obvious and not so obvious
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentation
 
Mssql
MssqlMssql
Mssql
 
5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...
5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...
5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...
 
10x improvement-mysql-100419105218-phpapp02
10x improvement-mysql-100419105218-phpapp0210x improvement-mysql-100419105218-phpapp02
10x improvement-mysql-100419105218-phpapp02
 
10x Performance Improvements
10x Performance Improvements10x Performance Improvements
10x Performance Improvements
 
Deep Dive into DynamoDB
Deep Dive into DynamoDBDeep Dive into DynamoDB
Deep Dive into DynamoDB
 
PL/SQL New and Advanced Features for Extreme Performance
PL/SQL New and Advanced Features for Extreme PerformancePL/SQL New and Advanced Features for Extreme Performance
PL/SQL New and Advanced Features for Extreme Performance
 
MySQL performance monitoring using Statsd and Graphite (PLUK2013)
MySQL performance monitoring using Statsd and Graphite (PLUK2013)MySQL performance monitoring using Statsd and Graphite (PLUK2013)
MySQL performance monitoring using Statsd and Graphite (PLUK2013)
 
MariaDB for Developers and Operators (DevOps)
MariaDB for Developers and Operators (DevOps)MariaDB for Developers and Operators (DevOps)
MariaDB for Developers and Operators (DevOps)
 
My Query is slow, now what?
My Query is slow, now what?My Query is slow, now what?
My Query is slow, now what?
 
Hundreds of queries in the time of one - Gianmario Spacagna
Hundreds of queries in the time of one - Gianmario SpacagnaHundreds of queries in the time of one - Gianmario Spacagna
Hundreds of queries in the time of one - Gianmario Spacagna
 
MySQL Query Optimisation 101
MySQL Query Optimisation 101MySQL Query Optimisation 101
MySQL Query Optimisation 101
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Tuning Apache Phoenix/HBase
Tuning Apache Phoenix/HBaseTuning Apache Phoenix/HBase
Tuning Apache Phoenix/HBase
 
Star schema my sql
Star schema   my sqlStar schema   my sql
Star schema my sql
 
Sql server
Sql serverSql server
Sql server
 

More from Sveta Smirnova

MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?
MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?
MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?Sveta Smirnova
 
Database in Kubernetes: Diagnostics and Monitoring
Database in Kubernetes: Diagnostics and MonitoringDatabase in Kubernetes: Diagnostics and Monitoring
Database in Kubernetes: Diagnostics and MonitoringSveta Smirnova
 
MySQL Database Monitoring: Must, Good and Nice to Have
MySQL Database Monitoring: Must, Good and Nice to HaveMySQL Database Monitoring: Must, Good and Nice to Have
MySQL Database Monitoring: Must, Good and Nice to HaveSveta Smirnova
 
MySQL Cookbook: Recipes for Developers
MySQL Cookbook: Recipes for DevelopersMySQL Cookbook: Recipes for Developers
MySQL Cookbook: Recipes for DevelopersSveta Smirnova
 
MySQL Performance for DevOps
MySQL Performance for DevOpsMySQL Performance for DevOps
MySQL Performance for DevOpsSveta Smirnova
 
MySQL Test Framework для поддержки клиентов и верификации багов
MySQL Test Framework для поддержки клиентов и верификации баговMySQL Test Framework для поддержки клиентов и верификации багов
MySQL Test Framework для поддержки клиентов и верификации баговSveta Smirnova
 
MySQL Cookbook: Recipes for Your Business
MySQL Cookbook: Recipes for Your BusinessMySQL Cookbook: Recipes for Your Business
MySQL Cookbook: Recipes for Your BusinessSveta Smirnova
 
Производительность MySQL для DevOps
 Производительность MySQL для DevOps Производительность MySQL для DevOps
Производительность MySQL для DevOpsSveta Smirnova
 
MySQL Performance for DevOps
MySQL Performance for DevOpsMySQL Performance for DevOps
MySQL Performance for DevOpsSveta Smirnova
 
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB Cluster
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB ClusterHow to Avoid Pitfalls in Schema Upgrade with Percona XtraDB Cluster
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB ClusterSveta Smirnova
 
How Safe is Asynchronous Master-Master Setup?
How Safe is Asynchronous Master-Master Setup?How Safe is Asynchronous Master-Master Setup?
How Safe is Asynchronous Master-Master Setup?Sveta Smirnova
 
Современному хайлоду - современные решения: MySQL 8.0 и улучшения Percona
Современному хайлоду - современные решения: MySQL 8.0 и улучшения PerconaСовременному хайлоду - современные решения: MySQL 8.0 и улучшения Percona
Современному хайлоду - современные решения: MySQL 8.0 и улучшения PerconaSveta Smirnova
 
How to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraHow to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraSveta Smirnova
 
How Safe is Asynchronous Master-Master Setup?
 How Safe is Asynchronous Master-Master Setup? How Safe is Asynchronous Master-Master Setup?
How Safe is Asynchronous Master-Master Setup?Sveta Smirnova
 
Что нужно знать о трёх топовых фичах MySQL
Что нужно знать  о трёх топовых фичах  MySQLЧто нужно знать  о трёх топовых фичах  MySQL
Что нужно знать о трёх топовых фичах MySQLSveta Smirnova
 
Why MySQL Replication Fails, and How to Get it Back
Why MySQL Replication Fails, and How to Get it BackWhy MySQL Replication Fails, and How to Get it Back
Why MySQL Replication Fails, and How to Get it BackSveta Smirnova
 

More from Sveta Smirnova (16)

MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?
MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?
MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?
 
Database in Kubernetes: Diagnostics and Monitoring
Database in Kubernetes: Diagnostics and MonitoringDatabase in Kubernetes: Diagnostics and Monitoring
Database in Kubernetes: Diagnostics and Monitoring
 
MySQL Database Monitoring: Must, Good and Nice to Have
MySQL Database Monitoring: Must, Good and Nice to HaveMySQL Database Monitoring: Must, Good and Nice to Have
MySQL Database Monitoring: Must, Good and Nice to Have
 
MySQL Cookbook: Recipes for Developers
MySQL Cookbook: Recipes for DevelopersMySQL Cookbook: Recipes for Developers
MySQL Cookbook: Recipes for Developers
 
MySQL Performance for DevOps
MySQL Performance for DevOpsMySQL Performance for DevOps
MySQL Performance for DevOps
 
MySQL Test Framework для поддержки клиентов и верификации багов
MySQL Test Framework для поддержки клиентов и верификации баговMySQL Test Framework для поддержки клиентов и верификации багов
MySQL Test Framework для поддержки клиентов и верификации багов
 
MySQL Cookbook: Recipes for Your Business
MySQL Cookbook: Recipes for Your BusinessMySQL Cookbook: Recipes for Your Business
MySQL Cookbook: Recipes for Your Business
 
Производительность MySQL для DevOps
 Производительность MySQL для DevOps Производительность MySQL для DevOps
Производительность MySQL для DevOps
 
MySQL Performance for DevOps
MySQL Performance for DevOpsMySQL Performance for DevOps
MySQL Performance for DevOps
 
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB Cluster
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB ClusterHow to Avoid Pitfalls in Schema Upgrade with Percona XtraDB Cluster
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB Cluster
 
How Safe is Asynchronous Master-Master Setup?
How Safe is Asynchronous Master-Master Setup?How Safe is Asynchronous Master-Master Setup?
How Safe is Asynchronous Master-Master Setup?
 
Современному хайлоду - современные решения: MySQL 8.0 и улучшения Percona
Современному хайлоду - современные решения: MySQL 8.0 и улучшения PerconaСовременному хайлоду - современные решения: MySQL 8.0 и улучшения Percona
Современному хайлоду - современные решения: MySQL 8.0 и улучшения Percona
 
How to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraHow to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with Galera
 
How Safe is Asynchronous Master-Master Setup?
 How Safe is Asynchronous Master-Master Setup? How Safe is Asynchronous Master-Master Setup?
How Safe is Asynchronous Master-Master Setup?
 
Что нужно знать о трёх топовых фичах MySQL
Что нужно знать  о трёх топовых фичах  MySQLЧто нужно знать  о трёх топовых фичах  MySQL
Что нужно знать о трёх топовых фичах MySQL
 
Why MySQL Replication Fails, and How to Get it Back
Why MySQL Replication Fails, and How to Get it BackWhy MySQL Replication Fails, and How to Get it Back
Why MySQL Replication Fails, and How to Get it Back
 

Recently uploaded

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 

Recently uploaded (20)

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 

Billion Goods in Few Categories: how Histograms Save a Life?

  • 1. Billion Goods in Few Categories: how Histograms Save a Life? November, 7, 2018 Sveta Smirnova
  • 2. •The Case •The Cardinality: Two Levels •ANALYZE TABLE Limitations •Solutions in Percona Server 5.7 •Histograms •Conclusion Table of Contents 2
  • 3. • MySQL Support engineer • Author of • MySQL Troubleshooting • JSON UDF functions • FILTER clause for MySQL • Speaker • Percona Live, OOW, Fosdem, DevConf, HighLoad... Sveta Smirnova 3
  • 4. • Hardware • Wise options • Optimized queries • Brain Everything can be Resolved! 4
  • 5. • This talk is about • How I spent last two years • Resolving the same issue • For different customers Not Everything 5
  • 6. • This talk is about • How I spent last two years • Resolving the same issue • For different customers • Task was to speed up the query Not Everything 5
  • 7. • Specific data distribution • Access on different fields • ON clause • WHERE clause • GROUP BY • ORDER BY • Index cannot be used effectively Not All the Queries can be Optimized 6
  • 8. • Topic based on real Support cases • Couple of them are still in progress Disclaimer 7
  • 9. • Topic based on real Support cases • All examples are 100% fake • They created such that • No customer can be identified • Everything generated Table names Column names Data • Use case itself is fictional Disclaimer 7
  • 10. • Topic based on real Support cases • All examples are 100% fake • All examples are simplified • Only columns, required to show the issue • Everything extra removed • Real tables usually store much more data Disclaimer 7
  • 11. • Topic based on real Support cases • All examples are 100% fake • All examples are simplified • All disasters happened with version 5.7 Disclaimer 7
  • 13. • categories • Less than 20 rows Two tables 9
  • 14. • categories • Less than 20 rows • goods • More than 1M rows • 20 unique cat id values • Many other fields Price Date: added, last updated, etc. Characteristics Store ... Two tables 9
  • 15. select * from goods join categories on (categories.id=goods.cat_id) where date_added between ’2018-07-01’ and ’2018-08-01’ and cat_id in (16,11) and price >= 1000 and <=10000 [ and ... ] [ GROUP BY ... [ORDER BY ... [ LIMIT ...]]] ; JOIN 10
  • 16. • Select from the Small Table Option 1: Select from the Small Table First 11
  • 17. • Select from the Small Table • For each cat id select from the large table Option 1: Select from the Small Table First 11
  • 18. • Select from the Small Table • For each cat id select from the large table • Filter result on date added[ and price[...]] Option 1: Select from the Small Table First 11
  • 19. • Select from the Small Table • For each cat id select from the large table • Filter result on date added[ and price[...]] • Slow with many items in the category Option 1: Select from the Small Table First 11
  • 20. • Filter rows by date added[ and price[...]] Option 2: Select from the Large Table First 12
  • 21. • Filter rows by date added[ and price[...]] • Get cat id values Option 2: Select from the Large Table First 12
  • 22. • Filter rows by date added[ and price[...]] • Get cat id values • Retrieve rows from the small table Option 2: Select from the Large Table First 12
  • 23. • Filter rows by date added[ and price[...]] • Get cat id values • Retrieve rows from the small table • Slow if number of rows, filtered by date added, is larger than number of goods in the selected categories Option 2: Select from the Large Table First 12
  • 24. • CREATE INDEX index everything (cat id, date added[, price[, ...]]) • It resolves the issue What if use Combined Indexes? 13
  • 25. • CREATE INDEX index everything (cat id, date added[, price[, ...]]) • It resolves the issue • But not in all cases What if use Combined Indexes? 13
  • 26. • Maintenance cost • Slower INSERT/UPDATE/DELETE • Disk space The Problem 14
  • 27. • Maintenance cost • Slower INSERT/UPDATE/DELETE • Disk space • Tables may have wrong cardinality The Problem 14
  • 29. • Optimizer • Engine • TokuDB • InnoDB • Any MySQL is Layered Architecture 16
  • 30. • Stores statistics on disk • mysql.innodb table stats • mysql.innodb index stats InnoDB: Overview 17
  • 31. • Stores statistics on disk • Returns statistics to Optimizer InnoDB: Overview 17
  • 32. • Stores statistics on disk • Returns statistics to Optimizer • In ha innobase::info • handler/ha innodb.cc InnoDB: Overview 17
  • 33. • Stores statistics on disk • Returns statistics to Optimizer • In ha innobase::info • handler/ha innodb.cc • When opens table • flag = HA STATUS CONST • Reads data from disk • Stores it in memory InnoDB: Overview 17
  • 34. • Stores statistics on disk • Returns statistics to Optimizer • In ha innobase::info • handler/ha innodb.cc • When opens table • Subsequent table accesses • flag = HA STATUS VARIABLE • Statistics from memory • Up to date Primary Key data InnoDB: Overview 17
  • 35. • Table created with option STATS AUTO RECALC = 0 • Before ANALYZE TABLE mysql> show index from testG ... *************************** 2. row *************************** Table: test Non_unique: 1 Key_name: f1 Seq_in_index: 1 Column_name: f1 Collation: A Cardinality: 64 ... InnoDB: Flow 18
  • 36. • Table created with option STATS AUTO RECALC = 0 • After ANALYZE TABLE mysql> show index from testG ... *************************** 2. row *************************** Table: test Non_unique: 1 Key_name: f1 Seq_in_index: 1 Column_name: f1 Collation: A Cardinality: 2 ... InnoDB: Flow 18
  • 37. • Table created with option STATS AUTO RECALC = 0 • After inserting rows mysql> show index from testG ... *************************** 2. row *************************** Table: test Non_unique: 1 Key_name: f1 Seq_in_index: 1 Column_name: f1 Collation: A Cardinality: 16 ... InnoDB: Flow 18
  • 38. • Table created with option STATS AUTO RECALC = 0 • After restart mysql> show index from testG ... *************************** 2. row *************************** Table: test Non_unique: 1 Key_name: f1 Seq_in_index: 1 Column_name: f1 Collation: A Cardinality: 2 ... InnoDB: Flow 18
  • 39. • Takes data from the engine Optimizer: Overview 19
  • 40. • Takes data from the engine • Class ha statistics • sql/handler.h Optimizer: Overview 19
  • 41. • Takes data from the engine • Class ha statistics • sql/handler.h • Does not have Cardinality field at all Optimizer: Overview 19
  • 42. • Takes data from the engine • Class ha statistics • sql/handler.h • Does not have Cardinality field at all • Uses formula to calculate Cardinality Optimizer: Overview 19
  • 43. • n rows: number of rows in the table • Naturally up to date • Constantly changing! Optimizer: Formula 20
  • 44. • n rows: number of rows in the table • Naturally up to date • Constantly changing! • rec per key: number of duplicates per key • Calculated by InnoDB in time of ANALYZE • rec per key = n rows / unique values • Do not change! Optimizer: Formula 20
  • 45. • n rows: number of rows in the table • Naturally up to date • Constantly changing! • rec per key: number of duplicates per key • Calculated by InnoDB in time of ANALYZE • rec per key = n rows / unique values • Do not change! • Cardinality = n rows / rec per key Optimizer: Formula 20
  • 46. • Engine stores persistent statistics TokuDB InnoDB Storage Files Tables Statistics As Calculated As Calculated Row Count Persistent Only in Memory Persistent Statistics Are Not Persistent 21
  • 47. • Engine stores persistent statistics TokuDB InnoDB Storage Files Tables Statistics As Calculated As Calculated Row Count Persistent Only in Memory • Optimizer calculates Cardinality every time when accesses engine statistics Persistent Statistics Are Not Persistent 21
  • 48. • Engine stores persistent statistics TokuDB InnoDB Storage Files Tables Statistics As Calculated As Calculated Row Count Persistent Only in Memory • Optimizer calculates Cardinality every time when accesses engine statistics • Weak user control Persistent Statistics Are Not Persistent 21
  • 50. • Counts number of pages in the table How ANALYZE TABLE Works with InnoDB? 23
  • 51. • Counts number of pages in the table • Takes STATS SAMPLE PAGES How ANALYZE TABLE Works with InnoDB? 23
  • 52. • Counts number of pages in the table • Takes STATS SAMPLE PAGES • Counts number of unique values in secondary index in these pages How ANALYZE TABLE Works with InnoDB? 23
  • 53. • Counts number of pages in the table • Takes STATS SAMPLE PAGES • Counts number of unique values in secondary index in these pages • Divides number of pages in the table on number of sample pages and multiplies result on number of unique values How ANALYZE TABLE Works with InnoDB? 23
  • 54. • Number of pages in the table: 20,000 • STATS SAMPLE PAGES: 20 (default) • Unique values in the secondary index: • In sample pages: 10 • In the table: 11 Example 24
  • 55. • Number of pages in the table: 20,000 • STATS SAMPLE PAGES: 20 (default) • Unique values in the secondary index: • In sample pages: 10 • In the table: 11 • Cardinality: 20,000 * 10 / 20 = 10,000 Example 24
  • 56. • Number of pages in the table: 20,000 • STATS SAMPLE PAGES: 5,000 • Unique values in the secondary index: • In sample pages: 10 • In the table: 11 • Cardinality: 20,000 * 10 / 5,000 = 40 Example 2 25
  • 57. • Time consuming mysql> select count(*) from goods; +----------+ | count(*) | +----------+ | 80303000 | +----------+ 1 row in set (35.95 sec) Use Larger STATS SAMPLE PAGES? 26
  • 58. • Time consuming • With default STATS SAMPLE PAGES mysql> analyze table goods; +------------+---------+----------+----------+ | Table | Op | Msg_type | Msg_text | +------------+---------+----------+----------+ | test.goods | analyze | status | OK | +------------+---------+----------+----------+ 1 row in set (0.32 sec) Use Larger STATS SAMPLE PAGES? 26
  • 59. • Time consuming • With bigger number mysql> alter table goods STATS_SAMPLE_PAGES=5000; Query OK, 0 rows affected (0.04 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> analyze table goods; +------------+---------+----------+----------+ | Table | Op | Msg_type | Msg_text | +------------+---------+----------+----------+ | test.goods | analyze | status | OK | +------------+---------+----------+----------+ 1 row in set (27.13 sec) Use Larger STATS SAMPLE PAGES? 26
  • 60. • Time consuming • With bigger number • 27.13/0.32 = 85 times slower! Use Larger STATS SAMPLE PAGES? 26
  • 61. User Manual claims it does not During the analysis, the table is locked with a read lock for InnoDB and MyISAM. Does ANALYZE TABLE Block Reads? 27
  • 62. User Manual claims it does not • But! Does ANALYZE TABLE Block Reads? 27
  • 63. User Manual claims it does not Sometimes it blocks all subsequent queries +------+-------------------------+---------------------------------+ | Time | State | Info | +------+-------------------------+---------------------------------+ | 32 | Writing to net | select * from t where c > ’%0%’ | | 12 | Waiting for table flush | select * from test.t where i=1 | | 12 | Waiting for table flush | select * from test.t where i=2 | | 12 | Waiting for table flush | select * from test.t where i=3 | | 11 | Waiting for table flush | select * from test.t where i=7 | | 10 | Waiting for table flush | select * from test.t where i=11 | ... Does ANALYZE TABLE Block Reads? 27
  • 64. Is not a solution Simply Increasing STATS SAMPLE PAGES 28
  • 65. Solutions in Percona Server 5.7
  • 66. Considered as a bug • jira.percona.com/browse/PS-2503 • lp:1704195 • bugs.mysql.com/87065 Blocking ANALYZE TABLE 30
  • 67. Considered as a bug • Fixed in Percona Server 5.6.38-83.0/5.7.20-18 Blocking ANALYZE TABLE 30
  • 68. • Before the fix • Opens table statistics Concurrent DML allowed • Updates table statistics Concurrent DML allowed • Update finished • Invalidates entry in table definition cache Concurrent DML forbidden • Invalidates query cache Concurrent DML forbidden Non-Blocking ANALYZE TABLE 31
  • 69. • After the fix • Opens table statistics Concurrent DML allowed • Updates table statistics Concurrent DML allowed • Update finished • Invalidates entry in table definition cache Concurrent DML forbidden • Invalidates query cache Concurrent DML forbidden Non-Blocking ANALYZE TABLE 31
  • 70. • InnoDB stores its statistics mysql.innodb index stats Without the Fix: Manual Update 32
  • 71. • InnoDB stores its statistics mysql.innodb index stats • This table is writable Without the Fix: Manual Update 32
  • 72. • InnoDB stores its statistics mysql.innodb index stats • This table is writable • Updating it with following FLUSH TABLE allows to fake any statistics Without the Fix: Manual Update 32
  • 73. • InnoDB stores its statistics mysql.innodb index stats • This table is writable • Updating it with following FLUSH TABLE allows to fake any statistics • Hack • Not documented • Not recommended • Can stop working any time Without the Fix: Manual Update 32
  • 74. • With Percona fix for blocking ANALYZE TABLE we can use large value for STATS SAMPLE PAGES • Does not help when • Index cannot be used • Data distribution in the index vary a lot 5.7: Resume 33
  • 75. • With Percona fix for blocking ANALYZE TABLE we can use large value for STATS SAMPLE PAGES • Does not help when • Index cannot be used • Data distribution in the index vary a lot • Manual update allows to fix statistics • Not recommended • Can stop working any time 5.7: Resume 33
  • 77. • Optimizer Column Statistics • Engine-independent • No fancy calculations • Knows about data distribution What are the Histograms? 35
  • 78. 1 2 3 4 5 6 7 8 9 10 0 200 400 600 800 Number of Values in Each Bucket 36
  • 79. 1 2 3 4 5 6 7 8 9 10 0 0.2 0.4 0.6 0.8 1 Data in the Histogram 37
  • 80. • Accurate statistics • Truly persistent • No extra calculations on access • Optimizer knows about data distribution • Without touching the table! How Histograms are Helpful? 38
  • 81. • Example data mysql> create table example(f1 int) engine=innodb; mysql> insert into example values(1),(1),(1),(2),(3); mysql> select f1, count(f1) from example group by f1; +------+-----------+ | f1 | count(f1) | +------+-----------+ | 1 | 3 | | 2 | 1 | | 3 | 1 | +------+-----------+ 3 rows in set (0.00 sec) Filtered Rows 39
  • 82. • Without a histogram mysql> explain select * from example where f1 > 0G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 33.33 Extra: Using where Filtered Rows 39
  • 83. • Without a histogram mysql> explain select * from example where f1 > 1G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 33.33 Extra: Using where Filtered Rows 39
  • 84. • Without a histogram mysql> explain select * from example where f1 > 2G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 33.33 Extra: Using where Filtered Rows 39
  • 85. • Without a histogram mysql> explain select * from example where f1 > 3G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 33.33 Extra: Using where Filtered Rows 39
  • 86. • With the histogram mysql> analyze table example update histogram on f1 with 3 buckets; +-----------------+-----------+----------+------------------------------+ | Table | Op | Msg_type | Msg_text | +-----------------+-----------+----------+------------------------------+ | hist_ex.example | histogram | status | Histogram statistics created for column ’f1’. | +-----------------+-----------+----------+------------------------------+ 1 row in set (0.03 sec) Filtered Rows 39
  • 87. • With the histogram mysql> select * from information_schema.column_statistics -> where table_name=’example’G *************************** 1. row *************************** SCHEMA_NAME: hist_ex TABLE_NAME: example COLUMN_NAME: f1 HISTOGRAM: "buckets": [[1, 0.6], [2, 0.8], [3, 1.0]], "data-type": "int", "null-values": 0.0, "collation-id": 8, "last-updated": "2018-11-07 09:07:19.791470", "sampling-rate": 1.0, "histogram-type": "singleton", "number-of-buckets-specified": 3 1 row in set (0.00 sec) Filtered Rows 39
  • 88. • With the histogram mysql> explain select * from example where f1 > 0G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 100.00 -- all rows Extra: Using where Filtered Rows 39
  • 89. • With the histogram mysql> explain select * from example where f1 > 1G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 40.00 -- 2 rows Extra: Using where Filtered Rows 39
  • 90. • With the histogram mysql> explain select * from example where f1 > 2G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 20.00 -- one row Extra: Using where Filtered Rows 39
  • 91. • With the histogram mysql> explain select * from example where f1 > 3G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 20.00 - one row Extra: Using where Filtered Rows 39
  • 92. • EXPLAIN without histograms mysql> explain select goods.* from goods -> join categories on (categories.id=goods.cat_id) -> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17) -> and -> date_added between ’2000-01-01’ and ’2001-01-01’ -- Large range -> order by goods.cat_id -> limit 10G -- We ask for 10 rows only! Example 40
  • 93. • EXPLAIN without histograms *************************** 1. row *************************** id: 1 select_type: SIMPLE table: categories -- Small table first partitions: NULL type: index possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: NULL rows: 20 filtered: 70.00 Extra: Using where; Using index; Using temporary; Using filesort Example 40
  • 94. • EXPLAIN without histograms *************************** 2. row *************************** id: 1 select_type: SIMPLE table: goods -- Large table partitions: NULL type: ref possible_keys: cat_id_2 key: cat_id_2 key_len: 5 ref: orig.categories.id rows: 51827 filtered: 11.11 -- Default value Extra: Using where 2 rows in set, 1 warning (0.01 sec) Example 40
  • 95. • Execution time without histograms mysql> flush status; Query OK, 0 rows affected (0.00 sec) mysql> select goods.* from goods -> join categories on (categories.id=goods.cat_id) -> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17) -> and -> date_added between ’2000-01-01’ and ’2001-01-01’ -> order by goods.cat_id -> limit 10; ab9f9bb7bc4f357712ec34f067eda364 - 10 rows in set (56.47 sec) Example 40
  • 96. • Engine statistics without histograms mysql> show status like ’Handler%’; +----------------------------+--------+ | Variable_name | Value | +----------------------------+--------+ ... | Handler_read_next | 964718 | | Handler_read_prev | 0 | | Handler_read_rnd | 10 | | Handler_read_rnd_next | 951671 | ... | Handler_write | 951670 | +----------------------------+--------+ 18 rows in set (0.01 sec) Example 40
  • 97. • Now lets add the histogram mysql> analyze table goods update histogram on date_added; +------------+-----------+----------+------------------------------+ | Table | Op | Msg_type | Msg_text | +------------+-----------+----------+------------------------------+ | orig.goods | histogram | status | Histogram statistics created for column ’date_added’. | +------------+-----------+----------+------------------------------+ 1 row in set (2.01 sec) Example 40
  • 98. • EXPLAIN with the histogram mysql> explain select goods.* from goods -> join categories -> on (categories.id=goods.cat_id) -> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17) -> and -> date_added between ’2000-01-01’ and ’2001-01-01’ -> order by goods.cat_id -> limit 10G Example 40
  • 99. • EXPLAIN with the histogram *************************** 1. row *************************** id: 1 select_type: SIMPLE table: goods -- Large table first partitions: NULL type: index possible_keys: cat_id_2 key: cat_id_2 key_len: 5 ref: NULL rows: 10 -- Same as we asked filtered: 98.70 -- True numbers Extra: Using where Example 40
  • 100. • EXPLAIN with the histogram *************************** 2. row *************************** id: 1 select_type: SIMPLE table: categories -- Small table partitions: NULL type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: orig.goods.cat_id rows: 1 filtered: 100.00 Extra: Using index 2 rows in set, 1 warning (0.01 sec) Example 40
  • 101. • Execution time with the histogram mysql> flush status; Query OK, 0 rows affected (0.00 sec) mysql> select goods.* from goods -> join categories on (categories.id=goods.cat_id) -> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17) -> and -> date_added between ’2000-01-01’ and ’2001-01-01’ -> order by goods.cat_id -> limit 10; eeb005fae0dd3441c5c380e1d87fee84 - 10 rows in set (0.00 sec) -- 56 times faster! Example 40
  • 102. • Engine statistics with the histogram mysql> show status like ’Handler%’; +----------------------------+-------++----------------------------+-------+ | Variable_name | Value || Variable_name | Value | +----------------------------+-------++----------------------------+-------+ | Handler_commit | 1 || Handler_read_prev | 0 | | Handler_delete | 0 || Handler_read_rnd | 0 | | Handler_discover | 0 || Handler_read_rnd_next | 0 | | Handler_external_lock | 4 || Handler_rollback | 0 | | Handler_mrr_init | 0 || Handler_savepoint | 0 | | Handler_prepare | 0 || Handler_savepoint_rollback | 0 | | Handler_read_first | 1 || Handler_update | 0 | | Handler_read_key | 3 || Handler_write | 0 | | Handler_read_last | 0 |+----------------------------+-------+ | Handler_read_next | 9 |18 rows in set (0.00 sec) Example 40
  • 103. • Data distribution is uniform • Range optimization can be used • Full table scan is fast When Histogram are not Helpful? 41
  • 104. Backward index scan • Better Statistics Persistence in InnoDB • MySQL bug #80178 • MySQL bug #84654 • Better PRIMARY key access Other Improvements in 8.0 42
  • 106. • Index statistics collected by the egine • Optimizer calculates Cardinality each time when accesses statistics • Indexes not always improve performance • Histograms can help Still new feature Conclusion 44
  • 107. MySQL User Reference Manual Blog by Erik Froseth Blog by Frederic Descamps Talk by Oystein Grovlen @Fosdem Talk by Sergei Petrunia @PerconaLive Talk by Sergei Golubchik @HighLoad++ More information 45
  • 110. How TokuDB Updates Statistics
  • 111. • Stores key statistics on disk and in memory • tablename status id.tokudb TokuDB: Overview 49
  • 112. • Stores key statistics on disk and in memory • Stores row count on disk and in memory • tablename main id.tokudb • tablename key keyname id.tokudb TokuDB: Overview 49
  • 113. • Stores key statistics on disk and in memory • Stores row count on disk and in memory • Returns statistics to Optimizer TokuDB: Overview 49
  • 114. • Stores key statistics on disk and in memory • Stores row count on disk and in memory • Returns statistics to Optimizer • In ha tokudb::info (handler/ha tokudb.cc) TokuDB: Overview 49
  • 115. • Stored on disk TokuDB: Key Statistics 50
  • 116. • Stored on disk • Updated during ANALYZE • Background ANALYZE • Explicitly called TokuDB: Key Statistics 50
  • 117. • Stored on disk • Updated during ANALYZE • Background ANALYZE • Explicitly called • Not updated when tokudb auto analyze=0 TokuDB: Key Statistics 50
  • 118. • Updated in TOKUDB SHARE::update cardinality counts TokuDB Key Statistics: Code 51
  • 119. • Updated in TOKUDB SHARE::update cardinality counts • Stored in tokudb::set card in status • In standard ANALYZE • standard t::on run TokuDB Key Statistics: Code 51
  • 120. • Updated in TOKUDB SHARE::update cardinality counts • Stored in tokudb::set card in status • Retrieved in tokudb::get card from status • When table is open • In ha tokudb::initialize share TokuDB Key Statistics: Code 51
  • 121. • Updated in TOKUDB SHARE::update cardinality counts • Stored in tokudb::set card in status • Retrieved in tokudb::get card from status • Used in TOKUDB SHARE::set cardinality counts in table for (uint32_t j = 0; j < key->actual_key_parts; j++) { ... assert_always(next_key_part < _rec_per_keys); ulong val = _rec_per_key[next_key_part++]; val = (val * tokudb::sysvars::cardinality_scale_percent) / 100; TokuDB Key Statistics: Code 51
  • 122. • Stored on disk • Updated • Each time table is updated • When ha tokudb::info called TokuDB Logical Rows Count 52
  • 123. mysql> create table test( -> id int not null auto_increment primary key, -> f1 int, -> ts timestamp, -> key(f1) -> ) engine=tokudb; Query OK, 0 rows affected (0.10 sec) mysql> insert into test (f1, ts) values(1, NOW()), (2, NOW()); Query OK, 2 rows affected (0.03 sec) Records: 2 Duplicates: 0 Warnings: 0 ... mysql> insert into test (f1, ts) select f1, NOW() from test; Query OK, 32 rows affected (0.01 sec) Records: 32 Duplicates: 0 Warnings: 0 TokuDB Test Case 53
  • 124. mysql> select count(distinct id), count(distinct f1) from test; +--------------------+--------------------+ | count(distinct id) | count(distinct f1) | +--------------------+--------------------+ | 64 | 2 | +--------------------+--------------------+ 1 row in set (0.01 sec) TokuDB Test Case 53
  • 125. • SHOW INDEX mysql> show index from testG *************************** 1. row *************************** Table: test Non_unique: 0 Key_name: PRIMARY Column_name: id Cardinality: 64 ... *************************** 2. row *************************** Table: test Non_unique: 1 Key_name: f1 Column_name: f1 Cardinality: 64 TokuDB: After First run 54
  • 126. • Number of rows $ ../bin/tokuftdump --header --nodata var/mysqld.1/data/test/test_key_f1_145_1_1 ft: layout_version=29 layout_version_original=29 layout_version_read_from_disk=29 build_id=0 build_id_original=0 time_of_creation= 1537709029 Sun Sep 23 16:23:49 2018 time_of_last_modification=1537709100 Sun Sep 23 16:25:00 2018 ... estimated numrows=64 estimated numbytes=640 logical row count=64 TokuDB: After First run 54
  • 127. • Index Statistics Thread 44 "mysqld" hit Breakpoint 1, TOKUDB_SHARE::set_cardinality_counts_in_tab (this=0x7fd86da54020, table=0x7fd86d90b020) at /home/sveta/src/percona-server/storage/tokudb/ha_tokudb.cc:400 400 if (val == 0 || _rows == 0 || (gdb) p key->name $21 = 0x7fd86d879999 "f1" (gdb) p val $22 = 0 TokuDB: After First run 54
  • 128. • Cardinality = 64 / 0 = 64 TokuDB: After First run 54
  • 129. mysql> analyze table test; +-----------+---------+----------+----------+ | Table | Op | Msg_type | Msg_text | +-----------+---------+----------+----------+ | test.test | analyze | status | OK | +-----------+---------+----------+----------+ 1 row in set (0.01 sec) TokuDB: ANALYZE TABLE 55
  • 130. • SHOW INDEX mysql> show index from testG *************************** 1. row *************************** Table: test Non_unique: 0 Key_name: PRIMARY Column_name: id Cardinality: 64 ... *************************** 2. row *************************** Table: test Non_unique: 1 Key_name: f1 Column_name: f1 Cardinality: 2 TokuDB: After ANALYZE TABLE 56
  • 131. • Number of rows $ ../bin/tokuftdump --header --nodata var/mysqld.1/data/test/test_key_f1_145_1_1 ft: layout_version=29 layout_version_original=29 layout_version_read_from_disk=29 build_id=0 build_id_original=0 time_of_creation= 1537709029 Sun Sep 23 16:23:49 2018 time_of_last_modification=1537709100 Sun Sep 23 16:25:00 2018 ... estimated numrows=64 estimated numbytes=640 logical row count=64 TokuDB: After ANALYZE TABLE 56
  • 132. • Index Statistics Thread 44 "mysqld" hit Breakpoint 1, TOKUDB_SHARE::set_cardinality_counts_in_tab (this=0x7fd86da54020, table=0x7fd86d90b020) at /home/sveta/src/percona-server/storage/tokudb/ha_tokudb.cc:400 400 if (val == 0 || _rows == 0 || (gdb) p key->name $26 = 0x7fd86d879999 "f1" (gdb) p val $27 = 32 TokuDB: After ANALYZE TABLE 56
  • 133. • Cardinality = 64 / 32 = 2 TokuDB: After ANALYZE TABLE 56
  • 134. mysql> insert into test (f1, ts) select f1, NOW() from test; Query OK, 64 rows affected (0.01 sec) Records: 64 Duplicates: 0 Warnings: 0 mysql> insert into test (f1, ts) select f1, NOW() from test; Query OK, 128 rows affected (0.01 sec) Records: 128 Duplicates: 0 Warnings: 0 mysql> insert into test (f1, ts) select f1, NOW() from test; Query OK, 256 rows affected (0.02 sec) Records: 256 Duplicates: 0 Warnings: 0 mysql> select count(distinct id), count(distinct f1) from test; +--------------------+--------------------+ | count(distinct id) | count(distinct f1) | +--------------------+--------------------+ | 512 | 2 | +--------------------+--------------------+ 1 row in set (0.01 sec) TokuDB: Let’s Insert More Data 57
  • 135. • SHOW INDEX mysql> show index from testG *************************** 1. row *************************** Table: test Non_unique: 0 Key_name: PRIMARY Column_name: id Cardinality: 512 ... *************************** 2. row *************************** Table: test Non_unique: 1 Key_name: f1 Column_name: f1 Cardinality: 16 TokuDB: After INSERT 58
  • 136. • Number of rows $ ../bin/tokuftdump --header --nodata var/mysqld.1/data/test/test_key_f1_145_1_1 ft: layout_version=29 layout_version_original=29 layout_version_read_from_disk=29 build_id=0 build_id_original=0 time_of_creation= 1537709029 Sun Sep 23 16:23:49 2018 time_of_last_modification=1537709880 Sun Sep 23 16:38:00 2018 ... estimated numrows=512 estimated numbytes=5120 logical row count=512 TokuDB: After INSERT 58
  • 137. • Index Statistics Thread 44 "mysqld" hit Breakpoint 1, TOKUDB_SHARE::set_cardinality_counts_in_tab (this=0x7fd86da54020, table=0x7fd86d90b020) at /home/sveta/src/percona-server/storage/tokudb/ha_tokudb.cc:400 400 if (val == 0 || _rows == 0 || (gdb) p key->name $30 = 0x7fd86d879999 "f1" (gdb) p val $31 = 32 TokuDB: After INSERT 58
  • 138. • Cardinality = 512 / 32 = 16 TokuDB: After INSERT 58
  • 139. • SHOW INDEX mysql> show index from testG *************************** 1. row *************************** Table: test Non_unique: 0 Key_name: PRIMARY Column_name: id Cardinality: 512 ... *************************** 2. row *************************** Table: test Non_unique: 1 Key_name: f1 Column_name: f1 Cardinality: 16 TokuDB: After Restart 59
  • 140. • Index Statistics Thread 44 "mysqld" hit Breakpoint 1, TOKUDB_SHARE::set_cardinality_counts_in_tab (this=0x7fd4e67ea020, table=0x7fd4e6765c20) at /home/sveta/src/percona-server/storage/tokudb/ha_tokudb.cc:400 400 if (val == 0 || _rows == 0 || (gdb) p key->name $3 = 0x7fd4e66d7599 "f1" (gdb) p val $4 = 32 TokuDB: After Restart 59
  • 141. • Cardinality = 512 / 32 = 16 TokuDB: After Restart 59
  • 142. • Cardinality = 512 / 32 = 16 • Same! TokuDB: After Restart 59
  • 143. • Index statistics updated only when ANALYZE TABLE is running TokuDB: Conclusion 60
  • 144. • Index statistics updated only when ANALYZE TABLE is running • Logical row count updated each time when number of rows change TokuDB: Conclusion 60
  • 145. • Index statistics updated only when ANALYZE TABLE is running • Logical row count updated each time when number of rows change • Cardinality based on both numbers TokuDB: Conclusion 60
  • 146. • Index statistics updated only when ANALYZE TABLE is running • Logical row count updated each time when number of rows change • Cardinality based on both numbers • It is expected the cardinality is not the same • After updates • Even when ANALYZE TABLE never run TokuDB: Conclusion 60