Query Optimizer in MariaDB 10.4

Sergey Petrunya
Sergey PetrunyaQuery Optimizer Developer at MariaDB Corporation Ab
Query Optimizer
in MariaDB 10.4
Sergei Petrunia,
Query Optimizer Developer
MariaDB Corporation
2019 MariaDB Developers Unconference
New York
New Optimizer features in MariaDB 10.4
● Optimizer trace
● Sampling for histogram collection
● Rowid filtering
● New default settings
● Condition Pushdown into IN-subqueries
● Condition Pushdown from HAVING into WHERE
New default settings
New default settings for statistics
-histogram_size=0
+histogram_size=254
-histogram_type=SINGLE_PREC_HB
+histogram_type=SINGLE_PREC_HB
-optimizer_use_condition_selectivity=1
+optimizer_use_condition_selectivity=4
1 Use selectivity of predicates as in MariaDB 5.5.
2 Use selectivity of all range predicates supported by indexes.
3 Use selectivity of all range predicates estimated without histogram.
4 Use selectivity of all range predicates estimated with histogram.
● Do use condition selectivity
● Make use of EITS statistics (incl. Histograms if they are available)
-use_stat_tables=NEVER
+use_stat_tables=PREFERABLY_FOR_QUERIES
– But don’t collect stats unless explicitly told to do so
● Do build histograms when collecting EITS statistics
New default settings
-eq_range_index_dive_limit=10
+eq_range_index_dive_limit=200
● Join buffer will auto-size itself
-optimize_join_buffer_size=OFF
+optimize_join_buffer_size=ON
● Use index statistics (cardinality) instead of records_in_range for large IN-lists
– Just following MySQL here
– (can use ANALYZE for statements to see the size)
Sampling for histograms
Histograms in MariaDB
● Introduced in MariaDB 10.0
– Manual command to collect, ANALYZE … PERSISTENT FOR …
– Optimizer settings to use them
– Histogram is collected from ALL table data
●
Other statistics (avg_frequency, avg_length), too.
● Results
– A few users
– Histogram collection is expensive
●
Cost of full table scan + full index scans, and even more than that
Histograms in MariaDB 10.4
● MariaDB 10.4
– “Bernoulli sampling” - roll the dice for each row
– Controlled with @@analyze_sample_percentage
●
100 (the default) – “use all data”
●
0 – (recommended) – “Determine sample ratio automatically”
● MySQL 8.0 also added histograms
– Uses Bernoulli sampling
– @@histogram_generation_max_mem_size=20MB.
● PostgreSQL has genuine random-jump sampling
– default_statistics_target
Histogram collection performance
● See MDEV-17886, (TODO: Vicentiu’s data?)
● Both MariaDB and MySQL: ANALYZE for columns is as fast as full table scan.
ANALYZE TABLE PERSISTENT FOR COLUMNS (...) INDEXES ();
● “Persistent for ALL” will also scan indexes
ANALYZE TABLE PERSISTENT FOR ALL;
● PostgreSQL is much faster with genuine sampling
– Vicentiu’s has a task in progress for this.
Histogram precision
● MariaDB histograms are very compact
– min/max column values, then 1-byte or 2-byte bounds (SINGLE|DOUBLE_PREC_HB)
– 255 bytes per histogram => 128 or 255 buckets max.
● MySQL
– Histogram is stored as JSON, bounds are stored as values
– 100 Buckets by default, max is 1024
●
In our tests, more buckets help in some cases
● PostgreSQL
– Histogram bounds stored as values
– 100 buckets by default, up to 10K allowed
● Testing is still in progress :-(, the obtained data varies.
Problem with correlated conditions
● Possible selectivities
– MIN(1/n, 1/m)
– (1/n) * (1/m)
– 0
select ...
from order_items
where shipdate='2015-12-15' AND item_name='christmas light'
'swimsuit'
Problem with correlated conditions
● PostgreSQL: Multi-variate statistics
– Detects functional dependencies, col1=F(col2)
– Only used for equality predicates
– Also #DISTINCT(a,b)
● MariaDB: MDEV-11107: Use table check constraints in optimizer
– Stalled?
select ...
from order_items
where shipdate='2015-12-15' AND item_name='christmas light'
'swimsuit'
Histograms: conclusions
● 10.4
– Sampling makes ANALYZE TABLE … PERSISTENT FOR COLUMNS
run at full-table-scan speed.
– @@analyze_sample_rows
● Further directions
– Do real sampling (in progress)
– More space for the histograms (?)
– Handle correlations (how?)
Optimizer trace
Optimizer trace
● Available in MySQL since MySQL 5.6
mysql> set optimizer_trace=1;
mysql> <query>;
mysql> select * from
-> information_schema.optimizer_trace;
{
"steps": [
{
"join_preparation": {
"select#": 1,
"steps": [
{
"expanded_query": "/* select#1 */ select `t1`.`col1` AS `col1`,`t1`.`col2`
AS `col2` from `t1` where (`t1`.`col1` < 4)"
}
]
}
},
{
"join_optimization": {
"select#": 1,
"steps": [
{
"condition_processing": {
"condition": "WHERE",
"original_condition": "(`t1`.`col1` < 4)",
"steps": [
{
"transformation": "equality_propagation",
"resulting_condition": "(`t1`.`col1` < 4)"
},
{
"transformation": "constant_propagation",
"resulting_condition": "(`t1`.`col1` < 4)"
},
{
"transformation": "trivial_condition_removal",
"resulting_condition": "(`t1`.`col1` < 4)"
}
]
}
},
{
● Now, similar feature in MariaDB
The goal is to understand the optimizer
● “Why was query plan X not chosen”
– Subquery was not converted into semi-join
●
This would exceed MAX_TABLES
– Subquery materialization was not used
●
Different collations
– Ref acess was not used
●
Incompatible collations
● What changed between the two hosts/versions
– diff trace_from_host1 trace_from_host2
Developer point of view
● The trace is always compiled in
● RAII-objects to start/end writing a trace
● Disabled trace added ~1-2% overhead
● Intend to add more tracing
– Expect to get more output
Rowid filtering
What is PK-filter: in details
SELECT *
FROM orders JOIN lineitem ON o_orderkey=l_orderkey
WHERE l_shipdate BETWEEN '1997-01-01' AND '1997-06-30' AND
o_totalprice between 200000 and 230000;
Filter for lineitem table built with condition
l_shipdate BETWEEN '1997-01-01' AND '1997-06-30':
is a container that contains primary keys of rows from lineitem which
l_shipdate value satisfy the above condition.
What is PK-filter: in details
SELECT *
FROM orders JOIN lineitem ON o_orderkey=l_orderkey
WHERE l_shipdate BETWEEN '1997-01-01' AND '1997-06-30' AND
o_totalprice between 200000 and 230000;
Filter for lineitem table built with condition
l_shipdate BETWEEN '1997-01-01' AND '1997-06-30':
is a container that contains primary keys of rows from lineitem which
l_shipdate value satisfy the above condition.
What is PK-filter: in details
SELECT *
FROM orders JOIN lineitem ON o_orderkey=l_orderkey
WHERE l_shipdate BETWEEN '1997-01-01' AND '1997-02-01' AND
o_totalprice > 200000;
1. There is index i_l_shipdate on
lineitem(l_shipdate)
What is PK-filter: in details
2.
SELECT *
FROM orders JOIN lineitem ON o_orderkey=l_orderkey
WHERE l_shipdate BETWEEN '1997-01-01' AND '1997-06-30' AND
o_totalprice between 200000 and 230000;
Condition pushdown...
SELECT ...
FROM t1
WHERE (a < 2) AND
a IN
(
SELECT c
FROM t2
WHERE … AND (c < 2)
GROUP BY c
);
How condition pushdown is made
SELECT ...
FROM t1
WHERE (a < 2) AND
a IN
(
SELECT c
FROM t2
WHERE ...
GROUP BY c
);
Thanks!
1 of 25

Recommended

Optimizer Trace Walkthrough by
Optimizer Trace WalkthroughOptimizer Trace Walkthrough
Optimizer Trace WalkthroughSergey Petrunya
287 views42 slides
Lessons for the optimizer from running the TPC-DS benchmark by
Lessons for the optimizer from running the TPC-DS benchmarkLessons for the optimizer from running the TPC-DS benchmark
Lessons for the optimizer from running the TPC-DS benchmarkSergey Petrunya
725 views28 slides
Optimizer features in recent releases of other databases by
Optimizer features in recent releases of other databasesOptimizer features in recent releases of other databases
Optimizer features in recent releases of other databasesSergey Petrunya
215 views19 slides
ANALYZE for Statements - MariaDB's hidden gem by
ANALYZE for Statements - MariaDB's hidden gemANALYZE for Statements - MariaDB's hidden gem
ANALYZE for Statements - MariaDB's hidden gemSergey Petrunya
293 views30 slides
Histograms in MariaDB, MySQL and PostgreSQL by
Histograms in MariaDB, MySQL and PostgreSQLHistograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLSergey Petrunya
1.6K views44 slides
Using histograms to get better performance by
Using histograms to get better performanceUsing histograms to get better performance
Using histograms to get better performanceSergey Petrunya
324 views41 slides

More Related Content

What's hot

MySQL 8.0 EXPLAIN ANALYZE by
MySQL 8.0 EXPLAIN ANALYZEMySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZENorvald Ryeng
684 views20 slides
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013 by
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013Sergey Petrunya
6.6K views102 slides
MariaDB: Engine Independent Table Statistics, including histograms by
MariaDB: Engine Independent Table Statistics, including histogramsMariaDB: Engine Independent Table Statistics, including histograms
MariaDB: Engine Independent Table Statistics, including histogramsSergey Petrunya
871 views16 slides
MariaDB 10.0 Query Optimizer by
MariaDB 10.0 Query OptimizerMariaDB 10.0 Query Optimizer
MariaDB 10.0 Query OptimizerSergey Petrunya
3K views33 slides
MySQL performance tuning by
MySQL performance tuningMySQL performance tuning
MySQL performance tuningAnurag Srivastava
251 views46 slides
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016) by
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)Sergey Petrunya
976 views60 slides

What's hot(19)

MySQL 8.0 EXPLAIN ANALYZE by Norvald Ryeng
MySQL 8.0 EXPLAIN ANALYZEMySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZE
Norvald Ryeng684 views
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013 by Sergey Petrunya
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
Sergey Petrunya6.6K views
MariaDB: Engine Independent Table Statistics, including histograms by Sergey Petrunya
MariaDB: Engine Independent Table Statistics, including histogramsMariaDB: Engine Independent Table Statistics, including histograms
MariaDB: Engine Independent Table Statistics, including histograms
Sergey Petrunya871 views
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016) by Sergey Petrunya
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
Sergey Petrunya976 views
MariaDB 10.3 Optimizer - where does it stand by Sergey Petrunya
MariaDB 10.3 Optimizer - where does it standMariaDB 10.3 Optimizer - where does it stand
MariaDB 10.3 Optimizer - where does it stand
Sergey Petrunya860 views
New features-in-mariadb-and-mysql-optimizers by Sergey Petrunya
New features-in-mariadb-and-mysql-optimizersNew features-in-mariadb-and-mysql-optimizers
New features-in-mariadb-and-mysql-optimizers
Sergey Petrunya1.6K views
ANALYZE for executable statements - a new way to do optimizer troubleshooting... by Sergey Petrunya
ANALYZE for executable statements - a new way to do optimizer troubleshooting...ANALYZE for executable statements - a new way to do optimizer troubleshooting...
ANALYZE for executable statements - a new way to do optimizer troubleshooting...
Sergey Petrunya1.8K views
Mysqlconf2013 mariadb-cassandra-interoperability by Sergey Petrunya
Mysqlconf2013 mariadb-cassandra-interoperabilityMysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperability
Sergey Petrunya3.6K views
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013 by Jaime Crespo
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Jaime Crespo20.3K views
M|18 Understanding the Query Optimizer by MariaDB plc
M|18 Understanding the Query OptimizerM|18 Understanding the Query Optimizer
M|18 Understanding the Query Optimizer
MariaDB plc442 views
Adaptive Query Optimization in 12c by Anju Garg
Adaptive Query Optimization in 12cAdaptive Query Optimization in 12c
Adaptive Query Optimization in 12c
Anju Garg2.2K views
Introduction into MySQL Query Tuning for Dev[Op]s by Sveta Smirnova
Introduction into MySQL Query Tuning for Dev[Op]sIntroduction into MySQL Query Tuning for Dev[Op]s
Introduction into MySQL Query Tuning for Dev[Op]s
Sveta Smirnova161 views
Histograms: Pre-12c and now by Anju Garg
Histograms: Pre-12c and nowHistograms: Pre-12c and now
Histograms: Pre-12c and now
Anju Garg1.2K views
Histograms : Pre-12c and Now by Anju Garg
Histograms : Pre-12c and NowHistograms : Pre-12c and Now
Histograms : Pre-12c and Now
Anju Garg2.8K views
0888 learning-mysql by sabir18
0888 learning-mysql0888 learning-mysql
0888 learning-mysql
sabir1887 views

Similar to Query Optimizer in MariaDB 10.4

PostgreSQL 9.5 - Major Features by
PostgreSQL 9.5 - Major FeaturesPostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesInMobi Technology
1.5K views48 slides
Part2 Best Practices for Managing Optimizer Statistics by
Part2 Best Practices for Managing Optimizer StatisticsPart2 Best Practices for Managing Optimizer Statistics
Part2 Best Practices for Managing Optimizer StatisticsMaria Colgan
6K views107 slides
Shaping Optimizer's Search Space by
Shaping Optimizer's Search SpaceShaping Optimizer's Search Space
Shaping Optimizer's Search SpaceGerger
452 views61 slides
Histogram-in-Parallel-universe-of-MySQL-and-MariaDB by
Histogram-in-Parallel-universe-of-MySQL-and-MariaDBHistogram-in-Parallel-universe-of-MySQL-and-MariaDB
Histogram-in-Parallel-universe-of-MySQL-and-MariaDBMydbops
190 views41 slides
MariaDB ColumnStore by
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStoreMariaDB plc
489 views31 slides
What is new in PostgreSQL 14? by
What is new in PostgreSQL 14?What is new in PostgreSQL 14?
What is new in PostgreSQL 14?Mydbops
652 views101 slides

Similar to Query Optimizer in MariaDB 10.4(20)

Part2 Best Practices for Managing Optimizer Statistics by Maria Colgan
Part2 Best Practices for Managing Optimizer StatisticsPart2 Best Practices for Managing Optimizer Statistics
Part2 Best Practices for Managing Optimizer Statistics
Maria Colgan6K views
Shaping Optimizer's Search Space by Gerger
Shaping Optimizer's Search SpaceShaping Optimizer's Search Space
Shaping Optimizer's Search Space
Gerger452 views
Histogram-in-Parallel-universe-of-MySQL-and-MariaDB by Mydbops
Histogram-in-Parallel-universe-of-MySQL-and-MariaDBHistogram-in-Parallel-universe-of-MySQL-and-MariaDB
Histogram-in-Parallel-universe-of-MySQL-and-MariaDB
Mydbops190 views
MariaDB ColumnStore by MariaDB plc
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStore
MariaDB plc489 views
What is new in PostgreSQL 14? by Mydbops
What is new in PostgreSQL 14?What is new in PostgreSQL 14?
What is new in PostgreSQL 14?
Mydbops652 views
Migration to ClickHouse. Practical guide, by Alexander Zaitsev by Altinity Ltd
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Altinity Ltd9.4K views
MariaDB Optimizer by JongJin Lee
MariaDB OptimizerMariaDB Optimizer
MariaDB Optimizer
JongJin Lee2.8K views
MySQL Time Machine by replicating into HBase - Slides from Percona Live Amste... by Boško Devetak
MySQL Time Machine by replicating into HBase - Slides from Percona Live Amste...MySQL Time Machine by replicating into HBase - Slides from Percona Live Amste...
MySQL Time Machine by replicating into HBase - Slides from Percona Live Amste...
Boško Devetak387 views
What's new in MariaDB Platform X3 by MariaDB plc
What's new in MariaDB Platform X3What's new in MariaDB Platform X3
What's new in MariaDB Platform X3
MariaDB plc558 views
NoCOUG_201411_Patel_Managing_a_Large_OLTP_Database by Paresh Patel
NoCOUG_201411_Patel_Managing_a_Large_OLTP_DatabaseNoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
NoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
Paresh Patel306 views
Getting the most out of MariaDB MaxScale by MariaDB plc
Getting the most out of MariaDB MaxScaleGetting the most out of MariaDB MaxScale
Getting the most out of MariaDB MaxScale
MariaDB plc1.9K views
Fortify aws aurora_proxy by Marco Tusa
Fortify aws aurora_proxyFortify aws aurora_proxy
Fortify aws aurora_proxy
Marco Tusa105 views
POLARDB for MySQL - Parallel Query by oysteing
POLARDB for MySQL - Parallel QueryPOLARDB for MySQL - Parallel Query
POLARDB for MySQL - Parallel Query
oysteing785 views
MySQL Indexing : Improving Query Performance Using Index (Covering Index) by Hemant Kumar Singh
MySQL Indexing : Improving Query Performance Using Index (Covering Index)MySQL Indexing : Improving Query Performance Using Index (Covering Index)
MySQL Indexing : Improving Query Performance Using Index (Covering Index)
Enabling presto to handle massive scale at lightning speed by Shubham Tagra
Enabling presto to handle massive scale at lightning speedEnabling presto to handle massive scale at lightning speed
Enabling presto to handle massive scale at lightning speed
Shubham Tagra285 views
2010 12 mysql_clusteroverview by Dimas Prasetyo
2010 12 mysql_clusteroverview2010 12 mysql_clusteroverview
2010 12 mysql_clusteroverview
Dimas Prasetyo1.1K views
10 Reasons to Start Your Analytics Project with PostgreSQL by Satoshi Nagayasu
10 Reasons to Start Your Analytics Project with PostgreSQL10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL
Satoshi Nagayasu4.5K views
Cost-Based Optimizer in Apache Spark 2.2 by Databricks
Cost-Based Optimizer in Apache Spark 2.2 Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2
Databricks5.5K views

More from Sergey Petrunya

New optimizer features in MariaDB releases before 10.12 by
New optimizer features in MariaDB releases before 10.12New optimizer features in MariaDB releases before 10.12
New optimizer features in MariaDB releases before 10.12Sergey Petrunya
112 views27 slides
MariaDB's join optimizer: how it works and current fixes by
MariaDB's join optimizer: how it works and current fixesMariaDB's join optimizer: how it works and current fixes
MariaDB's join optimizer: how it works and current fixesSergey Petrunya
807 views32 slides
Improved histograms in MariaDB 10.8 by
Improved histograms in MariaDB 10.8Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8Sergey Petrunya
54 views27 slides
JSON Support in MariaDB: News, non-news and the bigger picture by
JSON Support in MariaDB: News, non-news and the bigger pictureJSON Support in MariaDB: News, non-news and the bigger picture
JSON Support in MariaDB: News, non-news and the bigger pictureSergey Petrunya
2K views33 slides
MariaDB 10.4 - что нового by
MariaDB 10.4 - что новогоMariaDB 10.4 - что нового
MariaDB 10.4 - что новогоSergey Petrunya
679 views45 slides
MyRocks in MariaDB | M18 by
MyRocks in MariaDB | M18MyRocks in MariaDB | M18
MyRocks in MariaDB | M18Sergey Petrunya
299 views61 slides

More from Sergey Petrunya(17)

New optimizer features in MariaDB releases before 10.12 by Sergey Petrunya
New optimizer features in MariaDB releases before 10.12New optimizer features in MariaDB releases before 10.12
New optimizer features in MariaDB releases before 10.12
Sergey Petrunya112 views
MariaDB's join optimizer: how it works and current fixes by Sergey Petrunya
MariaDB's join optimizer: how it works and current fixesMariaDB's join optimizer: how it works and current fixes
MariaDB's join optimizer: how it works and current fixes
Sergey Petrunya807 views
Improved histograms in MariaDB 10.8 by Sergey Petrunya
Improved histograms in MariaDB 10.8Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8
Sergey Petrunya54 views
JSON Support in MariaDB: News, non-news and the bigger picture by Sergey Petrunya
JSON Support in MariaDB: News, non-news and the bigger pictureJSON Support in MariaDB: News, non-news and the bigger picture
JSON Support in MariaDB: News, non-news and the bigger picture
Sergey Petrunya2K views
MariaDB 10.4 - что нового by Sergey Petrunya
MariaDB 10.4 - что новогоMariaDB 10.4 - что нового
MariaDB 10.4 - что нового
Sergey Petrunya679 views
New Query Optimizer features in MariaDB 10.3 by Sergey Petrunya
New Query Optimizer features in MariaDB 10.3New Query Optimizer features in MariaDB 10.3
New Query Optimizer features in MariaDB 10.3
Sergey Petrunya691 views
Common Table Expressions in MariaDB 10.2 by Sergey Petrunya
Common Table Expressions in MariaDB 10.2Common Table Expressions in MariaDB 10.2
Common Table Expressions in MariaDB 10.2
Sergey Petrunya1.5K views
MyRocks in MariaDB: why and how by Sergey Petrunya
MyRocks in MariaDB: why and howMyRocks in MariaDB: why and how
MyRocks in MariaDB: why and how
Sergey Petrunya1.3K views
Эволюция репликации в MySQL и MariaDB by Sergey Petrunya
Эволюция репликации в MySQL и MariaDBЭволюция репликации в MySQL и MariaDB
Эволюция репликации в MySQL и MariaDB
Sergey Petrunya1K views
MariaDB 10.1 - что нового. by Sergey Petrunya
MariaDB 10.1 - что нового.MariaDB 10.1 - что нового.
MariaDB 10.1 - что нового.
Sergey Petrunya355 views
MyRocks: табличный движок для MySQL на основе RocksDB by Sergey Petrunya
MyRocks: табличный движок для MySQL на основе RocksDBMyRocks: табличный движок для MySQL на основе RocksDB
MyRocks: табличный движок для MySQL на основе RocksDB
Sergey Petrunya1.4K views
MariaDB: ANALYZE for statements (lightning talk) by Sergey Petrunya
MariaDB:  ANALYZE for statements (lightning talk)MariaDB:  ANALYZE for statements (lightning talk)
MariaDB: ANALYZE for statements (lightning talk)
Sergey Petrunya982 views
How mysql handles ORDER BY, GROUP BY, and DISTINCT by Sergey Petrunya
How mysql handles ORDER BY, GROUP BY, and DISTINCTHow mysql handles ORDER BY, GROUP BY, and DISTINCT
How mysql handles ORDER BY, GROUP BY, and DISTINCT
Sergey Petrunya4.8K views

Recently uploaded

Advanced API Mocking Techniques by
Advanced API Mocking TechniquesAdvanced API Mocking Techniques
Advanced API Mocking TechniquesDimpy Adhikary
19 views11 slides
HarshithAkkapelli_Presentation.pdf by
HarshithAkkapelli_Presentation.pdfHarshithAkkapelli_Presentation.pdf
HarshithAkkapelli_Presentation.pdfharshithakkapelli
11 views16 slides
ict act 1.pptx by
ict act 1.pptxict act 1.pptx
ict act 1.pptxsanjaniarun08
13 views17 slides
Unleash The Monkeys by
Unleash The MonkeysUnleash The Monkeys
Unleash The MonkeysJacob Duijzer
7 views28 slides
SUGCON ANZ Presentation V2.1 Final.pptx by
SUGCON ANZ Presentation V2.1 Final.pptxSUGCON ANZ Presentation V2.1 Final.pptx
SUGCON ANZ Presentation V2.1 Final.pptxJack Spektor
22 views34 slides
Agile 101 by
Agile 101Agile 101
Agile 101John Valentino
7 views20 slides

Recently uploaded(20)

Advanced API Mocking Techniques by Dimpy Adhikary
Advanced API Mocking TechniquesAdvanced API Mocking Techniques
Advanced API Mocking Techniques
Dimpy Adhikary19 views
SUGCON ANZ Presentation V2.1 Final.pptx by Jack Spektor
SUGCON ANZ Presentation V2.1 Final.pptxSUGCON ANZ Presentation V2.1 Final.pptx
SUGCON ANZ Presentation V2.1 Final.pptx
Jack Spektor22 views
DSD-INT 2023 The Danube Hazardous Substances Model - Kovacs by Deltares
DSD-INT 2023 The Danube Hazardous Substances Model - KovacsDSD-INT 2023 The Danube Hazardous Substances Model - Kovacs
DSD-INT 2023 The Danube Hazardous Substances Model - Kovacs
Deltares8 views
Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated... by TomHalpin9
Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated...Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated...
Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated...
TomHalpin95 views
Fleet Management Software in India by Fleetable
Fleet Management Software in India Fleet Management Software in India
Fleet Management Software in India
Fleetable11 views
MariaDB stored procedures and why they should be improved by Federico Razzoli
MariaDB stored procedures and why they should be improvedMariaDB stored procedures and why they should be improved
MariaDB stored procedures and why they should be improved
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J... by Deltares
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...
Deltares9 views
Software evolution understanding: Automatic extraction of software identifier... by Ra'Fat Al-Msie'deen
Software evolution understanding: Automatic extraction of software identifier...Software evolution understanding: Automatic extraction of software identifier...
Software evolution understanding: Automatic extraction of software identifier...
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols by Deltares
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - DolsDSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols
Deltares7 views
Dapr Unleashed: Accelerating Microservice Development by Miroslav Janeski
Dapr Unleashed: Accelerating Microservice DevelopmentDapr Unleashed: Accelerating Microservice Development
Dapr Unleashed: Accelerating Microservice Development
Miroslav Janeski10 views
20231129 - Platform @ localhost 2023 - Application-driven infrastructure with... by sparkfabrik
20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...
20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...
sparkfabrik5 views
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema by Deltares
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - GeertsemaDSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema
Deltares17 views
FIMA 2023 Neo4j & FS - Entity Resolution.pptx by Neo4j
FIMA 2023 Neo4j & FS - Entity Resolution.pptxFIMA 2023 Neo4j & FS - Entity Resolution.pptx
FIMA 2023 Neo4j & FS - Entity Resolution.pptx
Neo4j6 views

Query Optimizer in MariaDB 10.4

  • 1. Query Optimizer in MariaDB 10.4 Sergei Petrunia, Query Optimizer Developer MariaDB Corporation 2019 MariaDB Developers Unconference New York
  • 2. New Optimizer features in MariaDB 10.4 ● Optimizer trace ● Sampling for histogram collection ● Rowid filtering ● New default settings ● Condition Pushdown into IN-subqueries ● Condition Pushdown from HAVING into WHERE
  • 4. New default settings for statistics -histogram_size=0 +histogram_size=254 -histogram_type=SINGLE_PREC_HB +histogram_type=SINGLE_PREC_HB -optimizer_use_condition_selectivity=1 +optimizer_use_condition_selectivity=4 1 Use selectivity of predicates as in MariaDB 5.5. 2 Use selectivity of all range predicates supported by indexes. 3 Use selectivity of all range predicates estimated without histogram. 4 Use selectivity of all range predicates estimated with histogram. ● Do use condition selectivity ● Make use of EITS statistics (incl. Histograms if they are available) -use_stat_tables=NEVER +use_stat_tables=PREFERABLY_FOR_QUERIES – But don’t collect stats unless explicitly told to do so ● Do build histograms when collecting EITS statistics
  • 5. New default settings -eq_range_index_dive_limit=10 +eq_range_index_dive_limit=200 ● Join buffer will auto-size itself -optimize_join_buffer_size=OFF +optimize_join_buffer_size=ON ● Use index statistics (cardinality) instead of records_in_range for large IN-lists – Just following MySQL here – (can use ANALYZE for statements to see the size)
  • 7. Histograms in MariaDB ● Introduced in MariaDB 10.0 – Manual command to collect, ANALYZE … PERSISTENT FOR … – Optimizer settings to use them – Histogram is collected from ALL table data ● Other statistics (avg_frequency, avg_length), too. ● Results – A few users – Histogram collection is expensive ● Cost of full table scan + full index scans, and even more than that
  • 8. Histograms in MariaDB 10.4 ● MariaDB 10.4 – “Bernoulli sampling” - roll the dice for each row – Controlled with @@analyze_sample_percentage ● 100 (the default) – “use all data” ● 0 – (recommended) – “Determine sample ratio automatically” ● MySQL 8.0 also added histograms – Uses Bernoulli sampling – @@histogram_generation_max_mem_size=20MB. ● PostgreSQL has genuine random-jump sampling – default_statistics_target
  • 9. Histogram collection performance ● See MDEV-17886, (TODO: Vicentiu’s data?) ● Both MariaDB and MySQL: ANALYZE for columns is as fast as full table scan. ANALYZE TABLE PERSISTENT FOR COLUMNS (...) INDEXES (); ● “Persistent for ALL” will also scan indexes ANALYZE TABLE PERSISTENT FOR ALL; ● PostgreSQL is much faster with genuine sampling – Vicentiu’s has a task in progress for this.
  • 10. Histogram precision ● MariaDB histograms are very compact – min/max column values, then 1-byte or 2-byte bounds (SINGLE|DOUBLE_PREC_HB) – 255 bytes per histogram => 128 or 255 buckets max. ● MySQL – Histogram is stored as JSON, bounds are stored as values – 100 Buckets by default, max is 1024 ● In our tests, more buckets help in some cases ● PostgreSQL – Histogram bounds stored as values – 100 buckets by default, up to 10K allowed ● Testing is still in progress :-(, the obtained data varies.
  • 11. Problem with correlated conditions ● Possible selectivities – MIN(1/n, 1/m) – (1/n) * (1/m) – 0 select ... from order_items where shipdate='2015-12-15' AND item_name='christmas light' 'swimsuit'
  • 12. Problem with correlated conditions ● PostgreSQL: Multi-variate statistics – Detects functional dependencies, col1=F(col2) – Only used for equality predicates – Also #DISTINCT(a,b) ● MariaDB: MDEV-11107: Use table check constraints in optimizer – Stalled? select ... from order_items where shipdate='2015-12-15' AND item_name='christmas light' 'swimsuit'
  • 13. Histograms: conclusions ● 10.4 – Sampling makes ANALYZE TABLE … PERSISTENT FOR COLUMNS run at full-table-scan speed. – @@analyze_sample_rows ● Further directions – Do real sampling (in progress) – More space for the histograms (?) – Handle correlations (how?)
  • 15. Optimizer trace ● Available in MySQL since MySQL 5.6 mysql> set optimizer_trace=1; mysql> <query>; mysql> select * from -> information_schema.optimizer_trace; { "steps": [ { "join_preparation": { "select#": 1, "steps": [ { "expanded_query": "/* select#1 */ select `t1`.`col1` AS `col1`,`t1`.`col2` AS `col2` from `t1` where (`t1`.`col1` < 4)" } ] } }, { "join_optimization": { "select#": 1, "steps": [ { "condition_processing": { "condition": "WHERE", "original_condition": "(`t1`.`col1` < 4)", "steps": [ { "transformation": "equality_propagation", "resulting_condition": "(`t1`.`col1` < 4)" }, { "transformation": "constant_propagation", "resulting_condition": "(`t1`.`col1` < 4)" }, { "transformation": "trivial_condition_removal", "resulting_condition": "(`t1`.`col1` < 4)" } ] } }, { ● Now, similar feature in MariaDB
  • 16. The goal is to understand the optimizer ● “Why was query plan X not chosen” – Subquery was not converted into semi-join ● This would exceed MAX_TABLES – Subquery materialization was not used ● Different collations – Ref acess was not used ● Incompatible collations ● What changed between the two hosts/versions – diff trace_from_host1 trace_from_host2
  • 17. Developer point of view ● The trace is always compiled in ● RAII-objects to start/end writing a trace ● Disabled trace added ~1-2% overhead ● Intend to add more tracing – Expect to get more output
  • 19. What is PK-filter: in details SELECT * FROM orders JOIN lineitem ON o_orderkey=l_orderkey WHERE l_shipdate BETWEEN '1997-01-01' AND '1997-06-30' AND o_totalprice between 200000 and 230000; Filter for lineitem table built with condition l_shipdate BETWEEN '1997-01-01' AND '1997-06-30': is a container that contains primary keys of rows from lineitem which l_shipdate value satisfy the above condition.
  • 20. What is PK-filter: in details SELECT * FROM orders JOIN lineitem ON o_orderkey=l_orderkey WHERE l_shipdate BETWEEN '1997-01-01' AND '1997-06-30' AND o_totalprice between 200000 and 230000; Filter for lineitem table built with condition l_shipdate BETWEEN '1997-01-01' AND '1997-06-30': is a container that contains primary keys of rows from lineitem which l_shipdate value satisfy the above condition.
  • 21. What is PK-filter: in details SELECT * FROM orders JOIN lineitem ON o_orderkey=l_orderkey WHERE l_shipdate BETWEEN '1997-01-01' AND '1997-02-01' AND o_totalprice > 200000; 1. There is index i_l_shipdate on lineitem(l_shipdate)
  • 22. What is PK-filter: in details 2. SELECT * FROM orders JOIN lineitem ON o_orderkey=l_orderkey WHERE l_shipdate BETWEEN '1997-01-01' AND '1997-06-30' AND o_totalprice between 200000 and 230000;
  • 24. SELECT ... FROM t1 WHERE (a < 2) AND a IN ( SELECT c FROM t2 WHERE … AND (c < 2) GROUP BY c ); How condition pushdown is made SELECT ... FROM t1 WHERE (a < 2) AND a IN ( SELECT c FROM t2 WHERE ... GROUP BY c );