SlideShare a Scribd company logo
Sergei Petrunia
MariaDB devroom
FOSDEM 2021
Join Optimizer
1. How it works
2. What we’re working on to improve it
Optimizer Call
July 2022
Sergei Petrunia
MariaDB
2
Join order search
●
Total number of possible join orders
for N-table join is:
N * (N-1) * (N-2) *… = N!
●
Join orders are built left-to-right
●
Cannot enumerate all possible join
orders.
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
t3
t4
3
Pruning
●
Enumerate promising join orders
first.
●
Do not explore join orders that are
apparently worse.
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
t3
t4
4
Pruning # 1: by cost
●
Cost of the current_prefix is
already higher than total cost of
best plan.
– Adding tables will make it even
higher
– No point to try.
●
This pruning is always done (no
switch)
●
Optimizer trace: pruned_by_cost
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
t3
t4
5
pruned_by_cost weaknesses
●
A really expensive table at the
end of the join order.
●
Any prefix that doesn’t include it is
relatively cheap
– Even if its comparably worse:
–
–
●
=> No pruning.
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t4
t3
t3
t4
t1
t3
t1
t4
t4
t3
t3
t1
t4
t1
t3
t4
t1
t1 t4
t2 t4
6
Pruning # 2: by heuristic
●
Adding a table tX to a join prefix
– Adds read_time (time to read tX)
– Produces record_count row
combinations to be joined with further
tables (aka “join suffix”).
– Both have an effect on the total cost:
●
read_time is time spent right now.
●
record_count will affect cost of join
suffix.
– We don’t know the “exchange ratio”
because we don’t know the costs of
“join suffix”.
t0
t1
t2
t3
incoming_record_count
record_count_t1
record_count_t2
record_count_t3
read_time_t1
7
The idea behind the heuristic
– … we don’t know the “exchange ratio”
because we don’t know the costs of “join
suffix”
●
Also the suffixes are different!
– Let’s assume the suffixes have similar costs.
– Then, if
●
read_time_t1 < read_time_t2, AND
●
record_count_t1 < record_count_t2
– Then t1 “is better” than t2.
– Can prune away t2.
t0
t1
t2
t3
incoming_record_count
record_count_t1
record_count_t2
record_count_t3
read_time_t1
8
Applying heuristic pruning
●
Do it locally in each join prefix
●
First, consider more promising
tables first.
●
Less-promising tables second
– And try to prune them away.
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
t3
t4
9
Pruning # 2: by heuristic
●
A Model Table (yes, I’ve just invented this term):
– Lowest read_time AND record_count seen so
far
– Either
●
record_count < 2.0, or
●
there are no possible "key dependencies" on
tables not in the prefix
– A “possible key dependency” is an eqality in form:
tbl.keyXpartY=expr(tables_no_in_prefix)
●
^^ this is a “heuristic” to apply the heuristic.
●
Prune away tables that have both worse read_time
and record_count than the Model Table.
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
t3
t4
10
How one can see heuristic pruning
●
@@optimizer_prune_level
– 0 – not enabled.
– 1 – enabled (the default)
●
Optimizer trace: grep for
“pruned_by_heuristic”
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
t3
t4
11
Greedy Search
12
Greedy search
●
Consider only prefixes of limited size
– Based on that, pick the first table
– Repeat
●
@@optimizer_search_depth
– Default: 62
(both MySQL and MariaDB)
– 0 – “pick depth automatically”
●
Why is this not default yet?
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
t3
t4
13
MDEV-28073
(fixed in 10.6)
14
MDEV-28073: patch #1: “edge tables”
●
If the suffix t1-t4-t3 uses only eq_ref or similar:
– It is [nearly] the best
– Don’t enumerate other table combinations.
●
They can’t be much better.
●
Optimizer trace: pruned_by_hanging_leaf
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
commit b729896d00e022f6205399376c0cc107e1ee0704
Author: Monty <monty@mariadb.org>
Date: Tue May 10 11:47:20 2022 +0300
MDEV-28073 Query performance degradation in newer MariaDB versions when
using many tables
The issue was that best_extension_by_limited_search() had to go through
too many plans with the same cost as there where many EQ_REF tables.
Fixed by shortcutting EQ_REF (AND REF) when the result only contains one
row. This got the optimization time down from hours to sub seconds.
t0
15
MDEV-28073: patch #2: key_dependent
select ...
from
person, car_rides, bicycle_rides
where
person.name=car_rides.rider and
person.name=bicycle_rides.rider and
...
car_rides bicycle_rides
person
●
Remember the “heuristics to apply the heuristic” a few slides above:
– there are no possible "key dependencies" on tables not in the prefix
It can be false due to multi-equalities:
●
person.name=bicycle_rides.riders is a “possible key dependency”.
●
But we already have person.name from car_rides.rider (the equality is “bound”)
– Trying join orders with bicycle_rides before person won’t produce a better plan.
●
Solution: adjust the heuristic: there are no possible key_dependencies on tables not in
the prefix that are not already bound.
name
16
MDEV-28073: patch #3: table order de-scrambling
●
The optimizer should try good tables first
●
Implemented by taking tables off the unused portion of
join->best_ref array.
– Initially it’s ordered (“promising” tables first)
– But due to bug eventually gets out of order
●
Plan searches that enumerate many options could
suffer from poor pruning towards the end.
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
Author: Michael Widenius <monty@mariadb.org>
Date: Sun May 15 15:46:29 2022 +0300
greedy_search() and best_extension_by_limited_search() scrambled table order
best_extension_by_limited_search() assumes that tables should be sorted
according to size to be able to quickly disregard bad plans. However the
current usage of swap_variables() will change the table order to a not
sorted one for the next recursive call. This breaks the assumtion and
causes performance issues when using many tables (we have to examine
many more plans).
t0
17
MDEV-28852
(MariaDB 10.10)
18
Local table pre-sorting
19
In which order do we try the tables?
●
Current:
– join_tab_cmp() orders all tables by their
JOIN_TAB::found_records
(records after table’s condition is checked)
– The same ordering is used everywhere
– This *ignores* the join prefix and efficien
table read plans we can use
– e.g. here, ignores the prefix of t1:
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
t3
t4
t1
t1
t2
t3
t4
20
In which order do we try the tables?
●
First, evaluate possible table accesses for {t2,t3,t4}.
●
Sort them by #found_rows
●
Then try extending join orders
– Do all kinds of pruning while doing this
t1
t1
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
commit 0762dd9283185c72c6955f44fc4d862a0a928569
Author: Monty <monty@mariadb.org>
Date: Tue May 31 17:36:32 2022 +0300
Improve pruning in greedy_search by sorting tables during search
MDEV-28073 Slow query performance in MariaDB when using many tables
The faster we can find a good query plan, the more options we have for
finding and pruning (ignoring) bad plans.
This patch adds sorting of plans to best_extension_by_limited_search().
21
Improving the pruning
(MDEV-28929)
22
Remember: the idea behind the heuristic
– … we don’t know the “exchange ratio”
because we don’t know the costs of “join
suffix”
●
Also the suffixes are different!
– Let’s assume the suffixes have similar costs.
– Then, if
●
read_time_t1 < read_time_t2, AND
●
record_count_t1 < record_count_t2
– Then t1 “is better” than t2.
– Can prune away t2.
–
t0
t1
t2
t3
incoming_record_count
record_count_t1
record_count_t2
record_count_t3
read_time_t1
23
Let’s plot the tables
Let’s plot
records_read
read_time
t1
24
Let’s plot the tables
records_read
read_time
t1
Better than t1
Worse than t1
25
Let’s plot the tables
●
“Typical” situation: plan with
higher cost produce more #rows.
– A lot of opportunities to do
pruning
●
The optimizer orders plans by
records_read
– (that is, goes left-to-right)
– Pick the first plan as “Model”,
prune those that are worse.
records_read
read_time
t1
t0
t2
t3
t4
26
When pruning doesn’t work
records_read
read_time
t1
t0
t2
t3
t4
●
“Bad” situation:
– plans with high cost produce
few rows
– And vice versa
●
Can’t do pruning.
27
When pruning could work but doesn’t
records_read
read_time
t1
t2
t3
t4
t0
●
Walking left-to-right, the optimizer
picks t0 as Model table.
●
And then can’t prune away any
other table.
Tables that are worse than t0 are here
28
How to do as much pruning as possible?
records_read
read_time
●
Pick a minimal set of Model
tables that allow to prune away
the rest?
●
Complexity seems to be at least
N^2.
●
Some approximate algorithm?
– Use the table with min_cost
– Use the table with
min_records_read
●
Have a patch with some
approximate implementation
29
eq_ref chaining
30
Motivation
●
Tables with “attributes” that are joined using Primary Key
select * from base_table, attr1, attr2, ... attrN
where
attr1.pk = base_table.pk and
attr2.pk = base_table.pk and
...
attrN.pk = base_table.pk
●
Lots of nearly-identical query plans: There are factorial(n_attributes) permutations
– Have the same or very close cost
●
=> Can’t do pruning
●
The fix with “Edge tables” aka pruned_by_hanging_leaf helps but only if the
attributes are at the end of the join order.
31
eq_ref chaining
●
The idea: if we see a eq_ref access, try
considering only eq_refs as long as we can.
●
MySQL 5.7 has a similar optimization
– TODO: describe the differences.
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
t3
t4
commit 5abb6bff6cfb5cb5d87520f1e32e9b41db46bd7b
Author: Monty <monty@mariadb.org>
Date: Thu Jun 2 19:47:23 2022 +0300
Added EQ_REF chaining to the greedy_optimizer
MDEV-28073 Slow query performance in MariaDB when using many table
The idea is to prefer and chain EQ_REF tables (tables that uses an
unique key to find a row) when searching for the best table combination.
This significantly reduces row combinations that has to be examined.
This is optimization is enabled when setting optimizer_prune_level=2 (default)
32
Thanks for your attention!

More Related Content

What's hot

Window functions in MySQL 8.0
Window functions in MySQL 8.0Window functions in MySQL 8.0
Window functions in MySQL 8.0
Mydbops
 
Using Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query PerformanceUsing Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query Performance
oysteing
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Altinity Ltd
 
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Jaime Crespo
 
MySQL Performance Schema in Action
MySQL Performance Schema in ActionMySQL Performance Schema in Action
MySQL Performance Schema in Action
Sveta Smirnova
 
Histograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLHistograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQL
Sergey Petrunya
 
MySQL Query And Index Tuning
MySQL Query And Index TuningMySQL Query And Index Tuning
MySQL Query And Index Tuning
Manikanda kumar
 
MySQL Optimizer Overview
MySQL Optimizer OverviewMySQL Optimizer Overview
MySQL Optimizer Overview
Olav Sandstå
 
MySQL 8.0 Optimizer Guide
MySQL 8.0 Optimizer GuideMySQL 8.0 Optimizer Guide
MySQL 8.0 Optimizer Guide
Morgan Tocker
 
How to analyze and tune sql queries for better performance percona15
How to analyze and tune sql queries for better performance percona15How to analyze and tune sql queries for better performance percona15
How to analyze and tune sql queries for better performance percona15
oysteing
 
How to use histograms to get better performance
How to use histograms to get better performanceHow to use histograms to get better performance
How to use histograms to get better performance
MariaDB plc
 
How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performance
oysteing
 
MySQL innoDB split and merge pages
MySQL innoDB split and merge pagesMySQL innoDB split and merge pages
MySQL innoDB split and merge pages
Marco Tusa
 
Oracle db performance tuning
Oracle db performance tuningOracle db performance tuning
Oracle db performance tuning
Simon Huang
 
MySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 TipsMySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 Tips
OSSCube
 
Chasing the optimizer
Chasing the optimizerChasing the optimizer
Chasing the optimizer
Mauro Pagano
 
PostgreSQL: Joining 1 million tables
PostgreSQL: Joining 1 million tablesPostgreSQL: Joining 1 million tables
PostgreSQL: Joining 1 million tables
Hans-Jürgen Schönig
 
How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performance
oysteing
 
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
Altinity Ltd
 
A deep dive about VIP,HAIP, and SCAN
A deep dive about VIP,HAIP, and SCAN A deep dive about VIP,HAIP, and SCAN
A deep dive about VIP,HAIP, and SCAN
Riyaj Shamsudeen
 

What's hot (20)

Window functions in MySQL 8.0
Window functions in MySQL 8.0Window functions in MySQL 8.0
Window functions in MySQL 8.0
 
Using Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query PerformanceUsing Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query Performance
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
 
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
 
MySQL Performance Schema in Action
MySQL Performance Schema in ActionMySQL Performance Schema in Action
MySQL Performance Schema in Action
 
Histograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQLHistograms in MariaDB, MySQL and PostgreSQL
Histograms in MariaDB, MySQL and PostgreSQL
 
MySQL Query And Index Tuning
MySQL Query And Index TuningMySQL Query And Index Tuning
MySQL Query And Index Tuning
 
MySQL Optimizer Overview
MySQL Optimizer OverviewMySQL Optimizer Overview
MySQL Optimizer Overview
 
MySQL 8.0 Optimizer Guide
MySQL 8.0 Optimizer GuideMySQL 8.0 Optimizer Guide
MySQL 8.0 Optimizer Guide
 
How to analyze and tune sql queries for better performance percona15
How to analyze and tune sql queries for better performance percona15How to analyze and tune sql queries for better performance percona15
How to analyze and tune sql queries for better performance percona15
 
How to use histograms to get better performance
How to use histograms to get better performanceHow to use histograms to get better performance
How to use histograms to get better performance
 
How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performance
 
MySQL innoDB split and merge pages
MySQL innoDB split and merge pagesMySQL innoDB split and merge pages
MySQL innoDB split and merge pages
 
Oracle db performance tuning
Oracle db performance tuningOracle db performance tuning
Oracle db performance tuning
 
MySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 TipsMySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 Tips
 
Chasing the optimizer
Chasing the optimizerChasing the optimizer
Chasing the optimizer
 
PostgreSQL: Joining 1 million tables
PostgreSQL: Joining 1 million tablesPostgreSQL: Joining 1 million tables
PostgreSQL: Joining 1 million tables
 
How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performance
 
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
 
A deep dive about VIP,HAIP, and SCAN
A deep dive about VIP,HAIP, and SCAN A deep dive about VIP,HAIP, and SCAN
A deep dive about VIP,HAIP, and SCAN
 

Similar to MariaDB's join optimizer: how it works and current fixes

Algorithim lec1.pptx
Algorithim lec1.pptxAlgorithim lec1.pptx
Algorithim lec1.pptx
rediet43
 
Advanced query optimization
Advanced query optimizationAdvanced query optimization
Advanced query optimization
MYXPLAIN
 
Basics in algorithms and data structure
Basics in algorithms and data structure Basics in algorithms and data structure
Basics in algorithms and data structure
Eman magdy
 
PostgreSQL query planner's internals
PostgreSQL query planner's internalsPostgreSQL query planner's internals
PostgreSQL query planner's internals
Alexey Ermakov
 
Query Compilation in Impala
Query Compilation in ImpalaQuery Compilation in Impala
Query Compilation in Impala
Cloudera, Inc.
 
Merge sort analysis and its real time applications
Merge sort analysis and its real time applicationsMerge sort analysis and its real time applications
Merge sort analysis and its real time applications
yazad dumasia
 
MySQL Query Optimisation 101
MySQL Query Optimisation 101MySQL Query Optimisation 101
MySQL Query Optimisation 101
Federico Razzoli
 
Data Structures 6
Data Structures 6Data Structures 6
Data Structures 6
Dr.Umadevi V
 
Flowshop scheduling
Flowshop schedulingFlowshop scheduling
Flowshop scheduling
Kunal Goswami
 
Press the link to see the book from my google drive.https.docx
Press the link to see the book from my google drive.https.docxPress the link to see the book from my google drive.https.docx
Press the link to see the book from my google drive.https.docx
ChantellPantoja184
 
5 Cool Things About PLSQL
5 Cool Things About PLSQL5 Cool Things About PLSQL
5 Cool Things About PLSQL
Connor McDonald
 
11.machines constrained flow shop scheduling processing time, setup time each...
11.machines constrained flow shop scheduling processing time, setup time each...11.machines constrained flow shop scheduling processing time, setup time each...
11.machines constrained flow shop scheduling processing time, setup time each...
Alexander Decker
 
11.machines constrained flow shop scheduling processing time, setup time each...
11.machines constrained flow shop scheduling processing time, setup time each...11.machines constrained flow shop scheduling processing time, setup time each...
11.machines constrained flow shop scheduling processing time, setup time each...
Alexander Decker
 
Machines constrained flow shop scheduling processing time, setup time each as...
Machines constrained flow shop scheduling processing time, setup time each as...Machines constrained flow shop scheduling processing time, setup time each as...
Machines constrained flow shop scheduling processing time, setup time each as...
Alexander Decker
 
Electrical Engineering Exam Help
Electrical Engineering Exam HelpElectrical Engineering Exam Help
Electrical Engineering Exam Help
Live Exam Helper
 
Algorithm analysis
Algorithm analysisAlgorithm analysis
Algorithm analysis
Akshay Dagar
 
Rtl design optimizations and tradeoffs
Rtl design optimizations and tradeoffsRtl design optimizations and tradeoffs
Rtl design optimizations and tradeoffs
Grace Abraham
 
Oracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive PlansOracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive Plans
Franck Pachot
 
merge sort help in language C with algorithms
merge sort help in language C with algorithmsmerge sort help in language C with algorithms
merge sort help in language C with algorithms
tahamou4
 
How to tune a query - ODTUG 2012
How to tune a query - ODTUG 2012How to tune a query - ODTUG 2012
How to tune a query - ODTUG 2012
Connor McDonald
 

Similar to MariaDB's join optimizer: how it works and current fixes (20)

Algorithim lec1.pptx
Algorithim lec1.pptxAlgorithim lec1.pptx
Algorithim lec1.pptx
 
Advanced query optimization
Advanced query optimizationAdvanced query optimization
Advanced query optimization
 
Basics in algorithms and data structure
Basics in algorithms and data structure Basics in algorithms and data structure
Basics in algorithms and data structure
 
PostgreSQL query planner's internals
PostgreSQL query planner's internalsPostgreSQL query planner's internals
PostgreSQL query planner's internals
 
Query Compilation in Impala
Query Compilation in ImpalaQuery Compilation in Impala
Query Compilation in Impala
 
Merge sort analysis and its real time applications
Merge sort analysis and its real time applicationsMerge sort analysis and its real time applications
Merge sort analysis and its real time applications
 
MySQL Query Optimisation 101
MySQL Query Optimisation 101MySQL Query Optimisation 101
MySQL Query Optimisation 101
 
Data Structures 6
Data Structures 6Data Structures 6
Data Structures 6
 
Flowshop scheduling
Flowshop schedulingFlowshop scheduling
Flowshop scheduling
 
Press the link to see the book from my google drive.https.docx
Press the link to see the book from my google drive.https.docxPress the link to see the book from my google drive.https.docx
Press the link to see the book from my google drive.https.docx
 
5 Cool Things About PLSQL
5 Cool Things About PLSQL5 Cool Things About PLSQL
5 Cool Things About PLSQL
 
11.machines constrained flow shop scheduling processing time, setup time each...
11.machines constrained flow shop scheduling processing time, setup time each...11.machines constrained flow shop scheduling processing time, setup time each...
11.machines constrained flow shop scheduling processing time, setup time each...
 
11.machines constrained flow shop scheduling processing time, setup time each...
11.machines constrained flow shop scheduling processing time, setup time each...11.machines constrained flow shop scheduling processing time, setup time each...
11.machines constrained flow shop scheduling processing time, setup time each...
 
Machines constrained flow shop scheduling processing time, setup time each as...
Machines constrained flow shop scheduling processing time, setup time each as...Machines constrained flow shop scheduling processing time, setup time each as...
Machines constrained flow shop scheduling processing time, setup time each as...
 
Electrical Engineering Exam Help
Electrical Engineering Exam HelpElectrical Engineering Exam Help
Electrical Engineering Exam Help
 
Algorithm analysis
Algorithm analysisAlgorithm analysis
Algorithm analysis
 
Rtl design optimizations and tradeoffs
Rtl design optimizations and tradeoffsRtl design optimizations and tradeoffs
Rtl design optimizations and tradeoffs
 
Oracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive PlansOracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive Plans
 
merge sort help in language C with algorithms
merge sort help in language C with algorithmsmerge sort help in language C with algorithms
merge sort help in language C with algorithms
 
How to tune a query - ODTUG 2012
How to tune a query - ODTUG 2012How to tune a query - ODTUG 2012
How to tune a query - ODTUG 2012
 

More from Sergey Petrunya

New optimizer features in MariaDB releases before 10.12
New optimizer features in MariaDB releases before 10.12New optimizer features in MariaDB releases before 10.12
New optimizer features in MariaDB releases before 10.12
Sergey Petrunya
 
Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8
Sergey Petrunya
 
Improving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimatesImproving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimates
Sergey Petrunya
 
JSON Support in MariaDB: News, non-news and the bigger picture
JSON Support in MariaDB: News, non-news and the bigger pictureJSON Support in MariaDB: News, non-news and the bigger picture
JSON Support in MariaDB: News, non-news and the bigger picture
Sergey Petrunya
 
Optimizer Trace Walkthrough
Optimizer Trace WalkthroughOptimizer Trace Walkthrough
Optimizer Trace Walkthrough
Sergey Petrunya
 
ANALYZE for Statements - MariaDB's hidden gem
ANALYZE for Statements - MariaDB's hidden gemANALYZE for Statements - MariaDB's hidden gem
ANALYZE for Statements - MariaDB's hidden gem
Sergey Petrunya
 
Optimizer features in recent releases of other databases
Optimizer features in recent releases of other databasesOptimizer features in recent releases of other databases
Optimizer features in recent releases of other databases
Sergey Petrunya
 
MariaDB 10.4 - что нового
MariaDB 10.4 - что новогоMariaDB 10.4 - что нового
MariaDB 10.4 - что нового
Sergey Petrunya
 
Using histograms to get better performance
Using histograms to get better performanceUsing histograms to get better performance
Using histograms to get better performance
Sergey Petrunya
 
MariaDB Optimizer - further down the rabbit hole
MariaDB Optimizer - further down the rabbit holeMariaDB Optimizer - further down the rabbit hole
MariaDB Optimizer - further down the rabbit hole
Sergey Petrunya
 
Query Optimizer in MariaDB 10.4
Query Optimizer in MariaDB 10.4Query Optimizer in MariaDB 10.4
Query Optimizer in MariaDB 10.4
Sergey Petrunya
 
Lessons for the optimizer from running the TPC-DS benchmark
Lessons for the optimizer from running the TPC-DS benchmarkLessons for the optimizer from running the TPC-DS benchmark
Lessons for the optimizer from running the TPC-DS benchmark
Sergey Petrunya
 
MariaDB 10.3 Optimizer - where does it stand
MariaDB 10.3 Optimizer - where does it standMariaDB 10.3 Optimizer - where does it stand
MariaDB 10.3 Optimizer - where does it stand
Sergey Petrunya
 
MyRocks in MariaDB | M18
MyRocks in MariaDB | M18MyRocks in MariaDB | M18
MyRocks in MariaDB | M18
Sergey Petrunya
 
New Query Optimizer features in MariaDB 10.3
New Query Optimizer features in MariaDB 10.3New Query Optimizer features in MariaDB 10.3
New Query Optimizer features in MariaDB 10.3
Sergey Petrunya
 
MyRocks in MariaDB
MyRocks in MariaDBMyRocks in MariaDB
MyRocks in MariaDB
Sergey Petrunya
 
Say Hello to MyRocks
Say Hello to MyRocksSay Hello to MyRocks
Say Hello to MyRocks
Sergey Petrunya
 
Common Table Expressions in MariaDB 10.2
Common Table Expressions in MariaDB 10.2Common Table Expressions in MariaDB 10.2
Common Table Expressions in MariaDB 10.2
Sergey Petrunya
 
MyRocks in MariaDB: why and how
MyRocks in MariaDB: why and howMyRocks in MariaDB: why and how
MyRocks in MariaDB: why and how
Sergey Petrunya
 
Эволюция репликации в MySQL и MariaDB
Эволюция репликации в MySQL и MariaDBЭволюция репликации в MySQL и MariaDB
Эволюция репликации в MySQL и MariaDB
Sergey Petrunya
 

More from Sergey Petrunya (20)

New optimizer features in MariaDB releases before 10.12
New optimizer features in MariaDB releases before 10.12New optimizer features in MariaDB releases before 10.12
New optimizer features in MariaDB releases before 10.12
 
Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8
 
Improving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimatesImproving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimates
 
JSON Support in MariaDB: News, non-news and the bigger picture
JSON Support in MariaDB: News, non-news and the bigger pictureJSON Support in MariaDB: News, non-news and the bigger picture
JSON Support in MariaDB: News, non-news and the bigger picture
 
Optimizer Trace Walkthrough
Optimizer Trace WalkthroughOptimizer Trace Walkthrough
Optimizer Trace Walkthrough
 
ANALYZE for Statements - MariaDB's hidden gem
ANALYZE for Statements - MariaDB's hidden gemANALYZE for Statements - MariaDB's hidden gem
ANALYZE for Statements - MariaDB's hidden gem
 
Optimizer features in recent releases of other databases
Optimizer features in recent releases of other databasesOptimizer features in recent releases of other databases
Optimizer features in recent releases of other databases
 
MariaDB 10.4 - что нового
MariaDB 10.4 - что новогоMariaDB 10.4 - что нового
MariaDB 10.4 - что нового
 
Using histograms to get better performance
Using histograms to get better performanceUsing histograms to get better performance
Using histograms to get better performance
 
MariaDB Optimizer - further down the rabbit hole
MariaDB Optimizer - further down the rabbit holeMariaDB Optimizer - further down the rabbit hole
MariaDB Optimizer - further down the rabbit hole
 
Query Optimizer in MariaDB 10.4
Query Optimizer in MariaDB 10.4Query Optimizer in MariaDB 10.4
Query Optimizer in MariaDB 10.4
 
Lessons for the optimizer from running the TPC-DS benchmark
Lessons for the optimizer from running the TPC-DS benchmarkLessons for the optimizer from running the TPC-DS benchmark
Lessons for the optimizer from running the TPC-DS benchmark
 
MariaDB 10.3 Optimizer - where does it stand
MariaDB 10.3 Optimizer - where does it standMariaDB 10.3 Optimizer - where does it stand
MariaDB 10.3 Optimizer - where does it stand
 
MyRocks in MariaDB | M18
MyRocks in MariaDB | M18MyRocks in MariaDB | M18
MyRocks in MariaDB | M18
 
New Query Optimizer features in MariaDB 10.3
New Query Optimizer features in MariaDB 10.3New Query Optimizer features in MariaDB 10.3
New Query Optimizer features in MariaDB 10.3
 
MyRocks in MariaDB
MyRocks in MariaDBMyRocks in MariaDB
MyRocks in MariaDB
 
Say Hello to MyRocks
Say Hello to MyRocksSay Hello to MyRocks
Say Hello to MyRocks
 
Common Table Expressions in MariaDB 10.2
Common Table Expressions in MariaDB 10.2Common Table Expressions in MariaDB 10.2
Common Table Expressions in MariaDB 10.2
 
MyRocks in MariaDB: why and how
MyRocks in MariaDB: why and howMyRocks in MariaDB: why and how
MyRocks in MariaDB: why and how
 
Эволюция репликации в MySQL и MariaDB
Эволюция репликации в MySQL и MariaDBЭволюция репликации в MySQL и MariaDB
Эволюция репликации в MySQL и MariaDB
 

Recently uploaded

Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Julian Hyde
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
Rakesh Kumar R
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
XfilesPro
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
lorraineandreiamcidl
 
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
VALiNTRY360
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
Yara Milbes
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 
UI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design SystemUI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design System
Peter Muessig
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
Remote DBA Services
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
ICS
 
Odoo ERP Vs. Traditional ERP Systems – A Comparative Analysis
Odoo ERP Vs. Traditional ERP Systems – A Comparative AnalysisOdoo ERP Vs. Traditional ERP Systems – A Comparative Analysis
Odoo ERP Vs. Traditional ERP Systems – A Comparative Analysis
Envertis Software Solutions
 

Recently uploaded (20)

Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
 
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 
UI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design SystemUI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design System
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
 
Odoo ERP Vs. Traditional ERP Systems – A Comparative Analysis
Odoo ERP Vs. Traditional ERP Systems – A Comparative AnalysisOdoo ERP Vs. Traditional ERP Systems – A Comparative Analysis
Odoo ERP Vs. Traditional ERP Systems – A Comparative Analysis
 

MariaDB's join optimizer: how it works and current fixes

  • 1. Sergei Petrunia MariaDB devroom FOSDEM 2021 Join Optimizer 1. How it works 2. What we’re working on to improve it Optimizer Call July 2022 Sergei Petrunia MariaDB
  • 2. 2 Join order search ● Total number of possible join orders for N-table join is: N * (N-1) * (N-2) *… = N! ● Join orders are built left-to-right ● Cannot enumerate all possible join orders. t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 t3 t4
  • 3. 3 Pruning ● Enumerate promising join orders first. ● Do not explore join orders that are apparently worse. t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 t3 t4
  • 4. 4 Pruning # 1: by cost ● Cost of the current_prefix is already higher than total cost of best plan. – Adding tables will make it even higher – No point to try. ● This pruning is always done (no switch) ● Optimizer trace: pruned_by_cost t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 t3 t4
  • 5. 5 pruned_by_cost weaknesses ● A really expensive table at the end of the join order. ● Any prefix that doesn’t include it is relatively cheap – Even if its comparably worse: – – ● => No pruning. t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t4 t3 t3 t4 t1 t3 t1 t4 t4 t3 t3 t1 t4 t1 t3 t4 t1 t1 t4 t2 t4
  • 6. 6 Pruning # 2: by heuristic ● Adding a table tX to a join prefix – Adds read_time (time to read tX) – Produces record_count row combinations to be joined with further tables (aka “join suffix”). – Both have an effect on the total cost: ● read_time is time spent right now. ● record_count will affect cost of join suffix. – We don’t know the “exchange ratio” because we don’t know the costs of “join suffix”. t0 t1 t2 t3 incoming_record_count record_count_t1 record_count_t2 record_count_t3 read_time_t1
  • 7. 7 The idea behind the heuristic – … we don’t know the “exchange ratio” because we don’t know the costs of “join suffix” ● Also the suffixes are different! – Let’s assume the suffixes have similar costs. – Then, if ● read_time_t1 < read_time_t2, AND ● record_count_t1 < record_count_t2 – Then t1 “is better” than t2. – Can prune away t2. t0 t1 t2 t3 incoming_record_count record_count_t1 record_count_t2 record_count_t3 read_time_t1
  • 8. 8 Applying heuristic pruning ● Do it locally in each join prefix ● First, consider more promising tables first. ● Less-promising tables second – And try to prune them away. t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 t3 t4
  • 9. 9 Pruning # 2: by heuristic ● A Model Table (yes, I’ve just invented this term): – Lowest read_time AND record_count seen so far – Either ● record_count < 2.0, or ● there are no possible "key dependencies" on tables not in the prefix – A “possible key dependency” is an eqality in form: tbl.keyXpartY=expr(tables_no_in_prefix) ● ^^ this is a “heuristic” to apply the heuristic. ● Prune away tables that have both worse read_time and record_count than the Model Table. t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 t3 t4
  • 10. 10 How one can see heuristic pruning ● @@optimizer_prune_level – 0 – not enabled. – 1 – enabled (the default) ● Optimizer trace: grep for “pruned_by_heuristic” t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 t3 t4
  • 12. 12 Greedy search ● Consider only prefixes of limited size – Based on that, pick the first table – Repeat ● @@optimizer_search_depth – Default: 62 (both MySQL and MariaDB) – 0 – “pick depth automatically” ● Why is this not default yet? t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 t3 t4
  • 14. 14 MDEV-28073: patch #1: “edge tables” ● If the suffix t1-t4-t3 uses only eq_ref or similar: – It is [nearly] the best – Don’t enumerate other table combinations. ● They can’t be much better. ● Optimizer trace: pruned_by_hanging_leaf t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 commit b729896d00e022f6205399376c0cc107e1ee0704 Author: Monty <monty@mariadb.org> Date: Tue May 10 11:47:20 2022 +0300 MDEV-28073 Query performance degradation in newer MariaDB versions when using many tables The issue was that best_extension_by_limited_search() had to go through too many plans with the same cost as there where many EQ_REF tables. Fixed by shortcutting EQ_REF (AND REF) when the result only contains one row. This got the optimization time down from hours to sub seconds. t0
  • 15. 15 MDEV-28073: patch #2: key_dependent select ... from person, car_rides, bicycle_rides where person.name=car_rides.rider and person.name=bicycle_rides.rider and ... car_rides bicycle_rides person ● Remember the “heuristics to apply the heuristic” a few slides above: – there are no possible "key dependencies" on tables not in the prefix It can be false due to multi-equalities: ● person.name=bicycle_rides.riders is a “possible key dependency”. ● But we already have person.name from car_rides.rider (the equality is “bound”) – Trying join orders with bicycle_rides before person won’t produce a better plan. ● Solution: adjust the heuristic: there are no possible key_dependencies on tables not in the prefix that are not already bound. name
  • 16. 16 MDEV-28073: patch #3: table order de-scrambling ● The optimizer should try good tables first ● Implemented by taking tables off the unused portion of join->best_ref array. – Initially it’s ordered (“promising” tables first) – But due to bug eventually gets out of order ● Plan searches that enumerate many options could suffer from poor pruning towards the end. t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 Author: Michael Widenius <monty@mariadb.org> Date: Sun May 15 15:46:29 2022 +0300 greedy_search() and best_extension_by_limited_search() scrambled table order best_extension_by_limited_search() assumes that tables should be sorted according to size to be able to quickly disregard bad plans. However the current usage of swap_variables() will change the table order to a not sorted one for the next recursive call. This breaks the assumtion and causes performance issues when using many tables (we have to examine many more plans). t0
  • 19. 19 In which order do we try the tables? ● Current: – join_tab_cmp() orders all tables by their JOIN_TAB::found_records (records after table’s condition is checked) – The same ordering is used everywhere – This *ignores* the join prefix and efficien table read plans we can use – e.g. here, ignores the prefix of t1: t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 t3 t4 t1 t1 t2 t3 t4
  • 20. 20 In which order do we try the tables? ● First, evaluate possible table accesses for {t2,t3,t4}. ● Sort them by #found_rows ● Then try extending join orders – Do all kinds of pruning while doing this t1 t1 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 commit 0762dd9283185c72c6955f44fc4d862a0a928569 Author: Monty <monty@mariadb.org> Date: Tue May 31 17:36:32 2022 +0300 Improve pruning in greedy_search by sorting tables during search MDEV-28073 Slow query performance in MariaDB when using many tables The faster we can find a good query plan, the more options we have for finding and pruning (ignoring) bad plans. This patch adds sorting of plans to best_extension_by_limited_search().
  • 22. 22 Remember: the idea behind the heuristic – … we don’t know the “exchange ratio” because we don’t know the costs of “join suffix” ● Also the suffixes are different! – Let’s assume the suffixes have similar costs. – Then, if ● read_time_t1 < read_time_t2, AND ● record_count_t1 < record_count_t2 – Then t1 “is better” than t2. – Can prune away t2. – t0 t1 t2 t3 incoming_record_count record_count_t1 record_count_t2 record_count_t3 read_time_t1
  • 23. 23 Let’s plot the tables Let’s plot records_read read_time t1
  • 24. 24 Let’s plot the tables records_read read_time t1 Better than t1 Worse than t1
  • 25. 25 Let’s plot the tables ● “Typical” situation: plan with higher cost produce more #rows. – A lot of opportunities to do pruning ● The optimizer orders plans by records_read – (that is, goes left-to-right) – Pick the first plan as “Model”, prune those that are worse. records_read read_time t1 t0 t2 t3 t4
  • 26. 26 When pruning doesn’t work records_read read_time t1 t0 t2 t3 t4 ● “Bad” situation: – plans with high cost produce few rows – And vice versa ● Can’t do pruning.
  • 27. 27 When pruning could work but doesn’t records_read read_time t1 t2 t3 t4 t0 ● Walking left-to-right, the optimizer picks t0 as Model table. ● And then can’t prune away any other table. Tables that are worse than t0 are here
  • 28. 28 How to do as much pruning as possible? records_read read_time ● Pick a minimal set of Model tables that allow to prune away the rest? ● Complexity seems to be at least N^2. ● Some approximate algorithm? – Use the table with min_cost – Use the table with min_records_read ● Have a patch with some approximate implementation
  • 30. 30 Motivation ● Tables with “attributes” that are joined using Primary Key select * from base_table, attr1, attr2, ... attrN where attr1.pk = base_table.pk and attr2.pk = base_table.pk and ... attrN.pk = base_table.pk ● Lots of nearly-identical query plans: There are factorial(n_attributes) permutations – Have the same or very close cost ● => Can’t do pruning ● The fix with “Edge tables” aka pruned_by_hanging_leaf helps but only if the attributes are at the end of the join order.
  • 31. 31 eq_ref chaining ● The idea: if we see a eq_ref access, try considering only eq_refs as long as we can. ● MySQL 5.7 has a similar optimization – TODO: describe the differences. t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 t3 t4 commit 5abb6bff6cfb5cb5d87520f1e32e9b41db46bd7b Author: Monty <monty@mariadb.org> Date: Thu Jun 2 19:47:23 2022 +0300 Added EQ_REF chaining to the greedy_optimizer MDEV-28073 Slow query performance in MariaDB when using many table The idea is to prefer and chain EQ_REF tables (tables that uses an unique key to find a row) when searching for the best table combination. This significantly reduces row combinations that has to be examined. This is optimization is enabled when setting optimizer_prune_level=2 (default)
  • 32. 32 Thanks for your attention!