Sergei Petrunia
MariaDB devroom
FOSDEM 2021
New optimizer features
in MariaDB releases before 10.12
MariaDB Fest 2022
Sergei Petrunia
MariaDB developer
select *
from optimizer_features
where
release_date is not null and
release_date > date_sub(now(), interval 1 year)
2
Optimizer features in MariaDB 10.8
MariaDB 10.8: Stable since May, 2022
Optimizer features:
●
JSOH_HB Histogram type
– Covered in “Improved histograms in MariaDB 10.8” talk at FOSDEM 2022.
●
Reverse-ordered index support
●
JSON produced by the optimizer is now
– Valid :-)
– Processible
3
Optimizer produces JSON documents
●
Optimizer produces JSON
– EXPLAIN FORMAT=JSON …
– ANALYZE FORMAT=JSON …
– Optimizer Trace
●
Reasons for using JSON
– Human-readable, well-known syntax
– Can be processed
●
e.g. extract interesting parts of trace
4
JSON data structures
●
Array:
[ elem1, elem2, …]
●
Object:
{ "name1":elem1, "name2":elem2, … }
●
What if there are duplicate names?
●
This is still valid JSON: { "name1":foo, "name1":bar, … }
●
Possible gotchas in processing
– Some tools work
– Some tools will ignore “foo”
– Some will concatenate (merge): “foobar”
5
Most common: EXPLAIN FORMAT=JSON
EXPLAIN: {
"query_block": {
"select_id": 1,
"table": {
"table_name": "t1",
"access_type": "ALL",
"rows": 10,
"filtered": 100,
},
"table": {
"table_name": "t2",
"access_type": "ref",
"possible_keys": ["a"],
"key": "a",
...
EXPLAIN: {
"query_block": {
"select_id": 1,
"nested_loop": [
{
"table": {
"table_name": "t1",
"access_type": "ALL",
"rows": 10,
"filtered": 100,
}
},
{
"table": {
"table_name": "t2",
"access_type": "ref",
"possible_keys": ["a"],
...
6
Now JSON is processible
●
Before 10.8:
– EXPLAIN/ANALYZE had duplicate names
– Optimizer trace also had that
●
Starting from 10.8:
– All JSON docs are processible
– Can develop tools
●
EXPLAIN/ANALYZE visualizer (Highlight heaviest parts of the
query)
●
Optimizer trace: show details about the chosen query plan
7
MariaDB 10.9
8
MariaDB 10.9
MariaDB 10.9: Stable since August, 2022
Optimizer features:
●
SHOW EXPLAIN FORMAT=JSON
●
SHOW ANALYZE [FORMAT=JSON]
●
EXPLAIN FOR CONNECTION
9
Remember SHOW EXPLAIN?
Lets one see the EXPLAIN of a running statement:
-- Run a long query
select ... from ...
-- Find the target's connection_id N
select * from information_schema.processlist;
-- See the query plan it is using
show explain for N;
Connection 1:
Connection 2:
10
Better SHOW EXPLAIN in 10.9
●
Support for SHOW EXPLAIN FORMAT=JSON
– One could only get tabular SHOW EXPLAIN before
●
Due to implementation details
– Now one can get the JSON form, too. It has more data.
●
Compatibility with MySQL-8:
– MySQL’s syntax:
EXPLAIN [FORMAT=JSON] FOR CONNECTION n
– Now, MariaDB supports it, too.
11
Remember ANALYZE?
●
ANALYZE SELECT …
shows explain with execution data
●
Has FORMAT=JSON form
●
Very useful for optimizer
troubleshooting
– IF you can wait for the query to
finish...
"query_block": {
"select_id": 1,
"r_loops": 1,
"r_total_time_ms": 0.947815181,
"nested_loop": [
{
"table": {
"table_name": "t1",
"access_type": "ALL",
"r_loops": 1,
"rows": 10,
"r_rows": 10,
"r_table_time_ms": 0.804191,
"r_other_time_ms": 0.1051569,
"filtered": 100,
"r_filtered": 20,
12
Problem with ANALYZE
●
Query takes forever to finish
●
Query has constructs with large potential for
optimization errors
– Derived tables
– Many-table joins
– Dependent subqueries
●
What is the query doing?
– How long will it take?
13
SHOW ANALYZE
●
Shows r_ members describing the
execution so far
●
Can check r_rows vs rows, r_filtered vs
filtered.
– r_total_time_ms will be present if the
target is collecting timings (is itself
running an ANALYZE command)
●
r_loops=1 – the first iteration
●
Low r_loops: last iteration is incomplete
{
"r_query_time_in_progress_ms": 7801,
"query_optimization": {
"r_total_time_ms": 0.402132271
},
"query_block": {
"select_id": 1,
"r_loops": 1,
"nested_loop": [
{
"table": {
"table_name": "one_m",
"access_type": "index",
"key": "PRIMARY",
"key_length": "4",
"used_key_parts": ["a"],
"r_loops": 1,
"rows": 1000000,
"r_rows": 69422,
"filtered": 100,
"r_filtered": 100,
"using_index": true
}
}
]
}
}
show analyze format=json for <thread_id>;
14
SHOW ANALYZE take aways
●
Get ANALYZE [FORMAT=JSON] output for a running query
●
Useful for troubleshooting long-running queries
●
Run it repeatedly to see the progress
– Get a clue about the result ETA.
15
MariaDB 10.10
16
MariaDB 10.10
MariaDB 10.10: RC since August, 2022
Optimizer features:
●
Table elimination works for derived tables (MDEV-26278)
●
Improved optimization for many-table joins (MDEV-28852)
17
Remember Table Elimination?
●
A normalized schema, optional attributes in separate tables
create table user (
user_id int primary key,
user_name varchar(32),
...
);
create view user_info as
select U.user_id, user_name, address
from
user U
left join shipping_address ADDR on U.user_id=ADDR.user_id;
●
A view to present the attribute(s)
create table shipping_address (
user_id int primary key,
address text
);
18
Remember Table Elimination (2) ?
select user_name from user_info where user_id=1234;
select user_name, address from user_info where user_id=1234;
●
Table ADDR was eliminated from the query plan
create view user_info as
select U.user_id, user_name, address
from
user U
left join shipping_address ADDR on U.user_id=ADDR.user_id;
+------+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
| 1 | SIMPLE | U | const | PRIMARY | PRIMARY | 4 | const | 1 | |
| 1 | SIMPLE | ADDR | const | PRIMARY | PRIMARY | 4 | const | 1 | |
+------+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
+------+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
| 1 | SIMPLE | U | const | PRIMARY | PRIMARY | 4 | const | 1 | |
+------+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
19
Summary attributes and derived tables
create table user (
user_id int primary key,
user_name varchar(32),
...
);
create view user_info2 as
select U.user_id, user_name, ORD_TOTAL
from
user U
left join (select
user_id, sum(amount) as ORD_TOTAL
from orders
group by user_id
) ORD_TOTALS
on ORD_TOTALS.user_id=U.user_id;
●
A view to show users with order totals:
create table orders (
order_id int primary key,
user_id int,
amount double,
key(user_id)
);
20
Table elimination for derived tables
select user_name, ORD_TOTAL from user_info2 where user_id in (1,2);
●
ORD_TOTALS was eliminated from the query plan
+------+-----------------+------------+-------+---------------+---------+---------+----------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-----------------+------------+-------+---------------+---------+---------+----------------+------+-------------+
| 1 | PRIMARY | u | range | PRIMARY | PRIMARY | 4 | NULL | 2 | Using where |
| 1 | PRIMARY | <derived3> | ref | key0 | key0 | 5 | test.u.user_id | 4 | |
| 3 | LATERAL DERIVED | orders | ref | user_id | user_id | 5 | test.u.user_id | 49 | |
+------+-----------------+------------+-------+---------------+---------+---------+----------------+------+-------------+
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| 1 | PRIMARY | u | range | PRIMARY | PRIMARY | 4 | NULL | 2 | Using where |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
select user_name from user_info2 where user_id in (1,2);
user U
left join (select
user_id, sum(amount) as ORD_TOTAL
from orders
group by user_id
) ORD_TOTALS
on ORD_TOTALS.user_id=U.user_id;
21
Summary
●
MariaDB has Table Elimination optimization
●
Applicable for inner sides of OUTER JOINs.
– Can eliminate unused tables
– Unused table must be joined with a unique key.
●
Starting from MariaDB 10.8:
– also applicable for derived tables with GROUP BY.
– Use case: derived tables providing summaries.
22
Optimization for many-table joins
23
Join order search
●
Total number of possible join orders is:
N * (N-1) * (N-2) *… = N!
●
Join orders are built left-to-right
●
Cannot enumerate all possible join
orders.
– Pruning away prefixes that are:
●
Already too expensive,or
●
Do not look primising.
t1
t1
t2
t2
t3
t4
t3
t4
t2
t4
t2
t3
t4
t3
t4
t2
t3
t2
t1
t3
t4
t3
t4
t1
t4
t1
t3
t4
t3
t4
t1
t3
t1
select ... from t1, t2, t3, t4 where ...
t3
24
Join order search
●
optimizer_prune_level=
– 0 – Don’t prune
– 1 – Standard pruning (the default)
– 2 – New. Extra pruning
●
optimizer_extra_pruning_depth=N
– If search depth >=N, enable extra pruning
select ... from t1, t2, t3, …, tN where ...
25
MariaDB 10.11
MariaDB 10.11: alpha since Sept, 2022
●
MDEV-20609: Optimization for queries reading from
INFORMATION_SCHEMA.{PROC,PARAMETERS}
– “Don’t parse the stored routine if it doesn’t match the WHERE”
– Java connector queries these tables.
●
MDEV-28926: Make ANALYZE FORMAT=JSON show time spent
in the query optimizer. ...
"query_optimization": {
"r_total_time_ms": 0.402132271
},
26
Summary
Optimizer Trace
Improvements
10.8
10.9
10.10
10.11
10.12
●
JSON_HB histograms
●
Optimizer produces valid, processible JSON
●
SHOW EXPLAIN FORMAT=JSON
●
SHOW ANALYZE [FORMAT=JSON]
●
Table elimination for derived tables
●
Improved optimization for many-table joins
●
Optimization for INFORMATION_SCHEMA.{PROC,PARAMETERS}
●
New cost model
●
Fixed selectivity computations
27
Thanks for your attention!

New optimizer features in MariaDB releases before 10.12

  • 1.
    Sergei Petrunia MariaDB devroom FOSDEM2021 New optimizer features in MariaDB releases before 10.12 MariaDB Fest 2022 Sergei Petrunia MariaDB developer select * from optimizer_features where release_date is not null and release_date > date_sub(now(), interval 1 year)
  • 2.
    2 Optimizer features inMariaDB 10.8 MariaDB 10.8: Stable since May, 2022 Optimizer features: ● JSOH_HB Histogram type – Covered in “Improved histograms in MariaDB 10.8” talk at FOSDEM 2022. ● Reverse-ordered index support ● JSON produced by the optimizer is now – Valid :-) – Processible
  • 3.
    3 Optimizer produces JSONdocuments ● Optimizer produces JSON – EXPLAIN FORMAT=JSON … – ANALYZE FORMAT=JSON … – Optimizer Trace ● Reasons for using JSON – Human-readable, well-known syntax – Can be processed ● e.g. extract interesting parts of trace
  • 4.
    4 JSON data structures ● Array: [elem1, elem2, …] ● Object: { "name1":elem1, "name2":elem2, … } ● What if there are duplicate names? ● This is still valid JSON: { "name1":foo, "name1":bar, … } ● Possible gotchas in processing – Some tools work – Some tools will ignore “foo” – Some will concatenate (merge): “foobar”
  • 5.
    5 Most common: EXPLAINFORMAT=JSON EXPLAIN: { "query_block": { "select_id": 1, "table": { "table_name": "t1", "access_type": "ALL", "rows": 10, "filtered": 100, }, "table": { "table_name": "t2", "access_type": "ref", "possible_keys": ["a"], "key": "a", ... EXPLAIN: { "query_block": { "select_id": 1, "nested_loop": [ { "table": { "table_name": "t1", "access_type": "ALL", "rows": 10, "filtered": 100, } }, { "table": { "table_name": "t2", "access_type": "ref", "possible_keys": ["a"], ...
  • 6.
    6 Now JSON isprocessible ● Before 10.8: – EXPLAIN/ANALYZE had duplicate names – Optimizer trace also had that ● Starting from 10.8: – All JSON docs are processible – Can develop tools ● EXPLAIN/ANALYZE visualizer (Highlight heaviest parts of the query) ● Optimizer trace: show details about the chosen query plan
  • 7.
  • 8.
    8 MariaDB 10.9 MariaDB 10.9:Stable since August, 2022 Optimizer features: ● SHOW EXPLAIN FORMAT=JSON ● SHOW ANALYZE [FORMAT=JSON] ● EXPLAIN FOR CONNECTION
  • 9.
    9 Remember SHOW EXPLAIN? Letsone see the EXPLAIN of a running statement: -- Run a long query select ... from ... -- Find the target's connection_id N select * from information_schema.processlist; -- See the query plan it is using show explain for N; Connection 1: Connection 2:
  • 10.
    10 Better SHOW EXPLAINin 10.9 ● Support for SHOW EXPLAIN FORMAT=JSON – One could only get tabular SHOW EXPLAIN before ● Due to implementation details – Now one can get the JSON form, too. It has more data. ● Compatibility with MySQL-8: – MySQL’s syntax: EXPLAIN [FORMAT=JSON] FOR CONNECTION n – Now, MariaDB supports it, too.
  • 11.
    11 Remember ANALYZE? ● ANALYZE SELECT… shows explain with execution data ● Has FORMAT=JSON form ● Very useful for optimizer troubleshooting – IF you can wait for the query to finish... "query_block": { "select_id": 1, "r_loops": 1, "r_total_time_ms": 0.947815181, "nested_loop": [ { "table": { "table_name": "t1", "access_type": "ALL", "r_loops": 1, "rows": 10, "r_rows": 10, "r_table_time_ms": 0.804191, "r_other_time_ms": 0.1051569, "filtered": 100, "r_filtered": 20,
  • 12.
    12 Problem with ANALYZE ● Querytakes forever to finish ● Query has constructs with large potential for optimization errors – Derived tables – Many-table joins – Dependent subqueries ● What is the query doing? – How long will it take?
  • 13.
    13 SHOW ANALYZE ● Shows r_members describing the execution so far ● Can check r_rows vs rows, r_filtered vs filtered. – r_total_time_ms will be present if the target is collecting timings (is itself running an ANALYZE command) ● r_loops=1 – the first iteration ● Low r_loops: last iteration is incomplete { "r_query_time_in_progress_ms": 7801, "query_optimization": { "r_total_time_ms": 0.402132271 }, "query_block": { "select_id": 1, "r_loops": 1, "nested_loop": [ { "table": { "table_name": "one_m", "access_type": "index", "key": "PRIMARY", "key_length": "4", "used_key_parts": ["a"], "r_loops": 1, "rows": 1000000, "r_rows": 69422, "filtered": 100, "r_filtered": 100, "using_index": true } } ] } } show analyze format=json for <thread_id>;
  • 14.
    14 SHOW ANALYZE takeaways ● Get ANALYZE [FORMAT=JSON] output for a running query ● Useful for troubleshooting long-running queries ● Run it repeatedly to see the progress – Get a clue about the result ETA.
  • 15.
  • 16.
    16 MariaDB 10.10 MariaDB 10.10:RC since August, 2022 Optimizer features: ● Table elimination works for derived tables (MDEV-26278) ● Improved optimization for many-table joins (MDEV-28852)
  • 17.
    17 Remember Table Elimination? ● Anormalized schema, optional attributes in separate tables create table user ( user_id int primary key, user_name varchar(32), ... ); create view user_info as select U.user_id, user_name, address from user U left join shipping_address ADDR on U.user_id=ADDR.user_id; ● A view to present the attribute(s) create table shipping_address ( user_id int primary key, address text );
  • 18.
    18 Remember Table Elimination(2) ? select user_name from user_info where user_id=1234; select user_name, address from user_info where user_id=1234; ● Table ADDR was eliminated from the query plan create view user_info as select U.user_id, user_name, address from user U left join shipping_address ADDR on U.user_id=ADDR.user_id; +------+-------------+-------+-------+---------------+---------+---------+-------+------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+---------+---------+-------+------+-------+ | 1 | SIMPLE | U | const | PRIMARY | PRIMARY | 4 | const | 1 | | | 1 | SIMPLE | ADDR | const | PRIMARY | PRIMARY | 4 | const | 1 | | +------+-------------+-------+-------+---------------+---------+---------+-------+------+-------+ +------+-------------+-------+-------+---------------+---------+---------+-------+------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+---------+---------+-------+------+-------+ | 1 | SIMPLE | U | const | PRIMARY | PRIMARY | 4 | const | 1 | | +------+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
  • 19.
    19 Summary attributes andderived tables create table user ( user_id int primary key, user_name varchar(32), ... ); create view user_info2 as select U.user_id, user_name, ORD_TOTAL from user U left join (select user_id, sum(amount) as ORD_TOTAL from orders group by user_id ) ORD_TOTALS on ORD_TOTALS.user_id=U.user_id; ● A view to show users with order totals: create table orders ( order_id int primary key, user_id int, amount double, key(user_id) );
  • 20.
    20 Table elimination forderived tables select user_name, ORD_TOTAL from user_info2 where user_id in (1,2); ● ORD_TOTALS was eliminated from the query plan +------+-----------------+------------+-------+---------------+---------+---------+----------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-----------------+------------+-------+---------------+---------+---------+----------------+------+-------------+ | 1 | PRIMARY | u | range | PRIMARY | PRIMARY | 4 | NULL | 2 | Using where | | 1 | PRIMARY | <derived3> | ref | key0 | key0 | 5 | test.u.user_id | 4 | | | 3 | LATERAL DERIVED | orders | ref | user_id | user_id | 5 | test.u.user_id | 49 | | +------+-----------------+------------+-------+---------------+---------+---------+----------------+------+-------------+ +------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+ | 1 | PRIMARY | u | range | PRIMARY | PRIMARY | 4 | NULL | 2 | Using where | +------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+ select user_name from user_info2 where user_id in (1,2); user U left join (select user_id, sum(amount) as ORD_TOTAL from orders group by user_id ) ORD_TOTALS on ORD_TOTALS.user_id=U.user_id;
  • 21.
    21 Summary ● MariaDB has TableElimination optimization ● Applicable for inner sides of OUTER JOINs. – Can eliminate unused tables – Unused table must be joined with a unique key. ● Starting from MariaDB 10.8: – also applicable for derived tables with GROUP BY. – Use case: derived tables providing summaries.
  • 22.
  • 23.
    23 Join order search ● Totalnumber of possible join orders is: N * (N-1) * (N-2) *… = N! ● Join orders are built left-to-right ● Cannot enumerate all possible join orders. – Pruning away prefixes that are: ● Already too expensive,or ● Do not look primising. t1 t1 t2 t2 t3 t4 t3 t4 t2 t4 t2 t3 t4 t3 t4 t2 t3 t2 t1 t3 t4 t3 t4 t1 t4 t1 t3 t4 t3 t4 t1 t3 t1 select ... from t1, t2, t3, t4 where ... t3
  • 24.
    24 Join order search ● optimizer_prune_level= –0 – Don’t prune – 1 – Standard pruning (the default) – 2 – New. Extra pruning ● optimizer_extra_pruning_depth=N – If search depth >=N, enable extra pruning select ... from t1, t2, t3, …, tN where ...
  • 25.
    25 MariaDB 10.11 MariaDB 10.11:alpha since Sept, 2022 ● MDEV-20609: Optimization for queries reading from INFORMATION_SCHEMA.{PROC,PARAMETERS} – “Don’t parse the stored routine if it doesn’t match the WHERE” – Java connector queries these tables. ● MDEV-28926: Make ANALYZE FORMAT=JSON show time spent in the query optimizer. ... "query_optimization": { "r_total_time_ms": 0.402132271 },
  • 26.
    26 Summary Optimizer Trace Improvements 10.8 10.9 10.10 10.11 10.12 ● JSON_HB histograms ● Optimizerproduces valid, processible JSON ● SHOW EXPLAIN FORMAT=JSON ● SHOW ANALYZE [FORMAT=JSON] ● Table elimination for derived tables ● Improved optimization for many-table joins ● Optimization for INFORMATION_SCHEMA.{PROC,PARAMETERS} ● New cost model ● Fixed selectivity computations
  • 27.