ANALYZE for executable statements - a new way to do optimizer troubleshooting in MariaDB 10.1
1. A new way to do optimizer troubleshooting in MariaDB 10.1
ANALYZE for executable statements
Sergei Petrunia, MariaDB
Percona Live Santa Clara
April 2015
19. 19
EXPLAIN FORMAT=JSON
MySQL 5.6 introduced EXPLAIN FORMAT=JSON
• Good! It shows more info (http://s.petrunia.net/blog/?p=83)
• But it has bugs
Bug#69567: EXPLAIN FORMAT=JSON lists subquery in optimized_away_subqueries, but it is run
Bug#69795: EXPLAIN FORMAT=JSON doesn't show Using filesort for UNION
Bug#74462: EXPLAIN FORMAT=JSON produces ordering_operation when no ordering takes place
Bug#74661: EXPLAIN FORMAT=JSON says two temptables are used, execution shows just one
Bug#74744: EXPLAIN FORMAT=JSON produces duplicates_removal where there is none
[no bug#]: EXPLAIN FORMAT=JSON shows the same subquery as two different subqueries
…
• And we were not happy with output
– Even MySQL Workbench choked on it (http://s.petrunia.net/blog/?p=93)
– “JSON format” != “print tabular EXPLAIN in JSON”
19
INSERT:EXPLAINFORMAT=JSON
20. 20
EXPLAIN FORMAT=JSON in MariaDB 10.1
Improved over MySQL 5.6
• Attached conditions printout is more readable
– No ridiculous overquoting
– Subqueries are not printed in full
• JSON pretty-printer is smarter
• Index Merge output is JSON-ish, shows used_key_parts
• Range checked for each record output is JSON-ish, shows more info
• “Full scan on NULL key” prints JSON, not “index map: 0xNNN”
• Query plans for “Using Join buffer” show more details
• …
• !Alas, some ORDER/GROUP BY problems remain*
20
INSERT:EXPLAINFORMAT=JSON
21. 21
ANALYZE FORMAT=JSON
21
• Works like ANALYZE
• Produces EXPLAIN FORMAT=JSON like output
– with more data.
EXPLAIN
FORMAT=JSON
+ ANALYZE = ANALYZE FORMAT=JSON
22. 22
ANALYZE FORMAT=JSON basics
• Consider an example
22
analyze select *
from
lineitem, orders
where
o_orderkey=l_orderkey and
o_orderdate between '1990-01-01' and '1998-12-06' and
l_extendedprice > 1000000
+--+-----------+--------+----+-------------+-------+-------+-----------------+-------+----------+--------+----------+-----------+
|id|select_type|table |type|possible_keys|key |key_len|ref |rows |r_rows |filtered|r_filtered|Extra |
+--+-----------+--------+----+-------------+-------+-------+-----------------+-------+----------+--------+----------+-----------+
|1 |SIMPLE |orders |ALL |PRIMARY,i_...|NULL |NULL |NULL |1498194|1500000.00| 50.00 |100.00 |Using where|
|1 |SIMPLE |lineitem|ref |PRIMARY,i_...|PRIMARY|4 |orders.o_orderkey|1 |4.00 | 100.00 |0.00 |Using where|
+--+-----------+--------+----+-------------+-------+-------+-----------------+-------+----------+--------+----------+-----------+
27. 27
ANALYZE and subqueries summary
27
• query_block.r_loops
number of times the subquery executed
• query_block.r_total_time_ms
– total time spent
– includes tables, children subqueries
• Again: can instantly see the most expensive subquery.
28. 28
ANALYZE and join buffer
28
• Join buffer optimization
– Reads rows into buffer, then sorts
– EXPLAIN somewhat misleading
– @@join_buffer_size?
+--+-----------+-----+----+-------------+----+-------+----+----+-----------------------------------------------+
|id|select_type|table|type|possible_keys|key |key_len|ref |rows|Extra |
+--+-----------+-----+----+-------------+----+-------+----+----+-----------------------------------------------+
|1 |SIMPLE |t2 |ALL |NULL |NULL|NULL |NULL|820 |Using where |
|1 |SIMPLE |t1 |ALL |NULL |NULL|NULL |NULL|889 |Using where; Using join buffer (flat, BNL join)|
+--+-----------+-----+----+-------------+----+-------+----+----+-----------------------------------------------+
select * from t1, t2 where t1.col1<100 and t2.col1<100 and t1.col2=t2.col2
30. 30
ORDER/GROUP BY optimization
30
• “Late” choice if/how do sorting/grouping
– Different execution paths for EXPLAIN and SELECT
– They do not match :-)
• A lot of problems:
Bug#69795: EXPLAIN FORMAT=JSON doesn't show Using filesort for UNION
Bug#74462: EXPLAIN FORMAT=JSON produces ordering_operation when no ordering takes
place
Bug#74661: EXPLAIN FORMAT=JSON says two temptables are used, execution shows just one
Bug#74744: EXPLAIN FORMAT=JSON produces duplicates_removal where there is none
Bug#76679: EXPLAIN incorrectly shows Distinct for tables using join buffer
…?
• MySQL 5.6: filesort/priority_queue continues the pattern
– Not visible in EXPLAIN.
31. 31
ORDER/GROUP BY optimization
31
ANALYZE FORMAT=JSON
• Tracks how the query executed
– Whether sorting was done (and at which stage)
– Whether join result was buffered in a temp.table
– Whether duplicate removal was done
• => It's a way to know how what really was executed.
32. 32
ANALYZE and filesort: example #1
32
• Consider an example: raise priority for 10 earliest orders
update orders
set o_shippriority=o_shippriority+1
where
o_clerk='Clerk#000000001'
order by
o_shipDATE
limit 10;
+--+-----------+------+-----+--------------------+--------------------+-------+----+----+---------------------------+
|id|select_type|table |type |possible_keys |key |key_len|ref |rows|Extra |
+--+-----------+------+-----+--------------------+--------------------+-------+----+----+---------------------------+
|1 |SIMPLE |orders|range|i_o_order_clerk_date|i_o_order_clerk_date|16 |NULL|1466|Using where; Using filesort|
+--+-----------+------+-----+--------------------+--------------------+-------+----+----+---------------------------+
• Let's run ANALYZE
– (CAUTION: ANALYZE UPDATE will do the updates!)
34. 34
ANALYZE and filesort: example #2
34
Now, delete these orders
delete from orders
where
o_clerk='Clerk#000000001'
order by
o_shipDATE
limit 10;
+--+-----------+------+-----+--------------------+--------------------+-------+----+----+---------------------------+
|id|select_type|table |type |possible_keys |key |key_len|ref |rows|Extra |
+--+-----------+------+-----+--------------------+--------------------+-------+----+----+---------------------------+
|1 |SIMPLE |orders|range|i_o_order_clerk_date|i_o_order_clerk_date|16 |NULL|1466|Using where; Using filesort|
+--+-----------+------+-----+--------------------+--------------------+-------+----+----+---------------------------+
EXPLAIN is the same as in UPDATE
36. 36
ANALYZE and “range checked for each record”
36
• Optimization for non-equality joins
• Example:
orders with nearby shipdate and nearby order date
select * from orders A, orders B
where
A.o_clerk='Clerk#000000001' and
B.o_orderdate between DATE_SUB(A.o_orderdate, interval 1 day) and
DATE_ADD(A.o_orderdate, interval 1 day)
and
B.o_shipdate between DATE_SUB(A.o_shipdate, interval 1 day) and
DATE_ADD(A.o_shipdate, interval 1 day)
37. 37
ANALYZE and “range checked for each record”
37
select * from orders A, orders B
where
A.o_clerk='Clerk#000000001' and
B.o_orderdate between DATE_SUB(A.o_orderdate, interval 1 day) and
DATE_ADD(A.o_orderdate, interval 1 day)
and
B.o_shipdate between DATE_SUB(A.o_shipdate, interval 1 day) and
DATE_ADD(A.o_shipdate, interval 1 day)
+--+-----------+-----+----+------------------------+----------+-------+-...
|id|select_type|table|type|possible_keys |key |key_len|
+--+-----------+-----+----+------------------------+----------+-------+-...
|1 |SIMPLE |A |ref |i_o_order_clerk_date |i_o_clerk |16 |
|1 |SIMPLE |B |ALL |i_o_orderdate,o_shipDATE|NULL |NULL |
+--+-----------+-----+----+------------------------+----------+-------+-...
..-+-----+-------+-----------------------------------------------+
|ref |rows |Extra |
..-+-----+-------+-----------------------------------------------+
|const|1466 |Using index condition |
|NULL |1499649|Range checked for each record (index map: 0x22)|
..-+-----+-------+-----------------------------------------------+
39. 39
Final bits
39
• Target version: MariaDB 10.1
• Current status: BETA
– Good enough for joins
– Will add the missing bits.
• log_slow_verbosity=explain prints ANALYZE.
40. 40
Conclusions
40
• MariaDB 10.1 adds new commands
– ANALYZE statement
– ANALYZE FORMAT=JSON statement
• Show details about query execution
• Help in diagnosing the optimizer.