Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

MySQL Optimizer Overview

599 views

Published on

Presented at Percona Live Data Performance Conference 2016

Published in: Internet
  • can it be downloaded, thanks
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

MySQL Optimizer Overview

  1. 1. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. MySQL Optimizer Overview Olav Sandstå Senior Principal Engineer MySQL Optimizer Team, Oracle April 19, 2016
  2. 2. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Program Agenda Logical transformations Cost-based optimizations Analyzing access methods Join optimizer Plan refinements Subquery optimizations Query execution plan 1 2 3 4 5 2 6 7
  3. 3. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. MySQL Optimizer SELECT a, b FROM t1 JOIN t2 ON t1.a = t2.b JOIN t3 ON t2.b = t3.c WHERE t2.d > 20 AND t2.d < 30; t2 t3 t1 Table scan Range scan Ref access JOIN JOIN Statistics (storage engines) Table/index info (data dictionary) Query Optimizer 3
  4. 4. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. MySQL Architecture Optimizer Logical transformations Cost-based optimizer: Join order and access methods Plan refinement Query execution plan Query execution Parser Resolver: Semantic check,name resolution Storage Engine InnoDB MyISAM SQL query Query result 4
  5. 5. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. MySQL Optimizer Characteristics • Produces the query plan that uses least resources – IO and CPU • Optimizes a single query – No inter-query optimizations • Produces a left-deep linear query execution plan JOIN JOIN t1 t2 t3 JOIN t4Table scan Table scan Range scan Ref access 5
  6. 6. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Optimizer Overview Main phases Optimizer Logical transformations Cost-based optimizer: Join order and access methods Plan refinement Query execution plan Query execution Parser Resolver: Semantic check,name resolution Storage engine InnoDB MyISAM Prepare for cost-based optimization Negation elimination Equality and constant propagation Evaluation of constant expressions Conversions of outer to inner join Subquery transformation Ref access analysis Range access analysis Estimation of condition fan out Constant table detection Table condition pushdown Access method adjustments Sort avoidance Index condition pushdown Access method selection Join order 6
  7. 7. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Program Agenda Logical transformations Cost-based optimizations Analyzing access methods Join optimizer Plan refinements Subquery optimizations Query execution plan 1 2 3 4 5 7 6 7
  8. 8. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Logical Transformations • Logical transformations of query conditions: – Negation elimination – Equality propagations – Evaluate constant expressions – Remove trivial conditions • Conversion of outer to inner join • Merging of views and derived tables • Subquery transformations Simpler query to optimize and execute Prepare for later optimizations 8
  9. 9. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Example: Logical Transformations t1.a = 9 AND t2.a = 9 AND (9 <= 10 AND t2.b <= 3 OR (t1.b = 12 AND t2.b = 5)); Evaluate const expressions SELECT * FROM t1 JOIN t2 WHERE t1.a = t2.a AND t2.a = 9 AND (NOT (t1.a > 10 OR t2.b > 3) OR (t1.b = t2.b + 7 AND t2.b = 5)); Negation elimination t1.a = t2.a AND t2.a = 9 AND (t1.a <= 10 AND t2.b <= 3 OR (t1.b = t2.b + 7 AND t2.b = 5)); Equality/const propagation t1.a = 9 AND t2.a = 9 AND (9 <= 10 AND t2.b <= 3 OR (t1.b = 5 + 7 AND t2.b = 5)); =TRUE Trivial condition removal t1.a = 9 AND t2.a = 9 AND (t2.b <= 3 OR (t1.b = 12 AND t2.b = 5)); 9
  10. 10. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Program Agenda Logical transformations Cost-based optimizations Analyzing access methods Join optimizer Plan refinements Subquery optimizations Query execution plan 1 2 3 4 5 10 6 7
  11. 11. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Cost-based Query Optimization General idea: • Assign cost to operations • Assign cost to partial or alternative plans • Search for plan with lowest cost t2 t3 t1 Table scan Range scan Ref access JOIN JOIN 11
  12. 12. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Cost-based Query Optimizations The main cost-based optimizations: • Index and access method – Table scan – Index scan – Range scan – Index lookup (ref access) • Join order • Join buffering strategy • Subquery strategy t2 t3 t1 Table scan Range scan Ref access JOIN JOIN 12
  13. 13. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Optimizer Cost Model t1 Cost estimate Row estimate Cost Model Cost formulas Access methods Join Subquery Cost constants CPU IO Metadata: - Record and index size - Index information - Uniqueness Statistics: - Table size - Cardinality - Range estimates Cost model configuration Range scan JOIN 13 New in MySQL 5.7
  14. 14. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. • The cost for executing a query • Cost unit: – “read a random data page from disk” • Main cost factors: – IO cost: • #pages read from table • #pages read from index – CPU cost: • Evaluating query conditions • Comparing keys/records • Sorting keys • Main cost constants: Cost Estimates Cost Default value Reading a random disk page 1.0 Reading a data page from memory buffer 1.0 Evaluating query condition 0.2 Comparing key/record 0.1 New in MySQL 5.7: Configurable 14
  15. 15. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Cost Model Examples Table scan: • IO-cost: #pages in table * IO_BLOCK_READ_COST • CPU-cost: #records * ROW_EVALUATE_COST Range scan (on secondary index): • IO-cost: #records_in_range * IO_BLOCK_READ_COST • CPU cost: #records_in_range * ROW_EVALUATE_COST + #records_in_range * ROW_EVALUATE_COST SELECT * FROM t1 WHERE a BETWEEN 20 AND 23 Evaluate range condition Evaluate WHERE condition 15
  16. 16. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Program Agenda Logical transformations Cost-based optimizations Analyzing access methods Join optimizer Plan refinements Subquery optimizations Query execution plan 1 2 3 4 5 16 6 7
  17. 17. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Selecting Access Method • For each table, find the best access method: 1. Check if the access method is useful 2. Estimate cost of using access method 3. Select the cheapest to be used • Choice of access method is cost based Finding the optimal method to read data from storage engine Main access methods • Table scan • Index scan • Index lookup (ref access) • Range scan • Index merge • Loose index scan 17
  18. 18. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Index Lookup (Ref Access) • Read all records with a given key value using an index • Examples: SELECT * FROM t1 WHERE t1.key = 7; SELECT * FROM t1, t2 WHERE t1.key = t2.key; • “eq_ref”: – Reading from a unique index, max one record returned • “ref”: – Reading from a non-unique index or a prefix of an index, possibly multiple records returned – The record estimate is based on cardinality number from index statistics 18
  19. 19. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Ref Access Analysis • Determine which indexes that can be used for index lookup in a join country country_id capital 19 SELECT city.name as capital, language.name FROM city JOIN country ON city.country_id = country.country_id JOIN language ON country.country_id = language.country_id WHERE city.city_id = country.capital; city country_id city_id name language country_id name
  20. 20. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Range Optimizer • Goal: find the “minimal” ranges to read for each index • Example: SELECT * FROM t1 WHERE (key1 > 10 AND key1 < 20) AND key2 > 30 • Range scan using INDEX(key1): • Range scan using INDEX(key2): 10 20 30 20
  21. 21. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Range Optimizer, cont. • Range optimizer selects the “useful” parts of the WHERE condition: – Conditions comparing a column value with a constant: – Nested AND/OR conditions are supported • Result: list of disjoint ranges that need to be read from index: • Cost estimate based on number of records in each range: – Record estimate is found by asking the storage engine (“index dives”) key > 3 key = 4 key IS NULLkey BETWEEN 4 AND 6 key LIKE ”abc%”key IN (10,12,..) 21
  22. 22. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Range Access for Multi-part Index • Table: • INDEX idx (a, b, c); • Logical storage layout of index: Example table with multi-part index 10 1 2 3 4 5 11 1 2 3 4 5 12 1 2 3 4 5 13 1 2 3 4 5 a b c pk a b c d 22
  23. 23. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Range Access for Multi-part Index, cont • Equality on 1st index part? – Can add condition on 2nd index part to range condition • Example: SELECT * from t1 WHERE a IN (10,11,13) AND (b=2 OR b=4) • Resulting range scan: 2 4 2 4 2 4 23 10 1 2 3 4 5 11 1 2 3 4 5 12 1 2 3 4 5 13 1 2 3 4 5 a b c
  24. 24. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. • Non-equality on 1st index part: – Can NOT add condition on 2nd index part in range condition • Example: SELECT * from t1 WHERE a > 10 AND a < 13 AND (b=2 OR b=4) • Resulting range scan: 10 1 2 3 4 5 11 1 2 3 4 5 12 1 2 3 4 5 13 1 2 3 4 5 a b c Range Access for Multi-part Index, cont a > 10 AND a < 13 24
  25. 25. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. • Use multiple indexes on the same table • Implemented index merge strategies: – Index Merge Union • OR conditions between different indexes – Index Merge Intersect • AND conditions between different indexes – Index Merge Sort-Union • OR conditions where condition is a range Index Merge • Example: SELECT * FROM t1 WHERE a=10 OR b=10 10INDEX(a) 10INDEX(b) a=10 OR b=10Result: Union 25
  26. 26. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. 10 1 2 3 4 5 11 1 2 3 4 5 12 1 2 3 4 5 13 1 2 3 4 5 a b c • Optimization for GROUP BY and DISTINCT: SELECT a, b FROM t1 GROUP BY a, b; SELECT DISTINCT a, b FROM t1; SELECT a, MIN(b) FROM t1 GROUP BY a; • GROUP BY/DISTINCT must be on the prefix of a multipart index Loose Index Scan 26
  27. 27. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Program Agenda Logical transformations Cost-based optimizations Analyzing access methods Join optimizer Plan refinements Subquery optimizations Query execution plan 1 2 3 4 5 27 6 7
  28. 28. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Join Optimizer • Goal: – Given a JOIN of N tables, find the best JOIN ordering • “Greedy search strategy”: – Start with all 1-table plans – Expand each plan with remaining tables • Depth-first – If “cost of partial plan” > “cost of best plan”: • “prune” plan – Heuristic pruning: • Prune less promising partial plans t1 t2 t2 t2 t2 t3 t3 t3 t4t4 t4 t4t4 t3 t3 t2 t4t2 t3 28 N! possible plans
  29. 29. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Join Optimizer Illustrated SELECT city.name as capital, language.name FROM city JOIN country ON city.country_id = country.country_id JOIN language ON country.country_id = language.country_id WHERE city.city_id = country.capital; language country city language language language languagecountry country country country city citycity city cost=26568 cost=32568 cost=627 cost=1245 cost=862 start 29
  30. 30. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. New in MySQL 5.7 Record and Cost Estimates for JOIN • tx JOIN tx+1 • records(tx+1) = records(tx) * condition_filter_effect * records_per_key Condition filter effect tx tx+1 Ref access Number of records read from tx Conditionfilter effect Records passing the table conditions on tx 30 Cardinality statistics for index
  31. 31. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. How to Calculate Condition Filter Effect, step 1 A condition contributes to the condition filter effect for a table only if: – It references a field in the table – It is not used by the access method – It depends on an available value: • employee.name = “John” will always contribute to filter on employee • employee.first_office_id <> office.id; depends on JOIN order SELECT office_name FROM office JOIN employee WHERE office.id = employee.office_id AND employee.name = “John” AND employee.first_office_id <> office.id; 31 New in MySQL 5.7
  32. 32. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Filter estimate based on what is available: 1. Range estimate 2. Index statistics 3. Guesstimate = 0.1 <=,<,>,>= 1/3 BETWEEN 1/9 NOT <op> 1 – SEL(<op>) AND P(A and B) = P(A) * P(B) OR P(A or B) = P(A) + P(B) – P(A and B) … … How to Calculate Condition Filter Effect, step 2 SELECT * FROM office JOIN employee ON office.id = employee.office_id WHERE office_name = “San Francisco” AND employee.name = “John” AND age > 21 AND hire_date BETWEEN “2014-01-01” AND “2014-06-01”; 32
  33. 33. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. SELECT * FROM office JOIN employee ON office.id = employee.office_id WHERE office_name = “San Francisco” AND employee.name = “John” AND age > 21 AND hire_date BETWEEN “2014-01-01” AND “2014-06-01”; Calculating Condition Filter Effect for Tables Condition filter effect for tables: – office: 0.03 – employee: 0.1 * 0.11 * 0.89 Example 0.1 (guesstimate) 0.89 (range) 0.11 (guesstimate) 0.03 (index) 33
  34. 34. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Program Agenda Logical transformations Cost-based optimizations Analyzing access methods Join optimizer Plan refinements Subquery optimizations Query execution plan 1 2 3 4 5 34 6 7
  35. 35. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Finalizing Query Plan • Assigning query conditions to tables – Evaluate conditions as early as possible in join order • ORDER BY optimization: avoid sorting – Change to different index – Read in descending order • Change to a cheaper access method – Example: Use range scan instead of table scan or ref access • Index Condition Pushdown Main optimizations: 35
  36. 36. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. ORDER BY Optimizations • General solution: “File sort” – Store query result in temporary table before sorting – If data volume is large, may need to sort in several passes with intermediate storage on disk • Optimizations: – Switch to use index that provides result in sorted order – For “LIMIT n” queries, maintain priority queue on n top items in memory instead of file sort 36
  37. 37. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Index Condition Pushdown • Pushes conditions that can be evaluated on the index down to storage engine – Works only on indexed columns • Goal: evaluate conditions without having to access the actual record – Reduces number of disk/block accesses – Reduces CPU usage Query conditions Index Table data Storage engine MySQL server 37
  38. 38. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Program Agenda Logical transformations Cost-based optimizations Analyzing access methods Join optimizer Plan refinements Subquery optimizations Query execution plan 1 2 3 4 5 38 6 7
  39. 39. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Subquery category: • IN (SELECT …) • NOT IN (SELECT …) • FROM (SELECT …) • <CompOp> ALL/ANY (SELECT ..) • EXISTS/other Strategy: Overview of Subquery Optimizations • Semi-join • Materialization • IN ➜ EXISTS • Merged • Materialized • MAX/MIN re-write • Execute subquery 39 New in MySQL 5.7
  40. 40. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Optimization of IN subqueries 1. Transform IN (and =ANY) subquery to semi-join: 2. Apply transformations/strategies for avoiding/removing duplicates: 3. Optimize using cost-based JOIN optimizer A. Semi-join Transformation Table pullout Duplicate Weedout First Match LooseScan Semi-join materialization 40 SELECT * FROM t1 WHERE query_where AND outer_expr IN (SELECT inner_expr FROM t2 WHERE cond2) SELECT * FROM t1 SEMIJOIN t2 ON outer_expr = inner_expr WHERE query_where AND cond2
  41. 41. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Optimization of IN Subqueries, cont. • Only for non-correlated subqueries • Execute subquery once – store result in temporary table with unique index (removes duplicates) • Outer query does lookup in temporary table B. Subquery Materialization SELECT title FROM film WHERE film_id IN (SELECT film_id FROM actor WHERE name=“Bullock”) Temporarytable Index Materialize Lookup 41
  42. 42. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Optimization of IN Subqueries, cont. • Convert IN subquery to EXISTS subquery by “push-down” IN-equality to subquery: • Benefit: subquery will evaluate fewer records • Note: special handling if pushed down expressions can be NULL C. IN  EXISTS transformation SELECT title FROM film WHERE film_id IN (SELECT film_id FROM actor WHERE name=“Bullock”) SELECT title FROM film WHERE EXISTS (SELECT 1 FROM actor WHERE name=“Bullock” AND film.film_id = actor.film_id) 42
  43. 43. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Program Agenda Logical transformations Cost-based optimizations Analyzing access methods Join optimizer Plan refinements Subquery optimizations Query execution plan 1 2 3 4 5 43 6 7
  44. 44. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Understanding the Query Plan • Use EXPLAIN to print the final query plan: • Explain for a running query: EXPLAIN EXPLAIN SELECT * FROM t1 JOIN t2 ON t1.a = t2.a WHERE b > 10 AND c > 10; +----+--------+-------+-------+---------------+-----+---------+------+------+----------+-----------------------+ | id | type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------+-------+-------+---------------+-----+---------+------+------+----------+-----------------------+ | 1 | SIMPLE | t1 | range | PRIMARY,idx1 | idx1| 4 | NULL | 12 | 33.33 | Using index condition | | 2 | SIMPLE | t2 | ref | idx2 | idx2| 4 | t1.a | 1 | 100.00 | NULL | +----+--------+-------+-------+---------------+-----+---------+------+------+----------+-----------------------+ 44 EXPLAIN FOR CONNECTION connection_id; New in MySQL 5.7
  45. 45. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Added in MySQL 5.7 Understanding the Query Plan • JSON format: • Contains more information: – Used index parts – Pushed index conditions – Cost estimates – Data estimates Structured EXPLAIN EXPLAIN FORMAT=JSON SELECT * FROM t1 WHERE b > 10 AND c > 10; EXPLAIN { "query_block": { "select_id": 1, "cost_info": { "query_cost": "17.81" }, "table": { "table_name": "t1", "access_type": "range", "possible_keys": [ "idx1" ], "key": "idx1", "used_key_parts": [ "b" ], "key_length": "4", "rows_examined_per_scan": 12, "rows_produced_per_join": 3, "filtered": "33.33", "index_condition": "(`test`.`t1`.`b` > 10)", "cost_info": { "read_cost": "17.01", "eval_cost": "0.80", "prefix_cost": "17.81", "data_read_per_join": "63" }, ……… "attached_condition": "(`test`.`t1`.`c` > 10)" } } } EXPLAIN FORMAT=JSON SELECT … 45
  46. 46. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. 46 Visual Explain in MySQL Work Bench Understanding the Query Plan
  47. 47. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Optimizer Trace • Trace of the main steps and decisions done by the optimizer Understand HOW a query is optimized SET optimizer_trace=”enabled=on”; SELECT * FROM t1 WHERE a > 10; SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE; "table": "`t1`", "range_analysis": { "table_scan": { "rows": 54, "cost": 13.9 }, "best_covering_index_scan": { "index": ”idx", "cost": 11.903, "chosen": true }, "analyzing_range_alternatives": { "range_scan_alternatives": [ { "index": ”idx", "ranges": [ "10 < a" ], "rowid_ordered": false, "using_mrr": false, "index_only": true, "rows": 12, "cost": 3.4314, "chosen": true } 47
  48. 48. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Influencing the Optimizer • Add indexes • Use hints: – Index hints: USE INDEX, FORCE INDEX, IGNORE INDEX – Join order: STRAIGHT_JOIN – Subquery strategy: /*+ SEMIJOIN(FirstMatch) */ – Join buffer strategy: /*+ BKA(table1) */ • Adjust optimizer_switch flags: – set optimizer_switch=“condition_fanout_filter=OFF” • Ask question in the MySQL optimizer forum When the optimizer does not do what you want: New hint syntax and new hints in MySQL 5.7 48
  49. 49. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. Summary • Query transformations • Selecting data access method • Join optimizer • Subquery optimizations • Plan refinements Questions? Optimizer: What´s New in 5.7 and Sneak Peek at 5.8 Thursday at 11:00 Ballroom C 49

×