Your SlideShare is downloading. ×
0
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Understandung Firebird optimizer, by Dmitry Yemanov (in English)

2,154

Published on

Understanding Firebird optimizer, by Dmitry Yemanov (in English)

Understanding Firebird optimizer, by Dmitry Yemanov (in English)

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,154
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
37
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Understanding Firebird optimizer Dmitry Yemanov [email_address] Firebird Project
  • 2. Optimizer Keypoints <ul><ul><li>Allow the data to be retrieved in the most efficient way possible </li></ul></ul><ul><ul><li>Analyze the existing statistical information
  • 3. Inject additional predicates </li></ul></ul><ul><ul><li>Order operations by priority </li></ul></ul><ul><ul><li>Try different join permutations </li></ul></ul><ul><ul><li>Strategies </li></ul></ul><ul><ul><li>Rule-based (heuristics)
  • 4. Cost-based (statistics)
  • 5. Mixed </li></ul></ul>
  • 6. Optimizer Algorithm <ul><ul><li>Preparation </li></ul></ul><ul><ul><li>Expand views
  • 7. Separate predicates: «base», «parent», «missing»
  • 8. Distribute equalities
  • 9. Generate index mappings </li></ul></ul><ul><ul><li>Main stage </li></ul></ul><ul><ul><li>Calculate cost for different join orders
  • 10. Choose the best index coverage for the given join order
  • 11. Ensure early predicates evaluation
  • 12. Decide about navigation or sorting </li></ul></ul>
  • 13. Rule-based Approach <ul><ul><li>Heuristical assumptions </li></ul></ul><ul><ul><li>Indexed retrieval is better than a full table scan
  • 14. Loop join (indexed) is better than a merge join </li></ul></ul><ul><ul><li>Index b-tree has three levels of depth </li></ul></ul><ul><ul><li>Compound indices are better than a few simple ones </li></ul></ul><ul><ul><li>Drawbacks </li></ul></ul><ul><ul><li>Indices could be not really good for some operations
  • 15. Not ready for «ad hoc» queries </li></ul></ul>
  • 16. Cost-based Approach <ul><ul><li>Key ideas </li></ul></ul><ul><ul><li>Every operation has an associated cost value
  • 17. Cost is calculated using the statistical information </li></ul></ul><ul><ul><li>Cost is aggregated from bottom up in the access path </li></ul></ul><ul><ul><li>Drawbacks </li></ul></ul><ul><ul><li>Complex implementation
  • 18. Slower optimization process
  • 19. Requires up-to-date statistics </li></ul></ul>
  • 20. Basic Terms <ul><ul><li>Selectivity </li></ul></ul><ul><ul><li>Represents a fraction of rows from a row set
  • 21. Value range is 0.0 to 1.0 </li></ul></ul><ul><ul><li>Cardinality </li></ul></ul><ul><ul><li>Represents number of rows in a row set
  • 22. Base cardinality is a number of rows in a base table </li></ul></ul><ul><ul><li>Cost </li></ul></ul><ul><ul><li>Represents computational complexity of the retrieval
  • 23. Is a function of the estimated cardinalities
  • 24. Linearly depends on the number of logical reads (page fetches) </li></ul></ul>
  • 25. Cost Measurement <ul><ul><li>Full table scan </li></ul></ul><ul><ul><li>cost = <base cardinality> </li></ul></ul><ul><ul><li>Unique index scan + table scan </li></ul></ul><ul><ul><li>cost = <b-tree level> + 1 </li></ul></ul><ul><ul><li>Range index scan + table scan </li></ul></ul><ul><ul><li>cost = <b-tree level> + N + <selectivity> * <base cardinality>
  • 26. N represents a number of the leaf pages to be scanned and thus depends on the average key length </li></ul></ul>
  • 27. Cost Aggregation SELECT * FROM T1 JOIN T2 ON T1.PK = T2.FK WHERE T1.VAL < 100 ORDER BY T1.RANK PLAN SORT ( JOIN ( T1 NATURAL, T2 INDEX (FK) ) ) Table T1: base cardinality = 1000 Table T2: base cardinality = 5000 Index FK: selectivity = 0.001 Final Row Set cost = 5000 cardinality = 2500 Sort cost = 5000 cardinality = 2500 Full Scan cost = 1000 cardinality = 1000 Filter cost = 1000 cardinality = 500 Index Scan cost = 7 cardinality = 5 Loop Join cost = 4000 cardinality = 2500
  • 28. Statistics <ul><ul><li>What is it? </li></ul></ul><ul><ul><li>Information that describes data amounts and distribution of values on different levels (table / index / column) </li></ul></ul><ul><ul><li>Where is located? </li></ul></ul><ul><ul><li>Stored in the database
  • 29. Calculated «on the fly» </li></ul></ul><ul><ul><li>How is updated? </li></ul></ul><ul><ul><li>By user's request (SET STATITICS)
  • 30. On index creation / activation
  • 31. On database restore </li></ul></ul>
  • 32. Core Statistics <ul><ul><li>Base cardinality (number of rows in a table) </li></ul></ul><ul><ul><li>For small tables: number of used record slots on data pages
  • 33. For large tables: number of data pages / average record length
  • 34. Estimated at runtime using a page scan </li></ul></ul><ul><ul><li>Index selectivity </li></ul></ul><ul><ul><li>1 / number of distinct keys
  • 35. Maintained per segment: (A), (A, B), (A, B, C)
  • 36. Uniform value distribution is assumed
  • 37. Stored on the index root page, visible through RDB$INDICES and RDB$INDEX_SEGMENTS </li></ul></ul>
  • 38. Advanced Statistics <ul><ul><li>Table level </li></ul></ul><ul><ul><li>Average page fill factor
  • 39. Average record length </li></ul></ul><ul><ul><li>Index level </li></ul></ul><ul><ul><li>B-tree depth
  • 40. Average key length
  • 41. Clustering factor </li></ul></ul><ul><ul><li>Column level </li></ul></ul><ul><ul><li>Number of NULLs
  • 42. Value distribution histograms </li></ul></ul>
  • 43. Clustering Factor Bad Clustering Factor Good Clustering Factor Index Key 1 Index Key 2 Index Key 3 Index Key 5 Index Key 4 Data Page 12 Data Page 25 Data Page 28 Data Page 57 Data Page 44 Data Page 12 Data Page 13 Data Page 14
  • 44. Decisions Based on Statistics <ul><ul><li>Full table scan vs indexed retrieval </li></ul></ul><ul><ul><li>Big selectivity value suggests a full table scan </li></ul></ul><ul><ul><li>Order of streams in loop joins </li></ul></ul><ul><ul><li>Calculate costs of different stream permutations and choose the cheapest one </li></ul></ul><ul><ul><li>Loop join vs merge join </li></ul></ul><ul><ul><li>Calculate costs of different stream permutations </li></ul></ul><ul><ul><li>Index navigation vs external sorting </li></ul></ul><ul><ul><li>Depends on the clustering factor </li></ul></ul>
  • 45. Decisions Based on Statistics (cont'd) <ul><ul><li>What indices to use </li></ul></ul><ul><ul><li>Compare index selectivities and index scan costs
  • 46. Estimate how many indices would work best
  • 47. Consider segment operations for compound indices
  • 48. Special handling of different comparisons
  • 49. Calculate selectivities for AND / OR operations </li></ul></ul>
  • 50. Thank you!

×