The Query Engine: The Life of a Read

1,660 views

Published on

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,660
On SlideShare
0
From Embeds
0
Number of Embeds
631
Actions
Shares
0
Downloads
24
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

The Query Engine: The Life of a Read

  1. 1. Query Planning Hari Khalsa Query Therapist
  2. 2. What you’re in for… 1.  Query system overview. 2.  Walk through an example query. 3.  Query Planning, revisited. 4.  Plan Ranking
  3. 3. Querying is a two-step process. (1) Planning (2) Execution What’s the best way to execute this query? Traverse the indexes, get the results.
  4. 4. TAKE AN EXAMPLE let’s look at the simplest case.
  5. 5. What’s in a .find()? .find({_id: {$gt: 7}}, {_id: 1}).sort({_id: 1})
  6. 6. What’s in a .find()? .find({_id: {$gt: 7}}, {_id: 1}).sort({_id: 1}) predicate what to search for
  7. 7. What’s in a .find()? .find({_id: {$gt: 7}}, {_id: 1}).sort({_id: 1}) projection turns {_id: 3, x: 5} into {_id: 3} predicate
  8. 8. What’s in a .find()? .find({_id: {$gt: 7}}, {_id: 1}).sort({_id: 1}) projection predicate sort by _id, ascending
  9. 9. { _id: 2, x: 3 } { _id: 3, x: 3 } { _id: 5, x: 5 } { _id: 8, x: 7 } { _id: 9, x: 4 } What’s in a .find()? .find({_id: {$gt: 7}}, {_id: 1}).sort({_id: 1})
  10. 10. { _id: 2, x: 3 } { _id: 3, x: 3 } { _id: 5, x: 5 } { _id: 8, x: 7 } { _id: 9, x: 4 } What’s in a .find()? .find({_id: {$gt: 7}}, {_id: 1}).sort({_id: 1}) { _id: 8 } { _id: 9 }
  11. 11. QUERY PLANNING selecting indexes, generating access plans, etc.
  12. 12. Query planning has three stages. Index Selection Access Plan Generation Analysis Sort, Project, etc.
  13. 13. Index Selection Access Analysis There are three index “types”; predicates are matched to indexes. Geo Text B-Tree
  14. 14. .find({ _id: { $gt: 7 }}, { _id: 1 }).sort({ _id: 1 }); predicate Which indexes are useful? Index Selection Access Analysis
  15. 15. .find({ _id: { $gt: 7 }}, { _id: 1 }).sort({ _id: 1 }); predicate _id index is the relevant index. Index Selection Access Analysis
  16. 16. .find({ _id: { $gt: 7 }}, { _id: 1 }).sort({ _id: 1 }); Index Selection Access Analysis predicate Perform index scan on { _id: 1 } index with bounds 7 < _id <= infinity.
  17. 17. .find({ _id: { $gt: 7 }}, { _id: 1 }).sort({ _id: 1 }); predicate Perform index scan on { _id: 1 } index with bounds 7 < _id <= infinity. The predicate is fully covered by the index! Index Selection Access Analysis
  18. 18. .find({ _id: { $gt: 7 }}, { _id: 1 }).sort({ _id: 1 }); predicate What if multiple indexes could work? Index Selection Access Analysis
  19. 19. .find({ _id: { $gt: 7 }}, { _id: 1 }).sort({ _id: 1 }); Index Selection Access Analysis projection sort What other processing is needed?
  20. 20. .find({ _id: { $gt: 7 }}, { _id: 1 }).sort({ _id: 1 }); projection What other processing is needed? Index scan on {_id: 1} => Results are already sorted by {_id: 1}! Index Selection Access Analysis
  21. 21. .find({ _id: { $gt: 7 }}, { _id: 1 }).sort({ _id: 1 }); What other processing is needed? Index on {_id: 1} & only need _id field => No need to fetch documents from disk! Index Selection Access Analysis
  22. 22. .find({ _id: { $gt: 7 }}, { _id: 1 }).sort({ _id: 1 }); The actual query solution tree: RESULTS Index Selection Access Analysis IXSCAN PROJ
  23. 23. ACCESS: IT’S COMPLICATED Index Selection Access Plan Generation Analysis Sort, Project, etc.
  24. 24. Query stages can produce data, or consume and produce data.
  25. 25. Predicate leaves are data-providing stages. IXSCAN TEXT GEONEAR Predicate Leaf
  26. 26. AND and OR stages consume and produce data. AND PL PL PL OR
  27. 27. An “OR” query is indexed if both of its children are indexed. OR
  28. 28. OR OR[x:1, z:1] with indices {x:1}, {z:1}
  29. 29. OR[x:1, z:1] with indices {x:1}, {z:1} •  OR (union) •  IXSCAN {x:1} from [1, 1] •  IXSCAN {z:1} from [1, 1] OR
  30. 30. An “AND” query is indexed if at least one of its children is indexed. AND
  31. 31. AND AND[x:1, y:1] with index {x:1}
  32. 32. AND AND[x:1, y:1] with index {x:1} •  IXSCAN {x:1} from [1, 1]
  33. 33. AND AND[x:1, y:1] with index {x:1} •  FETCH [ filter = {y:1} ] •  IXSCAN {x:1} from [1, 1]
  34. 34. What about index intersection? AND
  35. 35. RANKING PLANS choosing the best plan for the query.
  36. 36. Run the plans, see which is best.
  37. 37. Run the plans, see which is best. What do you mean, best?
  38. 38. Run the plans, see which is best. What do you mean, best? Time is not a reliable metric.
  39. 39. = look at one index key (IXSCAN) = look at one more doc (COLLSCAN) one “work” unit
  40. 40. Pick the plan with the highest results produced works executed
  41. 41. THAT’S THE PLAN. Index Selection Access Plan Generation Analysis Sort, Project, etc.
  42. 42. What does the future hold? Stats! Better Index Intersection
  43. 43. THANKS FOR LISTENING!

×