Miguel Angel Nieto
miguel.nieto@mongodb.com
Technical Services Engineer, MongoDB
Query Planner
The Question
● I worked for 6 years as MySQL Technical Support
Engineer.
● A large percentage of cases from customers were related
to bad query plans/wrong index selection.
● Query Planning is a complex piece of code with many
knobs that can be tuned.
● When I started working at MongoDB I found that the
number of cases on that topic was very very (very) low. So
I asked myself:
Why?
Plan selection in other databases
● Traditional databases use a statistics approach to choose
the best plan:
○ The information about data distribution is not
accurate.
○ It is estimated by reading data with random dives in
the index tree (MySQL).
○ When some prerequisites are met (like number of
modified rows) statistics are automatically
recalculated.
Plan selection in MongoDB
● MongoDB uses a empirical method:
○ If there is no cached plan, then all viable execution plans,
based on the available indexes, are created.
○ MongoDB runs the query multiple times, one for each query
plan and benchmarks them. It chooses the one that
provides the best performance.
○ Once done, the plan is cached.
■ Future queries with the same shape will re-use this
plan rather than re-running the candidate plans.
■ For each such query the performance of the cached
plan is evaluated. If the plan's performance decreases
beyond a given threshold, it is evicted from the cache
and the candidate test phase runs again. This is known
as re-planning (SERVER-15225)
Benchmarking the plans
● All possible plans are executed in round-robin fashion.
● It gathers execution metrics and then provide a score to each
plan.
● Sort the plans by score and choose the best one.
Execution Metrics (I)
● These are the metrics:
works
advanced
needTime
isEOF
Execution Metrics (II)
● Number of works:
■ The planner asks each plan for the next document, via a
call to work().
■ If the plan can supply a document, it responds with
'advanced'. Otherwise, the plan responds with
'needsTime'.
● If all documents have been retrieved, then isEOF = 1.
Early stop of query execution
● The query could be expensive, so there are limits to early
stop the execution. Execution stop if:
○ The maximum number of works has been reached.
○ The requested number of documents has been
retrieved (advanced).
○ We get isEOF (the resultSet has no more documents).
works
work()
isEOF advanced
Break
Number of work() calls before stopping
● internalQueryPlanEvaluationWorks = 10000
For large collections we take a fraction of the number of
documents:
● internalQueryPlanEvaluationCollFraction = 0.3
Then, get the maximum value.
internalQueryPlanEvaluationWorks
internalQueryPlanEvaluationCollFraction numRecords
works
Number of documents to retrieve before
stopping
● internalQueryPlanEvaluationMaxResults = 101
● query.getQueryRequest().getNToReturn()
○ Used in the old OP_QUERY protocol.
○ Drivers set 'ntoreturn' to min('batchSize', 'limit') in
order to fake the lack of 'limit' or 'batchSize'
mechanism in the protocol.
● query.getQueryRequest().getLimit()
○ Used in OP_QUERY protocol from 3.2 onwards.
getNToReturn
advanced getNToReturn internalQueryPlanEvaluationMaxResults
getLimit
advanced getLimit internalQueryPlanEvaluationMaxResults
advanced internalQueryPlanEvaluationMaxResults
advanced
Pick the best plan, count the scores
● baseScore = 1
● Productivity = queryResults / workUnits
● TieBreak (very small number) = min(1.0 / (10 * workUnits), 1e-4)
● noFetchBonus (covered index) = TieBreak or 0
● noSortBonus (blocking sort) = TieBreak or 0
● noIxisectBonus (avoiding index intersection) = TieBreak or 0
● tieBreakers = noFetchBonus + noSortBonus +
noIxisectBonus
● eofBonus (if during plan execution all possible documents are retrieved) = 0 | 1
Replanning: Automatic Plan Cache Eviction
● The stored data keep changing, it could possible that the
cached plan is not the best one anymore.
● While the cached plan is being used, MongoDB re-runs
the trial period for that plan and keeps a count of the
work() function calls.
● If the new trial period takes more than 10 times as many
works() as the original trial period, it evicts the plan from
the cache and re-tests all candidate plans to pick a new
winner.
● internalQueryCacheEvictionRatio = 10
maxWorksBeforeReplan internalQueryCacheEvictionRatio cachedWorks
currentWorks maxWorksBeforeReplan
replan()
Plans are not always cached
● In the following situations, the execution plan is not
cached:
○ Collection scan without sort()
○ hint()
○ min()
○ max()
○ explain()
○ Tailable cursors (they don’t use indexes)
○ snapshot()
○ A single viable plan
Query Planner Troubleshoot Example (I)
● We check all query shapes:
listQueryShapes
Query Planner Troubleshoot Example (II)
● Get the execution plan for that query:
getPlansByQuery
solution
score
works
isEOF
Query Planner Troubleshoot Example (III)
● Remove the query plan for a particular query:
clearPlansByQuery
"plans": [ ]
Query Planner Troubleshoot Example (IV)
● Remove all query plans on a particular collection:
clear
Query Planner Troubleshoot
● There are Plan Cache methods that can be used for
troubleshooting:
https://docs.mongodb.com/manual/reference/method/js-plan-cache/
● Check all query shapes:
○ db.collection.getPlanCache().listQueryShapes()
● Get the plan for a particular query:
○ db.collection.getPlanCache().getPlansByQuery(
<query>, <projection>, <sort> )
● Clean the plans for a particular query:
○ db.collection.getPlanCache().clearPlansByQuery()
● Clean all plans:
○ db.collection.getPlanCache().clear()
Thanks!

Query planner

  • 1.
    Miguel Angel Nieto miguel.nieto@mongodb.com TechnicalServices Engineer, MongoDB Query Planner
  • 2.
    The Question ● Iworked for 6 years as MySQL Technical Support Engineer. ● A large percentage of cases from customers were related to bad query plans/wrong index selection. ● Query Planning is a complex piece of code with many knobs that can be tuned. ● When I started working at MongoDB I found that the number of cases on that topic was very very (very) low. So I asked myself: Why?
  • 3.
    Plan selection inother databases ● Traditional databases use a statistics approach to choose the best plan: ○ The information about data distribution is not accurate. ○ It is estimated by reading data with random dives in the index tree (MySQL). ○ When some prerequisites are met (like number of modified rows) statistics are automatically recalculated.
  • 4.
    Plan selection inMongoDB ● MongoDB uses a empirical method: ○ If there is no cached plan, then all viable execution plans, based on the available indexes, are created. ○ MongoDB runs the query multiple times, one for each query plan and benchmarks them. It chooses the one that provides the best performance. ○ Once done, the plan is cached. ■ Future queries with the same shape will re-use this plan rather than re-running the candidate plans. ■ For each such query the performance of the cached plan is evaluated. If the plan's performance decreases beyond a given threshold, it is evicted from the cache and the candidate test phase runs again. This is known as re-planning (SERVER-15225)
  • 5.
    Benchmarking the plans ●All possible plans are executed in round-robin fashion. ● It gathers execution metrics and then provide a score to each plan. ● Sort the plans by score and choose the best one.
  • 6.
    Execution Metrics (I) ●These are the metrics: works advanced needTime isEOF
  • 7.
    Execution Metrics (II) ●Number of works: ■ The planner asks each plan for the next document, via a call to work(). ■ If the plan can supply a document, it responds with 'advanced'. Otherwise, the plan responds with 'needsTime'. ● If all documents have been retrieved, then isEOF = 1.
  • 8.
    Early stop ofquery execution ● The query could be expensive, so there are limits to early stop the execution. Execution stop if: ○ The maximum number of works has been reached. ○ The requested number of documents has been retrieved (advanced). ○ We get isEOF (the resultSet has no more documents). works work() isEOF advanced Break
  • 9.
    Number of work()calls before stopping ● internalQueryPlanEvaluationWorks = 10000 For large collections we take a fraction of the number of documents: ● internalQueryPlanEvaluationCollFraction = 0.3 Then, get the maximum value. internalQueryPlanEvaluationWorks internalQueryPlanEvaluationCollFraction numRecords works
  • 10.
    Number of documentsto retrieve before stopping ● internalQueryPlanEvaluationMaxResults = 101 ● query.getQueryRequest().getNToReturn() ○ Used in the old OP_QUERY protocol. ○ Drivers set 'ntoreturn' to min('batchSize', 'limit') in order to fake the lack of 'limit' or 'batchSize' mechanism in the protocol. ● query.getQueryRequest().getLimit() ○ Used in OP_QUERY protocol from 3.2 onwards. getNToReturn advanced getNToReturn internalQueryPlanEvaluationMaxResults getLimit advanced getLimit internalQueryPlanEvaluationMaxResults advanced internalQueryPlanEvaluationMaxResults advanced
  • 11.
    Pick the bestplan, count the scores ● baseScore = 1 ● Productivity = queryResults / workUnits ● TieBreak (very small number) = min(1.0 / (10 * workUnits), 1e-4) ● noFetchBonus (covered index) = TieBreak or 0 ● noSortBonus (blocking sort) = TieBreak or 0 ● noIxisectBonus (avoiding index intersection) = TieBreak or 0 ● tieBreakers = noFetchBonus + noSortBonus + noIxisectBonus ● eofBonus (if during plan execution all possible documents are retrieved) = 0 | 1
  • 12.
    Replanning: Automatic PlanCache Eviction ● The stored data keep changing, it could possible that the cached plan is not the best one anymore. ● While the cached plan is being used, MongoDB re-runs the trial period for that plan and keeps a count of the work() function calls. ● If the new trial period takes more than 10 times as many works() as the original trial period, it evicts the plan from the cache and re-tests all candidate plans to pick a new winner. ● internalQueryCacheEvictionRatio = 10 maxWorksBeforeReplan internalQueryCacheEvictionRatio cachedWorks currentWorks maxWorksBeforeReplan replan()
  • 13.
    Plans are notalways cached ● In the following situations, the execution plan is not cached: ○ Collection scan without sort() ○ hint() ○ min() ○ max() ○ explain() ○ Tailable cursors (they don’t use indexes) ○ snapshot() ○ A single viable plan
  • 14.
    Query Planner TroubleshootExample (I) ● We check all query shapes: listQueryShapes
  • 15.
    Query Planner TroubleshootExample (II) ● Get the execution plan for that query: getPlansByQuery solution score works isEOF
  • 16.
    Query Planner TroubleshootExample (III) ● Remove the query plan for a particular query: clearPlansByQuery "plans": [ ]
  • 17.
    Query Planner TroubleshootExample (IV) ● Remove all query plans on a particular collection: clear
  • 18.
    Query Planner Troubleshoot ●There are Plan Cache methods that can be used for troubleshooting: https://docs.mongodb.com/manual/reference/method/js-plan-cache/ ● Check all query shapes: ○ db.collection.getPlanCache().listQueryShapes() ● Get the plan for a particular query: ○ db.collection.getPlanCache().getPlansByQuery( <query>, <projection>, <sort> ) ● Clean the plans for a particular query: ○ db.collection.getPlanCache().clearPlansByQuery() ● Clean all plans: ○ db.collection.getPlanCache().clear()
  • 19.