SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
2.
The Question
● I worked for 6 years as MySQL Technical Support
Engineer.
● A large percentage of cases from customers were related
to bad query plans/wrong index selection.
● Query Planning is a complex piece of code with many
knobs that can be tuned.
● When I started working at MongoDB I found that the
number of cases on that topic was very very (very) low. So
I asked myself:
Why?
3.
Plan selection in other databases
● Traditional databases use a statistics approach to choose
the best plan:
○ The information about data distribution is not
accurate.
○ It is estimated by reading data with random dives in
the index tree (MySQL).
○ When some prerequisites are met (like number of
modified rows) statistics are automatically
recalculated.
4.
Plan selection in MongoDB
● MongoDB uses a empirical method:
○ If there is no cached plan, then all viable execution plans,
based on the available indexes, are created.
○ MongoDB runs the query multiple times, one for each query
plan and benchmarks them. It chooses the one that
provides the best performance.
○ Once done, the plan is cached.
■ Future queries with the same shape will re-use this
plan rather than re-running the candidate plans.
■ For each such query the performance of the cached
plan is evaluated. If the plan's performance decreases
beyond a given threshold, it is evicted from the cache
and the candidate test phase runs again. This is known
as re-planning (SERVER-15225)
5.
Benchmarking the plans
● All possible plans are executed in round-robin fashion.
● It gathers execution metrics and then provide a score to each
plan.
● Sort the plans by score and choose the best one.
6.
Execution Metrics (I)
● These are the metrics:
works
advanced
needTime
isEOF
7.
Execution Metrics (II)
● Number of works:
■ The planner asks each plan for the next document, via a
call to work().
■ If the plan can supply a document, it responds with
'advanced'. Otherwise, the plan responds with
'needsTime'.
● If all documents have been retrieved, then isEOF = 1.
8.
Early stop of query execution
● The query could be expensive, so there are limits to early
stop the execution. Execution stop if:
○ The maximum number of works has been reached.
○ The requested number of documents has been
retrieved (advanced).
○ We get isEOF (the resultSet has no more documents).
works
work()
isEOF advanced
Break
9.
Number of work() calls before stopping
● internalQueryPlanEvaluationWorks = 10000
For large collections we take a fraction of the number of
documents:
● internalQueryPlanEvaluationCollFraction = 0.3
Then, get the maximum value.
internalQueryPlanEvaluationWorks
internalQueryPlanEvaluationCollFraction numRecords
works
10.
Number of documents to retrieve before
stopping
● internalQueryPlanEvaluationMaxResults = 101
● query.getQueryRequest().getNToReturn()
○ Used in the old OP_QUERY protocol.
○ Drivers set 'ntoreturn' to min('batchSize', 'limit') in
order to fake the lack of 'limit' or 'batchSize'
mechanism in the protocol.
● query.getQueryRequest().getLimit()
○ Used in OP_QUERY protocol from 3.2 onwards.
getNToReturn
advanced getNToReturn internalQueryPlanEvaluationMaxResults
getLimit
advanced getLimit internalQueryPlanEvaluationMaxResults
advanced internalQueryPlanEvaluationMaxResults
advanced
11.
Pick the best plan, count the scores
● baseScore = 1
● Productivity = queryResults / workUnits
● TieBreak (very small number) = min(1.0 / (10 * workUnits), 1e-4)
● noFetchBonus (covered index) = TieBreak or 0
● noSortBonus (blocking sort) = TieBreak or 0
● noIxisectBonus (avoiding index intersection) = TieBreak or 0
● tieBreakers = noFetchBonus + noSortBonus +
noIxisectBonus
● eofBonus (if during plan execution all possible documents are retrieved) = 0 | 1
12.
Replanning: Automatic Plan Cache Eviction
● The stored data keep changing, it could possible that the
cached plan is not the best one anymore.
● While the cached plan is being used, MongoDB re-runs
the trial period for that plan and keeps a count of the
work() function calls.
● If the new trial period takes more than 10 times as many
works() as the original trial period, it evicts the plan from
the cache and re-tests all candidate plans to pick a new
winner.
● internalQueryCacheEvictionRatio = 10
maxWorksBeforeReplan internalQueryCacheEvictionRatio cachedWorks
currentWorks maxWorksBeforeReplan
replan()
13.
Plans are not always cached
● In the following situations, the execution plan is not
cached:
○ Collection scan without sort()
○ hint()
○ min()
○ max()
○ explain()
○ Tailable cursors (they don’t use indexes)
○ snapshot()
○ A single viable plan
14.
Query Planner Troubleshoot Example (I)
● We check all query shapes:
listQueryShapes
15.
Query Planner Troubleshoot Example (II)
● Get the execution plan for that query:
getPlansByQuery
solution
score
works
isEOF
16.
Query Planner Troubleshoot Example (III)
● Remove the query plan for a particular query:
clearPlansByQuery
"plans": [ ]
17.
Query Planner Troubleshoot Example (IV)
● Remove all query plans on a particular collection:
clear
18.
Query Planner Troubleshoot
● There are Plan Cache methods that can be used for
troubleshooting:
https://docs.mongodb.com/manual/reference/method/js-plan-cache/
● Check all query shapes:
○ db.collection.getPlanCache().listQueryShapes()
● Get the plan for a particular query:
○ db.collection.getPlanCache().getPlansByQuery(
<query>, <projection>, <sort> )
● Clean the plans for a particular query:
○ db.collection.getPlanCache().clearPlansByQuery()
● Clean all plans:
○ db.collection.getPlanCache().clear()