SlideShare a Scribd company logo
1 of 78
MongoDB Indexing and Query Optimizer Details Antoine Girbal Mongo FR March 23, 2011
What will we cover? ,[object Object]
A full understanding of these details is not required to use mongo, but this knowledge can be helpful when making optimizations.
We’ll discuss functionality of Mongo 1.8 (for our purposes pretty similar to 1.6 and almost identical to 1.7 edge).
Much of the material will be presented through examples.
Diagrams are to aid understanding – some details will be left out.
Btree (conceptual diagram) 1 2 3 4 5 6 7 8 9 {_id:4,x:6}
Find One Document ,[object Object]
Index {x:1}
Find One Document 1 2 3 4 5 6 7 8 9 6 ? {_id:4,x:6}
Find One Document > db.c.find( {x:6} ).limit( 1 ).explain() { "cursor" : "BtreeCursor x_1", "nscanned" : 1, "nscannedObjects" : 1, "n" : 1, "millis" : 1, "nYields" : 0, "nChunkSkips" : 0, "isMultiKey" : false, "indexOnly" : false, "indexBounds" : { "x" : [ [ 6, 6 ] ] } } Uses a btree cursor to find the object. Index ranges are around a single value.
Find One Document 1 2 3 4 5 6 7 8 9 6 ? {_id:4,x:6}
Find One Document 1 2 3 4 5 6 7 8 9 6 ? {_id:4,x:6}
Find One Document 1 2 3 4 5 6 6 6 9 6 ? {_id:4,x:6} Now we have duplicate x values
Find One Document 1 2 3 4 5 6 6 6 9 6 ? {_id:4,x:6}
Equality Match ,[object Object]
Index {x:1}
Several documents to be returned
Equality Match 9 1 2 3 4 5 6 6 6 6 ? {_id:4,x:6} {_id:5,x:6} {_id:1,x:6}
Equality Match > db.c.find( {x:6} ).explain() { "cursor" : "BtreeCursor x_1", "nscanned" : 3, "nscannedObjects" : 3, "n" : 3, "millis" : 1, "nYields" : 0, "nChunkSkips" : 0, "isMultiKey" : false, "indexOnly" : false, "indexBounds" : { "x" : [ [ 6, 6 ] ] } }
Equality Match 1 2 3 4 5 6 6 6 9 6 ?
Full Document Matcher ,[object Object]
Index {x:1}
Object content needs to be checked
Full Document Matcher 9 1 2 3 4 5 6 6 6 6 ? {y:4,x:6} {y:5,x:6} {y:1,x:6}
Full Document Matcher > db.c.find( {x:6,y:1} ).explain() { "cursor" : "BtreeCursor x_1", "nscanned" : 3, "nscannedObjects" : 3, "n" : 1, "millis" : 1, "nYields" : 0, "nChunkSkips" : 0, "isMultiKey" : false, "indexOnly" : false, "indexBounds" : { "x" : [ [ 6, 6 ] ] } } Documents for all matching index keys are scanned, but only one document matched on non index keys.
Range Match ,[object Object]
Index {x:1}
Range Match 8 1 2 3 4 5 6 7 9 4 <= ? <= 7
Range Match > db.c.find( {x:{$gte:4,$lte:7}} ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1&quot;, &quot;nscanned&quot; : 4, &quot;nscannedObjects&quot; : 4, &quot;n&quot; : 4, &quot;millis&quot; : 1, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : false, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ 4, 7 ] ] } }
Range Match 1 2 3 4 5 6 7 8 9
Exclusive Range Match ,[object Object]
Index {x:1}
Range of index is same as inclusive range match
but boundaries are not scanned nor returned
Multikeys ,[object Object]
Index {x:1}
documents contain lists with several values like [8,9].
Multikeys 1 2 3 4 5 6 7 9 ? > 7 {_id:4,x:[8,9]} 8
Multikeys > db.c.find( {x:{$gt:7}} ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1&quot;, &quot;nscanned&quot; : 2, &quot;nscannedObjects&quot; : 2, &quot;n&quot; : 1, &quot;millis&quot; : 1, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : true, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ 7, 1.7976931348623157e+308 ] ] } } All keys in valid range are scanned, but the matcher rejects duplicate documents making n == 1.
Multikeys 1 2 3 4 5 6 7 8 9
Range Types ,[object Object],[object Object]
db.c.find( {x:{$gt:4}} )
db.c.find( {x:{$ne:4}} ) ,[object Object],[object Object],[object Object]
Range Types db.c.find( {x:/^a/} ) &quot;indexBounds&quot; : { &quot;x&quot; : [ [ &quot;a&quot;, &quot;b&quot; ], [ /^a/, /^a/ ] ] } 2 ranges scanned of 2 different types: string and regex
Range Types db.c.find( {x:/a/} ) &quot;indexBounds&quot; : { &quot;x&quot; : [ [ &quot;&quot;, { } ], [ /a/, /a/ ] ] } Here the index only helps to restrict type, not efficient in practice
Set Match ,[object Object]
Index {x:1}
Set Match 8 1 2 3 4 5 6 7 9 3 , 6
Set Match > db.c.find( {x:{$in:[3,6]}} ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1 multi&quot;, &quot;nscanned&quot; : 3, &quot;nscannedObjects&quot; : 2, &quot;n&quot; : 2, &quot;millis&quot; : 8, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : false, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ 3, 3 ], [ 6, 6 ] ] }} Why is nscanned 3?  This is an algorithmic detail, when there are disjoint ranges for a key nscanned may be higher than the number of matching keys.
Set Match 1 2 3 4 5 6 7 8 9
All Match ,[object Object]
Index {x:1}
All Match 8 1 2 3 4 5 6 7 9 3 ? {_id:4,x:[3,6]}
All Match > db.c.find( {x:{$all:[3,6]}} ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1&quot;, &quot;nscanned&quot; : 1, &quot;nscannedObjects&quot; : 1, &quot;n&quot; : 1, &quot;millis&quot; : 0, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : true, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ 3, 3 ] ] } } The first entry in the $all match array is always used for index bounds.  Note this may not be the least numerous indexed value in the $all array.
All Match 1 2 3 4 5 6 7 8 9
Limit ,[object Object]
Index {x:1}
Limit 8 1 2 3 4 5 6 7 9 6 ? < y:3 y:1 y:3 y:3 y:3
Limit > db.c.find( {x:{$lt:6},y:3} ).limit( 3 ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1&quot;, &quot;nscanned&quot; : 4, &quot;nscannedObjects&quot; : 4, &quot;n&quot; : 3, &quot;millis&quot; : 1, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : true, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ -1.7976931348623157e+308, 6 ] ] } } Scan until three matches are found, then stop.
Skip ,[object Object]
Index {x:1}
Skip 8 1 2 3 4 5 6 7 9 6 ? < y:3 y:1 y:3 y:3 y:3
Skip > db.c.find( {x:{$lt:6},y:3} ).skip( 3 ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1&quot;, &quot;nscanned&quot; : 5, &quot;nscannedObjects&quot; : 5, &quot;n&quot; : 1, &quot;millis&quot; : 1, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : true, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ -1.7976931348623157e+308, 6 ] ] } } All skipped documents are scanned.
Sort ,[object Object]
Index {x:1}
Sorting along index key uses index btree ordering
Sort 8 1 2 3 4 5 6 7 9 6 ? < y:3 y:1 y:3 y:3 y:3
Sort > db.c.find( {x:{$lt:6},y:3} ).sort( {x:1} ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1&quot;, &quot;nscanned&quot; : 5, &quot;nscannedObjects&quot; : 5, &quot;n&quot; : 4, &quot;millis&quot; : 1, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : true, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ -1.7976931348623157e+308, 6 ] ] } } Find uses the btree cursor to easily sort data
Sort ,[object Object]
Index {x:1}
Using non-indexed key to sort data will need to scan & order
Sort 8 1 2 3 4 5 6 7 9 6 ? < y:3 y:1 y:3 y:3 y:3
Sort Results are sorted on the fly to match requested order.  The scanAndOrder field is only printed when its value is true. > db.c.find( {x:{$lt:6},y:3} ).sort( {y:1} ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1&quot;, &quot;nscanned&quot; : 5, &quot;nscannedObjects&quot; : 5, &quot;n&quot; : 4, &quot;scanAndOrder&quot; : true, &quot;millis&quot; : 1, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : true, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ -1.7976931348623157e+308, 6 ] ] } }
Sort and scanAndOrder ,[object Object]
With scanAndOrder, sorting is performed in memory and the memory footprint is constrained by the limit spec if present.
Count ,[object Object]
With some operators the full document must be checked.  Some of these cases: ,[object Object]

More Related Content

What's hot

Webinar: Index Tuning and Evaluation
Webinar: Index Tuning and EvaluationWebinar: Index Tuning and Evaluation
Webinar: Index Tuning and EvaluationMongoDB
 
Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)MongoDB
 
Back to Basics Webinar 3: Schema Design Thinking in Documents
 Back to Basics Webinar 3: Schema Design Thinking in Documents Back to Basics Webinar 3: Schema Design Thinking in Documents
Back to Basics Webinar 3: Schema Design Thinking in DocumentsMongoDB
 
MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329Douglas Duncan
 
Indexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleIndexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleMongoDB
 
MongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineMongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineJason Terpko
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesMongoDB
 
Reducing Development Time with MongoDB vs. SQL
Reducing Development Time with MongoDB vs. SQLReducing Development Time with MongoDB vs. SQL
Reducing Development Time with MongoDB vs. SQLMongoDB
 
Webinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation FrameworkWebinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation FrameworkMongoDB
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDBrogerbodamer
 
Back to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkBack to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkMongoDB
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation FrameworkTyler Brock
 
MongoDB Aggregation
MongoDB Aggregation MongoDB Aggregation
MongoDB Aggregation Amit Ghosh
 
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDBMap/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDBUwe Printz
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.GeeksLab Odessa
 
Webinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsWebinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsMongoDB
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation FrameworkCaserta
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance TuningMongoDB
 
MongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
MongoDB Europe 2016 - Advanced MongoDB Aggregation PipelinesMongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
MongoDB Europe 2016 - Advanced MongoDB Aggregation PipelinesMongoDB
 
Storing tree structures with MongoDB
Storing tree structures with MongoDBStoring tree structures with MongoDB
Storing tree structures with MongoDBVyacheslav
 

What's hot (20)

Webinar: Index Tuning and Evaluation
Webinar: Index Tuning and EvaluationWebinar: Index Tuning and Evaluation
Webinar: Index Tuning and Evaluation
 
Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)
 
Back to Basics Webinar 3: Schema Design Thinking in Documents
 Back to Basics Webinar 3: Schema Design Thinking in Documents Back to Basics Webinar 3: Schema Design Thinking in Documents
Back to Basics Webinar 3: Schema Design Thinking in Documents
 
MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329
 
Indexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleIndexing Strategies to Help You Scale
Indexing Strategies to Help You Scale
 
MongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineMongoDB - Aggregation Pipeline
MongoDB - Aggregation Pipeline
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
 
Reducing Development Time with MongoDB vs. SQL
Reducing Development Time with MongoDB vs. SQLReducing Development Time with MongoDB vs. SQL
Reducing Development Time with MongoDB vs. SQL
 
Webinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation FrameworkWebinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation Framework
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDB
 
Back to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkBack to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation Framework
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
MongoDB Aggregation
MongoDB Aggregation MongoDB Aggregation
MongoDB Aggregation
 
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDBMap/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
 
Webinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsWebinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in Documents
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
 
MongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
MongoDB Europe 2016 - Advanced MongoDB Aggregation PipelinesMongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
MongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
 
Storing tree structures with MongoDB
Storing tree structures with MongoDBStoring tree structures with MongoDB
Storing tree structures with MongoDB
 

Similar to 2011 Mongo FR - Indexing in MongoDB

Schema design with MongoDB (Dwight Merriman)
Schema design with MongoDB (Dwight Merriman)Schema design with MongoDB (Dwight Merriman)
Schema design with MongoDB (Dwight Merriman)MongoSF
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchclintongormley
 
MongoDB's index and query optimize
MongoDB's index and query optimizeMongoDB's index and query optimize
MongoDB's index and query optimizemysqlops
 
Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)MongoSF
 
Schema design short
Schema design shortSchema design short
Schema design shortMongoDB
 
Data Structure In C#
Data Structure In C#Data Structure In C#
Data Structure In C#Shahzad
 
Fazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearchFazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearchPedro Franceschi
 
Fighting fraud: finding duplicates at scale
Fighting fraud: finding duplicates at scaleFighting fraud: finding duplicates at scale
Fighting fraud: finding duplicates at scaleAlexey Grigorev
 
Linq Sanjay Vyas
Linq   Sanjay VyasLinq   Sanjay Vyas
Linq Sanjay Vyasrsnarayanan
 
Program 4You are to write an efficient program that will read a di.pdf
Program 4You are to write an efficient program that will read a di.pdfProgram 4You are to write an efficient program that will read a di.pdf
Program 4You are to write an efficient program that will read a di.pdfezzi552
 
Scientific Computing with Python Webinar --- May 22, 2009
Scientific Computing with Python Webinar --- May 22, 2009Scientific Computing with Python Webinar --- May 22, 2009
Scientific Computing with Python Webinar --- May 22, 2009Enthought, Inc.
 

Similar to 2011 Mongo FR - Indexing in MongoDB (20)

Schema design with MongoDB (Dwight Merriman)
Schema design with MongoDB (Dwight Merriman)Schema design with MongoDB (Dwight Merriman)
Schema design with MongoDB (Dwight Merriman)
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearch
 
MongoDB
MongoDBMongoDB
MongoDB
 
Php 2
Php 2Php 2
Php 2
 
MongoDB's index and query optimize
MongoDB's index and query optimizeMongoDB's index and query optimize
MongoDB's index and query optimize
 
Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)
 
Schema design short
Schema design shortSchema design short
Schema design short
 
R meetup talk
R meetup talkR meetup talk
R meetup talk
 
Data Structure In C#
Data Structure In C#Data Structure In C#
Data Structure In C#
 
Arrays in c
Arrays in cArrays in c
Arrays in c
 
Fazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearchFazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearch
 
Fighting fraud: finding duplicates at scale
Fighting fraud: finding duplicates at scaleFighting fraud: finding duplicates at scale
Fighting fraud: finding duplicates at scale
 
Boost tour 1_44_0_all
Boost tour 1_44_0_allBoost tour 1_44_0_all
Boost tour 1_44_0_all
 
Jquery 1
Jquery 1Jquery 1
Jquery 1
 
C to perl binding
C to perl bindingC to perl binding
C to perl binding
 
Sencha Touch Intro
Sencha Touch IntroSencha Touch Intro
Sencha Touch Intro
 
Mongo indexes
Mongo indexesMongo indexes
Mongo indexes
 
Linq Sanjay Vyas
Linq   Sanjay VyasLinq   Sanjay Vyas
Linq Sanjay Vyas
 
Program 4You are to write an efficient program that will read a di.pdf
Program 4You are to write an efficient program that will read a di.pdfProgram 4You are to write an efficient program that will read a di.pdf
Program 4You are to write an efficient program that will read a di.pdf
 
Scientific Computing with Python Webinar --- May 22, 2009
Scientific Computing with Python Webinar --- May 22, 2009Scientific Computing with Python Webinar --- May 22, 2009
Scientific Computing with Python Webinar --- May 22, 2009
 

2011 Mongo FR - Indexing in MongoDB

  • 1. MongoDB Indexing and Query Optimizer Details Antoine Girbal Mongo FR March 23, 2011
  • 2.
  • 3. A full understanding of these details is not required to use mongo, but this knowledge can be helpful when making optimizations.
  • 4. We’ll discuss functionality of Mongo 1.8 (for our purposes pretty similar to 1.6 and almost identical to 1.7 edge).
  • 5. Much of the material will be presented through examples.
  • 6. Diagrams are to aid understanding – some details will be left out.
  • 7. Btree (conceptual diagram) 1 2 3 4 5 6 7 8 9 {_id:4,x:6}
  • 8.
  • 10. Find One Document 1 2 3 4 5 6 7 8 9 6 ? {_id:4,x:6}
  • 11. Find One Document > db.c.find( {x:6} ).limit( 1 ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1&quot;, &quot;nscanned&quot; : 1, &quot;nscannedObjects&quot; : 1, &quot;n&quot; : 1, &quot;millis&quot; : 1, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : false, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ 6, 6 ] ] } } Uses a btree cursor to find the object. Index ranges are around a single value.
  • 12. Find One Document 1 2 3 4 5 6 7 8 9 6 ? {_id:4,x:6}
  • 13. Find One Document 1 2 3 4 5 6 7 8 9 6 ? {_id:4,x:6}
  • 14. Find One Document 1 2 3 4 5 6 6 6 9 6 ? {_id:4,x:6} Now we have duplicate x values
  • 15. Find One Document 1 2 3 4 5 6 6 6 9 6 ? {_id:4,x:6}
  • 16.
  • 18. Several documents to be returned
  • 19. Equality Match 9 1 2 3 4 5 6 6 6 6 ? {_id:4,x:6} {_id:5,x:6} {_id:1,x:6}
  • 20. Equality Match > db.c.find( {x:6} ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1&quot;, &quot;nscanned&quot; : 3, &quot;nscannedObjects&quot; : 3, &quot;n&quot; : 3, &quot;millis&quot; : 1, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : false, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ 6, 6 ] ] } }
  • 21. Equality Match 1 2 3 4 5 6 6 6 9 6 ?
  • 22.
  • 24. Object content needs to be checked
  • 25. Full Document Matcher 9 1 2 3 4 5 6 6 6 6 ? {y:4,x:6} {y:5,x:6} {y:1,x:6}
  • 26. Full Document Matcher > db.c.find( {x:6,y:1} ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1&quot;, &quot;nscanned&quot; : 3, &quot;nscannedObjects&quot; : 3, &quot;n&quot; : 1, &quot;millis&quot; : 1, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : false, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ 6, 6 ] ] } } Documents for all matching index keys are scanned, but only one document matched on non index keys.
  • 27.
  • 29. Range Match 8 1 2 3 4 5 6 7 9 4 <= ? <= 7
  • 30. Range Match > db.c.find( {x:{$gte:4,$lte:7}} ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1&quot;, &quot;nscanned&quot; : 4, &quot;nscannedObjects&quot; : 4, &quot;n&quot; : 4, &quot;millis&quot; : 1, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : false, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ 4, 7 ] ] } }
  • 31. Range Match 1 2 3 4 5 6 7 8 9
  • 32.
  • 34. Range of index is same as inclusive range match
  • 35. but boundaries are not scanned nor returned
  • 36.
  • 38. documents contain lists with several values like [8,9].
  • 39. Multikeys 1 2 3 4 5 6 7 9 ? > 7 {_id:4,x:[8,9]} 8
  • 40. Multikeys > db.c.find( {x:{$gt:7}} ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1&quot;, &quot;nscanned&quot; : 2, &quot;nscannedObjects&quot; : 2, &quot;n&quot; : 1, &quot;millis&quot; : 1, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : true, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ 7, 1.7976931348623157e+308 ] ] } } All keys in valid range are scanned, but the matcher rejects duplicate documents making n == 1.
  • 41. Multikeys 1 2 3 4 5 6 7 8 9
  • 42.
  • 44.
  • 45. Range Types db.c.find( {x:/^a/} ) &quot;indexBounds&quot; : { &quot;x&quot; : [ [ &quot;a&quot;, &quot;b&quot; ], [ /^a/, /^a/ ] ] } 2 ranges scanned of 2 different types: string and regex
  • 46. Range Types db.c.find( {x:/a/} ) &quot;indexBounds&quot; : { &quot;x&quot; : [ [ &quot;&quot;, { } ], [ /a/, /a/ ] ] } Here the index only helps to restrict type, not efficient in practice
  • 47.
  • 49. Set Match 8 1 2 3 4 5 6 7 9 3 , 6
  • 50. Set Match > db.c.find( {x:{$in:[3,6]}} ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1 multi&quot;, &quot;nscanned&quot; : 3, &quot;nscannedObjects&quot; : 2, &quot;n&quot; : 2, &quot;millis&quot; : 8, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : false, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ 3, 3 ], [ 6, 6 ] ] }} Why is nscanned 3? This is an algorithmic detail, when there are disjoint ranges for a key nscanned may be higher than the number of matching keys.
  • 51. Set Match 1 2 3 4 5 6 7 8 9
  • 52.
  • 54. All Match 8 1 2 3 4 5 6 7 9 3 ? {_id:4,x:[3,6]}
  • 55. All Match > db.c.find( {x:{$all:[3,6]}} ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1&quot;, &quot;nscanned&quot; : 1, &quot;nscannedObjects&quot; : 1, &quot;n&quot; : 1, &quot;millis&quot; : 0, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : true, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ 3, 3 ] ] } } The first entry in the $all match array is always used for index bounds. Note this may not be the least numerous indexed value in the $all array.
  • 56. All Match 1 2 3 4 5 6 7 8 9
  • 57.
  • 59. Limit 8 1 2 3 4 5 6 7 9 6 ? < y:3 y:1 y:3 y:3 y:3
  • 60. Limit > db.c.find( {x:{$lt:6},y:3} ).limit( 3 ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1&quot;, &quot;nscanned&quot; : 4, &quot;nscannedObjects&quot; : 4, &quot;n&quot; : 3, &quot;millis&quot; : 1, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : true, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ -1.7976931348623157e+308, 6 ] ] } } Scan until three matches are found, then stop.
  • 61.
  • 63. Skip 8 1 2 3 4 5 6 7 9 6 ? < y:3 y:1 y:3 y:3 y:3
  • 64. Skip > db.c.find( {x:{$lt:6},y:3} ).skip( 3 ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1&quot;, &quot;nscanned&quot; : 5, &quot;nscannedObjects&quot; : 5, &quot;n&quot; : 1, &quot;millis&quot; : 1, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : true, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ -1.7976931348623157e+308, 6 ] ] } } All skipped documents are scanned.
  • 65.
  • 67. Sorting along index key uses index btree ordering
  • 68. Sort 8 1 2 3 4 5 6 7 9 6 ? < y:3 y:1 y:3 y:3 y:3
  • 69. Sort > db.c.find( {x:{$lt:6},y:3} ).sort( {x:1} ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1&quot;, &quot;nscanned&quot; : 5, &quot;nscannedObjects&quot; : 5, &quot;n&quot; : 4, &quot;millis&quot; : 1, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : true, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ -1.7976931348623157e+308, 6 ] ] } } Find uses the btree cursor to easily sort data
  • 70.
  • 72. Using non-indexed key to sort data will need to scan & order
  • 73. Sort 8 1 2 3 4 5 6 7 9 6 ? < y:3 y:1 y:3 y:3 y:3
  • 74. Sort Results are sorted on the fly to match requested order. The scanAndOrder field is only printed when its value is true. > db.c.find( {x:{$lt:6},y:3} ).sort( {y:1} ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1&quot;, &quot;nscanned&quot; : 5, &quot;nscannedObjects&quot; : 5, &quot;n&quot; : 4, &quot;scanAndOrder&quot; : true, &quot;millis&quot; : 1, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : true, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ -1.7976931348623157e+308, 6 ] ] } }
  • 75.
  • 76. With scanAndOrder, sorting is performed in memory and the memory footprint is constrained by the limit spec if present.
  • 77.
  • 78.
  • 79. $size
  • 81.
  • 82.
  • 83. Index {x:1} Id would be returned by default, but isn’t in the index so we need to exclude to return only indexed fields.
  • 84. Covered Indexes > db.c.find( {x:6}, {x:1,_id:0} ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1&quot;, &quot;nscanned&quot; : 1, &quot;nscannedObjects&quot; : 1, &quot;n&quot; : 1, &quot;millis&quot; : 0, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : false, &quot;indexOnly&quot; : true, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ 6, 6 ] ] } } IndexOnly is true, and isMultiKey must be false. Currently we set isMultiKey to true the first time we save a doc where the field is a multikey array.
  • 85.
  • 87. Two Equality Bounds ? 5 c 1 b 3 d 4 g 5 d 5 f 6 c 7 a 9 b 5 c
  • 88. Two Equality Bounds > db.c.find( {x:5,y:'c'} ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1_y_1&quot;, &quot;nscanned&quot; : 1, &quot;nscannedObjects&quot; : 1, &quot;n&quot; : 1, &quot;millis&quot; : 1, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : false, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ 5, 5 ] ], &quot;y&quot; : [ [ &quot;c&quot;, &quot;c&quot; ] ]}} 2 Ranges applied to narrow down the data to scan.
  • 89. Two Equality Bounds ? 1 b 3 d 4 g 5 c 5 d 5 f 5 c 6 c 7 a 9 b
  • 90.
  • 92. Two Set Bounds , , , 5 c 1 b 3 d 4 g 5 d 5 f 6 c 7 a 9 f 5 c 5 f 9 c 9 f
  • 93. Two Set Bounds > db.c.find( {x:{$in:[5,9]},y:{$in:['c','f']}} ).explain() { &quot;cursor&quot; : &quot;BtreeCursor x_1_y_1 multi&quot;, &quot;nscanned&quot; : 5, &quot;nscannedObjects&quot; : 3, &quot;n&quot; : 3, &quot;millis&quot; : 0, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : false, &quot;indexOnly&quot; : false, ... &quot;indexBounds&quot; : { &quot;x&quot; : [ [ 5, 5 ], [ 9, 9 ] ], &quot;y&quot; : [ [ &quot;c&quot;, &quot;c&quot; ], [ &quot;f&quot;, &quot;f&quot; ] ] } }
  • 94.
  • 96. Does 2 sequential find for each clause
  • 97. Must not return same document twice, so it checks whether it satisfies previous clause
  • 98. Disjoint $or Criteria ? ? 1 b 3 d 4 g 5 d 6 a 7 e 9 f 5 c d 7 g 5 1 b 3 d 4 g 5 d 6 a 7 e 9 f 5 c 7 g
  • 99. Disjoint $or Criteria > db.c.find( {$or:[{x:5},{y:'d'}]} ).explain() { &quot;clauses&quot; : [ { &quot;cursor&quot; : &quot;BtreeCursor x_1&quot;, &quot;nscanned&quot; : 2, &quot;nscannedObjects&quot; : 2, &quot;n&quot; : 2, &quot;millis&quot; : 0, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : false, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;x&quot; : [ [ 5, 5 ] ] } }, { &quot;cursor&quot; : &quot;BtreeCursor y_1&quot;, &quot;nscanned&quot; : 2, &quot;nscannedObjects&quot; : 2, &quot;n&quot; : 1, &quot;millis&quot; : 1, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : false, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { &quot;y&quot; : [ [ &quot;d&quot;, &quot;d&quot; ] ] } }], &quot;nscanned&quot; : 4, &quot;nscannedObjects&quot; : 4, &quot;n&quot; : 3, &quot;millis&quot; : 1}
  • 100.
  • 101. Index {x:1} (no index on y)
  • 102. Unindexed $or Clause > db.c.find( {$or:[{x:5},{y:'d'}]} ).explain() { &quot;cursor&quot; : &quot;BasicCursor&quot;, &quot;nscanned&quot; : 9, &quot;nscannedObjects&quot; : 9, &quot;n&quot; : 3, &quot;millis&quot; : 0, &quot;nYields&quot; : 0, &quot;nChunkSkips&quot; : 0, &quot;isMultiKey&quot; : false, &quot;indexOnly&quot; : false, &quot;indexBounds&quot; : { } } Since y is not indexed, we must do a full collection scan to match y:’d’. Since a full scan is required, we don’t use the index on x to match x:5.
  • 103. Automatic Index Selection (Query Optimizer)
  • 104.
  • 105.
  • 106.
  • 107. All fields with index useful constraints are indexed
  • 108.
  • 109.
  • 110.
  • 111.
  • 112.
  • 113.
  • 114. If fewer distinct values of 2 < x < 7 than distinct values of ‘b’ < y < ‘f’ then {x:1,y:1} chosen (rule of thumb)
  • 115.
  • 116.
  • 117. Cost of scanAndOrder vs ordered index
  • 118. Cost of loading full document vs just index key
  • 119. Cost of scanning adjacent btree keys vs non adjacent keys/documents
  • 120.
  • 121. Run in interleaved fashion
  • 122. Plans kept in a priority queue ordered by nscanned. We always continue progress on plan with lowest nscanned.
  • 123.
  • 124. We only allow plans to compete in initial query. In getMore, we continue reading from the index cursor established by the initial query.
  • 125.
  • 127. {Pattern: {x:’gt bound’, y:’lt bound’}, Index: {y:1}, nscanned: 500}
  • 128.
  • 131.
  • 132. Indexes added / removed
  • 133.
  • 135. Thanks! Feature Requests jira.mongodb.org Support groups.google.com/group/mongodb-user