SlideShare a Scribd company logo
1 of 40
IndicesQuery OptimizerPerformance Tuning Aaron Staple aaron@10gen.com
What is an index? A set of references to your documents, efficiently ordered by key {x:0.5,y:0.5} {x:2,y:0.5} {x:5,y:2} {x:-4,y:10} {x:3,y:’f’}
What is an index? A set of references to your documents, efficiently ordered by key {x:1} {x:0.5,y:0.5} {x:2,y:0.5} {x:5,y:2} {x:-4,y:10} {x:3,y:’f’}
What is an index? A set of references to your documents, efficiently ordered by key {y:1} {x:0.5,y:0.5} {x:2,y:0.5} {x:5,y:2} {x:-4,y:10} {x:3,y:’f’}
How is an index stored? B-tree {x:2} {x:3} 3<=x<4 4<=x<5 {x:0.5} 2<=x<5 {x:5} 0<=x<1 x>=5 x<0 {x:-4} {x:1}
What if I have multiple indices? {c:1} {a:3} {c:2} {c:3} {b:’x’} {d:null} { a:3, b:’x’, c:[1,2,3] } {a:1} {c:1} {b:1} {d:1}
How does a simple query work? Tree traversal {x:2} {x:3} 3<=x<4 4<=x<5 {x:0.5} 2<=x<5 {x:5} 0<=x<1 x>=5 x<0 {x:-4} {x:1}
Simple document lookup	 db.c.findOne( {_id:2} ), using index {_id:1} db.c.find( {x:2} ), using index {x:1} db.c.find( {x:{$in:[2,3]}} ), using index {x:1} db.c.find( {‘x.a’:1} ), using index {‘x.a’:1} Matches {_id:1,x:{a:1}} db.c.find( {x:{a:1}} ), using index {x:1} Matches {_id:1,x:{a:1}}, but not {_id:2,x:{a:1,b:2}} QUESTION: What about db.c.find( {$where:“this.x == this.y”} ), using index {x:1}? Indices cannot be used for $where type queries, but if there are non-where elements in the query then indices can be used for the non-where elements.
How does a range query work? Tree traversal + scan: find({x:{$gte:3,$lte:5}}) {x:2} {x:3} {x:4} 3<=x<4 4<=x<5 {x:0.5} 2<=x<5 {x:5} 0<=x<1 {x:6} x>=5 x<0 {x:-4} {x:1}
Document range scan db.c.find( {x:{$gt:2}} ), using index {x:1} db.c.find( {x:{$gt:2,$lt:5}} ), using index {x:1} db.c.find( {x:/^a/} ), using index {x:1} QUESTION: What about db.c.find( {x:/a/} ), using index {x:1}? The letter ‘a’ can appear anywhere in a matching string, so lexicographic ordering on strings won’t help.  However, we can use the index to find the range of documents where x is string (eg not a number) or x is the regular expression /a/.
Other operations db.c.count( {x:2} ) using index {x:1} db.c.distinct( {x:2} ) using index {x:1} db.c.update( {x:2}, {x:3} ) using index {x:1} db.c.remove( {x:2} ) using index {x:1} QUESTION: What about db.c.update( {x:2}, {$inc:{x:3}} ), using index {x:1}? Older versions of mongoDB didn’t support modifiers on indexed fields, but we now support this.
Missing fields db.c.find( {x:null} ), using index {x:1} Matches {_id:5} Matches {_id:5,x:null} QUESTION: What about db.c.find( {x:{$exists:true}} ), using index {x:1}? The index is not currently used, though we will fix this in MongoDB 1.6.
Array matching All the following match {_id:6,x:[2,10]} and use index {x:1} db.c.find( {x:2} ) db.c.find( {x:10} ) db.c.find( {x:{$gt:5}} ) db.c.find( {x:[2,10]} ) db.c.find( {x:{$in:[2,5]}} ) QUESTION: What about db.c.find( {x:{$all:[2,10]}} )? The index will be used to look up all documents matching {x:2}.
What is a compound index? {x:2,y:3} {x:1,y:5} {x:2,y:9} {x:3,y:1} {x:1,y:1}
How are bounds determined for a compound index? find( {x:{$gte:2,$lte:4},y:6} ) {x:3,y:1} {x:2,y:6} {x:3,y:7} {x:3.5,y:6} {x:2,y:3} {x:4,y:6} {x:1,y:5} {x:5,y:6} {x:1,y:1}
How does an ordered range query work? Simple range scan if index already ensures desired ordering: find( {x:2} ).sort( {y:1} ) {x:2,y:3} {x:1,y:5} {x:2,y:9} {x:3,y:1} {x:1,y:1}
How does an ordered range query work? Otherwise, in-memory sort of matching documents: find( {x:2} ).sort( {y:1} ) {x:2,y:3} {x:2,y:9} {x:1,y:5} {x:2,y:3} {x:2,y:9} … {x:3,y:1} {x:1}
Document ordering db.c.find( {} ).sort( {x:1} ), using index {x:1} db.c.find( {} ).sort( {x:-1} ), using index {x:1} db.c.find( {x:{$gt:4}} ).sort( {x:-1} ), using index {x:1} db.c.find( {} ).sort( {‘x.a’:1} ), using index {‘x.a’:1} QUESTION: What about db.c.find( {y:1} ).sort( {x:1} ), using index {x:1}? The index will be used to ensure ordering, provided there is no better index.
Compound indices and ordering db.c.find( {x:10,y:20} ), using index {x:1,y:1} db.c.find( {x:10,y:20} ), using index {x:1,y:-1} db.c.find( {x:{$in:[10,20]},y:20} ), using index {x:1,y:1} db.c.find().sort( {x:1,y:1} ), using index {x:1,y:1} db.c.find().sort( {x:-1,y:1} ), using index {x:1,y:-1} db.c.find( {x:10} ).sort( {y:1} ), using index {x:1,y:1} QUESTION: What about db.c.find( {y:10} ).sort( {x:1} ), using index {x:1,y:1}? The index will be used to ensure ordering, provided no better index is available.
What if we negate a query? find({x:{$ne:2}}) {x:2} {x:1} {x:2} {x:3} {x:1}
When indices are less helpful db.c.find( {x:{$ne:1}} ) db.c.find( {x:{$mod:[10,1]}} ) Uses index {x:1} to scan numbers only db.c.find( {x:{$not:/a/}} ) db.c.find( {x:{$gte:0,$lte:10},y:5} ) using index {x:1,y:1} Currently must scan all elements from {x:0,y:5} to {x:10,y:5}, but some improvements may be possible db.c.find( {$where:’this.x = 5’} ) QUESTION: What about db.c.find( {x:{$not:/^a/}} ), using index {x:1}? The index is not used currently, but will be used in mongoDB 1.6
How is an index chosen? find( {x:2,y:3} ) {x:2,y:1} {y:3,x:1} {x:2,y:3} {x:2,y:9} {y:3,x:2} {y:9,x:2} {x:1,y:3} {y:1,x:2} {x:1} {y:1} √ {x:2,y:3} {x:2,y:1} {x:2,y:9} {y:3,x:2} {y:3,x:1}
Query pattern matching Very simple algorithm, few complaints so far find({x:1}) find({x:2}) find({x:100}) find({x:{$gt:4}}) find({x:{$gte:6}}) find({x:1,y:2}) find({x:{$gt:4,$lte:10}}) find({x:{$gte:6,$lte:400}}) find({x:1}).sort({y:1})
Query optimizer In charge of picking which index to use for a query/count/update/delete/etc Usually it does a good job, but if you know what you’re doing you can override it db.c.find( {x:2,y:3} ).hint( {y:1} ) Use index {y:1} and avoid trying {x:1} As your data changes, different indices may be chosen.  Ordering requirements should be made explicit using sort(). QUESTION: How can you force a full collection scan instead of using indices? db.c.find( {x:2,y:3} ).hint( {$natural:1} ) to bypass indices
Geospatial indices db.c.find( {a:[50,50]} ) using index {a:’2d’} db.c.find( {a:{$near:[50,50]}} ) using index {a:’2d’} Results are sorted closest - farthest db.c.find( {a:{$within:{$box:[[40,40],[60,60]]}}} ) using index {a:’2d’} db.c.find( {a:{$within:{$center:[[50,50],10]}}} ) using index {a:’2d’} db.c.find( {a:{$near:[50,50]},b:2} ) using index {a:’2d’,b:1} QUESTION: Most queries can be performed with or without an index.  Is this true of geospatial queries? No.  A geospatial query requires an index.
How does an insert work? Tree traversal and insert, split if necessary {x:3.5} {x:2} {x:3} {x:4} 3<=x<4 4<=x<5 {x:0.5} 2<=x<5 {x:5} 0<=x<1 {x:6} x>=5 x<0 {x:-4} {x:1}
What if my keys are increasing? You’ll always insert on the right {x:2} {x:3} {x:4} 3<=x<4 4<=x<5 {x:0.5} 2<=x<5 {x:5} 0<=x<1 {x:6} x>=5 x<0 {x:7} {x:-4} {x:8} {x:1} {x:9}
Why is RAM important? RAM is basically used as a LIFO disk cache Whole index in RAM Portion of index in RAM
Creating an index {_id:1} index created automatically For non-capped collections db.c.ensureIndex( {x:1} ) Can create an index at any time, even when you already have plenty of data in your collection Creating an index will block mongoDB unless you specify background index creation db.c.ensureIndex( {x:1}, {background:true} ) Background index creation is a still impacts performance – run at non peak times if you’re concerned QUESTION: Can an index be removed during background creation? Not at this time.
Unique key constraints db.c.ensureIndex( {x:1}, {unique:true} ) Don’t allow {_id:10,x:2} and {_id:11,x:2} Don’t allow {_id:12} and {_id:13} (both match {x:null} What if duplicates exist before index is created? Normally index creation fails and the index is removed db.ensureIndex( {x:1}, {unique:true,dropDups:true} ) QUESTION: In dropDups mode, which duplicates will be removed? The first document according to the collection’s “natural order” will be preserved.
Cleaning up an index db.system.indices.find( {ns:’db.c’} ) db.c.dropIndex( {x:1} ) db.c.dropindices() db.c.reIndex() Rebuilds all indices, removing index cruft that has built up over large numbers of updates and deletes.  Index cruft will not exist in mongoDB 1.6, so this command will be deprecated. QUESTION: Why would you want to drop an index? See next slide…
Limits and tradeoffs Max 40 indices per collection Logically equivalent indices are not prevented (eg {x:1} and {x:-1}) indices can improve speed of queries, but make inserts slower A more specific index {a:1,b:1,c:1} can be more helpful than less specific index {a:1} but the more specific index will be larger, thus harder to fit in RAM QUESTION: Do indices make updates slower?  How about deletes? It depends – finding your document might be faster, but if any indexed fields are changed the indices must be updated.
Mongod log output query test.c ntoreturn:1 reslen:69 nscanned:100000 { i: 99999.0 }  nreturned:1 157ms query test.$cmd ntoreturn:1 command: { count: "c", query: { type: 0.0, i: { $gt: 99000.0 } }, fields: {} } reslen:64 256ms query:{ query: {}, orderby: { i: 1.0 } } ... query test.c ntoreturn:0 exception  1378ms ... User Exception 10128:too much key data for sort() with no index.  add an index or specify a smaller limit query test.c ntoreturn:0 reslen:4783 nscanned:100501 { query: { type: 500.0 }, orderby: { i: 1.0 } }  nreturned:101 390ms Occasionally may see a slow operation as a result of disk activity or mongo cleaning things up – some messages about slow ops are spurious Keep this in mind when running the same op a massive number of times, and it appears slow very rarely
Profiling Record same info as with log messages, but in a database collection > db.system.profile.find() {"ts" : "Thu Jan 29 2009 15:19:32 GMT-0500 (EST)" , "info" : "query test.$cmd ntoreturn:1 reslen:66 nscanned:0  <br>query: { profile: 2 }  nreturned:1 bytes:50" , "millis" : 0}... > db.system.profile.find( { info: /test.foo/ } ) > db.system.profile.find( { millis : { $gt : 5 } } ) > db.system.profile.find().sort({$natural:-1}) Enable explicitly using levels (0:off, 1:slow ops (>100ms), 2:all ops) > db.setProfilingLevel(2); {"was" : 0 , "ok" : 1} > db.getProfilingLevel() 2 > db.setProfilingLevel( 1 , 10 ); // slow means > 10ms Profiling impacts performance, but not severely
Query explain > db.c.find( {x:1000,y:0} ).explain() { 	"cursor" : "BtreeCursor x_1", 	"indexBounds" : [ 		[ 			{ 				"x" : 1000 			}, 			{ 				"x" : 1000 			} 		] 	], 	"nscanned" : 10, 	"nscannedObjects" : 10, 	"n" : 10, 	"millis" : 0, 	"oldPlan" : { 		"cursor" : "BtreeCursor x_1", 		"indexBounds" : [ 			[ 				{ 					"x" : 1000 				}, 				{ 					"x" : 1000 				} 			] 		] 	}, 	"allPlans" : [ 		{ 			"cursor" : "BtreeCursor x_1", 			"indexBounds" : [ 				[ 					{ 						"x" : 1000 					}, 					{ 						"x" : 1000 					} 				] 			] 		}, 		{ 			"cursor" : "BtreeCursor y_1", 			"indexBounds" : [ 				[ 					{ 						"y" : 0 					}, 					{ 						"y" : 0 					} 				] 			] 		}, 		{ 			"cursor" : "BasicCursor", 			"indexBounds" : [ ] 		} 	] }
Example 1 > db.c.findOne( {i:99999} ) { "_id" : ObjectId("4bb962dddfdcf5761c1ec6a3"), "i" : 99999 } query test.c ntoreturn:1 reslen:69 nscanned:100000 { i: 99999.0 }  nreturned:1 157ms > db.c.find( {i:99999} ).limit(1).explain() { 	"cursor" : "BasicCursor", 	"indexBounds" : [ ], 	"nscanned" : 100000, 	"nscannedObjects" : 100000, 	"n" : 1, 	"millis" : 161, 	"allPlans" : [ 		{ 			"cursor" : "BasicCursor", 			"indexBounds" : [ ] 		} 	] } > db.c.ensureIndex( {i:1} ); > for( i = 0; i < 100000; ++i ) { db.c.save( {i:i} ); }
Example 2 > db.c.count( {type:0,i:{$gt:99000}} ) 499 query test.$cmd ntoreturn:1 command: { count: "c", query: { type: 0.0, i: { $gt: 99000.0 } }, fields: {} } reslen:64 256ms > db.c.find( {type:0,i:{$gt:99000}} ).limit(1).explain() { 	"cursor" : "BtreeCursor type_1", 	"indexBounds" : [ 		[ 			{ 				"type" : 0 			}, 			{ 				"type" : 0 			} 		] 	], 	"nscanned" : 49502, 	"nscannedObjects" : 49502, 	"n" : 1, 	"millis" : 349, ... > db.c.ensureIndex( {type:1,i:1} ); > for( i = 0; i < 100000; ++i ) { db.c.save( {type:i%2,i:i} ); }
Example 3 > db.c.find().sort( {i:1} ) error: { 	"$err" : "too much key data for sort() with no index.  add an index or specify a smaller limit" } > db.c.find().sort( {i:1} ).explain() JS Error: uncaught exception: error: { 	"$err" : "too much key data for sort() with no index.  add an index or specify a smaller limit" } > db.c.ensureIndex( {i:1} ); > db.c.find().sort( {i:1} ).limit( 1000 ); //alternatively > for( i = 0; i < 1000000; ++i ) { db.c.save( {i:i} ); }
Example 4 > db.c.find( {type:500} ).sort( {i:1} ) { "_id" : ObjectId("4bba4904dfdcf5761c2f917e"), "i" : 500, "type" : 500 } { "_id" : ObjectId("4bba4904dfdcf5761c2f9566"), "i" : 1500, "type" : 500 } ... query test.c ntoreturn:0 reslen:4783 nscanned:100501 { query: { type: 500.0 }, orderby: { i: 1.0 } }  nreturned:101 390ms > db.c.find( {type:500} ).sort( {i:1} ).explain() { 	"cursor" : "BtreeCursor i_1", 	"indexBounds" : [ 		[ 			{ 				"i" : { 					"$minElement" : 1 				} 			}, 			{ 				"i" : { 					"$maxElement" : 1 				} 			} 		] 	], 	"nscanned" : 1000000, 	"nscannedObjects" : 1000000, 	"n" : 1000, 	"millis" : 5388, ... > db.c.ensureIndex( {type:1,i:1} ); > for( i = 0; i < 1000000; ++i ) { db.c.save( {i:i,type:i%1000} ); }
Questions? Get involved www.mongodb.org Downloads, user group, chat room Follow @mongodb Upcoming events  www.mongodb.org/display/DOCS/Events SF MongoDB office hours  Mondays 4-6pm at Epicenter Café SF MongoDBmeetup May 17 at Engine Yard Commercial support www.10gen.com jobs@10gen.com

More Related Content

What's hot

Data manipulation on r
Data manipulation on rData manipulation on r
Data manipulation on rAbhik Seal
 
Optimizing queries MySQL
Optimizing queries MySQLOptimizing queries MySQL
Optimizing queries MySQLGeorgi Sotirov
 
Robert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres Open
Robert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres OpenRobert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres Open
Robert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres OpenPostgresOpen
 
Cassandra London - C* Spark Connector
Cassandra London - C* Spark ConnectorCassandra London - C* Spark Connector
Cassandra London - C* Spark ConnectorChristopher Batey
 
Using Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query PerformanceUsing Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query Performanceoysteing
 
R Programming: Export/Output Data In R
R Programming: Export/Output Data In RR Programming: Export/Output Data In R
R Programming: Export/Output Data In RRsquared Academy
 
Distributed algorithms for big data @ GeeCon
Distributed algorithms for big data @ GeeConDistributed algorithms for big data @ GeeCon
Distributed algorithms for big data @ GeeConDuyhai Doan
 
PostgreSQL, performance for queries with grouping
PostgreSQL, performance for queries with groupingPostgreSQL, performance for queries with grouping
PostgreSQL, performance for queries with groupingAlexey Bashtanov
 
RMySQL Tutorial For Beginners
RMySQL Tutorial For BeginnersRMySQL Tutorial For Beginners
RMySQL Tutorial For BeginnersRsquared Academy
 
PGDay UK 2016 -- Performace for queries with grouping
PGDay UK 2016 -- Performace for queries with groupingPGDay UK 2016 -- Performace for queries with grouping
PGDay UK 2016 -- Performace for queries with groupingAlexey Bashtanov
 
Data Love Conference - Window Functions for Database Analytics
Data Love Conference - Window Functions for Database AnalyticsData Love Conference - Window Functions for Database Analytics
Data Love Conference - Window Functions for Database AnalyticsDave Stokes
 
Java script objects 1
Java script objects 1Java script objects 1
Java script objects 1H K
 
MongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query Pitfalls
MongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query PitfallsMongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query Pitfalls
MongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query PitfallsMongoDB
 

What's hot (20)

WOTC_Import
WOTC_ImportWOTC_Import
WOTC_Import
 
binary_trees2
binary_trees2binary_trees2
binary_trees2
 
Data manipulation on r
Data manipulation on rData manipulation on r
Data manipulation on r
 
R code for data manipulation
R code for data manipulationR code for data manipulation
R code for data manipulation
 
Optimizing queries MySQL
Optimizing queries MySQLOptimizing queries MySQL
Optimizing queries MySQL
 
Robert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres Open
Robert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres OpenRobert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres Open
Robert Bernier - Recovering From A Damaged PostgreSQL Cluster @ Postgres Open
 
Cassandra London - C* Spark Connector
Cassandra London - C* Spark ConnectorCassandra London - C* Spark Connector
Cassandra London - C* Spark Connector
 
Using Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query PerformanceUsing Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query Performance
 
R Programming: Export/Output Data In R
R Programming: Export/Output Data In RR Programming: Export/Output Data In R
R Programming: Export/Output Data In R
 
webScrapingFunctions
webScrapingFunctionswebScrapingFunctions
webScrapingFunctions
 
Distributed algorithms for big data @ GeeCon
Distributed algorithms for big data @ GeeConDistributed algorithms for big data @ GeeCon
Distributed algorithms for big data @ GeeCon
 
PostgreSQL: Advanced indexing
PostgreSQL: Advanced indexingPostgreSQL: Advanced indexing
PostgreSQL: Advanced indexing
 
PostgreSQL, performance for queries with grouping
PostgreSQL, performance for queries with groupingPostgreSQL, performance for queries with grouping
PostgreSQL, performance for queries with grouping
 
RMySQL Tutorial For Beginners
RMySQL Tutorial For BeginnersRMySQL Tutorial For Beginners
RMySQL Tutorial For Beginners
 
linked_lists3
linked_lists3linked_lists3
linked_lists3
 
PGDay UK 2016 -- Performace for queries with grouping
PGDay UK 2016 -- Performace for queries with groupingPGDay UK 2016 -- Performace for queries with grouping
PGDay UK 2016 -- Performace for queries with grouping
 
Data Love Conference - Window Functions for Database Analytics
Data Love Conference - Window Functions for Database AnalyticsData Love Conference - Window Functions for Database Analytics
Data Love Conference - Window Functions for Database Analytics
 
Java script objects 1
Java script objects 1Java script objects 1
Java script objects 1
 
MongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query Pitfalls
MongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query PitfallsMongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query Pitfalls
MongoDB.local DC 2018: Tips and Tricks for Avoiding Common Query Pitfalls
 
Php forum2015 tomas_final
Php forum2015 tomas_finalPhp forum2015 tomas_final
Php forum2015 tomas_final
 

Similar to MongoDB's index and query optimize

The Query Engine: The Life of a Read
The Query Engine: The Life of a ReadThe Query Engine: The Life of a Read
The Query Engine: The Life of a ReadMongoDB
 
Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)MongoDB
 
2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDB2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDBantoinegirbal
 
Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27MongoDB
 
Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)MongoDB
 
Idea for ineractive programming language
Idea for ineractive programming languageIdea for ineractive programming language
Idea for ineractive programming languageLincoln Hannah
 
Lecture 5: Functional Programming
Lecture 5: Functional ProgrammingLecture 5: Functional Programming
Lecture 5: Functional ProgrammingEelco Visser
 
Oracle JSON treatment evolution - from 12.1 to 18 AOUG-2018
Oracle JSON treatment evolution - from 12.1 to 18 AOUG-2018Oracle JSON treatment evolution - from 12.1 to 18 AOUG-2018
Oracle JSON treatment evolution - from 12.1 to 18 AOUG-2018Alexander Tokarev
 
Basic R Data Manipulation
Basic R Data ManipulationBasic R Data Manipulation
Basic R Data ManipulationChu An
 
Fazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearchFazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearchPedro Franceschi
 
Indexing and Query Optimizer
Indexing and Query OptimizerIndexing and Query Optimizer
Indexing and Query OptimizerMongoDB
 
Get started with R lang
Get started with R langGet started with R lang
Get started with R langsenthil0809
 
CS101- Introduction to Computing- Lecture 29
CS101- Introduction to Computing- Lecture 29CS101- Introduction to Computing- Lecture 29
CS101- Introduction to Computing- Lecture 29Bilal Ahmed
 
Query parameterization
Query parameterizationQuery parameterization
Query parameterizationRiteshkiit
 
Indexing & query optimization
Indexing & query optimizationIndexing & query optimization
Indexing & query optimizationJared Rosoff
 

Similar to MongoDB's index and query optimize (20)

Mongo indexes
Mongo indexesMongo indexes
Mongo indexes
 
The Query Engine: The Life of a Read
The Query Engine: The Life of a ReadThe Query Engine: The Life of a Read
The Query Engine: The Life of a Read
 
Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)
 
2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDB2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDB
 
Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27
 
Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)
 
Idea for ineractive programming language
Idea for ineractive programming languageIdea for ineractive programming language
Idea for ineractive programming language
 
Indexing
IndexingIndexing
Indexing
 
R code for data manipulation
R code for data manipulationR code for data manipulation
R code for data manipulation
 
Spark workshop
Spark workshopSpark workshop
Spark workshop
 
Lecture 5: Functional Programming
Lecture 5: Functional ProgrammingLecture 5: Functional Programming
Lecture 5: Functional Programming
 
MongoDB (Advanced)
MongoDB (Advanced)MongoDB (Advanced)
MongoDB (Advanced)
 
Oracle JSON treatment evolution - from 12.1 to 18 AOUG-2018
Oracle JSON treatment evolution - from 12.1 to 18 AOUG-2018Oracle JSON treatment evolution - from 12.1 to 18 AOUG-2018
Oracle JSON treatment evolution - from 12.1 to 18 AOUG-2018
 
Basic R Data Manipulation
Basic R Data ManipulationBasic R Data Manipulation
Basic R Data Manipulation
 
Fazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearchFazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearch
 
Indexing and Query Optimizer
Indexing and Query OptimizerIndexing and Query Optimizer
Indexing and Query Optimizer
 
Get started with R lang
Get started with R langGet started with R lang
Get started with R lang
 
CS101- Introduction to Computing- Lecture 29
CS101- Introduction to Computing- Lecture 29CS101- Introduction to Computing- Lecture 29
CS101- Introduction to Computing- Lecture 29
 
Query parameterization
Query parameterizationQuery parameterization
Query parameterization
 
Indexing & query optimization
Indexing & query optimizationIndexing & query optimization
Indexing & query optimization
 

More from mysqlops

The simplethebeautiful
The simplethebeautifulThe simplethebeautiful
The simplethebeautifulmysqlops
 
Oracle数据库分析函数详解
Oracle数据库分析函数详解Oracle数据库分析函数详解
Oracle数据库分析函数详解mysqlops
 
Percona Live 2012PPT:mysql-security-privileges-and-user-management
Percona Live 2012PPT:mysql-security-privileges-and-user-managementPercona Live 2012PPT:mysql-security-privileges-and-user-management
Percona Live 2012PPT:mysql-security-privileges-and-user-managementmysqlops
 
Percona Live 2012PPT: introduction-to-mysql-replication
Percona Live 2012PPT: introduction-to-mysql-replicationPercona Live 2012PPT: introduction-to-mysql-replication
Percona Live 2012PPT: introduction-to-mysql-replicationmysqlops
 
Percona Live 2012PPT: MySQL Cluster And NDB Cluster
Percona Live 2012PPT: MySQL Cluster And NDB ClusterPercona Live 2012PPT: MySQL Cluster And NDB Cluster
Percona Live 2012PPT: MySQL Cluster And NDB Clustermysqlops
 
Percona Live 2012PPT: MySQL Query optimization
Percona Live 2012PPT: MySQL Query optimizationPercona Live 2012PPT: MySQL Query optimization
Percona Live 2012PPT: MySQL Query optimizationmysqlops
 
Pldc2012 innodb architecture and internals
Pldc2012 innodb architecture and internalsPldc2012 innodb architecture and internals
Pldc2012 innodb architecture and internalsmysqlops
 
DBA新人的述职报告
DBA新人的述职报告DBA新人的述职报告
DBA新人的述职报告mysqlops
 
分布式爬虫
分布式爬虫分布式爬虫
分布式爬虫mysqlops
 
MySQL应用优化实践
MySQL应用优化实践MySQL应用优化实践
MySQL应用优化实践mysqlops
 
eBay EDW元数据管理及应用
eBay EDW元数据管理及应用eBay EDW元数据管理及应用
eBay EDW元数据管理及应用mysqlops
 
基于协程的网络开发框架的设计与实现
基于协程的网络开发框架的设计与实现基于协程的网络开发框架的设计与实现
基于协程的网络开发框架的设计与实现mysqlops
 
eBay基于Hadoop平台的用户邮件数据分析
eBay基于Hadoop平台的用户邮件数据分析eBay基于Hadoop平台的用户邮件数据分析
eBay基于Hadoop平台的用户邮件数据分析mysqlops
 
对MySQL DBA的一些思考
对MySQL DBA的一些思考对MySQL DBA的一些思考
对MySQL DBA的一些思考mysqlops
 
QQ聊天系统后台架构的演化与启示
QQ聊天系统后台架构的演化与启示QQ聊天系统后台架构的演化与启示
QQ聊天系统后台架构的演化与启示mysqlops
 
腾讯即时聊天IM1.4亿在线背后的故事
腾讯即时聊天IM1.4亿在线背后的故事腾讯即时聊天IM1.4亿在线背后的故事
腾讯即时聊天IM1.4亿在线背后的故事mysqlops
 
分布式存储与TDDL
分布式存储与TDDL分布式存储与TDDL
分布式存储与TDDLmysqlops
 
MySQL数据库生产环境维护
MySQL数据库生产环境维护MySQL数据库生产环境维护
MySQL数据库生产环境维护mysqlops
 

More from mysqlops (20)

The simplethebeautiful
The simplethebeautifulThe simplethebeautiful
The simplethebeautiful
 
Oracle数据库分析函数详解
Oracle数据库分析函数详解Oracle数据库分析函数详解
Oracle数据库分析函数详解
 
Percona Live 2012PPT:mysql-security-privileges-and-user-management
Percona Live 2012PPT:mysql-security-privileges-and-user-managementPercona Live 2012PPT:mysql-security-privileges-and-user-management
Percona Live 2012PPT:mysql-security-privileges-and-user-management
 
Percona Live 2012PPT: introduction-to-mysql-replication
Percona Live 2012PPT: introduction-to-mysql-replicationPercona Live 2012PPT: introduction-to-mysql-replication
Percona Live 2012PPT: introduction-to-mysql-replication
 
Percona Live 2012PPT: MySQL Cluster And NDB Cluster
Percona Live 2012PPT: MySQL Cluster And NDB ClusterPercona Live 2012PPT: MySQL Cluster And NDB Cluster
Percona Live 2012PPT: MySQL Cluster And NDB Cluster
 
Percona Live 2012PPT: MySQL Query optimization
Percona Live 2012PPT: MySQL Query optimizationPercona Live 2012PPT: MySQL Query optimization
Percona Live 2012PPT: MySQL Query optimization
 
Pldc2012 innodb architecture and internals
Pldc2012 innodb architecture and internalsPldc2012 innodb architecture and internals
Pldc2012 innodb architecture and internals
 
DBA新人的述职报告
DBA新人的述职报告DBA新人的述职报告
DBA新人的述职报告
 
分布式爬虫
分布式爬虫分布式爬虫
分布式爬虫
 
MySQL应用优化实践
MySQL应用优化实践MySQL应用优化实践
MySQL应用优化实践
 
eBay EDW元数据管理及应用
eBay EDW元数据管理及应用eBay EDW元数据管理及应用
eBay EDW元数据管理及应用
 
基于协程的网络开发框架的设计与实现
基于协程的网络开发框架的设计与实现基于协程的网络开发框架的设计与实现
基于协程的网络开发框架的设计与实现
 
eBay基于Hadoop平台的用户邮件数据分析
eBay基于Hadoop平台的用户邮件数据分析eBay基于Hadoop平台的用户邮件数据分析
eBay基于Hadoop平台的用户邮件数据分析
 
对MySQL DBA的一些思考
对MySQL DBA的一些思考对MySQL DBA的一些思考
对MySQL DBA的一些思考
 
QQ聊天系统后台架构的演化与启示
QQ聊天系统后台架构的演化与启示QQ聊天系统后台架构的演化与启示
QQ聊天系统后台架构的演化与启示
 
腾讯即时聊天IM1.4亿在线背后的故事
腾讯即时聊天IM1.4亿在线背后的故事腾讯即时聊天IM1.4亿在线背后的故事
腾讯即时聊天IM1.4亿在线背后的故事
 
分布式存储与TDDL
分布式存储与TDDL分布式存储与TDDL
分布式存储与TDDL
 
MySQL数据库生产环境维护
MySQL数据库生产环境维护MySQL数据库生产环境维护
MySQL数据库生产环境维护
 
Memcached
MemcachedMemcached
Memcached
 
DevOPS
DevOPSDevOPS
DevOPS
 

Recently uploaded

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 

Recently uploaded (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 

MongoDB's index and query optimize

  • 1. IndicesQuery OptimizerPerformance Tuning Aaron Staple aaron@10gen.com
  • 2. What is an index? A set of references to your documents, efficiently ordered by key {x:0.5,y:0.5} {x:2,y:0.5} {x:5,y:2} {x:-4,y:10} {x:3,y:’f’}
  • 3. What is an index? A set of references to your documents, efficiently ordered by key {x:1} {x:0.5,y:0.5} {x:2,y:0.5} {x:5,y:2} {x:-4,y:10} {x:3,y:’f’}
  • 4. What is an index? A set of references to your documents, efficiently ordered by key {y:1} {x:0.5,y:0.5} {x:2,y:0.5} {x:5,y:2} {x:-4,y:10} {x:3,y:’f’}
  • 5. How is an index stored? B-tree {x:2} {x:3} 3<=x<4 4<=x<5 {x:0.5} 2<=x<5 {x:5} 0<=x<1 x>=5 x<0 {x:-4} {x:1}
  • 6. What if I have multiple indices? {c:1} {a:3} {c:2} {c:3} {b:’x’} {d:null} { a:3, b:’x’, c:[1,2,3] } {a:1} {c:1} {b:1} {d:1}
  • 7. How does a simple query work? Tree traversal {x:2} {x:3} 3<=x<4 4<=x<5 {x:0.5} 2<=x<5 {x:5} 0<=x<1 x>=5 x<0 {x:-4} {x:1}
  • 8. Simple document lookup db.c.findOne( {_id:2} ), using index {_id:1} db.c.find( {x:2} ), using index {x:1} db.c.find( {x:{$in:[2,3]}} ), using index {x:1} db.c.find( {‘x.a’:1} ), using index {‘x.a’:1} Matches {_id:1,x:{a:1}} db.c.find( {x:{a:1}} ), using index {x:1} Matches {_id:1,x:{a:1}}, but not {_id:2,x:{a:1,b:2}} QUESTION: What about db.c.find( {$where:“this.x == this.y”} ), using index {x:1}? Indices cannot be used for $where type queries, but if there are non-where elements in the query then indices can be used for the non-where elements.
  • 9. How does a range query work? Tree traversal + scan: find({x:{$gte:3,$lte:5}}) {x:2} {x:3} {x:4} 3<=x<4 4<=x<5 {x:0.5} 2<=x<5 {x:5} 0<=x<1 {x:6} x>=5 x<0 {x:-4} {x:1}
  • 10. Document range scan db.c.find( {x:{$gt:2}} ), using index {x:1} db.c.find( {x:{$gt:2,$lt:5}} ), using index {x:1} db.c.find( {x:/^a/} ), using index {x:1} QUESTION: What about db.c.find( {x:/a/} ), using index {x:1}? The letter ‘a’ can appear anywhere in a matching string, so lexicographic ordering on strings won’t help. However, we can use the index to find the range of documents where x is string (eg not a number) or x is the regular expression /a/.
  • 11. Other operations db.c.count( {x:2} ) using index {x:1} db.c.distinct( {x:2} ) using index {x:1} db.c.update( {x:2}, {x:3} ) using index {x:1} db.c.remove( {x:2} ) using index {x:1} QUESTION: What about db.c.update( {x:2}, {$inc:{x:3}} ), using index {x:1}? Older versions of mongoDB didn’t support modifiers on indexed fields, but we now support this.
  • 12. Missing fields db.c.find( {x:null} ), using index {x:1} Matches {_id:5} Matches {_id:5,x:null} QUESTION: What about db.c.find( {x:{$exists:true}} ), using index {x:1}? The index is not currently used, though we will fix this in MongoDB 1.6.
  • 13. Array matching All the following match {_id:6,x:[2,10]} and use index {x:1} db.c.find( {x:2} ) db.c.find( {x:10} ) db.c.find( {x:{$gt:5}} ) db.c.find( {x:[2,10]} ) db.c.find( {x:{$in:[2,5]}} ) QUESTION: What about db.c.find( {x:{$all:[2,10]}} )? The index will be used to look up all documents matching {x:2}.
  • 14. What is a compound index? {x:2,y:3} {x:1,y:5} {x:2,y:9} {x:3,y:1} {x:1,y:1}
  • 15. How are bounds determined for a compound index? find( {x:{$gte:2,$lte:4},y:6} ) {x:3,y:1} {x:2,y:6} {x:3,y:7} {x:3.5,y:6} {x:2,y:3} {x:4,y:6} {x:1,y:5} {x:5,y:6} {x:1,y:1}
  • 16. How does an ordered range query work? Simple range scan if index already ensures desired ordering: find( {x:2} ).sort( {y:1} ) {x:2,y:3} {x:1,y:5} {x:2,y:9} {x:3,y:1} {x:1,y:1}
  • 17. How does an ordered range query work? Otherwise, in-memory sort of matching documents: find( {x:2} ).sort( {y:1} ) {x:2,y:3} {x:2,y:9} {x:1,y:5} {x:2,y:3} {x:2,y:9} … {x:3,y:1} {x:1}
  • 18. Document ordering db.c.find( {} ).sort( {x:1} ), using index {x:1} db.c.find( {} ).sort( {x:-1} ), using index {x:1} db.c.find( {x:{$gt:4}} ).sort( {x:-1} ), using index {x:1} db.c.find( {} ).sort( {‘x.a’:1} ), using index {‘x.a’:1} QUESTION: What about db.c.find( {y:1} ).sort( {x:1} ), using index {x:1}? The index will be used to ensure ordering, provided there is no better index.
  • 19. Compound indices and ordering db.c.find( {x:10,y:20} ), using index {x:1,y:1} db.c.find( {x:10,y:20} ), using index {x:1,y:-1} db.c.find( {x:{$in:[10,20]},y:20} ), using index {x:1,y:1} db.c.find().sort( {x:1,y:1} ), using index {x:1,y:1} db.c.find().sort( {x:-1,y:1} ), using index {x:1,y:-1} db.c.find( {x:10} ).sort( {y:1} ), using index {x:1,y:1} QUESTION: What about db.c.find( {y:10} ).sort( {x:1} ), using index {x:1,y:1}? The index will be used to ensure ordering, provided no better index is available.
  • 20. What if we negate a query? find({x:{$ne:2}}) {x:2} {x:1} {x:2} {x:3} {x:1}
  • 21. When indices are less helpful db.c.find( {x:{$ne:1}} ) db.c.find( {x:{$mod:[10,1]}} ) Uses index {x:1} to scan numbers only db.c.find( {x:{$not:/a/}} ) db.c.find( {x:{$gte:0,$lte:10},y:5} ) using index {x:1,y:1} Currently must scan all elements from {x:0,y:5} to {x:10,y:5}, but some improvements may be possible db.c.find( {$where:’this.x = 5’} ) QUESTION: What about db.c.find( {x:{$not:/^a/}} ), using index {x:1}? The index is not used currently, but will be used in mongoDB 1.6
  • 22. How is an index chosen? find( {x:2,y:3} ) {x:2,y:1} {y:3,x:1} {x:2,y:3} {x:2,y:9} {y:3,x:2} {y:9,x:2} {x:1,y:3} {y:1,x:2} {x:1} {y:1} √ {x:2,y:3} {x:2,y:1} {x:2,y:9} {y:3,x:2} {y:3,x:1}
  • 23. Query pattern matching Very simple algorithm, few complaints so far find({x:1}) find({x:2}) find({x:100}) find({x:{$gt:4}}) find({x:{$gte:6}}) find({x:1,y:2}) find({x:{$gt:4,$lte:10}}) find({x:{$gte:6,$lte:400}}) find({x:1}).sort({y:1})
  • 24. Query optimizer In charge of picking which index to use for a query/count/update/delete/etc Usually it does a good job, but if you know what you’re doing you can override it db.c.find( {x:2,y:3} ).hint( {y:1} ) Use index {y:1} and avoid trying {x:1} As your data changes, different indices may be chosen. Ordering requirements should be made explicit using sort(). QUESTION: How can you force a full collection scan instead of using indices? db.c.find( {x:2,y:3} ).hint( {$natural:1} ) to bypass indices
  • 25. Geospatial indices db.c.find( {a:[50,50]} ) using index {a:’2d’} db.c.find( {a:{$near:[50,50]}} ) using index {a:’2d’} Results are sorted closest - farthest db.c.find( {a:{$within:{$box:[[40,40],[60,60]]}}} ) using index {a:’2d’} db.c.find( {a:{$within:{$center:[[50,50],10]}}} ) using index {a:’2d’} db.c.find( {a:{$near:[50,50]},b:2} ) using index {a:’2d’,b:1} QUESTION: Most queries can be performed with or without an index. Is this true of geospatial queries? No. A geospatial query requires an index.
  • 26. How does an insert work? Tree traversal and insert, split if necessary {x:3.5} {x:2} {x:3} {x:4} 3<=x<4 4<=x<5 {x:0.5} 2<=x<5 {x:5} 0<=x<1 {x:6} x>=5 x<0 {x:-4} {x:1}
  • 27. What if my keys are increasing? You’ll always insert on the right {x:2} {x:3} {x:4} 3<=x<4 4<=x<5 {x:0.5} 2<=x<5 {x:5} 0<=x<1 {x:6} x>=5 x<0 {x:7} {x:-4} {x:8} {x:1} {x:9}
  • 28. Why is RAM important? RAM is basically used as a LIFO disk cache Whole index in RAM Portion of index in RAM
  • 29. Creating an index {_id:1} index created automatically For non-capped collections db.c.ensureIndex( {x:1} ) Can create an index at any time, even when you already have plenty of data in your collection Creating an index will block mongoDB unless you specify background index creation db.c.ensureIndex( {x:1}, {background:true} ) Background index creation is a still impacts performance – run at non peak times if you’re concerned QUESTION: Can an index be removed during background creation? Not at this time.
  • 30. Unique key constraints db.c.ensureIndex( {x:1}, {unique:true} ) Don’t allow {_id:10,x:2} and {_id:11,x:2} Don’t allow {_id:12} and {_id:13} (both match {x:null} What if duplicates exist before index is created? Normally index creation fails and the index is removed db.ensureIndex( {x:1}, {unique:true,dropDups:true} ) QUESTION: In dropDups mode, which duplicates will be removed? The first document according to the collection’s “natural order” will be preserved.
  • 31. Cleaning up an index db.system.indices.find( {ns:’db.c’} ) db.c.dropIndex( {x:1} ) db.c.dropindices() db.c.reIndex() Rebuilds all indices, removing index cruft that has built up over large numbers of updates and deletes. Index cruft will not exist in mongoDB 1.6, so this command will be deprecated. QUESTION: Why would you want to drop an index? See next slide…
  • 32. Limits and tradeoffs Max 40 indices per collection Logically equivalent indices are not prevented (eg {x:1} and {x:-1}) indices can improve speed of queries, but make inserts slower A more specific index {a:1,b:1,c:1} can be more helpful than less specific index {a:1} but the more specific index will be larger, thus harder to fit in RAM QUESTION: Do indices make updates slower? How about deletes? It depends – finding your document might be faster, but if any indexed fields are changed the indices must be updated.
  • 33. Mongod log output query test.c ntoreturn:1 reslen:69 nscanned:100000 { i: 99999.0 } nreturned:1 157ms query test.$cmd ntoreturn:1 command: { count: "c", query: { type: 0.0, i: { $gt: 99000.0 } }, fields: {} } reslen:64 256ms query:{ query: {}, orderby: { i: 1.0 } } ... query test.c ntoreturn:0 exception 1378ms ... User Exception 10128:too much key data for sort() with no index. add an index or specify a smaller limit query test.c ntoreturn:0 reslen:4783 nscanned:100501 { query: { type: 500.0 }, orderby: { i: 1.0 } } nreturned:101 390ms Occasionally may see a slow operation as a result of disk activity or mongo cleaning things up – some messages about slow ops are spurious Keep this in mind when running the same op a massive number of times, and it appears slow very rarely
  • 34. Profiling Record same info as with log messages, but in a database collection > db.system.profile.find() {"ts" : "Thu Jan 29 2009 15:19:32 GMT-0500 (EST)" , "info" : "query test.$cmd ntoreturn:1 reslen:66 nscanned:0 <br>query: { profile: 2 } nreturned:1 bytes:50" , "millis" : 0}... > db.system.profile.find( { info: /test.foo/ } ) > db.system.profile.find( { millis : { $gt : 5 } } ) > db.system.profile.find().sort({$natural:-1}) Enable explicitly using levels (0:off, 1:slow ops (>100ms), 2:all ops) > db.setProfilingLevel(2); {"was" : 0 , "ok" : 1} > db.getProfilingLevel() 2 > db.setProfilingLevel( 1 , 10 ); // slow means > 10ms Profiling impacts performance, but not severely
  • 35. Query explain > db.c.find( {x:1000,y:0} ).explain() { "cursor" : "BtreeCursor x_1", "indexBounds" : [ [ { "x" : 1000 }, { "x" : 1000 } ] ], "nscanned" : 10, "nscannedObjects" : 10, "n" : 10, "millis" : 0, "oldPlan" : { "cursor" : "BtreeCursor x_1", "indexBounds" : [ [ { "x" : 1000 }, { "x" : 1000 } ] ] }, "allPlans" : [ { "cursor" : "BtreeCursor x_1", "indexBounds" : [ [ { "x" : 1000 }, { "x" : 1000 } ] ] }, { "cursor" : "BtreeCursor y_1", "indexBounds" : [ [ { "y" : 0 }, { "y" : 0 } ] ] }, { "cursor" : "BasicCursor", "indexBounds" : [ ] } ] }
  • 36. Example 1 > db.c.findOne( {i:99999} ) { "_id" : ObjectId("4bb962dddfdcf5761c1ec6a3"), "i" : 99999 } query test.c ntoreturn:1 reslen:69 nscanned:100000 { i: 99999.0 } nreturned:1 157ms > db.c.find( {i:99999} ).limit(1).explain() { "cursor" : "BasicCursor", "indexBounds" : [ ], "nscanned" : 100000, "nscannedObjects" : 100000, "n" : 1, "millis" : 161, "allPlans" : [ { "cursor" : "BasicCursor", "indexBounds" : [ ] } ] } > db.c.ensureIndex( {i:1} ); > for( i = 0; i < 100000; ++i ) { db.c.save( {i:i} ); }
  • 37. Example 2 > db.c.count( {type:0,i:{$gt:99000}} ) 499 query test.$cmd ntoreturn:1 command: { count: "c", query: { type: 0.0, i: { $gt: 99000.0 } }, fields: {} } reslen:64 256ms > db.c.find( {type:0,i:{$gt:99000}} ).limit(1).explain() { "cursor" : "BtreeCursor type_1", "indexBounds" : [ [ { "type" : 0 }, { "type" : 0 } ] ], "nscanned" : 49502, "nscannedObjects" : 49502, "n" : 1, "millis" : 349, ... > db.c.ensureIndex( {type:1,i:1} ); > for( i = 0; i < 100000; ++i ) { db.c.save( {type:i%2,i:i} ); }
  • 38. Example 3 > db.c.find().sort( {i:1} ) error: { "$err" : "too much key data for sort() with no index. add an index or specify a smaller limit" } > db.c.find().sort( {i:1} ).explain() JS Error: uncaught exception: error: { "$err" : "too much key data for sort() with no index. add an index or specify a smaller limit" } > db.c.ensureIndex( {i:1} ); > db.c.find().sort( {i:1} ).limit( 1000 ); //alternatively > for( i = 0; i < 1000000; ++i ) { db.c.save( {i:i} ); }
  • 39. Example 4 > db.c.find( {type:500} ).sort( {i:1} ) { "_id" : ObjectId("4bba4904dfdcf5761c2f917e"), "i" : 500, "type" : 500 } { "_id" : ObjectId("4bba4904dfdcf5761c2f9566"), "i" : 1500, "type" : 500 } ... query test.c ntoreturn:0 reslen:4783 nscanned:100501 { query: { type: 500.0 }, orderby: { i: 1.0 } } nreturned:101 390ms > db.c.find( {type:500} ).sort( {i:1} ).explain() { "cursor" : "BtreeCursor i_1", "indexBounds" : [ [ { "i" : { "$minElement" : 1 } }, { "i" : { "$maxElement" : 1 } } ] ], "nscanned" : 1000000, "nscannedObjects" : 1000000, "n" : 1000, "millis" : 5388, ... > db.c.ensureIndex( {type:1,i:1} ); > for( i = 0; i < 1000000; ++i ) { db.c.save( {i:i,type:i%1000} ); }
  • 40. Questions? Get involved www.mongodb.org Downloads, user group, chat room Follow @mongodb Upcoming events www.mongodb.org/display/DOCS/Events SF MongoDB office hours Mondays 4-6pm at Epicenter Café SF MongoDBmeetup May 17 at Engine Yard Commercial support www.10gen.com jobs@10gen.com