Indexing, Query Optimization, the Query                 Optimizer — MongoAustin                                   Mathias ...
Indexing Basics         Indexes are tree-structured sets of references to your         documents.         The query planne...
However, indexing strikes people as a gray art         As is the case with relational systems, schema design and         i...
Some indexing generalities         A collection may have at most 64 indexes.         A query may only use 1 index (except ...
Creating Indexes   The id attribute is always indexed. Additional indexes can be   created with ensureIndex():      // Cre...
Index maintenance   // Drops an index on x   db.collection.dropIndex({x:1})   // Drops all indexes except _id   db.collect...
Indexes are smart about data types and structures         Indexes on attributes whose values are of different types in     ...
When can indexes be used?   In short, if you can envision how the index might get used, it   probably is. These will all u...
Trickier cases where indexes can be used         db.collection.find({ x : 1 }).sort({ y : 1 })         will use an index o...
Some array examples   The following queries will use an index on x, and will match   documents whose x attribute is the ar...
Geospatial indexes   Geospatial indexes are a sort of special case; the operators that can   take advantage of them can on...
When indexes cannot be used         Many sorts of negations, e.g., $ne, $not.         Tricky arithmetic, e.g., $mod.      ...
Never forget about compound indexes         Whenever you’re querying on multiple attributes, whether as         part of th...
Schema/index relationships   Sometimes, question isn’t “given the shape of these documents,   how do I index them?”, but “...
Index sizes   Of course, indexes take up space. For many interesting databases,   real query performance will depend on in...
explain()   It’s useful to be able to ensure that your query is doing what you   want it to do. For this, we have explain(...
explain(), continued   If the query plan doesn’t use the index, the cursor type will be   BasicCursor.   db.collection.fin...
Really, compound indexes are important   Try this at home:      1   Create a collection with a few tens of thousands of do...
The DB Profiler  MongoDB includes a database profiler that, when enabled, records  the timing measurements and result counts...
Query Optimizer         MongoDB’s query optimizer is empirical, not cost-based.         To test query plans, it tries seve...
Hinting the query plan   Sometimes, you might want to force the query plan. For this, we   have hint().   // Force the use...
Going forward         www.mongodb.org — downloads, docs, community         mongodb-user@googlegroups.com — mailing list   ...
Upcoming SlideShare
Loading in...5
×

Indexing and Query Optimizer (Mongo Austin)

3,489

Published on

Mathias Stearn's presentation at Mongo Austin

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,489
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
51
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Indexing and Query Optimizer (Mongo Austin)

  1. 1. Indexing, Query Optimization, the Query Optimizer — MongoAustin Mathias Stearn 10gen Inc. mathias@10gen.com @mathias mongo February 15, 2011MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  2. 2. Indexing Basics Indexes are tree-structured sets of references to your documents. The query planner can employ indexes to efficiently enumerate and sort matching documents. MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  3. 3. However, indexing strikes people as a gray art As is the case with relational systems, schema design and indexing go hand in hand. . . . . . but you also need to know about your actual (not just predicted) query patterns. MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  4. 4. Some indexing generalities A collection may have at most 64 indexes. A query may only use 1 index (except for disjuncts of $or queries). Indexes entail additional work on inserts, updates, deletes. MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  5. 5. Creating Indexes The id attribute is always indexed. Additional indexes can be created with ensureIndex(): // Create an index on the user attribute db.collection.ensureIndex({ user : 1 }) // Create a compound index on // the user and email attributes db.collection.ensureIndex({ user : 1, email : 1 }) // Create an index on the tags attribute, // will index all values in list db.collection.ensureIndex({ tags : 1 }) // Create a unique index on the user attribte db.collection.ensureIndex({user:1}, {unique:true}) // Create an index in the background. db.collection.ensureIndex({user:1}, {background:true}) MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  6. 6. Index maintenance // Drops an index on x db.collection.dropIndex({x:1}) // Drops all indexes except _id db.collection.dropIndexes() // Rebuild and compact indexes db.collection.reIndex() MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  7. 7. Indexes are smart about data types and structures Indexes on attributes whose values are of different types in different documents can speed up queries by skipping documents where the relevant attribute isn’t of the appropriate type. Indexes on attributes whose values are lists will index each element, speeding up queries that look into these attributes. (You really want to do this for querying on tags.) MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  8. 8. When can indexes be used? In short, if you can envision how the index might get used, it probably is. These will all use an index on x: db.collection.find( { x: 1 } ) db.collection.find( { x :{ $in : [1,2,3] } } ) db.collection.find( { x : { $gt : 1 } } ) db.collection.find( { x : /^a/ } ) db.collection.count( { x : 2 } ) db.collection.distinct( { x : 2 } ) db.collection.find().sort( { x : 1 } ) MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  9. 9. Trickier cases where indexes can be used db.collection.find({ x : 1 }).sort({ y : 1 }) will use an index on y for sorting, if there’s no index on x. (For this sort of case, use a compound index on both x and y in that order.) db.collection.update( { x : 2 } , { x : 3 } ) will use and update an index on x MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  10. 10. Some array examples The following queries will use an index on x, and will match documents whose x attribute is the array [2,10] db.collection.find({ x : 2 }) db.collection.find({ x : 10 }) db.collection.find({ x : { $gt : 5 } }) db.collection.find({ x : [2,10] }) db.collection.find({ x : { $in : [2,5] }}) MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  11. 11. Geospatial indexes Geospatial indexes are a sort of special case; the operators that can take advantage of them can only be used if the relevant indexes have been created. Some examples: db.collection.find({ a : [50, 50]}) finds a document with this point for a. db.collection.find({a : {$near : [50, 50]}}) sorts results by distance. db.collection.find({ a:{$within:{$box:[[40,40],[60,60]]}}}}) db.collection.find({ a:{$within:{$center:[[50,50],10]}}}}) MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  12. 12. When indexes cannot be used Many sorts of negations, e.g., $ne, $not. Tricky arithmetic, e.g., $mod. Most regular expressions (e.g., /a/). Expressions in $where clauses don’t take advantage of indexes. Of course $where clauses are mostly for complex queries that often can’t be indexed anyway, e.g., ‘‘where a > b’’. (If these cases matter to you, it you can precompute the match and store that as an additional attribute, you can store that, index it, and skip the $where clause entirely.) map/reduce can’t take advantage of indexes (mapping function is opaque to the query optimizer). As a rule, if you can’t imagine how an index might be used, it probably can’t! MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  13. 13. Never forget about compound indexes Whenever you’re querying on multiple attributes, whether as part of the selector document or in a sort(), compound indexes can be used. MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  14. 14. Schema/index relationships Sometimes, question isn’t “given the shape of these documents, how do I index them?”, but “how might I shape the data so I can take advantage of indexing?” // Consider a schema that uses a list of // attribute/value pairs: db.c.insert({ product : "SuperDooHickey", manufacturer : "Foo Enterprises", catalog : [ { stock : 50, modtime: ’2010-09-02’ }, { price : 29.95, modtime : ’2010-06-14’ } ] }); db.c.ensureIndex({ catalog : 1 }); // All attribute queries can use one index. db.c.find( { catalog : { stock : { $gt : 0 } } } ) MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  15. 15. Index sizes Of course, indexes take up space. For many interesting databases, real query performance will depend on index sizes; so it’s useful to see these numbers. db.collection.stats() shows indexSizes, the size of each index in the collection. db.stats() includes the total size of all indexes in the database. MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  16. 16. explain() It’s useful to be able to ensure that your query is doing what you want it to do. For this, we have explain(). Query plans that use an index have cursor type BtreeCursor. db.collection.find({x:{$gt:5}}).explain() { "cursor" : "BtreeCursor x_1", ... "nscanned" : 100, ... "n" : 100, "millis" : 0, ... } MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  17. 17. explain(), continued If the query plan doesn’t use the index, the cursor type will be BasicCursor. db.collection.find({x:{$gt:5}}).explain() { "cursor" : "BasicCursor", ... "nscanned" : 12345, ... "n" : 100, "millis" : 4, ... } MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  18. 18. Really, compound indexes are important Try this at home: 1 Create a collection with a few tens of thousands of documents having two attributes (let’s call them a and b). 2 Create a compound index on {a : 1, b : 1}, 3 Do a db.collection.find({a : constant}).sort({b : 1}).explain(). 4 Note the explain result’s millis. 5 Drop the compound index. 6 Create another compound index with the attributes reversed. (This will be a suboptimal compound index.) 7 Explain the above query again. 8 The suboptimal index should produce a slower explain result. MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  19. 19. The DB Profiler MongoDB includes a database profiler that, when enabled, records the timing measurements and result counts in a collection within the database. // Enable the profiler on this database. > db.setProfilingLevel(1, 100) { "was" : 0, "slowms" : 100, "ok" : 1 } > db.foo.find({a: { $mod : [3, 0] } }); ... // See the profiler info. > db.system.profile.find() { "ts" : "Thu Nov 18 2010 06:46:16 GMT-0500 (EST)", "info" : "query test.$cmd ntoreturn:1 command: { count: "foo", query: { a: { $mod: [ 3.0, 0.0 ] } }, fields: {} } reslen:64 406ms", "millis" : 406 } MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  20. 20. Query Optimizer MongoDB’s query optimizer is empirical, not cost-based. To test query plans, it tries several in parallel, and records the plan that finishes fastest. If a plan’s performance changes over time (e.g., as data changes), the database will reoptimize (i.e., retry all possible plans). MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  21. 21. Hinting the query plan Sometimes, you might want to force the query plan. For this, we have hint(). // Force the use of an index on attribute x: db.collection.find({x: 1, ...}).hint({x:1}) // Force indexes to be avoided! db.collection.find({x: 1, ...}).hint({$natural:1}) MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  22. 22. Going forward www.mongodb.org — downloads, docs, community mongodb-user@googlegroups.com — mailing list #mongodb on irc.freenode.net try.mongodb.org — web-based shell 10gen is hiring. Email jobs@10gen.com. 10gen offers support, training, and advising services for mongodb MongoDB – Indexing and Query Optimiz(ation—er) — MongoAustin
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×