Indexing and Query Optimization Webinar
 

Indexing and Query Optimization Webinar

on

  • 4,370 views

MongoDB supports a wide range of indexing options to enable fast querying of your data. In this talk we'll cover how indexing works, the various indexing options, and cover use cases where each might ...

MongoDB supports a wide range of indexing options to enable fast querying of your data. In this talk we'll cover how indexing works, the various indexing options, and cover use cases where each might be useful.

Statistics

Views

Total Views
4,370
Views on SlideShare
3,728
Embed Views
642

Actions

Likes
4
Downloads
64
Comments
0

14 Embeds 642

http://www.10gen.com 417
http://www.mongodb.com 112
http://cursoreclutamiento20.wordpress.com 72
https://twitter.com 9
http://www.twylah.com 7
http://drupal1.10gen.cc 6
https://www.mongodb.com 5
http://ww.w.mongodb.org 4
http://aws.10gen.com 3
http://www.onlydoo.com 2
http://ww.mongodb.org 2
http://educacionvirtual.uta.edu.ec 1
http://pinterest.com 1
https://si0.twimg.com 1
More...

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Indexing and Query Optimization Webinar Indexing and Query Optimization Webinar Presentation Transcript

  • Indexing  and  Query  Optimization Kevin  Matulef September  6,  2012Thursday, September 6, 12
  • What’s in store • What are indexes? • Picking the right indexes. • Creating indexes in MongoDB • TroubleshootingThursday, September 6, 12
  • Indexes are the single biggest tunable performance factor in MongoDB.Thursday, September 6, 12
  • Absent or suboptimal indexes are the most common avoidable MongoDB performance problem.Thursday, September 6, 12
  • So what problem do indexes solve?Thursday, September 6, 12
  • Thursday, September 6, 12
  • How do you find a chicken recipe? • An unindexed cookbook might be quite a page turner. • Probably not what you want, though.Thursday, September 6, 12
  • I know, I’ll use an index!Thursday, September 6, 12
  • Thursday, September 6, 12
  • Let’s imagine a simple index ingredient page aardvark 790 ... ... beef 190,  191,  205,  ... ... ... chicken 182,  199,  200,  ...   chorizo 497,  ... ... ... zucchini 673,  986,  ...Thursday, September 6, 12
  • How do you find a quick chicken recipe?Thursday, September 6, 12
  • Let’s imagine a compound index ingredient cooking  time page ... ... ... chicken 15  min 182,  200 chicken 25  min 199 chicken 30  min 289,316,320 chicken 45  min 290,  291,  354 ... ... ...Thursday, September 6, 12
  • Consider the ordering of index keys Aardvark,  20  min Chicken,  15  min Zuchinni,  45  min Chicken,  25  min Chicken,  30  min Chicken,  45  minThursday, September 6, 12
  • How about a low-calorie chicken recipe?Thursday, September 6, 12
  • Let’s imagine a 2nd compound index ingredient calories page ... ... ... chicken 250 199,  316 chicken 300 289,291 chicken 425 320 ... ... ...Thursday, September 6, 12
  • How about a quick, low-calorie recipe?Thursday, September 6, 12
  • Let’s imagine a last compound index calories cooking  time page ... ... ... 250 25  min 199 250 30  min 316 300 25  min 289 300 45  min 291 425 30  min 320 ... ... ... How do you find dishes from 250 to 300 calories that cook from 30 to 40 minutes?Thursday, September 6, 12
  • Consider the ordering of index keys 250  cal, 250  cal, 300  cal, 300  cal, 425  cal, 25  min 30  min 25  min 45  min 30  min How do you find dishes from 250 to 300 calories that cook from 30 to 40 minutes? 4 index entries will be scanned, but only 1 will match!Thursday, September 6, 12
  • Range queries using an index on A, B • A is a range J • A is constant, B is a range J • A is constant, order by B J • A is range, B is constant/range K • B is constant/range, A unspecified LThursday, September 6, 12
  • It’s really that straightforward.Thursday, September 6, 12
  • B-Trees (Bayer & McCreight ’72)Thursday, September 6, 12
  • B-Trees (Bayer & McCreight ’72) 13Thursday, September 6, 12
  • B-Trees (Bayer & McCreight ’72) 13 Queries,  Inserts,  Deletes:  O(log  n)Thursday, September 6, 12
  • All this is relevant to MongoDB. • MongoDB’s indexes are B-Trees, which are designed for range queries. • Generally, the best index for your queries is going to be a compound index. • Every additional index slows down inserts & removes, and may slow updates.Thursday, September 6, 12
  • On to MongoDB!Thursday, September 6, 12
  • Declaring Indexes • db.foo.ensureIndex( { username : 1 } )Thursday, September 6, 12
  • Declaring Indexes • db.foo.ensureIndex( { username : 1 } ) • db.foo.ensureIndex( { username : 1, created_at : -1 } )Thursday, September 6, 12
  • And managing them.... > db.system.indexes.find() //db.foo.getIndexes() { "v" : 1, "key" : { "_id" : 1 }, "ns" : "test.foo", "name" : "_id_" } { "v" : 1, "key" : { "username" : 1 }, "ns" : "test.foo", "name" : "username_1" }Thursday, September 6, 12
  • And managing them.... > db.system.indexes.find() //db.foo.getIndexes() { "v" : 1, "key" : { "_id" : 1 }, "ns" : "test.foo", "name" : "_id_" } { "v" : 1, "key" : { "username" : 1 }, "ns" : "test.foo", "name" : "username_1" } > db.foo.dropIndex( { username : 1} ) { "nIndexesWas" : 2 , "ok" : 1 }Thursday, September 6, 12
  • Key info about MongoDB’s indexes • A collection may have at most 64 indexes.Thursday, September 6, 12
  • Key info about MongoDB’s indexes • A collection may have at most 64 indexes. • “_id” index is automatic (except capped collections before 2.2)Thursday, September 6, 12
  • Key info about MongoDB’s indexes • A collection may have at most 64 indexes. • “_id” index is automatic (except capped collections before 2.2) • All queries can use just 1 index (except $or queries).Thursday, September 6, 12
  • Key info about MongoDB’s indexes • A collection may have at most 64 indexes. • “_id” index is automatic (except capped collections before 2.2) • All queries can use just 1 index (except $or queries). • The maximum index key size is 1024 bytes.Thursday, September 6, 12
  • Indexes get used where you’d expect • db.foo.find({x : 42}) • db.foo.find({x : {$in : [42,52]}}) • db.foo.find({x : {$lt : 42}) • update, findAndModify that select on x, • count, distinct, • $match in aggregation • left-anchored regexp, e.g. /^Kev/Thursday, September 6, 12
  • But indexes aren’t always helpful • Most negations: $not, $nin, $ne • Some corner cases: $mod, $where • Matching most regular expressions, e.g. /a/ or /foo/iThursday, September 6, 12
  • Advanced OptionsThursday, September 6, 12
  • Arrays: the powerful “multiKey” index { title : “Chicken Noodle Soup”, ingredients : [“chicken”, “noodles”] } >  db.foo.ensureIndex(  {  ingredients  :  1  }  ) ingredients page chicken 42 ... ... noodles 42 ... ...Thursday, September 6, 12
  • Unique Indexes • db.foo.ensureIndex( { email : 1 } , {unique : true} ) > db.foo.insert({email : “matulef@10gen.com”}) > db.foo.insert({email : “matulef@10gen.com”}) E11000 duplicate key error ...Thursday, September 6, 12
  • Sparse Indexes • db.foo.ensureIndex( { email : 1 } , {sparse : true} ) No index entries for docs without “email” fieldThursday, September 6, 12
  • Geospatial Indexes { name: "10gen Office", lat_long: [ 52.5184, 13.387 ] } > db.foo.ensureIndex( { lat_long : “2d” } ) > db.locations.find( { lat_long: {$near: [52.53, 13.4] } } )Thursday, September 6, 12
  • TroubleshootingThursday, September 6, 12
  • The Query Optimizer • For each “type” of query, mongoDB periodically tries all useful indexes. • Aborts as soon as one plan wins. • Winning plan is temporarily cached.Thursday, September 6, 12
  • Which plan wins? Explain! > db.foo.find( { t: { $lt : 40 } } ).explain( ) { "cursor" : "BtreeCursor t_1" , "n" : 42, “nscannedObjects: 42 "nscanned" : 42, ... "millis" : 0, ... }Thursday, September 6, 12
  • Which plan wins? Explain! > db.foo.find( { t: { $lt : 40 } } ).explain( ) { "cursor" : "BtreeCursor t_1" , "n" : 42, Pay attention to the “nscannedObjects: 42 "nscanned" : 42, ratio  n/nscanned! ... "millis" : 0, ... }Thursday, September 6, 12
  • Think you know better? Give us a hint > db.foo.find( { t: { $lt : 40 } } ).hint( { _id : 1} )Thursday, September 6, 12
  • Recording slow queries > db.setProfilingLevel( n , slowms=100ms ) n=0 profiler off n=1 record queries longer than slowms n=2 record all queries > db.system.profile.find()Thursday, September 6, 12
  • Operational TipsThursday, September 6, 12
  • Background index builds db.foo.ensureIndex( { user : 1 } , { background : true } ) Caveats: • still resource-intensive • will build in foreground on secondariesThursday, September 6, 12
  • Minimizing impact on Replica Sets for (s in secondaries) s.restartAsStandalone() s.buildIndex() s.restartAsReplSetMember() s.waitForCatchup() p.stepDown() p.restartAsStandalone() p.buildIndex() p.restartAsReplSetMember()Thursday, September 6, 12
  • Absent or suboptimal indexes are the most common avoidable MongoDB performance problem... ...so take some time and get your indexes right!Thursday, September 6, 12
  • Thanks! (and thanks to Richard Kreuter for the slides)Thursday, September 6, 12