Indexing and Query Optimization                                  Paul PedersenMonday, October 15, 12
What’s in store      • What are indexes?      • Picking the right indexes.      • Creating indexes in MongoDB      • Troub...
Indexes are the single biggest         tunable performance factor                in MongoDB.Monday, October 15, 12
Absent or suboptimal indexes are      the most common avoidable    MongoDB performance problem.Monday, October 15, 12
So what problem do indexes solve?Monday, October 15, 12
Monday, October 15, 12
How do you find a chicken recipe?      • An unindexed cookbook might be quite a          page turner.      • Probably not ...
I know, I’ll use an index!Monday, October 15, 12
Monday, October 15, 12
Let’s imagine a simple index                         ingredient         page                          aardvark           7...
How do you find a quick chicken recipe?Monday, October 15, 12
Let’s imagine a compound index                         ingredient   cooking      page                                     ...
Consider the ordering of index keys  Aardvark, 20 min Chicken, 15 min                Zuchinni, 45 min                     ...
How about a low-calorie chicken recipe?Monday, October 15, 12
Let’s imagine a 2nd compound index                         ingredient   calories    page                             ...  ...
How about a quick, low-calorie recipe?Monday, October 15, 12
Let’s imagine a last compound index                         calories   cooking time      page                            ....
Consider the ordering of index keys              250 cal,      250 cal,     300 cal,     300 cal,     425 cal,            ...
Range queries using an index on A, B      • A is a range       • A is constant, B is a range       • A is constant, orde...
It’s really that straightforward.Monday, October 15, 12
B-Trees             (Bayer & McCreight ’72)Monday, October 15, 12
B-Trees             (Bayer & McCreight ’72)                                           13Monday, October 15, 12
B-Trees             (Bayer & McCreight ’72)                                           13            Queries, Inserts, Dele...
All this is relevant to MongoDB.      • MongoDB’s indexes are B-Trees, which are          designed for range queries.     ...
On to MongoDB!Monday, October 15, 12
Declaring Indexes      • db.foo.ensureIndex( { username : 1 } )Monday, October 15, 12
Declaring Indexes      • db.foo.ensureIndex( { username : 1 } )      • db.foo.ensureIndex( { username : 1, created_at : -1...
And managing them....         > db.system.indexes.find() //db.foo.getIndexes()           { "v" : 1, "key" : { "_id" : 1 },...
And managing them....         > db.system.indexes.find() //db.foo.getIndexes()           { "v" : 1, "key" : { "_id" : 1 },...
Key info about MongoDB’s indexes      • A collection may have at most 64 indexes.Monday, October 15, 12
Key info about MongoDB’s indexes      • A collection may have at most 64 indexes.      • “_id” index is automatic         ...
Key info about MongoDB’s indexes      • A collection may have at most 64 indexes.      • “_id” index is automatic         ...
Key info about MongoDB’s indexes      • A collection may have at most 64 indexes.      • “_id” index is automatic         ...
Indexes get used where you’d expect           • db.foo.find({x : 42})           • db.foo.find({x : {$in : [42,52]}})      ...
But indexes aren’t always helpful      • Most negations: $not, $nin, $ne      • Some corner cases: $mod, $where      • Mat...
Advanced OptionsMonday, October 15, 12
Arrays: the powerful “multiKey” index           { title : “Chicken Noodle Soup”,             ingredients : [“chicken”, “no...
Unique Indexes     • db.foo.ensureIndex( { email : 1 } , {unique : true} )         > db.foo.insert({email : “matulef@10gen...
Sparse Indexes     • db.foo.ensureIndex( { email : 1 } , {sparse : true} )                  No index entries for docs with...
Geospatial Indexes         { name: "10gen Office",           lat_long: [ 52.5184, 13.387 ] }         > db.foo.ensureIndex(...
TroubleshootingMonday, October 15, 12
The Query Optimizer      • For each “type” of query, mongoDB          periodically tries all useful indexes.      • Aborts...
Which plan wins? Explain!      > db.foo.find( { t: { $lt : 40 } } ).explain( )      {        "cursor" : "BtreeCursor t_1" ...
Which plan wins? Explain!      > db.foo.find( { t: { $lt : 40 } } ).explain( )      {        "cursor" : "BtreeCursor t_1" ...
Think you know better? Give us a hint      > db.foo.find( { t: { $lt : 40 } } ).hint( { _id : 1} )Monday, October 15, 12
Recording slow queries      > db.setProfilingLevel( n , slowms=100ms )      n=0 profiler off      n=1 record queries longe...
Operational TipsMonday, October 15, 12
Background index builds         db.foo.ensureIndex( { user : 1 } , { background : true } )         Caveats:          • sti...
Minimizing impact on Replica Sets          for (s in secondaries)              s.restartAsStandalone()              s.buil...
Absent or suboptimal indexes are     the most common avoidable   MongoDB performance problem...    ...so take some time an...
Thanks!Monday, October 15, 12
Upcoming SlideShare
Loading in …5
×

A17 indexing and query optimization by paul pederson

589
-1

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
589
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
26
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

A17 indexing and query optimization by paul pederson

  1. 1. Indexing and Query Optimization Paul PedersenMonday, October 15, 12
  2. 2. What’s in store • What are indexes? • Picking the right indexes. • Creating indexes in MongoDB • TroubleshootingMonday, October 15, 12
  3. 3. Indexes are the single biggest tunable performance factor in MongoDB.Monday, October 15, 12
  4. 4. Absent or suboptimal indexes are the most common avoidable MongoDB performance problem.Monday, October 15, 12
  5. 5. So what problem do indexes solve?Monday, October 15, 12
  6. 6. Monday, October 15, 12
  7. 7. How do you find a chicken recipe? • An unindexed cookbook might be quite a page turner. • Probably not what you want, though.Monday, October 15, 12
  8. 8. I know, I’ll use an index!Monday, October 15, 12
  9. 9. Monday, October 15, 12
  10. 10. Let’s imagine a simple index ingredient page aardvark 790 ... ... beef 190, 191, 205, ... ... ... chicken 182, 199, 200, ... chorizo 497, ... ... ... zucchini 673, 986, ...Monday, October 15, 12
  11. 11. How do you find a quick chicken recipe?Monday, October 15, 12
  12. 12. Let’s imagine a compound index ingredient cooking page time ... ... ... chicken 15 min 182, 200 chicken 25 min 199 chicken 30 min 289,316,320 chicken 45 min 290, 291, 354 ... ... ...Monday, October 15, 12
  13. 13. Consider the ordering of index keys Aardvark, 20 min Chicken, 15 min Zuchinni, 45 min Chicken, 25 min Chicken, 30 min Chicken, 45 minMonday, October 15, 12
  14. 14. How about a low-calorie chicken recipe?Monday, October 15, 12
  15. 15. Let’s imagine a 2nd compound index ingredient calories page ... ... ... chicken 250 199, 316 chicken 300 289,291 chicken 425 320 ... ... ...Monday, October 15, 12
  16. 16. How about a quick, low-calorie recipe?Monday, October 15, 12
  17. 17. Let’s imagine a last compound index calories cooking time page ... ... ... 250 25 min 199 250 30 min 316 300 25 min 289 300 45 min 291 425 30 min 320 ... ... ... How do you find dishes from 250 to 300 calories that cook from 30 to 40 minutes?Monday, October 15, 12
  18. 18. Consider the ordering of index keys 250 cal, 250 cal, 300 cal, 300 cal, 425 cal, 25 min 30 min 25 min 45 min 30 min How do you find dishes from 250 to 300 calories that cook from 30 to 40 minutes? 4 index entries will be scanned, but only 1 will match!Monday, October 15, 12
  19. 19. Range queries using an index on A, B • A is a range  • A is constant, B is a range  • A is constant, order by B  • A is range, B is constant/range  • B is constant/range, A unspecified Monday, October 15, 12
  20. 20. It’s really that straightforward.Monday, October 15, 12
  21. 21. B-Trees (Bayer & McCreight ’72)Monday, October 15, 12
  22. 22. B-Trees (Bayer & McCreight ’72) 13Monday, October 15, 12
  23. 23. B-Trees (Bayer & McCreight ’72) 13 Queries, Inserts, Deletes: O(log n)Monday, October 15, 12
  24. 24. All this is relevant to MongoDB. • MongoDB’s indexes are B-Trees, which are designed for range queries. • Generally, the best index for your queries is going to be a compound index. • Every additional index slows down inserts & removes, and may slow updates.Monday, October 15, 12
  25. 25. On to MongoDB!Monday, October 15, 12
  26. 26. Declaring Indexes • db.foo.ensureIndex( { username : 1 } )Monday, October 15, 12
  27. 27. Declaring Indexes • db.foo.ensureIndex( { username : 1 } ) • db.foo.ensureIndex( { username : 1, created_at : -1 } )Monday, October 15, 12
  28. 28. And managing them.... > db.system.indexes.find() //db.foo.getIndexes() { "v" : 1, "key" : { "_id" : 1 }, "ns" : "test.foo", "name" : "_id_" } { "v" : 1, "key" : { "username" : 1 }, "ns" : "test.foo", "name" : "username_1" }Monday, October 15, 12
  29. 29. And managing them.... > db.system.indexes.find() //db.foo.getIndexes() { "v" : 1, "key" : { "_id" : 1 }, "ns" : "test.foo", "name" : "_id_" } { "v" : 1, "key" : { "username" : 1 }, "ns" : "test.foo", "name" : "username_1" } > db.foo.dropIndex( { username : 1} ) { "nIndexesWas" : 2 , "ok" : 1 }Monday, October 15, 12
  30. 30. Key info about MongoDB’s indexes • A collection may have at most 64 indexes.Monday, October 15, 12
  31. 31. Key info about MongoDB’s indexes • A collection may have at most 64 indexes. • “_id” index is automatic (except capped collections before 2.2)Monday, October 15, 12
  32. 32. Key info about MongoDB’s indexes • A collection may have at most 64 indexes. • “_id” index is automatic (except capped collections before 2.2) • All queries can use just 1 index (except $or queries).Monday, October 15, 12
  33. 33. Key info about MongoDB’s indexes • A collection may have at most 64 indexes. • “_id” index is automatic (except capped collections before 2.2) • All queries can use just 1 index (except $or queries). • The maximum index key size is 1024 bytes.Monday, October 15, 12
  34. 34. Indexes get used where you’d expect • db.foo.find({x : 42}) • db.foo.find({x : {$in : [42,52]}}) • db.foo.find({x : {$lt : 42}) • update, findAndModify that select on x, • count, distinct, • $match in aggregation • left-anchored regexp, e.g. /^Kev/Monday, October 15, 12
  35. 35. But indexes aren’t always helpful • Most negations: $not, $nin, $ne • Some corner cases: $mod, $where • Matching most regular expressions, e.g. /a/ or /foo/iMonday, October 15, 12
  36. 36. Advanced OptionsMonday, October 15, 12
  37. 37. Arrays: the powerful “multiKey” index { title : “Chicken Noodle Soup”, ingredients : [“chicken”, “noodles”] } > db.foo.ensureIndex( { ingredients : 1 } ) ingredients page chicken 42 ... ... noodles 42 ... ...Monday, October 15, 12
  38. 38. Unique Indexes • db.foo.ensureIndex( { email : 1 } , {unique : true} ) > db.foo.insert({email : “matulef@10gen.com”}) > db.foo.insert({email : “matulef@10gen.com”}) E11000 duplicate key error ...Monday, October 15, 12
  39. 39. Sparse Indexes • db.foo.ensureIndex( { email : 1 } , {sparse : true} ) No index entries for docs without “email” fieldMonday, October 15, 12
  40. 40. Geospatial Indexes { name: "10gen Office", lat_long: [ 52.5184, 13.387 ] } > db.foo.ensureIndex( { lat_long : “2d” } ) > db.locations.find( { lat_long: {$near: [52.53, 13.4] } } )Monday, October 15, 12
  41. 41. TroubleshootingMonday, October 15, 12
  42. 42. The Query Optimizer • For each “type” of query, mongoDB periodically tries all useful indexes. • Aborts as soon as one plan wins. • Winning plan is temporarily cached.Monday, October 15, 12
  43. 43. Which plan wins? Explain! > db.foo.find( { t: { $lt : 40 } } ).explain( ) { "cursor" : "BtreeCursor t_1" , "n" : 42, “nscannedObjects: 42 "nscanned" : 42, ... "millis" : 0, ... }Monday, October 15, 12
  44. 44. Which plan wins? Explain! > db.foo.find( { t: { $lt : 40 } } ).explain( ) { "cursor" : "BtreeCursor t_1" , "n" : 42, Pay attention to the “nscannedObjects: 42 "nscanned" : 42, ratio n/nscanned! ... "millis" : 0, ... }Monday, October 15, 12
  45. 45. Think you know better? Give us a hint > db.foo.find( { t: { $lt : 40 } } ).hint( { _id : 1} )Monday, October 15, 12
  46. 46. Recording slow queries > db.setProfilingLevel( n , slowms=100ms ) n=0 profiler off n=1 record queries longer than slowms n=2 record all queries > db.system.profile.find()Monday, October 15, 12
  47. 47. Operational TipsMonday, October 15, 12
  48. 48. Background index builds db.foo.ensureIndex( { user : 1 } , { background : true } ) Caveats: • still resource-intensive • will build in foreground on secondariesMonday, October 15, 12
  49. 49. Minimizing impact on Replica Sets for (s in secondaries) s.restartAsStandalone() s.buildIndex() s.restartAsReplSetMember() s.waitForCatchup() p.stepDown() p.restartAsStandalone() p.buildIndex() p.restartAsReplSetMember()Monday, October 15, 12
  50. 50. Absent or suboptimal indexes are the most common avoidable MongoDB performance problem... ...so take some time and get your indexes right!Monday, October 15, 12
  51. 51. Thanks!Monday, October 15, 12
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×