0
#MongoDBSydneyIndexing and QueryOptimisationStephen Steneker (stennie@10gen.com)Support Engineer, 10gen Australia
Agenda• What are indexes?• Why do I need them?• Working with indexes in MongoDB• Optimise your queries• Avoiding common mi...
What are indexes?
What are indexes?Imagine you’re looking for a recipe in a cookbookordered by recipe name. Looking up a recipe byname is qu...
What are indexes?• How would you find a recipe using chicken?• How about a 250-350 calorie recipe using chicken?
KRISTINE TO INSERT IMAGE OF COOKBOOKConsult the index!
1   2   3    4    5   6   7        Linked List
1    2    3     4    5     6   7    Finding 7 in Linked List
4    2                       61          3        5           7        Finding 7 in Tree
Indexes in MongoDB are B-trees
Queries, inserts and deletes:       O(log(n)) time
Indexes are the singlebiggest tuneableperformance factor inMongoDB
Absent or suboptimalindexes are the mostcommon avoidableMongoDB performanceproblem.
Why do I need indexes?A brief story
Working with Indexes inMongoDB
How do I create indexes?// Create an index if one does not existdb.recipes.createIndex({ main_ingredient: 1 })// The clien...
What can be indexed?// Multiple fields (compound key indexes)db.recipes.ensureIndex({   main_ingredient: 1,   calories: -1...
What can be indexed?// Subdocuments{   name : Pavlova,   contributor: {     name: Ima Aussie,     id: ima123   }}db.recipe...
How do I manage indexes?// List a collections indexesdb.recipes.getIndexes()db.recipes.getIndexKeys()// Drop a specific in...
Background Index Builds// Index creation is a blocking operation that can take a long time// Background creation yields to...
Options• Uniqueness constraints (unique, dropDups)• Sparse Indexes• Geospatial (2d) Indexes• TTL Collections (expireAfterS...
Uniqueness Constraints// Only one recipe can have a given value for namedb.recipes.ensureIndex( { name: 1 }, { unique: tru...
Sparse Indexes// Only documents with field calories will be indexeddb.recipes.ensureIndex(    { calories: -1 },    { spars...
Geospatial Indexes// Add latitude, longitude coordinates{     name: 10gen Sydney’,     loc: [ 151.21037, -33.88456 ]}// In...
TTL Collections// Documents must have a BSON UTC Date field{ status : ISODate(2012-11-09T11:44:07.211Z), … }// Documents a...
Limitations• Collections can not have > 64 indexes.• Index keys can not be > 1024 bytes (1K).• The name of an index, inclu...
Optimise Your Queries
Profiling Slow Opsdb.setProfilingLevel( n , slowms=100ms )n=0 profiler offn=1 record operations longer than slowmsn=2 reco...
The Explain Plan (Pre Index)db.recipes.find( { calories:    { $lt : 40 } }).explain( ){    "cursor" : "BasicCursor" ,    "...
The Explain Plan (Post Index)db.recipes.find( { calories:    { $lt : 40 } }).explain( ){    "cursor" : "BtreeCursor calori...
The Query Optimiser
The Query Optimiser• For each "type" of query, MongoDB  periodically tries all useful indexes• Aborts the rest as soon as ...
Manually Select Index to Use// Tell the database what index to usedb.recipes.find({  calories: { $lt: 1000 } }).hint({ _id...
Use Indexes to Sort QueryResults// Given the following indexdb.collection.ensureIndex({ a:1, b:1 , c:1, d:1 })// The follo...
Indexes that won’t work forsorting query results// Given the following indexdb.collection.ensureIndex({ a:1, b:1, c:1, d:1...
Index Covered Queries// MongoDB can return data from just the indexdb.recipes.ensureIndex({ main_ingredient: 1, name: 1 })...
Absent or suboptimalindexes are the mostcommon avoidableMongoDB performanceproblem.
Avoiding CommonMistakes
Trying to Use MultipleIndexes// MongoDB can only use one index for a querydb.collection.ensureIndex({ a: 1 })db.collection...
Compound Key Mistakes// Compound key indexes are very effectivedb.collection.ensureIndex({ a: 1, b: 1, c: 1 })// But only ...
Low Selectivity Indexesdb.collection.distinct(status’)[ new, processed ]db.collection.ensureIndex({ status: 1 })// Low sel...
Regular Expressionsdb.users.ensureIndex({ username: 1 })// Left anchored regex queries can use the indexdb.users.find({ us...
Negation// Indexes arent helpful with negationsdb.things.ensureIndex({ x: 1 })// e.g. "not equal" queriesdb.things.find({ ...
Choosing the rightindexes is one of themost important thingsyou can do as aMongoDB developer sotake the time to get yourin...
#MongoDBSydneyThank youStephen Steneker (stennie@10gen.com)Support Engineer, 10gen
Upcoming SlideShare
Loading in...5
×

Indexing and Query Optimisation

710

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
710
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
22
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • '2d' index is a geohash on top of the b-tree.Allows you to search for documents 'near' a latitude/longitude position. Bounds queries are also possible using $within.TODO: Google maps image, or something similar. Kristine to provide.
  • Index must be on a BSON date field.Documents are removed after expireAfterSeconds seconds.Reaper thread runs every 60 seconds.TODO: Hourglass image, or something similar. Kristine to provide.
  • Indexes are a really powerful feature of MongoDB, however there are some limitations.Understanding these limitations is an important part of using MongoDB correctly.With the exception of $or queries.If index key exceeds 1k, documents silently dropped/not included
  • cursor – the type of cursor used. BasicCursor means no index was used. TODO: Use a real example here instead of made up numbers…n – the number of documents that match the querynscannedObjects – the number of documents that had to be scannednscanned – the number of items (index entries or documents) examinedmillis – how long the query tookRatio of n to nscanned should be as close to 1 as possible.
  • cursor – the type of cursor used. BasicCursor means no index was used.n – the number of documents that match the querynscannedObjects – the number of documents that had to be scannednscanned – the number of items (index entries or documents) examinedmillis – how long the query tookRatio of n to nscanned should be as close to 1 as possible.
  • 2008 melbourne cup
  • Transcript of "Indexing and Query Optimisation"

    1. 1. #MongoDBSydneyIndexing and QueryOptimisationStephen Steneker (stennie@10gen.com)Support Engineer, 10gen Australia
    2. 2. Agenda• What are indexes?• Why do I need them?• Working with indexes in MongoDB• Optimise your queries• Avoiding common mistakes
    3. 3. What are indexes?
    4. 4. What are indexes?Imagine you’re looking for a recipe in a cookbookordered by recipe name. Looking up a recipe byname is quick and easy.
    5. 5. What are indexes?• How would you find a recipe using chicken?• How about a 250-350 calorie recipe using chicken?
    6. 6. KRISTINE TO INSERT IMAGE OF COOKBOOKConsult the index!
    7. 7. 1 2 3 4 5 6 7 Linked List
    8. 8. 1 2 3 4 5 6 7 Finding 7 in Linked List
    9. 9. 4 2 61 3 5 7 Finding 7 in Tree
    10. 10. Indexes in MongoDB are B-trees
    11. 11. Queries, inserts and deletes: O(log(n)) time
    12. 12. Indexes are the singlebiggest tuneableperformance factor inMongoDB
    13. 13. Absent or suboptimalindexes are the mostcommon avoidableMongoDB performanceproblem.
    14. 14. Why do I need indexes?A brief story
    15. 15. Working with Indexes inMongoDB
    16. 16. How do I create indexes?// Create an index if one does not existdb.recipes.createIndex({ main_ingredient: 1 })// The client remembers the index and raises no errorsdb.recipes.ensureIndex({ main_ingredient: 1 })* 1 means ascending, -1 descending
    17. 17. What can be indexed?// Multiple fields (compound key indexes)db.recipes.ensureIndex({ main_ingredient: 1, calories: -1})// Arrays of values (multikey indexes){ name: Chicken Noodle Soup’, ingredients : [chicken, noodles]}db.recipes.ensureIndex({ ingredients: 1 })
    18. 18. What can be indexed?// Subdocuments{ name : Pavlova, contributor: { name: Ima Aussie, id: ima123 }}db.recipes.ensureIndex({ contributor.id: 1 })db.recipes.ensureIndex({ contributor: 1 })
    19. 19. How do I manage indexes?// List a collections indexesdb.recipes.getIndexes()db.recipes.getIndexKeys()// Drop a specific indexdb.recipes.dropIndex({ ingredients: 1 })// Drop all indexes and recreate themdb.recipes.reIndex()// Default (unique) index on _id
    20. 20. Background Index Builds// Index creation is a blocking operation that can take a long time// Background creation yields to other operationsdb.recipes.ensureIndex( { ingredients: 1 }, { background: true })
    21. 21. Options• Uniqueness constraints (unique, dropDups)• Sparse Indexes• Geospatial (2d) Indexes• TTL Collections (expireAfterSeconds)
    22. 22. Uniqueness Constraints// Only one recipe can have a given value for namedb.recipes.ensureIndex( { name: 1 }, { unique: true } )// Force index on collection with duplicate recipe names – drop theduplicatesdb.recipes.ensureIndex( { name: 1 }, { unique: true, dropDups: true })* dropDups is probably never what you want
    23. 23. Sparse Indexes// Only documents with field calories will be indexeddb.recipes.ensureIndex( { calories: -1 }, { sparse: true })// Allow multiple documents to not have calories fielddb.recipes.ensureIndex( { name: 1 , calories: -1 }, { unique: true, sparse: true })* Missing fields are stored as null(s) in the index
    24. 24. Geospatial Indexes// Add latitude, longitude coordinates{ name: 10gen Sydney’, loc: [ 151.21037, -33.88456 ]}// Index the coordinatesdb.locations.ensureIndex( { loc : 2d } )// Query for locations near a particular coordinatedb.locations.find({ loc: { $near: [ 151.21, -33.88 ] }})
    25. 25. TTL Collections// Documents must have a BSON UTC Date field{ status : ISODate(2012-11-09T11:44:07.211Z), … }// Documents are removed after expireAfterSeconds secondsdb.recipes.ensureIndex( { submitted_date: 1 }, { expireAfterSeconds: 3600 })
    26. 26. Limitations• Collections can not have > 64 indexes.• Index keys can not be > 1024 bytes (1K).• The name of an index, including the namespace, must be < 128 characters.• Queries can only use 1 index*• Indexes have storage requirements, and impact the performance of writes.• In memory sort (no-index) limited to 32mb of return data.
    27. 27. Optimise Your Queries
    28. 28. Profiling Slow Opsdb.setProfilingLevel( n , slowms=100ms )n=0 profiler offn=1 record operations longer than slowmsn=2 record all queriesdb.system.profile.find()* The profile collection is a capped collection, and fixed in size
    29. 29. The Explain Plan (Pre Index)db.recipes.find( { calories: { $lt : 40 } }).explain( ){ "cursor" : "BasicCursor" , "n" : 42, "nscannedObjects” : 12345 "nscanned" : 12345, ... "millis" : 356, ...}* Doesn’t use cached plans, re-evals and resets cache
    30. 30. The Explain Plan (Post Index)db.recipes.find( { calories: { $lt : 40 } }).explain( ){ "cursor" : "BtreeCursor calories_-1" , "n" : 42, "nscannedObjects": 42 "nscanned" : 42, ... "millis" : 0, ...}* Doesn’t use cached plans, re-evals and resets cache
    31. 31. The Query Optimiser
    32. 32. The Query Optimiser• For each "type" of query, MongoDB periodically tries all useful indexes• Aborts the rest as soon as one plan wins• The winning plan is temporarily cached for each “type” of query
    33. 33. Manually Select Index to Use// Tell the database what index to usedb.recipes.find({ calories: { $lt: 1000 } }).hint({ _id: 1 })// Tell the database to NOT use an indexdb.recipes.find( { calories: { $lt: 1000 } }).hint({ $natural: 1 })
    34. 34. Use Indexes to Sort QueryResults// Given the following indexdb.collection.ensureIndex({ a:1, b:1 , c:1, d:1 })// The following query and sort operations can use the indexdb.collection.find( ).sort({ a:1 })db.collection.find( ).sort({ a:1, b:1 })db.collection.find({ a:4 }).sort({ a:1, b:1 })db.collection.find({ b:5 }).sort({ a:1, b:1 })
    35. 35. Indexes that won’t work forsorting query results// Given the following indexdb.collection.ensureIndex({ a:1, b:1, c:1, d:1 })// These can not sort using the indexdb.collection.find( ).sort({ b: 1 })db.collection.find({ b: 5 }).sort({ b: 1 })
    36. 36. Index Covered Queries// MongoDB can return data from just the indexdb.recipes.ensureIndex({ main_ingredient: 1, name: 1 })// Return only the ingredients fielddb.recipes.find( { main_ingredient: chicken’ }, { _id: 0, name: 1 })// indexOnly will be true in the explain plandb.recipes.find( { main_ingredient: chicken }, { _id: 0, name: 1 }).explain(){ "indexOnly": true,}
    37. 37. Absent or suboptimalindexes are the mostcommon avoidableMongoDB performanceproblem.
    38. 38. Avoiding CommonMistakes
    39. 39. Trying to Use MultipleIndexes// MongoDB can only use one index for a querydb.collection.ensureIndex({ a: 1 })db.collection.ensureIndex({ b: 1 })// Only one of the above indexes is useddb.collection.find({ a: 3, b: 4 })
    40. 40. Compound Key Mistakes// Compound key indexes are very effectivedb.collection.ensureIndex({ a: 1, b: 1, c: 1 })// But only if the query is a prefix of the index// This query cant effectively use the indexdb.collection.find({ c: 2 })// …but this query candb.collection.find({ a: 3, b: 5 })
    41. 41. Low Selectivity Indexesdb.collection.distinct(status’)[ new, processed ]db.collection.ensureIndex({ status: 1 })// Low selectivity indexes provide little benefitdb.collection.find({ status: new })// Betterdb.collection.ensureIndex({ status: 1, created_at: -1 })db.collection.find( { status: new }).sort({ created_at: -1 })
    42. 42. Regular Expressionsdb.users.ensureIndex({ username: 1 })// Left anchored regex queries can use the indexdb.users.find({ username: /^joe smith/ })// But not generic regexesdb.users.find({username: /smith/ })// Or case insensitive queriesdb.users.find({ username: /Joe/i })
    43. 43. Negation// Indexes arent helpful with negationsdb.things.ensureIndex({ x: 1 })// e.g. "not equal" queriesdb.things.find({ x: { $ne: 3 } })// …or "not in" queriesdb.things.find({ x: { $nin: [2, 3, 4 ] } })// …or the $not operatordb.people.find({ name: { $not: John Doe } })
    44. 44. Choosing the rightindexes is one of themost important thingsyou can do as aMongoDB developer sotake the time to get yourindexes right!
    45. 45. #MongoDBSydneyThank youStephen Steneker (stennie@10gen.com)Support Engineer, 10gen
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×