SlideShare a Scribd company logo
1 of 51
Download to read offline
Indexing	
  and	
  Query	
  Optimization
                                        Kevin	
  Matulef
                                   September	
  6,	
  2012




Thursday, September 6, 12
What’s in store

      • What are indexes?

      • Picking the right indexes.

      • Creating indexes in MongoDB

      • Troubleshooting




Thursday, September 6, 12
Indexes are the single biggest
         tunable performance factor
                in MongoDB.




Thursday, September 6, 12
Absent or suboptimal indexes are
      the most common avoidable
    MongoDB performance problem.




Thursday, September 6, 12
So what problem do indexes solve?




Thursday, September 6, 12
Thursday, September 6, 12
How do you find a chicken recipe?


      • An unindexed cookbook might be quite a
          page turner.

      • Probably not what you want, though.




Thursday, September 6, 12
I know, I’ll use an index!




Thursday, September 6, 12
Thursday, September 6, 12
Let’s imagine a simple index
                            ingredient             page

                            aardvark                790

                                ...                   ...

                              beef       190,	
  191,	
  205,	
  ...

                                ...                   ...

                             chicken     182,	
  199,	
  200,	
  ...	
  

                             chorizo             497,	
  ...

                                ...                   ...

                             zucchini        673,	
  986,	
  ...




Thursday, September 6, 12
How do you find a quick chicken recipe?




Thursday, September 6, 12
Let’s imagine a compound index
                            ingredient   cooking	
  time        page

                                ...            ...                 ...

                             chicken        15	
  min         182,	
  200

                             chicken        25	
  min            199

                             chicken        30	
  min      289,316,320

                             chicken        45	
  min      290,	
  291,	
  354

                                ...            ...                 ...




Thursday, September 6, 12
Consider the ordering of index keys




  Aardvark,	
  20	
  min    Chicken,	
  15	
  min                      Zuchinni,	
  45	
  min
                                   Chicken,	
  25	
  min
                                        Chicken,	
  30	
  min
                                               Chicken,	
  45	
  min




Thursday, September 6, 12
How about a low-calorie chicken recipe?




Thursday, September 6, 12
Let’s imagine a 2nd compound index

                            ingredient   calories     page


                                ...         ...         ...


                             chicken       250      199,	
  316


                             chicken       300      289,291


                             chicken       425         320


                                ...         ...         ...




Thursday, September 6, 12
How about a quick, low-calorie recipe?




Thursday, September 6, 12
Let’s imagine a last compound index
                            calories   cooking	
  time   page

                               ...           ...          ...

                              250         25	
  min      199

                              250         30	
  min      316

                              300         25	
  min      289

                              300         45	
  min      291
                              425         30	
  min      320

                               ...           ...          ...

          How do you find dishes from 250 to 300 calories that cook from
                               30 to 40 minutes?



Thursday, September 6, 12
Consider the ordering of index keys


              250	
  cal,   250	
  cal,   300	
  cal,   300	
  cal,   425	
  cal,
              25	
  min     30	
  min     25	
  min     45	
  min     30	
  min




          How do you find dishes from 250 to 300 calories that cook from
                               30 to 40 minutes?

                  4 index entries will be scanned, but only 1 will match!




Thursday, September 6, 12
Range queries using an index on A, B
      • A is a range J

      • A is constant, B is a range J

      • A is constant, order by B J

      • A is range, B is constant/range   K

      • B is constant/range, A unspecified L


Thursday, September 6, 12
It’s really that straightforward.




Thursday, September 6, 12
B-Trees                (Bayer & McCreight ’72)




Thursday, September 6, 12
B-Trees                (Bayer & McCreight ’72)




                                              13




Thursday, September 6, 12
B-Trees                (Bayer & McCreight ’72)




                                                 13

            Queries,	
  Inserts,	
  Deletes:	
  O(log	
  n)


Thursday, September 6, 12
All this is relevant to MongoDB.
      • MongoDB’s indexes are B-Trees, which are
          designed for range queries.

      • Generally, the best index for your queries is
          going to be a compound index.

      • Every additional index slows down inserts &
          removes, and may slow updates.




Thursday, September 6, 12
On to MongoDB!




Thursday, September 6, 12
Declaring Indexes
      • db.foo.ensureIndex( { username : 1 } )




Thursday, September 6, 12
Declaring Indexes
      • db.foo.ensureIndex( { username : 1 } )


      • db.foo.ensureIndex( { username : 1, created_at : -1 } )




Thursday, September 6, 12
And managing them....
          > db.system.indexes.find() //db.foo.getIndexes()

           { "v" : 1, "key" : { "_id" : 1 }, "ns" : "test.foo", "name" : "_id_" }
           { "v" : 1, "key" : { "username" : 1 }, "ns" : "test.foo", "name" : "username_1" }




Thursday, September 6, 12
And managing them....
          > db.system.indexes.find() //db.foo.getIndexes()

           { "v" : 1, "key" : { "_id" : 1 }, "ns" : "test.foo", "name" : "_id_" }
           { "v" : 1, "key" : { "username" : 1 }, "ns" : "test.foo", "name" : "username_1" }



          > db.foo.dropIndex( { username : 1} )

            { "nIndexesWas" : 2 , "ok" : 1 }




Thursday, September 6, 12
Key info about MongoDB’s indexes
      • A collection may have at most 64 indexes.




Thursday, September 6, 12
Key info about MongoDB’s indexes
      • A collection may have at most 64 indexes.

      • “_id” index is automatic
                  (except capped collections before 2.2)




Thursday, September 6, 12
Key info about MongoDB’s indexes
      • A collection may have at most 64 indexes.

      • “_id” index is automatic
                  (except capped collections before 2.2)


      • All queries can use just 1 index
                  (except $or queries).




Thursday, September 6, 12
Key info about MongoDB’s indexes
      • A collection may have at most 64 indexes.

      • “_id” index is automatic
                  (except capped collections before 2.2)


      • All queries can use just 1 index
                  (except $or queries).


      • The maximum index key size is 1024 bytes.



Thursday, September 6, 12
Indexes get used where you’d expect
           • db.foo.find({x : 42})
           • db.foo.find({x : {$in : [42,52]}})
           • db.foo.find({x : {$lt : 42})
           • update, findAndModify that select on x,
           • count, distinct,
           • $match in aggregation
           • left-anchored regexp, e.g. /^Kev/




Thursday, September 6, 12
But indexes aren’t always helpful
      • Most negations: $not, $nin, $ne


      • Some corner cases: $mod, $where


      • Matching most regular expressions, e.g. /a/
          or /foo/i




Thursday, September 6, 12
Advanced Options




Thursday, September 6, 12
Arrays: the powerful “multiKey” index
           { title : “Chicken Noodle Soup”,
             ingredients : [“chicken”, “noodles”] }

           >	
  db.foo.ensureIndex(	
  {	
  ingredients	
  :	
  1	
  }	
  )

                             ingredients                        page
                                chicken                          42
                                   ...                            ...
                               noodles                           42
                                   ...                            ...




Thursday, September 6, 12
Unique Indexes
     • db.foo.ensureIndex( { email : 1 } , {unique : true} )



          > db.foo.insert({email : “matulef@10gen.com”})
          > db.foo.insert({email : “matulef@10gen.com”})
             E11000 duplicate key error ...




Thursday, September 6, 12
Sparse Indexes
     • db.foo.ensureIndex( { email : 1 } , {sparse : true} )


                  No index entries for docs without “email” field




Thursday, September 6, 12
Geospatial Indexes
          { name: "10gen Office",
            lat_long: [ 52.5184, 13.387 ] }

          > db.foo.ensureIndex( { lat_long : “2d” } )

          > db.locations.find( { lat_long: {$near: [52.53, 13.4] } } )




Thursday, September 6, 12
Troubleshooting




Thursday, September 6, 12
The Query Optimizer
      • For each “type” of query, mongoDB
          periodically tries all useful indexes.

      • Aborts as soon as one plan wins.

      • Winning plan is temporarily cached.




Thursday, September 6, 12
Which plan wins? Explain!
      > db.foo.find( { t: { $lt : 40 } } ).explain( )
      {
        "cursor" : "BtreeCursor t_1" ,
        "n" : 42,
        “nscannedObjects: 42
        "nscanned" : 42,
         ...
        "millis" : 0,
         ...
      }




Thursday, September 6, 12
Which plan wins? Explain!
      > db.foo.find( { t: { $lt : 40 } } ).explain( )
      {
        "cursor" : "BtreeCursor t_1" ,
        "n" : 42,                          Pay attention to the
        “nscannedObjects: 42
        "nscanned" : 42,                    ratio	
  n/nscanned!
         ...
        "millis" : 0,
         ...
      }




Thursday, September 6, 12
Think you know better? Give us a hint
      > db.foo.find( { t: { $lt : 40 } } ).hint( { _id : 1} )




Thursday, September 6, 12
Recording slow queries
      > db.setProfilingLevel( n , slowms=100ms )

      n=0 profiler off
      n=1 record queries longer than slowms
      n=2 record all queries

      > db.system.profile.find()




Thursday, September 6, 12
Operational Tips




Thursday, September 6, 12
Background index builds
          db.foo.ensureIndex( { user : 1 } , { background : true } )

          Caveats:
           • still resource-intensive
           • will build in foreground on secondaries




Thursday, September 6, 12
Minimizing impact on Replica Sets
          for (s in secondaries)
              s.restartAsStandalone()
              s.buildIndex()
              s.restartAsReplSetMember()
              s.waitForCatchup()

          p.stepDown()
          p.restartAsStandalone()
          p.buildIndex()
          p.restartAsReplSetMember()




Thursday, September 6, 12
Absent or suboptimal indexes are
     the most common avoidable
   MongoDB performance problem...

    ...so take some time and get your
               indexes right!


Thursday, September 6, 12
Thanks!

                (and thanks to Richard Kreuter for the slides)




Thursday, September 6, 12

More Related Content

More from MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Recently uploaded

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 

Indexing and Query Optimization Webinar

  • 1. Indexing  and  Query  Optimization Kevin  Matulef September  6,  2012 Thursday, September 6, 12
  • 2. What’s in store • What are indexes? • Picking the right indexes. • Creating indexes in MongoDB • Troubleshooting Thursday, September 6, 12
  • 3. Indexes are the single biggest tunable performance factor in MongoDB. Thursday, September 6, 12
  • 4. Absent or suboptimal indexes are the most common avoidable MongoDB performance problem. Thursday, September 6, 12
  • 5. So what problem do indexes solve? Thursday, September 6, 12
  • 7. How do you find a chicken recipe? • An unindexed cookbook might be quite a page turner. • Probably not what you want, though. Thursday, September 6, 12
  • 8. I know, I’ll use an index! Thursday, September 6, 12
  • 10. Let’s imagine a simple index ingredient page aardvark 790 ... ... beef 190,  191,  205,  ... ... ... chicken 182,  199,  200,  ...   chorizo 497,  ... ... ... zucchini 673,  986,  ... Thursday, September 6, 12
  • 11. How do you find a quick chicken recipe? Thursday, September 6, 12
  • 12. Let’s imagine a compound index ingredient cooking  time page ... ... ... chicken 15  min 182,  200 chicken 25  min 199 chicken 30  min 289,316,320 chicken 45  min 290,  291,  354 ... ... ... Thursday, September 6, 12
  • 13. Consider the ordering of index keys Aardvark,  20  min Chicken,  15  min Zuchinni,  45  min Chicken,  25  min Chicken,  30  min Chicken,  45  min Thursday, September 6, 12
  • 14. How about a low-calorie chicken recipe? Thursday, September 6, 12
  • 15. Let’s imagine a 2nd compound index ingredient calories page ... ... ... chicken 250 199,  316 chicken 300 289,291 chicken 425 320 ... ... ... Thursday, September 6, 12
  • 16. How about a quick, low-calorie recipe? Thursday, September 6, 12
  • 17. Let’s imagine a last compound index calories cooking  time page ... ... ... 250 25  min 199 250 30  min 316 300 25  min 289 300 45  min 291 425 30  min 320 ... ... ... How do you find dishes from 250 to 300 calories that cook from 30 to 40 minutes? Thursday, September 6, 12
  • 18. Consider the ordering of index keys 250  cal, 250  cal, 300  cal, 300  cal, 425  cal, 25  min 30  min 25  min 45  min 30  min How do you find dishes from 250 to 300 calories that cook from 30 to 40 minutes? 4 index entries will be scanned, but only 1 will match! Thursday, September 6, 12
  • 19. Range queries using an index on A, B • A is a range J • A is constant, B is a range J • A is constant, order by B J • A is range, B is constant/range K • B is constant/range, A unspecified L Thursday, September 6, 12
  • 20. It’s really that straightforward. Thursday, September 6, 12
  • 21. B-Trees (Bayer & McCreight ’72) Thursday, September 6, 12
  • 22. B-Trees (Bayer & McCreight ’72) 13 Thursday, September 6, 12
  • 23. B-Trees (Bayer & McCreight ’72) 13 Queries,  Inserts,  Deletes:  O(log  n) Thursday, September 6, 12
  • 24. All this is relevant to MongoDB. • MongoDB’s indexes are B-Trees, which are designed for range queries. • Generally, the best index for your queries is going to be a compound index. • Every additional index slows down inserts & removes, and may slow updates. Thursday, September 6, 12
  • 25. On to MongoDB! Thursday, September 6, 12
  • 26. Declaring Indexes • db.foo.ensureIndex( { username : 1 } ) Thursday, September 6, 12
  • 27. Declaring Indexes • db.foo.ensureIndex( { username : 1 } ) • db.foo.ensureIndex( { username : 1, created_at : -1 } ) Thursday, September 6, 12
  • 28. And managing them.... > db.system.indexes.find() //db.foo.getIndexes() { "v" : 1, "key" : { "_id" : 1 }, "ns" : "test.foo", "name" : "_id_" } { "v" : 1, "key" : { "username" : 1 }, "ns" : "test.foo", "name" : "username_1" } Thursday, September 6, 12
  • 29. And managing them.... > db.system.indexes.find() //db.foo.getIndexes() { "v" : 1, "key" : { "_id" : 1 }, "ns" : "test.foo", "name" : "_id_" } { "v" : 1, "key" : { "username" : 1 }, "ns" : "test.foo", "name" : "username_1" } > db.foo.dropIndex( { username : 1} ) { "nIndexesWas" : 2 , "ok" : 1 } Thursday, September 6, 12
  • 30. Key info about MongoDB’s indexes • A collection may have at most 64 indexes. Thursday, September 6, 12
  • 31. Key info about MongoDB’s indexes • A collection may have at most 64 indexes. • “_id” index is automatic (except capped collections before 2.2) Thursday, September 6, 12
  • 32. Key info about MongoDB’s indexes • A collection may have at most 64 indexes. • “_id” index is automatic (except capped collections before 2.2) • All queries can use just 1 index (except $or queries). Thursday, September 6, 12
  • 33. Key info about MongoDB’s indexes • A collection may have at most 64 indexes. • “_id” index is automatic (except capped collections before 2.2) • All queries can use just 1 index (except $or queries). • The maximum index key size is 1024 bytes. Thursday, September 6, 12
  • 34. Indexes get used where you’d expect • db.foo.find({x : 42}) • db.foo.find({x : {$in : [42,52]}}) • db.foo.find({x : {$lt : 42}) • update, findAndModify that select on x, • count, distinct, • $match in aggregation • left-anchored regexp, e.g. /^Kev/ Thursday, September 6, 12
  • 35. But indexes aren’t always helpful • Most negations: $not, $nin, $ne • Some corner cases: $mod, $where • Matching most regular expressions, e.g. /a/ or /foo/i Thursday, September 6, 12
  • 37. Arrays: the powerful “multiKey” index { title : “Chicken Noodle Soup”, ingredients : [“chicken”, “noodles”] } >  db.foo.ensureIndex(  {  ingredients  :  1  }  ) ingredients page chicken 42 ... ... noodles 42 ... ... Thursday, September 6, 12
  • 38. Unique Indexes • db.foo.ensureIndex( { email : 1 } , {unique : true} ) > db.foo.insert({email : “matulef@10gen.com”}) > db.foo.insert({email : “matulef@10gen.com”}) E11000 duplicate key error ... Thursday, September 6, 12
  • 39. Sparse Indexes • db.foo.ensureIndex( { email : 1 } , {sparse : true} ) No index entries for docs without “email” field Thursday, September 6, 12
  • 40. Geospatial Indexes { name: "10gen Office", lat_long: [ 52.5184, 13.387 ] } > db.foo.ensureIndex( { lat_long : “2d” } ) > db.locations.find( { lat_long: {$near: [52.53, 13.4] } } ) Thursday, September 6, 12
  • 42. The Query Optimizer • For each “type” of query, mongoDB periodically tries all useful indexes. • Aborts as soon as one plan wins. • Winning plan is temporarily cached. Thursday, September 6, 12
  • 43. Which plan wins? Explain! > db.foo.find( { t: { $lt : 40 } } ).explain( ) { "cursor" : "BtreeCursor t_1" , "n" : 42, “nscannedObjects: 42 "nscanned" : 42, ... "millis" : 0, ... } Thursday, September 6, 12
  • 44. Which plan wins? Explain! > db.foo.find( { t: { $lt : 40 } } ).explain( ) { "cursor" : "BtreeCursor t_1" , "n" : 42, Pay attention to the “nscannedObjects: 42 "nscanned" : 42, ratio  n/nscanned! ... "millis" : 0, ... } Thursday, September 6, 12
  • 45. Think you know better? Give us a hint > db.foo.find( { t: { $lt : 40 } } ).hint( { _id : 1} ) Thursday, September 6, 12
  • 46. Recording slow queries > db.setProfilingLevel( n , slowms=100ms ) n=0 profiler off n=1 record queries longer than slowms n=2 record all queries > db.system.profile.find() Thursday, September 6, 12
  • 48. Background index builds db.foo.ensureIndex( { user : 1 } , { background : true } ) Caveats: • still resource-intensive • will build in foreground on secondaries Thursday, September 6, 12
  • 49. Minimizing impact on Replica Sets for (s in secondaries) s.restartAsStandalone() s.buildIndex() s.restartAsReplSetMember() s.waitForCatchup() p.stepDown() p.restartAsStandalone() p.buildIndex() p.restartAsReplSetMember() Thursday, September 6, 12
  • 50. Absent or suboptimal indexes are the most common avoidable MongoDB performance problem... ...so take some time and get your indexes right! Thursday, September 6, 12
  • 51. Thanks! (and thanks to Richard Kreuter for the slides) Thursday, September 6, 12