SlideShare a Scribd company logo
Query Optimization in
MongoDB
TehranDB
MongoDB
MongoDB is a cross-platform, document oriented database that provides, high performance,
high availability, and easy scalability. MongoDB works on concept of collection and document.
Database
Database is a physical container for collections. Each database gets its own set of files on the
file system. A single MongoDB server typically has multiple databases.
Collection
Collection is a group of MongoDB documents. It is the equivalent of an RDBMS table. A
collection exists within a single database. Collections do not enforce a schema. Documents
within a collection can have different fields. Typically, all documents in a collection are of similar
or related purpose.
Document
A document is a set of key-value pairs. Documents have dynamic schema. Dynamic schema
means that documents in the same collection do not need to have the same set of fields or
structure, and common fields in a collection's documents may hold different types of data.
The following table shows the relationship of RDBMS terminology with MongoDB.
Advantages of MongoDB over RDBMS
• Schema less − MongoDB is a document database in which one collection
holds different documents. Number of fields, content and size of the document can differ
from one document to another.
• Structure of a single object is clear.
• No complex joins.
• Deep query-ability. MongoDB supports dynamic queries on documents using a
document-based query language that's nearly as powerful as SQL.
• Tuning.
• Ease of scale-out − MongoDB is easy to scale.
• Conversion/mapping of application objects to database objects not needed.
• Uses internal memory for storing the (windowed) working set, enabling faster
access of data.
Where to Use MongoDB?
• Big Data
• Content Management and Delivery
• Mobile and Social Infrastructure
• User Data Management
Analyze Your Queries
Like many databases, MongoDB provides an explain facility which provides statistics
about the Performance of a Query. You can add explain('executionStats') to a query.
db.user.find(
{ country: 'AU', city: 'Melbourne' }
).explain('executionStats');
Explain
or append it to the collection:
db.user.explain('executionStats').find(
{ country: 'AU', city: 'Melbourne' }
);
Explain
This returns a large JSON result, but there are three primary values to examine:
queryPlanner.winningPlan.stag:
1. COLLSCAN : Indicates a collection scan.
2.IXSCAN : Indicates index use.
executionStats.nReturned :
The number of documents returned.
executionStats.totalDocsExamined :
The number of documents scanned to find the result.
executionStats.totalKeysExamined :
Indicate that MongoDB scanned three index entries. 0 indicates that the query is not using an index.
If the number of documents examined greatly exceeds the number returned, the query
may not be efficient. In the worst cases, MongoDB might have to scan every document
in the collection.
Explain Result Example 1
{
"queryPlanner" : {
"plannerVersion" : 1,
...
"winningPlan" : {
"stage" : "COLLSCAN",
...
}
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 3,
"executionTimeMillis" : 0,
"totalKeysExamined" : 0,
"totalDocsExamined" : 10,
"executionStages" : {
"stage" : "COLLSCAN",
...
},
...
},
...
}
Explain Result Example 2
{
"queryPlanner" : {
"plannerVersion" : 1,
...
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"quantity" : 1
},
...
}
},
"rejectedPlans" : [ ]
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 3,
"executionTimeMillis" : 0,
"totalKeysExamined" : 3,
"totalDocsExamined" : 3,
"executionStages" : {
...
},
...
},
...
}
Add Appropriate Indexes
Index Types
1. Single Field
2. Compound Index
3. Multikey Index
4. Geospatial Index
5. Text Index
6. Hashed Indexes
Simple Index
MongoDB supports the creation of user-defined ascending/descending indexes on a
single field of a document.
Default _id Index
MongoDB creates a unique index on the _id field during the creation of a collection. The
_id index prevents clients from inserting two documents with the same value for the _id
field. You cannot drop this index on the _id field.
Create an Index
The following example creates a single key descending index on the name field:
db.collection.createIndex( { name: -1 } )
Compound Index
MongoDB also supports user-defined indexes on multiple fields, i.e. compound indexes.
The order of fields listed in a compound index has significance.
For instance, if a compound index consists of { userid: 1, score: -1 }, the index sorts first
by userid and then, within each userid value, sorts by score.
db.collection.createIndex( {userId: 1 , score: -1 } )
Multikey Index
MongoDB uses multikey indexes to index the content stored in arrays. If you index a
field that holds an array value, MongoDB creates separate index entries for every
element of the array. These multikey indexes allow queries to select documents that
contain arrays by matching on element or elements of the arrays. MongoDB
automatically determines whether to create a multikey index if the indexed field contains
an array value; you do not need to explicitly specify the multikey type.
db.collection.createIndex( {addr.zip: 1 } )
Geospatial Index
To support efficient queries of geospatial coordinate data, MongoDB provides two
special indexes: 2d indexes that uses planar geometry when returning results and
2dsphere indexes that use spherical geometry to return results.
Text Indexes
MongoDB provides a text index type that supports searching for string content in a
collection. These text indexes do not store language-specific stop words (e.g. “the”, “a”,
“or”) and stem the words in a collection to only store root words.
Hashed Indexes
To support hash based sharding, MongoDB provides a hashed index type, which
indexes the hash of the value of a field. These indexes have a more random distribution
of values along their range, but only support equality matches and cannot support
range-based queries.
Index Properties
Unique Indexes¶
The unique property for an index causes MongoDB to reject duplicate values for the indexed field.
Other than the unique constraint, unique indexes are functionally interchangeable with other
MongoDB indexes.
db.members.createIndex( { "user_id": 1 }, { unique: true } )
Partial Indexes¶
New in version 3.2.
Partial indexes only index the documents in a collection that meet a specified filter expression. By
indexing a subset of the documents in a collection, partial indexes have lower storage requirements
and reduced performance costs for index creation and maintenance.
Partial indexes offer a superset of the functionality of sparse indexes and should be preferred over
sparse indexes.
db.restaurants.createIndex( { cuisine: 1, name: 1 }, { partialFilterExpression: { rating: { $gt: 5 }
} })
Index Properties
Sparse Indexes
The sparse property of an index ensures that the index only contain entries for documents that
have the indexed field. The index skips documents that do not have the indexed field.
You can combine the sparse index option with the unique index option to reject documents that
have duplicate values for a field but ignore documents that do not have the indexed key.
If a sparse index would result in an incomplete result set for queries and sort operations,
MongoDB will not use that index unless a hint() explicitly specifies the index.
db.addresses.createIndex( { "xmpp_id": 1 }, { sparse: true } )
TTL Indexes
TTL indexes are special indexes that MongoDB can use to automatically remove documents
from a collection after a certain amount of time. This is ideal for certain types of information like
machine generated event data, logs, and session information that only need to persist in a
database for a finite amount of time.
db.eventlog.createIndex( { "lastModifiedDate": 1 }, { expireAfterSeconds: 3600 } )
Index Intersection
MongoDB can use the intersection of multiple indexes to fulfill queries. In general, each
index intersection involves two indexes; however, MongoDB can employ multiple/nested
index intersections to resolve a query.
To illustrate index intersection, consider a collection orders that has the following
indexes:
{ qty: 1 }
{ item: 1 }
MongoDB can use the intersection of the two indexes to support the following query:
db.orders.find( { item: "abc123", qty: { $gt: 15 } } )
To determine if MongoDB used index intersection, run explain(); the results of explain()
will include either an AND_SORTED stage or an AND_HASH stage.
Index Prefix Intersection
With index intersection, MongoDB can use an intersection of either the entire index or
the index prefix. An index prefix is a subset of a compound index, consisting of one or
more keys starting from the beginning of the index.
Consider a collection orders with the following indexes:
{ qty: 1 }
{ status: 1, ord_date: -1 }
To fulfill the following query which specifies a condition on both the qty field and the
status field, MongoDB can use the intersection of the two indexes:
db.orders.find( { qty: { $gt: 10 } , status: "A" } )
Index Intersection and Compound Indexes
Index intersection does not eliminate the need for creating compound indexes. However, because both the list
order (i.e. the order in which the keys are listed in the index) and the sort order (i.e. ascending or descending),
matter in compound indexes, a compound index may not support a query condition that does not include the
index prefix keys or that specifies a different sort order.
For example, if a collection orders has the following compound index, with the status field listed before the
ord_date field:
{ status: 1, ord_date: -1 }
The compound index can support the following queries:
db.orders.find( { status: { $in: ["A", "P" ] } } )
db.orders.find({ord_date: { $gt: new Date("2014-02-01") }, status: {$in:[ "P", "A" ] }})
But not the following two queries:
db.orders.find( { ord_date: { $gt: new Date("2014-02-01") } } )
db.orders.find( { } ).sort( { ord_date: 1 } )
However, if the collection has two separate indexes:
{ status: 1 }
{ ord_date: -1 }
The two indexes can, either individually or through index intersection, support all four aforementioned queries.
The choice between creating compound indexes that support your queries or relying on index intersection
depends on the specifics of your system.
Index Intersection and Sort
Index intersection does not apply when the sort() operation requires an index completely separate from the
query predicate.
For example, the orders collection has the following indexes:
{ qty: 1 }
{ status: 1, ord_date: -1 }
{ status: 1 }
{ ord_date: -1 }
MongoDB cannot use index intersection for the following query with sort:
db.orders.find( { qty: { $gt: 10 } } ).sort( { status: 1 } )
That is, MongoDB does not use the { qty: 1 } index for the query, and the separate { status: 1 } or the { status:
1, ord_date: -1 } index for the sort.
However, MongoDB can use index intersection for the following query with sort since the index { status: 1,
ord_date: -1 } can fulfill part of the query predicate.
db.orders.find( { qty: { $gt: 10 } , status: "A" } ).sort( { ord_date: -1 } )
Compound Indexes Prefix
Index prefixes are the beginning subsets of indexed fields. For example, consider the
following compound index:
{ "item": 1, "location": 1, "stock": 1 }
The index has the following index prefixes:
• { item: 1 }
• { item: 1, location: 1 }
For a compound index, MongoDB can use the index to support queries on the index
prefixes. As such, MongoDB can use the index for queries on the following fields:
• the item field,
• the item field and the location field,
• the item field and the location field and the stock field.
Compound Indexes Prefix
MongoDB can also use the index to support a query on item and stock fields since item
field corresponds to a prefix. However, the index would not be as efficient in supporting
the query as would be an index on only item and stock.
However, MongoDB cannot use the index to support queries that include the following
fields since without the item field, none of the listed fields correspond to a prefix index:
• the location field,
• the stock field
• the location and stock fields.
If you have a collection that has both a compound index and an index on its prefix (e.g. {
a: 1, b: 1 } and { a: 1 }), if neither index has a sparse or unique constraint, then you can
remove the index on the prefix (e.g. { a: 1 }). MongoDB will use the compound index in
all of the situations that it would have used the prefix index.
Optimizing MongoDB Compound Indexes
In order to create the best index for a complex MongoDB queries that combine equality
tests, sorts, and range filters, and demonstrate the best order for fields in a compound
index You must consider Index Cardinality and Selectivity.
:
Index Cardinality
The index cardinality refers to how many possible values there are for a field. The field
sex only has two possible values. It has a very low cardinality. Other fields such as
names, usernames, phone numbers, emails, etc. will have a more unique value for
every document in the collection, which is considered high cardinality.
• Greater Cardinality
The greater the cardinality of a field the more helpful an index will be, because indexes
narrow the search space, making it a much smaller set.
If you have an index on sex and you are looking for men named John. You would only
narrow down the result space by approximately %50 if you indexed by sex first.
Conversely if you indexed by name, you would immediately narrow down the result set
to a minute fraction of users named John, then you would refer to those documents to
check the gender.
Selectivity
Also, you want to use indexes selectively and write queries that limit the number of
possible documents with the indexed field. To keep it simple, consider the following
collection. If your index is {name:1}, If you run the query { name: "John", sex: "male"}.
You will have to scan 1 document. Because you allowed MongoDB to be selective.
Consider the following collection. If your index is {sex:1}, If you run the query {sex:
"male", name: "John"}. You will have to scan 4 documents.
{_id:ObjectId(),name:"John",sex:"male"}
{_id:ObjectId(),name:"Rich",sex:"male"}
{_id:ObjectId(),name:"Mose",sex:"male"}
{_id:ObjectId(),name:"Sami",sex:"male"}
{_id:ObjectId(),name:"Cari",sex:"female"}
{_id:ObjectId(),name:"Mary",sex:"female"}
Equality, Range Query, And Sort
Method
So here’s my method for creating a compound index for a query combining equality tests, sort fields, and range
filters:
Equality Tests
• Add all equality-tested fields to the compound index, in any order
Sort Fields (ascending / descending only matters if there are multiple sort fields)
• Add sort fields to the index in the same order and direction as your query’s sort
Range Filters
• First, add the range filter for the field with the lowest cardinality (fewest distinct values in the collection)
• Then the next lowest-cardinality range filter, and so on to the highest-cardinality
You can omit some equality-test fields or range-filter fields if they are not selective, to decrease the index
size—a rule of thumb is, if the field doesn’t filter out at least 90% of the possible documents in your collection,
it’s probably better to omit it from the index. Remember that if you have several indexes on a collection, you
may need to hint Mongo to use the right index.
That’s it! For complex queries on several fields, there’s a heap of possible indexes to consider. If you use this
method you’ll narrow your choices radically and go straight to a good index.
Equality, Range Query, And Sort
❖ Example:
db.comments.find({ timestamp: { $gte: 2, $lte: 4 }, anonymous: false }
…).sort( { rating: -1 } ).hint( { anonymous: 1, rating: 1, timestamp: 1 } ).explain()
{ timestamp: 1, anonymous: false, rating: 3 }
{ timestamp: 2, anonymous: false, rating: 5 }
{ timestamp: 3, anonymous: true, rating: 1 }
{ timestamp: 4, anonymous: false, rating: 2 }

More Related Content

What's hot

Mongo DB
Mongo DB Mongo DB
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
Universidade de São Paulo
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
valuebound
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Ravi Teja
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDB
Lee Theobald
 
Apache hadoop hbase
Apache hadoop hbaseApache hadoop hbase
Apache hadoop hbase
sheetal sharma
 
MongoDB 101
MongoDB 101MongoDB 101
MongoDB 101
Abhijeet Vaikar
 
Mongodb vs mysql
Mongodb vs mysqlMongodb vs mysql
Mongodb vs mysql
hemal sharma
 
Mongo db intro.pptx
Mongo db intro.pptxMongo db intro.pptx
Mongo db intro.pptx
JWORKS powered by Ordina
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
Harri Kauhanen
 
About elasticsearch
About elasticsearchAbout elasticsearch
About elasticsearch
Minsoo Jun
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)
Uwe Printz
 
Mongodb
MongodbMongodb
Mongodb
Apurva Vyas
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
Bishal Khanal
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Design
MongoDB
 
Indexing & Query Optimization
Indexing & Query OptimizationIndexing & Query Optimization
Indexing & Query OptimizationMongoDB
 
Non Relational Databases
Non Relational DatabasesNon Relational Databases
Non Relational Databases
Chris Baglieri
 
Couch db
Couch dbCouch db
Couch db
Rashmi Agale
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
Ram kumar
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
César Trigo
 

What's hot (20)

Mongo DB
Mongo DB Mongo DB
Mongo DB
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDB
 
Apache hadoop hbase
Apache hadoop hbaseApache hadoop hbase
Apache hadoop hbase
 
MongoDB 101
MongoDB 101MongoDB 101
MongoDB 101
 
Mongodb vs mysql
Mongodb vs mysqlMongodb vs mysql
Mongodb vs mysql
 
Mongo db intro.pptx
Mongo db intro.pptxMongo db intro.pptx
Mongo db intro.pptx
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
About elasticsearch
About elasticsearchAbout elasticsearch
About elasticsearch
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)
 
Mongodb
MongodbMongodb
Mongodb
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Design
 
Indexing & Query Optimization
Indexing & Query OptimizationIndexing & Query Optimization
Indexing & Query Optimization
 
Non Relational Databases
Non Relational DatabasesNon Relational Databases
Non Relational Databases
 
Couch db
Couch dbCouch db
Couch db
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
 

Similar to Query Optimization in MongoDB

unit 4,Indexes in database.docx
unit 4,Indexes in database.docxunit 4,Indexes in database.docx
unit 4,Indexes in database.docx
RaviRajput416403
 
Mongodb Introduction
Mongodb IntroductionMongodb Introduction
Mongodb Introduction
Raghvendra Parashar
 
Mongo db a deep dive of mongodb indexes
Mongo db  a deep dive of mongodb indexesMongo db  a deep dive of mongodb indexes
Mongo db a deep dive of mongodb indexes
Rajesh Kumar
 
Nosql part 2
Nosql part 2Nosql part 2
Nosql part 2
Ruru Chowdhury
 
Storage dei dati con MongoDB
Storage dei dati con MongoDBStorage dei dati con MongoDB
Storage dei dati con MongoDB
Andrea Balducci
 
Mongo db
Mongo dbMongo db
Mongo db
Gyanendra Yadav
 
MongoDB
MongoDBMongoDB
MongoDB
kesavan N B
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
S.Shayan Daneshvar
 
Indexing and Query Optimizer
Indexing and Query OptimizerIndexing and Query Optimizer
Indexing and Query Optimizer
MongoDB
 
Mongophilly indexing-2011-04-26
Mongophilly indexing-2011-04-26Mongophilly indexing-2011-04-26
Mongophilly indexing-2011-04-26
kreuter
 
MongoDB - An Introduction
MongoDB - An IntroductionMongoDB - An Introduction
MongoDB - An Introduction
sethfloydjr
 
MongoDB - An Introduction
MongoDB - An IntroductionMongoDB - An Introduction
MongoDB - An Introduction
dinkar thakur
 
Mongo DB
Mongo DBMongo DB
Mongo db
Mongo dbMongo db
Top MongoDB interview Questions and Answers
Top MongoDB interview Questions and AnswersTop MongoDB interview Questions and Answers
Top MongoDB interview Questions and Answers
jeetendra mandal
 
Indexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleIndexing Strategies to Help You Scale
Indexing Strategies to Help You Scale
MongoDB
 
Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)
MongoDB
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaper
Rajesh Kumar
 

Similar to Query Optimization in MongoDB (20)

unit 4,Indexes in database.docx
unit 4,Indexes in database.docxunit 4,Indexes in database.docx
unit 4,Indexes in database.docx
 
Mongodb Introduction
Mongodb IntroductionMongodb Introduction
Mongodb Introduction
 
Mongo db a deep dive of mongodb indexes
Mongo db  a deep dive of mongodb indexesMongo db  a deep dive of mongodb indexes
Mongo db a deep dive of mongodb indexes
 
Nosql part 2
Nosql part 2Nosql part 2
Nosql part 2
 
Storage dei dati con MongoDB
Storage dei dati con MongoDBStorage dei dati con MongoDB
Storage dei dati con MongoDB
 
Mongo db
Mongo dbMongo db
Mongo db
 
MongoDB
MongoDBMongoDB
MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Indexing and Query Optimizer
Indexing and Query OptimizerIndexing and Query Optimizer
Indexing and Query Optimizer
 
Mongophilly indexing-2011-04-26
Mongophilly indexing-2011-04-26Mongophilly indexing-2011-04-26
Mongophilly indexing-2011-04-26
 
MongoDB - An Introduction
MongoDB - An IntroductionMongoDB - An Introduction
MongoDB - An Introduction
 
MongoDB - An Introduction
MongoDB - An IntroductionMongoDB - An Introduction
MongoDB - An Introduction
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
MongoDB
MongoDBMongoDB
MongoDB
 
Lucene basics
Lucene basicsLucene basics
Lucene basics
 
Mongo db
Mongo dbMongo db
Mongo db
 
Top MongoDB interview Questions and Answers
Top MongoDB interview Questions and AnswersTop MongoDB interview Questions and Answers
Top MongoDB interview Questions and Answers
 
Indexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleIndexing Strategies to Help You Scale
Indexing Strategies to Help You Scale
 
Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaper
 

Recently uploaded

Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 

Recently uploaded (20)

Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 

Query Optimization in MongoDB

  • 2. MongoDB MongoDB is a cross-platform, document oriented database that provides, high performance, high availability, and easy scalability. MongoDB works on concept of collection and document. Database Database is a physical container for collections. Each database gets its own set of files on the file system. A single MongoDB server typically has multiple databases. Collection Collection is a group of MongoDB documents. It is the equivalent of an RDBMS table. A collection exists within a single database. Collections do not enforce a schema. Documents within a collection can have different fields. Typically, all documents in a collection are of similar or related purpose. Document A document is a set of key-value pairs. Documents have dynamic schema. Dynamic schema means that documents in the same collection do not need to have the same set of fields or structure, and common fields in a collection's documents may hold different types of data. The following table shows the relationship of RDBMS terminology with MongoDB.
  • 3. Advantages of MongoDB over RDBMS • Schema less − MongoDB is a document database in which one collection holds different documents. Number of fields, content and size of the document can differ from one document to another. • Structure of a single object is clear. • No complex joins. • Deep query-ability. MongoDB supports dynamic queries on documents using a document-based query language that's nearly as powerful as SQL. • Tuning. • Ease of scale-out − MongoDB is easy to scale. • Conversion/mapping of application objects to database objects not needed. • Uses internal memory for storing the (windowed) working set, enabling faster access of data.
  • 4. Where to Use MongoDB? • Big Data • Content Management and Delivery • Mobile and Social Infrastructure • User Data Management
  • 5. Analyze Your Queries Like many databases, MongoDB provides an explain facility which provides statistics about the Performance of a Query. You can add explain('executionStats') to a query. db.user.find( { country: 'AU', city: 'Melbourne' } ).explain('executionStats');
  • 6. Explain or append it to the collection: db.user.explain('executionStats').find( { country: 'AU', city: 'Melbourne' } );
  • 7. Explain This returns a large JSON result, but there are three primary values to examine: queryPlanner.winningPlan.stag: 1. COLLSCAN : Indicates a collection scan. 2.IXSCAN : Indicates index use. executionStats.nReturned : The number of documents returned. executionStats.totalDocsExamined : The number of documents scanned to find the result. executionStats.totalKeysExamined : Indicate that MongoDB scanned three index entries. 0 indicates that the query is not using an index. If the number of documents examined greatly exceeds the number returned, the query may not be efficient. In the worst cases, MongoDB might have to scan every document in the collection.
  • 8. Explain Result Example 1 { "queryPlanner" : { "plannerVersion" : 1, ... "winningPlan" : { "stage" : "COLLSCAN", ... } }, "executionStats" : { "executionSuccess" : true, "nReturned" : 3, "executionTimeMillis" : 0, "totalKeysExamined" : 0, "totalDocsExamined" : 10, "executionStages" : { "stage" : "COLLSCAN", ... }, ... }, ... }
  • 9. Explain Result Example 2 { "queryPlanner" : { "plannerVersion" : 1, ... "winningPlan" : { "stage" : "FETCH", "inputStage" : { "stage" : "IXSCAN", "keyPattern" : { "quantity" : 1 }, ... } }, "rejectedPlans" : [ ] }, "executionStats" : { "executionSuccess" : true, "nReturned" : 3, "executionTimeMillis" : 0, "totalKeysExamined" : 3, "totalDocsExamined" : 3, "executionStages" : { ... }, ... }, ... }
  • 10. Add Appropriate Indexes Index Types 1. Single Field 2. Compound Index 3. Multikey Index 4. Geospatial Index 5. Text Index 6. Hashed Indexes
  • 11. Simple Index MongoDB supports the creation of user-defined ascending/descending indexes on a single field of a document. Default _id Index MongoDB creates a unique index on the _id field during the creation of a collection. The _id index prevents clients from inserting two documents with the same value for the _id field. You cannot drop this index on the _id field. Create an Index The following example creates a single key descending index on the name field: db.collection.createIndex( { name: -1 } )
  • 12. Compound Index MongoDB also supports user-defined indexes on multiple fields, i.e. compound indexes. The order of fields listed in a compound index has significance. For instance, if a compound index consists of { userid: 1, score: -1 }, the index sorts first by userid and then, within each userid value, sorts by score. db.collection.createIndex( {userId: 1 , score: -1 } )
  • 13. Multikey Index MongoDB uses multikey indexes to index the content stored in arrays. If you index a field that holds an array value, MongoDB creates separate index entries for every element of the array. These multikey indexes allow queries to select documents that contain arrays by matching on element or elements of the arrays. MongoDB automatically determines whether to create a multikey index if the indexed field contains an array value; you do not need to explicitly specify the multikey type. db.collection.createIndex( {addr.zip: 1 } )
  • 14. Geospatial Index To support efficient queries of geospatial coordinate data, MongoDB provides two special indexes: 2d indexes that uses planar geometry when returning results and 2dsphere indexes that use spherical geometry to return results.
  • 15. Text Indexes MongoDB provides a text index type that supports searching for string content in a collection. These text indexes do not store language-specific stop words (e.g. “the”, “a”, “or”) and stem the words in a collection to only store root words.
  • 16. Hashed Indexes To support hash based sharding, MongoDB provides a hashed index type, which indexes the hash of the value of a field. These indexes have a more random distribution of values along their range, but only support equality matches and cannot support range-based queries.
  • 17. Index Properties Unique Indexes¶ The unique property for an index causes MongoDB to reject duplicate values for the indexed field. Other than the unique constraint, unique indexes are functionally interchangeable with other MongoDB indexes. db.members.createIndex( { "user_id": 1 }, { unique: true } ) Partial Indexes¶ New in version 3.2. Partial indexes only index the documents in a collection that meet a specified filter expression. By indexing a subset of the documents in a collection, partial indexes have lower storage requirements and reduced performance costs for index creation and maintenance. Partial indexes offer a superset of the functionality of sparse indexes and should be preferred over sparse indexes. db.restaurants.createIndex( { cuisine: 1, name: 1 }, { partialFilterExpression: { rating: { $gt: 5 } } })
  • 18. Index Properties Sparse Indexes The sparse property of an index ensures that the index only contain entries for documents that have the indexed field. The index skips documents that do not have the indexed field. You can combine the sparse index option with the unique index option to reject documents that have duplicate values for a field but ignore documents that do not have the indexed key. If a sparse index would result in an incomplete result set for queries and sort operations, MongoDB will not use that index unless a hint() explicitly specifies the index. db.addresses.createIndex( { "xmpp_id": 1 }, { sparse: true } ) TTL Indexes TTL indexes are special indexes that MongoDB can use to automatically remove documents from a collection after a certain amount of time. This is ideal for certain types of information like machine generated event data, logs, and session information that only need to persist in a database for a finite amount of time. db.eventlog.createIndex( { "lastModifiedDate": 1 }, { expireAfterSeconds: 3600 } )
  • 19. Index Intersection MongoDB can use the intersection of multiple indexes to fulfill queries. In general, each index intersection involves two indexes; however, MongoDB can employ multiple/nested index intersections to resolve a query. To illustrate index intersection, consider a collection orders that has the following indexes: { qty: 1 } { item: 1 } MongoDB can use the intersection of the two indexes to support the following query: db.orders.find( { item: "abc123", qty: { $gt: 15 } } ) To determine if MongoDB used index intersection, run explain(); the results of explain() will include either an AND_SORTED stage or an AND_HASH stage.
  • 20. Index Prefix Intersection With index intersection, MongoDB can use an intersection of either the entire index or the index prefix. An index prefix is a subset of a compound index, consisting of one or more keys starting from the beginning of the index. Consider a collection orders with the following indexes: { qty: 1 } { status: 1, ord_date: -1 } To fulfill the following query which specifies a condition on both the qty field and the status field, MongoDB can use the intersection of the two indexes: db.orders.find( { qty: { $gt: 10 } , status: "A" } )
  • 21. Index Intersection and Compound Indexes Index intersection does not eliminate the need for creating compound indexes. However, because both the list order (i.e. the order in which the keys are listed in the index) and the sort order (i.e. ascending or descending), matter in compound indexes, a compound index may not support a query condition that does not include the index prefix keys or that specifies a different sort order. For example, if a collection orders has the following compound index, with the status field listed before the ord_date field: { status: 1, ord_date: -1 } The compound index can support the following queries: db.orders.find( { status: { $in: ["A", "P" ] } } ) db.orders.find({ord_date: { $gt: new Date("2014-02-01") }, status: {$in:[ "P", "A" ] }}) But not the following two queries: db.orders.find( { ord_date: { $gt: new Date("2014-02-01") } } ) db.orders.find( { } ).sort( { ord_date: 1 } ) However, if the collection has two separate indexes: { status: 1 } { ord_date: -1 } The two indexes can, either individually or through index intersection, support all four aforementioned queries. The choice between creating compound indexes that support your queries or relying on index intersection depends on the specifics of your system.
  • 22. Index Intersection and Sort Index intersection does not apply when the sort() operation requires an index completely separate from the query predicate. For example, the orders collection has the following indexes: { qty: 1 } { status: 1, ord_date: -1 } { status: 1 } { ord_date: -1 } MongoDB cannot use index intersection for the following query with sort: db.orders.find( { qty: { $gt: 10 } } ).sort( { status: 1 } ) That is, MongoDB does not use the { qty: 1 } index for the query, and the separate { status: 1 } or the { status: 1, ord_date: -1 } index for the sort. However, MongoDB can use index intersection for the following query with sort since the index { status: 1, ord_date: -1 } can fulfill part of the query predicate. db.orders.find( { qty: { $gt: 10 } , status: "A" } ).sort( { ord_date: -1 } )
  • 23. Compound Indexes Prefix Index prefixes are the beginning subsets of indexed fields. For example, consider the following compound index: { "item": 1, "location": 1, "stock": 1 } The index has the following index prefixes: • { item: 1 } • { item: 1, location: 1 } For a compound index, MongoDB can use the index to support queries on the index prefixes. As such, MongoDB can use the index for queries on the following fields: • the item field, • the item field and the location field, • the item field and the location field and the stock field.
  • 24. Compound Indexes Prefix MongoDB can also use the index to support a query on item and stock fields since item field corresponds to a prefix. However, the index would not be as efficient in supporting the query as would be an index on only item and stock. However, MongoDB cannot use the index to support queries that include the following fields since without the item field, none of the listed fields correspond to a prefix index: • the location field, • the stock field • the location and stock fields. If you have a collection that has both a compound index and an index on its prefix (e.g. { a: 1, b: 1 } and { a: 1 }), if neither index has a sparse or unique constraint, then you can remove the index on the prefix (e.g. { a: 1 }). MongoDB will use the compound index in all of the situations that it would have used the prefix index.
  • 25. Optimizing MongoDB Compound Indexes In order to create the best index for a complex MongoDB queries that combine equality tests, sorts, and range filters, and demonstrate the best order for fields in a compound index You must consider Index Cardinality and Selectivity. :
  • 26. Index Cardinality The index cardinality refers to how many possible values there are for a field. The field sex only has two possible values. It has a very low cardinality. Other fields such as names, usernames, phone numbers, emails, etc. will have a more unique value for every document in the collection, which is considered high cardinality. • Greater Cardinality The greater the cardinality of a field the more helpful an index will be, because indexes narrow the search space, making it a much smaller set. If you have an index on sex and you are looking for men named John. You would only narrow down the result space by approximately %50 if you indexed by sex first. Conversely if you indexed by name, you would immediately narrow down the result set to a minute fraction of users named John, then you would refer to those documents to check the gender.
  • 27. Selectivity Also, you want to use indexes selectively and write queries that limit the number of possible documents with the indexed field. To keep it simple, consider the following collection. If your index is {name:1}, If you run the query { name: "John", sex: "male"}. You will have to scan 1 document. Because you allowed MongoDB to be selective. Consider the following collection. If your index is {sex:1}, If you run the query {sex: "male", name: "John"}. You will have to scan 4 documents. {_id:ObjectId(),name:"John",sex:"male"} {_id:ObjectId(),name:"Rich",sex:"male"} {_id:ObjectId(),name:"Mose",sex:"male"} {_id:ObjectId(),name:"Sami",sex:"male"} {_id:ObjectId(),name:"Cari",sex:"female"} {_id:ObjectId(),name:"Mary",sex:"female"}
  • 28. Equality, Range Query, And Sort Method So here’s my method for creating a compound index for a query combining equality tests, sort fields, and range filters: Equality Tests • Add all equality-tested fields to the compound index, in any order Sort Fields (ascending / descending only matters if there are multiple sort fields) • Add sort fields to the index in the same order and direction as your query’s sort Range Filters • First, add the range filter for the field with the lowest cardinality (fewest distinct values in the collection) • Then the next lowest-cardinality range filter, and so on to the highest-cardinality You can omit some equality-test fields or range-filter fields if they are not selective, to decrease the index size—a rule of thumb is, if the field doesn’t filter out at least 90% of the possible documents in your collection, it’s probably better to omit it from the index. Remember that if you have several indexes on a collection, you may need to hint Mongo to use the right index. That’s it! For complex queries on several fields, there’s a heap of possible indexes to consider. If you use this method you’ll narrow your choices radically and go straight to a good index.
  • 29. Equality, Range Query, And Sort ❖ Example: db.comments.find({ timestamp: { $gte: 2, $lte: 4 }, anonymous: false } …).sort( { rating: -1 } ).hint( { anonymous: 1, rating: 1, timestamp: 1 } ).explain() { timestamp: 1, anonymous: false, rating: 3 } { timestamp: 2, anonymous: false, rating: 5 } { timestamp: 3, anonymous: true, rating: 1 } { timestamp: 4, anonymous: false, rating: 2 }