Indexes are data structures that store a subset of data to allow for efficient retrieval. MongoDB stores indexes using a b-tree format. There are several types of indexes including simple, compound, multikey, full-text, and geospatial indexes. Indexes improve performance by enabling efficient retrieval, sorting, and filtering of documents. Indexes are created using the createIndex command and their usage can be checked using explain plans.
As your data grows, the need to establish proper indexes becomes critical to performance. MongoDB supports a wide range of indexing options to enable fast querying of your data, but what are the right strategies for your application?
In this talk we’ll cover how indexing works, the various indexing options, and use cases where each can be useful. We'll dive into common pitfalls using real-world examples to ensure that you're ready for scale.
MongoDB .local Toronto 2019: Tips and Tricks for Effective IndexingMongoDB
Query performance can either be a constant headache or the unsung hero of an application. MongoDB provides extremely powerful querying capabilities when used properly. I will share more common mistakes observed and some tips and tricks to avoiding them.
Presented by Tom Schreiber, Senior Consulting Engineer, MongoDB
MongoDB supports a wide range of indexing options to enable fast querying of your data, but what are the right strategies for your application? In this talk we’ll cover how indexing works, the various indexing options, and cover use cases where each might be useful. We'll dive into common pitfalls using real-world examples to ensure that you're ready for scale. We'll show you the tools and techniques for diagnosing and tuning the performance of your MongoDB deployment. Whether you're running into problems or just want to optimize your performance, these skills will be useful.
MongoDB World 2019: The Sights (and Smells) of a Bad QueryMongoDB
“Why is MongoDB so slow?” you may ask yourself on occasion. You’ve created indexes, you’ve learned how to use the aggregation pipeline. What the heck? Could it be your queries? This talk will outline what tools are at your disposal (both in MongoDB Atlas and in MongoDB server) to identify inefficient queries.
Video available here: http://vivu.tv/portal/archive.jsp?flow=783-586-4282&id=1270584002677
We all know that MongoDB is one of the most flexible and feature-rich databases available. In this webinar we'll discuss how you can leverage this feature set and maintain high performance with your project's massive data sets and high loads. We'll cover how indexes can be designed to optimize the performance of MongoDB. We'll also discuss tips for diagnosing and fixing performance issues should they arise.
MongoDB World 2019: Tips and Tricks++ for Querying and Indexing MongoDBMongoDB
Query performance can either be a constant headache or the unsung hero of an application. MongoDB provides extremely powerful querying capabilities when used properly. As a senior member of the support team I will share more common mistakes observed and some tips and tricks to avoiding them.
As your data grows, the need to establish proper indexes becomes critical to performance. MongoDB supports a wide range of indexing options to enable fast querying of your data, but what are the right strategies for your application?
In this talk we’ll cover how indexing works, the various indexing options, and use cases where each can be useful. We'll dive into common pitfalls using real-world examples to ensure that you're ready for scale.
MongoDB .local Toronto 2019: Tips and Tricks for Effective IndexingMongoDB
Query performance can either be a constant headache or the unsung hero of an application. MongoDB provides extremely powerful querying capabilities when used properly. I will share more common mistakes observed and some tips and tricks to avoiding them.
Presented by Tom Schreiber, Senior Consulting Engineer, MongoDB
MongoDB supports a wide range of indexing options to enable fast querying of your data, but what are the right strategies for your application? In this talk we’ll cover how indexing works, the various indexing options, and cover use cases where each might be useful. We'll dive into common pitfalls using real-world examples to ensure that you're ready for scale. We'll show you the tools and techniques for diagnosing and tuning the performance of your MongoDB deployment. Whether you're running into problems or just want to optimize your performance, these skills will be useful.
MongoDB World 2019: The Sights (and Smells) of a Bad QueryMongoDB
“Why is MongoDB so slow?” you may ask yourself on occasion. You’ve created indexes, you’ve learned how to use the aggregation pipeline. What the heck? Could it be your queries? This talk will outline what tools are at your disposal (both in MongoDB Atlas and in MongoDB server) to identify inefficient queries.
Video available here: http://vivu.tv/portal/archive.jsp?flow=783-586-4282&id=1270584002677
We all know that MongoDB is one of the most flexible and feature-rich databases available. In this webinar we'll discuss how you can leverage this feature set and maintain high performance with your project's massive data sets and high loads. We'll cover how indexes can be designed to optimize the performance of MongoDB. We'll also discuss tips for diagnosing and fixing performance issues should they arise.
MongoDB World 2019: Tips and Tricks++ for Querying and Indexing MongoDBMongoDB
Query performance can either be a constant headache or the unsung hero of an application. MongoDB provides extremely powerful querying capabilities when used properly. As a senior member of the support team I will share more common mistakes observed and some tips and tricks to avoiding them.
In this presentation, we are going to discuss how elasticsearch handles the various operations like insert, update, delete. We would also cover what is an inverted index and how segment merging works.
Having the right indexes in place are crucial to performance in MongoDB. In this talk, we’ll explain how indexes work and the various indexing options. We’ll talk about the tools available to optimize your queries and avoid common pitfalls. Throughout the session, we’ll reference real-world examples to demonstrate the importance of proper indexing.
MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...MongoDB
Aggregation pipeline has been able to power your analysis of data since version 2.2. In 4.2 we added more power and now you can use it for more powerful queries, updates, and outputting your data to existing collections. Come hear how you can do everything with the pipeline, including single-view, ETL, data roll-ups and materialized views.
Find out which is faster, SQL or NoSQL, for traditional reporting tasks. Discover how you can optimise MongoDB aggregation pipelines and how to push complex computation down to the database.
Data Definition Language (DDL), Data Definition Language (DDL), Data Manipulation Language (DML) , Transaction Control Language (TCL) , Data Control Language (DCL) - , SQL Constraints
This presentation was presented at Percona Live UK.
Although a DBMS hides the internal mechanics of indexing. But to be able to create efficient indexes, you need to know how they work. This talk will help you understand the mechanics of the data structure used to store indexes and as to how it applies to InnoDB. At the end of the talk you will be able to learn how to use cost-analysis to pick and choose correct index definitions and will learn how to create indexes that will work efficiently with InnoDB.
Talk given for the #phpbenelux user group, March 27th in Gent (BE), with the goal of convincing developers that are used to build php/mysql apps to broaden their horizon when adding search to their site. Be sure to also have a look at the notes for the slides; they explain some of the screenshots, etc.
An accompanying blog post about this subject can be found at http://www.jurriaanpersyn.com/archives/2013/11/18/introduction-to-elasticsearch/
This presentation deals with the advanced features of SQL comprising of Arithmetic Calculations, Analytical Function, PIVOT etc. Presented by Alphalogic Inc: https://www.alphalogicinc.com/
Index is a database object, which can be created on one or more columns (16 Max column combinations). When creating the index will read the column(s) and forms a relevant data structure to minimize the number of data comparisons. The index will improve the performance of data retrieval and adds some overhead on data modification such as create, delete and modify. So it depends on how much data retrieval can be performed on table versus how much of DML (Insert, Delete and Update) operations
I inherited a MongoDB database server with 60 collections and 100 or so indexes.
The business users are complaining about slow report completion times. What can I do to improve performance?
In this presentation, we are going to discuss how elasticsearch handles the various operations like insert, update, delete. We would also cover what is an inverted index and how segment merging works.
Having the right indexes in place are crucial to performance in MongoDB. In this talk, we’ll explain how indexes work and the various indexing options. We’ll talk about the tools available to optimize your queries and avoid common pitfalls. Throughout the session, we’ll reference real-world examples to demonstrate the importance of proper indexing.
MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...MongoDB
Aggregation pipeline has been able to power your analysis of data since version 2.2. In 4.2 we added more power and now you can use it for more powerful queries, updates, and outputting your data to existing collections. Come hear how you can do everything with the pipeline, including single-view, ETL, data roll-ups and materialized views.
Find out which is faster, SQL or NoSQL, for traditional reporting tasks. Discover how you can optimise MongoDB aggregation pipelines and how to push complex computation down to the database.
Data Definition Language (DDL), Data Definition Language (DDL), Data Manipulation Language (DML) , Transaction Control Language (TCL) , Data Control Language (DCL) - , SQL Constraints
This presentation was presented at Percona Live UK.
Although a DBMS hides the internal mechanics of indexing. But to be able to create efficient indexes, you need to know how they work. This talk will help you understand the mechanics of the data structure used to store indexes and as to how it applies to InnoDB. At the end of the talk you will be able to learn how to use cost-analysis to pick and choose correct index definitions and will learn how to create indexes that will work efficiently with InnoDB.
Talk given for the #phpbenelux user group, March 27th in Gent (BE), with the goal of convincing developers that are used to build php/mysql apps to broaden their horizon when adding search to their site. Be sure to also have a look at the notes for the slides; they explain some of the screenshots, etc.
An accompanying blog post about this subject can be found at http://www.jurriaanpersyn.com/archives/2013/11/18/introduction-to-elasticsearch/
This presentation deals with the advanced features of SQL comprising of Arithmetic Calculations, Analytical Function, PIVOT etc. Presented by Alphalogic Inc: https://www.alphalogicinc.com/
Index is a database object, which can be created on one or more columns (16 Max column combinations). When creating the index will read the column(s) and forms a relevant data structure to minimize the number of data comparisons. The index will improve the performance of data retrieval and adds some overhead on data modification such as create, delete and modify. So it depends on how much data retrieval can be performed on table versus how much of DML (Insert, Delete and Update) operations
I inherited a MongoDB database server with 60 collections and 100 or so indexes.
The business users are complaining about slow report completion times. What can I do to improve performance?
MongoDB Days UK: Indexing and Performance TuningMongoDB
Presented by Tom Schreiber, Senior Consulting Engineer, MongoDB
Experience level: Beginner
MongoDB supports a wide range of indexing options to enable fast querying of your data, but what are the right strategies for your application? In this talk we’ll cover how indexing works, the various indexing options, and cover use cases where each might be useful. We'll dive into common pitfalls using real-world examples to ensure that you're ready for scale. We'll show you the tools and techniques for diagnosing and tuning the performance of your MongoDB deployment. Whether you're running into problems or just want to optimize your performance, these skills will be useful.
Webinar: MongoDB 2.4 Feature Demo and Q&A on Hash-based ShardingMongoDB
In version 2.4, MongoDB introduces hash-based sharding, allowing the user to shard based on a randomized shard key to spread documents evenly across a cluster. Hash-based sharding is an alternative to range-based sharding, making it easier to manage your growing cluster. In this talk, we'll discuss provide an overview of this new feature and discuss the pros and cons of using a hash-based sharding vs. range-based approach.
Learn about the various approaches to sharding your data with MongoDB. This presentation will help you answer questions such as when to shard and how to choose a shard key.
Optimizing MongoDB: Lessons Learned at Localyticsandrew311
Tips, tricks, and gotchas learned at Localytics for optimizing MongoDB installs. Includes information about document design, indexes, fragmentation, migration, AWS EC2/EBS, and more.
Query Analyzing
Introduction into indexes
Indexes In Mongo
Managing indexes in MongoDB
Using index to sort query results.
When should I use indexes.
When should we avoid using indexes.
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2Antonios Giannopoulos
MongoDB 4.2 comes GA soon delivering some amazing new features on multiple areas. In this talk, we will focus on the new capabilities of the aggregation framework. We are going to cover the new operators and expressions. At the same time, we will explore how updates commands can now use the aggregation framework operators. We are also going to present aggregation framework improvements focusing on the on-demand materialized views. Finally, we are going to explore the wildcard indexes introduced in MongoDB 4.2 and how they change the way we design documents and build queries/aggregations. We will also make a reference to the new index build system.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
StarCompliance is a leading firm specializing in the recovery of stolen cryptocurrency. Our comprehensive services are designed to assist individuals and organizations in navigating the complex process of fraud reporting, investigation, and fund recovery. We combine cutting-edge technology with expert legal support to provide a robust solution for victims of crypto theft.
Our Services Include:
Reporting to Tracking Authorities:
We immediately notify all relevant centralized exchanges (CEX), decentralized exchanges (DEX), and wallet providers about the stolen cryptocurrency. This ensures that the stolen assets are flagged as scam transactions, making it impossible for the thief to use them.
Assistance with Filing Police Reports:
We guide you through the process of filing a valid police report. Our support team provides detailed instructions on which police department to contact and helps you complete the necessary paperwork within the critical 72-hour window.
Launching the Refund Process:
Our team of experienced lawyers can initiate lawsuits on your behalf and represent you in various jurisdictions around the world. They work diligently to recover your stolen funds and ensure that justice is served.
At StarCompliance, we understand the urgency and stress involved in dealing with cryptocurrency theft. Our dedicated team works quickly and efficiently to provide you with the support and expertise needed to recover your assets. Trust us to be your partner in navigating the complexities of the crypto world and safeguarding your investments.
4. What are indexes?
Indexes are special data-structures that store a subset of
your data in an easily traversable format.
MongoDB stores indexes in a b-tree format which allows for
efficient access to the index content.
Proper index use is good and makes a system run
optimally. Improper index use can bring a system to a
grinding halt.
5. What are indexes?
Indexes are stored similar in a format similar to the
following if there was an index on Origin:
[ABE] -> 0xa193b48c
[ABE] -> 0x8e8b242a
[ABE] -> 0x0928cdc1
…
[DEN] -> 0x24aa4ecd
[DEN] -> 0x87396a3c
[DEN] -> 0x9392ab2f
…
[LAX] -> 0x89ccede0
…
7. The _id index
• The _id index is automatically created and cannot be removed.
• This is the same as a primary key in traditional RDBMS.
• Default value is a 12-byte ObjectId:
• 4-byte time stamp
• 3-byte machine id
• 2-byte process id
• 3-byte counter
8. Simple index
• A simple index is an index on a single key
• This is similar to a book’s index where you look
up a word to find the pages it’s referenced on.
9. Compound index
• A compound index is created over two or more
fields in a document
• This is similar to a phone book where you can
find the phone number of a person given their
first and last names.
10. Multikey index
• A multikey index is an index that’s created on a
field that contains an array.
• If using in a compound index, only a single field
in a given document can be an array.
• You will get one entry in the index for every item
in the array for the given document. This means
if you have an array with 100 items, that
document will have 100 index entries.
11. Full-text index
• This is an index over a text based field, similar to
how Google indexes web pages.
12. Geo-spatial index
• A geo-spatial index will allow you to determine
distance from a given point.
• Works on both planar and spherical geometries.
13. Hashed indexes
• A hashed index is used in hash based sharding,
and allows for a more randomized distribution.
• Hashed indexes cannot contain compound keys
or be unique.
• Hashed indexes can contain the key in both a
hashed and non-hashed version. The non-
hashed version will allow for range based
queries.
15. Unique
• The unique property allows for only a single
value for the indexed field, or combination of
fields for a compound index
db.collection.createIndex({“email”: 1}, {“unique”:
true})
• A unique index can only have a single null or
missing field value for all documents in the
collection.
16. Sparse
• The sparse property allows you to index only
documents that contain a value for the given
field.
db.collection.createIndex({“kids”: 1}, {“sparse”: true})
• A sparse index will not be used if it would result in
an incomplete result set, unless specifically
hinted.
db.collection.find({“kids”: {“$gte”: 5})
17. TTL
• The TTL property allows for the automatic removal of
documents after a given time period.
db.collection.createIndex({“accessTime”: 1}, {“expireAfterSeconds”:
“1200”})
• The indexed field should contain an ISODate() value. If
any other type is used the document will not be removed.
• The TTL removal process runs once every 60 seconds so
you might see the document even though the time has
expired.
18. Partial
• The partial property allows you to index a subset
of your data.
db.collection.createIndex({“movie”: 1, “reviews”: 1},
{“rating”: {“$gte”: 4}})
• The index will not be used if it would provide an
incomplete result set (similar to the sparse
index).
20. Why use indexes?
• Efficiently retrieving document matches
• Equality matching
• Inequality or range matching
• Sorting
• Lack of a usable index will cause MongoDB to
scan the entire collection.
22. Before creating indexes
• Think about the queries you will be running and try to
create as few indexes as possible to support those
queries. Similar query patterns could use the same
(or very similar) indexes.
• Think about the data that you will query and put your
highly selective fields first in the index if possible.
• Check your current indexes before creating new
ones. MongoDB will allow you to create indexes with
the same fields in different orders.
23. Simple indexes
• When creating a simple index, the sort order,
ascending (1) or descending (-1), of the values
doesn’t matter as much as MongoDB can walk
the index forwards and backwards.
• Simple index creation:
db.flights.createIndex({“Origin”: 1})
24. Compound indexes
• When creating a compound index, the sort order, ascending (1) or
descending (-1), of the values starts to matter, especially if the index is used
to sort on multiple keys.
• When creating compound indexes you want to add keys to the index in the
following key order:
• Equality matches
• Sort fields
• Inequality matches
• A compound index will also help any queries that are made based off the
left most subset of keys.
26. Compound indexes
• An index created as follows:
db.flights.createIndex({“Origin”: 1, “Dest”: -1})
Could be used with either of the following queries as well
since MongoDB can walk the index either way:
db.flights.find().sort({“Origin”: 1, “Dest”: -1})
db.flights.find().sort({“Origin”: -1, “Dest”: 1})
27. Full-text indexes
• Full-text index creation:
• db.messages.createIndex({“body”: “text”})
• To search using the index finding any of the words:
db.messages.find({“$text”: {“$search”: “some text”}})
• To search using the index finding a phrase
db.message.find({“$text”: {“$search”: “”some text””}}
28. Covering indexes
• Covering indexes are indexes that will answer a
query without going back to the data. For example:
db.flights.createIndex({“Origin”: 1, “Dest”: 1, “ArrDelay”:
1, “UniqueCarrier”: 1})
• The following query would be covered as all fields
are in the index:
db.flights.find({“Origin”: “DEN”, “Dest”: “JFK”},
{“UniqueCarrier”: 1, “ArrDelay”: 1, “_id”:
0}).sort({“ArrDelay”: -1})
29. Indexing nested
fields/documents
• Let’s say you have documents with nested documents in them like the
following:
db.locations.findOne()
{
“_id”: ObjectId(…),
…,
“location”: {
“state”: “Colorado”,
“city”: “Lyons”
}
}
31. Indexing nested
fields/documents
• You can also index embedded documents
db.locations.createIndex({“location”: 1})
• If you do this the query must match the document exactly
(keys in the same order). That means that this will return the
document:
db.locations.find({“location”: {“state”: “Colorado”, “city”:
“Lyons”})
• But this won’t:
db.locations.find({“location”: {“city”: “Lyons”, “state”:
“Colorado”})
32. Index Intersection
• Index intersection is when MongoDB uses two or more
indexes to satisfy a query.
• Given the following two indexes:
db.orders.createIndex({“qty”: 1})
db.orders.createIndex({“item”: 1})
• Index intersection means a query such as the following
could use both indexes in parallel with the results being
merged together to satisfy the query:
db.orders.find({“item”: “ABC123”, “qty”: {“$gte”: 15}})
33. Indexing arrays
• You can index fields that contain arrays as well.
• Compound indexes however can only have a single field that is an array in a given document. If
a document has two indexed fields that are arrays, you will get an error.
db.arrtest.createIndex({“a”: 1, “b”: 1})
db.arrtest.insert({"b": [1,2,3], "a": [1,2,3]})
cannot index parallel arrays [b] [a]
WriteResult({
"nInserted": 0,
"writeError": {
"code": 10088,
"errmsg": "cannot index parallel arrays [b] [a]"
}
})
34. Index Intersection
• Index intersection is when MongoDB uses two or more
indexes to satisfy a query.
• Given the following two indexes:
db.orders.createIndex({“qty”: 1})
db.orders.createIndex({“item”: 1})
• Index intersection means a query such as the following
could in theory use both indexes in parallel with the results
being merged together to satisfy the query:
db.orders.find({“item”: “ABC123”, “qty”: {“$gte”: 15}})
35. Removing indexes
• The command to remove indexes is similar to the
one to create the index.
db.flights.dropIndex({“Origin”: 1, “Dest”: -1})
37. View all indexes in a
database
• To view all indexes in a database use the
following command:
db.system.indexes.find()
• For each index you’ll see the fields the index was
created with, the name of the index and the
namespace (db.collection) that the index was
built on.
38. View indexes for a given
collection
• To view all indexes for a given collection use the
following command:
db.collection.getIndexes()
• This returns the same information as the
previous command, but is limited to the given
collection.
39. View index sizes
• To view the size of all indexes in a collection:
db.collection.stats()
• You will see the size of all indexes and the size
of each individual index in the results. The sizes
are in bytes.
40. How to see if an index is
used
• If you want to see if an index is used, append the
.explain() operator to your query
db.flights.find({“Origin”: “DEN”}).explain()
• The explain operator has three levels of verbosity:
• queryPlanner - this is the default, and it returns the winning query plan
• executionStats - adds execution stats for the plan
• allPlansExecution - adds stats for the other candidate plans
41. Notes on indexes.
• When creating an index you need to know your
data and the queries that will run against it.
• Don’t build indexes in isolation!
• While indexes can improve performance, be
careful to not over index as every index gets
updated every time you write to the collection.
43. End Notes
• User group discounts
• Manning publications: www.manning.com
• Code ‘ug367’ to save 36% off order
• APress publications: www.appress.com
• Code ‘UserGroup’ to save 10% off order
• O’Reilly publication: www.oreilly.com
• Still waiting to get information
45. End Notes
• MongoDB World
• When: June 28th and 29th
• Where: NYC
• Save 25% by using code ‘DDuncan’
Editor's Notes
The indexes do not have to store the field names as all fields are the same for each entry. After the values you will have a pointer back to the data portion of the file.
_id index is a primary key. Default value is a 12 byte ObjectId that has as it’s first 4 bytes a time stamp that the document was entered into the collection. 3 byte machine id, 2 byte process id and 3 byte counter starting with a random value. You can however override this as long as the values you enter are unique. Automatically created and cannot be removed.
Simple index is similar to a book where you look up a word and find page numbers.
Compound index is similar to a phone book where you can find the phone number of a person if you know their first and last names.
Multikey indexes are indexes over columns that have an array. There will be an entry for each item in the array and there can only be a single array column indexed in a given index.
Full-text indexes are similar to what Google does when search for words in a web site.
Geo-spatial indexes allow you to determine map proximity similar to Google maps find restaurants around this location. Can use both 2d for planar geometry and 2dsphere for spherical geometries.
Hashed indexes are used in hash-based sharding which allows for a more random distribution. Can only do equality searches against this type of index, unless you add the field as in both hashed and non-hashed forms ({“field”: “hashed”, “field”: 1}). Cannot be a compound or unique index.
Unique indexes are indexes that can only store a single value for the given key that’s being indexes (or set of values if a compound index). This can only contain a single document that has a null for the indexed field or a document that doesn’t have the field at all. Cannot have both of these. db.coll.ensureIndex({“a”: 1}, {“unique”: true}).
Sparse indexes only index the documents that the field actually exists in. Will not index missing fields, but will index fields whose value is null. db.coll.ensureIndex({“a”: 1}, {“sparse”: true}). Use db.coll.find().hint({“a”: 1}) to see what index contains. In 2.6 and earlier, this could result in queries returning incorrect data.
TTL indexes are indexes that will automatically remove documents in a collection after a given time. Indexed field needs to be a Date object. db.coll.ensureIndex({“field1”: 1}, {“expireAfterSeconds”: 300}). If you put any other value in the indexed field it will never expire.
Partial indexes allow you to add a filter to the index so only those documents are indexes. This allows you to have smaller storage footprint than a regular index over the same field. These should be preferred over sparse indexes. db.coll.createIndex({“a”: 1}, {“partialFilterExpression”: {“a”: {“$gt”: 5}}}). Mongo will not use this index if it will return an incomplete result set. The query must contain the filter expression or a modified version that will return a smaller subset of the documents covered by the index.
I’ve never had a case where index intersection worked, at least not when running an explain() on the query.