MongoDB and Indexes - MUG Denver - 20160329

MongoDB and Indexes
Doug Duncan
dugdun@hotmail.com
@dugdun

What we’ll cover
• What are indexes?
• Types
• Properties
• Why use indexes?
• How to create indexes.
• Commands to check indexes and plans.

What are indexes?
Indexes are special data-structures that store a subset of
your data in an easily traversable format.
MongoDB stores indexes in a b-tree format which allows for
efficient access to the index content.
Proper index use is good and makes a system run
optimally. Improper index use can bring a system to a
grinding halt.

What are indexes?
Indexes are stored similar in a format similar to the
following if there was an index on Origin:
[ABE] -> 0xa193b48c
[ABE] -> 0x8e8b242a
[ABE] -> 0x0928cdc1
…
[DEN] -> 0x24aa4ecd
[DEN] -> 0x87396a3c
[DEN] -> 0x9392ab2f
…
[LAX] -> 0x89ccede0
…

Types of indexes
• _id
• Simple
• Compound
• Multikey
• Full-Text
• Geo-spatial
• Hashed

The _id index
• The _id index is automatically created and cannot be removed.
• This is the same as a primary key in traditional RDBMS.
• Default value is a 12-byte ObjectId:
• 4-byte time stamp
• 3-byte machine id
• 2-byte process id
• 3-byte counter

Simple index
• A simple index is an index on a single key
• This is similar to a book’s index where you look
up a word to find the pages it’s referenced on.

Compound index
• A compound index is created over two or more
fields in a document
• This is similar to a phone book where you can
find the phone number of a person given their
first and last names.

Multikey index
• A multikey index is an index that’s created on a
field that contains an array.
• If using in a compound index, only a single field
in a given document can be an array.
• You will get one entry in the index for every item
in the array for the given document. This means
if you have an array with 100 items, that
document will have 100 index entries.

Full-text index
• This is an index over a text based field, similar to
how Google indexes web pages.

Geo-spatial index
• A geo-spatial index will allow you to determine
distance from a given point.
• Works on both planar and spherical geometries.

Hashed indexes
• A hashed index is used in hash based sharding,
and allows for a more randomized distribution.
• Hashed indexes cannot contain compound keys
or be unique.
• Hashed indexes can contain the key in both a
hashed and non-hashed version. The non-
hashed version will allow for range based
queries.

Index properties
• Unique
• Sparse
• TTL
• Partial (new in 3.2)

Unique
• The unique property allows for only a single
value for the indexed field, or combination of
fields for a compound index
db.collection.createIndex({“email”: 1}, {“unique”:
true})
• A unique index can only have a single null or
missing field value for all documents in the
collection.

Sparse
• The sparse property allows you to index only
documents that contain a value for the given
field.
db.collection.createIndex({“kids”: 1}, {“sparse”: true})
• A sparse index will not be used if it would result in
an incomplete result set, unless specifically
hinted.
db.collection.find({“kids”: {“$gte”: 5})

TTL
• The TTL property allows for the automatic removal of
documents after a given time period.
db.collection.createIndex({“accessTime”: 1}, {“expireAfterSeconds”:
“1200”})
• The indexed field should contain an ISODate() value. If
any other type is used the document will not be removed.
• The TTL removal process runs once every 60 seconds so
you might see the document even though the time has
expired.

Partial
• The partial property allows you to index a subset
of your data.
db.collection.createIndex({“movie”: 1, “reviews”: 1},
{“rating”: {“$gte”: 4}})
• The index will not be used if it would provide an
incomplete result set (similar to the sparse
index).

Why use indexes?
• Efficiently retrieving document matches
• Equality matching
• Inequality or range matching
• Sorting
• Lack of a usable index will cause MongoDB to
scan the entire collection.

Before creating indexes
• Think about the queries you will be running and try to
create as few indexes as possible to support those
queries. Similar query patterns could use the same
(or very similar) indexes.
• Think about the data that you will query and put your
highly selective fields first in the index if possible.
• Check your current indexes before creating new
ones. MongoDB will allow you to create indexes with
the same fields in different orders.

Simple indexes
• When creating a simple index, the sort order,
ascending (1) or descending (-1), of the values
doesn’t matter as much as MongoDB can walk
the index forwards and backwards.
• Simple index creation:
db.flights.createIndex({“Origin”: 1})

Compound indexes
• When creating a compound index, the sort order, ascending (1) or
descending (-1), of the values starts to matter, especially if the index is used
to sort on multiple keys.
• When creating compound indexes you want to add keys to the index in the
following key order:
• Equality matches
• Sort fields
• Inequality matches
• A compound index will also help any queries that are made based off the
left most subset of keys.

Compound indexes
• Compound index creation:
db.flights.createIndex({“Origin”: 1, “Dest”: 1, “FlightDate”: -1})
• Queries supported:
db.flights.find({“Origin”: “DEN”})
db.flights.find({“Origin”: “DEN”, “Dest”: “JFK”})
db.flights.find({“Origin”: “DEN”, “Dest”: “JFK”}).sort({“FlightDate”: -1})
db.flights.find({“Origin”: “DEN”, “Dest”: “JFK”}).sort({“FlightDate”: 1})

Compound indexes
• An index created as follows:
db.flights.createIndex({“Origin”: 1, “Dest”: -1})
Could be used with either of the following queries as well
since MongoDB can walk the index either way:
db.flights.find().sort({“Origin”: 1, “Dest”: -1})
db.flights.find().sort({“Origin”: -1, “Dest”: 1})

Full-text indexes
• Full-text index creation:
• db.messages.createIndex({“body”: “text”})
• To search using the index finding any of the words:
db.messages.find({“$text”: {“$search”: “some text”}})
• To search using the index finding a phrase
db.message.find({“$text”: {“$search”: “”some text””}}

Covering indexes
• Covering indexes are indexes that will answer a
query without going back to the data. For example:
db.flights.createIndex({“Origin”: 1, “Dest”: 1, “ArrDelay”:
1, “UniqueCarrier”: 1})
• The following query would be covered as all fields
are in the index:
db.flights.find({“Origin”: “DEN”, “Dest”: “JFK”},
{“UniqueCarrier”: 1, “ArrDelay”: 1, “_id”:
0}).sort({“ArrDelay”: -1})

Indexing nested
fields/documents
• Let’s say you have documents with nested documents in them like the
following:
db.locations.findOne()
{
“_id”: ObjectId(…),
…,
“location”: {
“state”: “Colorado”,
“city”: “Lyons”
}
}

Indexing nested
fields/documents
• You can index on embedded fields by using dot
notation:
db.locations.createIndex({“location.state”: 1})

Indexing nested
fields/documents
• You can also index embedded documents
db.locations.createIndex({“location”: 1})
• If you do this the query must match the document exactly
(keys in the same order). That means that this will return the
document:
db.locations.find({“location”: {“state”: “Colorado”, “city”:
“Lyons”})
• But this won’t:
db.locations.find({“location”: {“city”: “Lyons”, “state”:
“Colorado”})

Index Intersection
• Index intersection is when MongoDB uses two or more
indexes to satisfy a query.
• Given the following two indexes:
db.orders.createIndex({“qty”: 1})
db.orders.createIndex({“item”: 1})
• Index intersection means a query such as the following
could use both indexes in parallel with the results being
merged together to satisfy the query:
db.orders.find({“item”: “ABC123”, “qty”: {“$gte”: 15}})

Indexing arrays
• You can index fields that contain arrays as well.
• Compound indexes however can only have a single field that is an array in a given document. If
a document has two indexed fields that are arrays, you will get an error.
db.arrtest.createIndex({“a”: 1, “b”: 1})
db.arrtest.insert({"b": [1,2,3], "a": [1,2,3]})
cannot index parallel arrays [b] [a]
WriteResult({
"nInserted": 0,
"writeError": {
"code": 10088,
"errmsg": "cannot index parallel arrays [b] [a]"
}
})

Index Intersection
• Index intersection is when MongoDB uses two or more
indexes to satisfy a query.
• Given the following two indexes:
db.orders.createIndex({“qty”: 1})
db.orders.createIndex({“item”: 1})
• Index intersection means a query such as the following
could in theory use both indexes in parallel with the results
being merged together to satisfy the query:
db.orders.find({“item”: “ABC123”, “qty”: {“$gte”: 15}})

Removing indexes
• The command to remove indexes is similar to the
one to create the index.
db.flights.dropIndex({“Origin”: 1, “Dest”: -1})

Commands to check
indexes and index usage

View all indexes in a
database
• To view all indexes in a database use the
following command:
db.system.indexes.find()
• For each index you’ll see the fields the index was
created with, the name of the index and the
namespace (db.collection) that the index was
built on.

View indexes for a given
collection
• To view all indexes for a given collection use the
following command:
db.collection.getIndexes()
• This returns the same information as the
previous command, but is limited to the given
collection.

View index sizes
• To view the size of all indexes in a collection:
db.collection.stats()
• You will see the size of all indexes and the size
of each individual index in the results. The sizes
are in bytes.

How to see if an index is
used
• If you want to see if an index is used, append the
.explain() operator to your query
db.flights.find({“Origin”: “DEN”}).explain()
• The explain operator has three levels of verbosity:
• queryPlanner - this is the default, and it returns the winning query plan
• executionStats - adds execution stats for the plan
• allPlansExecution - adds stats for the other candidate plans

Notes on indexes.
• When creating an index you need to know your
data and the queries that will run against it.
• Don’t build indexes in isolation!
• While indexes can improve performance, be
careful to not over index as every index gets
updated every time you write to the collection.

End Notes
• User group discounts
• Manning publications: www.manning.com
• Code ‘ug367’ to save 36% off order
• APress publications: www.appress.com
• Code ‘UserGroup’ to save 10% off order
• O’Reilly publication: www.oreilly.com
• Still waiting to get information

End Notes
• Communication
• Twitter: @MUGDenver and #MUGDenver
• Email: mugdenver@gmail.com
• Slack: ???

End Notes
• MongoDB World
• When: June 28th and 29th
• Where: NYC
• Save 25% by using code ‘DDuncan’

MongoDB and Indexes - MUG Denver - 20160329

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (10)

Similar to MongoDB and Indexes - MUG Denver - 20160329

Similar to MongoDB and Indexes - MUG Denver - 20160329 (20)

Recently uploaded

Recently uploaded (20)

MongoDB and Indexes - MUG Denver - 20160329

Editor's Notes