3. 2
Agenda
• MongoDB index basics
• Indexing tips and tricks
• Dex automation
• Dex details and demo
• Extras
4. 3
Some notable MongoDB fundamentals
• Good performance starts with indexes
– you create them; they don’t just happen
• Each query uses at most one index
– so index accordingly
• The query optimizer is empirical
– every so often (~1k writes) MongoDB runs a race between
query plans. The first query plan to complete wins.
– query plans are also re-run after certain changes to a
collection (such as adding an index).
5. 4
Proper indexing is critical
• Indexes can improve query performance by 2 to 3
orders of magnitude
– 1000ms query down to <1ms!
• Bad queries don’t just get in their own way, they get
in the way of other things too:
– write lock, queued operations, page faults
• Bad indexing → Memory Apocalypse
– without warning, large portions of your working data topple
out of memory and must be page-faulted back
7. 6
explain()will reveal a scanAndOrder
• scanAndOrder is almost always bad!
• If MongoDB is re-ordering to satisfy a sort
clause, explain() includes: { scanAndOrder:
true }
• MongoDB sorts documents in-memory! (very
expensive!)
– without an index, large result sets are rejected with an error
9. 8
An index is a b-tree that maps a sequence of
key values to a list of document pointers
*
“Ben”
“Fighter” “Noble”
“Eric”
“Engineer” “Wizard”
{ "name": 1, "class": 1 }
name->
class->
the order of the keys really matters!
10. 9
Index key order determines how the b-tree
is constructed
This ordering of keys influences how:
• applicable an index is to a given query
– a query that doesn't include the first field(s) in the index
cannot use the index
• quickly the scope of possible results is pruned
– here is where your data's cardinality weighs in
• documents are sorted in result sets
– did I mention scanAndOrder was bad?
11. 10
Ordering is tricky and especially important
with range operators
The order of fields in an index should be the:
① fields on which you will query for exact values
② fields on which you will sort
③ fields on which you will query for a range of values
($in, $gt, $lt, etc.)
Article explaining this topic in detail:
bit.ly/mongoindex
12. 11
Put the range field value last in your index
diagram at bit.ly/mongoindex
13. 12
Put the range field value last in your index
diagram at bit.ly/mongoindex
14. 13
Put the range field value last in your index
diagram at bit.ly/mongoindex
15. 14
Put the range field value last in your index
diagram at bit.ly/mongoindex
17. 16
Be warned if you...
• Use a variety of query patterns
• Give the app user control over queries
• Use MongoDB like a relational database
• Have many indexes in each collection
18. 17
Don’t die the death of a thousand cuts
• The most expensive queries are not always the
slowest queries.
– 50 queries * 20 ms == 1 s
That’s 1 second other queries can't use!
• Profile your queries and check the <100ms range for
a high volume of expensive but relatively fast
queries
• Remember... bad queries don't just get into their
own way!
19. 18
Identify the problematic queries
• Search the log file
– logs any query over 100ms
• Use the database profiler
① Turn it on
db.setProfilingLevel(1)logs slow queries
db.setProfilingLevel(2)logs all queries (helpful but noisy)
② Find the slow queries
.sort({millis: -1})
.find({ns: "mongoquest.adventurers"})
.find({op: {$in: ["query", "update", "command"]})
③ Cleanup
db.setProfilingLevel(0)
db.system.profile.dropCollection()
20. 19
Here’s a hint() if you have too many
indexes
• The query optimizer might choose a suboptimal
index
– It’s empirical, so it is vulnerable to poor conditions at query
time, especially in high-page-fault environments
• Hint your queries to the better index
– db.adventurers.find(…).hint({“myIndex”: 1})
22. 21
How Dex Works
① Dex iterates over the input
(log or profile collection)
② A LogParser or
ProfileParser extracts
queries from each line of
input.
③ Dex passes the query to a
QueryAnalyzer.
④ The QueryAnalyzer
compares the query to
existing indexes (from left
to right)
⑤ If an index meeting
Dex's criteria does not
already exist, Dex
suggests the best
index for that query
23. The Heart of Dex
22
Dex understands that order of fields in an index
should be:
① Equivalency checks {a:1}
② Sorts .sort({b: 1})
③ Range checks {c: {$in: [1, 2]}}
24. 23
Using Dex is easy
Install using pip:
> sudo pip install dex
Usage: dex [<options>] uri
> dex –f my/mongod/data/path/mongodb.log
mongodb://myUser:myPass@myHost:12345/myDb
> dex –p mongodb://myUser:myPass@myHost:12345/myDb
27. 26
> dex -f my/mongod/data/path/mongodb.log
-n "myFirstDb.collectionOne"
mongodb://myUser:myPass@myHost:12345/myFirst
Db
> dex -f my/mongod/data/path/mongodb.log
-n "*.collectionOne"
mongodb://myUser:myPass@myHost:12345/admin
> dex -f my/mongod/data/path/mongodb.log
-n "myFirstDb.*" -n "mySecondDb.*"
mongodb://myUser:myPass@myHost:12345/admin
Note the auth to the admin db to run against more than one db!
The namespace filter (-n)
28. 27
For when you want current results, not prior results.
> dex –w -f my/mongod/data/path/mongodb.log
mongodb://myUser:myPass@myHost:12345/myFirst
Db
> dex –w –p –n "dbname.*"
mongodb://myUser:myPass@myHost:12345/admin
Watch mode (-w)
31. 30
Future plans for Dex
Dev/Testing now:
– Aggregation framework, geospatial queries, map/reduce
– min/max/average nscanned and nreturned
– scanAndOrder true/false
Soon:
• Renovation of internals
• Improved index recommendations
– set-wise optimization of index fields
• minimize the number of indexes required to cover all of your
queries
– order-wise optimization of index fields
• measure cardinality for key ordering
34. 33
Questions?
Thank you and good luck out there!
eric@mongolab.com
www.github.com/mongolab/dex
http://mongolab.org
http://blog.mongolab.com/2012/06/introducing-dex-the-index-bot/
http://blog.mongolab.com/2012/07/remote-dex/
http://blog.mongolab.com/2012/06/cardinal-ins/
http://blog.mongolab.com/2013/04/thinking-about-arrays-in-mongodb/
Editor's Notes
query is – find me all people named X whose less than age N sorted by heightcomposite index with order of ( name, height, age)