Automated Slow Query Analysis: Dex the Index Robot


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • query is – find me all people named X whose less than age N sorted by heightcomposite index with order of ( name, height, age)
  • Thank you
  • Thank you
  • Automated Slow Query Analysis: Dex the Index Robot

    1. 1. 0Eric SedorIndex Automation and DexJune 2013
    2. 2. 2Agenda• MongoDB index basics• Indexing tips and tricks• Dex automation• Dex details and demo• Extras
    3. 3. 3Some notable MongoDB fundamentals• Good performance starts with indexes– you create them; they don’t just happen• Each query uses at most one index– so index accordingly• The query optimizer is empirical– every so often (~1k writes) MongoDB runs a race betweenquery plans. The first query plan to complete wins.– query plans are also re-run after certain changes to acollection (such as adding an index).
    4. 4. 4Proper indexing is critical• Indexes can improve query performance by 2 to 3orders of magnitude– 1000ms query down to <1ms!• Bad queries don’t just get in their own way, they getin the way of other things too:– write lock, queued operations, page faults• Bad indexing → Memory Apocalypse– without warning, large portions of your working data toppleout of memory and must be page-faulted back
    5. 5. 5Five key commandsdb.adventurers.find({"name" : "Eric", "class": "Wizard"}).explain()db.adventurers.getIndexKeys()db.adventurers.getIndexes()db.adventurers.ensureIndex({"name": 1, "class": 1},{"background": true})db.adventurers.dropIndex({"name": 1, "class": 1})
    6. 6. 6explain()will reveal a scanAndOrder• scanAndOrder is almost always bad!• If MongoDB is re-ordering to satisfy a sortclause, explain() includes: { scanAndOrder:true }• MongoDB sorts documents in-memory! (veryexpensive!)– without an index, large result sets are rejected with an error
    7. 7. 7Know Thy B-Tree
    8. 8. 8An index is a b-tree that maps a sequence ofkey values to a list of document pointers*“Ben”“Fighter” “Noble”“Eric”“Engineer” “Wizard”{ "name": 1, "class": 1 }name->class->the order of the keys really matters!
    9. 9. 9Index key order determines how the b-treeis constructedThis ordering of keys influences how:• applicable an index is to a given query– a query that doesnt include the first field(s) in the indexcannot use the index• quickly the scope of possible results is pruned– here is where your datas cardinality weighs in• documents are sorted in result sets– did I mention scanAndOrder was bad?
    10. 10. 10Ordering is tricky and especially importantwith range operatorsThe order of fields in an index should be the:① fields on which you will query for exact values② fields on which you will sort③ fields on which you will query for a range of values($in, $gt, $lt, etc.)Article explaining this topic in
    11. 11. 11Put the range field value last in your indexdiagram at
    12. 12. 12Put the range field value last in your indexdiagram at
    13. 13. 13Put the range field value last in your indexdiagram at
    14. 14. 14Put the range field value last in your indexdiagram at
    15. 15. 15Slow Hell(like normal hell only slower)What do we do?
    16. 16. 16Be warned if you...• Use a variety of query patterns• Give the app user control over queries• Use MongoDB like a relational database• Have many indexes in each collection
    17. 17. 17Don’t die the death of a thousand cuts• The most expensive queries are not always theslowest queries.– 50 queries * 20 ms == 1 sThat’s 1 second other queries cant use!• Profile your queries and check the <100ms range fora high volume of expensive but relatively fastqueries• Remember... bad queries dont just get into theirown way!
    18. 18. 18Identify the problematic queries• Search the log file– logs any query over 100ms• Use the database profiler① Turn it ondb.setProfilingLevel(1)logs slow queriesdb.setProfilingLevel(2)logs all queries (helpful but noisy)② Find the slow queries.sort({millis: -1}).find({ns: "mongoquest.adventurers"}).find({op: {$in: ["query", "update", "command"]})③ Cleanupdb.setProfilingLevel(0)db.system.profile.dropCollection()
    19. 19. 19Here’s a hint() if you have too manyindexes• The query optimizer might choose a suboptimalindex– It’s empirical, so it is vulnerable to poor conditions at querytime, especially in high-page-fault environments• Hint your queries to the better index– db.adventurers.find(…).hint({“myIndex”: 1})
    20. 20. 20Introducing...
    21. 21. 21How Dex Works① Dex iterates over the input(log or profile collection)② A LogParser orProfileParser extractsqueries from each line ofinput.③ Dex passes the query to aQueryAnalyzer.④ The QueryAnalyzercompares the query toexisting indexes (from leftto right)⑤ If an index meetingDexs criteria does notalready exist, Dexsuggests the bestindex for that query
    22. 22. The Heart of Dex22Dex understands that order of fields in an indexshould be:① Equivalency checks {a:1}② Sorts .sort({b: 1})③ Range checks {c: {$in: [1, 2]}}
    23. 23. 23Using Dex is easyInstall using pip:> sudo pip install dexUsage: dex [<options>] uri> dex –f my/mongod/data/path/mongodb.logmongodb://myUser:myPass@myHost:12345/myDb> dex –p mongodb://myUser:myPass@myHost:12345/myDb
    24. 24. 24Demo
    25. 25. 25runStats: {linesRecommended: 76,linesProcessed: 76,linesPassed: 93},results: [{index: {"name": 1},totalTimeMillis: 410041,namespace: mongoquest.adventurers,queryCount: 2161,avgTimeMillis: 189,queries: [{"q": {"name": "<name>"}}]},...Example of Dexs output(use –v for a shell command!)
    26. 26. 26> dex -f my/mongod/data/path/mongodb.log-n "myFirstDb.collectionOne"mongodb://myUser:myPass@myHost:12345/myFirstDb> dex -f my/mongod/data/path/mongodb.log-n "*.collectionOne"mongodb://myUser:myPass@myHost:12345/admin> dex -f my/mongod/data/path/mongodb.log-n "myFirstDb.*" -n "mySecondDb.*"mongodb://myUser:myPass@myHost:12345/adminNote the auth to the admin db to run against more than one db!The namespace filter (-n)
    27. 27. 27For when you want current results, not prior results.> dex –w -f my/mongod/data/path/mongodb.logmongodb://myUser:myPass@myHost:12345/myFirstDb> dex –w –p –n "dbname.*"mongodb://myUser:myPass@myHost:12345/adminWatch mode (-w)
    28. 28. 28Focus on longer-running queries> dex –w -f my/mongod/data/path/mongodb.logmongodb://myUser:myPass@myHost:12345/myFirstDb –s 1000> dex –w –pmongodb://myUser:myPass@myHost:12345/admin --slowms 5000SlowMS (-s/--slowms)
    29. 29. 29{parsed: ...,namespace: db.adventurers,queryAnalysis: {analyzedFields: [{fieldName: name,fieldType: EQUIV},{fieldName: class,fieldType: EQUIV},fieldCount: N,supported: true|false},indexAnalysis: {fullIndexes: [],partialIndexes: [{name: 1}]needsRecommendation: true|false },recommendation: {namespace: mongoquest.adventurersindex: {name: 1, class: 1}shellCommand: db.ensureIndex... } }Dexs guts
    30. 30. 30Future plans for DexDev/Testing now:– Aggregation framework, geospatial queries, map/reduce– min/max/average nscanned and nreturned– scanAndOrder true/falseSoon:• Renovation of internals• Improved index recommendations– set-wise optimization of index fields• minimize the number of indexes required to cover all of yourqueries– order-wise optimization of index fields• measure cardinality for key ordering
    31. 31. 31
    32. 32. 32PSWe’re hiring!
    33. 33. 33Questions?Thank you and good luck out there!