Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Mythbusting: Understanding How We Measure the Performance of MongoDB

677 views

Published on

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

Mythbusting: Understanding How We Measure the Performance of MongoDB

  1. 1. Senior Director of Performance Engineering, MongoDB Alvin Richards #MongoDBWorld Mythbusting: Understanding How We Measure the Performance of MongoDB
  2. 2. Before we start… • We are going to look a lot at – C++ kernel code – Java benchmarks – JavaScript tests • And lots of charts…
  3. 3. Measuring "Performance" https://www.youtube.com/watch?v=7wm-pZp_mi0
  4. 4. Benchmarking • Some common traps • Performance measurement & diagnosis • What's next
  5. 5. Part One Some Common Traps
  6. 6. The Milk Train Doesn't Stop Here Anymore Tennessee Williams "We all live in a house on fire, no fire department to call; no way out, just the upstairs window to look out of while the fire burns the house down with us trapped, locked in it."
  7. 7. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
  8. 8. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
  9. 9. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
  10. 10. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
  11. 11. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
  12. 12. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); So that looks ok, right?
  13. 13. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Object creation and GC management?
  14. 14. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Thread contention on nextInt()? Object creation and GC management?
  15. 15. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Time to synthesize data? Object creation and GC management? Thread contention on nextInt()?
  16. 16. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Object creation and GC management? Thread contention on addAndGet()? Thread contention on nextInt()? Time to synthesize data?
  17. 17. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Object creation and GC management? Clock resolution? Thread contention on nextInt()? Time to synthesize data? Thread contention on addAndGet()?
  18. 18. // Pre Create the Object outside the Loop BasicDBObject[] aDocs = new BasicDBObject[documentsPerInsert]; for (int i=0; i < documentsPerInsert; i++) { BasicDBObject doc = new BasicDBObject(); String cVal = "…"; doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i] = doc; } Solution: Pre-Create the objects Pre-create non varying data outside the timing loop Alternative • Pre-create the data in a file; load from file
  19. 19. // Use ThreadLocalRandom generator or an instance of java.util.Random per thread java.util.concurrent.ThreadLocalRandom rand; for (long roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; doc = aDocs[i]; doc.put("_id",id); doc.put("k", nextInt(rand, numMaxInserts)+1); } coll.insert(aDocs); numInserts += documentsPerInsert; } // Maintain count outside the loop globalInserts.addAndGet(documentsPerInsert * roundNum); Solution: Remove contention Remove contention nextInt() by making Thread local
  20. 20. // Use ThreadLocalRandom generator or an instance of java.util.Random per thread java.util.concurrent.ThreadLocalRandom rand; for (long roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; doc = aDocs[i]; doc.put("_id",id); doc.put("k", nextInt(rand, numMaxInserts)+1); } coll.insert(aDocs); numInserts += documentsPerInsert; } // Maintain count outside the loop globalInserts.addAndGet(documentsPerInsert * roundNum); Solution: Remove contention Remove contention on addAndGet() Remove contention nextInt() by making Thread local
  21. 21. long startTime = System.currentTimeMillis(); … long endTime = System.currentTimeMillis(); long startTime = System.nanoTime(); … long endTime = System.nanoTime() - startTime; Solution: Timer resolution "resolution is at least as good as that of currentTimeMillis()" "granularity of the value depends on the underlying operating system and may be larger" Source • http://docs.oracle.com/javase/7/docs/api/java/lang/System.html
  22. 22. General Principal #1 Know what you are measuring
  23. 23. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); #2 Response time to return all results
  24. 24. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); #2 Response time to return all results
  25. 25. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); #2 Response time to return all results
  26. 26. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); #2 Response time to return all results
  27. 27. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); So that looks ok, right?
  28. 28. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); What are else you measuring? Each doc is is 4080 bytes on disk with powerOf2Sizes
  29. 29. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); What are else you measuring? Each doc is is 4080 bytes on disk with powerOf2Sizes Unrestricted predicate?
  30. 30. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); What are else you measuring? Each doc is is 4080 bytes on disk with powerOf2Sizes Measuring • Time to parse & execute query • Time to retrieve all document But also • Cost of shipping ~4MB data through network stack Unrestricted predicate?
  31. 31. BasicDBObject predicate = new BasicDBObject(); predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20)); BasicDBObject projection = new BasicDBObject(); projection.put("_id", 1); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate, projection ); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); Solution: Limit the projection Return fixed range
  32. 32. BasicDBObject predicate = new BasicDBObject(); predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20)); BasicDBObject projection = new BasicDBObject(); projection.put("_id", 1); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate, projection ); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); Solution: Limit the projection Only project _id Return fixed range
  33. 33. BasicDBObject predicate = new BasicDBObject(); predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20)); BasicDBObject projection = new BasicDBObject(); projection.put("_id", 1); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate, projection ); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); Solution: Limit the projection Only project _id Only 46k transferred through network stack Return fixed range
  34. 34. General Principal #2 Measure only what you need to measure
  35. 35. Part Two Performance measurement & diagnosis
  36. 36. The Physical Principles of the Quantum Theory (1930) Werner Heisenberg "Every experiment destroys some of the knowledge of the system which was obtained by previous experiments."
  37. 37. Broad categories • Micro Benchmarks • Workloads
  38. 38. Micro benchmarks: mongo-perf
  39. 39. mongo-perf: goals • Measure – commands • Configure – Single mongod, ReplSet size (1 -> n), Sharding – Single vs. Multiple DB – O/S • Characterize – Throughput by thread count • Compare
  40. 40. What do you get? Better
  41. 41. What do you get? Measured improvement between rc0 and rc2 Better
  42. 42. tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
  43. 43. tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
  44. 44. tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
  45. 45. tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
  46. 46. Code Change
  47. 47. Workloads • "public" workloads – YCSB – Sysbench • "real world" simulations – Inbox fan in/out – Message Stores – Content Management
  48. 48. Example: Bulk Load Performance 16m Documents Better 55% degradation 2.6.0-rc1 vs 2.4.10
  49. 49. Ouch… where's the tree in the woods? • 2.4.10 -> 2.6.0 – 4495 git commits
  50. 50. git-bisect • Bisect between good/bad hashes • git-bisect nominates a new githash – Build against githash – Re-run test – Confirm if this githash is good/bad • Rinse and repeat
  51. 51. Code Change - Bad Githash
  52. 52. Code Change - Fix
  53. 53. Bulk Load Performance - Fix Better 11% improvement 2.6.1 vs 2.4.10
  54. 54. The problem with measurement • Observability – What can you observe on the system? • Effect – What effects can an observation cause?
  55. 55. mtools
  56. 56. mtools • MongoDB log file analysis – Filter logs for operations, events – Response time, lock durations – Plot • https://github.com/rueckstiess/mtools
  57. 57. Response Times > 100ms Bulk Insert 2.6.0-rc0 Ops/Sec Time
  58. 58. Response Times > 100ms Bulk Insert 2.6.0-rc0 vs. 2.6.0-rc2 Floor raised
  59. 59. Code Change – Yielding Policy
  60. 60. Code Change
  61. 61. Response Times Bulk Insert 2.6.0 vs 2.6.1 Ceiling similar, lower floor resulting in 40% improvement in throughput
  62. 62. Secondary effects of Yield policy change Write lock time reduced Order of magnitude reduction of write lock duration
  63. 63. > db.serverStatus() Yes – will cause a read lock to be acquired > db.serverStatus({recordStats:0}) No – lock is not acquired > mongostat Yes - until SERVER-14008 resolved, uses db.serverStatus() Unexpected side effects of measurement?
  64. 64. CPU sampling • Get an impression of – Call Graphs – CPU time spent on node and called nodes
  65. 65. > sudo apt-get install google-perftools > sudo apt-get install libunwind7-dev > scons --use-cpu-profiler mongod Setup & building with google- profiler
  66. 66. > mongodb –dbpath <…> Note: Do not use –fork > mongo > use admin > db.runCommand({_cpuProfilerStart: {profileFilename: 'foo.prof'}}) Execute some commands that you want to profile > db.runCommand({_cpuProfilerStop: 1}) Start the profiling
  67. 67. Sample start vs. end of workload
  68. 68. Sample start vs. end of workload
  69. 69. Code change
  70. 70. Public Benchmarks – Not all forks are the same… • YCSB – https://github.com/achille/YCSB • sysbench-mongodb – https://github.com/mdcallag/sysbench-mongodb
  71. 71. Part Three And next?
  72. 72. Beavis & Butthead "The future sucks. Change it." "I'm way cool Beavis, but I cannot change the future."
  73. 73. What we are working on • mongo-perf – UI refactor – Adding more micro benchmarks • Workloads – Adding external benchmarks – Creating benchmarks for common use cases • Inbox fan in/out • Analytical dashboards • Stream / Feeds • Customers, Partners & Community
  74. 74. Here's how you can help change the future! • Got a great workload? Great benchmark? • Want to donate it? • alvin@mongodb.com
  75. 75. Don't be that benchmark… #1 Know what you are measuring #2 Measure only what you need to measure
  76. 76. alvin@mongodb.com Senior Director of Performance Engineering, MongoDB Alvin Richards #MongoDBWorld Thank You

×