Senior Director of Performance Engineering, MongoDB
Alvin Richards
#MongoDBWorld
Mythbusting: Understanding
How We Measure...
Before we start…
• We are going to look a lot at
– C++ kernel code
– Java benchmarks
– JavaScript tests
• And lots of char...
Measuring "Performance"
https://www.youtube.com/watch?v=7wm-pZp_mi0
Benchmarking
• Some common traps
• Performance measurement & diagnosis
• What's next
Part One
Some Common Traps
The Milk Train Doesn't Stop Here Anymore
Tennessee Williams
"We all live in a house on fire, no fire department
to call; n...
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i ...
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i ...
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i ...
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i ...
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i ...
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i ...
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i ...
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i ...
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i ...
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i ...
long startTime = System.currentTimeMillis();
for (int roundNum = 0; roundNum < numRounds; roundNum++) {
for (int i = 0; i ...
// Pre Create the Object outside the Loop
BasicDBObject[] aDocs = new BasicDBObject[documentsPerInsert];
for (int i=0; i <...
// Use ThreadLocalRandom generator or an instance of java.util.Random per thread
java.util.concurrent.ThreadLocalRandom ra...
// Use ThreadLocalRandom generator or an instance of java.util.Random per thread
java.util.concurrent.ThreadLocalRandom ra...
long startTime = System.currentTimeMillis();
…
long endTime = System.currentTimeMillis();
long startTime = System.nanoTime...
General Principal #1
Know what you are
measuring
BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("...
BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("...
BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("...
BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("...
BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("...
BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("...
BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("...
BasicDBObject doc = new BasicDBObject();
doc.put("v", str); // str is a 2k string
for (int i=0; i < 1000; i++) {
doc.put("...
BasicDBObject predicate = new BasicDBObject();
predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20));
Bas...
BasicDBObject predicate = new BasicDBObject();
predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20));
Bas...
BasicDBObject predicate = new BasicDBObject();
predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20));
Bas...
General Principal #2
Measure only what you
need to measure
Part Two
Performance
measurement &
diagnosis
The Physical Principles of the Quantum Theory (1930)
Werner Heisenberg
"Every experiment destroys some of the
knowledge of...
Broad categories
• Micro Benchmarks
• Workloads
Micro benchmarks: mongo-perf
mongo-perf: goals
• Measure
– commands
• Configure
– Single mongod, ReplSet size (1 -> n), Sharding
– Single vs. Multiple ...
What do you get?
Better
What do you get?
Measured
improvement
between rc0 and
rc2
Better
tests.push( { name: "Commands.CountsIntIDRange",
pre: function( collection ) {
collection.drop();
for ( var i = 0; i < 100...
tests.push( { name: "Commands.CountsIntIDRange",
pre: function( collection ) {
collection.drop();
for ( var i = 0; i < 100...
tests.push( { name: "Commands.CountsIntIDRange",
pre: function( collection ) {
collection.drop();
for ( var i = 0; i < 100...
tests.push( { name: "Commands.CountsIntIDRange",
pre: function( collection ) {
collection.drop();
for ( var i = 0; i < 100...
Code Change
Workloads
• "public" workloads
– YCSB
– Sysbench
• "real world" simulations
– Inbox fan in/out
– Message Stores
– Content ...
Example: Bulk Load Performance
16m Documents
Better
55% degradation
2.6.0-rc1 vs 2.4.10
Ouch… where's the tree in the
woods?
• 2.4.10 -> 2.6.0
– 4495 git commits
git-bisect
• Bisect between good/bad hashes
• git-bisect nominates a new githash
– Build against githash
– Re-run test
– C...
Code Change - Bad Githash
Code Change - Fix
Bulk Load Performance - Fix
Better
11% improvement
2.6.1 vs 2.4.10
The problem with measurement
• Observability
– What can you observe on the system?
• Effect
– What effects can an observat...
mtools
mtools
• MongoDB log file analysis
– Filter logs for operations, events
– Response time, lock durations
– Plot
• https://g...
Response Times > 100ms
Bulk Insert 2.6.0-rc0 Ops/Sec
Time
Response Times > 100ms
Bulk Insert 2.6.0-rc0 vs. 2.6.0-rc2
Floor raised
Code Change – Yielding Policy
Code Change
Response Times
Bulk Insert 2.6.0 vs 2.6.1
Ceiling similar, lower floor
resulting in 40%
improvement in throughput
Secondary effects of Yield policy change
Write lock time reduced
Order of magnitude reduction
of write lock duration
> db.serverStatus()
Yes – will cause a read lock to be acquired
> db.serverStatus({recordStats:0})
No – lock is not acquir...
CPU sampling
• Get an impression of
– Call Graphs
– CPU time spent on node and called nodes
> sudo apt-get install google-perftools
> sudo apt-get install libunwind7-dev
> scons --use-cpu-profiler mongod
Setup & bu...
> mongodb –dbpath <…>
Note: Do not use –fork
> mongo
> use admin
> db.runCommand({_cpuProfilerStart: {profileFilename: 'fo...
Sample start vs. end of workload
Sample start vs. end of workload
Code change
Public Benchmarks – Not all forks are
the same…
• YCSB
– https://github.com/achille/YCSB
• sysbench-mongodb
– https://gith...
Part Three
And next?
Beavis & Butthead
"The future sucks. Change it."
"I'm way cool Beavis, but I cannot change the
future."
What we are working on
• mongo-perf
– UI refactor
– Adding more micro benchmarks
• Workloads
– Adding external benchmarks
...
Here's how you can help change the
future!
• Got a great workload? Great benchmark?
• Want to donate it?
• alvin@mongodb.c...
Don't be that benchmark…
#1 Know what you are measuring
#2 Measure only what you need to
measure
alvin@mongodb.com
Senior Director of Performance Engineering, MongoDB
Alvin Richards
#MongoDBWorld
Thank You
Upcoming SlideShare
Loading in …5
×

Mythbusting: Understanding How We Measure the Performance of MongoDB

572 views
420 views

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
572
On SlideShare
0
From Embeds
0
Number of Embeds
13
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Per Java7 documentation

    http://docs.oracle.com/javase/7/docs/api/java/util/Random.html

    "Instances of java.util.Random are threadsafe. However, the concurrent use of the same java.util.Random instance across threads may encounter contention and consequent poor performance. Consider instead using ThreadLocalRandom in multithreaded designs."
  • Per Java7 documentation

    http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/atomic/package-summary.html

    "The specifications of these methods enable implementations to employ efficient machine-level atomic instructions that are available on contemporary processors. However on some platforms, support may entail some form of internal locking."
  • Per Java7 documentation

    http://docs.oracle.com/javase/7/docs/api/java/lang/System.html#currentTimeMillis()

    "Returns the current time in milliseconds. Note that while the unit of time of the return value is a millisecond, the granularity of the value depends on the underlying operating system and may be larger. For example, many operating systems measure time in units of tens of milliseconds."
  • Per Jav7 documentation

    http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadLocalRandom.html

    "A random number generator isolated to the current thread…Use of ThreadLocalRandom is particularly appropriate when multiple tasks (for example, each a ForkJoinTask) use random numbers in parallel in thread pools."
  • Per Java7 documentation

    http://docs.oracle.com/javase/7/docs/api/java/lang/System.html#nanoTime()

    "This method provides nanosecond precision, but not necessarily nanosecond resolution (that is, how frequently the value changes) - no guarantees are made except that the resolution is at least as good as that of currentTimeMillis()."
  • Githash

    https://github.com/mongodb/mongo/commit/d1dc7cf2b213d77103658ccd2ea4816b33a27f6a#diff-7ba76fe024c203ca35087f3b93395acc
  • Githash

    https://github.com/mongodb/mongo/commit/00f7aeaa25f98de5e66f0759d5b102951a247526#diff-fa99d4a7f4e8efac0787f30c60814eaf
  • Githash

    https://github.com/mongodb/mongo/commit/68d42de9a958688acbf659dfb651fb699e9d7394#diff-fa99d4a7f4e8efac0787f30c60814eaf
  • Githash

    https://github.com/mongodb/mongo/commit/00f7aeaa25f98de5e66f0759d5b102951a247526#diff-fa99d4a7f4e8efac0787f30c60814eaf
  • Githash

    https://github.com/mongodb/mongo/commit/68d42de9a958688acbf659dfb651fb699e9d7394#diff-fa99d4a7f4e8efac0787f30c60814eaf
  • Githash

    https://github.com/mongodb/mongo/commit/8d43b5cb9949c16452cb8d949c89d94cab9c8bad#diff-264fb70c85a638c671570970f3752bf3
  • Mythbusting: Understanding How We Measure the Performance of MongoDB

    1. 1. Senior Director of Performance Engineering, MongoDB Alvin Richards #MongoDBWorld Mythbusting: Understanding How We Measure the Performance of MongoDB
    2. 2. Before we start… • We are going to look a lot at – C++ kernel code – Java benchmarks – JavaScript tests • And lots of charts…
    3. 3. Measuring "Performance" https://www.youtube.com/watch?v=7wm-pZp_mi0
    4. 4. Benchmarking • Some common traps • Performance measurement & diagnosis • What's next
    5. 5. Part One Some Common Traps
    6. 6. The Milk Train Doesn't Stop Here Anymore Tennessee Williams "We all live in a house on fire, no fire department to call; no way out, just the upstairs window to look out of while the fire burns the house down with us trapped, locked in it."
    7. 7. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
    8. 8. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
    9. 9. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
    10. 10. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
    11. 11. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
    12. 12. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); So that looks ok, right?
    13. 13. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Object creation and GC management?
    14. 14. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Thread contention on nextInt()? Object creation and GC management?
    15. 15. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Time to synthesize data? Object creation and GC management? Thread contention on nextInt()?
    16. 16. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Object creation and GC management? Thread contention on addAndGet()? Thread contention on nextInt()? Time to synthesize data?
    17. 17. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "…" doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Object creation and GC management? Clock resolution? Thread contention on nextInt()? Time to synthesize data? Thread contention on addAndGet()?
    18. 18. // Pre Create the Object outside the Loop BasicDBObject[] aDocs = new BasicDBObject[documentsPerInsert]; for (int i=0; i < documentsPerInsert; i++) { BasicDBObject doc = new BasicDBObject(); String cVal = "…"; doc.put("c",cVal); String padVal = "…"; doc.put("pad",padVal); aDocs[i] = doc; } Solution: Pre-Create the objects Pre-create non varying data outside the timing loop Alternative • Pre-create the data in a file; load from file
    19. 19. // Use ThreadLocalRandom generator or an instance of java.util.Random per thread java.util.concurrent.ThreadLocalRandom rand; for (long roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; doc = aDocs[i]; doc.put("_id",id); doc.put("k", nextInt(rand, numMaxInserts)+1); } coll.insert(aDocs); numInserts += documentsPerInsert; } // Maintain count outside the loop globalInserts.addAndGet(documentsPerInsert * roundNum); Solution: Remove contention Remove contention nextInt() by making Thread local
    20. 20. // Use ThreadLocalRandom generator or an instance of java.util.Random per thread java.util.concurrent.ThreadLocalRandom rand; for (long roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; doc = aDocs[i]; doc.put("_id",id); doc.put("k", nextInt(rand, numMaxInserts)+1); } coll.insert(aDocs); numInserts += documentsPerInsert; } // Maintain count outside the loop globalInserts.addAndGet(documentsPerInsert * roundNum); Solution: Remove contention Remove contention on addAndGet() Remove contention nextInt() by making Thread local
    21. 21. long startTime = System.currentTimeMillis(); … long endTime = System.currentTimeMillis(); long startTime = System.nanoTime(); … long endTime = System.nanoTime() - startTime; Solution: Timer resolution "resolution is at least as good as that of currentTimeMillis()" "granularity of the value depends on the underlying operating system and may be larger" Source • http://docs.oracle.com/javase/7/docs/api/java/lang/System.html
    22. 22. General Principal #1 Know what you are measuring
    23. 23. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); #2 Response time to return all results
    24. 24. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); #2 Response time to return all results
    25. 25. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); #2 Response time to return all results
    26. 26. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); #2 Response time to return all results
    27. 27. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); So that looks ok, right?
    28. 28. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); What are else you measuring? Each doc is is 4080 bytes on disk with powerOf2Sizes
    29. 29. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); What are else you measuring? Each doc is is 4080 bytes on disk with powerOf2Sizes Unrestricted predicate?
    30. 30. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); What are else you measuring? Each doc is is 4080 bytes on disk with powerOf2Sizes Measuring • Time to parse & execute query • Time to retrieve all document But also • Cost of shipping ~4MB data through network stack Unrestricted predicate?
    31. 31. BasicDBObject predicate = new BasicDBObject(); predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20)); BasicDBObject projection = new BasicDBObject(); projection.put("_id", 1); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate, projection ); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); Solution: Limit the projection Return fixed range
    32. 32. BasicDBObject predicate = new BasicDBObject(); predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20)); BasicDBObject projection = new BasicDBObject(); projection.put("_id", 1); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate, projection ); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); Solution: Limit the projection Only project _id Return fixed range
    33. 33. BasicDBObject predicate = new BasicDBObject(); predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20)); BasicDBObject projection = new BasicDBObject(); projection.put("_id", 1); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate, projection ); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); Solution: Limit the projection Only project _id Only 46k transferred through network stack Return fixed range
    34. 34. General Principal #2 Measure only what you need to measure
    35. 35. Part Two Performance measurement & diagnosis
    36. 36. The Physical Principles of the Quantum Theory (1930) Werner Heisenberg "Every experiment destroys some of the knowledge of the system which was obtained by previous experiments."
    37. 37. Broad categories • Micro Benchmarks • Workloads
    38. 38. Micro benchmarks: mongo-perf
    39. 39. mongo-perf: goals • Measure – commands • Configure – Single mongod, ReplSet size (1 -> n), Sharding – Single vs. Multiple DB – O/S • Characterize – Throughput by thread count • Compare
    40. 40. What do you get? Better
    41. 41. What do you get? Measured improvement between rc0 and rc2 Better
    42. 42. tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
    43. 43. tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
    44. 44. tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
    45. 45. tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
    46. 46. Code Change
    47. 47. Workloads • "public" workloads – YCSB – Sysbench • "real world" simulations – Inbox fan in/out – Message Stores – Content Management
    48. 48. Example: Bulk Load Performance 16m Documents Better 55% degradation 2.6.0-rc1 vs 2.4.10
    49. 49. Ouch… where's the tree in the woods? • 2.4.10 -> 2.6.0 – 4495 git commits
    50. 50. git-bisect • Bisect between good/bad hashes • git-bisect nominates a new githash – Build against githash – Re-run test – Confirm if this githash is good/bad • Rinse and repeat
    51. 51. Code Change - Bad Githash
    52. 52. Code Change - Fix
    53. 53. Bulk Load Performance - Fix Better 11% improvement 2.6.1 vs 2.4.10
    54. 54. The problem with measurement • Observability – What can you observe on the system? • Effect – What effects can an observation cause?
    55. 55. mtools
    56. 56. mtools • MongoDB log file analysis – Filter logs for operations, events – Response time, lock durations – Plot • https://github.com/rueckstiess/mtools
    57. 57. Response Times > 100ms Bulk Insert 2.6.0-rc0 Ops/Sec Time
    58. 58. Response Times > 100ms Bulk Insert 2.6.0-rc0 vs. 2.6.0-rc2 Floor raised
    59. 59. Code Change – Yielding Policy
    60. 60. Code Change
    61. 61. Response Times Bulk Insert 2.6.0 vs 2.6.1 Ceiling similar, lower floor resulting in 40% improvement in throughput
    62. 62. Secondary effects of Yield policy change Write lock time reduced Order of magnitude reduction of write lock duration
    63. 63. > db.serverStatus() Yes – will cause a read lock to be acquired > db.serverStatus({recordStats:0}) No – lock is not acquired > mongostat Yes - until SERVER-14008 resolved, uses db.serverStatus() Unexpected side effects of measurement?
    64. 64. CPU sampling • Get an impression of – Call Graphs – CPU time spent on node and called nodes
    65. 65. > sudo apt-get install google-perftools > sudo apt-get install libunwind7-dev > scons --use-cpu-profiler mongod Setup & building with google- profiler
    66. 66. > mongodb –dbpath <…> Note: Do not use –fork > mongo > use admin > db.runCommand({_cpuProfilerStart: {profileFilename: 'foo.prof'}}) Execute some commands that you want to profile > db.runCommand({_cpuProfilerStop: 1}) Start the profiling
    67. 67. Sample start vs. end of workload
    68. 68. Sample start vs. end of workload
    69. 69. Code change
    70. 70. Public Benchmarks – Not all forks are the same… • YCSB – https://github.com/achille/YCSB • sysbench-mongodb – https://github.com/mdcallag/sysbench-mongodb
    71. 71. Part Three And next?
    72. 72. Beavis & Butthead "The future sucks. Change it." "I'm way cool Beavis, but I cannot change the future."
    73. 73. What we are working on • mongo-perf – UI refactor – Adding more micro benchmarks • Workloads – Adding external benchmarks – Creating benchmarks for common use cases • Inbox fan in/out • Analytical dashboards • Stream / Feeds • Customers, Partners & Community
    74. 74. Here's how you can help change the future! • Got a great workload? Great benchmark? • Want to donate it? • alvin@mongodb.com
    75. 75. Don't be that benchmark… #1 Know what you are measuring #2 Measure only what you need to measure
    76. 76. alvin@mongodb.com Senior Director of Performance Engineering, MongoDB Alvin Richards #MongoDBWorld Thank You

    ×