MongoDB: Optimising for Performance, Scale & Analytics

459 views

Published on

MongoDB is easy to download and run locally but requires some thought and further understanding when deploying to production. At scale, schema design, indexes and query patterns really matter. So does data structure on disk, sharding, replication and data centre awareness. This talk will examine these factors in the context of analytics, and more generally, to help you optimise MongoDB for any scale.

Presented at MongoDB Days London 2013 by David Mytton.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
459
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
7
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

MongoDB: Optimising for Performance, Scale & Analytics

  1. 1. Optimising for performance, scale, analytics
  2. 2. ~3333 write ops/s 0.07 - 0.05 ms response
  3. 3. David MyttonWoop Japan!
  4. 4. MongoDB at Server Density
  5. 5. MongoDB at Server Density•27 nodes
  6. 6. MongoDB at Server Density•27 nodes• June 2009 - +3yrs
  7. 7. MongoDB at Server Density•27 nodes• June 2009 - +3yrs•MySQL -> MongoDB
  8. 8. MongoDB at Server Density•27 nodes• June 2009 - +3yrs•MySQL -> MongoDB•17TB data per month
  9. 9. MongoDB at Server DensityQueues Primary data storeTime series
  10. 10. Why?
  11. 11. Why?• Replication
  12. 12. Why?• Replication• Official drivers
  13. 13. Why?• Replication• Official drivers• Easy deployment
  14. 14. Why?• Replication• Official drivers• Easy deployment• Fast out of the box (sort of)
  15. 15. Fast out of the box?Photo: dannychoo.com
  16. 16. ~3333 write ops/s 0.07 - 0.05 ms response
  17. 17. Fast out of the box?• Softlayer cloud (1 core, 8GB)Photo: dannychoo.com
  18. 18. Fast out of the box?• Softlayer cloud (1 core, 8GB)• Local instance storagePhoto: dannychoo.com
  19. 19. Fast out of the box?• Softlayer cloud (1 core, 8GB)• Local instance storage• Ubuntu 10/12.04 LTSPhoto: dannychoo.com
  20. 20. Fast out of the box?• JournalingPhoto: dannychoo.com
  21. 21. Fast out of the box?• Journaling• ReplicationPhoto: dannychoo.com
  22. 22. Fast out of the box?Picture is unrelated! Mmm, ice cream.
  23. 23. Fast out of the box?• Fast networkPicture is unrelated! Mmm, ice cream.
  24. 24. Fast out of the box?• Fast network• Working set in RAMPicture is unrelated! Mmm, ice cream.
  25. 25. mongos> db.metrics_20120508_15_1m.stats(){ "sharded" : true, "flags" : 1, "ns" : "metrics.metrics_20120508_15_1m", "count" : 2752934, "numExtents" : 46, "size" : 746837640, "storageSize" : 823717888, "totalIndexSize" : 517581680, "indexSizes" : { "_id_" : 130358144, "a_1_i_1" : 155711920, "a_1_i_1_m_1_t_1" : 231511616 }, "avgObjSize" : 271.2878841265355, "nindexes" : 3, "nchunks" : 61,
  26. 26. "size" : 746837640, "totalIndexSize" : 517581680Indexes = 493MB Data = 712MB
  27. 27. "size" : 746837640, "totalIndexSize" : 517581680Indexes = 493MB Data = 712MB
  28. 28. "size" : 746837640, "totalIndexSize" : 517581680Indexes = 493MB Data = 712MB Total = 1205MB
  29. 29. Where should it go? Should it be in What? memory? Indexes Always Data If you canhttp://www.flickr.com/photos/comedynose/4388430444/
  30. 30. How you’ll know1) Slow queries Thu Oct 14 17:01:11 [conn7410] update sd.apiLog query: { c: "android/setDeviceToken", a: 1466, u: "blah", ua: "Server Density Android" } 51926mswww.flickr.com/photos/tonivc/2283676770/
  31. 31. How you’ll know2) Timeouts cursor timed out (20000 ms)
  32. 32. How you’ll know3) Disk i/o spikeswww.flickr.com/photos/daddo83/3406962115/
  33. 33. Fast out of the box?• Fast network• Working set in RAMPicture is unrelated! Mmm, ice cream.
  34. 34. Fast out of the box?• Fast network• Working set in RAM• Fast disks (optional)Picture is unrelated! Mmm, ice cream.
  35. 35. Growing documents is bad
  36. 36. Updates in placedb.my_collection.update( { _id : ... },{ $inc : { y : 2 } } )
  37. 37. BSON
  38. 38. BSON
  39. 39. Growing documents is bad> db.coll.stats(){ "ns" : "...", ..., "paddingFactor" : 1, ..., "ok" : 1}
  40. 40. Scaling writes• Global DB lock
  41. 41. Scaling writes• Global DB lock• Concurrency
  42. 42. http://blog.pythonisito.com/2011/12/mongodbs-write-lock.html
  43. 43. http://blog.pythonisito.com/2011/12/mongodbs-write-lock.html
  44. 44. http://bit.ly/mongolock
  45. 45. Fast out of the box?• Fast network• Working set in RAM• Fast disks (optional)• Sharding (optional)Picture is unrelated! Mmm, ice cream.
  46. 46. Scaling writes• Global DB lock• Concurrency• Sharding
  47. 47. Scaling writes
  48. 48. Scaling writes• Collection location
  49. 49. Scaling writes• Collection location• Pre-split / moveChunk
  50. 50. Scaling writes• Collection location• Pre-split / moveChunk• Hashing (v2.4)
  51. 51. Failover•Replica sets
  52. 52. Failover•Replica sets •Master/slave
  53. 53. Failover•Replica sets •Master/slave •Min 3 nodes
  54. 54. Failover•Replica sets •Master/slave •Min 3 nodes •Automatic failover
  55. 55. rs.status() { ! "_id" : 1, ! "name" : "rs3b:27018", ! "health" : 1, ! "state" : 2, ! "stateStr" : "SECONDARY", ! "uptime" : 1886098, ! "optime" : { ! ! "t" : 1291252178000, ! ! "i" : 13 ! }, ! "optimeDate" : ISODate("2010-12-02T01:09:38Z"), "lastHeartbeat" : ISODate("2010-12-02T01:09:38Z") },www.ex-astris-scientia.org/inconsistencies/ent_vs_tng.htm (yes it’s a replicator from Star Trek)
  56. 56. rs.status()1) myState Value Meaning 0 Starting up (phase 1) 1 Primary 2 Secondary 3 Recovering 4 Fatal error 5 Starting up (phase 2) 6 Unknown state 7 Arbiter 8 Down
  57. 57. rs.status()2) Optime "optimeDate" : ISODate("2010-12-02T01:09:38Z")www.flickr.com/photos/robbie73/4244846566/
  58. 58. rs.status()3) Heartbeat "lastHeartbeat" : ISODate("2010-12-02T01:09:38Z")www.flickr.com/photos/drawblindfaith/3400981091/
  59. 59. Scaling reads
  60. 60. Scaling reads•Replica slaves
  61. 61. Scaling reads•Replica slaves•Consistency
  62. 62. Scaling reads•Replica slaves•Consistency•w flag / tags
  63. 63. WriteConcernChanged Nov 27 2012
  64. 64. WriteConcern• Safe by default >>> from pymongo import MongoClient >>> connection = MongoClient()
  65. 65. WriteConcern• Safe by default >>> from pymongo import MongoClient >>> connection = MongoClient(w=int/str) Value Meaning 0 Unsafe 1 Primary 2 Primary + x1 secondary 3 Primary + x2 secondaries
  66. 66. Tags{ _id : "someSet", members : [ {_id : 0, host : "A", tags : {"dc": "ny"}}, {_id : 1, host : "B", tags : {"dc": "ny"}}, {_id : 2, host : "C", tags : {"dc": "sf"}}, {_id : 3, host : "D", tags : {"dc": "sf"}}, {_id : 4, host : "E", tags : {"dc": "cloud"}} ] settings : { getLastErrorModes : { veryImportant : {"dc" : 3}, sortOfImportant : {"dc" : 2} } }}> db.foo.insert({x:1})> db.runCommand({getLastError : 1, w : "veryImportant"})
  67. 67. Tags{ _id : "someSet", (A or B) + (C or D) + E members : [ {_id : 0, host : "A", tags : {"dc": "ny"}}, {_id : 1, host : "B", tags : {"dc": "ny"}}, {_id : 2, host : "C", tags : {"dc": "sf"}}, {_id : 3, host : "D", tags : {"dc": "sf"}}, {_id : 4, host : "E", tags : {"dc": "cloud"}} ] settings : { getLastErrorModes : { veryImportant : {"dc" : 3}, sortOfImportant : {"dc" : 2} } }}> db.foo.insert({x:1})> db.runCommand({getLastError : 1, w : "veryImportant"})
  68. 68. Tags{ _id : "someSet", (A + C) or (D + E) ... members : [ {_id : 0, host : "A", tags : {"dc": "ny"}}, {_id : 1, host : "B", tags : {"dc": "ny"}}, {_id : 2, host : "C", tags : {"dc": "sf"}}, {_id : 3, host : "D", tags : {"dc": "sf"}}, {_id : 4, host : "E", tags : {"dc": "cloud"}} ] settings : { getLastErrorModes : { veryImportant : {"dc" : 3}, sortOfImportant : {"dc" : 2} } }}> db.foo.insert({x:1})> db.runCommand({getLastError : 1, w : "sortOfImportant"})
  69. 69. WriteConcern• Safe by default•J
  70. 70. WriteConcern• Safe by default•J• fsync
  71. 71. Bottlenecks•EC2
  72. 72. Bottlenecks•EC2 •Local storage
  73. 73. Bottlenecks•EC2 •Local storage •EBS: RAID10 4-8 volumes
  74. 74. Bottlenecks•EC2 •Local storage •EBS: RAID10 4-8 volumes•i/o: rand but not sequential
  75. 75. http://www.slideshare.net/jrosoff/mongodb-on-ec2-and-ebs
  76. 76. http://bit.ly/ec2mongodb
  77. 77. Bottlenecks•CPU •Index building
  78. 78. Tips: rand()•_id
  79. 79. Tips: rand()•_id•Field names
  80. 80. Tips: rand()•_id•Field names•Covered indexes
  81. 81. Tips: rand()•_id•Field names•Covered indexes•Collections / databases
  82. 82. mongostat
  83. 83. mongostatLocks/Queues
  84. 84. mongostatDiagnostics
  85. 85. Current operations db.currentOp(); { ! ! ! "opid" : "shard1:299939199", ! ! ! "active" : true, ! ! ! "lockType" : "write", ! ! ! "waitingForLock" : false, ! ! ! "secs_running" : 15419, ! ! ! "op" : "remove", ! ! ! "ns" : "sd.metrics", ! ! ! "query" : { ! ! ! ! "accId" : 1391, ! ! ! ! "tA" : { ! ! ! ! ! "$lte" : ISODate("2010-11-24T19:53:00Z") ! ! ! ! } ! ! ! }, ! ! ! "client" : "10.121.12.228:44426", ! ! ! "desc" : "conn" ! ! },www.flickr.com/photos/jeffhester/2784666811/
  86. 86. Monitoring toolsRun yourself Ganglia
  87. 87. Monitoring toolsServer Density
  88. 88. Fast out of the box?• Fast network• Working set in RAMPicture is unrelated! Mmm, ice cream.
  89. 89. Fast out of the box?• Fast network• Working set in RAM• bit.ly/benchrunPicture is unrelated! Mmm, ice cream.
  90. 90. www.serverdensity.com/mdbWoop Japan!
  91. 91. David Mytton @davidmyttondavid@serverdensity.comwww.serverdensity.comWoop Japan!

×