MongoDB Tokyo - Monitoring and Queueing

1,963 views

Published on

Presentation given by @davidmytton at Mongo Tokyo meetup 15th Nov 2011.

Published in: Technology, Travel
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,963
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
36
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

MongoDB Tokyo - Monitoring and Queueing

  1. 1. MongoDB Queueing & Monitoring
  2. 2. •Server Density •26 nodes •6 replica sets •Primary datastore = 15 nodes
  3. 3. •Server Density •+7TB / mth •+1bn docs / mth •2-5k inserts/s @ 3msWe use MongoDB as our primary data store but also as a queueing system. So I’m going totalk first about how we built the queuing functionality into Mongo and then more generallyabout what you need to keep an eye on when monitoring MongoDB in production.
  4. 4. Queuing: Useswww.flickr.com/photos/triplexpresso/496995086/
  5. 5. Queuing: Uses• Background processingwww.flickr.com/photos/triplexpresso/496995086/
  6. 6. Queuing: Uses• Background processing• Sending notificationswww.flickr.com/photos/triplexpresso/496995086/
  7. 7. Queuing: Uses• Background processing• Sending notifications• Event streamingwww.flickr.com/photos/triplexpresso/496995086/Asynchronous
  8. 8. Queuing: Features
  9. 9. Queuing: Features• Consumers
  10. 10. Queuing: Features• Consumers• Atomic
  11. 11. Queuing: Features• Consumers• Atomic• Speed
  12. 12. Queuing: Features• Consumers• Atomic• Speed• GC
  13. 13. Queuing: Features•Consumers
  14. 14. Queuing: Features•Consumers MongoDB RabbitMQ Mongo Wire AMQP ProtocolIf you’re building a queue connecting via - RabbitMQ AMQP. Mongo Wire
  15. 15. Queuing: Features•Atomicen.wikipedia.org/wiki/State_of_matter
  16. 16. Queuing: Features•Atomic MongoDB RabbitMQ findAndModify consume/acken.wikipedia.org/wiki/State_of_matter
  17. 17. Queuing: Features•Speed
  18. 18. Queuing: Features•GC
  19. 19. Queuing: Features•GC MongoDB RabbitMQ ☹ consume/ack
  20. 20. Implementation• Consumers2 things we need to implement - consumers and GC
  21. 21. Implementation• Consumers db.runCommand( { findAndModify : <collection>, <options> } )findAndModify command takes 2 parameters - collection and options.
  22. 22. Implementation• Consumers db.runCommand( { findAndModify : <collection>, <options> } ) query: filter (WHERE) { query: { inProg: false } }Specify the query just like any normal query against Mongo. The very first document thatmatches this will be returned. Since we’re building a queuing system, we’re using a fieldcalled inProg so we’re asking it to give us documents where this is false - i.e. the processingof that document isnt in progress.
  23. 23. Implementation• Consumers db.runCommand( { findAndModify : <collection>, <options> } ) update: modifier object { update: { $set: {inProg: true, start: new Date()} } }Atomic update.
  24. 24. Implementation• Consumers db.runCommand( { findAndModify : <collection>, <options> } ) sort: selects the first one on multi-match { sort: { added: -1 } }We can also sort e.g. on a timestamp so you can return the oldest documents first, or youcould build a priority system to return more important documents first.
  25. 25. Implementation• Consumers db.runCommand( { findAndModify : <collection>, <options> } ) remove: true = deletes on return new: true = returns modified object fields: return specific fields upsert: true = create object if !exists()
  26. 26. Implementation• GC
  27. 27. Implementation• GC now = datetime.datetime.now() difference = datetime.timedelta(seconds=10) timeout = now - difference queue.find({inProg : True, start : {$lte : timeout} })
  28. 28. Stick with RabbitMQ?
  29. 29. Stick with RabbitMQ?QoS
  30. 30. Stick with RabbitMQ?QoSAMQP
  31. 31. Stick with RabbitMQ?QoSAMQPThrottling
  32. 32. It’s a little different, but not entirely new.The problem is that MongoDB is fairly new and whilst it’s still just another database runningon a server, there are things that are new and unusual. This means that some oldassumptions are still valid, but others aren’t. You don’t have to approach it as a completelynew thing, but it is a little different. There are disadvantages to this but one advantage is youcan use it for novel tasks, like queuing.
  33. 33. Keep it in RAM. Obviously.www.flickr.com/photos/comedynose/4388430444/The first and most obvious thing to note is that keeping everything in RAM is faster. But whatdoes that actually mean and how do you know when something is in RAM?
  34. 34. How do you know? > db.stats() { ! "collections" : 3, ! "objects" : 379970142, ! "avgObjSize" : 146.4554114991488, ! "dataSize" : 55648683504, 51GB ! "storageSize" : 61795435008, ! "numExtents" : 64, ! "indexes" : 1, ! "indexSize" : 21354514128, 19GB ! "fileSize" : 100816388096, ! "ok" : 1 }http://www.flickr.com/photos/comedynose/4388430444/The easiest way is to check the database size. The MongoDB console provides an easy way tolook at the data and index sizes, and the output is provided in bytes.
  35. 35. Where should it go? Should it be in What? memory? Indexes Always Data If you canhttp://www.flickr.com/photos/comedynose/4388430444/In every case, having something in memory is going to be faster than not. However, that’s notalways feasible if you have massive data sets. Instead, you want to make sure you alwayshave enough RAM to store all the indexes, which is what the db.stats() output is for. And ifyou can, have space for data too. MongoDB is smart about its memory management so it willkeep commonly accessed data in RAM where possible.
  36. 36. How you’ll know1) Slow queries Thu Oct 14 17:01:11 [conn7410] update sd.apiLog query: { c: "android/setDeviceToken", a: 1466, u: "blah", ua: "Server Density Android" } 51926mswww.flickr.com/photos/tonivc/2283676770/Although not the only reason, a slow query does indicate insufficient memory. This might bethat you’ve not got the most optimal indexes for a query but if indexes are being used andit’s still slow, it could be because of a disk i/o bottleneck because the data isn’t in RAM.Doing an explain on the query will show you what indexes it is using.
  37. 37. How you’ll know2) Timeouts cursor timed out (20000 ms)These slow queries will obviously cause a slowdown in your app but they may also causetimeouts. In the PHP driver a cursor will timeout after 20,000ms by default, although this isconfigurable.
  38. 38. How you’ll know3) Disk i/o spikeswww.flickr.com/photos/daddo83/3406962115/You’ll see write spikes because MongoDB syncs data to disk periodically, but if you’re seeingread spikes then that can indicate MongoDB is having to read the data files rather thanaccessing data from memory. Be careful though because this won’t distinguish between dataand indexes, or even other server activity. Read spikes can also occur even if you have littleor no read activity if the mongod is part of a cluster where the slaves are reading from theoplog.
  39. 39. Watch your storage1) Pre-allocIt sounds obvious but our statistics show that people run out disk space suddenly, eventhough there is a predictable increase over time. Remember that MongoDB pre-allocates filesbefore the space is used, so you’ll see your storage being used up in 2GB increments (onceyou go past the smaller initial data file sizes).
  40. 40. Watch your storage2) Sharding maxSizeWhen adding a new shard you can specify the maximum amount of data you want to store onthat shard. This isn’t a hard limit and is instead used as a guide. MongoDB will try to keep thedata balanced across all your shards so that it meets this setting but it may not. MongoDBdoesn’t currently look at actual disk levels and assumes available capacity is the same acrossall nodes. As such, it’s advisable that you set this to around 70% of the total available diskspace.
  41. 41. Watch your storage3) Logging --quiet db.runCommand("logRotate"); killall -SIGUSR1 mongodLogging is verbose by default, so you’ll want to use the quiet option to ensure only importantthings are output. And assuming you’re logging to a log file, you will want to periodicallyrotate it via the MongoDB console so that it doesn’t get too big. You can also do a killallSIGUSR1 on all your mongod processes from the shell which will cause a log rotation(because of the SIGUSR1 flag). This is useful if you want to script log rotation or put it into acron job.
  42. 42. Watch your storage4) Journaling david@rs2b ~: ls -alh /mongodbdata/journal/ total 538M drwxrwxr-x 2 david david 29 Mar 20 16:50 . drwx------ 4 david david 4.0K Mar 13 09:50 .. -rw------- 1 david david 538M Mar 20 17:00 j._862 -rw------- 1 david david 88 Mar 20 17:00 lsnMongo should rotate the journal files often but you need to remember that they will take upsome space too, and as new files are allocated and old ones deleted, you may see your diskusage spiking up and down.
  43. 43. db.serverStatus()The server status command provides a lot of different statistics that can help you, like thismap of traffic in central Tokyo.
  44. 44. db.serverStatus()1) Used connectionswww.flickr.com/photos/armchaircaver/2061231069/Every connection to the database has an overhead. You want to reduce this number by usingpersistent connections through the drivers.
  45. 45. db.serverStatus()2) Available connectionsEvery server has its limits. If you run out of available connections then you’ll have a problem,which will look like this in the logs.
  46. 46. Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [conn2335] getaddrinfo("rs1b") failed: No address associated with hostnameFri Nov 19 17:24:32 [conn2335] getaddrinfo("rs1d") failed: No address associated with hostnameFri Nov 19 17:24:32 [conn2335] getaddrinfo("rs1c") failed: No address associated with hostnameFri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2b") failed: No address associated with hostnameFri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2d") failed: No address associated with hostnameFri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2c") failed: No address associated with hostnameFri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2a") failed: No address associated with hostnameFri Nov 19 17:24:32 [conn2268] checkmaster: rs2b:27018 { setName: "set2", ismaster: false, secondary: true, hosts: [ "rs2b:27018", "rs2d:27018", "rs2c:27018", "rs2a:27018" ], arbiters:[ "rs2arbiter:27018" ], primary: "rs2a:27018", maxBsonObjectSize: 8388608, ok: 1.0 }MessagingPort say send() errno:9 Bad file descriptor (NONE)Fri Nov 19 17:24:32 [conn2268] checkmaster: caught exception rs2d:27018 socket exceptionFri Nov 19 17:24:32 [conn2268] MessagingPort say send() errno:9 Bad file descriptor (NONE)Fri Nov 19 17:24:32 [conn2268] checkmaster: caught exception rs2c:27018 socket exceptionFri Nov 19 17:24:32 [conn2268] MessagingPort say send() errno:9 Bad file descriptor (NONE)Fri Nov 19 17:24:32 [conn2268] checkmaster: caught exception rs2a:27018 socket exceptionFri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1a") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1b") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1d") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1c") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2b") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2d") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2c") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2a") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2b") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2d") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2c") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2a") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2343] getaddrinfo("rs1b") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2343] getaddrinfo("rs1d") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2343] getaddrinfo("rs1c") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2332] getaddrinfo("rs1b") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2332] getaddrinfo("rs1d") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2332] getaddrinfo("rs1c") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2b") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2d") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2c") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2a") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2d") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2c") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2a") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2343] trying reconnect to rs2d:27018Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2d") failed: No address associated with hostnameWe’ve recently had this problem and it manifests itself by the logs filling up all available diskFri Nov 19 17:24:34 [conn2343] reconnect rs2d:27018 failedspace instantly, and in some cases completely crashing the server.Fri Nov 19 17:24:34 [conn2343] MessagingPort say send() errno:9 Bad file descriptor (NONE)Fri Nov 19 17:24:34 [conn2343] trying reconnect to rs2c:27018Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2c") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2343] reconnect rs2c:27018 failedFri Nov 19 17:24:34 [conn2343] MessagingPort say send() errno:9 Bad file descriptor (NONE)Fri Nov 19 17:24:34 [conn2343] trying reconnect to rs2a:27018Fri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2a") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2343] reconnect rs2a:27018 failedFri Nov 19 17:24:34 [conn2343] MessagingPort say send() errno:9 Bad file descriptor (NONE)Fri Nov 19 17:24:35 [conn2343] checkmaster: rs2b:27018 { setName: "set2", ismaster: false, secondary: true, hosts: [ "rs2b:27018", "rs2d:27018", "rs2c:27018", "rs2a:27018" ], arbiters:[ "rs2arbiter:27018" ], primary: "rs2a:27018", maxBsonObjectSize: 8388608, ok: 1.0 }MessagingPort say send() errno:9 Bad file descriptor (NONE)
  47. 47. connPoolStats > db.runCommand("connPoolStats") { ! "hosts" : { ! ! "config1:27019" : { ! ! ! "available" : 2, ! ! ! "created" : 6 ! ! }, ! ! "set1/rs1a:27018,rs1b:27018" : { ! ! ! "available" : 1, ! ! ! "created" : 249 ! ! }, ... ! }, ! "totalAvailable" : 5, ! "totalCreated" : 1002, ! "numDBClientConnection" : 3490, ! "numAScopedConnection" : 3, }connPoolStats allows you to see the connection pools that have been set up by a mongos toconnect to different members of the replica set shards. This is useful to correlate againstopen file descriptors so you can see if there are suddenly a large number of connections, or ifthere are a low number of available connections across your entire cluster.
  48. 48. db.serverStatus()3) Index counters "indexCounters" : { ! ! "btree" : { ! ! ! "accesses" : 15180175, ! ! ! "hits" : 15178725, ! ! ! "misses" : 1450, ! ! ! "resets" : 0, ! ! ! "missRatio" : 0.00009551932 ! ! } ! },The miss ratio is what you’re looking at here. If you’re seeing a lot of index misses then youneed to look at your queries to see if they’re making optimal use of the indexes you’vecreated. You should consider adding new indexes and seeing if your queries run faster as aresult. You can use the explain syntax to see which indexes queries are hitting, and the totalexecution time so you can benchmark them before and after.
  49. 49. db.serverStatus()4) Op counterswww.flickr.com/photos/cosmic_bandita/2395369614/The op counters - inserts, updates, deletes and queries - are fun to look at, especially if thenumbers are high. But you have to be careful these are not just vanity metrics. There aresome things you can use them for though. If you have a high number of inserts and updates,i.e. writes, then you may want to look at your fsync time setting. By default this will flush todisk every 60 seconds but if you’re doing thousands of writes per second you might want todo this sooner for durability. Of course you can also ensure the write happens from withinthe driver. Queries can show whether you need to load off reads to your slaves, which can bedone through the drivers, so that you’re spreading the load across your servers and onlywriting to the master. Deletes can also cause concurrency problems if you’re doing a largenumber of them and the database keeps having to yield.
  50. 50. db.serverStatus()5) Background flushingPicture is unrelated! Mmm, ice cream.The server status output allows you to see the last time data was flushed to disk, and howlong that took. This is useful to see if you’re causing high disk load but also so you canmonitor how often data is being written. Remember that whilst it isn’t synced to disk, youcould experience data loss in the event of a crash or power outage.
  51. 51. db.serverStatus()6) DurIf you have journalling enabled then serverStatus will also show some stats such as how manycommits have occurred, the amount of data written and how long various operations havetaken. This can be useful for seeing how much overhead durability adds to servers. We’vefound no noticeable difference when enabling journaling and that’s on servers processingbillions of operations.
  52. 52. rs.status() { ! "_id" : 1, ! "name" : "rs3b:27018", ! "health" : 1, ! "state" : 2, ! "stateStr" : "SECONDARY", ! "uptime" : 1886098, ! "optime" : { ! ! "t" : 1291252178000, ! ! "i" : 13 ! }, ! "optimeDate" : ISODate("2010-12-02T01:09:38Z"), "lastHeartbeat" : ISODate("2010-12-02T01:09:38Z") },www.ex-astris-scientia.org/inconsistencies/ent_vs_tng.htm (yes it’s a replicator from Star Trek)If you’re running a replica set then you can use the rs.status() command to get informationabout the whole replica set, on any set member. This gives you a few stats about the currentmember as well as a full list of every member in the set.
  53. 53. rs.status()1) myState Value Meaning 0 Starting up (phase 1) 1 Primary 2 Secondary 3 Recovering 4 Fatal error 5 Starting up (phase 2) 6 Unknown state 7 Arbiter 8 Downen.wikipedia.org/wiki/State_of_matterThe first value is myState which shows you the status of the server you executed thecommand on. However, it’s also used in the list of members the command also provides soyou can see the state of any member in the replica set, as that member sees it. This is usefulto understand why members might be down because other members can’t see them.
  54. 54. rs.status()2) Optime "optimeDate" : ISODate("2010-12-02T01:09:38Z")www.flickr.com/photos/robbie73/4244846566/Replica set members who are not master will be secondary, which means they’ll act as a slavestaying up to date with the master. The optimeDate allows you to see whether a member isbehind on the replication sync. The timestamp is the last applied log item so if it’s up to date,it’ll be very close to the current actual time on the server.
  55. 55. rs.status()3) Heartbeat "lastHeartbeat" : ISODate("2010-12-02T01:09:38Z")www.flickr.com/photos/drawblindfaith/3400981091/The whole idea behind replica sets is that they automate failover in the event of failuresomewhere. This is done by a regular heartbeat that all members send out to all othermembers. The status output shows you the last time that particular member was contactedfrom the current member. In the event of a network partition it may be that some memberscan’t communicate with eachother, and when there is an error you’ll see it in this section too.
  56. 56. mongostatThe mongostat tool is included as part of the standard MongoDB download and gives you aquick, real time snapshot of the current state of your servers.
  57. 57. mongostat 1) faultsPicture is unrelated! Snowmobile in Norway.The faults column shows you the number of Linux page faults per second. This is whenMongo accesses something that is mapped to the virtual address space but not in physicalmemory. i.e. it results in a read from disk. High values here indicate you may not haveenough RAM to store all necessary data and disk accesses may start to become thebottleneck.
  58. 58. mongostat2) lockedwww.flickr.com/photos/bbusschots/4541573665/The next column is locked, which shows the % of time in a global write lock. When this ishappening no other queries will complete until the lock is given up, or the lock owner yields.This is indicative of a large, global operation like a remove() or dropping a collection and canresult in slow performance.
  59. 59. mongostat3) index misswww.flickr.com/photos/gareandkitty/276471187/Index miss is like we saw in the server status output except instead of an aggregate total,you can see queries hitting (or missing) the index in real time. This is useful if you’redebugging specific queries in development or need to track down a server that is performingbadly.
  60. 60. mongostat4) queuesWhen MongoDB gets too many queries to handle in real time, it queues them up. This isrepresented in mongostat by the read and write queue columns. When this starts to increaseyou will slowdowns in executing queries as they have to wait to run through the queue. Youcan alleviate this by stopping any more queries until the queue has dissipated. Queues willtend to spike if you’re doing a lot of write operations alongside other write heavy ops, suchas large ranged removes. The second column it the active read and writes.
  61. 61. mongostat5) DiagnosticsThe last three columns show the total number of connections per server, the replica set theybelong to and the status of that server. This is useful if you need to quickly see which serveris a master in a replica set.
  62. 62. Current operations db.currentOp(); { ! ! ! "opid" : "shard1:299939199", ! ! ! "active" : true, ! ! ! "lockType" : "write", ! ! ! "waitingForLock" : false, ! ! ! "secs_running" : 15419, ! ! ! "op" : "remove", ! ! ! "ns" : "sd.metrics", ! ! ! "query" : { ! ! ! ! "accId" : 1391, ! ! ! ! "tA" : { ! ! ! ! ! "$lte" : ISODate("2010-11-24T19:53:00Z") ! ! ! ! } ! ! ! }, ! ! ! "client" : "10.121.12.228:44426", ! ! ! "desc" : "conn" ! ! },www.flickr.com/photos/jeffhester/2784666811/The db.currentOp() function will give you a full list of every operation currently in progress. Inthis case there’s a long runnin remove which has been active for over 4 hours. You can seethat it’s targeted at shard 1 and the query is based on an account ID and a timestamp. It’spart of our retention scripts to remove older metrics data. This is useful because you cantrack down long running queries which might be hurting performance, and kill them off usingthe opid.
  63. 63. Monitoring toolsServer Density
  64. 64. Monitoring toolswww.mongomonitor.com
  65. 65. Recap
  66. 66. RecapKeep it in RAM
  67. 67. RecapKeep it in RAMWatch your storage
  68. 68. RecapKeep it in RAMWatch your storagedb.serverStatus()rs.status()
  69. 69. David Mytton @davidmyttondavid@boxedice.comwww.mongomonitor.comWoop Japan!

×