MongoDB Health Tips(How to spot a landslide)
It’s a little different,but not entirely new.
Keep it in RAM. Obviously.www.flickr.com/photos/comedynose/4388430444/
How do you know?                   >   db.stats()                   {                   !    "collections" : 3,           ...
Where should it go?                                                     Should it be in                            What?  ...
How you’ll know1) Slow queries                 Thu Oct 14 17:01:11 [conn7410] update sd.apiLog                query: { c: ...
How you’ll know2) Timeouts    cursor timed out (20000 ms)
How you’ll know3) Disk i/o spikeswww.flickr.com/photos/daddo83/3406962115/
Watch your storage1) Pre-alloc
Watch your storage2) Sharding maxSize
Watch your storage3) Logging              --quiet    db.runCommand("logRotate");      killall -SIGUSR1 mongod
db.serverStatus()
db.serverStatus()1) Used connectionswww.flickr.com/photos/armchaircaver/2061231069/
db.serverStatus()2) Available connections
Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain...
connPoolStats>   db.runCommand("connPoolStats"){!   "hosts" : {!   ! "config1:27019" : {!   ! ! "available" : 2,!   ! ! "c...
db.serverStatus()3) Index counters     "indexCounters" : {     ! ! "btree" : {     ! ! ! "accesses" : 15180175,     ! ! ! ...
db.serverStatus()4) Op counterswww.flickr.com/photos/cosmic_bandita/2395369614/
db.serverStatus()5) Background flushingPicture is unrelated! Mmm, ice cream.
rs.status()       {       !     "_id" : 1,       !     "name" : "rs3b:27018",       !     "health" : 1,       !     "state...
rs.status()1) myState                               Value           Meaning                                 0   Starting u...
rs.status()2) Optime         "optimeDate" : ISODate("2010-12-02T01:09:38Z")www.flickr.com/photos/robbie73/4244846566/
rs.status()3) Heartbeat         "lastHeartbeat" : ISODate("2010-12-02T01:09:38Z")www.flickr.com/photos/drawblindfaith/34009...
mongostat
mongostat 1) faultswww.flickr.com/photos/mshades/294203396/
mongostat2) lockedwww.flickr.com/photos/bbusschots/4541573665/
mongostat3) index misswww.flickr.com/photos/gareandkitty/276471187/
mongostat4) queues
mongostat5) Diagnostics
Current operations    db.currentOp();    {    ! ! ! "opid" : "shard1:299939199",    ! ! ! "active" : true,    ! ! ! "lockT...
Monitoring toolsRun yourself    Ganglia
Monitoring toolsHosted monitoring
Monitoring toolsServer Density
Monitoring toolswww.mongomonitor.com
Recap
RecapKeep it in RAM
RecapKeep it in RAMWatch your storage
RecapKeep it in RAMWatch your storagedb.serverStatus()rs.status()
mongostat
David Mytton @davidmyttondavid@boxedice.comwww.mongomonitor.comWoop Japan!
Monitoring MongoDB (MongoSV)
Monitoring MongoDB (MongoSV)
Monitoring MongoDB (MongoSV)
Upcoming SlideShare
Loading in...5
×

Monitoring MongoDB (MongoSV)

5,037

Published on

Presentation by David Mytton about monitoring MongoDB at the MongoSV conference 3rd Dec 2010.

A full blog series covering everything in this presentation is at http://blog.boxedice.com/mongodb-monitoring/

1 Comment
16 Likes
Statistics
Notes
  • A full blog series covering everything in this presentation is at http://blog.boxedice.com/mongodb-monitoring/
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
5,037
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
132
Comments
1
Likes
16
Embeds 0
No embeds

No notes for slide

Monitoring MongoDB (MongoSV)

  1. 1. MongoDB Health Tips(How to spot a landslide)
  2. 2. It’s a little different,but not entirely new.
  3. 3. Keep it in RAM. Obviously.www.flickr.com/photos/comedynose/4388430444/
  4. 4. How do you know? > db.stats() { ! "collections" : 3, ! "objects" : 379970142, ! "avgObjSize" : 146.4554114991488, ! "dataSize" : 55648683504, 51GB ! "storageSize" : 61795435008, ! "numExtents" : 64, ! "indexes" : 1, ! "indexSize" : 21354514128, 19GB ! "fileSize" : 100816388096, ! "ok" : 1 }http://www.flickr.com/photos/comedynose/4388430444/
  5. 5. Where should it go? Should it be in What? memory? Indexes Always Data If you canhttp://www.flickr.com/photos/comedynose/4388430444/
  6. 6. How you’ll know1) Slow queries Thu Oct 14 17:01:11 [conn7410] update sd.apiLog query: { c: "android/setDeviceToken", a: 1466, u: "blah", ua: "Server Density Android" } 51926mswww.flickr.com/photos/tonivc/2283676770/
  7. 7. How you’ll know2) Timeouts cursor timed out (20000 ms)
  8. 8. How you’ll know3) Disk i/o spikeswww.flickr.com/photos/daddo83/3406962115/
  9. 9. Watch your storage1) Pre-alloc
  10. 10. Watch your storage2) Sharding maxSize
  11. 11. Watch your storage3) Logging --quiet db.runCommand("logRotate"); killall -SIGUSR1 mongod
  12. 12. db.serverStatus()
  13. 13. db.serverStatus()1) Used connectionswww.flickr.com/photos/armchaircaver/2061231069/
  14. 14. db.serverStatus()2) Available connections
  15. 15. Fri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [mongosMain] Listener: accept() returns -1 errno:24 Too many open filesFri Nov 19 17:24:32 [conn2335] getaddrinfo("rs1b") failed: No address associated with hostnameFri Nov 19 17:24:32 [conn2335] getaddrinfo("rs1d") failed: No address associated with hostnameFri Nov 19 17:24:32 [conn2335] getaddrinfo("rs1c") failed: No address associated with hostnameFri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2b") failed: No address associated with hostnameFri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2d") failed: No address associated with hostnameFri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2c") failed: No address associated with hostnameFri Nov 19 17:24:32 [conn2335] getaddrinfo("rs2a") failed: No address associated with hostnameFri Nov 19 17:24:32 [conn2268] checkmaster: rs2b:27018 { setName: "set2", ismaster: false, secondary: true, hosts: [ "rs2b:27018", "rs2d:27018", "rs2c:27018", "rs2a:27018" ], arbiters:[ "rs2arbiter:27018" ], primary: "rs2a:27018", maxBsonObjectSize: 8388608, ok: 1.0 }MessagingPort say send() errno:9 Bad file descriptor (NONE)Fri Nov 19 17:24:32 [conn2268] checkmaster: caught exception rs2d:27018 socket exceptionFri Nov 19 17:24:32 [conn2268] MessagingPort say send() errno:9 Bad file descriptor (NONE)Fri Nov 19 17:24:32 [conn2268] checkmaster: caught exception rs2c:27018 socket exceptionFri Nov 19 17:24:32 [conn2268] MessagingPort say send() errno:9 Bad file descriptor (NONE)Fri Nov 19 17:24:32 [conn2268] checkmaster: caught exception rs2a:27018 socket exceptionFri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1a") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1b") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1d") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2330] getaddrinfo("rs1c") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2b") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2d") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2c") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2327] getaddrinfo("rs2a") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2b") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2d") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2c") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2126] getaddrinfo("rs2a") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2343] getaddrinfo("rs1b") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2343] getaddrinfo("rs1d") failed: No address associated with hostnameFri Nov 19 17:24:33 [conn2343] getaddrinfo("rs1c") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2332] getaddrinfo("rs1b") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2332] getaddrinfo("rs1d") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2332] getaddrinfo("rs1c") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2b") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2d") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2c") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2332] getaddrinfo("rs2a") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2d") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2c") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2343] getaddrinfo("rs2a") failed: No address associated with hostnameFri Nov 19 17:24:34 [conn2343] trying reconnect to rs2d:27018
  16. 16. connPoolStats> db.runCommand("connPoolStats"){! "hosts" : {! ! "config1:27019" : {! ! ! "available" : 2,! ! ! "created" : 6! ! },! ! "set1/rs1a:27018,rs1b:27018" : {! ! ! "available" : 1,! ! ! "created" : 249! ! }, ...! },! "totalAvailable" : 5,! "totalCreated" : 1002,! "numDBClientConnection" : 3490,! "numAScopedConnection" : 3,}
  17. 17. db.serverStatus()3) Index counters "indexCounters" : { ! ! "btree" : { ! ! ! "accesses" : 15180175, ! ! ! "hits" : 15178725, ! ! ! "misses" : 1450, ! ! ! "resets" : 0, ! ! ! "missRatio" : 0.00009551932 ! ! } ! },
  18. 18. db.serverStatus()4) Op counterswww.flickr.com/photos/cosmic_bandita/2395369614/
  19. 19. db.serverStatus()5) Background flushingPicture is unrelated! Mmm, ice cream.
  20. 20. rs.status() { ! "_id" : 1, ! "name" : "rs3b:27018", ! "health" : 1, ! "state" : 2, ! "stateStr" : "SECONDARY", ! "uptime" : 1886098, ! "optime" : { ! ! "t" : 1291252178000, ! ! "i" : 13 ! }, ! "optimeDate" : ISODate("2010-12-02T01:09:38Z"), "lastHeartbeat" : ISODate("2010-12-02T01:09:38Z") },www.ex-astris-scientia.org/inconsistencies/ent_vs_tng.htm (yes it’s a replicator from Star Trek)
  21. 21. rs.status()1) myState Value Meaning 0 Starting up (phase 1) 1 Primary 2 Secondary 3 Recovering 4 Fatal error 5 Starting up (phase 2) 6 Unknown state 7 Arbiter 8 Downen.wikipedia.org/wiki/State_of_matter
  22. 22. rs.status()2) Optime "optimeDate" : ISODate("2010-12-02T01:09:38Z")www.flickr.com/photos/robbie73/4244846566/
  23. 23. rs.status()3) Heartbeat "lastHeartbeat" : ISODate("2010-12-02T01:09:38Z")www.flickr.com/photos/drawblindfaith/3400981091/
  24. 24. mongostat
  25. 25. mongostat 1) faultswww.flickr.com/photos/mshades/294203396/
  26. 26. mongostat2) lockedwww.flickr.com/photos/bbusschots/4541573665/
  27. 27. mongostat3) index misswww.flickr.com/photos/gareandkitty/276471187/
  28. 28. mongostat4) queues
  29. 29. mongostat5) Diagnostics
  30. 30. Current operations db.currentOp(); { ! ! ! "opid" : "shard1:299939199", ! ! ! "active" : true, ! ! ! "lockType" : "write", ! ! ! "waitingForLock" : false, ! ! ! "secs_running" : 15419, ! ! ! "op" : "remove", ! ! ! "ns" : "sd.metrics", ! ! ! "query" : { ! ! ! ! "accId" : 1391, ! ! ! ! "tA" : { ! ! ! ! ! "$lte" : ISODate("2010-11-24T19:53:00Z") ! ! ! ! } ! ! ! }, ! ! ! "client" : "10.121.12.228:44426", ! ! ! "desc" : "conn" ! ! },www.flickr.com/photos/jeffhester/2784666811/
  31. 31. Monitoring toolsRun yourself Ganglia
  32. 32. Monitoring toolsHosted monitoring
  33. 33. Monitoring toolsServer Density
  34. 34. Monitoring toolswww.mongomonitor.com
  35. 35. Recap
  36. 36. RecapKeep it in RAM
  37. 37. RecapKeep it in RAMWatch your storage
  38. 38. RecapKeep it in RAMWatch your storagedb.serverStatus()rs.status()
  39. 39. mongostat
  40. 40. David Mytton @davidmyttondavid@boxedice.comwww.mongomonitor.comWoop Japan!
  1. ¿Le ha llamado la atención una diapositiva en particular?

    Recortar diapositivas es una manera útil de recopilar información importante para consultarla más tarde.

×