Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How to monitor MongoDB

40,184 views

Published on

David Mytton is a MongoDB master and the founder of Server Density. In this presentation David delves deeper into what's discussed in our how to monitor MongoDB tutorial (https://blog.serverdensity.com/monitor-mongodb/), with the aim of taking you through:

Key MongoDB metrics to monitor.
Non-critical MongoDB metrics to monitor.
Alerts to set for MongoDB on production.
Tools for monitoring MongoDB.

Published in: Technology

How to monitor MongoDB

  1. 1. How to monitor: MongoDB David Mytton Hangout on Air - Sept 2014 https://blog.serverdensity.com/monitor-mongodb/
  2. 2. David Mytton
  3. 3. Server Density Architecture
  4. 4. Server Density Architecture ● ~100 servers - Ubuntu 12.04
  5. 5. Server Density Architecture ● ~100 servers - Ubuntu 12.04 ● 50:50 virtual/dedicated
  6. 6. Server Density Architecture ● ~100 servers - Ubuntu 12.04 ● 50:50 virtual/dedicated ● 200TB/m processed data
  7. 7. Server Density Architecture ● ~100 servers - Ubuntu 12.04 ● 50:50 virtual/dedicated ● 200TB/m processed data ● Nginx, Python, MongoDB
  8. 8. Server Density Architecture ● ~100 servers - Ubuntu 12.04 ● 50:50 virtual/dedicated ● 200TB/m processed data ● Nginx, Python, MongoDB ● Softlayer > 1TB RAM, 5TB SSDs
  9. 9. Key metrics ● Oplog replication lag ● Replica state ● Lock % ● Disk i/o % utilization
  10. 10. Oplog replication lag ● Replica sets: master/slave
  11. 11. Oplog replication lag ● Replica sets: master/slave ● Async i.e. eventually consistent
  12. 12. Oplog replication lag ● Replica sets: master/slave ● Async i.e. eventually consistent ● Write concern
  13. 13. Oplog replication lag
  14. 14. Oplog replication lag https://blog.serverdensity.com/mongodb-benchmarks/
  15. 15. Oplog replication lag ● Replica sets: master/slave ● Async i.e. eventually consistent ● Write concern ● Falling behind
  16. 16. Reasons for repl falling behind ● Network problems
  17. 17. Reasons for repl falling behind ● Network problems ● Hardware problems
  18. 18. Reasons for repl falling behind ● Network problems ● Hardware problems ● Shard chunk migrations
  19. 19. Reasons for repl falling behind ● Network problems ● Hardware problems ● Shard chunk migrations ● MongoDB bugs
  20. 20. Replica state ● Primary / secondary
  21. 21. Replica state ● Primary / secondary ● Alert on state change
  22. 22. Lock % ● Database locking (2.6)
  23. 23. Lock % ● Database locking (2.6) ● Sometimes a problem:
  24. 24. Lock % ● Database locking (2.6) ● Sometimes a problem: ● Nearing 100%
  25. 25. Lock % ● Database locking (2.6) ● Sometimes a problem: ● Nearing 100% ● Constantly high
  26. 26. Lock % ● Database locking (2.6) ● Sometimes a problem: ● Nearing 100% ● Constantly high ● Slows replication
  27. 27. Disk i/o % utilization ● Hardware limits
  28. 28. Disk i/o % utilization ● Hardware limits ● Nearing 100%
  29. 29. Disk i/o % utilization ● Hardware limits ● Nearing 100% ● Constantly high
  30. 30. Disk i/o % utilization ● Hardware limits ● Nearing 100% ● Constantly high ● Spinning -> SSD
  31. 31. Disk i/o % utilization https://blog.serverdensity.com/mongodb-performance-ssds-vs-spindle-sas-drives/
  32. 32. Disk i/o % utilization https://blog.serverdensity.com/mongodb-benchmarks/
  33. 33. Disk i/o % utilization ● Hardware limits ● Nearing 100% ● Constantly high ● Spinning -> SSD ● Slow queries, hangs, slow repl
  34. 34. Non-critical metrics to watch ● Memory usage
  35. 35. Non-critical metrics to watch ● Memory usage ● Page faults
  36. 36. Non-critical metrics to watch ● Memory usage ● Page faults ● Connections
  37. 37. Non-critical metrics to watch ● Memory usage ● Page faults ● Connections ● Shard chunk distribution
  38. 38. Non-critical metrics to watch
  39. 39. Monitoring tools ● mongostat ● mongotop ● rs.status() ● sh.status()
  40. 40. rs.status()
  41. 41. sh.status()
  42. 42. Server Density
  43. 43. MMS
  44. 44. Summary ● Critical alerts on key metrics
  45. 45. Key metrics ● Oplog replication lag ● Replica state ● Lock % ● Disk i/o % utilization
  46. 46. Summary ● Critical alerts on key metrics
  47. 47. Summary ● Critical alerts on key metrics ● Watch non-critical
  48. 48. Summary ● Critical alerts on key metrics ● Watch non-critical ● Manual tools for real time
  49. 49. Summary ● Critical alerts on key metrics ● Watch non-critical ● Manual tools for real time ● Set up a monitoring product
  50. 50. Useful resources ● http://docs.mongodb.org/manual/administration/monitoring/ ● https://blog.serverdensity.com/monitor-mongodb ● https://blog.serverdensity.com
  51. 51. どもありがとうございます @davidmytton david@serverdensity.com blog.serverdensity.com

×