MongoDB Operational Best Practices (mongosf2012)
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

MongoDB Operational Best Practices (mongosf2012)

  • 4,995 views
Uploaded on

Learn about mongodb best practices from examples from fields.

Learn about mongodb best practices from examples from fields.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
4,995
On Slideshare
4,995
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
62
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Operational Best PracticesTales from the field
  • 2. The Plan● Review support cases ○ Taken from real issues ○ Names/ips/dates changed to protect identities● Analyze reported issues● Distill best practices● Summarize takeaways● Repeat...
  • 3. Scenario 1● Fire, it is on fire!● Users notice response time takes 1-3 sec● App logs show timeouts● Server log show socket exceptions
  • 4. Scenario 1 - Diagnostics● Logs● Understanding the timeouts ○ Client read timeout set ○ Connection closed/discarded ○ Symptom not cause● Server connection exceptions ○ Match timing of client timeouts ○ Symptom not cause
  • 5. Scenario 1 - MonitoringGraphs speak a thousand words
  • 6. Scenario 1 - Takeaways● Monitor Logs ○ Alert, escalate ○ Correlate● Disk ○ Monitor ○ Moved to RAID (10)● Instrument/Monitor App● Know your application and application (write) characteristics
  • 7. Scenario 2● Alerts warn that server is running hot● Random (small) slowdowns● Increased traffic/queries
  • 8. Scenario 2 - SymptomsHigh use cpuSimilar querypattern
  • 9. Scenario 2 - Diagnostics● Turn on DB Profiling● Look at logsIdentify query patterns taking longest or withhighest frequency and run explain
  • 10. Scenario 2 - Explaindb.scenario2.find({...}).sort({...}).explain() { "cursor" : "BtreeCursor ABC", "nscanned" : 160677, "nscannedObjects" : 12015, "n" : 55, "millis" : 99, "scanAndOrder" : true, "indexBounds" : {...} }
  • 11. Scenario 2 - Diagnostics● Create a compound index ○ Used for criteria and sort ○ Reduced CPU dramatically
  • 12. Scenario 2 - Takeaways● Performance test/analyze system behavior● Load test before deployment● Alert on abnormal states● High CPU is a sign of poorly indexed● Rolling upgrade for indexes
  • 13. Scenario 3● General slowdown on login● High disk utilization
  • 14. Scenario 3 - DiagnosticsiostatDevice: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %utilsdp 0.00 0.00 0.50 0.00 27.86 0.00 56.00 149.58 20320.00 2010.00 100.00
  • 15. Scenario 3$ blockdev --reportRO RA SSZ BSZ StartSec Size Devicerw 8096 512 4048 0 1099494850560 /dev/sdpHuge read-ahead of 4MB
  • 16. Scenario 3 - Takeaways● Pay attention to disk configurations● Load testing would have found this early● MongoDB depends on the OS a lot● Connect the dots from disportionate effects
  • 17. Best Practices Learned● System provisioning ○ Capacity ○ Performance ○ Scale ○ Configuration● Logs ○ Review ○ Alert ○ Rotate and collect (per cluster)
  • 18. Best Practices Learned● Query/Index Analysis ○ Database Profiler ○ Run explain periodically (sampled) ○ Instrument code, generate metrics● Plan/test rollouts ○ Rolling upgrade for Replica Set ○ Generate indexes on secondaries first ○ Name services, use redirection
  • 19. Thanks, more refsPlease take a look at http://mongodb.org (docs)● Ask on mongodb-user group● Use MMS or historic monitoring ○ Watch for trends ○ Create alerts ○ Forecast capacity for provisioning● logrotate unix command● monitor disk - munin or the like● iostat, dstat, vmstat, free, netstat
  • 20. Questions