Webinar - Approaching 1 billion documents with MongoDB

  • 8,977 views
Uploaded on

Presentation given via webinar on 5th May 2010 by David Mytton on approaching 1 billion documents in MongoDB. …

Presentation given via webinar on 5th May 2010 by David Mytton on approaching 1 billion documents in MongoDB.

Audio recording at http://bit.ly/mongo1bndocs

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • You say keep indexes in memory, you have 241GB Index size however I don't see any slide which shows you have any system or even a shard with half that amount of memory.
    Are you sure you want to
    Your message goes here
  • Audio recording at http://bit.ly/mongo1bndocs
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
8,977
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
123
Comments
2
Likes
25

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Approaching 1 Billion Documents in MongoDB David Mytton 1/25 david@boxedice.com / www.mytton.net
  • 2. Server Density Monitoring Processing Database UI 2/25 www.serverdensity.com
  • 3. db.stats() Documents 981,289,332 Collections 47,962 Indexes 39,684 Data size 369GB Index size 241GB 3/25 As of 25th Apr 2010.
  • 4. 10 months 4/25 Why we moved: http://bit.ly/mysqltomongo
  • 5. Initial Setup Replication Master Slave DC1 DC2 8GB RAM 8GB RAM 5/25
  • 6. Vertical Scaling Replication Master Slave DC1 DC2 72GB RAM 8GB RAM 6/25
  • 7. Tip #1 Keep your indexes in memory at all times. db.stats() 7/25
  • 8. Manual Partitioning Replication Master A Slave A DC1 DC2 16GB RAM 16GB RAM Replication Master B Slave B DC1 DC2 8/25 16GB RAM 16GB RAM
  • 9. Database vs collections • Many databases = many data files (small but quickly get large). • Many collections = watch namespace limit. 9/25
  • 10. Namespaces = Number of collections + number of indexes 10/25
  • 11. Tip #2 Monitor the 24,000 namespace limit. 11/25
  • 12. Using Server Density 12/25
  • 13. Console db.system.namespaces.count() 13/25
  • 14. Replica Pairs = Failover Replica Pair Master A Slave A DC1 DC2 16GB RAM 16GB RAM Replica Pair Master B Slave B DC1 DC2 14/25 16GB RAM 16GB RAM
  • 15. Tip #3 Pre-provision your oplog files. 15/25
  • 16. A shell script to generate 75GB oplog files for i in {0..40} do echo $i head -c 2146435072 /dev/zero > local.$i done 16/25
  • 17. Tip #4 Expect slower performance during initial replica sync. 17/25
  • 18. Tip #5 You can rotate your log files from the console. 18/25
  • 19. Rotating your log files db.runCommand("logRotate") 19/25
  • 20. Tip #6 Index creation blocks by default. Use background indexing if necessary. 20/25 MongoDB Manual: http://bit.ly/mongobgindex
  • 21. Tip #7 Increase your OS file descriptor limit + use persistent connections. 21/25
  • 22. Too many open files! /etc/security/limits.conf mongo hard nofile 10000 mongo soft nofile 10000 user type limit /etc/ssh/sshd_config UsePAM yes 22/25
  • 23. Space is not reused 23/25
  • 24. Tip #8 10gen commercial support is worth paying for. 24/25
  • 25. Summary 1. Keep indexes in memory. 2. Monitor the 24k namespace limit. 3. Pre-provision oplog files. 4. Expect slower performance on replica sync. 5. Rotate logs from the console. 6. Index creation blocks by default. 7. OS file descriptor limit + persistent connections. 25/25 8. Commercial support is worth it.