Mongo SV<br />Keeping the lights on with MongoDB<br />Tony Tam<br />12/3/2010<br />
Presentation Overview<br />Data >>> code<br />Treat it appropriately<br />Manage and maintain Mongo<br />Mongo is young (a...
Who is Wordnik<br />Wordnik is:<br />The world’s largest English Language reference <br />~10M words!<br />Mapping every w...
Wordnik’s MongoDB Deployment<br />Over 12 Months with Mongo<br />Corpus/UGC/Structured Data/Statistics<br />Master/Slave<b...
Engineering + IT Ops<br />First, Guiding Principles<br />Know your data<br />Don’t rely on IT magic<br />Equal Importance ...
Admins: Be Prepared<br />ok, this sucks.<br />
How?<br />Replicate!<br />Is that enough?<br />Well, not if your company is on the line<br />Snapshot<br />Every minute???...
Then What?<br />Yes, Mongo can do Incremental<br />Use the mongo slave mechanism<br />It’s exposed<br />It’s supported<br ...
Better than Free<br />Take our tools-They work!!!<br />SnapshotUtil<br />Selectively snapshot in BSON<br />Index info too!...
What if Scenarios<br />One collection gets corrupt?<br />Restore it<br />Apply all operations to it<br />“My top developer...
What else is possible?<br />Replication<br />Why not use built-in?<br />Control, of course<br />Same logic as Incremental ...
Hot Datacenter<br />Create incremental backups<br />Compress<br />Push to DC in batch<br />Apply to master<br />Primary Da...
Dev Environment<br />Developers need production-ish data<br />Anonymize while replicating to dev server<br />
Multiple Upstream Masters<br />Aggregate to single collection<br />Target can be a master!<br />Master A<br />Master B<br ...
Unblock MapReduce<br />Map Reduce can lock up your server<br />Replicate source data to another mongod<br />Replicate resu...
Mesh Mode<br />Write to Multiple Masters<br />Filter by “Server Identifier”<br />> db.documents.find().limit(2)<br />{"_id...
What’s Next<br />Multi-Master in Wordnik Production<br />Multiple Datacenter Presence<br />More data => more challenges<br />
Try it out<br />http://blog.wordnik.com/mongoutils<br />Questions?<br />
Upcoming SlideShare
Loading in...5
×

Keeping the Lights On with MongoDB

5,983

Published on

A presentation by Tony Tam at the MongoSV conference in Silicon Valley, hosted by MongoDB creator 10gen.

Published in: Technology
0 Comments
10 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
5,983
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
93
Comments
0
Likes
10
Embeds 0
No embeds

No notes for slide

Transcript of "Keeping the Lights On with MongoDB"

  1. 1. Mongo SV<br />Keeping the lights on with MongoDB<br />Tony Tam<br />12/3/2010<br />
  2. 2. Presentation Overview<br />Data >>> code<br />Treat it appropriately<br />Manage and maintain Mongo<br />Mongo is young (and robust!)<br />Performance and Features<br />The right hooks exist<br />
  3. 3. Who is Wordnik<br />Wordnik is:<br />The world’s largest English Language reference <br />~10M words!<br />Mapping every word, based on real data<br />(free ) API to add word information, everywhere<br />
  4. 4. Wordnik’s MongoDB Deployment<br />Over 12 Months with Mongo<br />Corpus/UGC/Structured Data/Statistics<br />Master/Slave<br />~3TB data<br />~12B records<br />We love Mongo’s performance<br />Read more:<br />http://blog.wordnik.com/12-months-with-mongodb<br />
  5. 5. Engineering + IT Ops<br />First, Guiding Principles<br />Know your data<br />Don’t rely on IT magic<br />Equal Importance in WebApps / SaaS<br />Hold hands and be friends<br />If you can’t manage it, don’t deploy it<br />
  6. 6. Admins: Be Prepared<br />ok, this sucks.<br />
  7. 7. How?<br />Replicate!<br />Is that enough?<br />Well, not if your company is on the line<br />Snapshot<br />Every minute???<br />Export often<br />Really???<br />
  8. 8. Then What?<br />Yes, Mongo can do Incremental<br />Use the mongo slave mechanism<br />It’s exposed<br />It’s supported<br />It’s very easy<br />It’s extremely fast<br />How?<br />Snapshot your data<br />Stream write ops to disk<br />Repeat<br />
  9. 9. Better than Free<br />Take our tools-They work!!!<br />SnapshotUtil<br />Selectively snapshot in BSON<br />Index info too!<br />IncrementalBackupUtil<br />Tail the oplog, stream to disk<br />Only the collections you want!<br />Compress & rotate<br />RestoreUtil<br />Recover your snapshots<br />Apply indexes yourself<br />ReplayUtil<br />Apply your Incremental backups<br />
  10. 10. What if Scenarios<br />One collection gets corrupt?<br />Restore it<br />Apply all operations to it<br />“My top developer dropped a collection!”<br />Restore just that one<br />Apply operations to it until that POT<br />“We got hacked!”<br />Restore it all<br />Apply operations until that POT<br />
  11. 11. What else is possible?<br />Replication<br />Why not use built-in?<br />Control, of course<br />Same logic as Incremental + Replay<br />Add some filters and it gets interesting<br />
  12. 12. Hot Datacenter<br />Create incremental backups<br />Compress<br />Push to DC in batch<br />Apply to master<br />Primary Datacenter<br />Hot Datacenter<br />Incremental Backup Files<br />Master<br />Master<br />Replay Util<br />SCP<br />
  13. 13. Dev Environment<br />Developers need production-ish data<br />Anonymize while replicating to dev server<br />
  14. 14. Multiple Upstream Masters<br />Aggregate to single collection<br />Target can be a master!<br />Master A<br />Master B<br />Master C<br />db.page_views<br />db.page_views<br />
  15. 15. Unblock MapReduce<br />Map Reduce can lock up your server<br />Replicate source data to another mongod<br />Replicate results back to master<br />Master<br />MR Server<br />db.source_data<br />db.summary_data<br />
  16. 16. Mesh Mode<br />Write to Multiple Masters<br />Filter by “Server Identifier”<br />> db.documents.find().limit(2)<br />{"_id":99887,"src":2,"title":"favorite.png","fsid":33774}<br />{"_id":128773,"src":1,"title":"select.png","fsid":837743}<br />db.documents<br />documents.src != 1<br />Master 1<br />Master 2<br />db.documents<br />documents.src != 2<br />
  17. 17. What’s Next<br />Multi-Master in Wordnik Production<br />Multiple Datacenter Presence<br />More data => more challenges<br />
  18. 18. Try it out<br />http://blog.wordnik.com/mongoutils<br />Questions?<br />
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×