• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Keeping the Lights On with MongoDB
 

Keeping the Lights On with MongoDB

on

  • 6,158 views

A presentation by Tony Tam at the MongoSV conference in Silicon Valley, hosted by MongoDB creator 10gen.

A presentation by Tony Tam at the MongoSV conference in Silicon Valley, hosted by MongoDB creator 10gen.

Statistics

Views

Total Views
6,158
Views on SlideShare
5,081
Embed Views
1,077

Actions

Likes
10
Downloads
89
Comments
0

12 Embeds 1,077

http://blog.nosqlfan.com 934
http://www.10gen.com 119
http://www.linkedin.com 5
https://www.linkedin.com 4
http://trunk.ly 3
http://static.slidesharecdn.com 3
http://reader.youdao.com 2
http://dowloads.mongodb.org 2
http://cache.baidu.com 2
http://webcache.googleusercontent.com 1
http://xianguo.com 1
http://localhost:8080 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Keeping the Lights On with MongoDB Keeping the Lights On with MongoDB Presentation Transcript

    • Mongo SV
      Keeping the lights on with MongoDB
      Tony Tam
      12/3/2010
    • Presentation Overview
      Data >>> code
      Treat it appropriately
      Manage and maintain Mongo
      Mongo is young (and robust!)
      Performance and Features
      The right hooks exist
    • Who is Wordnik
      Wordnik is:
      The world’s largest English Language reference
      ~10M words!
      Mapping every word, based on real data
      (free ) API to add word information, everywhere
    • Wordnik’s MongoDB Deployment
      Over 12 Months with Mongo
      Corpus/UGC/Structured Data/Statistics
      Master/Slave
      ~3TB data
      ~12B records
      We love Mongo’s performance
      Read more:
      http://blog.wordnik.com/12-months-with-mongodb
    • Engineering + IT Ops
      First, Guiding Principles
      Know your data
      Don’t rely on IT magic
      Equal Importance in WebApps / SaaS
      Hold hands and be friends
      If you can’t manage it, don’t deploy it
    • Admins: Be Prepared
      ok, this sucks.
    • How?
      Replicate!
      Is that enough?
      Well, not if your company is on the line
      Snapshot
      Every minute???
      Export often
      Really???
    • Then What?
      Yes, Mongo can do Incremental
      Use the mongo slave mechanism
      It’s exposed
      It’s supported
      It’s very easy
      It’s extremely fast
      How?
      Snapshot your data
      Stream write ops to disk
      Repeat
    • Better than Free
      Take our tools-They work!!!
      SnapshotUtil
      Selectively snapshot in BSON
      Index info too!
      IncrementalBackupUtil
      Tail the oplog, stream to disk
      Only the collections you want!
      Compress & rotate
      RestoreUtil
      Recover your snapshots
      Apply indexes yourself
      ReplayUtil
      Apply your Incremental backups
    • What if Scenarios
      One collection gets corrupt?
      Restore it
      Apply all operations to it
      “My top developer dropped a collection!”
      Restore just that one
      Apply operations to it until that POT
      “We got hacked!”
      Restore it all
      Apply operations until that POT
    • What else is possible?
      Replication
      Why not use built-in?
      Control, of course
      Same logic as Incremental + Replay
      Add some filters and it gets interesting
    • Hot Datacenter
      Create incremental backups
      Compress
      Push to DC in batch
      Apply to master
      Primary Datacenter
      Hot Datacenter
      Incremental Backup Files
      Master
      Master
      Replay Util
      SCP
    • Dev Environment
      Developers need production-ish data
      Anonymize while replicating to dev server
    • Multiple Upstream Masters
      Aggregate to single collection
      Target can be a master!
      Master A
      Master B
      Master C
      db.page_views
      db.page_views
    • Unblock MapReduce
      Map Reduce can lock up your server
      Replicate source data to another mongod
      Replicate results back to master
      Master
      MR Server
      db.source_data
      db.summary_data
    • Mesh Mode
      Write to Multiple Masters
      Filter by “Server Identifier”
      > db.documents.find().limit(2)
      {"_id":99887,"src":2,"title":"favorite.png","fsid":33774}
      {"_id":128773,"src":1,"title":"select.png","fsid":837743}
      db.documents
      documents.src != 1
      Master 1
      Master 2
      db.documents
      documents.src != 2
    • What’s Next
      Multi-Master in Wordnik Production
      Multiple Datacenter Presence
      More data => more challenges
    • Try it out
      http://blog.wordnik.com/mongoutils
      Questions?