Walking the Walk: Developing the MongoDB Backup Service with MongoDB


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • A presentation on we build the MongoDB Backup Service in house. Not a sales pitch.
  • FixBackup Agent = Exernal program, similar to MMS Agent. Written in Go.Ingestion = RESTfulwebservices. Responsible for all agent communication (configuration and ingestion)Daemons = Actual processing
  • “Evolving Schema with MongoDB”Complexity crept in – complex rules,graceful recovery, accurate representation of data, additional state. “More bells And whistles”
  • Schema Talk – how the application will use data. Start simple and evolve.
  • Explain what a capped collection is – circular bufferExplain what the oplog is
  • 500GB -> 1TB oplog.Timeline– why this chosen, how long it lasted, why and when it stopped working (can’t shard, single point of contention, lose dynamic schema, customers intertwined)Start with somethingTTL indexes did not exist“”
  • MongoDB 2.2 coincided with this stage of developmentAdded in 2.2. Time To Live index. DB-level locking, TTLand dynamic schemaAnd we can shard shardingTTL window – hard to manage at first, made easier later
  • Worst case – no worse than tar.gz of files
  • Be prepared to discusszfs and why it wasn’t chosen
  • Risk: CorruptionBig Idea: De-duping -> save the stuff that changed
  • Multiple slides with details – what went wrong, why. E.g. indexes we used, read vs. wright, disk io was the bottleneck
  • Walking the Walk: Developing the MongoDB Backup Service with MongoDB

    1. 1. Engineer, Cloud Team, 10genSteve BriskinWalking the Walk:Developing the MongoDBBackup Service With MongoDB
    2. 2. Agenda• Intro: The Project• How the backup service was built– Keeping State– Storage of Oplog Documents– De-duped Snapshot Storage• Q&A
    3. 3. The Project• Started in December 2011 – 1 person• 3 Engineers + PM & Manager by June 2012• Private Beta – September 2012• Limited Release – April 2013• 6 Engineers (and hiring) + PM & Manager –Now• Agile Principles
    4. 4. Data FlowReconstructed Replica SetsShardedClusterBRSDaemonBackupAgentReplicaSet 1CustomerReplicaSet 4ReplicaSet 3ReplicaSet 2BackupIngestion10GENBackupDaemon(s)Main DBBlockStoreRS1RS2RS3RS42. InitialSync3. OpLog Data1. Configuration4. SaveSync/Oplog Data5. ReconstructReplica Set6. PersistSnapshot7. RetrieveSnapshot8. SCP DataFiles
    5. 5. How We Built It (Iteratively)
    6. 6. Keeping State – First Version• One document per replica set being backed up{_id : ObjectId("5194ecde036446e958b9df9b"),groupId : “Customer Group”,replicaSet : ”ReplSet Name",broken : false,workingOn : “Initial Sync”,numOplogs : NumberInt(100),head :Timestamp(1370982242,1),lastOplog :Timestamp(1370982243,1),lastSnapshot:Timestamp(1370981940,1),machine : "backup1.10gen.com"}
    7. 7. Keeping State – CurrentVersion• More fields, Nested Documents. Still No Joins.{_id:ObjectId("5194ecde036446e958b9df9b"),groupId:“CustomerGroup”,replicaSet:”ReplSetName",broken:false,workingOn:{…},head:{ts:Timestamp(1370982242,1),hash:49238479326510},lastOplog:{ts:Timestamp(1370982243,1),hash:93408342387492}numOplogs:NumberLong(9400),oplogNamespace:“CustomerGroup.oplogs_ReplSetName”lastSnapshot:Timestamp(1370981940,1),nextSnapshot:Timestamp1371003540,1),schedule:{reference:13709812343,rules{[{…},{…}]}}machine:"backup1.10gen.com"}Simple Value -> NestedDocumentInteger -> LongComplex, Nested Document
    8. 8. Imitating a Secondary:Capturing and storing the oplog
    9. 9. Capture Oplog• Use replication oplog to capture activity• Oplog is a Capped Collection – local.oplog.rs– We can tail Capped Collections• Strategy– Tail the Oplog– Read 10 MB of Data– Compress and Send to 10gen
    10. 10. Store Oplog – First Version• Single Capped Collection• Pros– Easy• Cons– Doesn’t scale!– Customers will have an impact on each other
    11. 11. Store Oplog – Good Version• DB per customer and Collection per replica set• TTL Index for cleanup• Pros– Logical and Physical separation of customer data– Can scale quickly and easily– Configurable by end user
    12. 12. Storing the Snapshots
    13. 13. Storage – First Version• Archive and Compress MongoDB data files• Scatter archives across machines– Pros• Fast and Easy– Cons• No Redundancy, Hard to Scale, Wastes SpaceMachine 1Snapshot_1.tar.gzSnapshot_4.tar.gzMachine 2Snapshot_2.tar.gzSnapshot_5.tar.gzMachine 3Snapshot_3.tar.gzSnapshot_6.tar.gz
    14. 14. Goal 1: De-DuplicatedStorage• Observation– Data change is low and localized– Data is compressible• Huge benefits in de-duplicatingWorst Case0% de-dupeNo compressionBest Case100% de-dupe10x compressionTypical Case90% de-dupe3x compression100GB100GB100GB100GB100GB100GB10GB 0GB 100GB100GB33GB 3GB
    15. 15. Goal 2: Redundancy andScalability• Require HighAvailability & Redundancy– MongoDB Replication!• RequireAbility to Scale– MongoDB Sharding!
    16. 16. Block Storedb_file.0SHA-256 Hash = “de23425..”Data = BinData[……]SHA-256 Hash = “3af37..”Data = BinData[……]SHA-256 Hash = “e721ac..”Data = BinData[……]
    17. 17. Block Store• File reference
    18. 18. Block Store InternalsFiles Collection{_id :ObjectId("5194ece0036446e958b9dfa1"),filename : ”db_file.0",size : NumberLong(786432),blocks : [{hash : "de2f256064….",size : 96},{hash : ”47a9834f23….",size : 32121},….}Blocks Collection{_id :"de2f256064a0af797747c2b9755dcb9f3df0de4f489eac731c23ae9ca9cc31",bytes :BinData(0,"H4sIAAAAAAAAAO3BAQEAAACAkP6v7ggKAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAauuOl9cAAAEA"),zippedSize : 96,size : 65536}SHA-256 HashSHA-256Hash
    19. 19. Putting the file back together• For each file– For each block• Retrieve block• Uncompress
    20. 20. Block Store GarbageCollection• 1st Attempt– Reference counting– Slow and non-parallelizable• 2nd Attempt– Mark and Sweep– Parallelizable– Requires more space
    21. 21. Q&A