Walking the Walk: Developing the MongoDB Backup Service with MongoDB

Like this? Share it with your network

Share

Walking the Walk: Developing the MongoDB Backup Service with MongoDB

  • 727 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
727
On Slideshare
635
From Embeds
92
Number of Embeds
6

Actions

Shares
Downloads
8
Comments
0
Likes
0

Embeds 92

http://www.mongodb.com 46
http://www.10gen.com 37
http://drupal1.10gen.cc 5
http://localhost 2
https://www.mongodb.com 1
https://comwww-drupal.10gen.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • A presentation on we build the MongoDB Backup Service in house. Not a sales pitch.
  • FixBackup Agent = Exernal program, similar to MMS Agent. Written in Go.Ingestion = RESTfulwebservices. Responsible for all agent communication (configuration and ingestion)Daemons = Actual processing
  • “Evolving Schema with MongoDB”Complexity crept in – complex rules,graceful recovery, accurate representation of data, additional state. “More bells And whistles”
  • Schema Talk – how the application will use data. Start simple and evolve.
  • Explain what a capped collection is – circular bufferExplain what the oplog is
  • 500GB -> 1TB oplog.Timeline– why this chosen, how long it lasted, why and when it stopped working (can’t shard, single point of contention, lose dynamic schema, customers intertwined)Start with somethingTTL indexes did not exist“”
  • MongoDB 2.2 coincided with this stage of developmentAdded in 2.2. Time To Live index. DB-level locking, TTLand dynamic schemaAnd we can shard shardingTTL window – hard to manage at first, made easier later
  • Worst case – no worse than tar.gz of files
  • Be prepared to discusszfs and why it wasn’t chosen
  • Risk: CorruptionBig Idea: De-duping -> save the stuff that changed
  • Multiple slides with details – what went wrong, why. E.g. indexes we used, read vs. wright, disk io was the bottleneck

Transcript

  • 1. Engineer, Cloud Team, 10genSteve BriskinWalking the Walk:Developing the MongoDBBackup Service With MongoDB
  • 2. Agenda• Intro: The Project• How the backup service was built– Keeping State– Storage of Oplog Documents– De-duped Snapshot Storage• Q&A
  • 3. The Project• Started in December 2011 – 1 person• 3 Engineers + PM & Manager by June 2012• Private Beta – September 2012• Limited Release – April 2013• 6 Engineers (and hiring) + PM & Manager –Now• Agile Principles
  • 4. Data FlowReconstructed Replica SetsShardedClusterBRSDaemonBackupAgentReplicaSet 1CustomerReplicaSet 4ReplicaSet 3ReplicaSet 2BackupIngestion10GENBackupDaemon(s)Main DBBlockStoreRS1RS2RS3RS42. InitialSync3. OpLog Data1. Configuration4. SaveSync/Oplog Data5. ReconstructReplica Set6. PersistSnapshot7. RetrieveSnapshot8. SCP DataFiles
  • 5. How We Built It (Iteratively)
  • 6. Keeping State – First Version• One document per replica set being backed up{_id : ObjectId("5194ecde036446e958b9df9b"),groupId : “Customer Group”,replicaSet : ”ReplSet Name",broken : false,workingOn : “Initial Sync”,numOplogs : NumberInt(100),head :Timestamp(1370982242,1),lastOplog :Timestamp(1370982243,1),lastSnapshot:Timestamp(1370981940,1),machine : "backup1.10gen.com"}
  • 7. Keeping State – CurrentVersion• More fields, Nested Documents. Still No Joins.{_id:ObjectId("5194ecde036446e958b9df9b"),groupId:“CustomerGroup”,replicaSet:”ReplSetName",broken:false,workingOn:{…},head:{ts:Timestamp(1370982242,1),hash:49238479326510},lastOplog:{ts:Timestamp(1370982243,1),hash:93408342387492}numOplogs:NumberLong(9400),oplogNamespace:“CustomerGroup.oplogs_ReplSetName”lastSnapshot:Timestamp(1370981940,1),nextSnapshot:Timestamp1371003540,1),schedule:{reference:13709812343,rules{[{…},{…}]}}machine:"backup1.10gen.com"}Simple Value -> NestedDocumentInteger -> LongComplex, Nested Document
  • 8. Imitating a Secondary:Capturing and storing the oplog
  • 9. Capture Oplog• Use replication oplog to capture activity• Oplog is a Capped Collection – local.oplog.rs– We can tail Capped Collections• Strategy– Tail the Oplog– Read 10 MB of Data– Compress and Send to 10gen
  • 10. Store Oplog – First Version• Single Capped Collection• Pros– Easy• Cons– Doesn’t scale!– Customers will have an impact on each other
  • 11. Store Oplog – Good Version• DB per customer and Collection per replica set• TTL Index for cleanup• Pros– Logical and Physical separation of customer data– Can scale quickly and easily– Configurable by end user
  • 12. Storing the Snapshots
  • 13. Storage – First Version• Archive and Compress MongoDB data files• Scatter archives across machines– Pros• Fast and Easy– Cons• No Redundancy, Hard to Scale, Wastes SpaceMachine 1Snapshot_1.tar.gzSnapshot_4.tar.gzMachine 2Snapshot_2.tar.gzSnapshot_5.tar.gzMachine 3Snapshot_3.tar.gzSnapshot_6.tar.gz
  • 14. Goal 1: De-DuplicatedStorage• Observation– Data change is low and localized– Data is compressible• Huge benefits in de-duplicatingWorst Case0% de-dupeNo compressionBest Case100% de-dupe10x compressionTypical Case90% de-dupe3x compression100GB100GB100GB100GB100GB100GB10GB 0GB 100GB100GB33GB 3GB
  • 15. Goal 2: Redundancy andScalability• Require HighAvailability & Redundancy– MongoDB Replication!• RequireAbility to Scale– MongoDB Sharding!
  • 16. Block Storedb_file.0SHA-256 Hash = “de23425..”Data = BinData[……]SHA-256 Hash = “3af37..”Data = BinData[……]SHA-256 Hash = “e721ac..”Data = BinData[……]
  • 17. Block Store• File reference
  • 18. Block Store InternalsFiles Collection{_id :ObjectId("5194ece0036446e958b9dfa1"),filename : ”db_file.0",size : NumberLong(786432),blocks : [{hash : "de2f256064….",size : 96},{hash : ”47a9834f23….",size : 32121},….}Blocks Collection{_id :"de2f256064a0af797747c2b9755dcb9f3df0de4f489eac731c23ae9ca9cc31",bytes :BinData(0,"H4sIAAAAAAAAAO3BAQEAAACAkP6v7ggKAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAauuOl9cAAAEA"),zippedSize : 96,size : 65536}SHA-256 HashSHA-256Hash
  • 19. Putting the file back together• For each file– For each block• Retrieve block• Uncompress
  • 20. Block Store GarbageCollection• 1st Attempt– Reference counting– Slow and non-parallelizable• 2nd Attempt– Mark and Sweep– Parallelizable– Requires more space
  • 21. Q&A