Your SlideShare is downloading. ×
Keeping data-safe-webinar-2010-11-01
Keeping data-safe-webinar-2010-11-01
Keeping data-safe-webinar-2010-11-01
Keeping data-safe-webinar-2010-11-01
Keeping data-safe-webinar-2010-11-01
Keeping data-safe-webinar-2010-11-01
Keeping data-safe-webinar-2010-11-01
Keeping data-safe-webinar-2010-11-01
Keeping data-safe-webinar-2010-11-01
Keeping data-safe-webinar-2010-11-01
Keeping data-safe-webinar-2010-11-01
Keeping data-safe-webinar-2010-11-01
Keeping data-safe-webinar-2010-11-01
Keeping data-safe-webinar-2010-11-01
Keeping data-safe-webinar-2010-11-01
Keeping data-safe-webinar-2010-11-01
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Keeping data-safe-webinar-2010-11-01

1,984

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,984
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
21
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Keeping your data safe Richard M Kreuter 10gen Inc. richard@10gen.com November 1, 2010 Keeping your data safe — webinar
  • 2. Aspects of data safety Replication Cross-data-center replication Application-controlled replication Backup Disaster recovery Keeping your data safe — webinar
  • 3. Replication MongoDB supports automatic replication (data mirroring) Recommended for failover, durability, backups (essentially all deployments). Works well over wide area networks. Also good for horizontal read scaling: clients can conditionally read from any of a number of slaves. Keeping your data safe — webinar
  • 4. Replication Overview MongoDB’s replication is similar to many DB’s. Writes are accepted only by a Primary-mode (master, writable) mongod. Writes are recorded in a normalized format in the operation log. Secondary-mode (slave, read-only) mongods periodically query the oplog and apply operations. Keeping your data safe — webinar
  • 5. Replica set replication Master (write server) Slave (read replica) Slave (read replica) Slave (read replica) Old Master Slave (read replica) Slave (read replica) New master Keeping your data safe — webinar
  • 6. Replica Set Failover and Invariants Replicating mongods track replica set membership. If secondaries can’t see the master, but can see a majority of replica set votes, an election is induced. Election selects exactly one most-recently-written node for primary. A primary steps down to secondary when it can’t see a majority of replica set votes. On set reintegration, unreplicated data on old primaries is rolled back to offline storage (e.g., for manual intervention). Keeping your data safe — webinar
  • 7. getLastError() Data manipulation operations are “fire and forget” by default; that is, they return immediately, and don’t wait for any server process. The database command getLastError() is the interface for forcing operation synchrony: db.getLastError() // returns null for "no error", // otherwise, a document containing // an error message Keeping your data safe — webinar
  • 8. getLastError() and write replication When running in a replicated configuration, getLastError() can also force data writes to replicating slaves: // write to 4 servers, timeout after 3 seconds db.getLastError({w: 4, wtimeout: 3000}) Keeping your data safe — webinar
  • 9. getLastError() and drivers, deployments All officially-supported MongoDB drivers have a SafeMode feature that implicitly invokes getLastError() after insert, update, delete operations. This way, application programmers have control over write replication separably from data manipulation logic. Replica Sets support a getLastErrorDefaults setting, which are used whenever a client calls getLastError() without parameters. This way, application architects and operations staff can design a system whose write replication can be configured independently of application code, if desired. Keeping your data safe — webinar
  • 10. Backup strategies MongoDB tools (mongoexport, mongodump) More generic tools (fs snapshots, file copying commands) Storage device features (SAN, EBS snapshots) Keeping your data safe — webinar
  • 11. MongoDB tools MongoDB comes with a couple pairs tools for backups mongodump & mongorestore — produce/consume BSON dumps of database content. Good for making compact backups. Note that indexes are reconstructed on mongorestore. mongoexport & mongoimport — produce/consume JSON/CSV text files of database content. More intended for cross-software transfers (e.g., transferring data between MongoDB and a spreadsheet program), but can be used for backup/recovery. Keeping your data safe — webinar
  • 12. Backing up database files MongoDB’s data files (under the --dbpath argument) can be backed up using any technique available for files: File System/Volume Manager snapshots — some OSes’ file systems (ZFS, XFS, etc.) and some Volume Managers (e.g., LVM) support point-in-time snapshotting. These snapshots can serve as backups. Plain ol’ file copying — you can just copy the database’s files around. Keeping your data safe — webinar
  • 13. Storage-layer backups Some storage devices have snapshotting features; you can use these snapshots as backups Commercial SANs often have point-in-time block-level snapshotting. Amazon’s EBS supports snapshotting (but they recommend unmounting the EBS volumes to quiesce the data). Keeping your data safe — webinar
  • 14. Locking the database for backups All backup strategies can, in principle, be performed on a live (a.k.a. “hot”) database, but with varying levels of efficacy. To ensure a clean backup, it’s recommended that you lock the database for the duration of your backup procedure. > use admin switched to db admin > db.runCommand({fsync:1,lock:1}) // now use mongodump/snapshotting/etc., and then > db.$cmd.sys.unlock.findOne(); In general, this procedure is best performed on replicating secondaries, which don’t accept writes. Keeping your data safe — webinar
  • 15. Disaster Recovery The general solution for recovering a failed server is as follows: 1 Repair/replace any failed hardware or operating system layers (e.g., replace disks, provision new hosts or virtual machines, etc.) 2 If step 1 completes quickly enough and its data directory is trustworthy (e.g., if the mongod was cleanly shut down, say, after a UPS-induced system halt), bring the mongod online and it will attempt to replay the replica set’s primary’s oplog. 3 If the data directory is suspect, you can move it aside or delete it, and then 1 Either bring up the mongod with an empty data directory, in which case it will clone the primary’s databases ... 2 ... or else seed the mongod’s data directory with a recent snapshot or mongodump backup. 4 The mongod will attempt to replay all the primary’s oplog records. Keeping your data safe — webinar
  • 16. Some aspects of disaster recovery Cloning the primary can impose notable load on the primary, so it’s probably prefarable to initialize a new secondary from a snapshot or a database dump. If you operate in multiple data centers, it’s advisable to try to keep snapshots/database backups “nearby” in data center space to avoid having to transfer large amounts of data during disaster recorvery events. For example, you might make periodic snapshots/backups of a secondary in each of your data centers, and use these for initializing new secondaries. It can occur that the primary’s oplog “rolls over” before a recovering secondary catches up. See http://www.mongodb.org/display/DOCS/Halted+Replication for more details. In general, avoiding a disaster is better than recovering from one. Employ monitoring tools! Keeping your data safe — webinar

×