Keeping data-safe-webinar-2010-11-01

Keeping your data safe

Richard M Kreuter
10gen Inc.
richard@10gen.com

November 1, 2010

Keeping your data safe — webinar

Aspects of data safety

Replication
Cross-data-center replication
Application-controlled replication
Backup
Disaster recovery


Replication

MongoDB supports automatic replication (data mirroring)
Recommended for failover, durability, backups (essentially all
deployments).
Works well over wide area networks.
Also good for horizontal read scaling: clients can conditionally
read from any of a number of slaves.


Replication Overview

MongoDB’s replication is similar to many DB’s.
Writes are accepted only by a Primary-mode (master,
writable) mongod.
Writes are recorded in a normalized format in the operation
log.
Secondary-mode (slave, read-only) mongods periodically query
the oplog and apply operations.


Replica set replication
Master (write server)

Slave (read replica) Slave (read replica)

Slave (read replica)
Old Master

Slave (read replica) Slave (read replica)

New master


Replica Set Failover and Invariants

Replicating mongods track replica set membership.
If secondaries can’t see the master, but can see a majority of
replica set votes, an election is induced.
Election selects exactly one most-recently-written node for
primary.
A primary steps down to secondary when it can’t see a
majority of replica set votes.
On set reintegration, unreplicated data on old primaries is
rolled back to oﬄine storage (e.g., for manual intervention).


getLastError()

Data manipulation operations are “ﬁre and forget” by default; that
is, they return immediately, and don’t wait for any server process.
The database command getLastError() is the interface for
forcing operation synchrony:

db.getLastError() // returns null for "no error",
// otherwise, a document containing
// an error message


getLastError() and write replication

When running in a replicated conﬁguration, getLastError() can
also force data writes to replicating slaves:

// write to 4 servers, timeout after 3 seconds
db.getLastError({w: 4, wtimeout: 3000})


getLastError() and drivers, deployments

All officially-supported MongoDB drivers have a SafeMode feature
that implicitly invokes getLastError() after insert, update,
delete operations. This way, application programmers have
control over write replication separably from data manipulation
logic.
Replica Sets support a getLastErrorDefaults setting, which are
used whenever a client calls getLastError() without parameters.
This way, application architects and operations staff can design a
system whose write replication can be configured independently of
application code, if desired.


Backup strategies

MongoDB tools (mongoexport, mongodump)
More generic tools (fs snapshots, ﬁle copying commands)
Storage device features (SAN, EBS snapshots)


MongoDB tools

MongoDB comes with a couple pairs tools for backups
mongodump & mongorestore — produce/consume BSON
dumps of database content. Good for making compact
backups. Note that indexes are reconstructed on
mongorestore.
mongoexport & mongoimport — produce/consume
JSON/CSV text ﬁles of database content. More intended for
cross-software transfers (e.g., transferring data between
MongoDB and a spreadsheet program), but can be used for
backup/recovery.


Backing up database files

MongoDB’s data files (under the --dbpath argument) can be
backed up using any technique available for files:
File System/Volume Manager snapshots — some OSes’ file
systems (ZFS, XFS, etc.) and some Volume Managers (e.g.,
LVM) support point-in-time snapshotting. These snapshots
can serve as backups.
Plain ol’ file copying — you can just copy the database’s files
around.


Storage-layer backups

Some storage devices have snapshotting features; you can use
these snapshots as backups
Commercial SANs often have point-in-time block-level
snapshotting.
Amazon’s EBS supports snapshotting (but they recommend
unmounting the EBS volumes to quiesce the data).


Locking the database for backups

All backup strategies can, in principle, be performed on a live
(a.k.a. “hot”) database, but with varying levels of eﬃcacy. To
ensure a clean backup, it’s recommended that you lock the
database for the duration of your backup procedure.

> use admin
switched to db admin
> db.runCommand({fsync:1,lock:1})
// now use mongodump/snapshotting/etc., and then
> db.$cmd.sys.unlock.findOne();

In general, this procedure is best performed on replicating
secondaries, which don’t accept writes.


Disaster Recovery
The general solution for recovering a failed server is as follows:
1 Repair/replace any failed hardware or operating system layers

(e.g., replace disks, provision new hosts or virtual machines,
etc.)
2 If step 1 completes quickly enough and its data directory is

trustworthy (e.g., if the mongod was cleanly shut down, say,
after a UPS-induced system halt), bring the mongod online
and it will attempt to replay the replica set’s primary’s oplog.
3 If the data directory is suspect, you can move it aside or

delete it, and then
1 Either bring up the mongod with an empty data directory, in
which case it will clone the primary’s databases ...
2 ... or else seed the mongod’s data directory with a recent
snapshot or mongodump backup.
4 The mongod will attempt to replay all the primary’s oplog
records.

Some aspects of disaster recovery

Cloning the primary can impose notable load on the primary,
so it’s probably prefarable to initialize a new secondary from a
snapshot or a database dump.
If you operate in multiple data centers, it’s advisable to try to
keep snapshots/database backups “nearby” in data center
space to avoid having to transfer large amounts of data during
disaster recorvery events. For example, you might make
periodic snapshots/backups of a secondary in each of your
data centers, and use these for initializing new secondaries.
It can occur that the primary’s oplog “rolls over” before a
recovering secondary catches up. See
http://www.mongodb.org/display/DOCS/Halted+Replication
for more details.
In general, avoiding a disaster is better than recovering from
one. Employ monitoring tools!

Keeping data-safe-webinar-2010-11-01

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (10)

Similar to Keeping data-safe-webinar-2010-11-01

Similar to Keeping data-safe-webinar-2010-11-01 (20)

More from MongoDB

More from MongoDB (20)

Recently uploaded

Recently uploaded (20)

Keeping data-safe-webinar-2010-11-01