Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Replication and replica sets


Published on

My talk on replication from MongoSeattle December 1st, 2011

Published in: Technology, Sports
  • Be the first to comment

Replication and replica sets

  1. 1. Chris WestinSoftware Engineer, 10gen © Copyright 2010 10gen Inc.
  2. 2. What is Replication for?• High availability • If a node fails, another node can step in • Extra copies of data for recovery• Scaling reads • Applications with high read rates can read from replicas
  3. 3. What Does Replication Look Like?• Replica Set • A set of mongod servers • Minimum of 3 • Can use “arbiters” • Consensus election of a “primary” • All writes go to primary • “Secondaries” replicate from primary
  4. 4. Configuring a Replica Set• Start mongod processes with --replSet <name>• Then:
  5. 5. Managing a Replica Set• rs.conf() • Shell helper: get current configuration• rs.initiate([<cfg>]); • Shell helper: initiate replica set• rs.add(“hostname:<port>”) • Shell helper: add a new member• rs.reconfig(<cfg>) • Shell helper: reconfigure a replica set• rs.remove(“hostname:<port>”) • Shell helper: remove a member
  6. 6. Some Administrative Commands• rs.status() • Reports status of the replica set from one node’s point of view• rs.stepDown(<secs>) • Request the primary to step down• rs.freeze(<secs>) • Prevents any changes to the current replica set configuration (primary/secondary status) • Use during backups
  7. 7. How Does it Work?• Change operations are written to the oplog • Changes are described in an idempotent form • They are safe to apply more than once! • The oplog is in the “local” database• Secondaries periodically query the primary’s oplog and apply what they find• Change timestamps are in local server time • Keep time skew at a minimum using NTP to avoid pauses during failover
  8. 8. A Few Words About the Oplog• The oplog is a capped collection • Must have enough space to allow new secondaries to catch up after copying from a primary • Must have enough space to cope with any applicable slaveDelay • The required oplog size depends on the level of activity • If necessary, the oplog can be resized • Or, use the first-time mongod startup option – oplogSize <MB> to choose size of replication log
  9. 9. Adding More Replicas• You can add more replicas after your initial setup • Add an empty server • This will slowly copy documents and then apply any necessary oplog to look like the primary • Add a new server based on a recent backup • Begins applying oplog records as if the replica had temporarily been cut off from the primary
  10. 10. Failover• Replica set members monitor each other via heartbeats (every 2 seconds)• If the primary can’t be reached, a new one is elected • The secondary with the most up-to-date oplog is chosen • If, after election, a secondary has changes not on the new primary, those are undone, and moved aside (changes saved to a BSON file) • If you require a guarantee, ensure data is written to a majority of the replica set
  11. 11. Priority• Optional parameter to replica set member configuration• All other things being equal, the highest priority member wins the election for primary • Changes in secondaries’ relative lag, i.e., catching up to primary, can trigger an election• Zero priority: can never become primary • Use for remote DR, delayed slaves, backups, analytics sources
  12. 12. For Applications• getLastError( { w : … } ) • Application blocks until changes are written to the specified number of servers • Defaults can be set in the replica set’s configuration• “Safe mode” for critical writes: setWriteConcern() • Another way to force writes to a number of servers• Drivers support “slaveOk” for sending queries to a secondary • This is for scaling reads
  13. 13. Replication and Sharding• Each shard is its own replica set• Drivers use a mongos process to route queries to the appropriate shard(s)• Configuration servers maintain the shard key range metadata
  14. 14. Replication and Sharding
  15. 15. Data Center Awareness• Tag nodes in replica set configuration • Apply hierarchical labels to replica set members • Define getLastError() modes • Can require number of nodes writes must go to • Can require locations of nodes writes must go to • Combinations • Available in 1.9.1
  16. 16. Tagging Example
  17. 17. Documentation• ca+Sets • Index of documents on replication topics