Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
#mongodbdays chicago

Introduction to Replication and
Replica Sets
J. Randall Hunt
Hackathoner, MongoDB
Notes to the presenter
•  Themes for this presentation:
•  Balance between cost and redundancy.
•  Cover the many scenario...
Why Replication?
•  How many have faced node failures?
•  How many have been woken up from sleep to do a

fail-over(s)?
• ...
Agenda
•  Replica Sets Lifecycle
•  Developing with Replica Sets
•  Operational Considerations
Replica Set Lifestyle
Node 1

Node 2

Node 3

Replica Set – Creation
Node 1

Node 2

Secondary

Secondary

Heartbeat

ica
Primary

Replica Set – Initialize

Re
pl

n

o
ati
lic

tio

n

p
Re
...
Node 1

Secondary

Primary Election
Heartbeat

Node 3

Replica Set – Failure

Node 2

Secondary
Node 1

Replication

Secondary

Node 2
Primary

Heartbeat

Node 3

Replica Set – Failover
Node 1

Replication

Secondary

Node 2
Primary

Re
pl

ica

tio
n

Heartbeat

Node 3
Recovery

Replica Set – Recovery
Node 1

Replication

Secondary

Node 2
Primary

Re
pl

ica

tio
n

Heartbeat

Node 3

Secondary

Replica Set – Recovered
Replica Set Roles &
Configuration
Node 1

Node 2

Secondary

Arbiter

Heartbeat

on
ati
lic

p
Re

Node 3
Primary

Replica Set Roles
Configuration Options
> conf = {
_id: ”empire",
members: [
{ _id: 0, host: 'vader.deathstar.mil'},
{ _id: 1, host: 'tarkin....
Configuration Options
> conf = {
_id: ”empire",
members: [

Primary DC

{ _id: 0, host: 'vader.deathstar.mil', priority: 3}...
Configuration Options
> conf = {
_id: ”empire",
members: [
{ _id: 0, host: 'vader.deathstar.mil', priority: 3},
{ _id: 1, h...
Configuration Options
> conf = {
_id: "myset",
members: [
{ _id: 0, host: 'vader.deathstar.mil', priority: 3},
{ _id: 1, ho...
Developing with
Replica Sets
Client Application

Write

Read

Driver

Primary

Secondary

Strong Consistency

Secondary
Client Application
Driver

Re
a

Delayed Consistency

d

Secondary

a
Re

d

Write

Primary

Secondary
Write Concern
•  Network acknowledgement
•  Wait for error
•  Wait for journal sync
•  Wait for replication
Driver
write

Primary
apply in
memory

Unacknowledged
Primary

getLastError

Driver

apply in
memory

MongoDB Acknowledged (wait for error)
j:true

write

Primary

getLastError

Driver

apply in
memory

write to
journal

Wait for Journal Sync
w:2

write

Primary

getLastError

Driver

replicate

apply in
memory

Secondary

Wait for Replication
Tagging
•  Control where data is written to, and read from
•  Each member can have one or more tags
–  tags: {dc: "ny"}
– ...
Tagging Example
{
_id : "mySet",
members : [
{_id : 0, host : "frodo", tags : {"dc": "ny"}},
{_id : 1, host : "sam", tags ...
W:allDCs

write

Primary (SF)

getLastError

Driver

replicate

apply in
memory

Secondary (NY)
replicate

Secondary (Clou...
Read Preference Modes
•  5 modes
–  primary (only) - Default
–  primaryPreferred
–  secondary
–  secondaryPreferred
–  Nea...
Tagged Read Preference
•  Custom read preferences
•  Control where you read from by (node) tags
–  E.g. { "disk": "ssd", "...
Operational Considerations
Maintenance and Upgrade
•  No downtime
•  Rolling upgrade/maintenance
–  Start with Secondary
–  Primary last
Replica Set – 1 Data Center
•  Single datacenter
Datacenter

Member 1
Member 2
Datacenter 2

Member 3

•  Single switch & ...
Replica Set – 2 Data Centers
Datacenter 1

Member 1
Member 2

Datacenter 2

Member 3

Multi data center
DR node for safety...
Replica Set – 3 Data Centers
Datacenter 1
Member 1
Member 2

Datacenter 2
Member 3
Member 4

Datacenter 3
Member 5

Three ...
Behind the Curtain
Implementation details
•  Heartbeat every 2 seconds
–  Times out in 10 seconds

•  Local DB (not replicated)
–  system.rep...
Op(erations) Log is idempotent
> db.replsettest.insert({_id:1,value:1})
{ "ts" : Timestamp(1350539727000, 1), "h" :
Number...
Single operation can have many entries
> db.replsettest.update({},{$set:{name : ”foo”}, false, true})
{ "ts" : Timestamp(1...
Recent improvements
•  Read preference support with sharding
–  Drivers too

•  Improved replication over WAN/high-latency...
Just Use It
Use replica sets
Easy to setup
–  Try on a single machine

Check doc page for RS tutorials
–  http://docs.mong...
Questions?
#mongodbdays seattle

Thank You
J. Randall Hunt
Hackathoner, MongoDB
@jrhunt, randall@mongodb.com
Upcoming SlideShare
Loading in …5
×

Replication MongoDB Days 2013

Slides used in MongoDB Days 2013 presentations in North America

  • Be the first to comment

  • Be the first to like this

Replication MongoDB Days 2013

  1. 1. #mongodbdays chicago Introduction to Replication and Replica Sets J. Randall Hunt Hackathoner, MongoDB
  2. 2. Notes to the presenter •  Themes for this presentation: •  Balance between cost and redundancy. •  Cover the many scenarios which replication would solve and why. •  Secret sauce of avoiding downtime & data loss •  If time is short, skip 'Behind the curtain' section •  If time is very short, skip Operational Considerations also
  3. 3. Why Replication? •  How many have faced node failures? •  How many have been woken up from sleep to do a fail-over(s)? •  How many have experienced issues due to network latency? •  Different uses for data –  Normal processing –  Simple analytics
  4. 4. Agenda •  Replica Sets Lifecycle •  Developing with Replica Sets •  Operational Considerations
  5. 5. Replica Set Lifestyle
  6. 6. Node 1 Node 2 Node 3 Replica Set – Creation
  7. 7. Node 1 Node 2 Secondary Secondary Heartbeat ica Primary Replica Set – Initialize Re pl n o ati lic tio n p Re Node 3
  8. 8. Node 1 Secondary Primary Election Heartbeat Node 3 Replica Set – Failure Node 2 Secondary
  9. 9. Node 1 Replication Secondary Node 2 Primary Heartbeat Node 3 Replica Set – Failover
  10. 10. Node 1 Replication Secondary Node 2 Primary Re pl ica tio n Heartbeat Node 3 Recovery Replica Set – Recovery
  11. 11. Node 1 Replication Secondary Node 2 Primary Re pl ica tio n Heartbeat Node 3 Secondary Replica Set – Recovered
  12. 12. Replica Set Roles & Configuration
  13. 13. Node 1 Node 2 Secondary Arbiter Heartbeat on ati lic p Re Node 3 Primary Replica Set Roles
  14. 14. Configuration Options > conf = { _id: ”empire", members: [ { _id: 0, host: 'vader.deathstar.mil'}, { _id: 1, host: 'tarkin.deathstar.mil'}, { _id: 2, host: 'palpatine.deathstar.mil'} ] } > rs.initiate(conf)
  15. 15. Configuration Options > conf = { _id: ”empire", members: [ Primary DC { _id: 0, host: 'vader.deathstar.mil', priority: 3}, { _id: 1, host: 'tarkin.deathstar.mil', priority: 2}, { _id: 2, host: 'palpatine.deathstar.mil'} ] } > rs.initiate(conf)
  16. 16. Configuration Options > conf = { _id: ”empire", members: [ { _id: 0, host: 'vader.deathstar.mil', priority: 3}, { _id: 1, host: 'tarkin.deathstar.mil', priority: 2}, { _id: 2, host: 'palpatine.deathstar.mil'}, { _id: 3, host: 'marajade.empire.mil', hidden: true} ] } > rs.initiate(conf) Analytics node
  17. 17. Configuration Options > conf = { _id: "myset", members: [ { _id: 0, host: 'vader.deathstar.mil', priority: 3}, { _id: 1, host: 'tarkin.deathstar.mil', priority: 2}, { _id: 2, host: 'palpatine.deathstar.mil'}, { _id: 3, host: 'marajade.empire.mil', hidden: true}, { _id: 4, host: 'thrawn.empire.mil', hidden: true, slaveDelay: 3600} ] }} > rs.initiate(conf) Backup node
  18. 18. Developing with Replica Sets
  19. 19. Client Application Write Read Driver Primary Secondary Strong Consistency Secondary
  20. 20. Client Application Driver Re a Delayed Consistency d Secondary a Re d Write Primary Secondary
  21. 21. Write Concern •  Network acknowledgement •  Wait for error •  Wait for journal sync •  Wait for replication
  22. 22. Driver write Primary apply in memory Unacknowledged
  23. 23. Primary getLastError Driver apply in memory MongoDB Acknowledged (wait for error)
  24. 24. j:true write Primary getLastError Driver apply in memory write to journal Wait for Journal Sync
  25. 25. w:2 write Primary getLastError Driver replicate apply in memory Secondary Wait for Replication
  26. 26. Tagging •  Control where data is written to, and read from •  Each member can have one or more tags –  tags: {dc: "ny"} –  tags: {dc: "ny", subnet: "192.168", rack: "row3rk7"} •  Replica set defines rules for write concerns •  Rules can change without changing app code
  27. 27. Tagging Example { _id : "mySet", members : [ {_id : 0, host : "frodo", tags : {"dc": "ny"}}, {_id : 1, host : "sam", tags : {"dc": "ny"}}, {_id : 2, host : "merry", tags : {"dc": "sf"}}, {_id : 3, host : "pippin", tags : {"dc": "sf"}}, {_id : 4, host : "gandalf", tags : {"dc": "cloud"}}], settings : { getLastErrorModes : { allDCs : {"dc" : 3}, someDCs : {"dc" : 2}} } } > db.blogs.insert({...}) > db.runCommand({getLastError : 1, w : "someDCs"})
  28. 28. W:allDCs write Primary (SF) getLastError Driver replicate apply in memory Secondary (NY) replicate Secondary (Cloud) Wait for Replication (Tagging)
  29. 29. Read Preference Modes •  5 modes –  primary (only) - Default –  primaryPreferred –  secondary –  secondaryPreferred –  Nearest When more than one node is possible, closest node is used for reads (all modes but primary)
  30. 30. Tagged Read Preference •  Custom read preferences •  Control where you read from by (node) tags –  E.g. { "disk": "ssd", "use": "reporting" } •  Use in conjunction with standard read preferences –  Except primary
  31. 31. Operational Considerations
  32. 32. Maintenance and Upgrade •  No downtime •  Rolling upgrade/maintenance –  Start with Secondary –  Primary last
  33. 33. Replica Set – 1 Data Center •  Single datacenter Datacenter Member 1 Member 2 Datacenter 2 Member 3 •  Single switch & power •  Points of failure: –  Power –  Network –  Data center –  Two node failure •  Automatic recovery of single node crash
  34. 34. Replica Set – 2 Data Centers Datacenter 1 Member 1 Member 2 Datacenter 2 Member 3 Multi data center DR node for safety Can’t do multi data center durable write safely since only 1 node in remote DC
  35. 35. Replica Set – 3 Data Centers Datacenter 1 Member 1 Member 2 Datacenter 2 Member 3 Member 4 Datacenter 3 Member 5 Three data centers Can survive full data center loss Can do w= { dc : 2 } to guarantee write in 2 data centers (with tags)
  36. 36. Behind the Curtain
  37. 37. Implementation details •  Heartbeat every 2 seconds –  Times out in 10 seconds •  Local DB (not replicated) –  system.replset –  oplog.rs •  Capped collection •  Idempotent version of operation stored
  38. 38. Op(erations) Log is idempotent > db.replsettest.insert({_id:1,value:1}) { "ts" : Timestamp(1350539727000, 1), "h" : NumberLong("6375186941486301201"), "op" : "i", "ns" : "test.replsettest", "o" : { "_id" : 1, "value" : 1 } } > db.replsettest.update({_id:1},{$inc:{value:10}}) { "ts" : Timestamp(1350539786000, 1), "h" : NumberLong("5484673652472424968"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 1 }, "o" : { "$set" : { "value" : 11 } } }
  39. 39. Single operation can have many entries > db.replsettest.update({},{$set:{name : ”foo”}, false, true}) { "ts" : Timestamp(1350540395000, 1), "h" : NumberLong("-4727576249368135876"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 2 }, "o" : { "$set" : { "name" : "foo" } } } { "ts" : Timestamp(1350540395000, 2), "h" : NumberLong("-7292949613259260138"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 3 }, "o" : { "$set" : { "name" : "foo" } } } { "ts" : Timestamp(1350540395000, 3), "h" : NumberLong("-1888768148831990635"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 1 }, "o" : { "$set" : { "name" : "foo" } } }
  40. 40. Recent improvements •  Read preference support with sharding –  Drivers too •  Improved replication over WAN/high-latency networks •  rs.syncFrom command •  buildIndexes setting •  replIndexPrefetch setting
  41. 41. Just Use It Use replica sets Easy to setup –  Try on a single machine Check doc page for RS tutorials –  http://docs.mongodb.org/manual/replication/#tutorials
  42. 42. Questions?
  43. 43. #mongodbdays seattle Thank You J. Randall Hunt Hackathoner, MongoDB @jrhunt, randall@mongodb.com

×