• Like

Replication Internals: The Life of a Write

  • 296 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
296
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
15
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Andy Schwerin Lead Engineer, MongoDB
  • 2. • Goals of Replication • Replication Architecture • A representative write
  • 3. • High availability for processing reads and writes – Automatic leader election • Support many network topologies – Tag sets • Accessible consistency model – Ordered operation log • Client can trade latency for durability – Write concern
  • 4. { ts: 4, op: “i”, ns: “d.c”, o: { _id: 10, name: “john” } } OPLOG …
  • 5. PRIMARY OPLOG 4 SECONDARY OPLOG 8 9 SECONDARY OPLOG 4 5 When a secondary oplog is not a prefix of the primary oplog…
  • 6. w:?
  • 7. w:1 Could lose write when primary disappears, without notification.
  • 8. w:majority Over half of nodes must fail to lose the write. And, an outside operator must intervene before new writes are
  • 9. w:all All nodes have the write before primary responds. But, cannot complete writes if any nodes are
  • 10. OPLOG d.c OPLOG P TS:6 S1 TS:6 S2 TS:2 d.c. insert ({_id:10,name:’john’}, wC: {w:2}}) 1. Fetch oplog entries 2. Apply to collections 3. Write to local oplog 4. Notify primary 5. Repeat
  • 11. OPLOG OBSERVER BATCH BATCH PREFETCH APPLIER BATCH x.y d.cd.c OPLOG d.c. insert ({_id:10,name:’john’}, wC: {w:2}}) P TS:6 S1 TS:6 S2 TS:2
  • 12. OPLOG d.c.insert ({_id:10,name:’john’}, wC: {w:2}}) d.c { ts: 4, op: “i”, ns: “d.c”, o: { _id: 10, name: “john” } } P TS:4 S1 TS:2 S2 TS:2
  • 13. OPLOG d.c.insert ({_id:10,name:’john’}, wC: {w:2}}) OBSERVER BATCH d.c OPLOG P TS:6 S1 TS:2 S2 TS:2
  • 14. OPLOG d.c.insert ({_id:10,name:’john’}, wC: {w:2}}) BATCH d.c OPLOG OBSERVER P TS:6 S1 TS:2 S2 TS:2
  • 15. OBSERVER BATCH BATCH PREFETCH OPLOG • Split batch into arbitrary work units • Assign work to prefetch threads • Entries processed in any order • All while admitting readers Allow readers
  • 16. OBSERVER BATCH BATCH PREFETCH OPLOG BATCH x.y d.c APPLIER • Assign entries to workers by target collection • Disable schema constraints Allow readers
  • 17. OBSERVER BATCH BATCH PREFETCH OPLOG BATCH x.y d.c APPLIER • Concurrency control excludes readers • Oplog entries applied in timestamp order Exclude readers
  • 18. OBSERVER BATCH BATCH PREFETCH OPLOG BATCH x.y d.c APPLIER Exclude readers• Concurrency control excludes readers • Oplog entries applied in timestamp order
  • 19. OBSERVER BATCH BATCH PREFETCH OPLOG BATCH x.y d.c APPLIER Exclude readers• Concurrency control excludes readers • Oplog entries applied in timestamp order
  • 20. OBSERVER BATCH BATCH PREFETCH APPLIER BATCH x.y d.c OPLOG • Readmit readers • Move entries from batch to oplog • Begin processing next batch Allow readers
  • 21. OPLOG OBSERVER BATCH BATCH PREFETCH APPLIER BATCH x.y d.cd.c OPLOG Allow readers P TS:6 S1 TS:6 S2 TS:2
  • 22. OPLOG d.c P TS:6 S1 TS:6 S2 TS:2 • Consults list of waiting clients • Looks for those waiting for ts:6 or earlier on S1 • Sends acknowledgement!