Replication Internals: The Life of a Write
 

Replication Internals: The Life of a Write

on

  • 328 views

 

Statistics

Views

Total Views
328
Views on SlideShare
275
Embed Views
53

Actions

Likes
1
Downloads
7
Comments
0

2 Embeds 53

https://www.mongodb.com 35
http://www.mongodb.com 18

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Replication Internals: The Life of a Write Replication Internals: The Life of a Write Presentation Transcript

  • Andy Schwerin Lead Engineer, MongoDB
  • • Goals of Replication • Replication Architecture • A representative write
  • • High availability for processing reads and writes – Automatic leader election • Support many network topologies – Tag sets • Accessible consistency model – Ordered operation log • Client can trade latency for durability – Write concern
  • { ts: 4, op: “i”, ns: “d.c”, o: { _id: 10, name: “john” } } OPLOG …
  • PRIMARY OPLOG 4 SECONDARY OPLOG 8 9 SECONDARY OPLOG 4 5 When a secondary oplog is not a prefix of the primary oplog…
  • w:?
  • w:1 Could lose write when primary disappears, without notification.
  • w:majority Over half of nodes must fail to lose the write. And, an outside operator must intervene before new writes are
  • w:all All nodes have the write before primary responds. But, cannot complete writes if any nodes are
  • OPLOG d.c OPLOG P TS:6 S1 TS:6 S2 TS:2 d.c. insert ({_id:10,name:’john’}, wC: {w:2}}) 1. Fetch oplog entries 2. Apply to collections 3. Write to local oplog 4. Notify primary 5. Repeat
  • OPLOG OBSERVER BATCH BATCH PREFETCH APPLIER BATCH x.y d.cd.c OPLOG d.c. insert ({_id:10,name:’john’}, wC: {w:2}}) P TS:6 S1 TS:6 S2 TS:2
  • OPLOG d.c.insert ({_id:10,name:’john’}, wC: {w:2}}) d.c { ts: 4, op: “i”, ns: “d.c”, o: { _id: 10, name: “john” } } P TS:4 S1 TS:2 S2 TS:2
  • OPLOG d.c.insert ({_id:10,name:’john’}, wC: {w:2}}) OBSERVER BATCH d.c OPLOG P TS:6 S1 TS:2 S2 TS:2
  • OPLOG d.c.insert ({_id:10,name:’john’}, wC: {w:2}}) BATCH d.c OPLOG OBSERVER P TS:6 S1 TS:2 S2 TS:2
  • OBSERVER BATCH BATCH PREFETCH OPLOG • Split batch into arbitrary work units • Assign work to prefetch threads • Entries processed in any order • All while admitting readers Allow readers
  • OBSERVER BATCH BATCH PREFETCH OPLOG BATCH x.y d.c APPLIER • Assign entries to workers by target collection • Disable schema constraints Allow readers
  • OBSERVER BATCH BATCH PREFETCH OPLOG BATCH x.y d.c APPLIER • Concurrency control excludes readers • Oplog entries applied in timestamp order Exclude readers
  • OBSERVER BATCH BATCH PREFETCH OPLOG BATCH x.y d.c APPLIER Exclude readers• Concurrency control excludes readers • Oplog entries applied in timestamp order
  • OBSERVER BATCH BATCH PREFETCH OPLOG BATCH x.y d.c APPLIER Exclude readers• Concurrency control excludes readers • Oplog entries applied in timestamp order
  • OBSERVER BATCH BATCH PREFETCH APPLIER BATCH x.y d.c OPLOG • Readmit readers • Move entries from batch to oplog • Begin processing next batch Allow readers
  • OPLOG OBSERVER BATCH BATCH PREFETCH APPLIER BATCH x.y d.cd.c OPLOG Allow readers P TS:6 S1 TS:6 S2 TS:2
  • OPLOG d.c P TS:6 S1 TS:6 S2 TS:2 • Consults list of waiting clients • Looks for those waiting for ts:6 or earlier on S1 • Sends acknowledgement!