Your SlideShare is downloading. ×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Replication Internals: The Life of a Write

344
views

Published on

Published in: Technology

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
344
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
16
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Andy Schwerin Lead Engineer, MongoDB
  • 2. • Goals of Replication • Replication Architecture • A representative write
  • 3. • High availability for processing reads and writes – Automatic leader election • Support many network topologies – Tag sets • Accessible consistency model – Ordered operation log • Client can trade latency for durability – Write concern
  • 4. { ts: 4, op: “i”, ns: “d.c”, o: { _id: 10, name: “john” } } OPLOG …
  • 5. PRIMARY OPLOG 4 SECONDARY OPLOG 8 9 SECONDARY OPLOG 4 5 When a secondary oplog is not a prefix of the primary oplog…
  • 6. w:?
  • 7. w:1 Could lose write when primary disappears, without notification.
  • 8. w:majority Over half of nodes must fail to lose the write. And, an outside operator must intervene before new writes are
  • 9. w:all All nodes have the write before primary responds. But, cannot complete writes if any nodes are
  • 10. OPLOG d.c OPLOG P TS:6 S1 TS:6 S2 TS:2 d.c. insert ({_id:10,name:’john’}, wC: {w:2}}) 1. Fetch oplog entries 2. Apply to collections 3. Write to local oplog 4. Notify primary 5. Repeat
  • 11. OPLOG OBSERVER BATCH BATCH PREFETCH APPLIER BATCH x.y d.cd.c OPLOG d.c. insert ({_id:10,name:’john’}, wC: {w:2}}) P TS:6 S1 TS:6 S2 TS:2
  • 12. OPLOG d.c.insert ({_id:10,name:’john’}, wC: {w:2}}) d.c { ts: 4, op: “i”, ns: “d.c”, o: { _id: 10, name: “john” } } P TS:4 S1 TS:2 S2 TS:2
  • 13. OPLOG d.c.insert ({_id:10,name:’john’}, wC: {w:2}}) OBSERVER BATCH d.c OPLOG P TS:6 S1 TS:2 S2 TS:2
  • 14. OPLOG d.c.insert ({_id:10,name:’john’}, wC: {w:2}}) BATCH d.c OPLOG OBSERVER P TS:6 S1 TS:2 S2 TS:2
  • 15. OBSERVER BATCH BATCH PREFETCH OPLOG • Split batch into arbitrary work units • Assign work to prefetch threads • Entries processed in any order • All while admitting readers Allow readers
  • 16. OBSERVER BATCH BATCH PREFETCH OPLOG BATCH x.y d.c APPLIER • Assign entries to workers by target collection • Disable schema constraints Allow readers
  • 17. OBSERVER BATCH BATCH PREFETCH OPLOG BATCH x.y d.c APPLIER • Concurrency control excludes readers • Oplog entries applied in timestamp order Exclude readers
  • 18. OBSERVER BATCH BATCH PREFETCH OPLOG BATCH x.y d.c APPLIER Exclude readers• Concurrency control excludes readers • Oplog entries applied in timestamp order
  • 19. OBSERVER BATCH BATCH PREFETCH OPLOG BATCH x.y d.c APPLIER Exclude readers• Concurrency control excludes readers • Oplog entries applied in timestamp order
  • 20. OBSERVER BATCH BATCH PREFETCH APPLIER BATCH x.y d.c OPLOG • Readmit readers • Move entries from batch to oplog • Begin processing next batch Allow readers
  • 21. OPLOG OBSERVER BATCH BATCH PREFETCH APPLIER BATCH x.y d.cd.c OPLOG Allow readers P TS:6 S1 TS:6 S2 TS:2
  • 22. OPLOG d.c P TS:6 S1 TS:6 S2 TS:2 • Consults list of waiting clients • Looks for those waiting for ts:6 or earlier on S1 • Sends acknowledgement!