Successfully reported this slideshow.
Your SlideShare is downloading. ×

Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев (MongoDB)

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 21 Ad

Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев (MongoDB)

Download to read offline

HighLoad++ 2017

Зал «Мумбай», 8 ноября, 18:00

Тезисы:
http://www.highload.ru/2017/abstracts/2836.html

При использовании Eventually Consistent распределенных баз данных нет гарантий, что чтение возвращает результаты последних изменений данных, если чтение и запись производятся на разных узлах. Это ограничивает пропускную способность системы. Поддержка свойства Causal Consistency снимает это ограничение, что позволяет улучшить масштабируемость, не требуя изменений в коде приложения.
...

HighLoad++ 2017

Зал «Мумбай», 8 ноября, 18:00

Тезисы:
http://www.highload.ru/2017/abstracts/2836.html

При использовании Eventually Consistent распределенных баз данных нет гарантий, что чтение возвращает результаты последних изменений данных, если чтение и запись производятся на разных узлах. Это ограничивает пропускную способность системы. Поддержка свойства Causal Consistency снимает это ограничение, что позволяет улучшить масштабируемость, не требуя изменений в коде приложения.
...

Advertisement
Advertisement

More Related Content

Slideshows for you (18)

Viewers also liked (10)

Advertisement

Similar to Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев (MongoDB) (20)

More from Ontico (20)

Advertisement

Recently uploaded (20)

Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев (MongoDB)

  1. 1. Implementation of Cluster-wide Causal Consistency in
  2. 2. - What is causal consistency - Academics view on Causal Consistency - MongoDB architecture - Causal Consistency building blocks - Making Causal Consistency secure - Making Causal Consistency fast - Making Causal Consistency reliable - Causal consistency for end-users Outline
  3. 3. Client-side properties of causal consistency - Read your writes - Writes follow reads - Monotonic reads - Monotonic writes
  4. 4. Implementing with a non causally consistent system - Single server systems are causally consistent - Read and write from the same node - Add an application logic to handle the scenarios that have to be causally consistent
  5. 5. Ordering of Events in Distributed System Process P Process Q Process R q2: <C2> = 11 p1 r1: <C3> = 11 q1 <C3> = 0<C1> = 10 <C2> = 10 r2: <C3> = 12 q3
  6. 6. Server-side causal consistency Causal consistency is a partial order of events in a distributed system. If an event A causes another event B, then causal consistency provides an assurance that each other process of the system observes event A before observing event B. If an event A is not causally related to an event B then they are concurrent.
  7. 7. System’s Architecture
  8. 8. Gossiping clusterTime ClusterTime: {uint64} Timestamp(1495470881, 6)
  9. 9. Ticking clusterTime {ts: 6} ... {ts: 5} ... {ts: 11} ... insert({x:1}, clusterTime: 4) <clusterTime>: 6 <Wall clock>: 11
  10. 10. Reporting operationTime insert({x:1}) {ts: 11} ... {ts: 5} ... {ok:1}, { operationTime: 12} {ts: 12} ... <clusterTime>: 11 <Wall clock>: 11
  11. 11. Waiting for afterClusterTime find({x:1}, afterClusterTime: {10}, clusterTime: {15}) {ts: 6} ... {ts: 5} ... {x:1}, { operationTime: {11}} {ts: 11} ...
  12. 12. Breaking clusterTime {ts: Timestamp(1495470881, 6), term: 1}, ... {ts: Timestamp(1495470881, 5), term: 1}, ... {Timestamp(0xFFFFFFFF, 0xFFFFFFFF} ... insert({x:1}, clusterTime: {0xFFFFFFFF, 0xFFFFFFFE}) LogicalClock:clusterTime = Timestamp(0xFFFFFFFF, 0xFFFFFFFE)
  13. 13. Protecting clusterTime "$clusterTime" : { "clusterTime" : Timestamp(1495470881, 5), "signature" : { "hash" : BinData(0,"7olYjQCLtnfORsI9IAhdsftESR4="), "keyId" : NumberLong("6422998367101517844") } }
  14. 14. Protecting against operator errors {ts: 6} ... {ts: 5} ... insert({x:1}, clusterTime: 100,000) <clusterTime>: 6 <Wall clock>: 11 {Error}, { operationTime: 6}
  15. 15. Signing a range of clusterTime find({x:1}, clusterTime: <val>, signature:<hash>) <timeRange> = <val> | 0x0000’0000’0000’FFFF cache:{ <timeRange>:<hash> }
  16. 16. Use dummy signatures - When the auth is off - When a user has advanceClusterTime privilege
  17. 17. How end users see it let session=db.getMongo().startSession({causalConsistency: true}) db = session.getDatabase(db.getName());
  18. 18. {checking:100} find({name:”misha”}) afterClusterTime: 15 update({name:”misha” checking:100}) {ok:1} operationTime: 15 startSession()
  19. 19. Misha Tyulenev misha@mongodb.com

Editor's Notes

  • Even if the read request goes to primary it's not guaranteed to read its own writes for example read concern level = majority may delay it
  • Add a logical clock object to each cluster node (routers, storage, clients)
    Every client tracks the greatest operationTime inside a causally consistent session
  • clusterTime is incremented only on the write to the oplog (storage)
    (clusterTime + election term) is a primary key in the oplog collection
  • Every command returns an operationTime: the greatest clusterTime stored with an opLog entry at the time the command finishes its execution
  • Every request includes the afterClusterTime
    A storage node waits for opLog to replicate the entry with clusterTime >= afterClusterTime
  • Clients have to participate, but we don’t trust the clients
    There is a maximum time after which primary can’t do a write
    So we want to be sure that all cluster times from clients are from trusted source

  • clusterTime is incremented only on the write to the oplog (storage)
    (clusterTime + election term) is a primary key in the oplog collection

×