Cap in depth


Published on

  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Cap in depth

  1. 1. EEDC 34330Execution The CAP theorem inEnvironments for depthDistributedComputingErasmus Mundus DistributedComputing - EMDC Homework: Final Project Group number: EEDC-7.1 Group member: Ioanna Tsalouchidou –
  2. 2. Contents The Theorem The CAP theorems growing impact CAP twelve years later: how the “Rules” have changed Perspectives on the CAP theorem Consistency tradeoffs in modern distributed database system design CAP and cloud data management Overcoming CAP with consistent soft-state replication Conclusions 2
  3. 3. The TheoremConsistency: each server gives the correct response to each requestAvailability: each request eventually receives a responsePartition tolerance: refers to the underlying system and not to the service. Servers partitioned to groups that are not able to communicate 3
  4. 4. The Theorem“ There is a fundamental tradeoff between consistency, availability and network partition tolerance” Eric Brewer“ The impossibility of guaranteeing both safety and liveness in an unreliable system”“Fast, Cheap, Good - Pick Any Two” J. Noel Chiappa 4
  5. 5. The Theorem 5
  6. 6. The TheoremIn practice CAP takes place during a timeout.Then a decision should be made: • Cancel the operation and thus decrease availability • Continue the operation and be prone to inconsistency 6
  7. 7. The CAP theorems growing impactThe big data challenge: Handling exponential growth in Web-data Relational DBMS with ACID properties do not scale well Alternative solutions → NoSQL databases Non-relational and distributed databases 7
  8. 8. The CAP theorems growing impactNoSQL Data Bases: • Flexible schema • Scale horizontally • Do NOT support ACID properties • Store and Replicate data in distributed systems • Achieve Scalability and Reliability 8
  9. 9. The CAP theorems growing impactAtomicity ConsistencyConsistency Vs. AvailabilityIsolation Partition toleranceDurability 9
  10. 10. The CAP theorems growing impactWithin a datacenter: • Network failures are rare • No tradeoff between Consistency and AvailabilityCloud providers: • Maintain multiple datacenters • Datacenters are geographically separated • Consistency- Availability tradeoff appears 10
  11. 11. CAP: twelve years laterHow “Rules ” have changed:“Any networked shared-data system can have only two of three desirable properties. However, by explicitly handling partitions, designers can optimize consistency and availability, thereby achieving some tradeoff off all three. ” Eric Brewer 11
  12. 12. CAP: twelve years laterUse and Abuse of CAP theorem: “2 of 3 ” oversimplifies the tensions among properties. CAP → prohibits a tiny part of the design space. Perfect availability and Consistency given partitions, which are rare.Modern CAP: Max the combination of Consistency and Availability when possible. Operation during a partition. Recovery after the partition. CAP goes beyond its limitations. 12
  13. 13. CAP: twelve years laterManaging Partitions: Detect the start of the partition. Partition mode → limited operations. Partition recovery. 13
  14. 14. CAP: twelve years laterDuring partition mode: The operations to be limited depends on the invariants needed to be maintainedRecovery: Both sides should become consistent Compensation of the mistakes happened during partitionCompensation: Tracking and limitation of partition-mode operations. Knowledge of the invariants violated. Last writer wins. Still an open problem. 14
  15. 15. Perspectives on the CAP theorem“ The CAP theorem is one example of a more general tradeoff between safety and liveness” Gilbert and Lynch 15
  16. 16. Perspectives on the CAP theoremSafety property → at every point in every execution this property holds – Consistency.Liveness property → if the execution continues for long then something desirable happens – Availability.CAP → any protocol implementing an atomic read/write register cannot guarantee both safety and liveness in a partition-prone system. 16
  17. 17. Perspectives on the CAP theoremAgreement: • Fault-tolerant agreement is impossible in an asynchronous system.Requirements for Consensus: • Agreement: all processes same value (safety). • Validity: output-values have been provided as the input of some processes (safety). • Termination: all processes must output a value (liveness).Consensus: • Safety and liveness are impossible if the system is potentially faulty 17
  18. 18. Perspectives on the CAP theoremSafety/liveness tradeoff for consensus Under which circumstances can we have both? Network synchrony • Wholly synchronous network → wholly avoided tradeoff • Cynthia Dwork → eventual synchrony • Tushar Chandra → failure detectors Consistency • Maximum level of consistency? • Soma Chaudhuri → set agreement • 1-set agreement means consensus no crash failure • t failures need [t/k] +1 rounds and achieve k-set agreement. 18
  19. 19. Perspectives on the CAP theoremPractical Implications Over an unreliable system you can choose to sacrifice • Availability • Consistency • Moderate approach – sacrifice both dynamically – Well response to most user requests – Consistency when it is necessary 19
  20. 20. Perspectives on the CAP theoremBest-effort availability – Most common approach – Guarantees consistency, regardless of network behavior – When communication is typically reliable – Example of servers of the same datacenter – rare partitions 20
  21. 21. Perspectives on the CAP theoremBest-effort consistency – Sometimes unavailability is not an option – Inconsistency is not a major problem – Web caches, services with image and video content – Best-effort for up-to-date data – No assurance that all users get the same content – Not high requirement of strong consistency 21
  22. 22. Perspectives on the CAP theoremBalancing consistency and availability Neither strong consistency nor continual availability. Applications specify the level of continuous consistency. Airline reservation system • Many free seats → sacrifice consistency • A few places left → sacrifice availability Inconsistency of data when consistency is not needed. Unavailability when major network partition happens. → Increase systems robustness to network disruption before sacrificing availability. 22
  23. 23. Tradeoffs in modern distributed db“ The CAP theorems impact on modern DDBS is more limited than is often perceived” Caniel J. Abadi 23
  24. 24. Tradeoffs in modern distributed db“It is wrong to assume that DDBSs that reduce consistency in the absence of any partitions are doing so due to CAP-based decision-making” 24
  25. 25. Tradeoffs in modern distributed dbConsistency/ Latency tradeoff: • Availability ~ Latency • An unavailable system provides extreme latency • Exists even without network partitions • System runs long enough → at least one component fails • Highly available systems need to replicate dataThe occurrence of failure causes CAP tradeoffs, the possibility of failure results in Consistency/Latency tradeoff. 25
  26. 26. Tradeoffs in modern distributed dbData Replication: As soon as a DDBS replicates data, a tradeoff between consistency and latency arises. Replication alternatives • Data updates to all replicas at the same time. • Data updates first to an agreed master node. • Data updates to a single arbitrary node. Each implementation comes with consistency/latency tradeoff. 26
  27. 27. Tradeoffs in modern distributed db PACELCIf there is a Partition, how does the system trade off Availability and Consistency; Else, when a system is running normally in the absence of partitions, how does the system trade off Latency and Consistency? 27
  28. 28. CAP and cloud data managementWeb applications must scale on demand.Need for requests with low latencyRequire high throughputHighly availableMinimum operational cost 28
  29. 29. CAP and cloud data managementCoordinating all updates through a masterPerformance and availability implicationsPNUTS → automatically migrating the master to be close to the writersImpact on performance and availability insignificant for Yahoos applications • Localized user access patterns 29
  30. 30. Overcoming CAP: replicationStronger consistency inside the datacenterLow latencyScalabilityNo consistency sacrificing 30
  31. 31. Overcoming CAP: replicationFirst-tier cloud services: New consistency model for data replication. Combination of agreement on update ordering with amnesia freedom. → Surprising levels of scalability and performance. 31
  32. 32. Overcoming CAP: replicationThe ISIS system  Supports virtually synchronous process groups  Reliable multicast  Various ordering options  Send primitive is FIFO-ordered  Ordered-primitive guarantees total order  Barrier primitive Flush → Amnesia Freedom  Delay until prior unstable multicasts reached destinations  Virtually synchronous version of Paxos  SafeSend  In-memory durability  On-disk durability 32
  33. 33. Overcoming CAP: replication 33
  34. 34. ConclusionsCAP → “2 of 3” in unreliable systems - No blind sacrifice consistency or availability when partitions exist.Safety/ LivenessFailures → CAP tradeoffs,possibility of failure → Consistency/Latency tradeoff.ReplicationPACELC 34
  35. 35. References[1] Guest Editors Introduction: The CAP Theorems Growing Impact -[2] Pushing the CAP: Strategies for Consistency and Availability -[3] Perspectives on the CAP Theorem -[4] Consistency Tradeoffs in Modern Distributed Database System Design: CAP is Only Part of the Story -[5] CAP and Cloud Data Management - 35
  36. 36. References[6] Overcoming CAP with Consistent Soft-State Replication - tp=&arnumber=6112739[7][8][9] problems-with-partition-tolerance/[10] 36
  37. 37. EEDC 34330Execution The CAP theorem inEnvironments for depthDistributedComputingErasmus Mundus DistributedComputing - EMDC Homework: Final Project Group number: EEDC-7.1 Group member: Ioanna Tsalouchidou –