SlideShare a Scribd company logo
1 of 17
Life after CAP
CAP conjecture [reminder]
• Can only have two of:
– Consistency
– Availability
– Partition-tolerance

• Examples
– Databases, 2PC, centralized algo (C & A)
– Distributed databases, majority protocols (C & P)
– DNS, Bayou (A & P)
CAP theorem
• Formalization by Gilbert & Lynch
• What does impossible mean?
– There exist an execution which violates one of CAP
– not possible to guarantee that an algorithm has
all three at all times
• Shard data with different CAP tradeoffs
• Detect partitions and weaken consistency
Partition-tolerance & availability
• What is partition-tolerance?
– Consistency and Availability are provided by algo
– Partitions are external events (scheduler/oracle)
• Partition-tolerance is really a failure model
• Partition-tolerance equivalent with omissions

• In the CAP theorem
– Proof rests on partitions that never heal
– Datacenters can guarantee recovery of partitions!
• Can guarantee that conflict resolution eventually happens
How do we ensure consistency
• Main technique to be consistent
– Quorum principle
– Example: Majority quorums
• Always write to and read from a majority of nodes
• At least one node knows most recent value
majority(9)=5

WRITE(v)

READ v
Quorum Principle
• Majority Quorum
– Pro: tolerate up to N/2 -1 crashes
– Con: Have to read/write  N/2 +1 values

• Read/write quorums (Dynamo, ZooKeeper, Chain Repl)
– Read R nodes, Rrite W nodes, s.t. R + W > N (W > N/2)
– Pro: adjust performance of reads/writes
– Con: availability can suffer

• Maekwa Quorum
–
–
–
–

P1

Arrange nodes in a MxM grid
P4
Write to row+col, read cols (always overlap)
P7
Pro: Only need to read/write O( sqrt(N) ) nodes
Con: Tolerate at most O( sqrt(N) ) crashes (reconfiguration)

P2

P3

P5

P6

P8

P9

7
Probabilistic Quorums
• Quorum size α√N, (α > 1)
intersects with probability 1-exp(α2)
– Example:
– Maekwa:

N=16 nodes, quorum size 7,
intersects 95%, tolerates 9 failures
N=16 nodes, quorum size 7,
intersects 100%, tolerates 4 failures

– Pro: Small quorums, high fault-tolerance
– Con: Could fail to intersect, N usually large
8
Quorums and CAP
• With quorums we can get
– C & P: partition can make quorum unavailable
– C & A: no-partition ensures availability and atomicity

• Faced decision when fail to get quorum *brewer’11+
– Sacrifice availability by waiting for merger
– Sacrifice atomicity by ignoring the quorum

• Can we get CAP for weaker consistency?
What does atomicity really mean?
R

P1
R

P2
P3

W(5)

W(6)
invocation response

• Linearization Points
– Read ops appear as if immediately happened at all nodes at
• time between invocation and response

– Write ops appear as if immediately happened at all nodes at
• time between invocation and response
Definition of Atomicity
• Linearization Points
– Read ops appear as if immediately happened at all nodes at
• time between invocation and response

– Write ops appear as if immediately happened at all nodes at
• time between invocation and response

R:6

P1
R:5

P2
P3

W(5)

W(6)

atomic
Definition of Atomicity
R:6

P1
R:6

P2
P3

W(5)

W(6)
R:5

P1
R:6

P2
P3

atomic

W(5)

W(6)

not atomic
Atomicity too strong?
R:5

P1
R:6

P2
P3

W(5)

not atomic

W(6)

• Linearization points too strong?
– Why not just have R:5 appear atomically right after W(5)?
– Lamport: ”If P2’s operator phones P1 and tells her I just read 6”
Atomicity too strong?
R:5

P1
R:6

P2
P3

W(5)

W(6)

not atomic
sequentially
consistent

• Sequential consistency
–
–
–
–

Weaker than atomicity
Sequential consistency removes this ”real-time” requirement
Any global ordering OK as long as they respect local ordering
Does Gilbert’s proof fall apart for sequential consistency?

• Causal memory
–
–
–
–

Weaker than sequential
No need to have global view, each process different view
Local, read/writes immediately return to caller
CAP theorem does not apply to causal memory

P1
P2

causally
consistent
W(0) R:1

W(1) R:0
Going really weak
• Eventual consistency
– When network non-partitioned, all nodes eventually have the same
value
– I.e. don’t be ”consistent” at all times, but only after partitions heal!

• Based on powerful technique: gossipping
–
–
–
–

Periodically exchange ”logs” with one random node
Exchange must be constant-sized packets
Set reconciliation, merkle trees, etc
Use (clock, node_id) to break ties of events in log

• Properties of gossipping
– All nodes will have the same value in O(log N) time
– No positive-feedback cycles that congest the network
BASE
• Catch all for any consistency model C’ that
enables C’-A-P
– Eventual consistency
– PRAM consistency
– Causal consistency

• Main ingredients
– Stale data
– Soft-state (regenerateable state)
– Approximate answers
Summary
• No need to ensure CAP at all times
– Switch between algorithms or satisfy subset at different times

• Weaken consistency model
– Choose weaker consistency:
• Causal memory (relatively strong) work around CAP

– Only be consistent when network isn’t partitioned:
• Eventual consistency (very weak) works around CAP

• Weaken partition-tolerance
– Some environments never partition, e.g. datacenters
– Tolerate unavailability in small quorums
– Some env. have recovery guarantees (partitions heal within X
hours), perform conflict resolution
Related Work (ignored in talk)
• PRAM consistency (Pipelined RAM)
– Weaker than causal and non-blocking

• Eventual Linearizability (PODC’10)
– Becomes atomic after quiescent periods

• Gossipping & set reconciliation
– Lots of related work

More Related Content

What's hot

Clk-to-q delay, library setup and hold time
Clk-to-q delay, library setup and hold timeClk-to-q delay, library setup and hold time
Clk-to-q delay, library setup and hold timeVLSI SYSTEM Design
 
Low latency in java 8 v5
Low latency in java 8 v5Low latency in java 8 v5
Low latency in java 8 v5Peter Lawrey
 
Mantis qcon nyc_2015
Mantis qcon nyc_2015Mantis qcon nyc_2015
Mantis qcon nyc_2015neerajrj
 
Benchmarks, performance, scalability, and capacity what's behind the numbers
Benchmarks, performance, scalability, and capacity what's behind the numbersBenchmarks, performance, scalability, and capacity what's behind the numbers
Benchmarks, performance, scalability, and capacity what's behind the numbersJustin Dorfman
 

What's hot (7)

ZeroMQ with NodeJS
ZeroMQ with NodeJSZeroMQ with NodeJS
ZeroMQ with NodeJS
 
Clk-to-q delay, library setup and hold time
Clk-to-q delay, library setup and hold timeClk-to-q delay, library setup and hold time
Clk-to-q delay, library setup and hold time
 
Who Broke My Crypto
Who Broke My CryptoWho Broke My Crypto
Who Broke My Crypto
 
Low latency in java 8 v5
Low latency in java 8 v5Low latency in java 8 v5
Low latency in java 8 v5
 
Scapy talk
Scapy talkScapy talk
Scapy talk
 
Mantis qcon nyc_2015
Mantis qcon nyc_2015Mantis qcon nyc_2015
Mantis qcon nyc_2015
 
Benchmarks, performance, scalability, and capacity what's behind the numbers
Benchmarks, performance, scalability, and capacity what's behind the numbersBenchmarks, performance, scalability, and capacity what's behind the numbers
Benchmarks, performance, scalability, and capacity what's behind the numbers
 

Similar to CAP theorem by Ali Ghodsi

Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...NoSQLmatters
 
Thoughts on consistency models
Thoughts on consistency modelsThoughts on consistency models
Thoughts on consistency modelsrogerbodamer
 
Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...
Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...
Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...Tokyo Institute of Technology
 
Resilience at exascale
Resilience at exascaleResilience at exascale
Resilience at exascaleMarc Snir
 
Design Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databasesDesign Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databaseslovingprince58
 
A shared-filesystem-memory approach for running IDA in parallel over informal...
A shared-filesystem-memory approach for running IDA in parallel over informal...A shared-filesystem-memory approach for running IDA in parallel over informal...
A shared-filesystem-memory approach for running IDA in parallel over informal...openseesdays
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databasesguestdfd1ec
 
Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...
Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...
Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...Peter Breuer
 
Concurrency in Distributed Systems : Leslie Lamport papers
Concurrency in Distributed Systems : Leslie Lamport papersConcurrency in Distributed Systems : Leslie Lamport papers
Concurrency in Distributed Systems : Leslie Lamport papersSubhajit Sahu
 
Cmg06 utilization is useless
Cmg06 utilization is uselessCmg06 utilization is useless
Cmg06 utilization is uselessAdrian Cockcroft
 
Multi-core Parallelization in Clojure - a Case Study
Multi-core Parallelization in Clojure - a Case StudyMulti-core Parallelization in Clojure - a Case Study
Multi-core Parallelization in Clojure - a Case Studyelliando dias
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm Chandler Huang
 
Making the Most Out of ScyllaDB's Awesome Concurrency at Optimizely
Making the Most Out of ScyllaDB's Awesome Concurrency at OptimizelyMaking the Most Out of ScyllaDB's Awesome Concurrency at Optimizely
Making the Most Out of ScyllaDB's Awesome Concurrency at OptimizelyScyllaDB
 
Using the big guns: Advanced OS performance tools for troubleshooting databas...
Using the big guns: Advanced OS performance tools for troubleshooting databas...Using the big guns: Advanced OS performance tools for troubleshooting databas...
Using the big guns: Advanced OS performance tools for troubleshooting databas...Nikolay Savvinov
 
Storm presentation
Storm presentationStorm presentation
Storm presentationShyam Raj
 
Data driven testing: Case study with Apache Helix
Data driven testing: Case study with Apache HelixData driven testing: Case study with Apache Helix
Data driven testing: Case study with Apache HelixKishore Gopalakrishna
 
Call me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networksCall me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networksShalin Shekhar Mangar
 
Verification with LoLA: 4 Using LoLA
Verification with LoLA: 4 Using LoLAVerification with LoLA: 4 Using LoLA
Verification with LoLA: 4 Using LoLAUniversität Rostock
 
Computer network (8)
Computer network (8)Computer network (8)
Computer network (8)NYversity
 

Similar to CAP theorem by Ali Ghodsi (20)

Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
Salvatore Sanfilippo – How Redis Cluster works, and why - NoSQL matters Barce...
 
Thoughts on consistency models
Thoughts on consistency modelsThoughts on consistency models
Thoughts on consistency models
 
Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...
Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...
Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...
 
Resilience at exascale
Resilience at exascaleResilience at exascale
Resilience at exascale
 
Ch3-2
Ch3-2Ch3-2
Ch3-2
 
Design Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databasesDesign Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databases
 
A shared-filesystem-memory approach for running IDA in parallel over informal...
A shared-filesystem-memory approach for running IDA in parallel over informal...A shared-filesystem-memory approach for running IDA in parallel over informal...
A shared-filesystem-memory approach for running IDA in parallel over informal...
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databases
 
Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...
Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...
Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux ...
 
Concurrency in Distributed Systems : Leslie Lamport papers
Concurrency in Distributed Systems : Leslie Lamport papersConcurrency in Distributed Systems : Leslie Lamport papers
Concurrency in Distributed Systems : Leslie Lamport papers
 
Cmg06 utilization is useless
Cmg06 utilization is uselessCmg06 utilization is useless
Cmg06 utilization is useless
 
Multi-core Parallelization in Clojure - a Case Study
Multi-core Parallelization in Clojure - a Case StudyMulti-core Parallelization in Clojure - a Case Study
Multi-core Parallelization in Clojure - a Case Study
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Making the Most Out of ScyllaDB's Awesome Concurrency at Optimizely
Making the Most Out of ScyllaDB's Awesome Concurrency at OptimizelyMaking the Most Out of ScyllaDB's Awesome Concurrency at Optimizely
Making the Most Out of ScyllaDB's Awesome Concurrency at Optimizely
 
Using the big guns: Advanced OS performance tools for troubleshooting databas...
Using the big guns: Advanced OS performance tools for troubleshooting databas...Using the big guns: Advanced OS performance tools for troubleshooting databas...
Using the big guns: Advanced OS performance tools for troubleshooting databas...
 
Storm presentation
Storm presentationStorm presentation
Storm presentation
 
Data driven testing: Case study with Apache Helix
Data driven testing: Case study with Apache HelixData driven testing: Case study with Apache Helix
Data driven testing: Case study with Apache Helix
 
Call me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networksCall me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networks
 
Verification with LoLA: 4 Using LoLA
Verification with LoLA: 4 Using LoLAVerification with LoLA: 4 Using LoLA
Verification with LoLA: 4 Using LoLA
 
Computer network (8)
Computer network (8)Computer network (8)
Computer network (8)
 

Recently uploaded

Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 

Recently uploaded (20)

Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 

CAP theorem by Ali Ghodsi

  • 2. CAP conjecture [reminder] • Can only have two of: – Consistency – Availability – Partition-tolerance • Examples – Databases, 2PC, centralized algo (C & A) – Distributed databases, majority protocols (C & P) – DNS, Bayou (A & P)
  • 3. CAP theorem • Formalization by Gilbert & Lynch • What does impossible mean? – There exist an execution which violates one of CAP – not possible to guarantee that an algorithm has all three at all times • Shard data with different CAP tradeoffs • Detect partitions and weaken consistency
  • 4. Partition-tolerance & availability • What is partition-tolerance? – Consistency and Availability are provided by algo – Partitions are external events (scheduler/oracle) • Partition-tolerance is really a failure model • Partition-tolerance equivalent with omissions • In the CAP theorem – Proof rests on partitions that never heal – Datacenters can guarantee recovery of partitions! • Can guarantee that conflict resolution eventually happens
  • 5. How do we ensure consistency • Main technique to be consistent – Quorum principle – Example: Majority quorums • Always write to and read from a majority of nodes • At least one node knows most recent value majority(9)=5 WRITE(v) READ v
  • 6. Quorum Principle • Majority Quorum – Pro: tolerate up to N/2 -1 crashes – Con: Have to read/write  N/2 +1 values • Read/write quorums (Dynamo, ZooKeeper, Chain Repl) – Read R nodes, Rrite W nodes, s.t. R + W > N (W > N/2) – Pro: adjust performance of reads/writes – Con: availability can suffer • Maekwa Quorum – – – – P1 Arrange nodes in a MxM grid P4 Write to row+col, read cols (always overlap) P7 Pro: Only need to read/write O( sqrt(N) ) nodes Con: Tolerate at most O( sqrt(N) ) crashes (reconfiguration) P2 P3 P5 P6 P8 P9 7
  • 7. Probabilistic Quorums • Quorum size α√N, (α > 1) intersects with probability 1-exp(α2) – Example: – Maekwa: N=16 nodes, quorum size 7, intersects 95%, tolerates 9 failures N=16 nodes, quorum size 7, intersects 100%, tolerates 4 failures – Pro: Small quorums, high fault-tolerance – Con: Could fail to intersect, N usually large 8
  • 8. Quorums and CAP • With quorums we can get – C & P: partition can make quorum unavailable – C & A: no-partition ensures availability and atomicity • Faced decision when fail to get quorum *brewer’11+ – Sacrifice availability by waiting for merger – Sacrifice atomicity by ignoring the quorum • Can we get CAP for weaker consistency?
  • 9. What does atomicity really mean? R P1 R P2 P3 W(5) W(6) invocation response • Linearization Points – Read ops appear as if immediately happened at all nodes at • time between invocation and response – Write ops appear as if immediately happened at all nodes at • time between invocation and response
  • 10. Definition of Atomicity • Linearization Points – Read ops appear as if immediately happened at all nodes at • time between invocation and response – Write ops appear as if immediately happened at all nodes at • time between invocation and response R:6 P1 R:5 P2 P3 W(5) W(6) atomic
  • 12. Atomicity too strong? R:5 P1 R:6 P2 P3 W(5) not atomic W(6) • Linearization points too strong? – Why not just have R:5 appear atomically right after W(5)? – Lamport: ”If P2’s operator phones P1 and tells her I just read 6”
  • 13. Atomicity too strong? R:5 P1 R:6 P2 P3 W(5) W(6) not atomic sequentially consistent • Sequential consistency – – – – Weaker than atomicity Sequential consistency removes this ”real-time” requirement Any global ordering OK as long as they respect local ordering Does Gilbert’s proof fall apart for sequential consistency? • Causal memory – – – – Weaker than sequential No need to have global view, each process different view Local, read/writes immediately return to caller CAP theorem does not apply to causal memory P1 P2 causally consistent W(0) R:1 W(1) R:0
  • 14. Going really weak • Eventual consistency – When network non-partitioned, all nodes eventually have the same value – I.e. don’t be ”consistent” at all times, but only after partitions heal! • Based on powerful technique: gossipping – – – – Periodically exchange ”logs” with one random node Exchange must be constant-sized packets Set reconciliation, merkle trees, etc Use (clock, node_id) to break ties of events in log • Properties of gossipping – All nodes will have the same value in O(log N) time – No positive-feedback cycles that congest the network
  • 15. BASE • Catch all for any consistency model C’ that enables C’-A-P – Eventual consistency – PRAM consistency – Causal consistency • Main ingredients – Stale data – Soft-state (regenerateable state) – Approximate answers
  • 16. Summary • No need to ensure CAP at all times – Switch between algorithms or satisfy subset at different times • Weaken consistency model – Choose weaker consistency: • Causal memory (relatively strong) work around CAP – Only be consistent when network isn’t partitioned: • Eventual consistency (very weak) works around CAP • Weaken partition-tolerance – Some environments never partition, e.g. datacenters – Tolerate unavailability in small quorums – Some env. have recovery guarantees (partitions heal within X hours), perform conflict resolution
  • 17. Related Work (ignored in talk) • PRAM consistency (Pipelined RAM) – Weaker than causal and non-blocking • Eventual Linearizability (PODC’10) – Becomes atomic after quiescent periods • Gossipping & set reconciliation – Lots of related work

Editor's Notes

  1. Failed ops appear ascompleted at every node, XORnever occurred at any node