SlideShare a Scribd company logo
1 of 50
CAP Theorem
CSE 40822-Cloud Computing-Fall 2014
Prof. Dong Wang
CAP Theorem
• Conjectured by Prof. Eric Brewer at
PODC (Principle of Distributed
Computing) 2000 keynote talk
• Described the trade-offs involved in
distributed system
• It is impossible for a web service to
provide following three guarantees at
the same time:
• Consistency
• Availability
• Partition-tolerance
CAP Theorem
• Consistency:
– All nodes should see the same data at the same time
• Availability:
– Node failures do not prevent survivors from
continuing to operate
• Partition-tolerance:
– The system continues to operate despite network
partitions
• A distributed system can satisfy any two of these
guarantees at the same time but not all three
CAP Theorem
C A
P
CAP Theorem
• A simple example:
Hotel Booking: are we double-booking the
same room?
Bob Dong
CAP Theorem
• A simple example:
Hotel Booking: are we double-booking the
same room?
Bob Dong
CAP Theorem
• A simple example:
Hotel Booking: are we double-booking the
same room?
Bob Dong
CAP Theorem: Proof
• 2002: Proven by research
conducted by Nancy Lynch
and Seth Gilbert at MIT
Gilbert, Seth, and Nancy Lynch. "Brewer's
conjecture and the feasibility of consistent,
available, partition-tolerant web services."
ACM SIGACT News 33.2 (2002): 51-59.
CAP Theorem: Proof
• A simple proof using two nodes:
A B
CAP Theorem: Proof
• A simple proof using two nodes:
A B
Not Consistent!
Respond to client
CAP Theorem: Proof
• A simple proof using two nodes:
A B
Not Available!
Wait to be updated
CAP Theorem: Proof
• A simple proof using two nodes:
A B
Not Partition
Tolerant!
A gets updated from B
Why this is important?
• The future of databases is distributed (Big
Data Trend, etc.)
• CAP theorem describes the trade-offs
involved in distributed systems
• A proper understanding of CAP theorem is
essential to making decisions about the future
of distributed database design
• Misunderstanding can lead to erroneous or
inappropriate design choices
Problem for Relational Database to Scale
• The Relational Database is built on the
principle of ACID (Atomicity, Consistency,
Isolation, Durability)
• It implies that a truly distributed relational
database should have availability, consistency
and partition tolerance.
• Which unfortunately is impossible …
Revisit CAP Theorem
C A
P
• Of the following three
guarantees potentially offered
a by distributed systems:
• Consistency
• Availability
• Partition tolerance
• Pick two
• This suggests there are three
kinds of distributed systems:
• CP
• AP
• CA
Any problems?
A popular misconception: 2 out 3
• How about CA?
• Can a distributed system
(with unreliable network)
really be not tolerant of
partitions?
C A
A few witnesses
• Coda Hale, Yammer software engineer:
– “Of the CAP theorem’s Consistency, Availability,
and Partition Tolerance, Partition Tolerance is
mandatory in distributed systems. You cannot
not choose it.”
http://codahale.com/you-cant-sacrifice-partition-tolerance/
A few witnesses
• Werner Vogels, Amazon CTO
– “An important observation is that in larger
distributed-scale systems, network partitions are a
given; therefore, consistency and availability
cannot be achieved at the same time.”
http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
A few witnesses
• Daneil Abadi, Co-founder of Hadapt
– So in reality, there are only two types of systems
... I.e., if there is a partition, does the system give
up availability or consistency?
http://dbmsmusings.blogspot.com/2010/04/problems-with-cap-and-yahoos-little.html
CAP Theorem 12 year later
• Prof. Eric Brewer: father of CAP
theorem
• “The “2 of 3” formulation was
always misleading because it
tended to oversimplify the
tensions among properties. ...
• CAP prohibits only a tiny part of
the design space: perfect
availability and consistency in the
presence of partitions, which are
rare.”
http://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed
Consistency or Availability
C A
P
• Consistency and Availability is
not “binary” decision
• AP systems relax consistency
in favor of availability – but
are not inconsistent
• CP systems sacrifice
availability for consistency-
but are not unavailable
• This suggests both AP and CP
systems can offer a degree of
consistency, and availability,
as well as partition tolerance
AP: Best Effort Consistency
• Example:
– Web Caching
– DNS
• Trait:
– Optimistic
– Expiration/Time-to-live
– Conflict resolution
CP: Best Effort Availability
• Example:
– Majority protocols
– Distributed Locking (Google Chubby Lock service)
• Trait:
– Pessimistic locking
– Make minority partition unavailable
Types of Consistency
• Strong Consistency
– After the update completes, any subsequent access
will return the same updated value.
• Weak Consistency
– It is not guaranteed that subsequent accesses will
return the updated value.
• Eventual Consistency
– Specific form of weak consistency
– It is guaranteed that if no new updates are made to
object, eventually all accesses will return the last
updated value (e.g., propagate updates to replicas in a
lazy fashion)
Eventual Consistency Variations
• Causal consistency
– Processes that have causal relationship will see
consistent data
• Read-your-write consistency
– A process always accesses the data item after it’s
update operation and never sees an older value
• Session consistency
– As long as session exists, system guarantees read-
your-write consistency
– Guarantees do not overlap sessions
Eventual Consistency Variations
• Monotonic read consistency
– If a process has seen a particular value of data item,
any subsequent processes will never return any
previous values
• Monotonic write consistency
– The system guarantees to serialize the writes by the
same process
• In practice
– A number of these properties can be combined
– Monotonic reads and read-your-writes are most
desirable
Eventual Consistency
- A Facebook Example
• Bob finds an interesting story and shares with
Alice by posting on her Facebook wall
• Bob asks Alice to check it out
• Alice logs in her account, checks her Facebook
wall but finds:
- Nothing is there!
Eventual Consistency
- A Facebook Example
• Bob tells Alice to wait a bit and check out later
• Alice waits for a minute or so and checks back:
- She finds the story Bob shared with her!
Eventual Consistency
- A Facebook Example
• Reason: it is possible because Facebook uses
an eventual consistent model
• Why Facebook chooses eventual consistent
model over the strong consistent one?
– Facebook has more than 1 billion active users
– It is non-trivial to efficiently and reliably store the
huge amount of data generated at any given time
– Eventual consistent model offers the option to
reduce the load and improve availability
Eventual Consistency
- A Dropbox Example
• Dropbox enabled immediate consistency via
synchronization in many cases.
• However, what happens in case of a network
partition?
Eventual Consistency
- A Dropbox Example
• Let’s do a simple experiment here:
– Open a file in your drop box
– Disable your network connection (e.g., WiFi, 4G)
– Try to edit the file in the drop box: can you do
that?
– Re-enable your network connection: what
happens to your dropbox folder?
Eventual Consistency
- A Dropbox Example
• Dropbox embraces eventual consistency:
– Immediate consistency is impossible in case of a
network partition
– Users will feel bad if their word documents freeze
each time they hit Ctrl+S , simply due to the large
latency to update all devices across WAN
– Dropbox is oriented to personal syncing, not on
collaboration, so it is not a real limitation.
Eventual Consistency
- An ATM Example
• In design of automated teller machine (ATM):
– Strong consistency appear to be a nature choice
– However, in practice, A beats C
– Higher availability means higher revenue
– ATM will allow you to withdraw money even if the
machine is partitioned from the network
– However, it puts a limit on the amount of withdraw
(e.g., $200)
– The bank might also charge you a fee when a
overdraft happens
Dynamic Tradeoff between C and A
• An airline reservation system:
– When most of seats are available: it is ok to rely
on somewhat out-of-date data, availability is more
critical
– When the plane is close to be filled: it needs more
accurate data to ensure the plane is not
overbooked, consistency is more critical
• Neither strong consistency nor guaranteed
availability, but it may significantly increase
the tolerance of network disruption
Heterogeneity: Segmenting C and A
• No single uniform requirement
– Some aspects require strong consistency
– Others require high availability
• Segment the system into different components
– Each provides different types of guarantees
• Overall guarantees neither consistency nor
availability
– Each part of the service gets exactly what it needs
• Can be partitioned along different dimensions
Discussion
• In an e-commercial system (e.g., Amazon, e-Bay,
etc), what are the trade-offs between consistency
and availability you can think of? What is your
strategy?
• Hint -> Things you might want to consider:
– Different types of data (e.g., shopping cart, billing,
product, etc.)
– Different types of operations (e.g., query, purchase,
etc.)
– Different types of services (e.g., distributed lock, DNS,
etc.)
– Different groups of users (e.g., users in different
geographic areas, etc.)
Partitioning Examples
• Data Partitioning
• Operational Partitioning
• Functional Partitioning
• User Partitioning
• Hierarchical Partitioning
Partitioning Examples
Data Partitioning
• Different data may require different consistency
and availability
• Example:
• Shopping cart: high availability, responsive, can
sometimes suffer anomalies
• Product information need to be available, slight
variation in inventory is sufferable
• Checkout, billing, shipping records must be consistent
Partitioning Examples
Operational Partitioning
• Each operation may require different balance
between consistency and availability
• Example:
• Reads: high availability; e.g.., “query”
• Writes: high consistency, lock when writing; e.g.,
“purchase”
Partitioning Examples
Functional Partitioning
• System consists of sub-services
• Different sub-services provide different
balances
• Example: A comprehensive distributed system
– Distributed lock service (e.g., Chubby) :
• Strong consistency
– DNS service:
• High availability
Partitioning Examples
User Partitioning
• Try to keep related data close together to
assure better performance
• Example: Craglist
– Might want to divide its service into several data
centers, e.g., east coast and west coast
• Users get high performance (e.g., high availability and
good consistency) if they query servers closet to them
• Poorer performance if a New York user query Craglist in
San Francisco
Partitioning Examples
Hierarchical Partitioning
• Large global service with local “extensions”
• Different location in hierarchy may use
different consistency
• Example:
– Local servers (better connected) guarantee more
consistency and availability
– Global servers has more partition and relax one of
the requirement
What if there are no partitions?
• Tradeoff between Consistency and Latency:
• Caused by the possibility of failure in
distributed systems
– High availability -> replicate data -> consistency
problem
• Basic idea:
– Availability and latency are arguably the same
thing: unavailable -> extreme high latency
– Achieving different levels of
consistency/availability takes different amount of
time
CAP -> PACELC
• A more complete description of the space of
potential tradeoffs for distributed system:
– If there is a partition (P), how does the system
trade off availability and consistency (A and C);
else (E), when the system is running normally in
the absence of partitions, how does the system
trade off latency (L) and consistency (C)?
Abadi, Daniel J. "Consistency tradeoffs in modern distributed database
system design." Computer-IEEE Computer Magazine 45.2 (2012): 37.
PACELC
C A C L
Partitioned Normal
Examples
• PA/EL Systems: Give up both Cs for availability and lower
latency
– Dynamo, Cassandra, Riak
• PC/EC Systems: Refuse to give up consistency and pay
the cost of availability and latency
– BigTable, Hbase, VoltDB/H-Store
• PA/EC Systems: Give up consistency when a partition
happens and keep consistency in normal operations
– MongoDB
• PC/EL System: Keep consistency if a partition occurs but
gives up consistency for latency in normal operations
– Yahoo! PNUTS
Contact: Prof. Dong Wang: dwang5@nd.edu
http://www3.nd.edu/~dwang5/teach/spring15/spring15_wang_flyer.pdf
CSE 40437/60437, Spring 2015:
Social Sensing & Cyber-Physical Systems
Cyber
Physical
Systems
Embedded Computing
Systems
Green Navigation
Systems
Zero-Energy Buildings
Smart Grids
Body Area Networks
Social Sensing
Contact: Prof. Dong Wang: dwang5@nd.edu
http://www3.nd.edu/~dwang5/teach/spring15/spring15_wang_flyer.pdf
CSE 40437/60437, Spring 2015:
Social Sensing & Cyber-Physical Systems
Cyber
Physical
Systems
Embedded Computing
Systems
Green Navigation
Systems
Zero-Energy Buildings
Smart Grids
Body Area Networks
Social Sensing
Analytics
Data
News and Public
Sources
Events
Decision
Support
Boston Bombing
Hurricane Sandy
Egypt unrest
Stock Prediction
(Money)
Traffic Monitoring
(Time)
Disaster Response
(Lives)
Geo Tagging (Smart
City)
Contact: Prof. Dong Wang: dwang5@nd.edu
http://www3.nd.edu/~dwang5/teach/spring15/spring15_wang_flyer.pdf
Applications
Sensors
People
CSE 40437/60437, Spring 2015:
Social Sensing & Cyber-Physical Systems
Thank you!

More Related Content

Similar to cse40822-CAP.pptx

Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInLinkedIn
 
System design fundamentals CAP.pdf
System design fundamentals CAP.pdfSystem design fundamentals CAP.pdf
System design fundamentals CAP.pdfUsmanAhmed269749
 
Beyond Strong Consistency
Beyond Strong ConsistencyBeyond Strong Consistency
Beyond Strong Consistencyjsinglet
 
Ebay架构原则
Ebay架构原则Ebay架构原则
Ebay架构原则yiditushe
 
designing distributed scalable and reliable systems
designing distributed scalable and reliable systemsdesigning distributed scalable and reliable systems
designing distributed scalable and reliable systemsMauro Servienti
 
Cloud Ready Apps
Cloud Ready AppsCloud Ready Apps
Cloud Ready AppsDotitude
 
Big Data Day LA 2015 - Lessons Learned Designing Data Ingest Systems
Big Data Day LA 2015 - Lessons Learned Designing Data Ingest SystemsBig Data Day LA 2015 - Lessons Learned Designing Data Ingest Systems
Big Data Day LA 2015 - Lessons Learned Designing Data Ingest Systemsaaamase
 
Docker-N-Beyond
Docker-N-BeyondDocker-N-Beyond
Docker-N-Beyondsantosh007
 
Container Attached Storage with OpenEBS - CNCF Paris Meetup
Container Attached Storage with OpenEBS - CNCF Paris MeetupContainer Attached Storage with OpenEBS - CNCF Paris Meetup
Container Attached Storage with OpenEBS - CNCF Paris MeetupMayaData Inc
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systemselliando dias
 
E Bay Best Practices For Scaling Websites
E Bay Best Practices For Scaling WebsitesE Bay Best Practices For Scaling Websites
E Bay Best Practices For Scaling WebsitesGeorge Ang
 
Qcon best practices for scaling websites
Qcon best practices for scaling websitesQcon best practices for scaling websites
Qcon best practices for scaling websitesyouzitang
 
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...confluent
 
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...Bob Pusateri
 
Антон Бойко "Разделяй и властвуй — набор практик для построения масштабируемо...
Антон Бойко "Разделяй и властвуй — набор практик для построения масштабируемо...Антон Бойко "Разделяй и властвуй — набор практик для построения масштабируемо...
Антон Бойко "Разделяй и властвуй — набор практик для построения масштабируемо...Marina Peregud
 

Similar to cse40822-CAP.pptx (20)

Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
 
System design fundamentals CAP.pdf
System design fundamentals CAP.pdfSystem design fundamentals CAP.pdf
System design fundamentals CAP.pdf
 
ds7_con.ppt
ds7_con.pptds7_con.ppt
ds7_con.ppt
 
Beyond Strong Consistency
Beyond Strong ConsistencyBeyond Strong Consistency
Beyond Strong Consistency
 
Ebay架构原则
Ebay架构原则Ebay架构原则
Ebay架构原则
 
Hbase hive pig
Hbase hive pigHbase hive pig
Hbase hive pig
 
designing distributed scalable and reliable systems
designing distributed scalable and reliable systemsdesigning distributed scalable and reliable systems
designing distributed scalable and reliable systems
 
Cloud Ready Apps
Cloud Ready AppsCloud Ready Apps
Cloud Ready Apps
 
Big Data Day LA 2015 - Lessons Learned Designing Data Ingest Systems
Big Data Day LA 2015 - Lessons Learned Designing Data Ingest SystemsBig Data Day LA 2015 - Lessons Learned Designing Data Ingest Systems
Big Data Day LA 2015 - Lessons Learned Designing Data Ingest Systems
 
Docker-N-Beyond
Docker-N-BeyondDocker-N-Beyond
Docker-N-Beyond
 
Container Attached Storage with OpenEBS - CNCF Paris Meetup
Container Attached Storage with OpenEBS - CNCF Paris MeetupContainer Attached Storage with OpenEBS - CNCF Paris Meetup
Container Attached Storage with OpenEBS - CNCF Paris Meetup
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
 
No sql (not only sql)
No sql                 (not only sql)No sql                 (not only sql)
No sql (not only sql)
 
E Bay Best Practices For Scaling Websites
E Bay Best Practices For Scaling WebsitesE Bay Best Practices For Scaling Websites
E Bay Best Practices For Scaling Websites
 
Qcon best practices for scaling websites
Qcon best practices for scaling websitesQcon best practices for scaling websites
Qcon best practices for scaling websites
 
Hbase hivepig
Hbase hivepigHbase hivepig
Hbase hivepig
 
Cap in depth
Cap in depthCap in depth
Cap in depth
 
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
 
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
 
Антон Бойко "Разделяй и властвуй — набор практик для построения масштабируемо...
Антон Бойко "Разделяй и властвуй — набор практик для построения масштабируемо...Антон Бойко "Разделяй и властвуй — набор практик для построения масштабируемо...
Антон Бойко "Разделяй и властвуй — набор практик для построения масштабируемо...
 

Recently uploaded

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 

Recently uploaded (20)

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 

cse40822-CAP.pptx

  • 1. CAP Theorem CSE 40822-Cloud Computing-Fall 2014 Prof. Dong Wang
  • 2. CAP Theorem • Conjectured by Prof. Eric Brewer at PODC (Principle of Distributed Computing) 2000 keynote talk • Described the trade-offs involved in distributed system • It is impossible for a web service to provide following three guarantees at the same time: • Consistency • Availability • Partition-tolerance
  • 3. CAP Theorem • Consistency: – All nodes should see the same data at the same time • Availability: – Node failures do not prevent survivors from continuing to operate • Partition-tolerance: – The system continues to operate despite network partitions • A distributed system can satisfy any two of these guarantees at the same time but not all three
  • 5. CAP Theorem • A simple example: Hotel Booking: are we double-booking the same room? Bob Dong
  • 6. CAP Theorem • A simple example: Hotel Booking: are we double-booking the same room? Bob Dong
  • 7. CAP Theorem • A simple example: Hotel Booking: are we double-booking the same room? Bob Dong
  • 8. CAP Theorem: Proof • 2002: Proven by research conducted by Nancy Lynch and Seth Gilbert at MIT Gilbert, Seth, and Nancy Lynch. "Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services." ACM SIGACT News 33.2 (2002): 51-59.
  • 9. CAP Theorem: Proof • A simple proof using two nodes: A B
  • 10. CAP Theorem: Proof • A simple proof using two nodes: A B Not Consistent! Respond to client
  • 11. CAP Theorem: Proof • A simple proof using two nodes: A B Not Available! Wait to be updated
  • 12. CAP Theorem: Proof • A simple proof using two nodes: A B Not Partition Tolerant! A gets updated from B
  • 13. Why this is important? • The future of databases is distributed (Big Data Trend, etc.) • CAP theorem describes the trade-offs involved in distributed systems • A proper understanding of CAP theorem is essential to making decisions about the future of distributed database design • Misunderstanding can lead to erroneous or inappropriate design choices
  • 14. Problem for Relational Database to Scale • The Relational Database is built on the principle of ACID (Atomicity, Consistency, Isolation, Durability) • It implies that a truly distributed relational database should have availability, consistency and partition tolerance. • Which unfortunately is impossible …
  • 15. Revisit CAP Theorem C A P • Of the following three guarantees potentially offered a by distributed systems: • Consistency • Availability • Partition tolerance • Pick two • This suggests there are three kinds of distributed systems: • CP • AP • CA Any problems?
  • 16. A popular misconception: 2 out 3 • How about CA? • Can a distributed system (with unreliable network) really be not tolerant of partitions? C A
  • 17. A few witnesses • Coda Hale, Yammer software engineer: – “Of the CAP theorem’s Consistency, Availability, and Partition Tolerance, Partition Tolerance is mandatory in distributed systems. You cannot not choose it.” http://codahale.com/you-cant-sacrifice-partition-tolerance/
  • 18. A few witnesses • Werner Vogels, Amazon CTO – “An important observation is that in larger distributed-scale systems, network partitions are a given; therefore, consistency and availability cannot be achieved at the same time.” http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
  • 19. A few witnesses • Daneil Abadi, Co-founder of Hadapt – So in reality, there are only two types of systems ... I.e., if there is a partition, does the system give up availability or consistency? http://dbmsmusings.blogspot.com/2010/04/problems-with-cap-and-yahoos-little.html
  • 20. CAP Theorem 12 year later • Prof. Eric Brewer: father of CAP theorem • “The “2 of 3” formulation was always misleading because it tended to oversimplify the tensions among properties. ... • CAP prohibits only a tiny part of the design space: perfect availability and consistency in the presence of partitions, which are rare.” http://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed
  • 21. Consistency or Availability C A P • Consistency and Availability is not “binary” decision • AP systems relax consistency in favor of availability – but are not inconsistent • CP systems sacrifice availability for consistency- but are not unavailable • This suggests both AP and CP systems can offer a degree of consistency, and availability, as well as partition tolerance
  • 22. AP: Best Effort Consistency • Example: – Web Caching – DNS • Trait: – Optimistic – Expiration/Time-to-live – Conflict resolution
  • 23. CP: Best Effort Availability • Example: – Majority protocols – Distributed Locking (Google Chubby Lock service) • Trait: – Pessimistic locking – Make minority partition unavailable
  • 24. Types of Consistency • Strong Consistency – After the update completes, any subsequent access will return the same updated value. • Weak Consistency – It is not guaranteed that subsequent accesses will return the updated value. • Eventual Consistency – Specific form of weak consistency – It is guaranteed that if no new updates are made to object, eventually all accesses will return the last updated value (e.g., propagate updates to replicas in a lazy fashion)
  • 25. Eventual Consistency Variations • Causal consistency – Processes that have causal relationship will see consistent data • Read-your-write consistency – A process always accesses the data item after it’s update operation and never sees an older value • Session consistency – As long as session exists, system guarantees read- your-write consistency – Guarantees do not overlap sessions
  • 26. Eventual Consistency Variations • Monotonic read consistency – If a process has seen a particular value of data item, any subsequent processes will never return any previous values • Monotonic write consistency – The system guarantees to serialize the writes by the same process • In practice – A number of these properties can be combined – Monotonic reads and read-your-writes are most desirable
  • 27. Eventual Consistency - A Facebook Example • Bob finds an interesting story and shares with Alice by posting on her Facebook wall • Bob asks Alice to check it out • Alice logs in her account, checks her Facebook wall but finds: - Nothing is there!
  • 28. Eventual Consistency - A Facebook Example • Bob tells Alice to wait a bit and check out later • Alice waits for a minute or so and checks back: - She finds the story Bob shared with her!
  • 29. Eventual Consistency - A Facebook Example • Reason: it is possible because Facebook uses an eventual consistent model • Why Facebook chooses eventual consistent model over the strong consistent one? – Facebook has more than 1 billion active users – It is non-trivial to efficiently and reliably store the huge amount of data generated at any given time – Eventual consistent model offers the option to reduce the load and improve availability
  • 30. Eventual Consistency - A Dropbox Example • Dropbox enabled immediate consistency via synchronization in many cases. • However, what happens in case of a network partition?
  • 31. Eventual Consistency - A Dropbox Example • Let’s do a simple experiment here: – Open a file in your drop box – Disable your network connection (e.g., WiFi, 4G) – Try to edit the file in the drop box: can you do that? – Re-enable your network connection: what happens to your dropbox folder?
  • 32. Eventual Consistency - A Dropbox Example • Dropbox embraces eventual consistency: – Immediate consistency is impossible in case of a network partition – Users will feel bad if their word documents freeze each time they hit Ctrl+S , simply due to the large latency to update all devices across WAN – Dropbox is oriented to personal syncing, not on collaboration, so it is not a real limitation.
  • 33. Eventual Consistency - An ATM Example • In design of automated teller machine (ATM): – Strong consistency appear to be a nature choice – However, in practice, A beats C – Higher availability means higher revenue – ATM will allow you to withdraw money even if the machine is partitioned from the network – However, it puts a limit on the amount of withdraw (e.g., $200) – The bank might also charge you a fee when a overdraft happens
  • 34. Dynamic Tradeoff between C and A • An airline reservation system: – When most of seats are available: it is ok to rely on somewhat out-of-date data, availability is more critical – When the plane is close to be filled: it needs more accurate data to ensure the plane is not overbooked, consistency is more critical • Neither strong consistency nor guaranteed availability, but it may significantly increase the tolerance of network disruption
  • 35. Heterogeneity: Segmenting C and A • No single uniform requirement – Some aspects require strong consistency – Others require high availability • Segment the system into different components – Each provides different types of guarantees • Overall guarantees neither consistency nor availability – Each part of the service gets exactly what it needs • Can be partitioned along different dimensions
  • 36. Discussion • In an e-commercial system (e.g., Amazon, e-Bay, etc), what are the trade-offs between consistency and availability you can think of? What is your strategy? • Hint -> Things you might want to consider: – Different types of data (e.g., shopping cart, billing, product, etc.) – Different types of operations (e.g., query, purchase, etc.) – Different types of services (e.g., distributed lock, DNS, etc.) – Different groups of users (e.g., users in different geographic areas, etc.)
  • 37. Partitioning Examples • Data Partitioning • Operational Partitioning • Functional Partitioning • User Partitioning • Hierarchical Partitioning
  • 38. Partitioning Examples Data Partitioning • Different data may require different consistency and availability • Example: • Shopping cart: high availability, responsive, can sometimes suffer anomalies • Product information need to be available, slight variation in inventory is sufferable • Checkout, billing, shipping records must be consistent
  • 39. Partitioning Examples Operational Partitioning • Each operation may require different balance between consistency and availability • Example: • Reads: high availability; e.g.., “query” • Writes: high consistency, lock when writing; e.g., “purchase”
  • 40. Partitioning Examples Functional Partitioning • System consists of sub-services • Different sub-services provide different balances • Example: A comprehensive distributed system – Distributed lock service (e.g., Chubby) : • Strong consistency – DNS service: • High availability
  • 41. Partitioning Examples User Partitioning • Try to keep related data close together to assure better performance • Example: Craglist – Might want to divide its service into several data centers, e.g., east coast and west coast • Users get high performance (e.g., high availability and good consistency) if they query servers closet to them • Poorer performance if a New York user query Craglist in San Francisco
  • 42. Partitioning Examples Hierarchical Partitioning • Large global service with local “extensions” • Different location in hierarchy may use different consistency • Example: – Local servers (better connected) guarantee more consistency and availability – Global servers has more partition and relax one of the requirement
  • 43. What if there are no partitions? • Tradeoff between Consistency and Latency: • Caused by the possibility of failure in distributed systems – High availability -> replicate data -> consistency problem • Basic idea: – Availability and latency are arguably the same thing: unavailable -> extreme high latency – Achieving different levels of consistency/availability takes different amount of time
  • 44. CAP -> PACELC • A more complete description of the space of potential tradeoffs for distributed system: – If there is a partition (P), how does the system trade off availability and consistency (A and C); else (E), when the system is running normally in the absence of partitions, how does the system trade off latency (L) and consistency (C)? Abadi, Daniel J. "Consistency tradeoffs in modern distributed database system design." Computer-IEEE Computer Magazine 45.2 (2012): 37.
  • 45. PACELC C A C L Partitioned Normal
  • 46. Examples • PA/EL Systems: Give up both Cs for availability and lower latency – Dynamo, Cassandra, Riak • PC/EC Systems: Refuse to give up consistency and pay the cost of availability and latency – BigTable, Hbase, VoltDB/H-Store • PA/EC Systems: Give up consistency when a partition happens and keep consistency in normal operations – MongoDB • PC/EL System: Keep consistency if a partition occurs but gives up consistency for latency in normal operations – Yahoo! PNUTS
  • 47. Contact: Prof. Dong Wang: dwang5@nd.edu http://www3.nd.edu/~dwang5/teach/spring15/spring15_wang_flyer.pdf CSE 40437/60437, Spring 2015: Social Sensing & Cyber-Physical Systems Cyber Physical Systems Embedded Computing Systems Green Navigation Systems Zero-Energy Buildings Smart Grids Body Area Networks Social Sensing
  • 48. Contact: Prof. Dong Wang: dwang5@nd.edu http://www3.nd.edu/~dwang5/teach/spring15/spring15_wang_flyer.pdf CSE 40437/60437, Spring 2015: Social Sensing & Cyber-Physical Systems Cyber Physical Systems Embedded Computing Systems Green Navigation Systems Zero-Energy Buildings Smart Grids Body Area Networks Social Sensing
  • 49. Analytics Data News and Public Sources Events Decision Support Boston Bombing Hurricane Sandy Egypt unrest Stock Prediction (Money) Traffic Monitoring (Time) Disaster Response (Lives) Geo Tagging (Smart City) Contact: Prof. Dong Wang: dwang5@nd.edu http://www3.nd.edu/~dwang5/teach/spring15/spring15_wang_flyer.pdf Applications Sensors People CSE 40437/60437, Spring 2015: Social Sensing & Cyber-Physical Systems