CAP, PACELC, and Determinism
Upcoming SlideShare
Loading in...5
×
 

CAP, PACELC, and Determinism

on

  • 5,698 views

 

Statistics

Views

Total Views
5,698
Views on SlideShare
5,664
Embed Views
34

Actions

Likes
9
Downloads
116
Comments
1

5 Embeds 34

http://www.linkedin.com 22
https://twitter.com 6
http://paper.li 3
http://grumpy.junta.com.au 2
https://grumpy.junta.com.au 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

CAP, PACELC, and Determinism CAP, PACELC, and Determinism Presentation Transcript

  • CAP, PACELC, and Determinism Daniel Abadi Yale University January 18 th , 2011 Twitter: @daniel_abadi Blog: dbmsmusings.blogspot.com
  • MIT Theorem
  • CAP Theorem View slide
  • Intuition Behind Proof DB 1 DB 2 network View slide
  • Intuition Behind Proof DB 1 DB 2 network partition
  • Problems with CAP
    • Not as elegant as the MIT theorem
  • MIT Theorem
  • CAP Theorem
  • Problems with CAP
    • Not as elegant as the MIT theorem
      • There are not three different choices!
      • CA and CP are indistinguishable
    • Used as an excuse to not bother with consistency
      • “ Availability is really important to me, so CAP says I have to get rid of consistency”
  • You can have A and C when there are no partitions! DB 1 DB 2 network
  • Source of Confusion
    • Asymmetry of CAP properties
      • Some are properties of the system in general
      • Some are properties of the system only when there is a partition
  • Another Problem to Fix
    • There are other costs to consistency (besides availability in the face of partitions)
      • Overhead of synchronization schemes
        • Deterministic execution model significantly reduces the cost (see Thomson and Abadi in VLDB 2010)
      • Latency
        • If workload is geographically partitionable
          • Latency is not so bad
        • Otherwise
          • No way to get around at least one round-trip message
  • A Cut at Fixing Both Problems
    • PACELC
      • In the case of a partition (P), does the system choose availability (A) or consistency (C)?
      • Else (E), does the system choose latency (L) or consistency (C)?
  • Examples
    • PA/EL
      • Dynamo, SimpleDB, Cassandra, Riptano, CouchDB, Cloudant
    • PC/EC
      • ACID compliant database systems
    • PA/EC
      • GenieDB
      • See CIDR paper from Wada, Fekete, et. al.
        • Indicates that Google App Engine data store (eventual consistent option) falls under this category
    • PC/EL: Existence is debatable, see next slide
  • Does PC/EL exist?
    • Strengthening consistency when there is a partition doesn’t seem to make sense
    • I think about it the following way
      • Some EL systems have stronger consistency guarantees than eventual consistency
        • Monotonic reads consistency
        • Timeline consistency
        • Read-your-writes consistency
        • Non-atomic consistency
      • CAP presents a problem for these guarantees
      • Hence, partitioning causes some loss of availability
      • Examples: PNUTs/Sherpa (original option), MongoDB
    • Dan Weinreb makes a case that this is too confusing: http://danweinreb.org/blog/improving-the-pacelc-taxonomy
  • A Case for P*/EC
    • Increased push for horizontally scalable transactional database systems
      • Reasons:
        • Cloud computing
        • Distributed applications
        • Desire to deploy applications on cheap, commodity hardware
    • Vast majority of currently available horizontally scalable systems are P*/EL
      • Developed by engineers at Google, Facebook, Yahoo, Amazon, etc.
      • These engineers can handle reduced consistency, but it’s really hard, and there needs to be an option for the rest of us
    • Also
      • Distributed concurrency control and commit protocols are expensive
      • Once consistency is gone, atomicity usually goes next ---> NoSQL
  • Key Problems to Overcome
    • High availability is critical, replication must be a first class citizen
      • Today’s systems generally act, then replicate
        • Complicates semantics of sending read queries to replicas
        • Need confirmation from replica before commit (increased latency) if you want durability and high availability
        • In progress transactions must be aborted upon a master failure
      • Want system that replicates then acts
    • Distributed concurrency control and commit are expensive, want to get rid of them both
  • Key Idea
    • Instead of weakening ACID, strengthen it!
      • Guaranteeing equivalence to SOME serial order makes active replication difficult
        • Running the same set of xacts on two different replicas might cause replicas to diverge
    • Disallow any nondeterministic behavior
    • Disallow aborts caused by DBMS
      • Disallow deadlock
      • Distributed commit much easier if you don’t have to worry about aborts
  • Consequences of Determinism
    • Replicas produce the same output, given the same input,
      • Facilitates active replication
    • Only initial input needs to be logged, state at failure can be reconstructed from this input log (or from a replica)
    • Active distributed xacts not aborted upon node failure
      • Greatly reduces (or eliminates) cost of distributed commit
        • Don’t have to worry about nodes failing during commit protocol
        • Don’t have to worry about affects of transaction making it to disk before promising to commit transaction
        • Just need one message from any node that potentially can deterministically abort the xact
        • This message can be sent in the middle of the xact, as soon as it knows it will commit
  • So how can this be done?
    • Serialize transactions
      • Slow
    • Partition data across cores, serialize transactions in each partition
      • My work with Stonebraker, Madden, and Harizopoulos shows that this works well if workload is partitionable (H-Store project)
      • H-Store project became VoltDB
    • What about non-partitionable workloads?
      • See Thomson and Abadi VLDB 2010 paper: “The case for determinism in database systems”
      • db.cs.yale.edu/determinism-vldb10.pdf
    • Bottom line:
      • Cost of consistency is less than people think
      • Latency is still an issue over a WAN unless workload is geographically partitionable
      • VLDB 2010 paper makes P*/EC systems more viable