cap
cap theorem


• Eric Brewer (ex-Inktomi)
• Proved by Lynch and Gilbert
cap theorem
It is impossible in the asynchrounous network
    model to implement a read/write object
    that garantuees the following properties:

               - Availability
  - Atomic consistency in fair transactions

        Or: If the network is broken,
         your database won’t work
AP vs CP

• Real choices are
 • Available - Partition
 • Consistent - Partion
AP
• Multiple Nodes participate in writes
• System will be Eventually Consistent
 • Storage System guarantees if there are no
    new updates, all reads will eventually
    return the same, last updated value
Examples:
- DNS
- ASync replication
- MongoDB with Slave-OK
- Memcache
eventual consistency
                        Master




               Slave              Slave




               Client             Client



Asuming update 1,2,3,4,5
Client will expect 1,2,2,2,3,4,5,5,5
eventual consistency
                       Master




              Slave             Slave




              Client            Client



However, we could get this: 1,2,2,4,2,5
eventual consistency

• Monotonic read consistency
• Pin client to certain slave / app server
 • Failover still fails
multi master
Dynamo model

R - number of servers to read from
W - number of servers to get response from
N - Replication Factor

R + W > N has nice properties
multi master
Example 1             Example 2
R + W <= N            R +W > N
R=1                   R=2         R =1
W=1                   W=1         W=2
N=5                   N=2         N=2
Possibly Stale Data   ‘Consistent’ Data
Higher Availability
R +W > N
      If R + W > N you can’t both
     have fast local reads and writes
network partitions
trivial network
    partition
network write
        possibilities
• deny all writes
 • read fully consistent data
• allow writes on one side
 • allow reads on other side (stale)
• allow writes on both sides
 • give up consistency
multiple writer strategies
 •   Last one wins

     •   vector clocks

 •   Insert

     •   insert often means:

         •   if (!exist(x)) set(x)

         •   exist is hard to implement in eventually
             consistent systems
delete
op1: set joe, age 40
op2: delete joe
op3: set joe, 41

- consider switching 2 and 3
- tombstone: remember delete and apply last op
wins
multiple writer strategies
•   programmatic merge

    •   store ops instead of state

    •   replay operations

        •   did I get the last one ?


•   Commutative operations

    •   conflict free

    •   anything that’s foldable
CP

• Sometimes we need global state
• Unique - constraints
• User registration
• ACL changes
Finally

uptime(CP + average developer)
>=
uptime(AP + average developer)

Where uptime is the system is up and non-buggy

Thoughts on consistency models

  • 1.
  • 2.
    cap theorem • EricBrewer (ex-Inktomi) • Proved by Lynch and Gilbert
  • 3.
    cap theorem It isimpossible in the asynchrounous network model to implement a read/write object that garantuees the following properties: - Availability - Atomic consistency in fair transactions Or: If the network is broken, your database won’t work
  • 4.
    AP vs CP •Real choices are • Available - Partition • Consistent - Partion
  • 5.
    AP • Multiple Nodesparticipate in writes • System will be Eventually Consistent • Storage System guarantees if there are no new updates, all reads will eventually return the same, last updated value Examples: - DNS - ASync replication - MongoDB with Slave-OK - Memcache
  • 6.
    eventual consistency Master Slave Slave Client Client Asuming update 1,2,3,4,5 Client will expect 1,2,2,2,3,4,5,5,5
  • 7.
    eventual consistency Master Slave Slave Client Client However, we could get this: 1,2,2,4,2,5
  • 8.
    eventual consistency • Monotonicread consistency • Pin client to certain slave / app server • Failover still fails
  • 9.
    multi master Dynamo model R- number of servers to read from W - number of servers to get response from N - Replication Factor R + W > N has nice properties
  • 10.
    multi master Example 1 Example 2 R + W <= N R +W > N R=1 R=2 R =1 W=1 W=1 W=2 N=5 N=2 N=2 Possibly Stale Data ‘Consistent’ Data Higher Availability
  • 11.
    R +W >N If R + W > N you can’t both have fast local reads and writes
  • 12.
  • 13.
  • 14.
    network write possibilities • deny all writes • read fully consistent data • allow writes on one side • allow reads on other side (stale) • allow writes on both sides • give up consistency
  • 15.
    multiple writer strategies • Last one wins • vector clocks • Insert • insert often means: • if (!exist(x)) set(x) • exist is hard to implement in eventually consistent systems
  • 16.
    delete op1: set joe,age 40 op2: delete joe op3: set joe, 41 - consider switching 2 and 3 - tombstone: remember delete and apply last op wins
  • 17.
    multiple writer strategies • programmatic merge • store ops instead of state • replay operations • did I get the last one ? • Commutative operations • conflict free • anything that’s foldable
  • 18.
    CP • Sometimes weneed global state • Unique - constraints • User registration • ACL changes
  • 19.
    Finally uptime(CP + averagedeveloper) >= uptime(AP + average developer) Where uptime is the system is up and non-buggy