Thoughts on Transaction and Consistency Models

4,976 views

Published on

Thoughts on Transaction and Consistency Models

Published in: Technology
0 Comments
8 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
4,976
On SlideShare
0
From Embeds
0
Number of Embeds
1,763
Actions
Shares
0
Downloads
142
Comments
0
Likes
8
Embeds 0
No embeds

No notes for slide

Thoughts on Transaction and Consistency Models

  1. 1. Roger  Bodamerroger@10gen.com @rogerb
  2. 2. Thoughts on Transaction and Consistency Models
  3. 3. RDBMS(Oracle,  MySQL)
  4. 4. RDBMS (Oracle,  MySQL)New Gen. OLAP(vertica,  aster,   greenplum)
  5. 5. RDBMS (Oracle,  MySQL)New Gen. Non-relational OLAP Operational(vertica,  aster,   Stores greenplum) (“NoSQL”)
  6. 6. The database world is changingDocument Datastores, Key Value, Graph databases
  7. 7. The database world is changing Transactional model
  8. 8. The database world is changing Full Acid
  9. 9. The database world is changing
  10. 10. • memcachedscalability  &  performance • key/value • RDBMS depth  of  functionality
  11. 11. CAPIt is impossible in the asynchronous network model toimplement a read/write data object that guarantees thefollowing properties:• Availability• Atomic consistency in all fair executions (including thosein which messages are lost).
  12. 12. CAP It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties: • Availability • Atomic consistency in all fair executions (including those in which messages are lost).Or: If the network is broken, your database won’t workBut we get to define “won’t work”
  13. 13. Consistency Models - CAP• Choices are Available-P (AP) or Consistent-P (CP)• Write Availability, not Read Availability, is the Main Question• It’s not all about CAP Scale, Reduced latency: •Multi data center •Speed •Even load distribution
  14. 14. Examples of Eventually Consistent Systems• Eventual Consistency: •“The storage system guarantees that if no new updates are made to the object, eventually all accesses will return the last updated value”•Examples: • DNS • Async replication (RDBMS, MongoDB) • Memcached (TTL cache)
  15. 15. Eventual Consistency
  16. 16. Eventual ConsistencyRead(x)  :  1,  2,  2,  4,  4,  4,  4  …
  17. 17. Could we get this?Read(x)  :  1,  2,  1,  4,  2,  4,  4,  4  …
  18. 18. Monotonic read consistencyPrevent  seeing  writes  out  of  order • Appserver and slave on same box • Appserver only reads from Slave • Eventually consistent • Guaranteed to see reads in order
  19. 19. Monotonic read consistencyPrevent  seeing  writes  out  of  order • Appserver and slave on same box • Appserver only reads from Slave • Eventually consistent • Guaranteed to see reads in order • Failover ?
  20. 20. RYOW Consistency • Read Your Own Primary WritesRead  /    Has  Written Replication • Eventually consistent for readers Secondary • Immediately consistent for writers Read
  21. 21. Amazon Dynamo•R: # of servers to read from•W: # servers to get response from•N: Replication factor • R+W>N has nice properties
  22. 22. ExampleExample  1R  +  W  <=  NR  =  1W  =  1N  =  5Possibly  Stale  dataHigher  availability
  23. 23. ExampleExample  1 Example  2R  +  W  <=  N R  +  W  >  NR  =  1 R  =  2                                          R  =  1W  =  1 W  =  1                                        W  =  2N  =  5 N  =  2                                        N  =  2Possibly  Stale  data Consistent  dataHigher  availability
  24. 24. R+W>N If R+W > N, we can’t have both fast local reads and writes at the same time if all the data centers are equal peers
  25. 25. Consistency levels• Strong: W + R > N• Weak / Eventual : W + R <= N• Optimized Read: R=1, W=N• Optimized Write: W=1, R=N
  26. 26. Multi Datacenter Strategies• DR • Failover• Single Region, Multi Datacenter • Useful for Strong or Eventual Consistency• Local Reads, Remote Writes • All writes to master on WAN, EC on Slaves• Intelligent Homing • Master copy close to user location
  27. 27. Network Partitions
  28. 28. Network Write Possibilities•Deny all writes • Still Read fully consistent data • Give up write Availability
  29. 29. Network Write Possibilities•Deny all writes • Still Read fully consistent data • Give up write Availability•Allow writes on one side • Failover • Possible to allow reads on other side
  30. 30. Network Write Possibilities•Deny all writes • Still Read fully consistent data • Give up write Availability•Allow writes on one side • Failover • Possible to allow reads on other side•Allow writes on both sides • Available • Give up consistency
  31. 31. Multiple Writer Strategies• Last one wins • Use Vector clocks to decide latest• Insert Inserts often really means if ( !exists(x)) then set(x) exists is hard to implement in eventually consistent systems
  32. 32. Multiple Writer Strategies• Delete op1: set( { _id : joe, age : 40 } } op2: delete( { _id : joe } ) op3: set( { _id : joe, age : 33 } )• Consider switching 2 and 3• Tombstone: Remember delete, and apply last-operation-wins• Update update users set age=40 where _id=’joe’However: op1: update users set age=40 where _id=joe op2: update users set state=ca where _id=joe
  33. 33. Multiple Writer Strategies• Programatic Merge • Store operations, instead of state • Replay operations • Did you get the last operation ? • Not immediate•Communtative operations • Conflict free? • Fold-able • Example: add, increment, decrement
  34. 34. Trivial Network Partitions
  35. 35. MongoDB Options
  36. 36. MongoDB Transaction Support• Single Master / Sharded• MongoDB Supports Atomic Operations on Single Documents • But not Rollback• $ operators • $set, $unset, $inc, $push, $pushall, $pull, $pullall• Update if current • find followed by update• Find and modify
  37. 37. MongoDB PrimaryRead  /  Write
  38. 38. MongoDB PrimaryRead  /  Write Replication Secondary Replication Secondary  for  Backup
  39. 39. MongoDB PrimaryRead  /  Write Replication Secondary Read Replication Secondary  for  Backup
  40. 40. MongoDB Replicaset PrimaryRead  /  Write Replication Secondary Read Replication Secondary  for  Backup
  41. 41. Write Scalability: Shardingread key  range   key  range   key  range   0  ..  30 31  ..  60 61  ..  100 ReplicaSet  1 ReplicaSet  2 ReplicaSet  3 Primary Primary Primary Secondary Secondary Secondary Secondary Secondary Secondary write
  42. 42. Sometimes we need global state / more consistency•Unique key constraints •User registration•ACL changes
  43. 43. Could it be the case that…uptime( CP + average developer ) >=uptime( AP + average developer )where uptime:= system is up and non-buggy?
  44. 44. Thank You :-) @rogerb
  45. 45. download at mongodb.org conferences,  appearances,  and  meetups http://www.10gen.com/events Facebook                    |                  Twitter                  |                  LinkedInhttp://bit.ly/mongoU   @mongodb http://linkd.in/joinmongo

×