Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
A Critique of Snapshot Isolation
Daniel Go ́mez Ferro, Maysam Yabandeh
Yahoo! Research
Barcelona, Spain
A Critique!?
Transactions: Atomic set of operations on the database
12/04/2012 2
Ts Tc
Database
A Critique!?
Isolation Level: the behavior under concurrency
12/04/2012 3
Ts Tc
Database
A Critique!?
Ideally Serializability: a serial order for txns
12/04/2012 4
But …
A Critique!?
Ideally Serializability: a serial order for txns
• Not everybody is perfect  non-serializable
• How non-seri...
A Critique!?
Snapshot Isolation (SI)
Write-write conflict detection
Anomalies
A Critique of ANSI SQL Isolation Levels
Hal ...
A Critique!?
• A Critique of ANSI SQL Isolation Levels
• Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O'N...
A Critique!?
• A Critique of ANSI SQL Isolation Levels
• Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O'N...
A Critique!?
• A Critique of ANSI SQL Isolation Levels
• Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O'N...
Agenda
1. Snapshot Isolation
– Write-Write Conflict Prevention (WW)
– Read-Write Conflict Prevention (RW)
2. WW vs. RW in ...
Snapshot Isolation (SI)
Used in: Oracle, PostgreSQL, Percolator
1) Read only and only from committed values
before Ts
11
T...
Snapshot Isolation (SI)
2) Prevent write-write conflicts (WW)
E.g., txnnew and txnconcurrent modify row r
12
Ts
Tc
Xold
Xn...
Why WW?
• WW lock-based implementation is
– Straightforward: lock on writes
– Low overhead: no lock for read rows
• Implem...
Is WW necessary?
• w2[x] w1[x] c2 c1  w2[x] c2 w1[x] c1
• Blind Write: No
– w1[y]
• Update Write: Yes
– r1[x] w1[x]
• Oth...
Write-Write Conflict
Write-Write conflict
prevention
Read-Write
conflict prevention
sufficient? NO
no serializability
nece...
Read-Write Conflict
Write-Write conflict
prevention
Read-Write
conflict prevention
sufficient? NO
no serializability
YES
s...
Why not RW?
• RW lock-based implementation is
– Straightforward: lock on reads (not for read-only)
– High overhead: lock o...
OMID
• Open source:
https://github.com/yahoo/omid/
• Use a centralized server to commit
– Called Status Oracle (SO)
– Allo...
OMID
• Open source:
https://github.com/yahoo/omid/
• Use a centralized server to commit
– Called Status Oracle (SO)
– Allo...
SO Overhead
• SO maintains the transactional data in mem
• WW prevents write-write conflicts + update
for writes
– Mem Op ...
SO Overhead
• How much is the cost of processing a commit (CMT)?
1. RPC costs (per txn)
2. Mem Ops (per data)
12/04/2012 2...
SO Overhead
• How much is the cost of processing a commit (CMT)?
1. RPC costs (per txn)
2. Mem Ops (per data)
12/04/2012 2...
Read-Write Conflict
Write-Write conflict
prevention
Read-Write
conflict prevention
sufficient? NO
no serializability
YES
s...
Agenda
1. Snapshot Isolation
– Write-Write Conflict Prevention (WW)
– Read-Write Conflict Prevention (RW)
2. WW vs. RW in ...
Why OMID?
1. Lock-free: progress even with failed clients
2. Negligible overhead
3. No change in the data store
4. A centr...
Evaluation
• RW provides serializability
• How much we are paying for it? RW vs. WW
1. Overhead on the status oracle?
2. C...
SO Overhead – 8read/8write
12/04/2012 27
SO Overhead – 64read/64write
12/04/2012 28
Concurrency?
• Depends on the workload
– We do not expect much difference
– Both don’t abort read-only transactions
• 80-9...
Synthetic Workload
• YCSB
• 0-20 rows per transaction
• Mixed workload
– 50% Read-only
– 50% Complex
• Half get, half put
...
Zipfian Distribution
12/04/2012 31
Summary
• RW Overhead: In Omid comparable to WW
• RW Abort Rate:
– No evidence that which is better
– In synthetic workloa...
https://github.com/yahoo/omid/ : readWrite branch
https://github.com/yahoo/omid/wiki/readWrite
https://github.com/maysamya...
Upcoming SlideShare
Loading in …5
×

A critique of snapshot isolation: eurosys 2012

1,247 views

Published on

Published in: Technology, Education
  • Be the first to comment

A critique of snapshot isolation: eurosys 2012

  1. 1. A Critique of Snapshot Isolation Daniel Go ́mez Ferro, Maysam Yabandeh Yahoo! Research Barcelona, Spain
  2. 2. A Critique!? Transactions: Atomic set of operations on the database 12/04/2012 2 Ts Tc Database
  3. 3. A Critique!? Isolation Level: the behavior under concurrency 12/04/2012 3 Ts Tc Database
  4. 4. A Critique!? Ideally Serializability: a serial order for txns 12/04/2012 4 But …
  5. 5. A Critique!? Ideally Serializability: a serial order for txns • Not everybody is perfect  non-serializable • How non-serializable? • ANSI SQL Standard Anomalies – P1: Dirty Read – P2: Fuzzy Read – P3: Phantom 12/04/2012 5
  6. 6. A Critique!? Snapshot Isolation (SI) Write-write conflict detection Anomalies A Critique of ANSI SQL Isolation Levels Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O'Neil, and Patrick O'Neil 1995, Cited by 492 • Defines new anomalies • P0 (Dirty Write) • P4 (Lost Update) http://www.freepik.com/, http://living-by-chance.blogspot.com/ 12/04/2012 6
  7. 7. A Critique!? • A Critique of ANSI SQL Isolation Levels • Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O'Neil, and Patrick O'Neil • P0 (Dirty Write): Transaction T1 modifies a data item. Another transaction T2 then further modifies that data item before T1 performs a COMMIT or ROLLBACK. If T1 or T2 then performs a ROLLBACK, it is unclear what the correct data value should be. The broad interpretation of this is: • P0: w1[x]...w2[x]...((c1 or a1) and (c2 or a2) in any order) 12/04/2012 7
  8. 8. A Critique!? • A Critique of ANSI SQL Isolation Levels • Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O'Neil, and Patrick O'Neil • Remark 3: ANSI SQL isolation should be modified to require P0 [write-write checking] for all isolation levels. 12/04/2012 8
  9. 9. A Critique!? • A Critique of ANSI SQL Isolation Levels • Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O'Neil, and Patrick O'Neil • Remark 3: ANSI SQL isolation should be modified to require P0 [write-write checking] for all isolation levels. 12/04/2012 9 • Write-write conflict prevention (WW) • Explore alternatives …
  10. 10. Agenda 1. Snapshot Isolation – Write-Write Conflict Prevention (WW) – Read-Write Conflict Prevention (RW) 2. WW vs. RW in Omid 3. Evaluation: WW vs. RW 12/04/2012 10
  11. 11. Snapshot Isolation (SI) Used in: Oracle, PostgreSQL, Percolator 1) Read only and only from committed values before Ts 11 Ts Tc Xold Xnew Xconcurrent 12/04/2012 The key to high concurrency: Read-only transactions never abort and never block the others
  12. 12. Snapshot Isolation (SI) 2) Prevent write-write conflicts (WW) E.g., txnnew and txnconcurrent modify row r 12 Ts Tc Xold Xnew Xconcurrent 12/04/2012
  13. 13. Why WW? • WW lock-based implementation is – Straightforward: lock on writes – Low overhead: no lock for read rows • Implementation with optimistic concurrency control [H. Kung and J. Robinson, 1981], [S. Elnikety, et al., 2005], [F. Junqueira, et al., 2011] – ? 12/04/2012 13
  14. 14. Is WW necessary? • w2[x] w1[x] c2 c1  w2[x] c2 w1[x] c1 • Blind Write: No – w1[y] • Update Write: Yes – r1[x] w1[x] • Others: ? 12/04/2012 14 WW is necessary when it implies RW
  15. 15. Write-Write Conflict Write-Write conflict prevention Read-Write conflict prevention sufficient? NO no serializability necessary ? NO different write snapshots helpful? YES If it implies a read-write conflict efficient? YES No lock for reads, no abort/blocking for read only12/04/2012 15 Why not RW?
  16. 16. Read-Write Conflict Write-Write conflict prevention Read-Write conflict prevention sufficient? NO no serializability YES serializability necessary ? NO different write snapshots NO unnecessary aborts helpful? YES If it implies a read-write conflict YES serializability efficient? YES No lock for reads, no abort/blocking for read only ? 12/04/2012 16
  17. 17. Why not RW? • RW lock-based implementation is – Straightforward: lock on reads (not for read-only) – High overhead: lock op for each read and write • Implementation with optimistic concurrency control [H. Kung and J. Robinson, 1981], [S. Elnikety, et al., 2005], [F. Junqueira, et al., 2011] – ? 12/04/2012 17 How about Omid?
  18. 18. OMID • Open source: https://github.com/yahoo/omid/ • Use a centralized server to commit – Called Status Oracle (SO) – Allows a lock-free implementation • To commit: – SO receives set of write row ids – SO prevents write-write conflicts – SO updates its internal list with the write set 18 HBase Region Servers Status Oracle (SO) HBase Client HBase Region Servers HBase Region Servers HBase Region Servers HBase Client HBase Client HBase Clients S O 12/04/2012
  19. 19. OMID • Open source: https://github.com/yahoo/omid/ • Use a centralized server to commit – Called Status Oracle (SO) – Allows a lock-free implementation • To commit: – SO receives set of read & write row ids – SO prevents read-write conflicts – SO updates its internal list with the write set 19 HBase Region Servers Status Oracle (SO) HBase Client HBase Region Servers HBase Region Servers HBase Region Servers HBase Client HBase Client HBase Clients S O 12/04/2012
  20. 20. SO Overhead • SO maintains the transactional data in mem • WW prevents write-write conflicts + update for writes – Mem Op for each write • RW prevents read-write conflicts + update for writes – Mem Op for each read and write • Reads+Writes > Writes •  cost(RW) > cost(WW) 12/04/2012 20
  21. 21. SO Overhead • How much is the cost of processing a commit (CMT)? 1. RPC costs (per txn) 2. Mem Ops (per data) 12/04/2012 21 Large Txns Small Txns RPC + Mem Ops RPC + Mem Ops
  22. 22. SO Overhead • How much is the cost of processing a commit (CMT)? 1. RPC costs (per txn) 2. Mem Ops (per data) 12/04/2012 22 Large Txns Small Txns RPC + Mem Ops RPC + Mem Ops • Small Txns are typical of OLTP workloads • If SO is not saturated, overhead does not change much
  23. 23. Read-Write Conflict Write-Write conflict prevention Read-Write conflict prevention sufficient? NO no serializability YES serializability necessary ? NO different write snapshots NO unnecessary aborts helpful? YES If it implies a read-write conflict YES serializability efficient? YES No lock for reads, no abort/blocking for read only YES In a lock-free system, no abort/blocking for read only12/04/2012 23
  24. 24. Agenda 1. Snapshot Isolation – Write-Write Conflict Prevention (WW) – Read-Write Conflict Prevention (RW) 2. WW vs. RW in Omid 3. Evaluation on Omid 12/04/2012 24
  25. 25. Why OMID? 1. Lock-free: progress even with failed clients 2. Negligible overhead 3. No change in the data store 4. A centralized scheme: status oracle (SO) – Adding one dual-core machine – Scalable to • 60,000 TPS (10 writes) • 120+ KTPS (small transactions) • 1000 clients 2512/04/2012
  26. 26. Evaluation • RW provides serializability • How much we are paying for it? RW vs. WW 1. Overhead on the status oracle? 2. Concurrency level? 12/04/2012 26
  27. 27. SO Overhead – 8read/8write 12/04/2012 27
  28. 28. SO Overhead – 64read/64write 12/04/2012 28
  29. 29. Concurrency? • Depends on the workload – We do not expect much difference – Both don’t abort read-only transactions • 80-90% of transactional traffic • Real workload: needs a real-world application on top of HBase that uses the transactional system • We evaluate on a synthetic workload 12/04/2012 29
  30. 30. Synthetic Workload • YCSB • 0-20 rows per transaction • Mixed workload – 50% Read-only – 50% Complex • Half get, half put • 20 dual-core servers for HBase+HDFS • DB large enough not to fit in memory • Zipfian distribution: hot regions 12/04/2012 30
  31. 31. Zipfian Distribution 12/04/2012 31
  32. 32. Summary • RW Overhead: In Omid comparable to WW • RW Abort Rate: – No evidence that which is better – In synthetic workload: comparable to WW 12/04/2012 32 WW + X  Serializability WW + RW  Serializability 1. [H. Kung and J. Robinson, 1981], [A. Adya, et al., 1999] , 2. [D. Dice, et al., 2006] , [M. Cahill, et al., 2009], [M. Bornea, et al., 2011], …
  33. 33. https://github.com/yahoo/omid/ : readWrite branch https://github.com/yahoo/omid/wiki/readWrite https://github.com/maysamyabandeh/omid omid-project@googlegroups.com Questions? 12/04/2012 33

×