Advertisement

Transactions and Concurrency Control

Research Scientist at Data61, CSIRO
Aug. 18, 2018
Advertisement

More Related Content

Advertisement
Advertisement

Transactions and Concurrency Control

  1. Transactions & Concurrency Control CS4262 Distributed Systems Dilum Bandara Dilum.Bandara@uom.lk Some slides adapted from U Uthaiyashankar (WSO2) and Rajkumar Buyya’s
  2. Outline  Transactions  Properties  Classification  Transaction implementation  Concurrency control 2
  3. Transactions  Transactions protect a shared resource (e.g., shared data) against simultaneous access by several concurrent processes  Transactions can do much more…  They allow a process to access & modify multiple data items as a single atomic operation  All-or-nothing property 3
  4. Transactions in DBMS  A unit of work performed within a DBMS against a database  Treated in a coherent & reliable way independent of other transactions  2 main purposes  To provide reliable units of work that allow correct recovery from failures & keep a database consistent even in cases of (complete or partial) system failure  To provide isolation between programs accessing a database concurrently 4
  5. Goal of Transactions  To ensure objects managed by a server remain in a consistent state when accessed by multiple transactions in the presence of server failure  Transactions deal with process crash failures & communication omission failures  Transactions don’t deal with Byzantine failure 5
  6. Transaction Model  Recoverable objects  Objects that can be recovered after their server crashes  Failure atomicity  Effects of the transaction are atomic even when server crashes  Transactions require coordination among  A client program  Some recoverable objects  A coordinator 6
  7. Transaction Model (Cont.)  Each transaction is created & managed by a coordinator which implements the coordinator interface  Coordinator gives each transaction an identifier (TID)  Clients invoke transaction coordinator interface methods to perform a transaction  A transaction may be aborted by client or server 7
  8. Transaction Coordinator Interface  openTransaction() → trans  Starts a new transaction & delivers a unique TID trans  trans will be used in other operations in the transaction  closeTransaction(trans) → (commit, abort)  Ends a transaction  commit – indicates that transaction has committed  abort – Indicates that it has aborted  abortTransaction(trans)  Aborts the transaction 8
  9. Properties of a Transaction  Proposed by Harder & Reuter in 1983  ACID Property  Atomic  Consistent  Isolated  Durable 9
  10. Atomic Property  Process all of a transaction or none of it  To outside world, transaction happens indivisibly  While a transaction happens, other processes can’t see any intermediate states  Example  A file is 100 bytes long when a transaction starts to append to it  If other processes read the file during transaction, they should see & access only the 100 bytes  If transaction commits successfully, file grows instantaneously to its new size 10
  11. Consistent Property  Data on all systems reflects the same state  Transaction doesn’t violate system invariants  Example  In a banking system, a key invariant is the law of conservation of money  After any internal transfer, amount of money in bank must be same as it was before transfer  During internal transfer operations, this invariant may be violated  But this violation isn’t visible outside the transaction 11
  12. Isolated Property  Transactions don’t interact/interfere with one another  Concurrent transactions don’t interfere with each other  i.e., transactions are serializable  Example  If 2+ transactions are running at the same time, to each of them & to other processes, final result appears as though all transactions ran sequentially (in some system-dependent order) 12
  13. Durable Property  Effects of a completed transaction are persistent  Once a transaction commits, changes are permanent  No failure after the commit can undo results or cause them to be lost 13
  14. Ensuring Transaction Properties  To support “failure atomicity” & “durability”, objects must be recoverable  To support “isolation” & “consistency” server needs to synchronize operations  Perform transactions serially (sequentially)  Generally, an unacceptable solution  Allow transactions to execute concurrently, if they would have the same effect as a serial execution  Known as serializable or serially equivalent transactions 14
  15. Classification of Transactions  Flat transactions  Nested transactions  Distributed transactions 15
  16. Flat Transactions  A series of operations  Don’t allow partial results to be committed or aborted  e.g., booking multiple air lines from start to destination 16
  17. Nested Transactions  Constructed from a number of sub-transactions  Top-level transaction may fork off children that run in parallel to one another  Incase of failure results of some sub-transaction that committed must be undone  e.g., booking multiple air lines from start to destination 17
  18. Distributed Transactions  Logically, a flat, indivisible transaction that operates on distributed data  In contrast, a nested transaction is logically decomposed into a hierarchy of sub-transactions 18 Source: Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007
  19. Achieving Transaction Properties in Distributed Transactions  To achieve atomicity & durability  Distributed commit protocols  To achieve consistency & isolation  Concurrency control algorithms (distributed locking protocols)  We need separate distributed algorithms to handle “locking of data” & “committing entire transaction” 19
  20. Transaction Implementation  Consider transactions on a file system  If each process executing a transaction just updates the file it uses in place, then transactions will not be atomic  2 common implementation techniques  Private workspace  Write-ahead log 20
  21. Private Workspace  Provide private (local) copies  All updates on private copies  Update the original (global) copy while committing  Issues  Cost of copying everything  Possible solution  Keep a pointer to relevant file block  When writing copy file block, instead of entire file  Update original when closing file  Examples – Andrew file systems 21
  22. Example – Private Workspace Implementation 22 Source: Andrew S. Tanenbaum , Distributed Operating Systems
  23. Write-Ahead Log  Files are modified in-place  Before any changes, a record is written in a log indicating  Which transaction is making the change  Which file & block is being changed  What old & new values are  Change is made to file only after log is written successfully  Example – log structured file system  If transaction succeeds & is committed, a commit record is written in log  If transaction is aborted, log is used to revert file back to its original state – rollback 23
  24. Writeahead Log Implementation 24
  25. Concurrency Control  Goal  Allow simultaneous execution of transactions  While enabling each transaction to be isolated & data items being manipulated are left in a consistent state  Consistency  Achieved by giving transactions access to data items in a specific order  Final result would be the same as if all transactions were run sequentially 25
  26. Achieving Concurrency Control  3 different managers 1. Data manager 2. Scheduler 3. Transaction manager 26
  27. Achieving Concurrency Control (Cont.)  Data manager  Performs actual read/write on data  Not concerned about transaction  Scheduler  Responsible for properly controlling concurrency  Determines which transaction is allowed to pass a read/write operation to data manager & at which time  Schedules individual read/write s.t. transaction isolation & consistency properties are met  Transaction manager  Guarantees transaction atomicity  Processes transaction primitives by transforming them into scheduling requests 27
  28.  Each site has a scheduler & data manager  Each transaction is handled by a transaction manager  Transaction manager communicates with schedulers at different sites  Scheduler may also communicate with remote data managers Distributed Concurrency Control 28
  29. Concurrency Control  Schedules  Concurrency control issues  Serial equivalence  Issues in aborting transactions 29
  30. Schedules – Example  Schedule – system dependent order of actions  Final result looks as though all transactions ran sequentially 30
  31. Concurrent Transaction Issues  Lost update problem  Transactions should not read a “stale” value & use it in computing a new value  Inconsistent retrievals problem  Update transactions shouldn’t interfere with retrievals  Illustrated in the context of banking examples… 31
  32. “Lost Update” Problem  3 bank accounts (A, B, C) with balances,  A = $100, B = $200, C = $300  Transaction T  Transfers from account A to B, for 10% increase in the B’s balance  B = $200 + ($200*10%) = $220  A = $100 – ($200*10%) = $80  Transaction U  Transfers from C to B, for 10% increase in B’s balance  B = $220 + ($220*10%) = $242  C = $300 – ($220*10%) = $278  What might happen if transactions T & U are executed concurrently? 32
  33. “Lost Update” Problem (Cont.) 33
  34. “Inconsistent Retrievals” Problem  2 bank accounts (A, B) with balances  A = $200  B = $200  Transaction V  Transfers $100 of account A to B  A = $200 - $100 = $100  B = $200 + $100 = $300  Transaction W  Computes sum of all account balances (branch total)  Total = $100 + $300 +…… = $400 + ……  What might happen if transactions V & W are executed concurrently? 34
  35. “Inconsistent Retrievals” Problem (Cont.) 35
  36. Serial Equivalence  Serial equivalence prevents occurrence of lost updates & inconsistent retrievals  Thus, serial equivalence is used as a criterion in concurrency control algorithms  2 transactions are serially equivalent if  All pairs of conflicting operations of 2 transactions are executed in same order at all of the objects that both transactions access 36
  37. Serial Equivalence (Cont.)  Consider 2 objects i & j, & 2 transactions T & U  Serially equivalent orderings require one of the following 2 conditions  T accesses i before U, and T accesses j before U  U accesses i before T, and U accesses j before T 37
  38. “Lost Update” Problem 38
  39. Serially Equivalent Interleaving 39
  40. “Inconsistent Retrievals” Problem 40
  41. Serially Equivalent Interleaving 41
  42. Issues in Aborting Transactions  Servers must record effects of all committed transactions & none of aborted transactions  2 well-known problems  Dirty reads  Premature writes  Both of these can occur in the presence of serially equivalent executions of transactions 42
  43. Example – Dirty Read 43
  44. Example – Premature Writes  Due to interaction between write operations on the same object belonging to different transactions  If U aborts & T commits, balance should be $105 44
  45. Concurrency Control Algorithms  2-Phase Locking (2PL)  Get all locks, then release them  No more lock acquisition when a lock is release  Timestamp ordering  Lamport’s time stamp  Optimistic concurrency  Go & do the changes, if there is a problem let’s handle them later  How to do these concurrently on a distributed system? 45

Editor's Notes

  1. Both T and U read the balance of B (200) T overwrites the update by U without seeing it
  2. W computes the sum before V updates account B
  3. Both T and U read the balance of B (200) T overwrites the update by U without seeing it
  4. W computes the sum before V updates account B
Advertisement