Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Transactions and Concurrency Control

158 views

Published on

Transactions and Concurrency Control in distributed systems. Transaction properties, classification, and transaction implementation. Flat, Nested, and Distributed transactions. Inconsistent Retrievals, Lost Update, Dirty Read, and Premature Writes Problem

Published in: Engineering
  • Be the first to comment

Transactions and Concurrency Control

  1. 1. Transactions & Concurrency Control CS4262 Distributed Systems Dilum Bandara Dilum.Bandara@uom.lk Some slides adapted from U Uthaiyashankar (WSO2) and Rajkumar Buyya’s
  2. 2. Outline  Transactions  Properties  Classification  Transaction implementation  Concurrency control 2
  3. 3. Transactions  Transactions protect a shared resource (e.g., shared data) against simultaneous access by several concurrent processes  Transactions can do much more…  They allow a process to access & modify multiple data items as a single atomic operation  All-or-nothing property 3
  4. 4. Transactions in DBMS  A unit of work performed within a DBMS against a database  Treated in a coherent & reliable way independent of other transactions  2 main purposes  To provide reliable units of work that allow correct recovery from failures & keep a database consistent even in cases of (complete or partial) system failure  To provide isolation between programs accessing a database concurrently 4
  5. 5. Goal of Transactions  To ensure objects managed by a server remain in a consistent state when accessed by multiple transactions in the presence of server failure  Transactions deal with process crash failures & communication omission failures  Transactions don’t deal with Byzantine failure 5
  6. 6. Transaction Model  Recoverable objects  Objects that can be recovered after their server crashes  Failure atomicity  Effects of the transaction are atomic even when server crashes  Transactions require coordination among  A client program  Some recoverable objects  A coordinator 6
  7. 7. Transaction Model (Cont.)  Each transaction is created & managed by a coordinator which implements the coordinator interface  Coordinator gives each transaction an identifier (TID)  Clients invoke transaction coordinator interface methods to perform a transaction  A transaction may be aborted by client or server 7
  8. 8. Transaction Coordinator Interface  openTransaction() → trans  Starts a new transaction & delivers a unique TID trans  trans will be used in other operations in the transaction  closeTransaction(trans) → (commit, abort)  Ends a transaction  commit – indicates that transaction has committed  abort – Indicates that it has aborted  abortTransaction(trans)  Aborts the transaction 8
  9. 9. Properties of a Transaction  Proposed by Harder & Reuter in 1983  ACID Property  Atomic  Consistent  Isolated  Durable 9
  10. 10. Atomic Property  Process all of a transaction or none of it  To outside world, transaction happens indivisibly  While a transaction happens, other processes can’t see any intermediate states  Example  A file is 100 bytes long when a transaction starts to append to it  If other processes read the file during transaction, they should see & access only the 100 bytes  If transaction commits successfully, file grows instantaneously to its new size 10
  11. 11. Consistent Property  Data on all systems reflects the same state  Transaction doesn’t violate system invariants  Example  In a banking system, a key invariant is the law of conservation of money  After any internal transfer, amount of money in bank must be same as it was before transfer  During internal transfer operations, this invariant may be violated  But this violation isn’t visible outside the transaction 11
  12. 12. Isolated Property  Transactions don’t interact/interfere with one another  Concurrent transactions don’t interfere with each other  i.e., transactions are serializable  Example  If 2+ transactions are running at the same time, to each of them & to other processes, final result appears as though all transactions ran sequentially (in some system-dependent order) 12
  13. 13. Durable Property  Effects of a completed transaction are persistent  Once a transaction commits, changes are permanent  No failure after the commit can undo results or cause them to be lost 13
  14. 14. Ensuring Transaction Properties  To support “failure atomicity” & “durability”, objects must be recoverable  To support “isolation” & “consistency” server needs to synchronize operations  Perform transactions serially (sequentially)  Generally, an unacceptable solution  Allow transactions to execute concurrently, if they would have the same effect as a serial execution  Known as serializable or serially equivalent transactions 14
  15. 15. Classification of Transactions  Flat transactions  Nested transactions  Distributed transactions 15
  16. 16. Flat Transactions  A series of operations  Don’t allow partial results to be committed or aborted  e.g., booking multiple air lines from start to destination 16
  17. 17. Nested Transactions  Constructed from a number of sub-transactions  Top-level transaction may fork off children that run in parallel to one another  Incase of failure results of some sub-transaction that committed must be undone  e.g., booking multiple air lines from start to destination 17
  18. 18. Distributed Transactions  Logically, a flat, indivisible transaction that operates on distributed data  In contrast, a nested transaction is logically decomposed into a hierarchy of sub-transactions 18 Source: Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007
  19. 19. Achieving Transaction Properties in Distributed Transactions  To achieve atomicity & durability  Distributed commit protocols  To achieve consistency & isolation  Concurrency control algorithms (distributed locking protocols)  We need separate distributed algorithms to handle “locking of data” & “committing entire transaction” 19
  20. 20. Transaction Implementation  Consider transactions on a file system  If each process executing a transaction just updates the file it uses in place, then transactions will not be atomic  2 common implementation techniques  Private workspace  Write-ahead log 20
  21. 21. Private Workspace  Provide private (local) copies  All updates on private copies  Update the original (global) copy while committing  Issues  Cost of copying everything  Possible solution  Keep a pointer to relevant file block  When writing copy file block, instead of entire file  Update original when closing file  Examples – Andrew file systems 21
  22. 22. Example – Private Workspace Implementation 22 Source: Andrew S. Tanenbaum , Distributed Operating Systems
  23. 23. Write-Ahead Log  Files are modified in-place  Before any changes, a record is written in a log indicating  Which transaction is making the change  Which file & block is being changed  What old & new values are  Change is made to file only after log is written successfully  Example – log structured file system  If transaction succeeds & is committed, a commit record is written in log  If transaction is aborted, log is used to revert file back to its original state – rollback 23
  24. 24. Writeahead Log Implementation 24
  25. 25. Concurrency Control  Goal  Allow simultaneous execution of transactions  While enabling each transaction to be isolated & data items being manipulated are left in a consistent state  Consistency  Achieved by giving transactions access to data items in a specific order  Final result would be the same as if all transactions were run sequentially 25
  26. 26. Achieving Concurrency Control  3 different managers 1. Data manager 2. Scheduler 3. Transaction manager 26
  27. 27. Achieving Concurrency Control (Cont.)  Data manager  Performs actual read/write on data  Not concerned about transaction  Scheduler  Responsible for properly controlling concurrency  Determines which transaction is allowed to pass a read/write operation to data manager & at which time  Schedules individual read/write s.t. transaction isolation & consistency properties are met  Transaction manager  Guarantees transaction atomicity  Processes transaction primitives by transforming them into scheduling requests 27
  28. 28.  Each site has a scheduler & data manager  Each transaction is handled by a transaction manager  Transaction manager communicates with schedulers at different sites  Scheduler may also communicate with remote data managers Distributed Concurrency Control 28
  29. 29. Concurrency Control  Schedules  Concurrency control issues  Serial equivalence  Issues in aborting transactions 29
  30. 30. Schedules – Example  Schedule – system dependent order of actions  Final result looks as though all transactions ran sequentially 30
  31. 31. Concurrent Transaction Issues  Lost update problem  Transactions should not read a “stale” value & use it in computing a new value  Inconsistent retrievals problem  Update transactions shouldn’t interfere with retrievals  Illustrated in the context of banking examples… 31
  32. 32. “Lost Update” Problem  3 bank accounts (A, B, C) with balances,  A = $100, B = $200, C = $300  Transaction T  Transfers from account A to B, for 10% increase in the B’s balance  B = $200 + ($200*10%) = $220  A = $100 – ($200*10%) = $80  Transaction U  Transfers from C to B, for 10% increase in B’s balance  B = $220 + ($220*10%) = $242  C = $300 – ($220*10%) = $278  What might happen if transactions T & U are executed concurrently? 32
  33. 33. “Lost Update” Problem (Cont.) 33
  34. 34. “Inconsistent Retrievals” Problem  2 bank accounts (A, B) with balances  A = $200  B = $200  Transaction V  Transfers $100 of account A to B  A = $200 - $100 = $100  B = $200 + $100 = $300  Transaction W  Computes sum of all account balances (branch total)  Total = $100 + $300 +…… = $400 + ……  What might happen if transactions V & W are executed concurrently? 34
  35. 35. “Inconsistent Retrievals” Problem (Cont.) 35
  36. 36. Serial Equivalence  Serial equivalence prevents occurrence of lost updates & inconsistent retrievals  Thus, serial equivalence is used as a criterion in concurrency control algorithms  2 transactions are serially equivalent if  All pairs of conflicting operations of 2 transactions are executed in same order at all of the objects that both transactions access 36
  37. 37. Serial Equivalence (Cont.)  Consider 2 objects i & j, & 2 transactions T & U  Serially equivalent orderings require one of the following 2 conditions  T accesses i before U, and T accesses j before U  U accesses i before T, and U accesses j before T 37
  38. 38. “Lost Update” Problem 38
  39. 39. Serially Equivalent Interleaving 39
  40. 40. “Inconsistent Retrievals” Problem 40
  41. 41. Serially Equivalent Interleaving 41
  42. 42. Issues in Aborting Transactions  Servers must record effects of all committed transactions & none of aborted transactions  2 well-known problems  Dirty reads  Premature writes  Both of these can occur in the presence of serially equivalent executions of transactions 42
  43. 43. Example – Dirty Read 43
  44. 44. Example – Premature Writes  Due to interaction between write operations on the same object belonging to different transactions  If U aborts & T commits, balance should be $105 44
  45. 45. Concurrency Control Algorithms  2-Phase Locking (2PL)  Get all locks, then release them  No more lock acquisition when a lock is release  Timestamp ordering  Lamport’s time stamp  Optimistic concurrency  Go & do the changes, if there is a problem let’s handle them later  How to do these concurrently on a distributed system? 45

×