SlideShare a Scribd company logo
1 of 70
CS 542 Database Management Systems Concurrency Control Commit in Distributed Systems J Singh  April 11, 2011
Today’s Meeting Concurrency Control Intention Locks Index Locking Optimistic CC Validation Timestamp Ordering Multi-version CC Commit in Distributed Databases Two Phase Commit Paxos Algorithm Concluding thoughts References (aside from textbook):  Concurrency Control and Recovery in Database Systems, Philip A. Bernstein, VassosHadzilacos, Nathan Goodman, Microsoft Research. Concurrency Control: Methods, Performance, and Analysis, Alexander Thomasian, ACM Computing Surveys, March, 1998  Paxos Commit, Gray & Lamport, Microsoft Research TechFest, 2004 OLTP Through the Looking Glass, and What We Found There, Harizopoulos et al, Proc ACM SIGMOD, 2008 The end of an Architectural Era, Stonebraker et al, Proc. VLDB, 2007
Scheduler Architecture for CC Scheduler has two parts Accepts read/write requests from transactions Assures serialization Keeps track of active and pending transactions  Controls commit, abort, delay Today’s lecture discusses Part 2 functionality
The Lock Table A relation that associates database elements with locking information about that element Implemented as a hash table Size is proportional to the number of lock elements, not to the size of the entire database DB element A Lock information for A
Scheduler Priority Logic When a transaction releases a lock that other transactions are waiting for, what policy to use? First-Come-First-Served:  Grant the lock to the longest waiting request.  No starvation (waiting forever for lock) Priority to Shared Locks:  Grant all S locks waiting, then one X lock.  Grant X lock if no others waiting Priority to Upgrading:  If there is a U lock waiting to upgrade to an X  lock, grant that first. Each has its advantages and disadvantages Configurable for a database instance
Today’s Meeting Concurrency Control Intention Locks Index Locking Optimistic CC Validation Timestamp Ordering Multi-version CC Commit in Distributed Databases Two Phase Commit Paxos Algorithm Concluding thoughts References (aside from textbook):  Concurrency Control and Recovery in Database Systems, Philip A. Bernstein, VassosHadzilacos, Nathan Goodman, Microsoft Research. Concurrency Control: Methods, Performance, and Analysis, Alexander Thomasian, ACM Computing Surveys, March, 1998  Paxos Commit, Gray & Lamport, Microsoft Research TechFest, 2004 OLTP Through the Looking Glass, and What We Found There, Harizopoulos et al, Proc ACM SIGMOD, 2008 The end of an Architectural Era, Stonebraker et al, Proc. VLDB, 2007
Motivation for intention locks Besides scanning through the table, if we need to modify a few tuples. What kind of lock to put on the table? Have to be X (if we only have S or X). But, blocks all other read requests!
Intention Locks Allow intention locks IS, IX. Before S locking an item, must IS lock the root. Before X locking an item, must IX lock the root. Should make sure: If Ti S locks a node, no Tj can X lock an ancestor. Achieved if S conflicts with IX If TjX locks a node, no Tican S or X lock an ancestor. Achieved if X conflicts with IS and IX.
Allowed Lock Sharings Lock Requester IX S SIX X IS	 Ö Ö Ö Ö Ö IS IX Ö Ö Lock Holder S Ö Ö SIX Ö X
Multiple Granularity Lock Protocol Each txn starts from the root of the hierarchy. To get a lock on any node, must hold an intentional lock on its parent node! E.g. to get S lock on a node, must hold IS or IX on parent. E.g. to get X lock on a node, must hold IX or SIX on parent. Full table of rules: Must release locks in bottom-up order.
Example 1 T1(IS) T1(S) T1 needs a shared lock on t2 T2 needs a shared lock on R1 , T2(S) R1 t1 t4 t2 t3
Example 2 T1(IS) , T2(IX) T2(IX) T1(S) ,[object Object],T2 needs an exclusive lock on t4 No conflict R1 t1 t4 t2 t3
Examples 3, 4, 5 T1 scans R, and updates a few tuples: T1 gets an SIX lock on R, and occasionally upgrades to X on the tuples. T2 uses an index to read only part of R: T2 gets an IS lock on R, and repeatedly gets an S lock on tuples of R. T3 reads all of R: T3 gets an S lock on R.  OR, T3 could behave like T2; can use lock escalationas it goes. Lock Requester IX S SIX X IS	 Ö Ö Ö Ö Ö IS IX Ö Ö Lock Holder S Ö Ö SIX Ö X
Insert and Delete Transactions T1: SELECT MAX(Price) WHERE Rating = 1; SELECT MAX(Price) WHERE Rating = 2; T2: INSERT <Apple, Arkansas Black, 1, 96>; DELETE WHERE Rating = 2  AND Price = (SELECT MAX(Price) WHERE Rating = 2); Execution T1 locks all records w/Rating=1 and gets 80. T2 inserts <Arkansas Black, 96> T2 deletes <Fuji, 75> T1 locks all records w/Rating=2 and gets 65. ,[object Object]
From T1: 80, 65
Actual: 96, 65
T1 then T2: 80, 75
T2 then T1: 96, 65,[object Object]
Today’s Meeting Concurrency Control Intention Locks Index Locking Optimistic CC Validation Timestamp Ordering Multi-version CC Commit in Distributed Databases Two Phase Commit Paxos Algorithm Concluding thoughts References (aside from textbook):  Concurrency Control and Recovery in Database Systems, Philip A. Bernstein, VassosHadzilacos, Nathan Goodman, Microsoft Research. Concurrency Control: Methods, Performance, and Analysis, Alexander Thomasian, ACM Computing Surveys, March, 1998  Paxos Commit, Gray & Lamport, Microsoft Research TechFest, 2004 OLTP Through the Looking Glass, and What We Found There, Harizopoulos et al, Proc ACM SIGMOD, 2008 The end of an Architectural Era, Stonebraker et al, Proc. VLDB, 2007
Did Insert/Delete expose a flaw in 2PL? The flaw was with the assumption that by locking all tuples, T1 had locked the set! We needed to lock the set Would we bottleneck on the relation if the workload were insert- and delete-heavy? There is another way to solve the problem: Lock at the index (if one exists) Since B+ trees are not 100% full, we can maintain multiple locks in different sections of the tree. Index Put a lock here. r=1
Index Locking (p1) Higher levels of the tree only direct searches for leaf pages. For inserts, a node on a path from root to modified leaf must be locked (in X mode, of course), only if a split can propagate up to it from the modified leaf.  (Similar point holds w.r.t. deletes.) We can exploit these observations to design efficient locking protocols that guarantee serializability even though they violate 2PL.
Index Locking (p2) Search:  Start at root and go down; repeatedly, S lock child then unlock parent. Insert/Delete: Start at root and go down, obtaining X locks as needed.  Once child is locked, check if it is safe: If child is safe, release all locks on ancestors. Safe node:  Node such that changes will not propagate up beyond this node. Inserts:  Node is not full. Deletes:  Node is not half-empty.
Example ROOT Where to lock? 1)  Delete 38* 2)  Insert 45* 3)  Insert 25* A 20 B 35 C F 38 44 23 H D E G I 20* 22* 23* 24* 35* 36* 38* 41* 44*
Today’s Meeting Concurrency Control Intention Locks Index Locking Optimistic CC Validation Timestamp Ordering Multi-version CC Commit in Distributed Databases Two Phase Commit Paxos Algorithm Concluding thoughts References (aside from textbook):  Concurrency Control and Recovery in Database Systems, Philip A. Bernstein, VassosHadzilacos, Nathan Goodman, Microsoft Research. Concurrency Control: Methods, Performance, and Analysis, Alexander Thomasian, ACM Computing Surveys, March, 1998  Paxos Commit, Gray & Lamport, Microsoft Research TechFest, 2004 OLTP Through the Looking Glass, and What We Found There, Harizopoulos et al, Proc ACM SIGMOD, 2008 The end of an Architectural Era, Stonebraker et al, Proc. VLDB, 2007
Optimistic CC Locking is a conservative approach in which conflicts are prevented. Disadvantages: Lock management overhead. Deadlock detection/resolution. Not discussed in CS-542 lectures, expecting that you are familiar with it If conflicts are rare, we may be able to gain performance by not locking, and instead checking for conflicts before txns commit. Two approaches Kung-Robinson Model Divides every transaction into three phases: read, validate, write Makes commit/abort decision based on what’s being read and written Timestamp Ordering Algorithms Clever use of timestamps to determine which operations are conflict-free and which must be aborted
Kung-Robinson Model Key idea: Let transactions work in isolation Validate reads and writes when ready to commit Make Validation Atomic Validated ≡ Committed Transactions have three phases: READ:   txns read from the database,  make changes to private copies of objects. VALIDATE:   Check if schedule so far is serializable. WRITE:  Make local copies of changes public. old ROOT modified objects new
Validation Test conditions that are sufficient to ensure that no conflict occurred. Each txn is assigned a numeric id. Just use a timestamp. Transaction ids assigned at end of READ phase, just before validation begins.  ReadSet(Ti):  Set of objects read by txn Ti. WriteSet(Ti):  Set of objects modified by Ti. Validation is atomic Done in a critical section
Validation Tests Test FIN(Ti) < START(Tj) FIN(Ti) < VAL(Tj) AND WriteSet(Ti ) ∩ReadSet(Tj ) is empty. VAL(Ti) < VAL(Tj) AND WriteSet(Ti ) ∩ReadSet(Tj ) is empty AND WriteSet(Ti ) ∩WriteSet(Tj ) is empty. Ti Tj Ti Ti R V W R V W R V W Tj R V W Tj R V W R V W Situation
Overheads in Kung-Robinson CC Must record read/write activity in ReadSet and WriteSet per txn. Must create and destroy these sets as needed. Must check for conflicts during validation, and must make validated writes “global”. Critical section can reduce concurrency. Scheme for making writes global can reduce clustering of objects. Optimistic CC restarts transactions that fail validation. Work done so far is wasted; requires clean-up.
Today’s Meeting Concurrency Control Intention Locks Index Locking Optimistic CC ,[object Object],Timestamp Ordering Multi-version CC Commit in Distributed Databases Two Phase Commit Paxos Algorithm Concluding thoughts References (aside from textbook):  Concurrency Control and Recovery in Database Systems, Philip A. Bernstein, VassosHadzilacos, Nathan Goodman, Microsoft Research. Concurrency Control: Methods, Performance, and Analysis, Alexander Thomasian, ACM Computing Surveys, March, 1998  Paxos Commit, Gray & Lamport, Microsoft Research TechFest, 2004 OLTP Through the Looking Glass, and What We Found There, Harizopoulos et al, Proc ACM SIGMOD, 2008 The end of an Architectural Era, Stonebraker et al, Proc. VLDB, 2007
Timestamp Ordering CC Main idea: Put a timestamp on the last read and write action on every object Use this timestamp to detect if a transaction attempts an illegal operation Abort the offending transaction if it does Algorithm:   Give each object a read-timestamp (RTS) and a write-timestamp (WTS),  Give each txn a timestamp (TS) when it begins Action ai of txn Ti must occur before action aj of txn Tj if If action ai of txn Ti conflicts with action aj of txn Tj, and  TS(Ti) < TS(Tj), then ai must occur before aj.   Otherwise, restart the violating txn.
Rules for Timestamps-Based scheduling Algorithm setup RT(X) The read time of X, the highest timestamp of transaction that has read X. WT(X) The write time of X, the highest timestamp of transaction that has write X. C(X) The commit bit for X, which is true if and only if the most recent transaction to write X has already committed. Scheduler receives a request from T to operate on X The request is realizable under some conditions and not under others
Physically Unrealizable Read too late A transaction U that started after transaction T but wrote a value for X before T reads X In other words, if TS(T) < RT(X), then the write is physically unrealizable,  and T must be rolled back. U writes X T reads X T start U start
Physically Unrealizable Write too late A transaction U that started after T, but read X before T got a chance to write X. In other words, if TS(T) < RT(X), then the write is physically unrealizable,  and T must be rolled back. U reads X T writes X T start U start
Dirty Read After T reads the value of X written by U, U could abort In other words, if TS(T) = RT(X) but TS(T) < WT(X), then the write is physically realizable, but there is already a later value in X.  If C(X) is true, then the previous writer of X is committed, all is good. If C(X) is false, we must delay T. U writes X T reads X U start T start U aborts
Write after Write T tries to write X after a later transaction (U) has written it OK to ignore the write by T because it will get overwritten anyway Except if U aborts  And the new value of T is lost forever Solve this problem by introducing the concept of a “tentative write” U writes X T writes X U abort U start T start T commit
Rules for Timestamps-based Scheduling Scheduler receives a request to commit T.  It must find all the database elements X written by T and set C(X)=true.  If any transactions are waiting for X to be committed, these transactions are allowed to proceed. Scheduler receives a request to abort T or decides to rollback T,  Any transaction that was waiting on an element X that T wrote must repeat its attempt to read or write.
Today’s Meeting Concurrency Control Intention Locks Index Locking Optimistic CC ,[object Object],Timestamp Ordering Multi-version CC Commit in Distributed Databases Two Phase Commit Paxos Algorithm Concluding thoughts References (aside from textbook):  Concurrency Control and Recovery in Database Systems, Philip A. Bernstein, VassosHadzilacos, Nathan Goodman, Microsoft Research. Concurrency Control: Methods, Performance, and Analysis, Alexander Thomasian, ACM Computing Surveys, March, 1998  Paxos Commit, Gray & Lamport, Microsoft Research TechFest, 2004 OLTP Through the Looking Glass, and What We Found There, Harizopoulos et al, Proc ACM SIGMOD, 2008 The end of an Architectural Era, Stonebraker et al, Proc. VLDB, 2007
Multiversion Timestamps Multiversion schemes keep old versions of data item to increase concurrency. Each successful write results in the creation of a new version of the data item written. Use timestamps to label versions. When a read(X) operation is issued, select an appropriate version of X based on the timestamp of the transaction, and return the value of the selected version.
Timestamps vs Locking Generally, timestamping performs better than locking in situations where: Most transactions are read-only. It is rare that concurrent transaction will try to read and write the same element. This is generally the case for Web Applications In high-conflict situation, locking performs better than timestamps
Practical Use 2-Phase Locks (or variants) Used by most relational databases Multi-level granularity Support for table, page and tuple-level locks Used by most relational databases Multi-version concurrency control Oracle 8 forward: Divide transactions into read-only and read-write Read-only transactions use multi-version concurrency and never wait Read-write transactions use 2PL Postgres, others as well, offer some level of MVCC
Today’s Meeting Concurrency Control Intention Locks Index Locking Optimistic CC Validation Timestamp Ordering Multi-version CC Commit in Distributed Databases Two Phase Commit Paxos Algorithm Concluding thoughts References (aside from textbook):  Concurrency Control and Recovery in Database Systems, Philip A. Bernstein, VassosHadzilacos, Nathan Goodman, Microsoft Research. Concurrency Control: Methods, Performance, and Analysis, Alexander Thomasian, ACM Computing Surveys, March, 1998  Paxos Commit, Gray & Lamport, Microsoft Research TechFest, 2004 OLTP Through the Looking Glass, and What We Found There, Harizopoulos et al, Proc ACM SIGMOD, 2008 The end of an Architectural Era, Stonebraker et al, Proc. VLDB, 2007
Distributed Commit Motivation FruitCo has Its main Sales office in Oregon Farms and Warehouse are in Washington Finance is in Utah All three sites have local data centers with their own systems When an order is placed, the Sales system must send the billing information to Utah and shipping information to Washington. When an order is placed, all three databases must be updated, or none should be.
Two Phase Commit The Basic Idea
Two-Phase Commit (2PC) Phase 1 : The TM gets the RMs ready to write the results into the database Phase 2 : Everybody writes the results into the database TM :The process at the site where the transaction originates and which controls the execution RM :The process at the other sites that participate in executing the transaction Global Commit Rule: The TM aborts a transaction if and only if at least one RM votes to abort it. The TM commits a transaction if and only if all of the RMs vote to commit it.
Centralized 2PC P P P P C C C P P P P ready? yes/no commit/abort? commited/aborted Phase 1 Phase 2
State Transitions in 2PC INITIAL INITIAL READY      Prepare    Commit command Vote-commit Prepare    Prepare    Vote-abort WAIT Global-abort Global-commit Vote-commit (all)   Vote-abort   Ack Ack Global-commit Global-abort ABORT COMMIT COMMIT ABORT TM RMs
When TM Fails… Timeout in INITIAL Who cares Timeout in WAIT Cannot unilaterally commit Can unilaterally abort Timeout in ABORT or COMMIT Stay blocked and wait for the acks TM INITIAL Commit command Prepare WAIT   Vote-abort     Vote-commit   Global-commit Global-abort ABORT COMMIT
When an RM Fails… INITIAL Timeout in INITIAL TM must have failed in INITIAL state Unilaterally abort Timeout in READY Stay blocked RMs      Prepare    Vote-commit    Prepare    Vote-abort READY Global-abort Global-commit Ack Ack ABORT COMMIT
When TM Recovers… Failure in INITIAL Start the commit process upon recovery Failure in WAIT Restart the commit process upon recovery Failure in ABORT or COMMIT Nothing special if all the acks have been received Otherwise the termination protocol is involved INITIAL TM Commit command Prepare WAIT   Vote-commit     Vote-abort   Global-commit Global-abort ABORT COMMIT
When an RM Recovers… Failure in INITIAL Unilaterally abort upon recovery Failure in READY The TM has been informed about the local decision Treat as timeout in READY state and invoke the termination protocol Failure in ABORT or COMMIT Nothing special needs to be done INITIAL RMs      Prepare    Vote-commit    Prepare    Vote-abort READY Global-abort Global-commit Ack Ack COMMIT ABORT
2PC Protocol Actions RM                    TM                 INITIAL INITIAL PREPARE write begin_commit in log write abort in log No Ready to Commit? VOTE-ABORT Yes VOTE-COMMIT write ready in log WAIT Yes GLOBAL-ABORT write abort in log READY Any No? No VOTE-COMMIT write commit in log Abort Type of msg ACK Commit write abort in log ABORT COMMIT ACK write commit in log write end_of_transaction in log ABORT COMMIT
Two-phase commit commentary Two-phase commit protocol limitation: it is a blocking protocol.  The failure of the TM can cause the protocol to block until the TM is repaired.  If the TM fails right after every RM has sent a Prepared message, then the other RMs have no way of knowing whether the TM committed or aborted. RMs will block resource processes while waiting for a message from the TM.  A TM will also block resources while waiting for replies from RMs. A TM can also block indefinitely if no acknowledgement is received from the RM.  “Federated” two-phase commit protocols, aka three-phase protocols, have been proposed but are still unproven. Paxos Consensus Algorithm.  Consensus on Transaction Commit, Jim Gray and Leslie Lamport, Microsoft Research, 2005, MSR-TR-2003-96
Today’s Meeting Concurrency Control Intention Locks Index Locking Optimistic CC Validation Timestamp Ordering Multi-version CC Commit in Distributed Databases Two Phase Commit Paxos Algorithm Concluding thoughts References (aside from textbook):  Concurrency Control and Recovery in Database Systems, Philip A. Bernstein, VassosHadzilacos, Nathan Goodman, Microsoft Research. Concurrency Control: Methods, Performance, and Analysis, Alexander Thomasian, ACM Computing Surveys, March, 1998  Paxos Commit, Gray & Lamport, Microsoft Research TechFest, 2004 OLTP Through the Looking Glass, and What We Found There, Harizopoulos et al, Proc ACM SIGMOD, 2008 The end of an Architectural Era, Stonebraker et al, Proc. VLDB, 2007
Fault-Tolerant Two Phase Commit Prepared client TM RM RequestCommit Prepare Prepared Prepare TM RM RequestCommit Prepare Prepared If the 2PC Transaction Manager (TM) Fails,  transaction blocks. Solution: Add a “spare” transaction manager (non blocking commit, 3 phase commit)
Fault-Tolerant Two Phase Commit client TM RM abort Prepared Prepare commit commit TM RM TM Prepared commit Prepare RequestCommit Prepare Prepared Inconsistent!  Now What? Prepare Prepared commit commit abort If the 2PC Transaction Manager (TM) Fails,  transaction blocks. Solution: Add a “spare” transaction manager (non blocking commit, 3 phase commit) The complexity is a mess. But… What if….?
Fault Tolerant 2PC  Several workarounds proposed in database community: Often called "3-phase" or "non-blocking" commit. None with complete algorithm and correctness proof.
Propose X consensus box client W Chosen Propose W client W Chosen client W Chosen Consensus collects proposed values Picks one proposed value remembers it forever
Consensus for Commit – The Obvious Approach consensus box RM client TM Propose Prepared Prepared Chosen Request Commit Prepared Prepare Commit Commit Prepare Commit TM RM Prepared Chosen Prepared RequestCommit Prepare Prepared Propose Prepared Prepared Chosen Commit Commit Get consensus on TM’s decision. TM just learns consensus value. TM is “stateless”
Consensus for Commit – The Paxos Commit Approach RM client TM Request Commit consensus box Propose RM1 Prepared Prepare RM1 Prepared Chosen Commit Commit Prepare consensus box Commit RM TM Propose RM2 Prepared RM2 Prepared Chosen RequestCommit Prepare Propose RM1 Prepared Propose RM2 Prepared RM1 Prepared Chosen RM2 Prepared Chosen Commit Commit Get consensus on each RM’s choice. TM just combines consensus values. TM is “stateless”
The Obvious Approach Paxos Commit One fewer message delay Prepare Prepare Prepared Propose RM1 Prepared Propose RM2 Prepared Propose Prepared RM1 Prepared Chosen Prepared Chosen RM2 Prepared Chosen Commit Commit
RM Consensus box Propose RM Prepared acceptor TM acceptor TM acceptor Consensus in Action Propose RM Prepared Vote RM Prepared Propose RM Prepared RM Prepared Chosen Vote RM Prepared Vote RM Prepared The normal (failure-free) case Two message delays Can optimize
RM Consensus box acceptor TM acceptor TM TM acceptor Consensus in Action TM can always learn what was chosen, or get Aborted chosen if nothing chosen yet; if majority of acceptors working .
The Complete Algorithm Subtle. More weird cases than most people imagine. Proved correct.
PaxosCommit in a Nutshell Acceptors 0…2F Client    TM RM1…N request commit prepare prepared all prepared commit N RMs 2F+1 acceptors (~2F+1 TMs) If F+1 acceptors see all RMs prepared, then transaction committed. 2F(N+1) + 3N + 1 messages5 message delays 2 stable write delays.
Paxos Commit Evaluation Two-Phase Commit 3N+1 messages N+1 stable writes 4 message delays 2 stable-write delays Availability is compromised Paxos Commit 3N+ 2F(N+1) +1 messages N+2F+1 stable writes 5 message delays 2 stable-write delays Tolerates F Faults Paxos≡ 2PC for F = 0 ,[object Object],Chubby has F=2 (5 Acceptors)
Today’s Meeting Concurrency Control Intention Locks Index Locking Optimistic CC Validation Timestamp Ordering Multi-version CC Commit in Distributed Databases Two Phase Commit Paxos Algorithm Concluding thoughts References (aside from textbook):  Concurrency Control and Recovery in Database Systems, Philip A. Bernstein, VassosHadzilacos, Nathan Goodman, Microsoft Research. Concurrency Control: Methods, Performance, and Analysis, Alexander Thomasian, ACM Computing Surveys, March, 1998  Paxos Commit, Gray & Lamport, Microsoft Research TechFest, 2004 OLTP Through the Looking Glass, and What We Found There, Harizopoulos et al, Proc ACM SIGMOD, 2008 The end of an Architectural Era, Stonebraker et al, Proc. VLDB, 2007
OLTP Through the Looking Glass (p1) Workload TPC-C Benchmark Quote: Overall, we identify overheads and optimizations that explain a total difference of about a factor of 20x in raw performance. …  Substantial time is spent in logging, latching, locking, Btree, and buffer management. ,[object Object],Took out components of a DBMS and measured its performance impact
OLTP Through the Looking Glass (p2) Concurrency Control Look for applications where it can be turned off Some sort of optimistic concurrency control Multi-core Support Latching (inter-thread communication) remains a significant bottleneck Cache-conscious B-Trees Replication Management Loss of transactional consistency if log shipping Recovery is not instantaneous Maintaining transactional consistency Weak Consistency Starbucks doesn’t need two phase commit How to achieve eventual consistency without transactional consistency Areas for Research that may yield dividends
End of an Era? The Relational Model is not necessarily the answer It was excellent for data processing Not a natural fit for Data Warehouses Web-oriented search Real-time analytics, and Semi-structured data i.e., Semantic Web SQL is not the answer Coupling between modern programming languages and SQL are “ugly beyond belief” Programming languages have evolved while SQL has remained static Pascal C/C++ Java The little languages: Python, Perl, PHP, Ruby ,[object Object],A critique of the “one size fits all” assumption in DBMS

More Related Content

What's hot

Concurrency control
Concurrency controlConcurrency control
Concurrency controlkansel85
 
Transaction concurrency control
Transaction concurrency controlTransaction concurrency control
Transaction concurrency controlAnand Grewal
 
Svetlin Nakov - Database Transactions
Svetlin Nakov - Database TransactionsSvetlin Nakov - Database Transactions
Svetlin Nakov - Database TransactionsSvetlin Nakov
 
Dbms ii mca-ch9-transaction-processing-2013
Dbms ii mca-ch9-transaction-processing-2013Dbms ii mca-ch9-transaction-processing-2013
Dbms ii mca-ch9-transaction-processing-2013Prosanta Ghosh
 
Chapter 12 transactions and concurrency control
Chapter 12 transactions and concurrency controlChapter 12 transactions and concurrency control
Chapter 12 transactions and concurrency controlAbDul ThaYyal
 
Dbms sixth chapter_part-1_2011
Dbms sixth chapter_part-1_2011Dbms sixth chapter_part-1_2011
Dbms sixth chapter_part-1_2011sumit_study
 
Databases: Concurrency Control
Databases: Concurrency ControlDatabases: Concurrency Control
Databases: Concurrency ControlDamian T. Gordon
 
17. Recovery System in DBMS
17. Recovery System in DBMS17. Recovery System in DBMS
17. Recovery System in DBMSkoolkampus
 
4. concurrency control
4. concurrency control4. concurrency control
4. concurrency controlAbDul ThaYyal
 
Optimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed SystemsOptimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed Systemsmridul mishra
 
management of distributed transactions
management of distributed transactionsmanagement of distributed transactions
management of distributed transactionsNilu Desai
 
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...Gyanmanjari Institute Of Technology
 
Concurrency (Distributed computing)
Concurrency (Distributed computing)Concurrency (Distributed computing)
Concurrency (Distributed computing)Sri Prasanna
 

What's hot (20)

Concurrency control
Concurrency controlConcurrency control
Concurrency control
 
Concurrency Control
Concurrency ControlConcurrency Control
Concurrency Control
 
Distributed Transaction
Distributed TransactionDistributed Transaction
Distributed Transaction
 
Transaction concurrency control
Transaction concurrency controlTransaction concurrency control
Transaction concurrency control
 
Svetlin Nakov - Database Transactions
Svetlin Nakov - Database TransactionsSvetlin Nakov - Database Transactions
Svetlin Nakov - Database Transactions
 
Dbms ii mca-ch9-transaction-processing-2013
Dbms ii mca-ch9-transaction-processing-2013Dbms ii mca-ch9-transaction-processing-2013
Dbms ii mca-ch9-transaction-processing-2013
 
Ch15
Ch15Ch15
Ch15
 
Concurrency
ConcurrencyConcurrency
Concurrency
 
Chapter 12 transactions and concurrency control
Chapter 12 transactions and concurrency controlChapter 12 transactions and concurrency control
Chapter 12 transactions and concurrency control
 
Dbms sixth chapter_part-1_2011
Dbms sixth chapter_part-1_2011Dbms sixth chapter_part-1_2011
Dbms sixth chapter_part-1_2011
 
Databases: Concurrency Control
Databases: Concurrency ControlDatabases: Concurrency Control
Databases: Concurrency Control
 
Dbms
DbmsDbms
Dbms
 
17. Recovery System in DBMS
17. Recovery System in DBMS17. Recovery System in DBMS
17. Recovery System in DBMS
 
4. concurrency control
4. concurrency control4. concurrency control
4. concurrency control
 
24904 lecture11
24904 lecture1124904 lecture11
24904 lecture11
 
Optimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed SystemsOptimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed Systems
 
management of distributed transactions
management of distributed transactionsmanagement of distributed transactions
management of distributed transactions
 
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
 
Concurrency (Distributed computing)
Concurrency (Distributed computing)Concurrency (Distributed computing)
Concurrency (Distributed computing)
 
Unit06 dbms
Unit06 dbmsUnit06 dbms
Unit06 dbms
 

Viewers also liked

Two phase commit protocol in dbms
Two phase commit protocol in dbmsTwo phase commit protocol in dbms
Two phase commit protocol in dbmsDilouar Hossain
 
The nightmare of locking, blocking and isolation levels!
The nightmare of locking, blocking and isolation levels!The nightmare of locking, blocking and isolation levels!
The nightmare of locking, blocking and isolation levels!Boris Hristov
 
Timestamp based protocol
Timestamp based protocolTimestamp based protocol
Timestamp based protocolVincent Chu
 
19. Distributed Databases in DBMS
19. Distributed Databases in DBMS19. Distributed Databases in DBMS
19. Distributed Databases in DBMSkoolkampus
 
Fragmentation and types of fragmentation in Distributed Database
Fragmentation and types of fragmentation in Distributed DatabaseFragmentation and types of fragmentation in Distributed Database
Fragmentation and types of fragmentation in Distributed DatabaseAbhilasha Lahigude
 
Transaction Management
Transaction Management Transaction Management
Transaction Management Visakh V
 
Distributed databases
Distributed databasesDistributed databases
Distributed databasessourabhdave
 

Viewers also liked (10)

Two phase commit protocol in dbms
Two phase commit protocol in dbmsTwo phase commit protocol in dbms
Two phase commit protocol in dbms
 
The nightmare of locking, blocking and isolation levels!
The nightmare of locking, blocking and isolation levels!The nightmare of locking, blocking and isolation levels!
The nightmare of locking, blocking and isolation levels!
 
Database fragmentation
Database fragmentationDatabase fragmentation
Database fragmentation
 
Timestamp based protocol
Timestamp based protocolTimestamp based protocol
Timestamp based protocol
 
Distributed deadlock
Distributed deadlockDistributed deadlock
Distributed deadlock
 
19. Distributed Databases in DBMS
19. Distributed Databases in DBMS19. Distributed Databases in DBMS
19. Distributed Databases in DBMS
 
Fragmentation and types of fragmentation in Distributed Database
Fragmentation and types of fragmentation in Distributed DatabaseFragmentation and types of fragmentation in Distributed Database
Fragmentation and types of fragmentation in Distributed Database
 
Distributed database
Distributed databaseDistributed database
Distributed database
 
Transaction Management
Transaction Management Transaction Management
Transaction Management
 
Distributed databases
Distributed databasesDistributed databases
Distributed databases
 

Similar to CS 542 -- Concurrency Control, Distributed Commit

Optimizing Application Architecture (.NET/Java topics)
Optimizing Application Architecture (.NET/Java topics)Optimizing Application Architecture (.NET/Java topics)
Optimizing Application Architecture (.NET/Java topics)Ravi Okade
 
Automating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency SpreadsAutomating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency SpreadsScyllaDB
 
[COSCUP 2022] 腳踏多條船-利用 Coroutine在 Software Transactional Memory上進行動態排程
[COSCUP 2022] 腳踏多條船-利用 Coroutine在  Software Transactional Memory上進行動態排程[COSCUP 2022] 腳踏多條船-利用 Coroutine在  Software Transactional Memory上進行動態排程
[COSCUP 2022] 腳踏多條船-利用 Coroutine在 Software Transactional Memory上進行動態排程littleuniverse24
 
The Need for Async @ ScalaWorld
The Need for Async @ ScalaWorldThe Need for Async @ ScalaWorld
The Need for Async @ ScalaWorldKonrad Malawski
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databasesguestdfd1ec
 
Design Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databasesDesign Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databaseslovingprince58
 
ACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsSrinath Perera
 
DEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
DEBS 2015 Tutorial : Patterns for Realtime Streaming AnalyticsDEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
DEBS 2015 Tutorial : Patterns for Realtime Streaming AnalyticsSriskandarajah Suhothayan
 
An Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseAn Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseBenjamin Bengfort
 
MRT 2018: reflecting on the past and the present with temporal graph models
MRT 2018: reflecting on the past and the present with temporal graph modelsMRT 2018: reflecting on the past and the present with temporal graph models
MRT 2018: reflecting on the past and the present with temporal graph modelsAntonio García-Domínguez
 
Need for Async: Hot pursuit for scalable applications
Need for Async: Hot pursuit for scalable applicationsNeed for Async: Hot pursuit for scalable applications
Need for Async: Hot pursuit for scalable applicationsKonrad Malawski
 
Introduction to Concurrent Data Structures
Introduction to Concurrent Data StructuresIntroduction to Concurrent Data Structures
Introduction to Concurrent Data StructuresDilum Bandara
 
C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian
C* Summit 2013: Time is Money Jake Luciani and Carl YeksigianC* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian
C* Summit 2013: Time is Money Jake Luciani and Carl YeksigianDataStax Academy
 
Physical design-complete
Physical design-completePhysical design-complete
Physical design-completeMurali Rai
 
SQL Server In-Memory OLTP introduction (Hekaton)
SQL Server In-Memory OLTP introduction (Hekaton)SQL Server In-Memory OLTP introduction (Hekaton)
SQL Server In-Memory OLTP introduction (Hekaton)Shy Engelberg
 
Nelson: Rigorous Deployment for a Functional World
Nelson: Rigorous Deployment for a Functional WorldNelson: Rigorous Deployment for a Functional World
Nelson: Rigorous Deployment for a Functional WorldTimothy Perrett
 
A Survey of Concurrency Constructs
A Survey of Concurrency ConstructsA Survey of Concurrency Constructs
A Survey of Concurrency ConstructsTed Leung
 

Similar to CS 542 -- Concurrency Control, Distributed Commit (20)

Optimizing Application Architecture (.NET/Java topics)
Optimizing Application Architecture (.NET/Java topics)Optimizing Application Architecture (.NET/Java topics)
Optimizing Application Architecture (.NET/Java topics)
 
Automating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency SpreadsAutomating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency Spreads
 
[COSCUP 2022] 腳踏多條船-利用 Coroutine在 Software Transactional Memory上進行動態排程
[COSCUP 2022] 腳踏多條船-利用 Coroutine在  Software Transactional Memory上進行動態排程[COSCUP 2022] 腳踏多條船-利用 Coroutine在  Software Transactional Memory上進行動態排程
[COSCUP 2022] 腳踏多條船-利用 Coroutine在 Software Transactional Memory上進行動態排程
 
The Need for Async @ ScalaWorld
The Need for Async @ ScalaWorldThe Need for Async @ ScalaWorld
The Need for Async @ ScalaWorld
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databases
 
Design Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databasesDesign Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databases
 
ACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics Patterns
 
DEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
DEBS 2015 Tutorial : Patterns for Realtime Streaming AnalyticsDEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
DEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
 
An Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseAn Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed Database
 
MRT 2018: reflecting on the past and the present with temporal graph models
MRT 2018: reflecting on the past and the present with temporal graph modelsMRT 2018: reflecting on the past and the present with temporal graph models
MRT 2018: reflecting on the past and the present with temporal graph models
 
Need for Async: Hot pursuit for scalable applications
Need for Async: Hot pursuit for scalable applicationsNeed for Async: Hot pursuit for scalable applications
Need for Async: Hot pursuit for scalable applications
 
Introduction to Concurrent Data Structures
Introduction to Concurrent Data StructuresIntroduction to Concurrent Data Structures
Introduction to Concurrent Data Structures
 
C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian
C* Summit 2013: Time is Money Jake Luciani and Carl YeksigianC* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian
C* Summit 2013: Time is Money Jake Luciani and Carl Yeksigian
 
TiReX: Tiled Regular eXpression matching architecture
TiReX: Tiled Regular eXpression matching architectureTiReX: Tiled Regular eXpression matching architecture
TiReX: Tiled Regular eXpression matching architecture
 
Physical design-complete
Physical design-completePhysical design-complete
Physical design-complete
 
SQL Server In-Memory OLTP introduction (Hekaton)
SQL Server In-Memory OLTP introduction (Hekaton)SQL Server In-Memory OLTP introduction (Hekaton)
SQL Server In-Memory OLTP introduction (Hekaton)
 
Microsoft Dryad
Microsoft DryadMicrosoft Dryad
Microsoft Dryad
 
Nelson: Rigorous Deployment for a Functional World
Nelson: Rigorous Deployment for a Functional WorldNelson: Rigorous Deployment for a Functional World
Nelson: Rigorous Deployment for a Functional World
 
A Survey of Concurrency Constructs
A Survey of Concurrency ConstructsA Survey of Concurrency Constructs
A Survey of Concurrency Constructs
 
Role of locking
Role of lockingRole of locking
Role of locking
 

More from J Singh

OpenLSH - a framework for locality sensitive hashing
OpenLSH  - a framework for locality sensitive hashingOpenLSH  - a framework for locality sensitive hashing
OpenLSH - a framework for locality sensitive hashingJ Singh
 
Designing analytics for big data
Designing analytics for big dataDesigning analytics for big data
Designing analytics for big dataJ Singh
 
Open LSH - september 2014 update
Open LSH  - september 2014 updateOpen LSH  - september 2014 update
Open LSH - september 2014 updateJ Singh
 
PaaS - google app engine
PaaS  - google app enginePaaS  - google app engine
PaaS - google app engineJ Singh
 
Mining of massive datasets using locality sensitive hashing (LSH)
Mining of massive datasets using locality sensitive hashing (LSH)Mining of massive datasets using locality sensitive hashing (LSH)
Mining of massive datasets using locality sensitive hashing (LSH)J Singh
 
Data Analytic Technology Platforms: Options and Tradeoffs
Data Analytic Technology Platforms: Options and TradeoffsData Analytic Technology Platforms: Options and Tradeoffs
Data Analytic Technology Platforms: Options and TradeoffsJ Singh
 
Facebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/ReduceFacebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/ReduceJ Singh
 
Big Data Laboratory
Big Data LaboratoryBig Data Laboratory
Big Data LaboratoryJ Singh
 
The Hadoop Ecosystem
The Hadoop EcosystemThe Hadoop Ecosystem
The Hadoop EcosystemJ Singh
 
Social Media Mining using GAE Map Reduce
Social Media Mining using GAE Map ReduceSocial Media Mining using GAE Map Reduce
Social Media Mining using GAE Map ReduceJ Singh
 
High Throughput Data Analysis
High Throughput Data AnalysisHigh Throughput Data Analysis
High Throughput Data AnalysisJ Singh
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduceJ Singh
 
CS 542 -- Failure Recovery, Concurrency Control
CS 542 -- Failure Recovery, Concurrency ControlCS 542 -- Failure Recovery, Concurrency Control
CS 542 -- Failure Recovery, Concurrency ControlJ Singh
 
CS 542 -- Query Optimization
CS 542 -- Query OptimizationCS 542 -- Query Optimization
CS 542 -- Query OptimizationJ Singh
 
CS 542 -- Query Execution
CS 542 -- Query ExecutionCS 542 -- Query Execution
CS 542 -- Query ExecutionJ Singh
 
CS 542 Putting it all together -- Storage Management
CS 542 Putting it all together -- Storage ManagementCS 542 Putting it all together -- Storage Management
CS 542 Putting it all together -- Storage ManagementJ Singh
 
CS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceCS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceJ Singh
 
CS 542 Database Index Structures
CS 542 Database Index StructuresCS 542 Database Index Structures
CS 542 Database Index StructuresJ Singh
 
CS 542 Controlling Database Integrity and Performance
CS 542 Controlling Database Integrity and PerformanceCS 542 Controlling Database Integrity and Performance
CS 542 Controlling Database Integrity and PerformanceJ Singh
 
CS 542 Overview of query processing
CS 542 Overview of query processingCS 542 Overview of query processing
CS 542 Overview of query processingJ Singh
 

More from J Singh (20)

OpenLSH - a framework for locality sensitive hashing
OpenLSH  - a framework for locality sensitive hashingOpenLSH  - a framework for locality sensitive hashing
OpenLSH - a framework for locality sensitive hashing
 
Designing analytics for big data
Designing analytics for big dataDesigning analytics for big data
Designing analytics for big data
 
Open LSH - september 2014 update
Open LSH  - september 2014 updateOpen LSH  - september 2014 update
Open LSH - september 2014 update
 
PaaS - google app engine
PaaS  - google app enginePaaS  - google app engine
PaaS - google app engine
 
Mining of massive datasets using locality sensitive hashing (LSH)
Mining of massive datasets using locality sensitive hashing (LSH)Mining of massive datasets using locality sensitive hashing (LSH)
Mining of massive datasets using locality sensitive hashing (LSH)
 
Data Analytic Technology Platforms: Options and Tradeoffs
Data Analytic Technology Platforms: Options and TradeoffsData Analytic Technology Platforms: Options and Tradeoffs
Data Analytic Technology Platforms: Options and Tradeoffs
 
Facebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/ReduceFacebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/Reduce
 
Big Data Laboratory
Big Data LaboratoryBig Data Laboratory
Big Data Laboratory
 
The Hadoop Ecosystem
The Hadoop EcosystemThe Hadoop Ecosystem
The Hadoop Ecosystem
 
Social Media Mining using GAE Map Reduce
Social Media Mining using GAE Map ReduceSocial Media Mining using GAE Map Reduce
Social Media Mining using GAE Map Reduce
 
High Throughput Data Analysis
High Throughput Data AnalysisHigh Throughput Data Analysis
High Throughput Data Analysis
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
 
CS 542 -- Failure Recovery, Concurrency Control
CS 542 -- Failure Recovery, Concurrency ControlCS 542 -- Failure Recovery, Concurrency Control
CS 542 -- Failure Recovery, Concurrency Control
 
CS 542 -- Query Optimization
CS 542 -- Query OptimizationCS 542 -- Query Optimization
CS 542 -- Query Optimization
 
CS 542 -- Query Execution
CS 542 -- Query ExecutionCS 542 -- Query Execution
CS 542 -- Query Execution
 
CS 542 Putting it all together -- Storage Management
CS 542 Putting it all together -- Storage ManagementCS 542 Putting it all together -- Storage Management
CS 542 Putting it all together -- Storage Management
 
CS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceCS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduce
 
CS 542 Database Index Structures
CS 542 Database Index StructuresCS 542 Database Index Structures
CS 542 Database Index Structures
 
CS 542 Controlling Database Integrity and Performance
CS 542 Controlling Database Integrity and PerformanceCS 542 Controlling Database Integrity and Performance
CS 542 Controlling Database Integrity and Performance
 
CS 542 Overview of query processing
CS 542 Overview of query processingCS 542 Overview of query processing
CS 542 Overview of query processing
 

Recently uploaded

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 

Recently uploaded (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 

CS 542 -- Concurrency Control, Distributed Commit

  • 1. CS 542 Database Management Systems Concurrency Control Commit in Distributed Systems J Singh April 11, 2011
  • 2. Today’s Meeting Concurrency Control Intention Locks Index Locking Optimistic CC Validation Timestamp Ordering Multi-version CC Commit in Distributed Databases Two Phase Commit Paxos Algorithm Concluding thoughts References (aside from textbook): Concurrency Control and Recovery in Database Systems, Philip A. Bernstein, VassosHadzilacos, Nathan Goodman, Microsoft Research. Concurrency Control: Methods, Performance, and Analysis, Alexander Thomasian, ACM Computing Surveys, March, 1998 Paxos Commit, Gray & Lamport, Microsoft Research TechFest, 2004 OLTP Through the Looking Glass, and What We Found There, Harizopoulos et al, Proc ACM SIGMOD, 2008 The end of an Architectural Era, Stonebraker et al, Proc. VLDB, 2007
  • 3. Scheduler Architecture for CC Scheduler has two parts Accepts read/write requests from transactions Assures serialization Keeps track of active and pending transactions Controls commit, abort, delay Today’s lecture discusses Part 2 functionality
  • 4. The Lock Table A relation that associates database elements with locking information about that element Implemented as a hash table Size is proportional to the number of lock elements, not to the size of the entire database DB element A Lock information for A
  • 5. Scheduler Priority Logic When a transaction releases a lock that other transactions are waiting for, what policy to use? First-Come-First-Served: Grant the lock to the longest waiting request. No starvation (waiting forever for lock) Priority to Shared Locks: Grant all S locks waiting, then one X lock. Grant X lock if no others waiting Priority to Upgrading: If there is a U lock waiting to upgrade to an X lock, grant that first. Each has its advantages and disadvantages Configurable for a database instance
  • 6. Today’s Meeting Concurrency Control Intention Locks Index Locking Optimistic CC Validation Timestamp Ordering Multi-version CC Commit in Distributed Databases Two Phase Commit Paxos Algorithm Concluding thoughts References (aside from textbook): Concurrency Control and Recovery in Database Systems, Philip A. Bernstein, VassosHadzilacos, Nathan Goodman, Microsoft Research. Concurrency Control: Methods, Performance, and Analysis, Alexander Thomasian, ACM Computing Surveys, March, 1998 Paxos Commit, Gray & Lamport, Microsoft Research TechFest, 2004 OLTP Through the Looking Glass, and What We Found There, Harizopoulos et al, Proc ACM SIGMOD, 2008 The end of an Architectural Era, Stonebraker et al, Proc. VLDB, 2007
  • 7. Motivation for intention locks Besides scanning through the table, if we need to modify a few tuples. What kind of lock to put on the table? Have to be X (if we only have S or X). But, blocks all other read requests!
  • 8. Intention Locks Allow intention locks IS, IX. Before S locking an item, must IS lock the root. Before X locking an item, must IX lock the root. Should make sure: If Ti S locks a node, no Tj can X lock an ancestor. Achieved if S conflicts with IX If TjX locks a node, no Tican S or X lock an ancestor. Achieved if X conflicts with IS and IX.
  • 9. Allowed Lock Sharings Lock Requester IX S SIX X IS Ö Ö Ö Ö Ö IS IX Ö Ö Lock Holder S Ö Ö SIX Ö X
  • 10. Multiple Granularity Lock Protocol Each txn starts from the root of the hierarchy. To get a lock on any node, must hold an intentional lock on its parent node! E.g. to get S lock on a node, must hold IS or IX on parent. E.g. to get X lock on a node, must hold IX or SIX on parent. Full table of rules: Must release locks in bottom-up order.
  • 11. Example 1 T1(IS) T1(S) T1 needs a shared lock on t2 T2 needs a shared lock on R1 , T2(S) R1 t1 t4 t2 t3
  • 12.
  • 13. Examples 3, 4, 5 T1 scans R, and updates a few tuples: T1 gets an SIX lock on R, and occasionally upgrades to X on the tuples. T2 uses an index to read only part of R: T2 gets an IS lock on R, and repeatedly gets an S lock on tuples of R. T3 reads all of R: T3 gets an S lock on R. OR, T3 could behave like T2; can use lock escalationas it goes. Lock Requester IX S SIX X IS Ö Ö Ö Ö Ö IS IX Ö Ö Lock Holder S Ö Ö SIX Ö X
  • 14.
  • 17. T1 then T2: 80, 75
  • 18.
  • 19. Today’s Meeting Concurrency Control Intention Locks Index Locking Optimistic CC Validation Timestamp Ordering Multi-version CC Commit in Distributed Databases Two Phase Commit Paxos Algorithm Concluding thoughts References (aside from textbook): Concurrency Control and Recovery in Database Systems, Philip A. Bernstein, VassosHadzilacos, Nathan Goodman, Microsoft Research. Concurrency Control: Methods, Performance, and Analysis, Alexander Thomasian, ACM Computing Surveys, March, 1998 Paxos Commit, Gray & Lamport, Microsoft Research TechFest, 2004 OLTP Through the Looking Glass, and What We Found There, Harizopoulos et al, Proc ACM SIGMOD, 2008 The end of an Architectural Era, Stonebraker et al, Proc. VLDB, 2007
  • 20. Did Insert/Delete expose a flaw in 2PL? The flaw was with the assumption that by locking all tuples, T1 had locked the set! We needed to lock the set Would we bottleneck on the relation if the workload were insert- and delete-heavy? There is another way to solve the problem: Lock at the index (if one exists) Since B+ trees are not 100% full, we can maintain multiple locks in different sections of the tree. Index Put a lock here. r=1
  • 21. Index Locking (p1) Higher levels of the tree only direct searches for leaf pages. For inserts, a node on a path from root to modified leaf must be locked (in X mode, of course), only if a split can propagate up to it from the modified leaf. (Similar point holds w.r.t. deletes.) We can exploit these observations to design efficient locking protocols that guarantee serializability even though they violate 2PL.
  • 22. Index Locking (p2) Search: Start at root and go down; repeatedly, S lock child then unlock parent. Insert/Delete: Start at root and go down, obtaining X locks as needed. Once child is locked, check if it is safe: If child is safe, release all locks on ancestors. Safe node: Node such that changes will not propagate up beyond this node. Inserts: Node is not full. Deletes: Node is not half-empty.
  • 23. Example ROOT Where to lock? 1) Delete 38* 2) Insert 45* 3) Insert 25* A 20 B 35 C F 38 44 23 H D E G I 20* 22* 23* 24* 35* 36* 38* 41* 44*
  • 24. Today’s Meeting Concurrency Control Intention Locks Index Locking Optimistic CC Validation Timestamp Ordering Multi-version CC Commit in Distributed Databases Two Phase Commit Paxos Algorithm Concluding thoughts References (aside from textbook): Concurrency Control and Recovery in Database Systems, Philip A. Bernstein, VassosHadzilacos, Nathan Goodman, Microsoft Research. Concurrency Control: Methods, Performance, and Analysis, Alexander Thomasian, ACM Computing Surveys, March, 1998 Paxos Commit, Gray & Lamport, Microsoft Research TechFest, 2004 OLTP Through the Looking Glass, and What We Found There, Harizopoulos et al, Proc ACM SIGMOD, 2008 The end of an Architectural Era, Stonebraker et al, Proc. VLDB, 2007
  • 25. Optimistic CC Locking is a conservative approach in which conflicts are prevented. Disadvantages: Lock management overhead. Deadlock detection/resolution. Not discussed in CS-542 lectures, expecting that you are familiar with it If conflicts are rare, we may be able to gain performance by not locking, and instead checking for conflicts before txns commit. Two approaches Kung-Robinson Model Divides every transaction into three phases: read, validate, write Makes commit/abort decision based on what’s being read and written Timestamp Ordering Algorithms Clever use of timestamps to determine which operations are conflict-free and which must be aborted
  • 26. Kung-Robinson Model Key idea: Let transactions work in isolation Validate reads and writes when ready to commit Make Validation Atomic Validated ≡ Committed Transactions have three phases: READ: txns read from the database, make changes to private copies of objects. VALIDATE: Check if schedule so far is serializable. WRITE: Make local copies of changes public. old ROOT modified objects new
  • 27. Validation Test conditions that are sufficient to ensure that no conflict occurred. Each txn is assigned a numeric id. Just use a timestamp. Transaction ids assigned at end of READ phase, just before validation begins. ReadSet(Ti): Set of objects read by txn Ti. WriteSet(Ti): Set of objects modified by Ti. Validation is atomic Done in a critical section
  • 28. Validation Tests Test FIN(Ti) < START(Tj) FIN(Ti) < VAL(Tj) AND WriteSet(Ti ) ∩ReadSet(Tj ) is empty. VAL(Ti) < VAL(Tj) AND WriteSet(Ti ) ∩ReadSet(Tj ) is empty AND WriteSet(Ti ) ∩WriteSet(Tj ) is empty. Ti Tj Ti Ti R V W R V W R V W Tj R V W Tj R V W R V W Situation
  • 29. Overheads in Kung-Robinson CC Must record read/write activity in ReadSet and WriteSet per txn. Must create and destroy these sets as needed. Must check for conflicts during validation, and must make validated writes “global”. Critical section can reduce concurrency. Scheme for making writes global can reduce clustering of objects. Optimistic CC restarts transactions that fail validation. Work done so far is wasted; requires clean-up.
  • 30.
  • 31. Timestamp Ordering CC Main idea: Put a timestamp on the last read and write action on every object Use this timestamp to detect if a transaction attempts an illegal operation Abort the offending transaction if it does Algorithm: Give each object a read-timestamp (RTS) and a write-timestamp (WTS), Give each txn a timestamp (TS) when it begins Action ai of txn Ti must occur before action aj of txn Tj if If action ai of txn Ti conflicts with action aj of txn Tj, and TS(Ti) < TS(Tj), then ai must occur before aj. Otherwise, restart the violating txn.
  • 32. Rules for Timestamps-Based scheduling Algorithm setup RT(X) The read time of X, the highest timestamp of transaction that has read X. WT(X) The write time of X, the highest timestamp of transaction that has write X. C(X) The commit bit for X, which is true if and only if the most recent transaction to write X has already committed. Scheduler receives a request from T to operate on X The request is realizable under some conditions and not under others
  • 33. Physically Unrealizable Read too late A transaction U that started after transaction T but wrote a value for X before T reads X In other words, if TS(T) < RT(X), then the write is physically unrealizable, and T must be rolled back. U writes X T reads X T start U start
  • 34. Physically Unrealizable Write too late A transaction U that started after T, but read X before T got a chance to write X. In other words, if TS(T) < RT(X), then the write is physically unrealizable, and T must be rolled back. U reads X T writes X T start U start
  • 35. Dirty Read After T reads the value of X written by U, U could abort In other words, if TS(T) = RT(X) but TS(T) < WT(X), then the write is physically realizable, but there is already a later value in X. If C(X) is true, then the previous writer of X is committed, all is good. If C(X) is false, we must delay T. U writes X T reads X U start T start U aborts
  • 36. Write after Write T tries to write X after a later transaction (U) has written it OK to ignore the write by T because it will get overwritten anyway Except if U aborts And the new value of T is lost forever Solve this problem by introducing the concept of a “tentative write” U writes X T writes X U abort U start T start T commit
  • 37. Rules for Timestamps-based Scheduling Scheduler receives a request to commit T. It must find all the database elements X written by T and set C(X)=true. If any transactions are waiting for X to be committed, these transactions are allowed to proceed. Scheduler receives a request to abort T or decides to rollback T, Any transaction that was waiting on an element X that T wrote must repeat its attempt to read or write.
  • 38.
  • 39. Multiversion Timestamps Multiversion schemes keep old versions of data item to increase concurrency. Each successful write results in the creation of a new version of the data item written. Use timestamps to label versions. When a read(X) operation is issued, select an appropriate version of X based on the timestamp of the transaction, and return the value of the selected version.
  • 40. Timestamps vs Locking Generally, timestamping performs better than locking in situations where: Most transactions are read-only. It is rare that concurrent transaction will try to read and write the same element. This is generally the case for Web Applications In high-conflict situation, locking performs better than timestamps
  • 41. Practical Use 2-Phase Locks (or variants) Used by most relational databases Multi-level granularity Support for table, page and tuple-level locks Used by most relational databases Multi-version concurrency control Oracle 8 forward: Divide transactions into read-only and read-write Read-only transactions use multi-version concurrency and never wait Read-write transactions use 2PL Postgres, others as well, offer some level of MVCC
  • 42. Today’s Meeting Concurrency Control Intention Locks Index Locking Optimistic CC Validation Timestamp Ordering Multi-version CC Commit in Distributed Databases Two Phase Commit Paxos Algorithm Concluding thoughts References (aside from textbook): Concurrency Control and Recovery in Database Systems, Philip A. Bernstein, VassosHadzilacos, Nathan Goodman, Microsoft Research. Concurrency Control: Methods, Performance, and Analysis, Alexander Thomasian, ACM Computing Surveys, March, 1998 Paxos Commit, Gray & Lamport, Microsoft Research TechFest, 2004 OLTP Through the Looking Glass, and What We Found There, Harizopoulos et al, Proc ACM SIGMOD, 2008 The end of an Architectural Era, Stonebraker et al, Proc. VLDB, 2007
  • 43. Distributed Commit Motivation FruitCo has Its main Sales office in Oregon Farms and Warehouse are in Washington Finance is in Utah All three sites have local data centers with their own systems When an order is placed, the Sales system must send the billing information to Utah and shipping information to Washington. When an order is placed, all three databases must be updated, or none should be.
  • 44. Two Phase Commit The Basic Idea
  • 45. Two-Phase Commit (2PC) Phase 1 : The TM gets the RMs ready to write the results into the database Phase 2 : Everybody writes the results into the database TM :The process at the site where the transaction originates and which controls the execution RM :The process at the other sites that participate in executing the transaction Global Commit Rule: The TM aborts a transaction if and only if at least one RM votes to abort it. The TM commits a transaction if and only if all of the RMs vote to commit it.
  • 46. Centralized 2PC P P P P C C C P P P P ready? yes/no commit/abort? commited/aborted Phase 1 Phase 2
  • 47. State Transitions in 2PC INITIAL INITIAL READY Prepare Commit command Vote-commit Prepare Prepare Vote-abort WAIT Global-abort Global-commit Vote-commit (all) Vote-abort Ack Ack Global-commit Global-abort ABORT COMMIT COMMIT ABORT TM RMs
  • 48. When TM Fails… Timeout in INITIAL Who cares Timeout in WAIT Cannot unilaterally commit Can unilaterally abort Timeout in ABORT or COMMIT Stay blocked and wait for the acks TM INITIAL Commit command Prepare WAIT Vote-abort Vote-commit Global-commit Global-abort ABORT COMMIT
  • 49. When an RM Fails… INITIAL Timeout in INITIAL TM must have failed in INITIAL state Unilaterally abort Timeout in READY Stay blocked RMs Prepare Vote-commit Prepare Vote-abort READY Global-abort Global-commit Ack Ack ABORT COMMIT
  • 50. When TM Recovers… Failure in INITIAL Start the commit process upon recovery Failure in WAIT Restart the commit process upon recovery Failure in ABORT or COMMIT Nothing special if all the acks have been received Otherwise the termination protocol is involved INITIAL TM Commit command Prepare WAIT Vote-commit Vote-abort Global-commit Global-abort ABORT COMMIT
  • 51. When an RM Recovers… Failure in INITIAL Unilaterally abort upon recovery Failure in READY The TM has been informed about the local decision Treat as timeout in READY state and invoke the termination protocol Failure in ABORT or COMMIT Nothing special needs to be done INITIAL RMs Prepare Vote-commit Prepare Vote-abort READY Global-abort Global-commit Ack Ack COMMIT ABORT
  • 52. 2PC Protocol Actions RM TM INITIAL INITIAL PREPARE write begin_commit in log write abort in log No Ready to Commit? VOTE-ABORT Yes VOTE-COMMIT write ready in log WAIT Yes GLOBAL-ABORT write abort in log READY Any No? No VOTE-COMMIT write commit in log Abort Type of msg ACK Commit write abort in log ABORT COMMIT ACK write commit in log write end_of_transaction in log ABORT COMMIT
  • 53. Two-phase commit commentary Two-phase commit protocol limitation: it is a blocking protocol. The failure of the TM can cause the protocol to block until the TM is repaired. If the TM fails right after every RM has sent a Prepared message, then the other RMs have no way of knowing whether the TM committed or aborted. RMs will block resource processes while waiting for a message from the TM. A TM will also block resources while waiting for replies from RMs. A TM can also block indefinitely if no acknowledgement is received from the RM. “Federated” two-phase commit protocols, aka three-phase protocols, have been proposed but are still unproven. Paxos Consensus Algorithm. Consensus on Transaction Commit, Jim Gray and Leslie Lamport, Microsoft Research, 2005, MSR-TR-2003-96
  • 54. Today’s Meeting Concurrency Control Intention Locks Index Locking Optimistic CC Validation Timestamp Ordering Multi-version CC Commit in Distributed Databases Two Phase Commit Paxos Algorithm Concluding thoughts References (aside from textbook): Concurrency Control and Recovery in Database Systems, Philip A. Bernstein, VassosHadzilacos, Nathan Goodman, Microsoft Research. Concurrency Control: Methods, Performance, and Analysis, Alexander Thomasian, ACM Computing Surveys, March, 1998 Paxos Commit, Gray & Lamport, Microsoft Research TechFest, 2004 OLTP Through the Looking Glass, and What We Found There, Harizopoulos et al, Proc ACM SIGMOD, 2008 The end of an Architectural Era, Stonebraker et al, Proc. VLDB, 2007
  • 55. Fault-Tolerant Two Phase Commit Prepared client TM RM RequestCommit Prepare Prepared Prepare TM RM RequestCommit Prepare Prepared If the 2PC Transaction Manager (TM) Fails, transaction blocks. Solution: Add a “spare” transaction manager (non blocking commit, 3 phase commit)
  • 56. Fault-Tolerant Two Phase Commit client TM RM abort Prepared Prepare commit commit TM RM TM Prepared commit Prepare RequestCommit Prepare Prepared Inconsistent! Now What? Prepare Prepared commit commit abort If the 2PC Transaction Manager (TM) Fails, transaction blocks. Solution: Add a “spare” transaction manager (non blocking commit, 3 phase commit) The complexity is a mess. But… What if….?
  • 57. Fault Tolerant 2PC Several workarounds proposed in database community: Often called "3-phase" or "non-blocking" commit. None with complete algorithm and correctness proof.
  • 58. Propose X consensus box client W Chosen Propose W client W Chosen client W Chosen Consensus collects proposed values Picks one proposed value remembers it forever
  • 59. Consensus for Commit – The Obvious Approach consensus box RM client TM Propose Prepared Prepared Chosen Request Commit Prepared Prepare Commit Commit Prepare Commit TM RM Prepared Chosen Prepared RequestCommit Prepare Prepared Propose Prepared Prepared Chosen Commit Commit Get consensus on TM’s decision. TM just learns consensus value. TM is “stateless”
  • 60. Consensus for Commit – The Paxos Commit Approach RM client TM Request Commit consensus box Propose RM1 Prepared Prepare RM1 Prepared Chosen Commit Commit Prepare consensus box Commit RM TM Propose RM2 Prepared RM2 Prepared Chosen RequestCommit Prepare Propose RM1 Prepared Propose RM2 Prepared RM1 Prepared Chosen RM2 Prepared Chosen Commit Commit Get consensus on each RM’s choice. TM just combines consensus values. TM is “stateless”
  • 61. The Obvious Approach Paxos Commit One fewer message delay Prepare Prepare Prepared Propose RM1 Prepared Propose RM2 Prepared Propose Prepared RM1 Prepared Chosen Prepared Chosen RM2 Prepared Chosen Commit Commit
  • 62. RM Consensus box Propose RM Prepared acceptor TM acceptor TM acceptor Consensus in Action Propose RM Prepared Vote RM Prepared Propose RM Prepared RM Prepared Chosen Vote RM Prepared Vote RM Prepared The normal (failure-free) case Two message delays Can optimize
  • 63. RM Consensus box acceptor TM acceptor TM TM acceptor Consensus in Action TM can always learn what was chosen, or get Aborted chosen if nothing chosen yet; if majority of acceptors working .
  • 64. The Complete Algorithm Subtle. More weird cases than most people imagine. Proved correct.
  • 65. PaxosCommit in a Nutshell Acceptors 0…2F Client TM RM1…N request commit prepare prepared all prepared commit N RMs 2F+1 acceptors (~2F+1 TMs) If F+1 acceptors see all RMs prepared, then transaction committed. 2F(N+1) + 3N + 1 messages5 message delays 2 stable write delays.
  • 66.
  • 67. Today’s Meeting Concurrency Control Intention Locks Index Locking Optimistic CC Validation Timestamp Ordering Multi-version CC Commit in Distributed Databases Two Phase Commit Paxos Algorithm Concluding thoughts References (aside from textbook): Concurrency Control and Recovery in Database Systems, Philip A. Bernstein, VassosHadzilacos, Nathan Goodman, Microsoft Research. Concurrency Control: Methods, Performance, and Analysis, Alexander Thomasian, ACM Computing Surveys, March, 1998 Paxos Commit, Gray & Lamport, Microsoft Research TechFest, 2004 OLTP Through the Looking Glass, and What We Found There, Harizopoulos et al, Proc ACM SIGMOD, 2008 The end of an Architectural Era, Stonebraker et al, Proc. VLDB, 2007
  • 68.
  • 69. OLTP Through the Looking Glass (p2) Concurrency Control Look for applications where it can be turned off Some sort of optimistic concurrency control Multi-core Support Latching (inter-thread communication) remains a significant bottleneck Cache-conscious B-Trees Replication Management Loss of transactional consistency if log shipping Recovery is not instantaneous Maintaining transactional consistency Weak Consistency Starbucks doesn’t need two phase commit How to achieve eventual consistency without transactional consistency Areas for Research that may yield dividends
  • 70.
  • 71. What’s so fun about databases? From our January 13 Lecture… Traditional database courses talked about Employee records Bank records Now we talk about Web search Data mining The collective intelligence of tweets Scientific and medical databases From a personal viewpoint, I have enjoyed learning this material with you Thank you.
  • 72. About CS 542 CS 542 will Build on database concepts you already know Provide you tools for separating hype from reality Help you develop skills in evaluating the tradeoffs involved in using and/or creating a database CS 542 may Train you to read technical journals and apply them CS 542 will not Cover the intricacies of SQL programming Spend much effort in Dynamic SQL Stored Procedures Interfaces with application programming languages Connectors, e.g., JDBC, ODBC From our January 13 Lecture…
  • 73. Thanks Contact Information: President, Early Stage IT – a cloud-based consulting firm Email: J [dot] Singh [at] EarlyStageIT [dot] com Phone: 978-760-2055 Co-chair of Software and Services SIG at TiE-Boston Founder, SQLnix.org, a local resource for NoSQL databases My WPI email will be good through the summer.