3. Concurrency
Concurrency control is the problem of
synchronizing concurrent transactions (i.e.,
order the operations of concurrent
transactions) such that the following two
properties are achieved:
– the consistency of the DB is maintained
– the maximum degree of concurrency of
operations is achieved
Obviously, the serial execution of a set of
transaction achieves consistency, if each
single transaction is consistent
3
4. Why Concurrency Control is
needed
The Lost Update Problem.
This occurs when two transactions that access
the same database items have their operations
interleaved in a way that makes the value of
some database item incorrect.
The Temporary Update (or Dirty Read)
Problem.
This occurs when one transaction updates a
database item and then the transaction fails for
some reason. The updated item is accessed
by another transaction before it is changed
back to its original value.
4
5. Why Concurrency Control is
needed
The Incorrect Summary Problem .
If one transaction is calculating an
aggregate summary function on a number
of records while other transactions are
updating some of these records, the
aggregate function may calculate some
values before they are updated and others
after they are updated.
5
9. Concurrency Control
A database must provide a mechanism that will
ensure that all possible schedules are
either conflict or view serializable, and
are recoverable and preferably cascadeless
A policy in which only one transaction can execute
at a time generates serial schedules, but provides
a poor degree of concurrency
Testing a schedule for serializability after it has
executed is a little too late!
Goal – to develop concurrency control protocols
that will assure serializability.
9
10. Concurrency Control vs. Serializability
Tests
Concurrency-control protocols allow concurrent
schedules, but ensure that the schedules are
conflict/view serializable, and are recoverable and
cascadeless.
Concurrency control protocols generally do not
examine the precedence graph as it is being
created
Instead a protocol imposes a discipline that avoids
nonseralizable schedules.
Different concurrency control protocols provide
different tradeoffs between the amount of
concurrency they allow and the amount of
overhead that they incur.
Tests for serializability help us understand why a
concurrency control protocol is correct.
10
11. For locking-based methods,
locking granularity, i.e. the
size of the portion of
database that a lock is
applied, is an important
issue. This portion is called
locking unit (lu).
Classification of concurrency control
11
12. 12
Concurrency Control
Techniques
Categories of concurrency techniques
1. Locking techniques
2. Timestamp ordering techniques
3. Multi-version concurrency control techniques
4. Optimistic concurrency control techniques
13. Lock modes
read lock (rl), also called shared lock
write lock (wl), also called exclusive lock
Compatibility of lock modes
exclusive (X) mode. Data item can be both
read as well as written. X-lock is requested
using lock-X instruction.
shared (S) mode. Data item can only be
read. S-lock is requested using lock-S
instruction.
Lock requests are made to concurrency-
control manager. Transaction can proceed
only after request is granted.
Locking-based Concurrency Control Algorith
13
14. Lock-compatibility matrix
A transaction may be granted a lock on an item if the
requested lock is compatible with locks already held on the
item by other transactions
Any number of transactions can hold shared locks on an
item,
but if any transaction holds an exclusive on the item no other
transaction may hold any lock on the item.
If a lock cannot be granted, the requesting transaction is
made to wait till all incompatible locks held by other
transactions have been released. The lock is then granted.
Locking-based Concurrency Control Algorith
14
15. Lock-Based Protocols …
Example of a transaction performing locking:
T2: lock-S(A);
read (A);
unlock(A);
lock-S(B);
read (B);
unlock(B);
display(A+B)
Locking as above is not sufficient to guarantee serializability
— if A and B get updated in-between the read of A and B, the
displayed sum would be wrong.
A locking protocol is a set of rules followed by all
transactions while requesting and releasing locks. Locking
protocols restrict the set of possible schedules.
15
18. Locking alone don’t insure serializability !
Initial: x = 50 y = 20
Result of above scheduling:
x = 102 y = 39
Result of schedule {T1,T2}:
x = 102 y = 38
Result of schedule {T2,T1}:
x = 101 y = 39
Basic LM algorithm may generate non-serializable
schedule!
19. The Two-Phase Locking
Protocol
This is a protocol which ensures conflict-serializable
schedules.
Phase 1: Growing Phase
transaction may obtain locks
transaction may not release locks
Phase 2: Shrinking Phase
transaction may release locks
transaction may not obtain locks
The protocol assures serializability. It can be proved
that the transactions can be serialized in the order of
their lock points (i.e. the point where a transaction
acquired its final lock).
19
21. The Two-Phase Locking
Protocol …
Properties of the 2PL protocol
– Generates conflict-serializable schedules
– But schedules may cause cascading aborts
If a transaction aborts after it releases a lock, it may cause
other transactions that have accessed the unlocked data
item to abort as well
1. Basic 2PL
All lock operations before the first unlock operation
2. Conservative 2PL or static 2PL
Lock all the items it accesses before the transaction begins
execution.
1. Deadlock free protocol
2. Read-set and write-set of the transaction should be
known
21
22. 22
Strict 2PL locking protocol
– Holds the exclusive locks till the end of the
transaction
Rigorous 2PL
– Holds all locks till the end of the transaction
– Cascading aborts are avoided
Conversion of locks
Upgrade the lock – upgrade from shared to
exclusive.
Downgrade the lock – downgrade from exclusive
to shared.
The Two-Phase Locking
Protocol …
24. Pitfalls of Lock-Based
Protocols
Consider the partial schedule
Neither T3 nor T4 can make progress — executing lock-
S(B) causes T4 to wait for T3 to release its lock on B, while
executing lock-X(A) causes T3 to wait for T4 to release its
lock on A.
Such a situation is called a deadlock.
To handle a deadlock one of T3 or T4 must be rolled back
and its locks released.
24
25. Multiple Granularity
Allow data items to be of various sizes and define a
hierarchy of data granularities, where the small
granularities are nested within larger ones
Can be represented graphically as a tree (but don't
confuse with tree-locking protocol)
When a transaction locks a node in the tree explicitly, it
implicitly locks all the node's descendents in the same
mode.
Granularity of locking (level in tree where locking is
done)
fine granularity (lower in tree): high concurrency,
high locking overhead
coarse granularity (higher in tree): low locking
27. Intention Lock Modes
In addition to S and X lock modes, there are three
additional lock modes with multiple granularity:
intention-shared (IS): indicates explicit locking at a
lower level of the tree but only with shared locks.
intention-exclusive (IX): indicates explicit locking at
a lower level with exclusive or shared locks
shared and intention-exclusive (SIX): the subtree
rooted by that node is locked explicitly in shared mode
and explicit locking is being done at a lower level with
exclusive-mode locks.
intention locks allow a higher level node to be locked in
S or X mode without having to check all descendent
nodes.
28. Compatibility Matrix with Intention Lock
Modes
The compatibility matrix for all lock modes is:
IS IX S S IX X
IS
IX
S
S IX
X
29. Multiple Granularity Locking
Scheme
Transaction Ti can lock a node Q, using the following rules:
1. The lock compatibility matrix must be observed.
2. The root of the tree must be locked first, and may be locked in
any mode.
3. A node Q can be locked by Ti in S or IS mode only if the
parent of Q is currently locked by Ti in either IX or IS mode.
4. A node Q can be locked by Ti in X, SIX, or IX mode only if the
parent of Q is currently locked by Ti in either IX or SIX mode.
5. Ti can lock a node only if it has not previously unlocked any
node (that is, Ti is two-phase).
6. Ti can unlock a node Q only if none of the children of Q are
currently locked by Ti.
Observe that locks are acquired in root-to-leaf order, whereas they
are released in leaf-to-root order.
31. Tiemstamp-based Concurrency Control
Algorithms
Lock method maintains serializability
by mutual exclusion.
Timestamp method maintains
serializability by assigning a unique
timestamp to every transaction and
executing transactions accordingly.
31
32. What is a timestamp?
An identifier for transaction
Used to permit ordering
Monotonicity
– timestamps generated by the same
TM monotonically increase in value.
32
33. Basic TO Algorithm
Each transaction is issued a timestamp when it enters
the system.
If an old transaction Ti has time-stamp TS(Ti), a new
transaction Tj is assigned time-stamp TS(Tj) such that
TS(Ti) <TS(Tj).
The protocol manages concurrent execution such that
the time-stamps determine the serializability order.
The protocol maintains for each data Q two timestamp
W-timestamp(Q) or WTS(Q) is the largest time-stamp of
any transaction that executed write(Q) successfully.
R-timestamp(Q) or RTS(Q) is the largest time-stamp of
any transaction that executed read(Q) successfully.
33
34. Suppose a transaction Ti issues a read(Q)
1. If TS(Ti) W-timestamp(Q), then
Hence, the read operation is rejected, and
Ti is rolled back.
2. If TS(Ti) W-timestamp(Q), then
the read operation is executed, and
RTS(Q) = MAXIMUM( RTS(Q), TS(Ti)).
Basic TO Algorithm …
34
35. Suppose that transaction Ti issues write(Q).
1. If TS(Ti) < R-timestamp(Q), then
Hence, the write operation is rejected, and Ti is
rolled back.
2. If TS(Ti) < W-timestamp(Q), then
Hence, this write operation is rejected, and Ti is
rolled back.
3. Otherwise, the write operation is executed, and
W-timestamp(Q) is set to TS(Ti).
Basic TO Algorithm …
35
36. Timestamp Ordering …
Schedules generated by the basic TO protocol
have the following properties:
Serializable
Since transactions never wait (but are rejected),
the schedules are deadlock-free
The price to pay for deadlock-free schedules is
the potential restart of a transaction several times
36
37. Example Use of the Protocol
A partial schedule for several data items for transactions
with timestamps 1, 2, 3, 4, 5
T1 T2 T3 T4 T5
read(Y)
read(X)
read(Y)
write(Y)
write(Z)
read(Z)
read(X)
read(X)
write(Z)
abort
write(Y)
write(Z)
38. Correctness of Timestamp-Ordering
Protocol
ADVANTAGES:
The timestamp-ordering protocol guarantees serializability since all the
arcs in the precedence graph are of the form:
Thus, there will be no cycles in the precedence graph
Timestamp protocol ensures freedom from deadlock as no transaction
ever waits.
DISADVANTAGES
But the schedule may not be cascade-free, and may not even be
recoverable.
transaction
with smaller
timestamp
transaction
with larger
timestamp
39. Thomas’ Write Rule
Modified version of the timestamp-ordering protocol in which
obsolete write operations may be ignored under certain
circumstances.
When Ti attempts to write data item Q,
if TS(Ti) < WTS(Q), then
this {write} operation can be ignored.
Otherwise this protocol is the same as the timestamp ordering
protocol.
Thomas' Write Rule allows greater potential concurrency.
Allows some view-serializable schedules that are not
conflict-serializable.
40. Slide 18- 40
Thomas’s Write Rule
Suppose that transaction Ti issues write(Q).
1. If TS(Ti) < R-timestamp(Q), then
Hence, the write operation is rejected, and Ti is
rolled back.
2. If TS(Ti) < W-timestamp(Q), then
Hence, this write operation is ignored.
3. Otherwise, the write operation is executed, and
W-timestamp(Q) is set to TS(Ti).
42. Optimistic Concurrency Control
Algorithms
Pessimistic
algorithms
assume conflicts
happen quite
often.
Optimistic
algorithms delay
the validation
phase until write
phase.
42
43. Concepts
Execution of transaction Ti is done in three
phases.
1. Read and execution phase: Transaction Ti
writes only to temporary local variables
2. Validation phase: Transaction Ti performs a
‘validation test’ to determine if local variables
can be written without violating serializability.
3. Write phase: If Ti is validated, the updates
are applied to the database; otherwise, Ti is
rolled back.
43
44. Validation-Based Protocol
(Cont.)
Each transaction Ti has 3 timestamps
Start(Ti) : the time when Ti started its execution
Validation(Ti): the time when Ti entered its validation
phase
Finish(Ti) : the time when Ti finished its write phase
Serializability order is determined by timestamp given
at validation time, to increase concurrency.
Thus TS(Ti) is given the value of Validation(Ti).
This protocol is useful and gives greater degree of
concurrency if probability of conflicts is low.
because the serializability order is not pre-decided,
and relatively few transactions will have to be rolled
back.
45. Validation Test for
Transaction Tj
If for all Ti with TS (Ti) < TS (Tj) either one of the following
condition holds:
finish(Ti) < start(Tj)
start(Tj) < finish(Ti) < validation(Tj) and the set of data
items written by Ti does not intersect with the set of data
items read by Tj.
then validation succeeds and Tj can be committed. Otherwise,
validation fails and Tj is aborted.
Justification: Either the first condition is satisfied, and there is
no overlapped execution, or the second condition is satisfied
and
the writes of Tj do not affect reads of Ti since they occur
after Ti has finished its reads.
the writes of Ti do not affect reads of Tj since Tj does not
read any item written by Ti.
46. Validation of Ti – Rule 1
If all Tk with ts(Tk) < ts(Ti) have completed
before Ti has started,
then the validation succeeds.
This is a serial execution order.
46
47. Validation of Ti – Rule 2
If there is any transaction Tk , such that
ts(Tk) < ts(Ti)
Tk is in write phase
Ti is in read phase, and
WS(Tk) RS(Ti) =
then the validation succeeds.
- None of the data items updated by Tk are read by Ti;
- Updates of Ti will not be overwritten by the updates
Read and write
phases overlap, but Ti
does not read data
items written by Tk
47
48. Validation of Ti – Rule 3
If there is any transaction Tk , such that
ts(Tk) < ts(Ti)
Tk completes the read phase before Ti completes
the read phase
WS(Tk) RS(Ti) = , and
WS(Tk) WS(Ti) =
then the validation succeeds.
- Update of Tk will not affect the read phase or write phase of Ti.
They overlap, but
does not access any
common data items.
48
51. Deadlock Management
Deadlock: A set of transactions is in a deadlock
situation if several transactions wait for each
other.
A deadlock requires an outside intervention to
take place.
Any locking-based concurrency control
algorithm may result in a deadlock, since there
is mutual exclusive access to data items and
transactions may wait for a lock
Some TO-based algorithms that require the
waiting of transactions may also cause
deadlocks
51
52. Deadlock : Example
Consider the following two transactions:
T1: write (X) T2: write(Y)
write(Y) write(X)
Schedule with deadlock
T1 T2
lock-X on X
write (X)
lock-X on Y
write (X)
wait for lock-X on X
wait for lock-X on Y
52
53. Deadlock Management …
There are following principal methods for dealing with
deadlock problem:
Ignore
➠ Let the application programmer deal with it, or restart the
system
Prevention
➠ Guaranteeing that deadlocks can never occur in the first
place. Check transaction when it is initiated. Requires no run
time support.
Avoidance
➠ Detecting potential deadlocks in advance and taking action
to insure that deadlock will not occur. Requires run time
support.
Detection and Recovery
➠ Allowing deadlocks to form and then finding and breaking
53
54. Deadlock prevention
Deadlock prevention: Guarantee that deadlocks never
occur
– Check transaction when it is initiated, and start it only
if all required resources are available.
– All resources which may be needed by a transaction
must be predeclared
Advantages
– No transaction rollback or restart is involved
– Requires no run-time support
Disadvantages
– Reduced concurrency due to pre-allocation
– Evaluating whether an allocation is safe leads to
added overhead
– Difficult to determine in advance the required
resources
54
55. Deadlock Avoidance
Transactions are not required to request
resources a priori.
Transactions are allowed to proceed unless a
requested resource is unavailable.
In case of conflict, transactions may be allowed to
wait for a fixed time interval.
Order either the data items or the sites and
always request locks in that order.
More attractive than prevention in a database
environment.
55
56. Deadlock Prevention…
If Ti requests a lock on a data item which is already locked by Tj
WAIT-DIE Rule:
IF TS(Ti) < TS(Tj)
then Ti ( Ti older than Tj ) is permitted to wait.
else Ti ( Ti younger than Tj ) is aborted (dies) and restarted with the same
timestamp.
➠ non-preemptive: Ti never preempts Tj.
➠ older transaction waits.
WOUND-WAIT Rule:
IF TS(Ti) < TS(Tj)
then (Ti is older than Tj) Tj is aborted (Ti wound Tj) and restarted with the
same timestamp.
else (Ti younger than Tj) Ti is permitted to wait
➠ preemptive: Ti preempts Tj if it is younger.
➠ Younger transaction waits.
56
Lock Granted
to Ti
57. 57
start start attempt
Tx Ty TyLa TxLa
+--------+--------+--------+--------+--------+--------+
1 2 3 4 5 6 7...
time ->
Tx starts at time 1 so TS(Tx) = 1
Ty starts at time 2 so TS(Ty) = 2
Ty has already locked item A and now Tx attempts to lock
item A.
Under Wait-Die: TS(Tx) < TS(Ty) so Tx will wait for Ty to release
the lock.
In other words, since Tx started first, it gets the privilege of waiting.
Under Wound Wait: TS(Tx) < TS(Ty) so Ty will be aborted.
In other words, since Tx started first, it has the privilege of kicking Ty
out of the way by aborting it. So now Tx can proceed.
58. 58
Following schemes use transaction timestamps for the sake
of deadlock prevention alone
Wait-die Scheme (Non-preemptive)
older transaction may wait for younger one to release data
item
younger transactions never wait for older ones; they are
rolled back instead
a transaction may die several times before acquiring
needed data item
Wound-wait Scheme (Preemptive)
older transaction wounds (forces rollback) of younger
transaction instead of waiting for it
younger transactions may wait for older ones
may be fewer rollbacks than wait-die scheme
Transactions are not assigned new timestamps
Avoid starvation, but may cause unnecessary rollbacks
59. Deadlock Detection
A Wait-for Graph (WFG) is a useful tool to
identify deadlocks
– The nodes represent transactions
– An edge from Ti to Tj indicates that Ti is
waiting for Tj
– If the WFG has a cycle, we have a deadlock
situation
59
60. Deadlock Detection
Deadlocks can be described as a wait-for graph,
which consists of a pair G = (V,E),
V is a set of vertices (all the transactions in the
system)
E is a set of edges; each element is an ordered pair Ti
Tj.
If Ti Tj is in E, then there is a directed edge from
Ti to Tj, implying that Ti is waiting for Tj to release a
data item.
When Ti requests a data item currently being held
by Tj, then the edge Ti Tj is inserted in the wait-for
graph. This edge is removed only when Tj is no
longer holding a data item needed by Ti.
62. Deadlock Detection …
Advantages
– Allows maximal concurrency
– The most popular and best-studied method
Disadvantages
– Considerable amount of work might be
undone
62
63. Recovery from Deadlock
Choose a victim: A transaction which can be
aborted and that causes least loss of work
performed can be chosen as a victim.
Abort: The victim which was chosen is
aborted.
63
64. Deadlock Recovery
When deadlock is detected :
Some transaction will have to rolled back (made a
victim) to break deadlock.
Select that transaction as victim that will incur minimum
cost.
performed less computations
used less data items
will cause less cascading rollbacks
Rollback -- determine how far to roll back transaction
Total rollback: Abort the transaction and then restart it.
Partial rollback: roll back transaction only as far as
necessary to break deadlock.
Starvation happens if same transaction is always
chosen as victim. Include the number of rollbacks in the
66. Multiversion Schemes
Multiversion schemes keep old versions of data item
to increase concurrency.
Multiversion Timestamp Ordering
Multiversion Two-Phase Locking
Each successful write results in the creation of a new
version of the data item written.
Use timestamps to label versions.
When a read(Q) operation is issued, select an
appropriate version of Q based on the timestamp of
the transaction, and return the value of the selected
version.
reads never have to wait as an appropriate version is
returned immediately.
67. Multiversion Timestamp
Ordering
Each data item Q has a sequence of versions
<Q1, Q2,...., Qm>. Each version Qk contains three
data fields:
Content -- the value of version Qk.
W-timestamp(Qk) -- timestamp of the transaction that
created (wrote) version Qk
R-timestamp(Qk) -- largest timestamp of a
transaction that successfully read version Qk
when a transaction Ti creates a new version Qk of
Q, Qk's W-timestamp and R-timestamp are
initialized to TS(Ti).
R-timestamp of Qk is updated whenever a
transaction Tj reads Qk, and TS(Tj) > R-
timestamp(Qk).
68. Multiversion Timestamp Ordering
(Cont)
The multiversion timestamp scheme presented next ensures
serializability.
Suppose that transaction Ti issues a read(Q) or write(Q) operation.
Let Qk denote the version of Q whose write timestamp is the largest
write timestamp less than or equal to TS(Ti).
1. If transaction Ti issues a read(Q), then the value returned is the
content of version Qk.
2. If transaction Ti issues a write(Q), and if TS(Ti) < R-
timestamp(Qk), then transaction Ti is rolled
back. Otherwise, if TS(Ti) = W-timestamp(Qk), the contents of Qk
are overwritten, otherwise a new version of Q is created.
Reads always succeed; a write by Ti is rejected if some other
transaction Tj that (in the serialization order defined by the timestamp
values) should read Ti's write, has already read a version created by
a transaction older than Ti.
69. Multiversion Two-Phase
Locking
Differentiates between read-only transactions and
update transactions
Update transactions acquire read and write locks,
and hold all locks up to the end of the transaction.
That is, update transactions follow rigorous two-
phase locking.
Each successful write results in the creation of a new
version of the data item written.
each version of a data item has a single timestamp
whose value is obtained from a counter ts-counter that
is incremented during commit processing.
Read-only transactions are assigned a timestamp
by reading the current value of ts-counter before
they start execution; they follow the multiversion
timestamp-ordering protocol for performing reads.
70. Multiversion Two-Phase Locking
(Cont.)
When an update transaction wants to read a data
item, it obtains a shared lock on it, and reads the
latest version.
When it wants to write an item, it obtains X lock on;
it then creates a new version of the item and sets
this version's timestamp to .
When update transaction Ti completes, commit
processing occurs:
Ti sets timestamp on the versions it has created to ts-
counter + 1
Ti increments ts-counter by 1
Read-only transactions that start after Ti increments
ts-counter will see the values updated by Ti.
Read-only transactions that start before Ti
increments the
ts-counter will see the value before the updates by
Ti.
Only serializable schedules are produced.