Deepa Rani
Transaction
Processing
What is transaction?
Introduction to Transaction Processing
• Single-User System: At most one user at a time can use the system.
• Multiuser System: Many users can access the system concurrently.
• Concurrency
• Interleaved processing: concurrent execution of processes is interleaved in a
single CPU
• Parallel processing: processes are concurrently executed in multiple CPUs.
Introduction to Transaction Processing
 A Transaction: logical unit of database processing that includes one or
more access operations (read -retrieval, write - insert or update, delete).
 A transaction (set of operations) may be stand-alone specified in a high
level language like SQL submitted interactively, or may be embedded within a
program.
 Transaction boundaries: Begin and End transaction.
 An application program may contain several transactions
separated by the Begin and End transaction boundaries.
What is transaction?
• A transaction is a sequence of operations performed as a single logical unit of work.
• A transaction is a logical unit of work that contains one or more SQL statements.
• Example of transaction:
Transaction
Operations
Works as a single
logical unit
read (A)
A = A – 50
write (A)
read (B)
B = B + 50
write (B)
Want to transfer Rs. 50 from Account-A to Account-B
What is transaction?
• A transaction is used to represent a logical unit of database processing that must
be completed to ensure correctness.
• Collection of operations that form a single logical unit of work are called
transactions
• A transaction is a unit of program execution that accesses and possibly updates
various data items.
• Two main issues to deal with:
• Failure of various kinds; such as hardware failure or system crashes.
• Concurrent execution of multiple transactions
Introduction to Transaction Processing
SIMPLE MODEL OF A DATABASE (for purposes of discussing
transactions):
 A database - collection of named data items
 Granularity of data - a field, a record , or a whole disk block (Concepts are
independent of granularity)
 Basic operations are read and write
– read_item(X): Reads a database item named X into a program variable. To
simplify our notation, we assume that the program variable is also named X.
– write_item(X): Writes the value of program variable X into the database item
named X.
Introduction to Transaction Processing
READ AND WRITE OPERATIONS:
 Basic unit of data transfer from the disk to the computer main memory is
one block. In general, a data item (what is read or written) will be the field
of some record in the database, although it may be a larger unit such as a
record or even a whole block.
 read_item(X) command includes the following steps:
1. Find the address of the disk block that contains item X.
2. Copy that disk block into a buffer in main memory (if that disk block is not already in
some main memory buffer).
3. Copy item X from the buffer to the program variable named X.
Introduction to Transaction Processing
READ AND WRITE OPERATIONS (cont.):
 write_item(X) command includes the following steps:
1. Find the address of the disk block that contains item X.
2. Copy that disk block into a buffer in main memory (if that disk block is not
already in some main memory buffer).
3. Copy item X from the program variable named X into its correct location in
the buffer.
4. Store the updated block from the buffer back to disk (either immediately or
at some later point in time).
ACID properties of transaction
ACID properties of transaction
• Atomicity (Either transaction execute 0% or 100%)
• Consistency (Database must remain in a consistent state after any transaction)
• Isolation (Intermediate transaction results must be hidden from other
concurrently executed transactions)
• Durability (Once a transaction completed successfully, the changes it has
made into the database should be permanent)
ACID properties of transaction (Atomicity)
Atomicity requirement
• If the transaction fails after step 3 and before step 6, money will be
“lost” leading to an inconsistent database state.
• Failure could be due to software or hardware.
• The system should ensure that updates of a partially executed
transaction are not reflected in the database.
• This property states that a transaction must be treated as an atomic unit,
that is, either all of its operations are executed or none.
• Either transaction execute 0% or 100%.
• For example, consider a transaction to transfer Rs. 50 from account A to
account B.
• In this transaction, if Rs. 50 is deducted from account A then it must be
added to account B.
read (A)
A = A – 50
write (A)
read (B)
B = B + 50
write (B)
0%
100%
FAIL
ACID properties of transaction (Consistency)
• The database must remain in a consistent state after
any transaction.
• If the database was in a consistent state before the
execution of a transaction, it must remain consistent
after the execution of the transaction as well.
• A transaction, when starting to execute, must see a
consistent database
• During transaction execution the database may be
temporarily inconsistent
• When the transaction completes successfully the database
must be consistent
• In our example, total of A and B must remain same
before and after the execution of transaction.
A=500, B=500
A+B=1000
read (A)
A = A – 50
write (A)
read (B)
B = B + 50
write (B)
A=450, B=550
A+B=1000
ACID properties of transaction (Isolation)
• Changes occurring in a particular transaction will not
be visible to any other transaction until it has been
committed.
• Intermediate transaction results must be hidden from
other concurrently executed transactions.
• In our example once our transaction starts from first
step (step 1) its result should not be access by any
other transaction until last step (step 6) is completed.
• It is enforced by CONCURRENCY CONTROL
COMPONENT of DATABASE.
read (A)
A = A – 50
write (A)
read (B)
B = B + 50
write (B)
Start
Transaction
ACID properties of transaction (Isolation)
• Isolation requirement — if between steps 3 and 6, another transaction T2 is
allowed to access the partially updated database, it will see an inconsistent
database (the sum A + B will be less than it should be).
T1 T2
1. read(A)
2. A := A – 50
3. write(A)
read(A), read(B), print(A+B)
4. read(B)
5. B := B + 50
6. write(B)
• Isolation can be ensured trivially by running transactions serially
• that is, one after the other.
• However, executing multiple transactions concurrently has significant benefits,
as we will see later.
ACID properties of transaction (Durability)
• After a transaction completes successfully, the changes
it has made to the database persist (permanent), even
if there are system failures.
• Once our transaction completed up to last step (step 6)
its result must be stored permanently. It should not be
removed if system fails.
A=500, B=500
read (A)
A = A – 50
write (A)
read (B)
B = B + 50
write (B)
A=450, B=550
These values must be stored
permanently in the
database
Transaction State Diagram  State Transition
Diagram
Transaction State Diagram  State Transition
Diagram
read (A)
A = A – 50
write (A)
read (B)
B = B + 50
write (B)
Commit
Active
Partial
Committed
Failed
Committed
Aborted
This is the initial state.
The transaction stays in this state while it is executing.
End
When a transaction executes its final operation, it is said to
be in a partially committed state.
Discover that normal execution can no longer proceed.
Once a transaction cannot be completed, any changes that
it made must be undone rolling it back.
The state after the transaction has been rolled back and the
database has been restored to its state prior to the start of
the transaction.
The transaction enters in this state after successful
completion of the transaction.
We cannot abort or rollback a committed transaction.
Active
Active
Partial
Committed
FAIL
Failed
Committed
Aborted
Transaction State Diagram  State Transition
Diagram
• Active
• This is the initial state.
• The transaction stays in this state while it is executing.
• Partial Committed
• When a transaction executes its final operation/ instruction, it is said to be in a partially committed state.
• Failed
• Discover that normal execution can no longer proceed.
• Once a transaction cannot be completed, any changes that it made must be undone rolling it back.
• Committed
• The transaction enters in this state after successful completion of the transaction (after committing
transaction).
• We cannot abort or rollback a committed transaction.
• Aborted
• The state after the transaction has been rolled back and the database has been restored to its state
prior to the start of the transaction.
Concurrent Execution
What is concurrency?
• In multi-user database system, many users can use the database at same time.
• In this case, the database should be in consistent state.
• Now a days, we have multi-programming systems, which also results in
concurrent execution of processes by interleaving.
• So, Concurrency is the ability of a database to allow multiple (more than one)
users to access data at the same time.
What is concurrency?
• The concurrency is allowed because of the following:
• Improved throughput and resource utilizations
• Helps in reducing the waiting times.
• Concurrent executions may cause to incorrectness and lead to data
inconsistency. To avoid this inconsistency we need some concurrency control
mechanism.
• Three problems due to concurrency
• Lost update problem
• Dirty read problem
• Incorrect summary (retrieval) problem
• Unrepeated read
Lost update problem
• This problem indicate that if two transactions T1
and T2 both read the same data and update it
then effect of first update will be overwritten by
the second update.
• How to avoid: A transaction T2 must not update
the data item (X) until the transaction T1 can
commit data item (X).
T1 Time T2
--- T0 ---
Read X T1 ---
--- T2 Read X
Update X
X=75
T3 ---
--- T4
Update X
X=50
--- T5 ---
X=100
Lost update problem
Suppose that transactions T1 and T2 are submitted at approximately the same time, and
suppose that their operations are interleaved as shown in Figure (a); then the final value of
item X is incorrect because T2 reads the value of X before T1 changes it in the database, and
hence the updated value resulting from T1 is lost.
For example, if X = 80 at the start (originally there were 80 reservations on the flight),
N = 5 (T1 transfers 5 seat reservations from the flight corresponding to X to the flight
corresponding to Y), and M = 4 (T2 reserves 4 seats on X),
the final result should be X = 79.
However, in the interleaving of operations shown in Figure (a),
it is X = 84 because the update in T1 that removed the five seats from X was lost.
Dirty read problem
• The dirty read arises when one transaction update
some item and then fails due to some reason. This
updated item is retrieved by another transaction
before it is changed back to the original value.
• How to avoid: A transaction T1 must not read the
data item (X) until the transaction T2 can commit
data item (X).
X=100
T1 Time T2
--- T0 ---
--- T1
Update X
X=50
Read X T2 ---
--- T3 Rollback
--- T4 ---
Dirty read problem
This problem occurs when one transaction updates a database item and then the transaction fails for
some reason.
Meanwhile, the updated item is accessed (read) by another transaction before it is changed back (or
rolled back) to its original value.
Figure (b) shows an example where T1 updates item X and then fails before completion, so the system
must
roll back X to its original value. Before it can do so, however, transaction T2 reads the temporary value
of X, which will not be recorded permanently in the database because of the failure of T1. The value of
item X that is read by T2 is called dirty data because it has been created by a transaction that has not
completed and committed yet; hence,
this problem is also known as the dirty read problem.
Incorrect retrieval problem
• The inconsistent retrieval problem
arises when one transaction retrieves
data to use in some operation but
before it can use this data another
transaction updates that data and
commits.
• Through this change will be hidden from
first transaction and it will continue to
use previous retrieved data. This
problem is also known as inconsistent
analysis problem.
• How to avoid: A transaction T2 must
not read or update data item (X) until
the transaction T1 can commit data
item (X).
T1 Time T2
Read (A)
Sum  200
T1 ---
Read (B)
Sum  Sum + 250 = 450
T2 ---
--- T3 Read (C)
--- T4
Update (C)
150  150 – 50 = 100
--- T5 Read (A)
--- T6
Update (A)
200  200 + 50 = 250
--- T7 COMMIT
Read (C)
Sum Sum + 100 = 550
T8 ---
Balance (A=200, B=250, C=150)
Incorrect retrieval problem
If one transaction is calculating an aggregate summary function on a number of database items while
other transactions are updating some of these items, the aggregate function may calculate some values
before they are updated and others after they are updated.
For example, suppose that a transaction T3 is calculating the total number of reservations on all the
flights; meanwhile, transaction T1 is executing. If the interleaving of operations shown in Figure (c) occurs,
the result of T3 will be off by an amount N because T3 reads the value of X after N seats have been
subtracted from it but reads the value of Y
before those N seats have been added to it.
Unrepeated Read
• Where a transaction T reads the same item twice and the item is changed by another
transaction T between the two reads.
′
• Hence, T receives different values for its two reads of the same item.
For example,
if during an airline reservation transaction, a
customer inquires about seat availability on
several flights. When the customer decides on a
particular flight, the transaction then reads the
number of seats on that flight a second time
before completing the reservation, and it may end
up reading a different value for the item.
Schedule
What is schedule?
• A schedule is a process of grouping the transactions into one and executing
them in a predefined order.
• A schedule is the chronological (sequential) order in which instructions are
executed in a system.
• A schedule is required in a database because when some transactions execute
in parallel, they may affect the result of the transaction.
• Means if one transaction is updating the values which the other transaction is
accessing, then the order of these two transactions will change the result of
another transaction.
• Hence a schedule is created to execute the transactions.
What is schedule?
• A schedule must preserve the order in which the instructions appear in each individual
transaction.
• A transaction that successfully completes its execution will have a commit instructions as the
last statement
• by default transaction assumed to execute commit instruction as its last step
• A transaction that fails to successfully complete its execution will have an abort instruction as
the last statement
• Formal Definition:
• A schedule (or history) S of n transactions T1, T2, ..., Tn :
It is an ordering of the operations of the transactions subject to the constraint that, for each
transaction Ti that participates in S, the operations of T1 in S must appear in the same order in
which they occur in T1. Note, however, that operations from other transactions Tj can be
interleaved with the operations of Ti in S.
Types of schedule
Schedule 1
 Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from A to B.
 A serial schedule in which T1 is followed by T2 :
Schedule
T1 T2
Read (A)
A = A - 50
Write (A)
Read (B)
B = B + 50
Write (B)
Commit
Read (A)
temp = A * 0.1
A = A - temp
Write (A)
Read (B)
B = B + temp
Write (B)
Commit
Schedule Execution
A=B=1000
Read (1000)
A = 1000 - 50
Write (950)
Read (1000)
B = 1000 + 50
Write (1050)
Commit
Read (950)
temp = 950 * 0.1
A = 950 - 95
Write (855)
Read (1050)
B = 1050 + 95
Write (1145)
Commit
Schedule 1
Schedule 2
• A serial schedule where T2 is followed by T1
Schedule 2
Schedule
T1 T2
Read (A)
Temp = A * 0.1
A = A - temp
Write (A)
Read (B)
B = B + temp
Write (B)
Commit
Read (A)
A = A - 50
Write (A)
Read (B)
B = B + 50
Write (B)
Commit
Schedule Execution
A=B=1000
Read (1000)
Temp = 1000 * 0.1
A = 1000 - 100
Write (900)
Read (1000)
B = 1000 + 100
Write (1100)
Commit
Read (900)
A = 900 - 50
Write (850)
Read (1100)
B = 1100 + 50
Write (1150)
Commit
• A serial schedule is a schedule in which no transaction starts until a running transaction has
ended.
• A serial schedule is a schedule in which one transaction is executed completely before starting
another transaction.
• Transactions are executed one after the other.
• This type of schedule is called a serial schedule, as transactions are executed in a serial manner.
• When the transactions are executed serially which always ensure a constant state.
• If there are n-transaction in a schedule, the possible number of serial schedule are n!.
• Disadvantages:
• They limit concurrency
• CPU time may be wasted, when the active transaction goes into wait.
• May lead to large waiting time of those transaction that are waiting for some long transaction.
Serial Schedule
Serial Schedule
T1 T2
Read (A)
A = A - 50
Write (A)
Read (B)
B = B + 50
Write (B)
Commit
Read (A)
temp = A * 0.1
A = A - temp
Write (A)
Read (B)
B = B + temp
Write (B)
Commit
Serial Schedule
T1 T2
Read (A)
A = A - 50
Write (A)
Read (B)
B = B + 50
Write (B)
Commit
Read (A)
temp = A * 0.1
A = A - temp
Write (A)
Read (B)
B = B + temp
Write (B)
Commit
Example of Serial
Schedule
Non-serial Schedule (Interleaved Schedule)
• Schedule that interleave the execution of different transactions.
• Means second transaction is started before the first one could end and
execution can switch between the transactions back and forth.
• It contains many possible orders in which the system can execute the individual
operations of the transactions.
Non-serial Schedule (Interleaved Schedule)
• The Non-Serial Schedule can be divided further into Serializable and Non-Serializable.
Serializable Schedule
• A schedule is serializable if it is equivalent to a serial schedule of the same n
transactions.
• In serial schedules, only one transaction is allowed to execute at a time i.e. no
concurrency is allowed.
• Whereas in serializable schedules, multiple transactions can execute
simultaneously i.e. concurrency is allowed.
• Types (forms) of serializable schedule
• Conflict serializability
• View serializability
Conflicting Instructions
 Instructions li and lj of transactions Ti and Tj respectively, conflict if and only if there exists
some item Q accessed by both li and lj, and at least one of these instructions wrote Q.
1. li = read(Q), lj = read(Q). li and lj don’t conflict.
2. li = read(Q), lj = write(Q). They conflict.
3. li = write(Q), lj = read(Q). They conflict
4. li = write(Q), lj = write(Q). They conflict
 Intuitively, a conflict between li and lj forces a (logical) temporal order between them.
 If li and lj are consecutive in a schedule and they do not conflict, their results would
remain the same even if they had been interchanged in the schedule.
Conflicting instructions
 Let li and lj be two instructions of transactions Ti and Tj respectively.
1. li = read(Q), lj = read(Q)
li and lj don’t conflict
2. li = read(Q), lj = write(Q)
li and lj conflict
3. li = write(Q), lj = read(Q)
li and lj conflict
4. li = write(Q), lj = write(Q)
li and lj conflict
Ti Tj
read (Q)
read (Q)
Ti Tj
read (Q)
write(Q)
Ti Tj
write(Q)
read (Q)
Ti Tj
write(Q)
write(Q)
Ti Tj
read (Q)
read (Q)
Ti Tj
write(Q)
read (Q)
Ti Tj
read (Q)
write(Q)
Ti Tj
write(Q)
write(Q)
Conflict serializability
• If a schedule S can be transformed into a schedule S´ by a series of swaps of
non-conflicting instructions, we say that S and S´ are conflict equivalent.
• We say that a schedule S is conflict serializable if it is conflict equivalent to a
serial schedule
OR
• If a given schedule can be converted into a serial schedule by swapping its
non-conflicting operations, then it is called as a conflict serializable schedule.
 Schedule 3 can be transformed into Schedule 6, a serial schedule where T2 follows
T1, by series of swaps of non-conflicting instructions.
 Therefore Schedule 3 is conflict serializable.
Schedule 3 Schedule 6
Conflict serializability (Example)
Conflict serializability (Example)
T1 T2
Read (A)
A = A - 50
Write (A)
Read (A)
Temp = A * 0.1
A = A - temp
Write (A)
Read (B)
B = B + 50
Write (B)
Commit Read (B)
B = B + temp
Write (B)
Commit
T1 T2
Read (A)
A = A - 50
Write (A)
Read (B)
B = B + 50
Write (B)
Commit
Read (A)
Temp = A * 0.1
A = A - temp
Write (A)
Read (B)
B = B + temp
Write (B)
Commit
• Example of a schedule that is not conflict serializable:
• We are unable to swap instructions in the above schedule to obtain either the
serial schedule <T1, T2>, or the serial schedule <T2, T1>.
T1 T2
Read (A)
Read (A)
Write (A)
Conflict serializability (Example)
Testing for Serializability
Testing for Serializability
• Consider some schedule of a set of transactions T1, T2, ..., Tn
• Precedence graph — a direct graph where the vertices are the
transactions (names).
• We draw an arc from Ti to Tj if the two transaction conflict, and Ti
accessed the data item on which the conflict arose earlier.
• We may label the arc by the item that was accessed.
• Example 1 x
y
Testing for Serializability
Algorithm to test conflict serializability of a schedule S
(1) For each transaction create a node, labelled with .
(2) For each case in S where executes read_item(X) after executes a write_item(X), create an
edge ( ) in the graph.
(3) For each case in S where executes write_item(X) after executes a read_item(X), create an
edge ( ) in the precedence graph.
(4) For each case in S where executes write_item(X) after executes a write_item(X), create an
edge ( ) in the graph.
(5) The schedule S is serializable if and only if the graph has no cycle.
• If there is no cycle in the precedence graph, it means we can construct a serial schedule S’
which is conflict equivalent to schedule S.
• We may run TOPOLOGICAL SORT on the graph to obtain serial schedule. This is a linear order
consistent with a partial order of the graph.
Testing for Serializability
Example:
S1: r2(Z), r2(Y), w2(Y); r3(Y), r3(Z); r1(X), w1(X); w3(Y), w3(Z); r2(X); r1(Y),
w1(Y); w2(X);
We can’t have an
equivalent schedule because
there is a cycle between
and and another cycle
between , , and .
T1 T2
T3
X
Y
X, Z
Y
S2: r3(Y), r3(Z); r1(X), w1(X); w3(Y), w3(Z); r2(Z); r1(Y), w1(Y); r2(Y), w2(Y), r2(X), w2(X);
Testing for Serializability
T1 T2
T3
X, Y
Y,Z
Y
Equivalent Schedule is T3  T1  T2
View Serializability
• Let S1 and S2 be two schedules with the same set of transactions. S1 and S2
are view equivalent if the following three conditions are satisfied, for each data
item Q
• Initial Read
• Updated Read
• Final Write
• If a schedule is view equivalent to its serial schedule then the given schedule is
said to be view serializable.
View serializability
Initial Read
• If in schedule S1, transaction Ti reads the initial value of Q, then in schedule S2
also transaction Ti must read the initial value of Q.
• Above two schedules S1 and S3 are not view equivalent because initial read
operation in S1 is done by T1 and in S3 it is done by T2.
• Above two schedules S1 and S2 are view equivalent because initial read
operation in S1 is done by T1 and in S2 it is also done by T1.
S1
T1 T2
Read (A)
Write (A)
S3
T1 T2
Write (A)
Read (A)
S2
T1 T2
Read (A)
Write (A)
• If in schedule S1 transaction Ti executes read(Q), and that value was produced
by transaction Tj (if any), then in schedule S2 also transaction Ti must read the
value of Q that was produced by transaction Tj.
• Above two schedules S1 and S3 are not view equal because, in S1, T3 is
reading A that is updated by T2 and in S3, T3 is reading A which is updated by
T1.
• Above two schedules S1 and S2 are view equal because, in S1, T3 is reading A
that is updated by T2 and in S2 also, T3 is reading A which is updated by T2.
S1
T1 T2 T3
Write (A)
Write (A)
Read (A)
S3
T1 T2 T3
Write (A)
Write (A)
Read (A)
S2
T1 T2 T3
Write (A)
Write (A)
Read (A)
Update Read
Final Write
• If Ti performs the final write on the data value in S1, then it also performs the
final write on the data value in S2.
• Above two schedules S1 and S3 are not view equal because final write
operation in S1 is done by T3 and in S3 final write operation is also done by
T1.
• Above two schedules S1 and S2 are view equal because final write operation in
S1 is done by T3 and in S2 also the final write operation is also done by T3.
S1
T1 T2 T3
Write (A)
Read (A)
Write (A)
S3
T1 T2 T3
Write (A)
Write (A)
Read (A)
S2
T1 T2 T3
Write (A)
Read (A)
Write (A)
View serializable example (Initial Read)
• In schedule S1, transaction T1 first reads the data item X. In S2 also transaction T1 first reads
the data item X.
• In schedule S1, transaction T1 first reads the data item Y. In S2 also the first read operation
on Y is performed by T1.
• The initial read condition is satisfied for both the schedules.
Non-Serial Schedule (S1)
T1 T2
Read (A)
Write (A)
Read (A)
Write (A)
Read (B)
Write (B)
Read (B)
Write (B)
Serial Schedule (S2)
T1 T2
Read (A)
Write (A)
Read (B)
Write (B)
Read (A)
Write (A)
Read (B)
Write (B)
View serializable example (Update Read)
• In schedule S1, transaction T2 reads the value of X, written by T1. In S2, the same
transaction T2 reads the X after it is written by T1.
• In schedule S1, transaction T2 reads the value of Y, written by T1. In S2, the same
transaction T2 reads the value of Y after it is updated by T1.
• The updated read condition is also satisfied for both the schedules.
Non-Serial Schedule (S1)
T1 T2
Read (A)
Write (A)
Read (A)
Write (A)
Read (B)
Write (B)
Read (B)
Write (B)
Serial Schedule (S2)
T1 T2
Read (A)
Write (A)
Read (B)
Write (B)
Read (A)
Write (A)
Read (B)
Write (B)
View serializable example (Final Write)
• In schedule S1, the final write operation on X is done by transaction T2. In S2 also
transaction T2 performs the final write on X.
• In schedule S1, the final write operation on Y is done by transaction T2. In schedule S2, final
write on Y is done by T2.
• The final write condition is also satisfied for both the schedules.
Non-Serial Schedule (S1)
T1 T2
Read (A)
Write (A)
Read (A)
Write (A)
Read (B)
Write (B)
Read (B)
Write (B)
Serial Schedule (S2)
T1 T2
Read (A)
Write (A)
Read (B)
Write (B)
Read (A)
Write (A)
Read (B)
Write (B)
View serializable example
• Since all the three conditions that checks whether the two schedules are view
equivalent are satisfied in this example, which means S1 and S2 are view equivalent.
• Also, as we know that the schedule S2 is the serial schedule of S1, thus we can say
that the schedule S1 is view serializable schedule.
Non-Serial Schedule (S1)
T1 T2
Read (A)
Write (A)
Read (A)
Write (A)
Read (B)
Write (B)
Read (B)
Write (B)
Serial Schedule (S2)
T1 T2
Read (A)
Write (A)
Read (B)
Write (B)
Read (A)
Write (A)
Read (B)
Write (B)
View serializable
Relationship between view and conflict equivalence:
• The two are same under constrained write assumption which assumes that if T writes X, it is constrained by the
value of X it read; i.e., new X = f(old X).
• Conflict serializability is stricter than view serializability.
• With unconstrained write (or blind write), a schedule that is view serializable is not necessarily conflict
serializable.
• Any conflict serializable schedule is also view serializable, but not vice versa.
Conflict
serializable
View serializable
Blind write
Consider the following schedule of three transactions
T1: r1(X), w1(X); T2: w2(X); and T3:
w3(X):
Schedule Sa: r1(X); w2(X); w1(X); w3(X); c1; c2; c3;
Consider the following schedule of three transactions
T1: r1(A), w1(A); T2: w2(A); and T3:
w3(A):
Schedule S1: r1(A); w2(A); w1(A); w3(A); c1; c2; c3;
View serializable
With 3 transactions, the total number of possible
serial schedule
• In S1, the operations w2(A) and w3(A) are blind writes,
since T1 and T3 do not read the value of A.
• S1 is view serializable, since it is view equivalent to the
serial schedule T1, T2, T3.
• However, S1 is not conflict serializable, since it is not
conflict equivalent to any serial schedule.
• [Write operation does not depend on the old value of A.
It just update the value]
Equivalent Schedule
Equivalent Schedule
• If two schedules produce the same result after execution, they are said to be
equivalent schedule.
OR
• The two schedules S1 and S2 are said to be equivalent schedules if they
produces the same final database state.
• Types of equivalent schedule:
Equivalent Schedule
1) Result equivalence schedule: Two schedules are said to be result equivalent if they
produce the same final database state for some initial value of data.
2) Conflict Equivalent Schedule: Two schedules are said to be conflict equivalent
operations in both the schedule must be executed in the same order.
3) View Equivalent Schedule: Two schedules S and S’ are said to be view equivalent if
the following 3-conditions are met for each data item (say A).
1) Initial read : Transaction Ti reads the initial value A in schedule S then
in schedule S’, transaction Ti also read the initial value of A.
2) Final write: Transaction Ti perform the final write of A in schedule S then
in schedule S’, transaction Ti also perform the final write of A.
3) Update read (Producer-consumer): If transaction Ti executes read-R(A) in schedule
S and that was produced by transaction Tj, then
must in schedule S’ also read the value of A that
was produced by Tj.
Equivalent Schedule
Schedule-1 (A=B=1000)
T1 T2
Read (A)
A = A - 50
Write (A)
Read (A)
temp = A * 0.1
A = A - temp
Write (A)
Read (B)
B = B + 50
Write (B)
Commit
Read (B)
B = B + temp
Write (B)
Commit
Schedule-2 (A=B=1000)
T1 T2
Read (A)
A = A - 50
Write (A)
Read (B)
B = B + temp
Write (B)
Commit
Read (A)
temp = A * 0.1
A = A - temp
Write (A)
Read (B)
B = B + 50
Write (B)
Commit
Both
schedules
are
equivalent
In
both
schedules
the
sum
“A
+
B”
is
preserved
.
• Let T1 and T2 be the transactions defined previously.
• The following schedule is not a serial schedule, but it is equivalent to Schedule 1.
In Schedules 1 and 3, the sum A + B is preserved.
Schedule 3
Schedule 1 Schedule 3
Classification of Schedule based on
Recoverability
Non-Serializable Schedule
A non-serial schedule which is not serializable is called as a non-serializable schedule.
Characteristics-
Non-serializable schedules-
• may or may not be consistent
• may or may not be recoverable
Irrecoverable Schedules-
If in a schedule,
• A transaction performs a dirty read operation from an uncommitted transaction
• And commits before the transaction from which it has read the value then such a schedule is
known as an Irrecoverable Schedule.
Irrecoverable Schedule
Example:
Here,
•T2 performs a dirty read operation.
•T2 commits before T1.
•T1 fails later and roll backs.
•The value that T2 read now stands to be incorrect.
•T2 can not recover since it has already committed.
Recoverability
Schedules classified on recoverability:
• Recoverable schedule:
• One where no transaction needs to be rolled back.
• A schedule S is recoverable if no transaction T in S commits until all transactions T’
that have written an item that T reads have committed.
• Cascadeless schedule:
• One where every transaction reads only the items that are written by committed transactions.
• Schedules requiring cascaded rollback: A schedule in which uncommitted transactions that read an
item from a failed transaction must be rolled back.
• Strict Schedules:
• A schedule in which a transaction can neither read or write an item X until the last transaction that
wrote X has committed.
Recoverable Schedule
• A schedule is recoverable if:
• each transaction commits only after all transactions from which it has read
has committed.
• In other words, if some transaction Tj is reading value updated or written by some
other transaction Ti, then the commit of Tj must occur after the commit of Ti.
The transaction Tk must commit only after the transactions T1, T2 and T3 have
committed
Recoverable Vs Non recoverable Schedule
In a recoverable schedule, a transaction Tk can only commit if:
transaction Tk has not read (= used) dirty data
Cascading Schedule
• If in a schedule, failure of one transaction causes several other dependent transactions to rollback
or abort, then such a schedule is called as a Cascading Schedule or Cascading Rollback or
Cascading Abort.
• It simply leads to the wastage of CPU time.
Cascadless Schedule
• If in a schedule, a transaction is not allowed to read a data item until the last transaction that has
written it is committed or aborted, then such a schedule is called as a Cascadeless Schedule.
• In other words,
• Cascadeless schedule allows only committed read operations.
• Therefore, it avoids cascading roll back and thus saves CPU time.
NOTE-
•Cascadeless schedule allows only committed read operations.
•However, it allows uncommitted write operations.
Strict Schedule
• If in a schedule, a transaction is neither allowed to read nor write a data item until the last transaction
that has written it is committed or aborted, then such a schedule is called as a Strict Schedule.
• Every strict schedule is both recoverable and cascadless.
• In other words,
• Strict schedule allows only committed read and write operations.
• Clearly, strict schedule implements more restrictions than cascadeless schedule.
Recoverability
• Strict schedules are more strict than cascadeless schedules.
• All strict schedules are cascadeless schedules.
• All cascadeless schedules are not strict schedules.
Concurrency Control
Concurrency Control
1 Purpose of Concurrency Control
• To enforce Isolation (through mutual exclusion) among conflicting transactions.
• To preserve database consistency through consistency preserving execution of
transactions.
• To resolve read-write and write-write conflicts.
Example: In concurrent execution environment if T1 conflicts with T2 over a data item
A, then the existing concurrency control decides if T1 or T2 should get the A and if the
other transaction is rolled-back or waits.
2 Introduction
Most of these techniques ensure serializability of schedules using concurrency
control protocol (i.e. set of rules), that guarantees serializability.
Concurrency Control
Classification of Techniques
• Lock: Locking the data items to prevent multiple transactions from accessing
the items concurrently; a number of locking protocol have been proposed.
• Use of Timestamp: A timestamp is a unique identifier for each transaction,
generated by the system. Transactions are executed one after the other.
• Mult version: Mult version concurrency controls that uses multiple versions of a
data items.
• Optimistic Concurrency Control: Based on the concept of validation or
certification of transaction after it executes its operations these are sometimes
called optimistic protocols.
Locking Technique
• Lock: A variable associated with a data item that describes the status of the
item with respect to possible operations that can be applied to it.
• Generally, there is one for each data item in the database.
What is Lock?
• A lock is a variable associated with data item to control concurrent access to
that data item.
Database
0
Lock variable
Locking is a strategy that is used to
prevent such concurrent access of data.
1
Lock-based Protocol
• Data items can be locked in two modes :
• Shared (S) mode: When we take this lock we can just read the item but cannot write.
• Exclusive (X) mode: When we take this lock we can read as well as write the item.
• Lock-compatibility matrix
• A transaction may be granted a lock on an item if the requested lock is compatible with locks
already held on the item by other transactions.
• If a lock cannot be granted, the requesting transaction is made to wait till all incompatible locks held
by other transactions have been released. The lock is then granted.
• Any number of transactions can hold shared locks on an item, but if any transaction holds an
exclusive on the item no other transaction can hold any lock on the item.
Shared lock Exclusive lock
Shared lock Yes
Compatible
No
Not Compatible
Exclusive lock No
Not Compatible
No
Not Compatible
T1
T2
Lock-based Protocol
• A lock associated with an item X, lock (X) has three possible states:
• “read-lock(X)”, “write-lock(X)”, and “unlock(X)”.
• A read locked item is also called shared-locked, because other transactions are
allowed to read the item.
• A write-locked item is called exclusive-locked, because a single transaction
exclusively holds the lock on the item.
Lock-based Protocol
Rule for read/write locks:
• A txn T must issue the operation read-lock (X) or write-lock (X) before any
read-item (X) operation is performed in T.
• A txn T must issue the operation write-lock (X) before any write-item (X)
operation is performed in T.
• A txn T must issue the operation unlock (X) after all read-item (X) and write-
item (X) operations are completed in T.
• A txn T will not issue an unlock (X) operation unless it already holds a read
(shared) lock or write (exclusive) lock on item X.
Lock-based Protocol
• This locking protocol divides transaction execution phase into three parts:
1. When transaction starts executing, create a list of data items on which they need locks
and requests the system for all the locks it needs.
2. Where the transaction acquires all locks and no other lock is required. Transaction
keeps executing its operation.
3. As soon as the transaction releases its first lock, the third phase starts. In this phase a
transaction cannot demand for any lock but only releases the acquired locks.
Transaction
Transaction
begin
Transaction
end
Time
Lock acquisition
phase
Lock releasing
phase
Transaction
execution
Two phase Locking Protocol
• This protocol works in two phases,
1. Growing Phase
• In this phase a transaction obtains locks, but can not release any lock.
• When a transaction takes the final lock is called lock point.
2. Shrinking Phase
• In this phase a transaction can release locks, but can not obtain any lock.
• The transaction enters the shrinking phase as soon as it releases the first lock after
crossing the Lock Point.
Transaction
Transaction
begin
Transaction
end
Time
Growing phase Shrinking phase
Two phase Locking Protocol
• With lock version: (if lock conversion is allowed)
1. Upgrading of locks (from read-locked to write-locked) must be done during the growing/expanding
phase.
2. Downgrading of locks (from write-locked to read-locked) must be done in the shrinking phase.
Hence, a read-lock (X) operation that downgrades an already held write lock on X can appear only in
the shrinking phase.
• Requirement: For a transaction these two phases must be mutually exclusively, that is, during locking
phase unlocking phase must not start and during unlocking phase locking phase must not begin.
• Claims:
1. If every transaction in a schedule follows two phase locking protocol, the schedule is guaranteed to
be serializable, obviating the need to test for serializability of schedules anymore.
2. If the locking mechanism enforce two-phase locking rules, it in effect enforce serializability of
schedules.
Two phase Locking Protocol
T1 T2 Result
read_lock (Y); read_lock (X); Initial values: X=20;
Y=30
read_item (Y); read_item (X); Result of serial
execution
unlock (Y); unlock (X); T1 followed by T2
write_lock (X); Write_lock (Y); X=50, Y=80.
read_item (X); read_item (Y); Result of serial
execution
X:=X+Y; Y:=X+Y; T2 followed by T1
write_item (X); write_item (Y); X=70, Y=50
unlock (X); unlock (Y);
• Transactions T1 and T2 do not follow the two-phase locking protocol,
because
• the write_lock(X) operation follows the unlock(Y) operation in T1, and
similarly the write_lock(Y) operation follows the unlock(X) operation in
T2.
• If we enforce two-phase locking, the transactions can be rewritten as T1’
and T2’.
• during locking phase
unlocking phase must
during locking
phase unlocking
phase must not
start and during
unlocking phase
locking phase must
not begin. and during
unlocking phase locking
phase must not begin.
Two phase Locking Protocol
T’1 T’2
read_lock (Y); read_lock (X); T1 and T2 follow two-
phase
read_item (Y); read_item (X); policy but they are subject
to
write_lock (X); Write_lock (Y); deadlock, which must be
unlock (Y); unlock (X); dealt with.
read_item (X); read_item (Y);
X:=X+Y; Y:=X+Y;
write_item (X); write_item (Y);
unlock (X); unlock (Y);
T1’ will issue its write_lock(X) before it unlocks item Y; consequently,
when T2’ issues its read_lock(X), it is forced to wait until T1’ releases the lock by
issuing an unlock (X) in the schedule.
However, this can lead to deadlock
Two phase Locking Protocol
T1 T2 T3
Lock_X(A);
read_item (A);
write_item (A);
Lock_S(B);
read_item (B);
Unlock (A)
Lock_X(A);
read_item (A);
write_item (A);
unlock (A);
Lock_S(A);
read_item (A);
• Cascading rollback may occur under two phase locking protocol.
• Deadlock may occur
• Cascading rollback can be avoided by a modification of 2-phase locking.
Fail
Rollback
Two phase Locking Protocol
Two phase Locking Protocol
Deadlock in 2PL
Let's understand deadlock in a two-phase locking protocol with
the help of an example.
•In the above transaction in step 3, T1 cannot execute the exclusive lock in R2 as the Exclusive lock is
already obtained by transaction T2 in step 1.
•Also, in step 3, the Exclusive lock on R1 cannot be executed by T2 as T1 has already applied the
Exclusive lock on R1.
Two phase Locking Protocol
Two-phase locking may limit the amount of concurrency
• T must lock the additional item Y before it needs it so that it can release X.
Penalty to other transactions:
• Another transaction seeking to access X may be forced to wait, even though T is
done with X.
OR
• If Y is locked earlier than it is needed, another transaction seeking to access Y is forced to
wait even though T is not using Y yet.
Types of Two-phase Locking:
1. Basic two phase locking
2. Conservative two phase locking
3. Strict two phase locking
4. Rigorous two phase locking
Conservative 2PL Protocol
• Requires a transaction to lock all the items it access before the transaction begins execution, by predeclaring its
read-set and write-set.
• Note: The read set of a transaction is the set of all items that the transaction reads, and the write set is the set of
all items that the transaction writes.
• If any of the predeclared items can not be locked, the transaction does not lock any item; instead, it waits until all
the items are available for locking.
OR
Which requires the transaction to update all the locks before it start and release all the locks after it commits.
Avoid cascading rollback and deadlock.
Difficult to use because of difficulty in predeclaring the read-set and write-set.
Strict 2PL Protocol
• The most popular variation of 2PL is strict 2PL, which guarantees strict schedules.
• In this variation, transaction T does not release any of its exclusive locks until after it commit or abort.
OR
A transaction must hold all its exclusive locks till it commits/aborts.
• Not a deadlock free protocol.
• Possible to enforce and desired due to recoverability.
Rigorous 2PL Protocol
• A transaction T does not release any of its locks (exclusive or shared) until after it commits or aborts.
• Behaves similar to strict 2PL except it is more restrictive, but easier to implement since all locks are held till
commit.
All 2PL Protocol
• Basic: All locking operations (read lock or write lock) precede the first unlock operation in the transaction.
• Conservative: Prevents deadlock by locking all desired data items before transaction begins execution.
• Strict: All exclusive locks taken by a transaction must be held until the transaction commits or abort.
• Rigorous: All (shared or exclusive) locks are held till commit.
Limitation of 2PL
Use of locks can came two additional problems deadlock and starvation.
Example 2PL Protocol
Which of the following schedule are in strict 2 phase or rigorous?
a) Lock_S(A) b) Lock_S(A) c) Lock _S(A)
read_item(A) read_item(A) read_item(A)
Lock_X(B) Lock_X(B) Lock_X(B)
read_item (B) Unlock (A) read_item (B)
Unlock (A) read_item (B) write_item (B)
write_item (B) write_item (B) Commit
Unlock (B) Commit Unlock (A)
Unlock (B) Unlock (B)
Basic 2PL Basic 2PL
Strict 2PL
Basic 2PL
Strict 2PL
Rigorous 2PL
Deadlock
What is deadlock?
• Consider the following two transactions:
• A deadlock is a situation in which two or more transactions are waiting for one
another to give up locks.
T1 T2
Granted for (A)
Waiting for (B)
Granted for (B)
Waiting for (A)
Lock-X (A)
Write (A)
Lock-X (B)
Write (B)
Lock-X (B)
Write (B)
Lock-X (A)
Write (A)
Deadlock
Detection
• A simple way to detect deadlock is with the help of wait-for graph.
• One node is created in the wait-for graph for each transaction that is currently
executing.
• Whenever a transaction Ti is waiting to lock an item X that is currently locked
by a transaction Tj, a directed edge from Ti to Tj (Ti→Tj) is created in the wait-
for graph.
• When Tj releases the lock(s) on the items that Ti was waiting for, the directed
edge is dropped from the wait-for graph.
• We have a state of deadlock if and only if the wait-for graph has a cycle.
• Then each transaction involved in the cycle is said to be deadlocked.
Deadlock
Detection
• Transaction A is waiting for transactions B and C.
• Transactions C is waiting for transaction B.
• Transaction B is waiting for transaction D.
• This wait-for graph has no cycle, so there is no deadlock
state.
• Suppose now that transaction D is requesting an item
held by C. Then the edge D → C is added to the wait-for
graph.
• Now this graph contains the cycle.
• B → D → C → B
• It means that transactions B, D and C are all
deadlocked.
B
A
D
C
D
E
A
D
L
O
C
K
Deadlock Recovery
• When a deadlock is detected, the system must recover from the
deadlock.
• The most common solution is to roll back one or more
transactions to break the deadlock.
• Choosing which transaction to abort is known as victim selection.
• In this wait-for graph transactions B, D and C are deadlocked.
• In order to remove deadlock one of the transaction out of these
three (B, D, C) transactions must be roll backed.
• We should rollback those transactions that will incur the minimum
cost.
• When a deadlock is detected, the choice of which transaction to
abort can be made using following criteria:
• The transaction which have the fewest locks
• The transaction that has done the least work
• The transaction that is farthest from completion
B
A
D
C
D
E
A
D
L
O
C
K
Deadlock Prevention
• A protocols ensure that the system will never enter into a deadlock state.
• Some prevention strategies :
• Require that each transaction locks all its data items before it begins execution (pre-
declaration).
• Impose partial ordering of all data items and require that a transaction can lock data
items only in the order specified by the partial.
Deadlock Prevention
• Following schemes use transaction timestamps for the sake of deadlock prevention alone.
1. Wait-die scheme — non-preemptive
• If an older transaction is requesting a resource which is held by younger transaction, then older
transaction is allowed to wait for it till it is available.
• If an younger transaction is requesting a resource which is held by older transaction, then younger
transaction is killed.
2. Wound-wait scheme — preemptive
• If an older transaction is requesting a resource which is held by younger transaction, then older
transaction forces younger transaction to kill the transaction and release the resource.
• If an younger transaction is requesting a resource which is held by older transaction, then younger
transaction is allowed to wait till older transaction will releases it.
Wait-die Wound-wait
O needs a resource held by
Y
O waits Y dies
Y needs a resource held by
O
Y dies Y waits
• Following schemes use transaction timestamps for the sake of deadlock
prevention alone.
3. Timeout-Based Schemes
• A transaction waits for a lock only for a specified amount of time. After that, the wait
times out and the transaction is rolled back. So deadlocks never occur.
• Simple to implement; but difficult to determine good value of the timeout interval.
Deadlock Prevention
Starvation
• A transaction is starved if it cannot proceed for an indefinite period of time while other transactions in
the system continue normally.
• This may occur if the waiting scheme for locked items is unfair, giving priority to some transaction over
others.
• Starvation can also occur in the algorithms for dealing with deadlock.
• Occurs if an algorithm select the same transaction as victim repeatedly, thus causing it to abort and
never finish execution.
Techniques for Preventing Starvation:
1. First come first serve
2. Allow some transactions to have priority over others and go with aging technique.
3. Execute the repeatedly abort transactions.
Note: The wait-die and wound-wait scheme avoid starvation.
Timestamp based Concurrency Control
• This protocol uses either system time or logical counter to be used as a time-stamp.
• Every transaction has a time-stamp associated with it and the ordering is determined
by the age of the transaction.
• A transaction ‘T1’ created at 0002 clock time would be older than all other transaction,
which come after it.
• For example, any transaction ‘T2' entering the system at 0004 is two seconds younger
than transaction ‘T1’ and priority is given to the older one.
• In addition, every data item is given the latest read and write time-stamp. This lets the
system know, when last read and write operations was made on the data item.
• This is the responsibility of the protocol system that the conflicting pair of tasks should
be executed according to the timestamp values of the transactions.
• Time-stamp of Transaction Ti is denoted as TS(Ti).
• Read time-stamp of data-item X is denoted by R-timestamp(X).
• Write time-stamp of data-item X is denoted by W-timestamp(X).
Timestamp based Concurrency Control
• This is the responsibility of the protocol system that the conflicting pair of tasks should be
executed according to the timestamp values of the transactions.
• Time-stamp of Transaction Ti is denoted as TS(Ti).
• Read time-stamp of data-item X is denoted by R-timestamp(X).
• Write time-stamp of data-item X is denoted by W-timestamp(X).
• Timestamp ordering protocol works as follows:
• If a transaction Ti issues read(X) operation:
• If TS(Ti) < W-timestamp(X)
• Operation rejected.
• If TS(Ti) >= W-timestamp(X)
• Operation executed.
• If a transaction Ti issues write(X) operation:
• If TS(Ti) < R-timestamp(X)
• Operation rejected.
• If TS(Ti) < W-timestamp(X)
• Operation rejected and Ti rolled back.
• Otherwise, operation executed.
Recovery Techniques
Database Recovery
• There are many situations in which a transaction may not reach a commit or abort
point.
• Operating system crash
• DBMS crash
• System might lose power (power failure)
• Disk may fail or other hardware may fail (disk/hardware failure)
• Human error
• In any of above situations, data in the database may become inconsistent or lost.
• For example, if a transaction has completed 30 out of 40 write instructions to the
database when the DBMS crashes, then the database may be in an inconsistent state
as only part of the transaction’s work was completed.
• Database recovery is the process of restoring the database and the data to a
consistent state.
• This may include restoring lost data up to the point of the event (e.g. system crash).
Log-based Recovery
• The log is a sequence of log records, which maintains information about
update activities on the database.
• A log is kept on stable storage (i.e HDD).
• Log contains
• Start of transaction
• Transaction-id
• Record-id
• Type of operation (insert, update, delete)
• Old value, new value
• End of transaction that is committed or aborted.
Log-based Recovery
• When transaction Ti starts, it registers itself by writing a record <Ti start> to
the log.
• Before Ti executes write(X), a log record <Ti, X, V1, V2> is written, where V1 is
the value of X before the write (the old value), and V2 is the value to be written
to X (the new value).
• When Ti finishes it last statement, the log record <Ti commit> is written.
• Undo of a log record <Ti, X, V1, V2> writes the old value V1 to X
• Redo of a log record <Ti, X, V1, V2> writes the new value V2 to X
• Types of log based recovery method
• Immediate database modification
• Deferred database modification
Immediate v/s Deferred database modification
Immediate database modification Deferred database modification
Updates (changes) to the database are
applied immediately as they occur without
waiting to reach to the commit point.
Updates (changes) to the database are
deferred (postponed) until the transaction
commits.
Immediate v/s Deferred database modification
Immediate database modification Deferred database modification
T1 T2
Read (A)
A = A - 100
Write (A)
Read (B)
B = B + 100
Write (B)
Commit
Read (C)
C = C - 200
Write (C)
Commit
A=500, B=600, C=700 A=500, B=600, C=700
<T1 start>
<T1, A, 500, 400>
<T1, B, 600, 700>
A=400,B=700,C=700
<T1 start>
<T1, A, 400>
<T1, B, 700>
A=500,B=600,C=700
<T1 start>
<T1, A, 400>
<T1, B, 700>
<T1, Commit>
<T2 start>
<T2, C, 500>
A=400,B=700,C=700
<T1 start>
<T1, A, 400>
<T1, B, 700>
<T1, Commit>
<T2 start>
<T2, C, 500>
<T2, Commit>
A=400,B=700,C=500
<T1 start>
<T1, A, 500, 400>
<T1, B, 600, 700>
<T1, Commit>
<T2 start>
<T2, C, 700, 500>
A=400,B=700,C=500
<T1 start>
<T1, A, 500, 400>
<T1, B, 600, 700>
<T1, Commit>
<T2 start>
<T2, C, 700, 500>
<T2, Commit>
A=400,B=700,C=500
Immediate v/s Deferred database modification
Immediate database modification Deferred database modification
Updates (changes) to the database are
applied immediately as they occur without
waiting to reach to the commit point.
Updates (changes) to the database are
deferred (postponed) until the transaction
commits.
If transaction is not committed, then we need
to do undo operation and restart the
transaction again.
If transaction is not committed, then no need
to do any undo operations. Just restart the
transaction.
If transaction is committed, then no need to
do redo the updates of the transaction.
If transaction is committed, then we need to
do redo the updates of the transaction.
Undo and Redo both operations are
performed.
Only Redo operation is performed.
Problems with Deferred & Immediate Updates (Checkpoint)
• Searching the entire log is time consuming.
• Immediate database modification
• When transaction fail log file is used to undo the updates of transaction.
• Deferred database modification
• When transaction commits log file is used to redo the updates of transaction.
• To reduce the searching time of entire log we can use check point.
• It is a point which specifies that any operations executed before it are done correctly
and stored safely (updated safely in database).
• At this point, all the buffers are force-fully written to the secondary storage
(database).
• Checkpoints are scheduled at predetermined time intervals.
• It is used to limit:
• Size of transaction log file
• Amount of searching
Problems with Deferred & Immediate Updates (Checkpoint)
• Searching the entire log is time consuming.
• Immediate database modification
• When transaction fail log file is used to undo the updates of transaction.
• Deferred database modification
• When transaction commits log file is used to redo the updates of transaction.
• To reduce the searching time of entire log we can use check point.
• It is a point which specifies that any operations executed before it are done correctly
and stored safely (updated safely in database).
• At this point, all the buffers are force-fully written to the secondary storage
(database).
• Checkpoints are scheduled at predetermined time intervals.
• It is used to limit:
• Size of transaction log file
• Amount of searching
How the checkpoint works when failure occurs
• At failure time:
• Ignore the transaction T1 as it has already been committed before checkpoint.
• Redo transaction T2 and T3 as they are active after checkpoint and are committed before failure.
• Undo transaction T4 as it is active after checkpoint and has not committed.
Time TC Tf
T1
T2
T3
T4
Checkpoint time Failure
How the checkpoint works when failure occurs
Time TC Tf
T1
T4
T5
T6
Checkpoint time Failure
Exercise Give the name of transactions which has already been committed before
checkpoint.
Exercise Give the name of transactions which will perform Redo
operation.
Exercise Give the name of transactions which will perform Undo
operation.
TC
T2
T3
Checkpoint time
Page table structure
2
1
4
101
202
n
Page Table
1
2
3
4
5
.
n
1
2
3
4
:
:
101
:
:
201
202
n
Pages on disk
Pages
The database is partitioned into
fixed-length blocks referred to as
PAGES.
Page table has n entries – one for
each database page and each entry
contain pointer to a page on disk.
Shadow paging technique
• Shadow paging is an alternative to log-based recovery.
• This scheme is useful if transactions execute serially.
• It maintain two page tables during the lifetime of a transaction
• current page table
• shadow page table
• Shadow page table is stored on non-volatile storage.
• When a transaction starts, both the page tables are identical. Only current
page table is updated for data item accesses (changed) during execution of
the transaction.
• Shadow page table is never modified during execution of transaction.
Shadow paging technique
• Two pages - page 2 & 5 - are affected by a transaction and copied to new physical
pages. The current page table points to these pages.
• The shadow page table continues to point to old pages which are not changed by
the transaction. So, this table and pages are used for undoing the transaction.
Page 1
Page 2
Page 3
Page 4
Page 5
Page 1
Page 2
Page 3
Page 4
Page 5
Page 1
Page 2
Page 3
Page 4
Page 5
Pages
Current page table Shadow page table
• Whenever any page is
updated first time
1. A copy of this page is made onto
an unused page
2. The current page table is then
made to point to the copy
3. The update is performed on the
copy
Page 2(old)
Page 5(old)
Page 2(new)
Page 5(new)
Shadow paging technique
• When transaction start, both the page tables are identical.
• The shadow page table is never changed over the duration of the transaction.
• The current page table will be changed when a transaction performs a write
operation.
• All input and output operations use the current page table.
• Whenever any page is about to be written for the first time
• A copy of this page is made onto an unused page
• The current page table is then made to point to the copy
• The update is performed on the copy
• When the transaction completes, all the modifications which are done by transaction
which are present in current page table are transferred to shadow page table.
• When the transaction fails, the shadow page table are transferred to current page
table.
Presentation on Transaction Processing in DBMS
Presentation on Transaction Processing in DBMS
Presentation on Transaction Processing in DBMS

Presentation on Transaction Processing in DBMS

  • 1.
  • 2.
  • 3.
    Introduction to TransactionProcessing • Single-User System: At most one user at a time can use the system. • Multiuser System: Many users can access the system concurrently. • Concurrency • Interleaved processing: concurrent execution of processes is interleaved in a single CPU • Parallel processing: processes are concurrently executed in multiple CPUs.
  • 4.
    Introduction to TransactionProcessing  A Transaction: logical unit of database processing that includes one or more access operations (read -retrieval, write - insert or update, delete).  A transaction (set of operations) may be stand-alone specified in a high level language like SQL submitted interactively, or may be embedded within a program.  Transaction boundaries: Begin and End transaction.  An application program may contain several transactions separated by the Begin and End transaction boundaries.
  • 5.
    What is transaction? •A transaction is a sequence of operations performed as a single logical unit of work. • A transaction is a logical unit of work that contains one or more SQL statements. • Example of transaction: Transaction Operations Works as a single logical unit read (A) A = A – 50 write (A) read (B) B = B + 50 write (B) Want to transfer Rs. 50 from Account-A to Account-B
  • 6.
    What is transaction? •A transaction is used to represent a logical unit of database processing that must be completed to ensure correctness. • Collection of operations that form a single logical unit of work are called transactions • A transaction is a unit of program execution that accesses and possibly updates various data items. • Two main issues to deal with: • Failure of various kinds; such as hardware failure or system crashes. • Concurrent execution of multiple transactions
  • 7.
    Introduction to TransactionProcessing SIMPLE MODEL OF A DATABASE (for purposes of discussing transactions):  A database - collection of named data items  Granularity of data - a field, a record , or a whole disk block (Concepts are independent of granularity)  Basic operations are read and write – read_item(X): Reads a database item named X into a program variable. To simplify our notation, we assume that the program variable is also named X. – write_item(X): Writes the value of program variable X into the database item named X.
  • 8.
    Introduction to TransactionProcessing READ AND WRITE OPERATIONS:  Basic unit of data transfer from the disk to the computer main memory is one block. In general, a data item (what is read or written) will be the field of some record in the database, although it may be a larger unit such as a record or even a whole block.  read_item(X) command includes the following steps: 1. Find the address of the disk block that contains item X. 2. Copy that disk block into a buffer in main memory (if that disk block is not already in some main memory buffer). 3. Copy item X from the buffer to the program variable named X.
  • 9.
    Introduction to TransactionProcessing READ AND WRITE OPERATIONS (cont.):  write_item(X) command includes the following steps: 1. Find the address of the disk block that contains item X. 2. Copy that disk block into a buffer in main memory (if that disk block is not already in some main memory buffer). 3. Copy item X from the program variable named X into its correct location in the buffer. 4. Store the updated block from the buffer back to disk (either immediately or at some later point in time).
  • 10.
    ACID properties oftransaction
  • 11.
    ACID properties oftransaction • Atomicity (Either transaction execute 0% or 100%) • Consistency (Database must remain in a consistent state after any transaction) • Isolation (Intermediate transaction results must be hidden from other concurrently executed transactions) • Durability (Once a transaction completed successfully, the changes it has made into the database should be permanent)
  • 12.
    ACID properties oftransaction (Atomicity) Atomicity requirement • If the transaction fails after step 3 and before step 6, money will be “lost” leading to an inconsistent database state. • Failure could be due to software or hardware. • The system should ensure that updates of a partially executed transaction are not reflected in the database. • This property states that a transaction must be treated as an atomic unit, that is, either all of its operations are executed or none. • Either transaction execute 0% or 100%. • For example, consider a transaction to transfer Rs. 50 from account A to account B. • In this transaction, if Rs. 50 is deducted from account A then it must be added to account B. read (A) A = A – 50 write (A) read (B) B = B + 50 write (B) 0% 100% FAIL
  • 13.
    ACID properties oftransaction (Consistency) • The database must remain in a consistent state after any transaction. • If the database was in a consistent state before the execution of a transaction, it must remain consistent after the execution of the transaction as well. • A transaction, when starting to execute, must see a consistent database • During transaction execution the database may be temporarily inconsistent • When the transaction completes successfully the database must be consistent • In our example, total of A and B must remain same before and after the execution of transaction. A=500, B=500 A+B=1000 read (A) A = A – 50 write (A) read (B) B = B + 50 write (B) A=450, B=550 A+B=1000
  • 14.
    ACID properties oftransaction (Isolation) • Changes occurring in a particular transaction will not be visible to any other transaction until it has been committed. • Intermediate transaction results must be hidden from other concurrently executed transactions. • In our example once our transaction starts from first step (step 1) its result should not be access by any other transaction until last step (step 6) is completed. • It is enforced by CONCURRENCY CONTROL COMPONENT of DATABASE. read (A) A = A – 50 write (A) read (B) B = B + 50 write (B) Start Transaction
  • 15.
    ACID properties oftransaction (Isolation) • Isolation requirement — if between steps 3 and 6, another transaction T2 is allowed to access the partially updated database, it will see an inconsistent database (the sum A + B will be less than it should be). T1 T2 1. read(A) 2. A := A – 50 3. write(A) read(A), read(B), print(A+B) 4. read(B) 5. B := B + 50 6. write(B) • Isolation can be ensured trivially by running transactions serially • that is, one after the other. • However, executing multiple transactions concurrently has significant benefits, as we will see later.
  • 16.
    ACID properties oftransaction (Durability) • After a transaction completes successfully, the changes it has made to the database persist (permanent), even if there are system failures. • Once our transaction completed up to last step (step 6) its result must be stored permanently. It should not be removed if system fails. A=500, B=500 read (A) A = A – 50 write (A) read (B) B = B + 50 write (B) A=450, B=550 These values must be stored permanently in the database
  • 17.
    Transaction State Diagram State Transition Diagram
  • 18.
    Transaction State Diagram State Transition Diagram read (A) A = A – 50 write (A) read (B) B = B + 50 write (B) Commit Active Partial Committed Failed Committed Aborted This is the initial state. The transaction stays in this state while it is executing. End When a transaction executes its final operation, it is said to be in a partially committed state. Discover that normal execution can no longer proceed. Once a transaction cannot be completed, any changes that it made must be undone rolling it back. The state after the transaction has been rolled back and the database has been restored to its state prior to the start of the transaction. The transaction enters in this state after successful completion of the transaction. We cannot abort or rollback a committed transaction. Active Active Partial Committed FAIL Failed Committed Aborted
  • 19.
    Transaction State Diagram State Transition Diagram • Active • This is the initial state. • The transaction stays in this state while it is executing. • Partial Committed • When a transaction executes its final operation/ instruction, it is said to be in a partially committed state. • Failed • Discover that normal execution can no longer proceed. • Once a transaction cannot be completed, any changes that it made must be undone rolling it back. • Committed • The transaction enters in this state after successful completion of the transaction (after committing transaction). • We cannot abort or rollback a committed transaction. • Aborted • The state after the transaction has been rolled back and the database has been restored to its state prior to the start of the transaction.
  • 20.
  • 21.
    What is concurrency? •In multi-user database system, many users can use the database at same time. • In this case, the database should be in consistent state. • Now a days, we have multi-programming systems, which also results in concurrent execution of processes by interleaving. • So, Concurrency is the ability of a database to allow multiple (more than one) users to access data at the same time.
  • 22.
    What is concurrency? •The concurrency is allowed because of the following: • Improved throughput and resource utilizations • Helps in reducing the waiting times. • Concurrent executions may cause to incorrectness and lead to data inconsistency. To avoid this inconsistency we need some concurrency control mechanism. • Three problems due to concurrency • Lost update problem • Dirty read problem • Incorrect summary (retrieval) problem • Unrepeated read
  • 23.
    Lost update problem •This problem indicate that if two transactions T1 and T2 both read the same data and update it then effect of first update will be overwritten by the second update. • How to avoid: A transaction T2 must not update the data item (X) until the transaction T1 can commit data item (X). T1 Time T2 --- T0 --- Read X T1 --- --- T2 Read X Update X X=75 T3 --- --- T4 Update X X=50 --- T5 --- X=100
  • 24.
    Lost update problem Supposethat transactions T1 and T2 are submitted at approximately the same time, and suppose that their operations are interleaved as shown in Figure (a); then the final value of item X is incorrect because T2 reads the value of X before T1 changes it in the database, and hence the updated value resulting from T1 is lost. For example, if X = 80 at the start (originally there were 80 reservations on the flight), N = 5 (T1 transfers 5 seat reservations from the flight corresponding to X to the flight corresponding to Y), and M = 4 (T2 reserves 4 seats on X), the final result should be X = 79. However, in the interleaving of operations shown in Figure (a), it is X = 84 because the update in T1 that removed the five seats from X was lost.
  • 25.
    Dirty read problem •The dirty read arises when one transaction update some item and then fails due to some reason. This updated item is retrieved by another transaction before it is changed back to the original value. • How to avoid: A transaction T1 must not read the data item (X) until the transaction T2 can commit data item (X). X=100 T1 Time T2 --- T0 --- --- T1 Update X X=50 Read X T2 --- --- T3 Rollback --- T4 ---
  • 26.
    Dirty read problem Thisproblem occurs when one transaction updates a database item and then the transaction fails for some reason. Meanwhile, the updated item is accessed (read) by another transaction before it is changed back (or rolled back) to its original value. Figure (b) shows an example where T1 updates item X and then fails before completion, so the system must roll back X to its original value. Before it can do so, however, transaction T2 reads the temporary value of X, which will not be recorded permanently in the database because of the failure of T1. The value of item X that is read by T2 is called dirty data because it has been created by a transaction that has not completed and committed yet; hence, this problem is also known as the dirty read problem.
  • 27.
    Incorrect retrieval problem •The inconsistent retrieval problem arises when one transaction retrieves data to use in some operation but before it can use this data another transaction updates that data and commits. • Through this change will be hidden from first transaction and it will continue to use previous retrieved data. This problem is also known as inconsistent analysis problem. • How to avoid: A transaction T2 must not read or update data item (X) until the transaction T1 can commit data item (X). T1 Time T2 Read (A) Sum  200 T1 --- Read (B) Sum  Sum + 250 = 450 T2 --- --- T3 Read (C) --- T4 Update (C) 150  150 – 50 = 100 --- T5 Read (A) --- T6 Update (A) 200  200 + 50 = 250 --- T7 COMMIT Read (C) Sum Sum + 100 = 550 T8 --- Balance (A=200, B=250, C=150)
  • 28.
    Incorrect retrieval problem Ifone transaction is calculating an aggregate summary function on a number of database items while other transactions are updating some of these items, the aggregate function may calculate some values before they are updated and others after they are updated. For example, suppose that a transaction T3 is calculating the total number of reservations on all the flights; meanwhile, transaction T1 is executing. If the interleaving of operations shown in Figure (c) occurs, the result of T3 will be off by an amount N because T3 reads the value of X after N seats have been subtracted from it but reads the value of Y before those N seats have been added to it.
  • 29.
    Unrepeated Read • Wherea transaction T reads the same item twice and the item is changed by another transaction T between the two reads. ′ • Hence, T receives different values for its two reads of the same item. For example, if during an airline reservation transaction, a customer inquires about seat availability on several flights. When the customer decides on a particular flight, the transaction then reads the number of seats on that flight a second time before completing the reservation, and it may end up reading a different value for the item.
  • 30.
  • 31.
    What is schedule? •A schedule is a process of grouping the transactions into one and executing them in a predefined order. • A schedule is the chronological (sequential) order in which instructions are executed in a system. • A schedule is required in a database because when some transactions execute in parallel, they may affect the result of the transaction. • Means if one transaction is updating the values which the other transaction is accessing, then the order of these two transactions will change the result of another transaction. • Hence a schedule is created to execute the transactions.
  • 32.
    What is schedule? •A schedule must preserve the order in which the instructions appear in each individual transaction. • A transaction that successfully completes its execution will have a commit instructions as the last statement • by default transaction assumed to execute commit instruction as its last step • A transaction that fails to successfully complete its execution will have an abort instruction as the last statement • Formal Definition: • A schedule (or history) S of n transactions T1, T2, ..., Tn : It is an ordering of the operations of the transactions subject to the constraint that, for each transaction Ti that participates in S, the operations of T1 in S must appear in the same order in which they occur in T1. Note, however, that operations from other transactions Tj can be interleaved with the operations of Ti in S.
  • 33.
  • 34.
    Schedule 1  LetT1 transfer $50 from A to B, and T2 transfer 10% of the balance from A to B.  A serial schedule in which T1 is followed by T2 :
  • 35.
    Schedule T1 T2 Read (A) A= A - 50 Write (A) Read (B) B = B + 50 Write (B) Commit Read (A) temp = A * 0.1 A = A - temp Write (A) Read (B) B = B + temp Write (B) Commit Schedule Execution A=B=1000 Read (1000) A = 1000 - 50 Write (950) Read (1000) B = 1000 + 50 Write (1050) Commit Read (950) temp = 950 * 0.1 A = 950 - 95 Write (855) Read (1050) B = 1050 + 95 Write (1145) Commit Schedule 1
  • 36.
    Schedule 2 • Aserial schedule where T2 is followed by T1
  • 37.
    Schedule 2 Schedule T1 T2 Read(A) Temp = A * 0.1 A = A - temp Write (A) Read (B) B = B + temp Write (B) Commit Read (A) A = A - 50 Write (A) Read (B) B = B + 50 Write (B) Commit Schedule Execution A=B=1000 Read (1000) Temp = 1000 * 0.1 A = 1000 - 100 Write (900) Read (1000) B = 1000 + 100 Write (1100) Commit Read (900) A = 900 - 50 Write (850) Read (1100) B = 1100 + 50 Write (1150) Commit
  • 38.
    • A serialschedule is a schedule in which no transaction starts until a running transaction has ended. • A serial schedule is a schedule in which one transaction is executed completely before starting another transaction. • Transactions are executed one after the other. • This type of schedule is called a serial schedule, as transactions are executed in a serial manner. • When the transactions are executed serially which always ensure a constant state. • If there are n-transaction in a schedule, the possible number of serial schedule are n!. • Disadvantages: • They limit concurrency • CPU time may be wasted, when the active transaction goes into wait. • May lead to large waiting time of those transaction that are waiting for some long transaction. Serial Schedule
  • 39.
    Serial Schedule T1 T2 Read(A) A = A - 50 Write (A) Read (B) B = B + 50 Write (B) Commit Read (A) temp = A * 0.1 A = A - temp Write (A) Read (B) B = B + temp Write (B) Commit Serial Schedule T1 T2 Read (A) A = A - 50 Write (A) Read (B) B = B + 50 Write (B) Commit Read (A) temp = A * 0.1 A = A - temp Write (A) Read (B) B = B + temp Write (B) Commit Example of Serial Schedule
  • 40.
    Non-serial Schedule (InterleavedSchedule) • Schedule that interleave the execution of different transactions. • Means second transaction is started before the first one could end and execution can switch between the transactions back and forth. • It contains many possible orders in which the system can execute the individual operations of the transactions.
  • 41.
    Non-serial Schedule (InterleavedSchedule) • The Non-Serial Schedule can be divided further into Serializable and Non-Serializable.
  • 42.
    Serializable Schedule • Aschedule is serializable if it is equivalent to a serial schedule of the same n transactions. • In serial schedules, only one transaction is allowed to execute at a time i.e. no concurrency is allowed. • Whereas in serializable schedules, multiple transactions can execute simultaneously i.e. concurrency is allowed. • Types (forms) of serializable schedule • Conflict serializability • View serializability
  • 43.
    Conflicting Instructions  Instructionsli and lj of transactions Ti and Tj respectively, conflict if and only if there exists some item Q accessed by both li and lj, and at least one of these instructions wrote Q. 1. li = read(Q), lj = read(Q). li and lj don’t conflict. 2. li = read(Q), lj = write(Q). They conflict. 3. li = write(Q), lj = read(Q). They conflict 4. li = write(Q), lj = write(Q). They conflict  Intuitively, a conflict between li and lj forces a (logical) temporal order between them.  If li and lj are consecutive in a schedule and they do not conflict, their results would remain the same even if they had been interchanged in the schedule.
  • 44.
    Conflicting instructions  Letli and lj be two instructions of transactions Ti and Tj respectively. 1. li = read(Q), lj = read(Q) li and lj don’t conflict 2. li = read(Q), lj = write(Q) li and lj conflict 3. li = write(Q), lj = read(Q) li and lj conflict 4. li = write(Q), lj = write(Q) li and lj conflict Ti Tj read (Q) read (Q) Ti Tj read (Q) write(Q) Ti Tj write(Q) read (Q) Ti Tj write(Q) write(Q) Ti Tj read (Q) read (Q) Ti Tj write(Q) read (Q) Ti Tj read (Q) write(Q) Ti Tj write(Q) write(Q)
  • 45.
    Conflict serializability • Ifa schedule S can be transformed into a schedule S´ by a series of swaps of non-conflicting instructions, we say that S and S´ are conflict equivalent. • We say that a schedule S is conflict serializable if it is conflict equivalent to a serial schedule OR • If a given schedule can be converted into a serial schedule by swapping its non-conflicting operations, then it is called as a conflict serializable schedule.
  • 46.
     Schedule 3can be transformed into Schedule 6, a serial schedule where T2 follows T1, by series of swaps of non-conflicting instructions.  Therefore Schedule 3 is conflict serializable. Schedule 3 Schedule 6 Conflict serializability (Example)
  • 47.
    Conflict serializability (Example) T1T2 Read (A) A = A - 50 Write (A) Read (A) Temp = A * 0.1 A = A - temp Write (A) Read (B) B = B + 50 Write (B) Commit Read (B) B = B + temp Write (B) Commit T1 T2 Read (A) A = A - 50 Write (A) Read (B) B = B + 50 Write (B) Commit Read (A) Temp = A * 0.1 A = A - temp Write (A) Read (B) B = B + temp Write (B) Commit
  • 48.
    • Example ofa schedule that is not conflict serializable: • We are unable to swap instructions in the above schedule to obtain either the serial schedule <T1, T2>, or the serial schedule <T2, T1>. T1 T2 Read (A) Read (A) Write (A) Conflict serializability (Example)
  • 49.
  • 50.
    Testing for Serializability •Consider some schedule of a set of transactions T1, T2, ..., Tn • Precedence graph — a direct graph where the vertices are the transactions (names). • We draw an arc from Ti to Tj if the two transaction conflict, and Ti accessed the data item on which the conflict arose earlier. • We may label the arc by the item that was accessed. • Example 1 x y
  • 51.
    Testing for Serializability Algorithmto test conflict serializability of a schedule S (1) For each transaction create a node, labelled with . (2) For each case in S where executes read_item(X) after executes a write_item(X), create an edge ( ) in the graph. (3) For each case in S where executes write_item(X) after executes a read_item(X), create an edge ( ) in the precedence graph. (4) For each case in S where executes write_item(X) after executes a write_item(X), create an edge ( ) in the graph. (5) The schedule S is serializable if and only if the graph has no cycle. • If there is no cycle in the precedence graph, it means we can construct a serial schedule S’ which is conflict equivalent to schedule S. • We may run TOPOLOGICAL SORT on the graph to obtain serial schedule. This is a linear order consistent with a partial order of the graph.
  • 52.
    Testing for Serializability Example: S1:r2(Z), r2(Y), w2(Y); r3(Y), r3(Z); r1(X), w1(X); w3(Y), w3(Z); r2(X); r1(Y), w1(Y); w2(X); We can’t have an equivalent schedule because there is a cycle between and and another cycle between , , and . T1 T2 T3 X Y X, Z Y
  • 53.
    S2: r3(Y), r3(Z);r1(X), w1(X); w3(Y), w3(Z); r2(Z); r1(Y), w1(Y); r2(Y), w2(Y), r2(X), w2(X); Testing for Serializability T1 T2 T3 X, Y Y,Z Y Equivalent Schedule is T3  T1  T2
  • 54.
  • 55.
    • Let S1and S2 be two schedules with the same set of transactions. S1 and S2 are view equivalent if the following three conditions are satisfied, for each data item Q • Initial Read • Updated Read • Final Write • If a schedule is view equivalent to its serial schedule then the given schedule is said to be view serializable. View serializability
  • 56.
    Initial Read • Ifin schedule S1, transaction Ti reads the initial value of Q, then in schedule S2 also transaction Ti must read the initial value of Q. • Above two schedules S1 and S3 are not view equivalent because initial read operation in S1 is done by T1 and in S3 it is done by T2. • Above two schedules S1 and S2 are view equivalent because initial read operation in S1 is done by T1 and in S2 it is also done by T1. S1 T1 T2 Read (A) Write (A) S3 T1 T2 Write (A) Read (A) S2 T1 T2 Read (A) Write (A)
  • 57.
    • If inschedule S1 transaction Ti executes read(Q), and that value was produced by transaction Tj (if any), then in schedule S2 also transaction Ti must read the value of Q that was produced by transaction Tj. • Above two schedules S1 and S3 are not view equal because, in S1, T3 is reading A that is updated by T2 and in S3, T3 is reading A which is updated by T1. • Above two schedules S1 and S2 are view equal because, in S1, T3 is reading A that is updated by T2 and in S2 also, T3 is reading A which is updated by T2. S1 T1 T2 T3 Write (A) Write (A) Read (A) S3 T1 T2 T3 Write (A) Write (A) Read (A) S2 T1 T2 T3 Write (A) Write (A) Read (A) Update Read
  • 58.
    Final Write • IfTi performs the final write on the data value in S1, then it also performs the final write on the data value in S2. • Above two schedules S1 and S3 are not view equal because final write operation in S1 is done by T3 and in S3 final write operation is also done by T1. • Above two schedules S1 and S2 are view equal because final write operation in S1 is done by T3 and in S2 also the final write operation is also done by T3. S1 T1 T2 T3 Write (A) Read (A) Write (A) S3 T1 T2 T3 Write (A) Write (A) Read (A) S2 T1 T2 T3 Write (A) Read (A) Write (A)
  • 59.
    View serializable example(Initial Read) • In schedule S1, transaction T1 first reads the data item X. In S2 also transaction T1 first reads the data item X. • In schedule S1, transaction T1 first reads the data item Y. In S2 also the first read operation on Y is performed by T1. • The initial read condition is satisfied for both the schedules. Non-Serial Schedule (S1) T1 T2 Read (A) Write (A) Read (A) Write (A) Read (B) Write (B) Read (B) Write (B) Serial Schedule (S2) T1 T2 Read (A) Write (A) Read (B) Write (B) Read (A) Write (A) Read (B) Write (B)
  • 60.
    View serializable example(Update Read) • In schedule S1, transaction T2 reads the value of X, written by T1. In S2, the same transaction T2 reads the X after it is written by T1. • In schedule S1, transaction T2 reads the value of Y, written by T1. In S2, the same transaction T2 reads the value of Y after it is updated by T1. • The updated read condition is also satisfied for both the schedules. Non-Serial Schedule (S1) T1 T2 Read (A) Write (A) Read (A) Write (A) Read (B) Write (B) Read (B) Write (B) Serial Schedule (S2) T1 T2 Read (A) Write (A) Read (B) Write (B) Read (A) Write (A) Read (B) Write (B)
  • 61.
    View serializable example(Final Write) • In schedule S1, the final write operation on X is done by transaction T2. In S2 also transaction T2 performs the final write on X. • In schedule S1, the final write operation on Y is done by transaction T2. In schedule S2, final write on Y is done by T2. • The final write condition is also satisfied for both the schedules. Non-Serial Schedule (S1) T1 T2 Read (A) Write (A) Read (A) Write (A) Read (B) Write (B) Read (B) Write (B) Serial Schedule (S2) T1 T2 Read (A) Write (A) Read (B) Write (B) Read (A) Write (A) Read (B) Write (B)
  • 62.
    View serializable example •Since all the three conditions that checks whether the two schedules are view equivalent are satisfied in this example, which means S1 and S2 are view equivalent. • Also, as we know that the schedule S2 is the serial schedule of S1, thus we can say that the schedule S1 is view serializable schedule. Non-Serial Schedule (S1) T1 T2 Read (A) Write (A) Read (A) Write (A) Read (B) Write (B) Read (B) Write (B) Serial Schedule (S2) T1 T2 Read (A) Write (A) Read (B) Write (B) Read (A) Write (A) Read (B) Write (B)
  • 63.
    View serializable Relationship betweenview and conflict equivalence: • The two are same under constrained write assumption which assumes that if T writes X, it is constrained by the value of X it read; i.e., new X = f(old X). • Conflict serializability is stricter than view serializability. • With unconstrained write (or blind write), a schedule that is view serializable is not necessarily conflict serializable. • Any conflict serializable schedule is also view serializable, but not vice versa. Conflict serializable View serializable Blind write Consider the following schedule of three transactions T1: r1(X), w1(X); T2: w2(X); and T3: w3(X): Schedule Sa: r1(X); w2(X); w1(X); w3(X); c1; c2; c3;
  • 64.
    Consider the followingschedule of three transactions T1: r1(A), w1(A); T2: w2(A); and T3: w3(A): Schedule S1: r1(A); w2(A); w1(A); w3(A); c1; c2; c3; View serializable With 3 transactions, the total number of possible serial schedule • In S1, the operations w2(A) and w3(A) are blind writes, since T1 and T3 do not read the value of A. • S1 is view serializable, since it is view equivalent to the serial schedule T1, T2, T3. • However, S1 is not conflict serializable, since it is not conflict equivalent to any serial schedule. • [Write operation does not depend on the old value of A. It just update the value]
  • 65.
  • 66.
    Equivalent Schedule • Iftwo schedules produce the same result after execution, they are said to be equivalent schedule. OR • The two schedules S1 and S2 are said to be equivalent schedules if they produces the same final database state. • Types of equivalent schedule:
  • 67.
    Equivalent Schedule 1) Resultequivalence schedule: Two schedules are said to be result equivalent if they produce the same final database state for some initial value of data. 2) Conflict Equivalent Schedule: Two schedules are said to be conflict equivalent operations in both the schedule must be executed in the same order. 3) View Equivalent Schedule: Two schedules S and S’ are said to be view equivalent if the following 3-conditions are met for each data item (say A). 1) Initial read : Transaction Ti reads the initial value A in schedule S then in schedule S’, transaction Ti also read the initial value of A. 2) Final write: Transaction Ti perform the final write of A in schedule S then in schedule S’, transaction Ti also perform the final write of A. 3) Update read (Producer-consumer): If transaction Ti executes read-R(A) in schedule S and that was produced by transaction Tj, then must in schedule S’ also read the value of A that was produced by Tj.
  • 68.
    Equivalent Schedule Schedule-1 (A=B=1000) T1T2 Read (A) A = A - 50 Write (A) Read (A) temp = A * 0.1 A = A - temp Write (A) Read (B) B = B + 50 Write (B) Commit Read (B) B = B + temp Write (B) Commit Schedule-2 (A=B=1000) T1 T2 Read (A) A = A - 50 Write (A) Read (B) B = B + temp Write (B) Commit Read (A) temp = A * 0.1 A = A - temp Write (A) Read (B) B = B + 50 Write (B) Commit Both schedules are equivalent In both schedules the sum “A + B” is preserved .
  • 69.
    • Let T1and T2 be the transactions defined previously. • The following schedule is not a serial schedule, but it is equivalent to Schedule 1. In Schedules 1 and 3, the sum A + B is preserved. Schedule 3 Schedule 1 Schedule 3
  • 70.
    Classification of Schedulebased on Recoverability
  • 71.
    Non-Serializable Schedule A non-serialschedule which is not serializable is called as a non-serializable schedule. Characteristics- Non-serializable schedules- • may or may not be consistent • may or may not be recoverable Irrecoverable Schedules- If in a schedule, • A transaction performs a dirty read operation from an uncommitted transaction • And commits before the transaction from which it has read the value then such a schedule is known as an Irrecoverable Schedule.
  • 72.
    Irrecoverable Schedule Example: Here, •T2 performsa dirty read operation. •T2 commits before T1. •T1 fails later and roll backs. •The value that T2 read now stands to be incorrect. •T2 can not recover since it has already committed.
  • 73.
    Recoverability Schedules classified onrecoverability: • Recoverable schedule: • One where no transaction needs to be rolled back. • A schedule S is recoverable if no transaction T in S commits until all transactions T’ that have written an item that T reads have committed. • Cascadeless schedule: • One where every transaction reads only the items that are written by committed transactions. • Schedules requiring cascaded rollback: A schedule in which uncommitted transactions that read an item from a failed transaction must be rolled back. • Strict Schedules: • A schedule in which a transaction can neither read or write an item X until the last transaction that wrote X has committed.
  • 74.
    Recoverable Schedule • Aschedule is recoverable if: • each transaction commits only after all transactions from which it has read has committed. • In other words, if some transaction Tj is reading value updated or written by some other transaction Ti, then the commit of Tj must occur after the commit of Ti. The transaction Tk must commit only after the transactions T1, T2 and T3 have committed
  • 75.
    Recoverable Vs Nonrecoverable Schedule In a recoverable schedule, a transaction Tk can only commit if: transaction Tk has not read (= used) dirty data
  • 76.
    Cascading Schedule • Ifin a schedule, failure of one transaction causes several other dependent transactions to rollback or abort, then such a schedule is called as a Cascading Schedule or Cascading Rollback or Cascading Abort. • It simply leads to the wastage of CPU time.
  • 77.
    Cascadless Schedule • Ifin a schedule, a transaction is not allowed to read a data item until the last transaction that has written it is committed or aborted, then such a schedule is called as a Cascadeless Schedule. • In other words, • Cascadeless schedule allows only committed read operations. • Therefore, it avoids cascading roll back and thus saves CPU time. NOTE- •Cascadeless schedule allows only committed read operations. •However, it allows uncommitted write operations.
  • 78.
    Strict Schedule • Ifin a schedule, a transaction is neither allowed to read nor write a data item until the last transaction that has written it is committed or aborted, then such a schedule is called as a Strict Schedule. • Every strict schedule is both recoverable and cascadless. • In other words, • Strict schedule allows only committed read and write operations. • Clearly, strict schedule implements more restrictions than cascadeless schedule.
  • 79.
    Recoverability • Strict schedulesare more strict than cascadeless schedules. • All strict schedules are cascadeless schedules. • All cascadeless schedules are not strict schedules.
  • 80.
  • 81.
    Concurrency Control 1 Purposeof Concurrency Control • To enforce Isolation (through mutual exclusion) among conflicting transactions. • To preserve database consistency through consistency preserving execution of transactions. • To resolve read-write and write-write conflicts. Example: In concurrent execution environment if T1 conflicts with T2 over a data item A, then the existing concurrency control decides if T1 or T2 should get the A and if the other transaction is rolled-back or waits. 2 Introduction Most of these techniques ensure serializability of schedules using concurrency control protocol (i.e. set of rules), that guarantees serializability.
  • 82.
    Concurrency Control Classification ofTechniques • Lock: Locking the data items to prevent multiple transactions from accessing the items concurrently; a number of locking protocol have been proposed. • Use of Timestamp: A timestamp is a unique identifier for each transaction, generated by the system. Transactions are executed one after the other. • Mult version: Mult version concurrency controls that uses multiple versions of a data items. • Optimistic Concurrency Control: Based on the concept of validation or certification of transaction after it executes its operations these are sometimes called optimistic protocols.
  • 83.
    Locking Technique • Lock:A variable associated with a data item that describes the status of the item with respect to possible operations that can be applied to it. • Generally, there is one for each data item in the database.
  • 84.
    What is Lock? •A lock is a variable associated with data item to control concurrent access to that data item. Database 0 Lock variable Locking is a strategy that is used to prevent such concurrent access of data. 1
  • 85.
    Lock-based Protocol • Dataitems can be locked in two modes : • Shared (S) mode: When we take this lock we can just read the item but cannot write. • Exclusive (X) mode: When we take this lock we can read as well as write the item. • Lock-compatibility matrix • A transaction may be granted a lock on an item if the requested lock is compatible with locks already held on the item by other transactions. • If a lock cannot be granted, the requesting transaction is made to wait till all incompatible locks held by other transactions have been released. The lock is then granted. • Any number of transactions can hold shared locks on an item, but if any transaction holds an exclusive on the item no other transaction can hold any lock on the item. Shared lock Exclusive lock Shared lock Yes Compatible No Not Compatible Exclusive lock No Not Compatible No Not Compatible T1 T2
  • 86.
    Lock-based Protocol • Alock associated with an item X, lock (X) has three possible states: • “read-lock(X)”, “write-lock(X)”, and “unlock(X)”. • A read locked item is also called shared-locked, because other transactions are allowed to read the item. • A write-locked item is called exclusive-locked, because a single transaction exclusively holds the lock on the item.
  • 87.
    Lock-based Protocol Rule forread/write locks: • A txn T must issue the operation read-lock (X) or write-lock (X) before any read-item (X) operation is performed in T. • A txn T must issue the operation write-lock (X) before any write-item (X) operation is performed in T. • A txn T must issue the operation unlock (X) after all read-item (X) and write- item (X) operations are completed in T. • A txn T will not issue an unlock (X) operation unless it already holds a read (shared) lock or write (exclusive) lock on item X.
  • 88.
    Lock-based Protocol • Thislocking protocol divides transaction execution phase into three parts: 1. When transaction starts executing, create a list of data items on which they need locks and requests the system for all the locks it needs. 2. Where the transaction acquires all locks and no other lock is required. Transaction keeps executing its operation. 3. As soon as the transaction releases its first lock, the third phase starts. In this phase a transaction cannot demand for any lock but only releases the acquired locks. Transaction Transaction begin Transaction end Time Lock acquisition phase Lock releasing phase Transaction execution
  • 89.
    Two phase LockingProtocol • This protocol works in two phases, 1. Growing Phase • In this phase a transaction obtains locks, but can not release any lock. • When a transaction takes the final lock is called lock point. 2. Shrinking Phase • In this phase a transaction can release locks, but can not obtain any lock. • The transaction enters the shrinking phase as soon as it releases the first lock after crossing the Lock Point. Transaction Transaction begin Transaction end Time Growing phase Shrinking phase
  • 90.
    Two phase LockingProtocol • With lock version: (if lock conversion is allowed) 1. Upgrading of locks (from read-locked to write-locked) must be done during the growing/expanding phase. 2. Downgrading of locks (from write-locked to read-locked) must be done in the shrinking phase. Hence, a read-lock (X) operation that downgrades an already held write lock on X can appear only in the shrinking phase. • Requirement: For a transaction these two phases must be mutually exclusively, that is, during locking phase unlocking phase must not start and during unlocking phase locking phase must not begin. • Claims: 1. If every transaction in a schedule follows two phase locking protocol, the schedule is guaranteed to be serializable, obviating the need to test for serializability of schedules anymore. 2. If the locking mechanism enforce two-phase locking rules, it in effect enforce serializability of schedules.
  • 91.
    Two phase LockingProtocol T1 T2 Result read_lock (Y); read_lock (X); Initial values: X=20; Y=30 read_item (Y); read_item (X); Result of serial execution unlock (Y); unlock (X); T1 followed by T2 write_lock (X); Write_lock (Y); X=50, Y=80. read_item (X); read_item (Y); Result of serial execution X:=X+Y; Y:=X+Y; T2 followed by T1 write_item (X); write_item (Y); X=70, Y=50 unlock (X); unlock (Y); • Transactions T1 and T2 do not follow the two-phase locking protocol, because • the write_lock(X) operation follows the unlock(Y) operation in T1, and similarly the write_lock(Y) operation follows the unlock(X) operation in T2. • If we enforce two-phase locking, the transactions can be rewritten as T1’ and T2’. • during locking phase unlocking phase must during locking phase unlocking phase must not start and during unlocking phase locking phase must not begin. and during unlocking phase locking phase must not begin.
  • 92.
    Two phase LockingProtocol T’1 T’2 read_lock (Y); read_lock (X); T1 and T2 follow two- phase read_item (Y); read_item (X); policy but they are subject to write_lock (X); Write_lock (Y); deadlock, which must be unlock (Y); unlock (X); dealt with. read_item (X); read_item (Y); X:=X+Y; Y:=X+Y; write_item (X); write_item (Y); unlock (X); unlock (Y); T1’ will issue its write_lock(X) before it unlocks item Y; consequently, when T2’ issues its read_lock(X), it is forced to wait until T1’ releases the lock by issuing an unlock (X) in the schedule. However, this can lead to deadlock
  • 93.
    Two phase LockingProtocol T1 T2 T3 Lock_X(A); read_item (A); write_item (A); Lock_S(B); read_item (B); Unlock (A) Lock_X(A); read_item (A); write_item (A); unlock (A); Lock_S(A); read_item (A); • Cascading rollback may occur under two phase locking protocol. • Deadlock may occur • Cascading rollback can be avoided by a modification of 2-phase locking. Fail Rollback
  • 94.
  • 95.
    Two phase LockingProtocol Deadlock in 2PL Let's understand deadlock in a two-phase locking protocol with the help of an example. •In the above transaction in step 3, T1 cannot execute the exclusive lock in R2 as the Exclusive lock is already obtained by transaction T2 in step 1. •Also, in step 3, the Exclusive lock on R1 cannot be executed by T2 as T1 has already applied the Exclusive lock on R1.
  • 96.
    Two phase LockingProtocol Two-phase locking may limit the amount of concurrency • T must lock the additional item Y before it needs it so that it can release X. Penalty to other transactions: • Another transaction seeking to access X may be forced to wait, even though T is done with X. OR • If Y is locked earlier than it is needed, another transaction seeking to access Y is forced to wait even though T is not using Y yet. Types of Two-phase Locking: 1. Basic two phase locking 2. Conservative two phase locking 3. Strict two phase locking 4. Rigorous two phase locking
  • 97.
    Conservative 2PL Protocol •Requires a transaction to lock all the items it access before the transaction begins execution, by predeclaring its read-set and write-set. • Note: The read set of a transaction is the set of all items that the transaction reads, and the write set is the set of all items that the transaction writes. • If any of the predeclared items can not be locked, the transaction does not lock any item; instead, it waits until all the items are available for locking. OR Which requires the transaction to update all the locks before it start and release all the locks after it commits. Avoid cascading rollback and deadlock. Difficult to use because of difficulty in predeclaring the read-set and write-set.
  • 98.
    Strict 2PL Protocol •The most popular variation of 2PL is strict 2PL, which guarantees strict schedules. • In this variation, transaction T does not release any of its exclusive locks until after it commit or abort. OR A transaction must hold all its exclusive locks till it commits/aborts. • Not a deadlock free protocol. • Possible to enforce and desired due to recoverability.
  • 99.
    Rigorous 2PL Protocol •A transaction T does not release any of its locks (exclusive or shared) until after it commits or aborts. • Behaves similar to strict 2PL except it is more restrictive, but easier to implement since all locks are held till commit.
  • 100.
    All 2PL Protocol •Basic: All locking operations (read lock or write lock) precede the first unlock operation in the transaction. • Conservative: Prevents deadlock by locking all desired data items before transaction begins execution. • Strict: All exclusive locks taken by a transaction must be held until the transaction commits or abort. • Rigorous: All (shared or exclusive) locks are held till commit. Limitation of 2PL Use of locks can came two additional problems deadlock and starvation.
  • 101.
    Example 2PL Protocol Whichof the following schedule are in strict 2 phase or rigorous? a) Lock_S(A) b) Lock_S(A) c) Lock _S(A) read_item(A) read_item(A) read_item(A) Lock_X(B) Lock_X(B) Lock_X(B) read_item (B) Unlock (A) read_item (B) Unlock (A) read_item (B) write_item (B) write_item (B) write_item (B) Commit Unlock (B) Commit Unlock (A) Unlock (B) Unlock (B) Basic 2PL Basic 2PL Strict 2PL Basic 2PL Strict 2PL Rigorous 2PL
  • 102.
  • 103.
    What is deadlock? •Consider the following two transactions: • A deadlock is a situation in which two or more transactions are waiting for one another to give up locks. T1 T2 Granted for (A) Waiting for (B) Granted for (B) Waiting for (A) Lock-X (A) Write (A) Lock-X (B) Write (B) Lock-X (B) Write (B) Lock-X (A) Write (A)
  • 104.
    Deadlock Detection • A simpleway to detect deadlock is with the help of wait-for graph. • One node is created in the wait-for graph for each transaction that is currently executing. • Whenever a transaction Ti is waiting to lock an item X that is currently locked by a transaction Tj, a directed edge from Ti to Tj (Ti→Tj) is created in the wait- for graph. • When Tj releases the lock(s) on the items that Ti was waiting for, the directed edge is dropped from the wait-for graph. • We have a state of deadlock if and only if the wait-for graph has a cycle. • Then each transaction involved in the cycle is said to be deadlocked.
  • 105.
    Deadlock Detection • Transaction Ais waiting for transactions B and C. • Transactions C is waiting for transaction B. • Transaction B is waiting for transaction D. • This wait-for graph has no cycle, so there is no deadlock state. • Suppose now that transaction D is requesting an item held by C. Then the edge D → C is added to the wait-for graph. • Now this graph contains the cycle. • B → D → C → B • It means that transactions B, D and C are all deadlocked. B A D C D E A D L O C K
  • 106.
    Deadlock Recovery • Whena deadlock is detected, the system must recover from the deadlock. • The most common solution is to roll back one or more transactions to break the deadlock. • Choosing which transaction to abort is known as victim selection. • In this wait-for graph transactions B, D and C are deadlocked. • In order to remove deadlock one of the transaction out of these three (B, D, C) transactions must be roll backed. • We should rollback those transactions that will incur the minimum cost. • When a deadlock is detected, the choice of which transaction to abort can be made using following criteria: • The transaction which have the fewest locks • The transaction that has done the least work • The transaction that is farthest from completion B A D C D E A D L O C K
  • 107.
    Deadlock Prevention • Aprotocols ensure that the system will never enter into a deadlock state. • Some prevention strategies : • Require that each transaction locks all its data items before it begins execution (pre- declaration). • Impose partial ordering of all data items and require that a transaction can lock data items only in the order specified by the partial.
  • 108.
    Deadlock Prevention • Followingschemes use transaction timestamps for the sake of deadlock prevention alone. 1. Wait-die scheme — non-preemptive • If an older transaction is requesting a resource which is held by younger transaction, then older transaction is allowed to wait for it till it is available. • If an younger transaction is requesting a resource which is held by older transaction, then younger transaction is killed. 2. Wound-wait scheme — preemptive • If an older transaction is requesting a resource which is held by younger transaction, then older transaction forces younger transaction to kill the transaction and release the resource. • If an younger transaction is requesting a resource which is held by older transaction, then younger transaction is allowed to wait till older transaction will releases it. Wait-die Wound-wait O needs a resource held by Y O waits Y dies Y needs a resource held by O Y dies Y waits
  • 109.
    • Following schemesuse transaction timestamps for the sake of deadlock prevention alone. 3. Timeout-Based Schemes • A transaction waits for a lock only for a specified amount of time. After that, the wait times out and the transaction is rolled back. So deadlocks never occur. • Simple to implement; but difficult to determine good value of the timeout interval. Deadlock Prevention
  • 110.
    Starvation • A transactionis starved if it cannot proceed for an indefinite period of time while other transactions in the system continue normally. • This may occur if the waiting scheme for locked items is unfair, giving priority to some transaction over others. • Starvation can also occur in the algorithms for dealing with deadlock. • Occurs if an algorithm select the same transaction as victim repeatedly, thus causing it to abort and never finish execution. Techniques for Preventing Starvation: 1. First come first serve 2. Allow some transactions to have priority over others and go with aging technique. 3. Execute the repeatedly abort transactions. Note: The wait-die and wound-wait scheme avoid starvation.
  • 111.
    Timestamp based ConcurrencyControl • This protocol uses either system time or logical counter to be used as a time-stamp. • Every transaction has a time-stamp associated with it and the ordering is determined by the age of the transaction. • A transaction ‘T1’ created at 0002 clock time would be older than all other transaction, which come after it. • For example, any transaction ‘T2' entering the system at 0004 is two seconds younger than transaction ‘T1’ and priority is given to the older one. • In addition, every data item is given the latest read and write time-stamp. This lets the system know, when last read and write operations was made on the data item. • This is the responsibility of the protocol system that the conflicting pair of tasks should be executed according to the timestamp values of the transactions. • Time-stamp of Transaction Ti is denoted as TS(Ti). • Read time-stamp of data-item X is denoted by R-timestamp(X). • Write time-stamp of data-item X is denoted by W-timestamp(X).
  • 112.
    Timestamp based ConcurrencyControl • This is the responsibility of the protocol system that the conflicting pair of tasks should be executed according to the timestamp values of the transactions. • Time-stamp of Transaction Ti is denoted as TS(Ti). • Read time-stamp of data-item X is denoted by R-timestamp(X). • Write time-stamp of data-item X is denoted by W-timestamp(X). • Timestamp ordering protocol works as follows: • If a transaction Ti issues read(X) operation: • If TS(Ti) < W-timestamp(X) • Operation rejected. • If TS(Ti) >= W-timestamp(X) • Operation executed. • If a transaction Ti issues write(X) operation: • If TS(Ti) < R-timestamp(X) • Operation rejected. • If TS(Ti) < W-timestamp(X) • Operation rejected and Ti rolled back. • Otherwise, operation executed.
  • 113.
  • 114.
    Database Recovery • Thereare many situations in which a transaction may not reach a commit or abort point. • Operating system crash • DBMS crash • System might lose power (power failure) • Disk may fail or other hardware may fail (disk/hardware failure) • Human error • In any of above situations, data in the database may become inconsistent or lost. • For example, if a transaction has completed 30 out of 40 write instructions to the database when the DBMS crashes, then the database may be in an inconsistent state as only part of the transaction’s work was completed. • Database recovery is the process of restoring the database and the data to a consistent state. • This may include restoring lost data up to the point of the event (e.g. system crash).
  • 115.
    Log-based Recovery • Thelog is a sequence of log records, which maintains information about update activities on the database. • A log is kept on stable storage (i.e HDD). • Log contains • Start of transaction • Transaction-id • Record-id • Type of operation (insert, update, delete) • Old value, new value • End of transaction that is committed or aborted.
  • 116.
    Log-based Recovery • Whentransaction Ti starts, it registers itself by writing a record <Ti start> to the log. • Before Ti executes write(X), a log record <Ti, X, V1, V2> is written, where V1 is the value of X before the write (the old value), and V2 is the value to be written to X (the new value). • When Ti finishes it last statement, the log record <Ti commit> is written. • Undo of a log record <Ti, X, V1, V2> writes the old value V1 to X • Redo of a log record <Ti, X, V1, V2> writes the new value V2 to X • Types of log based recovery method • Immediate database modification • Deferred database modification
  • 117.
    Immediate v/s Deferreddatabase modification Immediate database modification Deferred database modification Updates (changes) to the database are applied immediately as they occur without waiting to reach to the commit point. Updates (changes) to the database are deferred (postponed) until the transaction commits.
  • 118.
    Immediate v/s Deferreddatabase modification Immediate database modification Deferred database modification T1 T2 Read (A) A = A - 100 Write (A) Read (B) B = B + 100 Write (B) Commit Read (C) C = C - 200 Write (C) Commit A=500, B=600, C=700 A=500, B=600, C=700 <T1 start> <T1, A, 500, 400> <T1, B, 600, 700> A=400,B=700,C=700 <T1 start> <T1, A, 400> <T1, B, 700> A=500,B=600,C=700 <T1 start> <T1, A, 400> <T1, B, 700> <T1, Commit> <T2 start> <T2, C, 500> A=400,B=700,C=700 <T1 start> <T1, A, 400> <T1, B, 700> <T1, Commit> <T2 start> <T2, C, 500> <T2, Commit> A=400,B=700,C=500 <T1 start> <T1, A, 500, 400> <T1, B, 600, 700> <T1, Commit> <T2 start> <T2, C, 700, 500> A=400,B=700,C=500 <T1 start> <T1, A, 500, 400> <T1, B, 600, 700> <T1, Commit> <T2 start> <T2, C, 700, 500> <T2, Commit> A=400,B=700,C=500
  • 119.
    Immediate v/s Deferreddatabase modification Immediate database modification Deferred database modification Updates (changes) to the database are applied immediately as they occur without waiting to reach to the commit point. Updates (changes) to the database are deferred (postponed) until the transaction commits. If transaction is not committed, then we need to do undo operation and restart the transaction again. If transaction is not committed, then no need to do any undo operations. Just restart the transaction. If transaction is committed, then no need to do redo the updates of the transaction. If transaction is committed, then we need to do redo the updates of the transaction. Undo and Redo both operations are performed. Only Redo operation is performed.
  • 120.
    Problems with Deferred& Immediate Updates (Checkpoint) • Searching the entire log is time consuming. • Immediate database modification • When transaction fail log file is used to undo the updates of transaction. • Deferred database modification • When transaction commits log file is used to redo the updates of transaction. • To reduce the searching time of entire log we can use check point. • It is a point which specifies that any operations executed before it are done correctly and stored safely (updated safely in database). • At this point, all the buffers are force-fully written to the secondary storage (database). • Checkpoints are scheduled at predetermined time intervals. • It is used to limit: • Size of transaction log file • Amount of searching
  • 121.
    Problems with Deferred& Immediate Updates (Checkpoint) • Searching the entire log is time consuming. • Immediate database modification • When transaction fail log file is used to undo the updates of transaction. • Deferred database modification • When transaction commits log file is used to redo the updates of transaction. • To reduce the searching time of entire log we can use check point. • It is a point which specifies that any operations executed before it are done correctly and stored safely (updated safely in database). • At this point, all the buffers are force-fully written to the secondary storage (database). • Checkpoints are scheduled at predetermined time intervals. • It is used to limit: • Size of transaction log file • Amount of searching
  • 122.
    How the checkpointworks when failure occurs • At failure time: • Ignore the transaction T1 as it has already been committed before checkpoint. • Redo transaction T2 and T3 as they are active after checkpoint and are committed before failure. • Undo transaction T4 as it is active after checkpoint and has not committed. Time TC Tf T1 T2 T3 T4 Checkpoint time Failure
  • 123.
    How the checkpointworks when failure occurs Time TC Tf T1 T4 T5 T6 Checkpoint time Failure Exercise Give the name of transactions which has already been committed before checkpoint. Exercise Give the name of transactions which will perform Redo operation. Exercise Give the name of transactions which will perform Undo operation. TC T2 T3 Checkpoint time
  • 124.
    Page table structure 2 1 4 101 202 n PageTable 1 2 3 4 5 . n 1 2 3 4 : : 101 : : 201 202 n Pages on disk Pages The database is partitioned into fixed-length blocks referred to as PAGES. Page table has n entries – one for each database page and each entry contain pointer to a page on disk.
  • 125.
    Shadow paging technique •Shadow paging is an alternative to log-based recovery. • This scheme is useful if transactions execute serially. • It maintain two page tables during the lifetime of a transaction • current page table • shadow page table • Shadow page table is stored on non-volatile storage. • When a transaction starts, both the page tables are identical. Only current page table is updated for data item accesses (changed) during execution of the transaction. • Shadow page table is never modified during execution of transaction.
  • 126.
    Shadow paging technique •Two pages - page 2 & 5 - are affected by a transaction and copied to new physical pages. The current page table points to these pages. • The shadow page table continues to point to old pages which are not changed by the transaction. So, this table and pages are used for undoing the transaction. Page 1 Page 2 Page 3 Page 4 Page 5 Page 1 Page 2 Page 3 Page 4 Page 5 Page 1 Page 2 Page 3 Page 4 Page 5 Pages Current page table Shadow page table • Whenever any page is updated first time 1. A copy of this page is made onto an unused page 2. The current page table is then made to point to the copy 3. The update is performed on the copy Page 2(old) Page 5(old) Page 2(new) Page 5(new)
  • 127.
    Shadow paging technique •When transaction start, both the page tables are identical. • The shadow page table is never changed over the duration of the transaction. • The current page table will be changed when a transaction performs a write operation. • All input and output operations use the current page table. • Whenever any page is about to be written for the first time • A copy of this page is made onto an unused page • The current page table is then made to point to the copy • The update is performed on the copy • When the transaction completes, all the modifications which are done by transaction which are present in current page table are transferred to shadow page table. • When the transaction fails, the shadow page table are transferred to current page table.