11
Dr. C.V. Suresh Babu
i
n
Transactions and Recovery
A transaction is an action, or a
series of actions, carried out
by a single user or an
application program, which
reads or updates the contents
of a database.
• Any action that reads from and/or writes to a
database may consist of
– Simple SELECT statement to generate a list of table
contents
– A series of related UPDATE statements to change the
values of attributes in various tables
– A series of INSERT statements to add rows to one or
more tables
– A combination of SELECT, UPDATE, and INSERT
statements
• A logical unit of work that must be either entirely
completed or aborted
• Successful transaction changes the database from
one consistent state to another
– One in which all data integrity constraints are satisfied
• Most real-world database transactions are formed
by two or more database requests
– The equivalent of a single SQL statement in an
application program or transaction
• Not all transactions update the database
• SQL code represents a transaction because database
was accessed
• Improper or incomplete transactions can have a
devastating effect on database integrity
– Some DBMSs provide means by which user can
define enforceable constraints based on business
rules
– Other integrity rules are enforced automatically by
the DBMS when table structures are properly
defined, thereby letting the DBMS validate some
transactions
Figure 9.2
• For example, a transaction may involve
– The creation of a new invoice
– Insertion of an row in the LINE table
– Decreasing the quantity on hand by 1
– Updating the customer balance
– Creating a new account transaction row
• If the system fails between the first and last step,
the database will no longer be in a consistent
state
7
• Atomicity
– Transactions are atomic –
they don’t have parts
(conceptually)
– can’t be executed partially; it
should not be detectable that
they interleave with another
transaction
• Consistency
– Transactions take the
database from one consistent
state into another
– In the middle of a transaction
the database might not be
consistent
• Isolation
– The effects of a transaction
are not visible to other
transactions until it has
completed
– From outside the transaction
has either happened or not
– To me this actually sounds
like a consequence of
atomicity…
• Durability
– Once a transaction has
completed, its changes are
made permanent
– Even if the system crashes,
the effects of a transaction
must remain in place
Transactions and Recovery
• Transfer Rs. 500 from account
A to account B
Read(A)
A = A - 50
Write(A)
Read(B)
B = B+50
Write(B)
Atomicity - shouldn’t take money
from A without giving it to B
Consistency - money isn’t lost or
gained
Isolation - other queries shouldn’t
see A or B change until
completion
Durability - the money does not
go back to A
transaction
Transactions and Recovery
The Transaction Manager
• The transaction manager
enforces the ACID
properties
– It schedules the operations of
transactions
– COMMIT and ROLLBACK are
used to ensure atomicity
– Locks or timestamps are used
to ensure consistency and
isolation for concurrent
transactions (next lectures)
– A log is kept to ensure
durability in the event of
system failure (this lecture)
Transactions and Recovery
COMMIT and ROLLBACK
• COMMIT signals the
successful end of a
transaction
– Any changes made by the
transaction should be saved
– These changes are now
visible to other transactions
• ROLLBACK signals the
unsuccessful end of a
transaction
– Any changes made by the
transaction should be undone
– It is now as if the transaction
never existed
Transactions and Recovery
Recovery
• Transactions should be
durable, but we cannot
prevent all sorts of failures:
– System crashes
– Power failures
– Disk crashes
– User mistakes
– Sabotage
– Natural disasters
• Prevention is better than
cure
– Reliable OS
– Security
– UPS and surge protectors
– RAID arrays
• Can’t protect against
everything though
Transactions and Recovery
The Transaction Log
• The transaction log records
the details of all
transactions
– Any changes the transaction
makes to the database
– How to undo these changes
– When transactions complete
and how
• The log is stored on disk,
not in memory
– If the system crashes it is
preserved
• Write ahead log rule
– The entry in the log must be
made before COMMIT
processing can complete
Transactions and Recovery
System Failures
• A system failure means all
running transactions are
affected
– Software crashes
– Power failures
• The physical media (disks)
are not damaged
• At various times a DBMS
takes a checkpoint
– All committed transactions
are written to disk
– A record is made (on disk) of
the transactions that are
currently running
Transactions and Recovery
Types of Transactions
Last Checkpoint System Failure
T1
T2
T3
T4
T5
Without Concurrency Control, problems may occur
with concurrent transactions:
• Lost Update Problem.
Occurs when two transactions update the same data
item, but both read the same original value before
update (Figure 21.3(a), next slide)
• The Temporary Update (or Dirty Read) Problem.
This occurs when one transaction T1 updates a
database item X, which is accessed (read) by
another transaction T2; then T1 fails for some reason
(Figure 21.3(b)); X was (read) by T2 before its value
is changed back (rolled back or UNDONE) after T1
fails
Concurrency control
• The Incorrect Summary Problem .
One transaction is calculating an aggregate summary
function on a number of records (for example, sum
(total) of all bank account balances) while other
transactions are updating some of these records (for
example, transferring a large amount between two
accounts, see Figure 21.3(c)); the aggregate function
may read some values before they are updated and
others after they are updated.
Cont…..
Concurrency control
• The Unrepeatable Read Problem .
A transaction T1 may read an item (say, available
seats on a flight); later, T1 may read the same item
again and get a different value because another
transaction T2 has updated the item (reserved seats
on the flight) between the two reads by T1
Cont…..
Concurrency control
Causes of transaction failure:
1. A computer failure (system crash): A hardware or
software error occurs during transaction execution. If
the hardware crashes, the contents of the computer’s
internal main memory may be lost.
2. A transaction or system error : Some operation in the
transaction may cause it to fail, such as integer overflow
or division by zero. Transaction failure may also occur
because of erroneous parameter values or because of
a logical programming error. In addition, the user may
interrupt the transaction during its execution.
Recovery
3. Local errors or exception conditions detected by
the transaction:
- certain conditions necessitate cancellation of the
transaction. For example, data for the transaction may
not be found. A condition, such as insufficient account
balance in a banking database, may cause a
transaction, such as a fund withdrawal, to be canceled
- a programmed abort causes the transaction to fail.
4. Concurrency control enforcement: The concurrency
control method may decide to abort the transaction, to
be restarted later, because it violates serializability or
because several transactions are in a state of
deadlock
Cont…..
Recovery
5. Disk failure: Some disk blocks may lose their data
because of a read or write malfunction or because of
a disk read/write head crash. This kind of failure and
item 6 are more severe than items 1 through 4.
6. Physical problems and catastrophes: This refers
to an endless list of problems that includes power or
air-conditioning failure, fire, theft, sabotage,
overwriting disks or tapes by mistake, and mounting
of a wrong tape by the operator.
Cont…..
Recovery
Transaction and System Concepts
A transaction is an atomic unit of work that is either
completed in its entirety or not done at all. A
transaction passes through several states.
Transaction states:
• Active state (executing read, write operations)
• Partially committed state (ended but waiting for
system checks to determine success or failure)
• Committed state (transaction succeeded)
• Failed state (transaction failed, must be rolled back)
• Terminated State (transaction leaves system)
Transactions and Recovery
System Recovery
• Any transaction that was
running at the time of
failure needs to be undone
and restarted
• Any transactions that
committed since the last
checkpoint need to be
redone
• Transactions of type T1 need
no recovery
• Transactions of type T3 or T5
need to be undone and
restarted
• Transactions of type T2 or T4
need to be redone
Transactions and Recovery
Transaction Recovery
UNDO and REDO: lists of transactions
UNDO = all transactions running at the last checkpoint
REDO = empty
For each entry in the log, starting at the last checkpoint
If a BEGIN TRANSACTION entry is found for T
Add T to UNDO
If a COMMIT entry is found for T
Move T from UNDO to REDO
Transactions and Recovery
Transaction Recovery
T1
T2
T3
T4
T5
Checkpoint Failure
UNDO: T2, T3
REDO:
Last Checkpoint
Active transactions: T2, T3
Transactions and Recovery
Transaction Recovery
T1
T2
T3
T4
T5
Checkpoint Failure
UNDO: T2, T3, T4
REDO:
T4 Begins
Add T4 to UNDO
Transactions and Recovery
Transaction Recovery
T1
T2
T3
T4
T5
Checkpoint Failure
UNDO: T2, T3, T4, T5
REDO:
T5 begins
Add T5 to UNDO
Transactions and Recovery
Transaction Recovery
T1
T2
T3
T4
T5
Checkpoint Failure
UNDO: T3, T4, T5
REDO: T2
T2 Commits
Move T2 to REDO
Transactions and Recovery
Transaction Recovery
T1
T2
T3
T4
T5
Checkpoint Failure
UNDO: T3, T5
REDO: T2, T4
T4 Commits
Move T4 to REDO
Transactions and Recovery
Forwards and Backwards
• Backwards recovery
– We need to undo some
transactions
– Working backwards through
the log we undo any
operation by a transaction on
the UNDO list
– This returns the database to a
consistent state
• Forwards recovery
– Some transactions need to be
redone
– Working forwards through
the log we redo any operation
by a transaction on the REDO
list
– This brings the database up to
date
Transactions and Recovery
Concurrency
• Large databases are used by
many people
– Many transactions to be run
on the database
– It is desirable to let them run
at the same time as each
other
– Need to preserve isolation
• If we don’t allow for
concurrency then
transactions are run
sequentially
– Have a queue of transactions
– Long transactions (eg
backups) will make others
wait for long periods
Transactions and Recovery
Concurrency Problems
• In order to run transactions
concurrently we interleave
their operations
• Each transaction gets a
share of the computing
time
• This leads to several sorts of
problems
– Lost updates
– Uncommitted updates
– Incorrect analysis
• All arise because isolation is
broken
Transactions and Recovery
Lost Update
• T1 and T2 read X, both
modify it, then both write it
out
– The net effect of T1 and T2
should be no change on X
– Only T2’s change is seen,
however, so the final value of
X has increased by 5
T1 T2
Read(X)
X = X - 5
Read(X)
X = X + 5
Write(X)
Write(X)
COMMIT
COMMIT
Transactions and Recovery
Uncommitted Update
• T2 sees the change to X
made by T1, but T1 is rolled
back
– The change made by T1 is
undone on rollback
– It should be as if that change
never happened
T1 T2
Read(X)
X = X - 5
Write(X)
Read(X)
X = X + 5
Write(X)
ROLLBACK
COMMIT
Transactions and Recovery
Inconsistent analysis
• T1 doesn’t change the sum
of X and Y, but T2 sees a
change
– T1 consists of two parts –
take 5 from X and then add 5
to Y
– T2 sees the effect of the first,
but not the second
T1 T2
Read(X)
X = X - 5
Write(X)
Read(X)
Read(Y)
Sum = X+Y
Read(Y)
Y = Y + 5
Write(Y)
Concurrency Control with Locking Methods
• Lock
– Guarantees exclusive use of a data item to a current
transaction
• T2 does not have access to a data item that is currently being
used by T1
• Transaction acquires a lock prior to data access; the lock is
released when the transaction is complete
– Required to prevent another transaction from reading
inconsistent data
• Lock manager
– Responsible for assigning and policing the locks used by
the transactions
Lock Granularity
• Indicates the level of lock use
• Locking can take place at the following levels:
– Database-level lock
• Entire database is locked
– Table-level lock
• Entire table is locked
– Page-level lock
• Entire diskpage is locked
– Row-level lock
• Allows concurrent transactions to access different rows of the same
table, even if the rows are located on the same page
– Field-level lock
• Allows concurrent transactions to access the same row, as long as they
require the use of different fields (attributes) within that row
A Database-Level Locking Sequence
 Good for batch processing but unsuitable for online multi-user DBMSs
 T1 and T2 can not access the same database concurrently even if they use
different tables
Table-Level Lock
 T1 and T2 can access the same database concurrently as long as they use different tables
 Can cause bottlenecks when many transactions are trying to access the same table (even if
the transactions want to access different parts of the table and would not interfere with
each other)
 Not suitable for multi-user DBMSs
Page-Level Lock
 An entire disk page is locked (a table can span several pages and each page can
contain several rows of one or more tables)
 Most frequently used multi-user DBMS locking method
Row-Level Lock
 Concurrent transactions can access different rows of the same table even if the
rows are located on the same page
 Improves data availability but with high overhead (each row has a lock that must
be read and written to)
Field-Level Lock
 Allows concurrent transactions to access the
same row as long as they require the use of
different fields with that row
 Most flexible lock buy requires an extremely high
level of overhead
Binary Locks
 Has only two states: locked (1) or unlocked (0)
 Eliminates “Lost Update” problem – the lock is not released until the write
statement is completed
 Can not use PROD_QOH until it has been properly updated
 Considered too restrictive to yield optimal concurrency conditions as it locks even for two
READs when no update is being done
Shared/Exclusive Locks
• Exclusive lock
– Access is specifically reserved for the transaction that locked the object
– Must be used when the potential for conflict exists – when a transaction wants to
update a data item and no locks are currently held on that data item by another
transaction
– Granted if and only if no other locks are held on the data item
• Shared lock
– Concurrent transactions are granted Read access on the basis of a common lock
– Issued when a transaction wants to read data and no exclusive lock is held on that data
item
• Multiple transactions can each have a shared lock on the same data item if they are all
just reading it
• Mutual Exclusive Rule
– Only one transaction at a time can own an exclusive lock in the same object
Shared/Exclusive Locks
• Increased overhead
– The type of lock held must be known before a lock can be granted
– Three lock operations exist:
• READ_LOCK to check the type of lock
• WRITE_LOCK to issue the lock
• UNLOCK to release the lock
– A lock can be upgraded from share to exclusive and downgraded from
exclusive to share
• Two possible major problems may occur
– The resulting transaction schedule may not be serializable
– The schedule may create deadlocks
Two-Phase Locking to Ensure Serializability
• Defines how transactions acquire and
relinquish locks
• Guarantees serializability, but it does not
prevent deadlocks
– Growing phase, in which a transaction acquires all
the required locks without unlocking any data
– Shrinking phase, in which a transaction releases all
locks and cannot obtain any new lock
Two-Phase Locking to Ensure Serializability
• Governed by the following rules:
– Two transactions cannot have conflicting locks
– No unlock operation can precede a lock operation
in the same transaction
– No data are affected until all locks are obtained—
that is, until the transaction is in its locked point
Two-Phase Locking Protocol
Deadlocks
• Condition that occurs when two transactions wait for each
other to unlock data
– T1 needs data items X and Y; T needs Y and X
– Each gets the its first piece of data but then waits to get the second
(which the other transaction is already holding) – deadly embrace
• Possible only if one of the transactions wants to obtain an
exclusive lock on a data item
– No deadlock condition can exist among shared locks
• Control through
– Prevention
– Detection
– Avoidance
How a Deadlock Condition Is Created
Concurrency Control with Time Stamping
Methods
• Assigns a global unique time stamp to each transaction
– All database operations within the same transaction must have the
same time stamp
• Produces an explicit order in which transactions are submitted
to the DBMS
• Uniqueness
– Ensures that no equal time stamp values can exist
• Monotonicity
– Ensures that time stamp values always increase
• Disadvantage
– Each value stored in the database requires two additional time stamp
fields – last read, last update
Wait/Die and Wound/Wait
Schemes• Wait/die
– Older transaction waits and the younger is rolled back and rescheduled
• Wound/wait
– Older transaction rolls back the younger transaction and reschedules it
• In the situation where a transaction is requests multiple locks,
each lock request has an associated time-out value. If the lock
is not granted before the time-out expires, the transaction is
rolled back
Wait/Die and Wound/Wait
Concurrency Control Schemes
Concurrency Control with Optimistic Methods
• Optimistic approach
– Based on the assumption that the majority of database
operations do not conflict
– Does not require locking or time stamping techniques
– Transaction is executed without restrictions until it is committed
– Acceptable for mostly read or query database systems that
require very few update transactions
– Phases are read, validation, and write
Concurrency Control with Optimistic Methods
• Phases are read, validation, and write
– Read phase – transaction reads the database, executes the needed
computations and makes the updates to a private copy of the database
values.
• All update operations of the transaction are recorded in a temporary
update file which is not accessed by the remaining transactions
– Validation phase – transaction is validated to ensure that the changes
made will not affect the integrity and consistency of the database
• If the validation test is positive, the transaction goes to the writing
phase. If negative, the transaction is restarted and the changes
discarded
– Writing phase – the changes are permanently applied to the database
58
Schedules
• A schedule is a list of actions from a set of
transactions
– A well-formed schedule is one where the actions of a
particular transaction T are in the same order as they
appear in T
• For example
– [RT1(a), WT1(a), RT2(b), WT2(b), RT1(c), WT1(c)] is a well-
formed schedule
– [RT1(c), WT1(c), RT2(b), WT2(b), RT1(a), WT1(a)] is not a well-
formed schedule
59
Schedules cont.
• A complete schedule is one that contains an
abort or commit action for every transaction
that occurs in the schedule
• A serial schedule is one where the actions of
different transactions are not interleaved
60
Serialisability
• A serialisable schedule is a schedule whose
effect on any consistent database instance is
identical to that of some complete serial
schedule
• NOTE:
– All different results assumed to be acceptable
– It’s more complicated when we have transactions
that abort
– We’ll assume that all ‘side-effects’ of a transaction
are written to the database
61
Anomalies with interleaved
execution
• Two actions on the same data object conflict if
at least one of them is a write
• We’ll now consider three ways in which a
schedule involving two consistency-preserving
transactions can leave a consistent database
inconsistent
62
WR conflicts
• Transaction T2 reads a database object that
has been modified by T1 which has not
committed
T1: R(a),W(a), R(b),W(b),C
T2: R(a),W(a),R(b),W(b),C
Debit €100
from a
Credit €100
to b
Read a and b
and add 6%
interest
“Dirty
read”
63
RW conflicts
• Transaction T2 could change the value of an object
that has been read by a transaction T1, while T1 is
still in progress
T1: R(a), R(a), W(a), C
T2: R(a),W(a),C
T1: R(a), W(a),C
T2: R(a), W(a),C A is 4 
Read A (5) Write 5+1=6
Read A (5) Write 5-1=4
“Unrepeatable
Read”
64
WW conflicts
• Transaction T2 could overwrite the value of an
object which has already been modified by T1,
while T1 is still in progress
T1: [W(Britney), W(gmb)] “Set both salaries at £1m”
T2: [W(gmb), W(Britney)] “Set both salaries at $1m”
• But:
T1: W(Britney), W(gmb)
T2: W(gmb), W(Britney)
gmb gets £1m
Britney gets $1m

“Blind
Write”
65
Serialisability and aborts
• Things are more complicated when
transactions can abort
T1:R(a), W(a), Abort
T2: R(a),W(a),R(b),W(b),C
Deduct €100
from a
Add 6% interest
to a and b
Can’t undo T2
It’s committed 
66
Strict two-phase locking
• DBMS enforces the following locking protocol:
– Each transaction must obtain an S (shared) lock before
reading, and an X (exclusive) lock before writing
– All locks held by a transaction are released when the
transaction completes
– If a transaction holds an X lock on an object, no other
transaction can get a lock (S or X) on that object
• Strict 2PL allows only serialisable schedules
67
More refined locks
• Some updates that seem at first sight to require a
write (X) lock, can be given something weaker
– Example: Consider a seat count object in a flights database
– There are two transactions that wish to book a flight – get
X lock on seat count
– Does it matter in what order they decrement the count?
• They are commutative actions!
• Do they need a write lock?
68
Aborting
• If a transaction Ti is aborted, then all actions must be
undone
– Also, if Tj reads object last written by Ti, then Tj must be
aborted!
• Most systems avoid cascading aborts by releasing
locks only at commit time (strict protocols)
– If Ti writes an object, then Tj can only read this after Ti
finishes
• In order to undo changes, the DBMS maintains a log
which records every write
69
The log
• The following facts are recorded in the log
– “Ti writes an object”: store new and old values
– “Ti commits/aborts”: store just a record
• Log records are chained together by
transaction id, so it’s easy to undo a specific
transaction
• Log is often duplexed and archived on stable
storage (it’s important!)
70
Connection to Normalization
• The more redundancy in a database, the more
locking is required for (update) transactions.
– Extreme case: so much redundancy that all update
transactions are forced to execute serially.
• In general, less redundancy allows for greater
concurrency and greater transaction throughput.
!!! This is what normalization is all about !!!
71
The Fundamental Tradeoff of Database
Performance Tuning
• De-normalized data can often result in faster
query response
• Normalized data leads to better transaction
throughput
What is more important in your database --- query response
or transaction throughput? The answer will vary.
What do the extreme ends of the spectrum look like?
Yes, indexing data can speed up transactions, but this just proves
the point --- an index IS redundant data. General rule of thumb:
indexing will slow down transactions!
72
Summary
You should now understand:
• Transactions and the ACID properties
• Schedules and serialisable schedules
• Potential anomalies with interleaving
• Strict 2-phase locking
• Problems with transactions that can abort
• Logs

Tranasaction management

  • 1.
  • 2.
    Transactions and Recovery Atransaction is an action, or a series of actions, carried out by a single user or an application program, which reads or updates the contents of a database.
  • 3.
    • Any actionthat reads from and/or writes to a database may consist of – Simple SELECT statement to generate a list of table contents – A series of related UPDATE statements to change the values of attributes in various tables – A series of INSERT statements to add rows to one or more tables – A combination of SELECT, UPDATE, and INSERT statements
  • 4.
    • A logicalunit of work that must be either entirely completed or aborted • Successful transaction changes the database from one consistent state to another – One in which all data integrity constraints are satisfied • Most real-world database transactions are formed by two or more database requests – The equivalent of a single SQL statement in an application program or transaction
  • 5.
    • Not alltransactions update the database • SQL code represents a transaction because database was accessed • Improper or incomplete transactions can have a devastating effect on database integrity – Some DBMSs provide means by which user can define enforceable constraints based on business rules – Other integrity rules are enforced automatically by the DBMS when table structures are properly defined, thereby letting the DBMS validate some transactions
  • 6.
    Figure 9.2 • Forexample, a transaction may involve – The creation of a new invoice – Insertion of an row in the LINE table – Decreasing the quantity on hand by 1 – Updating the customer balance – Creating a new account transaction row • If the system fails between the first and last step, the database will no longer be in a consistent state
  • 7.
  • 8.
    • Atomicity – Transactionsare atomic – they don’t have parts (conceptually) – can’t be executed partially; it should not be detectable that they interleave with another transaction • Consistency – Transactions take the database from one consistent state into another – In the middle of a transaction the database might not be consistent • Isolation – The effects of a transaction are not visible to other transactions until it has completed – From outside the transaction has either happened or not – To me this actually sounds like a consequence of atomicity… • Durability – Once a transaction has completed, its changes are made permanent – Even if the system crashes, the effects of a transaction must remain in place
  • 9.
    Transactions and Recovery •Transfer Rs. 500 from account A to account B Read(A) A = A - 50 Write(A) Read(B) B = B+50 Write(B) Atomicity - shouldn’t take money from A without giving it to B Consistency - money isn’t lost or gained Isolation - other queries shouldn’t see A or B change until completion Durability - the money does not go back to A transaction
  • 10.
    Transactions and Recovery TheTransaction Manager • The transaction manager enforces the ACID properties – It schedules the operations of transactions – COMMIT and ROLLBACK are used to ensure atomicity – Locks or timestamps are used to ensure consistency and isolation for concurrent transactions (next lectures) – A log is kept to ensure durability in the event of system failure (this lecture)
  • 11.
    Transactions and Recovery COMMITand ROLLBACK • COMMIT signals the successful end of a transaction – Any changes made by the transaction should be saved – These changes are now visible to other transactions • ROLLBACK signals the unsuccessful end of a transaction – Any changes made by the transaction should be undone – It is now as if the transaction never existed
  • 12.
    Transactions and Recovery Recovery •Transactions should be durable, but we cannot prevent all sorts of failures: – System crashes – Power failures – Disk crashes – User mistakes – Sabotage – Natural disasters • Prevention is better than cure – Reliable OS – Security – UPS and surge protectors – RAID arrays • Can’t protect against everything though
  • 13.
    Transactions and Recovery TheTransaction Log • The transaction log records the details of all transactions – Any changes the transaction makes to the database – How to undo these changes – When transactions complete and how • The log is stored on disk, not in memory – If the system crashes it is preserved • Write ahead log rule – The entry in the log must be made before COMMIT processing can complete
  • 14.
    Transactions and Recovery SystemFailures • A system failure means all running transactions are affected – Software crashes – Power failures • The physical media (disks) are not damaged • At various times a DBMS takes a checkpoint – All committed transactions are written to disk – A record is made (on disk) of the transactions that are currently running
  • 15.
    Transactions and Recovery Typesof Transactions Last Checkpoint System Failure T1 T2 T3 T4 T5
  • 16.
    Without Concurrency Control,problems may occur with concurrent transactions: • Lost Update Problem. Occurs when two transactions update the same data item, but both read the same original value before update (Figure 21.3(a), next slide) • The Temporary Update (or Dirty Read) Problem. This occurs when one transaction T1 updates a database item X, which is accessed (read) by another transaction T2; then T1 fails for some reason (Figure 21.3(b)); X was (read) by T2 before its value is changed back (rolled back or UNDONE) after T1 fails Concurrency control
  • 18.
    • The IncorrectSummary Problem . One transaction is calculating an aggregate summary function on a number of records (for example, sum (total) of all bank account balances) while other transactions are updating some of these records (for example, transferring a large amount between two accounts, see Figure 21.3(c)); the aggregate function may read some values before they are updated and others after they are updated. Cont….. Concurrency control
  • 20.
    • The UnrepeatableRead Problem . A transaction T1 may read an item (say, available seats on a flight); later, T1 may read the same item again and get a different value because another transaction T2 has updated the item (reserved seats on the flight) between the two reads by T1 Cont….. Concurrency control
  • 21.
    Causes of transactionfailure: 1. A computer failure (system crash): A hardware or software error occurs during transaction execution. If the hardware crashes, the contents of the computer’s internal main memory may be lost. 2. A transaction or system error : Some operation in the transaction may cause it to fail, such as integer overflow or division by zero. Transaction failure may also occur because of erroneous parameter values or because of a logical programming error. In addition, the user may interrupt the transaction during its execution. Recovery
  • 22.
    3. Local errorsor exception conditions detected by the transaction: - certain conditions necessitate cancellation of the transaction. For example, data for the transaction may not be found. A condition, such as insufficient account balance in a banking database, may cause a transaction, such as a fund withdrawal, to be canceled - a programmed abort causes the transaction to fail. 4. Concurrency control enforcement: The concurrency control method may decide to abort the transaction, to be restarted later, because it violates serializability or because several transactions are in a state of deadlock Cont….. Recovery
  • 23.
    5. Disk failure:Some disk blocks may lose their data because of a read or write malfunction or because of a disk read/write head crash. This kind of failure and item 6 are more severe than items 1 through 4. 6. Physical problems and catastrophes: This refers to an endless list of problems that includes power or air-conditioning failure, fire, theft, sabotage, overwriting disks or tapes by mistake, and mounting of a wrong tape by the operator. Cont….. Recovery
  • 24.
    Transaction and SystemConcepts A transaction is an atomic unit of work that is either completed in its entirety or not done at all. A transaction passes through several states. Transaction states: • Active state (executing read, write operations) • Partially committed state (ended but waiting for system checks to determine success or failure) • Committed state (transaction succeeded) • Failed state (transaction failed, must be rolled back) • Terminated State (transaction leaves system)
  • 25.
    Transactions and Recovery SystemRecovery • Any transaction that was running at the time of failure needs to be undone and restarted • Any transactions that committed since the last checkpoint need to be redone • Transactions of type T1 need no recovery • Transactions of type T3 or T5 need to be undone and restarted • Transactions of type T2 or T4 need to be redone
  • 26.
    Transactions and Recovery TransactionRecovery UNDO and REDO: lists of transactions UNDO = all transactions running at the last checkpoint REDO = empty For each entry in the log, starting at the last checkpoint If a BEGIN TRANSACTION entry is found for T Add T to UNDO If a COMMIT entry is found for T Move T from UNDO to REDO
  • 27.
    Transactions and Recovery TransactionRecovery T1 T2 T3 T4 T5 Checkpoint Failure UNDO: T2, T3 REDO: Last Checkpoint Active transactions: T2, T3
  • 28.
    Transactions and Recovery TransactionRecovery T1 T2 T3 T4 T5 Checkpoint Failure UNDO: T2, T3, T4 REDO: T4 Begins Add T4 to UNDO
  • 29.
    Transactions and Recovery TransactionRecovery T1 T2 T3 T4 T5 Checkpoint Failure UNDO: T2, T3, T4, T5 REDO: T5 begins Add T5 to UNDO
  • 30.
    Transactions and Recovery TransactionRecovery T1 T2 T3 T4 T5 Checkpoint Failure UNDO: T3, T4, T5 REDO: T2 T2 Commits Move T2 to REDO
  • 31.
    Transactions and Recovery TransactionRecovery T1 T2 T3 T4 T5 Checkpoint Failure UNDO: T3, T5 REDO: T2, T4 T4 Commits Move T4 to REDO
  • 32.
    Transactions and Recovery Forwardsand Backwards • Backwards recovery – We need to undo some transactions – Working backwards through the log we undo any operation by a transaction on the UNDO list – This returns the database to a consistent state • Forwards recovery – Some transactions need to be redone – Working forwards through the log we redo any operation by a transaction on the REDO list – This brings the database up to date
  • 33.
    Transactions and Recovery Concurrency •Large databases are used by many people – Many transactions to be run on the database – It is desirable to let them run at the same time as each other – Need to preserve isolation • If we don’t allow for concurrency then transactions are run sequentially – Have a queue of transactions – Long transactions (eg backups) will make others wait for long periods
  • 34.
    Transactions and Recovery ConcurrencyProblems • In order to run transactions concurrently we interleave their operations • Each transaction gets a share of the computing time • This leads to several sorts of problems – Lost updates – Uncommitted updates – Incorrect analysis • All arise because isolation is broken
  • 35.
    Transactions and Recovery LostUpdate • T1 and T2 read X, both modify it, then both write it out – The net effect of T1 and T2 should be no change on X – Only T2’s change is seen, however, so the final value of X has increased by 5 T1 T2 Read(X) X = X - 5 Read(X) X = X + 5 Write(X) Write(X) COMMIT COMMIT
  • 36.
    Transactions and Recovery UncommittedUpdate • T2 sees the change to X made by T1, but T1 is rolled back – The change made by T1 is undone on rollback – It should be as if that change never happened T1 T2 Read(X) X = X - 5 Write(X) Read(X) X = X + 5 Write(X) ROLLBACK COMMIT
  • 37.
    Transactions and Recovery Inconsistentanalysis • T1 doesn’t change the sum of X and Y, but T2 sees a change – T1 consists of two parts – take 5 from X and then add 5 to Y – T2 sees the effect of the first, but not the second T1 T2 Read(X) X = X - 5 Write(X) Read(X) Read(Y) Sum = X+Y Read(Y) Y = Y + 5 Write(Y)
  • 38.
    Concurrency Control withLocking Methods • Lock – Guarantees exclusive use of a data item to a current transaction • T2 does not have access to a data item that is currently being used by T1 • Transaction acquires a lock prior to data access; the lock is released when the transaction is complete – Required to prevent another transaction from reading inconsistent data • Lock manager – Responsible for assigning and policing the locks used by the transactions
  • 39.
    Lock Granularity • Indicatesthe level of lock use • Locking can take place at the following levels: – Database-level lock • Entire database is locked – Table-level lock • Entire table is locked – Page-level lock • Entire diskpage is locked – Row-level lock • Allows concurrent transactions to access different rows of the same table, even if the rows are located on the same page – Field-level lock • Allows concurrent transactions to access the same row, as long as they require the use of different fields (attributes) within that row
  • 40.
    A Database-Level LockingSequence  Good for batch processing but unsuitable for online multi-user DBMSs  T1 and T2 can not access the same database concurrently even if they use different tables
  • 41.
    Table-Level Lock  T1and T2 can access the same database concurrently as long as they use different tables  Can cause bottlenecks when many transactions are trying to access the same table (even if the transactions want to access different parts of the table and would not interfere with each other)  Not suitable for multi-user DBMSs
  • 42.
    Page-Level Lock  Anentire disk page is locked (a table can span several pages and each page can contain several rows of one or more tables)  Most frequently used multi-user DBMS locking method
  • 43.
    Row-Level Lock  Concurrenttransactions can access different rows of the same table even if the rows are located on the same page  Improves data availability but with high overhead (each row has a lock that must be read and written to)
  • 44.
    Field-Level Lock  Allowsconcurrent transactions to access the same row as long as they require the use of different fields with that row  Most flexible lock buy requires an extremely high level of overhead
  • 45.
    Binary Locks  Hasonly two states: locked (1) or unlocked (0)  Eliminates “Lost Update” problem – the lock is not released until the write statement is completed  Can not use PROD_QOH until it has been properly updated  Considered too restrictive to yield optimal concurrency conditions as it locks even for two READs when no update is being done
  • 46.
    Shared/Exclusive Locks • Exclusivelock – Access is specifically reserved for the transaction that locked the object – Must be used when the potential for conflict exists – when a transaction wants to update a data item and no locks are currently held on that data item by another transaction – Granted if and only if no other locks are held on the data item • Shared lock – Concurrent transactions are granted Read access on the basis of a common lock – Issued when a transaction wants to read data and no exclusive lock is held on that data item • Multiple transactions can each have a shared lock on the same data item if they are all just reading it • Mutual Exclusive Rule – Only one transaction at a time can own an exclusive lock in the same object
  • 47.
    Shared/Exclusive Locks • Increasedoverhead – The type of lock held must be known before a lock can be granted – Three lock operations exist: • READ_LOCK to check the type of lock • WRITE_LOCK to issue the lock • UNLOCK to release the lock – A lock can be upgraded from share to exclusive and downgraded from exclusive to share • Two possible major problems may occur – The resulting transaction schedule may not be serializable – The schedule may create deadlocks
  • 48.
    Two-Phase Locking toEnsure Serializability • Defines how transactions acquire and relinquish locks • Guarantees serializability, but it does not prevent deadlocks – Growing phase, in which a transaction acquires all the required locks without unlocking any data – Shrinking phase, in which a transaction releases all locks and cannot obtain any new lock
  • 49.
    Two-Phase Locking toEnsure Serializability • Governed by the following rules: – Two transactions cannot have conflicting locks – No unlock operation can precede a lock operation in the same transaction – No data are affected until all locks are obtained— that is, until the transaction is in its locked point
  • 50.
  • 51.
    Deadlocks • Condition thatoccurs when two transactions wait for each other to unlock data – T1 needs data items X and Y; T needs Y and X – Each gets the its first piece of data but then waits to get the second (which the other transaction is already holding) – deadly embrace • Possible only if one of the transactions wants to obtain an exclusive lock on a data item – No deadlock condition can exist among shared locks • Control through – Prevention – Detection – Avoidance
  • 52.
    How a DeadlockCondition Is Created
  • 53.
    Concurrency Control withTime Stamping Methods • Assigns a global unique time stamp to each transaction – All database operations within the same transaction must have the same time stamp • Produces an explicit order in which transactions are submitted to the DBMS • Uniqueness – Ensures that no equal time stamp values can exist • Monotonicity – Ensures that time stamp values always increase • Disadvantage – Each value stored in the database requires two additional time stamp fields – last read, last update
  • 54.
    Wait/Die and Wound/Wait Schemes•Wait/die – Older transaction waits and the younger is rolled back and rescheduled • Wound/wait – Older transaction rolls back the younger transaction and reschedules it • In the situation where a transaction is requests multiple locks, each lock request has an associated time-out value. If the lock is not granted before the time-out expires, the transaction is rolled back
  • 55.
  • 56.
    Concurrency Control withOptimistic Methods • Optimistic approach – Based on the assumption that the majority of database operations do not conflict – Does not require locking or time stamping techniques – Transaction is executed without restrictions until it is committed – Acceptable for mostly read or query database systems that require very few update transactions – Phases are read, validation, and write
  • 57.
    Concurrency Control withOptimistic Methods • Phases are read, validation, and write – Read phase – transaction reads the database, executes the needed computations and makes the updates to a private copy of the database values. • All update operations of the transaction are recorded in a temporary update file which is not accessed by the remaining transactions – Validation phase – transaction is validated to ensure that the changes made will not affect the integrity and consistency of the database • If the validation test is positive, the transaction goes to the writing phase. If negative, the transaction is restarted and the changes discarded – Writing phase – the changes are permanently applied to the database
  • 58.
    58 Schedules • A scheduleis a list of actions from a set of transactions – A well-formed schedule is one where the actions of a particular transaction T are in the same order as they appear in T • For example – [RT1(a), WT1(a), RT2(b), WT2(b), RT1(c), WT1(c)] is a well- formed schedule – [RT1(c), WT1(c), RT2(b), WT2(b), RT1(a), WT1(a)] is not a well- formed schedule
  • 59.
    59 Schedules cont. • Acomplete schedule is one that contains an abort or commit action for every transaction that occurs in the schedule • A serial schedule is one where the actions of different transactions are not interleaved
  • 60.
    60 Serialisability • A serialisableschedule is a schedule whose effect on any consistent database instance is identical to that of some complete serial schedule • NOTE: – All different results assumed to be acceptable – It’s more complicated when we have transactions that abort – We’ll assume that all ‘side-effects’ of a transaction are written to the database
  • 61.
    61 Anomalies with interleaved execution •Two actions on the same data object conflict if at least one of them is a write • We’ll now consider three ways in which a schedule involving two consistency-preserving transactions can leave a consistent database inconsistent
  • 62.
    62 WR conflicts • TransactionT2 reads a database object that has been modified by T1 which has not committed T1: R(a),W(a), R(b),W(b),C T2: R(a),W(a),R(b),W(b),C Debit €100 from a Credit €100 to b Read a and b and add 6% interest “Dirty read”
  • 63.
    63 RW conflicts • TransactionT2 could change the value of an object that has been read by a transaction T1, while T1 is still in progress T1: R(a), R(a), W(a), C T2: R(a),W(a),C T1: R(a), W(a),C T2: R(a), W(a),C A is 4  Read A (5) Write 5+1=6 Read A (5) Write 5-1=4 “Unrepeatable Read”
  • 64.
    64 WW conflicts • TransactionT2 could overwrite the value of an object which has already been modified by T1, while T1 is still in progress T1: [W(Britney), W(gmb)] “Set both salaries at £1m” T2: [W(gmb), W(Britney)] “Set both salaries at $1m” • But: T1: W(Britney), W(gmb) T2: W(gmb), W(Britney) gmb gets £1m Britney gets $1m  “Blind Write”
  • 65.
    65 Serialisability and aborts •Things are more complicated when transactions can abort T1:R(a), W(a), Abort T2: R(a),W(a),R(b),W(b),C Deduct €100 from a Add 6% interest to a and b Can’t undo T2 It’s committed 
  • 66.
    66 Strict two-phase locking •DBMS enforces the following locking protocol: – Each transaction must obtain an S (shared) lock before reading, and an X (exclusive) lock before writing – All locks held by a transaction are released when the transaction completes – If a transaction holds an X lock on an object, no other transaction can get a lock (S or X) on that object • Strict 2PL allows only serialisable schedules
  • 67.
    67 More refined locks •Some updates that seem at first sight to require a write (X) lock, can be given something weaker – Example: Consider a seat count object in a flights database – There are two transactions that wish to book a flight – get X lock on seat count – Does it matter in what order they decrement the count? • They are commutative actions! • Do they need a write lock?
  • 68.
    68 Aborting • If atransaction Ti is aborted, then all actions must be undone – Also, if Tj reads object last written by Ti, then Tj must be aborted! • Most systems avoid cascading aborts by releasing locks only at commit time (strict protocols) – If Ti writes an object, then Tj can only read this after Ti finishes • In order to undo changes, the DBMS maintains a log which records every write
  • 69.
    69 The log • Thefollowing facts are recorded in the log – “Ti writes an object”: store new and old values – “Ti commits/aborts”: store just a record • Log records are chained together by transaction id, so it’s easy to undo a specific transaction • Log is often duplexed and archived on stable storage (it’s important!)
  • 70.
    70 Connection to Normalization •The more redundancy in a database, the more locking is required for (update) transactions. – Extreme case: so much redundancy that all update transactions are forced to execute serially. • In general, less redundancy allows for greater concurrency and greater transaction throughput. !!! This is what normalization is all about !!!
  • 71.
    71 The Fundamental Tradeoffof Database Performance Tuning • De-normalized data can often result in faster query response • Normalized data leads to better transaction throughput What is more important in your database --- query response or transaction throughput? The answer will vary. What do the extreme ends of the spectrum look like? Yes, indexing data can speed up transactions, but this just proves the point --- an index IS redundant data. General rule of thumb: indexing will slow down transactions!
  • 72.
    72 Summary You should nowunderstand: • Transactions and the ACID properties • Schedules and serialisable schedules • Potential anomalies with interleaving • Strict 2-phase locking • Problems with transactions that can abort • Logs