Tranasaction management

Transactions and Recovery
A transaction is an action, or a
series of actions, carried out
by a single user or an
application program, which
reads or updates the contents
of a database.

• Any action that reads from and/or writes to a
database may consist of
– Simple SELECT statement to generate a list of table
contents
– A series of related UPDATE statements to change the
values of attributes in various tables
– A series of INSERT statements to add rows to one or
more tables
– A combination of SELECT, UPDATE, and INSERT
statements

• A logical unit of work that must be either entirely
completed or aborted
• Successful transaction changes the database from
one consistent state to another
– One in which all data integrity constraints are satisfied
• Most real-world database transactions are formed
by two or more database requests
– The equivalent of a single SQL statement in an
application program or transaction

• Not all transactions update the database
• SQL code represents a transaction because database
was accessed
• Improper or incomplete transactions can have a
devastating effect on database integrity
– Some DBMSs provide means by which user can
define enforceable constraints based on business
rules
– Other integrity rules are enforced automatically by
the DBMS when table structures are properly
defined, thereby letting the DBMS validate some
transactions

Figure 9.2
• For example, a transaction may involve
– The creation of a new invoice
– Insertion of an row in the LINE table
– Decreasing the quantity on hand by 1
– Updating the customer balance
– Creating a new account transaction row
• If the system fails between the first and last step,
the database will no longer be in a consistent
state

• Atomicity
– Transactions are atomic –
they don’t have parts
(conceptually)
– can’t be executed partially; it
should not be detectable that
they interleave with another
transaction
• Consistency
– Transactions take the
database from one consistent
state into another
– In the middle of a transaction
the database might not be
consistent
• Isolation
– The effects of a transaction
are not visible to other
transactions until it has
completed
– From outside the transaction
has either happened or not
– To me this actually sounds
like a consequence of
atomicity…
• Durability
– Once a transaction has
completed, its changes are
made permanent
– Even if the system crashes,
the effects of a transaction
must remain in place

• Transfer Rs. 500 from account
A to account B
Read(A)
A = A - 50
Write(A)
Read(B)
B = B+50
Write(B)
Atomicity - shouldn’t take money
from A without giving it to B
Consistency - money isn’t lost or
gained
Isolation - other queries shouldn’t
see A or B change until
completion
Durability - the money does not
go back to A
transaction

The Transaction Manager
• The transaction manager
enforces the ACID
properties
– It schedules the operations of
transactions
– COMMIT and ROLLBACK are
used to ensure atomicity
– Locks or timestamps are used
to ensure consistency and
isolation for concurrent
transactions (next lectures)
– A log is kept to ensure
durability in the event of
system failure (this lecture)

COMMIT and ROLLBACK
• COMMIT signals the
successful end of a
transaction
– Any changes made by the
transaction should be saved
– These changes are now
visible to other transactions
• ROLLBACK signals the
unsuccessful end of a
transaction
– Any changes made by the
transaction should be undone
– It is now as if the transaction
never existed

Recovery
• Transactions should be
durable, but we cannot
prevent all sorts of failures:
– System crashes
– Power failures
– Disk crashes
– User mistakes
– Sabotage
– Natural disasters
• Prevention is better than
cure
– Reliable OS
– Security
– UPS and surge protectors
– RAID arrays
• Can’t protect against
everything though

The Transaction Log
• The transaction log records
the details of all
transactions
– Any changes the transaction
makes to the database
– How to undo these changes
– When transactions complete
and how
• The log is stored on disk,
not in memory
– If the system crashes it is
preserved
• Write ahead log rule
– The entry in the log must be
made before COMMIT
processing can complete

System Failures
• A system failure means all
running transactions are
affected
– Software crashes
– Power failures
• The physical media (disks)
are not damaged
• At various times a DBMS
takes a checkpoint
– All committed transactions
are written to disk
– A record is made (on disk) of
the transactions that are
currently running

Types of Transactions
Last Checkpoint System Failure
T1
T2
T3
T4
T5

Without Concurrency Control, problems may occur
with concurrent transactions:
• Lost Update Problem.
Occurs when two transactions update the same data
item, but both read the same original value before
update (Figure 21.3(a), next slide)
• The Temporary Update (or Dirty Read) Problem.
This occurs when one transaction T1 updates a
database item X, which is accessed (read) by
another transaction T2; then T1 fails for some reason
(Figure 21.3(b)); X was (read) by T2 before its value
is changed back (rolled back or UNDONE) after T1
fails
Concurrency control

• The Incorrect Summary Problem .
One transaction is calculating an aggregate summary
function on a number of records (for example, sum
(total) of all bank account balances) while other
transactions are updating some of these records (for
example, transferring a large amount between two
accounts, see Figure 21.3(c)); the aggregate function
may read some values before they are updated and
others after they are updated.
Cont…..
Concurrency control

• The Unrepeatable Read Problem .
A transaction T1 may read an item (say, available
seats on a flight); later, T1 may read the same item
again and get a different value because another
transaction T2 has updated the item (reserved seats
on the flight) between the two reads by T1
Cont…..
Concurrency control

Causes of transaction failure:
1. A computer failure (system crash): A hardware or
software error occurs during transaction execution. If
the hardware crashes, the contents of the computer’s
internal main memory may be lost.
2. A transaction or system error : Some operation in the
transaction may cause it to fail, such as integer overflow
or division by zero. Transaction failure may also occur
because of erroneous parameter values or because of
a logical programming error. In addition, the user may
interrupt the transaction during its execution.
Recovery

3. Local errors or exception conditions detected by
the transaction:
- certain conditions necessitate cancellation of the
transaction. For example, data for the transaction may
not be found. A condition, such as insufficient account
balance in a banking database, may cause a
transaction, such as a fund withdrawal, to be canceled
- a programmed abort causes the transaction to fail.
4. Concurrency control enforcement: The concurrency
control method may decide to abort the transaction, to
be restarted later, because it violates serializability or
because several transactions are in a state of
deadlock
Cont…..
Recovery

5. Disk failure: Some disk blocks may lose their data
because of a read or write malfunction or because of
a disk read/write head crash. This kind of failure and
item 6 are more severe than items 1 through 4.
6. Physical problems and catastrophes: This refers
to an endless list of problems that includes power or
air-conditioning failure, fire, theft, sabotage,
overwriting disks or tapes by mistake, and mounting
of a wrong tape by the operator.
Cont…..
Recovery

Transaction and System Concepts
A transaction is an atomic unit of work that is either
completed in its entirety or not done at all. A
transaction passes through several states.
Transaction states:
• Active state (executing read, write operations)
• Partially committed state (ended but waiting for
system checks to determine success or failure)
• Committed state (transaction succeeded)
• Failed state (transaction failed, must be rolled back)
• Terminated State (transaction leaves system)

System Recovery
• Any transaction that was
running at the time of
failure needs to be undone
and restarted
• Any transactions that
committed since the last
checkpoint need to be
redone
• Transactions of type T1 need
no recovery
• Transactions of type T3 or T5
need to be undone and
restarted
• Transactions of type T2 or T4
need to be redone

Transaction Recovery
UNDO and REDO: lists of transactions
UNDO = all transactions running at the last checkpoint
REDO = empty
For each entry in the log, starting at the last checkpoint
If a BEGIN TRANSACTION entry is found for T
Add T to UNDO
If a COMMIT entry is found for T
Move T from UNDO to REDO

T1
T2
T3
T4
T5
Checkpoint Failure
UNDO: T2, T3
REDO:
Last Checkpoint
Active transactions: T2, T3

T1
T2
T3
T4
T5
Checkpoint Failure
UNDO: T2, T3, T4
REDO:
T4 Begins
Add T4 to UNDO

T1
T2
T3
T4
T5
Checkpoint Failure
UNDO: T2, T3, T4, T5
REDO:
T5 begins
Add T5 to UNDO

T1
T2
T3
T4
T5
Checkpoint Failure
UNDO: T3, T4, T5
REDO: T2
T2 Commits
Move T2 to REDO

T1
T2
T3
T4
T5
Checkpoint Failure
UNDO: T3, T5
REDO: T2, T4
T4 Commits
Move T4 to REDO

Forwards and Backwards
• Backwards recovery
– We need to undo some
transactions
– Working backwards through
the log we undo any
operation by a transaction on
the UNDO list
– This returns the database to a
consistent state
• Forwards recovery
– Some transactions need to be
redone
– Working forwards through
the log we redo any operation
by a transaction on the REDO
list
– This brings the database up to
date

Concurrency
• Large databases are used by
many people
– Many transactions to be run
on the database
– It is desirable to let them run
at the same time as each
other
– Need to preserve isolation
• If we don’t allow for
concurrency then
transactions are run
sequentially
– Have a queue of transactions
– Long transactions (eg
backups) will make others
wait for long periods

Concurrency Problems
• In order to run transactions
concurrently we interleave
their operations
• Each transaction gets a
share of the computing
time
• This leads to several sorts of
problems
– Lost updates
– Uncommitted updates
– Incorrect analysis
• All arise because isolation is
broken

Lost Update
• T1 and T2 read X, both
modify it, then both write it
out
– The net effect of T1 and T2
should be no change on X
– Only T2’s change is seen,
however, so the final value of
X has increased by 5
T1 T2
Read(X)
X = X - 5
Read(X)
X = X + 5
Write(X)
Write(X)
COMMIT
COMMIT

Uncommitted Update
• T2 sees the change to X
made by T1, but T1 is rolled
back
– The change made by T1 is
undone on rollback
– It should be as if that change
never happened
T1 T2
Read(X)
X = X - 5
Write(X)
Read(X)
X = X + 5
Write(X)
ROLLBACK
COMMIT

Inconsistent analysis
• T1 doesn’t change the sum
of X and Y, but T2 sees a
change
– T1 consists of two parts –
take 5 from X and then add 5
to Y
– T2 sees the effect of the first,
but not the second
T1 T2
Read(X)
X = X - 5
Write(X)
Read(X)
Read(Y)
Sum = X+Y
Read(Y)
Y = Y + 5
Write(Y)

Concurrency Control with Locking Methods
• Lock
– Guarantees exclusive use of a data item to a current
transaction
• T2 does not have access to a data item that is currently being
used by T1
• Transaction acquires a lock prior to data access; the lock is
released when the transaction is complete
– Required to prevent another transaction from reading
inconsistent data
• Lock manager
– Responsible for assigning and policing the locks used by
the transactions

Lock Granularity
• Indicates the level of lock use
• Locking can take place at the following levels:
– Database-level lock
• Entire database is locked
– Table-level lock
• Entire table is locked
– Page-level lock
• Entire diskpage is locked
– Row-level lock
• Allows concurrent transactions to access different rows of the same
table, even if the rows are located on the same page
– Field-level lock
• Allows concurrent transactions to access the same row, as long as they
require the use of different fields (attributes) within that row

A Database-Level Locking Sequence
 Good for batch processing but unsuitable for online multi-user DBMSs
 T1 and T2 can not access the same database concurrently even if they use
different tables

Table-Level Lock
 T1 and T2 can access the same database concurrently as long as they use different tables
 Can cause bottlenecks when many transactions are trying to access the same table (even if
the transactions want to access different parts of the table and would not interfere with
each other)
 Not suitable for multi-user DBMSs

Page-Level Lock
 An entire disk page is locked (a table can span several pages and each page can
contain several rows of one or more tables)
 Most frequently used multi-user DBMS locking method

Row-Level Lock
 Concurrent transactions can access different rows of the same table even if the
rows are located on the same page
 Improves data availability but with high overhead (each row has a lock that must
be read and written to)

Field-Level Lock
 Allows concurrent transactions to access the
same row as long as they require the use of
different fields with that row
 Most flexible lock buy requires an extremely high
level of overhead

Binary Locks
 Has only two states: locked (1) or unlocked (0)
 Eliminates “Lost Update” problem – the lock is not released until the write
statement is completed
 Can not use PROD_QOH until it has been properly updated
 Considered too restrictive to yield optimal concurrency conditions as it locks even for two
READs when no update is being done

Shared/Exclusive Locks
• Exclusive lock
– Access is specifically reserved for the transaction that locked the object
– Must be used when the potential for conflict exists – when a transaction wants to
update a data item and no locks are currently held on that data item by another
transaction
– Granted if and only if no other locks are held on the data item
• Shared lock
– Concurrent transactions are granted Read access on the basis of a common lock
– Issued when a transaction wants to read data and no exclusive lock is held on that data
item
• Multiple transactions can each have a shared lock on the same data item if they are all
just reading it
• Mutual Exclusive Rule
– Only one transaction at a time can own an exclusive lock in the same object

Shared/Exclusive Locks
• Increased overhead
– The type of lock held must be known before a lock can be granted
– Three lock operations exist:
• READ_LOCK to check the type of lock
• WRITE_LOCK to issue the lock
• UNLOCK to release the lock
– A lock can be upgraded from share to exclusive and downgraded from
exclusive to share
• Two possible major problems may occur
– The resulting transaction schedule may not be serializable
– The schedule may create deadlocks

Two-Phase Locking to Ensure Serializability
• Defines how transactions acquire and
relinquish locks
• Guarantees serializability, but it does not
prevent deadlocks
– Growing phase, in which a transaction acquires all
the required locks without unlocking any data
– Shrinking phase, in which a transaction releases all
locks and cannot obtain any new lock

Two-Phase Locking to Ensure Serializability
• Governed by the following rules:
– Two transactions cannot have conflicting locks
– No unlock operation can precede a lock operation
in the same transaction
– No data are affected until all locks are obtained—
that is, until the transaction is in its locked point

Deadlocks
• Condition that occurs when two transactions wait for each
other to unlock data
– T1 needs data items X and Y; T needs Y and X
– Each gets the its first piece of data but then waits to get the second
(which the other transaction is already holding) – deadly embrace
• Possible only if one of the transactions wants to obtain an
exclusive lock on a data item
– No deadlock condition can exist among shared locks
• Control through
– Prevention
– Detection
– Avoidance

How a Deadlock Condition Is Created

Concurrency Control with Time Stamping
Methods
• Assigns a global unique time stamp to each transaction
– All database operations within the same transaction must have the
same time stamp
• Produces an explicit order in which transactions are submitted
to the DBMS
• Uniqueness
– Ensures that no equal time stamp values can exist
• Monotonicity
– Ensures that time stamp values always increase
• Disadvantage
– Each value stored in the database requires two additional time stamp
fields – last read, last update

Wait/Die and Wound/Wait
Schemes• Wait/die
– Older transaction waits and the younger is rolled back and rescheduled
• Wound/wait
– Older transaction rolls back the younger transaction and reschedules it
• In the situation where a transaction is requests multiple locks,
each lock request has an associated time-out value. If the lock
is not granted before the time-out expires, the transaction is
rolled back

Wait/Die and Wound/Wait
Concurrency Control Schemes

Concurrency Control with Optimistic Methods
• Optimistic approach
– Based on the assumption that the majority of database
operations do not conflict
– Does not require locking or time stamping techniques
– Transaction is executed without restrictions until it is committed
– Acceptable for mostly read or query database systems that
require very few update transactions
– Phases are read, validation, and write

Concurrency Control with Optimistic Methods
• Phases are read, validation, and write
– Read phase – transaction reads the database, executes the needed
computations and makes the updates to a private copy of the database
values.
• All update operations of the transaction are recorded in a temporary
update file which is not accessed by the remaining transactions
– Validation phase – transaction is validated to ensure that the changes
made will not affect the integrity and consistency of the database
• If the validation test is positive, the transaction goes to the writing
phase. If negative, the transaction is restarted and the changes
discarded
– Writing phase – the changes are permanently applied to the database

58
Schedules
• A schedule is a list of actions from a set of
transactions
– A well-formed schedule is one where the actions of a
particular transaction T are in the same order as they
appear in T
• For example
– [RT1(a), WT1(a), RT2(b), WT2(b), RT1(c), WT1(c)] is a well-
formed schedule
– [RT1(c), WT1(c), RT2(b), WT2(b), RT1(a), WT1(a)] is not a well-
formed schedule

59
Schedules cont.
• A complete schedule is one that contains an
abort or commit action for every transaction
that occurs in the schedule
• A serial schedule is one where the actions of
different transactions are not interleaved

60
Serialisability
• A serialisable schedule is a schedule whose
effect on any consistent database instance is
identical to that of some complete serial
schedule
• NOTE:
– All different results assumed to be acceptable
– It’s more complicated when we have transactions
that abort
– We’ll assume that all ‘side-effects’ of a transaction
are written to the database

61
Anomalies with interleaved
execution
• Two actions on the same data object conflict if
at least one of them is a write
• We’ll now consider three ways in which a
schedule involving two consistency-preserving
transactions can leave a consistent database
inconsistent

62
WR conflicts
• Transaction T2 reads a database object that
has been modified by T1 which has not
committed
T1: R(a),W(a), R(b),W(b),C
T2: R(a),W(a),R(b),W(b),C
Debit €100
from a
Credit €100
to b
Read a and b
and add 6%
interest
“Dirty
read”

63
RW conflicts
• Transaction T2 could change the value of an object
that has been read by a transaction T1, while T1 is
still in progress
T1: R(a), R(a), W(a), C
T2: R(a),W(a),C
T1: R(a), W(a),C
T2: R(a), W(a),C A is 4 
Read A (5) Write 5+1=6
Read A (5) Write 5-1=4
“Unrepeatable
Read”

64
WW conflicts
• Transaction T2 could overwrite the value of an
object which has already been modified by T1,
while T1 is still in progress
T1: [W(Britney), W(gmb)] “Set both salaries at £1m”
T2: [W(gmb), W(Britney)] “Set both salaries at $1m”
• But:
T1: W(Britney), W(gmb)
T2: W(gmb), W(Britney)
gmb gets £1m
Britney gets $1m

“Blind
Write”

65
Serialisability and aborts
• Things are more complicated when
transactions can abort
T1:R(a), W(a), Abort
T2: R(a),W(a),R(b),W(b),C
Deduct €100
from a
Add 6% interest
to a and b
Can’t undo T2
It’s committed 

66
Strict two-phase locking
• DBMS enforces the following locking protocol:
– Each transaction must obtain an S (shared) lock before
reading, and an X (exclusive) lock before writing
– All locks held by a transaction are released when the
transaction completes
– If a transaction holds an X lock on an object, no other
transaction can get a lock (S or X) on that object
• Strict 2PL allows only serialisable schedules

67
More refined locks
• Some updates that seem at first sight to require a
write (X) lock, can be given something weaker
– Example: Consider a seat count object in a flights database
– There are two transactions that wish to book a flight – get
X lock on seat count
– Does it matter in what order they decrement the count?
• They are commutative actions!
• Do they need a write lock?

68
Aborting
• If a transaction Ti is aborted, then all actions must be
undone
– Also, if Tj reads object last written by Ti, then Tj must be
aborted!
• Most systems avoid cascading aborts by releasing
locks only at commit time (strict protocols)
– If Ti writes an object, then Tj can only read this after Ti
finishes
• In order to undo changes, the DBMS maintains a log
which records every write

69
The log
• The following facts are recorded in the log
– “Ti writes an object”: store new and old values
– “Ti commits/aborts”: store just a record
• Log records are chained together by
transaction id, so it’s easy to undo a specific
transaction
• Log is often duplexed and archived on stable
storage (it’s important!)

70
Connection to Normalization
• The more redundancy in a database, the more
locking is required for (update) transactions.
– Extreme case: so much redundancy that all update
transactions are forced to execute serially.
• In general, less redundancy allows for greater
concurrency and greater transaction throughput.
!!! This is what normalization is all about !!!

71
The Fundamental Tradeoff of Database
Performance Tuning
• De-normalized data can often result in faster
query response
• Normalized data leads to better transaction
throughput
What is more important in your database --- query response
or transaction throughput? The answer will vary.
What do the extreme ends of the spectrum look like?
Yes, indexing data can speed up transactions, but this just proves
the point --- an index IS redundant data. General rule of thumb:
indexing will slow down transactions!

72
Summary
You should now understand:
• Transactions and the ACID properties
• Schedules and serialisable schedules
• Potential anomalies with interleaving
• Strict 2-phase locking
• Problems with transactions that can abort
• Logs

Tranasaction management

More Related Content

What's hot

Similar to Tranasaction management

More from Dr. C.V. Suresh Babu

Recently uploaded

Tranasaction management