Data Access Pattern Classification
Scheme for avoiding Lost Updates in
Fritz Laux1, Martti Laiho2
Dpt. of Informatics, Reutlingen University, Alteburgstraße 150, 72762 Reutlingen,
Germany, Tel: +49 7121 271 4019, Fax: +49 7121 271 90 4019,
E-mail: Friedrich.Laux @ Reutlingen-University.DE
Dpt. of Business Information Technology, Haaga-Helia University of Applied
Sciences, Ratapihantie 13, 00520 Helsinki, Finland
Tel: +358 9 2296 5228, E-mail: martti.laiho @ haaga-helia.fi
Modern DBMS systems eliminate the classical lost update problem in terms of protecting
updates of a SQL transaction up to end of the transaction’s scope. However, lost updates can
happen in the scope of a user transaction if it does not map one-to-one to a SQL Transaction.
In spite of the server-side concurrency control applied to single SQL transactions by the DBMS
system, the system cannot stop a SQL transaction from writing over committed updates written
by other transactions. If the data written is based on stale (outdated) copy of the data, then the
SQL transaction will get guilty of the lost update problem. This could be the case for example
when a user transaction will contain series of SQL transactions.
The usual way to avoid blocking of resources and prevent lost updates during a user
transaction is to divide its data access phases into series of short SQL transactions (free of any
user intervention) and use so called “optimistic locking” at client-side verifying that the updating
transactions do not write over the updates of concurrent transactions of others.
In this paper we will identify possible data update patterns for proper application of row version
verifying (RVV) discipline. This classification provides us with two update patterns that will
guarantee the correct use of RVV for avoiding the lost update problem. Efficient ways to
implement row version indicators based on server-side version stamping and used for version
verification are discussed and compared to implementations in modern database systems.
As examples we show implementations of these patterns using the mainstream database
systems like Oracle, DB2, and SQL Server.
Keywords: data access, concurrency control, lost update, row version verification.
1. Lost Update Problem in Transaction Scope
A typical fault in multi-user file-based systems without proper concurrency control is
the Lost Update Problem i.e. a record x updated by some process A will be
overwritten by some other concurrent process B like in the following simplified
rA(x), rB(x), wA(x), wB(x)
Properly administrated databases provide reliable storage services for data of
information systems without losing any data. However, it is responsibility of
application developers to use properly these reliable services, such as transactions
and concurrency control. We assume that the reader is familiar with the concepts of
SQL transactions, ISO SQL Standard isolation levels on tuning concurrency control
of transactions and also on principles of database locks. Instead of concurrency
control theories presented in database textbooks we are interested in the
implementations in mainstream DBMS systems of today and what application
developers need to understand about reliable access of databases. Table 1
summarises the SQL isolation levels and concurrency control implementations in
modern mainstream DBMS systems (6, 6, 6, 6).
Locking Scheme Multi-Versioning
Concurrency Control Concurrency Control
Concurrency: (LSCC): (MVCC):
SQL 2005, allow
ISO Server Oracle snapshot
Scope: Isolation: SQL DB2 V9.5 2005 11g isolation Oracle 11g
Cursor Read Only yes
(X-locking of rows) for update
Optimistic with values yes
Optimistic with (row change
timestamp timestamp) yes
Cursor Stability CS Locks"
Transaction Read Uncommitted yes RU yes
Snapshot Read "Read
Committed yes Committed"
Read Committed yes CS yes
Snapshot yes "Serializable"
Repeatable Read yes RS yes
Serializable yes RR yes
Table 1: Summary of Isolation Levels and Concurrency Control Implementations
Updates made in a transaction will be protected up to the end of the transaction
against overwriting by other concurrent transactions using either locking scheme
concurrency control (LSCC) or multi-versioning concurrency control (MVCC) which
are the modern implementations of so called "optimistic concurrency control
systems". A MVCC system never blocks readers, but has the price that the readers
may get stale data.
In scope of a single SQL transaction the schedule presented above is not possible
in the DBMS systems, so we don’t have the Lost Update Problem on our own
updates, - but we need to consider also series of SQL transactions in their
application context i.e. in the scope of user transactions .
2. Lost Update Problem in Application Context
Let us consider first the following problematic scenario of SQL transactions of two
concurrent processes A and B updating balance of the same account in Fig. 1.
Figure 1: A Lost Update scenario by a SELECT - UPDATE transaction (A)
The withdraw of 200 € made by the transaction of B will be overwritten by A, in
other words the update made by B in step 5 will be lost in step 7 when transaction
of A will overwrite the updated value by value 900 € which is based on stale data
i.e. outdated value of the balance from step 3. If the transactions of A and B would
serialize properly, the correct balance value after these transactions would be 700€,
but there is nothing that the DBMS could do to protect the update of step 5, since
guilty to this lost update problem is the programmer of process A, who has ordered
from the DBMS a wrong isolation level. READ COMMITTED, which for
performance reasons this is the default transaction isolation level used by most
RDBMS systems, does not protect any data read by transaction of getting outdated
right after reading the value. Proper isolation level on LSCC systems should be
REPEATABLE READ or SERIALIZABLE, which would protect the values read in
the transaction of getting outdated during the transaction by holding shared locks
on these rows up to end of the transaction. The isolation service of the DBMS does
guarantee that the transaction will either get the ordered isolation or in case of
serialization conflict the transaction will be rejected by the DBMS. Means of this
service and transactional outcome for the very same application code can be
different on use of different DBMS systems, and even on different table structures.
Usually a transaction rejected due to serialization conflict should be retried by the
application, but we will discuss about this later.
The erroneous scenario above would be the same also if process A commits its
transaction of steps 1 and 3 (let us call it transaction A1) in step 4 and continues
(for example after some user interaction) with another transaction A2 of phases 7-8.
In this case no isolation level can help, but transaction A2 will make a blind write
(based on stale data, insensitive of the current value) over the balance value
updated by transaction B.
3. A Taxonomy of Updates Avoiding Lost Updates
The blind write of the update transaction A2 of phases 7-8 (resulting in the lost
update of transaction B) could have been avoided by any of the following types of
Type 0: There is no risk for lost update if A2 in step 7 had been using the form of
the update which is sensitive to the current value, like B uses in step 5 as
SET balance = balance - 100
WHERE acctId = :id;
Type 1: After transaction A1 first has read the original row version data in step 3
transaction A2 verifies in step 7 with an additional comparison
expression in the WHERE clause of the UPDATE command that the
current row version in the database is still the same as it was when the
process previously accessed the account row, for example
SET balance = :newBalance
WHERE acctId = :id AND
(rowVersion = :old_rowVersion);
The comparison expression can be a single comparison predicate like in the
example above where rowVersion is a column (or a pseudo-column
provided by the DBMS) reflecting any changes made in the contents of the
row and :old_rowVersion is a host variable containing the value of the
column when the process previously read the contents of the row. In the
case that more than one colum is involved the comparison expression can
be built of version comparisons of all columns used and based on the 3-
value logic of SQL.
Since Type 1 update transaction does not explicitly read data there is no
need to set isolation level. The result of the concurrency control services is
the same for LSCC and MVCC based DBMS. The result of the update
depends on the RVV predicate and the application code needs to read the
indicator of updated rows from the DBMS to verify the result.
Type 2: (re-SELECT .. UPDATE) is a variant of Type 1 in which transaction A2 first
reads the current row version data from the database into some host
SELECT rowVersion INTO :current_rowVersion
WHERE acctId = :id ;
and then applies the conditional update
if (current_rowVersion = old_rowVersion) then
SET balance = :newBalance
WHERE acctId = :id ;
In this case it is necessary to make sure that no other transaction has
changed the row between the SELECT and the UPDATE. For this purpose
we need to apply a strong enough isolation level (REPEATABLE READ,
SNAPSHOT, or SERIALIZABLE) or explicit row-level locking such as
Oracle's FOR UPDATE clause in the SELECT command.
Since isolation level implementations of LSCC and MVCC based DBMSs
are different, the result of concurrency services can be different: In LSCC
based systems the first writer of the row or reader using REPEATABLE
READ or SERIALIZABLE isolation level will usually win, whereas in MVCC
based systems the first writer wins the concurrency competition.
4. RVV Discipline and Server-Side Stamping Solutions
Type 1 and Type 2 updates don't require any locking before transaction A2 and the
update method is generally known as "Optimistic Locking" 6, but we prefer to call it
Row Version Verification (RVV) Discipline. There are multiple options for row
version verification, including comparison of original contents of all or some relevant
subset of columns of the row, a checksum of these, a technical SQL column, or
some technical pseudo-column maintained by the DBMS.
A general solution for row version management is to include a technical row
version column and using a row-level trigger to increase the value of column rv
on any row automatically every time the row is updated. We call the use of trigger
or use of technical pseudo-column as "server-side stamping" which no application
can bypass, as opposite to client-side stamping using the SET clause within the
UPDATE command, a discipline that all applications should follow in that case.
Row-level triggers are affordable, but have performance cost of some percents in
execution time on Oracle and DB2, whereas SQL Server does not even support
Timestamps are typically mentioned in database literature as a mean for
differentiating update versions of a row, but our tests 6 prove that for example on a
32bit Windows workstation using a single processor Oracle 11g can generate up to
115 updates having the very same timestamp, and almost same problem applies to
DATETIME of SQL Server 2005 and TIMESTAMP of DB2 LUW 9, with exception of
the new ROW CHANGE TIMESTAMP option in DB2 9.5 which generates unique
timestamp values for every update of the same row having technical TIMESTAMP
The native TIMESTAMP data type of SQL Server is not a timestamp but a technical
column which can be used to monitor the order of all row updates inside a
database. We prefer to use its synonym name ROWVERSION. This provides the
most effective server-side stamping method in SQL Server; although as side-effect
it generates an extra U-lock which will result in deadlock in the example of Fig. 1.
In version 10 and later versions Oracle provides a new pseudo-column
ORA_ROWSCN for rows in every table created using ROWDEPENDENCIES
option 6. This will show the transaction SCN number of the last committed
transaction which has updated the row. This provides the most effective server-
side stamping method for RVV in Oracle databases, although as harmful side-effect
row-locking turns its value to NULL.
In our "RVV Paper" 6 we have presented SQL view solutions for mapping these
technical row version column contents into BIGINT data type for row version
verification (RVV) at client-side.
5. Data Accesses of a Typical Use Case
A typical multi-tier architecture today makes use of the Model-View-Controller
(MVC) pattern, where the Model part (M) is responsible for accessing the database
(Data Access). The role of View and Controller tiers are not in the scope of our
paper. We will focus just on topics of the Data Access tier. Reliable data access
requires SQL transactions and various transaction patterns can be used depending
on the phases of the use cases.
Fig. 2 presents a generic use case (user transaction), for example maintenance of
product inventory as part of order entry system, or maintenance of customer data.
Figure 2: Use case, MVC implementation and the Data Access Patterns
The numbers in Fig. 2 identify the phases in the use case scenario where the user
first picks the proper object from a search list (phases 1 and 3), the object data is
then presented to the user on a form (phase 5), after updating the data on the form
the user presses some “save” button and the changed data will be updated in the
database. We will now focus on the implementation of the scenario as SQL
transactions on the Model tier:
Phase 2 is implemented as a READ ONLY transaction which reads some relevant
attributes of objects using some selection criteria set by the user at phase 1, and
returns the result set at phase 3 on the View tier for final selection of the proper
object. In case of MVCC systems any default isolation level will do here, since
MVCC systems don't block readers. In case of LSCC Read Uncommitted isolation
level is enough, since we don't want to block concurrent transactions and it is
enough to know that the listed objects exists. If the characteristics of the selected
object has been changed in phase 5 the user can return to make a new selection.
Phase 4 is implemented as a READ ONLY transaction using “singleton SELECT”
fetching all relevant attributes of the selected object to be presented to the user at
phase 5 on the View tier. Obviously this transaction is not allowed to read
uncommitted data. For minimal blocking of concurrent transactions we don’t keep
the object locked in the database, but phase 4 starts the “client-side scope of
concurrency as seen by the application saving the original data (or what will be
needed, for example just id and row version data) in a cache of the Model tier for
the row version verification at phase 6.
Phase 5 in our scenario stands for the user interface phases on View tier, which
may take unpredictable time to complete. This may also require additional READ
ONLY lookup database transactions in the Model tier, but to keep the picture simple
we have skipped presenting them.
Phase 6 is a typical case of an updating OLTP transaction. It will get the updating
data from the phase 5 View tier. Since some other concurrent transactions may
have updated the data of the object in the database during user’s “thinking time”
(and potential coffee break), and not to lose these updates by “blind overwriting”,
the current data of the object has to be compared to the original data in the cache.
Either Type 1 or Type 2 update will do. In case of Type 1 updating using UPDATE
statement with instant row version verifying (RVV) predicates the isolation level has
no meaning and results of the services provided by LSCC and MVCC systems are
the same. In this case we also need to check if the UPDATE statement really
affected the row.
For Type 2 (SELECT .. UPDATE) we need to set at least the isolation level
REPEATABLE READ which in case of LSCC systems guarantees that the row
version read by the SELECT command is protected by S-lock and unless we
happen to become victim of accidental deadlock situation we will manage to do the
UPDATE part. Here LSCC systems will provide the better service, since in case of
MVCC systems we need to set the isolation level as SERIALIZABLE and we will
lose the competition to any concurrent updates.
If the row version verifying fails i.e. the object has been changed in the meantime,
then the user shall be notified (on phase 8 which is missing in the figure 1) of
outdated data and control shall return to phase 3 (and in some case perhaps to
phase 1) so that user can refresh the object data from database for new update.
Whenever the update transaction requires use of multiple SQL commands, which is
the case in Type 2 update (SELECT .. UPDATE) or if the data of the object to be
updated is actually stored in multiple tables, then it is possible that the transaction
will fail due to concurrency conflict, for example deadlock. Usually concurrency
conflicts can be solved by applying the Retryer Transaction Pattern 6, but as we
need to avoid lost updates of concurrent transactions this may not be the proper
At the end of successful phase 6 we should refresh the object version data in the
Model cache in case the user will continue processing the object data.
A committed transaction cannot be rolled back, but database textbooks discuss
possible compensating transactions 6, which by reverse update statements will
restore the object data into the original state of phase 4 (for which we would need a
copy of the original data). This is presented as step 7 in Figure 1, which is not
guaranteed to be a success, since concurrent transactions may have already
affected the situation, - and how about the “lost update problem” of those
concurrent updates. Sometimes this may be possible apart from the database
transactions and based on pure business rules: for example we may cancel the
hotel / ticket reservation based on the reservation number of our own - a kind of
business level locking.
The concurrency control by DBMS treats SQL transactions without their application
context, and this is the typical scope of database textbooks in teaching transaction
programming. We see the need to expand this scope to application level, to typical
user transactions which are the context of SQL transactions. Even if the widely
accepted application Design Patterns of GoF 6 did not even mention database
transactions, we can identify and build practical Data Access Patterns to be applied
in teaching Data Access Technologies and in application development using
modern DBMS systems for the benefit of database professionals and the industry.
Modern application architectures have introduced new practices and needs which
have outdated some practices of earlier SQL programming, such as holdable
cursors in case we use connection pooling, etc. So for example we cannot apply
optimistic locking services of cursors across transaction series inside our user
transaction. Also new programming paradigms of Java world and .NET provide
new possibilities and challenges. We will discuss on these in our presentation
"Data Access using RVV Discipline and Persistence Middleware" at eRA-3.
 J. Gray, A. Reuter, "Transaction Processing: Concepts and Techniques", Morgan Kaufmann, 1993
 G. Weikum, G. Vossen, "Transactional Information Systems", Morgan Kaufmann Publishers, 2002
 E. Gamma et al "Design Patterns, Elements of Reusable Object-Oriented Software", Addison-
 C. Nock, "Data Access Patterns", Addison-Wesley, 2004
 M. Laiho, F. Laux, “On Row Version Verifying (RVV) Data Access Discipline for avoiding Lost
 Oracle, "SQL Language Reference 11g Release 1 (11.1)", B28286-01, July 2007
 Microsoft, "SQL Server 2005 Books Online", http://msdn.microsoft.com/en-gb/library/ms130214.aspx
 IBM, "DB2 Version 9.5 for Linux, UNIX, and Windows, Windows, SQL Reference, Volume 1", March