SlideShare a Scribd company logo
1 of 6
Download to read offline
Logging Last Resource Optimization for Distributed
              Transactions in Oracle WebLogic Server
  Tom Barnes                 Adam Messinger   Paul Parkinson     Amit Ganesh                                      German Shegalov
                                  Saraswathy Narayan     Srinivas Kareenhalli
                                                                 Oracle Corporation
                                                                500 Oracle Parkway
                                                      Redwood Shores, CA 94065, USA
                                                  {firstname.lastname}@oracle.com

                                                                             is present in the log. When a transaction involves several re-
ABSTRACT                                                                     sources (participants, also denoted as agents in the literature)
State-of-the-art OLTP systems execute distributed transactions
                                                                             with separate recovery logs (regardless whether local or remote),
using XA-2PC protocol, a presumed-abort variant of the Two-
                                                                             the commit process has to be coordinated in order to prevent in-
Phase Commit (2PC) protocol. While the XA specification pro-
                                                                             consistent subtransaction outcomes. A dedicated resource or one
vides for the Read-Only and 1PC optimizations of 2PC, it does
                                                                             of the participants is chosen to coordinate the transaction. To
not deal with another important optimization, coined Nested 2PC.
                                                                             avoid inconsistent subtransaction outcomes, the transaction coor-
In this paper, we describe the Logging Last Resource (LLR) opti-
                                                                             dinator (TC) executes a client commit request using a 2PC proto-
mization in Oracle WebLogic Server (WLS). It adapts and im-
                                                                             col. Several presume-nothing, presumed-abort, and presumed-
proves the Nested 2PC optimization to/for the Java Enterprise
                                                                             commit variants of 2PC are known in the literature [2, 3, 4, 5, 13].
Edition (JEE) environment. It allows reducing the number of
                                                                             We briefly outline the Presumed-Abort 2PC (PA2PC) because it
forced (synchronous) writes and the number of exchanged mes-
                                                                             has been chosen to implement the XA standard [12] that is pre-
sages when executing distributed transactions that span multiple
                                                                             dominant in today's OLTP world.
transactional resources including a SQL database integrated as a
JDBC datasource. This optimization has been validated in                     1.1 Presumed-Abort 2PC
SPECjAppServer2004 (a standard industry benchmark for JEE)                   Voting (Prepare) Phase: TC sends a prepare message to every
and a variety of internal benchmarks. LLR has been successfully              participant. When the participant determines that its subtransac-
deployed by high-profile customers in mission-critical high-                 tion can be committed, it makes subtransaction recoverable and
performance applications.                                                    replies with a positive vote (ACK). Subtransaction recoverability
                                                                             is achieved by creating a special prepared log record and forcing
1. I TRODUCTIO                                                               the transaction log to disk. Since the transaction is not committed
A transaction (a sequence of operations delimited by commit or               yet, no locks are released to ensure the chosen isolation level.
rollback calls) is a so-called ACID contract between a client ap-            When the participant determines that the transaction is not com-
plication and a transactional resource such as a database or a mes-          mittable for whatever reason, the transaction is aborted; nothing
saging system that guarantees: 1) Atomicity: effects of aborted              has to be force-logged (presumed abort); a negative vote is re-
transactions (hit by a failure prior to commit or explicitly aborted         turned ( ACK).
by the user via rollback) are erased. 2) Consistency: transactions
violating consistency constraints are automatically re-                      Commit Phase: When every participant returned an ACK, the TC
jected/aborted. 3) Isolation: from the application perspective,              force-logs the committed record for the current transaction, and
transactions are executed one at a time even in typical multi-user           notifies all participants using commit messages. Upon receiving a
deployments. 4) Durability (Persistence): state modifications by             commit message, participants force-log the commit decision, re-
committed transactions survive subsequent system failures (i.e.,             lease locks (acquired for transaction isolation), and send a commit
redone when necessary) [13].                                                 status to the TC. Upon collecting all commit status messages from
                                                                             the participants, the TC can discard the transaction information
Transactions are usually implemented using a sequential recovery             (i.e., release log records for garbage collection).
log containing undo and redo information concluded by a commit
record stored in persistent memory such as a hard disk. A transac-           Abort Phase: When at least one participant sends a ACK, the
tion is considered committed by recovery when its commit record              TC sends rollback messages without forcing the log (presumed
                                                                             abort) and does not have to wait for rollback status messages.
                                                                             The complexity of a failure-free run of a PA2PC transaction on n
 Permission to make digital or hard copies of all or part of this work for   participants accumulates to 2n+1 forced writes and 4n messages.
 personal or classroom use is granted without fee provided that copies are   Since commit/rollback status messages are needed only for the
 not made or distributed for profit or commercial advantage and that         garbage collection at the coordinator site, they can be sent asyn-
 copies bear this notice and the full citation on the first page. To copy
                                                                             chronously, e.g., as a batch, or they can be piggybacked on the
 otherwise, to republish, to post on servers or to redistribute to lists,
 requires prior specific permission and/or a fee.                            vote messages of the next transaction instance. Thus, the commu-
 EDBT 2010, March 22-26, 2010, Lausanne, Switzerland.                        nication cost is reduced to 3n messages, and more importantly to
 Copyright 2010 ACM 978-1-60558-945-9/10/0003 ...$10.00                      just 1 synchronous round trip as seen above. If one of the partici-
ested 2PC: Another interesting optimization that is not part of
 void procesess(TwoPCMessage msg) {                                   the XA standard has originally been described by Jim Gray in [2].
   switch (msg) {
   case PREPARE:                                                      It is also known as Tree or Recursive 2PC. It assumes that we deal
     if (i == n) {                                                    with a fixed linear topology of participants P1, …, Pn. There is no
       if (executeCommit() == OK) {                                   fixed TC. Instead the role of TC is propagated left-to-right during
         sendCommitTo(msg.from);                                      the vote phase and right-to-left during the commit phase. Clients
       } else {
         executeAbort();                                              call commit_2pc on P1 that subsequently generates a prepare
         sendAbortTo(msg.from);                                       message to itself. Each Pi (indicated by variable i in the pseudo-
       }                                                              code) executes the simplified logic outlined in Figure 1 process-
     } else {                                                         ing a 2PC message from the source Pfrom. A failure-free execution
       if (executePrepare() == OK) {
         sendPrepareTo(i+1);
                                                                      of an instance of the Nested 2PC Optimization costs 2n forced
       } else {                                                       writes and 2(n-1) messages.
         executeAbort();
         sendAbortTo(i-1);                                            1.3 Java Transaction API
         sendAbortTo(i+1);                                            Java Transaction API (JTA) is a Java “translation” of the XA
       }
     }                                                                specification; it defines standard Java interfaces between a TC and
     break;                                                           the parties involved in a distributed transaction system: the trans-
   case ABORT:                                                        actional resources (e.g., messaging and database systems), the
     if (i != n) {                                                    application server, and the transactional client applications [11].
       sendAbortTo(msg.from == i-1 ? i+1 : i-1);
     }                                                                A TC implements the JTA interface javax.transaction.Trans-
     executeAbort();                                                  actionManager. Conversely, in order to participate in 2PC trans-
     break;                                                           actions, participant resources implement the JTA interface
   case COMMIT:                                                       javax.transaction.xa.XAResource. In order to support a hierarchi-
     if (i != 1) {
       sendCommitTo(i-1);
                                                                      cal 2PC, a TC also implements javax.transaction.xa.XAResource
     }                                                                that can be enlisted by third-party TC's. From this point on, we
     executeCommit();                                                 will use the JTA jargon only: we say transaction manager (TM)
     break;                                                           for a 2PC TC, and (XA)Resource for a 2PC participant.
   } // switch
 }                                                                    Every transaction has a unique id that is encoded into a structure
                                                                      named Xid (along with the id of the TM) representing a subtrans-
 //   a)normal operation                                              action or a branch in the XA jargon. The isSameRM method al-
 //    ->prepare-> ->prepare-> ->prepare->                            lows the TM to compare the resource it is about to enlist in the
 //   P1          P2          P3          P4                          current transaction against every resource it has already enlisted to
 //    <-commit<-- <--commit<- <--commit<-
 //                                                                   reduce the complexity of 2PC during the commit processing, and
 //   b) failure on P3                                                potentially enable 1PC Optimization. The TM makes sure that the
 //    ->prepare-> ->prepare-> -->abort-->                            work done on behalf of a transaction is delimited with the calls to
 //   P1          P2           P3         P4                          start(xid) and end(xid), respectively, which adds at least 4n mes-
 //    <--abort<-- <--abort<--
                                                                      sages per transaction since these calls are typically blocking.
  Figure 1: C-style pseudo code of ested 2PC, and its effect          When a client requests a commit of a transaction, the TM first
            in a commit (a) and an abort (b) case.                    calls prepare on every enlisted resource and based on the returned
                                                                      statuses, in the second phase, it calls either commit or rollback.
                                                                      After a crash, the TM contacts all registered XAReresource's and
pants takes the role of the coordinator, we save one set of mes-      executes a misnamed method recover to obtain the Xid list of
sages at the coordinator site, and we save one forced write since     prepared but not committed, so-called in-doubt transactions. The
we do not need to log transaction commit twice at the coordinator     TM then checks every Xid belonging to it (there can be other
site. Thus, the overhead of a PA2PC commit can be further re-         TM's Xid's) against its log and (re)issues the commit call if the
duced to 2n forced writes and 3(n-1) messages [13].                   commit entry is present in the log; otherwise abort is called.

1.2 Related Work                                                      1.4 Contribution
There are several well-known optimizations of the 2PC protocol.       Our contribution is an implementation of the LLR optimization
The following two optimizations are described in the XA spec and      through JDBC datasources in WLS [8], an adaptation of the
should be implemented by the vendors:                                 Nested 2PC optimization to the JEE. The rest of the paper is or-
                                                                      ganized as follows: Section 2 outlines relevant WLS components,
Read-Only Optimization: A participant may vote “read-only”
                                                                      Section 3 discusses the design considerations of the LLR optimi-
indicating that no updates have been made on this participant on
                                                                      zation, Section 4 describes the implementation details of normal
behalf of the current transaction. Thus, this participant should be
                                                                      operation and failure recovery, Section 5 highlights performance
skipped during the second phase.
                                                                      achieved in a standard JEE benchmark, and finally Section 6 con-
One-Phase Commit (1PC): The TC should detect situations in            cludes this paper.
which only one participant is enlisted in a distributed transaction
and skip the vote phase.                                              2. ORACLE WEBLOGIC SERVER
                                                                      As part of Oracle Fusion Middleware 11g, a family of standards
                                                                      based middleware products, WLS [7] is a Java application server
that provides, inter alia., a complete implementation of the JEE5        no longer experience today in the broadband age. However, we
specification. JEE5 defines containers that manage user applica-         still need to single out one resource that we want to commit di-
tion components such as servlets (for web-based applications),           rectly without calling prepare. The TM will call asynchronously
Enterprise Java Beans (EJB), Message Driven Beans (MDB),                 prepare on n-1 resources. If the TM has received n-1 ACK's, it
application clients (outside the server container), and services that    will invoke commit on the nth resource (Last Resource) and the
containers provide to user application components such as Java           success of this call is used as a global decision about whether to
Database Connectivity (JDBC) for accessing SQL databases, Java           call commit on n-1 resources prepared during the voting phase as
Messaging Service (JMS) for reliable asynchronous point-to-point         well.
and publish-subscribe communication, and JTA for coordinating            The following points suggest choosing a JDBC datasource as an
2PC transactions. User applications typically access container           LLR:
services by looking up relevant API entry points in a directory
implementing the Java Naming Directory Interface (JNDI) pro-             • In a typical scenario, an application in WLS accesses a local
vided by the application server. Specifications and documentation          JMS server, and remote database servers, and such a
about JEE5 are linked from the online reference [10].                      (sub)transaction is coordinated by the local TM. Passing 2PC
                                                                           messages to a local resource has a negligible cost of a local
In WLS, users register JMS and JDBC resources needed by their
                                                                           method invocation in Java because we use collocation. There-
application components by means of corresponding deployment
                                                                           fore, a JDBC datasource connection is a perfect candidate to be
modules. JMS modules configure JMS client objects such as (po-
                                                                           selected as the Last Resource.
tentially XA-enabled) connection factories, and message destina-
                                                                         • In contrast to JMS messages, database data structures such as
tions (queues or topics). A JDBC module provides a definition of
                                                                           data records, primary and secondary indices are subjects to
a SQL datasource, a collection of parameters, such as the name of
                                                                           frequent updates. Committing a database transaction in one
a (potentially XA-capable) JDBC driver and the JDBC URL.
                                                                           phase without waiting for the completion of the voting phase
JDBC datasources provide database access and database connec-
                                                                           greatly reduces the duration of database locks. This potentially
tion management. Each datasource defines a pool of database
                                                                           more than doubles the database transaction throughput. There-
connections. User applications reserve a database connection from
                                                                           fore, even when the application accesses a remote JMS server
the datasource connection pool by looking up the datasource
                                                                           (and the collocation argument no longer holds), the user will
(java.sql.DataSource) on the JNDI context, and calling getCon-
                                                                           want to use a JDBC datasource as an LLR.
nection() on the datasource. When finished with the connection,
                                                                         • When any component involved in a PA2PC transaction fails
the application should call close() on it as early as possible, which
                                                                           during the commit phase, the progress of this PA2PC instance
returns the database connection to the pool for further reuse. Us-
                                                                           (transaction) blocks until the failed component is restarted
ing even an ideal XA JDBC driver is in general more expensive
                                                                           again. Reducing the number of components minimizes the risk
even in the single-resource 1PC transaction case: extra round-trip
                                                                           of running components being blocked by failed ones. The LLR
XAResource methods start and end have to be called regardless of
                                                                           optimization enables the TM to eliminate the local Persistent
the number of resources participating in the transaction.
                                                                           Store resource by logging the global transaction outcome di-
WLS ships with integrated JMS provider (i.e., a JMS server) that           rectly to the LLR. Hence, applications that do not involve local
can use a local filesystem or a JDBC datasource as an underlying           JEE services using the local Persistent Store other than JTA
message persistence mechanism.                                             TM benefit even more from the LLR optimization.
The WLS JTA implementation consists of a distributed service             • 2PC provides only Atomicity and Durability in the ACID con-
that runs one TM instance per WLS server. A transaction may                tract; it does not provide global concurrency control (serializa-
span multiple WLS servers (e.g., when an application invokes               tion, isolation) for global transactions [13]. Even worse, the
remote EJB's), in which case the TM's form a coordinator tree              order in which the changes made in different branches within
with the global coordinator being the root node. Each TM main-             the same global transaction become visible is not defined. LLR
tains persistent state for 2PC transactions in its transaction log for     adds some intra-transaction determinism by defining the partial
crash recovery purposes.                                                   order: updates to the LLR branch are visible before updates to
                                                                           other branches. An application that makes an update to a JDBC
All WLS services on a server instance that require persistence
                                                                           datasource that is an LLR and sends a notification about this
including the JMS servers and the TM can be configured to use a
                                                                           particular update to a JMS destination in the same transaction
single transactional Persistence Store that enables them to appear
                                                                           is guaranteed that the JDBC update is visible at the time when
as a single resource in the 2PC. Persistent Store is either a File
                                                                           the notification is received.
Store, a file-based implementation optimized for fast writes, or a
JDBC Store utilizing a SQL database. Customers with high per-
                                                                         The decision to use a JDBC resource as the last transaction out-
formance requirements use File Stores.
                                                                         come decider poses another challenge. Nested 2PC Optimization
3. LLR DESIG CO SIDERATIO S                                              builds upon the fact that every resource in the chain is capable of
                                                                         being a coordinator, i.e., to persist the transaction outcome and
In contrast to the Nested 2PC Optimization, we do not want to
                                                                         continue the protocol execution after a crash. However, by design,
invoke prepare sequentially in a nested manner because it in-
                                                                         there is no mechanism in a vanilla (XA) JDBC driver to remember
creases the overall latency to approximately (n-1)(rtt+dst), where
                                                                         transaction outcomes. The database is obliged by the ACID con-
rtt is a roundtrip time between two resources and dst is an average
                                                                         tract to preserve the updates of a committed transaction, but it will
hard disk sync time. This part of the Nested 2PC Optimization
                                                                         discard the transaction information from the log as we explained
saves the number of messages and was targeted three decades ago
                                                                         above for PA2PC.
to expensive thin network links and frequent rollbacks, which we
update(XAResource xares, Transaction t,…)                            protocol. The WLS TM lazily garbage-collects in the LLR table
  1. if xares is LLR accept only if t.llr is null                      log entries of terminated 2PC instances as part of subsequent
      and set t.llr = xares.
  2. if xares is not LLR and t.enlisted doesn't                        transactions.
      contain xares                                                    Now that we realize that we do not utilize the XAResource func-
          a. t.enlisted.add(xares)
          b. xares.start(t.xid)                                        tionality to deal with the Last Resource, we can safely replace the
  3. Perform updates via xares' connection                             XA JDBC driver in the LLR datasource definition by a faster non-
                                                                       XA version of the JDBC driver, which also eliminates expensive
  commit(Transaction t)                                                roundtrip calls to the XAResource methods start and end.
  1. for each xares in t.enlisted submit concur-
      rent tasks { xares.end(t.xid);
      xares.prepare(xid); }
                                                                       4. LLR PROCESSI G DETAILS
  2. wait for prepare tasks to finish                                  On a datasource deployment or during the server boot, LLR
  3. if t.llr is not null                                              datasources load or create a table on the database from which the
          a. t.llr.insert(t.xid)                                       datasource pools database connections. The table is created in the
          b. t.llr.commit(), hence commit is test-
              able by looking up xid in the t.llr's
                                                                       original datasource schema. If the database table cannot be created
              table.                                                   or loaded, then the server boot will fail.
  4. if t.llr is null force-log commit record                          Within a global transaction, the first connection obtained from an
      t.xid to local Persistent Store
  5. for each xares in t.enlisted submit concur-                       LLR datasource transparently reserves an internal JDBC connec-
      rent tasks { xares.commit(xid); }                                tion on the transaction’s coordinator. All subsequent transaction
                                                                       operations on any connections obtained from a same-named
  On any error prior commit's 3.b the following is                     datasource on any server are routed to this same single internal
  called. Otherwise the LLR table is consulted.
  error(Transaction t)                                                 JDBC connection.
  1. for each xares in t.enlisted submit concur-                       Figure 2 sketches the LLR-modified processing of JTA transac-
      rent tasks { xares.end(t.xid);
      xares.rollback(xid); }                                           tions in WLS. During normal operation, using the LLR resource
  2. if t.llr is not null t.llr.rollback()                             saves us at least one blocking call start(xid) when the LLR is be-
  3. throw exception to interrupt the caller                           ing enlisted in the transaction (update), and one blocking call
                                                                       end(xid) when the LLR is being committed. Instead of calling
  recover(TransactionManager tm):                                      prepare on LLR we call an insert statement to record the transac-
  1. add commited xid's from local Persistent
      Store to tm.committedXidSet.                                     tion xid in the LLR table, however, this call does not incur log
  2. add commited xid's from llr tables to                             forcing on the LLR. In contrast to Nested 2PC, all non-LLR re-
      tm.commitedXidSet.                                               sources execute 2PC concurrently. However, the commit of LLR
                                                                       is not started until the prepare phase of all non-LLR resources
  when recreating JMS, and non-LLR XA JDBC re-
  sources                                                              completes.
  recover(XAResource xares)
  1. unresolvedXidSet = xares.recover()                                4.1 LLR Table
  2. for each xid in unresolvedXidSet                                  Each WLS instance maintains a LLR table on the database to
         a. if xid is in tm.committedXidSet                            which a JDBC LLR datasource pools database connections. These
             xares.commit(xid)
         b. if xid is not in tm.committedXidSet                        tables are used for storing 2PC commit records. If multiple LLR
             xares.rollback(xid)}                                      datasources are deployed on the same WLS instance and connect
                                                                       to the same database instance and database schema, they will also
                                                                       share the same LLR table. Unless configured explicitly, LLR table
     Figure 2: Simplified pseudocode of LLR processing.
                                                                       names are automatically generated as WL_LLR_<server ame>.

The solution to this problem is to maintain an extra LLR table in      4.2 Failure and Recovery Processing for LLR
the schema connected by the Last Resource datasource. When the         WLS TM processes transaction failures in the following way:
TM performs the LLR Optimization, instead of force-logging the         For 2PC errors that occur before the LLR commit is attempted, the
transaction commit to the local Persistent Store it inserts an entry   TM immediately throws an exception. For 2PC errors that occur
for the transaction id (XID) into the LLR table and only then the      during the LLR commit: If the record is written, the TM commits
TM invokes commit on the Last Resource. This way we make sure          the transaction; If the record is not written, the TM rolls back the
that the commit of the Last Resource and logging the global trans-     transaction; If it is unknown whether the record is written, the TM
action outcome is an atomic operation (hence, the name Logging         throws an ambiguous commit failure exception and attempts to
Last Resource) (see commit in Figure 2). Therefore, the TM will        complete the transaction every five seconds until the transaction
be able to query the LLR table to determine the real transaction       abandon timeout; If the transaction is still incomplete, the TM
outcome even if the TM or the database fail once both operate          logs an abandoned transaction message.
normally again. Persisting of the transaction id as part of the
                                                                       During server boot, the TM will use the LLR resource to read the
transaction has also been used for recoverable ODBC sessions in
                                                                       transaction 2PC record from the database and then use the recov-
a non-XA setting [1].
                                                                       ered information to commit any subtransaction on any participat-
Once the TM committed the LLR transaction it issues asynchro-          ing non-LLR XA resources.
nously commit requests to the prepared XA resources. Receiving
                                                                       If the JDBC connection in an LLR resource fails during a 2PC
all outstanding commit request replies terminates the XA-2PC
                                                                       record insert, the TM rolls back the transaction. If the JDBC con-
Figure 3: SPECjAppServer2004 world wide distributed business [9].

                 Driver1 (3 Dealer Agents, 3 Manufacturing Agents)                        System Under Test (SUT)

                 CPU: two quad core 3.7Ghz x86_64, 2MB L2 Cache
                 RAM: 8GB                                                   Oracle Weblogic Server 10.3 Middle Tier (MT)
                                                                                                      -

                                                                            CPU: two quad core 2.83Ghz x86_64, 6MB L2 Cache
                 Driver2 (3 Dealer Agents, 3 Manufacturing Agents)
                                                                            RAM: 16GB
                 CPU: two quad core 2.7Ghz x86_64, 2MB L2 Cache
                 RAM: 16GB
                                                                            Oracle Database 11g EE- Database Tier (DB)
                 External Supplier Emulator
                                                                            CPU: four quad core 2.7Ghz x86_64, 2MB L2 Cache
                 CPU: two quad core 2.7Ghz x86_64, 2MB L2 Cache             RAM: 64GB
                 RAM: 16GB

                                                           Figure 4: EAStress2004 setup

nection in an LLR resource fails during the commit of the LLR              tion Management, and JDBC [9]. Figure 3 depicts the way that
transaction, the result depends on whether the transaction is 1PC          these components relate to each other.
(i.e., when the LLR resource is the only participant) or 2PC:              The SPECjAppServer2004 benchmark's research workload,
For a 1PC transaction, the transaction will be committed, rolled           coined EAStress2004, is used to evaluate the performance of LLR
back, or block waiting for the resolution of the LLR transaction.          in this paper. SPECjAppServer is a trademark of the Standard
The outcome of the transaction is atomic and is determined by the          Performance       Evaluation      Corp.        (SPEC).       The
underlying database alone. For a 2PC transaction, the outcome is           SPECjAppServer2004 / EAStress2004 results or findings in this
determined based on the LLR table.                                         paper have not been reviewed or accepted by SPEC, therefore no
                                                                           comparison nor performance inference can be made against any
4.3 LLR Transaction Recovery                                               published SPEC result [9].
During server startup, the TM for each WLS must recover incom-
                                                                           The five SPECjAppServer2004 domains are: Dealer, Manufactur-
plete transactions coordinated by the server, including LLR trans-
                                                                           ing, Supplier, Customer, and Corporate. The Dealer domain en-
actions. Each server will attempt to read the transaction records
                                                                           compasses the user interface components of a Web-based applica-
from the LLR database tables for each LLR datasource. If the
                                                                           tion used to access the services provided by the other domains
server cannot access the LLR database tables or if the recovery
                                                                           (i.e., Manufacturing, Supplier, Customer and Corporate), which
fails, the server will not start.
                                                                           can be considered "business domains". There are producer-
5. PERFORMA CE EVALUATIO                                                   consumer relationships between domains in the company and to
                                                                           outside suppliers and customers as well.
LLR datasources have been used for the recent world record re-
sults achieved by Oracle Fusion Middleware in the                          Hardware used in our runs is depicted in Figure 4 where each box
SPECjAppServer2004 benchmark [6]. SPECjAppServer2004 is a                  represents a computer node. Java components such as the bench-
benchmark of JEE-based application servers. It is an end-to-end            mark drivers, the supplier emulator, and WLS are run by Oracle
application which exercises all major JEE technologies imple-              Jrockit JVM R27.6. The benchmark application accesses the data-
mented by compliant application servers: The web container,                base from the WLS using Oracle JDBC Driver 11g. XA
including servlets and JSP's, the EJB container, EJB2.0 Container          datasources oracle.jdbc.xa.client.OracleXADatasource and non-
Managed Persistence, JMS and Message Driven Beans, Transac-
that the user code does not require any changes. Solely the de-
            140                                                                           ployment descriptors or server-wide datasource definitions need
                                                                                          to be redefined as LLR-enabled. Users need to make sure that
            120                                                                           their application deployments do not enlist more than one LLR
            100                                                                           datasource per global transaction. The LLR datasource connection
                                                                                          is automatically selected for logging the global transaction out-
             80                                                                           come instead of the local Persistent Store. Successful commit of
   Values




                                                                                          the LLR datasource commits the global transaction. Saving one
             60                                                                           phase on a remote datasource often more than doubles the transac-
                                                                                          tion throughput. This feature is widely used by our customers in
             40
                                                                                          mission-critical applications with high performance requirements
             20                                                                           and has been validated by deploying it in recognized industry
                                                                                          benchmarks [6].
              0
                     MT-CPU




                                    DB-CPU



                                                I/O Requests /

                                                MT(Hundreds)


                                                                  Round trips to
                                                                                          7. REFERE CES


                                                                   DB (Millions)
                                                                                          [1] Barga, R., D. Lomet, T. Baby, and S. Agrawal: Persistent
                                                    sec on


            LLR                                                                               Client-Server Database Sessions. In Proceedings of 7th Int'l
            XA                                                                                Conference on Extending Database Technology (EDBT),
                                                                                              Konstanz, Germany, Mar 2000: 462-477
                                                                                          [2] Gray, J.: Notes on Data Base Operating Systems. In Ad-
             Figure 5: Resource utilization in EAStress2004.                                  vanced Course: Operating Systems, Springer, 1978: 393-481
                                                                                          [3] Lampson, D. and D. Lomet: A New Presumed Commit Op-
XA datasources backed by oracle.jdbc.OracleDriver are used for
                                                                                              timization for Two Phase Commit. In Proceedings of the
XA and LLR runs, respectively.
                                                                                              19th Int'l Conference on Very Large Data Bases, Dublin, Ire-
The EAStress2004 workload is run with 10 minutes ramp up time,                                land, Aug 1993, Morgan Kaufmann, San Francisco, CA,
and 60 min steady state with Injection Rate (IR) 700 for both XA                              USA: 630-640
and LLR datasources. IR refers to the rate at which business trans-
                                                                                          [4] Mohan, C. and B. Lindsay: Efficient Commit Protocols for
action requests from the Dealer application in the Dealer Domain
                                                                                              the Tree of Processes Model of Distributed Transactions. In
are injected into the system under test (SUT). Table 1 summarizes
                                                                                              Proceedings of the 2nd ACM SIGACT/SIGOPS Symposium
important results. Average transaction response times dramatically
                                                                                              on Principles pf Distributed Computing, Montreal, Quebec,
benefit from using the LLR Optimization.
                                                                                              Canada, Aug 1983: 76-88
In a run with an hour-long steady state, we observe that with XA
                                                                                          [5] Mohan, C., B. Lindsay, and R. Obermark: Transaction Man-
datasources: DB CPU utilization increases by 80%, MT CPU goes
                                                                                              agement in the R* Distributed Database Management Sys-
up 6%. The round trips to the DB increase by around 50%. The
                                                                                              tem. In ACM Transactions on Database Systems, 11(4), Dec
I/O writes on the MT increase by 80%, with I/O service times
                                                                                              1986: 378-396
remaining stable, but disk utilization increasing by 30%. Figure 5
depicts a resource utilization chart for the runs with and without                        [6] Oracle Corp.: Oracle Benchmark Results,
LLR.                                                                                          http://www.oracle.com/solutions/performance_scalability/be
                                                                                              nchmark_results.html
6. CO CLUSIO                                                                              [7] Oracle Corp.: Oracle Fusion Middleware.
We introduced the LLR Optimization in WLS for XA-2PC in-                                      http://www.oracle.com/products/middleware/index.html
volving JDBC datasource connections. LLR provides significant
optimizations of and features beyond the original Nested 2PC in                           [8] Oracle Corp.: Programming WebLogic JTA: Logging Last
the context of JEE, which are unique to Oracle WLS: Failure                                   Resource Transaction Optimization.
during LLR resource recovery fails the boot process; LLR is clus-                             http://download.oracle.com/docs/cd/E12840_01/wls/docs103
ter-ready: in a cluster, only the TC node hosts the LLR transac-                              /jta/llr.html
tion's physical JDBC connection and requests submitted to identi-                         [9] SPEC: SPECjAppServer2004,
cally named LLR datasources on other cluster members within the                               http://www.spec.org/jAppServer2004/
same transaction are rerouted through the coordinator; LLR log
                                                                                          [10] Sun Microsystems: Java EE at a Glance.
entries are automatically garbage collected. We are able to im-
                                                                                               http://java.sun.com/javaee/.
prove the JTA performance in WLS in a transparent fashion such
                                                                                          [11] Sun Microsystems: Java Transaction API Specification 1.0.1,
  Table 1: Summary of EAStess2004 v1.08 results for IR 700                                     http://java.sun.com/javaee/technologies/jta/
                               Response Time in seconds                                   [12] The Open Group: Distributed Transaction Processing: The
                                                                                               XA Specification. X/Open Company Ltd., UK, 1991
              Purchase         Manage        Browse          Manufacturing
                                                                                          [13] Weikum, G. and G. Vossen: Transactional Information Sys-
  XA                    4.20        2.40              5.40                         3.75
                                                                                               tems: Theory, Algorithms, and the Practice of Concurrency
  LLR                   1.50        1.20              1.90                         3.00        Control and Recovery. Morgan Kaufmann, San Francisco,
                                                                                               CA, USA, 2001
  Delta                 2.8x            2x            2.8x                    1.25x

More Related Content

What's hot

SLES 11 SP2 PerformanceEvaluation for Linux on System z
SLES 11 SP2 PerformanceEvaluation for Linux on System zSLES 11 SP2 PerformanceEvaluation for Linux on System z
SLES 11 SP2 PerformanceEvaluation for Linux on System zIBM India Smarter Computing
 
Paxos building-reliable-system
Paxos building-reliable-systemPaxos building-reliable-system
Paxos building-reliable-systemYanpo Zhang
 
Ibm mq c luster overlap demo script
Ibm mq c luster overlap demo scriptIbm mq c luster overlap demo script
Ibm mq c luster overlap demo scriptsarvank2
 
Availability and Integrity in hadoop (Strata EU Edition)
Availability and Integrity in hadoop (Strata EU Edition)Availability and Integrity in hadoop (Strata EU Edition)
Availability and Integrity in hadoop (Strata EU Edition)Steve Loughran
 
Task migration using CRIU
Task migration using CRIUTask migration using CRIU
Task migration using CRIURohit Jnagal
 
Ppt project process migration
Ppt project process migrationPpt project process migration
Ppt project process migrationjaya380
 
vSAN Performance and Resiliency at Scale
vSAN Performance and Resiliency at ScalevSAN Performance and Resiliency at Scale
vSAN Performance and Resiliency at ScaleSumit Lahiri
 
Q2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP SchedulingQ2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP SchedulingLinaro
 
Red Hat Global File System (GFS)
Red Hat Global File System (GFS)Red Hat Global File System (GFS)
Red Hat Global File System (GFS)Schubert Zhang
 
005 cluster monitoring
005 cluster monitoring005 cluster monitoring
005 cluster monitoringScott Miao
 
Practical Byzantine Fault Tolerance
Practical Byzantine Fault TolerancePractical Byzantine Fault Tolerance
Practical Byzantine Fault ToleranceSuman Karumuri
 
Data Replication in Distributed System
Data Replication in  Distributed SystemData Replication in  Distributed System
Data Replication in Distributed SystemEhsan Hessami
 
process management
 process management process management
process managementAshish Kumar
 
Ibm mq dqm setups
Ibm mq dqm setupsIbm mq dqm setups
Ibm mq dqm setupssarvank2
 
Transactions and Concurrency Control
Transactions and Concurrency ControlTransactions and Concurrency Control
Transactions and Concurrency ControlDilum Bandara
 
Memory Bandwidth QoS
Memory Bandwidth QoSMemory Bandwidth QoS
Memory Bandwidth QoSRohit Jnagal
 
Istanbul BFT
Istanbul BFTIstanbul BFT
Istanbul BFTYu-Te Lin
 
003 admin featuresandclients
003 admin featuresandclients003 admin featuresandclients
003 admin featuresandclientsScott Miao
 

What's hot (20)

SLES 11 SP2 PerformanceEvaluation for Linux on System z
SLES 11 SP2 PerformanceEvaluation for Linux on System zSLES 11 SP2 PerformanceEvaluation for Linux on System z
SLES 11 SP2 PerformanceEvaluation for Linux on System z
 
Paxos building-reliable-system
Paxos building-reliable-systemPaxos building-reliable-system
Paxos building-reliable-system
 
Ibm mq c luster overlap demo script
Ibm mq c luster overlap demo scriptIbm mq c luster overlap demo script
Ibm mq c luster overlap demo script
 
Availability and Integrity in hadoop (Strata EU Edition)
Availability and Integrity in hadoop (Strata EU Edition)Availability and Integrity in hadoop (Strata EU Edition)
Availability and Integrity in hadoop (Strata EU Edition)
 
Task migration using CRIU
Task migration using CRIUTask migration using CRIU
Task migration using CRIU
 
Ppt project process migration
Ppt project process migrationPpt project process migration
Ppt project process migration
 
vSAN Performance and Resiliency at Scale
vSAN Performance and Resiliency at ScalevSAN Performance and Resiliency at Scale
vSAN Performance and Resiliency at Scale
 
Q2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP SchedulingQ2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP Scheduling
 
Red Hat Global File System (GFS)
Red Hat Global File System (GFS)Red Hat Global File System (GFS)
Red Hat Global File System (GFS)
 
005 cluster monitoring
005 cluster monitoring005 cluster monitoring
005 cluster monitoring
 
Practical Byzantine Fault Tolerance
Practical Byzantine Fault TolerancePractical Byzantine Fault Tolerance
Practical Byzantine Fault Tolerance
 
9
99
9
 
Data Replication in Distributed System
Data Replication in  Distributed SystemData Replication in  Distributed System
Data Replication in Distributed System
 
process management
 process management process management
process management
 
Ibm mq dqm setups
Ibm mq dqm setupsIbm mq dqm setups
Ibm mq dqm setups
 
Transactions and Concurrency Control
Transactions and Concurrency ControlTransactions and Concurrency Control
Transactions and Concurrency Control
 
Ch17
Ch17Ch17
Ch17
 
Memory Bandwidth QoS
Memory Bandwidth QoSMemory Bandwidth QoS
Memory Bandwidth QoS
 
Istanbul BFT
Istanbul BFTIstanbul BFT
Istanbul BFT
 
003 admin featuresandclients
003 admin featuresandclients003 admin featuresandclients
003 admin featuresandclients
 

Viewers also liked

стартовая презентация учителя
стартовая презентация учителястартовая презентация учителя
стартовая презентация учителяgubernatorova
 
An Introduction to UME Golf for Golf Course Operators
An Introduction to UME Golf for Golf Course OperatorsAn Introduction to UME Golf for Golf Course Operators
An Introduction to UME Golf for Golf Course OperatorsUME Interactive Inc.
 
Lesson plan #2
Lesson plan #2Lesson plan #2
Lesson plan #2shanimcc
 
"Strong Hive Strategizing"
"Strong Hive Strategizing""Strong Hive Strategizing"
"Strong Hive Strategizing"DEBeekeepers
 
ερευν.εργ
ερευν.εργερευν.εργ
ερευν.εργkatrinba63
 
Come sfruttare la Procedural Content Generation - Presentazione svilupparty 2014
Come sfruttare la Procedural Content Generation - Presentazione svilupparty 2014Come sfruttare la Procedural Content Generation - Presentazione svilupparty 2014
Come sfruttare la Procedural Content Generation - Presentazione svilupparty 2014Michele Pirovano
 
Getting started with blended learning guide
Getting started with blended learning guideGetting started with blended learning guide
Getting started with blended learning guideHafidzah Aziz
 
Script ee (deel 8) opnames
Script   ee (deel 8) opnamesScript   ee (deel 8) opnames
Script ee (deel 8) opnamesGabsm85
 
стартовая презентация учителя
стартовая презентация учителястартовая презентация учителя
стартовая презентация учителяgubernatorova
 
Molecular biology .
Molecular biology .Molecular biology .
Molecular biology .sebasgrbm
 
Closing Keynote at GSDI Conference
Closing Keynote at GSDI ConferenceClosing Keynote at GSDI Conference
Closing Keynote at GSDI ConferencePLACE
 
Act paratrabajarlaatenciónme
Act paratrabajarlaatenciónmeAct paratrabajarlaatenciónme
Act paratrabajarlaatenciónmebeatriz espiritu
 
Script entertainment experiencedefinitieveversiedeel67.docx
Script entertainment experiencedefinitieveversiedeel67.docxScript entertainment experiencedefinitieveversiedeel67.docx
Script entertainment experiencedefinitieveversiedeel67.docxGabsm85
 
Anatomi hormon
Anatomi hormonAnatomi hormon
Anatomi hormonDEe THa
 
Immigration: Myths and Realities
Immigration: Myths and RealitiesImmigration: Myths and Realities
Immigration: Myths and Realitiesglneu3
 
Diabetes map set 2004 2008 gwc
Diabetes map set 2004 2008 gwcDiabetes map set 2004 2008 gwc
Diabetes map set 2004 2008 gwcGaiaWellness
 

Viewers also liked (20)

стартовая презентация учителя
стартовая презентация учителястартовая презентация учителя
стартовая презентация учителя
 
An Introduction to UME Golf for Golf Course Operators
An Introduction to UME Golf for Golf Course OperatorsAn Introduction to UME Golf for Golf Course Operators
An Introduction to UME Golf for Golf Course Operators
 
Creative PowerPoint
Creative PowerPointCreative PowerPoint
Creative PowerPoint
 
Lesson plan #2
Lesson plan #2Lesson plan #2
Lesson plan #2
 
"Strong Hive Strategizing"
"Strong Hive Strategizing""Strong Hive Strategizing"
"Strong Hive Strategizing"
 
ερευν.εργ
ερευν.εργερευν.εργ
ερευν.εργ
 
Come sfruttare la Procedural Content Generation - Presentazione svilupparty 2014
Come sfruttare la Procedural Content Generation - Presentazione svilupparty 2014Come sfruttare la Procedural Content Generation - Presentazione svilupparty 2014
Come sfruttare la Procedural Content Generation - Presentazione svilupparty 2014
 
Getting started with blended learning guide
Getting started with blended learning guideGetting started with blended learning guide
Getting started with blended learning guide
 
Script ee (deel 8) opnames
Script   ee (deel 8) opnamesScript   ee (deel 8) opnames
Script ee (deel 8) opnames
 
стартовая презентация учителя
стартовая презентация учителястартовая презентация учителя
стартовая презентация учителя
 
Molecular biology .
Molecular biology .Molecular biology .
Molecular biology .
 
Closing Keynote at GSDI Conference
Closing Keynote at GSDI ConferenceClosing Keynote at GSDI Conference
Closing Keynote at GSDI Conference
 
Act paratrabajarlaatenciónme
Act paratrabajarlaatenciónmeAct paratrabajarlaatenciónme
Act paratrabajarlaatenciónme
 
Script entertainment experiencedefinitieveversiedeel67.docx
Script entertainment experiencedefinitieveversiedeel67.docxScript entertainment experiencedefinitieveversiedeel67.docx
Script entertainment experiencedefinitieveversiedeel67.docx
 
Anatomi hormon
Anatomi hormonAnatomi hormon
Anatomi hormon
 
Immigration: Myths and Realities
Immigration: Myths and RealitiesImmigration: Myths and Realities
Immigration: Myths and Realities
 
Ok
OkOk
Ok
 
Essence july pitch draft
Essence july pitch   draftEssence july pitch   draft
Essence july pitch draft
 
Justin paper
Justin paperJustin paper
Justin paper
 
Diabetes map set 2004 2008 gwc
Diabetes map set 2004 2008 gwcDiabetes map set 2004 2008 gwc
Diabetes map set 2004 2008 gwc
 

Similar to Logging Last Resource Optimization for Distributed Transactions in Oracle…

Sedna XML Database: Executor Internals
Sedna XML Database: Executor InternalsSedna XML Database: Executor Internals
Sedna XML Database: Executor InternalsIvan Shcheklein
 
RTOS implementation
RTOS implementationRTOS implementation
RTOS implementationRajan Kumar
 
Topic2 Understanding Middleware
Topic2 Understanding MiddlewareTopic2 Understanding Middleware
Topic2 Understanding Middlewaresanjoysanyal
 
Distributed OPERATING SYSTEM FOR BACHELOR OF BUSINESS INFORMATION TECHNOLOGY
Distributed OPERATING SYSTEM FOR BACHELOR OF BUSINESS INFORMATION TECHNOLOGYDistributed OPERATING SYSTEM FOR BACHELOR OF BUSINESS INFORMATION TECHNOLOGY
Distributed OPERATING SYSTEM FOR BACHELOR OF BUSINESS INFORMATION TECHNOLOGYreginamutio48
 
Load balancing and failover options
Load balancing and failover optionsLoad balancing and failover options
Load balancing and failover optionsmaclean liu
 
RTOS Material hfffffffffffffffffffffffffffffffffffff
RTOS Material hfffffffffffffffffffffffffffffffffffffRTOS Material hfffffffffffffffffffffffffffffffffffff
RTOS Material hfffffffffffffffffffffffffffffffffffffadugnanegero
 
PostgreSQL Portland Performance Practice Project - Database Test 2 Background
PostgreSQL Portland Performance Practice Project - Database Test 2 BackgroundPostgreSQL Portland Performance Practice Project - Database Test 2 Background
PostgreSQL Portland Performance Practice Project - Database Test 2 BackgroundMark Wong
 
Network and TCP performance relationship workshop
Network and TCP performance relationship workshopNetwork and TCP performance relationship workshop
Network and TCP performance relationship workshopKae Hsu
 
Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applicationsDing Li
 
Linux process management
Linux process managementLinux process management
Linux process managementRaghu nath
 
Oracle Clusterware and Private Network Considerations - Practical Performance...
Oracle Clusterware and Private Network Considerations - Practical Performance...Oracle Clusterware and Private Network Considerations - Practical Performance...
Oracle Clusterware and Private Network Considerations - Practical Performance...Guenadi JILEVSKI
 
Oow2007 performance
Oow2007 performanceOow2007 performance
Oow2007 performanceRicky Zhu
 
XPDS14 - RT-Xen: Real-Time Virtualization in Xen - Sisu Xi, Washington Univer...
XPDS14 - RT-Xen: Real-Time Virtualization in Xen - Sisu Xi, Washington Univer...XPDS14 - RT-Xen: Real-Time Virtualization in Xen - Sisu Xi, Washington Univer...
XPDS14 - RT-Xen: Real-Time Virtualization in Xen - Sisu Xi, Washington Univer...The Linux Foundation
 

Similar to Logging Last Resource Optimization for Distributed Transactions in Oracle… (20)

13 tm adv
13 tm adv13 tm adv
13 tm adv
 
Sedna XML Database: Executor Internals
Sedna XML Database: Executor InternalsSedna XML Database: Executor Internals
Sedna XML Database: Executor Internals
 
RTOS implementation
RTOS implementationRTOS implementation
RTOS implementation
 
Topic2 Understanding Middleware
Topic2 Understanding MiddlewareTopic2 Understanding Middleware
Topic2 Understanding Middleware
 
Distributed OPERATING SYSTEM FOR BACHELOR OF BUSINESS INFORMATION TECHNOLOGY
Distributed OPERATING SYSTEM FOR BACHELOR OF BUSINESS INFORMATION TECHNOLOGYDistributed OPERATING SYSTEM FOR BACHELOR OF BUSINESS INFORMATION TECHNOLOGY
Distributed OPERATING SYSTEM FOR BACHELOR OF BUSINESS INFORMATION TECHNOLOGY
 
Load balancing and failover options
Load balancing and failover optionsLoad balancing and failover options
Load balancing and failover options
 
Userspace networking
Userspace networkingUserspace networking
Userspace networking
 
RTOS Material hfffffffffffffffffffffffffffffffffffff
RTOS Material hfffffffffffffffffffffffffffffffffffffRTOS Material hfffffffffffffffffffffffffffffffffffff
RTOS Material hfffffffffffffffffffffffffffffffffffff
 
PostgreSQL Portland Performance Practice Project - Database Test 2 Background
PostgreSQL Portland Performance Practice Project - Database Test 2 BackgroundPostgreSQL Portland Performance Practice Project - Database Test 2 Background
PostgreSQL Portland Performance Practice Project - Database Test 2 Background
 
Postgres clusters
Postgres clustersPostgres clusters
Postgres clusters
 
Network and TCP performance relationship workshop
Network and TCP performance relationship workshopNetwork and TCP performance relationship workshop
Network and TCP performance relationship workshop
 
Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applications
 
dos.ppt.pptx
dos.ppt.pptxdos.ppt.pptx
dos.ppt.pptx
 
Linux process management
Linux process managementLinux process management
Linux process management
 
Oracle Clusterware and Private Network Considerations - Practical Performance...
Oracle Clusterware and Private Network Considerations - Practical Performance...Oracle Clusterware and Private Network Considerations - Practical Performance...
Oracle Clusterware and Private Network Considerations - Practical Performance...
 
Oow2007 performance
Oow2007 performanceOow2007 performance
Oow2007 performance
 
Rac questions
Rac questionsRac questions
Rac questions
 
Rtos
RtosRtos
Rtos
 
XPDS14 - RT-Xen: Real-Time Virtualization in Xen - Sisu Xi, Washington Univer...
XPDS14 - RT-Xen: Real-Time Virtualization in Xen - Sisu Xi, Washington Univer...XPDS14 - RT-Xen: Real-Time Virtualization in Xen - Sisu Xi, Washington Univer...
XPDS14 - RT-Xen: Real-Time Virtualization in Xen - Sisu Xi, Washington Univer...
 
Data (1)
Data (1)Data (1)
Data (1)
 

More from Gera Shegalov

#SlimScalding - Less Memory is More Capacity
#SlimScalding - Less Memory is More Capacity#SlimScalding - Less Memory is More Capacity
#SlimScalding - Less Memory is More CapacityGera Shegalov
 
The Role of Database Systems in the Era of Big Data
The Role  of Database Systems  in the Era of Big DataThe Role  of Database Systems  in the Era of Big Data
The Role of Database Systems in the Era of Big DataGera Shegalov
 
Hadoop 2 @ Twitter, Elephant Scale
Hadoop 2 @ Twitter, Elephant Scale Hadoop 2 @ Twitter, Elephant Scale
Hadoop 2 @ Twitter, Elephant Scale Gera Shegalov
 
Integrated Data, Message, and Process Recovery for Failure Masking in Web Ser...
Integrated Data, Message, and Process Recovery for Failure Masking in Web Ser...Integrated Data, Message, and Process Recovery for Failure Masking in Web Ser...
Integrated Data, Message, and Process Recovery for Failure Masking in Web Ser...Gera Shegalov
 
Integrated Data, Message, and Process Recovery for Failure Masking in Web Ser...
Integrated Data, Message, and Process Recovery for Failure Masking in Web Ser...Integrated Data, Message, and Process Recovery for Failure Masking in Web Ser...
Integrated Data, Message, and Process Recovery for Failure Masking in Web Ser...Gera Shegalov
 
Apache Drill @ PJUG, Jan 15, 2013
Apache Drill @ PJUG, Jan 15, 2013Apache Drill @ PJUG, Jan 15, 2013
Apache Drill @ PJUG, Jan 15, 2013Gera Shegalov
 
Transaction Timestamping in Temporal Databases
Transaction Timestamping in Temporal DatabasesTransaction Timestamping in Temporal Databases
Transaction Timestamping in Temporal DatabasesGera Shegalov
 
Unstoppable Stateful PHP Web Services
Unstoppable Stateful PHP Web ServicesUnstoppable Stateful PHP Web Services
Unstoppable Stateful PHP Web ServicesGera Shegalov
 
Formal Verification of Transactional Interaction Contract
Formal Verification of Transactional Interaction ContractFormal Verification of Transactional Interaction Contract
Formal Verification of Transactional Interaction ContractGera Shegalov
 
Formal Verification of Web Service Interaction Contracts
Formal Verification of Web Service Interaction ContractsFormal Verification of Web Service Interaction Contracts
Formal Verification of Web Service Interaction ContractsGera Shegalov
 
CTL Model Checking in Database Cloud
CTL Model Checking in Database CloudCTL Model Checking in Database Cloud
CTL Model Checking in Database CloudGera Shegalov
 

More from Gera Shegalov (11)

#SlimScalding - Less Memory is More Capacity
#SlimScalding - Less Memory is More Capacity#SlimScalding - Less Memory is More Capacity
#SlimScalding - Less Memory is More Capacity
 
The Role of Database Systems in the Era of Big Data
The Role  of Database Systems  in the Era of Big DataThe Role  of Database Systems  in the Era of Big Data
The Role of Database Systems in the Era of Big Data
 
Hadoop 2 @ Twitter, Elephant Scale
Hadoop 2 @ Twitter, Elephant Scale Hadoop 2 @ Twitter, Elephant Scale
Hadoop 2 @ Twitter, Elephant Scale
 
Integrated Data, Message, and Process Recovery for Failure Masking in Web Ser...
Integrated Data, Message, and Process Recovery for Failure Masking in Web Ser...Integrated Data, Message, and Process Recovery for Failure Masking in Web Ser...
Integrated Data, Message, and Process Recovery for Failure Masking in Web Ser...
 
Integrated Data, Message, and Process Recovery for Failure Masking in Web Ser...
Integrated Data, Message, and Process Recovery for Failure Masking in Web Ser...Integrated Data, Message, and Process Recovery for Failure Masking in Web Ser...
Integrated Data, Message, and Process Recovery for Failure Masking in Web Ser...
 
Apache Drill @ PJUG, Jan 15, 2013
Apache Drill @ PJUG, Jan 15, 2013Apache Drill @ PJUG, Jan 15, 2013
Apache Drill @ PJUG, Jan 15, 2013
 
Transaction Timestamping in Temporal Databases
Transaction Timestamping in Temporal DatabasesTransaction Timestamping in Temporal Databases
Transaction Timestamping in Temporal Databases
 
Unstoppable Stateful PHP Web Services
Unstoppable Stateful PHP Web ServicesUnstoppable Stateful PHP Web Services
Unstoppable Stateful PHP Web Services
 
Formal Verification of Transactional Interaction Contract
Formal Verification of Transactional Interaction ContractFormal Verification of Transactional Interaction Contract
Formal Verification of Transactional Interaction Contract
 
Formal Verification of Web Service Interaction Contracts
Formal Verification of Web Service Interaction ContractsFormal Verification of Web Service Interaction Contracts
Formal Verification of Web Service Interaction Contracts
 
CTL Model Checking in Database Cloud
CTL Model Checking in Database CloudCTL Model Checking in Database Cloud
CTL Model Checking in Database Cloud
 

Logging Last Resource Optimization for Distributed Transactions in Oracle…

  • 1. Logging Last Resource Optimization for Distributed Transactions in Oracle WebLogic Server Tom Barnes Adam Messinger Paul Parkinson Amit Ganesh German Shegalov Saraswathy Narayan Srinivas Kareenhalli Oracle Corporation 500 Oracle Parkway Redwood Shores, CA 94065, USA {firstname.lastname}@oracle.com is present in the log. When a transaction involves several re- ABSTRACT sources (participants, also denoted as agents in the literature) State-of-the-art OLTP systems execute distributed transactions with separate recovery logs (regardless whether local or remote), using XA-2PC protocol, a presumed-abort variant of the Two- the commit process has to be coordinated in order to prevent in- Phase Commit (2PC) protocol. While the XA specification pro- consistent subtransaction outcomes. A dedicated resource or one vides for the Read-Only and 1PC optimizations of 2PC, it does of the participants is chosen to coordinate the transaction. To not deal with another important optimization, coined Nested 2PC. avoid inconsistent subtransaction outcomes, the transaction coor- In this paper, we describe the Logging Last Resource (LLR) opti- dinator (TC) executes a client commit request using a 2PC proto- mization in Oracle WebLogic Server (WLS). It adapts and im- col. Several presume-nothing, presumed-abort, and presumed- proves the Nested 2PC optimization to/for the Java Enterprise commit variants of 2PC are known in the literature [2, 3, 4, 5, 13]. Edition (JEE) environment. It allows reducing the number of We briefly outline the Presumed-Abort 2PC (PA2PC) because it forced (synchronous) writes and the number of exchanged mes- has been chosen to implement the XA standard [12] that is pre- sages when executing distributed transactions that span multiple dominant in today's OLTP world. transactional resources including a SQL database integrated as a JDBC datasource. This optimization has been validated in 1.1 Presumed-Abort 2PC SPECjAppServer2004 (a standard industry benchmark for JEE) Voting (Prepare) Phase: TC sends a prepare message to every and a variety of internal benchmarks. LLR has been successfully participant. When the participant determines that its subtransac- deployed by high-profile customers in mission-critical high- tion can be committed, it makes subtransaction recoverable and performance applications. replies with a positive vote (ACK). Subtransaction recoverability is achieved by creating a special prepared log record and forcing 1. I TRODUCTIO the transaction log to disk. Since the transaction is not committed A transaction (a sequence of operations delimited by commit or yet, no locks are released to ensure the chosen isolation level. rollback calls) is a so-called ACID contract between a client ap- When the participant determines that the transaction is not com- plication and a transactional resource such as a database or a mes- mittable for whatever reason, the transaction is aborted; nothing saging system that guarantees: 1) Atomicity: effects of aborted has to be force-logged (presumed abort); a negative vote is re- transactions (hit by a failure prior to commit or explicitly aborted turned ( ACK). by the user via rollback) are erased. 2) Consistency: transactions violating consistency constraints are automatically re- Commit Phase: When every participant returned an ACK, the TC jected/aborted. 3) Isolation: from the application perspective, force-logs the committed record for the current transaction, and transactions are executed one at a time even in typical multi-user notifies all participants using commit messages. Upon receiving a deployments. 4) Durability (Persistence): state modifications by commit message, participants force-log the commit decision, re- committed transactions survive subsequent system failures (i.e., lease locks (acquired for transaction isolation), and send a commit redone when necessary) [13]. status to the TC. Upon collecting all commit status messages from the participants, the TC can discard the transaction information Transactions are usually implemented using a sequential recovery (i.e., release log records for garbage collection). log containing undo and redo information concluded by a commit record stored in persistent memory such as a hard disk. A transac- Abort Phase: When at least one participant sends a ACK, the tion is considered committed by recovery when its commit record TC sends rollback messages without forcing the log (presumed abort) and does not have to wait for rollback status messages. The complexity of a failure-free run of a PA2PC transaction on n Permission to make digital or hard copies of all or part of this work for participants accumulates to 2n+1 forced writes and 4n messages. personal or classroom use is granted without fee provided that copies are Since commit/rollback status messages are needed only for the not made or distributed for profit or commercial advantage and that garbage collection at the coordinator site, they can be sent asyn- copies bear this notice and the full citation on the first page. To copy chronously, e.g., as a batch, or they can be piggybacked on the otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. vote messages of the next transaction instance. Thus, the commu- EDBT 2010, March 22-26, 2010, Lausanne, Switzerland. nication cost is reduced to 3n messages, and more importantly to Copyright 2010 ACM 978-1-60558-945-9/10/0003 ...$10.00 just 1 synchronous round trip as seen above. If one of the partici-
  • 2. ested 2PC: Another interesting optimization that is not part of void procesess(TwoPCMessage msg) { the XA standard has originally been described by Jim Gray in [2]. switch (msg) { case PREPARE: It is also known as Tree or Recursive 2PC. It assumes that we deal if (i == n) { with a fixed linear topology of participants P1, …, Pn. There is no if (executeCommit() == OK) { fixed TC. Instead the role of TC is propagated left-to-right during sendCommitTo(msg.from); the vote phase and right-to-left during the commit phase. Clients } else { executeAbort(); call commit_2pc on P1 that subsequently generates a prepare sendAbortTo(msg.from); message to itself. Each Pi (indicated by variable i in the pseudo- } code) executes the simplified logic outlined in Figure 1 process- } else { ing a 2PC message from the source Pfrom. A failure-free execution if (executePrepare() == OK) { sendPrepareTo(i+1); of an instance of the Nested 2PC Optimization costs 2n forced } else { writes and 2(n-1) messages. executeAbort(); sendAbortTo(i-1); 1.3 Java Transaction API sendAbortTo(i+1); Java Transaction API (JTA) is a Java “translation” of the XA } } specification; it defines standard Java interfaces between a TC and break; the parties involved in a distributed transaction system: the trans- case ABORT: actional resources (e.g., messaging and database systems), the if (i != n) { application server, and the transactional client applications [11]. sendAbortTo(msg.from == i-1 ? i+1 : i-1); } A TC implements the JTA interface javax.transaction.Trans- executeAbort(); actionManager. Conversely, in order to participate in 2PC trans- break; actions, participant resources implement the JTA interface case COMMIT: javax.transaction.xa.XAResource. In order to support a hierarchi- if (i != 1) { sendCommitTo(i-1); cal 2PC, a TC also implements javax.transaction.xa.XAResource } that can be enlisted by third-party TC's. From this point on, we executeCommit(); will use the JTA jargon only: we say transaction manager (TM) break; for a 2PC TC, and (XA)Resource for a 2PC participant. } // switch } Every transaction has a unique id that is encoded into a structure named Xid (along with the id of the TM) representing a subtrans- // a)normal operation action or a branch in the XA jargon. The isSameRM method al- // ->prepare-> ->prepare-> ->prepare-> lows the TM to compare the resource it is about to enlist in the // P1 P2 P3 P4 current transaction against every resource it has already enlisted to // <-commit<-- <--commit<- <--commit<- // reduce the complexity of 2PC during the commit processing, and // b) failure on P3 potentially enable 1PC Optimization. The TM makes sure that the // ->prepare-> ->prepare-> -->abort--> work done on behalf of a transaction is delimited with the calls to // P1 P2 P3 P4 start(xid) and end(xid), respectively, which adds at least 4n mes- // <--abort<-- <--abort<-- sages per transaction since these calls are typically blocking. Figure 1: C-style pseudo code of ested 2PC, and its effect When a client requests a commit of a transaction, the TM first in a commit (a) and an abort (b) case. calls prepare on every enlisted resource and based on the returned statuses, in the second phase, it calls either commit or rollback. After a crash, the TM contacts all registered XAReresource's and pants takes the role of the coordinator, we save one set of mes- executes a misnamed method recover to obtain the Xid list of sages at the coordinator site, and we save one forced write since prepared but not committed, so-called in-doubt transactions. The we do not need to log transaction commit twice at the coordinator TM then checks every Xid belonging to it (there can be other site. Thus, the overhead of a PA2PC commit can be further re- TM's Xid's) against its log and (re)issues the commit call if the duced to 2n forced writes and 3(n-1) messages [13]. commit entry is present in the log; otherwise abort is called. 1.2 Related Work 1.4 Contribution There are several well-known optimizations of the 2PC protocol. Our contribution is an implementation of the LLR optimization The following two optimizations are described in the XA spec and through JDBC datasources in WLS [8], an adaptation of the should be implemented by the vendors: Nested 2PC optimization to the JEE. The rest of the paper is or- ganized as follows: Section 2 outlines relevant WLS components, Read-Only Optimization: A participant may vote “read-only” Section 3 discusses the design considerations of the LLR optimi- indicating that no updates have been made on this participant on zation, Section 4 describes the implementation details of normal behalf of the current transaction. Thus, this participant should be operation and failure recovery, Section 5 highlights performance skipped during the second phase. achieved in a standard JEE benchmark, and finally Section 6 con- One-Phase Commit (1PC): The TC should detect situations in cludes this paper. which only one participant is enlisted in a distributed transaction and skip the vote phase. 2. ORACLE WEBLOGIC SERVER As part of Oracle Fusion Middleware 11g, a family of standards based middleware products, WLS [7] is a Java application server
  • 3. that provides, inter alia., a complete implementation of the JEE5 no longer experience today in the broadband age. However, we specification. JEE5 defines containers that manage user applica- still need to single out one resource that we want to commit di- tion components such as servlets (for web-based applications), rectly without calling prepare. The TM will call asynchronously Enterprise Java Beans (EJB), Message Driven Beans (MDB), prepare on n-1 resources. If the TM has received n-1 ACK's, it application clients (outside the server container), and services that will invoke commit on the nth resource (Last Resource) and the containers provide to user application components such as Java success of this call is used as a global decision about whether to Database Connectivity (JDBC) for accessing SQL databases, Java call commit on n-1 resources prepared during the voting phase as Messaging Service (JMS) for reliable asynchronous point-to-point well. and publish-subscribe communication, and JTA for coordinating The following points suggest choosing a JDBC datasource as an 2PC transactions. User applications typically access container LLR: services by looking up relevant API entry points in a directory implementing the Java Naming Directory Interface (JNDI) pro- • In a typical scenario, an application in WLS accesses a local vided by the application server. Specifications and documentation JMS server, and remote database servers, and such a about JEE5 are linked from the online reference [10]. (sub)transaction is coordinated by the local TM. Passing 2PC messages to a local resource has a negligible cost of a local In WLS, users register JMS and JDBC resources needed by their method invocation in Java because we use collocation. There- application components by means of corresponding deployment fore, a JDBC datasource connection is a perfect candidate to be modules. JMS modules configure JMS client objects such as (po- selected as the Last Resource. tentially XA-enabled) connection factories, and message destina- • In contrast to JMS messages, database data structures such as tions (queues or topics). A JDBC module provides a definition of data records, primary and secondary indices are subjects to a SQL datasource, a collection of parameters, such as the name of frequent updates. Committing a database transaction in one a (potentially XA-capable) JDBC driver and the JDBC URL. phase without waiting for the completion of the voting phase JDBC datasources provide database access and database connec- greatly reduces the duration of database locks. This potentially tion management. Each datasource defines a pool of database more than doubles the database transaction throughput. There- connections. User applications reserve a database connection from fore, even when the application accesses a remote JMS server the datasource connection pool by looking up the datasource (and the collocation argument no longer holds), the user will (java.sql.DataSource) on the JNDI context, and calling getCon- want to use a JDBC datasource as an LLR. nection() on the datasource. When finished with the connection, • When any component involved in a PA2PC transaction fails the application should call close() on it as early as possible, which during the commit phase, the progress of this PA2PC instance returns the database connection to the pool for further reuse. Us- (transaction) blocks until the failed component is restarted ing even an ideal XA JDBC driver is in general more expensive again. Reducing the number of components minimizes the risk even in the single-resource 1PC transaction case: extra round-trip of running components being blocked by failed ones. The LLR XAResource methods start and end have to be called regardless of optimization enables the TM to eliminate the local Persistent the number of resources participating in the transaction. Store resource by logging the global transaction outcome di- WLS ships with integrated JMS provider (i.e., a JMS server) that rectly to the LLR. Hence, applications that do not involve local can use a local filesystem or a JDBC datasource as an underlying JEE services using the local Persistent Store other than JTA message persistence mechanism. TM benefit even more from the LLR optimization. The WLS JTA implementation consists of a distributed service • 2PC provides only Atomicity and Durability in the ACID con- that runs one TM instance per WLS server. A transaction may tract; it does not provide global concurrency control (serializa- span multiple WLS servers (e.g., when an application invokes tion, isolation) for global transactions [13]. Even worse, the remote EJB's), in which case the TM's form a coordinator tree order in which the changes made in different branches within with the global coordinator being the root node. Each TM main- the same global transaction become visible is not defined. LLR tains persistent state for 2PC transactions in its transaction log for adds some intra-transaction determinism by defining the partial crash recovery purposes. order: updates to the LLR branch are visible before updates to other branches. An application that makes an update to a JDBC All WLS services on a server instance that require persistence datasource that is an LLR and sends a notification about this including the JMS servers and the TM can be configured to use a particular update to a JMS destination in the same transaction single transactional Persistence Store that enables them to appear is guaranteed that the JDBC update is visible at the time when as a single resource in the 2PC. Persistent Store is either a File the notification is received. Store, a file-based implementation optimized for fast writes, or a JDBC Store utilizing a SQL database. Customers with high per- The decision to use a JDBC resource as the last transaction out- formance requirements use File Stores. come decider poses another challenge. Nested 2PC Optimization 3. LLR DESIG CO SIDERATIO S builds upon the fact that every resource in the chain is capable of being a coordinator, i.e., to persist the transaction outcome and In contrast to the Nested 2PC Optimization, we do not want to continue the protocol execution after a crash. However, by design, invoke prepare sequentially in a nested manner because it in- there is no mechanism in a vanilla (XA) JDBC driver to remember creases the overall latency to approximately (n-1)(rtt+dst), where transaction outcomes. The database is obliged by the ACID con- rtt is a roundtrip time between two resources and dst is an average tract to preserve the updates of a committed transaction, but it will hard disk sync time. This part of the Nested 2PC Optimization discard the transaction information from the log as we explained saves the number of messages and was targeted three decades ago above for PA2PC. to expensive thin network links and frequent rollbacks, which we
  • 4. update(XAResource xares, Transaction t,…) protocol. The WLS TM lazily garbage-collects in the LLR table 1. if xares is LLR accept only if t.llr is null log entries of terminated 2PC instances as part of subsequent and set t.llr = xares. 2. if xares is not LLR and t.enlisted doesn't transactions. contain xares Now that we realize that we do not utilize the XAResource func- a. t.enlisted.add(xares) b. xares.start(t.xid) tionality to deal with the Last Resource, we can safely replace the 3. Perform updates via xares' connection XA JDBC driver in the LLR datasource definition by a faster non- XA version of the JDBC driver, which also eliminates expensive commit(Transaction t) roundtrip calls to the XAResource methods start and end. 1. for each xares in t.enlisted submit concur- rent tasks { xares.end(t.xid); xares.prepare(xid); } 4. LLR PROCESSI G DETAILS 2. wait for prepare tasks to finish On a datasource deployment or during the server boot, LLR 3. if t.llr is not null datasources load or create a table on the database from which the a. t.llr.insert(t.xid) datasource pools database connections. The table is created in the b. t.llr.commit(), hence commit is test- able by looking up xid in the t.llr's original datasource schema. If the database table cannot be created table. or loaded, then the server boot will fail. 4. if t.llr is null force-log commit record Within a global transaction, the first connection obtained from an t.xid to local Persistent Store 5. for each xares in t.enlisted submit concur- LLR datasource transparently reserves an internal JDBC connec- rent tasks { xares.commit(xid); } tion on the transaction’s coordinator. All subsequent transaction operations on any connections obtained from a same-named On any error prior commit's 3.b the following is datasource on any server are routed to this same single internal called. Otherwise the LLR table is consulted. error(Transaction t) JDBC connection. 1. for each xares in t.enlisted submit concur- Figure 2 sketches the LLR-modified processing of JTA transac- rent tasks { xares.end(t.xid); xares.rollback(xid); } tions in WLS. During normal operation, using the LLR resource 2. if t.llr is not null t.llr.rollback() saves us at least one blocking call start(xid) when the LLR is be- 3. throw exception to interrupt the caller ing enlisted in the transaction (update), and one blocking call end(xid) when the LLR is being committed. Instead of calling recover(TransactionManager tm): prepare on LLR we call an insert statement to record the transac- 1. add commited xid's from local Persistent Store to tm.committedXidSet. tion xid in the LLR table, however, this call does not incur log 2. add commited xid's from llr tables to forcing on the LLR. In contrast to Nested 2PC, all non-LLR re- tm.commitedXidSet. sources execute 2PC concurrently. However, the commit of LLR is not started until the prepare phase of all non-LLR resources when recreating JMS, and non-LLR XA JDBC re- sources completes. recover(XAResource xares) 1. unresolvedXidSet = xares.recover() 4.1 LLR Table 2. for each xid in unresolvedXidSet Each WLS instance maintains a LLR table on the database to a. if xid is in tm.committedXidSet which a JDBC LLR datasource pools database connections. These xares.commit(xid) b. if xid is not in tm.committedXidSet tables are used for storing 2PC commit records. If multiple LLR xares.rollback(xid)} datasources are deployed on the same WLS instance and connect to the same database instance and database schema, they will also share the same LLR table. Unless configured explicitly, LLR table Figure 2: Simplified pseudocode of LLR processing. names are automatically generated as WL_LLR_<server ame>. The solution to this problem is to maintain an extra LLR table in 4.2 Failure and Recovery Processing for LLR the schema connected by the Last Resource datasource. When the WLS TM processes transaction failures in the following way: TM performs the LLR Optimization, instead of force-logging the For 2PC errors that occur before the LLR commit is attempted, the transaction commit to the local Persistent Store it inserts an entry TM immediately throws an exception. For 2PC errors that occur for the transaction id (XID) into the LLR table and only then the during the LLR commit: If the record is written, the TM commits TM invokes commit on the Last Resource. This way we make sure the transaction; If the record is not written, the TM rolls back the that the commit of the Last Resource and logging the global trans- transaction; If it is unknown whether the record is written, the TM action outcome is an atomic operation (hence, the name Logging throws an ambiguous commit failure exception and attempts to Last Resource) (see commit in Figure 2). Therefore, the TM will complete the transaction every five seconds until the transaction be able to query the LLR table to determine the real transaction abandon timeout; If the transaction is still incomplete, the TM outcome even if the TM or the database fail once both operate logs an abandoned transaction message. normally again. Persisting of the transaction id as part of the During server boot, the TM will use the LLR resource to read the transaction has also been used for recoverable ODBC sessions in transaction 2PC record from the database and then use the recov- a non-XA setting [1]. ered information to commit any subtransaction on any participat- Once the TM committed the LLR transaction it issues asynchro- ing non-LLR XA resources. nously commit requests to the prepared XA resources. Receiving If the JDBC connection in an LLR resource fails during a 2PC all outstanding commit request replies terminates the XA-2PC record insert, the TM rolls back the transaction. If the JDBC con-
  • 5. Figure 3: SPECjAppServer2004 world wide distributed business [9]. Driver1 (3 Dealer Agents, 3 Manufacturing Agents) System Under Test (SUT) CPU: two quad core 3.7Ghz x86_64, 2MB L2 Cache RAM: 8GB Oracle Weblogic Server 10.3 Middle Tier (MT) - CPU: two quad core 2.83Ghz x86_64, 6MB L2 Cache Driver2 (3 Dealer Agents, 3 Manufacturing Agents) RAM: 16GB CPU: two quad core 2.7Ghz x86_64, 2MB L2 Cache RAM: 16GB Oracle Database 11g EE- Database Tier (DB) External Supplier Emulator CPU: four quad core 2.7Ghz x86_64, 2MB L2 Cache CPU: two quad core 2.7Ghz x86_64, 2MB L2 Cache RAM: 64GB RAM: 16GB Figure 4: EAStress2004 setup nection in an LLR resource fails during the commit of the LLR tion Management, and JDBC [9]. Figure 3 depicts the way that transaction, the result depends on whether the transaction is 1PC these components relate to each other. (i.e., when the LLR resource is the only participant) or 2PC: The SPECjAppServer2004 benchmark's research workload, For a 1PC transaction, the transaction will be committed, rolled coined EAStress2004, is used to evaluate the performance of LLR back, or block waiting for the resolution of the LLR transaction. in this paper. SPECjAppServer is a trademark of the Standard The outcome of the transaction is atomic and is determined by the Performance Evaluation Corp. (SPEC). The underlying database alone. For a 2PC transaction, the outcome is SPECjAppServer2004 / EAStress2004 results or findings in this determined based on the LLR table. paper have not been reviewed or accepted by SPEC, therefore no comparison nor performance inference can be made against any 4.3 LLR Transaction Recovery published SPEC result [9]. During server startup, the TM for each WLS must recover incom- The five SPECjAppServer2004 domains are: Dealer, Manufactur- plete transactions coordinated by the server, including LLR trans- ing, Supplier, Customer, and Corporate. The Dealer domain en- actions. Each server will attempt to read the transaction records compasses the user interface components of a Web-based applica- from the LLR database tables for each LLR datasource. If the tion used to access the services provided by the other domains server cannot access the LLR database tables or if the recovery (i.e., Manufacturing, Supplier, Customer and Corporate), which fails, the server will not start. can be considered "business domains". There are producer- 5. PERFORMA CE EVALUATIO consumer relationships between domains in the company and to outside suppliers and customers as well. LLR datasources have been used for the recent world record re- sults achieved by Oracle Fusion Middleware in the Hardware used in our runs is depicted in Figure 4 where each box SPECjAppServer2004 benchmark [6]. SPECjAppServer2004 is a represents a computer node. Java components such as the bench- benchmark of JEE-based application servers. It is an end-to-end mark drivers, the supplier emulator, and WLS are run by Oracle application which exercises all major JEE technologies imple- Jrockit JVM R27.6. The benchmark application accesses the data- mented by compliant application servers: The web container, base from the WLS using Oracle JDBC Driver 11g. XA including servlets and JSP's, the EJB container, EJB2.0 Container datasources oracle.jdbc.xa.client.OracleXADatasource and non- Managed Persistence, JMS and Message Driven Beans, Transac-
  • 6. that the user code does not require any changes. Solely the de- 140 ployment descriptors or server-wide datasource definitions need to be redefined as LLR-enabled. Users need to make sure that 120 their application deployments do not enlist more than one LLR 100 datasource per global transaction. The LLR datasource connection is automatically selected for logging the global transaction out- 80 come instead of the local Persistent Store. Successful commit of Values the LLR datasource commits the global transaction. Saving one 60 phase on a remote datasource often more than doubles the transac- tion throughput. This feature is widely used by our customers in 40 mission-critical applications with high performance requirements 20 and has been validated by deploying it in recognized industry benchmarks [6]. 0 MT-CPU DB-CPU I/O Requests / MT(Hundreds) Round trips to 7. REFERE CES DB (Millions) [1] Barga, R., D. Lomet, T. Baby, and S. Agrawal: Persistent sec on LLR Client-Server Database Sessions. In Proceedings of 7th Int'l XA Conference on Extending Database Technology (EDBT), Konstanz, Germany, Mar 2000: 462-477 [2] Gray, J.: Notes on Data Base Operating Systems. In Ad- Figure 5: Resource utilization in EAStress2004. vanced Course: Operating Systems, Springer, 1978: 393-481 [3] Lampson, D. and D. Lomet: A New Presumed Commit Op- XA datasources backed by oracle.jdbc.OracleDriver are used for timization for Two Phase Commit. In Proceedings of the XA and LLR runs, respectively. 19th Int'l Conference on Very Large Data Bases, Dublin, Ire- The EAStress2004 workload is run with 10 minutes ramp up time, land, Aug 1993, Morgan Kaufmann, San Francisco, CA, and 60 min steady state with Injection Rate (IR) 700 for both XA USA: 630-640 and LLR datasources. IR refers to the rate at which business trans- [4] Mohan, C. and B. Lindsay: Efficient Commit Protocols for action requests from the Dealer application in the Dealer Domain the Tree of Processes Model of Distributed Transactions. In are injected into the system under test (SUT). Table 1 summarizes Proceedings of the 2nd ACM SIGACT/SIGOPS Symposium important results. Average transaction response times dramatically on Principles pf Distributed Computing, Montreal, Quebec, benefit from using the LLR Optimization. Canada, Aug 1983: 76-88 In a run with an hour-long steady state, we observe that with XA [5] Mohan, C., B. Lindsay, and R. Obermark: Transaction Man- datasources: DB CPU utilization increases by 80%, MT CPU goes agement in the R* Distributed Database Management Sys- up 6%. The round trips to the DB increase by around 50%. The tem. In ACM Transactions on Database Systems, 11(4), Dec I/O writes on the MT increase by 80%, with I/O service times 1986: 378-396 remaining stable, but disk utilization increasing by 30%. Figure 5 depicts a resource utilization chart for the runs with and without [6] Oracle Corp.: Oracle Benchmark Results, LLR. http://www.oracle.com/solutions/performance_scalability/be nchmark_results.html 6. CO CLUSIO [7] Oracle Corp.: Oracle Fusion Middleware. We introduced the LLR Optimization in WLS for XA-2PC in- http://www.oracle.com/products/middleware/index.html volving JDBC datasource connections. LLR provides significant optimizations of and features beyond the original Nested 2PC in [8] Oracle Corp.: Programming WebLogic JTA: Logging Last the context of JEE, which are unique to Oracle WLS: Failure Resource Transaction Optimization. during LLR resource recovery fails the boot process; LLR is clus- http://download.oracle.com/docs/cd/E12840_01/wls/docs103 ter-ready: in a cluster, only the TC node hosts the LLR transac- /jta/llr.html tion's physical JDBC connection and requests submitted to identi- [9] SPEC: SPECjAppServer2004, cally named LLR datasources on other cluster members within the http://www.spec.org/jAppServer2004/ same transaction are rerouted through the coordinator; LLR log [10] Sun Microsystems: Java EE at a Glance. entries are automatically garbage collected. We are able to im- http://java.sun.com/javaee/. prove the JTA performance in WLS in a transparent fashion such [11] Sun Microsystems: Java Transaction API Specification 1.0.1, Table 1: Summary of EAStess2004 v1.08 results for IR 700 http://java.sun.com/javaee/technologies/jta/ Response Time in seconds [12] The Open Group: Distributed Transaction Processing: The XA Specification. X/Open Company Ltd., UK, 1991 Purchase Manage Browse Manufacturing [13] Weikum, G. and G. Vossen: Transactional Information Sys- XA 4.20 2.40 5.40 3.75 tems: Theory, Algorithms, and the Practice of Concurrency LLR 1.50 1.20 1.90 3.00 Control and Recovery. Morgan Kaufmann, San Francisco, CA, USA, 2001 Delta 2.8x 2x 2.8x 1.25x