Formal Verification of Web Service Interaction Contracts

  • 418 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
418
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
1
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • We use the state-and-activity chart language to formally specify the interaction contracts. The State-and-Activity chart language is provided with a leading tool for specification of reactive systems Statemate. The specification process begins with an activity chart providing the functional view on the system. Internal activities are represented by solid-line boxes. Dashed-line boxes specify external activities, an execution environment, and external applications. The arrows represent the data flow. Labels indicates which data or events are concerned. In this concrete scenario we specify an activity ensuring that a message is passed from one CIC component to an other one according to the CIC rules in a failure-prone environment that non-deterministically supplies failure events (crashes and link outages). What the application needs to know about it that it should activate the "sender trigger" and await an occurrence of the event "message processed" . This is important, please memorize that. The system administrator specifies the timeout values suitable for the given application along with some other options. The manager may stop the specification process at this stage. Activities are hierarchical and allow for a step-wise refinement. The next employee will say that actually the behavior of the cic activity is controlled by a so-called control activity cic_sc (sc stands for statechart) depicted as a green rounded box and has two further sub-activities: cic_sender and cic_receiver exchanging the messages and notifications as I have described informally before. The behaviors of these subactivities are defined by the corresponding control activities.
  • The CIC can be informally described as follows: By sending a message to a different component the CIC sender commits its state. Usually, it forces the log to disk to make its state and the message recoverable. The sender deterministically tags its message with a unique id, a message sequence number MSN The sender keeps sending the message periodically until it gets a stable notification from the receiver. It keeps the message for the receiver may request the message again after a failure. The sender is released from all of its obligations when it gets an installed notification from the receiver. The CIC receiver eliminates message duplicates based on MSN. It persists an interaction before sending a stable notification to the sender. Normally this is done by logging the message header and forcing the log. The receiver requests the original message from the sender after a failure, when its log contains only the message header. The receiver ensures its autonomous recovery by forcing the complete message to disk or creating an installation point before sending an installed notification to the sender.
  • At the end, we learned that we need to make compromises between the realism of the models and their verifiability. A web service model using integer expressions to generate timeouts periodically as it would happen in a real system could not be verified. We succeeded after replacing the integer-based timeouts by nondeterministic 1-bit timeouts, which is a more general case. No engineering tricks however have helped to obtain any results for a multi-user model and for the liveness of the single-user-model.
  • We performed measurements to evaluate the overhead of the interaction contracts in a 3-tier application that has a similar structure as an ebay like auction service. The front-end server manages private user setting that are accessed simultaneously without contention. The backend server manages the current highest bids for auction items that are accessed concurrently. The load was generated by a synthetic load generator Apache Jmeter from 5 different machines
  • The run-time overhead of EOS-PHP is on average about 100% in terms of both the elapsed and the CPU time. At this price we support failure making which radically simplifies the development process and provides a correct and highly available service to customers.
  • I implemented the committed and external interaction contracts for PHP-based Web-services. PHP is a scripting language that is embedded into usual HTML pages. PHP is interpreted by the Zend engine that has a great variety of modules extending the capabilities of the PHP language. With PHP we can manage the application state across multiple HTTP requests using the Session module. There is a number of options of invoking remote Web services to build a complex multi-tier Application. In my work I concentrated on the CURL module. A reply message of a PHP script is normally an HTML page that is displayed by the browser.
  • Our prototype implements the exactly sematics. It delivers the recovery guarantees to the end-user by implementing the external and the committed interaction contracts for the Internet Explorer. On the PHP side we can recover concurrent request accessing shared objects. We can recover calls to the nondeterminisatic functions, time, curl_exec, and the random number generator rand. We do really support n-tier for any n with any fanout in the call structure. We have enhanced performance of the original PHP implementation with Regard to disk I/Os and made the conccurency control. For instance it is now possible to access the session data read only.
  • We use the state-and-activity chart language to formally specify the interaction contracts. The State-and-Activity chart language is provided with a leading tool for specification of reactive systems Statemate. The specification process begins with an activity chart providing the functional view on the system. Internal activities are represented by solid-line boxes. Dashed-line boxes specify external activities, an execution environment, and external applications. The arrows represent the data flow. Labels indicates which data or events are concerned. In this concrete scenario we specify an activity ensuring that a message is passed from one CIC component to an other one according to the CIC rules in a failure-prone environment that non-deterministically supplies failure events (crashes and link outages). What the application needs to know about it that it should activate the "sender trigger" and await an occurrence of the event "message processed" . This is important, please memorize that. The system administrator specifies the timeout values suitable for the given application along with some other options. The manager may stop the specification process at this stage. Activities are hierarchical and allow for a step-wise refinement. The next employee will say that actually the behavior of the cic activity is controlled by a so-called control activity cic_sc (sc stands for statechart) depicted as a green rounded box and has two further sub-activities: cic_sender and cic_receiver exchanging the messages and notifications as I have described informally before. The behaviors of these subactivities are defined by the corresponding control activities.
  • Before we start with the verification of the IC we need some additional definitions. A finite state computational system, e.g. a Statemate specification, can be represented as a Kripke structure. It contains a finite state transition graph with nodes labeled with atomic propositions that are valid in this node. These atomic propositions would refer to individual memory bits in a software system. If we unwind the state transition diagram we obtain a computation tree with potentially infinite branches.
  • A computation tree over the set of atomic propositions P can be characterized by the temporal logic called CTL. Its syntax is inductively defined as shown on this slide. The temporal aspects of the execution paths originating in the given state can be characterized by the Path quantifiers Exists and All combined with the temporal modalities Next and Util, finally, and globally. The modality Finally is used in a sense that some property holds eventually. Globally means that a property holds in every state of a path.
  • Explicit model checking is a rather simple recursive algorithm with the quadratic run-time. There are heuristic solutions using ordered binary decision diagrams as in the Statemate's symbolic model checker. Other model checkers use SAT solvers.
  • To provide recovery guarantees all Pcoms such as client and server components need to be equipped with logging and recovery capabilities. Unlike database systems, we do not want and do not need to enable undo. Components are piecewise deterministic, they execute deterministically between two consecutive non-deterministic events such incoming messages from other components or reading the system clock. SO, logging of nondeterministic events turns piecewise-deterministic components into truly deterministic ones. We can recreate Pcom's state and messages by simply replaying the log from some initial state. To accelerate the deterministic replay the component needs to truncate the log on a regular basis. before doing this it has to dump its current state to disk. We call such state dumps "installation points". Out failure model includes crashes of the sending and receiving components as well as network failures causing message losses. Such transient failures are due to nondeterministic so-called Heisenbugs that are impossible to reproduce to take them out. We do not consider malicious manipulations called commission failures. And we do not deal with the corruption of stable storage as this can be avoided by a sufficient replication.

Transcript

  • 1.
      • SCC WIP Session 3, Honolulu, HI, USA, July 9, 2008
      • German Shegalov (ex-MPII, Oracle, USA)
      • Gerhard Weikum (MPI Informatik, Germany)
    Formal Verification of Web Service Interaction Contracts funded by
  • 2. E-Business Scenario Your server command (process id #20) has been terminated. Re-run your command (severity 13) in /opt/www/your-reliable-eshop.biz/mb_1300_db.mb1 place your order!
  • 3.
    • Non- idempotence (Math 1.0)
      • , n > 1
    • Non-idempotence (Web 2.0, ERP, etc.)
      • "Request timeout"  "request failure"
      • "Request send"  "request resend"
      • Anecdotal evidence: “Don't click more than once!”
        • 8 health insurance id's for a 3 member family
        • Order one , get many  ... pay for many 
    Problem Statement
  • 4. Transaction recovery is idempotent. However, … Web Client Web Application Server Database Server Timeline Non-idempotent execution ! ACK Purchase Request Order Confirmation Start Transaction SQL Request SQL Response SQL Request SQL Response Commit Transaction ACK Transaction Restart Purchase Request Resubmission
  • 5. Real-World n -Tier Application Expedia Sabre Server Amadeus Expedia App Server Sabre App Server Amadeus App Server Client Web Server DB 1 DB 2 DB 3 DB 4
  • 6. IC Framework
    • Components and Guarantees
      • Persistent (Pcom): Persistent, testable state & messages
      • External (Xcom) (e.g., humans): No recovery
      • Transactional (Tcom): Persistance and testability on commit
    • Interaction Contracts
      • Xcom & Pcom = External IC (XIC)
      • Pcom & Pcom = Committed IC (CIC)
      • Tcom & Pcom = Transacted IC (TIC)
    • Failure model: transient failures, e.g., Heisenbugs
    • Exactly-Once Semantics
      • Forget rollbacks : exactly-once execution is guaranteed
  • 7. Pcom Design
    • Redo Log & Recovery Managers
    • Piecewise determinism + Logging = Full Determinism
    • Unique message id for duplicate elimination
    • Deterministic replay recovers Pcom's
    • Installation Points speed up replay
    PCom1 PCom2 C 2 C 2 C 2
  • 8. Committed IC Sender * EVENT_OK = EVENT   LINK_OUTAGE STABLE_S SENDING INSTALLED_S RECOVERY MSG_LOOKUP PREPARE_PERSISTENCE SNDR_MSG_TM and not (STABLE_OK or INSTALLED_OK)/ SEND_MSG SNDR_ND/ SEND_MSG SNDR_TRIGGER [SNDR_LAST_LOGGED=='']/ SNDR_ND MSG_RECOVERED_TM/ SEND_MSG GET_MSG_OK [SNDR_LAST_LOGGED=='INSTALLED'] INSTALLED_OK/ SNDR_LAST_LOGGED:='INSTALLED' STABLE_OK SNDR_STABLE_TM and not (INSTALLED_OK or GET_MSG_OK)/ IS_INSTALLED CIC_SNDR_SC STABLE_S SENDING MSG_LOOKUP SNDR_MSG_TM and INSTALLED_OK)/ SEND_MSG SNDR_ND/ SEND_MSG [SNDR_LAST_LOGGED=='']/ SNDR_ND MSG_RECOVERED_TM/ SEND_MSG GET_MSG_OK INSTALLED_OK/ SNDR_STABLE_TM and not (INSTALLED_OK or GET_MSG_OK)/ IS_INSTALLED SNDR_CRASH T T STABLE_S SENDING MSG_LOOKUP SNDR_MSG_TM and INSTALLED_OK)/ SEND_MSG SNDR_ND/ SEND_MSG [SNDR_LAST_LOGGED=='']/ SNDR_ND MSG_RECOVERED_TM/ SEND_MSG GET_MSG_OK INSTALLED_OK/ SNDR_STABLE_TM and not (INSTALLED_OK or GET_MSG_OK)/ IS_INSTALLED CIC_SNDR_SC STABLE_S SENDING MSG_LOOKUP INSTALLED_OK/ SNDR_MSG_TM and INSTALLED_OK)/ SEND_MSG SNDR_ND/ SEND_MSG SNDR_LAST_LOGGED SNDR_ND MSG_RECOVERED_TM/ SEND_MSG GET_MSG_OK INSTALLED_OK/ SNDR_STABLE_TM and not (INSTALLED_OK or GET_MSG_OK)/ IS_INSTALLED T T SNDR_LAST_LOGGED:='INSTALLED' _TM means TIMEOUT
  • 9. Committed IC Receiver MSG_RECOVERY STABLE_R INSTALLED_R MSG_RECEIVED RECOVERY MSG_PROCESSED RCVR_INSTALL_TM/ RCVR_LAST_LOGGED:='INSTALLED'; INSTALLED [RCVR_LAST_LOGGED=='INSTALLED'] [RCVR_LAST_LOGGED=='STABLE'] SEND_MSG_OK [RCVR_LAST_LOGGED=='STABLE']/ GET_MSG [ICIC]/ RCVR_LAST_LOGGED:='INSTALLED'; INSTALLED MSG_EXEC_TM/ RECEIVED; ( RCVR_STABLE_TM or RCVR_ND [MSG_ORDER_MATTERS] ) [not ICIC and RCVR_LAST_LOGGED=='']/ RCVR_LAST_LOGGED:='STABLE'; SEND_MSG_OK [RCVR_LAST_LOGGED==''] not SEND_MSG_OK and GET_MSG_TM/ GET_MSG RCVR_CRASH T CIC_RCVR_SC MSG_RECEIVED RECOVERY MSG_PROCESSED [RCVR_LAST_LOGGED=='INSTALLED'] [RCVR_LAST_LOGGED=='STABLE'] SEND_MSG_OK [RCVR_LAST_LOGGED=='STABLE']/ GET_MSG [ICIC]/ RCVR_LAST_LOGGED:='INSTALLED'; INSTALLED MSG_EXEC_TM/ RECEIVED; [not ICIC and RCVR_LAST_LOGGED=='']/ RCVR_LAST_LOGGED:='STABLE'; SEND_MSG_OK [RCVR_LAST_LOGGED==''] not SEND_MSG_OK and GET_MSG_TM/ GET_MSG RCVR_CRASH T SEND_MSG or IS_INSTALLED/ SEND_MSG or IS_INSTALLED/ INSTALLED STABLE_R INSTALLED_R MSG_RECEIVED RECOVERY MSG_PROCESSED [RCVR_LAST_LOGGED=='INSTALLED'] [RCVR_LAST_LOGGED=='STABLE'] SEND_MSG_OK [RCVR_LAST_LOGGED=='STABLE']/ GET_MSG [ICIC]/ RCVR_LAST_LOGGED:='INSTALLED'; INSTALLED MSG_EXEC_TM/ RECEIVED; STABLE SEND_MSG_OK [RCVR_LAST_LOGGED==''] not SEND_MSG_OK and GET_MSG_TM/ GET_MSG RCVR_CRASH T CIC_RCVR_SC MSG_RECEIVED RECOVERY MSG_PROCESSED [RCVR_LAST_LOGGED=='INSTALLED'] [RCVR_LAST_LOGGED=='STABLE'] SEND_MSG_OK [RCVR_LAST_LOGGED=='STABLE']/ GET_MSG [ICIC]/ RCVR_LAST_LOGGED:='INSTALLED'; INSTALLED MSG_EXEC_TM/ RECEIVED; SEND_MSG_OK [RCVR_LAST_LOGGED==''] not SEND_MSG_OK and GET_MSG_TM/ GET_MSG RCVR_CRASH T SEND_MSG or IS_INSTALLED/ STABLE SEND_MSG or IS_INSTALLED/ INSTALLED * EVENT_OK = EVENT   LINK_OUTAGE, _TM means TIMEOUT RCVR_LAST_LOGGED:='INSTALLED'
  • 10. CIC Verification
    • Safety: a value is logged at most once
      • For all log values v  { 'stable', 'installed' }
      • AG ( written ( log )  log = v  AX AG ¬( written ( log )  log = v ) )
    • Liveness: CIC terminates
      • for timeouts < 30 steps
      • F < n eventually after at most n steps
      • AF < 500 AG ¬ failures  AF <700 CIC installed
    • Together: exactly once!
  • 11. IC's & Web Service
    • Web server reply's commits app servers' reply order
    • AG websrvr_rep:send_msg   i=1,2 ( appsrvr i : rcvr_log=’stable'  appsrvr i : rcvr_log=’installed' )
    HTML_PROMPT USER1_REQ @USER1_SC XACT_UPDATE <TIC_AC BROWSER_INPUT <XIC_I_AC BROWSER_OUTPUT <XIC_O_AC APPSRVR2_REP <CIC_AC APPSRVR1_REQ <CIC_AC APPSRVR2_REQ <CIC_AC APPSRVR1_REP <CIC_AC WEBSRVR_REP <CIC_AC WEBSRVR_REQ <CIC_AC CUSTOMER BUTTON_CLICKED HTML_REPLY CLICK_CAPTURED WEBSRVR_REQ_RCVD APPSRVR1_REQ_RCVD APPSRVR2_REP_RCVD APPSRVR1_REP_RCVD WEBSRVR_REP_RCVD LOCAL_FAILURES BROWSER_CRASH, XACT_{USER, INTERNAL}_ABORT, BROWSER_WEBSRVR_LINK_OUTAGE GLOBAL_FAILURES WEBSERVER_CRASH, APPSERVER{1;2}_CRASH, DBSRVR_CRASH, WEB_APP{1,2}_LINK_OUTAGE, APP1_DB_LINK_OUTAGE XACT_COMMITTED APPSRVR2_REQ_RCVD USER1_REQ @USER1_SC XACT_UPDATE <TIC_AC BROWSER_INPUT <XIC_I_AC BROWSER_OUTPUT <XIC_O_AC APPSRVR2_REP <CIC_AC APPSRVR1_REQ <CIC_AC APPSRVR2_REQ <CIC_AC APPSRVR1_REP <CIC_AC WEBSRVR_REP <CIC_AC WEBSRVR_REQ <CIC_AC CUSTOMER LOCAL_FAILURES BROWSER_CRASH, XACT_{USER, INTERNAL}_ABORT, BROWSER_WEBSRVR_LINK_OUTAGE GLOBAL_FAILURES WEBSERVER_CRASH, APPSERVER{1;2}_CRASH, DBSRVR_CRASH, WEB_APP{1,2}_LINK_OUTAGE, APP1_DB_LINK_OUTAGE
  • 12. Summary
    • Generic IC framework specification
      • STATEMATE: Statetcharts
    • Formal verification at IC and app level
      • STATEMATE: Model Checking
    • IC implementation for PHP & Internet Explorer
      • EOS
    • Rigorous recovery guarantees based on the formal verified models
  • 13. EOS Demo USER 1 Backend Server Frontend Server B2B_LINK B2C_LINK
  • 14. Thank You!
      • German Shegalov <german.shegalov@acm.org>
      • Gerhard Weikum <weikum@mpi-inf.mpg.de>
    ?
  • 15. Transaction Recovery
    • At most once semantics
    • Recovery: Redo All, Undo Uncommitted
      • LSN < PageLSN : skip redo
      • LSN > PageLSN : skip undo`
    • BEGIN TRANSACTION
    • /* LSN = 1: log undo and redo*/
      • UPDATE Accounts SET balance = balance – 100 WHERE Number = 1
    • /* LSN = 2: log undo and redo*/
      • UPDATE Accounts SET balance = balance + 100 WHERE Number = 2
    • /* LSN = 3: log commit; force to disk (~10 5 slower)*/
    • COMMIT TRANSACTION
    Transfer €100 from 1 to 2 (LSN=0) (LSN=3) 2000 2 1000 1 Balance Number Accounts 2100 2 900 1 Balance Number Accounts
  • 16. Statecharts [Harel'87, UML' 97] Step-wise refinement INIT ЕND S 1 S 3 E[C]/A S 2 E 23 / A 23 [OK] [!OK]
  • 17. 2PC Message Sequence Coordinator DB i force-log begin Timeline prepare force-log prepared commit force-log commit force-log commit force-log end ack yes
  • 18. PA-2PC Coordinator
  • 19. PA-PC Cohort
  • 20. External IC
  • 21. Committed IC Monitor
    • Statechart = Behavioral View
      • Finite State Automaton (FSA) +
      • Nesting + Orthogonal substates +
      • E [ C ]/ A transitions: on E vent while C ondition
        • Leave source, enter target, execute A ction
        • E.g., A = E' means generate event E'
      • Configuration = set of entered states
      • Execution context = variable valuation
        • Step i : conf i  ctxt i  conf i+1  ctxt i+1
    CIC_SC SENDING RECEIVING (not SNDR_CRASH) [not active(CIC_SNDR_AC) ]/ start!(CIC_SNDR_AC) SENDING RECEIVING (not RCVR_CRASH) [not active(CIC_RCVR_AC)]/ start!(CIC_RCVR_AC) SNDR_S RCVR_S
  • 22. Committed IC Activities
    • Activitychart = Functional View
    CIC_AC @CIC_SC FAILURE_PRONE_ENVIRONMENT RCVR_CRASH SNDR_CRASH LINK_OUTAGE CIC_SNDR_AC CIC_RCVR_AC SEND_MSG STABLE INSTALLED @CIC_SNDR_SC @CIC_RCVR_SC EXTERNAL_APP_LOGIC SNDR_TRIGGER MSG_PROCESSED GET_MSG SYSTEM_ADMINISTRATOR ICIC TIMEOUTS
  • 23. CIC's Informal Design
    • CIC sender (Pcom) obligations
      • Persist state before send
      • Tag message with a MSN
      • Resend on timeout until stable ack
      • Resend on receiver's &quot;get msg&quot;
      • Forget interaction on installed ack
    • CIC receiver (Pcom) obligations
      • Eliminates duplicates using MSN's
      • Persists interaction before stable ack
      • &quot;gets msg&quot; if msg is not in log after failure
      • Ensures autonomous recovery before installed ack
  • 24. Verification Run-Times ~10 hours ~10 6 Nondeterministic Timeout Not terminated ~10 7 Integer Timeout 1-user WS safety ~10 hours ~10 5 Nondeterministic Timeout ~10 hours ~10 6 Integer Timeout IC-level liveness ~1sec. ~10 3 Nondeterministic Timeout ~5 seconds ~10 4 Integer Timeout IC-level safety Verification Time OBDD size Property/Specification Type
  • 25. Experiment Setup Backend Server P4 3Ghz, 1GB Frontend Server P4 3Ghz, 1GB shared count 1234  1235 private count 2  3 private count 2  3 private count 2  1 private count 2  3 POST (ICIC) action=increment b2b=true 1235 <html> <p>Privatel Count: 3 <p>Shared Count: 1235 </html> POST (ICIC) action=increment Web Client
    • eBay-like auction service
    • User settings at frontend (private)
    • Auction items at backend (shared)
    • 5 concurrent end users, synthetic load
  • 26. Run-Time Overhead Backend Server Frontend Server shared count 1234  1235 private count 2  3 private count 2  3 private count 2  1 private count 2  3 POST ( ICIC ) action=increment b2b=true 1235 <html> <p>Privatel Count: 3 <p>Shared Count: 1235 </html> POST ( ICIC ) action=increment Web Client 33% 36% 44% Overhead (backend CPU) [%] 0.1600 0.0750 0.0130 EOS-PHP backend CPU time [sec] 0.1200 0.0550 0.0090 PHP backend CPU time [sec] 102% 122% 109% Overhead (frontend CPU) [%] 1.1545 0.6000 0.0815 EOS-PHP frontend CPU time [sec] 0.5727 0.2708 0.0390 PHP frontend CPU time [sec] 93% 113% 101% Overhead (elapsed time) [%] 3.1000 1.6850 0.3140 EOS-PHP elapsed time [sec] 1.6100 0.7900 0.1560 PHP elapsed time [sec] 10 steps 5 steps 1 step   Session
  • 27. PHP and Zend Engine Zend Engine Session CURL Zend Engine Session CURL Zend Engine Session CURL Web Client Web Client Web Client Web Client
    • <html>
    • <?php
    • session_start();
    • $HTTP_SESSION_VARS[&quot;count&quot;]++;
    • printf(&quot;Script called %i times&quot;,
    • $HTTP_SESSION_VARS[&quot;count&quot;] );
    • $ch = curl_init(&quot;http://eos-php.net/b2b.php&quot;);
    • $b2b_reply = curl_exec($ch);
    • printf(&quot;Other server reports: %s &quot;, $b2b_reply );
    • curl_close($ch);
    • ?>
    • </html>
    • <html>
      • Script called 5 times
      • Other server reports: Script called 1000 times
    • </html>
  • 28. EOS
    • Exactly-once semantics with
      • Transparent browser recovery
      • Concurrent accesses to shared data
      • Nondeterm. functions: time , curl_exec , rand
      • Any n in n -tier, any fanout
      • Failure masking: no changes to app code neither to PHP scripts, nor to the browser
    • Performance enhancements (side effects)
      • Log structured data access (sequential I/O)
      • LRU buffers for state and log data
      • Latches (Shared/Exclusive)
      • session_start ( bool $read_only )
  • 29. Transacted IC Activities
    • Activitychart = Functional View
    TIC_AC @TIC_SC FAILURE_PRONE_ENVIRONMENT XACT_CLIENT_CRASH LINK_OUTAGE XACT_CLIENT_AC XACT_SERVER_AC SQL_REQ SQL_REP @XACT_CLIENT_SC @XACT_SERVER_SC EXTERNAL_APP_LOGIC XACT_TRIGGER XACT_COMMITTED COMMITTED SYSTEM_ADMINISTRATOR TIMEOUTS XACT_ABORTED XACT_SERVER_CRASH COMMIT USER_ABORT ABORTED
  • 30. Transactional IC Server
  • 31. Transactional IC Client
  • 32. Execution Abstraction
    • Kripke structure K =( S , R , L ) over P
      • P is a finite set of atomic propositions
      • Software: P is a union of all memory bits
      • S finite set of states
      • R  S  S state transitions
      • L  S  P  { true, false } valuation
      • Non-determinism to determinism Computation Tree vs. Sequence
    p , q  P p p q p  q
  • 33.
    • Basic Syntax
      • Atomic propositions P  CTL( P )
      • If p, q  CTL( P ), then so are
        • Propositional logic formulas (  p , p  q, etc. )
        • Path quantifiers E xists, A ll + modality ne X t , U ntil
        • EX p
        • { E, A } ( p U q )
    • Derived Syntax
        • AX p   ( EX  p )
        • A F inally p  A ( true U p )
        • EF p  E ( true U p )
        • A G lobally p   ( E ( true U  p ) )
        • EG p   ( A ( true U  p ) )
    Computation Tree Logic
  • 34. Explicit Model Checking
    • For K = ( S , R , L ) over P, s  S, f  CTL ( P )
      • s |= f , f  P  L ( s , f ) = true
      • s |= f , f =  f 1  s  |  f 1
      • s |= f , f = f 1  f 2  s  |= f 1 or s  |= f 2
      • s |= f , f = EX f  ( s , r )  R with r  |= f
      • s |= f , f = E ( f 1 U f 2 )
        • if s is checked then false else check
        • if s  |= f 2  then true
        • if s  |= f 1 and  ( s , r )  R with r  |= f then true
      • s  |= f , f = A ( f 1 U f 2 )
        • if s already checked then false else check
        • if s  |= f 2  then true
        • if s  |= f 1 and  ( s , r )  R with r  |= f
  • 35. TIC Verification
    • At-Most-Once (Safety): AG( server_last_logged =’ commited ’  AG(¬any( sql_req )) )
    • At-Least-Once (Liveness): AF <500 (AG¬( failures ))  AF <700 ( AG( client_last_logged =’committed’  srvr_last_logged =’ committed ’))
    • Consequence: Exactly Once
  • 36. TIC Design
    • Tcom
      • Traditional Redo & Undo Log
      • Faithful Reply
        • Persists commit state
        • Persists commit reply message
        • Resends commit reply on a second request
        • No commit reply logged ->aborted
      • Commit request duplicate elimination.
    • Pcom
      • Log-forcing before commit
      • Periodically resends commit request