CTL Model Checking in Database Cloud
Upcoming SlideShare
Loading in...5

CTL Model Checking in Database Cloud






Total Views
Views on SlideShare
Embed Views



2 Embeds 23

http://www.linkedin.com 16
https://www.linkedin.com 7



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

CTL Model Checking in Database Cloud CTL Model Checking in Database Cloud Document Transcript

  • CTL Model Checking in Database Cloud German Shegalov Oracle Corp. 500 Oracle Parkway, Redwood Shores, CA 94065, USA german.shegalov@oracle.comAbstract— Modern software systems such as OS, RDBMS, JVM, rated virtual machine that will close the gap for any Tur-etc have reached enormous complexity by any metric ranging ing-computable problem. The task of a model checker is tofrom the number of lines of code to the program state explosion verify whether the system under test satisfies a property poseddue to concurrency. Standard quality assurance methods do not as a formulae in temporal logics such as CTL.yield strong correctness guarantees because even 100% code cov-erage – while a desirable metric – is not equivalent to the II. CTL BACKGROUNDstate/execution path coverage. Whereas model checking providesrigor correctness proofs, its computational complexity is often Model checking is a formal method of software/hardwareprohibitive for real world systems. With advances of distributed verification – an automated way of providing mathematicalcomputing frameworks such as MapReduce, and affordability of proofs [1]. In this paper, we deal with the Computational Treelarge computer clusters (e.g., offered as an on-demand Cloud ser- Logic (CTL) [3] model checking.vice), steadily larger systems can be verified using model check- Along with the traditional Boolean operators, CTL definesing. In this paper, we envision database vendors compete for the existential path quantifier E and the universal path quanti-achieving the highest possible degree of verification using fier A for the paths originating in some state s. Temporal as-massive scalability features. To this end, we show a way of imple-menting a CTL model checker as an SQL application that a data- pects are expressed using the unary modalities neXt (refers tobase system will “tune” for the cloud. successor state), Globally (all reachable states satisfy the for- mula), Finally (a reachable state satisfies the formula), and the I. INTRODUCTION binary modality Until (the left-hand formula is valid at least In this paper, we demonstrate a relatively simple way to until a state is reached where the right-hand formula holds).turn a relational database system in a powerful verification The unary modalities are usually most relevant in the praxis.tool. We show only a very basic technique of implementing a The set of CTL formulae over a finite set of atomic propos-model checker inside a database system. The idea is to en- itions P, denoted as CTL(P), is formally defined as follows us-courage database vendors to compete not only on perform- ing the structural induction:ance-oriented benchmark but on merits of objective software p ∈ P implies p ∈ CTL(P)quality as well. An interesting side effect of this is that the {p, q} ⊆ CTL(P) implies {¬p, p ∧ q, EX p, E (p U q), A (pdatabase system will be able to verify its own concurrency U q)} ⊆ CTL(P)control and recovery protocols. Given basic formulae defined above, the following short- Several steps are required on the way towards software hand syntax is provided as equivalent to formulae in the basicverification. First the source code has to be converted into set:some abstract state transition model using e.g., the ETL func- p ∨ q ≡ ¬(¬p ∧ ¬q)tionality. This is already a complex problem because finite ab- AX p ≡ ¬EX ¬pstractions for things like recursion and heap allocations need AF p ≡ A (true U p)to be found. In this paper, we assume that some technology as EF p ≡ E (true U p)in Spin model checker is used to this end. Then each compon- AG p ≡ ¬E (true U ¬p)ent architect will formulate safety and liveness properties in EG p ≡ ¬A (true U ¬p)temporal logics that can be verified by the system as the finalstep. These tasks by themselves should already embody a sub- The CTL presumes that a computing system is representedstantial stress test of the database system itself. Often, the as a Kripke structure K = (S, R, L), where S is the finite set ofsource code is generated from state diagrams for protocols, or states, R ⊆ S × S is the state transition relation with (s, t) ∈R ifgrammars, and these specifications can be used directly in- t is an immediate successor of s, and L: S × P →{true, false}.stead. A path is a potentially infinite sequence of successive states. In this paper, we focus on implementing a Model Checking In our toy example of Figure 1, AX P0, EG P0, EF P1, AGEngine as an SQL application. The idea here is to use SQL as P0∨P1 are true in S0scalability vehicle for massive-parallel model checking. Dia- S0: P0 S1: P0, P1 S1: P1lects of SQL, the lingua franca, of most relational databasesare already a very powerful language that found its interestingusages beyond the traditional OLTP and OLAP scopes, e.g., to Fig 1 A sample state-transition diagram with initial state S0 and two atomicsolve puzzles [7]. And when the SQLs expressiveness is not fornulae P0 and P1.sufficient, we can resort to an efficient database-system-integ-
  • III. KRIPKE SCHEMA for the attribute RESULT if s satisfies p or FALSE otherwise. In this section we present a way of translating the Kripke The SQL statements given below are written in Oraclestructure using database relations. The state transition diagram 11gR2s SQL [6] as close as possible to ANSI SQL and areof Figure 1 translates to the instance of the Kripke schema just meant to give the reader a flavor of the idea; we claimoutlined in Table 1. As we incorporated ids into the names in neither their particular elegance nor efficiency.this example, we focus solely on non-trivial relations valu- The algorithm of constructing a SQL representationation and transition. A negation not P is implied when the sql(state_id, p) of CTL is given using the structural inductionatomic proposition P is not shown in the state and analogously over CTL(P).when there is no corresponding entry in the valuation relation. A. sql(state_id, FALSE)The tuple (null, 0) in the transition relation specifies that thestate with s_id = 0 is the initial state in this state transition sys- This formula cannot be satisfied when the state withtem. state_id exists. selectcreate table state ( case count (*) s_id number primary key, when 0 then NULL s_nm varchar2(10) else FALSE); end as resultinsert into state values(0, S_0); from stateinsert into state values(1, S_1); where state.s_id = state_id;insert into state values(2, S_2);create table atomic ( B. sql(state_id, TRUE) a_id number primary key, a_nm varchar2(10) This formula is always satisfied when the state with); state_id exists.insert into atomic values(0, P_0);insert into atomic values(1, P_1); select case count (*)create table valuation ( when 0 then NULL s_id number references state(s_id), else TRUE a_id number references atomic(a_id) end as result); from stateinsert into valuation values(0, 0); where state.s_id = state_id;insert into valuation values(1, 0);insert into valuation values(1, 1);insert into valuation values(2, 1); C. sql(state_id, atomic_id) An atomic propositional formula is satisfied when there is acreate table transition ( src_id number references state(s_id), tuple (state_id, atomic_id) in the relation atomic. tgt_id number references state(s_id)); selectinsert into transition values(null, 1); case count (*)insert into transition values(0, 1); when 0 then NULLinsert into transition values(1, 0); else TRUEinsert into transition values(1, 2); end as result from valuation where valuation.a_id = atomic_id valuation s_id a_id and valuation.s_id = state_id; 0 0 D. sql(state_id, ¬p) 1 0 The negation of p satisfied when p is false. 1 1 with subq as ( 2 1 <sql(state_id, p)> ) select case subq.result transition src_id tgt_id when TRUE then FALSE null 0 else TRUE end as result 0 1 from subq; 1 0 E. sql(state_id, p ∧ q) 1 2 The conjunction is satisfied when both p and q are satisfied. Fig 2 A sample relational representation of Kripke structure of Fig 1. with subq_p as ( <sql(state_id, p)> IV. MODEL CHECKER AS AN SQL APPLICATION ), subq_q as ( <sql(state_id, q)> In this section we translate basic CTL formulae into execut- ) selectable SQL queries as an implementation of the basic explicit case count(*)model checking algorithm [1]. For a p ∈ CTL(P) and s ∈ S let when 0 then FALSE else TRUEsql(state_id, p) denote an SQL statement that returns TRUE end as result
  • from subq_p natural join subq_q beginwhere result = TRUE; insert into temp (<sql(state_id, q)>); commit; -- autonomous transactionF. sql(state_id, EX p) select count(*) into counter The disjunction is satisfied when state state_id is in the set from temp where temp.rs = s_id;of predecessors of states satisfying p. if counter <> 0 then return TRUE;select end if; case count(*) loop when 0 then FALSE newstates := 0; else TRUE for r1 in end as result (from transition t select t1.src_idwhere t.src_id = state_id from temp and TRUE = (<sql(t.tgt_id, p)>); join transition t1 on (G. sql(state_id, E (p U q)) temp.rs = t1.tgt_id and TRUE = (<sql(t1.src_id, p)>) This formula is satisfied when state_id is in the set of states )satisfying q or state_id is reachable through recursive reverse ) looptraversal from the set of states already known to satisfy the select count(*) into counterformula. In each recursive step we add states that satisfy p. from transition t2 where t2.src_id = r1.src_idSince the state transition diagram may be cyclic, we use the and t2.tgt_id not in (cycle detection clause. select * from temp tt3 );with subq_EpUq (rs) as ( if counter = 0 then select s_id as rs if r1.src_id = s_id then from state return TRUE; where TRUE = (<sql(s_id, q)>) elseunion all begin select t.src_id as rs insert into temp values (r1.src_id); from subq_EpUq commit; -- autonomous transaction join transition t newstates := newstates + 1; on ( exception subq_EpUq.rs = t.tgt_id when dup_val_on_index then and TRUE = (<sql(t.src_id, p)>) dbms_output.put_line( ) ignored duplicate);) end;cycle rs set is_cycle to y default n end if;select end if; case count(*) end loop; when 0 then FALSE if newstates = 0 then else TRUE return FALSE; end as result end if;from EpUq end loop;where rs = state_id; end; select ApUq() as result from dual;H. sql(state_id, A (p U q)) This formula is computed similarly to the existentially This sample implementation can be further optimized atquantified formula above with the difference that in every re- different levels. From the model checking perspective, the ba-cursive step we make sure to not add states that have at least sic explicit algorithm is known to be outperformed by theone successor that is not in the result set of the previous step. symbolic model checking [4] using OBDD-encoded BooleanHence, more than one reference to the result set computed in functions [5]. From the database perspective, we would startthe previous recursion step: predecessor computation and the looking at using the horizontal scalability features such as Par-check whether all successors of the predecessor are in the pre- allel Pipelined Table Functions (PTF) in case of Oracle [6], orvious set already. Therefore, this formula cannot be computed similar techniques such as MapReduce [2] depending on thewith the plain recursive SQL as above. Instead we develop a vendors functionality. As you notice in this section the queriesPL/SQL stored function and use a temporary table to achieve implementing a composite CTL formula might consist ofthe desired behavior. many subqueries that can be run in parallel. Many existential queries will benefit from the ability to stream the query hitsdrop table temp;create global temporary table temp ( early before the whole result set is formed as can be done with rs number primary key PTF.)on commit preserve rows; V. BENCHMARK PROPOSALScreate or replace function ApUq()return varchar2 In terms of self-verification it might be difficult to devise aas vendor-independent metric for the model checking bench- pragma autonomous_transaction; counter number; mark. One such metric could be the percentage of the source newstates number;
  • code verified given a set of the CTL propositions that apply to be implemented using the database system itself also presentsall products. an interesting test case in terms of traditional software testing. Fortunately, it is much easier to design an apple-to-apple Further, we show a sample implementation of the basic ex-benchmark if the verified system is a third-party product. We plicit model checking algorithm using the combination ofsuggest that a substantial open-source project at the scale of Oracle 11.2 SQL and PL/SQL. Then we point out a couple ofLinux or MySQL is used as the system under verification. optimization areas where the vendors can work on excelling in As an example of properties we want to verify, consider this benchmark. Last but not least, we suggest several bench-two-phase locking (2PL) where there are distinct lock acquisi- mark metrics.tion and release phases for a transaction. With the event oflock acquisition/release by a transaction t encoded as t_acq REFERENCESand t_rel, accordingly we can state: [1] Clarke, E., Schlinghoff, B.: Model Checking, in Handbook of AG(¬t_rel ∨ AX(AG ¬t_acq)) Automated Reasoning, Volume 2, Elsevier and MIT, 1635-1790 (2001) [2] Dean, J., Ghemawat, S.: Symposium on Operating System Design and Implementation (OSDI), San Francisco, CA (2004). We envision the following benchmark metrics: [3] Emerson, E.: Temporal and Modal Logic, in Handbook of Theoretical • The fraction of the source code verified Computer Science, Volume B: Formal Models and Semantics, Elsevier and MIT, 995-1072 (1990) • The fraction of the formulae verified [4] McMillan, K.: Symbolic Model Checking, Kluwer , Norwell, MA • The monetary cost of the setup needed for verifica- (1993) tion [5] Meinel, C., Theobald T.: Algorithms and Data Structures in VLSI Design OBDD Foundations and Applications, Springer, Heidelberg, • The amount energy spent per verification per source (1998) code line [6] Oracle Corp.: Oracle Database SQL Language Reference 11g Release 2 ( 1 1 . 2 ), VI. CONCLUSION http://download.oracle.com/docs/cd/E11882_01/server.112/e17118/toc. htm This paper advocates spending recent scalability gains in [7] Sheffer, A.: Oracle RDBMS 11gR2 – Solving a Sudoku usingmodern computing on finding rare and corner-case bugs in Recursive Subquery Factoring,database systems to improve their quality by means of fully http://technology.amis.nl/blog/6404/oracle-rdbms-11gr2-solving-a-automated model checking. The fact that model checking can sudoku-using-recursive-subquery-factoring