Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

RDB - Repairable Database Systems


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

RDB - Repairable Database Systems

  1. 1. RDB: Repairable Database Systems Alexey Smirnov and Tzi-cker Chiueh Experimental Computer Systems Lab, Department of Computer Science, SUNY Stony Brook Motivation Transaction Dependency Tracking SQL Statements Rewriting Suppose you are a DBA and you have just noticed that your database has been compromised 24 hours ago. How To track dependencies successfully, the tracking As a part of the dependency tracking mechanism, SQL proxy rewrites certain classes of SQL statements coming from a client. would you repair the database? mechanism should be able to intercept both read and write actions performed by the database server. Possible Original statement Modified statement(s) Currently, the only way to do this is to restore a database ways of interception are: SELECT t1.a1,…,,…,tk.ank FROM t1,…,tk WHERE c SELECT t1.a1,…,,…,tk.ank, t1.trid,…,tk.trid FROM t1,…,tk WHERE c backup and recommit all benign transactions manually. •Database triggers (online) – cannot add a trigger for SELECT statements; SELECT t.trid FROM t WHERE c SELECT SUM(t.a) FROM t WHERE c GROUP BY t.b Challenges: (1) How to tell which transactions are benign SELECT SUM(t.a) FROM t WHERE c GROUP BY t.b and which are malicious? Identifying the initial set of •Database log analysis (offline) – read operations are not logged; no run-time overhead is its big advantage. UPDATE t SET a1=v1,…,an=vn WHERE c UPDATE t SET a1=v1,…,an=vn, trid=curTrID WHERE c malicious transactions is not enough because initial damage can spread over the database by subsequent INSERT INTO t(a1,…,an) VALUES (v1,…,vn) INSERT INTO t(a1,…,an,trid) VALUES (v1,…,vn,curTrID) •Tracking proxy (online) – a small program sitting between benign transactions. (2) The amount of data can be huge INSERT INTO trans_dep(curTrID,…) and the repair process is very error-prone. There is a need a client and a server that intercepts all SQL statements COMMIT a way to automate it. sent by the client and results sent back by the server. COMMIT RDB uses both second and third approaches to Ideally, an intrusion-resilient DBMS should be able to implement dependency tracking. Track inter-transaction dependencies; Perform a selective transaction rollback. Database Repair Performance Results We propose an implementation framework called RDB that can render an off-the-self DBMS intrusion resilient The database is repaired by compensating malicious We used TPC-C benchmark to evaluate the run-time without modifying its internals. RDB has two major transactions. When using RDB, the repair process overhead of JDBC proxy. The size of the test database was components: tracking subsystem which runs at run-time consists of the following steps: about 4GB. and recovery subsystem which runs offline. • Database log analysis to reconstruct complete We varied the following parameters: RDB inserts a proxy JDBC driver between the DB dependency information and generate compensating Transaction mix (read intensive and read/write intensive); server and a client that transparently intercepts all transactions; Connection type (local or over a network); queries and results. The proxy can be either on the Total footprint size W (effect of database cache); client side or on the server side. • Dependency graph visualization; Our results suggest that the overhead of the proxy is between 6% and Definition of Transaction Dependency • Repairing database by committing compensating 13% for a typical load. A read set of an SQL statement S is the set of rows transactions. fetched by this statement. Different DBMSs provide different facilities for log analysis. We have studied three database servers: Oracle We will say that statement S2 depends on statement S1 if 9.2.0, PostgreSQL 7.2.2, and Sybase ASE 12.5. at least one row from the read set of S2 was modified by Eventually, all of them provide enough information to S1. We will say that transaction T2 depends on transaction The following changes are made to the database at the generate compensating transactions. T1 if at least one statement of T2 depends on a statement time of its creation: We used GraphViz – a free from T1. This definition is prone to both false positives and graph drawing software false negatives. Example of a false positive dependency: A new field tr_id is added to each table. It from AT&T. contains the ID of last transaction that modified a A1 A2 A3 particular row; T1: SET A2=5 WHERE The application allows the 100 5 5 Table trans_dep(tr_id:INTEGER, A1<250 user to select an initial set 200 5 6 T2: SELECT A3 WHERE dep_ids:VARCHAR) – stores IDs of of malicious transactions A3>3 transactions that depend on transaction tr_id; and computes its transitive 300 1 7 Table annot(tr_id:INTEGER, closure. Then the result descr:VARCHAR) – stores annotations for can be refined by the user Also, in general it is impossible to determine all transaction tr_id; to build the final set of Contact Information transaction dependencies by looking at the traffic transactions to be compensated. ECSL Lab at SUNY Stony Brook: between a client and the DB server only because part of The proxy uses its own transaction Ids because there is the logic may be inside the application itself. no standard way to access the internal transaction ID of a database. E-mail: {alexey,chiueh}