PolicyReplay Talk

PolicyReplay: Misconfiguration-Response Queries for Data Breach Reporting Daniel Fabbri*, Kristen LeFevre*, Qiang Zhu+ *University of Michigan +University of Michigan, Dearborn

Breach Reporting Report accesses to unauthorized data Typically, access controls restrict access to sensitive data Unfortunately, access controls are difficult to configure Misconfigurations allow unauthorized data to be accessed 2

Goal Goal:Given a DB, the operations executed on the DB, an incorrect (old) policy and a correct (new) policy, find all queries that disclosed unauthorized data. 3

Example Medical records are stored in hospital databases Security and privacy of patient records is important Patient data is sensitive (e.g., disease, medication) Access control policy restricts access to medical records When a misconfiguration occurs: Patient information is inappropriately accessed 4

New Legal Requirements For Reporting Medical Data Breaches Health Information Technology for Economic and Clinical Health Act (HITECH) of 2009, USA Expanded security and privacy protections Monetary fine for disclosure of patient data Covered entities (e.g., hospitals) must report breaches New mechanisms are needed to report breaches 5

Outline Motivation Finding Queries That Disclose Unauthorized Data Framework Components Improving Misconfiguration Response Performance Evaluation 6

Finding Queries That Disclose Unauthorized Data What does it mean for a query to be suspicious? How to find these suspicious queries? Straw man approaches: Database Auditing Techniques Annotation/Provenance Techniques 7

Database Auditing Techniques [Agrawal ’04] Applicable for logs with only queries (no updates) During normal execution, record SQL text of all operations At audit time: Auditor specifies sensitive data Retrieve those queries that used sensitive data 8 Sensitive Data Suspicious Query Patients Table SQL Operation Log

Misconfiguration As Audit Problem For misconfigurations, sensitive data is: Data in the DB accessible under incorrect (old) policy, but no longer accessible under corrected (new) policy Consider the policies: Old Policy (Patients): age < 30 New Policy (Patients): age < 18 9 Sensitive Data Suspicious Query Patients Table SQL Operation Log 9 9

Limitations: Data Modifications Old Policy (Patients): age < 30 New Policy (Patients): age < 18 Temp: No restrictions Sensitive Data Patients Table 10 Temp Table

Limitations: Data Modifications Old Policy (Patients): age < 30 New Policy (Patients): age < 18 Temp: No restrictions Sensitive Data Patients Table 11 Temp Table

Limitations: Data Modifications Patients Table 12 Old Policy (Patients): age < 30 New Policy (Patients): age < 18 Temp: No restrictions Sensitive Data Temp Table Not Suspicious Misses the propagation of information!

Annotations/Provenance Techniques During normal execution: Record SQL text of all operations Record the dependencies between rows (e.g., Bhagwat ’04) At audit time: Auditor specifies sensitive data Retrieve those queriesthat use data derived from sensitive data 13

Example (cont.) Sensitive Data Patients Table Temp Table Suspicious Query Suspicious: Accesses a row that depends on sensitive data 14

Annotations/Provenance Techniques Tracks the derivation of data Solves the ‘copy’ problem Limitations: Empty results – no annotations to analyze More generally, lack of a row in a result can disclose info 15

Previous Work Is Not Applicable Data Modification Operations Explicit flow of information (Copy) Implicit flow of information (UPDATE Patients SET age = 999 WHERE disease = flu) Empty Results Information learned from multiple queries 16

Our Solution: The Misconfiguration Response (MR) Query Conceptually, replay the log under the new policy Compare query results from the old and new policy Returns queries that disclosed unauthorized data 17

Misconfiguration-Response Query Observation: Unauthorized (sensitive) data did not contribute to the result of the query if: The query’s result is the same when the log is completely replayed under the incorrect and correct policies. If different, no guarantees about the data disclosed 18

Misconfiguration-Response Query Cleanly addresses previously discussed limitations Replaying the log and comparing results captures: Different information learned between the policies Data modifications Empty results and missing rows 19

Misconfiguration-Response Query Naïve Algorithm: Copy the database at time of the misconfiguration Replay log of operation under the new policy For each query, execute it on the old and new DB: If the query result is not the same under both policies, Then mark it as suspicious 20

Framework Components Components should easily integrate into existing DBMS Row-level access control Re-write operations with an added selection condition Restricts users to a subset of rows E.g., Oracle Fine Grained Access Control Operation log Stores all DB events E.g., (username, SQL text) of operations executed on the DB Separate from a recovery log Available in Oracle, SQL Server, DB2 22

Framework Components Temporal (Historical) Databases Create database state that existed at a previous time One possible implementation [Jensen ‘91]: Backlog tables: Append only (inserts & deletes) Additional metadata stored to re-construct database state 23 Patients Table (time = 2) Patients Backlog Table

Performance Considerations Naïve approach can be costly Copy large amounts of data Replay the entire log Execute queries twice (once on the old and new DBs) 24

Outline Motivation Finding Queries That Disclose Unauthorized Data Framework Components Improving Misconfiguration Response Performance Static Pruning Delta Tables Partial and Simultaneous Re-execution Evaluation 25

Static Pruning (Queries Only) Guarantee the query never access unauthorized data Method: Analyze SQL text of the (i) policies and (ii) the query Data-independent analysis Example: Old Policy (Patients): age < 30 New Policy (Patients): age < 18 26 Prunable

Pruning With Data Modifications Static approach is applicable for logs with only queries Example: Old Policy (Patients): age < 30 New Policy (Patients): age < 18 27 Sensitive Data Patients Table (Old) Patients Table (New)

Pruning With Data Modifications Static approach is applicable for logs with only queries Example: Old Policy (Patients): age < 30 New Policy (Patients): age < 18 28 Sensitive Data Patients Table (New) Patients Table (Old) Should Not Be Pruned

Handling Updates Delta Tables: Stores the differences between the old and new backlog tables Set of rows added under old, but not new policy (Delta Minus) Set of rows added under new, but not old policy (Delta Plus) 29 Patients Backlog Table (New) Patients Backlog Table (Old) Patients Delta Minus Backlog Table

Pruning With Data Modifications Can Prune If: Static pruning condition is satisfied All rows from the delta tables are filtered by the query Example (cont.): 30 Not Filtered By Query Not Prunable Patients Delta Minus Backlog Table

Pruning With Data Modifications Can Prune If: Static pruning condition is satisfied All rows from the delta tables are filtered by the query No longer a static pruning condition But, the delta tables are typically smaller than full tables No longer need to copy the database Can use the old DB and the delta tables to create the new DB 31

Re-Execution When an operation cannot be pruned: Re-execute the operation to test if it is suspicious Executing a query on the old and new DBs wastes work E.g., Old and new queries may join the same rows Can we improve re-execution performance? 32

Simultaneous Re-Execution Observation: Same operation, different data (old vs. new) Can we do the shared computation simultaneously? Combine data from old and new databases Flags track origins of each row (new, old, common) Flags used to ensure correctness 33

Partial & Simultaneous Re-Execution Not suspicious if: Only common rows are in the result Partial Re-Execution Stop mid-execution if only common rows exist on a cut 34 Cut In Query Plan For Partial Re-Execution

Implementation Implemented on top of PostgreSQL Constraint solver used for static pruning Goal: Understand how the workload and data affect performance Synthetic data and workload Parameters: Operation selectivity Select-to-update ratio Misconfiguration size Number of the operations 36

Static Pruning Results(Queries Only) Fewer operations re-executed 500 queries, 1% selectivity, 250 K rows 37 Higher selectivity/larger misconfiguration reduces number of queries pruned

Performance With Updates(Small Misconfiguration) Simultaneous re-execution improves naïve method 500 operations, 0.9 select to update ratio, 250 K rows, 1% selectivity 38

Performance With Updates(Small Misconfiguration) Pruning improves performance for common cases 500 operations, 0.9 select to update ratio, 250 K rows, 1% selectivity 39

Summary of Additional Results Large misconfigurations – naïve approach can be better Cost of tracking differences between old and new is high Pruning is not effective Future Work: Optimizer to choose MR-query method given parameters 40

Summary PolicyReplay Policy misconfigurations are a security concern Existing approaches are not able to find all breaches Presented the misconfiguration response query Optimizations to improve performance 41

Questions? More info at: http://www.eecs.umich.edu/db 42

Annotation Limitation Old Policy (Patients): age < 30 New Policy (Patients): age < 18 Temp: No restrictions Patients Table 44 Sensitive Data Temp Table Learns that Bob has the flu

Annotation Limitation Old Policy (Patients): age < 30 New Policy (Patients): age < 18 Temp: No restrictions Patients Table 45 Sensitive Data Temp Table Deletes rows with the same disease

Annotation Limitation Old Policy (Patients): age < 30 New Policy (Patients): age < 18 Temp: No restrictions Patients Table 46 Sensitive Data Temp Table Learns that someone in Patients table has the flu No annotations in the empty result!

PolicyReplay Talk

Recommended

Recommended

More Related Content

What's hot

What's hot (12)

Viewers also liked

Viewers also liked (7)

Similar to PolicyReplay Talk

Similar to PolicyReplay Talk (20)

PolicyReplay Talk

Editor's Notes