All-Solution Satisfiability Modulo Theories: applications, algorithms and benchmarks
1. All-Solution Satisfiability Modulo Theories:
applications, algorithms and benchmarks
Quoc-Sang Phan and Pasquale Malacaria
Queen Mary University of London
ARES: August 25, 2015
1 / 23
2. Content
Satisfiability Modulo Theories (SMT): a decision problem for
logical formulas over first-order theories
All-SMT: the problem of finding all solutions of an SMT
problem with respect to a set of Boolean variables
All-SMT solver can benefit various domains of application:
Bounded Model Checking, Automated Test Generation,
Reliability analysis, and Quantitative Information Flow. We
concentrate here on Quantitative Information Flow (QIF)
we propose algorithms to design an All-SMT solver on top of
an existing SMT solver, and implement it into a prototype
tool, called aZ3.
2 / 23
3. Information Flow
Secret input (H) Public input (L)
Program P
Public Output (O)
Non-interference
Public input (L)
Program P
Secret input (H)
Information leaked
Public Output (O)
√
?
3 / 23
4. Non-interference is unachievable
int check(int H, int L){
int O;
if (L == H)
O = ACCEPT;
else O = REJECT;
return O;
}
password check
Secret input (H) Public input (L)
Program P
Public Output (O)
Non-interference
Public input (L)
Program P
Secret input (H)
Information leaked
Public Output (O)
√
?
Leakage = Secrecy before observing - Secrecy after observing
∆E (XH) = E(XH) − E(XH|XO)
where XH is the secret, XO is the output and E is an ”entropy”
function
4 / 23
5. Motivations for Quantitative Information Flow
1 information leakage is unavoidable, e.g. authentication
systems must leak by design some information about
passwords
2 however, provided the leakage is small, usually that is not a
problem.
3 so measuring leakage allows for a security assessment of a
program
4 This work provides new and fast algorithms to measure
information leaks, “how much” a program leaks
5 / 23
6. Quantifying Information Leaks
Theorem: Channel Capacity for deterministic systems
∆E (XH) ≤ log2(|O|)
holds for Shannon entropy and R´enyi’s min-entropy
holds for all possible distributions of XH.
is the basis of state-of-the-art techniques for Quantitative
Information Flow analysis.
based on the above we have:
Definition
Quantitative Information Flow (QIF) is the problem of counting N,
the number of possible outputs of a given program P.
6 / 23
7. An example
base = 8;
if (H < 16 and H>=0) then
O = base + H
else
O = base
Figure: Data sanitization program
Here then O is in [8..23], so leakage ≤ log(15), possible bits
configurations in O are 0 . . . 01000 to 0 . . . 010111
7 / 23
8. From programs to formulas
First step in our approach is to understand how programs are
translated into formulas: using Single Static Assignment (SSA) a
program P is translated into a conjunctive formula ϕP
8 / 23
9. Quantifying as Counting
Adversary
tries to infer
H from L and O
H
L
O
f
O is stored as a bit vector b1b2 . . . bM.
Assume we have a first-order formula ϕP such that:
ϕP contains a set of Boolean variables VI := {p1, p2, .., pM }
pi = if and only if bi is 1, and pi = ⊥ if and only if bi = 0
Counting outputs of P ≡ Counting models of ϕP w.r.t. VI
9 / 23
10. QIF analysis using a All-SMT solver
Program transformation
L = 8;
if (H < 16)
O = H + L;
else
O = L;
(L1 = 8) ∧
(G0 = H0 < 16) ∧
(O1 = H0 + L1) ∧
(O2 = L1) ∧
(O3 = G0?O1 : O2)
Figure: A simple program P encoded into a first-order formula ϕP
Formula instrumentation to build the set VI :
(assert (= (= #b1 (( extract 0 0) O3)) p1))
10 / 23
11. QIF analysis using a All-SMT solver
We introduce two algorithms for All-SMT solving.
Both use APIs provided by an SMT solver.
Blocking clause
After finding a model
µ = l0 ∧ l1 ∧ · · · ∧ lm ∧ . . .
Add the clause:
block = ¬l0 ∨ ¬l1 ∨ · · · ∨ ¬lm
11 / 23
13. Blocking Clause all-SMT
The blocking clauses method is straightforward and it is simple to
implement.
However, adding a large number of blocking clauses will require a
large amount of memory.
Also the large number of clauses slows down the Boolean
Constraint Propagation procedure of the underlying solver.
To address these inefficiencies we introduce an alternative method
which avoids re-discovering solutions using depth-first search
(DFS).
13 / 23
14. QIF analysis using a All-SMT solver
Use APIs provided by an SMT solver
Depth-first search
Two components:
A DPLL like procedure to enumerate truth assignments.
Use the SMT solver to check consistency of the truth
assignments.
14 / 23
16. Depth First Search all-SMT
The method choose literal chooses the next state to explore
from VI in a DFS manner, and even if there are 2N possible states
efficient pruning avoid exponential blow-out for programs that
“don’t leak too much”, i.e.
Depth-first search all-SMT algorithm is linear in |O|
16 / 23
18. Implementation
Tools selected:
Model Checking: CBMC (Ansi C)
Symbolic Execution: Symbolic PathFinder (Java bytecode)
Program transformation: CBMC
SMT solver: z3
Benchmarks include:
Vulnerabilities in Linux kernel
Anonymity protocols
A Tax program from the European project HATS (Java)
Assumptions: all programs have bounded loops, no recursion.
18 / 23
21. Conclusions
P
program transformation
ϕP
QIF All-SMT
Formal methods DPLL(T )
Two approaches:
Use formal methods to mimic DPLL(T ).
QIF analysis using Model Checking.
QIF analysis using Symbolic Execution.
Generate ϕP, then using DPLL(T ).
Generate ϕP using program transformation.
Extend an SMT solver for All-SMT.
21 / 23