Quantifying Information Leaks using Reliability Analysis
1. INFORMATION-LOW LEAKS
QUANTIFYING LEAKS USING RELIABILITY ANALYSIS
CONCLUSION
Quantifying Information Leaks using
Reliability Analysis
Q. Sang Phan∗, Pasquale Malacaria∗, Corina S. P˘as˘areanu†,
and Marcelo d’Amorim‡
∗Queen Mary University of London, UK
†Carnegie Mellon Silicon Valley and NASA Ames, USA
‡Federal University of Pernambuco, Brazil
July 23, 2014
1 / 18
2. INFORMATION-LOW LEAKS
QUANTIFYING LEAKS USING RELIABILITY ANALYSIS
CONCLUSION
Information Flow
Quantitative Information Flow
Information Flow
Secret input (H) Public input (L)
Program P
Public Output (O)
Non-interference
Public input (L)
Program P
Secret input (H)
Information leaked
Public Output (O)
√
? 2 / 18
3. INFORMATION-LOW LEAKS
QUANTIFYING LEAKS USING RELIABILITY ANALYSIS
CONCLUSION
Information Flow
Quantitative Information Flow
Information Flow
What violates non-interference?
Information flow from variable H to variable O
Direct flow (explicit flow)
O = H - 10;
Indirect flow (implicit flow)
if (H > 3) O = 3; else O = 100;
3 / 18
4. INFORMATION-LOW LEAKS
QUANTIFYING LEAKS USING RELIABILITY ANALYSIS
CONCLUSION
Information Flow
Quantitative Information Flow
Information Flow
What violates non-interference?
Information flow from variable H to variable O
Direct flow (explicit flow)
O = H - 10;
Indirect flow (implicit flow)
if (H > 3) O = 3; else O = 100;
Approaches to non-interference:
Type systems: suffer from false positives, e.g. O = H - H;
Taint analysis: suffer from false positives and false negatives.
Self-composition: precise (but more expensive).
3 / 18
5. INFORMATION-LOW LEAKS
QUANTIFYING LEAKS USING RELIABILITY ANALYSIS
CONCLUSION
Information Flow
Quantitative Information Flow
Information Flow
Non-interference is often unachievable.
4 / 18
6. INFORMATION-LOW LEAKS
QUANTIFYING LEAKS USING RELIABILITY ANALYSIS
CONCLUSION
Information Flow
Quantitative Information Flow
Information Flow
Non-interference is often unachievable.
int check(int H, int L){
int O;
if (L == H)
O = ACCEPT;
else O = REJECT;
return O;
}
password check
Secret input (H) Public input (L)
Program P
Public Output (O)
Non-interference
Public input (L)
Program P
Secret input (H)
Information leaked
Public Output (O)
√
?
4 / 18
7. INFORMATION-LOW LEAKS
QUANTIFYING LEAKS USING RELIABILITY ANALYSIS
CONCLUSION
Information Flow
Quantitative Information Flow
Information Flow
Non-interference is often unachievable.
int check(int H, int L){
int O;
if (L == H)
O = ACCEPT;
else O = REJECT;
return O;
}
password check
Secret input (H) Public input (L)
Program P
Public Output (O)
Non-interference
Public input (L)
Program P
Secret input (H)
Information leaked
Public Output (O)
√
?
Non-interference: Does it leak information?
Quantitative Information Flow: “How much” does it leak?
4 / 18
8. INFORMATION-LOW LEAKS
QUANTIFYING LEAKS USING RELIABILITY ANALYSIS
CONCLUSION
Information Flow
Quantitative Information Flow
Quantitative Information Flow
Adversary
tries to infer
H from L and O
H
L
O
f
Leaks = Secrecy before observing - Secrecy after observing
5 / 18
9. INFORMATION-LOW LEAKS
QUANTIFYING LEAKS USING RELIABILITY ANALYSIS
CONCLUSION
Information Flow
Quantitative Information Flow
Quantitative Information Flow
Adversary
tries to infer
H from L and O
H
L
O
f
Leaks = Secrecy before observing - Secrecy after observing
Formal definition
XH, XL, XO: distributions of H, L, O.
E (entropy): function measuring secrecy.
∆E (XH) = E(XH) − E(XH|XL = l, XO)
5 / 18
10. INFORMATION-LOW LEAKS
QUANTIFYING LEAKS USING RELIABILITY ANALYSIS
CONCLUSION
Information Flow
Quantitative Information Flow
Quantitative Information Flow
∆E (XH) = E(XH) − E(XH|XL = l, XO)
Theorem of Channel Capacity
∆E (XH) ≤ log2(|O|)
has been proved in the case:
E is Shannon entropy (Malacaria and Chen 2008)
E is R´enyi’s min-entropy (Smith 2009)
holds for all possible distributions of XH.
is basis of state-of-the-art techniques for Quantitative
Information Flow analysis.
6 / 18
11. INFORMATION-LOW LEAKS
QUANTIFYING LEAKS USING RELIABILITY ANALYSIS
CONCLUSION
Symbolic PathFinder
Labeling Procedure
Quantifying Procedure
Preliminary Evaluation
State of the art
What can’t be avoided:
Input: program P, inputs classified as H and L
(Output: P leaks maximum k bits)
What users have to do?
(Heusser and Malacaria 2010): write a driver following a
template.
(Meng and Smith 2011), (Meng and Smith 2013): manually
transform the program into bit vector predicates.
(Klebanov 2012): provide hypothesis, loop invariants etc for
the interactive theorem prover.
. . .
This work: automated.
7 / 18
13. INFORMATION-LOW LEAKS
QUANTIFYING LEAKS USING RELIABILITY ANALYSIS
CONCLUSION
Symbolic PathFinder
Labeling Procedure
Quantifying Procedure
Preliminary Evaluation
Preliminaries
P = (Σ, I, F, T)
A symbolic path ρ of P: ρ = σ0σ1..σn
σ0 ∈ I; σn ∈ F, σi , σi+1 ∈ T for all i ∈ {0, . . . , n − 1}
Semantics of P: the set R of all symbolic paths ρi
Define the functions:
init(ρ) = σ0; fin(ρ) = σn
#in(ρ): the number of inputs that go to path ρ.
#out(ρ): the number of outputs that go out from the path ρ.
Denote by X|y the value of the variable X at the symbolic
state y (i.e. y : X → X|y )
9 / 18
14. INFORMATION-LOW LEAKS
QUANTIFYING LEAKS USING RELIABILITY ANALYSIS
CONCLUSION
Symbolic PathFinder
Labeling Procedure
Quantifying Procedure
Preliminary Evaluation
Symbolic PathFinder
Take symbols as inputs instead of concrete data.
Build path condition pci ≡ ci (α, β) for each symbolic path ρi .
Execute program P with H = α and L = β
O =
f1(α, β) if c1(α, β)
f2(α, β) if c2(α, β)
. . . . . .
fm(α, β) if cm(α, β)
For the symbolic path ρi with final state σi ∈ F
O|σi = fi (α, β)
Define a function:
path(ρi ) = ci (α, β)
10 / 18
15. INFORMATION-LOW LEAKS
QUANTIFYING LEAKS USING RELIABILITY ANALYSIS
CONCLUSION
Symbolic PathFinder
Labeling Procedure
Quantifying Procedure
Preliminary Evaluation
Illustrative Example
int sanityCheck(int H){
int base = 8, O;
if (H < 16)
O = base + H;
else
O = base;
return O;
}
Sanity check
Running Symbolic Execution on the program with H = α, there
are two symbolic paths:
ρ1 : O|fin(ρ1) = α + 8, and c1(α) = α < 16
ρ2 : O|fin(ρ2) = 8, and c2(α) = ¬(α < 16)
11 / 18
16. INFORMATION-LOW LEAKS
QUANTIFYING LEAKS USING RELIABILITY ANALYSIS
CONCLUSION
Symbolic PathFinder
Labeling Procedure
Quantifying Procedure
Preliminary Evaluation
Labeling Procedure
Self-composition
P : copy of P with all variable renamed: H, L, O → H , L , O
The following Hoare triple guarantees non-interference
{L = L }P; P {O = O }
12 / 18
17. INFORMATION-LOW LEAKS
QUANTIFYING LEAKS USING RELIABILITY ANALYSIS
CONCLUSION
Symbolic PathFinder
Labeling Procedure
Quantifying Procedure
Preliminary Evaluation
Labeling Procedure
Self-composition
P : copy of P with all variable renamed: H, L, O → H , L , O
The following Hoare triple guarantees non-interference
{L = L }P; P {O = O }
Suppose we run Symbolic Execution on P; P with
H = α; H = α1; L = L = β
The symbolic semantics of P and P is R and R
Fine-grained Self-composition by Symbolic Execution
∀ρ ∈ R, ρ ∈ R .path(ρ) ∧ path(ρ ) → O|fin(ρ) = O |fin(ρ )
12 / 18
18. INFORMATION-LOW LEAKS
QUANTIFYING LEAKS USING RELIABILITY ANALYSIS
CONCLUSION
Symbolic PathFinder
Labeling Procedure
Quantifying Procedure
Preliminary Evaluation
Illustrative Example
ρ1 : O|fin(ρ1) = α + 8, and c1(α) = α < 16
ρ2 : O|fin(ρ2) = 8, and c2(α) = ¬(α < 16)
Program P also has two symbolic path ρ1, ρ2. There are 3
possible combinations:
ρ1, ρ1 (α < 16 ∧ α1 < 16 → α + 8 = α1 + 8) : INVALID
ρ2, ρ2 (¬(α < 16) ∧ ¬(α1 < 16) → 8 = 8) : VALID
ρ1, ρ2 (α < 16 ∧ ¬(α1 < 16)) → α + 8 = 8 : INVALID
⇒ ρ1 is direct flow, ρ2 is in indirect flow.
13 / 18
19. INFORMATION-LOW LEAKS
QUANTIFYING LEAKS USING RELIABILITY ANALYSIS
CONCLUSION
Symbolic PathFinder
Labeling Procedure
Quantifying Procedure
Preliminary Evaluation
Quantifying Procedure
CC(P) ≤ log2(Σ#out(ρc) + Σ#out(ρi ) + Σ#out(ρd ))
Σ#out(ρc) = 1.
Σ#out(ρi ) is the number of indirect paths ρi .
Only Σ#out(ρd ) needs to be computed.
14 / 18
20. INFORMATION-LOW LEAKS
QUANTIFYING LEAKS USING RELIABILITY ANALYSIS
CONCLUSION
Symbolic PathFinder
Labeling Procedure
Quantifying Procedure
Preliminary Evaluation
Quantifying Procedure
CC(P) ≤ log2(Σ#out(ρc) + Σ#out(ρi ) + Σ#out(ρd ))
Σ#out(ρc) = 1.
Σ#out(ρi ) is the number of indirect paths ρi .
Only Σ#out(ρd ) needs to be computed.
Reliability Analysis in Symbolic PathFinder. Filieri, P˘as˘areanu
and Visser. ICSE 2013.
Compute #in(ρ) for each ρ
Program as a function:
#out(ρd ) ≤ #in(ρd )
14 / 18