Quantifying Information Leaks using Reliability Analysis
1. Quantifying Information Leaks using Reliability Analysis
Quoc-Sang Phan∗
Pasquale Malacaria∗
Corina S. P˘as˘areanu†
Marcelo d’Amorim‡
∗
Queen Mary University of London †
Carnegie Mellon Silicon Valley and NASA Ames ‡
Federal University of Pernambuco
Information Flow
Secret Input Public input
Program P
Public Output
Non-interference
Public input
Program P
Secret Input
Information leaked
Public Output
√
?
H HL L
O O
Non-interference is often unachievable
Example: a password checking program:
if(H == L) O = ACCEPT; else O = REJECT;
The program violates non-interference. Is it secure?
Non-interference: Does it leak information?
Quantitative Information Flow: “How much” does it leak?
→ Measure the leaks using information-theoretic metrics.
Quantitative Information Flow
Assuming that the password is a 4-digit PIN, and the attacker has no prior
knowledge: there are 10000 possible values (0 . . . 9999) for H.
There are log2(10000) = 13.29 bits of secret.
The probability to guess the password correctly: 1
10000
The probability to be rejected: 9999
10000
Leakage after one try in Shannon entropy:
pi log2(
1
pi
) =
1
10000
log2(10000)+
9999
10000
log2(
10000
9999
) = 0.00147
Formalisation
Adversary
tries to infer
H from L and O
H
L
O
f
Leaks = Secrecy before observing - Secrecy after observing
Definition:
XH, XL, XO: random variables representing the distributions of H, L, O.
E (entropy): function measuring secrecy.
∆E(XH) = E(XH) − E(XH|XL = l, XO)
Theorem of channel capacity:
∆E(XH) ≤ log2(|O|)
has been proved in the cases E is Shannon entropy and R´enyi’s min-entropy.
holds for any distribution of XH.
is the basis of state-of-the-art techniques for Quantitative Information Flow.
log2(|O|) is the channel capacity of program P, denoted by CC(P).
QILURA
Program
Symbolic
PathFinder
Labeling
Procedure
Z3 Omega
Quantifying
Procedure
Latte
Input
labels
k bits
Symbolic PathFinder
Take symbols as inputs instead of concrete data.
Build path condition pci ≡ ci(α, β) for each symbolic path ρi.
Execute program P with H = α and L = β
O =
f1(α, β) if c1(α, β)
f2(α, β) if c2(α, β)
. . . . . .
fm(α, β) if cm(α, β)
For the symbolic path ρi with final state σi ∈ F: O|σi
= fi(α, β).
Define a function: path(ρi) = ci(α, β).
Labelling Procedure
Self-composition
P : copy of P with all variable renamed: H, L, O → H , L , O
The following Hoare triple guarantees non-interference
{L = L }P; P {O = O }
Suppose we run Symbolic Execution on P; P with
H = α; H = α1; L = L = β
The symbolic semantics of P and P is R and R
Fine-grained Self-composition by Symbolic Execution
∀ρ ∈ R, ρ ∈ R .path(ρ) ∧ path(ρ ) → O|fin(ρ) = O |fin(ρ )
Quantifying Procedure
CC(P) ≤ log2(Σ#out(ρc) + Σ#out(ρi) + Σ#out(ρd))
Σ#out(ρc) = 1.
Σ#out(ρi) is the number of indirect paths ρi.
Σ#out(ρd):
#out(ρd) ≤ #in(ρd), consequently Σ#out(ρd) ≤ Σ#in(ρd).
Compute #in(ρd) using Reliability Analysis engine.
Preliminary Evaluation
Case Study
jpf-qif QILURA BitPattern
Capacity Time Upper Bound Time Upper Bound Time
No Flow 0 2.304 0 0.790 - -
Sanity check 1 4 45.324 4.09 1.066 4 0.036
Sanity check 2 4 35.346 4.09 1.049 4.59 0.203
Implicit Flow 2.81 0.897 3 0.796 3 0.011
Electronic Purse 2 1.169 2.32 0.854 2 0.157
Ten random outputs 3.32 1.050 3.32 0.814 18.645 0.224
Conclusions
QILURA: a fully automated tool to quantify leaks in Java bytecode.
Two-steps analysis:
Fine-grained self-composition to label paths.
Reliability Analysis engine to quantify inputs in each path.
Download:
https://github.com/qif/jpf-qilura
http://www.eecs.qmul.ac.uk/∼qsp30/ q.phan@qmul.ac.uk