1. 7/21/2014
1
Faculty of Engineering and Computer Science
Concordia Institute for Information Systems Engineering
Cloud Traffic Security
Wen Ming Liu
INSE 6620 July 23, 2014
Agenda
2
Cloud Applications
Side-Channel Attacks
Challenges and Solutions
Ceiling Padding
4. 7/21/2014
4
7
Side-Channel Attack on Encrypted Traffic
Internet
Client ServerEncrypted Traffic
User Input Observed Directional Packet Sizes
a: 801→, ←54, ←509, 60→
00: 812→, ←54, ←505, 60→,
813→, ←54, ←507, 60→
b-byte s-byte
Network packets’ sizes and directions
between user and a popular search engine
By acting as a normal user and
eavesdropping traffic with sniffer pro 4.7.5.
Collected in May 2012
Indicator of the
input itself
Fixed pattern: identified input string
8
Updated Patterns Dec 2013
Internet
Client ServerEncrypted Traffic
User Input Observed Directional Packet Sizes
a: 590→, 67→, ←60, ←60, ←728, 60→
00: 590→, 67→, ←60, ←60, ←698, 60→,
590→, 68→, ←60, ←60, ←717, 60→
b-byte s-byte
Patterns may change over time, but attacks
will still work in similar ways
Patterns may be different for different Web
applications, but they are always there
Indicator of
input itself
Fixed pattern: identified input string
5. 7/21/2014
5
9
To Make Things Worse
The “Autocomplete” feature allows adversaries to combine the packets
corresponding to multiple keystrokes
Web applications are highly interactive
User Input Observed Directional Packet Sizes
a: 590→, 67→, ←60, ←60, ←728, 60→
00: 590→, 67→, ←60, ←60, ←698, 60→,
590→, 68→, ←60, ←60, ←717, 60→
b-byte s-byte
10
Longer Inputs, More Unique the Patterns
S value for each character entered as:
a b c d e f g
509 504 502 516 499 504 502
h i j k l m n
509 492 517 499 501 503 488
o p q r s t
509 525 494 498 488 494
u v w x y z
503 522 516 491 502 501
First keystroke: Second keystroke:
First
Keystroke
Second Keystroke
a b c d
a 509 487 493 501 497
b 504 516 488 482 481
c 502 501 488 473 477
d 516 543 478 509 499
Unique s value 12 out of 1616 out of 16
The unique patterns leak out users’
private information: the input string
In reality, it may
take more than two
keystrokes to
uniquely identify
an input string.
6. 7/21/2014
6
Side-Channel Leaks
To protect the information in critical applications against
network sniffing, a common practice is to encrypt their
network traffic. However, as discovered in the research,
serious information leaks are still a reality.
Even though the communications generated during these
state transitions are protected by HTTPS, their observable
attributes can still give away the information about the user’s
selection.
The eavesdropper cannot see the contents, but can observe:
number of packets, timing/size of each packet.
11
Slides 11-27 are partially based on: S. Chen, R. Wang, X. Wang, and K. Zhang. Side-channel leaks in
web applications: A reality today, a challenge tomorrow. In IEEE Symposium on Security and Privacy’10,
pages 191–206, 2010.
Main Findings
Analysis of the side-channel weakness in web applications.
Several high-profile and really popular web applications actually
disclose surprisingly detailed user’s sensitive information
Personal health data, family income, investment details, search queries
The root causes of the side-channel information leaks are the
fundamental characteristics in today’s web applications.
Stateful communication, low entropy input and significant traffic
distinctions
In-depth study on the challenges in mitigating the threat.
Evaluate the effectiveness and the overhead of common mitigation
techniques such as packet padding.
Show that effective solutions to the side-channel problem have to be
application-specific, relying on an in-depth understanding of the
application being protected.
This suggests the necessity of a significant improvement of the
current practice for developing web applications.
12
7. 7/21/2014
7
Fundamental Characteristics of Web Applications
The root causes are some fundamental characteristics
in today’s web applications :
Low entropy inputs for better interactions.
Small input space
Autosuggestion, auto-complete
Stateful communications.
Transitions to next states depend both on the current state and
on its input.
Although information for each transition may be insignificant, their
combination can be really powerful.
Significant traffic distinctions.
The chance of two different user actions having the same traffic
pattern is really small.
Such distinctions often come from the objected updated by client-
server data exchanges.
13
Significant traffic distinctions
14
8. 7/21/2014
8
Basic of Wi-Fi Encryption Schemes:
WEP: susceptible to key-recovery attacks
WPA: TKIP (RC4)
WPA2: CCMP (128-bit AES block cipher in counter
mode)
The ciphertext fully preserves the size of its plaintext!
15
Scenario: search using encrypted Wi-Fi WPA/WPA2.
Example: user types “list” on a WPA2 laptop.
Consequence: Anybody on the street knows our search queries.
Attacker’s effort: linear, not exponential.
821
910
822
931
823
995
824
1007
16
9. 7/21/2014
9
OnlineHealthA
(“A” denoting a pseudonym)
A web application by one of the most reputable
companies of online services
Illness/medication/surgery information is leaked
out, as well as the type of doctor being queried.
Vulnerable designs
Entering health records
By typing – auto suggestion
By mouse selecting – a tree-structure organization of
elements
Finding a doctor
Using a dropdown list item as the search input
17
tabs
Entering health records: no matter
keyboard typing or mouse selection,
attacker has a 2000× ambiguity
reduction power.
Find-A-Doctor: attacker can
uniquely identify the specialty.
Attacker’s power
10. 7/21/2014
10
OnlineTaxA
It is the online version of one of the most widely
used applications for the U.S. tax preparation.
Design: a tax-preparation wizard
Tailor the conversation based on user’s previous input.
The forms that you work on tell a lot about your
family
Filing status
Number of children
Paid big medical bill
The adjusted gross income (AGI)
19
Entry page of
Deductions &
Credits
Summary of
Deductions &
Credits
Full credit
Not eligible
Partial credit
All transitions have unique traffic patterns.
Consult the IRS instruction:
$1000 for each child
Phase-out starting from $110,000. For every $1000 income, lose $50
credit.
$0
$110000 $150000
Not eligibleFull credit Partial credit
(two children scenario)
child credit state machine
20
11. 7/21/2014
11
Entry page of
Deductions &
Credits
Summary of
Deductions &
Credits
Full credit
Not eligible
Partial credit
Even worse, most decision procedures for credits/deductions
have asymmetric paths.
Eligible – more questions
Not eligible – no more question
Enter your
paid interest
$0
$115000 $145000
Not eligibleFull credit Partial credit
Student-loan-interest credit
21
Disabled Credit
$24999
Retirement Savings
$53000
IRA Contribution
$85000 $105000
College Expense $116000
$115000Student Loan Interest $145000
First-time Homebuyer credit $150000 $170000
Earned Income Credit
$41646
Child credit *
$110000
Adoption expense $174730 $214780
$130000 or $150000 or $170000 …
$0
A subset of identifiable AGI thresholds
We are not tax experts.
OnlineTaxA can find more than 350 credits/deductions.
22
12. 7/21/2014
12
A major financial institution in the U.S.
Which funds you invest?
• No secret.
• Each price history curve is a
GIF image from MarketWatch.
• Everybody in the world can
obtain the images from
MarketWatch.
• Just compare the image sizes!
OnlineInvestA
Your investment allocation
• Given only the size of the pie chart,
can we recover it?
• Challenge: hundreds of pie-charts
collide on a same size.
23
Inference based on the evolution of
the pie-chart size in 4-or-5 days
The financial institution updates the pie chart every day
after the market is closed.
The mutual fund prices are public knowledge.
≅ 800 charts ≅ 80 charts ≅ 8 charts 1 chart
Sizeofday1
Sizeofday2;
Pricesoftheday
Sizeofday3;
Pricesoftheday
Sizeofday4;
Pricesoftheday
≅80000charts
24
13. 7/21/2014
13
Challenging to Mitigate the Vulnerabilities
25
Traffic differences are everywhere. Which ones result in
serious data leaks?
Need to analyze the application semantics, the availability of
domain knowledge, etc.
Hard.
Is there a vulnerability-agnostic defense to fix the
vulnerabilities without finding them?
Obviously, padding is a must-do strategy.
We found that even for the discussed apps, the defense policies
have to be case-by-case.
Why challenging?
26
14. 7/21/2014
14
See if problem can be solved without analyzing
individual application
Application-agnostic manner: Padding
Rounding:
Random padding:
Average overhead:
Given , reduction power being
calculated after padding
Universal Mitigation Policies
27
Any Problem?
Agenda
28
Cloud Applications
Side-Channel Attacks
Challenges and Solutions
Ceiling Padding
15. 7/21/2014
15
29
Trivial problem?
30
Don’t Forget the Cost
(Prefix) char s Value Rounding (ΔΔΔΔ)
64 160 256
(c) c 473 512 480 512
(c) d 477 512 480 512
(d) b 478 512 480 512
(d) d 499 512 640 512
(a) c 501 512 640 512
(b) a 516 576 640 768
Padding Overhead (%) 6.5% 14.1% 13.0%
1 S. Chen, R. Wang, X. Wang, and K. Zhang. Side-channel leaks in web applications: A reality today, a
challenge tomorrow. In IEEE Symposium on Security and Privacy’10, pages 191–206, 2010.
No guarantee of better privacy at a higher cost
↑ ⇏ privacy ↑
↑ ⇏ overhead ↑
To make all inputs indistinguishable by rounding will result in
a 21074% overhead for a well-known online tax system 1
16. 7/21/2014
16
Two Conflicting Goals
31
To prevent side-channel attacks, we face two
seemingly conflicting goals,
Privacy protection:
Reduce the differences in packet sizes
Cost:
Minimize the overhead (communication and processing…)
The similar goals may allow us to borrow existing
expertise on privacy-preserving data publishing(PPDP).
How?
Grouping and Breaking
Slides 31-45 are partially based on: W. M. Liu, L. Wang, K. Ren, P. Cheng, M. Debbabi, S. Zhu, “PPTP:
Privacy-Preserving Traffic Padding in Web-Based Applications,” IEEE Trans. on Dependable and
Secure Computing (TDSC),
Solution: Ceiling Padding
32
473 477 478 (c) c
477 477 478 (c) d
478 499 478 (d) b
499 499 516 (d) d
501 516 516 (a) c
516 516 516 (b) a
S Value Padding (Prefix)
charOption 1 Option 2
Quasi-ID Function 1 Function 2 Sensitive
AttributeGeneralization
PPTP:
Padding group
PPDP:
anonymized group
PPTP goals:
Privacy
Cost
PPDP goals:
Privacy
Data utility
So we can apply existing techniques in data publication to achieve ceiling padding
However, there are a few difference, and hence challenges...
Ceiling padding: pad every packet to the maximum size in the group
17. 7/21/2014
17
33
PPTP Components
Internet
Interaction:
action a:
Atomic user input that triggers traffic
A keystroke, a mouse click …
action-sequence ܽԦ :
A sequence of actions with complete input info
Consecutive keystrokes…
action-set Ai:
Collection of all ith actions in a set of action-seq
Observation:
flow-vector v:
A sequence of flows (directional packet sizes)
Triggered an action
vector-sequence ݒԦ :
A sequence of flow-vectors
Triggered by an equal-length action-sequence
vector-set Vi:
Collection of all ith vectors in a set of vector-seq
Vector-Action Set VAi:
Pairs of ith actions and corresponding ith flow-vectors
User Input Observed Directional Packet Sizes
a: 801→, ←54, ←509, 60→
00: 812→, ←54, ←505, 60→,
813→, ←54, ←507, 60→
34
Privacy and Cost
k-indistinguishability: Given a vector-action set VA
Padding group:
any S⊆VA satisfying all the pairs in S have identical flow-vectors and no S’ ⊃S can
satisfy this property
We say VA satisfies k-indistinguishability (k is an integer) if the cardinality
of every padding group is no less than k
Goal of privacy protection:
Upon observing any flow-vector in the traffic, the eavesdropper cannot
determine which action in the table (vector-action set) has triggered this flow-
vector.
l-diversity:
Address the cases that:
No all inputs should be treated equally in padding (for example, some statistical
information regarding the likelihood of different inputs may be publicly known).
18. 7/21/2014
18
35
Privacy and Cost
Vector-distance:
Given two equal-length flow-vectors v1 and v2, vector-distance is the
total number of bytes different in the flows: ݏ݅݀ݒ ݒଵ, ݒଶ =
∑ (|ݏଵ − ݏଶ|)
௩భ
ୀଵ .
Padding cost:
Given a vector-set V, the padding cost is the sum of the vector-
distances between each flow-vector in V and its countpart after padding.
Processing cost:
Given a vector-set V, the processing cost is the number of flows in V
which corresponding packets should be padded.
Agenda
36
Cloud Applications
Side-Channel Attacks
Challenges and Solutions
Ceiling Padding
19. 7/21/2014
19
Challenge 1
37
473 477 478 (c) c
477 477 478 (c) d
478 499 478 (d) b
499 499 516 (d) d
501 516 516 (a) c
516 516 516 (b) a
S Value Padding (Prefix)
charOption 1 Option 2
Differences and challenges:
Data utility measures & padding cost
Traffic padding: cost of option 1 is worse than that of option 2
Data publication: utility of function 1 is better than that of option 2
38
Challenge 2
Recall that adversaries may combine multiple keystokes
Example:
One obvious, but invalid solution:
Pad every keystroke (separately)
Another obvious, but invalid solution:
Pad on the whole string!
First
Keystroke
Second Keystroke
a b c d
a 487 493 501 497
b 516 488 482 481
c 501 488 473 477
d 543 478 509 499
First
Keystroke
Second Keystroke
a b c d
a 487 493 501 497
b 516 488 482 481
c 501 488 473 477
d 543 478 509 499
First
Keystroke
Second Keystroke
a b c d
a … … 501 …
b … 488 … …
c 501 488 … …
d … … … …
First
Keystroke
Second Keystroke
a b c d
a 509 … … 501 …
b 504 … 488 … …
c 502 501 488 … …
d 516 … … … …
First
Keystroke
Second Keystroke
a b c d
a 516 … … 501 …
b 504 … 488 … …
c 504 501 488 … …
d 516 … … … …
Strings 1st keystroke 2nd keystroke
ac 509 501
ca 502 501
ad 509 497
dd 516 499
… … …
20. 7/21/2014
20
39
PPTP - Overview of Algorithms
Intention:
To demonstrate the existence of abundant possibilities in approaching PPTP
issue, and not to design an exhaustive list of solutions.
Design three algorithms for partitioning inputs into padding groups.
Main difference: the algorithms handle in increasingly complicated cases .
Computational complexity:
svsdSimple algorithm: Ο ݈݊݊݃
svmdGreedy algorithm: Ο(݊ ⨯ ݊2) (worse case), Ο(݊ ⨯ ݈݊)݊݃ (average case)
mvmdGreedy algorithm:Ο(݊ ⨯ ݊2) (worse case), Ο(݊ ⨯ ݈݊)݊݃ (average case)
40
Experiment Settings
Collect data from two real-world web applications:
A popular search engine
(users’ search keyword needs to be protected)
Collect flow-vectors for query suggestion widget for all possible combinations of four letters
by crafting requests to simulate the normal AJAX connection request.
Authoritative drug information system from national institute
(user’s possible health information needs to be protected)
Collect vector-action set for all the drug information by mouse-selecting following the
application’s three-level tree-hierarchical navigation.
21. 7/21/2014
21
41
Overhead - Padding Cost
The padding cost against k:
To compare to rounding, Δ=512 (engineB) and Δ=5120 (drugB) which achieves only 5-indistinguishility.
Our algorithms have less padding cost in both cases.
Observe that our algorithms are superior specially when the number of flow-vectors is larger.
42
Overhead – Execution Time
Generate n-size flow data by synthesizing n/|VA| copies of engineB and drugB.
The computation time of mvmdGreedy increases slowly with n.
Practically efficient (1.2s for 2.7m flow-vectors),
Require slightly more overhead than rounding when it is applied to a single Δ value.
The computational time of mvmdGreedy against privacy property k
A tighter upper bound: Ο(݊ ⨯ ݊ ⨯ 2݇ ⨯ λ) (worse case), Ο(݊ ⨯ ݊ ⨯ log (2݇ ⨯ λ)) (average case)
The computation time increases slowly with k for engineB, and decreases slowly for drugB.
22. 7/21/2014
22
43
Overhead – Processing Cost
An application can choose to incorporate the padding at different stage of
processing a request, however, we must minimize the number of packets to be
padded.
Pad the flow-vectors on the fly,
Modify the original data beforehand.
The processing cost against k:
Rounding must pad each flow-vector regardless of the k’s and the applications, while our
algorithms have much less cost for engineB and slightly less for drugB.
Extension
44
Adapt l-diversity to address cases that:
No all inputs should be treated equally in padding.
Model:
Catch the information about the inequality.
Algorithms:
Need additional constraints on partition.
23. 7/21/2014
23
45
Experiments
Collect data from two real-world web applications:
Another popular search engine
(users’ search keyword needs to be protected)
Authoritative patent information system from national institute
(company’s patent interest needs to be protected)
46
Challenge 3: Ceiling Padding Defeated
Condition s Value Rounding (ΔΔΔΔ) Ceiling
Padding112 144 176
Cancer 360 448 432 528 360
Cervicitis 290 336 432 352 360
Cold 290 336 432 352 290
Cough 290 336 432 352 290
Padding Overhead (%) 18.4% 40.5% 28.8% 5.7%
2-indistinguishability
Observation: a patient received a 360-byte packet after login
Cancer? Cervicitis? ⇒ 50% , 50%
Extra knowledge: this patient is a male
Cancer? Cervicitis? ⇒ 100% , 0%
Facts (Ceiling Padding):
Slides 46-58 are partially based on: W. M. Liu, L. Wang, K. Ren, M. Debbabi,, “Background Knowledge-
Resistant Traffic Padding for Preserving User Privacy in Web-Based Applications,” Proc. The 5th IEEE
International Conference and on Cloud Computing Technology and Science (IEEE CloudCom 2013),
24. 7/21/2014
24
47
Solution: Add Randomness
Condition s Value
Cancer 360
Cervicitis 290
Cold 290
Cough 290
Cancerous Person
360
360
360
Random Ceiling Padding
Instead of deterministically forming padding groups, the server will randomly
(at uniform, in this example) selects one out of the other three conditions
(together with the real condition) to form a padding group for ceiling padding.
Always receive a
360-byte packet
Condition s Value
Cancer 360
Cervicitis 290
Cold 290
Cough 290
Cervicitis Patient
360
290
66.7%: 290-byte packet
33.3%: 360-byte packet
290
48
Better Privacy Protection
Diseases s Value
Cancer 360
Cervicitis 290
Cold 290
Cough 290
Can tolerate adversaries’ extra knowledge
Suppose an adversary knows a patient is male and he saw s = 360
Patient
Has
Server
Selects
Cancer Cervicitis
Cancer Cold
Cancer Cough
Cold Cancer
Cough Cancer
The adversary
now can only
be 60%, instead
of 100%, sure
that patient has
Cancer.
Cost is not necessarily worse
In this example, these two methods actually lead to exactly the same
expected padding and processing costs
25. 7/21/2014
25
Analysis
49
Scenario:
Algorithm: randomness drawn from uniform distribution
Data: action-sequence and flow-vector are of length one
Analysis of privacy preservation:
Lemma 6.1
Analysis of costs:
Lemma 6.2, Lemma 6.3
Model, Scheme and Experiment
50
Model:
component, privacy, method, cost
Scheme:
Main idea:
Server randomly selects members to form the group.
Different choices of random distribution lead to different algorithms.
Two instantiations of scheme:
Bounded uniform distribution; Normal distribution
Computation complexity: O(k)
Experiment:
Data from real-world web applications
Low overheads and high uncertainty
26. 7/21/2014
26
51
Privacy Properties
k-Indistinguishability:
For any flow-vector, at least k different actions can trigger it.
Given vector-action set VA, padding algorithm M, range Range(M,VA)
∀ ݒ ∈ ܴܽ݊݃݁ ,ܯ ܸܣ , ܽ: Pr (ܯିଵ ݒ = ܽ > 0 ∧ ܽ ∈ ܣ ≥ ݇
Model the privacy requirement of a traffic padding from two perspectives
Uncertainty:
Apply the concept of entropy in information theory to quantify an
adversary’s uncertainty about the real action performed by a user.
Given vector-action sequence ܸ,ܣ padding algorithm M
߮ ,ݒ ܸ,ܣ ܯ = − Pr (ܯିଵ
ݒ = ܽ logଶ Pr (ܯିଵ
ݒ = ܽ
∈
߶ ܸ,ܣ ܯ = ߮(,ݒ ܸ,ܣ )ܯ × Pr (ܯ ܣ = )ݒ
௩∈ோ(ெ,)
Φ ܸ,ܣ ܯ = ෑ ߶(ܸ,ܣ )ܯ
∈
52
Privacy Properties (cont.)
δ-uncertain k-Indistinguishability:
An algorithm M give δ-uncertain k-Indistinguishability for a
vector-action sequence ܸܣ if
M w.r.t. any ܸܣ ∈ ܸܣ satisfies k-indistinguishability, and
The uncertainty of M w.r.t. ܸܣ is not less than δ.
27. 7/21/2014
27
53
Padding Method
Ceiling padding [15][16]:
Inspired by PPDP: grouping and breaking
Dominant-vector of a padding group:
Size of each group is not less than k, and
every flow-vector in a group is padded to dominant-vector of that group.
Achieves k-
indistinguishability, but
not sufficient if the
adversary possess
prior knowledge.
Random ceiling padding method:
A mechanism M: when responding to an action a (per each user request),
It randomly selects k-1 other actions, and
Pads the flow-vector of action a to be dominant-vector of transient group
(those k actions).
Randomness:
Randomly selects members of transient group from certain candidates
based on certain distributions.
To reduce the cost, change the probability of an action being selected as
a member of transient group.
54
Cost Metrics
Expected processing cost:
How many flow-vectors need to be padded
ݏܿݎ ܸ,ܣ ܯ =
∑ ∑ ୰ ெ()ஷ௩(ೌ,ೡ)∈ೇಲೇಲ∈ೇಲ
∑ ,௩ : ,௩ ∈
ೇಲ∈ೇಲ
Expected padding cost:
The proportion of packet size increases compared to original flow-vectors
Given vector-action sequence ܸ,ܣ padding algorithm M,
(ܽ, :)ݒ
ݏܿ ܽ, ܸ,ܣ ܯ = Pr ܯ ܽ = ݒᇱ × ݒᇱ − ݒ
௩ᇱ∈ோ(ெ,) ଵ
ܸ:ܣ
ݏܿ ܸ,ܣ ܯ = (,ܽ(ݏܿ ܸ,ܣ ))ܯ
,௩ ∈
ܸ:ܣ ݏܿ ܸ,ܣ ܯ = (,ܣܸ(ݏܿ ))ܯ
∈
28. 7/21/2014
28
55
Overview of Scheme
Main idea:
To response a user input, server randomly selects members to form the group.
Different choices of random distribution lead to different algorithms.
Goal:
The privacy properties need to be ensured.
The costs of achieving such privacy protection should be minimized.
Two stage scheme:
Stage 1: derive randomness parameters, one-time, optimization problem;
Stage 2: form transient group, real-time.
56
Scheme (cont.)
Computational complexity: O(k)
Stage 1: pre-calculated only once
Stage 2: select k-1 random actions without duplicate, O(k).
Discussion on privacy:
The adversary cannot collect vector-action set even acting as normal user,
ܸܣ = 100, ݇ = 20, ݉ݎ݂݅݊ݑ ⟹ )݊݅ݐܿܽ ݊ܽ( ݏݑݎ݃ ݐ݊݁݅ݏ݊ܽݎݐ =
99
19
≈ 2
Approximate the distribution is hard: all users share one random process.
Discussion on costs:
Deterministically incomparable with those of ceiling padding.
29. 7/21/2014
29
57
Instantiations of Scheme
Bounded uniform distribution:
ct: cardinality of candidate actions
cl: number of larger candidates
Scheme can be realized in many different ways.
Choose group members from different subsets of candidates and based on
different distributions, in order to reduce costs.
Normal distribution:
µ: mean
σ: standard deviation
0 n
i
action
cl
ct
0 n
i
action
largest smallest largest smallest
probability
[max(0,min(i-cl, |VA|-ct)), min(max(0,min(i-cl, |VA|-ct))+ct, |VA|)]
(return)
58
Uncertainty and Costs v.s. k
The padding and processing costs of all algorithms increase
with k, while TUNI and NORM have less than those of SVMD.
Our algorithms have much larger uncertainty for Drug and
slightly larger for Engine.