1. 1
INSE 6620 (Cloud Computing Security and Privacy)
Attacks on Cloud
1
Prof. Lingyu Wang
Outline
Co-Residence Attack
Power Attacko e ttac
2
Ristenpart et al., Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute
Clouds; Xu et al., Power Attack: An Increasing Threat to Data Centers
2. 2
The Threat of Multi-Tenancy
In traditional systems, security goal is usually
to “keep bad guys out”
Clouds brings new threats with multi-tenancy:
Multiple independent users share the same physical
infrastructure
So, an attacker can legitimately be in the same
physical machine as the target
Th b d t tThe bad guys are next to you...
3
What Would the Bad Guys do?
Step 1: Find out where the
target is located
Step 2: Try to be co-located with
the target in the same (physical)
machine
Step 2.1: Verify it’s achieved
Step 3: Gather information about
the target once co-located
4
3. 3
“Hey, You, Get Off of My Cloud”
Influential, cited by 872 papers as of 2014 July
(Google Scholar)
Media coverage:
MIT Technology Review, Network World, Network World (2), Computer World,
Data Center Knowledge, IT Business Edge, Cloudsecurity.org, Infoworld
Attack launched against commercially available
”real” cloud (Amazon EC2)
Claims up to 40% success in co-residence withClaims up to 40% success in co residence with
target VM
First work showing concrete threats in cloud
5
Approach Overview
Map the cloud infrastructure to
estimate where the target is
located cartography
Footprinting
Port scanning
located - cartography
Use various heuristics to verify
co-residence of two VMs
Launch probe VMs trying to be
co-residence with target VMs
Exploit cross-VM side-channel
Discovering
vulnerabilities
Initial
exploitation
Privilege
escalation
leakage to gather information
about the target
escalation
6
4. 4
Threat Model
Attacker model
Cloud infrastructure provider is trustworthy
Cloud insiders are trustworthy
Attacker is a malicious non-provider-affiliated third
party who can legitimately use cloud provider's
service
Victim model
Vi ti th l d th t h itiVictims are other cloud users that have sensitive
information
7
The Amazon EC2
Xen hypervisor
Domain0 is used to manage guest images, physical
Zen Hypervisor
Dom0Guest1 Guest2
resource provisioning, and access control rights
Dom0 routes packages and reports itself as a first
hop
Users may choose to create instance in
2 regions (United States and Europe)
3 il bilit (f f lt t l )3 availability zones (for fault tolerance)
5 Linux instance types m1.small, c1.medium,
m1.large, m1.xlarge, c1.xlarge
8
5. 5
IP Addresses of Instances
An instance may have a public IP
75.101.210.100, which, from outside the cloud,
maps to an external DNS name
ec2-75-101-210-100.compute-1.amazonaws.com
And an internal IP and DNS name
10.252.146.52
domU-12-31-38-00-8D-C6.compute-1.internal
Within the cloud, both domain names resolve
to the internal IP
75.101.210.100 -> ec2-75-101-210-100.compute-
1.amazonaws.com -> 10.252.146.52
9
Network Probing
nmap: TCP connect probes (3-way handshake)
hping: TCP SYN traceroutesp g C S t ace outes
both nmap/hping targeting ports 80 and 443
wget: retrieve web pages up to 1024B
Internal probing from an instance to another
Legitimate w.r.t. Amazon policies
External probing from outside EC2External probing from outside EC2
Not illegal (port scanning is)
Only targeting port 80/443 – with services running –
implication on ethical issues
10
6. 6
Step 1: Mapping the Cloud
Hypothesis:
The Amazon EC2 internal IP address space is
cleanly partitioned between availability zones
(likely to make it easy to manage separate network
connectivity for these zones)
Instance types within these zones also show
considerable regularity
Moreover, different accounts exhibit similar
placement.
11
Mapping the Cloud
20 instances for each of the 15 zone/type pairs, total 300
Plot of internal IPs against zones
Result: Different availability zones correspond to different
statically defined internal IP address ranges.
12
7. 7
Mapping the Cloud
20 instances of each type, from another account, zone 3
Plot of internal IPs in Zone 3 against instance types
Result: Same instance types correspond loosely with
similar IP address range regions.
13
Derive IP Address Allocation Rules
Heuristics to label /24 prefixes with both
availability zone and instance type:
All IPs from a /16 are from same availability zone
A /24 inherits any included sampled instance type.
If multiple instance types, then it is ambiguous
A /24 containing a Dom0 IP address only contains
Dom0 IP addresses. We associate to this /24
the type of the Dom0’s associated instancethe type of the Dom0 s associated instance
All /24’s between two consecutive Dom0 /24’s
inherit the former’s associated type.
10.250.8.0/24 contained Dom0 IPs associated with
m1.small instances in 10.250.9.0/24, 10.250.10.0/24
14
8. 8
Mapping 6057 EC2 Servers
15
Preventing Cloud Cartography
Why preventing?
Make following attacks harder
Hiding infrastructure/amount of users
What makes mapping easier?
Static local IPs – changing which may complicate
management
External to internal IP mapping – preventing which
l l d i (ti i d t tcan only slow down mapping (timing and tracert are
still possible)
16
9. 9
Step 2: Determine Co-residence
Network-based co-resident checks: instances
are likely co-resident if they have:
matching Dom0 IP address
Dom0: 1st hop from this instance, or last hop to victim
small packet round-trip times
Needs a “warm-up” – 1st probe discarded
numerically close internal IP addresses (e.g., within
7)7)
8 m1.small instances on one machine
17
Step 2: Determine Co-residence
Verified via a hard-disk-based covert channel
All “instances” are in zone 3
Effective false positive rate of ZEROEffective false positive rate of ZERO
Go with a simpler test:
Close enough internal IPs? If yes then tracert. A
single hop (dom0) in between? If yes test passes
18
10. 10
Step 3: Exploiting VM Placement
Facts about Amazon placement
Same account never has instances on same
machine (so 8 instances will be placed on 8
machines)
Sequential locality (A stops then B starts, A and B
likely co-resident)
Parallel locality (A and B under different accounts
run at roughly same time likely co-resident)g y y )
Machines with less instances are more likely be
placed to (load balancing)
m1.xlarge and c1.xlarge have their own machines
19
Step 3: Exploiting VM Placement
Strategy 1: Brute-forcing placement
141 out of 1686: a success rate of 8.4%
Strategy 2: Abusing Placement Locality
Attacker instance-flooding right after the target
instances are launched – exploiting parallel locality
Observing instance disappearing/reappearing
Triggering the creation of new instances (elasticity)
40% success rate40% success rate
flooding after 5 mins 20
11. 11
Step 3: Exploiting VM Placement
“Window” for parallel
locality is quite large
Evidence of sequential
localityy q g
(Each instance is killed
immediately after
probing)
21
Step 4: Information Leakage
Co-Residency affords the ability to:
Denial of Service
Estimate victim's work load
Extract cryptographic keys via side channels
22
12. 12
Mitigations
Co-residence checks:
Prevent identification of dom0/hypervisor
VM placement:
Allow users to control/exclusively use machines
Side channel leaks:
Many methods exist
Limitations: impractical (overhead), application-p ( ), pp
specific, or insufficient protection
Also, all of them require to know all possible
channels in advance
23
Amazon's response
Amazon downplays report highlighting
vulnerabilities in its cloud service
"The side channel techniques presented are based
on testing results from a carefully controlled lab
environment with configurations that do not match
the actual Amazon EC2 environment."
"As the researchers point out, there are a number
of factors that would make such an attack
significantly more difficult in practice."
http://www.techworld.com.au/article/324189/amazon_downplays
_report_highlighting_vulnerabilities_its_cloud_service
24
13. 13
Outline
Co-Residence Attack
Power Attacko e ttac
25
Background
The number of servers in data center surged
from 24 million in 2008 to 35 million in 2012
Power consumption 56% percent increase
Very expensive to upgrade existing power
infrastructures
How to add more servers with less cost?
26
14. 14
Power Attack
Solution: Oversubscription
Place more servers than can be supported by the
power infrastructure
Assumption: not all servers will reach peak
consumption (nameplate power ratings) at the
same time
Leaves data center vulnerable to power attack:
Malicious workload that can generate power spikesMalicious workload that can generate power spikes
on multiple servers/racks/whole data center
Launched as a regular user
Causing DoS to both providers and clients by
triggering the circuit breakers (CBs)
27
Power Distribution in Data Centers
Three tiers:
(60-400kV →
f ktransformer → 10-20kV
→ switchgear → 400-
600v) → UPS/PDUs
(Power distribution
units) → racks
Circuit breakers at
switchgear, PDUs, and
rack level branchrack-level branch
circuit
28
15. 15
Oversubscription
Google’s analysis
Workload traces collected from real data centers:
search, webmail, and MapReduce
Peak power reaches 96% of rated capacity at rack
level, but 72% at data center level
Oversubscription would allow adding 38% more
servers
A big assumption:
29
A big assumption:
Workloads never reach peak consumption
Benign workloads – maybe; malicious ones – no!
Threat Model
Target can be rack, PDU, or data centerTarget can be rack, PDU, or data center
Running public servers, e.g.,Running public servers, e.g., IaaSIaaS,, PaaSPaaS,, SaasSaas
Power oversubscriptionPower oversubscription
Power consumption is monitored/managed at rackPower consumption is monitored/managed at rack--
level (machinelevel (machine--level is too expensive)level is too expensive)
AdversaryAdversary
Hackers, competitors, cyber crime/cyber warfareHackers, competitors, cyber crime/cyber warfare
30
Regular user with sufficient resources (large numberRegular user with sufficient resources (large number
of accounts, workload) and mapping of cloudof accounts, workload) and mapping of cloud
Our focus: How to generate power spikes?Our focus: How to generate power spikes?
UnderUnder IaaSIaaS,, PaaSPaaS,, SaasSaas??
16. 16
Power Attack in PaaS
PaaS: Attacker can run any chosen applications
Load balancingoad ba a c g
Load (utility) balancing ≠ power balancing
Attack in two stages
Utility reaches 100% (e.g., CPU)
Fine-tune workload to further increase power
consumption (remember utility ≠ power)
31
Single Server Test
Goal: find out how workloads affect power
SPECCPU2006 HPC benchmark
Results:
Different workloads have very different power cost
Same CPU (100%), memory ≠ power
e.g., 462 vs 465, 462 vs 456
32
17. 17
Single Server Test
HPL benchmark
Multiple parameters to adjust
Adjust block size NB (how problem is solved)
Results:
Same workload, same CPU and memory
Different power cost under different parameters
33
Rack Level Test
Results:
Similar to single machine
Attack:
Increasing workload to reach utility cap
Further increasing power cost by changing
workload/tuning parameters
34
18. 18
Damage Assessment
Overheating
One CPU is overheated, resulting in system failure
CB tripped
in a room with 16 servers
out of which only 4 servers are under attack
It will only get worse in real world
When memory/I/O devices are attacked
35
y/ /
With better “power proportionality”
60% power consumed when idle in this case
Power Attack in IaaS
IaaS: More control using VMs; more exposure
Attack vectors:ttac ecto s
Parasite attack – attack from inside
Run applications from VMs
Launch DoS attacks on such VMs (more power cost than
normal workload)
Exploit routine operations
Live migration of VMs
36
Live migration of VMs
Launch parasite attack during migration
19. 19
Evaluation - Parasite Attack
Parasite attack
Co-resident with victim
Run intensive workload
Launch DoS attacks on such VMs
Results
Normal load: 180w
Intensive load: 200w
37
DoS: 230w (peak 245w)
Increase 30%
smurf: broadcasting
Evaluation – VM Migration
Results
During migration, both source and dest. experience
power spikes (memory copy, NICs, CPUs)
Intra-rack and inter-rack (migrating to same server)
38
20. 20
Power Attack in SaaS
SaaS: Limited control
Attack vectors: Specially crafted requeststtac ecto s Spec a y c a ted equests
Trigger larger numbers of cache misses
Floating point operations
Floating point unit (FLU) more power hungry than
Arithmetic logic unit (ALU)
Divisions rather than add/multiplication
39
Power Attack in SaaS
RUBiS online shopping benchmark
Modified to support:od ed to suppo t
Floating point operations (discount coupons)
Cache misses (continuously browsing at random)
30-40% of power spikes
40
21. 21
Data Center Level Simulations
Based on configurations of Google data center
in Lenoir, NC, USA
“Original” workload based on traces of the data
center and “Attack” includes HPC workloads
Attacking the “Peak”, “medium”, and “valley”
regions
41
Results
One PDU: 22 min attack trips PDU-level CB
Multi-PDU: 4 attacks all trip CBu t U attac s a t p C
First 3 attacks recovered due to load balancing
Last attack causes DoS (only 53% of requests
processed) during 58-69hrs
42
22. 22
Results
DC level attack
possible
Larger scale attacks
require more
resources
43
Mitigation
Power capping: limit peak consumption
Challenges: 2-minute sampling window enough for
attacks; even longer (12 mins) to actually reduce
consumption
Server consolidation (shutdown if not in use)
Better power proportionality – more aggressive
oversubscription - more vulnerable
Challenges
44
Challenges:
Need to save power
Difficult to monitor power
Difficult to distinguish between users and attacks