EuroSec2011 (EuroSys2011 workshop) Slide of "Memory Deduplication as a Threat to the Guest OS" by Kuniyasu Suzaki
http://www.iseclab.org/eurosec-2011/program.html
EuroSec2011 Slide "Memory Deduplication as a Threat to the Guest OS" by Kuniyasu Suzaki
1. Memory Deduplication
as a Threat to the Guest OS
Kuniyasu Suzaki, Kengo Iijima,
Toshiki Yagi, Cyrille Artho
Research Center for Information Security
EuroSec 2011 at Salzburg, April 10
2. Background 1/2
• IaaS (Infrastructure as a Service) type cloud
computing hosts many virtual machines.
– Examples: Amazon EC2, Rackspace, ...
– Physical resources (Memory/Storage) are very important.
• Image of guest OS are provided by IaaS venders.
– Most contents are same.
• Deduplication technique is utilized to share same
contents and reduces consumption of physical
resources.
3. Background 2/2
• Unfortunately memory deduplication has access time
difference between non-deduplicated pages and
deduplicated pages, caused by Copy-On-Write.
• Attacker can use this phenomena for memory
disclosure attack .
• Our paper presents real attack to a Guest OS
(Windows and Linux).
5. Memory Deduplication 1/2
• Memory deduplication is a technique to share same
contents page.
– Mainly used for virtual machines.
– Very effective when same guest OS runs on several virtual
machines.
• Many virtual machine monitors include deduplication
• Implementations may differ
Guest Pseudo Memory
VM1 VM2 VM(n)
Real Physical Memory
6. Memory Deduplication 2/2
• Content-aware deduplication
– Transparent Page Sharing on Disco [OSDI97]
– Satori on Xen[USENIX09]
• Periodical memory-scan dedpulication
– Content-Based Page Sharing on VMWare ESX [SOSP02]
– Differential Engine on Xen[OSDI08]
– KSM (Kernel Samepage Merging) [LinuxSymp09]
• General-purpose memory deduplication for Linux.
• Used mainly for KVM.
• Our paper uses KSM with KVM virtual machine.
7. KSM: Kernel Samepage Merging
• KSM has 3 states for pages.
– Volatile : contents change frequently (not to be candidate)
– Unshared: candidate pages for deduplication
– Shared: deduplicated pages
• Pages are scanned (default: 20msec)
– All pages are not scanned at a time.
– The maximum is 25% of the available memory.
– The time to be deduplicated depends on the situation.
• Write accesses to deduplicated pages are managed
by Copy-On-Write.
8. Copy-On-Write (COW)
• When a write access is issued to a deduplicated page, a
fresh copy is created, which accepts write access.
• Write access time difference between deduplicateted
and non-deduplicated pages due to copying.
• A kind of Covert Channel of COW
Guest Pseudo Memory
VM1 VM2 Write Access VM1 VM2
(victim) (attacker) (victim) (attacker)
Real Physical Memory Re-created page cases
access time difference
9. Memory Disclosure Attack
• Attacker can use write access time difference to guess
memory contents on a victim’s VM.
Basic idea for memory disclosure attack
• Allocate same contents pages on memory of
attacker’s VM.
• Wait to be deduplicaed
• Issue write access to the pages. When the pages
are deduplicated, the write access time is longer
than normal.
• Used to detect a process or an open file on victim’s VM
10. Implementation detail 1/2
(1) Get memory image of victim VM
– To detect a process
• An executable binary file is used, because a loader (ex. ld-
linux.so) loads most parts of binary file without change.
• The last page which is less than 4KB is not target of
matching, because the tail of 4KB may include arbitrary
contents on the memory.
– To detect an opened file
• Normal file is used. It is saved in page cache without change.
11. Implementation detail 2/2
(2) Compare time difference
• Write one byte on each target page and measure the access
time.
• Compare the time with the access time measured already
– zero-cleared (dedupicated) and random data (non-deduplicated).
• Length of waiting time
– Depends on size of memory for memory scan type.
12. Problems on implementation
• Memory disclosure attack for memory deduplication
looks simple, but there are some problems.
– 4KB Alignment problem
• Caused by memory allocation strategy
– Self-reflection problem
• Caused by memory management on cache and heap
– Run time modification problem
• Caused by ASLR, swap-out, etc
13. 4KB alignment problem
• On victim’s VM
– A binary file is loaded with 4KB alignment.
– Opened file is cached with 4KB alignment.
• On Attacker’s VM
– Attacker loads the matching contents on his heap memory.
– The contents MUST be loaded on 4KB alignment.
• Attacker can use posix_memalign() to get aligned memory region.
14. Self-reflection problem 1/2
• Contents of opened file by attacker’s program are stored
on page cache memory by Linux kernel.
• Attacker loads the matching contents on his heap memory.
• The heap memory contents is deduplicated to the page
cache memory contents on a VM! [self-reflection problem]
attacker’s program memory on attacker’s VM
matching
Un-control by
target file … attacker’s program page
open(); cache
Contents stored
… on page cache
posix_memalign();
read(); heap Same contents
Contents stored
… on heap memory
are deduplicated
on self memory
gettimeofday();
# write data to heap
access time is delayed with
gettimeofday(); self memory deduplication
15. Self-reflection problem 2/2
• Image on heap and page cache must be different.
• We gzipped the target file and expand it at run time.
– The contents on page cache is gzipped image.
attacker’s program memory on attacker’s VM
matching target
Un-control by
file with gziped … attacker’s program page
gzopen(); cache
Contents stored
… on page cache
posix_memalign();
different contents
gzread(); heap
Contents stored
… on heap memory
gettimeofday(); Deduplicated with
# write data to heap memory on other VM
gettimeofday();
16. Run time modification problem
• Some parts of code are modified at run time, and cause
– False-negative
• ASLR (Address Space Layout Randomization)
– Modern kernel has this security function. (after Linux2.6.12, after
Windows VISTA)
– Most text parts are not changed. Even if the pages are scattered, the unit is
4KB. The memory disclosure attack does not care ASLR.
• Swap-out
• Self modification code
– Not used by normal applications.
– False-positive
• Pre-load mechanism
• Common 4KB Code
17. Summary of memory disclosure attack
• The memory disclosure attack is 4KB aligned exact
matching.
• Limitation
– Attacker can know
• existence of process.
• opened or downloaded file.
– Attacker doesn’t know
• termination of process because of un-cleared page cache.
• which virtual machine has the contents.
– Noises cause false-negative and false-positive.
• Include timing of deduplication
18. Experiments
• The memory disclosure attack runs on KSM with
KVM(0.12.4). The host linux kernel is 2.6.33.5.
• Victim’s VM
– Guest OS: Debian GNU Linux and WindowsXP
– Target applications
• apache2 and sshd on Debian GNU Linux
• Internet Explore6 and FireFox on WindowsXP
• Attacker’s VM
– Guest OS: Debian GNU Linux
– Assume to know invocation of application
– Wait 5 minutes after target application is invoked
19. Attack for Apache2 and sshd on Linux
Average (us) zero Random Apache2
Apache2 ELF file 365,308B (89*4KB) Before invocation 6.45 0.37 0.53
After invocation 6.24 0.40 7.56
Before invocation After invocation
Average (us) zero random sshd
sshd ELF file 438,852B (107*4KB) Before invocation 6.60 0.55 0.45
After invocation 6.51 0.42 7.50
Non-deduplicated
Timimng? swap-out?
Code modification?
Before invocation After invocation
20. Attack for Firefox and IE6 on WindowsXP
Average (us) zero random Firefox
Firefox binary file 912,334B (222*4KB) Before invocation 6.45 0.43 1.92
After invocation 6.49 0.37 7.68
Deduplicated
Common 4KB Code?
Pre-load?
Before invocation After invocation
Average (us) zero random IE6
Before invocation 6.59 0.27 5.68
IE6 binary file 93,184B (22*4KB) After invocation 6.32 0.68 7.00
IE6 has special
optimization?
Before invocation After invocation
21. Detect downloaded file
• The memory disclosure attack can also be applied to
find an opened file on a victim’s VM.
• We confirm that the “Google logo” file can be
detected if page caching is enabled on Firefox.
• This disclosure attack is dangerous, because it detects
a view of Home Page, even if the network is
encrypted by TLS/SSL.
22. Countermeasure
• Satori[USENIX ATC’09] has a mechanism to refuse
memory deduplication for each page.
– Applications have to control its memory.
• Overshadow [ASPLOS’08] and SP3 [Vee’08] encrypt
memory.
– Unfortunately it ruins effect of memory deduplication.
• We plan to develop memory deduplication for read-only
pages.
– It requires to monitor page table of Guest OS.
23. Discussion
• Security function (memory sanitization) on Guest OS
helps the memory disclosure attack
– Because it tells termination of process.
• The memory disclosure attack can be used for security
education.
– Detect hidden process.
– Data life time is longer than we expect [USENIX Security’05
and ATC’06].
• The memory disclosure attack does not violate any SLA
on cloud computing. It just measure access time.
24. Related Work
• Cross VM Side Channel Attack, “Hey, you, get off of
my cloud” [CCS’09]
– Monitor the behavior of a physically shared cache of multi-
cores (hyper threads)
• Cold-boot attack [USENIX Security’08]
– a kind of physical side channel attack, which freezes physical
memory and scans the data.
25. Conclusion
• We presented memory disclose attack on memory
deduplication which use write access time difference on
Copy-On-Write.
• Experiments detected
– Existence of apache2 and sshd on Linux,
– IE6 and Firefox on WindowsXP,
– downloaded file in the Firefox browser.
• Countermeasures and applicable usage are mentioned.