EuroSec2011 Slide "Memory Deduplication as a Threat to the Guest OS" by Kuniyasu Suzaki
Memory Deduplication as a Threat to the Guest OS Kuniyasu Suzaki, Kengo Iijima, Toshiki Yagi, Cyrille Artho Research Center for Information SecurityEuroSec 2011 at Salzburg, April 10
Background 1/2• IaaS (Infrastructure as a Service) type cloud computing hosts many virtual machines. – Examples: Amazon EC2, Rackspace, ... – Physical resources (Memory/Storage) are very important.• Image of guest OS are provided by IaaS venders. – Most contents are same.• Deduplication technique is utilized to share same contents and reduces consumption of physical resources.
Background 2/2• Unfortunately memory deduplication has access time difference between non-deduplicated pages and deduplicated pages, caused by Copy-On-Write.• Attacker can use this phenomena for memory disclosure attack .• Our paper presents real attack to a Guest OS (Windows and Linux).
Memory Deduplication 1/2• Memory deduplication is a technique to share same contents page. – Mainly used for virtual machines. – Very effective when same guest OS runs on several virtual machines.• Many virtual machine monitors include deduplication• Implementations may differ Guest Pseudo Memory VM1 VM2 VM(n) Real Physical Memory
Memory Deduplication 2/2• Content-aware deduplication – Transparent Page Sharing on Disco [OSDI97] – Satori on Xen[USENIX09]• Periodical memory-scan dedpulication – Content-Based Page Sharing on VMWare ESX [SOSP02] – Differential Engine on Xen[OSDI08] – KSM (Kernel Samepage Merging) [LinuxSymp09] • General-purpose memory deduplication for Linux. • Used mainly for KVM.• Our paper uses KSM with KVM virtual machine.
KSM: Kernel Samepage Merging• KSM has 3 states for pages. – Volatile : contents change frequently (not to be candidate) – Unshared: candidate pages for deduplication – Shared: deduplicated pages• Pages are scanned (default: 20msec) – All pages are not scanned at a time. – The maximum is 25% of the available memory. – The time to be deduplicated depends on the situation.• Write accesses to deduplicated pages are managed by Copy-On-Write.
Copy-On-Write (COW)• When a write access is issued to a deduplicated page, a fresh copy is created, which accepts write access.• Write access time difference between deduplicateted and non-deduplicated pages due to copying.• A kind of Covert Channel of COW Guest Pseudo Memory VM1 VM2 Write Access VM1 VM2 (victim) (attacker) (victim) (attacker) Real Physical Memory Re-created page cases access time difference
Memory Disclosure Attack• Attacker can use write access time difference to guess memory contents on a victim’s VM. Basic idea for memory disclosure attack • Allocate same contents pages on memory of attacker’s VM. • Wait to be deduplicaed • Issue write access to the pages. When the pages are deduplicated, the write access time is longer than normal.• Used to detect a process or an open file on victim’s VM
Implementation detail 1/2(1) Get memory image of victim VM – To detect a process • An executable binary file is used, because a loader (ex. ld- linux.so) loads most parts of binary file without change. • The last page which is less than 4KB is not target of matching, because the tail of 4KB may include arbitrary contents on the memory. – To detect an opened file • Normal file is used. It is saved in page cache without change.
Implementation detail 2/2(2) Compare time difference• Write one byte on each target page and measure the access time.• Compare the time with the access time measured already – zero-cleared (dedupicated) and random data (non-deduplicated).• Length of waiting time – Depends on size of memory for memory scan type.
Problems on implementation• Memory disclosure attack for memory deduplication looks simple, but there are some problems. – 4KB Alignment problem • Caused by memory allocation strategy – Self-reflection problem • Caused by memory management on cache and heap – Run time modification problem • Caused by ASLR, swap-out, etc
4KB alignment problem• On victim’s VM – A binary file is loaded with 4KB alignment. – Opened file is cached with 4KB alignment.• On Attacker’s VM – Attacker loads the matching contents on his heap memory. – The contents MUST be loaded on 4KB alignment. • Attacker can use posix_memalign() to get aligned memory region.
Self-reflection problem 1/2• Contents of opened file by attacker’s program are stored on page cache memory by Linux kernel.• Attacker loads the matching contents on his heap memory.• The heap memory contents is deduplicated to the page cache memory contents on a VM! [self-reflection problem] attacker’s program memory on attacker’s VM matching Un-control by target file … attacker’s program page open(); cache Contents stored … on page cache posix_memalign(); read(); heap Same contents Contents stored … on heap memory are deduplicated on self memory gettimeofday(); # write data to heap access time is delayed with gettimeofday(); self memory deduplication
Self-reflection problem 2/2• Image on heap and page cache must be different.• We gzipped the target file and expand it at run time. – The contents on page cache is gzipped image. attacker’s program memory on attacker’s VM matching target Un-control by file with gziped … attacker’s program page gzopen(); cache Contents stored … on page cache posix_memalign(); different contents gzread(); heap Contents stored … on heap memory gettimeofday(); Deduplicated with # write data to heap memory on other VM gettimeofday();
Run time modification problem• Some parts of code are modified at run time, and cause – False-negative • ASLR (Address Space Layout Randomization) – Modern kernel has this security function. (after Linux2.6.12, after Windows VISTA) – Most text parts are not changed. Even if the pages are scattered, the unit is 4KB. The memory disclosure attack does not care ASLR. • Swap-out • Self modification code – Not used by normal applications. – False-positive • Pre-load mechanism • Common 4KB Code
Summary of memory disclosure attack• The memory disclosure attack is 4KB aligned exact matching.• Limitation – Attacker can know • existence of process. • opened or downloaded file. – Attacker doesn’t know • termination of process because of un-cleared page cache. • which virtual machine has the contents. – Noises cause false-negative and false-positive. • Include timing of deduplication
Experiments• The memory disclosure attack runs on KSM with KVM(0.12.4). The host linux kernel is 184.108.40.206.• Victim’s VM – Guest OS: Debian GNU Linux and WindowsXP – Target applications • apache2 and sshd on Debian GNU Linux • Internet Explore6 and FireFox on WindowsXP• Attacker’s VM – Guest OS: Debian GNU Linux – Assume to know invocation of application – Wait 5 minutes after target application is invoked
Attack for Apache2 and sshd on Linux Average (us) zero Random Apache2Apache2 ELF file 365,308B (89*4KB) Before invocation 6.45 0.37 0.53 After invocation 6.24 0.40 7.56 Before invocation After invocation Average (us) zero random sshdsshd ELF file 438,852B (107*4KB) Before invocation 6.60 0.55 0.45 After invocation 6.51 0.42 7.50 Non-deduplicated Timimng? swap-out? Code modification? Before invocation After invocation
Attack for Firefox and IE6 on WindowsXP Average (us) zero random FirefoxFirefox binary file 912,334B (222*4KB) Before invocation 6.45 0.43 1.92 After invocation 6.49 0.37 7.68 Deduplicated Common 4KB Code? Pre-load? Before invocation After invocation Average (us) zero random IE6 Before invocation 6.59 0.27 5.68IE6 binary file 93,184B (22*4KB) After invocation 6.32 0.68 7.00 IE6 has special optimization? Before invocation After invocation
Detect downloaded file• The memory disclosure attack can also be applied to find an opened file on a victim’s VM.• We confirm that the “Google logo” file can be detected if page caching is enabled on Firefox.• This disclosure attack is dangerous, because it detects a view of Home Page, even if the network is encrypted by TLS/SSL.
Countermeasure• Satori[USENIX ATC’09] has a mechanism to refuse memory deduplication for each page. – Applications have to control its memory.• Overshadow [ASPLOS’08] and SP3 [Vee’08] encrypt memory. – Unfortunately it ruins effect of memory deduplication.• We plan to develop memory deduplication for read-only pages. – It requires to monitor page table of Guest OS.
Discussion• Security function (memory sanitization) on Guest OS helps the memory disclosure attack – Because it tells termination of process.• The memory disclosure attack can be used for security education. – Detect hidden process. – Data life time is longer than we expect [USENIX Security’05 and ATC’06].• The memory disclosure attack does not violate any SLA on cloud computing. It just measure access time.
Related Work• Cross VM Side Channel Attack, “Hey, you, get off of my cloud” [CCS’10] – Monitor the behavior of a physically shared cache of multi- cores (hyper threads)• Cold-boot attack [USENIX Security’07] – a kind of physical side channel attack, which freezes physical memory and scans the data.
Conclusion• We presented memory disclose attack on memory deduplication which use write access time difference on Copy-On-Write.• Experiments detected – Existence of apache2 and sshd on Linux, – IE6 and Firefox on WindowsXP, – downloaded file in the Firefox browser.• Countermeasures and applicable usage are mentioned.