SlideShare a Scribd company logo
1 of 45
Download to read offline
Edgar Barbosa
COSEINC Advanced Malware Labs
SyScan’07
Speaker info
Edgar Barbosa
Security researcher
Currently employed at COSEINC
Experience with reverse engineering of Windows kernel
and x86/x64 cpu architecture
Published some articles at rootkit.com
Participated in the creation of BluePill, a virtualization
hardware based rootkit
Content
Part I
How hardware virtualization rootkits (HVR) works?
Part II
How to detect HVR?
Detection of virtualization rootkits
Hardware virtualization
rootkits
Intel and AMD developed virtualization extensions to the
x86 architecture - VT-x and SVM.
There are 2 famous hardware virtualization based rootkits:
Vitriol, created by Dino Dai Zovi – uses Intel VT-x
Bluepill, designed by Joanna Rutkowska – uses AMD SVM
Source code not public
We will focus the Bluepill rootkit in this presentation, but
the concepts and methods are very similar to the Intel
plataform.
Bluepill
Designed by Joanna Rutkowska
Intellectual property of COSEINC
Uses AMD Secure Virtual Machine (SVM) extensions
Runs in 64-bit mode
Supports multicore systems
AMD SVM
SVM stands for “Secure Virtual Machine”
It’s a CPU extension to support Virtual Machine Monitors
(VMM), a.k.a. hypervisor.
8 new instructions:
VMRUN
VMSAVE
VMLOAD
VMMCALL
CLGI
STGI
SKINIT
INVLPGA
Initialization of a SVM rootkit
Before any SVM instruction can be used, the EFER.SVME
must be set to 1.
Trying to execute a SVM instruction with SVME equal 0
results in #UD (Invalid opcode) exception.
Allocates and initialize the VMCB structure.
VMCB (Virtual Machine Control Block) address must be 4KB-
aligned
VMCB describes a virtual machine to be executed.
It contains:
Instruction or events in the guest to be intercepted
Control bits
Guest processor state( General registers, RIP, CR registers, … )
Initialization of a SVM rootkit
After VMCB initialization, set the VM_HSAVE_PA MSR.
This is the physical address where the VMRUN instruction
saves host processor state information.
Then execute the VMRUN instruction with RAX register value
equal the physical address of the VMCB
Initialization of a SVM rootkit
VMRUN instruction
Available only at CPL-0
CPU enters in a new processor mode: Guest Mode
In guest mode the behavior of some instructions changes
to facilitate virtualization
Consistency checks on the host and guest state
Saves the host processor state
Load the guest process state configured in the VMCB
CPU now runs in guest mode until an intercept occurs
#VMEXIT
When a intercept triggers, the processor performs a #VMEXIT
On #VMEXIT the processor:
Disable interrupts
Clear all intercepts
Sets the host CPL to 0
Disable all breakpoints
Checks the reload host state for consistency
The reason of the #VMEXIT is saved in the EXITINFO field
of the VMCB structure
Execute the Bluepill interception handler routine
Bluepill hypervisor
Detection of virtualization rootkits
“Undetectable” rootkits
Popek and Goldberg VMM properties:
Efficiency
Resource control
Equivalence
Equivalence “implies that any program executing on a virtual machine must
behave in a manner identical to the way it would have behaved when
running directly on the native hardware” [1]
SVM/VT-x rootkits are only theoreticaly ‘undetectable’
However, the equivalence principle is not fully respected in the hardware
virtualization extensions
There are computer resources that hypervisor has not full control:
TLB (partially)
Branch prediction
SMP processing
Timing attacks
The most obvious attack against hardware virtualization
rootkits is timing attack.
We measure the time of execution of some probably
intercepted instruction and compare the value against some
trusted baseline.
But AMD and Intel hardware virtualization extensions has
support to intercept any internal source of timing:
RDTSC
RDMSR
I/O ports
Hardware virtualization even supports a TSC offset value to be
subtracted from every TSC access attempt.
This is the reason that local timing attacks fails
Detection methods
Methods:
TLB
Branch prediction
Counter-based clock
#GP exceptions
DMA-based attacks will not be discussed due to the new
IOMMU unit.
TLB
A Translation Lookaside Buffer (TLB) is a CPU cache that is
used to improve the speed of virtual address translation.
Detailed TLB information can be obtained by CPUID
instruction. Returns information like the number of entries of
each TLB, the type and the associativity of the cache.
For each line in the TLB is stored information like:
Tag, used to compare with the virtual address
Physical address, the result of the VA translation
Page attributes
If the translation is not store in the cache (cache miss), the
system must execute the ‘table-walk’ procedure. This is a
expensive clock-cycle operation.
TLB
The TLB has a limited number of entries.
The contents of each line is not accessible by software
However we can fill the TLB by accessing several pages.
The idea is to fill all the TLB entries and measure the time
to access these cached pages. Now we execute a
privileged instruction that must be intercepted by a
hypervisor. If there is a hypervisor running on the system,
it will evict some TLB entries. After executing the
privileged instruction we measure the time to execute the
previous cached pages. If it takes more time to be
accessed, there is a hypervisor running.
TLB
The idea of using TLB to detect hypervisor was first published
by Peter Ferrie [2]. However, in the second version of his paper
[3], Ferrie states that the TLB method does not work on AMD-
based hypervisors because they can direct the hardware to not
flush the TLB when a hypervisor event occurs.
Ferrie suggests the CPUID instruction to be used in the TLB
method. But Bluepill doesn’t need to intercept cpuid
instruction. Another instruction could be used instead, the
rdmsr EFER, which bluepill must intercept.
It is still possible to use the TLB method to detect bluepill even
if the hypervisor controls TLB flush! How?
TLB
TLB entries are tagged with ASID (Address Space Identifier) bits to
distinguish different host and/or guest space address.
ASID #00 assigned to VMM and #1..#63 to guests.
TLB_CONTROL field:
The VMM can control the TLB flush operations by setting the
TLB_CONTROL field on the VMCB. If set to 1, the VMRUN
instruction will flush the entire TLB (all ASID’s).
Even with tagged ASID TLB, we can evict all lines in the TLB. The
number of TLB entries are limited, so it will evict lines if necessary.
Opteron primary TLB has only 40 entries [4].
AMD optimization manual suggests to avoid using the
TLB_CONTROL = 1 to flush the guest TLB. Instead, it is best to
assign a new ASID to the guest!
Branch prediction
Studies have shown that the behavior of branch instruction is
highly predictable [5]
Execution trace history of branch instructions can be used to
predict its future behavior.
If a branch is predicted to be taken and this prediction turns out
to be incorrect, there is a huge performance penalty because all
the pipeline must be flushed.
There are a lot of branch prediction schemes. Explaining these
schemes are out of the scope of this presentation.
There are some very good references about this subject[5]
Branch prediction unit uses a small cache to store the history of
the branch instruction execution.
Branch prediction
There is another buffer to store the target address of the branch,
the BTB (Branch Target Buffer )
How to use the branch prediction unit (BPU) to detect
hypervisor code?
Using the prediction rules of static and dynamic predictors, we
can fill the entries of the branch history tables and measure the
time to execute our code. Now the detector executes a privileged
instruction that will be intercept if there is a hypervisor running.
The hypervisor code will affect the branch history tables. We
execute now the ‘branch test code’ again without the privileged
instruction and measure the time. If the execution of the
privileged instruction was intercepted, the measured times will be
different.
Branch prediction
The Branch Prediction Unit was successfully used to obtain a
512-bit encryption key by using a Branch Prediction Analysis
(BPA) attack[6]. This attack is based in some interesting
features of BPU:
The execution history cache is accessed using just a few low-
order bits from the branch instruction address. Two different
address can use the same history. This is called Branch Aliasing
or Branch Interference.
The cache is shared between all threads.
The spy thread was running simultaneously with the decryption
thread. Since the two threads was using the same branch
prediction cache (branch aliasing), the spy thread can
determine which branches the decryption thread has taken.
Branch prediction
Branch prediction
It is not possible to use Branch Aliasing effect to detect
virtualization rootkits due to the fact that we do not know the
virtual address where the rootkit code is being executed.
To use our detection method we must know internal details of
the implementation of the branch prediction unit. However, this
information is not easy to find in the cpu manuals.
We can use some very clever benchmarks developed by Milena
Milenkovic, Aleksandar Milenkovic and Jeffrey Kulick [7].
These benchmarks give us very specific details of the BPU,
like the number of entries and associativiy of BTB, the bits
used from the branch address that are used as the set index and
the local branch history length.
Counter based detection
Bluepill can intercept attempts to read the TSC MSR or any
other clock.
However, we can create our own clock by using a simple
counter algorithm.
Using a multicore system, we can create two threads running
concurrently.
Thread #2 can measure how long Thread #1 takes to execute
some routine or instruction.
It is necessary a synchronization protocol between the threads
to signal when to start and when to finish the counter increment
operation.
SMP uses shared memory, which can be used for the
synchronization protocol.
Counter based detection
How this detection method really works?
Basic idea:
Thread #1 will set the ‘start counter’ synchronization signal
and wait the reply of thread #2.
Thread #1 will execute a instruction that bluepill must
intercept ( rdmsr EFER ), while thread #2 starts to increment
the counter at the same time.
After the intercepted instruction, thread #1 set the ‘stop
counter’ synchronization signal and thread #2 stops the
counting process.
Now we check the value of the counter
Counter based detection
Counter based detection
Using a simple counter code like:
while( stopCounter == 0 )
counter++;
The results of a ‘rdmsr EFER’ test are:
Counter = 016h for a clean system.
Counter = 02000h for a system running a HBR
The counter code can be optimized using the AMD
optimization manual rules.
Counter based detection
To avoid detection, bluepill must stop thread #2 counter as
soon it intercepts any event.
However just the #VMEXIT control transferring process takes
around 1000h clock cycles!
The CPU#2 bluepill hypervisor is in ‘sleep mode’ while the
counter runs and even if the CPU#1 sends a IPI (Inter
processor interrupt ) to CPU#2, it will take even more time.
The CPU#1 hypervisor doesn’t have access to the CPU#2
registers context.
It is too late to change any thread schedule quantum value
Counter based detection
What if our counter code is interrupted by some external
interrupt, like the clock, at the start of the counter process?
It is good to avoid interrupts in our counter code, but not really
necessary
We can’t guarantee that the counter code will not be
interrupted
Clear interrupt methods are interceptable by the rootkit:
Temporarily disable the APIC (interceptable)
CLI instruction (interceptable)
PUSHF and POPF instructions (interceptable)
Solution:
We can run the detection code several times. All we need is a
weird counter value.
Counter based detection
There is another way for the rootkit to detect this detection
method?
Very difficult. We can implement several different
synchronization routines and algorithms to make sure that
the threads are running concurrently.
There is no time for the rootkit to unload itself to avoid
detection after the intercept.
BP in hibernation-mode
One interesting idea discussed is the possibility of bluepill
being able to unload itself while some attack is being executed
and reload itself after the finish of the attack.[8]
That’s a weird idea because if we know that the rootkit is
unloaded, we can load our own detector hypervisor and waits
for any code trying to get access to SVM resources! Remember
bluepill is predicted to be undetectable even if the source is
published.
However, the unload idea can be cleverly used against the next
detection idea. It is interesting to present this attack to know
how virtualization rootkits can use this ‘unload’ trick.
#GP detection
EFER (Extended Feature Enable Register ) is a model
specific register (MSR)
Can be accessed by RDMSR and WRMSR instructions.
MSR EFER index is 0xC0000080
Before using the AMD SVM extensions, it is necessary to
set the EFER.SVME bit to 1.
Bluepill intercepts all attempt of read or write in the
EFER.
There is a way to know the value of SVME bit without
being intercepted?
VMSAVE instruction
The VMSAVE instruction stores a subset of the processor state into
the VMCB specified by the physical address in the RAX register.
This is a Secure Virtual Machine Instruction.
This instruction generates a #UD exception if SVM is not enabled.
Pseudo code:
VMSAVE and EFER
What happens if we execute VMSAVE instruction with RAX
containing a invalid physical address and SVM?
If the EFER.SVME = 0 the system generates a #UD
exception!
If the EFER.SVME = 1 the system generates a #GP
exception!
The VMSAVE instruction microcode is able to read the real
value of the EFER.SVME register without being intercepted!
We can use the VMSAVE instruction to detect HVBR.
The VMSAVE in not the only one SVM instruction that can be
used for detection. Take a look at AMD manuals.
Counter-attack - I
If the rootkit set the VMCB to intercept the VMSAVE
instruction, it will not detect the attack because the system
will generate a exception before executing VMSAVE.
The rootkit can set the VMCB to intercept #GP
exceptions!
After a #GP exception intercept, the rootkit must verify it
the guest RIP is pointing to a VMSAVE instruction!
If the VMSAVE instruction is being executed and the guest
EFER.SVME = 0, it can now inject a #UD exception in the
guest.
#GP attack - II
The detection code must not allow the rootkit to detect the
VMSAVE instruction at the guest RIP.
We can use the Translation Lookaside Buffer to hide our
detection code.
Exactly before executing the VMSAVE instruction, the
detector will change the PTE of the detection code to make it
point to a fake page which will have another code that
generates a #GP exception.
When the rootkit intercepts the #GP, it will not be able to look
at the real code of RIP because the PTE is not showing the real
detection code page. The real address of the detection code is
inside the I-TLB and there is no way to access the contents of
the TLB cache.
Counter-attack II
The rootkit is not able to find the real detector page because it
is cached at ITLB. But it can detect if the PTE address of the
RIP is fake.
How?
The rootkit will write a 0xCC (int 0x3) opcode at the RIP address
and restart guest execution at the same RIP.
If the system generates a #BP exception, the page is not fake.
If the system generates a #GP again, the page is fake.
If the rootkit detects such attack, it can’t know what is the
correct exception that must be inject in the guest because the
hidden code can be any instruction able to generate a #GP
exception. If it injects a #UD exception it will be easily
detected!
Counter-attack II
What the rootkit can do now?
It knows that a exception must be generated.
It hooks the guest exception handlers.
Next, it unload the hypervisor and now it calls the intercepted
instruction again.
In this case, the instruction will generate the correct exception
that will be detected by the hooked exception handlers.
Now, the exception handler just needs to load the hypervisor
again!
Due to the #GP attack, every virtualization rootkit must
implement configure the VMCB to intercept #GP exceptions.
CPU bugs
It is possible to use CPU bugs to detect HVBR?
Yes, but it is not a reliable way to detect rootkits.
I found that the execution of the Address-Size Prefix (0x67)
opcode together with the VMSAVE instruction is aparently
able to freeze systems running hypervisors !
A detector which freezes the system is not very useful
outside of lab environments.
Credits
All the cool crypto research papers using cpu
microarchitecture based attacks.
Alexander Tereshkin, for the creation of the counter-
attacks against the #GP exception method to detect
Bluepill.
References
[1] J. Smith and R. Nair. Virtual Machines. Versatile platforms for systems and processes. Morgan Kaufmann, 2005.
[2]http://pferrie.tripod.com/papers/attacks.pdf
[3]http://pferrie.tripod.com/papers/attacks2.pdf
[4]http://www.chip-architect.com/news/2003_09_21_Detailed_Architecture_of_AMDs_64bit_Core.html
[5]J. Shen and M. Lipasti. Modern Processor Design. Fundamentals of Superscalar processors. McGraw-Hill , 2005.
[6]O. Acuçmez, Ç. Koç and J. Seifert. On the power of simple branch prediction analysis. http://eprint.iacr.org/2006/351.pdf
[7] M. Milenkovic, A. Milenkovic and J. Kulick. Demystifying Intel Branch Predictors.
http://www.ece.wisc.edu/~wddd/2002/final/milenkovic.pdf
[8]http://blogs.zdnet.com/Ou/?p=297
Questions?
Thank you for your time!

More Related Content

What's hot

Uvm presentation dac2011_final
Uvm presentation dac2011_finalUvm presentation dac2011_final
Uvm presentation dac2011_final
sean chen
 
CSW2017 Henry li how to find the vulnerability to bypass the control flow gua...
CSW2017 Henry li how to find the vulnerability to bypass the control flow gua...CSW2017 Henry li how to find the vulnerability to bypass the control flow gua...
CSW2017 Henry li how to find the vulnerability to bypass the control flow gua...
CanSecWest
 
Lecture 2 verilog
Lecture 2   verilogLecture 2   verilog
Lecture 2 verilog
venravi10
 
JVM JIT-compiler overview @ JavaOne Moscow 2013
JVM JIT-compiler overview @ JavaOne Moscow 2013JVM JIT-compiler overview @ JavaOne Moscow 2013
JVM JIT-compiler overview @ JavaOne Moscow 2013
Vladimir Ivanov
 
Uvm dcon2013
Uvm dcon2013Uvm dcon2013
Uvm dcon2013
sean chen
 

What's hot (20)

UVM TUTORIAL;
UVM TUTORIAL;UVM TUTORIAL;
UVM TUTORIAL;
 
Testing CAN network with help of CANToolz
Testing CAN network with help of CANToolzTesting CAN network with help of CANToolz
Testing CAN network with help of CANToolz
 
Functial Verification Tutorials
Functial Verification TutorialsFunctial Verification Tutorials
Functial Verification Tutorials
 
Tiered Compilation in Hotspot JVM
Tiered Compilation in Hotspot JVMTiered Compilation in Hotspot JVM
Tiered Compilation in Hotspot JVM
 
FIFOPt
FIFOPtFIFOPt
FIFOPt
 
Uvm presentation dac2011_final
Uvm presentation dac2011_finalUvm presentation dac2011_final
Uvm presentation dac2011_final
 
UVM ARCHITECTURE FOR VERIFICATION
UVM ARCHITECTURE FOR VERIFICATIONUVM ARCHITECTURE FOR VERIFICATION
UVM ARCHITECTURE FOR VERIFICATION
 
Intrinsic Methods in HotSpot VM
Intrinsic Methods in HotSpot VMIntrinsic Methods in HotSpot VM
Intrinsic Methods in HotSpot VM
 
05 defense
05 defense05 defense
05 defense
 
CSW2017 Henry li how to find the vulnerability to bypass the control flow gua...
CSW2017 Henry li how to find the vulnerability to bypass the control flow gua...CSW2017 Henry li how to find the vulnerability to bypass the control flow gua...
CSW2017 Henry li how to find the vulnerability to bypass the control flow gua...
 
Linux : PSCI
Linux : PSCILinux : PSCI
Linux : PSCI
 
Jonathan bromley doulos
Jonathan bromley doulosJonathan bromley doulos
Jonathan bromley doulos
 
Lecture 2 verilog
Lecture 2   verilogLecture 2   verilog
Lecture 2 verilog
 
JVM JIT-compiler overview @ JavaOne Moscow 2013
JVM JIT-compiler overview @ JavaOne Moscow 2013JVM JIT-compiler overview @ JavaOne Moscow 2013
JVM JIT-compiler overview @ JavaOne Moscow 2013
 
The pocl Kernel Compiler
The pocl Kernel CompilerThe pocl Kernel Compiler
The pocl Kernel Compiler
 
Kernel Recipes 2014 - Writing Code: Keep It Short, Stupid!
Kernel Recipes 2014 - Writing Code: Keep It Short, Stupid!Kernel Recipes 2014 - Writing Code: Keep It Short, Stupid!
Kernel Recipes 2014 - Writing Code: Keep It Short, Stupid!
 
An Open Discussion of RISC-V BitManip, trends, and comparisons _ Claire
 An Open Discussion of RISC-V BitManip, trends, and comparisons _ Claire An Open Discussion of RISC-V BitManip, trends, and comparisons _ Claire
An Open Discussion of RISC-V BitManip, trends, and comparisons _ Claire
 
Uvm dcon2013
Uvm dcon2013Uvm dcon2013
Uvm dcon2013
 
UVM Methodology Tutorial
UVM Methodology TutorialUVM Methodology Tutorial
UVM Methodology Tutorial
 
ARM Trusted FirmwareのBL31を単体で使う!
ARM Trusted FirmwareのBL31を単体で使う!ARM Trusted FirmwareのBL31を単体で使う!
ARM Trusted FirmwareのBL31を単体で使う!
 

Viewers also liked

Electronic cash
Electronic cashElectronic cash
Electronic cash
dhakarboy
 
Cryptography and E-Commerce
Cryptography and E-CommerceCryptography and E-Commerce
Cryptography and E-Commerce
Hiep Luong
 

Viewers also liked (20)

Vista uipi.ppt (1)
Vista uipi.ppt (1)Vista uipi.ppt (1)
Vista uipi.ppt (1)
 
Taint analysis
Taint analysisTaint analysis
Taint analysis
 
Hypervisor Framework
Hypervisor FrameworkHypervisor Framework
Hypervisor Framework
 
Secure socket layer
Secure socket layerSecure socket layer
Secure socket layer
 
Symbolic Automata = Automata + SMT solvers at ExCape14
Symbolic Automata = Automata + SMT solvers at ExCape14Symbolic Automata = Automata + SMT solvers at ExCape14
Symbolic Automata = Automata + SMT solvers at ExCape14
 
Electronic cash
Electronic cashElectronic cash
Electronic cash
 
Secure Electronic Transaction (SET)
Secure Electronic Transaction (SET)Secure Electronic Transaction (SET)
Secure Electronic Transaction (SET)
 
What is Digital/Electronic Cash? - Dr. David Everett, Microexpert
What is Digital/Electronic Cash? - Dr. David Everett, MicroexpertWhat is Digital/Electronic Cash? - Dr. David Everett, Microexpert
What is Digital/Electronic Cash? - Dr. David Everett, Microexpert
 
Plastic money and digital cash sept 2012 abbl card info
Plastic money and digital cash sept 2012 abbl card infoPlastic money and digital cash sept 2012 abbl card info
Plastic money and digital cash sept 2012 abbl card info
 
Rootkit
RootkitRootkit
Rootkit
 
Applying Memory Forensics to Rootkit Detection
Applying Memory Forensics to Rootkit DetectionApplying Memory Forensics to Rootkit Detection
Applying Memory Forensics to Rootkit Detection
 
Research Paper on Rootkit.
Research Paper on Rootkit.Research Paper on Rootkit.
Research Paper on Rootkit.
 
[Defcon] Hardware backdooring is practical
[Defcon] Hardware backdooring is practical[Defcon] Hardware backdooring is practical
[Defcon] Hardware backdooring is practical
 
WordPress Security
WordPress SecurityWordPress Security
WordPress Security
 
Attacks on tacacs - Алексей Тюрин
Attacks on tacacs - Алексей ТюринAttacks on tacacs - Алексей Тюрин
Attacks on tacacs - Алексей Тюрин
 
Zn task - defcon russia 20
Zn task  - defcon russia 20Zn task  - defcon russia 20
Zn task - defcon russia 20
 
Defeating x64: Modern Trends of Kernel-Mode Rootkits
Defeating x64: Modern Trends of Kernel-Mode RootkitsDefeating x64: Modern Trends of Kernel-Mode Rootkits
Defeating x64: Modern Trends of Kernel-Mode Rootkits
 
Identifying XSS Vulnerabilities
Identifying XSS VulnerabilitiesIdentifying XSS Vulnerabilities
Identifying XSS Vulnerabilities
 
Anti-Forensic Rootkits
Anti-Forensic RootkitsAnti-Forensic Rootkits
Anti-Forensic Rootkits
 
Cryptography and E-Commerce
Cryptography and E-CommerceCryptography and E-Commerce
Cryptography and E-Commerce
 

Similar to Detecting hardware virtualization rootkits

Crussoe proc
Crussoe procCrussoe proc
Crussoe proc
tyadi
 
Joanna Rutkowska Subverting Vista Kernel
Joanna Rutkowska   Subverting Vista KernelJoanna Rutkowska   Subverting Vista Kernel
Joanna Rutkowska Subverting Vista Kernel
guestf1a032
 
04+ECETEMT092-+WDT+APB+UVM.pdf
04+ECETEMT092-+WDT+APB+UVM.pdf04+ECETEMT092-+WDT+APB+UVM.pdf
04+ECETEMT092-+WDT+APB+UVM.pdf
SamHoney6
 
SMI_SNUG_paper_v10
SMI_SNUG_paper_v10SMI_SNUG_paper_v10
SMI_SNUG_paper_v10
Igor Lesik
 
Highly available (ha) kubernetes
Highly available (ha) kubernetesHighly available (ha) kubernetes
Highly available (ha) kubernetes
Tarek Ali
 
Chapter 5 – Cloud Resource Virtua.docx
Chapter 5 – Cloud Resource                        Virtua.docxChapter 5 – Cloud Resource                        Virtua.docx
Chapter 5 – Cloud Resource Virtua.docx
madlynplamondon
 
Chapter 5 – Cloud Resource Virtua.docx
Chapter 5 – Cloud Resource                        Virtua.docxChapter 5 – Cloud Resource                        Virtua.docx
Chapter 5 – Cloud Resource Virtua.docx
gertrudebellgrove
 

Similar to Detecting hardware virtualization rootkits (20)

Code Red Security
Code Red SecurityCode Red Security
Code Red Security
 
Disadvantages Of Robotium
Disadvantages Of RobotiumDisadvantages Of Robotium
Disadvantages Of Robotium
 
Crussoe proc
Crussoe procCrussoe proc
Crussoe proc
 
Reverse Engineering of Rocket Chip
Reverse Engineering of Rocket ChipReverse Engineering of Rocket Chip
Reverse Engineering of Rocket Chip
 
Joanna Rutkowska Subverting Vista Kernel
Joanna Rutkowska   Subverting Vista KernelJoanna Rutkowska   Subverting Vista Kernel
Joanna Rutkowska Subverting Vista Kernel
 
Pitfalls of virtual machine introspection on modern hardware
Pitfalls of virtual machine introspection on modern hardwarePitfalls of virtual machine introspection on modern hardware
Pitfalls of virtual machine introspection on modern hardware
 
12 Ways Not to get 'Hacked' your Kubernetes Cluster
12 Ways Not to get 'Hacked' your Kubernetes Cluster12 Ways Not to get 'Hacked' your Kubernetes Cluster
12 Ways Not to get 'Hacked' your Kubernetes Cluster
 
CrySys guest-lecture: Virtual machine introspection on modern hardware
CrySys guest-lecture: Virtual machine introspection on modern hardwareCrySys guest-lecture: Virtual machine introspection on modern hardware
CrySys guest-lecture: Virtual machine introspection on modern hardware
 
Anycast all the things
Anycast all the thingsAnycast all the things
Anycast all the things
 
04+ECETEMT092-+WDT+APB+UVM.pdf
04+ECETEMT092-+WDT+APB+UVM.pdf04+ECETEMT092-+WDT+APB+UVM.pdf
04+ECETEMT092-+WDT+APB+UVM.pdf
 
Container & kubernetes
Container & kubernetesContainer & kubernetes
Container & kubernetes
 
Ip Subnet Design
Ip Subnet DesignIp Subnet Design
Ip Subnet Design
 
Building an HPC Cluster in 10 Minutes
Building an HPC Cluster in 10 MinutesBuilding an HPC Cluster in 10 Minutes
Building an HPC Cluster in 10 Minutes
 
SMI_SNUG_paper_v10
SMI_SNUG_paper_v10SMI_SNUG_paper_v10
SMI_SNUG_paper_v10
 
Highly available (ha) kubernetes
Highly available (ha) kubernetesHighly available (ha) kubernetes
Highly available (ha) kubernetes
 
Operating System Engineering Quiz
Operating System Engineering QuizOperating System Engineering Quiz
Operating System Engineering Quiz
 
Chapter 5 – Cloud Resource Virtua.docx
Chapter 5 – Cloud Resource                        Virtua.docxChapter 5 – Cloud Resource                        Virtua.docx
Chapter 5 – Cloud Resource Virtua.docx
 
Chapter 5 – Cloud Resource Virtua.docx
Chapter 5 – Cloud Resource                        Virtua.docxChapter 5 – Cloud Resource                        Virtua.docx
Chapter 5 – Cloud Resource Virtua.docx
 
CloudComputing_UNIT 2.pdf
CloudComputing_UNIT 2.pdfCloudComputing_UNIT 2.pdf
CloudComputing_UNIT 2.pdf
 
CloudComputing_UNIT 2.pdf
CloudComputing_UNIT 2.pdfCloudComputing_UNIT 2.pdf
CloudComputing_UNIT 2.pdf
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Detecting hardware virtualization rootkits

  • 1. Edgar Barbosa COSEINC Advanced Malware Labs SyScan’07
  • 2. Speaker info Edgar Barbosa Security researcher Currently employed at COSEINC Experience with reverse engineering of Windows kernel and x86/x64 cpu architecture Published some articles at rootkit.com Participated in the creation of BluePill, a virtualization hardware based rootkit
  • 3. Content Part I How hardware virtualization rootkits (HVR) works? Part II How to detect HVR?
  • 5. Hardware virtualization rootkits Intel and AMD developed virtualization extensions to the x86 architecture - VT-x and SVM. There are 2 famous hardware virtualization based rootkits: Vitriol, created by Dino Dai Zovi – uses Intel VT-x Bluepill, designed by Joanna Rutkowska – uses AMD SVM Source code not public We will focus the Bluepill rootkit in this presentation, but the concepts and methods are very similar to the Intel plataform.
  • 6. Bluepill Designed by Joanna Rutkowska Intellectual property of COSEINC Uses AMD Secure Virtual Machine (SVM) extensions Runs in 64-bit mode Supports multicore systems
  • 7. AMD SVM SVM stands for “Secure Virtual Machine” It’s a CPU extension to support Virtual Machine Monitors (VMM), a.k.a. hypervisor. 8 new instructions: VMRUN VMSAVE VMLOAD VMMCALL CLGI STGI SKINIT INVLPGA
  • 8. Initialization of a SVM rootkit Before any SVM instruction can be used, the EFER.SVME must be set to 1. Trying to execute a SVM instruction with SVME equal 0 results in #UD (Invalid opcode) exception. Allocates and initialize the VMCB structure. VMCB (Virtual Machine Control Block) address must be 4KB- aligned VMCB describes a virtual machine to be executed. It contains: Instruction or events in the guest to be intercepted Control bits Guest processor state( General registers, RIP, CR registers, … )
  • 9. Initialization of a SVM rootkit After VMCB initialization, set the VM_HSAVE_PA MSR. This is the physical address where the VMRUN instruction saves host processor state information. Then execute the VMRUN instruction with RAX register value equal the physical address of the VMCB
  • 10. Initialization of a SVM rootkit
  • 11. VMRUN instruction Available only at CPL-0 CPU enters in a new processor mode: Guest Mode In guest mode the behavior of some instructions changes to facilitate virtualization Consistency checks on the host and guest state Saves the host processor state Load the guest process state configured in the VMCB CPU now runs in guest mode until an intercept occurs
  • 12. #VMEXIT When a intercept triggers, the processor performs a #VMEXIT On #VMEXIT the processor: Disable interrupts Clear all intercepts Sets the host CPL to 0 Disable all breakpoints Checks the reload host state for consistency The reason of the #VMEXIT is saved in the EXITINFO field of the VMCB structure Execute the Bluepill interception handler routine
  • 15. “Undetectable” rootkits Popek and Goldberg VMM properties: Efficiency Resource control Equivalence Equivalence “implies that any program executing on a virtual machine must behave in a manner identical to the way it would have behaved when running directly on the native hardware” [1] SVM/VT-x rootkits are only theoreticaly ‘undetectable’ However, the equivalence principle is not fully respected in the hardware virtualization extensions There are computer resources that hypervisor has not full control: TLB (partially) Branch prediction SMP processing
  • 16. Timing attacks The most obvious attack against hardware virtualization rootkits is timing attack. We measure the time of execution of some probably intercepted instruction and compare the value against some trusted baseline. But AMD and Intel hardware virtualization extensions has support to intercept any internal source of timing: RDTSC RDMSR I/O ports Hardware virtualization even supports a TSC offset value to be subtracted from every TSC access attempt. This is the reason that local timing attacks fails
  • 17. Detection methods Methods: TLB Branch prediction Counter-based clock #GP exceptions DMA-based attacks will not be discussed due to the new IOMMU unit.
  • 18. TLB A Translation Lookaside Buffer (TLB) is a CPU cache that is used to improve the speed of virtual address translation. Detailed TLB information can be obtained by CPUID instruction. Returns information like the number of entries of each TLB, the type and the associativity of the cache. For each line in the TLB is stored information like: Tag, used to compare with the virtual address Physical address, the result of the VA translation Page attributes If the translation is not store in the cache (cache miss), the system must execute the ‘table-walk’ procedure. This is a expensive clock-cycle operation.
  • 19. TLB The TLB has a limited number of entries. The contents of each line is not accessible by software However we can fill the TLB by accessing several pages. The idea is to fill all the TLB entries and measure the time to access these cached pages. Now we execute a privileged instruction that must be intercepted by a hypervisor. If there is a hypervisor running on the system, it will evict some TLB entries. After executing the privileged instruction we measure the time to execute the previous cached pages. If it takes more time to be accessed, there is a hypervisor running.
  • 20. TLB The idea of using TLB to detect hypervisor was first published by Peter Ferrie [2]. However, in the second version of his paper [3], Ferrie states that the TLB method does not work on AMD- based hypervisors because they can direct the hardware to not flush the TLB when a hypervisor event occurs. Ferrie suggests the CPUID instruction to be used in the TLB method. But Bluepill doesn’t need to intercept cpuid instruction. Another instruction could be used instead, the rdmsr EFER, which bluepill must intercept. It is still possible to use the TLB method to detect bluepill even if the hypervisor controls TLB flush! How?
  • 21. TLB TLB entries are tagged with ASID (Address Space Identifier) bits to distinguish different host and/or guest space address. ASID #00 assigned to VMM and #1..#63 to guests. TLB_CONTROL field: The VMM can control the TLB flush operations by setting the TLB_CONTROL field on the VMCB. If set to 1, the VMRUN instruction will flush the entire TLB (all ASID’s). Even with tagged ASID TLB, we can evict all lines in the TLB. The number of TLB entries are limited, so it will evict lines if necessary. Opteron primary TLB has only 40 entries [4]. AMD optimization manual suggests to avoid using the TLB_CONTROL = 1 to flush the guest TLB. Instead, it is best to assign a new ASID to the guest!
  • 22. Branch prediction Studies have shown that the behavior of branch instruction is highly predictable [5] Execution trace history of branch instructions can be used to predict its future behavior. If a branch is predicted to be taken and this prediction turns out to be incorrect, there is a huge performance penalty because all the pipeline must be flushed. There are a lot of branch prediction schemes. Explaining these schemes are out of the scope of this presentation. There are some very good references about this subject[5] Branch prediction unit uses a small cache to store the history of the branch instruction execution.
  • 23. Branch prediction There is another buffer to store the target address of the branch, the BTB (Branch Target Buffer ) How to use the branch prediction unit (BPU) to detect hypervisor code? Using the prediction rules of static and dynamic predictors, we can fill the entries of the branch history tables and measure the time to execute our code. Now the detector executes a privileged instruction that will be intercept if there is a hypervisor running. The hypervisor code will affect the branch history tables. We execute now the ‘branch test code’ again without the privileged instruction and measure the time. If the execution of the privileged instruction was intercepted, the measured times will be different.
  • 24. Branch prediction The Branch Prediction Unit was successfully used to obtain a 512-bit encryption key by using a Branch Prediction Analysis (BPA) attack[6]. This attack is based in some interesting features of BPU: The execution history cache is accessed using just a few low- order bits from the branch instruction address. Two different address can use the same history. This is called Branch Aliasing or Branch Interference. The cache is shared between all threads. The spy thread was running simultaneously with the decryption thread. Since the two threads was using the same branch prediction cache (branch aliasing), the spy thread can determine which branches the decryption thread has taken.
  • 26. Branch prediction It is not possible to use Branch Aliasing effect to detect virtualization rootkits due to the fact that we do not know the virtual address where the rootkit code is being executed. To use our detection method we must know internal details of the implementation of the branch prediction unit. However, this information is not easy to find in the cpu manuals. We can use some very clever benchmarks developed by Milena Milenkovic, Aleksandar Milenkovic and Jeffrey Kulick [7]. These benchmarks give us very specific details of the BPU, like the number of entries and associativiy of BTB, the bits used from the branch address that are used as the set index and the local branch history length.
  • 27. Counter based detection Bluepill can intercept attempts to read the TSC MSR or any other clock. However, we can create our own clock by using a simple counter algorithm. Using a multicore system, we can create two threads running concurrently. Thread #2 can measure how long Thread #1 takes to execute some routine or instruction. It is necessary a synchronization protocol between the threads to signal when to start and when to finish the counter increment operation. SMP uses shared memory, which can be used for the synchronization protocol.
  • 28. Counter based detection How this detection method really works? Basic idea: Thread #1 will set the ‘start counter’ synchronization signal and wait the reply of thread #2. Thread #1 will execute a instruction that bluepill must intercept ( rdmsr EFER ), while thread #2 starts to increment the counter at the same time. After the intercepted instruction, thread #1 set the ‘stop counter’ synchronization signal and thread #2 stops the counting process. Now we check the value of the counter
  • 30. Counter based detection Using a simple counter code like: while( stopCounter == 0 ) counter++; The results of a ‘rdmsr EFER’ test are: Counter = 016h for a clean system. Counter = 02000h for a system running a HBR The counter code can be optimized using the AMD optimization manual rules.
  • 31. Counter based detection To avoid detection, bluepill must stop thread #2 counter as soon it intercepts any event. However just the #VMEXIT control transferring process takes around 1000h clock cycles! The CPU#2 bluepill hypervisor is in ‘sleep mode’ while the counter runs and even if the CPU#1 sends a IPI (Inter processor interrupt ) to CPU#2, it will take even more time. The CPU#1 hypervisor doesn’t have access to the CPU#2 registers context. It is too late to change any thread schedule quantum value
  • 32. Counter based detection What if our counter code is interrupted by some external interrupt, like the clock, at the start of the counter process? It is good to avoid interrupts in our counter code, but not really necessary We can’t guarantee that the counter code will not be interrupted Clear interrupt methods are interceptable by the rootkit: Temporarily disable the APIC (interceptable) CLI instruction (interceptable) PUSHF and POPF instructions (interceptable) Solution: We can run the detection code several times. All we need is a weird counter value.
  • 33. Counter based detection There is another way for the rootkit to detect this detection method? Very difficult. We can implement several different synchronization routines and algorithms to make sure that the threads are running concurrently. There is no time for the rootkit to unload itself to avoid detection after the intercept.
  • 34. BP in hibernation-mode One interesting idea discussed is the possibility of bluepill being able to unload itself while some attack is being executed and reload itself after the finish of the attack.[8] That’s a weird idea because if we know that the rootkit is unloaded, we can load our own detector hypervisor and waits for any code trying to get access to SVM resources! Remember bluepill is predicted to be undetectable even if the source is published. However, the unload idea can be cleverly used against the next detection idea. It is interesting to present this attack to know how virtualization rootkits can use this ‘unload’ trick.
  • 35. #GP detection EFER (Extended Feature Enable Register ) is a model specific register (MSR) Can be accessed by RDMSR and WRMSR instructions. MSR EFER index is 0xC0000080 Before using the AMD SVM extensions, it is necessary to set the EFER.SVME bit to 1. Bluepill intercepts all attempt of read or write in the EFER. There is a way to know the value of SVME bit without being intercepted?
  • 36. VMSAVE instruction The VMSAVE instruction stores a subset of the processor state into the VMCB specified by the physical address in the RAX register. This is a Secure Virtual Machine Instruction. This instruction generates a #UD exception if SVM is not enabled. Pseudo code:
  • 37. VMSAVE and EFER What happens if we execute VMSAVE instruction with RAX containing a invalid physical address and SVM? If the EFER.SVME = 0 the system generates a #UD exception! If the EFER.SVME = 1 the system generates a #GP exception! The VMSAVE instruction microcode is able to read the real value of the EFER.SVME register without being intercepted! We can use the VMSAVE instruction to detect HVBR. The VMSAVE in not the only one SVM instruction that can be used for detection. Take a look at AMD manuals.
  • 38. Counter-attack - I If the rootkit set the VMCB to intercept the VMSAVE instruction, it will not detect the attack because the system will generate a exception before executing VMSAVE. The rootkit can set the VMCB to intercept #GP exceptions! After a #GP exception intercept, the rootkit must verify it the guest RIP is pointing to a VMSAVE instruction! If the VMSAVE instruction is being executed and the guest EFER.SVME = 0, it can now inject a #UD exception in the guest.
  • 39. #GP attack - II The detection code must not allow the rootkit to detect the VMSAVE instruction at the guest RIP. We can use the Translation Lookaside Buffer to hide our detection code. Exactly before executing the VMSAVE instruction, the detector will change the PTE of the detection code to make it point to a fake page which will have another code that generates a #GP exception. When the rootkit intercepts the #GP, it will not be able to look at the real code of RIP because the PTE is not showing the real detection code page. The real address of the detection code is inside the I-TLB and there is no way to access the contents of the TLB cache.
  • 40. Counter-attack II The rootkit is not able to find the real detector page because it is cached at ITLB. But it can detect if the PTE address of the RIP is fake. How? The rootkit will write a 0xCC (int 0x3) opcode at the RIP address and restart guest execution at the same RIP. If the system generates a #BP exception, the page is not fake. If the system generates a #GP again, the page is fake. If the rootkit detects such attack, it can’t know what is the correct exception that must be inject in the guest because the hidden code can be any instruction able to generate a #GP exception. If it injects a #UD exception it will be easily detected!
  • 41. Counter-attack II What the rootkit can do now? It knows that a exception must be generated. It hooks the guest exception handlers. Next, it unload the hypervisor and now it calls the intercepted instruction again. In this case, the instruction will generate the correct exception that will be detected by the hooked exception handlers. Now, the exception handler just needs to load the hypervisor again! Due to the #GP attack, every virtualization rootkit must implement configure the VMCB to intercept #GP exceptions.
  • 42. CPU bugs It is possible to use CPU bugs to detect HVBR? Yes, but it is not a reliable way to detect rootkits. I found that the execution of the Address-Size Prefix (0x67) opcode together with the VMSAVE instruction is aparently able to freeze systems running hypervisors ! A detector which freezes the system is not very useful outside of lab environments.
  • 43. Credits All the cool crypto research papers using cpu microarchitecture based attacks. Alexander Tereshkin, for the creation of the counter- attacks against the #GP exception method to detect Bluepill.
  • 44. References [1] J. Smith and R. Nair. Virtual Machines. Versatile platforms for systems and processes. Morgan Kaufmann, 2005. [2]http://pferrie.tripod.com/papers/attacks.pdf [3]http://pferrie.tripod.com/papers/attacks2.pdf [4]http://www.chip-architect.com/news/2003_09_21_Detailed_Architecture_of_AMDs_64bit_Core.html [5]J. Shen and M. Lipasti. Modern Processor Design. Fundamentals of Superscalar processors. McGraw-Hill , 2005. [6]O. Acuçmez, Ç. Koç and J. Seifert. On the power of simple branch prediction analysis. http://eprint.iacr.org/2006/351.pdf [7] M. Milenkovic, A. Milenkovic and J. Kulick. Demystifying Intel Branch Predictors. http://www.ece.wisc.edu/~wddd/2002/final/milenkovic.pdf [8]http://blogs.zdnet.com/Ou/?p=297