More Shadow Walker- The Progression Of TLB-Splitting On X86 by Jacob Torrey
This talk will cover the concept of mis-using the hardware (x86 translation lookaside buffer) to provide code hiding and how the evolution of the Intel x86 architecture has rendered previous techniques obsolete and new techniques to perform TLB-splitting on modern hardware. After requisite background is provided, the talk will then move to the new research, the author's method for splitting a TLB on Core i-series and newer processors and how it can again be used for defensive (MoRE code-injection detection) and offensive purposes (EPT Shadow Walker root-kit). This talk will be very high-level but aims to convey the complexities of the hardware and possible attack vectors that can happen at the lowest-levels of an organization's IT infrastructure.
1. 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Turbo Talk
Jacob Torrey
@JacobTorrey
MORE SHADOW WALKER: THE
PROGRESSION OF TLB-
SPLITTING ON X86
2. 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
• The overwhelming complexity of modern
computer systems create software-level
security challenges stemming from
hardware-level designs
• Many hamper detection of and protection
from threats to your organization
Thesis
3. 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
• Intel x86 provides OS method to abstract
view of memory: virtual memory / paging
Background
Virtual Memory
4. 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
• Every memory access requires several memory bus
transactions to perform page translation
– This is slow!
Background
Page Translation
5. 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
• The solution to this problem is to cache previous
translations in a buffer called the Translation Lookaside
Buffer (TLB)
Background
Translation Lookaside Buffer
6. 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
• The CPU’s TLB is used to cache memory page
translations to increase performance.
• De-synchronizing a CPU’s Translation Look-aside Buffer
(TLB) (e.g. Shadow Walker or PaX).
Background
TLB Splitting
7. 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
• In pre-Nehelam CPUs, the D-TLB and I-
TLB were completely separate:
Background
Intel TLB
8. 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
• A CPU’s view of memory is dependent on
how memory is being accessed
• Anti-virus scanning memory will see one
version of memory, execution of that memory
will yield different results
• Demonstrates differences in perceived
hardware and actual hardware
What does this mean?
9. 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
• Intel releases the Nehelam architecture
(1st generation Core i-series)
• Addition of a level 2 cache for TLB, a
shared TLB, or S-TLB
• Previous TLB splitting tools will not work
due to this major architecture change
– Hangs in endless loop as S-TLB merges
entries
– Not enough permission granularity
End of an Era
Intel breaks TLB-splitting
10. 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
• After Nehelam, Intel introduced the shared
TLB (S-TLB):
Background
Intel S-TLB
11. 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
• Now the hardware is implemented how it
is used, a full von Neumann machine
model: same view of memory for data and
code
• End of the story?
What now?
12. 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
• We believe that the same TLB de-
synchronization used by Shadow Walker
can be used to automatically separate
data references from already existing
applications in real-time for real-time trust
measurements
MoRE
Hypothesis
13. 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
• The DARPA CFT MoRE program sought
to identify if TLB splitting could be used to
detect application subjugation even if an
executable’s data and code are mixed
MoRE
Goal
14. 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
• Built a custom VMX hypervisor with EPT and
VPID support that could monitor process
creation
• Used new CPU capabilities in Nehelam+ CPUs
to “re-break” assumptions. Uses virtualization
capabilities to re-split TLB, previously thought to
be impossible on modern CPUs
MoRE
Design
15. 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
• Even with the prototype nature of MoRE,
performance hits were <2%
• Could perform periodic measurements of an
application and the MoRE system (designed
to be measurable) very rapidly – re-verifying
trust every <1/10th of a second!
• Required no modification of application, no
recompilation or source
MoRE
Results
16. 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
• TLB-splitting is just a technique – clear
that is can be used for both offense and
defensive
• MoRE Shadow Walker is a modification to
MoRE that allows memory hiding even
from ring 0 code
– Patch Guard?
• Can split on arbitrary pages on Nehelam
and newer CPUs
MoRE Shadow Walker
Swinging back to the offensive
17. 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
• The immense complexity of the Intel x86
ISA enables huge architectural
modifications to be effected through
software
– Ex: Turing-complete MMU
• Even as architecture evolves, so too does the
techniques to misuse it
– Ex: NX bit
Conclusion
18. 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
• The code for a simple TLB splitting VMM
(for Windows 7) can be found on AIS’s
Github repository:
– http://github.com/ainfosec/MoRE
• Released at Black Hat USA
The code
19. 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
• @grsecurity & PaX team for helping make
Linux more secure
• @jamierbutler for helping provide
guidance on the CFP submission
• @dotMudge and @DARPA for taking
MoRE from proposal to implementation
• @ainfosec for letting me speak about this
very exciting research area all over the
world
Shout outs