Summary of Linux kernel
Security Protections and it’s
associated attacks
Shubham Dubey
About me
Shubham Dubey
Security Researcher @ Microsoft
nixhacker.com
/in/shubham0d
AGENDA
• Introduction to kernel configuration and Memory mapping
• Kernel Self protection techniques – Software
• Kernel Self protection/mitigation techniques - Hardware
• Bonus kernel security projects
• Planning: To cover maximum number of protections rather than going
in depth of one
CONFIGURATION support for linux kernel
• Linux kernel can be configured during compilation using configure
parameters. Used to enable/disable a feature
• Usually have naming convention of CONFIG_*.
• Once kernel is built, these parameters cannot be modified.
• To check the status of a parameter in running kernel
• zcat /proc/config.gz | grep CONFIG_DEBUG_RODATA
• grep CONFIG_DEBUG_RODATA /boot/config-`uname -r`
- configuration parameter specific to a security protection.
Linux kernel Memory layout
Linux kernel Memory layout – Low Memory
• This is a linear(1to1) map memory.
• Linux kernel image resides here.
• Usually, 896MB in 32 bits architecture and around 5096 max in 64
bits.
• The virtual address where lowmem is mapped is defined by
PAGE_OFFSET
• Memory allocated by kmalloc() resides in lowmem and it is physically
contiguous.
Linux kernel Memory layout – High Memory
• This is an arbitrary mapped memory.
• Mostly introduced for 32 bit system since, in 32 bit due to virtual address
space limitation, not everything can be mapped to lowmem all the time.
• The virtual base address where high memory is defined is high_memory.
• There are multiple types of mappings in the highmem area:
• Multi-page permanent mappings (vmalloc, ioremap)
• Temporary 1 page mappings (atomic_kmap)
• Permanent 1 page mappings (kmap, fix-mapped linear addresses)
Strict memory
permission
Strict kernel memory permissions
2006
CONFIG_DEBUG_RODATA
CONFIG_ARM_KERNMEMPERMS
2016
CONFIG_DEBUG_RODATA
CONFIG_DEBUG_ALIGN_RODATA
2017
CONFIG_STRICT_KERNEL_RWX
CONFIG_STRICT_MODULE_RWX
Makes kernel rodata and kernel module – R^XW.
Makes kernel text and kernel module text RX^W.
This protect against rare chance that attackers might find and use ROP gadgets that exist in the rodata
section.
Adds an additional section-aligned split of rodata from kernel text so it can be made explicitly non-
executable. This padding may waste memory space to gain the additional protection.
Major changes in memory permission
2006 (v2.6)
Introduced CONFIG_DEBUG_RODATA
link
2006(v2.6)
Added file_operation structs
link
2006(v2.6)
Included kernel_params structure
link
2006(v2.6)
Included kallsyms data
link
2009(v2.6)
Made text section writable for hooks
link
2010(v2.6)
Added NX protection
link
2010(v2.6)
Included module permission
CONFIG_DEBUG_SET_MODULE_RONX
link
2014(v3.19)
Introduced for ARM architecture
link
Major changes in memory permission -
contd
2015 (v4.0)
Introduced for ARM64
link
2016(v4.6)
Introduced
DEBUG_ALIGN_RODATA
link
2016(v4.7)
Added fault_info table
link
2017(v4.11)
Renaming to STRICT_*_RWX
link
2017(v4.14)
Introduced for PPC32
link
2020(v5.7)
Removed
DEBUG_ALIGN_RODATA
link
2020(v5.8)
Refuse loading module that
don’t enforce W^X
link
Limitation of Strict memory permission
• A kernel module/component can modify the page permission using
some default functions available in linux kernel.
• One of such function is set_memory_rw part of set_memory_*
function set.
• It’s not exported to use directly. But can be called manually.
Randomization
at Kernel space
Requirement of KASLR - Background
• The attacker can use kernel vulnerability to insert malicious code into the
kernel address space by various means and redirect the kernel's execution
to that that code.
• One method used to get root privilege:
commit_creds(prepare_creds());
• These attacks rely on knowing where symbols of interest live in the kernel's
address space.
• Those locations change between kernel versions and distribution build, but
are known (or can be figured out) for a particular kernel.
• ASLR disrupts that process and adds another layer of difficulty to an attack.
Kernel Address Randomization Timeline
2006 (v2.6)
Introduced
CONFIG_RELOCATABLE
link
2013(v3.10)
Introduced KASLR for x86/64
link
2013(v3.14)
Introduced
CONFIG_RANDOMIZE_BASE
link
2014(v3.15)
Randomization for modules
link
2016(v4.6)
Introduced for ARM64
link
2016(v4.7)
Introduced for MIPS
link
2016(v4.8)
Randomize kernel memory
range
link
2021(v5.13)
Randomization of kernel stack
link
Consequences of KASLR in linux kernel
• The kernel previously used to be at very start of lowmem, but now any
placement relative to memory ranges is possible.
• Kernel can be separate from lowmem area. In theory, KASLR can put the
kernel anywhere in the range of [16M, MAXMEM) on 64-bit, and [16M,
KERNEL_IMAGE_SIZE) on 32-bit.
• Load address of modules are randomized in the kernel to make KASLR
effective for modules.
• Both physical and virtual addresses are randomized.
• Other Kernel memory regions are also randomized like physical mapping,
vmalloc and vmemmap regions using CONFIG_RANDOMIZE_MEMORY.
• Introduced randstack feature link
CONFIG_RELOCATABLE
• This builds a kernel image that retains relocation information to enables
loading and running a kernel binary from a different physical address than
it has been compiled for.
• This involves processing the dynamic relocations in the image in the early
stages of booting
• Works by building the kernel as a Position Independent Executable (PIE),
which retains all relocation metadata required to relocate the kernel binary
at runtime to a different virtual address.
• Runtime relocation is possible since relocation metadata are embedded
into the kernel.
• Can read more about the internals here
CONFIG_RANDOMIZE_BASE
• Depends on CONFIG_RELOCATABLE
• With CONFIG_RANDOMIZE_BASE set, it randomizes the address at
which the kernel is decompressed at boot.
• It deters exploit attempts relying on knowledge of the location of
kernel internals.
• Entropy is generated using the RDRAND instruction. If not supported
by CPU, then RDTSC is used.
CONFIG_RANDOMIZE_MEMORY
• Introduced in 2016 kernel v4.8
• When enabled the direct mapping of all physical memory,
vmalloc/ioremap space and virtual memory map are randomized.
• Works by randomizing base address of each sections.
• Order is preserved but their base offset differ.
• This makes exploits relying on predictable memory locations less
reliable.
CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAU
LT – Background
• Introduced in 2021 kernel v5.13.
• This feature is based on the original idea from GRSecurity PaX’s
RANDKSTACK feature.
• Linux assigns two pages of kernel stack to every task. This stack is
used whenever the task enters the kernel (system call, device
interrupt, CPU exception, etc).
• By the time the task returns to userland, the kernel land stack pointer
will be at the point of the initial entry to the kernel thread stack.
• This means that a userland originating attack against a kernel bug
would find itself always at the same place on the task's kernel stack.
CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAU
LT - Overview
• This feature aims to make various stack-based attacks that rely on
deterministic stack structure harder.
• The goal of randomize_kstack_offset feature is to add a random offset
after the pt_regs has been pushed to the stack and before the rest of
the thread stack is used during the transition.
• If the stack offset is randomized on each system call, it is harder for an
attack to reliably land in any particular place on the thread stack.
• Even if address is exposed, the stack offsets will change on the next
syscall.
PAX RANDSTACK vs randomize_kstack_offset
KASLR limitation and Bypasses - Infoleaks
• Most common ways different kernel vulnerabilities are exploited with
KASLR on is due to some info leaks. It can be a pointers leak (can be
pointer to struct or heap/stack area) or content leak.
• Raw kernel pointers were frequently printed to the kernel debug log
• Bugs which trigger a kernel oops can be used to leak kernel pointers
• Leak can happen due to uninitialized stack variables. Reference
• Example - CVE-2019-10639 (Remote kernel pointer leak), CVE-2017-
14954
KASLR limitation and Bypasses
• Low Entropy - there are only so many locations the kernel can fit in.
This means an attacker could guess without too much trouble.
• Arbitary read/ write - CVE-2017-18344
• Heap spraying using msgsnd()->msg_msg struct - CVE-2021-26708,
CVE-2021-43267, CVE-2021-22555
• Hardware attacks and side channels – BlindSide attack
• Each vulnerability exploits has it’s own story.
More on kernel address leaks protection
kptr_restrict - This indicates whether restrictions are placed on exposing
kernel addresses via /proc and other interfaces.
• When kptr_restrict is set to (1), kernel pointers printed using the %pK
format specifier will be replaced with 0’s unless the user has CAP_SYSLOG.
dmesg_restrict - This indicates whether unprivileged users are prevented
from using dmesg to view messages from the kernel's log buffer.
• When dmesg_restrict is set, users must have CAP_SYSLOG to use dmesg.
Kernel Stack
protection
Canaries based protection in stack
• This feature puts a canary value on the stack just before the return
address and validates the value before returning.
• Stack based buffer overflows (that need to overwrite return address)
also overwrite the canary, which gets detected and neutralized via a
kernel panic.
• CONFIG_CC_STACKPROTECTOR – Only put canaries at the starting of
critical functions. Equivalent GCC flag: -fstack-protector
• CONFIG_CC_STACKPROTECTOR_STRONG - Adds the canary for a
wider set of functions. i.e With this, more functions end up with a
canary. Equivalent GCC flag: -fstack-protector-strong
Canaries based protection in stack
CONFIG_HAVE_CC_STACKPROTECTOR
CONFIG_CC_HAS_STACKPROTECTOR_NONE
CONFIG_CC_HAS_SANE_STACKPROTECTOR
where the "CC_" versions are about internal compiler infrastructure.
Canaries based protection limitation
• Issue with canaries: if a stack overflow is detected at all on a
production system, it is often well after the actual event and after an
unknown amount of damage has been done.
• Function recursion based exploitation is still possible.
Virtual mapped stack – CONFIG_VMAP_STACK
• Earlier the stack lives in directly-mapped kernel memory, so it must be
physically contiguous.
• VMAP_STACK enable support for kernel stack to be present at not
physically contiguous memory (vmalloc area).
• This adds a guard pages in each thread stack area.
• Guard pages are the pages that are set non writable in page table.
Upon writing, it cause PF exception.
• This feature causes reliable faults when the stack overflows. Kernel
can set up the usability of stack trace and response to overflow.
Bypassing VMAP guard pages – Stack hopping
• Background – The kernel place one guard page at the starting and
ending of each stack area.
• A thread that wanders off the bottom of a stack into the guard page
will be rewarded with a segmentation-fault signal
• The fundamental problem with the guard page is that it is too small.
• There are a number of ways in which the stack can be expanded by
more than one page at a time.
Stack hopping - contd
Stack expansion working:
• if the stack-pointer (the esp register, on i386_64) reaches the start of the
stack and there is unmapped memory pages below
• then a "page-fault" exception is raised and caught by handler
• the page-fault handler transparently expands the stack of the process
• or it terminates the thread/process if the stack expansion
fails(THREAD_SIZE is reached)
Unfortunately, this stack expansion mechanism is implicit and fragile: it relies
on page-fault exceptions, but if another memory region is mapped directly
below the stack, then the stack-pointer can move from the stack into the
other memory region without raising a page-fault.
Stack hopping – Steps overview
• "Clashing" the stack with another memory region: allocate memory
until the stack reaches another memory region.
• "Jumping" over the stack guard-page: move the
stack-pointer from the stack and into the other
memory region, without accessing the stack
guard-page.
• "Smashing" the stack, or the other memory region:
i.e overwrite the stack with the other memory
region.
Kernel Thread 1 Stack
Guard Page (4kb)
Thread 2 Stack/memory
Stack hopping – Illustration
• Step 1: Allocate memory until the start of the stack reaches the end
of another memory region
• Through megabytes of argument passed to the thread function.
• Through recursive function call. Project zero reference
Guard Page
Thread 2 Stack
Fill up the stack
Stack hopping – Illustration
• Step 1: Allocate memory until the start of the stack reaches the end
of another memory region
• Through megabytes of argument passed to the thread function.
• Through recursive function call. Reference
• Step 2: Consume the unused stack memory that
separates the stack-pointer from the start of the
stack.
Guard Page
Thread 2 Stack
Fill up the stack
Stack hopping – Illustration
• Step 3: Jump over the stack guard-page, into another memory region
• Move the stack-pointer from the stack and into the memory region that
clashed with the stack without accessing the guard-page. Can be done using
large memory allocations.
• it must be larger than the guard-page;
• it must end in the stack, below the guard-page;
• it must start in the memory region above the stack guard-page;
• it must not be fully written to (a full write would access
the guard-page, raise a page-fault exception.
Guard Page
Thread 2 Stack
Fill up the stack
Th1 stack allocation
Stack hopping – Illustration
• Step 3: Jump over the stack guard-page, into another memory region
• Move the stack-pointer from the stack and into the memory region that
clashed with the stack but without accessing the guard-page.
• it must be larger than the guard-page;
• it must end in the stack, below the guard-page;
• it must start in the memory region above the stack guard-page;
• it must not be fully written to (a full write would access
the guard-page, raise a page-fault exception).
• Step 4: Either smash the stack with another
memory region or smash another memory region
with the stack.
Guard Page
Fill up the stack
Th1 stack allocation
Th2 starts filling
Kernel Page table isolation -
CONFIG_PAGE_TABLE_ISOLATION
• Introduced in 2017 (v4.15) as countermeasure to famous Meltdown
attack.
• Earlier whole kernel page table used to be mapped to user space
process memory.
• To mitigate Meltdown like side channel, linux create an independent
set of page tables for use only when running userspace applications.
• The userspace page tables only contain a minimal amount of kernel
data. Only what is needed to enter/exit the kernel such as the
entry/exit functions and interrupt descriptor table (IDT).
Kernel page table isolation
ret2usr exploitation
• ret2usr (return-to-user) based on the fact that code in kernel mode
can execute the code in user mode.
• Even with protection like KPTI, attackers can easily execute shellcode
with kernel rights by hijacking a privileged execution path in kernel
mode and redirecting it to user space.
• In a ret2usr attack, kernel data is overwritten with user space
addresses, typically after exploitation of memory corruption bugs in
kernel code.
• Example associated CVEs: CVE-2017-7308
ret2usr exploitation – source Blackhat 2014
CONFIG_RETPOLINE
• Introduced in 2018 (v4.15) as countermeasure to famous Specter attack.
• It guard against kernel-to-user data leaks by avoiding speculative indirect
branches.
• Specter background:
• Modern CPUs have a branch predictor to optimize their performance.
• It works by referencing Branch Target Buffer or BTB that is a storage for a key(PC)-
value(Target PC) pair.
• But its size limitation causes a BTB collision that leads to a new side-channel attack
• Using this primitive, an attacker can inject an indirect branch target into
the BTB, and consequently run some codes in a speculative context. It can
leak a sensitive data across boundaries. (e.g. between VMs, Processes,
Kernel/User mode)
CONFIG_RETPOLINE - Contd
• In simple terms, it replace all the indirect jmp and call with return
instructions.
CONFIG_MODULE_SIG – Module signing
• Introduced in 2012 (v3.7), when set allow only loading of signed module
with valid key.
• The kernel module signing facility cryptographically signs modules during
installation and then checks the signature upon loading the module.
• This allows increased kernel security by disallowing the loading of
unsigned and malicious modules.
• It uses RSA public key encryption and upto sha512 for hashing.
• A private key is used to generate a signature and the corresponding public
key is used to check it.
• Under normal conditions, the kernel build will automatically generate a
new keypair using openssl if one does not exist in the file
cert/signing_key.pem. More details here
Hardware
assisted
protection
SMEP/SMAP – Supervisor mode
execution/access prevention
• SMEP prevents the CPU in kernel-mode to jump to an executable
page that has the user flag set in the PTE.
• This prevents the kernel from executing user-space code accidentally
or maliciously, for example prevents kernel from jumping to specially
prepared user-mode shellcode. (ret2usr)
• Can be enabled by CR4.SMEP (20th bit) and CR4.SMAP(21st bit)
SMEP/SMAP – Count.
• SMAP extend the protection of SMEP to read and write.
• SMAP can be temporarily disabled for explicit memory accesses by
setting the EFLAGS.AC (Alignment Check) flag.
• ARM have equivalent of SMEP, named as PXN (Privileged Execute-
Never)
• An standlone bundle kGuard(cross platform) can be used in case
hardware support is not present.
• It inject CFAs that perform a small runtime check before every branch to
verify that the target address is located in kernel space or loaded from kernel-
mapped memory.
Bypassing SMEP/SMAP – ret2dir
• Return to direct-mapped memory
• Each address allocated at userspace will have a physical address
directly mapped on kernel address space as well(called address
aliasing) if it’s in lowmem.
• To bypass SMEP/SMAP, user can provide kernel synonym address
rather than the userspace mapped address during ret2usr
exploitation.
• Kernel will execute it without any issues.
Ret2dir – contd.
Ret2dir – process overview
• Step 1: Allocate a memory in userspace that will get mapped in
lowmem of kernel space
• This can be done by allocating big chunks of memory from different processes
and filling the highmem. This force kernel to allocate memory from lowmem.
• Step 2: Guess the kernel space address in lowmem for the directly
mapped memory
• /proc/<pid>/pagemap can be used to get the page number offset in lowmem.
• lowmem_base knowledge will help determining base.
• Step 3: Use ret2usr vulnerability to execute the shellcode. (Disclaimer:
The memory area need to have WX permission set in kernel mapping)
CONFIG_X86_KERNEL_IBT - Indirect branch
tracking
• Merged in March, 2022 (v5.8)
• If an attacker can corrupt a variable that is used for indirect branches,
they may be able to redirect the kernel's execution flow to an
arbitrary location.
• Exploit techniques like return-oriented programming and jump-
oriented programming depend on this kind of redirection.
• IBT’s purpose is to prevent an attacker from causing an indirect
branch (a function call via a pointer variable, for example) to go to an
unintended place.
CONFIG_X86_KERNEL_IBT - Indirect branch
tracking – Compiler version
• It works by trying to ensure that the target of every indirect branch is, in
fact, intended to be reached that way.
• In linux implementation, indirect branch goes through a "jump table",
ensuring that the target is not only meant to be reached by indirect
branches, but that the prototype of the called function matches what the
caller is expecting.
• whenever an indirect function call is made, control goes to a special function called
__cfi_check()
• It will verify that the target address is, indeed, an address within the expected jump
table, extract the real function address from the table, and jump to that address.
• If the target address is not within the jump table, instead, the default action is to
assume that an attack is in progress and immediately panic the system
CONFIG_X86_KERNEL_IBT - Indirect branch
tracking – Intel CET version
• If IBT is enabled, the CPU will ensure that every indirect branch lands
on a special instruction (endbr32 or endbr64). If anything else is
found, the processor will raise a control-protection (#CP) exception.
• The processor implements a state machine that tracks indirect JMP
and CALL instructions.
• When one of these instructions is seen, the state machine moves from IDLE
to WAIT_​FOR_​ENDBRANCH state.
• In WAIT_​FOR_​ENDBRANCH state the next instruction in the program stream
must be an ENDBRANCH.
• If an ENDBRANCH is not seen the processor causes a control protection fault
(#CP), otherwise the state machine moves back to IDLE state.
Intel CET – Indirect Branch Tracking illustration
CONFIG_ARM64_MTE – Memory tagging
extension
• Merged in linux kernel v5.10, this mechanism enables the automated
detection of a wide range of memory-safety issues.(user-space only)
• Context: Arm64 only uses 48 bits for addressing, remaining bits are
“top byte ignore” feature that allows software to store arbitrary data
in the uppermost byte of a virtual address.
• MTE allows the storage of a four-bit "key” value in bits 59-56 of a
virtual address that is associate with one or more 16-byte ranges of
memory.
• When a pointer is dereferenced, the key stored in the pointer itself is
compared to that associated with the memory the pointer references;
if the two do not match, a trap may be raised.
Memory tagging extension - contd
• Each memory granule has a tag (aka color)
• Every pointer has a tag
• On allocation, both memory and pointer get a matching random tag
Memory tagging extension - contd
• Each memory granule has a tag (aka color)
• Every pointer has a tag
• On allocation, both memory and pointer get a matching random tag
• On pointer dereference, pointer tag must match memory tag
Memory tagging extension - contd
• Each memory granule has a tag (aka color)
• Every pointer has a tag
• On allocation, both memory and pointer get a matching random tag
• On pointer dereference, pointer tag must match memory tag
CONFIG_AMD_MEM_ENCRYPT - AMD Secure
Memory Encryption
• SME is hardware feature present on AMD CPUs allowing system RAM
to be encrypted and decrypted (mostly) transparently by the CPU,
with a little help from the kernel to transition to/from encrypted
RAM.
• Such RAM should be more secure against various physical attacks like
RAM access via the memory bus and should make the radio signature
of memory bus traffic harder to intercept (and decrypt) as well.
• It works by marking individual pages of memory as encrypted using
the standard x86 page tables. A page that is marked encrypted will
be automatically decrypted when read from DRAM and encrypted
when written to DRAM.
SME - contd
AMD SME -contd
• Support for SME can be determined
through the CPUID instruction.
The CPUID function 0x8000001f[eax]
reports if SME is supported.
• If support for SME is present,
MSR 0xc00100010 (MSR_K8_SYSCFG)
Bit[21] can be used to determine if
memory encryption is enabled.
• Bits[5:0] pagetable bit number used
to activate memory encryption.(C-bit)
Linux kernel
security
projects
Honorable mentions
grSecurity and PAX project
• Grsecurity ( www.grsecurity.net) is the only drop-in Linux kernel
replacement offering high-performance, state-of-the-art exploit
prevention against both known and unknown threats.
• PaX is a separate project that is included in Grsecurity as part of its
security strategy. The PaX project researches various defences against
the exploitation of software bugs (e.g., buffer overflows and user-
supplied format string bugs).
• PaX does not focus on finding and fixing the bugs, but rather the
prevention and containment of exploit techniques
grSecurity and PAX project
Linux Kernel Runtime Guard(LKRG)
• It’s an independent project, equivalent of windows Patchguard.
• Performs runtime integrity checking of the Linux kernel and detection
of security vulnerability exploits against the kernel.
• LKRG is a kernel module, so it can be built for and loaded on top of a
wide range of mainline and distros' kernels, without needing to patch
those.
• It uses kprobe for hooking various linux kernel apis.
• Amount of protection provided is based on profile set by user.
Allowed values are 0 (log and accept), 1 (selective), 2 (strict), and 3
(paranoid).
Linux Kernel Runtime Guard(LKRG) - Features
• Exploit detection
• Tracking processes important data structures and metadata, pointers and capability,
namespace, cred struct modification.
• Ptrace access, Keyring access
• SeLinux state modification
• Checking kernel modules integrity
• Module removed from module list or KOBJ.
• Gathers information about loaded kernel modules and tries to protect them via calculating
hashes from their core_text section.
• Kernel Components validation
• SMEP, MSRs, pint, kint, umh and Profiles validation
• Periodically check critical system hashes using timer
• (Un)Hide itself from the module system activity components
LSM – Linux security module framework
• The LSM kernel patch provides a general kernel framework to support
security modules.
• By itself, the framework does not provide any additional security; it
merely provides the infrastructure to support security modules
• The LSM kernel patch adds security fields to kernel data structures
and inserts calls to hook functions at critical points in the kernel code
to manage the security fields and to perform access control.
• It also adds functions for registering and unregistering security
modules, and adds a general security system call to support new
system calls for security-aware applications.
Kernel lockdown
• Introduced in 2021(v5.3), a linux kernel security module that uses LSM.
• The Kernel Lockdown feature is designed to prevent both direct and
indirect access to a running kernel image
• Attempting to protect against unauthorized modification of the kernel image and
• Prevent access to security and cryptographic data located in kernel memory.
• If a prohibited or restricted feature is accessed or used, the kernel will emit
a message that looks like:
Lockdown: X: Y is restricted, see man kernel_lockdown.7
where X indicates the process name and Y indicates what is restricted.
Thank you
Uncovered Protections
• Retbleed mitigation:
• CONFIG_CPU_IBPB_ENTRY
• CONFIG_CPU_UNRET_ENTRY
• CONFIG_RETPOLINE
• CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
• Specter and Meltdown - ARM
• CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY
• CONFIG_UNMAP_KERNEL_AT_EL0
• CONFIG_ARM64_PTR_AUTH_KERNEL
• CONFIG_ARM64_BTI
• CONFIG_ARM64_EPAN
Uncovered Protections
• CONFIG_DEBUG_STACKOVERFLOW
• KFENCE
• KASAN/KMSAN/KCSAN
• CONFIG_DEBUG_KMEMLEAK
• refcount_t API
• FORTIFY_SOURCE
• PAX RAP
• L1TF mitigation - PTE inversion
Q & A

Summary of linux kernel security protections

  • 1.
    Summary of Linuxkernel Security Protections and it’s associated attacks Shubham Dubey
  • 2.
    About me Shubham Dubey SecurityResearcher @ Microsoft nixhacker.com /in/shubham0d
  • 3.
    AGENDA • Introduction tokernel configuration and Memory mapping • Kernel Self protection techniques – Software • Kernel Self protection/mitigation techniques - Hardware • Bonus kernel security projects • Planning: To cover maximum number of protections rather than going in depth of one
  • 4.
    CONFIGURATION support forlinux kernel • Linux kernel can be configured during compilation using configure parameters. Used to enable/disable a feature • Usually have naming convention of CONFIG_*. • Once kernel is built, these parameters cannot be modified. • To check the status of a parameter in running kernel • zcat /proc/config.gz | grep CONFIG_DEBUG_RODATA • grep CONFIG_DEBUG_RODATA /boot/config-`uname -r` - configuration parameter specific to a security protection.
  • 5.
  • 6.
    Linux kernel Memorylayout – Low Memory • This is a linear(1to1) map memory. • Linux kernel image resides here. • Usually, 896MB in 32 bits architecture and around 5096 max in 64 bits. • The virtual address where lowmem is mapped is defined by PAGE_OFFSET • Memory allocated by kmalloc() resides in lowmem and it is physically contiguous.
  • 7.
    Linux kernel Memorylayout – High Memory • This is an arbitrary mapped memory. • Mostly introduced for 32 bit system since, in 32 bit due to virtual address space limitation, not everything can be mapped to lowmem all the time. • The virtual base address where high memory is defined is high_memory. • There are multiple types of mappings in the highmem area: • Multi-page permanent mappings (vmalloc, ioremap) • Temporary 1 page mappings (atomic_kmap) • Permanent 1 page mappings (kmap, fix-mapped linear addresses)
  • 8.
  • 9.
    Strict kernel memorypermissions 2006 CONFIG_DEBUG_RODATA CONFIG_ARM_KERNMEMPERMS 2016 CONFIG_DEBUG_RODATA CONFIG_DEBUG_ALIGN_RODATA 2017 CONFIG_STRICT_KERNEL_RWX CONFIG_STRICT_MODULE_RWX Makes kernel rodata and kernel module – R^XW. Makes kernel text and kernel module text RX^W. This protect against rare chance that attackers might find and use ROP gadgets that exist in the rodata section. Adds an additional section-aligned split of rodata from kernel text so it can be made explicitly non- executable. This padding may waste memory space to gain the additional protection.
  • 10.
    Major changes inmemory permission 2006 (v2.6) Introduced CONFIG_DEBUG_RODATA link 2006(v2.6) Added file_operation structs link 2006(v2.6) Included kernel_params structure link 2006(v2.6) Included kallsyms data link 2009(v2.6) Made text section writable for hooks link 2010(v2.6) Added NX protection link 2010(v2.6) Included module permission CONFIG_DEBUG_SET_MODULE_RONX link 2014(v3.19) Introduced for ARM architecture link
  • 11.
    Major changes inmemory permission - contd 2015 (v4.0) Introduced for ARM64 link 2016(v4.6) Introduced DEBUG_ALIGN_RODATA link 2016(v4.7) Added fault_info table link 2017(v4.11) Renaming to STRICT_*_RWX link 2017(v4.14) Introduced for PPC32 link 2020(v5.7) Removed DEBUG_ALIGN_RODATA link 2020(v5.8) Refuse loading module that don’t enforce W^X link
  • 12.
    Limitation of Strictmemory permission • A kernel module/component can modify the page permission using some default functions available in linux kernel. • One of such function is set_memory_rw part of set_memory_* function set. • It’s not exported to use directly. But can be called manually.
  • 13.
  • 14.
    Requirement of KASLR- Background • The attacker can use kernel vulnerability to insert malicious code into the kernel address space by various means and redirect the kernel's execution to that that code. • One method used to get root privilege: commit_creds(prepare_creds()); • These attacks rely on knowing where symbols of interest live in the kernel's address space. • Those locations change between kernel versions and distribution build, but are known (or can be figured out) for a particular kernel. • ASLR disrupts that process and adds another layer of difficulty to an attack.
  • 15.
    Kernel Address RandomizationTimeline 2006 (v2.6) Introduced CONFIG_RELOCATABLE link 2013(v3.10) Introduced KASLR for x86/64 link 2013(v3.14) Introduced CONFIG_RANDOMIZE_BASE link 2014(v3.15) Randomization for modules link 2016(v4.6) Introduced for ARM64 link 2016(v4.7) Introduced for MIPS link 2016(v4.8) Randomize kernel memory range link 2021(v5.13) Randomization of kernel stack link
  • 16.
    Consequences of KASLRin linux kernel • The kernel previously used to be at very start of lowmem, but now any placement relative to memory ranges is possible. • Kernel can be separate from lowmem area. In theory, KASLR can put the kernel anywhere in the range of [16M, MAXMEM) on 64-bit, and [16M, KERNEL_IMAGE_SIZE) on 32-bit. • Load address of modules are randomized in the kernel to make KASLR effective for modules. • Both physical and virtual addresses are randomized. • Other Kernel memory regions are also randomized like physical mapping, vmalloc and vmemmap regions using CONFIG_RANDOMIZE_MEMORY. • Introduced randstack feature link
  • 17.
    CONFIG_RELOCATABLE • This buildsa kernel image that retains relocation information to enables loading and running a kernel binary from a different physical address than it has been compiled for. • This involves processing the dynamic relocations in the image in the early stages of booting • Works by building the kernel as a Position Independent Executable (PIE), which retains all relocation metadata required to relocate the kernel binary at runtime to a different virtual address. • Runtime relocation is possible since relocation metadata are embedded into the kernel. • Can read more about the internals here
  • 18.
    CONFIG_RANDOMIZE_BASE • Depends onCONFIG_RELOCATABLE • With CONFIG_RANDOMIZE_BASE set, it randomizes the address at which the kernel is decompressed at boot. • It deters exploit attempts relying on knowledge of the location of kernel internals. • Entropy is generated using the RDRAND instruction. If not supported by CPU, then RDTSC is used.
  • 19.
    CONFIG_RANDOMIZE_MEMORY • Introduced in2016 kernel v4.8 • When enabled the direct mapping of all physical memory, vmalloc/ioremap space and virtual memory map are randomized. • Works by randomizing base address of each sections. • Order is preserved but their base offset differ. • This makes exploits relying on predictable memory locations less reliable.
  • 20.
    CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAU LT – Background •Introduced in 2021 kernel v5.13. • This feature is based on the original idea from GRSecurity PaX’s RANDKSTACK feature. • Linux assigns two pages of kernel stack to every task. This stack is used whenever the task enters the kernel (system call, device interrupt, CPU exception, etc). • By the time the task returns to userland, the kernel land stack pointer will be at the point of the initial entry to the kernel thread stack. • This means that a userland originating attack against a kernel bug would find itself always at the same place on the task's kernel stack.
  • 21.
    CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAU LT - Overview •This feature aims to make various stack-based attacks that rely on deterministic stack structure harder. • The goal of randomize_kstack_offset feature is to add a random offset after the pt_regs has been pushed to the stack and before the rest of the thread stack is used during the transition. • If the stack offset is randomized on each system call, it is harder for an attack to reliably land in any particular place on the thread stack. • Even if address is exposed, the stack offsets will change on the next syscall.
  • 22.
    PAX RANDSTACK vsrandomize_kstack_offset
  • 23.
    KASLR limitation andBypasses - Infoleaks • Most common ways different kernel vulnerabilities are exploited with KASLR on is due to some info leaks. It can be a pointers leak (can be pointer to struct or heap/stack area) or content leak. • Raw kernel pointers were frequently printed to the kernel debug log • Bugs which trigger a kernel oops can be used to leak kernel pointers • Leak can happen due to uninitialized stack variables. Reference • Example - CVE-2019-10639 (Remote kernel pointer leak), CVE-2017- 14954
  • 24.
    KASLR limitation andBypasses • Low Entropy - there are only so many locations the kernel can fit in. This means an attacker could guess without too much trouble. • Arbitary read/ write - CVE-2017-18344 • Heap spraying using msgsnd()->msg_msg struct - CVE-2021-26708, CVE-2021-43267, CVE-2021-22555 • Hardware attacks and side channels – BlindSide attack • Each vulnerability exploits has it’s own story.
  • 25.
    More on kerneladdress leaks protection kptr_restrict - This indicates whether restrictions are placed on exposing kernel addresses via /proc and other interfaces. • When kptr_restrict is set to (1), kernel pointers printed using the %pK format specifier will be replaced with 0’s unless the user has CAP_SYSLOG. dmesg_restrict - This indicates whether unprivileged users are prevented from using dmesg to view messages from the kernel's log buffer. • When dmesg_restrict is set, users must have CAP_SYSLOG to use dmesg.
  • 26.
  • 27.
    Canaries based protectionin stack • This feature puts a canary value on the stack just before the return address and validates the value before returning. • Stack based buffer overflows (that need to overwrite return address) also overwrite the canary, which gets detected and neutralized via a kernel panic. • CONFIG_CC_STACKPROTECTOR – Only put canaries at the starting of critical functions. Equivalent GCC flag: -fstack-protector • CONFIG_CC_STACKPROTECTOR_STRONG - Adds the canary for a wider set of functions. i.e With this, more functions end up with a canary. Equivalent GCC flag: -fstack-protector-strong
  • 28.
    Canaries based protectionin stack CONFIG_HAVE_CC_STACKPROTECTOR CONFIG_CC_HAS_STACKPROTECTOR_NONE CONFIG_CC_HAS_SANE_STACKPROTECTOR where the "CC_" versions are about internal compiler infrastructure.
  • 29.
    Canaries based protectionlimitation • Issue with canaries: if a stack overflow is detected at all on a production system, it is often well after the actual event and after an unknown amount of damage has been done. • Function recursion based exploitation is still possible.
  • 30.
    Virtual mapped stack– CONFIG_VMAP_STACK • Earlier the stack lives in directly-mapped kernel memory, so it must be physically contiguous. • VMAP_STACK enable support for kernel stack to be present at not physically contiguous memory (vmalloc area). • This adds a guard pages in each thread stack area. • Guard pages are the pages that are set non writable in page table. Upon writing, it cause PF exception. • This feature causes reliable faults when the stack overflows. Kernel can set up the usability of stack trace and response to overflow.
  • 31.
    Bypassing VMAP guardpages – Stack hopping • Background – The kernel place one guard page at the starting and ending of each stack area. • A thread that wanders off the bottom of a stack into the guard page will be rewarded with a segmentation-fault signal • The fundamental problem with the guard page is that it is too small. • There are a number of ways in which the stack can be expanded by more than one page at a time.
  • 32.
    Stack hopping -contd Stack expansion working: • if the stack-pointer (the esp register, on i386_64) reaches the start of the stack and there is unmapped memory pages below • then a "page-fault" exception is raised and caught by handler • the page-fault handler transparently expands the stack of the process • or it terminates the thread/process if the stack expansion fails(THREAD_SIZE is reached) Unfortunately, this stack expansion mechanism is implicit and fragile: it relies on page-fault exceptions, but if another memory region is mapped directly below the stack, then the stack-pointer can move from the stack into the other memory region without raising a page-fault.
  • 33.
    Stack hopping –Steps overview • "Clashing" the stack with another memory region: allocate memory until the stack reaches another memory region. • "Jumping" over the stack guard-page: move the stack-pointer from the stack and into the other memory region, without accessing the stack guard-page. • "Smashing" the stack, or the other memory region: i.e overwrite the stack with the other memory region. Kernel Thread 1 Stack Guard Page (4kb) Thread 2 Stack/memory
  • 34.
    Stack hopping –Illustration • Step 1: Allocate memory until the start of the stack reaches the end of another memory region • Through megabytes of argument passed to the thread function. • Through recursive function call. Project zero reference Guard Page Thread 2 Stack Fill up the stack
  • 35.
    Stack hopping –Illustration • Step 1: Allocate memory until the start of the stack reaches the end of another memory region • Through megabytes of argument passed to the thread function. • Through recursive function call. Reference • Step 2: Consume the unused stack memory that separates the stack-pointer from the start of the stack. Guard Page Thread 2 Stack Fill up the stack
  • 36.
    Stack hopping –Illustration • Step 3: Jump over the stack guard-page, into another memory region • Move the stack-pointer from the stack and into the memory region that clashed with the stack without accessing the guard-page. Can be done using large memory allocations. • it must be larger than the guard-page; • it must end in the stack, below the guard-page; • it must start in the memory region above the stack guard-page; • it must not be fully written to (a full write would access the guard-page, raise a page-fault exception. Guard Page Thread 2 Stack Fill up the stack Th1 stack allocation
  • 37.
    Stack hopping –Illustration • Step 3: Jump over the stack guard-page, into another memory region • Move the stack-pointer from the stack and into the memory region that clashed with the stack but without accessing the guard-page. • it must be larger than the guard-page; • it must end in the stack, below the guard-page; • it must start in the memory region above the stack guard-page; • it must not be fully written to (a full write would access the guard-page, raise a page-fault exception). • Step 4: Either smash the stack with another memory region or smash another memory region with the stack. Guard Page Fill up the stack Th1 stack allocation Th2 starts filling
  • 38.
    Kernel Page tableisolation - CONFIG_PAGE_TABLE_ISOLATION • Introduced in 2017 (v4.15) as countermeasure to famous Meltdown attack. • Earlier whole kernel page table used to be mapped to user space process memory. • To mitigate Meltdown like side channel, linux create an independent set of page tables for use only when running userspace applications. • The userspace page tables only contain a minimal amount of kernel data. Only what is needed to enter/exit the kernel such as the entry/exit functions and interrupt descriptor table (IDT).
  • 39.
  • 40.
    ret2usr exploitation • ret2usr(return-to-user) based on the fact that code in kernel mode can execute the code in user mode. • Even with protection like KPTI, attackers can easily execute shellcode with kernel rights by hijacking a privileged execution path in kernel mode and redirecting it to user space. • In a ret2usr attack, kernel data is overwritten with user space addresses, typically after exploitation of memory corruption bugs in kernel code. • Example associated CVEs: CVE-2017-7308
  • 41.
    ret2usr exploitation –source Blackhat 2014
  • 42.
    CONFIG_RETPOLINE • Introduced in2018 (v4.15) as countermeasure to famous Specter attack. • It guard against kernel-to-user data leaks by avoiding speculative indirect branches. • Specter background: • Modern CPUs have a branch predictor to optimize their performance. • It works by referencing Branch Target Buffer or BTB that is a storage for a key(PC)- value(Target PC) pair. • But its size limitation causes a BTB collision that leads to a new side-channel attack • Using this primitive, an attacker can inject an indirect branch target into the BTB, and consequently run some codes in a speculative context. It can leak a sensitive data across boundaries. (e.g. between VMs, Processes, Kernel/User mode)
  • 43.
    CONFIG_RETPOLINE - Contd •In simple terms, it replace all the indirect jmp and call with return instructions.
  • 44.
    CONFIG_MODULE_SIG – Modulesigning • Introduced in 2012 (v3.7), when set allow only loading of signed module with valid key. • The kernel module signing facility cryptographically signs modules during installation and then checks the signature upon loading the module. • This allows increased kernel security by disallowing the loading of unsigned and malicious modules. • It uses RSA public key encryption and upto sha512 for hashing. • A private key is used to generate a signature and the corresponding public key is used to check it. • Under normal conditions, the kernel build will automatically generate a new keypair using openssl if one does not exist in the file cert/signing_key.pem. More details here
  • 45.
  • 46.
    SMEP/SMAP – Supervisormode execution/access prevention • SMEP prevents the CPU in kernel-mode to jump to an executable page that has the user flag set in the PTE. • This prevents the kernel from executing user-space code accidentally or maliciously, for example prevents kernel from jumping to specially prepared user-mode shellcode. (ret2usr) • Can be enabled by CR4.SMEP (20th bit) and CR4.SMAP(21st bit)
  • 47.
    SMEP/SMAP – Count. •SMAP extend the protection of SMEP to read and write. • SMAP can be temporarily disabled for explicit memory accesses by setting the EFLAGS.AC (Alignment Check) flag. • ARM have equivalent of SMEP, named as PXN (Privileged Execute- Never) • An standlone bundle kGuard(cross platform) can be used in case hardware support is not present. • It inject CFAs that perform a small runtime check before every branch to verify that the target address is located in kernel space or loaded from kernel- mapped memory.
  • 48.
    Bypassing SMEP/SMAP –ret2dir • Return to direct-mapped memory • Each address allocated at userspace will have a physical address directly mapped on kernel address space as well(called address aliasing) if it’s in lowmem. • To bypass SMEP/SMAP, user can provide kernel synonym address rather than the userspace mapped address during ret2usr exploitation. • Kernel will execute it without any issues.
  • 49.
  • 50.
    Ret2dir – processoverview • Step 1: Allocate a memory in userspace that will get mapped in lowmem of kernel space • This can be done by allocating big chunks of memory from different processes and filling the highmem. This force kernel to allocate memory from lowmem. • Step 2: Guess the kernel space address in lowmem for the directly mapped memory • /proc/<pid>/pagemap can be used to get the page number offset in lowmem. • lowmem_base knowledge will help determining base. • Step 3: Use ret2usr vulnerability to execute the shellcode. (Disclaimer: The memory area need to have WX permission set in kernel mapping)
  • 51.
    CONFIG_X86_KERNEL_IBT - Indirectbranch tracking • Merged in March, 2022 (v5.8) • If an attacker can corrupt a variable that is used for indirect branches, they may be able to redirect the kernel's execution flow to an arbitrary location. • Exploit techniques like return-oriented programming and jump- oriented programming depend on this kind of redirection. • IBT’s purpose is to prevent an attacker from causing an indirect branch (a function call via a pointer variable, for example) to go to an unintended place.
  • 52.
    CONFIG_X86_KERNEL_IBT - Indirectbranch tracking – Compiler version • It works by trying to ensure that the target of every indirect branch is, in fact, intended to be reached that way. • In linux implementation, indirect branch goes through a "jump table", ensuring that the target is not only meant to be reached by indirect branches, but that the prototype of the called function matches what the caller is expecting. • whenever an indirect function call is made, control goes to a special function called __cfi_check() • It will verify that the target address is, indeed, an address within the expected jump table, extract the real function address from the table, and jump to that address. • If the target address is not within the jump table, instead, the default action is to assume that an attack is in progress and immediately panic the system
  • 53.
    CONFIG_X86_KERNEL_IBT - Indirectbranch tracking – Intel CET version • If IBT is enabled, the CPU will ensure that every indirect branch lands on a special instruction (endbr32 or endbr64). If anything else is found, the processor will raise a control-protection (#CP) exception. • The processor implements a state machine that tracks indirect JMP and CALL instructions. • When one of these instructions is seen, the state machine moves from IDLE to WAIT_​FOR_​ENDBRANCH state. • In WAIT_​FOR_​ENDBRANCH state the next instruction in the program stream must be an ENDBRANCH. • If an ENDBRANCH is not seen the processor causes a control protection fault (#CP), otherwise the state machine moves back to IDLE state.
  • 54.
    Intel CET –Indirect Branch Tracking illustration
  • 55.
    CONFIG_ARM64_MTE – Memorytagging extension • Merged in linux kernel v5.10, this mechanism enables the automated detection of a wide range of memory-safety issues.(user-space only) • Context: Arm64 only uses 48 bits for addressing, remaining bits are “top byte ignore” feature that allows software to store arbitrary data in the uppermost byte of a virtual address. • MTE allows the storage of a four-bit "key” value in bits 59-56 of a virtual address that is associate with one or more 16-byte ranges of memory. • When a pointer is dereferenced, the key stored in the pointer itself is compared to that associated with the memory the pointer references; if the two do not match, a trap may be raised.
  • 56.
    Memory tagging extension- contd • Each memory granule has a tag (aka color) • Every pointer has a tag • On allocation, both memory and pointer get a matching random tag
  • 57.
    Memory tagging extension- contd • Each memory granule has a tag (aka color) • Every pointer has a tag • On allocation, both memory and pointer get a matching random tag • On pointer dereference, pointer tag must match memory tag
  • 58.
    Memory tagging extension- contd • Each memory granule has a tag (aka color) • Every pointer has a tag • On allocation, both memory and pointer get a matching random tag • On pointer dereference, pointer tag must match memory tag
  • 59.
    CONFIG_AMD_MEM_ENCRYPT - AMDSecure Memory Encryption • SME is hardware feature present on AMD CPUs allowing system RAM to be encrypted and decrypted (mostly) transparently by the CPU, with a little help from the kernel to transition to/from encrypted RAM. • Such RAM should be more secure against various physical attacks like RAM access via the memory bus and should make the radio signature of memory bus traffic harder to intercept (and decrypt) as well. • It works by marking individual pages of memory as encrypted using the standard x86 page tables. A page that is marked encrypted will be automatically decrypted when read from DRAM and encrypted when written to DRAM.
  • 60.
  • 61.
    AMD SME -contd •Support for SME can be determined through the CPUID instruction. The CPUID function 0x8000001f[eax] reports if SME is supported. • If support for SME is present, MSR 0xc00100010 (MSR_K8_SYSCFG) Bit[21] can be used to determine if memory encryption is enabled. • Bits[5:0] pagetable bit number used to activate memory encryption.(C-bit)
  • 62.
  • 63.
    grSecurity and PAXproject • Grsecurity ( www.grsecurity.net) is the only drop-in Linux kernel replacement offering high-performance, state-of-the-art exploit prevention against both known and unknown threats. • PaX is a separate project that is included in Grsecurity as part of its security strategy. The PaX project researches various defences against the exploitation of software bugs (e.g., buffer overflows and user- supplied format string bugs). • PaX does not focus on finding and fixing the bugs, but rather the prevention and containment of exploit techniques
  • 64.
  • 65.
    Linux Kernel RuntimeGuard(LKRG) • It’s an independent project, equivalent of windows Patchguard. • Performs runtime integrity checking of the Linux kernel and detection of security vulnerability exploits against the kernel. • LKRG is a kernel module, so it can be built for and loaded on top of a wide range of mainline and distros' kernels, without needing to patch those. • It uses kprobe for hooking various linux kernel apis. • Amount of protection provided is based on profile set by user. Allowed values are 0 (log and accept), 1 (selective), 2 (strict), and 3 (paranoid).
  • 66.
    Linux Kernel RuntimeGuard(LKRG) - Features • Exploit detection • Tracking processes important data structures and metadata, pointers and capability, namespace, cred struct modification. • Ptrace access, Keyring access • SeLinux state modification • Checking kernel modules integrity • Module removed from module list or KOBJ. • Gathers information about loaded kernel modules and tries to protect them via calculating hashes from their core_text section. • Kernel Components validation • SMEP, MSRs, pint, kint, umh and Profiles validation • Periodically check critical system hashes using timer • (Un)Hide itself from the module system activity components
  • 67.
    LSM – Linuxsecurity module framework • The LSM kernel patch provides a general kernel framework to support security modules. • By itself, the framework does not provide any additional security; it merely provides the infrastructure to support security modules • The LSM kernel patch adds security fields to kernel data structures and inserts calls to hook functions at critical points in the kernel code to manage the security fields and to perform access control. • It also adds functions for registering and unregistering security modules, and adds a general security system call to support new system calls for security-aware applications.
  • 68.
    Kernel lockdown • Introducedin 2021(v5.3), a linux kernel security module that uses LSM. • The Kernel Lockdown feature is designed to prevent both direct and indirect access to a running kernel image • Attempting to protect against unauthorized modification of the kernel image and • Prevent access to security and cryptographic data located in kernel memory. • If a prohibited or restricted feature is accessed or used, the kernel will emit a message that looks like: Lockdown: X: Y is restricted, see man kernel_lockdown.7 where X indicates the process name and Y indicates what is restricted.
  • 70.
  • 72.
    Uncovered Protections • Retbleedmitigation: • CONFIG_CPU_IBPB_ENTRY • CONFIG_CPU_UNRET_ENTRY • CONFIG_RETPOLINE • CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS • Specter and Meltdown - ARM • CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY • CONFIG_UNMAP_KERNEL_AT_EL0 • CONFIG_ARM64_PTR_AUTH_KERNEL • CONFIG_ARM64_BTI • CONFIG_ARM64_EPAN
  • 73.
    Uncovered Protections • CONFIG_DEBUG_STACKOVERFLOW •KFENCE • KASAN/KMSAN/KCSAN • CONFIG_DEBUG_KMEMLEAK • refcount_t API • FORTIFY_SOURCE • PAX RAP • L1TF mitigation - PTE inversion
  • 74.

Editor's Notes

  • #6 sudo] password for parallels: [21548.411904] Hello world. [21548.411906] Page offset value is 0xffff898740000000. [21548.411908] PHYS_OFFSET value is 0x0 or 1000000. [21548.411909] Task size is 0x7ffffffff000. [21548.411909] High memory address is 0xffff898890000000 and has physical address of 0x150000000. [21548.411910] Vmalloc start address is 0xffffa89600000000. [21548.411911] Physical address where kernel is loaded is 0x1000000. [21548.411911] Module memory start address is 0xffffffffc0000000 and ends at 0xffffffffff000000. [21548.411912] Data segment address is 0xffffffffc08e707b. [21548.411912] Stack location is 0xffffa89606d03b8c.
  • #8 https://linux-kernel-labs.github.io/refs/heads/master/lectures/address-space.html
  • #25 Source - https://github.com/bcoles/kasld
  • #35 Reference: https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
  • #39 https://github.com/torvalds/linux/commit/5aa90a84589282b87666f92b6c3c917c8080a9bf https://github.com/torvalds/linux/blob/master/Documentation/x86/pti.rst
  • #40 Can be bypassed using swapgs attack
  • #41 https://www.blackhat.com/docs/eu-14/materials/eu-14-Kemerlis-Ret2dir-Deconstructing-Kernel-Isolation-wp.pdf
  • #42 https://www.blackhat.com/docs/eu-14/materials/eu-14-Kemerlis-Ret2dir-Deconstructing-Kernel-Isolation-wp.pdf
  • #44 https://support.google.com/faqs/answer/7625886
  • #59 Source : https://docs.google.com/presentation/d/1IpICtHR1T3oHka858cx1dSNRu2XcT79-RCRPgzCuiRk/edit
  • #62 AES engine https://github.com/torvalds/linux/blob/master/Documentation/x86/amd-memory-encryption.rst