Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Jaime Peñalba - Kernel exploitation. ¿El octavo arte? [rooted2019]

120 views

Published on

Jaime Peñalba - Kernel exploitation. ¿El octavo arte? [rooted2019]

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Jaime Peñalba - Kernel exploitation. ¿El octavo arte? [rooted2019]

  1. 1. Kernel Exploitation ¿El octavo arte? RootedCON 2019 Jaime Peñalba Estébanez @NighterMan
  2. 2. Congratulations
  3. 3. Jaime Peñalba Estébanez @NighterMan jpenalba@member.fsf.org HATER at the interwebz whoami
  4. 4. whoami ● Rey de las cosas pequeñas ● Colaborador en el hormiguero ● Mejor mago del 2016 ● Ganador club de la comedia 2001
  5. 5. Jaime Peñalba Estébanez @NighterMan jpenalba@member.fsf.org ● Linux kernel researcher at COSEINC ● Antes molaba ● Ahora malgasto mi vida en el GDB whoami
  6. 6. Unrelated Disclaimer Please, do not use docker for security purposes. This applies to docker, lxc, or any container system based on namespaces and cgroups. Do not think about it as security container or as valid security mechanism. Its just a chroot on steroids.
  7. 7. I’m not a big a fan of exploitation talks: ● Require audience with deep understanding of the target ● Are boring and hard to follow ● If you get distracted, you loose This should be a trilogy (at least): ● Talk about kernel basics ● Talk about exploitation ● Talk about Memory Management ● Talk about developing tools and sanitizers ● Talk about how to find bugs I’m gonna try to make a mix, and high level explanation of a real life bug What is this all about?
  8. 8. We will try to cover the following topics: ● Introduction to basic kernel concepts ● Basic kernel exploitation ● Some Kernel mitigations ● Kernel memory management ● Dynamic memory management ● Real world exploit? All this will be explained at a very high level, as each topic could be a hole talk by itself. Se trasca la magedia…. What is this all about?
  9. 9. Battle Royale Ed. RootedCON
  10. 10. SOME KERNEL BASICS
  11. 11. The kernel is a computer program that is the core of a computer's operating system, with complete control over everything in the system. On most systems, it is one of the first programs loaded on start-up (after the bootloader). It handles the rest of start-up as well as input/output requests from software, translating them into data-processing instructions for the central processing unit. It handles memory and peripherals like keyboards, monitors, printers, and speakers. The critical code of the kernel is usually loaded into a separate area of memory, which is protected from access by application programs or other, less critical parts of the operating system. The kernel performs its tasks, such as running processes, managing hardware devices such as the hard disk, and handling interrupts, in this protected kernel space. In contrast, everything a user does is in user space: writing text in a text editor, running programs in a GUI, etc. This separation prevents user data and kernel data from interfering with each other and causing instability and slowness, as well as preventing malfunctioning application programs from crashing the entire operating system. The kernel's interface is a low-level abstraction layer. When a process makes requests of the kernel, it is called a system call. Kernel designs differ in how they manage these system calls and resources. A monolithic kernel runs all the operating system instructions in the same address space for speed. A microkernel runs most processes in user space, for modularity. Source: https://en.wikipedia.org/wiki/Kernel_(operating_system) What is a kernel
  12. 12. Source: N. Murray, N. Horman, Understanding Virtual Memory What is a kernel
  13. 13. Virtual Memory Upper Canonical Non-Canonical Lower Canonical 0xFFFF FFFF FFFF FFFF 1111111111111111111111111111111111111111111111111111111111111111 0xFFFF 8000 0000 0000 1111111111111111100000000000000000000000000000000000000000000000 0x0000 0000 0000 0000 0000000000000000000000000000000000000000000000000000000000000000 0x0000 7FFF FFFF FFFF 0000000000000000011111111111111111111111111111111111111111111111 1111111111111111 1 00000000000000000000000000000000000000000000000 1 bit (sign-extended) Most significant implemented bit Unimplemented bits 47 bits Used for addressing
  14. 14. Virtual Memory Virtual memory map with 4-level page tables ● 47 bits address space + 1 bit sign ● 2^47 = 128TB ● 256TB of mappable memory Virtual memory map with 5-level page tables ● 56 bits address space + 1 bit sign ● 2^56 = 64PB ● 128PB of mappable memory
  15. 15. Virtual Memory A basic memory map looks like this:
  16. 16. Virtual Memory Upper Canonical Non-Canonical Lower Canonical 0xFFFFFFFF FFFFFFFF 0xFFFF8000 00000000 0x00000000 00000000 0x00007FFF FFFFFFFF Kernel Space virtual mem shared between all processes User Space virtual mem different per mm TASK_SIZE
  17. 17. Virtual Memory
  18. 18. Virtual Memory Upper Canonical Non-Canonical Process 234 0xFFFFFFFF FFFFFFFF 0xFFFF8000 00000000 0x00000000 00000000 0x00007FFF FFFFFFFF Kernel Space virtual mem shared between all processes User Space virtual mem different per mm TASK_SIZE Process 312Process 1453Process 457
  19. 19. BASIC KERNEL EXPLOITATION
  20. 20. In a kernel release far, far away...
  21. 21. NULL Pointer Dereference Let’s suppose we have the following code running in the kernel as a miscdevice at “/proc/vulnerable”. 0x0000000 0x0000000+4
  22. 22. NULL Pointer Dereference This userspace program triggers the vulnerable code in the kernel using an ioctl. It should trigger a NULL pointer dereference and produce a kernel panic / system crash.
  23. 23. NULL Pointer Dereference What if we mmap addr 0x0 on user space?
  24. 24. NULL Pointer Dereference struct my_struct *tmp; tmp->get_counter(); struct my_struct { int counter; int (*get_counter) (); }; void escalate_privs() { /* Do hacker things */ return; } 0xFFFFFFFFFFFFFFFF 0x0000000000000000 Kernel Space User Space TASK_SIZE
  25. 25. Congratulations! You traveled back to 1999 and are now OSEE certified
  26. 26. MITIGATIONS
  27. 27. Linux Kernel Defence Map 1 Source: https://github.com/a13xp0p0v/linux-kernel-defence-map
  28. 28. Linux Kernel Defence Map 2 Source: https://github.com/a13xp0p0v/linux-kernel-defence-map
  29. 29. Linux Kernel Defence Map 3 Source: https://github.com/a13xp0p0v/linux-kernel-defence-map
  30. 30. Linux Kernel Defence Map 4 Source: https://github.com/a13xp0p0v/linux-kernel-defence-map
  31. 31. MitiGator raising the bar
  32. 32. SMAP/SMEP SMAP ● Supervisor Mode Access Prevention ● Cannot access/dereference any pages that are userspace (U=1) while CPU is running in privileged mode (CPL=0) ● Uses AC flag in EFLAGS register ● Two new instructions CLAC/STAC ● 21th bit in CR4 register SMEP ● Supervisor Mode Execution Prevention ● Cannot execute code from any pages that are userspace (U=1) while the CPU is running in privileged mode (CPL=0) ● 20th bit in CR4 register ● Older/more common than SMAP These are implemented at hardware level. Its the CPU itself who enforces the protection (such as the NX bit).
  33. 33. SMAP/SMEP struct my_struct *tmp; tmp->get_counter(); struct my_struct { int counter; int (*get_counter) (); }; void escalate_privs() { /* Do hacker things */ return; } 0xFFFFFFFFFFFFFFFF 0x0000000000000000 Kernel Space User Space TASK_SIZE SMEP SMAP
  34. 34. KPTI (KAISER) Kernel Page Table Isolation was designed to try to stop leaks caused by side channel attacks such as meltdown or spectre. ● Two sets of page tables are maintained once used while runing in privileged mode and another one for unprivileged mode. ● Userspace PTEs hide most of the kernel mappings, reducing it to a minimum: kernel code for handling syscalls, interrupts, etc. ● Kernel PTEs mark the hole user space area as non-executable. This turns out to be something like a software implementation of SMEP. ● Impact on performance from 5% to 30%. Mostly due to TLB flushing. It can be improved on CPUs supporting PCID. Source: https://es.wikipedia.org/wiki/Aislamiento_de_tablas_de_páginas_del_núcleo
  35. 35. KPTI (KAISER) Kernel space User space Kernel space User space Kernel space User space No KPTI KPTI Kernel PTEs Userspace PTEs
  36. 36. KPTI (KAISER) struct my_struct *tmp; tmp->get_counter(); struct my_struct { int counter; int (*get_counter) (); }; void escalate_privs() { /* Do hacker things */ return; } 0xFFFFFFFFFFFFFFFF 0x0000000000000000 Kernel Space User Space TASK_SIZE NX
  37. 37. DYNAMIC MEMORY MANAGEMENT
  38. 38. KMEM CACHES kmalloc() is the normal method of allocating memory for objects smaller than page size in the kernel. These objects will be stored in different caches based on its size, or the cache requested.
  39. 39. KMEM CACHES object object object object object object object object One full SLAB at kmalloc-192 cache contains 20 chunks Full SLAB can be one or more pages One chunk is 192 bytes in size
  40. 40. Allocators The Linux kernel has implemented different allocators over the years. Nowadays we have 3 different allocators available. You must choose one of them at compilation time: ● SLAB: The first one ● SLUB: Smaller memory footprint and less locking ● SLOB: Aimed at embedded systems Each allocator have its own way to track free and allocated objects, local cpu caches, etc...
  41. 41. SLAB Allocator example
  42. 42. SLAB Allocator example Source: Understanding the Linux Virtual Memory Manager / Mel Gorman.
  43. 43. TCP KMEM_CACHE (SLAB Allocator)
  44. 44. ENOTIME
  45. 45. MEMORY MANAGEMENT
  46. 46. ENOTIME
  47. 47. REAL WORLD EXPLOITATION
  48. 48. Our bug I wont be giving any details on the bug itself, so you will have to believe me… Its a race condition, and the code displayed is just an example...
  49. 49. Our exploit Our base bug is a race condition which causes corruption onto a circular doubly linked list. An outline of the exploitation is something like: 1. Use the race condition to add the same item twice into the list 2. Free the item which removes it from the list just once 3. Linked list still contains one of the entries but the item is freed 4. We now have an Use-After-Free 5. Repeat steps 1 to 4, to produce a second UAF 6. Use both UAF in the list to achieve a write-what-where primitive So somehow we turned a race condition into a write-what- were.
  50. 50. Doubly linked list
  51. 51. Doubly linked list
  52. 52. Doubly linked list Source: Understanding the Linux Kernel 3rd edition. Daniel P. Bovet and Marco Cesati.
  53. 53. Empty list next prev list_head An empty doubly linked list just points to itself for the next and prev fields.
  54. 54. Adding an item
  55. 55. Adding one item list.next list.prev next prev list_head vuln_subsys (1)
  56. 56. Adding another item list.next list.prev list.next list.prev next prev list_head vuln_subsys (2) vuln_subsys (1)
  57. 57. Removing an item
  58. 58. Removing the 2nd item list.next list.prev next prev list_head vuln_subsys (1)
  59. 59. Abusing the race condition What would happen if we call add_vuln_subsys() at the same time from two different processes with the same vuln_subsys *entry ?
  60. 60. Abusing the race condition list.next list.prev next prev list_head vuln_subsys (1) Adding the same entry twice. First add:
  61. 61. Abusing the race condition Adding the same entry twice. Second add: list.next list.prev next prev list_head vuln_subsys (1)
  62. 62. Abusing the race condition After the double add, we free the vuln_subsys *entry. Which will remove the entry from the list and free the obj.
  63. 63. Abusing the race condition list.next list.prev next prev list_head vuln_subsys (1) 0xdead000000200200
  64. 64. Abusing the race condition If you remember the free_vuln_subsys() function, it does not just remove the item from the list, but it also frees the object. So we now have an use-after-free.
  65. 65. Abusing the race condition
  66. 66. Abusing the race condition list.next list.prev next prev list_head vuln_subsys (1) 0xdead000000200200 The chunk is now free
  67. 67. Abusing the race condition Usually from here you would try to allocate over the recently freed memory chunk with data under your control. Later call to some code path which would make use of that freed object, and try to take control of the system. Common ways to take control are overwriting a function pointer on the structure/object if there is one. In this case there was just a work queue, which we could redirect to a fake work queue under our control. Sadly that queue is only triggered after some special events, so we need to find another way….
  68. 68. UAF / Heap Spraying In order to exploit bugs related to dynamic memory management such as use after frees usually you want to find a mechanism that meets the following criteria: ● Be able to allocate and deallocate memory at will. ● Control the allocation size to be able to hit different caches. ● You will often find that spraying is required. So having high limits is desired. ● It wont add any data before or after your allocation I tend to use two ways to do such a thing.
  69. 69. UAF / Heap Spraying System V message queues – msgsnd() We can create a new message queue, and send messages into the queue to allocate, and consume messages from the queue to deallocate. ● First 48 bytes of the allocation will be used for the msg_msg structure, and we do not fully control that struct ● There are limits on max number of message queues which can be created, and also on the max number of messages queued for each queue. ● There are not restrictions on NULL bytes or anything. ● We can use mtype to mark messages.
  70. 70. UAF / Heap Spraying Key Management – add_key() keyctl() We can create new keys to allocate, and delete keys to deallocate. We can use the key name or the key data to store data. I normally use the key name. ● No data is added before our buffer, but it appends a NULL byte at the end ● Very limited by key quota around 200 keys or 20000 bytes ● Maximum allocation is around 4093 bytes ● Key names cannot be equal (we need to change 1 byte at least) ● NULL bytes are not allowed
  71. 71. WARNING Things start to go bonkers from now own.
  72. 72. Abusing the UAF Looks like there is no obvious way to quickly exploit this…. (You have to believe me) So… What if we add another entry into the doubly linked list?
  73. 73. Abusing the UAF list.next list.prev next prev list_head vuln_subsys (1) The chunk is now free list.next list.prev vuln_subsys (2)
  74. 74. Abusing the UAF By adding a second entry into the linked list, we will overwrite the first entry’s list.prev field. In this case the list.prev field is at offset 176. The data written will be the address of the previous object in the list plus 168 bytes which is the offset where the list_head field is held within the object. So now we have a new exploit primitive, an 8 bytes write of uncontrolled data at offset 176.
  75. 75. Abusing the UAF So now that we know we can overwrite with 8 bytes of uncontrolled data at offset 176, we have to search for the following: 1. A structure which will be allocated in the kmalloc-192 cache 2. With a pointer or something like at offset 176 which we will be overwriting 3. Which can be allocated an deallocated at will Finding such case is not easy, as the size-192 cache covers objects from 128 to 192 bytes, and writing at offset 176, leaves us just with objects/structures which are 184 to 192 bytes in size.
  76. 76. Abusing the UAF Several ideas to look for candidates: ● Search for structures between 184 and 192 bytes in size which contain a pointer at offset 176 and are part of the kernel core. ● Allocate a 176 bytes string, and overwrite the null terminator with our primitive. (more crazy shit) ● Exercise the kernel and examine the kmalloc-192 cache. Developing some scripts is required...
  77. 77. Abusing the UAF
  78. 78. Abusing the UAF
  79. 79. Abusing the UAF Ended up developing some scripts to do the following tasks: ● Exercise the kernel as much as we can ● Extract the stacktraces of all the allocations in the kmalloc-192 cache ● Sum all the allocations and remove duplicates At last we should manually review all the unique stacktraces and looks for useful ones...
  80. 80. Abusing the UAF
  81. 81. Abusing the UAF
  82. 82. Abusing the UAF After a lot of research, we found a suitable candidate. It involves using IPC System V shared memory segments. This meets all the criteria previously defined: ● A structure which will be allocated in the kmalloc-192 cache ● With a pointer or something like at offset 176 which we will be overwriting ● Which can be allocated an deallocated at will
  83. 83. Abusing the UAF
  84. 84. Abusing the UAF struct ipc_rcu + struct shmid_kernel 64 bytes + 120 bytes = 184 bytes struct ipc_rcu + shmid_kernel.mlock_user offset 64 bytes + 112 offset = 176
  85. 85. Abusing the UAF
  86. 86. Abusing the UAF list.next list.prev next prev list_head vuln_subsys (1) 0xdead000000200200 The chunk is now free
  87. 87. Abusing the UAF shm_cprid shm_lprid *mlock_user next prev list_head vuln_subsys (1) shmid_kernel struct ipc_rcu (64 bytes) struct shmid_kernel (120 bytes) int seg = shmget(key++, 4096, IPC_CREAT | 0600);
  88. 88. Abusing the UAF shm_cprid shm_lprid *mlock_user next prev list_head vuln_subsys (1) shmid_kernel list.next list.prev vuln_subsys (2) Pointer overwritten
  89. 89. Abusing the UAF again shm_cprid shm_lprid *mlock_user next prev list_head vuln_subsys (1) shmid_kernel list.next list.prev vuln_subsys (2) 0xdead000000200200
  90. 90. Abusing the UAF again shm_cprid shm_lprid *mlock_user next prev list_head vuln_subsys (1) shmid_kernel __count processes files ... inotify_devs vuln_subsys (2) fake mlock_user alloc_msgsnd(char *)&fake_mlock_user, 140);
  91. 91. Abusing the race condition If we take a look to the mlock_user field it points to an struct user_struct which we will store at the second UAF object. It is 104 bytes in length. But mlock_user will point to offset 168 inside a 192 bytes chunk. If we do the math: 168 + 104 = 272. So when reading this structure we will be reading past the 192 bytes chunk, and reading from the next chunk….
  92. 92. shm_cprid shm_lprid *mlock_user vuln_subsys (1) shmid_kernel __count processes files ... inotify_devs vuln_subsys (2) fake mlock_user First 168 bytes are not used 192 bytes chunk Allocated at `kmalloc-192` cache Only 24 bytes Left on this chunk Contiguous chunk at `kmalloc-192` cache Some other object in the slab Remaining data overflows into next chunk in the slab Offset 0 Offset 168 Offset 192 Offset 0
  93. 93. Each step is a new problem...
  94. 94. Don’t worry, there are more problems... Allocating over the freed chunks might look easy at first, but it is not. Memory allocation its not a deterministic process, there are many other tasks in the kernel, processes in the system, interrupts, etc… which will be competing against you to allocate over and changing the layout of the kmem cache. There are local objects caches per cpu for fast/lockless allocation and deallocation paths. So binding to a cpu is required. Your success can vary depending on memory fragmentation, memory pressure, system load, etc… Your task will be turn something non-deterministic and unreliable by nature, into something more or less reliable…
  95. 95. Don’t worry, there are more problems... Our fake mlock_user structure has been allocated here. When reading past the first 24 bytes of the structure, we will start reading into this next chunk.
  96. 96. Don’t worry, there are more problems... Original structure declaration Structure declaration aligned to offset 168 for kmalloc-192 cache New structure start offset (168)
  97. 97. kmalloc-192 slab One full SLAB at kmalloc-192 cache contains 20 chunks One chunk is 192 bytes in size The goal for our heap massage is to control a hole slab with our custom data, so whenever the read overflow happens, data being read will be under our control. Or at least it will probably be under our control…..
  98. 98. kmalloc-192 partial slabs ALLOCATED ALLOCATED ALLOCATED ALLOCATED FREE ALLOCATED ALLOCATED ALLOCATED ALLOCATED FREE ALLOCATED ALLOCATED ALLOCATED ALLOCATED FREE ALLOCATED ALLOCATED ALLOCATED FREE FREE ALLOCATED ALLOCATED ALLOCATED ALLOCATED ALLOCATED ALLOCATED FREE FREE FREE FREE ALLOCATED ALLOCATED Partialslabs First we will fill all those gaps in order to force the creation of a new hole slab which is the one we will be trying to take control of. We can read /proc/slabinfo if available in order to find how many partial slabs and objects exists. Use System V message queues to fill gaps. msgsnd()
  99. 99. kmalloc-192 filled slabs Next we will fill two new slabs containing or “unaligned” fake mlock_struct structures data. To do the allocation, we can use the key managament facility to do this. Using this method, we cannot use null bytes. When using this method we are limited by the key quota. ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL ALLOCATED ALLOCATED ALLOCATED FILL FILL ALLOCATED ALLOCATED ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL FILL FILL FILL ALLOCATED ALLOCATED Fullslabs
  100. 100. kmalloc-192 filled slabs ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL ALLOCATED ALLOCATED ALLOCATED FILL FILL ALLOCATED ALLOCATED ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL FILL FILL FILL ALLOCATED ALLOCATED Fullslabs alloc_key alloc_key alloc_key alloc_key alloc_key alloc_key alloc_key alloc_key Allocate two full slabs (40 items) with fake user_struct using `alloc_key_name()` method alloc_key alloc_key alloc_key alloc_key alloc_key alloc_key alloc_key alloc_key Newslabs
  101. 101. kmalloc-192 create gaps ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL ALLOCATED ALLOCATED ALLOCATED FILL FILL ALLOCATED ALLOCATED ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL FILL FILL FILL ALLOCATED ALLOCATED Fullslabs alloc_key alloc_key HOLE alloc_key HOLE alloc_key HOLE alloc_key Create some gaps into the slabs by freeing keys alloc_key alloc_key alloc_key alloc_key alloc_key alloc_key alloc_key alloc_key Newslabs
  102. 102. kmalloc-192 create gaps ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL ALLOCATED ALLOCATED ALLOCATED FILL FILL ALLOCATED ALLOCATED ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL FILL FILL FILL ALLOCATED ALLOCATED Fullslabs alloc_key alloc_key vuln_subsys alloc_key HOLE alloc_key HOLE alloc_key Allocate a new vuln_subsys (2) alloc_key alloc_key alloc_key alloc_key alloc_key alloc_key alloc_key alloc_key Newslabs
  103. 103. kmalloc-192 create gaps ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL ALLOCATED ALLOCATED ALLOCATED FILL FILL ALLOCATED ALLOCATED ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL FILL FILL FILL ALLOCATED ALLOCATED Fullslabs alloc_key alloc_key vuln_subsys alloc_key HOLE alloc_key HOLE alloc_key Trigger the race and free the vuln_subsys (2) to produce an UAF alloc_key alloc_key alloc_key alloc_key alloc_key alloc_key alloc_key alloc_key Newslabs
  104. 104. kmalloc-192 create gaps ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL ALLOCATED ALLOCATED ALLOCATED FILL FILL ALLOCATED ALLOCATED ALLOCATED ALLOCATED ALLOCATED ALLOCATED FILL FILL FILL FILL ALLOCATED ALLOCATED Fullslabs alloc_key alloc_key vuln_subsys msgsnd alloc_key HOLE alloc_key HOLE alloc_key Allocate over the UAF another fake mlock_user Using msgsnd() method alloc_key alloc_key alloc_key alloc_key alloc_key alloc_key alloc_key alloc_key Newslabs
  105. 105. kmalloc-192 create gaps First chunk allocated using alloc_msgbuf And containing a null byte Second chunk containing the overflowed data. Allocated using alloc_key_name Structure will start being read from offset 167 Read will continue from offset 0 on the next chunk
  106. 106. write-what-where So we have a shmid_kernel structure pointing to an user_struct structure fully controlled by us. What can be done with it? Lets take another look to the user_struct structure:
  107. 107. write-what-where This piece of code can be triggered when destroying the shared memory segment. But we have to meet several conditions: ● The shared memory segment should have been locked. Otherwise the user_struct structure is ignored. ● The __counter field on the user_struct structure must be set to 1. This way free_user() function will be called when the reference counter is decreased to 0.
  108. 108. privilege escalation Now that we have a proper write-what-where, we can write 8 bytes of data anywhere in the memory. There are many things we can overwrite in order to escalate privileges. Among the most common ways is to overwrite a file_operations structure. There are tons of them, are easily located in memory as they are defined as globals, and they contain tons of function pointers. I personally do not use this method. But lets suppose I do….
  109. 109. privilege escalation Our target will be the file operations for “/dev/ptmx”. And we will overwrite the llseek function pointer, as it points to no_llseek in the ptmx_fops structure, and nobody should call to it...
  110. 110. privilege escalation If there were no protections in place, we could simply point tty_fops.llseek into an userspace address with our code. But it is not 1999 any more… So what we are going to do, is point it to a rop gadget which will pivot the stack. Lets take a look to the llseek() function prototype: When calling llseek() from userspace we can control offset and origin fields, which will be at rsi and rdx respectively. So gadgets such as the following ones should do the job: loff_t no_llseek(struct file *file, loff_t offset, int origin) mov rsp, rsi; ret; xchg rsi, rsp; ret;
  111. 111. privilege escalation Triggering the stack pivot should be as easy as running this code from userspace: The kernel stack will point to 0xdeadbeefdeadbeef where we have our full ropchain to disable SMAP/SMEP or mark a kernel page from the direct mapping range as executable. Sample ROP chain:
  112. 112. privilege escalation Although our fake stack, and the escalation code are part of our exploit code which is running in userspace, we are referencing to it using addresses from kernel space. There are some ways to find were our userspace memory is mapped into the direct mapping of the kernel space. This direct mapping is marked as non-executable. That’s why we use the ROP chain to mark the page containing our code as executable prior jumping to it.
  113. 113. privilege escalation
  114. 114. Cleanup To finish up with the exploitation is necessary to cleanup any mess to keep the kernel stable, otherwise all your work will be worthless…. ● Fix any kmem caches we might have corrupted ● Restore any pointers overwritten ● Restore the stack so the kernel will properly return to userspace ● Fix any other mess you may have made….
  115. 115. DEMO
  116. 116. THE END Jaime Peñalba Estébanez @NighterMan jpenalba@member.fsf.org

×