BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
Bsdtw17: brooks davis: is it time to replace mmap?
1. Brooks Davis: Is it time to replace mmap?
“Memory” Throughout Unix History
● PDP-11 c. 1970
Process Address Space
● 0x0 - 0x7ff..f
● code, data, bss
● Null pointer trapping, Stack, Heap
● Physical address space
○ Copy on write
Process Address Space (cont.)
● Break -> … <- SP
● Break: highest address in address space that’s in use
● Single thread programs have their stack space magically managed
UNIX and BSD
● (combed through combined Unix history git repo)
● How was the break set?
○ PDP-7? 1970
○ 1972 V1: sysbreak syscall
○ 1972 V2: break syscall
○ 1973 V3: break syscall and docs
■ break sets the system’s idea of the highest location used by the program to addr.
Locations greater than addr and below the stack pointer are not swapped and are thus
liable to unexpected modifications.
○ 1974 V4: sbrk syscall, no with protection
■ manpage: memory violation will occur
○ 1975 V5: brk() introduced
○ 1983 4.2BSD: references to mmap()
Heap fragmentation (Problem with sbrk)
Memory Sharing
● read only mappings share physical memory, potentially TLB entries though not generally
Dynamic Linking
● use case where sharing is crucial (e.g. libc size v.s. Unix tool)
Multi-threaded programs
● multiple stacks
● brk, sbrk interface problematic for stack allocation (stack interleaved with heap)
● no guard pages
● want stack to still be at the bottom of the address space plus guard pages
4.2BSD memory interfaces
● mmap()
○ allocate addr space
○ alter backing mappings
● mremap()
○ relocate or extend mapping, no BSD implements it, Linux has a implementation
● munmap()
○ remove backing
● mprotect()
○ alter page protections
● madvise()
○ hints memory usage to kernel (Will need soon, page it in. Won’t need again. Free content after
read)
● mincore()
2. ○ query backing status
● sbrk()
○ extend or reduce “break”
● sstk()
○ extend or reduce stack.
● 4.2BSD: only sbrk() was implemented. All other interfaces were empty functions.
Back to History
● 1990 4.3-Reno: mmap() implemented with VM from Mach
○ VM implementation came from Mach but the mmap() interface was a BSD invention
● 2003 OpenBSD 3.3: implements W^X (writable or executable)
W^X and JITs
● Map PROT_WRITE then remove PROT_WRITE and add PROT_EXEC
● most pages cannot express “most pages should not become executable”
Back to History
● 2010 CHERI Project
○ pointers with bounds and permissions
■ strong monotonicity guarantees: read capability cannot turn into executable capability
○ Want W^X for pointers (in addition to pages)
○ API changes required:
■ make mprotect() return a pointer?
■ some other interface?
mmap() functionality issues
● conflates address reservation and mapping
○ lack of boundaries between reservations leads to bugs: e.g. Stack Clash
● Lack of expressiveness
○ portable way to express alignment (e.g. on superpage boundary)
○ no way to express maximum permission
mmap() API issues
● Too many arguments
● Too many failure modes
○ FreeBSD 11: 19 documented errors (15 use the same error code, EINVAL)
Other mmap() issues
● No support for mapping more pages than requested
○ can’t round up to superpage size
○ CHERI bounds compression requires rounding for very large allocations
● No concept of address space ownership
○ math errors mean changing the wrong region
RFC: cmmap (⅓)
● int cmreserve(cmt_t *handlep, size_t length, vaddr_t hint, int prot, cmreq_t *cmr);
○ reserve a region (optionally mapping)
● int cmgetptr(cm_t handle void **ptrp);
○ get pointer to region
RFC cmmap(⅔)
● int cmap(cm_t handle, cmreq_t *cmr);
○ replace (part of) a region’s mappings
● int cmclose(cmt _handle);
○ close a handle, freeing memory
● int cmrestrict(cm_t handle, XX ops, XX *oops);
○ restrict the set of operations on a handle
● int cmstat(cmt_t handle, size_t index, struct cm_stat *cs);
○ return data on a series of submaps
○ want cm_t handle to not be a file descriptor or all mappings would be enumerable and would
break ASLR
● cmadvise(), …
3. ○ operate within a region
Map request objects
● cm_request_t is like pthread_attr_t
● Accessor functions
● Goal: useful defaults
CHERI extensions
● int cmgetcap(cm_t cookie, void **ptrp, perm_t perms)
○ get capability pointer
● int cmandperm()
○ reduce permissions
Should We Replace mmap()?
● Yes, No
Q&A
● ...
Aside: FreeBSD 11 shipped ARM64 and RISC-V without sbrk(), pretty much only Emacs broke as a result