Practical ,Transparent Operating System Support For SUPERPAGESAuthors:Juan Navarro, Sitaram Iyer, Peter Drusche, Alan CoxPresented by:Nadeeshani Hewage
Basics & Terminology• TLB? o Caches virtual-to-physical address mappings from the page tables o Faster access• TLB Coverage? o Amount of memory accessible through cached mappings, without TLB misses• Superpage o Memory page for larger size than a Base-page (ordinary memory page) o Multiple sizes o Single entry in TLB per superpage• Memory objects o Occupy some contiguous space in virtual memory o Eg: memory mapped files, code, data• TLB coverage decreasing with growing memory size over years• Motivation to increase TLB coverage without enlargening TLB size.
Constraints for Superpages• Page size should be supported by processor.• Contiguous• Starting address in physical & virtual address space must be a multiple of its size.• Single TLB entry for a superpage. Meaning single set of protection attributes and single dirty bit for a superpage.
Considerations• Allocation o Allocating physical frames for the memory object. o If there exists a plan to create a superpage by the OS, to satisfy the contiguity requirement, Relocation based allocation • Create contiguity • Physically copying allocated page frames when it is decided that superpages are likely to be beneficial o Software managed TLB + TLB miss handler o Counter per superpage incremented at TLB miss Reservation based allocation • Preserve contiguity • Make allocation decisions at page fault time • Decide preferred size of allocation on an algorithm o Eg: Talluri & Hill proposed
• Fragmentation Control o Release contiguous chunks of inactive memory from previous allocations o Preempt an existing partially used reservation• Promotion o Set of base pages from a potential super page satisfying constraints promoted to a Superpage.• Demotion o Reduce size of a superpage A smaller super page Base pages• Eviction o At a demand of memory preassure, inactive superpage is evicted from physical memory.
Preferred Superpage Size Policy o Decision made early at process execution o Looks only at attributes of memory object o Decided size Too large: decision overridden by preempting init reservation Too small: cannot be reverted o Policy: Pick the maximum superpage size that can be used for an object
Preempting Reservations• Free physical memory scarce/ excessively fragmented• Preempt reserved but unused frames• Choice between o Refusing allocation; reserving smaller extent than desired for new allocation o Preempting existing reservation which has enough unallocated frames• Policy: Whenever possible preempt & give space for alloaction
Incremental Promotions, Demotions• Promotions o Multiple superpage sizes exist o Superpage gets created when any size supported gets fully populated within a reservation o Increment smaller superpage to next larger size superpage• Demotions o A base page of a superpage targetted to be evicted by page daemon, causes a demotion o Demote to next smaller superpage size
Paging Out Dirty Superpages• OS keeps only single dirty bit for the whole superpage.• Problem: only one base page modified; cost of writing full superpage to disk (high cost on I/O)• Practice: Demote clean superpages whenever process attempts to write to them. Repromote later.
Reservation Lists• Keep track of reserved page extent that are not fully populated• One list per each page size supported.• Kept sorted by the time of their most recent page frame allocation• Policy: Preempt the extent whose most recent allocation, occurred least recently among all reservations in the list (head)
Population Map• Keep track of allocated base pages within each memory object• Queried on every page fault, should support efficient lookups.• Radix tree o Each level corresponds to a page size o Root: max superpage size supported o Non-leaf: # of supperpage sized virtual regions at next lower level with a population of at least one (somepop) # of supperpage sized virtual regions at next lower level fully populated (fullpop)• Used for various decisions
Wired Page clustering• Pages used for internal datastructures for the OS (implementation on FreeBSD) are wired (not evicted)• At system boot time, clustered together in physical memory.• Gradually as kernel allocates memory they get scattered, causes fragmentation• To avoid fragmentation, identify pages which are to be wired (internal use), cluster them in pools of contiguous physical memory
Platform & Workload• Platform o FreeBSD 4.3 kernel as a loadable module o Alpha 21264 processor at 500 MHz o four page sizes: 8KB base pages, 64KB, 512KB and 4MB superpages o 512MB RAM o 64KB data and 64KB instruction L1 caches, virtually indexed and 2-way associative o fully associative TLB with 128 entries for data and 128 for instructions• Workload o FFTW: The Fastest Fourier Transform in the West with a 200x200x200 matrix as input o Matrix: A non-blocked matrix transposition of a1000x1000 matrix o Web: The thttpd web server servicing 50000 requests selected from an access log of the CS departmental web server at Rice University. The working set size of this trace is 238MB, while its data set is 3.6GB.
Best case benefits• When free memory is plentiful & non fragmented o Speedup (mean elapsed runtime of 3 runs after an initial warmup) o TLB misses reduction percentage
Evaluation ctd.• Fragment system by running web server & feeding it with requests• FFTW run on fragmented system 4 times in sequence.• Goal : To see how quickly the system recovers just enough contiguous memory to build superpages• 2 contiguity restoration techniques o Cache scheme Treat all cache pages as available, coalesces them into allocator 0 superpages o Daemon scheme Contiguity aware page replacement + wired page clustering Get 20/60, 35/60, 38/60, 40/60 superpages
• Even if the superpages implementation is there it requires a promising fragmentation control mechanism to acquire promising results.
• With superpages and effective fragmentation control, evaluations have given promising results.