Operating Systems
        CMPSCI 377
Dynamic Memory Management
                     Emery Berger
  University of Massachus...
Dynamic Memory Management
    How the heap manager is implemented


        malloc, free
    

        new, delete
    ...
Memory Management
    Ideal memory manager:


        Fast
    

              Raw time, asymptotic runtime, locality
  ...
Memory Manager Functions
    Not just malloc/free


        realloc
    

              Change size of object, copying o...
Fragmentation
    Intuitively, fragmentation stems from


    “breaking” up heap into unusable spaces
        More fragme...
Classical Algorithms
    First-fit


        find first chunk of desired size
    




        UNIVERSITY OF MASSACHUSET...
Classical Algorithms
    Best-fit


        find chunk that fits best
    

              Minimizes wasted space
       ...
Classical Algorithms
    Worst-fit


        find chunk that fits worst
    

        then split object
    




    Re...
Implementation Techniques
    Freelists


        Linked lists of objects in same size class
    

              Range o...
Implementation Techniques
    Segregated size classes


        Use free lists, but never coalesce or split
    

    Ch...
Implementation Techniques
    Big Bag of Pages (BiBOP)


        Page or pages (multiples of 4K)
    

        Usually s...
Runtime Analysis
    Key components


        Cost of malloc (best, worst, average)
    

        Cost of free
    

  ...
Space Bounds
    Fragmentation worst-case for “optimal”:


    O(log M/m)
        M = largest object size
    

        ...
Performance Issues
    We’ll talk about scalability later


    Reliability, too


    But: general-purpose allocator of...
Custom Memory Allocation
    Programmers replace                                    Very common
                         ...
Drawbacks of Custom Allocators

    Avoiding system allocator:


        More code to maintain & debug
    

        Can...
(1) Per-Class Allocators

    Recycle freed objects from a free list


    a = new Class1;                    Class1
    ...
(II) Custom Patterns
             Tailor-made to fit allocation patterns
         

                 Example: 197.parser ...
(III) Regions


    Separate areas, deletion only en masse


    regioncreate(r)                 r
    regionmalloc(r, sz...
Custom Allocators Are Faster…

                                Runtime - Custom Allocator Benchmarks

                    ...
Not So Fast…

                                        Runtime - Custom Allocator Benchmarks
                              ...
The Lea Allocator (DLmalloc 2.7.0)
    Mature public-domain general-purpose

    allocator
    Optimized for common alloc...
Space Consumption: Mixed Results

                                      Space - Custom Allocator Benchmarks

             ...
Custom Allocators?
    Generally not worth the trouble:


    use good general-purpose allocator
        Avoids risky sof...
Problems with Unsafe Languages
       C, C++: pervasive apps, but langs.
   

       memory unsafe
       Numerous opport...
Soundness for “Erroneous” Programs

        Normally: memory errors ) ? …
    

        Consider infinite-heap allocator:...
Probabilistic Memory Safety

   Approximate                 with M-heaps (e.g., M=2)

       DieHard: fully-randomized M-h...
Implementation Choices

       Conventional, freelist-based heaps
   

            Hard to randomize, protect from errors...
Randomized Heap Layout
00000001           1010 10             metadata
size = 2i+3         2i+4     2i+5

                ...
Randomized Allocation
00000001         1010 10             metadata
size = 2i+3       2i+4     2i+5

                     ...
Randomized Allocation
00010001         1010 10             metadata
size = 2i+3       2i+4     2i+5

                     ...
Randomized Deallocation
00010001         1010 10             metadata
size = 2i+3       2i+4     2i+5

                   ...
Randomized Deallocation
00010001         1010 10             metadata
size = 2i+3       2i+4     2i+5

                   ...
Randomized Deallocation
00000001         1010 10             metadata
size = 2i+3       2i+4     2i+5

                   ...
Randomized Heaps & Reliability
                  object size = 2i+3                                   object size = 2i+4
 ...
DieHard software architecture


                                            replica1
                               seed1
...
DieHard Results

        Analytical results (pictures!)
    

            Buffer overflows
        

            Uniniti...
Analytical Results: Buffer Overflows

     Model overflow as write of live data
 

         Heap half full (max occupancy...
Analytical Results: Buffer Overflows

     Model overflow as write of live data
 

         Heap half full (max occupancy...
Analytical Results: Buffer Overflows

     Model overflow: random write of live
 

     data
         Heap half full (max...
Analytical Results: Buffer Overflows

     Replicas: Increase odds of avoiding
 

     overflow in at least one replica
 ...
Analytical Results: Buffer Overflows

     Replicas: Increase odds of avoiding
 

     overflow in at least one replica
 ...
Analytical Results: Buffer Overflows

     Replicas: Increase odds of avoiding
 

     overflow in at least one replica
 ...
Analytical Results: Buffer Overflows


    F = free space


    H = heap size

    N = # objects

    worth of
    over...
Empirical Results: Runtime




       UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Empirical Results: Error Avoidance
       Injected faults:
   

           Dangling pointers (@50%, 10 allocations)
     ...
The End




   UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science   47
Upcoming SlideShare
Loading in …5
×

Operating Systems - Dynamic Memory Management

2,814
-1

Published on

From the Operating Systems course (CMPSCI 377) at UMass Amherst, Fall 2007.

Published in: Technology, Education
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,814
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
211
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Operating Systems - Dynamic Memory Management

  1. 1. Operating Systems CMPSCI 377 Dynamic Memory Management Emery Berger University of Massachusetts Amherst UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  2. 2. Dynamic Memory Management How the heap manager is implemented  malloc, free  new, delete  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 2
  3. 3. Memory Management Ideal memory manager:  Fast  Raw time, asymptotic runtime, locality  Memory efficient  Low fragmentation  With multicore & multiprocessors:  Scalable to multiple processors  New issues:  Secure from attack  Reliable in face of errors  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 3
  4. 4. Memory Manager Functions Not just malloc/free  realloc  Change size of object, copying old contents  ptr = realloc (ptr, 10);  But: realloc(ptr, 0) = ?  How about: realloc (NULL, 16) ?  Other fun  calloc  memalign  Needs ability to locate size & object start  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 4
  5. 5. Fragmentation Intuitively, fragmentation stems from  “breaking” up heap into unusable spaces More fragmentation = worse utilization of  memory External fragmentation  Wasted space outside allocated objects  Internal fragmentation  Wasted space inside an object  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 5
  6. 6. Classical Algorithms First-fit  find first chunk of desired size  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 6
  7. 7. Classical Algorithms Best-fit  find chunk that fits best  Minimizes wasted space  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 7
  8. 8. Classical Algorithms Worst-fit  find chunk that fits worst  then split object  Reclaim space: coalesce free adjacent  objects into one big object UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 8
  9. 9. Implementation Techniques Freelists  Linked lists of objects in same size class  Range of object sizes  First-fit, best-fit in this context  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 9
  10. 10. Implementation Techniques Segregated size classes  Use free lists, but never coalesce or split  Choice of size classes  Exact  Powers-of-two  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 10
  11. 11. Implementation Techniques Big Bag of Pages (BiBOP)  Page or pages (multiples of 4K)  Usually segregated size classes  Header contains metadata  Locate with bitmasking  Limits external fragmentation  Can be very fast  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 11
  12. 12. Runtime Analysis Key components  Cost of malloc (best, worst, average)  Cost of free  Cost of size lookup (for realloc & free)  Examine for first-fit, best-fit, segregated  (with BiBOP) UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 12
  13. 13. Space Bounds Fragmentation worst-case for “optimal”:  O(log M/m) M = largest object size  m = smallest object size  Best-fit = O(M * m) !  Goal: perform well for typical programs  Considerations:  Internal fragmentation  External fragmentation  Headers (metadata)  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 13
  14. 14. Performance Issues We’ll talk about scalability later  Reliability, too  But: general-purpose allocator often seen  as too slow UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 14
  15. 15. Custom Memory Allocation Programmers replace Very common   new/delete, bypassing practice system allocator Apache, gcc, lcc, STL,  database servers… Reduce runtime – often  Language-level Expand functionality –   support in C++ sometimes Widely Reduce space – rarely   recommended “Use custom allocators” UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 15
  16. 16. Drawbacks of Custom Allocators Avoiding system allocator:  More code to maintain & debug  Can’t use memory debuggers  Not modular or robust:  Mix memory from custom  and general-purpose allocators → crash! Increased burden on programmers  Are custom allocators really a win? UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 16
  17. 17. (1) Per-Class Allocators Recycle freed objects from a free list  a = new Class1; Class1 Fast free list b = new Class1; + c = new Class1; Linked list operations + a delete a; Simple + delete b; Identical semantics b + delete c; C++ language support + a = new Class1; c Possibly space-inefficient b = new Class1; - c = new Class1; UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 17
  18. 18. (II) Custom Patterns Tailor-made to fit allocation patterns  Example: 197.parser (natural language  parser) db a c char[MEMORY_LIMIT] end_of_array end_of_array end_of_array end_of_array end_of_array a = xalloc(8); Fast + b = xalloc(16); Pointer-bumping allocation + c = xalloc(8); - Brittle xfree(b); - Fixed memory size xfree(c); - Requires stack-like lifetimes d = xalloc(8); UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 18
  19. 19. (III) Regions Separate areas, deletion only en masse  regioncreate(r) r regionmalloc(r, sz) regiondelete(r) - Risky Fast + - Dangling Pointer-bumping allocation + references Deletion of chunks + - Too much space Convenient + One call frees all memory + Increasingly popular custom allocator  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 19
  20. 20. Custom Allocators Are Faster… Runtime - Custom Allocator Benchmarks Custom Win32 1.75 Normalized Runtime non-regions regions 1.5 1.25 1 0.75 0.5 0.25 0 r he er lle ze m c c vp gc lc rs si ud ac ee 5. 6. d- pa m 17 ap br 17 xe 7. c- bo 19 As good as and sometimes much faster than Win32  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 20
  21. 21. Not So Fast… Runtime - Custom Allocator Benchmarks Custom Win32 DLmalloc 1.75 non-regions regions Normalized Runtime 1.5 1.25 1 0.75 0.5 0.25 0 lle e r he c r m c vp e lc z gc si ud rs ee ac 5. d- 6. pa m br 17 ap 17 xe 7. c- bo 19 DLmalloc: as fast or faster for most benchmarks  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 21
  22. 22. The Lea Allocator (DLmalloc 2.7.0) Mature public-domain general-purpose  allocator Optimized for common allocation patterns  Per-size quicklists ≈ per-class allocation  Deferred coalescing  (combining adjacent free objects) Highly-optimized fastpath  Space-efficient  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 22
  23. 23. Space Consumption: Mixed Results Space - Custom Allocator Benchmarks Custom DLmalloc 1.75 non-regions regions Normalized Space 1.5 1.25 1 0.75 0.5 0.25 0 lle e r he c r sim c vp e lc z gc ud rs ee ac 5. d- 6. pa m br 17 ap 17 xe 7. c- bo 19 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 23
  24. 24. Custom Allocators? Generally not worth the trouble:  use good general-purpose allocator Avoids risky software engineering errors  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 24
  25. 25. Problems with Unsafe Languages C, C++: pervasive apps, but langs.  memory unsafe Numerous opportunities for security  vulnerabilities, errors Double free  Invalid free  Uninitialized reads  Dangling pointers  Buffer overflows (stack & heap)  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  26. 26. Soundness for “Erroneous” Programs Normally: memory errors ) ? …  Consider infinite-heap allocator:  All news fresh;  ignore delete No dangling pointers, invalid frees,  double frees Every object infinitely large  No buffer overflows, data overwrites  Transparent to correct program  “Erroneous” programs sound  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  27. 27. Probabilistic Memory Safety Approximate with M-heaps (e.g., M=2) DieHard: fully-randomized M-heap  Increases odds of benign errors  Probabilistic memory safety  i.e., P(no error) n  Errors independent across heaps  E(users with no error) n * |users|  ? Efficient implementation… UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  28. 28. Implementation Choices Conventional, freelist-based heaps  Hard to randomize, protect from errors  Double frees, heap corruption  What about bitmaps? [Wilson90]  – Catastrophic fragmentation Each small object likely to occupy one page  obj obj obj obj pages UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  29. 29. Randomized Heap Layout 00000001 1010 10 metadata size = 2i+3 2i+4 2i+5 heap Bitmap-based, segregated size classes  Bit represents one object of given size  i.e., one bit = 2i+3 bytes, etc.  Prevents fragmentation  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  30. 30. Randomized Allocation 00000001 1010 10 metadata size = 2i+3 2i+4 2i+5 heap malloc(8): compute size class = ceil(log2 sz) – 3  randomly probe bitmap for zero-bit (free)  Fast: runtime O(1)  M=2 – E[# of probes] · 2  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  31. 31. Randomized Allocation 00010001 1010 10 metadata size = 2i+3 2i+4 2i+5 heap malloc(8): compute size class = ceil(log2 sz) – 3  randomly probe bitmap for zero-bit (free)  Fast: runtime O(1)  M=2 – E[# of probes] · 2  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  32. 32. Randomized Deallocation 00010001 1010 10 metadata size = 2i+3 2i+4 2i+5 heap free(ptr):  Ensure object valid – aligned to right address  Ensure allocated – bit set  Resets bit  Prevents invalid frees, double frees  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  33. 33. Randomized Deallocation 00010001 1010 10 metadata size = 2i+3 2i+4 2i+5 heap free(ptr):  Ensure object valid – aligned to right address  Ensure allocated – bit set  Resets bit  Prevents invalid frees, double frees  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  34. 34. Randomized Deallocation 00000001 1010 10 metadata size = 2i+3 2i+4 2i+5 heap free(ptr):  Ensure object valid – aligned to right address  Ensure allocated – bit set  Resets bit  Prevents invalid frees, double frees  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  35. 35. Randomized Heaps & Reliability object size = 2i+3 object size = 2i+4 … 24 5 3 1 6 3 My Mozilla: “malignant” overflow Objects randomly spread across heap  Different run = different heap  Errors across heaps independent  Your Mozilla: “benign” overflow … 1 6 3 2 54 1 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  36. 36. DieHard software architecture replica1 seed1 input output replica2 seed2 vote broadcast replica3 seed3 execute replicas (separate processes) Replication-based fault-tolerance  Requires randomization: errors independent  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  37. 37. DieHard Results Analytical results (pictures!)  Buffer overflows  Uninitialized reads  Dangling pointer errors (the best)  Empirical results  Runtime overhead  Error avoidance  Injected faults & actual applications  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  38. 38. Analytical Results: Buffer Overflows Model overflow as write of live data  Heap half full (max occupancy)  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  39. 39. Analytical Results: Buffer Overflows Model overflow as write of live data  Heap half full (max occupancy)  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  40. 40. Analytical Results: Buffer Overflows Model overflow: random write of live  data Heap half full (max occupancy)  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  41. 41. Analytical Results: Buffer Overflows Replicas: Increase odds of avoiding  overflow in at least one replica replicas UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  42. 42. Analytical Results: Buffer Overflows Replicas: Increase odds of avoiding  overflow in at least one replica replicas UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  43. 43. Analytical Results: Buffer Overflows Replicas: Increase odds of avoiding  overflow in at least one replica replicas P(Overflow in all replicas) = (½)3 = 1/8  P(No overflow in > 1 replica) = 1-(½)3 = 7/8  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  44. 44. Analytical Results: Buffer Overflows F = free space  H = heap size  N = # objects  worth of overflow k = replicas  Overflow one object  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  45. 45. Empirical Results: Runtime UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  46. 46. Empirical Results: Error Avoidance Injected faults:  Dangling pointers (@50%, 10 allocations)  glibc: crashes; DieHard: 9/10 correct  Overflows (@1%, 4 bytes over) –  glibc: crashes 9/10, inf loop; DieHard: 10/10 correct  Real faults:  Avoids Squid web cache overflow  Crashes BDW & glibc  Avoids dangling pointer error in Mozilla  DoS in glibc & Windows  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  47. 47. The End UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 47
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×