Dynamic memory allocation and fragmentation

6,220 views

Published on

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
6,220
On SlideShare
0
From Embeds
0
Number of Embeds
15
Actions
Shares
0
Downloads
142
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • This is the general concept of what a memory allocator does or at least what it is supposed to do The allocator makes the choice where in memory to place a free block. It has unlimited access to all the machines memory and can request more memory from the operating system if necessary
  • All basic memory allocators have a three level design which helps them do their job efficiently The Mechanism implements a policy which is motivated by a strategy. ((An allocator algorithm should be regarded as the mechanism that implements a placement policy which is motivated by a strategy for minimizing fragmentation .))
  • Policies use mainly two techniques to satisfy incoming requests, splitting of large blocks and merging of small blocks
  • Fragmentation is the inability to reuse memory that is free Most programs need memory that can expand “dynamic memory” during the program execution. Internal fragmentation is often accepted to prevent external which is considered a more serious problem.
  • The most common causes for fragmentation are Isolated deaths and time-varying program behavior . An allocator that can predict the death of objects can exploit that information to reduce fragmentation. Program could free small blocks and request large ones instead. If possible the allocator’s should try to exploit these patterns or at least not let them undermine its strategy.
  • The inability to reuse memory depends not only on the number and sizes of holes, but on the future behavior of the program and the future responses of the allocator itself. Unfortunately there’s no way to predict general program behavior If there are 100 blocks of size 10 and 200 of size 20. Is the memory fragmented? Yes: All requests are of size 10? No: All requests are of size 30? That’s a problem. Even a request of exactly 100 blocks of 10 and 200 of 20 depends on the order in which they arrive. [Best fit would do] All depends on the moment by moment decisions of where to place them (important to have good placement policies).
  • Real programs do not generally behave randomly. That’s why no single allocator policy works for all cases. Most common behavioral patterns are: Ramps which are used to refer to something that grows slowly over the whole execution of a program Peaks are faster growing volumes of objects that are discarded before the end of execution Plateaus are data structures that build quickly and are used for long periods of time
  • This shows memory usage for the GNU C compiler compiling the largest file of it’s own source code (combine.c) We can clearly see the peak behavior .
  • This shows memory usage for the Grobner program , the program does a lot of data collection and rewrites , resulting in a typical ramp pattern (including some plateaus) A ramp or plateau profile has a very convenient property, in that if short-term fragmentation can be avoided, long term fragmentation is not a problem. Because nothing is freed until the end of the program so reuse is not an issue .
  • Shows memory usage for a run of Espresso , which is an optimizer for programmable logic array design. This shows that typical program behavior is hard (if not impossible) to predict .
  • The most commonly used mechanisms are sequential fits which are essentially just single linear list or array of all free blocks Boundary tags are stored in a special footer which is added to the block and includes the block size and various flags. The footer also includes pointers to neighboring blocks. This allows for easy traversal. Boundary tag and doubly linked list together makes coalescing simple and very fast . Does not scale well in terms of time costs, because the number of free block grows and the time to search the list may become excessively long.
  • General policy rules for sequential fits: Decide how the list should be ordered Define a splitting threshold
  • Another method is segregated free lists which add another dimension to sequential fits making searching faster . Each list is indexed by size classes which group similar size blocks together
  • Simple segregated lists only allocate objects of a single size , or small range of sizes. This makes allocation fast , but may lead to high external fragmentation , as unused parts of blocks cannot be reused for other object sizes. With segregated fits we gain the ability to split larger blocks when smaller ones are requested. Exact lists : separate list for each possible block size. Strict size classes with rounding : uses size classes (for example powers of two). Requested sizes are then rounded up to mach one of the sizes in the size class series. When splitting the resulting block size must be present in the list. Size classes with range lists : works the same as the previous. Only it allows each list to contain blocks of slightly different sizes.
  • One implementation of segregated fits using size classes with rounding is the buddy system. It supports limited but efficient splitting and coalescing. In the buddy system when a large block is split into two parts each part becomes a unique buddy to the other. A split block can only be merged with its unique buddy block.
  • Each level in the tree is a separate free list for a single size class. When the allocated node is freed then it checks its buddies status (free or allocated). If and only if it is also free then can the be merged into a larger block (and so on).
  • Each level in the tree is a separate free list for a single size class. When the allocated node is freed then it checks its buddies status (free or allocated). If and only if it is also free then can the be merged into a larger block (and so on).
  • Each level in the tree is a separate free list for a single size class. When the allocated node is freed then it checks its buddies status (free or allocated). If and only if it is also free then can the be merged into a larger block (and so on).
  • Each level in the tree is a separate free list for a single size class. When the allocated node is freed then it checks its buddies status (free or allocated). If and only if it is also free then can the be merged into a larger block (and so on).
  • Binary buddies are the simplest implementation of the buddy system. All buddy sizes are in powers of two and each buddy in a pair is of equal size. Internal fragmentation, because any object size must be rounded up to the nearest size and the difference in size between classes is fairly high. (( Equal block sizes make address computation simple , because all buddies are aligned on a power of two boundary offset. System based on closer size classes may be similarly efficient if lookup tables are used (to perform size class mappings). ))
  • Fibonacci buddies base all size classes on the fibonacci series to try to reduce internal fragmentation Numbers in the fibonacci series have that quality that they can only be split into 2 uneven parts . This can be a disadvantage when allocating many equal sized blocks
  • This is an example of how blocks in the fibonacci buddy system are split.
  • Another variant of the buddy system is weighted buddies . Size classes are a power of two and between each pair is a size three times a power of two . This is done to get even closer size classes to reduce internal fragmentation even more. Weighted buddies have two different splitting rules depending on whether the numbers are a power of two or three times the power of two.
  • 2 x numbers can only be split in half 2 x *3 numbers can be split in half or unevenly into two parts, leaving one block at 1/3 of the original size and the other at 2/3.
  • The last variation of the buddy system we discuss is the double buddy system. It unlike the other implementations uses 2 distinct size series each with different size classes. One list uses powers of two sizes Other uses powers of two spacing , offset by any number x
  • In this example we offset the second list by 3 giving us the same size blocks as are in the weighted buddy system. The difference lies in the splitting and merging rules, that is free space is not shared between the two lists. So when splitting a block of size 6 we can not… Requested sizes are rounded up to the nearest size class in either series . This reduces the internal fragmentation by about 50%
  • Now we like to discuss some common enhancements that can be used with all basic allocator mechanisms One of them is deferred coalescing or delayed merging . It stores recently freed blocks of predefined sizes in a cache array and doesn’t merge them as soon as they are released. This saves time when allocating a large number of similar sized blocks. Deferred coalescing vary in some ways: How often they coalesce items from quick lists Which items are coalesced In what order the items are chosen for coalescing The order in which items are allocated from the quick lists (LIFO, FIFO)
  • Is a method to optimize memory allocation and reduce fragmentation. It gives neighbors of a freed block more time to die by delaying the freed blocks reuse. And thus increasing the possibility that they can be merged.
  • 1. Newly-allocated objects will be placed in holes left by old objects that have died, thus mixing objects created by different phases (which may die at different times). 2. A large number of small request in a row can prevent the possibility of allocating large successive requests 3. Scattered holes in the heap “most of the time” are not a problem if they are filled before a peak is reached
  • Deferred coalescing may have significant effects on fragmentation, by changing the allocator’s decisions as to which blocks of memory to use to hold which objects … used memory could become scattered. The remaining block is of a different size is less likely to be useful if the program allocates many objects of the same size. Because the added list increase the number of available size-classes twofold and reduces the size gap between classes
  • Dynamic memory allocation and fragmentation

    1. 1. Dynamic memory allocation and fragmentation Seminar on Network and Operating Systems Group II
    2. 2. Schedule <ul><li>Today (Monday): </li></ul><ul><ul><li>General memory allocation mechanisms </li></ul></ul><ul><ul><li>The Buddy System </li></ul></ul><ul><li>Thursday: </li></ul><ul><ul><li>General Object Caching </li></ul></ul><ul><ul><li>Slabs </li></ul></ul>
    3. 3. What is an allocator and what must it do?
    4. 4. Memory Allocator <ul><li>Keeps track of memory in use and free memory </li></ul><ul><li>Must be fast and waste little memory </li></ul><ul><li>Services memory requests it receives </li></ul><ul><li>Prevent forming of memory “holes” </li></ul><ul><li>“ For any possible allocation algorithm, there will always be a program behavior that forces it into severe fragmentation.” </li></ul>
    5. 5. The three levels of an allocator <ul><li>Strategies </li></ul><ul><ul><li>Try to find regularities in incoming memory requests. </li></ul></ul><ul><li>Policies </li></ul><ul><ul><li>Decides where and how to place blocks in memory (selected by the strategy) </li></ul></ul><ul><li>Mechanisms </li></ul><ul><ul><li>The algorithms that implement the policy </li></ul></ul>
    6. 6. Policy techniques <ul><li>Uses splitting and coalescing to satisfy the incoming requests. </li></ul><ul><ul><li>Split large blocks for small requests </li></ul></ul><ul><ul><li>Coalesce small blocks for larger requests </li></ul></ul>
    7. 7. Fragmentation, why is it a problem?
    8. 8. Fragmentation <ul><li>Fragmentation is the inability to reuse memory that is free </li></ul><ul><li>External fragmentation occurs when enough free memory is available but isn’t contiguous </li></ul><ul><ul><li>Many small holes </li></ul></ul><ul><li>Internal fragmentation arises when a large enough block is allocated but it is bigger than needed </li></ul><ul><ul><li>Blocks are usually split to prevent internal fragmentation </li></ul></ul>
    9. 9. What causes fragmentation? <ul><li>Isolated deaths </li></ul><ul><ul><li>When adjacent objects do not die at the same time. </li></ul></ul><ul><li>Time-varying program behavior </li></ul><ul><ul><li>Memory requests change unexpectedly </li></ul></ul>
    10. 10. Why traditional approaches don’t work <ul><li>Program behavior is not predictable in general </li></ul><ul><li>The ability to reuse memory depends on the future interaction between the program and the allocator </li></ul><ul><ul><li>100 blocks of size 10 and 200 of size 20? </li></ul></ul>
    11. 11. How do we avoid fragmentation? A single death is a tragedy. A million deaths is a statistic. -Joseph Stalin
    12. 12. Understanding program behavior <ul><li>Common behavioral patterns </li></ul><ul><ul><li>Ramps </li></ul></ul><ul><ul><ul><li>Data structures that are accumulated over time </li></ul></ul></ul><ul><ul><li>Peaks </li></ul></ul><ul><ul><ul><li>Memory used in bursty patterns usually while building up temporal data structures. </li></ul></ul></ul><ul><ul><li>Plateaus </li></ul></ul><ul><ul><ul><li>Data structures build quickly and are used for long periods of time </li></ul></ul></ul>
    13. 13. Memory usage in the GNU C Compiler KBytes in use Allocation Time in Megabytes
    14. 14. Memory usage in the Grobner program KBytes in use Allocation Time in Megabytes
    15. 15. Memory usage in Espresso PLA Optimizer KBytes in use Allocation Time in Megabytes
    16. 16. Mechanisms <ul><li>Most common mechanisms used </li></ul><ul><ul><li>Sequential fits </li></ul></ul><ul><ul><li>Segregated free lists </li></ul></ul><ul><ul><ul><li>Buddy System </li></ul></ul></ul><ul><ul><li>Bitmap fits </li></ul></ul><ul><ul><li>Index fits </li></ul></ul>
    17. 17. Sequential fits <ul><li>Based on a single linear list </li></ul><ul><ul><li>Stores all free memory blocks </li></ul></ul><ul><ul><li>Usually circularly or doubly linked </li></ul></ul><ul><ul><li>Most use boundary tag technique </li></ul></ul><ul><li>Most common mechanisms use this method. </li></ul>
    18. 18. Sequential fits <ul><li>Best fit, First fit, Worst fit </li></ul><ul><li>Next fit </li></ul><ul><ul><li>Uses a roving pointer for allocation </li></ul></ul><ul><li>Optimal fit </li></ul><ul><ul><li>“ Samples” the list first to find a good enough fit </li></ul></ul><ul><li>Half fit </li></ul><ul><ul><li>Splits blocks twice the requested size </li></ul></ul>
    19. 19. Segregated free lists <ul><li>Use arrays of lists which hold free blocks of particular size </li></ul><ul><li>Use size classes for indexing purposes </li></ul><ul><ul><li>Usually in sizes that are a power of two </li></ul></ul><ul><li>Requested sizes are rounded up to the nearest available size </li></ul>2 4 8 16 32 64 128
    20. 20. Segregated free lists <ul><li>Simple segregated list </li></ul><ul><ul><li>No splitting of free blocks </li></ul></ul><ul><ul><li>Subject to severe external fragmentation </li></ul></ul><ul><li>Segregated fit </li></ul><ul><ul><li>Splits larger blocks if there is no free block in the appropriate free list </li></ul></ul><ul><ul><li>Uses first fit or next fit to find a free block </li></ul></ul><ul><ul><li>Three types: exact lists, strict size classes with rounding or size classes with range lists. </li></ul></ul>
    21. 21. Buddy system <ul><li>A special case of segregated fit </li></ul><ul><ul><li>Supports limited splitting and coalescing </li></ul></ul><ul><ul><li>Separate free list for each allowable size </li></ul></ul><ul><ul><li>Simple block address computation </li></ul></ul><ul><li>A free block can only be merged with its unique buddy. </li></ul><ul><ul><li>Only whole entirely free blocks can be merged. </li></ul></ul>
    22. 22. Buddy system 16 MB 8 MB 4 MB 3 MB Free
    23. 23. Buddy system Split Free 16 MB 8 MB 4 MB 3 MB Free
    24. 24. Buddy system Split Split Free Free 16 MB 8 MB 4 MB 3 MB Free
    25. 25. Buddy system Split Split Alloc . Free Free 16 MB 8 MB 4 MB
    26. 26. Binary buddies <ul><li>Simplest implementation </li></ul><ul><ul><li>All buddy sizes are powers of two </li></ul></ul><ul><ul><li>Each block divided into two equal parts </li></ul></ul><ul><li>Internal fragmentation very high </li></ul><ul><ul><li>Expected 28%, in practice usually higher </li></ul></ul><ul><li>(Demonstration applet) </li></ul>
    27. 27. Fibonacci buddies <ul><li>Size classes based on the fibonacci series </li></ul><ul><ul><li>More closely-spaced set of size classes </li></ul></ul><ul><ul><li>Reduces internal fragmentation </li></ul></ul><ul><ul><li>Blocks can only be split into sizes that are also in the series </li></ul></ul><ul><li>Uneven block sizes a disadvantage? </li></ul><ul><ul><li>When allocating many equal sized blocks </li></ul></ul>
    28. 28. Fibonacci buddies 13 5 8 21 8 13 2 3 5 8 13 21 34 55 … Size series: Splitting blocks:
    29. 29. Weighted buddies <ul><li>Size classes are power of two </li></ul><ul><ul><li>Between each pair is a size three times a power of two </li></ul></ul><ul><li>Two different splitting methods </li></ul><ul><ul><li>2 x numbers can be split in half </li></ul></ul><ul><ul><li>2 x *3 numbers can be split in half or unevenly into two sizes. </li></ul></ul>
    30. 30. Weighted buddies 2 3 4 6 8 12 16 24 … (2 1 ) (2 0 *3) (2 2 ) (2 1 *3) (2 3 ) (2 2 *3) (2 4 ) (2 3 *3) … Size series: Splitting of 2 x *3 numbers: 6 3 3 6 2 4
    31. 31. Double buddies <ul><li>Use 2 different binary buddy series </li></ul><ul><ul><li>One list uses powers of two sizes </li></ul></ul><ul><ul><li>Other uses powers of two spacing, offset by x </li></ul></ul><ul><li>Splitting rules </li></ul><ul><ul><li>Blocks can only be split in half </li></ul></ul><ul><ul><li>Split blocks stay in the same series </li></ul></ul>
    32. 32. Double buddies 2 4 8 16 32 64 128 … (2 1 ) (2 2 ) (2 3 ) (2 4 ) (2 5 ) (2 6 ) (2 7 )… Size series: Splitting of 3*2 x numbers: 3 6 12 24 48 96 192 … (3*2 0 ) (3*2 1 ) (3*2 2 ) (3*2 3 ) (3*2 4 ) (3*2 5 ) (3*2 6 )… 6 3 3 6 2 4
    33. 33. Deferred coalescing <ul><li>Blocks are not merged as soon as they are freed. </li></ul><ul><li>Uses quick lists or subpools </li></ul><ul><ul><li>Arrays of free lists, one for each size class that is to be deferred </li></ul></ul><ul><ul><li>Blocks larger than those defined to be deferred are returned to the general allocator </li></ul></ul>
    34. 34. Deferred reuse <ul><li>Recently freed blocks are not immediately reused </li></ul><ul><ul><li>Older free blocks used instead of newly freed </li></ul></ul><ul><li>Compacts long-lived memory blocks </li></ul><ul><ul><li>Can cause increased fragmentation if only short-lived blocks are requested </li></ul></ul>
    35. 35. Discussion
    36. 36. Questions? <ul><li>Why can deferred reuse cause increased fragmentation if only short-lived blocks are requested? </li></ul><ul><li>How can the order in which the requests arrive effect memory fragmentation? </li></ul><ul><li>Why is fragmentation at peaks more important than at intervening points? </li></ul>
    37. 37. Questions? <ul><li>When would deferred coalescing be likely to cause more fragmentation? </li></ul><ul><li>What is a possible disadvantage when splitting blocks using the fibonacci buddy system? </li></ul><ul><li>In the double buddy system, why does the added size-class list reduce internal fragmentation by about 50%? </li></ul>

    ×