Memory Management  <ul><li>Jordan University of Science & Technology </li></ul><ul><li>CPE 746 Embedded Real-Time Systems ...
Outline <ul><li>Introduction </li></ul><ul><li>Common Memory Types </li></ul><ul><li>Composing Memory </li></ul><ul><li>Me...
Introduction   <ul><li>An embedded system is a special-purpose computer system designed to perform one or a few dedicated ...
Common Memory Types
RAM <ul><li>DRAM: </li></ul><ul><li>Volatile memory. </li></ul><ul><li>Address lines are multiplexed. The 1 st  half is se...
Reading DRAM Super cell (2,1) <ul><li>Step 1(a): Row access strobe ( RAS ) selects row 2. </li></ul>cols rows RAS = 2 0 1 ...
Reading DRAM Super cell (2,1) <ul><li>Step 2(a): Column access strobe ( CAS ) selects column 1. </li></ul>internal buffer ...
RAM <ul><li>SRAM: </li></ul><ul><li>Volatile memory. </li></ul><ul><li>Six transistors / bit => lower capacity. </li></ul>...
Some Memory Types <ul><li>ROM: </li></ul><ul><li>Non-volatile memory. </li></ul><ul><li>Can be read from but not written t...
Some Memory Types <ul><li>Flash Memory: </li></ul><ul><li>Non-volatile memory.  </li></ul><ul><li>Can be electrically eras...
Fast Expensive (SRAM + battery) Unlimited Byte Yes No NVRAM Fast to read, slow to erase/write Moderate Limited (consult da...
Composing Memory <ul><li>When available memory is larger, simply ignore unneeded high-order address bits and higher data l...
Connect side-by-side <ul><li>To increase width of words. </li></ul>… 2 m  × 3n  ROM 2 m  × n  ROM A 0 … enable 2 m  × n  R...
Connect top to bottom <ul><li>To increase number of words. </li></ul>2 m+1  × n  ROM 2 m  × n  ROM A 0 … enable … 2 m  × n...
Combine techniques <ul><ul><li>To increase number and width of words. </li></ul></ul>A enable outputs Increase number and ...
Memory Hierarchy <ul><li>Is an approach for organizing memory and storage systems. </li></ul><ul><li>A memory hierarchy is...
An Example Memory Hierarchy registers on-chip L1 cache (SRAM) main memory (DRAM) local secondary storage (local disks) Lar...
Caches <ul><li>Cache:  The first level(s) of the memory hierarchy encountered once the address leaves the CPU. </li></ul><...
Caching in a Memory Hierarchy 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Larger, slower, cheaper storage device at level k+1 is...
Request 14 Request 12 General Caching  Concepts <ul><li>Program needs object d, which is stored in some block b. </li></ul...
Cache Placement <ul><li>There are 3 categories of cache organization: </li></ul><ul><li>Direct-mapped. </li></ul><ul><li>F...
Direct-Mapped <ul><li>The block can appear in 1 place only. </li></ul><ul><li>Fastest & simplest organization but highest ...
Fully-Associative <ul><li>The block can appear anywhere in the cache. Slowest organization but lowest miss rate. </li></ul...
Set-Associative <ul><li>The block can appear anywhere within a single set. (n-way set associative) </li></ul><ul><li>The s...
Cache Replacement <ul><li>In a direct-mapped cache, only 1 block is checked for a hit, & only that block can be replaced. ...
Cache Replacement <ul><li>As the associativity increases => LRU harder & more expensive to implement => LRU is approximate...
Write Policies <ul><li>Write Back:  the information is only written to the block in the cache. </li></ul><ul><li>Write Thr...
Reducing the Miss Rate <ul><li>Larger Block Sizes & Caches. </li></ul><ul><li>Higher Associativity. </li></ul>
Application Memory Management <ul><li>Allocation: to allocate portions of memory to programs at their request.  </li></ul>...
Memory Management <ul><li>In many embedded systems, the kernel and application programs execute in the same space i.e., th...
Memory Management <ul><li>An RTOS uses small memory size by including only the necessary functionality for an application....
Static memory management <ul><li>provides tasks with temporary data space. </li></ul><ul><li>The system’s free memory is d...
<ul><li>Another way is to provide temporary space for tasks is via priorities: </li></ul><ul><li>A high priority pool : is...
Dynamic memory management <ul><li>employs memory swapping, overlays, multiprogramming with a fixed number of tasks (MFT), ...
<ul><li>MFT: a fixed number of equalized code parts are in memory at the same time. </li></ul><ul><li>MVT: is like MFT exc...
Memory Allocation <ul><li>is the process of assigning blocks of memory on request . </li></ul><ul><li>Memory for user proc...
Static memory allocation <ul><li>means that all memory is allocated to each process or thread when the system starts up. I...
Dynamic Storage-Allocation <ul><li>How to satisfy a request of size  n  from a list of free holes. This means that during ...
Dynamic Storage-Allocation Schemes <ul><li>First-fit :  Allocate the  first hole  that is big enough, so it is fast </li><...
Buddy memory allocation  <ul><li>allocates memory in powers of 2 </li></ul><ul><li>it only allocates blocks of certain siz...
How buddy works? <ul><li>If memory is to be allocated  </li></ul><ul><li>1-Look for a memory slot of a suitable size (the ...
How buddy works? <ul><li>If memory is to be freed  </li></ul><ul><li>Free the block of memory  </li></ul><ul><li>Look at t...
Example: buddy system 1024K t=8 512K 128K D-128K 256K t=7 512K 128K D-128K B-128K 128K t=6 512K 128K D-128K B-128K 64K A-6...
The problem of fragmentation <ul><li>neither first fit nor best fit is clearly better that the other in terms of storage u...
  <ul><li>External Fragmentation –total memory space exists to satisfy a request, but it is not contiguous. </li></ul><ul>...
Example: Internal Fragmentation
Memory Protection <ul><li>it may not be acceptable for a hardware failure to corrupt data in memory. So, use of a hardware...
Hardware Memory Protection
Recycling techniques  <ul><li>There are many ways for automatic memory managers to determine what memory is no longer requ...
Recycling techniques  <ul><li>Tracing collectors   :   Automatic memory managers that follow pointers to determine which b...
Example :  Tracing collectors  <ul><li>Mark-sweep collection: </li></ul><ul><li>Phase1:  all blocks that can be reached by...
Mark-sweep collection <ul><li>The drawbacks : </li></ul><ul><li>It must scan the entire memory in use before any memory ca...
Example :  Reference counts  <ul><li>Simple reference counting : </li></ul><ul><li>a reference count is kept for each  obj...
References: <ul><li>http://www.memorymanagement.org/articles/recycle.html </li></ul><ul><li>http://www.dedicated-systems.c...
Upcoming SlideShare
Loading in …5
×

Presentation

866 views

Published on

  • Be the first to comment

Presentation

  1. 1. Memory Management <ul><li>Jordan University of Science & Technology </li></ul><ul><li>CPE 746 Embedded Real-Time Systems </li></ul><ul><li>Prepared By: Salam Al-Mandil & Hala Obaidat </li></ul><ul><li>Supervised By: Dr. Lo’ai Tawalbeh </li></ul>
  2. 2. Outline <ul><li>Introduction </li></ul><ul><li>Common Memory Types </li></ul><ul><li>Composing Memory </li></ul><ul><li>Memory Hierarchy </li></ul><ul><li>Caches </li></ul><ul><li>Application Memory Management </li></ul><ul><li>Static memory management </li></ul><ul><li>Dynamic memory management </li></ul><ul><li>Memory Allocation </li></ul><ul><li>The problem of fragmentation </li></ul><ul><li>Memory Protection </li></ul><ul><li>Recycling techniques </li></ul>
  3. 3. Introduction <ul><li>An embedded system is a special-purpose computer system designed to perform one or a few dedicated functions, sometimes with real-time computing constraints. </li></ul><ul><li>An embedded system is part of a larger system. </li></ul><ul><li>Embedded systems often have small memory and are required to run a long time, so memory management is a major concern when developing real-time applications. </li></ul>
  4. 4. Common Memory Types
  5. 5. RAM <ul><li>DRAM: </li></ul><ul><li>Volatile memory. </li></ul><ul><li>Address lines are multiplexed. The 1 st half is sent 1 st & called the RAS. The 2 nd half is sent later & called CAS. </li></ul><ul><li>A capacitor and a single transistor / bit => better capacity. </li></ul><ul><li>Requires periodical refreshing every 10-100 ms => dynamic. </li></ul><ul><li>Cheaper / bit => lower cost. </li></ul><ul><li>Slower => used for main memory. </li></ul>
  6. 6. Reading DRAM Super cell (2,1) <ul><li>Step 1(a): Row access strobe ( RAS ) selects row 2. </li></ul>cols rows RAS = 2 0 1 2 3 0 1 2 internal row buffer 16 x 8 DRAM chip 3 addr data 2 / 8 / memory controller <ul><li>Step 1(b): Row 2 copied from DRAM array to row buffer. </li></ul>
  7. 7. Reading DRAM Super cell (2,1) <ul><li>Step 2(a): Column access strobe ( CAS ) selects column 1. </li></ul>internal buffer cols rows 0 1 2 3 0 1 2 3 internal row buffer 16 x 8 DRAM chip CAS = 1 addr data 2 / 8 / memory controller <ul><li>Step 2(b): Super cell (2,1) copied from buffer to data lines, and eventually back to the CPU. </li></ul>super cell (2,1) super cell (2,1) To CPU
  8. 8. RAM <ul><li>SRAM: </li></ul><ul><li>Volatile memory. </li></ul><ul><li>Six transistors / bit => lower capacity. </li></ul><ul><li>No refreshing required => faster & lower power consumption. </li></ul><ul><li>More expensive / bit => higher cost. </li></ul><ul><li>Faster => used in caches. </li></ul>
  9. 9. Some Memory Types <ul><li>ROM: </li></ul><ul><li>Non-volatile memory. </li></ul><ul><li>Can be read from but not written to, by a processor in an embedded system. </li></ul><ul><li>Traditionally written to, “programmed”, before inserting to embedded system. </li></ul><ul><li>Stores constant data needed by system. </li></ul><ul><li>Horizontal lines = words, vertical lines = data. </li></ul><ul><li>Some embedded systems work without RAM, exclusively on ROM, because their programs and data are rarely changed. </li></ul>
  10. 10. Some Memory Types <ul><li>Flash Memory: </li></ul><ul><li>Non-volatile memory. </li></ul><ul><li>Can be electrically erased & reprogrammed. </li></ul><ul><li>Used in memory cards, and USB flash drives. </li></ul><ul><li>It is erased and programmed in large blocks at once, rather than one word at a time. </li></ul><ul><li>Examples of applications include PDAs and laptop computers, digital audio players, digital cameras and mobile phones. </li></ul>
  11. 11. Fast Expensive (SRAM + battery) Unlimited Byte Yes No NVRAM Fast to read, slow to erase/write Moderate Limited (consult datasheet) Sector Yes No Flash Fast to read, slow to erase/write Expensive Limited (consult datasheet) Byte Yes No EEPROM Fast Moderate Limited (consult datasheet) Entire Chip Yes, with a device programmer No EPROM Fast Moderate n/a n/a Once, with a device programmer No PROM Fast Inexpensive n/a n/a No No Masked ROM Moderate Moderate Unlimited Byte Yes Yes DRAM Fast Expensive Unlimited Byte Yes Yes SRAM Speed Cost (per Byte) Max Erase Cycles Erase Size Writeable? Volatile? Type
  12. 12. Composing Memory <ul><li>When available memory is larger, simply ignore unneeded high-order address bits and higher data lines. </li></ul><ul><li>When available memory is smaller, compose several smaller memories into one larger memory: </li></ul><ul><li>Connect side-by-side. </li></ul><ul><li>Connect top to bottom. </li></ul><ul><li>Combine techniques. </li></ul>
  13. 13. Connect side-by-side <ul><li>To increase width of words. </li></ul>… 2 m × 3n ROM 2 m × n ROM A 0 … enable 2 m × n ROM … 2 m × n ROM … Q 3n-1 Q 2n-1 … Q 0 … A m Increase width of words
  14. 14. Connect top to bottom <ul><li>To increase number of words. </li></ul>2 m+1 × n ROM 2 m × n ROM A 0 … enable … 2 m × n ROM A m-1 A m 1 × 2 decoder … … … Q n-1 Q 0 … Increase number of words
  15. 15. Combine techniques <ul><ul><li>To increase number and width of words. </li></ul></ul>A enable outputs Increase number and width of words
  16. 16. Memory Hierarchy <ul><li>Is an approach for organizing memory and storage systems. </li></ul><ul><li>A memory hierarchy is organized into several levels – each smaller, faster, & more expensive / byte than the next lower level. </li></ul><ul><li>For each k, the faster, smaller device at level k serves as a cache for the larger, slower device at level k+1. </li></ul><ul><li>Programs tend to access the data at level k more often than they access the data at level k+1. </li></ul>
  17. 17. An Example Memory Hierarchy registers on-chip L1 cache (SRAM) main memory (DRAM) local secondary storage (local disks) Larger, slower, and cheaper (per byte) storage devices remote secondary storage (distributed file systems, Web servers) off-chip L2 cache (SRAM) CPU registers hold words retrieved from L1 cache. L0: L1: L2: L3: L4: L5: Smaller, faster, and costlier (per byte) storage devices Local disks hold files retrieved from disks on remote network servers. Main memory holds disk blocks retrieved from local disks. L1 cache holds cache lines retrieved from the L2 cache memory. L2 cache holds cache lines retrieved from main memory.
  18. 18. Caches <ul><li>Cache: The first level(s) of the memory hierarchy encountered once the address leaves the CPU. </li></ul><ul><li>The term is generally used whenever buffering is employed to reuse commonly occurring items such as webpage caches, file caches, & name caches. </li></ul>
  19. 19. Caching in a Memory Hierarchy 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Larger, slower, cheaper storage device at level k+1 is partitioned into blocks. Level k+1: 4 4 4 10 10 10 Data is copied between levels in block-sized transfer units 8 9 14 3 Smaller, faster, more expensive device at level k caches a subset of the blocks from level k+1 Level k:
  20. 20. Request 14 Request 12 General Caching Concepts <ul><li>Program needs object d, which is stored in some block b. </li></ul><ul><li>Cache hit </li></ul><ul><ul><li>Program finds b in the cache at level k. E.g., block 14. </li></ul></ul><ul><li>Cache miss </li></ul><ul><ul><li>b is not at level k, so level k cache must fetch it from level k+1. E.g., block 12. </li></ul></ul><ul><ul><li>If level k cache is full, then some current block must be replaced (evicted). Which one is the “victim”? We’ll see later. </li></ul></ul>9 3 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Level k: Level k+1: 14 14 12 14 4* 4* 12 12 0 1 2 3 Request 12 4* 4* 12
  21. 21. Cache Placement <ul><li>There are 3 categories of cache organization: </li></ul><ul><li>Direct-mapped. </li></ul><ul><li>Fully-associative. </li></ul><ul><li>Set-associative. </li></ul>
  22. 22. Direct-Mapped <ul><li>The block can appear in 1 place only. </li></ul><ul><li>Fastest & simplest organization but highest miss rate due to contention. </li></ul><ul><li>Mapping is usually: Block address % Number of blocks in cache. </li></ul>Data Valid Tag Index Offset = V T D
  23. 23. Fully-Associative <ul><li>The block can appear anywhere in the cache. Slowest organization but lowest miss rate. </li></ul>Tag Offset = V T D Valid V T D … V T D = = Data
  24. 24. Set-Associative <ul><li>The block can appear anywhere within a single set. (n-way set associative) </li></ul><ul><li>The set number is usually: Block address % Number of sets in the cache. </li></ul>Tag Index Offset = V T D Data Valid V T D =
  25. 25. Cache Replacement <ul><li>In a direct-mapped cache, only 1 block is checked for a hit, & only that block can be replaced. </li></ul><ul><li>For set-associative or fully-associative caches, the evicted block is chosen using three strategies: </li></ul><ul><li>Random. </li></ul><ul><li>LRU. </li></ul><ul><li>FIFO. </li></ul>
  26. 26. Cache Replacement <ul><li>As the associativity increases => LRU harder & more expensive to implement => LRU is approximated. </li></ul><ul><li>LRU & random perform almost equally for larger caches. But LRU outperforms others for small caches. </li></ul>
  27. 27. Write Policies <ul><li>Write Back: the information is only written to the block in the cache. </li></ul><ul><li>Write Through: the information is written to both the block in the cache & to the block in lower levels. </li></ul>
  28. 28. Reducing the Miss Rate <ul><li>Larger Block Sizes & Caches. </li></ul><ul><li>Higher Associativity. </li></ul>
  29. 29. Application Memory Management <ul><li>Allocation: to allocate portions of memory to programs at their request. </li></ul><ul><li>Recycling: freeing it for reuse when no longer needed. </li></ul>
  30. 30. Memory Management <ul><li>In many embedded systems, the kernel and application programs execute in the same space i.e., there is no memory protection. </li></ul><ul><li>The embedded operating systems thus make large effort to reduce its memory occupation size. </li></ul>
  31. 31. Memory Management <ul><li>An RTOS uses small memory size by including only the necessary functionality for an application. </li></ul><ul><li>We have two kinds of memory management: </li></ul><ul><li>Static </li></ul><ul><li>Dynamic </li></ul>
  32. 32. Static memory management <ul><li>provides tasks with temporary data space. </li></ul><ul><li>The system’s free memory is divided into a pool of fixed sized memory blocks. </li></ul><ul><li>When a task finishes using a memory block it must return it to the pool. </li></ul>
  33. 33. <ul><li>Another way is to provide temporary space for tasks is via priorities: </li></ul><ul><li>A high priority pool : is sized to have the worst-case memory demand of the system </li></ul><ul><li>A low priority pool : is given the remaining free memory. </li></ul>Static memory management
  34. 34. Dynamic memory management <ul><li>employs memory swapping, overlays, multiprogramming with a fixed number of tasks (MFT), multiprogramming with a variable number of tasks (MVT) and demand paging. </li></ul><ul><li>Overlays allow programs larger than the available memory to be executed by partitioning the code and swapping them from disk to memory. </li></ul>
  35. 35. <ul><li>MFT: a fixed number of equalized code parts are in memory at the same time. </li></ul><ul><li>MVT: is like MFT except that the size of the partition depends on the needs of the program. </li></ul><ul><li>Demand paging : have fixed-size pages that reside in non-contiguous memory, unlike those in MFT and MVT </li></ul>Dynamic memory management
  36. 36. Memory Allocation <ul><li>is the process of assigning blocks of memory on request . </li></ul><ul><li>Memory for user processes is divided into multiple partitions of varying sizes. </li></ul><ul><li>Hole : is a block of available memory. </li></ul>
  37. 37. Static memory allocation <ul><li>means that all memory is allocated to each process or thread when the system starts up. In this case, you never have to ask for memory while a process is being executed. This is very costly. </li></ul><ul><li>The advantage of this in embedded systems is that the whole issue of memory-related bugs-due to leaks, failures, and dangling pointers-simply does not exist . </li></ul>
  38. 38. Dynamic Storage-Allocation <ul><li>How to satisfy a request of size n from a list of free holes. This means that during runtime, a process is asking the system for a memory block of a certain size to hold a certain data structure. </li></ul><ul><li>Some RTOSs support a timeout function on a memory request. You ask the OS for memory within a prescribed time limit. </li></ul>
  39. 39. Dynamic Storage-Allocation Schemes <ul><li>First-fit : Allocate the first hole that is big enough, so it is fast </li></ul><ul><li>Best-fit : Allocate the smallest hole that is big enough; must search entire list, unless ordered by size. </li></ul><ul><li>Buddy: it divides memory into partitions to try to satisfy a memory request as suitably as possible. </li></ul>
  40. 40. Buddy memory allocation <ul><li>allocates memory in powers of 2 </li></ul><ul><li>it only allocates blocks of certain sizes </li></ul><ul><li>has many free lists, one for each permitted size </li></ul>
  41. 41. How buddy works? <ul><li>If memory is to be allocated </li></ul><ul><li>1-Look for a memory slot of a suitable size (the minimal 2k block that is larger then the requested memory) </li></ul><ul><ul><li>If it is found, it is allocated to the program </li></ul></ul><ul><ul><li>If not, it tries to make a suitable memory slot. The system does so by trying the following: </li></ul></ul><ul><ul><ul><li>Split a free memory slot larger than the requested memory size into half </li></ul></ul></ul><ul><ul><ul><li>If the lower limit is reached, then allocate that amount of memory </li></ul></ul></ul><ul><ul><ul><li>Go back to step 1 (look for a memory slot of a suitable size) </li></ul></ul></ul><ul><ul><ul><li>Repeat this process until a suitable memory slot is found </li></ul></ul></ul>
  42. 42. How buddy works? <ul><li>If memory is to be freed </li></ul><ul><li>Free the block of memory </li></ul><ul><li>Look at the neighboring block - is it free too? </li></ul><ul><li>If it is, combine the two, and go back to step 2 and repeat this process until either the upper limit is reached (all memory is freed), or until a non-free neighbor block is encountered </li></ul>
  43. 43. Example: buddy system 1024K t=8 512K 128K D-128K 256K t=7 512K 128K D-128K B-128K 128K t=6 512K 128K D-128K B-128K 64K A-64K t=5 512K 128K D-128K B-128K C-64K A-64K t=4 512K 256K B-128K C-64K A-64K t=3 512K 256K B-128K 64K A-64K t=2 512K 256K 128K 64K A-64K t=1 1024K t=0 64K 64K 64K 64K 64K 64K 64K 64K 64K 64K 64K 64K 64K 64K 64K 64K
  44. 44. The problem of fragmentation <ul><li>neither first fit nor best fit is clearly better that the other in terms of storage utilization, but first fit is generally faster. </li></ul><ul><li>All the previous schemes has external fragmentation. </li></ul><ul><li>the buddy memory system has little external fragmentation. </li></ul>
  45. 45. <ul><li>External Fragmentation –total memory space exists to satisfy a request, but it is not contiguous. </li></ul><ul><li>Internal Fragmentation –allocated memory may be slightly larger than requested memory; this size difference is memory internal to a partition, but not being used. </li></ul>Fragmentation
  46. 46. Example: Internal Fragmentation
  47. 47. Memory Protection <ul><li>it may not be acceptable for a hardware failure to corrupt data in memory. So, use of a hardware protection mechanism is recommended. </li></ul><ul><li>This hardware protection mechanism can be found in the processor or MMU. </li></ul><ul><li>MMUs also enable address translation, which is not needed in RT because we use cross-compilers that generate PIC code (Position Independent Code). </li></ul>
  48. 48. Hardware Memory Protection
  49. 49. Recycling techniques <ul><li>There are many ways for automatic memory managers to determine what memory is no longer required </li></ul><ul><li>garbage collection relies on determining which blocks are not pointed to by any program variables . </li></ul>
  50. 50. Recycling techniques <ul><li>Tracing collectors : Automatic memory managers that follow pointers to determine which blocks of memory are reachable from program variables. </li></ul><ul><li>Reference counts : is a count of how many references (that is, pointers) there are to a particular memory block from other blocks . </li></ul>
  51. 51. Example : Tracing collectors <ul><li>Mark-sweep collection: </li></ul><ul><li>Phase1: all blocks that can be reached by the program are marked. </li></ul><ul><li>Phase2: the collector sweeps all allocated memory, searching for blocks that have not been marked. If it finds any, it returns them to the allocator for reuse. </li></ul>
  52. 52. Mark-sweep collection <ul><li>The drawbacks : </li></ul><ul><li>It must scan the entire memory in use before any memory can be freed. </li></ul><ul><li>It must run to completion or, if interrupted, start again from scratch. </li></ul>
  53. 53. Example : Reference counts <ul><li>Simple reference counting : </li></ul><ul><li>a reference count is kept for each object . </li></ul><ul><li>The count is incremented for each new reference, decremented if a reference is overwritten, or if the referring object is recycled. </li></ul><ul><li>If a reference count falls to zero, then the object is no longer required and can be recycled. </li></ul><ul><li>it is hard to implement efficiently because of the cost of updating the counts. </li></ul>
  54. 54. References: <ul><li>http://www.memorymanagement.org/articles/recycle.html </li></ul><ul><li>http://www.dedicated-systems.com </li></ul><ul><li>http:// www.Wikipedia.org </li></ul><ul><li>http://www.cs.utexas.edu </li></ul><ul><li>http://www.netrino.com </li></ul><ul><li>S. Baskiyar,Ph.D. and N.Meghanathan,A Survey of Contemporary Real-time Operating Systems. </li></ul>

×