Chapter 4: Memory Management
Part 1: Mechanisms for Managing Memory
Chapter 4 2
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Memory management
 Basic memory management
 Swapping
 Virtual memory
 Page replacement algorithms
 Modeling page replacement algorithms
 Design issues for paging systems
 Implementation issues
 Segmentation
Chapter 4 3
CS 1550, cs.pitt.edu
(originaly modified by Ethan
In an ideal world…
 The ideal world has memory that is
 Very large
 Very fast
 Non-volatile (doesn’t go away when power is turned off)
 The real world has memory that is:
 Very large
 Very fast
 Affordable!
Pick any two…
 Memory management goal: make the real world look
as much like the ideal world as possible
Chapter 4 4
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Memory hierarchy
 What is the memory hierarchy?
 Different levels of memory
 Some are small & fast
 Others are large & slow
 What levels are usually included?
 Cache: small amount of fast, expensive memory
 L1 (level 1) cache: usually on the CPU chip
 L2 & L3 cache: off-chip, made of SRAM
 Main memory: medium-speed, medium price memory (DRAM)
 Disk: many gigabytes of slow, cheap, non-volatile storage
 Memory manager handles the memory hierarchy
Chapter 4 5
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Basic memory management
 Components include
 Operating system (perhaps with device drivers)
 Single process
 Goal: lay these out in memory
 Memory protection may not be an issue (only one program)
 Flexibility may still be useful (allow OS changes, etc.)
 No swapping or paging
Operating system
(RAM)
User program
(RAM)
0xFFFF 0xFFFF
0 0
User program
(RAM)
Operating system
(ROM)
Operating system
(RAM)
User program
(RAM)
Device drivers
(ROM)
Chapter 4 6
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Fixed partitions: multiple programs
 Fixed memory partitions
 Divide memory into fixed spaces
 Assign a process to a space when it’s free
 Mechanisms
 Separate input queues for each partition
 Single input queue: better ability to optimize CPU usage
OS
Partition 1
Partition 2
Partition 3
Partition 4
0
100K
500K
600K
700K
900K
OS
Partition 1
Partition 2
Partition 3
Partition 4
0
100K
500K
600K
700K
900K
Chapter 4 7
CS 1550, cs.pitt.edu
(originaly modified by Ethan
How many programs is enough?
 Several memory partitions (fixed or variable size)
 Lots of processes wanting to use the CPU
 Tradeoff
 More processes utilize the CPU better
 Fewer processes use less memory (cheaper!)
 How many processes do we need to keep the CPU
fully utilized?
 This will help determine how much memory we need
 Is this still relevant with memory costing less than $1/GB?
Chapter 4 8
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Modeling multiprogramming
 More I/O wait means less
processor utilization
 At 20% I/O wait, 3–4
processes fully utilize CPU
 At 80% I/O wait, even 10
processes aren’t enough
 This means that the OS
should have more
processes if they’re I/O
bound
 More processes =>
memory management &
protection more
important!
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 6 7 8 9 10
Degree of Multiprogramming
CPU
Utilization
80% I/O Wait 50% I/O Wait 20% I/O Wait
Chapter 4 9
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Multiprogrammed system performance
 Arrival and work requirements of 4 jobs
 CPU utilization for 1–4 jobs with 80% I/O wait
 Sequence of events as jobs arrive and finish
 Numbers show amount of CPU time jobs get in each interval
 More processes => better utilization, less time per process
Job Arrival
time
CPU
needed
1 10:00 4
2 10:10 3
3 10:15 2
4 10:20 2
1 2 3 4
CPU idle 0.80 0.64 0.51 0.41
CPU busy 0.20 0.36 0.49 0.59
CPU/process 0.20 0.18 0.16 0.15
0 10 15 20 22 27.6 28.2 31.7
1
2
3
4
Time
Chapter 4 10
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Memory and multiprogramming
 Memory needs two things for multiprogramming
 Relocation
 Protection
 The OS cannot be certain where a program will be
loaded in memory
 Variables and procedures can’t use absolute locations in
memory
 Several ways to guarantee this
 The OS must keep processes’ memory separate
 Protect a process from other processes reading or
modifying its own memory
 Protect a process from modifying its own memory in
undesirable ways (such as writing to program code)
Chapter 4 11
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Base and limit registers
 Special CPU registers: base &
limit
 Access to the registers limited to
system mode
 Registers contain
 Base: start of the process’s
memory partition
 Limit: length of the process’s
memory partition
 Address generation
 Physical address: location in
actual memory
 Logical address: location from
the process’s point of view
 Physical address = base + logical
address
 Logical address larger than limit
=> error
Process
partition
OS
0
0xFFFF
Limit
Base
0x2000
0x9000
Logical address: 0x1204
Physical address:
0x1204+0x9000 = 0xa204
Chapter 4 12
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Swapping
 Memory allocation changes as
 Processes come into memory
 Processes leave memory
 Swapped to disk
 Complete execution
 Gray regions are unused memory
OS OS OS OS OS OS OS
A A
B
A
B
C
B
C
B
C
D
C
D
C
D
A
Chapter 4 13
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Swapping: leaving room to grow
 Need to allow for programs
to grow
 Allocate more memory for
data
 Larger stack
 Handled by allocating more
space than is necessary at
the start
 Inefficient: wastes memory
that’s not currently in use
 What if the process requests
too much memory?
OS
Code
Data
Stack
Code
Data
Stack
Process
B
Process
A
Room for
B to grow
Room for
A to grow
Chapter 4 14
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Tracking memory usage: bitmaps
 Keep track of free / allocated memory regions with a bitmap
 One bit in map corresponds to a fixed-size region of memory
 Bitmap is a constant size for a given amount of memory regardless of how
much is allocated at a particular time
 Chunk size determines efficiency
 At 1 bit per 4KB chunk, we need just 256 bits (32 bytes) per MB of memory
 For smaller chunks, we need more memory for the bitmap
 Can be difficult to find large contiguous free areas in bitmap
A B C D
11111100
00111000
01111111
11111000
8 16 24 32
Memory regions
Bitmap
Chapter 4 15
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Tracking memory usage: linked lists
 Keep track of free / allocated memory regions with a linked list
 Each entry in the list corresponds to a contiguous region of memory
 Entry can indicate either allocated or free (and, optionally, owning process)
 May have separate lists for free and allocated areas
 Efficient if chunks are large
 Fixed-size representation for each region
 More regions => more space needed for free lists
A B C D
16 24 32
Memory regions
A 0 6 - 6 4 B 10 3 - 13 4 C 17 9
- 29 3
D 26 3
8
Chapter 4 16
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Allocating memory
 Search through region list to find a large enough space
 Suppose there are several choices: which one to use?
 First fit: the first suitable hole on the list
 Next fit: the first suitable after the previously allocated hole
 Best fit: the smallest hole that is larger than the desired region (wastes least
space?)
 Worst fit: the largest available hole (leaves largest fragment)
 Option: maintain separate queues for different-size holes
- 6 5 - 19 14 - 52 25 - 102 30 - 135 16
- 202 10 - 302 20 - 350 30 - 411 19 - 510 3
Allocate 20 blocks first fit
5
Allocate 12 blocks next fit
18
Allocate 13 blocks best fit
1
Allocate 15 blocks worst fit
15
Chapter 4 17
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Freeing memory
 Allocation structures must be updated when memory is freed
 Easy with bitmaps: just set the appropriate bits in the bitmap
 Linked lists: modify adjacent elements as needed
 Merge adjacent free regions into a single region
 May involve merging two regions with the just-freed area
A X B
A X
X B
X
A B
A
B
Chapter 4 18
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Limitations of swapping
 Problems with swapping
 Process must fit into physical memory (impossible to run
larger processes)
 Memory becomes fragmented
 External fragmentation: lots of small free areas
 Compaction needed to reassemble larger free areas
 Processes are either in memory or on disk: half and half
doesn’t do any good
 Overlays solved the first problem
 Bring in pieces of the process over time (typically data)
 Still doesn’t solve the problem of fragmentation or
partially resident processes
Chapter 4 19
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Virtual memory
 Basic idea: allow the OS to hand out more memory
than exists on the system
 Keep recently used stuff in physical memory
 Move less recently used stuff to disk
 Keep all of this hidden from processes
 Processes still see an address space from 0 – max address
 Movement of information to and from disk handled by the
OS without process help
 Virtual memory (VM) especially helpful in
multiprogrammed system
 CPU schedules process B while process A waits for its
memory to be retrieved from disk
Chapter 4 20
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Virtual and physical addresses
 Program uses virtual
addresses
 Addresses local to the process
 Hardware translates virtual
address to physical address
 Translation done by the
Memory Management Unit
 Usually on the same chip as
the CPU
 Only physical addresses leave
the CPU/MMU chip
 Physical memory indexed
by physical addresses
CPU chip
CPU
Memory
Disk
controller
MMU
Virtual addresses
from CPU to MMU
Physical addresses
on bus, in memory
Chapter 4 21
CS 1550, cs.pitt.edu
(originaly modified by Ethan
0–4K
4–8K
8–12K
12–16K
16–20K
20–24K
24–28K
28–32K
Paging and page tables
 Virtual addresses mapped to
physical addresses
 Unit of mapping is called a page
 All addresses in the same virtual
page are in the same physical
page
 Page table entry (PTE) contains
translation for a single page
 Table translates virtual page
number to physical page number
 Not all virtual memory has a
physical page
 Not every physical page need be
used
 Example:
 64 KB virtual memory
 32 KB physical memory
7
0–4K
4
4–8K
8–12K
12–16K
0
16–20K
20–24K
24–28K
3
28–32K
32–36K
36–40K
1
40–44K
5
44–48K
6
48–52K
-
52–56K
56–60K
-
60–64K
Virtual
address
space
Physical
memory
-
-
-
-
-
-
-
Chapter 4 22
CS 1550, cs.pitt.edu
(originaly modified by Ethan
What’s in a page table entry?
 Each entry in the page table contains
 Valid bit: set if this logical page number has a corresponding physical frame
in memory
 If not valid, remainder of PTE is irrelevant
 Page frame number: page in physical memory
 Referenced bit: set if data on the page has been accessed
 Dirty (modified) bit :set if data on the page has been modified
 Protection information
Page frame number
V
R
D
Protection
Valid bit
Referenced bit
Dirty bit
Chapter 4 23
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Example:
• 4 KB (=4096 byte) pages
• 32 bit logical addresses
p d
2d = 4096 d = 12
12 bits
32 bit logical address
32-12 = 20 bits
Mapping logical => physical address
 Split address from CPU into
two pieces
 Page number (p)
 Page offset (d)
 Page number
 Index into page table
 Page table contains base
address of page in physical
memory
 Page offset
 Added to base address to get
actual physical memory
address
 Page size = 2d bytes
Chapter 4 24
CS 1550, cs.pitt.edu
(originaly modified by Ethan
page number
p d
page offset
0
1
p-1
p
p+1
f
f d
Page frame number
.
.
.
page table
physical memory
0
1
.
.
.
f-1
f
f+1
f+2
.
.
.
Page frame number
CPU
Address translation architecture
Chapter 4 25
CS 1550, cs.pitt.edu
(originaly modified by Ethan
0
Page frame number
Logical memory (P0)
1
2
3
4
5
6
7
8
9
Physical
memory
Page table (P0)
Logical memory (P1) Page table (P1)
Page 4
Page 3
Page 2
Page 1
Page 0
Page 1
Page 0
0
8
2
9
4
3
6
Page 3 (P0)
Page 0 (P1)
Page 0 (P0)
Page 2 (P0)
Page 1 (P0)
Page 4 (P0)
Page 1 (P1)
Free
pages
Memory & paging structures
Chapter 4 26
CS 1550, cs.pitt.edu
(originaly modified by Ethan
884
960
955
.
.
.
220
657
401
.
.
.
1st level
page table
2nd level
page tables
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
main
memory
.
.
.
125
613
961
.
.
.
Two-level page tables
 Problem: page tables can be too
large
 232 bytes in 4KB pages need 1
million PTEs
 Solution: use multi-level page
tables
 “Page size” in first page table is
large (megabytes)
 PTE marked invalid in first page
table needs no 2nd level page
table
 1st level page table has pointers
to 2nd level page tables
 2nd level page table has actual
physical page numbers in it
Chapter 4 27
CS 1550, cs.pitt.edu
(originaly modified by Ethan
More on two-level page tables
 Tradeoffs between 1st and 2nd level page table sizes
 Total number of bits indexing 1st and 2nd level table is
constant for a given page size and logical address length
 Tradeoff between number of bits indexing 1st and number
indexing 2nd level tables
 More bits in 1st level: fine granularity at 2nd level
 Fewer bits in 1st level: maybe less wasted space?
 All addresses in table are physical addresses
 Protection bits kept in 2nd level table
Chapter 4 28
CS 1550, cs.pitt.edu
(originaly modified by Ethan
p1 = 10 bits p2 = 9 bits offset = 13 bits
page offset
page number
Two-level paging: example
 System characteristics
 8 KB pages
 32-bit logical address divided into 13 bit page offset, 19 bit page number
 Page number divided into:
 10 bit page number
 9 bit page offset
 Logical address looks like this:
 p1 is an index into the 1st level page table
 p2 is an index into the 2nd level page table pointed to by p1
Chapter 4 29
CS 1550, cs.pitt.edu
(originaly modified by Ethan
.
.
.
.
.
.
2-level address translation example
p1 = 10 bits p2 = 9 bits offset = 13 bits
page offset
page number
.
.
.
0
1
p1
.
.
.
0
1
p2
19
physical address
1st level page table
2nd level page table
main memory
0
1
frame
number
13
Page
table
base
.
.
.
.
.
.
Chapter 4 30
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Implementing page tables in hardware
 Page table resides in main (physical) memory
 CPU uses special registers for paging
 Page table base register (PTBR) points to the page table
 Page table length register (PTLR) contains length of page table:
restricts maximum legal logical address
 Translating an address requires two memory accesses
 First access reads page table entry (PTE)
 Second access reads the data / instruction from memory
 Reduce number of memory accesses
 Can’t avoid second access (we need the value from memory)
 Eliminate first access by keeping a hardware cache (called a
translation lookaside buffer or TLB) of recently used page table
entries
Chapter 4 31
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Logical
page #
Physical
frame #
Example TLB
8
unused
2
3
12
29
22
7
3
1
0
12
6
11
4
Translation Lookaside Buffer (TLB)
 Search the TLB for the desired
logical page number
 Search entries in parallel
 Use standard cache techniques
 If desired logical page number is
found, get frame number from
TLB
 If desired logical page number
isn’t found
 Get frame number from page
table in memory
 Replace an entry in the TLB
with the logical & physical page
numbers from this reference
Chapter 4 32
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Handling TLB misses
 If PTE isn’t found in TLB, OS needs to do the lookup in the
page table
 Lookup can be done in hardware or software
 Hardware TLB replacement
 CPU hardware does page table lookup
 Can be faster than software
 Less flexible than software, and more complex hardware
 Software TLB replacement
 OS gets TLB exception
 Exception handler does page table lookup & places the result into the
TLB
 Program continues after return from exception
 Larger TLB (lower miss rate) can make this feasible
Chapter 4 33
CS 1550, cs.pitt.edu
(originaly modified by Ethan
How long do memory accesses take?
 Assume the following times:
 TLB lookup time = a (often zero - overlapped in CPU)
 Memory access time = m
 Hit ratio (h) is percentage of time that a logical page number
is found in the TLB
 Larger TLB usually means higher h
 TLB structure can affect h as well
 Effective access time (an average) is calculated as:
 EAT = (m + a)h + (m + m + a)(1-h)
 EAT =a + (2-h)m
 Interpretation
 Reference always requires TLB lookup, 1 memory access
 TLB misses also require an additional memory reference
Chapter 4 34
CS 1550, cs.pitt.edu
(originaly modified by Ethan
Inverted page table
 Reduce page table size further: keep one entry for
each frame in memory
 PTE contains
 Virtual address pointing to this frame
 Information about the process that owns this page
 Search page table by
 Hashing the virtual page number and process ID
 Starting at the entry corresponding to the hash result
 Search until either the entry is found or a limit is reached
 Page frame number is index of PTE
 Improve performance by using more advanced
hashing algorithms
Chapter 4 35
CS 1550, cs.pitt.edu
(originaly modified by Ethan
pid1
pidk
pid0
Inverted page table architecture
process ID p = 19 bits offset = 13 bits
page number
13
19
physical address
inverted page table
main memory
.
.
.
0
1
.
.
.
Page frame
number
page offset
pid p
p0
p1
pk
.
.
.
.
.
.
0
1
k
search
k

Memory Management

  • 1.
    Chapter 4: MemoryManagement Part 1: Mechanisms for Managing Memory
  • 2.
    Chapter 4 2 CS1550, cs.pitt.edu (originaly modified by Ethan Memory management  Basic memory management  Swapping  Virtual memory  Page replacement algorithms  Modeling page replacement algorithms  Design issues for paging systems  Implementation issues  Segmentation
  • 3.
    Chapter 4 3 CS1550, cs.pitt.edu (originaly modified by Ethan In an ideal world…  The ideal world has memory that is  Very large  Very fast  Non-volatile (doesn’t go away when power is turned off)  The real world has memory that is:  Very large  Very fast  Affordable! Pick any two…  Memory management goal: make the real world look as much like the ideal world as possible
  • 4.
    Chapter 4 4 CS1550, cs.pitt.edu (originaly modified by Ethan Memory hierarchy  What is the memory hierarchy?  Different levels of memory  Some are small & fast  Others are large & slow  What levels are usually included?  Cache: small amount of fast, expensive memory  L1 (level 1) cache: usually on the CPU chip  L2 & L3 cache: off-chip, made of SRAM  Main memory: medium-speed, medium price memory (DRAM)  Disk: many gigabytes of slow, cheap, non-volatile storage  Memory manager handles the memory hierarchy
  • 5.
    Chapter 4 5 CS1550, cs.pitt.edu (originaly modified by Ethan Basic memory management  Components include  Operating system (perhaps with device drivers)  Single process  Goal: lay these out in memory  Memory protection may not be an issue (only one program)  Flexibility may still be useful (allow OS changes, etc.)  No swapping or paging Operating system (RAM) User program (RAM) 0xFFFF 0xFFFF 0 0 User program (RAM) Operating system (ROM) Operating system (RAM) User program (RAM) Device drivers (ROM)
  • 6.
    Chapter 4 6 CS1550, cs.pitt.edu (originaly modified by Ethan Fixed partitions: multiple programs  Fixed memory partitions  Divide memory into fixed spaces  Assign a process to a space when it’s free  Mechanisms  Separate input queues for each partition  Single input queue: better ability to optimize CPU usage OS Partition 1 Partition 2 Partition 3 Partition 4 0 100K 500K 600K 700K 900K OS Partition 1 Partition 2 Partition 3 Partition 4 0 100K 500K 600K 700K 900K
  • 7.
    Chapter 4 7 CS1550, cs.pitt.edu (originaly modified by Ethan How many programs is enough?  Several memory partitions (fixed or variable size)  Lots of processes wanting to use the CPU  Tradeoff  More processes utilize the CPU better  Fewer processes use less memory (cheaper!)  How many processes do we need to keep the CPU fully utilized?  This will help determine how much memory we need  Is this still relevant with memory costing less than $1/GB?
  • 8.
    Chapter 4 8 CS1550, cs.pitt.edu (originaly modified by Ethan Modeling multiprogramming  More I/O wait means less processor utilization  At 20% I/O wait, 3–4 processes fully utilize CPU  At 80% I/O wait, even 10 processes aren’t enough  This means that the OS should have more processes if they’re I/O bound  More processes => memory management & protection more important! 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 1 2 3 4 5 6 7 8 9 10 Degree of Multiprogramming CPU Utilization 80% I/O Wait 50% I/O Wait 20% I/O Wait
  • 9.
    Chapter 4 9 CS1550, cs.pitt.edu (originaly modified by Ethan Multiprogrammed system performance  Arrival and work requirements of 4 jobs  CPU utilization for 1–4 jobs with 80% I/O wait  Sequence of events as jobs arrive and finish  Numbers show amount of CPU time jobs get in each interval  More processes => better utilization, less time per process Job Arrival time CPU needed 1 10:00 4 2 10:10 3 3 10:15 2 4 10:20 2 1 2 3 4 CPU idle 0.80 0.64 0.51 0.41 CPU busy 0.20 0.36 0.49 0.59 CPU/process 0.20 0.18 0.16 0.15 0 10 15 20 22 27.6 28.2 31.7 1 2 3 4 Time
  • 10.
    Chapter 4 10 CS1550, cs.pitt.edu (originaly modified by Ethan Memory and multiprogramming  Memory needs two things for multiprogramming  Relocation  Protection  The OS cannot be certain where a program will be loaded in memory  Variables and procedures can’t use absolute locations in memory  Several ways to guarantee this  The OS must keep processes’ memory separate  Protect a process from other processes reading or modifying its own memory  Protect a process from modifying its own memory in undesirable ways (such as writing to program code)
  • 11.
    Chapter 4 11 CS1550, cs.pitt.edu (originaly modified by Ethan Base and limit registers  Special CPU registers: base & limit  Access to the registers limited to system mode  Registers contain  Base: start of the process’s memory partition  Limit: length of the process’s memory partition  Address generation  Physical address: location in actual memory  Logical address: location from the process’s point of view  Physical address = base + logical address  Logical address larger than limit => error Process partition OS 0 0xFFFF Limit Base 0x2000 0x9000 Logical address: 0x1204 Physical address: 0x1204+0x9000 = 0xa204
  • 12.
    Chapter 4 12 CS1550, cs.pitt.edu (originaly modified by Ethan Swapping  Memory allocation changes as  Processes come into memory  Processes leave memory  Swapped to disk  Complete execution  Gray regions are unused memory OS OS OS OS OS OS OS A A B A B C B C B C D C D C D A
  • 13.
    Chapter 4 13 CS1550, cs.pitt.edu (originaly modified by Ethan Swapping: leaving room to grow  Need to allow for programs to grow  Allocate more memory for data  Larger stack  Handled by allocating more space than is necessary at the start  Inefficient: wastes memory that’s not currently in use  What if the process requests too much memory? OS Code Data Stack Code Data Stack Process B Process A Room for B to grow Room for A to grow
  • 14.
    Chapter 4 14 CS1550, cs.pitt.edu (originaly modified by Ethan Tracking memory usage: bitmaps  Keep track of free / allocated memory regions with a bitmap  One bit in map corresponds to a fixed-size region of memory  Bitmap is a constant size for a given amount of memory regardless of how much is allocated at a particular time  Chunk size determines efficiency  At 1 bit per 4KB chunk, we need just 256 bits (32 bytes) per MB of memory  For smaller chunks, we need more memory for the bitmap  Can be difficult to find large contiguous free areas in bitmap A B C D 11111100 00111000 01111111 11111000 8 16 24 32 Memory regions Bitmap
  • 15.
    Chapter 4 15 CS1550, cs.pitt.edu (originaly modified by Ethan Tracking memory usage: linked lists  Keep track of free / allocated memory regions with a linked list  Each entry in the list corresponds to a contiguous region of memory  Entry can indicate either allocated or free (and, optionally, owning process)  May have separate lists for free and allocated areas  Efficient if chunks are large  Fixed-size representation for each region  More regions => more space needed for free lists A B C D 16 24 32 Memory regions A 0 6 - 6 4 B 10 3 - 13 4 C 17 9 - 29 3 D 26 3 8
  • 16.
    Chapter 4 16 CS1550, cs.pitt.edu (originaly modified by Ethan Allocating memory  Search through region list to find a large enough space  Suppose there are several choices: which one to use?  First fit: the first suitable hole on the list  Next fit: the first suitable after the previously allocated hole  Best fit: the smallest hole that is larger than the desired region (wastes least space?)  Worst fit: the largest available hole (leaves largest fragment)  Option: maintain separate queues for different-size holes - 6 5 - 19 14 - 52 25 - 102 30 - 135 16 - 202 10 - 302 20 - 350 30 - 411 19 - 510 3 Allocate 20 blocks first fit 5 Allocate 12 blocks next fit 18 Allocate 13 blocks best fit 1 Allocate 15 blocks worst fit 15
  • 17.
    Chapter 4 17 CS1550, cs.pitt.edu (originaly modified by Ethan Freeing memory  Allocation structures must be updated when memory is freed  Easy with bitmaps: just set the appropriate bits in the bitmap  Linked lists: modify adjacent elements as needed  Merge adjacent free regions into a single region  May involve merging two regions with the just-freed area A X B A X X B X A B A B
  • 18.
    Chapter 4 18 CS1550, cs.pitt.edu (originaly modified by Ethan Limitations of swapping  Problems with swapping  Process must fit into physical memory (impossible to run larger processes)  Memory becomes fragmented  External fragmentation: lots of small free areas  Compaction needed to reassemble larger free areas  Processes are either in memory or on disk: half and half doesn’t do any good  Overlays solved the first problem  Bring in pieces of the process over time (typically data)  Still doesn’t solve the problem of fragmentation or partially resident processes
  • 19.
    Chapter 4 19 CS1550, cs.pitt.edu (originaly modified by Ethan Virtual memory  Basic idea: allow the OS to hand out more memory than exists on the system  Keep recently used stuff in physical memory  Move less recently used stuff to disk  Keep all of this hidden from processes  Processes still see an address space from 0 – max address  Movement of information to and from disk handled by the OS without process help  Virtual memory (VM) especially helpful in multiprogrammed system  CPU schedules process B while process A waits for its memory to be retrieved from disk
  • 20.
    Chapter 4 20 CS1550, cs.pitt.edu (originaly modified by Ethan Virtual and physical addresses  Program uses virtual addresses  Addresses local to the process  Hardware translates virtual address to physical address  Translation done by the Memory Management Unit  Usually on the same chip as the CPU  Only physical addresses leave the CPU/MMU chip  Physical memory indexed by physical addresses CPU chip CPU Memory Disk controller MMU Virtual addresses from CPU to MMU Physical addresses on bus, in memory
  • 21.
    Chapter 4 21 CS1550, cs.pitt.edu (originaly modified by Ethan 0–4K 4–8K 8–12K 12–16K 16–20K 20–24K 24–28K 28–32K Paging and page tables  Virtual addresses mapped to physical addresses  Unit of mapping is called a page  All addresses in the same virtual page are in the same physical page  Page table entry (PTE) contains translation for a single page  Table translates virtual page number to physical page number  Not all virtual memory has a physical page  Not every physical page need be used  Example:  64 KB virtual memory  32 KB physical memory 7 0–4K 4 4–8K 8–12K 12–16K 0 16–20K 20–24K 24–28K 3 28–32K 32–36K 36–40K 1 40–44K 5 44–48K 6 48–52K - 52–56K 56–60K - 60–64K Virtual address space Physical memory - - - - - - -
  • 22.
    Chapter 4 22 CS1550, cs.pitt.edu (originaly modified by Ethan What’s in a page table entry?  Each entry in the page table contains  Valid bit: set if this logical page number has a corresponding physical frame in memory  If not valid, remainder of PTE is irrelevant  Page frame number: page in physical memory  Referenced bit: set if data on the page has been accessed  Dirty (modified) bit :set if data on the page has been modified  Protection information Page frame number V R D Protection Valid bit Referenced bit Dirty bit
  • 23.
    Chapter 4 23 CS1550, cs.pitt.edu (originaly modified by Ethan Example: • 4 KB (=4096 byte) pages • 32 bit logical addresses p d 2d = 4096 d = 12 12 bits 32 bit logical address 32-12 = 20 bits Mapping logical => physical address  Split address from CPU into two pieces  Page number (p)  Page offset (d)  Page number  Index into page table  Page table contains base address of page in physical memory  Page offset  Added to base address to get actual physical memory address  Page size = 2d bytes
  • 24.
    Chapter 4 24 CS1550, cs.pitt.edu (originaly modified by Ethan page number p d page offset 0 1 p-1 p p+1 f f d Page frame number . . . page table physical memory 0 1 . . . f-1 f f+1 f+2 . . . Page frame number CPU Address translation architecture
  • 25.
    Chapter 4 25 CS1550, cs.pitt.edu (originaly modified by Ethan 0 Page frame number Logical memory (P0) 1 2 3 4 5 6 7 8 9 Physical memory Page table (P0) Logical memory (P1) Page table (P1) Page 4 Page 3 Page 2 Page 1 Page 0 Page 1 Page 0 0 8 2 9 4 3 6 Page 3 (P0) Page 0 (P1) Page 0 (P0) Page 2 (P0) Page 1 (P0) Page 4 (P0) Page 1 (P1) Free pages Memory & paging structures
  • 26.
    Chapter 4 26 CS1550, cs.pitt.edu (originaly modified by Ethan 884 960 955 . . . 220 657 401 . . . 1st level page table 2nd level page tables . . . . . . . . . . . . . . . . . . . . . . . . . . . main memory . . . 125 613 961 . . . Two-level page tables  Problem: page tables can be too large  232 bytes in 4KB pages need 1 million PTEs  Solution: use multi-level page tables  “Page size” in first page table is large (megabytes)  PTE marked invalid in first page table needs no 2nd level page table  1st level page table has pointers to 2nd level page tables  2nd level page table has actual physical page numbers in it
  • 27.
    Chapter 4 27 CS1550, cs.pitt.edu (originaly modified by Ethan More on two-level page tables  Tradeoffs between 1st and 2nd level page table sizes  Total number of bits indexing 1st and 2nd level table is constant for a given page size and logical address length  Tradeoff between number of bits indexing 1st and number indexing 2nd level tables  More bits in 1st level: fine granularity at 2nd level  Fewer bits in 1st level: maybe less wasted space?  All addresses in table are physical addresses  Protection bits kept in 2nd level table
  • 28.
    Chapter 4 28 CS1550, cs.pitt.edu (originaly modified by Ethan p1 = 10 bits p2 = 9 bits offset = 13 bits page offset page number Two-level paging: example  System characteristics  8 KB pages  32-bit logical address divided into 13 bit page offset, 19 bit page number  Page number divided into:  10 bit page number  9 bit page offset  Logical address looks like this:  p1 is an index into the 1st level page table  p2 is an index into the 2nd level page table pointed to by p1
  • 29.
    Chapter 4 29 CS1550, cs.pitt.edu (originaly modified by Ethan . . . . . . 2-level address translation example p1 = 10 bits p2 = 9 bits offset = 13 bits page offset page number . . . 0 1 p1 . . . 0 1 p2 19 physical address 1st level page table 2nd level page table main memory 0 1 frame number 13 Page table base . . . . . .
  • 30.
    Chapter 4 30 CS1550, cs.pitt.edu (originaly modified by Ethan Implementing page tables in hardware  Page table resides in main (physical) memory  CPU uses special registers for paging  Page table base register (PTBR) points to the page table  Page table length register (PTLR) contains length of page table: restricts maximum legal logical address  Translating an address requires two memory accesses  First access reads page table entry (PTE)  Second access reads the data / instruction from memory  Reduce number of memory accesses  Can’t avoid second access (we need the value from memory)  Eliminate first access by keeping a hardware cache (called a translation lookaside buffer or TLB) of recently used page table entries
  • 31.
    Chapter 4 31 CS1550, cs.pitt.edu (originaly modified by Ethan Logical page # Physical frame # Example TLB 8 unused 2 3 12 29 22 7 3 1 0 12 6 11 4 Translation Lookaside Buffer (TLB)  Search the TLB for the desired logical page number  Search entries in parallel  Use standard cache techniques  If desired logical page number is found, get frame number from TLB  If desired logical page number isn’t found  Get frame number from page table in memory  Replace an entry in the TLB with the logical & physical page numbers from this reference
  • 32.
    Chapter 4 32 CS1550, cs.pitt.edu (originaly modified by Ethan Handling TLB misses  If PTE isn’t found in TLB, OS needs to do the lookup in the page table  Lookup can be done in hardware or software  Hardware TLB replacement  CPU hardware does page table lookup  Can be faster than software  Less flexible than software, and more complex hardware  Software TLB replacement  OS gets TLB exception  Exception handler does page table lookup & places the result into the TLB  Program continues after return from exception  Larger TLB (lower miss rate) can make this feasible
  • 33.
    Chapter 4 33 CS1550, cs.pitt.edu (originaly modified by Ethan How long do memory accesses take?  Assume the following times:  TLB lookup time = a (often zero - overlapped in CPU)  Memory access time = m  Hit ratio (h) is percentage of time that a logical page number is found in the TLB  Larger TLB usually means higher h  TLB structure can affect h as well  Effective access time (an average) is calculated as:  EAT = (m + a)h + (m + m + a)(1-h)  EAT =a + (2-h)m  Interpretation  Reference always requires TLB lookup, 1 memory access  TLB misses also require an additional memory reference
  • 34.
    Chapter 4 34 CS1550, cs.pitt.edu (originaly modified by Ethan Inverted page table  Reduce page table size further: keep one entry for each frame in memory  PTE contains  Virtual address pointing to this frame  Information about the process that owns this page  Search page table by  Hashing the virtual page number and process ID  Starting at the entry corresponding to the hash result  Search until either the entry is found or a limit is reached  Page frame number is index of PTE  Improve performance by using more advanced hashing algorithms
  • 35.
    Chapter 4 35 CS1550, cs.pitt.edu (originaly modified by Ethan pid1 pidk pid0 Inverted page table architecture process ID p = 19 bits offset = 13 bits page number 13 19 physical address inverted page table main memory . . . 0 1 . . . Page frame number page offset pid p p0 p1 pk . . . . . . 0 1 k search k