SlideShare a Scribd company logo
1 of 58
Cache Memory:
Cache – A safe place for storing or hiding things like desk for
books from library
The simplest way to assign a location of main memory data in
the cache is to assign the cache location based on the address
of the word in memory. This cache structure is called direct
mapped, since each memory location is mapped directly to
exactly one location in the cache.
For example, almost all direct-mapped caches use this
mapping to find a block:
(Block address) modulo (Number of blocks in the cache)
Cache Memory:
If the number of entries in the cache is a power of 2, then
modulo can be computed simply by using the low-order log2
(cache size in blocks) bits of the address.
Thus, an 8-block cache uses the three lowest bits (8=23) of the
block address.
Similarly 4-block cache uses two bits as block address i.e. 00,
01, 10 and 11
Cache Memory:
(Block address) modulo (Number of blocks in the cache)
For example, the memory addresses between 1ten (00001two)
and 29ten (11101two) map to locations 1ten (001two) and 5ten
(101two) in a direct-mapped cache of eight words.
Cache Memory:
Tag - A field in a table used for a memory hierarchy that
contains the address information required to identify whether
the associated block in the hierarchy corresponds to a
requested word.
Cache Memory:
Valid Bit – A field in the tables of a memory hierarchy that
indicates that the associated block in the hierarchy contains
valid data. If data present, it is 1 otherwise 0
Cache Memory:
Accessing a cache:
Cache Hit – The requested data is present in the cache
Cache Miss – The requested data is not present in the cache
and have to access from main memory
Initially the cache is empty and any data requested is not
available in cache and it is cache miss
Cache Memory:
A sequence of nine memory references to an empty
eight-block cache
Cache Memory:
Cache Memory:
The address in the memory is divided into
Tag field – compared with tag field of the cache
Cache index – used to select the block in cache
Cache Memory:
32-bit addresses
The cache size is 2n blocks, so n bits are used for the index
The block size is 2m words (2m+2 bytes), so m bits are used for
the word within the block, and two bits are used for the byte
part of the address
the size of the tag field is
Cache Memory:
When the CPU tries to read from memory, the address will be
sent to a cache controller.
•The lowest k bits of the address will index a block in the
cache.
•If the block is valid and the tag matches the upper (m - k)
bits of the m-bit address, then that data will be sent to the
CPU.
Cache Memory:
Cache Memory:
Handling Cache Misses:
Steps to be taken on an instruction cache miss:
1. Send the original PC value (current PC – 4) to the memory.
2. Instruct main memory to perform a read and wait for the
memory to complete its access.
3. Write the cache entry, putting the data from memory in the
data portion of the entry, writing the upper bits of the address
(from the ALU) into the tag field, and turning the valid bit on.
4. Restart the instruction execution at the first step, which will
refetch the instruction, this time finding it in the cache.
Cache Memory:
Handling Writes:
The cache and main memory are said to be inconsistent if
both contains different data for a same location e.g. a store
instruction may writes new data to cache block which may
not be updated in main memory
Write-through:
A scheme in which writes always update both the cache and
the next lower level of the memory hierarchy, ensuring that
data is always consistent between the two.
Cache Memory:
Handling Writes:
With write-through scheme, the data when fetched and write
into the cache also requires written back into the main
memory, which decreases the performance by taking more
clock cycles of processor.
The solution is Write buffer:
Cache Memory:
Handling Writes:
Write buffer:
• A queue that holds data while the data is waiting to be
written to memory.
• After writing the data into the cache and into the write
buffer, the processor can continue execution.
• When a write to main memory completes, the entry in the
write buffer is freed.
• If the write buffer is full when the processor reaches a
write, the processor must stall until there is an empty
position in the write buffer.
Cache Memory:
Handling Writes:
Write-back:
An alternative to a write-through scheme is a scheme called
write-back.
A scheme that handles writes by updating values only to the
block in the cache, then writing the modified block to the
lower level of the hierarchy when the block is replaced.
Advantage: Performance increases by reducing the writes into
main memory every time
Cache Memory:
Split Cache:
A scheme in which a level of a memory hierarchy is composed
of two independent caches that operates in parallel with each
other, with one handling instructions and one handling data
Measuring and Improving Cache Performance:
Ways to improve cache performance:
Reducing the miss rate by reducing the probability that two
different memory blocks will contend for the same cache
location.
Reduces the miss penalty by adding an additional level to the
Hierarchy (multilevel caching)
Measuring and Improving Cache Performance:
CPU time can be divided into
the clock cycles that the CPU spends executing the program
the clock cycles that the CPU spends waiting for the memory
system.
CPU time = (CPU execution clock cycles + Memory-stall clock
cycles) * Clock cycle time
Measuring and Improving Cache Performance:
Memory-stall clock cycles can be defined as the sum of the
stall cycles coming from reads plus those coming from writes
Memory-stall clock cycles = (Read-stall cycles + Write-stall
cycles)
The read-stall cycles can be defined in terms of the number of
read accesses per program, the miss penalty in clock cycles
for a read, and the read miss rate.
Measuring and Improving Cache Performance:
Write-stall cycles:
For a write-through scheme, we have two sources of stalls:
write misses, which usually require that we fetch the block
before continuing the write and
write buffer stalls, which occur when the write buffer is full
when a write occurs.
Measuring and Improving Cache Performance:
In most write-through cache organizations, the read and write
miss penalties are the same (the time to fetch the block from
memory).
If we assume that the write buffer stalls are negligible, we can
combine the reads and writes by using a single miss rate and
the miss penalty:
Schemes for Reducing Cache Misses:
The cache misses can be reduced by means of
• Direct Mapped Cache
• Set Associative Cache
• Fully Mapped Cache
Direct Mapped Cache
Location = Block No. in MM % No. of blocks in CM
Direct Mapped Cache
Set Associative Cache
A cache that has a fixed number of locations (at least two)
where each block can be placed.
A set-associative cache with n locations for a block is called
an n-way set-associative cache.
Set Associative Cache
•2 set associative – each block is repeated 2 times
•4 set associative – each block is repeated 4 times
•8 set associative – each block is repeated 8 times
•E.g. in 2 set associative type, if no. of block in CM
is 4 means it contains two 0 block and two 1 block
and totally of 4 blocks.
Location = Block No. in MM % No. of sets
Least Recently used blocks will be replaced if any need
Set Associative Cache
Fully Mapped
•Blocks will be transferred to the free location in
the cache memory
Least Recently used blocks will be replaced if any need
Schemes for Reducing Cache Misses:
Multilevel Caches:
Two miss rate
Global miss rate – The fraction of references that miss in all
levels of a multilevel cache.
Local miss rate – The fraction of references to one level of a
cache that misses used in multilevel hierarchies.
Virtual Memory:
A technique that uses main memory as a “cache” for
secondary storage.
Physical address - An address in main memory.
Program’s own address space - A separate range of memory
locations accessible only to the program
Virtual memory implements the translation of a program’s
address space to physical addresses.
Virtual Memory:
Need for Virtual Memory:
To allow efficient and safe sharing of memory among multiple
programs, such as for the memory needed by multiple virtual
machines for cloud computing
to remove the programming burdens of a small, limited
amount of main memory i.e. if the executing program
memory is larger than main memory.
Overlays – Program of larger size than main memory is
divided into pieces and then identified the pieces that were
mutually exclusive
Virtual Memory:
Protection:
A set of mechanisms for ensuring that multiple processes
sharing the processor, memory, or I/O devices cannot
interfere, intentionally or unintentionally, with one another
by reading or writing each other’s data.
These mechanisms also isolate the operating system from a
user process.
Virtual Memory:
Page – A block in the virtual memory
Page Fault – An event that occurs when the requested page is
not in virtual memory i.e. miss in virtual memory
Virtual Address - An address that corresponds to a location in
virtual space and is translated by address mapping to a
physical address when memory is accessed.
Address mapping or address translation – Translation of
virtual address into physical address by mapping the pages of
virtual memory into main memory i.e. a virtual address is
mapped to an address used to access memory
Virtual Memory:
Relocation – A technique that maps the virtual addresses
used by a program to different physical addresses before the
addresses are used to access memory.
Virtual Memory:
The virtual address is broken into
•A virtual page number and
•A page offset
Page fault frequency can be reduced by optimizing page
replacement i.e. any page can be replaced when a page fault
occurs.
Clever and flexible replacement schemes reduces the page
fault rate
Virtual Memory:
Translation of Virtual Page Number to Physical Page Number:
Physical page number constitutes the upper portion and page
offset in lower portion. No. of bits in page offset determines
page size
Virtual Memory:
Page Table:
Used by the virtual memory system, containing the virtual to
physical address translations in a virtual memory system.
The table, which is stored in memory, is typically indexed by
the virtual page number
Each entry in the table contains the physical page number for
that virtual page if the page is currently in memory.
Virtual Memory:
Page Table:
Virtual Memory
Valid/invalid bit – set as v or1 if the page is in main memory
- set as i or 0 if the page is not in main memory
Virtual Memory:
Swap Space:
The space on the disk reserved for the full virtual memory
space of a process.
Reference bit / Use bit:
A field that is set whenever a page is accessed and that is
used to implement LRU or other replacement schemes.
Used to replace the pages in virtual memory
Virtual Memory:
Dirty Bit:
Used to track whether a page has been written since it was
read into the memory
Sets when any word in a page is written
If the operating system chooses to replace the page, the dirty
bit indicates whether the page needs to be written out before
its location in memory can be given to another page.
Hence, a modified page is often called a dirty page.
Translation-Lookaside Buffer:
Since the page tables are stored in main memory, every
access takes twice i.e. one access to access the address and
another to access the data
Instead of accessing page table often, a special cache that
keeps track of recently used translations is maintained which
is called as Translation-Lookaside Buffer (TLB) otherwise
called as Translation Cache
e.g. a piece of paper to record the location of set of books in
the catalog rather than searching the entire catalog.
Translation-Lookaside Buffer:
Locality of reference:
Locality of reference states that, instead of loading the entire
process in the main memory, OS can load only those number
of pages in the main memory that are frequently accessed by
the CPU and along with that, the OS can also load only those
page table entries which are corresponding to those many
pages.
TLB follows the concept of locality of reference which means
that it contains only the entries of those many pages that are
frequently accessed by the CPU.
Translation-Lookaside Buffer:
•A translation lookaside buffer (TLB) caches the virtual
to physical page number translation for recent accesses
Translation-Lookaside Buffer:
Translation-Lookaside Buffer:
If the probability of TLB hit is P% (TLB hit rate) then the
probability of TLB miss (TLB miss rate) will be (1-P) %.
Therefore, the effective access time can be defined as;
EAT = P (t + m) + (1 - p) (t + k.m + m)
Where, P → TLB hit rate,
t → time taken to access TLB,
m → time taken to access main memory
k = 1, if the single level paging has been
implemented.
Interrupts:
On executing any operation by the processor, if the I/O
devices are not ready then often processor wastes time to
check whether the devices are ready and if not ready it waits.
To eliminate the continuous checking and waiting time, the
I/O devices can send a signal to the processor indicating that
it is ready, which is called as interrupt.
Instead of waiting for the devices to get ready, the processor
can execute any other operation.
Interrupts:
e.g. consider a task of computing the calculations and
displaying the results once in every ten seconds.
The task involves two routines namely COMPUTE and
DISPLAY.
COMPUTE – Computing the calculations
DISPLAY – Display the results in every ten seconds.
Timer will count the seconds. Processor usually executes
COMPUTE routine and upon receiving interrupt from the
timer it executes DISPLAY routine and resume back to
COMPUTE.
Interrupts:
Interrupt Service Routine (ISR):
The routine that is executed immediately in response to an
interrupt request is called interrupt service routine.
Interrupts:
When receiving the interrupt i.e. here in instruction i,
immediately the PC value (i+1) is stored in temporary storage.
After executing the interrupt, the program execution resumes
from the PC value at the temporary storage.
Interrupt sent by any I/O devices to the processor will be
acknowledged by the processor by means of Interrupt
Acknowledge.
Interrupts:
There may be two kinds of interrupts.
One that can save all the register contents before transferring
the interrupt and the another one that doesn’t save the
contents.
An approach is to provide duplicate sets of processor registers
used by interrupt service routine which eliminates the need
of save and restore registers value. The duplicate registers are
called as shadow registers.
Interrupts:
The processor has a status register (PS) providing information
about the current state of operation.
Interrupt Enable (IE) of the status register is used for enabling
/ disabling interrupts.
When IE=1, the processor accepts and processes the
interrupt from any I/O devices
When IE=0, the processor will just ignore the interrupt and
continue with normal execution.
Interrupts:
Steps in handling an interrupt request from a device:
Interrupts:
Handling interrupts from multiple devices:
The problems may be
Handling interrupts from multiple devices:
Solution:
When the devices raising an interrupt request, it sets a bit called
as IRQ to 1
The first device which encountered with 1 will serviced first
Polling - all interrupts are serviced by branching to the same
service program. This program then checks with each device if it
is the one generating the interrupt. The order of checking is
determined by the priority that has to be set. The device having
the highest priority is checked first and then devices are checked
in descending order of priority.
An alternative scheme is using vectored interrupt.
Interrupts:
Vectored Interrupts:
Interrupts generating devices are having the interrupt service
routine to service the interrupt.
Interrupt vectors – used to store the interrupt service routine
i.e. address that store the routine.

More Related Content

Similar to Cache.pptx

cache memory and types of cache memory,
cache memory  and types of cache memory,cache memory  and types of cache memory,
cache memory and types of cache memory,
ashima967262
 

Similar to Cache.pptx (20)

Memory management
Memory managementMemory management
Memory management
 
Computer architecture cache memory
Computer architecture cache memoryComputer architecture cache memory
Computer architecture cache memory
 
computerarchitecturecachememory-170927134432.pdf
computerarchitecturecachememory-170927134432.pdfcomputerarchitecturecachememory-170927134432.pdf
computerarchitecturecachememory-170927134432.pdf
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating System
 
cache memory and types of cache memory,
cache memory  and types of cache memory,cache memory  and types of cache memory,
cache memory and types of cache memory,
 
Bab 4
Bab 4Bab 4
Bab 4
 
Lecture2
Lecture2Lecture2
Lecture2
 
IS 139 Lecture 7
IS 139 Lecture 7IS 139 Lecture 7
IS 139 Lecture 7
 
cache memory.ppt
cache memory.pptcache memory.ppt
cache memory.ppt
 
cache memory.ppt
cache memory.pptcache memory.ppt
cache memory.ppt
 
Computer Architecture | Computer Fundamental and Organization
Computer Architecture | Computer Fundamental and OrganizationComputer Architecture | Computer Fundamental and Organization
Computer Architecture | Computer Fundamental and Organization
 
Computer System Architecture Lecture Note 8.1 primary Memory
Computer System Architecture Lecture Note 8.1 primary MemoryComputer System Architecture Lecture Note 8.1 primary Memory
Computer System Architecture Lecture Note 8.1 primary Memory
 
Chapter 8 - Main Memory
Chapter 8 - Main MemoryChapter 8 - Main Memory
Chapter 8 - Main Memory
 
Cache memory
Cache memoryCache memory
Cache memory
 
Cache memory
Cache memoryCache memory
Cache memory
 
Ch8
Ch8Ch8
Ch8
 
Lecture 25
Lecture 25Lecture 25
Lecture 25
 
Unit-4 swapping.pptx
Unit-4 swapping.pptxUnit-4 swapping.pptx
Unit-4 swapping.pptx
 
Memory Organization
Memory OrganizationMemory Organization
Memory Organization
 
Memory Organization.pdf
Memory Organization.pdfMemory Organization.pdf
Memory Organization.pdf
 

Recently uploaded

Introduction to Robotics in Mechanical Engineering.pptx
Introduction to Robotics in Mechanical Engineering.pptxIntroduction to Robotics in Mechanical Engineering.pptx
Introduction to Robotics in Mechanical Engineering.pptx
hublikarsn
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
Epec Engineered Technologies
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
Kamal Acharya
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
mphochane1998
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
Query optimization and processing for advanced database systems
Query optimization and processing for advanced database systemsQuery optimization and processing for advanced database systems
Query optimization and processing for advanced database systems
meharikiros2
 

Recently uploaded (20)

Introduction to Robotics in Mechanical Engineering.pptx
Introduction to Robotics in Mechanical Engineering.pptxIntroduction to Robotics in Mechanical Engineering.pptx
Introduction to Robotics in Mechanical Engineering.pptx
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Computer Graphics Introduction To Curves
Computer Graphics Introduction To CurvesComputer Graphics Introduction To Curves
Computer Graphics Introduction To Curves
 
Introduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfIntroduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdf
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
Introduction to Geographic Information Systems
Introduction to Geographic Information SystemsIntroduction to Geographic Information Systems
Introduction to Geographic Information Systems
 
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using PipesLinux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
8086 Microprocessor Architecture: 16-bit microprocessor
8086 Microprocessor Architecture: 16-bit microprocessor8086 Microprocessor Architecture: 16-bit microprocessor
8086 Microprocessor Architecture: 16-bit microprocessor
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
Query optimization and processing for advanced database systems
Query optimization and processing for advanced database systemsQuery optimization and processing for advanced database systems
Query optimization and processing for advanced database systems
 

Cache.pptx

  • 1. Cache Memory: Cache – A safe place for storing or hiding things like desk for books from library The simplest way to assign a location of main memory data in the cache is to assign the cache location based on the address of the word in memory. This cache structure is called direct mapped, since each memory location is mapped directly to exactly one location in the cache. For example, almost all direct-mapped caches use this mapping to find a block: (Block address) modulo (Number of blocks in the cache)
  • 2. Cache Memory: If the number of entries in the cache is a power of 2, then modulo can be computed simply by using the low-order log2 (cache size in blocks) bits of the address. Thus, an 8-block cache uses the three lowest bits (8=23) of the block address. Similarly 4-block cache uses two bits as block address i.e. 00, 01, 10 and 11
  • 3. Cache Memory: (Block address) modulo (Number of blocks in the cache) For example, the memory addresses between 1ten (00001two) and 29ten (11101two) map to locations 1ten (001two) and 5ten (101two) in a direct-mapped cache of eight words.
  • 4. Cache Memory: Tag - A field in a table used for a memory hierarchy that contains the address information required to identify whether the associated block in the hierarchy corresponds to a requested word.
  • 5. Cache Memory: Valid Bit – A field in the tables of a memory hierarchy that indicates that the associated block in the hierarchy contains valid data. If data present, it is 1 otherwise 0
  • 6. Cache Memory: Accessing a cache: Cache Hit – The requested data is present in the cache Cache Miss – The requested data is not present in the cache and have to access from main memory Initially the cache is empty and any data requested is not available in cache and it is cache miss
  • 7. Cache Memory: A sequence of nine memory references to an empty eight-block cache
  • 9. Cache Memory: The address in the memory is divided into Tag field – compared with tag field of the cache Cache index – used to select the block in cache
  • 10. Cache Memory: 32-bit addresses The cache size is 2n blocks, so n bits are used for the index The block size is 2m words (2m+2 bytes), so m bits are used for the word within the block, and two bits are used for the byte part of the address the size of the tag field is
  • 11. Cache Memory: When the CPU tries to read from memory, the address will be sent to a cache controller. •The lowest k bits of the address will index a block in the cache. •If the block is valid and the tag matches the upper (m - k) bits of the m-bit address, then that data will be sent to the CPU.
  • 13. Cache Memory: Handling Cache Misses: Steps to be taken on an instruction cache miss: 1. Send the original PC value (current PC – 4) to the memory. 2. Instruct main memory to perform a read and wait for the memory to complete its access. 3. Write the cache entry, putting the data from memory in the data portion of the entry, writing the upper bits of the address (from the ALU) into the tag field, and turning the valid bit on. 4. Restart the instruction execution at the first step, which will refetch the instruction, this time finding it in the cache.
  • 14. Cache Memory: Handling Writes: The cache and main memory are said to be inconsistent if both contains different data for a same location e.g. a store instruction may writes new data to cache block which may not be updated in main memory Write-through: A scheme in which writes always update both the cache and the next lower level of the memory hierarchy, ensuring that data is always consistent between the two.
  • 15. Cache Memory: Handling Writes: With write-through scheme, the data when fetched and write into the cache also requires written back into the main memory, which decreases the performance by taking more clock cycles of processor. The solution is Write buffer:
  • 16. Cache Memory: Handling Writes: Write buffer: • A queue that holds data while the data is waiting to be written to memory. • After writing the data into the cache and into the write buffer, the processor can continue execution. • When a write to main memory completes, the entry in the write buffer is freed. • If the write buffer is full when the processor reaches a write, the processor must stall until there is an empty position in the write buffer.
  • 17. Cache Memory: Handling Writes: Write-back: An alternative to a write-through scheme is a scheme called write-back. A scheme that handles writes by updating values only to the block in the cache, then writing the modified block to the lower level of the hierarchy when the block is replaced. Advantage: Performance increases by reducing the writes into main memory every time
  • 18. Cache Memory: Split Cache: A scheme in which a level of a memory hierarchy is composed of two independent caches that operates in parallel with each other, with one handling instructions and one handling data
  • 19. Measuring and Improving Cache Performance: Ways to improve cache performance: Reducing the miss rate by reducing the probability that two different memory blocks will contend for the same cache location. Reduces the miss penalty by adding an additional level to the Hierarchy (multilevel caching)
  • 20. Measuring and Improving Cache Performance: CPU time can be divided into the clock cycles that the CPU spends executing the program the clock cycles that the CPU spends waiting for the memory system. CPU time = (CPU execution clock cycles + Memory-stall clock cycles) * Clock cycle time
  • 21. Measuring and Improving Cache Performance: Memory-stall clock cycles can be defined as the sum of the stall cycles coming from reads plus those coming from writes Memory-stall clock cycles = (Read-stall cycles + Write-stall cycles) The read-stall cycles can be defined in terms of the number of read accesses per program, the miss penalty in clock cycles for a read, and the read miss rate.
  • 22. Measuring and Improving Cache Performance: Write-stall cycles: For a write-through scheme, we have two sources of stalls: write misses, which usually require that we fetch the block before continuing the write and write buffer stalls, which occur when the write buffer is full when a write occurs.
  • 23. Measuring and Improving Cache Performance: In most write-through cache organizations, the read and write miss penalties are the same (the time to fetch the block from memory). If we assume that the write buffer stalls are negligible, we can combine the reads and writes by using a single miss rate and the miss penalty:
  • 24. Schemes for Reducing Cache Misses: The cache misses can be reduced by means of • Direct Mapped Cache • Set Associative Cache • Fully Mapped Cache
  • 25. Direct Mapped Cache Location = Block No. in MM % No. of blocks in CM
  • 27. Set Associative Cache A cache that has a fixed number of locations (at least two) where each block can be placed. A set-associative cache with n locations for a block is called an n-way set-associative cache.
  • 28. Set Associative Cache •2 set associative – each block is repeated 2 times •4 set associative – each block is repeated 4 times •8 set associative – each block is repeated 8 times •E.g. in 2 set associative type, if no. of block in CM is 4 means it contains two 0 block and two 1 block and totally of 4 blocks. Location = Block No. in MM % No. of sets Least Recently used blocks will be replaced if any need
  • 30. Fully Mapped •Blocks will be transferred to the free location in the cache memory Least Recently used blocks will be replaced if any need
  • 31. Schemes for Reducing Cache Misses: Multilevel Caches: Two miss rate Global miss rate – The fraction of references that miss in all levels of a multilevel cache. Local miss rate – The fraction of references to one level of a cache that misses used in multilevel hierarchies.
  • 32. Virtual Memory: A technique that uses main memory as a “cache” for secondary storage. Physical address - An address in main memory. Program’s own address space - A separate range of memory locations accessible only to the program Virtual memory implements the translation of a program’s address space to physical addresses.
  • 33. Virtual Memory: Need for Virtual Memory: To allow efficient and safe sharing of memory among multiple programs, such as for the memory needed by multiple virtual machines for cloud computing to remove the programming burdens of a small, limited amount of main memory i.e. if the executing program memory is larger than main memory. Overlays – Program of larger size than main memory is divided into pieces and then identified the pieces that were mutually exclusive
  • 34. Virtual Memory: Protection: A set of mechanisms for ensuring that multiple processes sharing the processor, memory, or I/O devices cannot interfere, intentionally or unintentionally, with one another by reading or writing each other’s data. These mechanisms also isolate the operating system from a user process.
  • 35. Virtual Memory: Page – A block in the virtual memory Page Fault – An event that occurs when the requested page is not in virtual memory i.e. miss in virtual memory Virtual Address - An address that corresponds to a location in virtual space and is translated by address mapping to a physical address when memory is accessed. Address mapping or address translation – Translation of virtual address into physical address by mapping the pages of virtual memory into main memory i.e. a virtual address is mapped to an address used to access memory
  • 36. Virtual Memory: Relocation – A technique that maps the virtual addresses used by a program to different physical addresses before the addresses are used to access memory.
  • 37. Virtual Memory: The virtual address is broken into •A virtual page number and •A page offset Page fault frequency can be reduced by optimizing page replacement i.e. any page can be replaced when a page fault occurs. Clever and flexible replacement schemes reduces the page fault rate
  • 38. Virtual Memory: Translation of Virtual Page Number to Physical Page Number: Physical page number constitutes the upper portion and page offset in lower portion. No. of bits in page offset determines page size
  • 39. Virtual Memory: Page Table: Used by the virtual memory system, containing the virtual to physical address translations in a virtual memory system. The table, which is stored in memory, is typically indexed by the virtual page number Each entry in the table contains the physical page number for that virtual page if the page is currently in memory.
  • 41. Virtual Memory Valid/invalid bit – set as v or1 if the page is in main memory - set as i or 0 if the page is not in main memory
  • 42. Virtual Memory: Swap Space: The space on the disk reserved for the full virtual memory space of a process. Reference bit / Use bit: A field that is set whenever a page is accessed and that is used to implement LRU or other replacement schemes. Used to replace the pages in virtual memory
  • 43. Virtual Memory: Dirty Bit: Used to track whether a page has been written since it was read into the memory Sets when any word in a page is written If the operating system chooses to replace the page, the dirty bit indicates whether the page needs to be written out before its location in memory can be given to another page. Hence, a modified page is often called a dirty page.
  • 44. Translation-Lookaside Buffer: Since the page tables are stored in main memory, every access takes twice i.e. one access to access the address and another to access the data Instead of accessing page table often, a special cache that keeps track of recently used translations is maintained which is called as Translation-Lookaside Buffer (TLB) otherwise called as Translation Cache e.g. a piece of paper to record the location of set of books in the catalog rather than searching the entire catalog.
  • 45. Translation-Lookaside Buffer: Locality of reference: Locality of reference states that, instead of loading the entire process in the main memory, OS can load only those number of pages in the main memory that are frequently accessed by the CPU and along with that, the OS can also load only those page table entries which are corresponding to those many pages. TLB follows the concept of locality of reference which means that it contains only the entries of those many pages that are frequently accessed by the CPU.
  • 46. Translation-Lookaside Buffer: •A translation lookaside buffer (TLB) caches the virtual to physical page number translation for recent accesses
  • 48. Translation-Lookaside Buffer: If the probability of TLB hit is P% (TLB hit rate) then the probability of TLB miss (TLB miss rate) will be (1-P) %. Therefore, the effective access time can be defined as; EAT = P (t + m) + (1 - p) (t + k.m + m) Where, P → TLB hit rate, t → time taken to access TLB, m → time taken to access main memory k = 1, if the single level paging has been implemented.
  • 49. Interrupts: On executing any operation by the processor, if the I/O devices are not ready then often processor wastes time to check whether the devices are ready and if not ready it waits. To eliminate the continuous checking and waiting time, the I/O devices can send a signal to the processor indicating that it is ready, which is called as interrupt. Instead of waiting for the devices to get ready, the processor can execute any other operation.
  • 50. Interrupts: e.g. consider a task of computing the calculations and displaying the results once in every ten seconds. The task involves two routines namely COMPUTE and DISPLAY. COMPUTE – Computing the calculations DISPLAY – Display the results in every ten seconds. Timer will count the seconds. Processor usually executes COMPUTE routine and upon receiving interrupt from the timer it executes DISPLAY routine and resume back to COMPUTE.
  • 51. Interrupts: Interrupt Service Routine (ISR): The routine that is executed immediately in response to an interrupt request is called interrupt service routine.
  • 52. Interrupts: When receiving the interrupt i.e. here in instruction i, immediately the PC value (i+1) is stored in temporary storage. After executing the interrupt, the program execution resumes from the PC value at the temporary storage. Interrupt sent by any I/O devices to the processor will be acknowledged by the processor by means of Interrupt Acknowledge.
  • 53. Interrupts: There may be two kinds of interrupts. One that can save all the register contents before transferring the interrupt and the another one that doesn’t save the contents. An approach is to provide duplicate sets of processor registers used by interrupt service routine which eliminates the need of save and restore registers value. The duplicate registers are called as shadow registers.
  • 54. Interrupts: The processor has a status register (PS) providing information about the current state of operation. Interrupt Enable (IE) of the status register is used for enabling / disabling interrupts. When IE=1, the processor accepts and processes the interrupt from any I/O devices When IE=0, the processor will just ignore the interrupt and continue with normal execution.
  • 55. Interrupts: Steps in handling an interrupt request from a device:
  • 56. Interrupts: Handling interrupts from multiple devices: The problems may be
  • 57. Handling interrupts from multiple devices: Solution: When the devices raising an interrupt request, it sets a bit called as IRQ to 1 The first device which encountered with 1 will serviced first Polling - all interrupts are serviced by branching to the same service program. This program then checks with each device if it is the one generating the interrupt. The order of checking is determined by the priority that has to be set. The device having the highest priority is checked first and then devices are checked in descending order of priority. An alternative scheme is using vectored interrupt.
  • 58. Interrupts: Vectored Interrupts: Interrupts generating devices are having the interrupt service routine to service the interrupt. Interrupt vectors – used to store the interrupt service routine i.e. address that store the routine.