3. CACHE MEMORY
• A cache is a hardware or software component that stores data so that for future
requests that data can be served faster.
• Cache memory is used in order to achieve higher performance of CPU by allowing
the CPU to access data at faster speed.
3
4. TYPES OF CACHE
• L1 , L2 , L3 cache are built to decrease the time taken to access data by the processor
.This time taken is called latency.
• L1 cache(2KB-64KB) : it is smaller in size and the instructions are first searched in it.
• L2 cache(256KB-512KB) : it is comparatively larger in size and the instructions not found
in the L1 are searched here.
• L3 cache(1MB – 8MB) : Size is largest among all cache and instructions not found in L2
are searched here.
4
5. CACHE SIMULATOR
This project simulates a write through or a write back direct mapped
cache in C.
It calculates the number of memory writes and memory reads for a
memory access pattern provided as a file.
It computes the number of cache hits and misses as well as the
number of main memory reads and writes.
5
6. Cache mapping divided in three category:
• Direct Mapped
• Associative
• Set Associative
CACHE MAPPING6
7. The direct mapping technique is simple and inexpensive to implement.
When the CPU wants to access data from memory, it places a address.
The index field of CPU address is used to access address.
The tag field of CPU address is compared with the associated tag in the word read
from the cache.
If the tag-bits of CPU address is matched with the tag-bits of cache, then there is a
hit and the required data word is read from cache.
7 DIRECT MAPPED
9. ASSOCIATIVE CACHE
• This mapping scheme attempts to improve cache utilization.
• In associative cache mapping, the data from any location in RAM
can be stored in any location in cache
• When the processor wants an address, all tag fields in the cache as
checked to determine if the data is already in the cache
• Each tag line requires circuitry to compare the desired address with
the tag field
• All tag fields are checked in parallel
9
10. SET ASSOCIATIVE MAPPING
• Set associative mapping is a mixture of direct and associative
mapping
• The cache lines are grouped into sets
• The number of lines in a set can vary from 2 to 16
• A portion of the address is used to specify which set will hold
an address
• The data can be stored in any of the lines in the set
10
11. CACHE HIT/MISS
Cache Hit
• A cache hit is a state in which data
requested for processing by a
component or application is found in
the cache memory.
• It is a faster means of delivering data
to the processor, as the cache already
contains the requested data.
Cache Miss
• Cache miss is a state where the data
requested for processing by a
component or application is not found in
the cache memory.
• It causes execution delays by requiring
the program or application to fetch the
data from other cache levels or the main
memory.
11
12. WRITE POLICY
• The cache's write policy determines how it handles writes to memory locations
that are currently being held in cache.
• These two popular cache write policies (schemes) are:
1. Write-Through
2. Write-Back
12
13. Write-Back Cache
• When the system writes to a memory location that is currently held in cache,
it only writes the new information to the appropriate cache line. When the
cache line is eventually needed for some other memory address, the changed
data is "written back" to system memory.
• This type of cache provides better performance than a write-through cache,
because it saves on (time-consuming) write cycles to memory.
13
14. WRITE-THROUGH CACHE
• When the system writes to a memory location that is currently held in
cache, it writes the new information both to the appropriate cache line
and the memory location itself at the same time.
• This type of caching provides worse performance than write-back, but is
simpler to implement and has the advantage of internal consistency,
because the cache is never out of sync with the memory the way it is with
a write-back cache
14
15. REPLACEMENT ALGORITHM
• Replacement algorithms are used when there are no available space in a cache in which to place a
data.
• In this cache simulator, we have used FIFO and LRU Replacement Algorithms.
• Factors to consider
1. Read/Write Balance. (What percentage of accesses are reads vs writes)
2. Amount of cache.
3. Type of media behind the cache. (Are they slow SATA drives or fast SSD drives?)
4. Hits vs Misses. (How often are things rewritten or reread?)
5. Average access size (This goes into choose the page size)
6. How expensive are reads and writes.
15
16. FIRST IN FIRST OUT
• The FIFO algorithm selects for replacement
the item that has been in the cache from
the longest time.
• Using this algorithm the cache behaves in
the same way as a FIFO queue. The cache
evicts the first block accessed first without
any regard to how often or how many times
it was accessed before.
16
17. LEAST RECENTLY USED
• The LRU caching scheme is used to remove
the least recently used item when the
cache is full and a new item is referenced
which is not there in cache.
• This algorithm requires keeping track of
what was used when, which is expensive if
one wants to make sure the algorithm
always discards the least recently used
item.
17
18. IMPLEMENTATION IN SIMULATOR
• FIFO using Queue: A FIFO cache is one that uses queuing logic for its backing store,
expunging the elements at the front of the queue when a predetermined threshold is
exceeded. For implementing queue using array, we keep track of two indices, front and
rear. We enqueue an item at the rear and dequeue an item from front.
• LRU using Doubly Linked List : To implement LRU keep the doubly-linked list in each
set sorted by the order in which the cache lines were referenced. This can be done by
removing a cache line from the linked list each time you reference it, and inserting it
back into the linked list at the head. In this case, you should always evict the cache line
at the tail of the list .
18
19. SIMULATOR OUTPUT
• Total Hit Rate: The percentage of memory cache (i.e. lines in the trace file)
that were hits. This number should be truncated to 4 decimal places of
precision.
• Total Run Time: The total run time of the program, assuming that the last
memory access was the last instruction of the program. The unit for this
should be cycles.
• Average Memory Access Latency: The average number of cycles needed to
complete a memory access. This number should be to 4 decimal places of
precision.
19