Main memory is RAM.(The words Memory,
Buffer, Cache are all refers Ram). Which is
Nearly 11,000 times faster than secondary
memory (Hard Disk) in Random Access.)
Characteristics of Main Memory is as vital as
the processor chip to a computer system.
Fast systems have both a fast processor and a
Here is a list of some characteristics of
closely connected to the processor.
Holds programs and data that the processor is
actively working with.
Used for long term storage.
The processor interacts with it millions of times
The contents is easily changed.
Usually its contents are organized into files.
Main memory is the short term memory of a
computer. It retains data only for the period
that a program is running, and that's it.
Memory is used for the purpose of running
Main memory satisfies the demands of caches
and serves as the I/O interface.
Performance measures of main memory
emphasize both latency and bandwidth
(Memory bandwidth is the number of bytes
read or written per unit time)
Memory latency is the time delay required to
obtain a speciﬁc item of data
Memory Bandwidth is the rate at which data can
be accessed (e.g. bits per second)
Bandwidth unit is normally 1/cycle time
This rate can be improved by concurrent access
The main memory affects the cache miss
the primary concern of the cache.
Main memory bandwidth
the primary concern of I/O and multiprocessors.
Although caches are interested in low
latency memory, it is generally easier to
improve memory bandwidth with new
organizations than it is to reduce latency
cache designers increase block size to take
advantage of the high memory bandwidth.
Memory latency is traditionally quoted using
Access time is the time between when a read is
requested and when the desired word arrives and
Cycle time is the minimum time between requests to
Cycle time is greater than access time
because the memory needs the address lines
to be stable between accesses.
Memory Hierarchy of a Modern Computer System
• By taking advantage of the principle of locality:
– Present the user with as much memory as is available in the
– Provide access at the speed offered by the fastest technology.
Speed (ns): 10s 100s
100s GsSize (bytes): Ks Ms
Static Random Access Memory (SRAM)
- use for cache.
Dynamic Random Access Memory (DRAM)
- use for main memory
‘S’ stands for static.
No need to be refresh , so access time is close to
Uses 6 transistor per bit.
Bits stored as on/off switches
No charges to leak
More complex construction
Larger per bit
Transistor arrangement gives stable logic state
C1 high, C2 low
T1 T4 off, T2 T3 on
C2 high, C1 low
T2 T3 off, T1 T4 on
Address line transistors T5 T6 is switch
Write – apply value to B & compliment to B
Read – value is on line B
Bits stored as charge in capacitors
Need refreshing even when powered
Smaller per bit
Need refresh circuits
Cycle time is longer than the access time
Address line active when bit read or written
Transistor switch closed (current flows)
Voltage to bit line
High for 1 low for 0
Then signal address line
Transfers charge to capacitor
Address line selected
transistor turns on
Charge from capacitor fed via bit line to sense amplifier
Compares with reference value to determine 0 or 1
Capacitor charge must be restored
Addresses divided into 2 halves (Memory as a 2D matrix):
RAS or Row Access Strobe
CAS or Column Access Strobe
Fig : Internal Organization of a DRAM
DRAM had an asynchronous interface to the
memory controller so every transfer involved
overhead to synchronize with controller.
Adding clock signal to the DRAM interface
reduces the overhead, such optimization
“Synchronous DRAM “ i.e SDRAMs
Double Data Rate i.e DDR was a later development
of SDRAM, used in PC memory beginning in 2000.
DDR SDRAM internally performs double-width
accesses at the clock rate, and uses a double data
rate interface to transfer one half on each clock
Further version of DDR(2 data transfer/cycle)
-DDR2(4 data transfer/cycle)
-DDR3 (8 data transfer/cycle)
Review: 6 Basic Cache Optimizations
• Reducing hit time
1.Address Translation during Cache Indexing
• Reducing Miss Penalty
2. Multilevel Caches
3. Giving priority to read misses over write
• Reducing Miss Rate
4. Larger Block size (Compulsory misses)
5. Larger Cache size (Capacity misses)
6. Higher Associativity (Conflict misses)
Reducing hit time
1. Small and simple caches
2. Way prediction
3. Trace caches
• Increasing cache bandwidth
4. Pipelined caches
5. Multibanked caches
6. Nonblocking caches
• Reducing Miss Penalty
7. Critical word first
8. Merging write buffers
• Reducing Miss Rate
• Reducing miss penalty or miss rate via parallelism
Advanced cache optimizations into the following categories:
Reducing the hit time: small and simple caches, way prediction, and trace
Increasing cache bandwidth: pipelined caches, multibanked caches, and
Reducing the miss penalty: critical word first and merging write buffers
Reducing the miss rate: compiler optimizations
Reducing the miss penalty or miss rate via parallelism: hardware prefetching
and compiler prefetching
We will conclude with a summary of the implementation complexity and the
First Optimization: Small and Simple Caches to Reduce
A time-consuming portion of a cache hit is using the
index portion of the address
to read the tag memory and then compare it to the
address. Smaller hardware can
be faster, so a small cache can help the hit time. It is
also critical to keep an L2
cache small enough to fit on the same chip as the
processor to avoid the time penalty
of going off chip.
The second suggestion is to keep the cache simple, such
as using direct mapping.
One benefit of direct-mapped caches is that the designer
can overlap the tag
check with the transmission of the data. This effectively
reduces hit time
Compiler techniques to reduce cache misses+ 0 Software is a
challenge; some computers have compiler option
Hardware prefetching of instructions and data + + 2 instr.,3
Many prefetch instructions;Opteron and Pentium 4 prefetch
Compiler-controlled prefetching + + 3 Needs nonblocking
possible instruction overhead; in many CPUs
Figure 5.11 Summary of 11 advanced cache
optimizations showing impact on cache performance
Although generally a technique helps only one factor,
prefetching can reduce misses if done sufficiently early; if
it can reduce miss penalty. + means that the technique
improves the factor, – means it hurts that factor, and blank
means it has no impact. The complexity measure is
subjective, with 0 being the easiest and 3 being a challenge
The techniques to improve hit time, bandwidth, miss
Miss rate generally affect the other components of the
average memory access equation as well as the complexity
of the memory hierarchy.
Estimates the impact on complexity, with + meaning that
Improves the factor, – meaning it hurts that factor, and
blank meaning it has no impact.
Generally, no technique helps more than one category.
Computer Architecture - A Quantitative