Unit I Memory technology and optimization

Presented by:
Trupti Diwan
Shweta Ghate
Sapana Vasave

 Basics of memory
 Memory Technology
 Memory optimization

 Main memory is RAM.(The words Memory,
Buffer, Cache are all refers Ram). Which is
Nearly 11,000 times faster than secondary
memory (Hard Disk) in Random Access.)
 Characteristics of Main Memory is as vital as
the processor chip to a computer system.
Fast systems have both a fast processor and a
large memory

 Here is a list of some characteristics of
computer memory.
 closely connected to the processor.
 Holds programs and data that the processor is
actively working with.
 Used for long term storage.
 The processor interacts with it millions of times
per second.
 The contents is easily changed.
 Usually its contents are organized into files.

 Main memory is the short term memory of a
computer. It retains data only for the period
that a program is running, and that's it.
 Memory is used for the purpose of running
programs

 Main memory satisfies the demands of caches
and serves as the I/O interface.
 Performance measures of main memory
emphasize both latency and bandwidth
(Memory bandwidth is the number of bytes
read or written per unit time)
 Memory latency is the time delay required to
obtain a speciﬁc item of data
 Memory Bandwidth is the rate at which data can
be accessed (e.g. bits per second)
 Bandwidth unit is normally 1/cycle time
 This rate can be improved by concurrent access

 The main memory affects the cache miss
penalty,
 the primary concern of the cache.
 Main memory bandwidth
 the primary concern of I/O and multiprocessors.

 Although caches are interested in low
latency memory, it is generally easier to
improve memory bandwidth with new
organizations than it is to reduce latency
 cache designers increase block size to take
advantage of the high memory bandwidth.

 Memory latency is traditionally quoted using
two measures
 Access time
 Access time is the time between when a read is
requested and when the desired word arrives and
 Cycle time
 Cycle time is the minimum time between requests to
memory.
 Cycle time is greater than access time
because the memory needs the address lines
to be stable between accesses.

Memory Hierarchy of a Modern Computer System
• By taking advantage of the principle of locality:
– Present the user with as much memory as is available in the
cheapest technology.
– Provide access at the speed offered by the fastest technology.
Control
Datapath
Secondary
Storage
(Disk)
Processor
Registers
Main
Memory
(DRAM)
Second
Level
Cache
(SRAM)
On-Chip
Cache
1s 10,000,000s
(10s ms)
Speed (ns): 10s 100s
100s GsSize (bytes): Ks Ms
Tertiary
Storage
(Tape)
10,000,000,000s
(10s sec)
Ts

 Static Random Access Memory (SRAM)
- use for cache.
 Dynamic Random Access Memory (DRAM)
- use for main memory

 ‘S’ stands for static.
 No need to be refresh , so access time is close to
cycle time.
 Uses 6 transistor per bit.
 Bits stored as on/off switches
 No charges to leak
 More complex construction
 Larger per bit
 More expensive
 Faster

 Transistor arrangement gives stable logic state
 State 1
 C1 high, C2 low
 T1 T4 off, T2 T3 on
 State 0
 C2 high, C1 low
 T2 T3 off, T1 T4 on
 Address line transistors T5 T6 is switch
 Write – apply value to B & compliment to B
 Read – value is on line B

 Bits stored as charge in capacitors
 Charges leak
 Need refreshing even when powered
 Simpler construction
 Smaller per bit
 Less expensive
 Need refresh circuits
 Slower
 Cycle time is longer than the access time

 Address line active when bit read or written
 Transistor switch closed (current flows)
 Write
 Voltage to bit line
 High for 1 low for 0
 Then signal address line
 Transfers charge to capacitor
 Read
 Address line selected
 transistor turns on
 Charge from capacitor fed via bit line to sense amplifier
 Compares with reference value to determine 0 or 1
 Capacitor charge must be restored

 Addresses divided into 2 halves (Memory as a 2D matrix):
 RAS or Row Access Strobe
 CAS or Column Access Strobe
Fig : Internal Organization of a DRAM

 DRAM had an asynchronous interface to the
memory controller so every transfer involved
overhead to synchronize with controller.
 Adding clock signal to the DRAM interface
reduces the overhead, such optimization
called
“Synchronous DRAM “ i.e SDRAMs

 Double Data Rate i.e DDR was a later development
of SDRAM, used in PC memory beginning in 2000.
 DDR SDRAM internally performs double-width
accesses at the clock rate, and uses a double data
rate interface to transfer one half on each clock
edge.
 Further version of DDR(2 data transfer/cycle)
-DDR2(4 data transfer/cycle)
-DDR3 (8 data transfer/cycle)

 Review: 6 Basic Cache Optimizations
 • Reducing hit time
 1.Address Translation during Cache Indexing
 • Reducing Miss Penalty
 2. Multilevel Caches
 3. Giving priority to read misses over write
misses
 • Reducing Miss Rate
 4. Larger Block size (Compulsory misses)
 5. Larger Cache size (Capacity misses)
 6. Higher Associativity (Conflict misses)

 Reducing hit time
 1. Small and simple caches
 2. Way prediction
 3. Trace caches
 • Increasing cache bandwidth
 4. Pipelined caches
 5. Multibanked caches
 6. Nonblocking caches
 • Reducing Miss Penalty
 7. Critical word first
 8. Merging write buffers
 • Reducing Miss Rate
 9.Compiler optimizations
 • Reducing miss penalty or miss rate via parallelism
 10.Hardware prefetching
 11.Compiler prefetching

 Advanced cache optimizations into the following categories:
 Reducing the hit time: small and simple caches, way prediction, and trace
 caches
 Increasing cache bandwidth: pipelined caches, multibanked caches, and
nonblocking
 caches
 Reducing the miss penalty: critical word first and merging write buffers
 Reducing the miss rate: compiler optimizations
 Reducing the miss penalty or miss rate via parallelism: hardware prefetching
 and compiler prefetching
 We will conclude with a summary of the implementation complexity and the
performance

 First Optimization: Small and Simple Caches to Reduce
Hit Time
 A time-consuming portion of a cache hit is using the
index portion of the address
 to read the tag memory and then compare it to the
address. Smaller hardware can
 be faster, so a small cache can help the hit time. It is
also critical to keep an L2
 cache small enough to fit on the same chip as the
processor to avoid the time penalty
 of going off chip.
 The second suggestion is to keep the cache simple, such
as using direct mapping.
 One benefit of direct-mapped caches is that the designer
can overlap the tag
 check with the transmission of the data. This effectively
reduces hit time

 Compiler techniques to reduce cache misses+ 0 Software is a
challenge; some computers have compiler option
 Hardware prefetching of instructions and data + + 2 instr.,3
data
 Many prefetch instructions;Opteron and Pentium 4 prefetch
data
 Compiler-controlled prefetching + + 3 Needs nonblocking
cache;
 possible instruction overhead; in many CPUs

 Figure 5.11 Summary of 11 advanced cache
optimizations showing impact on cache performance
and complexity.
 Although generally a technique helps only one factor,
prefetching can reduce misses if done sufficiently early; if
not,
 it can reduce miss penalty. + means that the technique
improves the factor, – means it hurts that factor, and blank
 means it has no impact. The complexity measure is
subjective, with 0 being the easiest and 3 being a challenge

 The techniques to improve hit time, bandwidth, miss
penalty
 Miss rate generally affect the other components of the
average memory access equation as well as the complexity
of the memory hierarchy.
 Estimates the impact on complexity, with + meaning that
the technique
 Improves the factor, – meaning it hurts that factor, and
blank meaning it has no impact.
 Generally, no technique helps more than one category.

 Computer Architecture - A Quantitative
Approach

Unit I Memory technology and optimization

Unit I Memory technology and optimization

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Unit I Memory technology and optimization

Similar to Unit I Memory technology and optimization (20)

More from K Gowsic Gowsic

More from K Gowsic Gowsic (6)

Recently uploaded

Recently uploaded (20)

Unit I Memory technology and optimization