3. Computer Memory
Computer memory is the storage space in
computer where data is to be processed and
instructions required for processing are
stored.
The memory is divided into large number of
small parts called cells.
Each location or cell has a unique address
which varies from zero to memory size minus
one.
For example if computer has 64k words, then
this memory unit has 64 * 1024=65536
memory locations. The address of these3 3/11/2016By:-Gourav Kottawar
9. How Is the Hierarchy Managed?
Registers <-> Memory
–by compiler (programmer?)
cache <-> memory
–by the hardware
memory <-> disks
–by the hardware and operating system (virtual
memory)
–by the programmer (files)
9 3/11/2016By:-Gourav Kottawar
10. Memory Hierarchy Technology
Random-Access Memory
Random-access memory (RAM) comes in two
varieties—static and dynamic.
Static RAM (SRAM) is faster and significantly more
expensive than Dynamic RAM (DRAM).
SRAM is used for cache memories, both on and off
the CPU chip.
DRAM is used for the main memory plus the frame
buffer of a graphics system.
Typically, a desktop system will have no more than a
few megabytes of SRAM, but hundreds or thousands
of megabytes of DRAM.
10 3/11/2016By:-Gourav Kottawar
12. Reading RAM
To read from this RAM, the controlling circuit must:
– Enable the chip by ensuring CS = 1.
– Select the read operation, by setting WR = 0.
– Send the desired address to the ADRS input.
– The contents of that address appear on OUT after
a little while.
Notice that the DATA input is unused for read
operations.
12 3/11/2016By:-Gourav Kottawar
13. Writing RAM
To write to this RAM, you need to:
– Enable the chip by setting CS = 1.
– Select the write operation, by setting WR = 1.
– Send the desired address to the ADRS input.
– Send the word to store to the DATA input.
The output OUT is not needed for memory write
operations.
13 3/11/2016By:-Gourav Kottawar
14. Static RAM
SRAM stores each bit in a bistable memory
cell. Each cell is implemented with a six-
transistor circuit.
This circuit has the property that it can stay
indefinitely in either of two different voltage
configurations, or states.
Any other state will be unstable—starting from
there, the circuit will quickly move toward one
of the stable states.
14 3/11/2016By:-Gourav Kottawar
15. Dynamic RAM
Dynamic random-access memory (DRAM) is a
type of random-access memory that stores each
bit of data in a separate capacitor within an
integrated circuit.
The capacitor can be either charged or
discharged; these two states are taken to
represent the two values of a bit, conventionally
called 0 and 1.
Since even "non conducting" transistors always
leak a small amount, the capacitors will slowly
discharge, and the information eventually fades
unless the capacitor charge is refreshed
periodically.
Because of this refresh requirement, it is a
15 3/11/2016By:-Gourav Kottawar
16. Dynamic RAM
The main memory (the "RAM") in personal
computers is dynamic RAM (DRAM).
It is the RAM in desktops, laptops and workstation
computers as well as some of the RAM of video
game consoles.
The advantage of DRAM is its structural simplicity:
only one transistor and a capacitor are required per
bit, compared to four or six transistors in SRAM.
This allows DRAM to reach very high densities.
Unlike flash memory, DRAM is volatile memory since
it loses its data quickly when power is removed. The
transistors and capacitors used are extremely small;
billions can fit on a single memory chip.16 3/11/2016By:-Gourav Kottawar
18. Synchronous Dynamic Random-
access Memory
that is synchronized with the system bus.
Classic DRAM has an asynchronous interface, which
means that it responds as quickly as possible to
changes in control inputs.
SDRAM has a synchronous interface, meaning that
it waits for a clock signal before responding to
control inputs and is therefore synchronized with the
computer's system bus.
The clock is used to drive an internal finite state
machine that pipelines incoming commands.
The data storage area is divided into several banks,
allowing the chip to work on several memory access
commands at a time, interleaved among the
separate banks.18 3/11/2016By:-Gourav Kottawar
19. Generations of SDRAM
SDR SDRAM (Single Data Rate synchronous
DRAM)
This type of SDRAM is slower than the DDR
variants, because only one word of data is
transmitted per clock cycle (single data rate)..
19 3/11/2016By:-Gourav Kottawar
20. Generations of SDRAM
DDR(1) SDRAM
DDR (Double Data Rate) memory is the next generation
SDRAM.
Like SDRAM, DDR is synchronous with the system clock.
The big difference between DDR and SDRAM memory is
that DDR reads data on both the rising and falling edges
of the clock signal.
SDRAM only carries information on the rising edge of a
signal.
Basically this allows the DDR module to transfer data
twice as fast as SDRAM.
For example, instead of a data rate of 133MHz, DDR memory
transfers data at 266MHz.
DDR SDRAM also consumes less power, which makes it
ideal for notebook computers.
JESD79C is the JEDEC standard for DDR SDRAM
specifications.
20 3/11/2016By:-Gourav Kottawar
22. DDR II
Normal DDR limitations at higher frequencies:
Signal integrity
Power Consumption
DDR2 Addresses these challenges by:
Operating voltage is reduced from 2.5V to 1.8V
Reduced core operating frequency
Core frequency = 1/2 the I/O frequency
Special New Features:
4-bit pre-fetch
On-die termination
Off-chip driver calibration
22 3/11/2016By:-Gourav Kottawar
23. On-Die Termination
Instead of having the necessary resistive
termination located on the motherboard, the
termination is located inside the semiconductor
chips–technique called On-Die Termination.
23 3/11/2016By:-Gourav Kottawar
24. OCD( Off Chip Driver
Calibration)
One way to solve the problem is to use Off-
Chip Driver calibration (OCD calibration)
where both parts of the differential strobes are
calibrated against each other and against the
DQ signal.
Through this sort of calibration, the ramping
voltages are optimized for the buffer
impedances to reduce over and
undershooting at the rising and falling edges.
Without OCD calibration, the DRAM has a
nominal output driver strength of 18 ohms
+30% and a pull-up and pulldown mismatch of
up to 4 ohms.
Using OCD calibration, a system can reduce
the pull-up and pull-down mismatch and target
25 3/11/2016By:-Gourav Kottawar
25. Generations of SDRAM
DDR2 SDRAM
is a double data rate synchronous dynamic random-
access memory interface.
It superseded the original DDR SDRAM
specification, and has since been superseded
by DDR3 SDRAM.
DDR2 DIMMs are neither forward compatible with
DDR3 nor backward compatible with DDR.
DDR2 allows higher bus speed and requires lower
power by running the internal clock at half the speed
of the data bus.
The two factors combine to produce a total of four
data transfers per internal clock cycle.26 3/11/2016By:-Gourav Kottawar
26. Generations of SDRAM
DDR2 SDRAM
With data being transferred 64 bits at a time, DDR2
SDRAM gives a transfer rate of (memory clock rate) × 2
(for bus clock multiplier) × 2 (for dual rate) × 64 (number
of bits transferred) / 8 (number of bits/byte).
Thus with a memory clock frequency of 100 MHz, DDR2
SDRAM gives a maximum transfer rate of 3200 MB/s.
Since the DDR2 internal clock runs at half the DDR
external clock rate, DDR2 memory operating at the same
external data bus clock rate as DDR results in DDR2
being able to provide the same bandwidth but with
higher latency.
Alternatively, DDR2 memory operating at twice the
external data bus clock rate as DDR may provide twice
the bandwidth with the same latency. The best-rated
DDR2 memory modules are at least twice as fast as the
best-rated DDR memory modules.
27 3/11/2016By:-Gourav Kottawar
29. DDR3 SDRAM
an abbreviation for double data rate type
three synchronous dynamic random
access memory,
is a modern type of dynamic random access
memory (DRAM) with a
high bandwidth interface, and has been in
use since 2007.
It is the higher-speed successor
to DDR and DDR2 and predecessor
to DDR4 synchronous dynamic random
access memory(SDRAM) chips.
DDR3 SDRAM is
neither forward nor backward compatible with
30 3/11/2016By:-Gourav Kottawar
30. DDR3 SDRAM
Has ability to transfer data at twice the rate (eight
times the speed of its internal memory arrays),
enabling higher bandwidth
With two transfers per cycle of a quadrupled clock
signal, a 64-bit wide DDR3 module may achieve a
transfer rate of up to 64 times the
memory clock speed megahertz (MHz)
in megabytes per second (MB/s).
With data being transferred 64 bits at a time per
memory module, DDR3 SDRAM gives a transfer
rate of (memory clock rate) × 4 (for bus clock
multiplier) × 2 (for data rate) × 64 (number of bits
transferred) / 8 (number of bits/byte).
Thus with a memory clock frequency of 100 MHz,
DDR3 SDRAM gives a maximum transfer rate of31 3/11/2016By:-Gourav Kottawar
33. Rambus DRAM (RDRAM)
Goal
High Density, Low Cost, High Bandwith DRAM
To achieve high bandwidth to memory interface
can either:
make interface to memory faster
make interface to memory wider
Wider => More Chips or More Pins => More Cost
e.g., “wider is NOT necessarily better”
more chips also decreases reliability
34 3/11/2016By:-Gourav Kottawar
34. Speeding up the interface
Many benefits to speeding up the interface
instead of widening the datapath
Fewer pins, fewer chips => less cost
higher reliability
Rambus DRAMS or SyncLink DRAMs uses 400
Mhz bus based on Gunning Transceiver Logic
(GTL)
35 3/11/2016By:-Gourav Kottawar
35. RDRAM was initially expected to become the
standard in PC memory, especially after Intel
agreed to license the Rambus technology for
use with its future chipsets.
Further, RDRAM was expected to become a
standard for VRAM. However, RDRAM got
embroiled in a standards war with an
alternative technology - DDR SDRAM, quickly
losing out on grounds of price, and, later on,
performance.
By around 2001, RDRAM was no longer
supported by any mainstream computing
36 3/11/2016By:-Gourav Kottawar
36. ROM
Short for Read-Only Memory, ROM is a type of
"built-in" memory that is capable of holding data
and having that data read from the chip, but not
written to.
Unlike Random Access Memory (RAM), ROM
is non-volatile which means it keeps its contents
regardless if it has power or not.
A diode is used.
A diode normally allows current to flow in only one
direction and has a certain threshold, known as the
forward breakover, that determines how much
current is required before the diode will pass it on.
In silicon-based items such as processors and
memory chips, the forward breakover voltage is
approximately 0.6 volts.37 3/11/2016By:-Gourav Kottawar
37. RAM ROM
Definition
Random Access Memory or
RAM is a form of data storage
that can be accessed
randomly at any time, in any
order and from any physical
location., allowing quick
access and manipulation.
Read-only memory or ROM is
also a form of data storage
that can not be easily altered
or reprogrammed.Stores
instuctions that are not
nescesary for re-booting up to
make the computer operate
when it is switched off.They
are hardwired.
Stands for Random Access Memory Read-only memory
Use
RAM allows the computer to
read data quickly to run
applications. It allows reading
and writing.
ROM stores the program
required to initially boot the
computer. It only allows
reading.
Volatility
RAM is volatile i.e. its contents
are lost when the device is
powered off.
It is non-volatile i.e. its
contents are retained even
when the device is powered
off.
The two main types of RAM The types of ROM include38 3/11/2016By:-Gourav Kottawar
38. ROM
A ROM chip can send a
charge that is the forward
breakover down the
appropriate column with
the selected row grounded
to connect at a specific
cell.
If a diode is present at that
cell, the charge will be
conducted through to the
ground, and, under the
binary system, the cell will
be read as being "on" (a
value of 1).
If the cell's value is 0, and
there is no diode link at
that intersection to connect
the column and row. So
the charge on the column
does not get transferred to
39 3/11/2016By:-Gourav Kottawar
39. ROM
The way a ROM chip works necessitates the
programming of complete data when the chip is
created.
You cannot reprogramme or rewrite a standard
ROM chip. If it is incorrect, or the data needs to be
updated, you have to throw it away and start over.
Creating the original template for a ROM chip is
often a laborious process.
Once the template is completed, the actual chips
can cost as little as a few cents each.
They use very little power, are extremely reliable
and, in the case of most small electronic devices,
contain all the necessary programming to control
the device.
40 3/11/2016By:-Gourav Kottawar
40. Read Only Memory (ROM)
Types
There are four basic ROM types:
1. PROM - Programmable Read Only Memory
2. EPROM - Erasable Programmable Read Only
Memory
3. EEPROM - Electrically Erasable Programmable
Read Only Memory
4. Flash EEPROM memory
41 3/11/2016By:-Gourav Kottawar
41. PROM
Creating ROM chips totally from scratch is time-
consuming and very expensive in small
quantities.
For this reason, developers created a type of
ROM known as programmable read-only memory
(PROM).
Blank PROM chips can be bought inexpensively
and coded by the user with a programmer.
42 3/11/2016By:-Gourav Kottawar
42. PROM PROM chips have a grid of
columns and rows just as
ordinary ROMs do.
The difference is that every
intersection of a column and row
in a PROM chip has a fuse
connecting them.
A charge sent through a column
will pass through the fuse in a
cell to a grounded row indicating
a value of 1.
Since all the cells have a fuse,
the initial (blank) state of a
PROM chip is all 1s.
To change the value of a cell to
0, you use a programmer to
send a specific amount of
current to the cell.
The higher voltage breaks the
connection between the column
43 3/11/2016By:-Gourav Kottawar
43. EPROM
Working with ROMs and PROMs can be a wasteful
business. Even though they are inexpensive per chip, the
cost can add up over time.
Erasable programmable read-only memory (EPROM)
addresses this issue.
EPROM chips can be rewritten many times. Erasing an
EPROM requires a special tool that emits a certain
frequency of ultraviolet (UV) light.
EPROMs are configured using an EPROM programmer
that provides voltage at specified levels depending on the
type of EPROM used.
The EPROM has a grid of columns and rows and the cell
at each intersection has two transistors.
The two transistors are separated from each other by a
thin oxide layer. One of the transistors is known as the
floating gate and the other as the control gate.
The floating gate's only link to the row (wordline) is through
the control gate. As long as this link is in place, the cell has
a value of 1. To change the value to 0 requires a process
called tunneling.
44 3/11/2016By:-Gourav Kottawar
44. Tunneling is used to alter the placement of
electrons in the floating gate.
Tunneling discharge electrons, which have
enough energy to pass through the insulating
oxide layer and accumulate on the gate electrode.
When the high voltage is removed, the electrons
are trapped on the electrode.
Because of the high insulation value of the silicon
oxide surrounding the gate, the stored charge
cannot readily leak away and the data can be
retained for decades.
An electrical charge, usually 10 to 13 volts, is
applied to the floating gate.
The charge comes from the column (bit line),
enters the floating gate and drains to a ground.
45 3/11/2016By:-Gourav Kottawar
46. This charge causes the floating-gate transistor to
act like an electron gun.
The excited electrons are pushed through and
trapped on the other side of the thin oxide layer,
giving it a negative charge.
These negatively charged electrons act as a barrier
between the control gate and the floating gate.
A device called a cell sensor monitors the level of
the charge passing through the floating gate.
If the flow through the gate is greater than 50
percent of the charge, it has a value of 1.
When the charge passing through drops below the
50-percent threshold, the value changes to 0. A
blank EPROM has all of the gates fully open, giving
each cell a value of 1.
47 3/11/2016By:-Gourav Kottawar
47. 48
To rewrite an EPROM, you must erase it first.
To erase it, you must supply a level of energy strong
enough to break through the negative electrons
blocking the floating gate.
In a standard EPROM, this is best accomplished
with UV light at a wavelength of 253.7 nanometers
(2537 angstroms).
Because this particular frequency will not penetrate
most plastics or glasses, each EPROM chip has a
quartz window on top of it.
The EPROM must be very close to the eraser's light
source, within an inch or two, to work properly.
An EPROM eraser is not selective, it will erase the
entire EPROM.
The EPROM must be removed from the device it is
in and placed under the UV light of the EPROM
eraser for several minutes.
An EPROM that is left under too long can become
3/11/2016By:-Gourav Kottawar
48. EEPROMs and Flash Memory
Though EPROMs are a big step up from PROMs in
terms of reusability, they still require dedicated
equipment and a labor-intensive process to remove
and reinstall them each time a change is necessary.
Also, changes cannot be made incrementally to an
EPROM; the whole chip must be erased.
Electrically erasable programmable read-only memory
(EEPROM) chips remove the biggest drawbacks of
EPROMs.
In EEPROMs:
The chip does not have to removed to be rewritten.
The entire chip does not have to be completely
erased to change a specific portion of it.
Changing the contents does not require additional
dedicated equipment.
49 3/11/2016By:-Gourav Kottawar
49. EEPROMs and Flash Memory
Instead of using UV light, you can return the
electrons in the cells of an EEPROM to normal
with the localized application of an electric field to
each cell.
This erases the targeted cells of the EEPROM,
which can then be rewritten.
EEPROMs are changed 1 byte at a time, which
makes them versatile but slow. In fact, EEPROM
chips are too slow to use in many products that
make quick changes to the data stored on the
chip.
Manufacturers responded to this limitation with
Flash memory, a type of EEPROM that uses in-
circuit wiring to erase by applying an electrical
field to the entire chip or to predetermined
sections of the chip called blocks.
50 3/11/2016By:-Gourav Kottawar
50. EEPROMs and Flash Memory
This erases the targeted area of the chip, which
can then be rewritten.
Flash memory works much faster than traditional
EEPROMs because instead of erasing one byte
at a time, it erases a block or the entire chip, and
then rewrites it.
The electrons in the cells of a Flash-memory chip
can be returned to normal ("1") by the application
of an electric field, a higher-voltage charge.
51 3/11/2016By:-Gourav Kottawar
51. CACHE MEMORY
Locality of Reference
- The references to memory at any given time
interval tend to be confined within a localized areas
- This area contains a set of information and
the membership changes gradually as time goes by
- Temporal Locality
e.g. Reuse of information in loops)
- Spatial Locality
e.g. Related data items (arrays) are usually stored together;
instructions are executed sequentially
Cache
- The property of Locality of Reference makes the
Cache memory systems work
- Cache is a fast small capacity memory that should hold those information
which are most likely to be accessed
Cache Memory
Main memory
Cache memory
CPU
3/11/201652 By:-Gourav Kottawar
52. CACHE MEMORY
• Cache is fastest component in the memory hierarchy
• Cache operation
-Keep the most frequently accessed instructions in the fast cache
memory
-Cache is only small fraction of the size of main memory
-Works on hit and miss
- All the memory accesses are directed first to Cache
- If the word is in Cache i.e. Hit ; Access cache to provide it to CPU
-If the word is not in Cache i.e. miss; Bring a block (or a line)
including that word t replace a block now in Cache
•Performance of Cache Memory System
-Hit ratio - the ratio of number of hits divided by the total CPU
references to memory (hits + misses)
• Characteristics
-Fast access
-Follows mapping process – transformation of data from main
memory to cache memory
Cache Memory
3/11/201653 By:-Gourav Kottawar
53. MEMORY AND CACHE MAPPING - ASSOCIATIVE MAPPLING -
Associative mapping
Direct mapping
Set-associative mapping
Associative Mapping
Types of Mapping
- Most flexible
- Mapping Table is implemented in an associative memory
-> Fast, very Expensive
- Mapping Table
Stores both address and the content of the memory word
- Any block location in Cache can store any block in memory
address (15 bits)
Argument register
Address Data
0 1 0 0 0
0 2 7 7 7
2 2 3 4 5
3 4 5 0
6 7 1 0
1 2 3 4
Cache Memory
3/11/201654 By:-Gourav Kottawar
54. MEMORY AND CACHE MAPPING - DIRECT MAPPING -
Addressing Relationships
Direct Mapping Cache Organization
Memory
address Memory data
00000 1 2 2 0
00777
01000
01777
02000
02777
2 3 4 0
3 4 5 0
4 5 6 0
5 6 7 0
6 7 1 0
Index
address Tag Data
000 0 0 1 2 2 0
0 2 6 7 1 0777
Cache memory
Tag(6) Index(9)
32K x 12
Main memory
Address = 15 bits
Data = 12 bits
512 x 12
Cache memory
Address = 9 bits
Data = 12 bits
00 000
77 777
000
777
- Each memory block has only one place to load in Cache
- Mapping Table is made of RAM
- n-bit memory address consists of 2 parts; k bits of Index field and
n-k bits of Tag field
- n-bit addresses are used to access main memory
and k-bit Index is used to access the Cache
Cache Memory
3/11/201655 By:-Gourav Kottawar
55. DIRECT MAPPING
Direct Mapping with block size of 8 words
Operation
- CPU generates a memory request with (TAG;INDEX)
- Access Cache using INDEX ; (tag; data)
Compare TAG and tag
- If matches -> Hit
Provide Cache[INDEX](data) to CPU
- If not match -> Miss
M[tag;INDEX] <- Cache[INDEX](data)
Cache[INDEX] <- (TAG;M[TAG; INDEX])
CPU <- Cache[INDEX](data)
Index tag data
000 0 1 3 4 5 0
007 0 1 6 5 7 8
010
017
770 0 2
777 0 2 6 7 1 0
Block 0
Block 1
Block 63
Tag Block Word
6 6 3
INDEX
Cache Memory
3/11/201656 By:-Gourav Kottawar
56. MEMORY AND CACHE MAPPING - SET ASSOCIATIVE MAPPING -
Set Associative Mapping Cache with set size of two
- Each memory block has a set of locations in the Cache to load
Index Tag Data
000 0 1 3 4 5 0 0 2 5 6 7 0
Tag Data
777 0 2 6 7 1 0 0 0 2 3 4 0
Operation
-CPU generates a memory request, the index value of the address is then
used to access the cache
- the tag field of the CPU address is then compared with both tags in the
cache to determine if a match occurs
- comparison is done by Associative search
Cache Memory
3/11/201657 By:-Gourav Kottawar
57. Writing to the cache
-Read operation main memory is not involved.
-If the operation is write , there are two ways that the system can proceed
1. Write through –
- simplest
- commonly used
- it is procedure is to update main memory with every write operation,
with cache memory being updated in parallel
- main memory always contains the same data as cache
2. Write Back –
- only cache location is updated during a write operation
- location is marked by a flag so later when the word is removed from
the cache it is copied in to main memory
Cache Memory
3/11/201658 By:-Gourav Kottawar
58. DIRECT MEMORY ACCESS
59
The transfer of data between a fast storage
device such as magnetic disk and memory is
often limited by the speed of CPU.
Removing CPU from the path and letting the
peripheral devices manage the memory buses
directly improves the speed of transfer.
this is called Direct Memory Access.
During memory transfer CPU is idle has no
control of memory buses.
A DMA controller takes over the buses to manage
the transfer directly between the I/O device and
memory.
3/11/2016By:-Gourav Kottawar
59. DIRECT MEMORY ACCESS
60
CPU bus signals for DMA transfer
Bus request – DMA request for control of
BUS
Bus grant - Control of Buses
granted to DMA
Burst Transfer – a block of sequences
consisting of a number of memory words is
transferred in continuous bus 3/11/2016By:-Gourav Kottawar
61. DIRECT MEMORY ACCESS
62
DMA controller
Unit communicate with CPU via data bus and control lines
Registers in the DMA are selected by enabling DS and RS
When BG =0
CPU communicate with DMA registers for Read or
Write
When BG = 1
DMA communicate directly to the memory by address
bus and activate RD or WR
Has 3 registers
1. Address register – contains address
incremented after each word transferred to
memory
2. Word count register –
holds the number of words to be transfer
decremented by one after each word transfer3/11/2016By:-Gourav Kottawar