SlideShare a Scribd company logo
1 of 53
ā€¢ Memory Hierarchy
ā€¢ Main Memory
ā€¢ Associative Memory
ā€¢ Cache Memory
ā€¢ Virtual Memory
ā€¢ Memory Management Hardware
MEMORY ORGANIZATION
Memory
ā€¢ Ideally,
1. Fast
2. Large
3. Inexpensive
ā€¢ Is it possible to meet all 3 requirements simultaneously ?
Some Basic Concepts
ā€¢ What is the max. size of memory?
ā€¢ Address space
ā€“16-bit : 216
= 64K memory locations
ā€“32-bit : 232
= 4G memory locations
ā€“40-bit : 240
= 1 T memory locations
ā€¢ What is Byte addressable?
Introduction
ā€¢ Even a sophisticated processor may
perform well below an ordinary
processor:
ā€“Unless supported by matching
performance by the memory system.
ā€¢ The focus of this module:
ā€“Study how memory system
performance has been enhanced
through various innovations and
optimizations.
MEMORY HIERARCHY
memory
Memory Hierarchy is to obtain the highest possible
access speed while minimizing the total cost of the memory system
Memory Hierarchy
Magnetic
tapes
Magnetic
disks
I/O
processor
CPU
Main
Cache
memory
Auxiliary memory
Register
Cache
Main Memory
Magnetic Disk
Magnetic Tape
Increasing size
Increasing speed Increasing cost
Basic Concepts of Memory
M A R
M D R
M e m o r y
U p t o 2 a d d r e s s a b le
lo c a t io n s
k
w o r d le n g t h = n b it s
C U
k - b it a d d r e s s b u s
n - b it d a t a b u s
C o n t r o l lin e s
P r o c e s s o r
Connection of the memory to the processo
R / W , MFC , etc
Basic Concepts of Memory
ā€¢ Data transfer between memory & processor takes place through MAR &
MDR.
ā€¢ If MAR is of K-bit then memory unit contains 2K
addressable location.
[K number of address lines]
ā€¢ If MDR is of n-bit then memory cycle n bits of data transferred between
memory & processor. [n number of data lines]
ā€¢ Bus also includes control lines Read / Write & MFC for coordinating
data transfer.
ā€¢ Processor read operation
ļƒ  MARin , Read / Write line = 1 , READ , WMFC , MDRin
ā€¢ Processor write operationļƒ 
ļƒ  MDRin , MARin , MDRout , Read / Write line = 0 , WRITE , WMFC
ā€¢ Memory access is synchronized using a clock.
ā€¢ Memory Access Time ā€“ Time between start Read and MFC signal [Speed
of memory]
ā€¢ Memory Cycle Time ā€“ Minimum time delay between initiation of two
successive memory operations.[ ]
Basic Concepts of Memory
P r o c e s s o r
R e g is t e r s
C a c h e L 1
C a c h e L 2
M a in
m e m o r y
s e c o n d a r y
s t o r a g e
m e m o r y
in c r e a s in g
s iz e
in c r e a s in g
s p e e d
in c r e a s in g
c o s t p e r b it
SRAM
ADRAM
1. Fastest access is to the data
held in processor registers.
Registers are at the top of
the memory hierarchy.
2. Relatively small amount of
memory that can be
implemented on processor
chip. This is processor
cache.
3. Two levels of cache. Level 1
(L1) cache is on the
processor chip.
4. Level 2 (L2) cache is in
between main memory and
processor.
5. Next level is main memory,
implemented as SIMMs.
Much larger, but much
slower than cache memory.
6. Next level is magnetic disks.
Huge amount of
inexpensive storage.
7. Speed of memory access is
critical, the idea is to bring
instructions and data that
will be used in the near
future as close to the
processor as possible.
Basic Concepts of Memory
ā€¢ Random Access Memory ļƒ  any location can be accessed for read / write operation
in fixed amount of time .
ā€¢ Types of RAM :ļƒ 
1. Static memory / SRAM : Capable of retain states as long as power is
applied, volatile in nature.[High cost & speed]
2. Asynchronous DRAM : Dynamic RAM are less expensive but they do not
retain their state indefinitely. Widely used in computers.
3. Synchronous DRAM : Whose operation is directly Synchronized with a clock
signal.
4. Performance Parameter :- Bandwidth & Latency.
5. Bandwidth :-Number of bytes transfer in 1 unit of time.
6. Latency:- Amount of time takes to transferred a word of data to & from
memory.
ā€¢ Read Only Memory / ROM :ļƒ  location can be accessed for read operation only in
fixed amount of time . Capable of retain states called as, non-volatile in nature.
ā€¢ Programmable ROM : Allows data to be loaded by user.
ā€¢ Erasable PROM : Erased [ by UV ray ]Stored data to load new data.
ā€¢ Electrically EPROM : Erased by different voltages.
ā€¢ Memory uses semiconductor integrated circuit to increase performance.
ā€¢ To reduce memory cycle time ļƒ  Use Cache memory ļƒ A small SRAM physically very
closed to processor which works on locality of reference.
ā€¢ Virtual memory is used to increase the size of the physical memory.
Internal Organization of Memory Chips
Organization of bit cells in a memory chip
2 4
FF
circuit
SenseĀ /Ā Write
Address
decoder
FF
CS
cells
Memory
circuit
SenseĀ /Ā Write SenseĀ /Ā Write
circuit
DataĀ input /Ā outputĀ lines:
A 0
A 1
A 2
A 3
W0
W1
W15
7 1 0
WR /
7 1 0
b7 b1 b0
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
ā€¢
b bā€™
Internal Organization of Memory Chips
ā€¢ Memory cells are organized in an array [ Row & Column format ] where
each cell is capable of storing one bit of information.
ā€¢ Each row of cell contains memory word / data & all cells are connected to
a common word line, which is driven by address decoder on chip.
ā€¢ Cell in each column are connected to sense / write circuit by 2 bit lines.
ā€¢ sense / write circuits are connected to data I/O lines of chip.
ā€¢ READ Operation ļƒ  sense / write circuit Sense / Read information stored in
cells selected by a word line & transmit same information to o/p data line.
ā€¢ WRITE Operation ļƒ  sense / write circuit receive i/p information & store it
in the cell.
ā€¢ If a memory chip consist of 16 memory words of 8 bit each then it is
referred as 16 x 8 organization or 128 x 8 bit organization.
ā€¢ The data I/O of each sense / write circuit are connected to a single
bidirectional data line that can be connected to the data bus of a computer.
ā€¢ 2 control lines Read / Write [Specifies the required operation ] & Chip
Select (CS ) [ select a chip in a multichip memory ].
ā€¢ It can store 128 bits & 14 external connections like address, data & control
lines.
Internal Organization of Memory Chip
An Example
32 - to -1
O/P MUX
&
I/P DMUX
Data I/P & O/P
Internal Organization of Memory Chip
An Example
1k [ 1024 ] Memory Cell
ā€¢ Design a memory of 1k [ 1024 ] memory cells.
ā€¢ For 1k , we require 10 bits address line.
ā€¢ So 5 bits for rows & columns each to access address of the memory cell
represented in array.
ā€¢ A row address selects a row of 32 cells, all of which accessed in parallel.
ā€¢ According to the column address, only one of these cells is connected to
the external data line by the output MUX & input DMUX.
Static Memories
ā€¢ Circuits capable of retaining their state as long as power is applied Static
RAM (SRAM) (volatile ).
ā€¢ 2 inverters are cross connected to form a latch.
ā€¢ Latch is connected to 2 bit lines by transistors T1 & T2.
ā€¢ transistors T1 & T2 act as switches can be opened & closed under control
of word line.
ā€¢ For ground level transistors turned off (initial time cell is in state 1, X=1
& Y=0 ).
ā€¢ Read Operation :-
1. Word line activated to close
switches T1 & T2 .
2. Cell state either 1 or 0 & the
signal on bit line b and bā€™ are
always complements to each
other.
3. Sense / Write circuit set the end
value of bit line as output.
ā€¢ Write Operation :-
1. State of the cell is set by
placing the actual values on bit
line b & its complement on bā€™
and activating the word line.
[Sense / Write circuit ]
A Static RAM Cell
YX
WordĀ line
BitĀ lines
b
T2T1
bā€²
Asynchronous DRAM
ā€¢ SRAMā€™s are fast but very costly due to much number of transistors for
their cells.
ā€¢ So, less expensive cell, which also canā€™t retain their state indefinitely
turn into a memory as dynamic RAM [DRAM].
ā€¢ Data is stored in DRAM cell in form of charge on capacitor but only for
a period of tens of milliseconds.
An Example of DRAM
ā€¢ DRAM cell consist of a capacitor,
C , & a transistor, T .
ā€¢ To store information in cell,
transistor T is turn on, & provide
correct amount of voltage to bit
line.
ā€¢ After transistor turn off capacitor
begins to discharge.
ā€¢ So, Read operation must be
completed before capacitor
drops voltage below some
threshold value [ by sense
amplifier connected to bit line].
Single Transistor Dynamic memory
Design 16MB DRAM Chip
ā€¢ 2 M x 8 memory chip .
ā€¢ Cells are organized in the form of 4K x 4K .
ā€¢ 4096 cells in each row divided into 512 group of 8. Hence 512 byte data can be stored in each
row.
ā€¢ 12 [ 512 x 8 = 212
] bit address to select row & 9 [ 512 = 212
] bits to specify a group of 8 bits in the
selected row.
ā€¢ RAS [Row address strobe] & CAS [Column address strobe] will be crossed to find the proper bit
to read or write.
Column
CSSenseĀ /Ā Write
circuits
cellĀ arraylatch
address
Row
Column
latch
decoder
Row
decoderaddress
4096 512 8( )
R /W
A20 9Ā­ A8 0Ā­ā„
D0D7
RAS
CAS
x x
1. Each row can
store 512 bytes.
12 bits to select
a row, and 9 bits
to select a group
in a row. Total of
21 bits.
2. First apply the
row address,
RAS signal
latches the row
address. Then
apply the column
address, CAS
signal latches
the address.
3. Timing of the
memory unit is
controlled by a
specialized unit
which generates
RAS and CAS.
Fast Page Mode
ļ‚” Suppose if we want to access the consecutive bytes in
the selected row.
ļ‚” This can be done without having to reselect the row.
ļ‚§ Add a latch at the output of the sense circuits in each row.
ļ‚§ All the latches are loaded when the row is selected.
ļ‚§ Different column addresses can be applied to select and place different bytes on the
data lines.
ļ‚” Consecutive sequence of column addresses can be
applied under the control signal CAS, without reselecting
the row.
ļ‚§ Allows a block of data to be transferred at a much faster rate than random accesses.
ļ‚§ A small collection/group of bytes is usually referred to as a block.
ļ‚” This transfer capability is referred to as the fast page mode
feature.
Synchronous DRAM
R/ W
RAS
CAS
CS
Clock
Cell array
latch
address
Row
decoder
Row
decoder
Column Read / Write
circuits & latchescounter
address
Column
Row/Column
address
Data input
register
Data output
register
Data
Refresh
counter
Mode register
and
timing control
1. Operation is directly
synchronized with processor
clock signal.
2. The outputs of the sense
circuits are connected to a latch.
3. During a Read operation, the
contents of the cells in a row are
loaded onto the latches.
4. During a refresh operation, the
contents of the cells are
refreshed without changing the
contents of the latches.
5. Data held in the latches
correspond to the selected
columns are transferred
to the output.
6. For a burst mode of operation,
successive columns are
selected using column address
counter and clock.
7. CAS signal need not be
generated externally. A new data
is placed during raising edge of
the clock
Double-Data-Rate SDRAM
ā€¢ In addition to faster circuits, new organizational and operational features
make it possible to achieve high data rates during block transfers.
ā€¢ The key idea is to take advantage of the fact that a large number of bits are
accessed at the same time inside the chip when a row address is applied.
ā€¢ Various techniques are used to transfer these bits quickly to the pins of
the chip.
ā€¢ To make the best use of the available clock speed, data are transferred
externally on both the rising and falling edges of the clock. For this reason,
memories that use this technique are called double-data-rate SDRAMs
(DDR SDRAMs).
ā€¢ Several versions of DDR chips have been developed. The earliest version
is known as DDR. Later versions, called DDR2, DDR3, and DDR4, have
enhanced capabilities.
Structure of Larger Memory
Static memories
19Ā­bitĀ internalĀ chipĀ address
decoder
2Ā­bit
addresses
21Ā­bit
A 0
A 1
A19
Ā memoryĀ chip
A20
D31Ā­24
D7Ā­0
D23Ā­16
D 15Ā­8
512 K XĀ 8Ɨ
ChipĀ select
Ā memoryĀ chip
19Ā­bit
address
512 K 8Ɨ
8Ā­bitĀ data
input/output
Organization of 2M X 32 Memory Modules using
512 K x 8
Static Memory Chip
1. Implement a memory unit of 2M
words of 32 bits each.
2. Use 512K x 8 static memory
chips.
3. Each column consists of 4
chips.
4. Each chip implements one byte
position.
5. A chip is selected by setting its
chip select control line to 1.
6. Selected chip places its data on
the data output line, outputs of
other chips are in high
impedance state.
7. 21 bits to address a 32-bit word.
8. High order 2 bits are needed to
select the row, by activating the
four Chip Select signals.
9. 19 bits are used to access
specific byte locations inside
the selected chip.
Memory Controller
M e m o r y
M e m o r y
C o n t r o lle r
R o w / C o lu m n
a d d r e s sA d d r e s s
d a t a
P r o c e s s o r
C lo c k
R / W
C S
C A S
R A S
R / W
R e q u e s t
C lo c k
ā€¢ Memory address are divided into 2 parts.
ā€¢ High order address bit which select row in the cell array, are provided first & latched into memory chip
under control of RSA signal.
ā€¢ Low order address bit , which selects a column are provided on the same address & latched through CSA
signal.
ā€¢ However, a processor issues all address bits at the same time.
ā€¢ In order to achieve multiplexing, memory controller circuit is inserted between processor & memory.
ā€¢ Controller accepts a complete address & R/W signal from processor under control of REQUEST signal,
which indicates memory access operation is needed.
ā€¢ Controller forwards row & column address timing to have address multiplexing function.
ā€¢ Then R/W & CS are send to memory.
ā€¢ Data lines are directly connected between processor & memory.
Read-Only memory (ROM)
ā€¢ Many applications need memory devices to retain contents after the
power is turned off.
ā€“ For example, computer is turned on, the operating system
must be loaded from the disk into the memory.
ā€“ Store instructions which would load the OS from the disk.
ā€“ Need to store these instructions so that they will not be lost
after the power is turned off.
ā€“ We need to store the instructions into a non-volatile memory.
ā€¢ Non-volatile memory is read in the same manner as volatile memory.
ā€“ Separate writing process is needed to place information in
this memory.
ā€“ Normal operation involves only reading of data, this type
of memory is called Read-Only memory (ROM).
Read-Only memory (ROM)
ā€¢ Read-Only Memory:
ā€“ Data are written into a ROM when it is manufactured.
ā€¢ Programmable Read-Only Memory (PROM):
ā€“ Allow the data to be loaded by a user.
ā€“ Process of inserting the data is irreversible.
ā€“ Storing information specific to a user in a ROM is expensive.
ā€¢ Erasable Programmable Read-Only Memory (EPROM):
ā€“ Stored data to be erased and new data to be loaded.
ā€“ Flexibility, useful during the development phase of digital systems.
ā€“ Erasable, reprogrammable ROM.
ā€“ Erasure requires exposing the ROM to UV light.
ā€¢ Electrically Erasable Programmable Read-Only Memory (EEPROM):
ā€“ To erase the contents of EPROMs, they have to be exposed to ultraviolet light.
ā€“ Physically removed from the circuit.
ā€“ EEPROMs the contents can be stored and erased electrically
ā€¢ Flash memory:
ā€“ Has similar approach to EEPROM.
ā€“ Read contents of a single cell, but write contents of an entire block of cells.
ā€“ Higher capacity and low storage cost per bit.
ā€“ Power consumption of flash memory is very low, making it attractive for use in
equipment that is battery-driven.
ā€¢ Reduces the search time
efficiently
ā€¢ Address is replaced by
content of data called as
Content Addressable Memory
(CAM)
ā€¢ Called as Content based data.
ā€¢ Hardwired Requirement :ļƒ 
ā€“ It contains memory array &
logic for m words with n bits
per each word.
ā€“ Argument register (A) & Key
register (k) each have n bits.
ā€“ Match register (M) has m bits,
one for each word in memory.
ā€“ Each word in memory is
compared in parallel with the
content of argument register
and key register.
ā€“ If a match found for a word
which matches with the bits
of argument register & its
corresponding bits in the
match register then a search
for a data word is over.
Associative Memory
Cache Memory
ā€¢ Processor is much faster than the main memory.
ā€“ As a result, the processor has to spend much of its time waiting while
instructions and data are being fetched from the main memory.
ā€“ Major obstacle towards achieving good performance.
ā€¢ Speed of the main memory cannot be increased beyond a certain point.
ā€¢ Cache memory is an architectural arrangement which makes the main memory
appear faster to the processor than it really is.
ā€¢ Relatively small SRAM [ Having low access time ] memory located physically closer
to processor.
ā€¢ Cache memory is based on the property of computer programs known as
ā€œLOCALITY OF REFERENCEā€.
ā€¢ Analysis of programs indicates that many instructions in localized areas of a
program are executed repeatedly during some period of time, while the others are
accessed relatively less frequently.
ā€“ These instructions may be the ones in a loop, nested loop or few procedures
calling each other repeatedly.
Locality of Reference
ļƒ˜ The references to memory at any given time interval tend to be confined within
a localized areas.
ļƒ˜ This area contains a set of information and the membership changes gradually
as time goes by
ļƒ˜ Temporal Locality ļƒ 
ļƒ¼ Recently executed instruction is likely to be executed again very soon.
ļƒ¼ The information which will be used in near future is likely to be in use already( e.g. Reuse of
information in loops).
ļƒ˜ Spatial Locality ļƒ 
ļƒ¼ Instructions with addresses close to a recently instruction are likely to be executed soon.
ļƒ¼ If a word is accessed, adjacent (near) words are likely accessed soon (e.g.
Related data items (arrays) are usually stored together; instructions are executed
sequentially)
ļƒ˜ Cache is a fast small capacity memory that should hold those information
which are most likely to be accessed.
Cache Memory
Main memory
Cache memory
CPU
Cache Memory
ā€¢ Processor issues a Read request, a block of words is transferred from the main
memory to the cache, one word at a time.
ā€¢ Subsequent references to the data in this block of words are found in the cache.
ā€¢ At any given time, only some blocks in the main memory are held in the cache.
Which blocks in the main memory are in the cache is determined by a ā€œmapping
functionā€.
ā€¢ When the cache is full, and a block of words needs to be transferred from the
main memory, some block of words in the cache must be replaced. This is
determined by a ā€œreplacement algorithmā€.
Cache
Main
memoryProcessor
Cache Hit
ā€¢ Existence of a cache is transparent to the processor. The
processor issues Read and Write requests in the same manner.
ā€¢ If the data is in the cache it is called a Read or Write hit.
ā€¢ Read hit:ļƒ 
ā€“ The data is obtained from the cache.
ā€¢ Write hit:ļƒ 
ā€“ Cache has a replica of the contents of the main memory.
ā€“ Contents of the cache and the main memory may be updated
simultaneously. This is the write-through protocol.
ā€“ Update the contents of the cache, and mark it as updated by
setting a bit known as the dirty bit or modified bit. The
contents of the main memory are updated when this block is
replaced. This is write-back or copy-back protocol.
ā€¢ All the memory accesses are directed first to Cache
ā€¢ If the word is in Cache; Access cache to provide it to CPU ļƒ  CACHE
HIT
ā€¢ If the word is not in Cache; Bring a block (or a line) including that word
to replace a block now in Cache ļƒ  CACHE MISS
ā€¢ Hit Ratio - % of memory accesses satisfied by Cache memory system
ā€¢ Te: Effective memory access time in Cache memory system
ā€¢ Tc: Cache access time
ā€¢ Tm: Main memory access time
ā€¢ Te = h*Tc + (1 - h) [Tc+Tm]
ā€¢ Example:
Tc = 0.4 Āµs, Tm = 1.2Āµs, h = 85%
ā€¢ Te = 0.85*0.4 + (1 - 0.85) * 1.6 = 0.58Āµs
Performance Of Cache Memory
Cache Miss
ā€¢ If the data is not present in the cache, then a Read miss or Write miss
occurs.
ā€¢ Read miss:ļƒ 
ā€“ Block of words containing this requested word is transferred from the
memory.
ā€“ After the block is transferred, the desired word is forwarded to the
processor.
ā€“ The desired word may also be forwarded to the processor as soon as it
is transferred without waiting for the entire block to be transferred. This
is called load-through or early-restart.
ā€¢ Write-miss:ļƒ 
ā€“ Write-through protocol is used, then the contents of the main memory
are updated directly.
ā€“ If write-back protocol is used, the block containing the addressed word
is first brought into the cache. The desired word is overwritten with new
information.
Cache Coherence Problem
ā€¢ A bit called as ā€œvalid bitā€ is provided for each block.
ā€¢ If the block contains valid data, then the bit is set to 1, else it is 0.
ā€¢ Valid bits are set to 0, when the power is just turned on.
ā€¢ When a block is loaded into the cache for the first time, the valid bit is set to 1.
ā€¢ Data transfers between main memory and disk occur directly bypassing the cache.
ā€¢ When the data on a disk changes, the main memory block is also updated.
ā€¢ However, if the data is also resident in the cache, then the valid bit is set to 0.
ā€¢ What happens if the data in the disk and main memory changes and the write-back
protocol is being used?
ā€¢ In this case, the data in the cache may also have changed and is indicated by the
dirty bit.
ā€¢ The copies of the data in the cache, and the main memory are different. This is
called the cache coherence problem.
ā€¢ One option is to force a write-back before the main memory is updated from the
disk.
Cache Memory Mapping Function
ā€¢ Specification of correspondence between main
memory blocks and cache blocks
ā€¢ Mapping functions determine how memory blocks are
placed in the cache.
ā€¢ Three different types of mapping functions:
ā€“ Direct mapping
ā€“ Associative mapping
ā€“ Set-associative mapping
A simple processor example:
ā€“ Cache consisting of 128 blocks of 16 words each.
ā€“ Total size of cache is 2048 (2K) words.
ā€“ Main memory is addressable by a 16-bit address.
ā€“ Main memory has 64K words.
ā€“ Main memory has 4K blocks of 16 words each.
Direct mapping
Main
memory BlockĀ 0
BlockĀ 1
BlockĀ 127
BlockĀ 128
BlockĀ 129
BlockĀ 255
BlockĀ 256
BlockĀ 257
BlockĀ 4095
7 4
MainĀ memoryĀ address
Tag Block Word
5
tag
tag
tag
Cache
BlockĀ 0
BlockĀ 1
BlockĀ 127
ā€¢ Each memory block has only one place to load in
Cache memory.
ā€¢ Block j of the main memory maps to j modulo 128 of
the cache. 0 maps to 0, 129 maps to 1.
ā€¢ More than one memory block is mapped onto the
same position in the cache.
ā€¢ May lead to contention for cache blocks even if the
cache is not full.
ā€¢ Resolve the contention by allowing new block to
replace old block, leading to a trivial replacement
algorithm.
ā€¢ Memory address is divided into three fields:
ļƒ˜ Low order 4 bits determine one of the 16
words in a block.
ļƒ˜ When a new block is brought into the cache,
the next 7 bits determine which cache block
this new block is placed in.
ļƒ˜ High order 5 bits determine which of the
possible 32 blocks is currently present in the
cache. These are tag bits.
Simple to implement but not very flexible.
Direct mapping
Each memory block has only one place to load in Cache memory.
Operation
1.As execution proceeds, the 7-bit cache block field of each address
generated by the processor points to a particular block location in the cache.
2.The high-order 5 bits of the address are compared with the tag bits
associated with that cache location.
3.If they match, then the desired word is in that block of the cache.
4.If there is no match, then the block containing the required word must first
be read from the main memory and loaded into the cache.
5.The direct-mapping technique is easy to implement, but it is not very
flexible.
Associative mapping
1. Main memory block can be placed into any
cache position.
2. Memory address is divided into two fields:
ļƒ¼ Low order 4 bits identify the word within
a block.
ļƒ¼ High order 12 bits or tag bits identify a
memory block when it is resident in the
cache.
3. The tag bits of an address received from the
processor are compared to the tag bits of
each block of the cache to see if the desired
block is present. This is called the
associative-mapping technique.
4. Flexible, and uses cache space efficiently.
5. Replacement algorithms can be used to
replace an existing block in the cache when
the cache is full.
6. Cost is higher than direct-mapped cache
because of the need to search all 128 patterns
to determine whether a given block is in the
cache.
Main
memory BlockĀ 0
BlockĀ 1
BlockĀ 127
BlockĀ 128
BlockĀ 129
BlockĀ 255
BlockĀ 256
BlockĀ 257
BlockĀ 4095
4
MainĀ memoryĀ address
Tag Word
12
tag
tag
tag
Cache
BlockĀ 0
BlockĀ 1
BlockĀ 127
Set-associative mapping
1. Blocks of cache are grouped into sets.
2. Mapping function allows a block of the main
memory to reside in any block of a specific set.
3. Divide the cache into 64 sets, with two blocks per
set.
4. Memory block 0, 64, 128 etc. map to block 0, and
they can occupy either of the two positions.
5. Memory address is divided into three fields:
- 6 bit field determines the set number.
- High order 6 bit fields are compared to the
tag fields of the two blocks in a set.
6. Set-associative mapping combination of direct and
associative mapping.
7. Number of blocks per set is a design parameter.
- One extreme is to have all the blocks in one
set, requiring no set bits (fully associative
mapping).
- Other extreme is to have one block per set,
is the same as direct mapping.
Main
memory BlockĀ 0
BlockĀ 1
BlockĀ 63
BlockĀ 64
BlockĀ 65
BlockĀ 127
BlockĀ 128
BlockĀ 129
BlockĀ 4095
6 4
MainĀ memoryĀ address
Tag Set Word
6
tag
tag
tag
Cache
BlockĀ 1
BlockĀ 2
BlockĀ 126
BlockĀ 127
BlockĀ Ā 3
BlockĀ 0tag
tag
tag
Performance Considerations
ā€¢ A key design objective of a computer system is to
achieve the best possible performance at the lowest
possible cost.
ā€“ Price/performance ratio is a common measure of success.
ā€¢ Performance of a processor depends on:
ā€“ How fast machine instructions can be brought into the
processor for execution.
ā€“ How fast the instructions can be executed.
Memory Interleaving
ļ‚” Divides the memory system into a number of memory
modules.
ļ‚” Each module has its own address buffer register (ABR)
and data buffer register (DBR).
ļ‚” Arranges addressing so that successive words in the
address space are placed in different modules.
ļ‚” When requests for memory access involve
consecutive addresses, the access will be to different
modules.
ļ‚” Since parallel access to these modules is possible,
the average rate of fetching words from the Main
Memory can be increased.
Methods of address layouts
ļ‚” Consecutive words are placed in a
module.
ļ‚” High-order k bits of a memory address
determine the module.
ļ‚” Low-order m bits of a memory address
determine the word within a module.
ļ‚” When a block of words is transferred
from main memory to cache, only one
module is busy at a time.
mĀ bits
AddressĀ inĀ module MMĀ address
i
kĀ bits
Module Module Module
Module
DBRABR DBRABR ABR DBR
0 n 1Ā­ i
kĀ bits
0
ModuleModuleModule
Module MMĀ address
DBRABRABR DBRABR DBR
AddressĀ inĀ module
2
k
1Ā­
mĀ bits
ļ‚” Consecutive words are located in
consecutive modules.
ļ‚” Consecutive addresses can be
located in consecutive modules.
ļ‚” While transferring a block of data,
several memory modules can be kept
busy at the same time.
Hit Rate and Miss Penalty
ā€¢ Hit rate
ā€¢ Miss penalty
ā€¢ Hit rate can be improved by increasing block size, while
keeping cache size constant
ā€¢ Block sizes that are neither very small nor very large
give best results.
ā€¢ Miss penalty can be reduced if load-through approach is
used when loading new blocks into cache.
Caches on the processor chip
ā€¢ In high performance processors 2 levels of caches
are normally used.
ā€¢ Avg access time in a system with 2 levels of caches
is
T ave = h1c1+(1-h1)h2c2+(1-h1)(1-h2)M
VIRTUAL MEMORY
ve the programmer the illusion that the system has a very large memory
n though the computer actually has a relatively small main memory
Address Space(Logical) and Memory Space(Physical)
Address Mapping
Memory Mapping Table for Virtual Address ļƒ  Physical Address
virtual address
(logical address)
physical address
address space memory space
address generated by programs actual main memory address
Mapping
Virtual address
Virtual
address
register
Memory
mapping
table
Memory table
buffer register
Main memory
address
register
Main
memory
Main memory
buffer register
Physical
Address
ADDRESS MAPPING
Organization of memory Mapping
Table in a paged system
Address Space and Memory Space are each divided into fixed size group
of words called blocks or pages
1K words group
Page 0
Page 1
Page 2
Page 3
Page 4
Page 5
Page 6
Page 7
Block 3
Block 2
Block 1
Block 0
Address space
N = 8K = 213
Memory space
M = 4K = 212
0000
1001
1010
0011
0100
1101
1110
0111
1
Block 0
Block 1
Block 2
Block 3
MBR
0 1 0 1 0 1 0 1 0 0 1 1
1 0 1 0 1 0 1 0 1 0 0 1 1
Table
address
Presence
bit
Page no. Line number
Virtual address
Main memory
address register
Memory page table
Main memory
11
00
01
10
01
PAGE FAULT
Processor architecture should provide the ability
to restart any instruction after a page fault.
1. Trap to the OS
2. Save the user registers and program state
3. Determine that the interrupt was a page fault
4. Check that the page reference was legal and
determine the location of the page on the
backing store(disk)
5. Issue a read from the backing store to a free
frame
a. Wait in a queue for this device until serviced
b. Wait for the device seek and/or latency time
c. Begin the transfer of the page to a free frame
6. While waiting, the CPU may be allocated to
some other process
7. Interrupt from the backing store (I/O completed)
8. Save the registers and program state for the other user
9. Determine that the interrupt was from the backing store
10. Correct the page tables (the desired page is now in memory)
11. Wait for the CPU to be allocated to this process again
12. Restore the user registers, program state, and new page table, then
resume the interrupted instruction.
LOAD M
0
Reference1
OS
trap2
3 Page is on backing store
free frame
main memory
4
bring in
missing
page5
reset
page
table
6
restart
instruction
PAGE REPLACEMENT
Modified page fault service routine
Decision on which page to displace to make room for an incoming page
when no free frame is available
1. Find the location of the desired page on the backing store
2. Find a free frame
- If there is a free frame, use it
- Otherwise, use a page-replacement algorithm to select a victim frame
- Write the victim page to the backing store
3. Read the desired page into the (newly) free frame
4. Restart the user process
2
f 0 v i
f v
frame
valid/
invalid bit
page table
change to
invalid
4
reset page
table for
new page
victim
1
swap
out
victim
page
3
swap
desired
page in
backing store
physical memory
First-In-First-Out (FIFO) Algorithm
ā€¢ Replacement is depends upon the arrival time of a page to
memory.
ā€¢ A page is replaced when it is oldest (in the ascending order of
page arrival time to memory).
ā€¢ As it is a FIFO queue no need to record the arrival time of a
page & the page at the head of the queue is replaced.
ā€¢ Performance is not good always
ā€¢ When a active page is replaced to bring a new page, then a
page fault occurs immediately to retrieve the active page.
ā€¢ To get the active page back some other page has to be
replaced. Hence the page fault increases.
FIFO Page Replacement
H
I
T
TWOHITS TWOHITS
15 PAGE FAULTS
Problem with FIFO Algorithm
ā€¢ Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
ā€¢ 3 frames (3 pages can be in memory at a time per process)
ā€¢ 4 frames
ā€¢ Beladyā€™s Anomaly:ļƒ  more frames ā‡’ more page faults
1
2
3
1
2
3
4
1
2
5
3
4
9 page faults
1
2
3
1
2
3
5
1
2
4
5
10 page faults44 3
UNEXPECTED
Page fault
increases
Optimal Algorithm
ā€¢ To recover from beladyā€™s anomaly problem :ļƒ  Use Optimal
page replacement algorithm
ā€¢ Replace the page that will not be used for longest period of
time.
ā€¢ This guarantees lowest possible page fault rate for a fixed
number of frames.
ā€¢ Example :ļƒ 
ā€“ First we found 3 page faults to fill the frames.
ā€“ Then replace page 7 with page 2 because it will
not needed up to the 18th
place in reference
string.
ā€“ Finally there are 09 page faults.
ā€“ Hence it is better than FIFO algorithm (15 page
Faults).
Optimal Page Replacement
H
I
T
H
I
T TWOHITSTWOHITS THREE
HITS
TWOHITS
09 PAGE FAULTS
Difficulty with Optimal Algorithm
ā€¢ Replace page that will not be used for longest period of
time
ā€¢ 4 frames example
1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
ā€¢ Used for measuring how well your algorithm performs.
ā€¢ It always needs future knowledge of reference string.
1
2
3
4
6 page faults
4 5
Least Recently Used (LRU) Algorithm
ā€¢ LRU algorithm lies between FIFO & Optimal algorithm ( in
terms of page faults).
ā€¢ FIFO :ļƒ  time when page brought into memory.
ā€¢ OPTIMAL :ļƒ  time when a page will used.
ā€¢ LRU :ļƒ  Use the recent past as an approximation of near
future (so it cant be replaced), then we will
replace that page which has not been used for longest
period of time. Hence it is least recently used algorithm.
ā€¢ Example :ļƒ 
ā€“ Up to 5th
page fault it is same as optimal algorithm.
ā€“ When page 4 occur LRU chose page 2 for replacement.
ā€“ Here we find only 12 page faults.
LRU Page Replacement
H
IT
H
IT TWOHITS
H
IT
H
IT
TWOHITS
12 PAGE FAULTS
Least Recently Used (LRU) Algorithm
ā€¢ Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
ā€¢ Counter implementation
ā€“ Every page entry has a counter; every time page is
referenced through this entry, copy the clock into the
counter
ā€“ When a page needs to be changed, look at the
counters to determine which are to change
5
2
4
3
1
2
3
4
1
2
5
4
1
2
5
3
1
2
4
3

More Related Content

What's hot

8255 Programmable parallel I/O
8255 Programmable parallel I/O 8255 Programmable parallel I/O
8255 Programmable parallel I/O Muhammed Afsal Villan
Ā 
Computer registers
Computer registersComputer registers
Computer registersDeepikaT13
Ā 
8086 class notes-Y.N.M
8086 class notes-Y.N.M8086 class notes-Y.N.M
8086 class notes-Y.N.MDr.YNM
Ā 
Semiconductor Memories
Semiconductor MemoriesSemiconductor Memories
Semiconductor Memoriesmelisha monteiro
Ā 
8051 memory
8051 memory8051 memory
8051 memoryMayank Garg
Ā 
8051 Microcontroller I/O ports
8051 Microcontroller I/O ports8051 Microcontroller I/O ports
8051 Microcontroller I/O portsanishgoel
Ā 
INTERRUPTS OF 8086 MICROPROCESSOR
INTERRUPTS OF 8086 MICROPROCESSORINTERRUPTS OF 8086 MICROPROCESSOR
INTERRUPTS OF 8086 MICROPROCESSORGurudev joshi
Ā 
Interfacing of io device to 8085
Interfacing of io device to 8085Interfacing of io device to 8085
Interfacing of io device to 8085Nitin Ahire
Ā 
Memory organization (Computer architecture)
Memory organization (Computer architecture)Memory organization (Computer architecture)
Memory organization (Computer architecture)Sandesh Jonchhe
Ā 
Programmable Peripheral Interface 8255
 Programmable Peripheral Interface   8255 Programmable Peripheral Interface   8255
Programmable Peripheral Interface 8255Dr.P.Parandaman
Ā 
I/O system in intel 80386 microcomputer architecture
I/O system in intel 80386 microcomputer architectureI/O system in intel 80386 microcomputer architecture
I/O system in intel 80386 microcomputer architecturekavitha muneeshwaran
Ā 
8051 Microcontroller PPT's By Er. Swapnil Kaware
8051 Microcontroller PPT's By Er. Swapnil Kaware8051 Microcontroller PPT's By Er. Swapnil Kaware
8051 Microcontroller PPT's By Er. Swapnil KawareProf. Swapnil V. Kaware
Ā 
8237 dma controller
8237 dma controller8237 dma controller
8237 dma controllerTech_MX
Ā 
Microcontroller
MicrocontrollerMicrocontroller
MicrocontrollerMaha lakshmi
Ā 
Register Transfer Language,Bus and Memory Transfer
Register Transfer Language,Bus and Memory TransferRegister Transfer Language,Bus and Memory Transfer
Register Transfer Language,Bus and Memory Transferlavanya marichamy
Ā 
Microcontroller 8051
Microcontroller 8051Microcontroller 8051
Microcontroller 8051Sadiq Rahim
Ā 
Assembly Language Basics
Assembly Language BasicsAssembly Language Basics
Assembly Language BasicsEducation Front
Ā 
Programmable peripheral interface 8255
Programmable peripheral interface 8255Programmable peripheral interface 8255
Programmable peripheral interface 8255Marajulislam3
Ā 
Serial Communication
Serial CommunicationSerial Communication
Serial CommunicationUshaRani289
Ā 

What's hot (20)

8255 Programmable parallel I/O
8255 Programmable parallel I/O 8255 Programmable parallel I/O
8255 Programmable parallel I/O
Ā 
Computer registers
Computer registersComputer registers
Computer registers
Ā 
8086 class notes-Y.N.M
8086 class notes-Y.N.M8086 class notes-Y.N.M
8086 class notes-Y.N.M
Ā 
Direct Memory Access
Direct Memory AccessDirect Memory Access
Direct Memory Access
Ā 
Semiconductor Memories
Semiconductor MemoriesSemiconductor Memories
Semiconductor Memories
Ā 
8051 memory
8051 memory8051 memory
8051 memory
Ā 
8051 Microcontroller I/O ports
8051 Microcontroller I/O ports8051 Microcontroller I/O ports
8051 Microcontroller I/O ports
Ā 
INTERRUPTS OF 8086 MICROPROCESSOR
INTERRUPTS OF 8086 MICROPROCESSORINTERRUPTS OF 8086 MICROPROCESSOR
INTERRUPTS OF 8086 MICROPROCESSOR
Ā 
Interfacing of io device to 8085
Interfacing of io device to 8085Interfacing of io device to 8085
Interfacing of io device to 8085
Ā 
Memory organization (Computer architecture)
Memory organization (Computer architecture)Memory organization (Computer architecture)
Memory organization (Computer architecture)
Ā 
Programmable Peripheral Interface 8255
 Programmable Peripheral Interface   8255 Programmable Peripheral Interface   8255
Programmable Peripheral Interface 8255
Ā 
I/O system in intel 80386 microcomputer architecture
I/O system in intel 80386 microcomputer architectureI/O system in intel 80386 microcomputer architecture
I/O system in intel 80386 microcomputer architecture
Ā 
8051 Microcontroller PPT's By Er. Swapnil Kaware
8051 Microcontroller PPT's By Er. Swapnil Kaware8051 Microcontroller PPT's By Er. Swapnil Kaware
8051 Microcontroller PPT's By Er. Swapnil Kaware
Ā 
8237 dma controller
8237 dma controller8237 dma controller
8237 dma controller
Ā 
Microcontroller
MicrocontrollerMicrocontroller
Microcontroller
Ā 
Register Transfer Language,Bus and Memory Transfer
Register Transfer Language,Bus and Memory TransferRegister Transfer Language,Bus and Memory Transfer
Register Transfer Language,Bus and Memory Transfer
Ā 
Microcontroller 8051
Microcontroller 8051Microcontroller 8051
Microcontroller 8051
Ā 
Assembly Language Basics
Assembly Language BasicsAssembly Language Basics
Assembly Language Basics
Ā 
Programmable peripheral interface 8255
Programmable peripheral interface 8255Programmable peripheral interface 8255
Programmable peripheral interface 8255
Ā 
Serial Communication
Serial CommunicationSerial Communication
Serial Communication
Ā 

Similar to Computer Organisation and Architecture

sramanddram.ppt
sramanddram.pptsramanddram.ppt
sramanddram.pptAmalNath44
Ā 
Computer organization memory
Computer organization memoryComputer organization memory
Computer organization memoryDeepak John
Ā 
Unit- 4 Computer Oganization and Architecture
Unit- 4 Computer Oganization and ArchitectureUnit- 4 Computer Oganization and Architecture
Unit- 4 Computer Oganization and Architecturethangamsuresh3
Ā 
memory system notes.pptx
memory system notes.pptxmemory system notes.pptx
memory system notes.pptx1DA21CS173
Ā 
memeoryorganization PPT for organization of memories
memeoryorganization PPT for organization of memoriesmemeoryorganization PPT for organization of memories
memeoryorganization PPT for organization of memoriesGauravDaware2
Ā 
Chapter5 the memory-system-jntuworld
Chapter5 the memory-system-jntuworldChapter5 the memory-system-jntuworld
Chapter5 the memory-system-jntuworldPraveen Kumar
Ā 
Module4
Module4Module4
Module4cs19club
Ā 
Random Access Memory
Random Access Memory Random Access Memory
Random Access Memory rohitladdu
Ā 
Memory Hierarchy PPT of Computer Organization
Memory Hierarchy PPT of Computer OrganizationMemory Hierarchy PPT of Computer Organization
Memory Hierarchy PPT of Computer Organization2022002857mbit
Ā 
901320_Main Memory.ppt
901320_Main Memory.ppt901320_Main Memory.ppt
901320_Main Memory.pptChandinialla1
Ā 
Memory Organization
Memory OrganizationMemory Organization
Memory OrganizationMumthas Shaikh
Ā 
Memory (Computer Organization)
Memory (Computer Organization)Memory (Computer Organization)
Memory (Computer Organization)JyotiprakashMishra18
Ā 
COMPUTER ORGANIZATION NOTES Unit 5
COMPUTER ORGANIZATION NOTES Unit 5COMPUTER ORGANIZATION NOTES Unit 5
COMPUTER ORGANIZATION NOTES Unit 5Dr.MAYA NAYAK
Ā 
Memory organization
Memory organizationMemory organization
Memory organizationDhaval Bagal
Ā 

Similar to Computer Organisation and Architecture (20)

sramanddram.ppt
sramanddram.pptsramanddram.ppt
sramanddram.ppt
Ā 
Computer organization memory
Computer organization memoryComputer organization memory
Computer organization memory
Ā 
Unit- 4 Computer Oganization and Architecture
Unit- 4 Computer Oganization and ArchitectureUnit- 4 Computer Oganization and Architecture
Unit- 4 Computer Oganization and Architecture
Ā 
memory system notes.pptx
memory system notes.pptxmemory system notes.pptx
memory system notes.pptx
Ā 
memeoryorganization PPT for organization of memories
memeoryorganization PPT for organization of memoriesmemeoryorganization PPT for organization of memories
memeoryorganization PPT for organization of memories
Ā 
Chapter5 the memory-system-jntuworld
Chapter5 the memory-system-jntuworldChapter5 the memory-system-jntuworld
Chapter5 the memory-system-jntuworld
Ā 
Unit IV Memory.pptx
Unit IV  Memory.pptxUnit IV  Memory.pptx
Unit IV Memory.pptx
Ā 
ram.pdf
ram.pdfram.pdf
ram.pdf
Ā 
Module4
Module4Module4
Module4
Ā 
Random Access Memory
Random Access Memory Random Access Memory
Random Access Memory
Ā 
Sram pdf
Sram pdfSram pdf
Sram pdf
Ā 
Memory Hierarchy PPT of Computer Organization
Memory Hierarchy PPT of Computer OrganizationMemory Hierarchy PPT of Computer Organization
Memory Hierarchy PPT of Computer Organization
Ā 
COA (Unit_4.pptx)
COA (Unit_4.pptx)COA (Unit_4.pptx)
COA (Unit_4.pptx)
Ā 
901320_Main Memory.ppt
901320_Main Memory.ppt901320_Main Memory.ppt
901320_Main Memory.ppt
Ā 
RAM Design
RAM DesignRAM Design
RAM Design
Ā 
Memory elements 1
Memory elements  1Memory elements  1
Memory elements 1
Ā 
Memory Organization
Memory OrganizationMemory Organization
Memory Organization
Ā 
Memory (Computer Organization)
Memory (Computer Organization)Memory (Computer Organization)
Memory (Computer Organization)
Ā 
COMPUTER ORGANIZATION NOTES Unit 5
COMPUTER ORGANIZATION NOTES Unit 5COMPUTER ORGANIZATION NOTES Unit 5
COMPUTER ORGANIZATION NOTES Unit 5
Ā 
Memory organization
Memory organizationMemory organization
Memory organization
Ā 

More from Subhasis Dash

Operating System
Operating SystemOperating System
Operating SystemSubhasis Dash
Ā 
Operating System
Operating SystemOperating System
Operating SystemSubhasis Dash
Ā 
Operating System
Operating SystemOperating System
Operating SystemSubhasis Dash
Ā 
Operating System
Operating SystemOperating System
Operating SystemSubhasis Dash
Ā 
Operating System
Operating SystemOperating System
Operating SystemSubhasis Dash
Ā 
Operating System
Operating SystemOperating System
Operating SystemSubhasis Dash
Ā 
Operating System
Operating SystemOperating System
Operating SystemSubhasis Dash
Ā 
Operating System
Operating SystemOperating System
Operating SystemSubhasis Dash
Ā 
Computer Organisation and Architecture
Computer Organisation and ArchitectureComputer Organisation and Architecture
Computer Organisation and ArchitectureSubhasis Dash
Ā 
Computer Organisation and Architecture
Computer Organisation and ArchitectureComputer Organisation and Architecture
Computer Organisation and ArchitectureSubhasis Dash
Ā 
Computer Organisation and Architecture
Computer Organisation and ArchitectureComputer Organisation and Architecture
Computer Organisation and ArchitectureSubhasis Dash
Ā 
High Performance Computer Architecture
High Performance Computer ArchitectureHigh Performance Computer Architecture
High Performance Computer ArchitectureSubhasis Dash
Ā 
High Performance Computer Architecture
High Performance Computer ArchitectureHigh Performance Computer Architecture
High Performance Computer ArchitectureSubhasis Dash
Ā 
High Performance Computer Architecture
High Performance Computer ArchitectureHigh Performance Computer Architecture
High Performance Computer ArchitectureSubhasis Dash
Ā 
High Performance Computer Architecture
High Performance Computer ArchitectureHigh Performance Computer Architecture
High Performance Computer ArchitectureSubhasis Dash
Ā 
Computer Organisation & Architecture (chapter 1)
Computer Organisation & Architecture (chapter 1) Computer Organisation & Architecture (chapter 1)
Computer Organisation & Architecture (chapter 1) Subhasis Dash
Ā 

More from Subhasis Dash (17)

Operating System
Operating SystemOperating System
Operating System
Ā 
Operating System
Operating SystemOperating System
Operating System
Ā 
Operating System
Operating SystemOperating System
Operating System
Ā 
Operating System
Operating SystemOperating System
Operating System
Ā 
Operating System
Operating SystemOperating System
Operating System
Ā 
Operating System
Operating SystemOperating System
Operating System
Ā 
Operating System
Operating SystemOperating System
Operating System
Ā 
Operating System
Operating SystemOperating System
Operating System
Ā 
Computer Organisation and Architecture
Computer Organisation and ArchitectureComputer Organisation and Architecture
Computer Organisation and Architecture
Ā 
Computer Organisation and Architecture
Computer Organisation and ArchitectureComputer Organisation and Architecture
Computer Organisation and Architecture
Ā 
Computer Organisation and Architecture
Computer Organisation and ArchitectureComputer Organisation and Architecture
Computer Organisation and Architecture
Ā 
High Performance Computer Architecture
High Performance Computer ArchitectureHigh Performance Computer Architecture
High Performance Computer Architecture
Ā 
High Performance Computer Architecture
High Performance Computer ArchitectureHigh Performance Computer Architecture
High Performance Computer Architecture
Ā 
High Performance Computer Architecture
High Performance Computer ArchitectureHigh Performance Computer Architecture
High Performance Computer Architecture
Ā 
High Performance Computer Architecture
High Performance Computer ArchitectureHigh Performance Computer Architecture
High Performance Computer Architecture
Ā 
Computer Organisation & Architecture (chapter 1)
Computer Organisation & Architecture (chapter 1) Computer Organisation & Architecture (chapter 1)
Computer Organisation & Architecture (chapter 1)
Ā 
Motherboard
MotherboardMotherboard
Motherboard
Ā 

Recently uploaded

Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
Ā 
šŸ”9953056974šŸ”!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
šŸ”9953056974šŸ”!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...šŸ”9953056974šŸ”!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
šŸ”9953056974šŸ”!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...9953056974 Low Rate Call Girls In Saket, Delhi NCR
Ā 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
Ā 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
Ā 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
Ā 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
Ā 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
Ā 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
Ā 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
Ā 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
Ā 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
Ā 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
Ā 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
Ā 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
Ā 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
Ā 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
Ā 

Recently uploaded (20)

Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
Ā 
šŸ”9953056974šŸ”!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
šŸ”9953056974šŸ”!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...šŸ”9953056974šŸ”!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
šŸ”9953056974šŸ”!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
Ā 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
Ā 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Ā 
young call girls in Green ParkšŸ” 9953056974 šŸ” escort Service
young call girls in Green ParkšŸ” 9953056974 šŸ” escort Serviceyoung call girls in Green ParkšŸ” 9953056974 šŸ” escort Service
young call girls in Green ParkšŸ” 9953056974 šŸ” escort Service
Ā 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Ā 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
Ā 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
Ā 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
Ā 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
Ā 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
Ā 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
Ā 
ā˜… CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
ā˜… CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCRā˜… CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
ā˜… CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
Ā 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
Ā 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
Ā 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
Ā 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
Ā 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
Ā 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
Ā 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
Ā 

Computer Organisation and Architecture

  • 1. ā€¢ Memory Hierarchy ā€¢ Main Memory ā€¢ Associative Memory ā€¢ Cache Memory ā€¢ Virtual Memory ā€¢ Memory Management Hardware MEMORY ORGANIZATION
  • 2. Memory ā€¢ Ideally, 1. Fast 2. Large 3. Inexpensive ā€¢ Is it possible to meet all 3 requirements simultaneously ? Some Basic Concepts ā€¢ What is the max. size of memory? ā€¢ Address space ā€“16-bit : 216 = 64K memory locations ā€“32-bit : 232 = 4G memory locations ā€“40-bit : 240 = 1 T memory locations ā€¢ What is Byte addressable?
  • 3. Introduction ā€¢ Even a sophisticated processor may perform well below an ordinary processor: ā€“Unless supported by matching performance by the memory system. ā€¢ The focus of this module: ā€“Study how memory system performance has been enhanced through various innovations and optimizations.
  • 4. MEMORY HIERARCHY memory Memory Hierarchy is to obtain the highest possible access speed while minimizing the total cost of the memory system Memory Hierarchy Magnetic tapes Magnetic disks I/O processor CPU Main Cache memory Auxiliary memory Register Cache Main Memory Magnetic Disk Magnetic Tape Increasing size Increasing speed Increasing cost
  • 5. Basic Concepts of Memory M A R M D R M e m o r y U p t o 2 a d d r e s s a b le lo c a t io n s k w o r d le n g t h = n b it s C U k - b it a d d r e s s b u s n - b it d a t a b u s C o n t r o l lin e s P r o c e s s o r Connection of the memory to the processo R / W , MFC , etc
  • 6. Basic Concepts of Memory ā€¢ Data transfer between memory & processor takes place through MAR & MDR. ā€¢ If MAR is of K-bit then memory unit contains 2K addressable location. [K number of address lines] ā€¢ If MDR is of n-bit then memory cycle n bits of data transferred between memory & processor. [n number of data lines] ā€¢ Bus also includes control lines Read / Write & MFC for coordinating data transfer. ā€¢ Processor read operation ļƒ  MARin , Read / Write line = 1 , READ , WMFC , MDRin ā€¢ Processor write operationļƒ  ļƒ  MDRin , MARin , MDRout , Read / Write line = 0 , WRITE , WMFC ā€¢ Memory access is synchronized using a clock. ā€¢ Memory Access Time ā€“ Time between start Read and MFC signal [Speed of memory] ā€¢ Memory Cycle Time ā€“ Minimum time delay between initiation of two successive memory operations.[ ]
  • 7. Basic Concepts of Memory P r o c e s s o r R e g is t e r s C a c h e L 1 C a c h e L 2 M a in m e m o r y s e c o n d a r y s t o r a g e m e m o r y in c r e a s in g s iz e in c r e a s in g s p e e d in c r e a s in g c o s t p e r b it SRAM ADRAM 1. Fastest access is to the data held in processor registers. Registers are at the top of the memory hierarchy. 2. Relatively small amount of memory that can be implemented on processor chip. This is processor cache. 3. Two levels of cache. Level 1 (L1) cache is on the processor chip. 4. Level 2 (L2) cache is in between main memory and processor. 5. Next level is main memory, implemented as SIMMs. Much larger, but much slower than cache memory. 6. Next level is magnetic disks. Huge amount of inexpensive storage. 7. Speed of memory access is critical, the idea is to bring instructions and data that will be used in the near future as close to the processor as possible.
  • 8. Basic Concepts of Memory ā€¢ Random Access Memory ļƒ  any location can be accessed for read / write operation in fixed amount of time . ā€¢ Types of RAM :ļƒ  1. Static memory / SRAM : Capable of retain states as long as power is applied, volatile in nature.[High cost & speed] 2. Asynchronous DRAM : Dynamic RAM are less expensive but they do not retain their state indefinitely. Widely used in computers. 3. Synchronous DRAM : Whose operation is directly Synchronized with a clock signal. 4. Performance Parameter :- Bandwidth & Latency. 5. Bandwidth :-Number of bytes transfer in 1 unit of time. 6. Latency:- Amount of time takes to transferred a word of data to & from memory. ā€¢ Read Only Memory / ROM :ļƒ  location can be accessed for read operation only in fixed amount of time . Capable of retain states called as, non-volatile in nature. ā€¢ Programmable ROM : Allows data to be loaded by user. ā€¢ Erasable PROM : Erased [ by UV ray ]Stored data to load new data. ā€¢ Electrically EPROM : Erased by different voltages. ā€¢ Memory uses semiconductor integrated circuit to increase performance. ā€¢ To reduce memory cycle time ļƒ  Use Cache memory ļƒ A small SRAM physically very closed to processor which works on locality of reference. ā€¢ Virtual memory is used to increase the size of the physical memory.
  • 9. Internal Organization of Memory Chips Organization of bit cells in a memory chip 2 4 FF circuit SenseĀ /Ā Write Address decoder FF CS cells Memory circuit SenseĀ /Ā Write SenseĀ /Ā Write circuit DataĀ input /Ā outputĀ lines: A 0 A 1 A 2 A 3 W0 W1 W15 7 1 0 WR / 7 1 0 b7 b1 b0 ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ ā€¢ b bā€™
  • 10. Internal Organization of Memory Chips ā€¢ Memory cells are organized in an array [ Row & Column format ] where each cell is capable of storing one bit of information. ā€¢ Each row of cell contains memory word / data & all cells are connected to a common word line, which is driven by address decoder on chip. ā€¢ Cell in each column are connected to sense / write circuit by 2 bit lines. ā€¢ sense / write circuits are connected to data I/O lines of chip. ā€¢ READ Operation ļƒ  sense / write circuit Sense / Read information stored in cells selected by a word line & transmit same information to o/p data line. ā€¢ WRITE Operation ļƒ  sense / write circuit receive i/p information & store it in the cell. ā€¢ If a memory chip consist of 16 memory words of 8 bit each then it is referred as 16 x 8 organization or 128 x 8 bit organization. ā€¢ The data I/O of each sense / write circuit are connected to a single bidirectional data line that can be connected to the data bus of a computer. ā€¢ 2 control lines Read / Write [Specifies the required operation ] & Chip Select (CS ) [ select a chip in a multichip memory ]. ā€¢ It can store 128 bits & 14 external connections like address, data & control lines.
  • 11. Internal Organization of Memory Chip An Example 32 - to -1 O/P MUX & I/P DMUX Data I/P & O/P
  • 12. Internal Organization of Memory Chip An Example 1k [ 1024 ] Memory Cell ā€¢ Design a memory of 1k [ 1024 ] memory cells. ā€¢ For 1k , we require 10 bits address line. ā€¢ So 5 bits for rows & columns each to access address of the memory cell represented in array. ā€¢ A row address selects a row of 32 cells, all of which accessed in parallel. ā€¢ According to the column address, only one of these cells is connected to the external data line by the output MUX & input DMUX.
  • 13. Static Memories ā€¢ Circuits capable of retaining their state as long as power is applied Static RAM (SRAM) (volatile ). ā€¢ 2 inverters are cross connected to form a latch. ā€¢ Latch is connected to 2 bit lines by transistors T1 & T2. ā€¢ transistors T1 & T2 act as switches can be opened & closed under control of word line. ā€¢ For ground level transistors turned off (initial time cell is in state 1, X=1 & Y=0 ). ā€¢ Read Operation :- 1. Word line activated to close switches T1 & T2 . 2. Cell state either 1 or 0 & the signal on bit line b and bā€™ are always complements to each other. 3. Sense / Write circuit set the end value of bit line as output. ā€¢ Write Operation :- 1. State of the cell is set by placing the actual values on bit line b & its complement on bā€™ and activating the word line. [Sense / Write circuit ] A Static RAM Cell YX WordĀ line BitĀ lines b T2T1 bā€²
  • 14. Asynchronous DRAM ā€¢ SRAMā€™s are fast but very costly due to much number of transistors for their cells. ā€¢ So, less expensive cell, which also canā€™t retain their state indefinitely turn into a memory as dynamic RAM [DRAM]. ā€¢ Data is stored in DRAM cell in form of charge on capacitor but only for a period of tens of milliseconds. An Example of DRAM ā€¢ DRAM cell consist of a capacitor, C , & a transistor, T . ā€¢ To store information in cell, transistor T is turn on, & provide correct amount of voltage to bit line. ā€¢ After transistor turn off capacitor begins to discharge. ā€¢ So, Read operation must be completed before capacitor drops voltage below some threshold value [ by sense amplifier connected to bit line]. Single Transistor Dynamic memory
  • 15. Design 16MB DRAM Chip ā€¢ 2 M x 8 memory chip . ā€¢ Cells are organized in the form of 4K x 4K . ā€¢ 4096 cells in each row divided into 512 group of 8. Hence 512 byte data can be stored in each row. ā€¢ 12 [ 512 x 8 = 212 ] bit address to select row & 9 [ 512 = 212 ] bits to specify a group of 8 bits in the selected row. ā€¢ RAS [Row address strobe] & CAS [Column address strobe] will be crossed to find the proper bit to read or write. Column CSSenseĀ /Ā Write circuits cellĀ arraylatch address Row Column latch decoder Row decoderaddress 4096 512 8( ) R /W A20 9Ā­ A8 0Ā­ā„ D0D7 RAS CAS x x 1. Each row can store 512 bytes. 12 bits to select a row, and 9 bits to select a group in a row. Total of 21 bits. 2. First apply the row address, RAS signal latches the row address. Then apply the column address, CAS signal latches the address. 3. Timing of the memory unit is controlled by a specialized unit which generates RAS and CAS.
  • 16. Fast Page Mode ļ‚” Suppose if we want to access the consecutive bytes in the selected row. ļ‚” This can be done without having to reselect the row. ļ‚§ Add a latch at the output of the sense circuits in each row. ļ‚§ All the latches are loaded when the row is selected. ļ‚§ Different column addresses can be applied to select and place different bytes on the data lines. ļ‚” Consecutive sequence of column addresses can be applied under the control signal CAS, without reselecting the row. ļ‚§ Allows a block of data to be transferred at a much faster rate than random accesses. ļ‚§ A small collection/group of bytes is usually referred to as a block. ļ‚” This transfer capability is referred to as the fast page mode feature.
  • 17. Synchronous DRAM R/ W RAS CAS CS Clock Cell array latch address Row decoder Row decoder Column Read / Write circuits & latchescounter address Column Row/Column address Data input register Data output register Data Refresh counter Mode register and timing control 1. Operation is directly synchronized with processor clock signal. 2. The outputs of the sense circuits are connected to a latch. 3. During a Read operation, the contents of the cells in a row are loaded onto the latches. 4. During a refresh operation, the contents of the cells are refreshed without changing the contents of the latches. 5. Data held in the latches correspond to the selected columns are transferred to the output. 6. For a burst mode of operation, successive columns are selected using column address counter and clock. 7. CAS signal need not be generated externally. A new data is placed during raising edge of the clock
  • 18. Double-Data-Rate SDRAM ā€¢ In addition to faster circuits, new organizational and operational features make it possible to achieve high data rates during block transfers. ā€¢ The key idea is to take advantage of the fact that a large number of bits are accessed at the same time inside the chip when a row address is applied. ā€¢ Various techniques are used to transfer these bits quickly to the pins of the chip. ā€¢ To make the best use of the available clock speed, data are transferred externally on both the rising and falling edges of the clock. For this reason, memories that use this technique are called double-data-rate SDRAMs (DDR SDRAMs). ā€¢ Several versions of DDR chips have been developed. The earliest version is known as DDR. Later versions, called DDR2, DDR3, and DDR4, have enhanced capabilities.
  • 19. Structure of Larger Memory Static memories 19Ā­bitĀ internalĀ chipĀ address decoder 2Ā­bit addresses 21Ā­bit A 0 A 1 A19 Ā memoryĀ chip A20 D31Ā­24 D7Ā­0 D23Ā­16 D 15Ā­8 512 K XĀ 8Ɨ ChipĀ select Ā memoryĀ chip 19Ā­bit address 512 K 8Ɨ 8Ā­bitĀ data input/output Organization of 2M X 32 Memory Modules using 512 K x 8 Static Memory Chip 1. Implement a memory unit of 2M words of 32 bits each. 2. Use 512K x 8 static memory chips. 3. Each column consists of 4 chips. 4. Each chip implements one byte position. 5. A chip is selected by setting its chip select control line to 1. 6. Selected chip places its data on the data output line, outputs of other chips are in high impedance state. 7. 21 bits to address a 32-bit word. 8. High order 2 bits are needed to select the row, by activating the four Chip Select signals. 9. 19 bits are used to access specific byte locations inside the selected chip.
  • 20. Memory Controller M e m o r y M e m o r y C o n t r o lle r R o w / C o lu m n a d d r e s sA d d r e s s d a t a P r o c e s s o r C lo c k R / W C S C A S R A S R / W R e q u e s t C lo c k ā€¢ Memory address are divided into 2 parts. ā€¢ High order address bit which select row in the cell array, are provided first & latched into memory chip under control of RSA signal. ā€¢ Low order address bit , which selects a column are provided on the same address & latched through CSA signal. ā€¢ However, a processor issues all address bits at the same time. ā€¢ In order to achieve multiplexing, memory controller circuit is inserted between processor & memory. ā€¢ Controller accepts a complete address & R/W signal from processor under control of REQUEST signal, which indicates memory access operation is needed. ā€¢ Controller forwards row & column address timing to have address multiplexing function. ā€¢ Then R/W & CS are send to memory. ā€¢ Data lines are directly connected between processor & memory.
  • 21. Read-Only memory (ROM) ā€¢ Many applications need memory devices to retain contents after the power is turned off. ā€“ For example, computer is turned on, the operating system must be loaded from the disk into the memory. ā€“ Store instructions which would load the OS from the disk. ā€“ Need to store these instructions so that they will not be lost after the power is turned off. ā€“ We need to store the instructions into a non-volatile memory. ā€¢ Non-volatile memory is read in the same manner as volatile memory. ā€“ Separate writing process is needed to place information in this memory. ā€“ Normal operation involves only reading of data, this type of memory is called Read-Only memory (ROM).
  • 22. Read-Only memory (ROM) ā€¢ Read-Only Memory: ā€“ Data are written into a ROM when it is manufactured. ā€¢ Programmable Read-Only Memory (PROM): ā€“ Allow the data to be loaded by a user. ā€“ Process of inserting the data is irreversible. ā€“ Storing information specific to a user in a ROM is expensive. ā€¢ Erasable Programmable Read-Only Memory (EPROM): ā€“ Stored data to be erased and new data to be loaded. ā€“ Flexibility, useful during the development phase of digital systems. ā€“ Erasable, reprogrammable ROM. ā€“ Erasure requires exposing the ROM to UV light. ā€¢ Electrically Erasable Programmable Read-Only Memory (EEPROM): ā€“ To erase the contents of EPROMs, they have to be exposed to ultraviolet light. ā€“ Physically removed from the circuit. ā€“ EEPROMs the contents can be stored and erased electrically ā€¢ Flash memory: ā€“ Has similar approach to EEPROM. ā€“ Read contents of a single cell, but write contents of an entire block of cells. ā€“ Higher capacity and low storage cost per bit. ā€“ Power consumption of flash memory is very low, making it attractive for use in equipment that is battery-driven.
  • 23. ā€¢ Reduces the search time efficiently ā€¢ Address is replaced by content of data called as Content Addressable Memory (CAM) ā€¢ Called as Content based data. ā€¢ Hardwired Requirement :ļƒ  ā€“ It contains memory array & logic for m words with n bits per each word. ā€“ Argument register (A) & Key register (k) each have n bits. ā€“ Match register (M) has m bits, one for each word in memory. ā€“ Each word in memory is compared in parallel with the content of argument register and key register. ā€“ If a match found for a word which matches with the bits of argument register & its corresponding bits in the match register then a search for a data word is over. Associative Memory
  • 24. Cache Memory ā€¢ Processor is much faster than the main memory. ā€“ As a result, the processor has to spend much of its time waiting while instructions and data are being fetched from the main memory. ā€“ Major obstacle towards achieving good performance. ā€¢ Speed of the main memory cannot be increased beyond a certain point. ā€¢ Cache memory is an architectural arrangement which makes the main memory appear faster to the processor than it really is. ā€¢ Relatively small SRAM [ Having low access time ] memory located physically closer to processor. ā€¢ Cache memory is based on the property of computer programs known as ā€œLOCALITY OF REFERENCEā€. ā€¢ Analysis of programs indicates that many instructions in localized areas of a program are executed repeatedly during some period of time, while the others are accessed relatively less frequently. ā€“ These instructions may be the ones in a loop, nested loop or few procedures calling each other repeatedly.
  • 25. Locality of Reference ļƒ˜ The references to memory at any given time interval tend to be confined within a localized areas. ļƒ˜ This area contains a set of information and the membership changes gradually as time goes by ļƒ˜ Temporal Locality ļƒ  ļƒ¼ Recently executed instruction is likely to be executed again very soon. ļƒ¼ The information which will be used in near future is likely to be in use already( e.g. Reuse of information in loops). ļƒ˜ Spatial Locality ļƒ  ļƒ¼ Instructions with addresses close to a recently instruction are likely to be executed soon. ļƒ¼ If a word is accessed, adjacent (near) words are likely accessed soon (e.g. Related data items (arrays) are usually stored together; instructions are executed sequentially) ļƒ˜ Cache is a fast small capacity memory that should hold those information which are most likely to be accessed. Cache Memory Main memory Cache memory CPU
  • 26. Cache Memory ā€¢ Processor issues a Read request, a block of words is transferred from the main memory to the cache, one word at a time. ā€¢ Subsequent references to the data in this block of words are found in the cache. ā€¢ At any given time, only some blocks in the main memory are held in the cache. Which blocks in the main memory are in the cache is determined by a ā€œmapping functionā€. ā€¢ When the cache is full, and a block of words needs to be transferred from the main memory, some block of words in the cache must be replaced. This is determined by a ā€œreplacement algorithmā€. Cache Main memoryProcessor
  • 27. Cache Hit ā€¢ Existence of a cache is transparent to the processor. The processor issues Read and Write requests in the same manner. ā€¢ If the data is in the cache it is called a Read or Write hit. ā€¢ Read hit:ļƒ  ā€“ The data is obtained from the cache. ā€¢ Write hit:ļƒ  ā€“ Cache has a replica of the contents of the main memory. ā€“ Contents of the cache and the main memory may be updated simultaneously. This is the write-through protocol. ā€“ Update the contents of the cache, and mark it as updated by setting a bit known as the dirty bit or modified bit. The contents of the main memory are updated when this block is replaced. This is write-back or copy-back protocol.
  • 28. ā€¢ All the memory accesses are directed first to Cache ā€¢ If the word is in Cache; Access cache to provide it to CPU ļƒ  CACHE HIT ā€¢ If the word is not in Cache; Bring a block (or a line) including that word to replace a block now in Cache ļƒ  CACHE MISS ā€¢ Hit Ratio - % of memory accesses satisfied by Cache memory system ā€¢ Te: Effective memory access time in Cache memory system ā€¢ Tc: Cache access time ā€¢ Tm: Main memory access time ā€¢ Te = h*Tc + (1 - h) [Tc+Tm] ā€¢ Example: Tc = 0.4 Āµs, Tm = 1.2Āµs, h = 85% ā€¢ Te = 0.85*0.4 + (1 - 0.85) * 1.6 = 0.58Āµs Performance Of Cache Memory
  • 29. Cache Miss ā€¢ If the data is not present in the cache, then a Read miss or Write miss occurs. ā€¢ Read miss:ļƒ  ā€“ Block of words containing this requested word is transferred from the memory. ā€“ After the block is transferred, the desired word is forwarded to the processor. ā€“ The desired word may also be forwarded to the processor as soon as it is transferred without waiting for the entire block to be transferred. This is called load-through or early-restart. ā€¢ Write-miss:ļƒ  ā€“ Write-through protocol is used, then the contents of the main memory are updated directly. ā€“ If write-back protocol is used, the block containing the addressed word is first brought into the cache. The desired word is overwritten with new information.
  • 30. Cache Coherence Problem ā€¢ A bit called as ā€œvalid bitā€ is provided for each block. ā€¢ If the block contains valid data, then the bit is set to 1, else it is 0. ā€¢ Valid bits are set to 0, when the power is just turned on. ā€¢ When a block is loaded into the cache for the first time, the valid bit is set to 1. ā€¢ Data transfers between main memory and disk occur directly bypassing the cache. ā€¢ When the data on a disk changes, the main memory block is also updated. ā€¢ However, if the data is also resident in the cache, then the valid bit is set to 0. ā€¢ What happens if the data in the disk and main memory changes and the write-back protocol is being used? ā€¢ In this case, the data in the cache may also have changed and is indicated by the dirty bit. ā€¢ The copies of the data in the cache, and the main memory are different. This is called the cache coherence problem. ā€¢ One option is to force a write-back before the main memory is updated from the disk.
  • 31. Cache Memory Mapping Function ā€¢ Specification of correspondence between main memory blocks and cache blocks ā€¢ Mapping functions determine how memory blocks are placed in the cache. ā€¢ Three different types of mapping functions: ā€“ Direct mapping ā€“ Associative mapping ā€“ Set-associative mapping A simple processor example: ā€“ Cache consisting of 128 blocks of 16 words each. ā€“ Total size of cache is 2048 (2K) words. ā€“ Main memory is addressable by a 16-bit address. ā€“ Main memory has 64K words. ā€“ Main memory has 4K blocks of 16 words each.
  • 32. Direct mapping Main memory BlockĀ 0 BlockĀ 1 BlockĀ 127 BlockĀ 128 BlockĀ 129 BlockĀ 255 BlockĀ 256 BlockĀ 257 BlockĀ 4095 7 4 MainĀ memoryĀ address Tag Block Word 5 tag tag tag Cache BlockĀ 0 BlockĀ 1 BlockĀ 127 ā€¢ Each memory block has only one place to load in Cache memory. ā€¢ Block j of the main memory maps to j modulo 128 of the cache. 0 maps to 0, 129 maps to 1. ā€¢ More than one memory block is mapped onto the same position in the cache. ā€¢ May lead to contention for cache blocks even if the cache is not full. ā€¢ Resolve the contention by allowing new block to replace old block, leading to a trivial replacement algorithm. ā€¢ Memory address is divided into three fields: ļƒ˜ Low order 4 bits determine one of the 16 words in a block. ļƒ˜ When a new block is brought into the cache, the next 7 bits determine which cache block this new block is placed in. ļƒ˜ High order 5 bits determine which of the possible 32 blocks is currently present in the cache. These are tag bits. Simple to implement but not very flexible.
  • 33. Direct mapping Each memory block has only one place to load in Cache memory. Operation 1.As execution proceeds, the 7-bit cache block field of each address generated by the processor points to a particular block location in the cache. 2.The high-order 5 bits of the address are compared with the tag bits associated with that cache location. 3.If they match, then the desired word is in that block of the cache. 4.If there is no match, then the block containing the required word must first be read from the main memory and loaded into the cache. 5.The direct-mapping technique is easy to implement, but it is not very flexible.
  • 34. Associative mapping 1. Main memory block can be placed into any cache position. 2. Memory address is divided into two fields: ļƒ¼ Low order 4 bits identify the word within a block. ļƒ¼ High order 12 bits or tag bits identify a memory block when it is resident in the cache. 3. The tag bits of an address received from the processor are compared to the tag bits of each block of the cache to see if the desired block is present. This is called the associative-mapping technique. 4. Flexible, and uses cache space efficiently. 5. Replacement algorithms can be used to replace an existing block in the cache when the cache is full. 6. Cost is higher than direct-mapped cache because of the need to search all 128 patterns to determine whether a given block is in the cache. Main memory BlockĀ 0 BlockĀ 1 BlockĀ 127 BlockĀ 128 BlockĀ 129 BlockĀ 255 BlockĀ 256 BlockĀ 257 BlockĀ 4095 4 MainĀ memoryĀ address Tag Word 12 tag tag tag Cache BlockĀ 0 BlockĀ 1 BlockĀ 127
  • 35. Set-associative mapping 1. Blocks of cache are grouped into sets. 2. Mapping function allows a block of the main memory to reside in any block of a specific set. 3. Divide the cache into 64 sets, with two blocks per set. 4. Memory block 0, 64, 128 etc. map to block 0, and they can occupy either of the two positions. 5. Memory address is divided into three fields: - 6 bit field determines the set number. - High order 6 bit fields are compared to the tag fields of the two blocks in a set. 6. Set-associative mapping combination of direct and associative mapping. 7. Number of blocks per set is a design parameter. - One extreme is to have all the blocks in one set, requiring no set bits (fully associative mapping). - Other extreme is to have one block per set, is the same as direct mapping. Main memory BlockĀ 0 BlockĀ 1 BlockĀ 63 BlockĀ 64 BlockĀ 65 BlockĀ 127 BlockĀ 128 BlockĀ 129 BlockĀ 4095 6 4 MainĀ memoryĀ address Tag Set Word 6 tag tag tag Cache BlockĀ 1 BlockĀ 2 BlockĀ 126 BlockĀ 127 BlockĀ Ā 3 BlockĀ 0tag tag tag
  • 36. Performance Considerations ā€¢ A key design objective of a computer system is to achieve the best possible performance at the lowest possible cost. ā€“ Price/performance ratio is a common measure of success. ā€¢ Performance of a processor depends on: ā€“ How fast machine instructions can be brought into the processor for execution. ā€“ How fast the instructions can be executed.
  • 37. Memory Interleaving ļ‚” Divides the memory system into a number of memory modules. ļ‚” Each module has its own address buffer register (ABR) and data buffer register (DBR). ļ‚” Arranges addressing so that successive words in the address space are placed in different modules. ļ‚” When requests for memory access involve consecutive addresses, the access will be to different modules. ļ‚” Since parallel access to these modules is possible, the average rate of fetching words from the Main Memory can be increased.
  • 38. Methods of address layouts ļ‚” Consecutive words are placed in a module. ļ‚” High-order k bits of a memory address determine the module. ļ‚” Low-order m bits of a memory address determine the word within a module. ļ‚” When a block of words is transferred from main memory to cache, only one module is busy at a time. mĀ bits AddressĀ inĀ module MMĀ address i kĀ bits Module Module Module Module DBRABR DBRABR ABR DBR 0 n 1Ā­ i kĀ bits 0 ModuleModuleModule Module MMĀ address DBRABRABR DBRABR DBR AddressĀ inĀ module 2 k 1Ā­ mĀ bits ļ‚” Consecutive words are located in consecutive modules. ļ‚” Consecutive addresses can be located in consecutive modules. ļ‚” While transferring a block of data, several memory modules can be kept busy at the same time.
  • 39. Hit Rate and Miss Penalty ā€¢ Hit rate ā€¢ Miss penalty ā€¢ Hit rate can be improved by increasing block size, while keeping cache size constant ā€¢ Block sizes that are neither very small nor very large give best results. ā€¢ Miss penalty can be reduced if load-through approach is used when loading new blocks into cache.
  • 40. Caches on the processor chip ā€¢ In high performance processors 2 levels of caches are normally used. ā€¢ Avg access time in a system with 2 levels of caches is T ave = h1c1+(1-h1)h2c2+(1-h1)(1-h2)M
  • 41. VIRTUAL MEMORY ve the programmer the illusion that the system has a very large memory n though the computer actually has a relatively small main memory Address Space(Logical) and Memory Space(Physical) Address Mapping Memory Mapping Table for Virtual Address ļƒ  Physical Address virtual address (logical address) physical address address space memory space address generated by programs actual main memory address Mapping Virtual address Virtual address register Memory mapping table Memory table buffer register Main memory address register Main memory Main memory buffer register Physical Address
  • 42. ADDRESS MAPPING Organization of memory Mapping Table in a paged system Address Space and Memory Space are each divided into fixed size group of words called blocks or pages 1K words group Page 0 Page 1 Page 2 Page 3 Page 4 Page 5 Page 6 Page 7 Block 3 Block 2 Block 1 Block 0 Address space N = 8K = 213 Memory space M = 4K = 212 0000 1001 1010 0011 0100 1101 1110 0111 1 Block 0 Block 1 Block 2 Block 3 MBR 0 1 0 1 0 1 0 1 0 0 1 1 1 0 1 0 1 0 1 0 1 0 0 1 1 Table address Presence bit Page no. Line number Virtual address Main memory address register Memory page table Main memory 11 00 01 10 01
  • 43. PAGE FAULT Processor architecture should provide the ability to restart any instruction after a page fault. 1. Trap to the OS 2. Save the user registers and program state 3. Determine that the interrupt was a page fault 4. Check that the page reference was legal and determine the location of the page on the backing store(disk) 5. Issue a read from the backing store to a free frame a. Wait in a queue for this device until serviced b. Wait for the device seek and/or latency time c. Begin the transfer of the page to a free frame 6. While waiting, the CPU may be allocated to some other process 7. Interrupt from the backing store (I/O completed) 8. Save the registers and program state for the other user 9. Determine that the interrupt was from the backing store 10. Correct the page tables (the desired page is now in memory) 11. Wait for the CPU to be allocated to this process again 12. Restore the user registers, program state, and new page table, then resume the interrupted instruction. LOAD M 0 Reference1 OS trap2 3 Page is on backing store free frame main memory 4 bring in missing page5 reset page table 6 restart instruction
  • 44. PAGE REPLACEMENT Modified page fault service routine Decision on which page to displace to make room for an incoming page when no free frame is available 1. Find the location of the desired page on the backing store 2. Find a free frame - If there is a free frame, use it - Otherwise, use a page-replacement algorithm to select a victim frame - Write the victim page to the backing store 3. Read the desired page into the (newly) free frame 4. Restart the user process 2 f 0 v i f v frame valid/ invalid bit page table change to invalid 4 reset page table for new page victim 1 swap out victim page 3 swap desired page in backing store physical memory
  • 45. First-In-First-Out (FIFO) Algorithm ā€¢ Replacement is depends upon the arrival time of a page to memory. ā€¢ A page is replaced when it is oldest (in the ascending order of page arrival time to memory). ā€¢ As it is a FIFO queue no need to record the arrival time of a page & the page at the head of the queue is replaced. ā€¢ Performance is not good always ā€¢ When a active page is replaced to bring a new page, then a page fault occurs immediately to retrieve the active page. ā€¢ To get the active page back some other page has to be replaced. Hence the page fault increases.
  • 46. FIFO Page Replacement H I T TWOHITS TWOHITS 15 PAGE FAULTS
  • 47. Problem with FIFO Algorithm ā€¢ Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 ā€¢ 3 frames (3 pages can be in memory at a time per process) ā€¢ 4 frames ā€¢ Beladyā€™s Anomaly:ļƒ  more frames ā‡’ more page faults 1 2 3 1 2 3 4 1 2 5 3 4 9 page faults 1 2 3 1 2 3 5 1 2 4 5 10 page faults44 3 UNEXPECTED Page fault increases
  • 48. Optimal Algorithm ā€¢ To recover from beladyā€™s anomaly problem :ļƒ  Use Optimal page replacement algorithm ā€¢ Replace the page that will not be used for longest period of time. ā€¢ This guarantees lowest possible page fault rate for a fixed number of frames. ā€¢ Example :ļƒ  ā€“ First we found 3 page faults to fill the frames. ā€“ Then replace page 7 with page 2 because it will not needed up to the 18th place in reference string. ā€“ Finally there are 09 page faults. ā€“ Hence it is better than FIFO algorithm (15 page Faults).
  • 49. Optimal Page Replacement H I T H I T TWOHITSTWOHITS THREE HITS TWOHITS 09 PAGE FAULTS
  • 50. Difficulty with Optimal Algorithm ā€¢ Replace page that will not be used for longest period of time ā€¢ 4 frames example 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 ā€¢ Used for measuring how well your algorithm performs. ā€¢ It always needs future knowledge of reference string. 1 2 3 4 6 page faults 4 5
  • 51. Least Recently Used (LRU) Algorithm ā€¢ LRU algorithm lies between FIFO & Optimal algorithm ( in terms of page faults). ā€¢ FIFO :ļƒ  time when page brought into memory. ā€¢ OPTIMAL :ļƒ  time when a page will used. ā€¢ LRU :ļƒ  Use the recent past as an approximation of near future (so it cant be replaced), then we will replace that page which has not been used for longest period of time. Hence it is least recently used algorithm. ā€¢ Example :ļƒ  ā€“ Up to 5th page fault it is same as optimal algorithm. ā€“ When page 4 occur LRU chose page 2 for replacement. ā€“ Here we find only 12 page faults.
  • 52. LRU Page Replacement H IT H IT TWOHITS H IT H IT TWOHITS 12 PAGE FAULTS
  • 53. Least Recently Used (LRU) Algorithm ā€¢ Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 ā€¢ Counter implementation ā€“ Every page entry has a counter; every time page is referenced through this entry, copy the clock into the counter ā€“ When a page needs to be changed, look at the counters to determine which are to change 5 2 4 3 1 2 3 4 1 2 5 4 1 2 5 3 1 2 4 3