Unit 4_DECA_Complete Digital Electronics.pptx

1
Digital Electronics and Computer Architecture (DECA)
CPU and its Organization (Unit-4):
(Complete Unit)
Central Processing Unit: Introduction, General Register Organization,
Operation of Control Unit, Control Word, Stack Organization and Instruction Format, Various
Addressing Modes, RISC and CISC Characteristics, Introduction to Parallel Processing,
Flynn’s Classification of Computers, Pipelining, Pipeline Hazards, Direct Memory Access
(DMA), DMA Transfer, Input-Output Processor (IOP)
B.E.-CSE 1st Sem.
Department of Interdisciplinary Courses in Engineering (DICE)
&
Department of Computer Science and Engineering

2
Central Processing Unit (CPU)
CPU is the brain of the computer. All types of data processing operations and all the
important functions of a computer are performed by the CPU.
The CPU consists of 3 major units, which are:
1. Memory or Storage Unit
2. Control Unit
3. ALU(Arithmetic Logic Unit)

3
The part of the computer that performs the bulk of data-processing operations is
called the central processing unit (CPU).
The CPU is made up of three major parts : Register set, Control Unit and Arithmetic
Logical Unit (ALU).
The register set stores intermediate data used during the execution of instructions.
The ALU performs the required micro-operations for executing the instructions.
The control unit generates the sequence of various micro-operations, supervises
the transfer of information among the registers and instructs the ALU as to which
operation to perform.

4
Central Processing Unit (Cont..)

5
Memory or Storage Unit:
 Data and instructions are stored in memory units which are required for
processing.
 It also stores the intermediate results of any calculation or task when they are in
process.
 The final results of processing are stored in the memory units before these results
are released to an output device for giving the output to the user.
 All sorts of inputs and outputs are transmitted through the memory unit.

6
Control Unit:
 Controlling of data and transfer of data and instructions is done by the control
unit among other parts of the computer.
 The control unit is responsible for managing all the units of the computer.
 The main task of the control unit is to obtain the instructions or data which is
input from the memory unit, interprets them, and then directs the operation of the
computer according to that.
 The control unit is responsible for communication with Input and output devices
for the transfer of data or results from memory.
 The control unit is not responsible for the processing of data or storing data.

7
ALU (Arithmetic Logic Unit) :
 Arithmetic Section
 Logic Section

MAJOR COMPONENTS OF CPU
• Storage Components
Registers
Flags
• Execution (Processing) Components
Arithmetic Logic Unit(ALU)
Arithmetic calculations, Logical computations, Shifts/Rotates
• Transfer Components
Bus
• Control Components
Control Unit
Register
File ALU
Control Unit

REGISTERS
• In Basic Computer, there is only one general purpose register, the
Accumulator (AC)
• In modern CPUs, there are many general purpose registers
• It is advantageous to have many registers
– Transfer between registers within the processor are relatively fast

GENERAL REGISTER
ORGANIZATION
MUX
SELA { MUX } SELB
ALU
OPR
R1
R2
R3
R4
R5
R6
R7
Input
3 x 8
decoder
SELD
Load
(7 lines)
Output
A bus B bus
Clock
-In basic computer, memory locations are needed to store address, operands, pointers, temporary
results.
-Having to refer to memory locations for such applications is time consuming because memory
access is most time consuming operation in computer.
-It is more convenient and more efficient to store these intermediate values in processor register.
(7 lines)

OPERATION OF CONTROL UNIT
The control unit :
Directs the information flow through ALU by
- Selecting various Components in the system
- Selecting the Function of ALU
Example: R1  R2 + R3
[1] MUX A selector (SELA): BUS A  R2
[2] MUX B selector (SELB): BUS B  R3
[3] ALU operation selector (OPR): ALU to ADD
[4] Decoder destination selector (SELD): R1  Out Bus
Control Word
Encoding of register selection fields
Binary
Code SELA SELB SELD
000 Input Input None
001 R1 R1 R1
010 R2 R2 R2
011 R3 R3 R3
100 R4 R4 R4
101 R5 R5 R5
110 R6 R6 R6
111 R7 R7 R7
SELA SELB SELD OPR
3 3 3 5

ALU CONTROL-Micro Operations
Encoding of ALU operations OPR
Select Operation Symbol
00000 Transfer A TSFA
00001 Increment A INCA
00010 ADD A + B ADD
00101 Subtract A - B SUB
00110 Decrement A DECA
01000 AND A and B AND
01010 OR A and B OR
01100 XOR A and B XOR
01110 Complement A COMA
10000 Shift right A SHR A
11000 Shift left A SHL A
Examples of ALU Microoperations
Symbolic Designation
Microoperation SELA SELB SELD OPR Control Word
R1 <- R2 - R3 R2 R3 R1 SUB 010 011 001 00101
R4 <- R4 OR R5 R4 R5 R4 OR 100 101 100 01010
R6 <- R6 + 1 R6 - R6 INCA 110 000 110 00001
R7 <- R1 R1 - R7 TSFA 001 000 111 00000
Output <- R2 R2 - None TSFA 010 000 000 00000
Output <- Input Input None TSFA 000 000 000 00000
R4 <- shl R4 R4 - R4 SHLA 100 000 100 11000
R5 <- 0 R5 R5 R5 XOR 101 101 101 01100

REGISTER STACK ORGANIZATION
Register Stack (64 word register stack)
Push, Pop operations ( for insertion and deletion operations)
/* Initially, SP = 0, EMPTY = 1, FULL = 0 */
PUSH POP
SP <- SP + 1 DR <-M[SP]
M[SP] <- DR SP <- SP - 1
If (SP = 0) then (FULL <- 1) If (SP = 0) then (EMPTY <-1)
EMPTY <- 0 FULL <- 0
Stack
- Very useful feature
- Also efficient for arithmetic expression evaluation
- Storage which can be accessed in LIFO
- Pointer: SP
- Only PUSH and POP operations are applicable
A
B
C
0
1
2
3
4
63
Address
FULL EMPTY
SP
DR
Flags
Stack pointer
stack
6 bits

INSTRUCTION FORMAT
OP-code field - specifies the operation to be performed
Address field - designates memory address(es) or a processor register(s)
Mode field - determines how the address field is to be interpreted (to
get effective address or the operand)
• The number of address fields in the instruction format depends on the
internal organization of CPU
• The three most common CPU organizations:
Single accumulator organization:
ADD X /* AC  AC + M[X] */
General register organization:
ADD R1, R2, R3 /* R1  R2 + R3 */
ADD R1, R2 /* R1  R1 + R2 */
MOV R1, R2 /* R1  R2 */
ADD R1, X /* R1  R1 + M[X] */
Stack organization:
PUSH X /* TOS  M[X] */
ADD
• Instruction Fields

• Three-Address Instructions
Program to evaluate X = (A + B) * (C + D) :
ADD R1, A, B /* R1 <- M[A] + M[B] */
ADD R2, C, D /* R2 <- M[C] + M[D] */
MUL X, R1, R2 /* M[X] <- R1 * R2 */
- Results in short programs (Advantage)
- Instruction becomes long (many bits)
• Two-Address Instructions
Program to evaluate X = (A + B) * (C + D) :
MOV R1, A /* R1 <- M[A] */
ADD R1, B /* R1 <- R1 + M[A] */
MOV R2, C /* R2  M[C] */
ADD R2, D /* R2  R2 + M[D] */
MUL R1, R2 /* R1  R1 * R2 */
MOV X, R1 /* M[X]  R1 */
THREE, AND TWO-ADDRESS INSTRUCTIONS

ONE-ADDRESS INSTRUCTIONS (Program)

Zero-ADDRESS INSTRUCTIONS (Program)
Stack Organized
computer uses zero-
address type of
instruction format
and in this type
operand stored in
Push down stack.

Practice Questions
Q. What will be the control word for R4R6 OR
R7?
Q. Write control word for the following micro-
operation: Outputinput.
Q. What contents will be left in stack after executing the
following instructions:
PUSH 10
POP
PUSH 5
PUSH 10
POP
PUSH 15
POP 18
Answer: 5

ADDRESSING MODES
• Addressing Modes
*Specifies a rule for interpreting or modifying the address field of the
instruction (before the operand is actually referenced)
* Variety of addressing modes
- to give programming flexibility to the user
- to use the bits in the address field of the instruction efficiently

TYPES OF ADDRESSING MODES
• Implied Mode
Address of the operands are specified implicitly in the definition of the instruction
- No need to specify address in the instruction
- EA = AC, or EA = Stack[SP]
- Examples from Basic Computer
CLA, CME, INP
• Immediate Mode
Instead of specifying the address of the operand, operand itself is specified
- No need to specify address in the instruction
- However, operand itself needs to be specified
- Sometimes, require more bits than the address
- Fast to acquire an operand

• Register Mode
Address specified in the instruction is the register address
- Designated operand need to be in a register
- Shorter address than the memory address
- Saving address field in the instruction
- Faster to acquire an operand than the memory addressing
- EA = IR(R) (IR(R): Register field of IR)
• Register Indirect Mode
Instruction specifies a register which contains the memory address of the operand
- Saving instruction bits since register address
is shorter than the memory address
- Slower to acquire an operand than both the
register addressing or memory addressing
- EA = [IR(R)] ([x]: Content of x)
• Autoincrement or Autodecrement Mode
- When the address in the register is used to access memory, the value in the register
is incremented or decremented by 1 automatically

Addressing Modes
• Direct Address Mode
Instruction specifies the memory address which can be used directly to access the
memory
- Faster than the other memory addressing modes
- Too many bits are needed to specify the address for a large physical memory space
- EA = IR(addr) (IR(addr): address field of IR)
• Indirect Addressing Mode
The address field of an instruction specifies the address of a memory location that
contains the address of the operand
- When the abbreviated address is used large physical memory can be addressed with a
relatively small number of bits
- Slow to acquire an operand because of an additional memory access
- EA = M[IR(address)]

Addressing Modes
• Relative Addressing Modes
In this mode, the content of the program counter (PC) is added to the address part of
the instruction in order to obtain the effective address.
The Address fields of an instruction specifies the part of the address (abbreviated
address) which can be used along with a designated register to calculate the
address of the operand
- Address field of the instruction is short
- Large physical memory can be accessed with a small number of address bits
- EA = f(IR(address), R), R is sometimes implied
3 different Relative Addressing Modes depending on R;
* PC Relative Addressing Mode (R = PC)
- EA = PC + IR(address)
• Indexed Addressing Mode (R = IX, where IX: Index Register)
- EA = IX + IR(address)
• Base Register Addressing Mode
(R = BAR, where BAR: Base Address Register)
- EA = BAR + IR(address)

ADDRESSING MODES - EXAMPLE
and Practice Question
Addressing
Mode
Effective
Address
Content
of AC
Direct address 500 /* AC  (500) */ 800
Immediate operand 201 /* AC  500 */ 500
Indirect address 800 /* AC  ((500)) */ 300
Relative address 702 /* AC  (PC+500) */ 325
Indexed address 600 /* AC  (RX+500) */ 900
Register - /* AC  R1 */ 400
Register indirect 400 /* AC  (R1) */ 700
Autoincrement 400 /* AC  (R1)+ */ 700
Autodecrement 399 /* AC  -(R) */ 450
Load to AC Mode
Address = 500
Next instruction
200
201
202
399
400
450
700
500 800
600 900
702 325
800 300
Memory
Address
PC = 200
R1 = 400
XR = 100
AC

RISC and CISC
• Reduced Instruction Set Computer (RISC) –
The main idea behind this is to make hardware simpler by using an
instruction set composed of a few basic steps for loading, evaluating, and
storing operations just like a load command will load data, a store
command will store the data.
• Complex Instruction Set Computer (CISC) –
The main idea is that a single instruction will do all loading, evaluating,
and storing operations just like a multiplication command will do stuff
like loading data, evaluating, and storing it, hence it’s complex.
27

Continue..
28
Both approaches try to increase the CPU performance
•RISC: Reduce the cycles per instruction at the cost of the number of
instructions per program.
•CISC: The CISC approach attempts to minimize the number of instructions
per program but at the cost of an increase in the number of cycles per
instruction.
RISC processor is comparatively faster and
consumes less power.

Characteristic of RISC
29
Earlier when programming was done using assembly language, a need
was felt to make instruction do more tasks because programming in
assembly was tedious and error-prone due to which CISC architecture
evolved but with the uprise of high-level language dependency on
assembly reduced RISC architecture prevailed.
Characteristic of RISC –
1.Simpler instruction, hence simple instruction decoding.
2.Instruction comes in the size of one word.
3.Instruction takes a single clock cycle to get executed.
4.More general-purpose registers.
5.Simple Addressing Modes.
6.Fewer Data types.
7.A pipeline can be achieved.

Characteristic of CISC
30
Characteristic of CISC –
1.Complex instruction, hence complex instruction decoding.
2.Instructions are larger than one-word size.
3.Instruction may take more than a single clock cycle to get executed.
4.Less number of general-purpose registers as operations get
performed in memory itself.
5.Complex Addressing Modes.
6.More Data types.
CISC is a processor which is developed with a full (complex)
set of instructions aimed at providing the full processor
capacity in the most efficient manner.

Continue..
31
Example – Suppose we have to add two 8-bit numbers:
•CISC approach: There will be a single command or instruction for this like
ADD which will perform the task.
•RISC approach: Here programmer will write the first load command to load
data in registers then it will use a suitable operator and then it will store the
result in the desired location.
So, add operation is divided into parts i.e. load, operate, store due to which
RISC programs are longer and require more memory to get stored but
require fewer transistors due to less complex command.

Difference
32
RISC CISC
Focus on software Focus on hardware
Uses only Hardwired control unit
Uses both hardwired and microprogrammed
control unit
Transistors are used for more registers
Transistors are used for storing complex
Instructions
Fixed sized instructions Variable sized instructions
Can perform only Register to Register
Arithmetic operations
Can perform REG to REG or REG to MEM or
MEM to MEM
Requires more number of registers Requires less number of registers
Code size is large Code size is small
An instruction executed in a single clock cycle Instruction takes more than one clock cycle
An instruction fit in one word Instructions are larger than the size of one word

34
Parallel Processing
• Parallel processing can be described as a class of techniques which enables the
system to achieve simultaneous data-processing tasks to increase the
computational speed of a computer system.
• The primary purpose of parallel processing is to speed up the computer processing
capability and increase its throughput, i.e. the amount of processing that can be
accomplished during a given interval of time.
• A parallel processing system is able to perform concurrent data processing to
achieve faster execution time. For example, while an instruction is being executed
in the ALU, the next instruction can be read from memory.

35
Parallel Processing
The following diagram shows one possible way of separating the execution unit into
eight functional units operating in parallel.
The operation performed in each functional unit is indicated in each block of the
diagram:

36
Parallel Processing
• The adder and integer
multiplier performs the
arithmetic operation with
integer numbers.
• The floating-point
operations are separated into
three circuits operating in
parallel.
• The logic, shift, and
increment operations can be
performed concurrently on
different data. All units are
independent of each other,
so one number can be
shifted while another
number is being
incremented.

37
Flynn's Classification of Computers
M.J. Flynn proposed a classification for the organization of a computer system by the
number of instructions and data items that are manipulated simultaneously.
• The sequence of instructions read from memory constitutes an instruction
stream.
• The operations performed on the data in the processor constitute a data stream.
• Parallel processing may occur in the instruction stream, in the data stream, or
both.
Flynn's classification divides computers into four major groups that are:
1. Single instruction stream, single data stream (SISD)
2. Single instruction stream, multiple data stream (SIMD)
3. Multiple instruction stream, single data stream (MISD)
4. Multiple instruction stream, multiple data stream (MIMD)

38
SISD stands for 'Single Instruction and Single Data Stream'. It represents the
organization of a single computer containing a control unit, a processor unit, and a
memory unit.
Most conventional computers have SISD architecture like the traditional Von-
Neumann computers.
Where, CU = Control Unit,
PE = Processing Element,
M = Memory
Instructions are decoded by the
Control Unit and then the Control
Unit sends the instructions to the
processing units for execution.

39
Flynn's Classification of Computers - SIMD
SIMD stands for 'Single Instruction and
Multiple Data Stream'. It represents an
organization that includes many processing
units under the supervision of a common
control unit.
All processors receive the same instruction
from the control unit but operate on different
items of data. The shared memory unit must
contain multiple modules so that it can
communicate with all the processors
simultaneously.
SIMD is mainly dedicated to array
processing machines. However,
vector processors can also be seen as a
part of this group.

40
MISD stands for 'Multiple Instruction and Single Data stream'.
MISD structure is only of theoretical interest since no practical system has been
constructed using this organization.
In MISD, multiple processing units operate on one single-data stream. Each
processing unit operates on the data independently via separate instruction stream.

41
MIMD stands for 'Multiple Instruction and Multiple Data Stream'.
In this organization, all processors in a parallel computer can execute different
instructions and operate on various data at the same time.
In MIMD, each processor has a separate program and an instruction stream is
generated from each program.
Example:
Cray T90, Cray T3E, IBM-SP2

43
Pipelining
• The term Pipelining refers to a technique of decomposing a sequential process
into sub-operations, with each sub-operation being executed in a dedicated
segment that operates concurrently with all other segments.
• The most important characteristic of a pipeline technique is that several
computations can be in progress in distinct segments at the same time.
• The overlapping of computation is made possible by associating a register with
each segment in the pipeline. The registers provide isolation between each
segment so that each can operate on distinct data simultaneously.
• The structure of a pipeline organization can be represented simply by including an
input register for each segment followed by a combinational circuit.

44
Pipelining
Let us consider an example of combined
multiplication and addition operation to get
a better understanding of the pipeline
organization.
The combined multiplication and addition
operation is done with a stream of numbers
such as:
Ai* Bi + Ci for i = 1, 2, 3, ......., 7
The operation to be performed on the
numbers is decomposed into sub-operations
with each sub-operation to be implemented
in a segment within a pipeline.

45
Pipelining
The sub-operations performed in each
segment of the pipeline are defined as:
R1 ← Ai, R2 ← Bi Input Ai, and Bi
R3 ← R1 * R2, R4 ← Ci Multiply, and
input Ci
R5 ← R3 + R4 Add Ci to
product
The following block diagram represents the
combined as well as the sub-operations
performed in each segment of the pipeline.

46
Pipelining
Ai* Bi + Ci for i = 1, 2, 3, ......., 7

47
Pipelining
In general, the pipeline organization is applicable for two areas of computer design
which includes:
1. Arithmetic Pipeline
2. Instruction Pipeline

48
Pipelining
Arithmetic Pipeline
• Arithmetic Pipelines are mostly used in high-speed computers.
• They are used to implement floating-point operations, multiplication of fixed-point
numbers, and similar computations encountered in scientific problems.
To understand the concepts of arithmetic pipeline in a more convenient way, let us consider
an example of a pipeline unit for floating-point addition and subtraction.
The inputs to the floating-point adder pipeline are two normalized floating-point binary
numbers defined as:
X=A*2^a=0.9504*10^3
Y=B*2^a=0.8200*10^2
Where A and B are two fractions that represent the mantissa and a and b are the exponents.

49
Pipelining
Arithmetic Pipeline
The combined operation of floating-point addition and subtraction is divided into
four segments. Each segment contains the corresponding suboperation to be
performed in the given pipeline. The suboperations that are shown in the four
segments are:
1. Compare the exponents by subtraction.
2. Align the mantissas.
3. Add or subtract the mantissas.
4. Normalize the result.

50
Pipelining
Arithmetic Pipeline
X=0.9504*10^3
Y=0.8200*10^2
Where A and B are two fractions
that represent the mantissa and a
and b are the exponents.
X=0.9504*10^3
Y=0.0820*10^3
After Addition
Z=1.0324*10^3
Normalize the result:
Z=0.10324*10^4

51
Pipelining
Instruction Pipeline
• Pipeline processing can occur not only in the data stream but in the instruction
stream as well.
• Most of the digital computers with complex instructions require instruction
pipeline to carry out operations like fetch, decode and execute instructions.
In general, the computer needs to process each instruction with the following
sequence of steps.
1. Fetch instruction from memory.
2. Decode the instruction.
3. Calculate the effective address.
4. Fetch the operands from memory.
5. Execute the instruction.
6. Store the result in the proper place.

52
Pipelining
Each step is executed in a particular segment, and there are times when different
segments may take different times to operate on the incoming information.
Moreover, there are times when two or more segments may require memory
access at the same time, causing one segment to wait until another is finished with
the memory.
The organization of an instruction pipeline will be more efficient if the instruction
cycle is divided into segments of equal duration. One of the most common
examples of this type of organization is a Four-segment instruction pipeline.

53
Pipelining
• A four-segment instruction pipeline combines two or more different segments
and makes it as a single one.
• For instance, the decoding of the instruction can be combined with the
calculation of the effective address into one segment.
The following block diagram (next slide) shows a typical example of a four-
segment instruction pipeline. The instruction cycle is completed in four segments.
Segment 1: The instruction fetch segment can be implemented using first in, first
out (FIFO) buffer.
Segment 2: The instruction fetched from memory is decoded in the second
segment, and eventually, the effective address is calculated in a separate
arithmetic circuit.
Segment 3: An operand from memory is fetched in the third segment.
Segment 4: The instructions are finally executed in the last segment of the
pipeline organization.

54
Pipelining
1. FI is the segment that fetches an instruction.
2. DA is the segment that decodes the instruction
and calculates the effective address.
3. FO is the segment that fetches the operand.
4. EX is the segment that execute the instruction.

55
Pipelining Hazard
• Pipeline hazards are situations that prevent the next instruction in the
instruction stream from executing during its designated clock cycles.
• Any condition that causes a stall in the pipeline operations can be called a
hazard.
There are primarily three types of hazards:
i. Data Hazards
ii. Control Hazards or instruction Hazards
iii. Structural Hazards

56
Pipelining Hazard
i. Data Hazards:
• A data hazard is any condition in which either the source or the destination
operands of an instruction are not available at the time expected in the pipeline.
• As a result of which some operation has to be delayed and the pipeline stalls.
Whenever there are two instructions one of which depends on the data
obtained from the other.
A=3+A
B=A*4
For the above sequence, the second instruction needs the value of ‘A’ computed
in the first instruction.
Thus the second instruction is said to depend on the first.

57
Pipelining Hazard
ii. Structural Hazards:
• This situation arises mainly when two instructions require a given hardware
resource at the same time and hence for one of the instructions the pipeline
needs to be delayed.
• The most common case is when memory is accessed at the same time by two
instructions. One instruction may need to access the memory as part of the
Execute or Write back phase while other instruction is being fetched.
• In this case if both the instructions and data reside in the same memory. Both
the instructions can’t proceed together and one of them needs to be stalled till
the other is done with the memory access part.
• Thus in general, sufficient hardware resources are needed for avoiding
structural hazards.

58
Pipelining Hazard
iii. Control hazards:
• The instruction fetch unit of the CPU is responsible for providing a stream of
instructions to the execution unit. The instructions fetched by the fetch unit are
in consecutive memory locations and they are executed.
• However the problem arises when one of the instructions is a branching
instruction to some other memory location. Thus all the instruction fetched in
the pipeline from consecutive memory locations are invalid now and need to
be removed(also called flushing of the pipeline).This induces a stall till new
instructions are again fetched from the memory address specified in the branch
instruction.
• Thus the time lost as a result of this is called a branch penalty. Often dedicated
hardware is incorporated in the fetch unit to identify branch instructions and
compute branch addresses as soon as possible and reducing the resulting delay
as a result.

60
Direct Memory Access (DMA)
• The transfer of data between a fast storage device such as magnetic disk and
memory is often limited by the speed of the CPU.
• Removing CPU from the path and letting the I/O devices manage the memory
buses directly would improve the speed of transfer.
• This transfer technique is called direct memory access (DMA).
• During DMA transfer, the CPU is idle and has no control of the memory buses.
• A DMA controller takes over the buses to manage the transfer directly the I/O
device and memory.

61
Direct Memory Access (DMA)
• The bus request (BR) input is used by the DMA controller to request the CPU to provide
the control of the buses.
• When this input is active, CPU terminates the execution of current instruction and places
address bus, data bus, and read and write lines into disabled mode (i. e. output is
disconnected).
• The CPU activates bus grant (BG) output to inform the external DMA that buses are in
disconnected mode and give control of buses to conduct memory transfer.
• When the DMA terminates the transfer, it disables the bus request line, and inform CPU
to come in its normal operation.

62
DMA Controller
DMA Controller is a hardware device that allows I/O devices to directly access
memory with less participation of the processor.

63
Direct Memory Access (DMA) Transfer
• DMA controller has to share the bus with the processor to make the data transfer. The device
that holds the bus at a given time is called bus master.
• When a transfer from I/O device to the memory or vice verse has to be made, the processor
stops the execution of the current program, increments the program counter, moves data
over stack then sends a DMA select signal to DMA controller over the address bus.
• If the DMA controller is free, it requests the control of bus from the processor by raising the
bus request signal. Processor grants the bus to the controller by raising the bus grant signal,
now DMA controller is the bus master.
• The processor initiates the DMA controller by sending the memory addresses, number of
blocks of data to be transferred and direction of data transfer.
• After assigning the data transfer task to the DMA controller, instead of waiting ideally till
completion of data transfer, the processor resumes the execution of the program after
retrieving instructions from the stack.
.

64
• DMA controller now has the full control of buses and can interact directly with memory and
I/O devices independent of CPU. It makes the data transfer according to the control
instructions received by the processor.
• After completion of data transfer, it disables the bus request signal and CPU disables the bus
grant signal thereby moving control of buses to the CPU.
• When an I/O device wants to initiate the transfer then it sends a DMA request signal to the
DMA controller, for which the controller acknowledges if it is free.
• Then the controller requests the processor for the bus, raising the bus request signal. After
receiving the bus grant signal it transfers the data from the device. For n channeled DMA
controller n number of external devices can be connected.

65
Transfer Of Data in Computer By DMA Controller

66
The DMA transfers the data in three modes which include the following
a) Burst Mode: In this mode DMA handover the buses to CPU only after
completion of whole data transfer. Meanwhile, if the CPU requires the bus it has
to stay ideal and wait for data transfer.
b) Cycle Stealing Mode: In this mode, DMA gives control of buses to CPU after
transfer of every byte. It continuously issues a request for bus control, makes the
transfer of one byte and returns the bus. By this CPU doesn’t have to wait for a long
time if it needs a bus for higher priority task.
c) Transparent Mode: Here, DMA transfers data only when CPU is executing the
instruction which does not require the use of buses.

Input Output Processors (IOP)
67

68
Input Output Processors (IOP)
• The DMA mode of data transfer reduces CPU’s overhead in handling
I/O operations. It also allows parallelism in CPU and I/O operations.
• Such parallelism is necessary to avoid wastage of valuable CPU time
while handling I/O devices whose speeds are much slower as
compared to CPU.
• The concept of DMA operation can be extended to relieve the CPU
further from getting involved with the execution of I/O operations.
• This gives rises to the development of special purpose processor
called Input-Output Processor (IOP).
• The Input Output Processor (IOP) is just like a CPU that handles the
details of I/O operations.
• It is more equipped with facilities than those are available in typical
DMA controller.
• The IOP can fetch and execute its own instructions that are
specifically designed to characterize I/O transfers.
• In addition to the I/O – related tasks, it can perform other processing
tasks like arithmetic, logic, branching and code translation.

69
Block Diagram of IOP (Input output Processor)

70
IOP
• The Input Output Processor is a specialized processor which loads and stores data
into memory along with the execution of I/O instructions.
• It acts as an interface between system and devices. It involves a sequence of events
to executing I/O operations and then store the results into the memory.
Advantages –
•The I/O devices can directly access the main memory without the intervention by the
processor in I/O processor based systems.
•It is used to address the problems that are arises in Direct memory access method.
•Reduced processor workload: With an I/O processor, the main processor doesn’t
have to deal with I/O operations, allowing it to focus on other tasks. This results in more
efficient use of the processor’s resources and can lead to faster overall system
performance.
• Improved data transfer rates: Since the I/O processor can access memory directly,
data transfers between I/O devices and memory can be faster and more efficient than
with other methods.
• Scalability: I/O processor based systems can be designed to scale easily, allowing
for additional I/O processors to be added as needed. This can be particularly useful in
large-scale data centers or other environments where the number of I/O devices is
constantly changing.

71
Disadvantages of IOP
Disadvantages –
1.Cost: I/O processors can add significant cost to a system due to the additional
hardware and complexity required. This can be a barrier to adoption, especially for
smaller systems.
2.Increased complexity: The addition of an I/O processor can increase the overall
complexity of a system, making it more difficult to design, build, and maintain. This
can also make it harder to diagnose and troubleshoot issues.
3.Limited performance gains: While I/O processors can improve system
performance by offloading I/O tasks from the main processor, the gains may not be
significant in all cases. In some cases, the additional overhead of the I/O processor
may actually slow down the system.
4.Synchronization issues: With multiple processors accessing the same memory,
synchronization issues can arise, leading to potential data corruption or other
errors.

73
IOP-CPU Communication
• There is a communication channel between IOP and CPU to perform task
which come under computer architecture.
• This channel explains the commands executed by IOP and CPU while
performing some programs.
• The CPU do not executes the instructions but it assigns the task of initiating
operations, the instructions are executed by IOP.
• I/O transfer is instructed by CPU. The IOP asks for CPU through interrupt.
• This channel starts by CPU, by giving “test IOP path” instruction to IOP and
then the communication begins
• IOP and CPU Whenever CPU gets interrupt from IOP to access
memory, it sends test path instruction to IOP.
• IOP executes and check for status, if the status given to CPU is OK,
then CPU gives start instruction to IOP and gives it some control and
get back to some another (or same) program, after that IOP is able to
access memory for its program.

Unit 4_DECA_Complete Digital Electronics.pptx

More Related Content

Similar to Unit 4_DECA_Complete Digital Electronics.pptx

Recently uploaded

Unit 4_DECA_Complete Digital Electronics.pptx