SlideShare a Scribd company logo
1 of 98
Download to read offline
Processing Unit
Deepak John ,Department Computer Applications,
SJCET-Pala
Fundamental Concepts
 Processor fetches one instruction at a time and perform the
operation specified.
 Instructions are fetched from successive memory locations
until a branch or a jump instruction is encountered.
 Processor keeps track of the address of the memory
location containing the next instruction to be fetched using
Program Counter (PC).
 Instruction Register (IR)
Executing an Instruction
 Fetch the contents of the memory location pointed to by
the PC. The contents of this location are loaded into the
IR (fetch phase).
IR ← [[PC]]
 Assuming that the memory is byte addressable,
increment the contents of the PC by 4 (fetch phase).
PC ← [PC] + 4
 Carry out the actions specified by the instruction in the
IR (execution phase).
Processor Organization
Executing an Instruction
 Transfer a word of data from one processor register to
another or to the ALU.
 Perform an arithmetic or a logic operation and store the
result in a processor register.
 Fetch the contents of a given memory location and load
them into a processor register.
 Store a word of data from a processor register into a given
memory location.
Register Transfers
 For each register two control signals are used
1. To place the contents of that register on the bus
2. To load the data on the bus into register
 The input and output of register are connected to the bus via
switches controlled by the signals Rin and Rout
Performing an Arithmetic or Logic Operation
 The ALU is a combinational circuit that has no
internal storage.
 ALU gets the two operands from MUX and bus. The
result is temporarily stored in register Z.
 What is the sequence of operations to add the
contents of register R1 to those of R2 and store the
result in R3?
1. R1out, Yin
2. R2out, SelectY, Add, Zin
3. Zout, R3in
Fetching a Word from Memory
 Address into MAR; issue Read operation; data into MDR.
 Memory-Function-Completed (MFC)
Execution of Branch Instructions
 A branch instruction replaces the contents of PC with the
branch target address, which is usually obtained by adding
an offset X given in the branch instruction.
 The offset X is usually the difference between the branch
target address and the address immediately following the
branch instruction.
 Conditional branch
Step Action
1 PCou
t
, MAR
in
, Read
,
Select4,Add, Zi
n
2 Zou
t
, PCi
n
, Y
in
, WM
F
C
3 MDR
out
, IRi
n
4 Offset-field-of-IR
out
, Add
,
Z
in
5 Zou
t
, PCi
n
, En
d
Figure 7.7. Control sequence for an unconditional branch instruction.
Multiple-Bus Organization
Register file:all general purpose registers are
combined
Buses A and B:used to transfer the source operands
to the ALU
Step Action
1 PCout, R=B, MAR in, Read, IncPC
2 WMFC
3 MDRoutB, R=B, IRin
4 R4outA, R5outB, SelectA, Add, R6in, End
Figure 7.9. Control sequence for the instruction.
Add R4,R5,R6,for the three-bus organization
Hardwired Control
 To execute instructions, the processor must have some means of
generating the control signals needed in the proper sequence.
 Two categories: hardwired control and microprogrammed control
 Hardwired system can operate at high speed; but with little
flexibility.
Control Unit Organization
CLK
Clock
Control step
IR
encoder
Decoder/
Control signals
codes
counter
inputs
Condition
External
Detailed Block Description
Step Decoder :provides
seperate signal line for
each step or time slot in
the ccontrol sequence.
Instruction decoder :o/p
consists of a seperate
line for each machine
instruction.
For any instruction in IR,
one of the output lines
INS1 through INSm is
set to 1,all other lines are
set to 0
Input signals to the
encoder block are
combined to generate
the individual control
signals
Yin,Pcout,ADD,End etc
Generating Zin
 Zin = T1 + T6 • ADD + T4 • BR + …
T1
AddBranch
T4 T6
ADVANTAGES
 Hardwired Control Unit is fast because control signals are
generated by combinational circuits.
 The delay in generation of control signals depends upon the
number of gates.
DISADVANTAGES
 more complex will be the design of control unit.
 Modifications in control signal are very difficult. That means it
requires rearranging of wires in the hardware circuit.
 It is difficult to correct mistake in original design or adding new
feature in existing design of control unit.
A Complete Processor
Microprogrammed Control
 Control signals are generated by a program similar to
machine language programs.
 Control Word (CW):is a word whose individual bits represent
the various control signals; Each step of the instruction
execution is represented by a control word with all of the bits
corresponding to the control signals needed for the step set
to one.
 Microroutine:a sequence of CW’s corresponding to the
control sequence of a machine instruction
 microinstruction:individual control words in microroutine,
consists of:
One or more micro-operations to be executed.
Address of next microinstruction to be executed.
Control store: micro routines for
all instructions in the instruction
set of a computer are stored
 The previous organization cannot handle the situation when the
control unit is required to check the status of the condition
codes or external inputs to choose between alternative courses
of action.
 Use conditional branch microinstruction.
Address Microinstruction
0 PCout , MARin , Read,Select4,Add, Zin
1 Zout , PCin , Yin , WMFC
2 MDRout , IRin
3 Branchto startingaddress of appropriate microroutine
. ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... .. ... ... .. ... ..
25 If N=0, then branchto microinstruction 0
26 Offset-field-of-IRout , SelectY, Add, Zin
27 Zout , PCin , End
Figure 7.17. Microroutine for the instruction Branch<0.
Figure 7.18. Organization of the control unit to allow conditional branching in the
microprogram.
Control
store
Clock
generator
Starting and
branch address Condition
codes
inputs
External
CW
IR
mPC
Microinstructions
 A straightforward way to structure microinstructions is to
assign one bit position to each control signal.
 However, this is very inefficient.
 The length can be reduced: most signals are not needed
simultaneously, and many signals are mutually exclusive.
 All mutually exclusive signals are placed in the same group
in binary coding.
Partial Format for the Microinstructions
Further Improvement
 Vertical organization:
1. each micro instruction specify only a small number of control
functions.
2. Slower operating speed due to the need of more micro
instructions to perform the desired function.
3. less hardware is needed to handle the execution of
microinstructions.
 Horizontal organization:
1. each micro instruction specify many control signals.
2. its useful when higher operating speed is desired and the
machine structure allows parallel use of resources
Microprogram sequencing
Several operations execute with varying addressing modes
• For example, consider ADD r1, r2, r3; ADD (r1), r2,r3; ADD x(r1), r2,
r3; and ADD (r1)+, r2, r3
• two disadvantages:
Ø Having a separate microroutine for each machine instruction results in
a large total number of microinstructions and a large control store.
Ø Longer execution time because it takes more time to carry out the
required branches.
§ a separate microroutine for each combination of instruction would
produce considerable duplication of common parts.
§ Organize the micro program so that the micro routines share as many
common parts as possible.
Prefetching microinstructions
 Microprogrammed control leads to a slower operating speed
because of the time it take to fetch microinstructions from
control store.
 Avoid this by prefetching the next microinstruction while the
current one is executed.
 prefetching the microinstruction had some difficulties like the
status flags and the result of the current microinstruction are
needed to determine the address of next microinstruction.
Thus a straight forward prefetching occasionally prefetches
wrong microinstruction.
 In this case the fetch must be repeated with correct address.
emulation
 Allows us to replace obsolete equipment with more up
to date machines.
 If the replacement computer fully emulates the original
one then no software changes have have to be made to
run existing programs.
 Facilitates transition to new computer system with
minimal disruption.
ADVANTAGES
 The design of micro-program control unit is less complex
because micro-programs are implemented using software
routines.
 The micro-programmed control unit is more flexible because
design modifications, correction and enhancement is easily
possible.
 The new or modified instruction set of CPU can be easily
implemented by simply rewriting or modifying the contents of
control memory.
 The fault can be easily diagnosed in the micro-program
control unit using diagnostics tools by maintaining the
contents of flags, registers and counters.
DISADVANTAGES
 The micro-program control unit is slower than hardwired
control unit.
 That means to execute an instruction in micro-program
control unit requires more time.
 The micro-program control unit is expensive than hardwired
control unit in case of limited hardware resources.
 The design duration of micro-program control unit is more
than hardwired control unit for smaller CPU.
Arithmetic
Signed Numbers
 Left most bit is sign bit
 0 means positive
 1 means negative
 +18 = 00010010
 -18 = 10010010
•To compute representation of a negative number faster, find the
representation of its absolute value in n bits.
•invert all bits of this number and add 1 to it.
•For example, −10 in 8 bits
10 in 8 bits is 00001010.
Invert all bits to get: 11110101.
Add 1 to it to get: 11110110.
2’s complement number system
Addition/subtraction of signed numbers
si =
ci +1 =
13
7
+ Y
1
0
0
0
1
0
1
1
0
0
1
1
0
1
1
0
0
1
1
0
1
0
0
1
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
Example:
1
0= = 0
0
1 1
1
1 1 0 0
1
1 1 10
xi yi Carry-in ci Sumsi Carry-outci+1
X
Z
+ 6 0+
xi
yi
si
Carry-out
ci+1
Carry-in
ci
xi
yi
ci
xi
yi
ci
xi
yi
ci
xi
yi
ci xi yi ciÅ Å=+ + +
yi
ci
xi
ci
xi
yi+ +
At the ith stage:
Input:
ci is the carry-in
Output:
si is the sum
ci+1 carry-out to (i+1)st
state
Addition logic for a single stage
Full Adder (FA): Symbol for the complete circuit for a single stage of addition.
Full adder
(FA)
ci
ci 1+
s
i
Sum Carry
yi
xi
c
i
yi
xi
c
i
yi
x
i
xi
ci
yi
si
c
i 1+
n-bit adder
•Cascade n full adder (FA) blocks to form a n-bit adder.
•Carries propagate or ripple through this cascade, n-bit ripple carry adder.
FA c0
y1
x1
s1
FA
c 1
y0
x0
s0
FA
cn 1-
y n 1-
xn 1-
cn
sn 1-
Most significant bit
(MSB) position
Least significant bit
(LSB) position
Carry-in c0 into the LSB position provides a convenient way to perform subtraction.
K n-bit adder
K n-bit numbers can be added by cascading k n-bit adders.
n-bit c
0
yn
xn
s
n
cn
y0
xn 1-
s
0
ckn
s
k 1-( )n
x0
yn 1-
y2n 1-
x2n 1-
ykn 1-
s
n 1-
s
2n 1-
s
kn 1-
xkn 1-
adder
n-bit
adder
n-bit
adder
Each n-bit adder forms a block, so this is cascading of blocks.
Carries ripple or propagate through blocks, Blocked Ripple Carry Adder
n-bit subtractor
FA 1
y
1
x
1
s
1
FA
c
1
y
0
x
0
s
0
FA
c
n 1-
y
n 1-
x
n 1-
c
n
s
n 1-
Most significant bit
(MSB) position
Least significant bit
(LSB) position
•Recall X – Y is equivalent to adding 2’s complement of Y to X.
•2’s complement is equivalent to 1’s complement + 1.
•X – Y = X + Y + 1
•2’s complement of positive and negative numbers is computed
similarly.
n-bit adder/subtractor
•Add/sub control = 0, addition.
•Add/sub control = 1, subtraction.
Add/Sub
control
n-bit adder
x
n 1-
x
1
x
0
c
n
s
n 1- s
1
s
0
c
0
y
n 1-
y
1
y
0
Detecting overflows
 Overflows can only occur when the sign of the two operands
is the same.
 Overflow occurs if the sign of the result is different from the
sign of the operands.
 Recall that the MSB represents the sign.
 xn-1, yn-1, sn-1 represent the sign of operand x, operand y and result s
respectively.
 Circuit to detect overflow can be implemented by the
following logic expressions:
111111 −−−−−− += nnnnnn syxsyxOverflow
1−⊕= nn ccOverflow
Computing the add time
Consider 0th stage:
x0
y0
c0
c1
s0
FA
•c1 is available after 2 gate delays.
•s1 is available after 1 gate delay.
c
i
yi
xi
c
i
yi
x
i
xi
ci
yi
si
c
i 1+
Sum Carry
Computing the add time (contd..)
x0
y0
s2
FA
x0 y0x0
y0
s1
FA
c2
s0
FA
c1c3
c0
x0
y0
s3
FA
c4
Cascade of 4 Full Adders, or a 4-bit adder
•s0 available after 1 gate delays, c1 available after 2 gate delays.
•s1 available after 3 gate delays, c2 available after 4 gate delays.
•s2 available after 5 gate delays, c3 available after 6 gate delays.
•s3 available after 7 gate delays, c4 available after 8 gate delays.
For an n-bit adder, sn-1 is available after 2n-1 gate delays ,cn is
available after 2n gate delays.
Fast addition
Recall the equations:
iiiiiii
iiii
cycxyxc
cyxs
++=
⊕⊕=
+1
Second equation can be written as:
iiiiii cyxyxc )(1 ++=+
We can write:
iiiiii
iiii
yxPandyxGwhere
cPGc
+==
+=+1
•Gi is called generate function and Pi is called propagate function
•Gi and Pi are computed only from xi and yi and not ci, thus they can be
computed in one gate delay after X and Y are applied to the inputs of an n-
bit adder.
Carry lookahead
ci +1 = Gi + Pi ci
ci = Gi −1 + Pi −1ci −1
⇒ ci+1 = Gi + Pi (Gi −1 + Pi −1ci −1)
continuing
⇒ ci+1 = Gi + Pi (Gi −1 + Pi −1(Gi − 2 + Pi− 2ci −2 ))
until
ci+1 = Gi + PiGi −1 + Pi Pi−1Gi −2 + .. + Pi Pi −1..P1G0 + Pi Pi −1...P0c0
•All carries can be obtained 3 gate delays after X, Y and c0 are applied.
-One gate delay for Pi and Gi
-Two gate delays in the AND-OR circuit for ci+1
•All sums can be obtained 1 gate delay after the carries are computed.
•Independent of n, n-bit addition requires only 4 gate delays.
•This is called Carry Lookahead adder.
Carry-lookahead adder
Carry-lookahead logic
B cell B cell B cell B cell
s
3
P3
G3
c
3
P2
G2
c
2
s
2
G
1
c
1
P
1
s
1
G
0
c
0
P
0
s
0
.c4
x
1
y
1
x
3
y
3
x
2
y
2
x
0
y
0
G i
c
i
..
.
P i
s i
x i
y i
B cell
4-bit
carry-lookahead
adder
B-cell for a single stage
C1=G0+P0C0
C2=G1+P1G0+P1P0C0
C3=G2+P2G1+P2P1G0+P2P1P0C0
C4=G3+P3G2+P3P2G1+P3P2P1G0+P3P2P1P0C0
Blocked Carry-Lookahead adder
c4 = G3 + P3G2 + P3P2G1 + P3P2 P1G0 + P3P2 P1P0c0
Carry-out from a 4-bit block can be given as:
Rewrite this as:
P0
I
= P3P2 P1P0
G0
I
= G3 + P3G2 + P3P2G1 + P3P2P1G0
Subscript I denotes the blocked carry lookahead and identifies the block.
Cascade 4 4-bit adders, c16 can be expressed as:
c16 = G3
I
+ P3
I
G2
I
+ P3
I
P2
I
G1
I
+ P3
I
P2
I
P1
0
G0
I
+ P3
I
P2
I
P1
0
P0
0
c0
In order to add operands longer than 4 bits, we can cascade 4-
bit Carry-Lookahead adders. Cascade of Carry-Lookahead
adders is called Blocked Carry-Lookahead adder.
Blocked Carry-Lookahead adder
Carry-lookahead logic
4-bit adder 4-bit adder 4-bit adder 4-bit adder
s15-12
P3
IG3
I
c12
P2
IG2
I
c8
s11-8
G1
I
c4
P1
I
s7-4
G0
I
c0
P0
I
s3-0
c16
x15-12
y15-12
x11-8
y11-8
x7-4
y7-4
x3-0
y3-0
.
After xi, yi and c0 are applied as inputs:
- Gi and Pi for each stage are available after 1 gate delay.
- PI is available after 2 and GI after 3 gate delays.
- All carries are available after 5 gate delays.
- c16 is available after 5 gate delays.
- s15 which depends on c12 is available after 8 (5+3)gate delays (Recall that
for a 4-bit carry lookahead adder, the last sum bit is available 3 gate delays
after all inputs are available)
Multiplication
Multiplication of unsigned numbers
Product of 2 n-bit numbers is at most a 2n-bit number.
Unsigned multiplication can be viewed as addition of shifted versions of the
multiplicand.
Multiplication of unsigned numbers (contd..)
 Rules to implement multiplication are:
 If the 0th bit of multiplier is 1, then add the multiplicand with 0
 If the 0th bit of multiplier is 0 ,then current partial product is 0.
 If the ith bit (except 0th bit) of the multiplier is 1, shift the
multiplicand and add the shifted multiplicand to the current
value of the partial product.
 Hand over the partial product to the next stage
 If the ith bit of the multiplier is 0, shift the multiplicand and add
the current partial product to the 0.
 Value of the partial product at the start stage is 0.
Multiplication of unsigned numbers
ith multiplier bit
carry incarry out
jth multiplicand bit
Bit of incoming partial product (PPi)
Bit of outgoing partial product (PP(i+1))
FA
Typical multiplication cell
Combinational array multiplier
Multiplicand
m 3
m 2
m 1
m 00 0 0 0
q3
q 2
q 1
q0
0
p 2
p 1
p 0
0
0
0
p3
p 4
p 5
p 6
p 7
PP1
PP2
PP3
(PP0)
,
Product is: p7,p6,..p0
Multiplicand is shifted by displacing it through an array of adders.
 Combinatorial array multipliers are:
 Extremely inefficient.
 Have a high gate count for multiplying numbers of practical
size such as 32-bit or 64-bit numbers.
 Perform only one function, namely, unsigned integer
product.
 Improve gate efficiency by using a mixture of combinatorial
array techniques and sequential techniques requiring less
combinational logic.
Sequential Circuit Multiplier
q
n 1-
m
n 1-
n-bit
Adder
Multiplicand M
Control
sequencer
Multiplier Q
0
C
Shift right
Register A (initially 0)
Add/Noadd
control
a
n 1-
a
0
q
0
m
0
0
MUX
1 1 1 1
1 0 1 1
1 1 1 1
1 1 1 0
1 1 1 0
1 1 0 1
1 1 0 1
Initial configuration
Add
M
1 1 0 1
C
First cycle
Second cycle
Third cycle
Fourth cycle
No add
Shift
Shift
Add
Shift
Shift
Add
1 1 1 1
0
0
0
1
0
0
0
1
0
0 0 0 0
0 1 1 0
1 1 0 1
0 0 1 1
1 0 0 1
0 1 0 0
0 0 0 1
1 0 0 0
1 0 0 1
1 0 1 1
QA
Product
Signed Multiplication
Signed Multiplication
 Considering 2’s-complement signed operands,
 Extend the sign bit value of the multiplicand to the left as far as the
product will extend
 (-13)(+11)
1
0
11 11 1 1 0 0 1 1
110
110
1
0
1000111011
000000
1100111
00000000
110011111
13-( )
143-( )
11+( )
Signed Multiplication
 For a negative multiplier, a straightforward solution is
to form the 2’s-complement of both the multiplier and
the multiplicand and proceed as in the case of a
positive multiplier.
 This is possible because complementation of both
operands does not change the value or the sign of
the product.
 A technique that works equally well for both negative
and positive multipliers – Booth algorithm.
Booth Algorithm
Multiplier
Bit i Bit i 1-
V ersion of multiplicand
selected by bit
0
1
0
0
01
1 1
0 M
1+ M
1 M
0 M
Booth multiplier recoding table.
X
X
X
X
Booth multiplication with a negative multiplier.
01
0
1 1 1 1 0 1 1
0 0 0 0 0 0 0 0 0
00
0110
0 0 0 0 1 1 0
1100111
0 0 0 0 0 0
01000 11111
1
1
0 1 1 0 1
1 1 0 1 0 6-( )
13+( )
X
78-( )
+11- 1-
2's complement of
the multiplicand
Fast Multiplication
Bit-Pair Recoding of Multipliers
 Bit-pair recoding halves the maximum number of
summands (versions of the multiplicand).
1+1
(a) Example of bit-pair recoding derived from Booth recoding
0
000
1 1 0 1 0
Implied 0 to right of LSB
1
0
Sign extension
1
21 

1-
0000
1 1 1 1 1 0
0 0 0 0 11
1 1 1 1 10 0
0 0 0 0 0 0
0000 111111
0 1 1 0 1
0
1 010011111
1 1 1 1 0 0 1 1
0 0 0 0 0 0
1 1 1 0 1 1 0 0 1 0
0
1
0 0
1 0
1
0 0
0
0 1
0
0 1
10
0
010
0 1 1 0 1
11
1-
6-( )
13+( )
1+
78-( )
1- 2-
´
Figure 6.15. Multiplication requiring only n/2 summands.
i 1+ i 1
(b) Table of multiplicand selection decisions
selected at positioni
MultiplicandMultiplier bit-pair
i
0
0
1
1
1
0
1
0
1
1
1
1
0
0
0
1
1
0
0
1
0
0
1
Multiplier bit on the right
0 0 X M
1+
1
1+
0
1
2
2+




X M
X M
X M
X M
X M
X M
X M
Carry-Save Addition of Summands
 CSA speeds up the addition process.
P7 P6 P5 P4 P3 P2 P1 P0
P1 P0P5P7 P6
 Consider the addition of many summands, we can:
Ø Group the summands in threes and perform carry-save addition
on each of these groups in parallel to generate a set of S and C
vectors in one full-adder delay
Ø Group all of the S and C vectors into threes, and perform carry-
save addition on them, generating a further set of S and C
vectors in one more full-adder delay
Ø Continue with this process until there are only two vectors
remaining
Ø They can be added in a RCA or CLA to produce the desired
product
Figure 6.17. A multiplication example used to illustrate carry-save addition as shown in Figure 6.18.
100 1 11
100 1 11
100 1 11
11111 1
100 1 11 M
Q
A
B
C
D
E
F
(2,835)
X
(45)
(63)
100 1 11
100 1 11
100 1 11
000 1 11 111 0 00 Product
Figure 6.18. The multiplication example from Figure 6.17 performed using carry-save addition.
00000101 0 10
10010000 1 11 1
+
1000011 1
10010111 0 10 1
0110 1 10 0
00011010 0 00
10001011 1 0 1
110001 1 0
00111100
00110 1 10
11001 0 01
100 1 11
100 1 11
100 1 11
00110 1 10
11001 0 01
100 1 11
100 1 11
100 1 11
11111 1
100 1 11 M
Q
A
B
C
S
1
C
1
D
E
F
S
2
C
2
S1
C
1
S2
S
3
C3
C2
S4
C
4
Product
x
Integer Division
Longhand Division Steps
 Position the divisor appropriately with respect to the
dividend and performs a subtraction.
 If the remainder is zero or positive, a quotient bit of 1 is
determined, the remainder is extended by another bit of
the dividend, the divisor is repositioned, and another
subtraction is performed.
 If the remainder is negative, a quotient bit of 0 is
determined, the dividend is restored by adding back the
divisor, and the divisor is repositioned for another
subtraction.
001111
Division of Unsigned Binary Integers
1011
00001101
10010011
1011
001110
1011
1011
100
Quotient
Dividend
Remainder
Partial
Remainders
Divisor
Circuit Arrangement
Figure 6.21. Circuit arrangement for binary division.
qn-1
Divisor M
Control
Sequencer
Dividend Q
Shift left
N+1 bit
adder
q0
Add/Subtract
Quotient
Setting
A
m00 mn-1
a0an
an-1
Restoring Division
 Shift A and Q left one binary position
 Subtract M from A, and place the answer back in A
 If the sign of A is 1, set q0 to 0 and add M back to A
(restore A); otherwise, set q0 to 1
 Repeat these steps n times
Examples
10111
Figure 6.22. A restoring-division example.
1 1 1 1 1
01111
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
1 01
11
1 1
01
0001
Subtract
Shift
Restore
1 0000
1 0000
1 1
Initially
Subtract
Shift
10111
10000
11000
00000
Subtract
Shift
Restore
10111
01000
10000
1 1
QuotientRemainder
Shift
10111
1 0000
Subtract
Second cycle
First cycle
Third cycle
Fourth cycle
0
0
0
0
0
0
1
0
1
10000
1 1
1 0000
11111
Restore
q0Set
q0Set
q0Set
q0Set
Nonrestoring Division
 Avoid the need for restoring A after an
unsuccessful subtraction.
 Step 1: (Repeat n times)
Ø If the sign of A is 0, shift A and Q left one bit position and
subtract M from A; otherwise, shift A and Q left and add
M to A.
Ø Now, if the sign of A is 0, set q0 to 1; otherwise, set q0 to 0.
 Step2: If the sign of A is 1, add M to A
Examples
A nonrestoring-division example.
Add
Restore
remainder
Remainder
0 0 0 01
1 1 1 1 1
0 0 0 1 1
1
Quotient
0 0 1 01 1 1 1 1
0 0 0 01 1 1 1 1
Shift 0 0 0
11000
01111
Add
0 0 0 1 1
0 0 0 0 1 0 0 0
1 1 1 0 1
Shift
Subtract
Initially 0 0 0 0 0 1 0 0 0
1 1 1 0 0000
1 1 1 0 0
0 0 0 1 1
0 0 0Shift
Add
0 0 10 0 0 01
1 1 1 0 1
Shift
Subtract
0 0 0 110000
Fourth cycle
Third cycle
Second cycle
First cycle
q
0Set
q
0
Set
q
0
Set
q
0
Set
Floating-Point Numbers
and
Operations
Fractions
If b is a binary vector, then we have seen that it can be interpreted as an unsigned
integer by:
V(b) = b31.231 + b30.230 + bn-3.229 + .... + b1.21 + b0.20
This vector has an implicit binary point to its immediate right:
b31b30b29....................b1b0. implicit binary point
Suppose if the binary vector is interpreted with the implicit binary point is just left of
the sign bit:
implicit binary point .b31b30b29....................b1b0
The value of b is then given by:
V(b) = b31.2-1 + b30.2-2 + b29.2-3 + .... + b1.2-31 + b0.2-32
Range of fractions
The value of the unsigned binary fraction is:
V(b) = b31.2-1 + b30.2-2 + b29.2-3 + .... + b1.2-31 + b0.2-32
The range of the numbers represented in this format is:
In general for a n-bit binary fraction (a number with an assumed binary point at
the immediate left of the vector), then the range of values is:
9999999998.021)(0 32
≈−≤≤ −
bV
n
bV −
−≤≤ 21)(0
Scientific notation
•Previous representations have a fixed point. Either the point is to the immediate
right or it is to the immediate left. This is called Fixed point representation.
•Fixed point representation suffers from a drawback that the representation can only
represent a finite range (and quite small) range of numbers.
A more convenient representation is the scientific representation, where
the numbers are represented in the form:
x = m1.m2m3m4 × b±e
Components of these numbers are:
Mantissa (m), implied base (b), and exponent (e)
Significant digits
A number such as the following is said to have 7 significant digits
x = ±0.m1m2m3m4m5m6m7 × b±e
Fractions in the range 0.0 to 0.9999999 need about 24 bits of precision (in binary).
For example the binary fraction with 24 1’s:
111111111111111111111111 = 0.9999999404
Not every real number between 0 and 0.9999999404 can be represented by a 24-bit
fractional number.
The smallest non-zero number that can be represented is:
000000000000000000000001 = 5.96046 x 10-8
Every other non-zero number is constructed in increments of this value.
Sign and exponent digits
•In a 32-bit number, suppose we allocate 24 bits to represent a fractional
mantissa.
•Assume that the mantissa is represented in sign and magnitude format,
and we have allocated one bit to represent the sign.
•We allocate 7 bits to represent the exponent, and assume that the
exponent is represented as a 2’s complement integer.
•There are no bits allocated to represent the base, we assume that the
base is implied for now, that is the base is 2.
•Since a 7-bit 2’s complement number can represent values in the range
64 to 63, the range of numbers that can be represented is:
0.0000001 x 2-64 < = | x | <= 0.9999999 x 263
•In decimal representation this range is:
0.5421 x 10-20 < = | x | <= 9.2237 x 1018
A sample representation
Sign Exponent Fractional mantissa
bit
1 7 24
•24-bit mantissa with an implied binary point to the immediate left
•7-bit exponent in 2’s complement form, and implied base is 2.
Normalization
If the number is to be represented using only 7 significant mantissa digits,
the representation ignoring rounding is:
Consider the number: x = 0.0004056781 x 1012
x = 0.0004056 x 1012
If the number is shifted so that as many significant digits are brought into
7 available slots:
x = 0.4056781 x 109 = 0.0004056 x 1012
Exponent of x was decreased by 1 for every left shift of x.
A number which is brought into a form so that all of the available mantissa digits
are optimally used (this is different from all occupied which may not hold), is
called a normalized number.
Same methodology holds in the case of binary mantissas
0001101000(10110) x 28 = 1101000101(10) x 25
Normalization (contd..)
•A floating point number is in normalized form if the most significant 1 in the
mantissa is in the most significant bit of the mantissa.
•All normalized floating point numbers in this system will be of the form:
0.1xxxxx.......xx
Range of numbers representable in this system, if every number must be
normalized is:
0.5 x 2-64 <= | x | < 1 x 263
Normalization, overflow and underflow
The procedure for normalizing a floating point number is:
Do (until MSB of mantissa = = 1)
Shift the mantissa left (or right)
Decrement (increment) the exponent by 1
end do
Applying the normalization procedure to: .000111001110....0010 x 2-62
gives: .111001110........ x 2-65
But we cannot represent an exponent of –65, in trying to normalize the number we
have underflowed our representation.
Applying the normalization procedure to: 1.00111000............x 263
gives: 0.100111..............x 264
This overflows the representation.
Changing the implied base
So far we have assumed an implied base of 2, that is our floating point
numbers are of the form:
x = m 2e
If we choose an implied base of 16, then:
x = m 16e
Then:
y = (m.16) .16e-1 (m.24) .16e-1 = m . 16e = x
•Thus, every four left shifts of a binary mantissa results in a decrease of 1 in a base
16 exponent.
•Normalization in this case means shifting the mantissa until there is a 1 in the first
four bits of the mantissa.
Excess notation
•Rather than representing an exponent in 2’s complement form, it turns out to be
more beneficial to represent the exponent in excess notation.
•If 7 bits are allocated to the exponent, exponents can be represented in the range
of -64 to +63, that is:
-64 <= e <= 63
Exponent can also be represented using the following coding called as excess-64:
E’ = Etrue + 64
In general, excess-p coding is represented as:
E’ = Etrue + p
True exponent of -64 is represented as 0
0 is represented as 64
63 is represented as 127
This enables efficient comparison of the relative sizes of two floating point numbers.
IEEE notation
IEEE Floating Point notation is the standard representation in use. There
are two representations:
- Single precision.
- Double precision.
Both have an implied base of 2.
Single precision:
- 32 bits (23-bit mantissa, 8-bit exponent in excess-127 representation)
Double precision:
- 64 bits (52-bit mantissa, 11-bit exponent in excess-1023
representation)
Fractional mantissa, with an implied binary point at immediate left.
Sign Exponent Mantissa
1 8 or 11 23 or 52
Peculiarities of IEEE notation
•Floating point numbers have to be represented in a normalized form to
maximize the use of available mantissa digits.
•In a base-2 representation, this implies that the MSB of the mantissa is
always equal to 1.
•If every number is normalized, then the MSB of the mantissa is always 1.
We can do away without storing the MSB.
•IEEE notation assumes that all numbers are normalized so that the MSB of
the mantissa is a 1 and does not store this bit.
•So the real MSB of a number in the IEEE notation is either a 0 or a 1.
•The values of the numbers represented in the IEEE single precision
notation are of the form:
(+,-) 1.M x 2(E - 127)
•The hidden 1 forms the integer part of the mantissa.
•Note that excess-127 and excess-1023 (not excess-128 or excess-1024)
are used to represent the exponent.
Exponent field
In the IEEE representation, the exponent is in excess-127 (excess-1023) notation.
The actual exponents represented are:
-126 <= E <= 127 and -1022 <= E <= 1023
not
-127 <= E <= 128 and -1023 <= E <= 1024
This is because the IEEE uses the exponents -127 and 128 (and -1023 and 1024),
that is the actual values 0 and 255 to represent special conditions:
- Exact zero
- Infinity
Floating point arithmetic
Addition:
3.1415 x 108 + 1.19 x 106 = 3.1415 x 108 + 0.0119 x 108 = 3.1534 x 108
Multiplication:
3.1415 x 108 x 1.19 x 106 = (3.1415 x 1.19 ) x 10(8+6)
Division:
3.1415 x 108 / 1.19 x 106 = (3.1415 / 1.19 ) x 10(8-6)
Biased exponent problem:
If a true exponent e is represented in excess-p notation, that is as e+p.
Then consider what happens under multiplication:
a. 10(x + p) * b. 10(y + p) = (a.b). 10(x + p + y +p) = (a.b). 10(x +y + 2p)
Representing the result in excess-p notation implies that the exponent
should be x+y+p. Instead it is x+y+2p.
Biases should be handled in floating point arithmetic.
Floating point arithmetic: ADD/SUB rule
 Choose the number with the smaller exponent.
 Shift its mantissa right until the exponents of both the
numbers are equal.
 Add or subtract the mantissas.
 Determine the sign of the result.
 Normalize the result if necessary and truncate/round
to the number of mantissa bits.
Note: This does not consider the possibility of overflow/underflow.
Floating point arithmetic: MUL rule
 Add the exponents.
 Subtract the bias.
 Multiply the mantissas and determine the sign of the
result.
 Normalize the result (if necessary).
 Truncate/round the mantissa of the result.
Floating point arithmetic: DIV rule
 Subtract the exponents
 Add the bias.
 Divide the mantissas and determine the sign of the
result.
 Normalize the result if necessary.
 Truncate/round the mantissa of the result.
Note: Multiplication and division does not require alignment of the mantissas
the way addition and subtraction does.
Guard bits
While adding two floating point numbers with 24-bit mantissas, we shift
the mantissa of the number with the smaller exponent to the right until
the two exponents are equalized.
This implies that mantissa bits may be lost during the right shift (that is,
bits of precision may be shifted out of the mantissa being shifted).
To prevent this, floating point operations are implemented by keeping
guard bits, that is, extra bits of precision at the least significant end
of the mantissa.
The arithmetic on the mantissas is performed with these extra bits of
precision.
After an arithmetic operation, the guarded mantissas are:
- Normalized (if necessary)
- Converted back by a process called truncation/rounding to a 24-bit
mantissa.
Truncation/rounding
 Straight chopping:
 The guard bits (excess bits of precision) are dropped.
 Von Neumann rounding:
 If the guard bits are all 0, they are dropped.
 However, if any bit of the guard bit is a 1, then the LSB of the retained bit
is set to 1.
 Rounding:
 If there is a 1 in the MSB of the guard bit then a 1 is added to the LSB of
the retained bits.
Rounding
 Rounding is evidently the most accurate truncation
method.
 However,
 Rounding requires an addition operation.
 Rounding may require a renormalization, if the addition operation de-
normalizes the truncated number.
 IEEE uses the rounding method.
0.111111100000 rounds to 0.111111 + 0.000001
=1.000000 which must be renormalized to 0.100000
Thank You

More Related Content

What's hot

basic computer programming and micro programmed control
basic computer programming and micro programmed controlbasic computer programming and micro programmed control
basic computer programming and micro programmed controlRai University
 
Microprogram Control
Microprogram Control Microprogram Control
Microprogram Control Anuj Modi
 
Control unit design
Control unit designControl unit design
Control unit designDhaval Bagal
 
Microinstruction sequencing new
Microinstruction sequencing newMicroinstruction sequencing new
Microinstruction sequencing newMahesh Kumar Attri
 
Basic structure of computers by aniket bhute
Basic structure of computers by aniket bhuteBasic structure of computers by aniket bhute
Basic structure of computers by aniket bhuteAniket Bhute
 
Instruction cycle
Instruction cycleInstruction cycle
Instruction cycleKumar
 
Micro Programmed Control Unit
Micro Programmed Control UnitMicro Programmed Control Unit
Micro Programmed Control UnitKamal Acharya
 
Timing and control unit
Timing and control unitTiming and control unit
Timing and control unitDestro Destro
 
Unit 3 basic processing unit
Unit 3   basic processing unitUnit 3   basic processing unit
Unit 3 basic processing unitchidabdu
 
Control unit design(1)
Control unit design(1)Control unit design(1)
Control unit design(1)Nazir Ahmed
 
BLOCK DIAGRAM OF HARDWIRED CONTROL UNIT
BLOCK DIAGRAM OF HARDWIRED CONTROL UNITBLOCK DIAGRAM OF HARDWIRED CONTROL UNIT
BLOCK DIAGRAM OF HARDWIRED CONTROL UNITRahul Sharma
 

What's hot (20)

basic computer programming and micro programmed control
basic computer programming and micro programmed controlbasic computer programming and micro programmed control
basic computer programming and micro programmed control
 
control unit
control unitcontrol unit
control unit
 
Microprogram Control
Microprogram Control Microprogram Control
Microprogram Control
 
Hardwired control
Hardwired controlHardwired control
Hardwired control
 
Micro programmed control
Micro programmed controlMicro programmed control
Micro programmed control
 
Control unit design
Control unit designControl unit design
Control unit design
 
Micro programmed control
Micro programmed controlMicro programmed control
Micro programmed control
 
Microinstruction sequencing new
Microinstruction sequencing newMicroinstruction sequencing new
Microinstruction sequencing new
 
Basic structure of computers by aniket bhute
Basic structure of computers by aniket bhuteBasic structure of computers by aniket bhute
Basic structure of computers by aniket bhute
 
Instruction cycle
Instruction cycleInstruction cycle
Instruction cycle
 
Microprogrammed Control Unit
Microprogrammed Control UnitMicroprogrammed Control Unit
Microprogrammed Control Unit
 
Addressing sequencing
Addressing sequencingAddressing sequencing
Addressing sequencing
 
Micro Programmed Control Unit
Micro Programmed Control UnitMicro Programmed Control Unit
Micro Programmed Control Unit
 
Timing and control unit
Timing and control unitTiming and control unit
Timing and control unit
 
Unit 3 basic processing unit
Unit 3   basic processing unitUnit 3   basic processing unit
Unit 3 basic processing unit
 
CO By Rakesh Roshan
CO By Rakesh RoshanCO By Rakesh Roshan
CO By Rakesh Roshan
 
Ch7 official
Ch7 officialCh7 official
Ch7 official
 
Control unit design(1)
Control unit design(1)Control unit design(1)
Control unit design(1)
 
BLOCK DIAGRAM OF HARDWIRED CONTROL UNIT
BLOCK DIAGRAM OF HARDWIRED CONTROL UNITBLOCK DIAGRAM OF HARDWIRED CONTROL UNIT
BLOCK DIAGRAM OF HARDWIRED CONTROL UNIT
 
Lecture 21
Lecture 21Lecture 21
Lecture 21
 

Similar to Processing Unit Fundamentals

conrol_Unit_part_of_computer_architecture.pptx
conrol_Unit_part_of_computer_architecture.pptxconrol_Unit_part_of_computer_architecture.pptx
conrol_Unit_part_of_computer_architecture.pptxjbri1395
 
Basic_Processing_Unit.pdf
Basic_Processing_Unit.pdfBasic_Processing_Unit.pdf
Basic_Processing_Unit.pdfUmamaheswariV4
 
chapter3 - Basic Processing Unit.ppt
chapter3 - Basic Processing Unit.pptchapter3 - Basic Processing Unit.ppt
chapter3 - Basic Processing Unit.pptPoliceNiranjanReddy
 
Computer Organization
Computer OrganizationComputer Organization
Computer OrganizationAnish Goel
 
Computer Organisation and Architecture
Computer Organisation and ArchitectureComputer Organisation and Architecture
Computer Organisation and ArchitectureSubhasis Dash
 
20IT204-COA- Lecture 17.pptx
20IT204-COA- Lecture 17.pptx20IT204-COA- Lecture 17.pptx
20IT204-COA- Lecture 17.pptxPerumalPitchandi
 
CO Unit 3.pdf (Important chapter of coa)
CO Unit 3.pdf (Important chapter of coa)CO Unit 3.pdf (Important chapter of coa)
CO Unit 3.pdf (Important chapter of coa)guptakrishns23
 
Realization of high performance run time loadable mips soft-core processor
Realization of high performance run time loadable mips soft-core processorRealization of high performance run time loadable mips soft-core processor
Realization of high performance run time loadable mips soft-core processoreSAT Publishing House
 
chapter3_CA.pptt nnnnnnnnnnnnnnnnnnnnnnn
chapter3_CA.pptt nnnnnnnnnnnnnnnnnnnnnnnchapter3_CA.pptt nnnnnnnnnnnnnnnnnnnnnnn
chapter3_CA.pptt nnnnnnnnnnnnnnnnnnnnnnnNineTo1
 
controlunit1a024a025-140520095202-phpapp01.pptx
controlunit1a024a025-140520095202-phpapp01.pptxcontrolunit1a024a025-140520095202-phpapp01.pptx
controlunit1a024a025-140520095202-phpapp01.pptxssuser3b0320
 
IRJET- Design of Low Power 32- Bit RISC Processor using Verilog HDL
IRJET-  	  Design of Low Power 32- Bit RISC Processor using Verilog HDLIRJET-  	  Design of Low Power 32- Bit RISC Processor using Verilog HDL
IRJET- Design of Low Power 32- Bit RISC Processor using Verilog HDLIRJET Journal
 
Design_amp_analysis_of_16_bit_RISC_processor_using_low_power_pipelining.pdf
Design_amp_analysis_of_16_bit_RISC_processor_using_low_power_pipelining.pdfDesign_amp_analysis_of_16_bit_RISC_processor_using_low_power_pipelining.pdf
Design_amp_analysis_of_16_bit_RISC_processor_using_low_power_pipelining.pdfssuser1e1bab
 

Similar to Processing Unit Fundamentals (20)

conrol_Unit_part_of_computer_architecture.pptx
conrol_Unit_part_of_computer_architecture.pptxconrol_Unit_part_of_computer_architecture.pptx
conrol_Unit_part_of_computer_architecture.pptx
 
Basic_Processing_Unit.pdf
Basic_Processing_Unit.pdfBasic_Processing_Unit.pdf
Basic_Processing_Unit.pdf
 
chapter3 - Basic Processing Unit.ppt
chapter3 - Basic Processing Unit.pptchapter3 - Basic Processing Unit.ppt
chapter3 - Basic Processing Unit.ppt
 
Computer Organization
Computer OrganizationComputer Organization
Computer Organization
 
Computer Organisation and Architecture
Computer Organisation and ArchitectureComputer Organisation and Architecture
Computer Organisation and Architecture
 
20IT204-COA- Lecture 17.pptx
20IT204-COA- Lecture 17.pptx20IT204-COA- Lecture 17.pptx
20IT204-COA- Lecture 17.pptx
 
CO Unit 3.pdf (Important chapter of coa)
CO Unit 3.pdf (Important chapter of coa)CO Unit 3.pdf (Important chapter of coa)
CO Unit 3.pdf (Important chapter of coa)
 
Realization of high performance run time loadable mips soft-core processor
Realization of high performance run time loadable mips soft-core processorRealization of high performance run time loadable mips soft-core processor
Realization of high performance run time loadable mips soft-core processor
 
chapter3_CA.pptt nnnnnnnnnnnnnnnnnnnnnnn
chapter3_CA.pptt nnnnnnnnnnnnnnnnnnnnnnnchapter3_CA.pptt nnnnnnnnnnnnnnnnnnnnnnn
chapter3_CA.pptt nnnnnnnnnnnnnnnnnnnnnnn
 
controlunit1a024a025-140520095202-phpapp01.pptx
controlunit1a024a025-140520095202-phpapp01.pptxcontrolunit1a024a025-140520095202-phpapp01.pptx
controlunit1a024a025-140520095202-phpapp01.pptx
 
Control unit
Control unit Control unit
Control unit
 
IRJET- Design of Low Power 32- Bit RISC Processor using Verilog HDL
IRJET-  	  Design of Low Power 32- Bit RISC Processor using Verilog HDLIRJET-  	  Design of Low Power 32- Bit RISC Processor using Verilog HDL
IRJET- Design of Low Power 32- Bit RISC Processor using Verilog HDL
 
Unit 2
Unit 2Unit 2
Unit 2
 
Unit 2
Unit 2Unit 2
Unit 2
 
Unit 3 CO.pptx
Unit 3 CO.pptxUnit 3 CO.pptx
Unit 3 CO.pptx
 
Digital-Unit-III.ppt
Digital-Unit-III.pptDigital-Unit-III.ppt
Digital-Unit-III.ppt
 
4876238.ppt
4876238.ppt4876238.ppt
4876238.ppt
 
Dp&amp;co
Dp&amp;coDp&amp;co
Dp&amp;co
 
Control Memory
Control MemoryControl Memory
Control Memory
 
Design_amp_analysis_of_16_bit_RISC_processor_using_low_power_pipelining.pdf
Design_amp_analysis_of_16_bit_RISC_processor_using_low_power_pipelining.pdfDesign_amp_analysis_of_16_bit_RISC_processor_using_low_power_pipelining.pdf
Design_amp_analysis_of_16_bit_RISC_processor_using_low_power_pipelining.pdf
 

More from Deepak John

Network concepts and wi fi
Network concepts and wi fiNetwork concepts and wi fi
Network concepts and wi fiDeepak John
 
Web browser week5 presentation
Web browser week5 presentationWeb browser week5 presentation
Web browser week5 presentationDeepak John
 
Information management
Information managementInformation management
Information managementDeepak John
 
It security,malware,phishing,information theft
It security,malware,phishing,information theftIt security,malware,phishing,information theft
It security,malware,phishing,information theftDeepak John
 
Email,contacts and calendar
Email,contacts and calendarEmail,contacts and calendar
Email,contacts and calendarDeepak John
 
Module 2 instruction set
Module 2 instruction set Module 2 instruction set
Module 2 instruction set Deepak John
 
introduction to computers
 introduction to computers introduction to computers
introduction to computersDeepak John
 
Registers and counters
Registers and counters Registers and counters
Registers and counters Deepak John
 
Computer security module 4
Computer security module 4Computer security module 4
Computer security module 4Deepak John
 
Module 4 network and computer security
Module  4 network and computer securityModule  4 network and computer security
Module 4 network and computer securityDeepak John
 
Network and computer security-
Network and computer security-Network and computer security-
Network and computer security-Deepak John
 
Computer security module 3
Computer security module 3Computer security module 3
Computer security module 3Deepak John
 
Module 4 registers and counters
Module 4 registers and counters Module 4 registers and counters
Module 4 registers and counters Deepak John
 
Module 2 network and computer security
Module 2 network and computer securityModule 2 network and computer security
Module 2 network and computer securityDeepak John
 
Computer security module 2
Computer security module 2Computer security module 2
Computer security module 2Deepak John
 
Computer security module 1
Computer security module 1Computer security module 1
Computer security module 1Deepak John
 
Network and Computer security
Network and Computer securityNetwork and Computer security
Network and Computer securityDeepak John
 
Combinational and sequential logic
Combinational and sequential logicCombinational and sequential logic
Combinational and sequential logicDeepak John
 
Module 2 logic gates
Module 2  logic gatesModule 2  logic gates
Module 2 logic gatesDeepak John
 

More from Deepak John (20)

Network concepts and wi fi
Network concepts and wi fiNetwork concepts and wi fi
Network concepts and wi fi
 
Web browser week5 presentation
Web browser week5 presentationWeb browser week5 presentation
Web browser week5 presentation
 
Information management
Information managementInformation management
Information management
 
It security,malware,phishing,information theft
It security,malware,phishing,information theftIt security,malware,phishing,information theft
It security,malware,phishing,information theft
 
Email,contacts and calendar
Email,contacts and calendarEmail,contacts and calendar
Email,contacts and calendar
 
Module 1 8086
Module 1 8086Module 1 8086
Module 1 8086
 
Module 2 instruction set
Module 2 instruction set Module 2 instruction set
Module 2 instruction set
 
introduction to computers
 introduction to computers introduction to computers
introduction to computers
 
Registers and counters
Registers and counters Registers and counters
Registers and counters
 
Computer security module 4
Computer security module 4Computer security module 4
Computer security module 4
 
Module 4 network and computer security
Module  4 network and computer securityModule  4 network and computer security
Module 4 network and computer security
 
Network and computer security-
Network and computer security-Network and computer security-
Network and computer security-
 
Computer security module 3
Computer security module 3Computer security module 3
Computer security module 3
 
Module 4 registers and counters
Module 4 registers and counters Module 4 registers and counters
Module 4 registers and counters
 
Module 2 network and computer security
Module 2 network and computer securityModule 2 network and computer security
Module 2 network and computer security
 
Computer security module 2
Computer security module 2Computer security module 2
Computer security module 2
 
Computer security module 1
Computer security module 1Computer security module 1
Computer security module 1
 
Network and Computer security
Network and Computer securityNetwork and Computer security
Network and Computer security
 
Combinational and sequential logic
Combinational and sequential logicCombinational and sequential logic
Combinational and sequential logic
 
Module 2 logic gates
Module 2  logic gatesModule 2  logic gates
Module 2 logic gates
 

Recently uploaded

Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 

Recently uploaded (20)

Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 

Processing Unit Fundamentals

  • 1. Processing Unit Deepak John ,Department Computer Applications, SJCET-Pala
  • 2. Fundamental Concepts  Processor fetches one instruction at a time and perform the operation specified.  Instructions are fetched from successive memory locations until a branch or a jump instruction is encountered.  Processor keeps track of the address of the memory location containing the next instruction to be fetched using Program Counter (PC).  Instruction Register (IR)
  • 3. Executing an Instruction  Fetch the contents of the memory location pointed to by the PC. The contents of this location are loaded into the IR (fetch phase). IR ← [[PC]]  Assuming that the memory is byte addressable, increment the contents of the PC by 4 (fetch phase). PC ← [PC] + 4  Carry out the actions specified by the instruction in the IR (execution phase).
  • 5. Executing an Instruction  Transfer a word of data from one processor register to another or to the ALU.  Perform an arithmetic or a logic operation and store the result in a processor register.  Fetch the contents of a given memory location and load them into a processor register.  Store a word of data from a processor register into a given memory location. Register Transfers  For each register two control signals are used 1. To place the contents of that register on the bus 2. To load the data on the bus into register  The input and output of register are connected to the bus via switches controlled by the signals Rin and Rout
  • 6.
  • 7. Performing an Arithmetic or Logic Operation  The ALU is a combinational circuit that has no internal storage.  ALU gets the two operands from MUX and bus. The result is temporarily stored in register Z.  What is the sequence of operations to add the contents of register R1 to those of R2 and store the result in R3? 1. R1out, Yin 2. R2out, SelectY, Add, Zin 3. Zout, R3in
  • 8. Fetching a Word from Memory  Address into MAR; issue Read operation; data into MDR.  Memory-Function-Completed (MFC)
  • 9.
  • 10. Execution of Branch Instructions  A branch instruction replaces the contents of PC with the branch target address, which is usually obtained by adding an offset X given in the branch instruction.  The offset X is usually the difference between the branch target address and the address immediately following the branch instruction.  Conditional branch Step Action 1 PCou t , MAR in , Read , Select4,Add, Zi n 2 Zou t , PCi n , Y in , WM F C 3 MDR out , IRi n 4 Offset-field-of-IR out , Add , Z in 5 Zou t , PCi n , En d Figure 7.7. Control sequence for an unconditional branch instruction.
  • 11. Multiple-Bus Organization Register file:all general purpose registers are combined Buses A and B:used to transfer the source operands to the ALU Step Action 1 PCout, R=B, MAR in, Read, IncPC 2 WMFC 3 MDRoutB, R=B, IRin 4 R4outA, R5outB, SelectA, Add, R6in, End Figure 7.9. Control sequence for the instruction. Add R4,R5,R6,for the three-bus organization
  • 13.  To execute instructions, the processor must have some means of generating the control signals needed in the proper sequence.  Two categories: hardwired control and microprogrammed control  Hardwired system can operate at high speed; but with little flexibility. Control Unit Organization CLK Clock Control step IR encoder Decoder/ Control signals codes counter inputs Condition External
  • 14. Detailed Block Description Step Decoder :provides seperate signal line for each step or time slot in the ccontrol sequence. Instruction decoder :o/p consists of a seperate line for each machine instruction. For any instruction in IR, one of the output lines INS1 through INSm is set to 1,all other lines are set to 0 Input signals to the encoder block are combined to generate the individual control signals Yin,Pcout,ADD,End etc
  • 15. Generating Zin  Zin = T1 + T6 • ADD + T4 • BR + … T1 AddBranch T4 T6
  • 16. ADVANTAGES  Hardwired Control Unit is fast because control signals are generated by combinational circuits.  The delay in generation of control signals depends upon the number of gates. DISADVANTAGES  more complex will be the design of control unit.  Modifications in control signal are very difficult. That means it requires rearranging of wires in the hardware circuit.  It is difficult to correct mistake in original design or adding new feature in existing design of control unit.
  • 19.  Control signals are generated by a program similar to machine language programs.  Control Word (CW):is a word whose individual bits represent the various control signals; Each step of the instruction execution is represented by a control word with all of the bits corresponding to the control signals needed for the step set to one.  Microroutine:a sequence of CW’s corresponding to the control sequence of a machine instruction  microinstruction:individual control words in microroutine, consists of: One or more micro-operations to be executed. Address of next microinstruction to be executed.
  • 20.
  • 21. Control store: micro routines for all instructions in the instruction set of a computer are stored
  • 22.  The previous organization cannot handle the situation when the control unit is required to check the status of the condition codes or external inputs to choose between alternative courses of action.  Use conditional branch microinstruction. Address Microinstruction 0 PCout , MARin , Read,Select4,Add, Zin 1 Zout , PCin , Yin , WMFC 2 MDRout , IRin 3 Branchto startingaddress of appropriate microroutine . ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... .. ... ... .. ... .. 25 If N=0, then branchto microinstruction 0 26 Offset-field-of-IRout , SelectY, Add, Zin 27 Zout , PCin , End Figure 7.17. Microroutine for the instruction Branch<0.
  • 23. Figure 7.18. Organization of the control unit to allow conditional branching in the microprogram. Control store Clock generator Starting and branch address Condition codes inputs External CW IR mPC
  • 24. Microinstructions  A straightforward way to structure microinstructions is to assign one bit position to each control signal.  However, this is very inefficient.  The length can be reduced: most signals are not needed simultaneously, and many signals are mutually exclusive.  All mutually exclusive signals are placed in the same group in binary coding.
  • 25. Partial Format for the Microinstructions
  • 26. Further Improvement  Vertical organization: 1. each micro instruction specify only a small number of control functions. 2. Slower operating speed due to the need of more micro instructions to perform the desired function. 3. less hardware is needed to handle the execution of microinstructions.  Horizontal organization: 1. each micro instruction specify many control signals. 2. its useful when higher operating speed is desired and the machine structure allows parallel use of resources
  • 27. Microprogram sequencing Several operations execute with varying addressing modes • For example, consider ADD r1, r2, r3; ADD (r1), r2,r3; ADD x(r1), r2, r3; and ADD (r1)+, r2, r3 • two disadvantages: Ø Having a separate microroutine for each machine instruction results in a large total number of microinstructions and a large control store. Ø Longer execution time because it takes more time to carry out the required branches. § a separate microroutine for each combination of instruction would produce considerable duplication of common parts. § Organize the micro program so that the micro routines share as many common parts as possible.
  • 28. Prefetching microinstructions  Microprogrammed control leads to a slower operating speed because of the time it take to fetch microinstructions from control store.  Avoid this by prefetching the next microinstruction while the current one is executed.  prefetching the microinstruction had some difficulties like the status flags and the result of the current microinstruction are needed to determine the address of next microinstruction. Thus a straight forward prefetching occasionally prefetches wrong microinstruction.  In this case the fetch must be repeated with correct address.
  • 29. emulation  Allows us to replace obsolete equipment with more up to date machines.  If the replacement computer fully emulates the original one then no software changes have have to be made to run existing programs.  Facilitates transition to new computer system with minimal disruption.
  • 30. ADVANTAGES  The design of micro-program control unit is less complex because micro-programs are implemented using software routines.  The micro-programmed control unit is more flexible because design modifications, correction and enhancement is easily possible.  The new or modified instruction set of CPU can be easily implemented by simply rewriting or modifying the contents of control memory.  The fault can be easily diagnosed in the micro-program control unit using diagnostics tools by maintaining the contents of flags, registers and counters.
  • 31. DISADVANTAGES  The micro-program control unit is slower than hardwired control unit.  That means to execute an instruction in micro-program control unit requires more time.  The micro-program control unit is expensive than hardwired control unit in case of limited hardware resources.  The design duration of micro-program control unit is more than hardwired control unit for smaller CPU.
  • 32.
  • 34. Signed Numbers  Left most bit is sign bit  0 means positive  1 means negative  +18 = 00010010  -18 = 10010010 •To compute representation of a negative number faster, find the representation of its absolute value in n bits. •invert all bits of this number and add 1 to it. •For example, −10 in 8 bits 10 in 8 bits is 00001010. Invert all bits to get: 11110101. Add 1 to it to get: 11110110. 2’s complement number system
  • 35. Addition/subtraction of signed numbers si = ci +1 = 13 7 + Y 1 0 0 0 1 0 1 1 0 0 1 1 0 1 1 0 0 1 1 0 1 0 0 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 Example: 1 0= = 0 0 1 1 1 1 1 0 0 1 1 1 10 xi yi Carry-in ci Sumsi Carry-outci+1 X Z + 6 0+ xi yi si Carry-out ci+1 Carry-in ci xi yi ci xi yi ci xi yi ci xi yi ci xi yi ciÅ Å=+ + + yi ci xi ci xi yi+ + At the ith stage: Input: ci is the carry-in Output: si is the sum ci+1 carry-out to (i+1)st state
  • 36. Addition logic for a single stage Full Adder (FA): Symbol for the complete circuit for a single stage of addition. Full adder (FA) ci ci 1+ s i Sum Carry yi xi c i yi xi c i yi x i xi ci yi si c i 1+
  • 37. n-bit adder •Cascade n full adder (FA) blocks to form a n-bit adder. •Carries propagate or ripple through this cascade, n-bit ripple carry adder. FA c0 y1 x1 s1 FA c 1 y0 x0 s0 FA cn 1- y n 1- xn 1- cn sn 1- Most significant bit (MSB) position Least significant bit (LSB) position Carry-in c0 into the LSB position provides a convenient way to perform subtraction. K n-bit adder K n-bit numbers can be added by cascading k n-bit adders. n-bit c 0 yn xn s n cn y0 xn 1- s 0 ckn s k 1-( )n x0 yn 1- y2n 1- x2n 1- ykn 1- s n 1- s 2n 1- s kn 1- xkn 1- adder n-bit adder n-bit adder Each n-bit adder forms a block, so this is cascading of blocks. Carries ripple or propagate through blocks, Blocked Ripple Carry Adder
  • 38. n-bit subtractor FA 1 y 1 x 1 s 1 FA c 1 y 0 x 0 s 0 FA c n 1- y n 1- x n 1- c n s n 1- Most significant bit (MSB) position Least significant bit (LSB) position •Recall X – Y is equivalent to adding 2’s complement of Y to X. •2’s complement is equivalent to 1’s complement + 1. •X – Y = X + Y + 1 •2’s complement of positive and negative numbers is computed similarly.
  • 39. n-bit adder/subtractor •Add/sub control = 0, addition. •Add/sub control = 1, subtraction. Add/Sub control n-bit adder x n 1- x 1 x 0 c n s n 1- s 1 s 0 c 0 y n 1- y 1 y 0
  • 40. Detecting overflows  Overflows can only occur when the sign of the two operands is the same.  Overflow occurs if the sign of the result is different from the sign of the operands.  Recall that the MSB represents the sign.  xn-1, yn-1, sn-1 represent the sign of operand x, operand y and result s respectively.  Circuit to detect overflow can be implemented by the following logic expressions: 111111 −−−−−− += nnnnnn syxsyxOverflow 1−⊕= nn ccOverflow
  • 41. Computing the add time Consider 0th stage: x0 y0 c0 c1 s0 FA •c1 is available after 2 gate delays. •s1 is available after 1 gate delay. c i yi xi c i yi x i xi ci yi si c i 1+ Sum Carry
  • 42. Computing the add time (contd..) x0 y0 s2 FA x0 y0x0 y0 s1 FA c2 s0 FA c1c3 c0 x0 y0 s3 FA c4 Cascade of 4 Full Adders, or a 4-bit adder •s0 available after 1 gate delays, c1 available after 2 gate delays. •s1 available after 3 gate delays, c2 available after 4 gate delays. •s2 available after 5 gate delays, c3 available after 6 gate delays. •s3 available after 7 gate delays, c4 available after 8 gate delays. For an n-bit adder, sn-1 is available after 2n-1 gate delays ,cn is available after 2n gate delays.
  • 43. Fast addition Recall the equations: iiiiiii iiii cycxyxc cyxs ++= ⊕⊕= +1 Second equation can be written as: iiiiii cyxyxc )(1 ++=+ We can write: iiiiii iiii yxPandyxGwhere cPGc +== +=+1 •Gi is called generate function and Pi is called propagate function •Gi and Pi are computed only from xi and yi and not ci, thus they can be computed in one gate delay after X and Y are applied to the inputs of an n- bit adder.
  • 44. Carry lookahead ci +1 = Gi + Pi ci ci = Gi −1 + Pi −1ci −1 ⇒ ci+1 = Gi + Pi (Gi −1 + Pi −1ci −1) continuing ⇒ ci+1 = Gi + Pi (Gi −1 + Pi −1(Gi − 2 + Pi− 2ci −2 )) until ci+1 = Gi + PiGi −1 + Pi Pi−1Gi −2 + .. + Pi Pi −1..P1G0 + Pi Pi −1...P0c0 •All carries can be obtained 3 gate delays after X, Y and c0 are applied. -One gate delay for Pi and Gi -Two gate delays in the AND-OR circuit for ci+1 •All sums can be obtained 1 gate delay after the carries are computed. •Independent of n, n-bit addition requires only 4 gate delays. •This is called Carry Lookahead adder.
  • 45. Carry-lookahead adder Carry-lookahead logic B cell B cell B cell B cell s 3 P3 G3 c 3 P2 G2 c 2 s 2 G 1 c 1 P 1 s 1 G 0 c 0 P 0 s 0 .c4 x 1 y 1 x 3 y 3 x 2 y 2 x 0 y 0 G i c i .. . P i s i x i y i B cell 4-bit carry-lookahead adder B-cell for a single stage C1=G0+P0C0 C2=G1+P1G0+P1P0C0 C3=G2+P2G1+P2P1G0+P2P1P0C0 C4=G3+P3G2+P3P2G1+P3P2P1G0+P3P2P1P0C0
  • 46. Blocked Carry-Lookahead adder c4 = G3 + P3G2 + P3P2G1 + P3P2 P1G0 + P3P2 P1P0c0 Carry-out from a 4-bit block can be given as: Rewrite this as: P0 I = P3P2 P1P0 G0 I = G3 + P3G2 + P3P2G1 + P3P2P1G0 Subscript I denotes the blocked carry lookahead and identifies the block. Cascade 4 4-bit adders, c16 can be expressed as: c16 = G3 I + P3 I G2 I + P3 I P2 I G1 I + P3 I P2 I P1 0 G0 I + P3 I P2 I P1 0 P0 0 c0 In order to add operands longer than 4 bits, we can cascade 4- bit Carry-Lookahead adders. Cascade of Carry-Lookahead adders is called Blocked Carry-Lookahead adder.
  • 47. Blocked Carry-Lookahead adder Carry-lookahead logic 4-bit adder 4-bit adder 4-bit adder 4-bit adder s15-12 P3 IG3 I c12 P2 IG2 I c8 s11-8 G1 I c4 P1 I s7-4 G0 I c0 P0 I s3-0 c16 x15-12 y15-12 x11-8 y11-8 x7-4 y7-4 x3-0 y3-0 . After xi, yi and c0 are applied as inputs: - Gi and Pi for each stage are available after 1 gate delay. - PI is available after 2 and GI after 3 gate delays. - All carries are available after 5 gate delays. - c16 is available after 5 gate delays. - s15 which depends on c12 is available after 8 (5+3)gate delays (Recall that for a 4-bit carry lookahead adder, the last sum bit is available 3 gate delays after all inputs are available)
  • 49. Multiplication of unsigned numbers Product of 2 n-bit numbers is at most a 2n-bit number. Unsigned multiplication can be viewed as addition of shifted versions of the multiplicand.
  • 50. Multiplication of unsigned numbers (contd..)  Rules to implement multiplication are:  If the 0th bit of multiplier is 1, then add the multiplicand with 0  If the 0th bit of multiplier is 0 ,then current partial product is 0.  If the ith bit (except 0th bit) of the multiplier is 1, shift the multiplicand and add the shifted multiplicand to the current value of the partial product.  Hand over the partial product to the next stage  If the ith bit of the multiplier is 0, shift the multiplicand and add the current partial product to the 0.  Value of the partial product at the start stage is 0.
  • 51. Multiplication of unsigned numbers ith multiplier bit carry incarry out jth multiplicand bit Bit of incoming partial product (PPi) Bit of outgoing partial product (PP(i+1)) FA Typical multiplication cell
  • 52. Combinational array multiplier Multiplicand m 3 m 2 m 1 m 00 0 0 0 q3 q 2 q 1 q0 0 p 2 p 1 p 0 0 0 0 p3 p 4 p 5 p 6 p 7 PP1 PP2 PP3 (PP0) , Product is: p7,p6,..p0 Multiplicand is shifted by displacing it through an array of adders.
  • 53.  Combinatorial array multipliers are:  Extremely inefficient.  Have a high gate count for multiplying numbers of practical size such as 32-bit or 64-bit numbers.  Perform only one function, namely, unsigned integer product.  Improve gate efficiency by using a mixture of combinatorial array techniques and sequential techniques requiring less combinational logic.
  • 54. Sequential Circuit Multiplier q n 1- m n 1- n-bit Adder Multiplicand M Control sequencer Multiplier Q 0 C Shift right Register A (initially 0) Add/Noadd control a n 1- a 0 q 0 m 0 0 MUX
  • 55. 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 0 1 1 1 0 1 Initial configuration Add M 1 1 0 1 C First cycle Second cycle Third cycle Fourth cycle No add Shift Shift Add Shift Shift Add 1 1 1 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 1 0 1 1 0 1 0 0 1 1 1 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 1 0 0 1 1 0 1 1 QA Product
  • 57. Signed Multiplication  Considering 2’s-complement signed operands,  Extend the sign bit value of the multiplicand to the left as far as the product will extend  (-13)(+11) 1 0 11 11 1 1 0 0 1 1 110 110 1 0 1000111011 000000 1100111 00000000 110011111 13-( ) 143-( ) 11+( )
  • 58. Signed Multiplication  For a negative multiplier, a straightforward solution is to form the 2’s-complement of both the multiplier and the multiplicand and proceed as in the case of a positive multiplier.  This is possible because complementation of both operands does not change the value or the sign of the product.  A technique that works equally well for both negative and positive multipliers – Booth algorithm.
  • 59. Booth Algorithm Multiplier Bit i Bit i 1- V ersion of multiplicand selected by bit 0 1 0 0 01 1 1 0 M 1+ M 1 M 0 M Booth multiplier recoding table. X X X X Booth multiplication with a negative multiplier. 01 0 1 1 1 1 0 1 1 0 0 0 0 0 0 0 0 0 00 0110 0 0 0 0 1 1 0 1100111 0 0 0 0 0 0 01000 11111 1 1 0 1 1 0 1 1 1 0 1 0 6-( ) 13+( ) X 78-( ) +11- 1- 2's complement of the multiplicand
  • 61. Bit-Pair Recoding of Multipliers  Bit-pair recoding halves the maximum number of summands (versions of the multiplicand). 1+1 (a) Example of bit-pair recoding derived from Booth recoding 0 000 1 1 0 1 0 Implied 0 to right of LSB 1 0 Sign extension 1 21  
  • 62. 1- 0000 1 1 1 1 1 0 0 0 0 0 11 1 1 1 1 10 0 0 0 0 0 0 0 0000 111111 0 1 1 0 1 0 1 010011111 1 1 1 1 0 0 1 1 0 0 0 0 0 0 1 1 1 0 1 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 0 1 0 0 1 10 0 010 0 1 1 0 1 11 1- 6-( ) 13+( ) 1+ 78-( ) 1- 2- ´ Figure 6.15. Multiplication requiring only n/2 summands. i 1+ i 1 (b) Table of multiplicand selection decisions selected at positioni MultiplicandMultiplier bit-pair i 0 0 1 1 1 0 1 0 1 1 1 1 0 0 0 1 1 0 0 1 0 0 1 Multiplier bit on the right 0 0 X M 1+ 1 1+ 0 1 2 2+     X M X M X M X M X M X M X M
  • 63. Carry-Save Addition of Summands  CSA speeds up the addition process. P7 P6 P5 P4 P3 P2 P1 P0
  • 65.  Consider the addition of many summands, we can: Ø Group the summands in threes and perform carry-save addition on each of these groups in parallel to generate a set of S and C vectors in one full-adder delay Ø Group all of the S and C vectors into threes, and perform carry- save addition on them, generating a further set of S and C vectors in one more full-adder delay Ø Continue with this process until there are only two vectors remaining Ø They can be added in a RCA or CLA to produce the desired product
  • 66. Figure 6.17. A multiplication example used to illustrate carry-save addition as shown in Figure 6.18. 100 1 11 100 1 11 100 1 11 11111 1 100 1 11 M Q A B C D E F (2,835) X (45) (63) 100 1 11 100 1 11 100 1 11 000 1 11 111 0 00 Product
  • 67. Figure 6.18. The multiplication example from Figure 6.17 performed using carry-save addition. 00000101 0 10 10010000 1 11 1 + 1000011 1 10010111 0 10 1 0110 1 10 0 00011010 0 00 10001011 1 0 1 110001 1 0 00111100 00110 1 10 11001 0 01 100 1 11 100 1 11 100 1 11 00110 1 10 11001 0 01 100 1 11 100 1 11 100 1 11 11111 1 100 1 11 M Q A B C S 1 C 1 D E F S 2 C 2 S1 C 1 S2 S 3 C3 C2 S4 C 4 Product x
  • 69. Longhand Division Steps  Position the divisor appropriately with respect to the dividend and performs a subtraction.  If the remainder is zero or positive, a quotient bit of 1 is determined, the remainder is extended by another bit of the dividend, the divisor is repositioned, and another subtraction is performed.  If the remainder is negative, a quotient bit of 0 is determined, the dividend is restored by adding back the divisor, and the divisor is repositioned for another subtraction.
  • 70. 001111 Division of Unsigned Binary Integers 1011 00001101 10010011 1011 001110 1011 1011 100 Quotient Dividend Remainder Partial Remainders Divisor
  • 71. Circuit Arrangement Figure 6.21. Circuit arrangement for binary division. qn-1 Divisor M Control Sequencer Dividend Q Shift left N+1 bit adder q0 Add/Subtract Quotient Setting A m00 mn-1 a0an an-1
  • 72. Restoring Division  Shift A and Q left one binary position  Subtract M from A, and place the answer back in A  If the sign of A is 1, set q0 to 0 and add M back to A (restore A); otherwise, set q0 to 1  Repeat these steps n times
  • 73. Examples 10111 Figure 6.22. A restoring-division example. 1 1 1 1 1 01111 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 01 11 1 1 01 0001 Subtract Shift Restore 1 0000 1 0000 1 1 Initially Subtract Shift 10111 10000 11000 00000 Subtract Shift Restore 10111 01000 10000 1 1 QuotientRemainder Shift 10111 1 0000 Subtract Second cycle First cycle Third cycle Fourth cycle 0 0 0 0 0 0 1 0 1 10000 1 1 1 0000 11111 Restore q0Set q0Set q0Set q0Set
  • 74. Nonrestoring Division  Avoid the need for restoring A after an unsuccessful subtraction.  Step 1: (Repeat n times) Ø If the sign of A is 0, shift A and Q left one bit position and subtract M from A; otherwise, shift A and Q left and add M to A. Ø Now, if the sign of A is 0, set q0 to 1; otherwise, set q0 to 0.  Step2: If the sign of A is 1, add M to A
  • 75. Examples A nonrestoring-division example. Add Restore remainder Remainder 0 0 0 01 1 1 1 1 1 0 0 0 1 1 1 Quotient 0 0 1 01 1 1 1 1 0 0 0 01 1 1 1 1 Shift 0 0 0 11000 01111 Add 0 0 0 1 1 0 0 0 0 1 0 0 0 1 1 1 0 1 Shift Subtract Initially 0 0 0 0 0 1 0 0 0 1 1 1 0 0000 1 1 1 0 0 0 0 0 1 1 0 0 0Shift Add 0 0 10 0 0 01 1 1 1 0 1 Shift Subtract 0 0 0 110000 Fourth cycle Third cycle Second cycle First cycle q 0Set q 0 Set q 0 Set q 0 Set
  • 77. Fractions If b is a binary vector, then we have seen that it can be interpreted as an unsigned integer by: V(b) = b31.231 + b30.230 + bn-3.229 + .... + b1.21 + b0.20 This vector has an implicit binary point to its immediate right: b31b30b29....................b1b0. implicit binary point Suppose if the binary vector is interpreted with the implicit binary point is just left of the sign bit: implicit binary point .b31b30b29....................b1b0 The value of b is then given by: V(b) = b31.2-1 + b30.2-2 + b29.2-3 + .... + b1.2-31 + b0.2-32
  • 78. Range of fractions The value of the unsigned binary fraction is: V(b) = b31.2-1 + b30.2-2 + b29.2-3 + .... + b1.2-31 + b0.2-32 The range of the numbers represented in this format is: In general for a n-bit binary fraction (a number with an assumed binary point at the immediate left of the vector), then the range of values is: 9999999998.021)(0 32 ≈−≤≤ − bV n bV − −≤≤ 21)(0
  • 79. Scientific notation •Previous representations have a fixed point. Either the point is to the immediate right or it is to the immediate left. This is called Fixed point representation. •Fixed point representation suffers from a drawback that the representation can only represent a finite range (and quite small) range of numbers. A more convenient representation is the scientific representation, where the numbers are represented in the form: x = m1.m2m3m4 × b±e Components of these numbers are: Mantissa (m), implied base (b), and exponent (e)
  • 80. Significant digits A number such as the following is said to have 7 significant digits x = ±0.m1m2m3m4m5m6m7 × b±e Fractions in the range 0.0 to 0.9999999 need about 24 bits of precision (in binary). For example the binary fraction with 24 1’s: 111111111111111111111111 = 0.9999999404 Not every real number between 0 and 0.9999999404 can be represented by a 24-bit fractional number. The smallest non-zero number that can be represented is: 000000000000000000000001 = 5.96046 x 10-8 Every other non-zero number is constructed in increments of this value.
  • 81. Sign and exponent digits •In a 32-bit number, suppose we allocate 24 bits to represent a fractional mantissa. •Assume that the mantissa is represented in sign and magnitude format, and we have allocated one bit to represent the sign. •We allocate 7 bits to represent the exponent, and assume that the exponent is represented as a 2’s complement integer. •There are no bits allocated to represent the base, we assume that the base is implied for now, that is the base is 2. •Since a 7-bit 2’s complement number can represent values in the range 64 to 63, the range of numbers that can be represented is: 0.0000001 x 2-64 < = | x | <= 0.9999999 x 263 •In decimal representation this range is: 0.5421 x 10-20 < = | x | <= 9.2237 x 1018
  • 82. A sample representation Sign Exponent Fractional mantissa bit 1 7 24 •24-bit mantissa with an implied binary point to the immediate left •7-bit exponent in 2’s complement form, and implied base is 2.
  • 83. Normalization If the number is to be represented using only 7 significant mantissa digits, the representation ignoring rounding is: Consider the number: x = 0.0004056781 x 1012 x = 0.0004056 x 1012 If the number is shifted so that as many significant digits are brought into 7 available slots: x = 0.4056781 x 109 = 0.0004056 x 1012 Exponent of x was decreased by 1 for every left shift of x. A number which is brought into a form so that all of the available mantissa digits are optimally used (this is different from all occupied which may not hold), is called a normalized number. Same methodology holds in the case of binary mantissas 0001101000(10110) x 28 = 1101000101(10) x 25
  • 84. Normalization (contd..) •A floating point number is in normalized form if the most significant 1 in the mantissa is in the most significant bit of the mantissa. •All normalized floating point numbers in this system will be of the form: 0.1xxxxx.......xx Range of numbers representable in this system, if every number must be normalized is: 0.5 x 2-64 <= | x | < 1 x 263
  • 85. Normalization, overflow and underflow The procedure for normalizing a floating point number is: Do (until MSB of mantissa = = 1) Shift the mantissa left (or right) Decrement (increment) the exponent by 1 end do Applying the normalization procedure to: .000111001110....0010 x 2-62 gives: .111001110........ x 2-65 But we cannot represent an exponent of –65, in trying to normalize the number we have underflowed our representation. Applying the normalization procedure to: 1.00111000............x 263 gives: 0.100111..............x 264 This overflows the representation.
  • 86. Changing the implied base So far we have assumed an implied base of 2, that is our floating point numbers are of the form: x = m 2e If we choose an implied base of 16, then: x = m 16e Then: y = (m.16) .16e-1 (m.24) .16e-1 = m . 16e = x •Thus, every four left shifts of a binary mantissa results in a decrease of 1 in a base 16 exponent. •Normalization in this case means shifting the mantissa until there is a 1 in the first four bits of the mantissa.
  • 87. Excess notation •Rather than representing an exponent in 2’s complement form, it turns out to be more beneficial to represent the exponent in excess notation. •If 7 bits are allocated to the exponent, exponents can be represented in the range of -64 to +63, that is: -64 <= e <= 63 Exponent can also be represented using the following coding called as excess-64: E’ = Etrue + 64 In general, excess-p coding is represented as: E’ = Etrue + p True exponent of -64 is represented as 0 0 is represented as 64 63 is represented as 127 This enables efficient comparison of the relative sizes of two floating point numbers.
  • 88. IEEE notation IEEE Floating Point notation is the standard representation in use. There are two representations: - Single precision. - Double precision. Both have an implied base of 2. Single precision: - 32 bits (23-bit mantissa, 8-bit exponent in excess-127 representation) Double precision: - 64 bits (52-bit mantissa, 11-bit exponent in excess-1023 representation) Fractional mantissa, with an implied binary point at immediate left. Sign Exponent Mantissa 1 8 or 11 23 or 52
  • 89. Peculiarities of IEEE notation •Floating point numbers have to be represented in a normalized form to maximize the use of available mantissa digits. •In a base-2 representation, this implies that the MSB of the mantissa is always equal to 1. •If every number is normalized, then the MSB of the mantissa is always 1. We can do away without storing the MSB. •IEEE notation assumes that all numbers are normalized so that the MSB of the mantissa is a 1 and does not store this bit. •So the real MSB of a number in the IEEE notation is either a 0 or a 1. •The values of the numbers represented in the IEEE single precision notation are of the form: (+,-) 1.M x 2(E - 127) •The hidden 1 forms the integer part of the mantissa. •Note that excess-127 and excess-1023 (not excess-128 or excess-1024) are used to represent the exponent.
  • 90. Exponent field In the IEEE representation, the exponent is in excess-127 (excess-1023) notation. The actual exponents represented are: -126 <= E <= 127 and -1022 <= E <= 1023 not -127 <= E <= 128 and -1023 <= E <= 1024 This is because the IEEE uses the exponents -127 and 128 (and -1023 and 1024), that is the actual values 0 and 255 to represent special conditions: - Exact zero - Infinity
  • 91. Floating point arithmetic Addition: 3.1415 x 108 + 1.19 x 106 = 3.1415 x 108 + 0.0119 x 108 = 3.1534 x 108 Multiplication: 3.1415 x 108 x 1.19 x 106 = (3.1415 x 1.19 ) x 10(8+6) Division: 3.1415 x 108 / 1.19 x 106 = (3.1415 / 1.19 ) x 10(8-6) Biased exponent problem: If a true exponent e is represented in excess-p notation, that is as e+p. Then consider what happens under multiplication: a. 10(x + p) * b. 10(y + p) = (a.b). 10(x + p + y +p) = (a.b). 10(x +y + 2p) Representing the result in excess-p notation implies that the exponent should be x+y+p. Instead it is x+y+2p. Biases should be handled in floating point arithmetic.
  • 92. Floating point arithmetic: ADD/SUB rule  Choose the number with the smaller exponent.  Shift its mantissa right until the exponents of both the numbers are equal.  Add or subtract the mantissas.  Determine the sign of the result.  Normalize the result if necessary and truncate/round to the number of mantissa bits. Note: This does not consider the possibility of overflow/underflow.
  • 93. Floating point arithmetic: MUL rule  Add the exponents.  Subtract the bias.  Multiply the mantissas and determine the sign of the result.  Normalize the result (if necessary).  Truncate/round the mantissa of the result.
  • 94. Floating point arithmetic: DIV rule  Subtract the exponents  Add the bias.  Divide the mantissas and determine the sign of the result.  Normalize the result if necessary.  Truncate/round the mantissa of the result. Note: Multiplication and division does not require alignment of the mantissas the way addition and subtraction does.
  • 95. Guard bits While adding two floating point numbers with 24-bit mantissas, we shift the mantissa of the number with the smaller exponent to the right until the two exponents are equalized. This implies that mantissa bits may be lost during the right shift (that is, bits of precision may be shifted out of the mantissa being shifted). To prevent this, floating point operations are implemented by keeping guard bits, that is, extra bits of precision at the least significant end of the mantissa. The arithmetic on the mantissas is performed with these extra bits of precision. After an arithmetic operation, the guarded mantissas are: - Normalized (if necessary) - Converted back by a process called truncation/rounding to a 24-bit mantissa.
  • 96. Truncation/rounding  Straight chopping:  The guard bits (excess bits of precision) are dropped.  Von Neumann rounding:  If the guard bits are all 0, they are dropped.  However, if any bit of the guard bit is a 1, then the LSB of the retained bit is set to 1.  Rounding:  If there is a 1 in the MSB of the guard bit then a 1 is added to the LSB of the retained bits.
  • 97. Rounding  Rounding is evidently the most accurate truncation method.  However,  Rounding requires an addition operation.  Rounding may require a renormalization, if the addition operation de- normalizes the truncated number.  IEEE uses the rounding method. 0.111111100000 rounds to 0.111111 + 0.000001 =1.000000 which must be renormalized to 0.100000