Unit 3 Arithmetic building blocks and memory Design (1).pdf

Arithmetic Building Blocks Design
Unit III

Common datapath operators include adders, one/zero detectors,
comparators, counters, Boolean logic units, error-correcting code
blocks, shifters, and multipliers.
Adder
Adder forms the basis for many processing operations, from
counting to multiplication to filtering.
As a result, adder circuits that add two binary numbers are of
great interest to digital system designers.
Datapath Subsystem

From the truth table of full adder it is clear that
Complements of Sum(S) is same as Sum(S) of input complements (i.e Sb of first TT is
same as S of second TT where i/ps are complemented)
Complements of Carry(Cout) is same as Carry(Cout) of input complements
A B Ci S Cout
1 1 1 1 1
1 1 0 0 1
1 0 1 0 1
1 0 0 1 0
0 1 1 0 1
0 1 0 1 0
0 0 1 1 0
0 0 0 0 0
Generate (G)=AB
Delete (D)=A’B’
Propagate (P)=A XOR B

An N-bit adder can be constructed by cascading N full adder circuits in series.
This configuration is called ripple carry adder.
Here the carry ripples from LSB to MSB for some inputs.
The propagation delay of such structure is worst case delay of all possible input patterns.
The propagation delay for N-input word is
tadder=(N-1)tcarry + tsum
Where tcarry and tsum are propagation delay of carry and sum for one bit full adder.
The drawbacks of the ripple carry adder are:
The propagation delay increases with increasing N (where N is 16 to 128 for wider
data paths).
The full adder cell itself has more tcarry and tsum delay, so it can be optimized for fast
adder cell.

A gate-level realization of these two functions is shown in Fig.
Note that instead of realizing the two functions independently, we use
the carry-out signal to generate the sum output, since the output can
also be expressed as (from the truth table )
This implementation will ultimately reduce the
circuit complexity and, hence, save chip area.
Fig: Gate-level schematic of the one-bit full-adder circuit
=

Figure : Transistor-level schematic of the one-bit full-adder circuit.
In the above circuit each Cout of a full adder cell is having one inverter
which increases delay of carry propagation, so we can use inverting
property in the N bit full adder as shown in figure below:

By inverting odd inputs of N bit full adder we can eliminate inverter in the
carry propagation. Inverting inputs will accounted for vertical delay which is
very less when compared to carry propagation (horizontal delay).

Mirror Adder
The mirror adder circuit is shown in figure, here the PDN and PUN networks of the
gate are not dual.
This adder cell requires only 28 transistors(24 transistors with out inverters).
In this circuit carry will be generated at C0 when A=B=0 (C0’=1) or A=B=1 (C0’=0)
irrespective of Ci (C0’ is connected to either Vdd or Gnd). Carry is propagated only
when A!=B (A XOR B). This results in considerable reduction in both area and delay.
Generate (G)=AB
Delete (D)=A’B’
Cout=AB+C(A+B)
Sum=ABC+Cout’(A+B+C)

A full adder design using transmission gates to form multiplexers and XORs as
shown in fig. Figure (a) shows the transistor-level schematic using 24 transistors
and providing buffered outputs of the proper polarity with equal delay. Output
TG1 & TG3 turns ON for 00 & 11 inputs of AB, TG2 & TG4 turns ON for 01 &
10 inputs of AB. The design can be understood by parsing the transmission gate
structures into multiplexers and an "invertible inverter" XOR structure.
Fig: Transmission gate full adder
00, 11
00, 11
01, 10
01, 10
Transmission gate based adder
From the Truth table as well as from figure b
When A=BS=C, Cout=A
When A!=BS=Cb, Cout=C

PG Carry-Ripple Addition
The critical path of the carry-ripple adder passes from carry-in to
carry-out along the carry chain majority gates. As the P and G signals
will have already stabilized by the time the carry arrives, we can use
them to simplify the majority function into an AND-OR gate
Because Ci=Gi:0, carry-ripple addition can now be viewed as the
extreme case of group PG logic in which a 1-bit group is combined with
an i-bit group to form an (i+1)bit group, Figure shows a 4-bit carry-
ripple adder.

Figa: 4-bit carry-ripple adder using PG logic
Generate (G)=AB

The critical carry path now proceeds through a chain of AND-OR gates rather
than a chain of majority gates.
From Figure it can be seen that the carry-ripple adder critical path delay is
Where tpg is the delay of the 1-bit propagate/generate gates,
tAO is the delay of the AND-OR gate, and
txor is the delay of the final sum XOR.

Manchester carry chain adder
In the circuit(Fig:a) C0=Ci when Pi is true and Gi and Di are not true
(i.e. when A!=B C0=Ci, When A=B  C0=either Vdd or Gnd).
The dynamic implementation(Fig:b) makes even further simplification possible, TG is
replaced with NMOS pass transistor and clock 0 charges to Vdd, clock=1 evaluates output
depending on Gi.
The dynamic version of cascaded Manchester carry chain adder(4-bit) is shown in below
figure:
Generate (G)=AB
Delete (D)=A’B’

In the above circuit:
When Ø=0 all intermediate nodes are pre charged to Vdd (i.e. Couti=0),
When Ø=1, evaluation taking place depending on Pi,Ci and Gi
(i.e. node is discharged when Gi=1 otherwise the node retains Vdd or Ci)
The worst case delay of carry chain of adder is modeled by RC network and is given by:
Where all Ci=C and Rj=R (N is no. of inputs plus 1 is from the inverter transistor)
tp = 0.69(R1C1 + (R1+R2)C2 + (R1+R2+R3)C3 + (R1+R2+R3+R4)C4

Carry-Bipass adder Generate (Gi)=AiBi
Delete (Di)=Ai’Bi’
Propagate (Pi)=Ai XOR Bi
When Pi=1 then an incoming carry Ci,0=1 bipass through the complete adder chain and
causes an outgoing carry C0,3=1
i.e. If(P0P1P2P3=1)Then C0,3=Ci.0 otherwise carry kill or generate occurred.
This information can be used to speed up the operation of the adder as shown in figure
below.
Fig: Carry-Bipass structure
When Bp=P0P1P2P3=1, the incoming carry is forwarded immediately to the next
block through the bypass transistor, hence the name Carry-Bipass adder or carry
skip adder. If this is not true carry is obtained by normal way.

Fig: Carry-Bipass adder (N=16) (the worst case delay path is shaded in gray)
The delay of N-bit adder can be calculated by dividing the total adder in
(N/M) equal lengths bypass stages. Each of which contains M bits. The total
propagation time is given by
Carry
propagation
Setup
Bit 0–3
Sum
M bits
tsetup
tsum
Carry
propagation
Setup
Bit 4–7
Sum
tbypass
Carry
propagation
Setup
Bit 8–11
Sum
Carry
propagation
Setup
Bit 12–15
Sum
tadder = tsetup + Mtcarry + ((N/M)-1)tbypass + (M-1)tcarry + tsum

Figure shows propagation delay of Ripple carry adder and carry Bipass
adder.
From the graph it is clear that Ripple adder is faster for small values of N ,
The Bipass adder is lesser propagation delay for large values of N.

Linear carry select adder
Fig: 4 bit carry select module
In a ripple carry adder every cell has to wait for the incoming carry before an
generating outgoing carry. One way to overcome this is to evaluate for both
possibilities of carry input (i.e. for 0 and 1) in advance.
When real value of incoming carry is known, the correct result is selected
with multiplexer. It takes extra hardware but improves speed of operation.

A full carry select adder is constructed by chaining number of equal adder
stages. The worst case propagation delay for the module is:
Where tsetup, tsum and tmux are fixed delays
N total no. of bits
Mno. of bits per stage
tcarrydelay of carry for single bit full adder cell

Consider the case of 16-bit linear carry select adder. Assume that full-adder and mux
cells have identical propagation delays equal to normalized value of 1.
The worst case arrival times of signal at different nodes are shown in below figure.
Consider the mux stage in the last adder stage. The inputs to this mux are two carry
chains of the block(5) and the multiplexer signal from the previous stage(8).
A major mismatch between the arrival times of the signals can be observed.

The mismatch of arrival time can be overcome by progressively adding more
bits to the subsequent stages in the adder.
Example first stage is 2-bit adder, second stage is 3-bit adder and third has 4
bit and so on.. This is called square-root carry select adder.
Square-root carry select adder
Fig: Square-root carry select adder configuration

Assume N-bit adder contains P stages and the first stage adds M bits. An
additional bit is added to each subsequent stage and is given by:
If M<<N (Ex: M=2 and N=64), the first term dominates and is approximated
as N=P2/2
Figure shows plot Propagation delays V/S
Number of bits of linear and square root
adder

Carry-Lookahead Adder:
The carry-lookahead adder (CLA) is similar to the carry-bypass adder, but
computes group generate signals as well as group propagate signals to avoid
waiting for a ripple to determine if the first group generates a carry.
In general
The carry-out logic is
Generate (Gi)=AiBi
Delete (Di)=Ai’Bi’

The possible circuit implementation of above equation for N=4 is shown in the
figure. The circuit uses self –duality and the recursive of carry look ahead eqn.
to build a mirror structure.
Consider an example
A=0000
B= 1111
Pi=1111
Gi=0000
If(Ci=0, Coutb=1 Cout=0),
If(Ci=1, Coutb=0)Cout=1)
Generate (Gi)=AiBi

Prob1: Develop equations for the logical effort and parasitic delay with
respect to the C0 input of an n-stage Manchester carry chain computing C1…
Cn. Consider all of the internal diffusion capacitances when deriving the
parasitic delay. Use the transistor widths shown in Figure. 2 and assume the
Pi and Gi transistors of each stage share a single diffusion contact.
a) Calculate the carry chain length of the Manchester carry chain adder to
achieve least delay?
Figure 2: Manchester carry chain

a) From the result obtained which Manchester carry chain length gives the least
delay for a long adder?
The number of stages is inversely proportional to n.

Array Multiplication
Array multiplier block diagram is shown in figure. Each cell contains a 2-input AND
gate that forms a partial product and a full adder to add the partial product into the
running sum.
Array Multiplier circuit is based on repeated addition and shifting procedure. Each
partial product is generated by the multiplication of the multiplicand with one multiplier
digit.
Fig: Array multiplier

a3b3
P7
a3b2
a2b3
P6
a3b1
a2b2
a1b3
P5
a3b0
P4
a2b1
a1b2
a0b3
a2b0
a1b1
a0b2
a1b0
a0b1 a0b0
P3
P2 P1 P0
b3 b2
a3
b1
a2
b0
a1 a0
For N*n Array Multiplier, N*n AND gates, n number of HAs and (N-2) *n number of
FAs are required . The multiplicand is multiplied with one bit of the multiplier to get
the partial product (PP). Their bit commands are used to shift the partial result
before they are added. This is followed by addition of partial products to obtain the
final product.
Illustration of 4-bit Array Multiplication
The partial products are shifted according to their bit sequences and then added.
N-1 adders are required where N is the number of multiplier bits.
The method is simple but the delay is high and consumes large area by using ripple
carry adder for array multiplier.

Y0
Y1
X3 X2 X1 X0
X3
HA
X2
FA
X1
FA
X0
HA
Y2
X3
FA
X2
FA
X1
FA
X0
HA
Z1
Z3
Z6
Z7 Z5 Z4
Y3
X3
FA
X2
FA
X1
FA
X0
HA
Z2
Z0
The delay of MxN Array Multiplier is
1 0 1 0 X 1 1 0 1
1 0 1 0
0 0 0 0
1 0 1 0
1 0 1 0
1 0 0 0 0 0 1 0

Carry-Save Multiplier
H A H A H A H A
F A
F A
F A
H A
F A
H A F A F A
F A
H A F A H A
V e c t o r M e r g i n g A d d e r
Carry-save multiplier using 1-bit full-adders as the basic unit cells is shown in Figure.
The carries generated at each row are connected in a cascading chain to form a ripple
adder. Each partial product obtained is combined with the next product component using
Subsequent adders. Here the carry bits are not immediately added but bits are saved for
the next stage, the output carry bits are passed diagonally downwards. The cells placed in
the last row are organized as regular carry-ripple adders, where all carries are hooked up in
the usual way and are allowed to ripple from the LSB to the MSB. The last row, thus adds
the carry-out from the last carry-save adder and provides the final output.

Shifters
Shift operation is another essential arithmetic operation. n-bit shifter should
be able to shift incoming data by up to n - 1 places in a right-shift or left-shift
direction.
Two types of shifters are
– Barrel shifter
– Logarithmic shifter
The Barrel Shifter
Here the control lines are routed diagonally through the array, For a 4-bit
word, that a 1-bit shift right is equivalent to a 3-bit shift left and a 2-bit shift
right is equivalent to a 2-bit shift left, etc.
Thus we can achieve a capability to shift left or right by zero, one, two, or
three places by designing a circuit which will shift right only (say) by zero,
one, two, or three places

Fig: 4 x 4 barrel shifter.
In barrel shifter
• Area dominated by wiring
• Propagation delay is theoretically constant independent of shifter size, no. of shifts
Requires decoder to activate any one out of 4 control lines (sh0,sh1,sh2 and sh3)

Logarithmic shifter
Logarithmic shifters are better suited for large shifts
In a logarithmic shifter, no decoder is necessary.
A Shifter with a maximum shift width of M consists of a log2M stages, where the ith
stage either shifts over 2i or passes the data unchanged.
The speed of the logarithmic shifter depends on the shift width in a logarithmic, M-bit
shifter requires log2M stages.
 The series connection of pass transistors slows the shifter down for larger shift values.
Advantage of logarithmic shifter is more effective for larger shift values in terms of
both area and speed.

Sh1 Sh2 No. of
shifts
0 0 0 Shift
0 1 1 Shift
1 0 2 Shift
1 1 3 Shift

Unit 3 Arithmetic building blocks and memory Design (1).pdf

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Unit 3 Arithmetic building blocks and memory Design (1).pdf

Similar to Unit 3 Arithmetic building blocks and memory Design (1).pdf (20)

Recently uploaded

Recently uploaded (20)

Unit 3 Arithmetic building blocks and memory Design (1).pdf