SlideShare a Scribd company logo
1 of 9
Download to read offline
Omar Nibouche
International Journal of Engineering (IJE), Volume (7) : Issue (1) : 2013 1
Area Efficient and Reduced Pin Count Multipliers
Omar Nibouche o.nibouche@tu.edu.sa
College of Computers and Information Technology
Taif University
POB 888 Taif 21964, KSA
Abstract
Fully serial multipliers can play an important role in the implementation of DSP algorithms in
resource-limited chips such as FPGAs; offering area efficient architectures with a reduced pin
count and moderate throughput rates. In this paper two structures that implement the fully serial
multiplication operation are presented. One significant aspect of the new designs is that they are
systolic and require near communication links only. They are superior in speed and area usage to
similar architectures in the literature. The paper also present a new fully serial multiplier optimized
for area-time
2
efficiency with better performance than available architectures in the open
literature.
Keywords: Educed Pin Count, Serial Multiplication, Area-Time
2
.
1. INTRODUCTION
There is a crucial advantage offered by bit-serial processors over their parallel counterpart, which
lies in the very efficient use of chip area. They are particularly suitable for applications that require
slow to moderate speeds and in batch mode applications. By contrast, bit-parallel processors are
useful for fast speed systems, but at the expense of larger a area usage and thus they are more
expensive [1-2].
Traditional bit serial multiplier structures suffer from an inefficient generation of partial products,
which leads to hardware overuse and slow speed systems. In this paper, two structures for bit serial
multiplication are presented. The first structure, called structure I, is the first fully serial multiplier
reported in the literature with comparable performance - in terms of speed- to existing serial-parallel
multipliers. The second structure, termed structure II, requires an extra multiplexer in the clock path;
thus making it slower, but has the merit of reducing the latency of the multiplier.
The remainder of the paper is organised as follows: in section 2, the previous work in the literature is
reviewed, while section 3 describes the new structures for fully serial multiplication. Section 4 is
concerned with an optimisation of the multiplier of Structure I in terms of area-time
2
efficiency. A
comparison of performance is shown in section 5 and conclusions are given in section 6.
2. BIT SERIAL MULTIPLIERS: A REVIEW
One of the early bit serial multipliers was proposed by [3]. It generates the partial products in a
recursive fashion. Consider the multiplication of two n-bit positive numbers A and B as follows:
∑
−=
=
=
1
0
2
ni
i
i
iaA (1)
and
∑
−=
=
=
1
0
2
ni
i
i
ibB (2)
Omar Nibouche
International Journal of Engineering (IJE), Volume (7) : Issue (1) : 2013 2
Let Pi represents the partial product computed after the i
th
bit is fed [3]. Pi is given by:
i
2
111 22 ba)AbB(aPP i
i
i-ii-i
i
i-i +++= (3)
where Ai and Bi represent the value of the operands A and B, respectively, and by considering only
bits from the Least Significant Bit (LSB) to the i
th
bit, that is,
Ai = A mod 2
i+1
(4)
and
Bi = B mod 2
i+1
(5)
with the initial values P-1 = A-1 = B-1 = 0.
The generation of the partial product, using equation (3), and their assignment to the multiplier cells
is shown in Table 1 below:
Cycle P Cell 1 Cell 2
1
2
3
4
5
6
7
8
a3
b3
Cell 3 Cell 4
a2
b3
+a3
b2
P0
a2b2
a1
b1
a0
b0
a1b2+a2b1
a0
b1
+a1
b0
a0
b2
+a2
b0
a1
b3
+a3
b1
a0
b3
+a3
b3
P1
P2
P3
P4
P5
P6
P7
TABLE 1: Multiplication Scheme of [3].
From the above table, each cell generates two new bit-products every cycle. To cope with this
constraint, The Basic Cell (BC) of the multiplier proposed by [3] is built around a 5 to 3 counter. The
counter is capable of accumulating five inputs of the same weight to a sum-bit (Sout) of the same
weight as the inputs, i.e.2
0
, 1
st
carry-bit (C
1
out) of a weight of 2
1
and 2
nd
carry-bit (C
2
out) of a weight of
2
2
. In particular, the sum-bit is calculated through a tree of EXOR gate to reduce the propagation
delay within the cell. The BC of the multiplier is shown in Figure 1. The multiplier uses n identical
cells to perform the multiplication of two n-bit numbers in 2n cycles, as it can be seen in Figure 2.
Omar Nibouche
International Journal of Engineering (IJE), Volume (7) : Issue (1) : 2013 3
SoutC1
out C2
out
Cin
Sin
ai
bi
control
5to3 counter
Sout
Cout
Latch
FA
FA
HA
22
21
20
20
2020
20
20
(b) (a)
FIGURE 1 (a): 5 to 3 Counter Made of Two FAs and One HA (b). The BC of [3].
BC n BC 2 BC 1
ai
Si
0
control
bi
FIGURE 2: The n-bit Multiplier by [3].
In [4], modifications were carried out on the multiplication scheme of Table 1 to make a more efficient
use of the hardware. In fact, the multiplier proposed by [4] uses only half the number of cells required
by [3]. To achieve this, the partial products generated by the last n/2 cells in [3] were reallocated to
the first n/2 cells and rescheduled at cycle n+1. In this way, a full utilisation of cells 1 to n/2 can be
achieved, as it can be shown in Table 2.
Cycle P Cell 1 Cell 2
1
2
3
4
5
6
7
8
a3
b3
a2b3+a3b2
P0
a2
b2
a1
b1
a0b0
a1b2+a2b1
a0b1+a1b0
a0b2+a2b0
a1
b3
+a3
b1
a0
b3
+a3
b3
P1
P2
P3
P4
P5
P6
P7
TABLE 2: Multiplication Scheme by [4].
Omar Nibouche
International Journal of Engineering (IJE), Volume (7) : Issue (1) : 2013 4
The multiplier was modified using little extra hardware: two n/2 shift registers are used to store the
n/2 most significant bits of the data words A and B and n/2 multiplexers. The BC of the modified
multiplier d is shown in Figure 3. The multiplier structure as proposed by [4] is shown in Figure 4.
This architecture has reduced the number of 5 to 3 counters by 50%, as well as reducing the number
of latches by 33%. The clock path equals the delay of a multiplexer, a gated counter and a latch.
1
0
5to3 counter
Sout
C1
out
C2
out
Cin
Sin
ai
bi
control
Cout Sout
Latch
FIGURE 3: The BC of [4].
BC N/2 BC 2 BC 1
1
0 ai
Si
0
control
1
0
n/2 Shift Register
n/2 Shift Register bi
FIGURE 4: The n-bit Multiplier by [4].
3. THE NEW FULLY SERIAL MULTIPLIERS
Although about 50% of the area used in [3] has been saved by [4], the throughput rate has not
been increased. On the contrary, it has decreased as a multiplexer was added to the structure
making the clock period of the multiplier in [4] equivalent to the delay of a multiplexer, an AND
gate, a 5 to 3 counter and a latch. To remedy this problem, two new structures are proposed.
3.1. Structure I
In order to reduce the clock path as described above, the multiplication algorithm has been modified.
It generates the bit-products associated with cells 2 to n at the (n+1)
th
cycle. The scheduling of the
tasks of the first cell is kept unchanged, but the latency of the multiplier is increased to n cycles. The
multiplication scheme is shown in Table 3 for 4-bit operands. The multiplication operation can be
divided into two parts, which can easily be done by rewriting the product of the two numbers, A and
B, in the following way:
ji
ni
i
nj
j
jibaBaB*A +
−=
=
−=
=
∑ ∑+= 2
1
1
1
0
0
(6)
Omar Nibouche
International Journal of Engineering (IJE), Volume (7) : Issue (1) : 2013 5
To keep working under the constraints of sending one bit of each operand at a time, the term a0B
in equation (6) is generated by the first cell of the multiplier during the first n cycles. At this stage,
the other cells only propagate the partial products already generated by the first cell. At the
(n+1)
th
cycle, all the operand bits have been fed and the term ji
ni
i
nj
j
ji ba +
−=
=
−=
=
∑ ∑ 2
1
1
1
0
can be generated
during the last n cycles. The clock period is equivalent to the delay of a FA, an AND gate and a
latch, as shown in Figure 5. Therefore, Structure I achieves similar speed performances when
compared with serial-parallel multipliers.
3.2. Structure II
The 5 to 3 counters have been widely used in the literature [3-6]. Basically, such a counter
reduces 5 bits of the same weight to three bits: a result-bit of the same weight as the inputs (a
weight of 2
0
), a carry-bit, which has a weight of 2
1
, and a far carry-bit that has a weight of 2
2
. It is
clear that while the sum of the inputs is up to 5, the sum of the outputs is up to 6, and as such,
two representations of the outputs are excluded. Therefore, it is clearly more appropriate to
reduce the 5 inputs to 3 outputs: a result-bit of the same weight as the inputs, and 2 carry-bits of
twice the weight of the result-bit. For this purpose, a new cell has been developed by using two
FAs as shown in Figure 6. The first FA is used to accumulate two bit products with a carry feedback.
The second FA is used to generate a result-bit from the result of the first FA, the result-bit from the
adjacent cell and a carry feedback. The two carry-bits generated by the new cell are fed back and
accumulated with the bit-products of the next cycle. The sum-bit of the first FA is registered, and thus
making the clock period equivalent to the delay of a multiplexer, an AND gate, a FA and a latch. The
multiplier structure implements directly the algorithm shown in Table 2. The multiplier requires only
n/2 cells for the multiplication of two n-bit numbers. It is also modular and needs near communication
links only.
TABLE 3: Structure I Multiplication Scheme For 4-bit Operands.
Cycle P Cell 1 Cell 2 Cell 3 Cell 4
1 a0b0
2 a0b1
3 a0b2
4 P0 a0b3
5 P1 a3b0 a2b0 a1b0
6 P2 a3b1 a2b1 a1b1
7 P3 a3b2 a2b2 a1b2
8 P4 a3b3 a2b3 a1b3
9 P5
10 P6
11 P7
Omar Nibouche
International Journal of Engineering (IJE), Volume (7) : Issue (1) : 2013 6
result
control
0 FA FA FA FA
1
0n Shift Register
ai
bi
control
Latch
00010001
FIGURE 5: Structure I bit-bit Serial Multiplier.
ai
bi
control
Sin
Sout
1
0
FA
FA
FIGURE 6: The Basic Cell of Structure II Fully Serial Multiplier
4. AREA-TIME2
EFFICIENT BIT-BIT SERIAL MULTIPLIER
In this section, a new multiplier structure, which is capable of multiplexing two multiplication
operations into Structure I is proposed. This has the merit of doubling the throughput rate at the
expense of extra hardware consisting of 2n multiplexers and n latches. By optimizing the
multiplier for area-time
2
efficiency, the problem of lost cycles is circumvented. The lost cycles are
the cycles needed for carry propagation once the generation of the partial-products is finished.
The best structure in the literature that can multiplex two multiplication operations into the same
multiplier was described in [7]. The algorithm presented in [7] is an improvement made on the
multiplication scheme of Table 2. Starting from this multiplication scheme, it reassigns and
reschedules the partial products generated by the (n/4+1)th cell and above starting at (n+n/4)th
cycle to the cells from 1 to n/4 at the (n+n/2)th cycle, respectively. This has the effect of freeing
n/4 most significant cells at the (n+n/4)th cycle and thereafter. The multiplication scheme adopted
by [7] to achieve this operation is shown in Table 5.
Omar Nibouche
International Journal of Engineering (IJE), Volume (7) : Issue (1) : 2013 7
Cycle P Cell 1 Cell 2
1 P0
2 P1
3 P2
4 P3
5 P4
6 P5
7 P6
8 P7
a3b3
a2b3+a3b2
a2b2
a13+a3b1a0b3+a3b0
a1b2+a2b1
a1b1
a0b0
a0b1+a1b0
a0b2+a2b0
TABLE 5: Multiplication Scheme of [7].
Although the work presented in [7] can be applied to the multiplier of Structure II, it is easier to
optimize the multiplier of Structure I for area-time
2
efficiency. One can clearly observe that the
result from the first cell is not accumulated with the bit products of the other cells until the (n+1)
th
cycle. Therefore, instead of feeding the result of the first cell to its neighbouring cell, it is delayed
by n cycles using n latches before being accumulated with the rest of the partial products, as can
be seen in Figure 7. In this way, the first cell is used for n cycles to generate the bit-products of
the first multiplication operation, then is reinitialised during one cycle before being used to
generate the bit-products of the second multiplication operation for a duration of another n cycles
and so on. The remaining cells operate almost in the same fashion. The key point is that they
generate and accumulate the bit products of the first pair of operands only when the first cell has
finished producing its bit-products. This operation lasts for n cycles before the propagation of the
partial results is switched to the multiplexers, which allows the cells to generate and accumulate
the bit-product of the second pair of data. Consequently, two multiplication operations can be
multiplexed into this multiplier every 2n cycles. The proposed architecture is depicted in Figure 6.
5. COMPARISON OF PERFORMANCE
A performance comparison of the new proposed architectures with similar structures available in
the literature [3,4] in terms of area usage and the speed of the multipliers is presented in Table 6.
In terms of speed, the multiplier of Structure I has a clock period equivalent to the delay of a FA, an
AND gate and a latch, and as such operates at faster speeds. Furthermore, the multiplier of
Structure II has a clock period of a multiplexer, an AND gate, a FA and a latch which makes it faster
than the multiplier described in [3]. In terms of area usage, the improvements introduced in [4] on the
multiplier of [3] have resulted in saving half the total number of cells. In terms of FPGA area usage
and in the case of n-bit operands, Structure I is mapped into 5n/2 slices of a virtex-4 FPGA and
Structure II uses 2n slices. The multiplier given in [3] is mapped into 5n slices while the multiplier
described in [4] requires 2n slices. These results clearly show the advantages of the new structures
in terms of both speed and area usage.
Omar Nibouche
International Journal of Engineering (IJE), Volume (7) : Issue (1) : 2013 8
FAFA FA
ai
control
bi
1
0
1
0
1
0
1
0
1
0
1
0
FA
n Shift Register
FIGURE 7: The New Area-Time
2
Efficient Multiplier.
Multiplier in [3] Multiplier in [4] Structure I Structure II
Basic Cell
counter + 2 AND
gates+
multiplexer + 6
latches
Counter + 2 AND
gates + 6 latches
+ multiplexer
FA + AND gate +
4 latches
2 Fas + 2 AND
gates +
multiplexer + 6
latches
n-bit multiplier area
usage
n BCs
n/2 BCs + n +
latches
n BCs + n latches
n/2 BCs + n
latches
Longest path
AND gate +
counter +
multiplexer +
latch
AND gate +
counter +
multiplexer +
latch
AND gate + FA +
latch
AND gate + FA +
multiplexer +
latch
Latency 1 cycle 1 cycle n cycles 2 cycles
n-bit multiplier area
usage in FPGA
Virxtex-4
5n slices 2n slices 5n/2 slices 2n slices
TABLE 6: Performance Comparison.
Multiplier in [7] New multiplier
Basic Cell
counter + 2 AND gates+
multiplexer + 6 latches
FA + AND gate + 4 latches
n-bit multiplier area
usage
3n/4 BCs + 2n latches
≈63n gates (100%)
n BCs + 4n latches + 2n
multiplexers ≈66n gates
(104%)
Longest path
AND gate + counter +
multiplexer + latch
AND gate + FA + latch
TABLE 8: Performance Comparison of Area-Time
2
Efficient Structures.
Table 8 shows a comparison of performance in terms of hardware usage and speed between the
new area-time
2
efficient multiplier and the multiplier described in [7]. An estimation of the area
usage for both structures made on the number of gates is also shown. It is assumed that the area
of the 5 to 3 counter is equal to that of two FAs and a Half Adder, as shown in Figure 1a, which
has the same behavior as the 5 to 3 counter. The area usages of both structures are almost
similar, but the BCs and the longest path have not been changed. It is worth pointing out the
reason behind the choice of the multiplication scheme of Table 4. One may comment that a
"parallel to serial converter" added to a bit serial-parallel multiplier transforms it to a fully serial
multiplier with identical features to those of Structure I multiplier. Had this approach been
adopted, once the multiplier is optimised for area-time
2
efficiency, an extra n multiplexers would
Omar Nibouche
International Journal of Engineering (IJE), Volume (7) : Issue (1) : 2013 9
have been added to the multiplier. These multiplexers are to be added in the path of the data
making its clock path equal to the delay of a multiplexer, a gated FA and a latch; making it slower
than the multiplier derived from Structure I.
6. CONCLUSIONS
In this paper, new structures for reduced pin count multiplication architectures have been presented.
These multipliers are systolic and scalable, thus suitable for VLSI implementation. They are both
modular and need near communication links only. Structure I is the first bit-bit serial multiplier with
speed performances similar to existing serial-parallel multipliers. In Structure II, the basic cell has
been modified to a more appropriate 5 to 3 counter, thus increasing the throughput rate of the
multiplier. Structure I has been optimised for Area-Time
2
efficiency, which has resulted in doubling
the throughput rate.
7. REFERENCES
[1] K.K. Parhi, "VLSI Digital Signal Processing Systems: Design and Implementation", A Wiley-
Interscience Publication, 1999.
[2] A. Aggoun, A. Farwan, M.K. Ibrahim and A.S. Ashur, “Radix-2
n
Serial-Serial Multipliers”, IEE
proc. Circuits, Devices and Systems, vol. 151, issue 6, pp. 503-509, Dec. 2004.
[3] P. Ienne, and M. Viredaz, “A bit-serial multipliers and squarers”, IEEE Trans. Computer, 1994,
43, (12), pp.1445-1450.
[4] A. Aggoun, A.S. Ashur, and M.K. Ibrahim, “Area-Time efficient serial-serial multipliers”, IEEE
International Symp. On Circuits and Systems (ISCAS), pp.V-585-588, GENEVA, May 2000.
[5] N. Strader and V. Rhyne, “A canonical bit-sequential multiplier”, IEEE Trans. Computer, 1982,
vol. 31, pp. 691-626.
[6] J. Scanlon and W. Fuchs, “'High-performance bit-serial multiplication”, in Proc. IEEE ICCD'86,
Rye Brook, NY, Oct. 1986.
[7] A.S. Ashur, “New Efficient Multiplication Structure and their Applications”. Ph.D. thesis, Dept.
of Electrical and Electronic Eng., the University of Nottingham, 1996.

More Related Content

What's hot

FPGA Implementation of SubByte & Inverse SubByte for AES Algorithm
FPGA Implementation of SubByte & Inverse SubByte for AES AlgorithmFPGA Implementation of SubByte & Inverse SubByte for AES Algorithm
FPGA Implementation of SubByte & Inverse SubByte for AES Algorithmijsrd.com
 
Graph based transistor network generation method for supergate design
Graph based transistor network generation method for supergate designGraph based transistor network generation method for supergate design
Graph based transistor network generation method for supergate designjpstudcorner
 
A High Speed Wallace Tree Multiplier Using Modified Booth Algorithm for Fast ...
A High Speed Wallace Tree Multiplier Using Modified Booth Algorithm for Fast ...A High Speed Wallace Tree Multiplier Using Modified Booth Algorithm for Fast ...
A High Speed Wallace Tree Multiplier Using Modified Booth Algorithm for Fast ...Kumar Goud
 
Low Power Implementation of Booth’s Multiplier using Reversible Gates
Low Power Implementation of Booth’s Multiplier using Reversible GatesLow Power Implementation of Booth’s Multiplier using Reversible Gates
Low Power Implementation of Booth’s Multiplier using Reversible GatesIJMTST Journal
 
Power Optimization using Reversible Gates for Booth’s Multiplier
Power Optimization using Reversible Gates for Booth’s MultiplierPower Optimization using Reversible Gates for Booth’s Multiplier
Power Optimization using Reversible Gates for Booth’s MultiplierIJMTST Journal
 
High performance nb-ldpc decoder with reduction of message exchange
High performance nb-ldpc decoder with reduction of message exchange High performance nb-ldpc decoder with reduction of message exchange
High performance nb-ldpc decoder with reduction of message exchange Ieee Xpert
 
A high performance fir filter architecture for fixed and reconfigurable appli...
A high performance fir filter architecture for fixed and reconfigurable appli...A high performance fir filter architecture for fixed and reconfigurable appli...
A high performance fir filter architecture for fixed and reconfigurable appli...Ieee Xpert
 
A comparative study of different multiplier designs
A comparative study of different multiplier designsA comparative study of different multiplier designs
A comparative study of different multiplier designsHoopeer Hoopeer
 
An Efficient High Speed Design of 16-Bit Sparse-Tree RSFQ Adder
An Efficient High Speed Design of 16-Bit Sparse-Tree RSFQ AdderAn Efficient High Speed Design of 16-Bit Sparse-Tree RSFQ Adder
An Efficient High Speed Design of 16-Bit Sparse-Tree RSFQ AdderIJERA Editor
 
Low cost reversible signed comparator
Low cost reversible signed comparatorLow cost reversible signed comparator
Low cost reversible signed comparatorVLSICS Design
 
Iaetsd mac using compressor based multiplier and carry save adder
Iaetsd mac using compressor based multiplier and carry save adderIaetsd mac using compressor based multiplier and carry save adder
Iaetsd mac using compressor based multiplier and carry save adderIaetsd Iaetsd
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Lec7 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Kar...
Lec7 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Kar...Lec7 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Kar...
Lec7 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Kar...Hsien-Hsin Sean Lee, Ph.D.
 
FPGA Implementation of Mix and Inverse Mix Column for AES Algorithm
FPGA Implementation of Mix and Inverse Mix Column for AES AlgorithmFPGA Implementation of Mix and Inverse Mix Column for AES Algorithm
FPGA Implementation of Mix and Inverse Mix Column for AES Algorithmijsrd.com
 
IRJET- Radix 8 Booth Encoded Interleaved Modular Multiplication
IRJET- Radix 8 Booth Encoded Interleaved Modular MultiplicationIRJET- Radix 8 Booth Encoded Interleaved Modular Multiplication
IRJET- Radix 8 Booth Encoded Interleaved Modular MultiplicationIRJET Journal
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)inventionjournals
 

What's hot (20)

FPGA Implementation of SubByte & Inverse SubByte for AES Algorithm
FPGA Implementation of SubByte & Inverse SubByte for AES AlgorithmFPGA Implementation of SubByte & Inverse SubByte for AES Algorithm
FPGA Implementation of SubByte & Inverse SubByte for AES Algorithm
 
Graph based transistor network generation method for supergate design
Graph based transistor network generation method for supergate designGraph based transistor network generation method for supergate design
Graph based transistor network generation method for supergate design
 
A High Speed Wallace Tree Multiplier Using Modified Booth Algorithm for Fast ...
A High Speed Wallace Tree Multiplier Using Modified Booth Algorithm for Fast ...A High Speed Wallace Tree Multiplier Using Modified Booth Algorithm for Fast ...
A High Speed Wallace Tree Multiplier Using Modified Booth Algorithm for Fast ...
 
Low Power Implementation of Booth’s Multiplier using Reversible Gates
Low Power Implementation of Booth’s Multiplier using Reversible GatesLow Power Implementation of Booth’s Multiplier using Reversible Gates
Low Power Implementation of Booth’s Multiplier using Reversible Gates
 
Power Optimization using Reversible Gates for Booth’s Multiplier
Power Optimization using Reversible Gates for Booth’s MultiplierPower Optimization using Reversible Gates for Booth’s Multiplier
Power Optimization using Reversible Gates for Booth’s Multiplier
 
Q045079298
Q045079298Q045079298
Q045079298
 
High performance nb-ldpc decoder with reduction of message exchange
High performance nb-ldpc decoder with reduction of message exchange High performance nb-ldpc decoder with reduction of message exchange
High performance nb-ldpc decoder with reduction of message exchange
 
A high performance fir filter architecture for fixed and reconfigurable appli...
A high performance fir filter architecture for fixed and reconfigurable appli...A high performance fir filter architecture for fixed and reconfigurable appli...
A high performance fir filter architecture for fixed and reconfigurable appli...
 
A comparative study of different multiplier designs
A comparative study of different multiplier designsA comparative study of different multiplier designs
A comparative study of different multiplier designs
 
D0161926
D0161926D0161926
D0161926
 
An Efficient High Speed Design of 16-Bit Sparse-Tree RSFQ Adder
An Efficient High Speed Design of 16-Bit Sparse-Tree RSFQ AdderAn Efficient High Speed Design of 16-Bit Sparse-Tree RSFQ Adder
An Efficient High Speed Design of 16-Bit Sparse-Tree RSFQ Adder
 
Hz3115131516
Hz3115131516Hz3115131516
Hz3115131516
 
Low cost reversible signed comparator
Low cost reversible signed comparatorLow cost reversible signed comparator
Low cost reversible signed comparator
 
Iaetsd mac using compressor based multiplier and carry save adder
Iaetsd mac using compressor based multiplier and carry save adderIaetsd mac using compressor based multiplier and carry save adder
Iaetsd mac using compressor based multiplier and carry save adder
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Lec7 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Kar...
Lec7 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Kar...Lec7 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Kar...
Lec7 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Kar...
 
Bu34437441
Bu34437441Bu34437441
Bu34437441
 
FPGA Implementation of Mix and Inverse Mix Column for AES Algorithm
FPGA Implementation of Mix and Inverse Mix Column for AES AlgorithmFPGA Implementation of Mix and Inverse Mix Column for AES Algorithm
FPGA Implementation of Mix and Inverse Mix Column for AES Algorithm
 
IRJET- Radix 8 Booth Encoded Interleaved Modular Multiplication
IRJET- Radix 8 Booth Encoded Interleaved Modular MultiplicationIRJET- Radix 8 Booth Encoded Interleaved Modular Multiplication
IRJET- Radix 8 Booth Encoded Interleaved Modular Multiplication
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 

Viewers also liked

Aprendamos tecnologia de almacenamiento y CLOUD COMPUTING
Aprendamos tecnologia de almacenamiento y CLOUD COMPUTINGAprendamos tecnologia de almacenamiento y CLOUD COMPUTING
Aprendamos tecnologia de almacenamiento y CLOUD COMPUTINGleoaranibar
 
Tafsir e-rahmatul lil aalameen
Tafsir e-rahmatul lil aalameenTafsir e-rahmatul lil aalameen
Tafsir e-rahmatul lil aalameenAale Rasool Ahmad
 
Producción agrícola
Producción agrícolaProducción agrícola
Producción agrícolajosenovoa2000
 
Confident Gold Coast-bangalore5.com
Confident Gold Coast-bangalore5.comConfident Gold Coast-bangalore5.com
Confident Gold Coast-bangalore5.comBangalore Property
 
Mandating Reduced Transit Fares for Low-Income Residents
Mandating Reduced Transit Fares for Low-Income ResidentsMandating Reduced Transit Fares for Low-Income Residents
Mandating Reduced Transit Fares for Low-Income Residentsandrejohnson034
 
Diversity state of_infor_beliefs_rubric fall 2011
Diversity state of_infor_beliefs_rubric fall 2011Diversity state of_infor_beliefs_rubric fall 2011
Diversity state of_infor_beliefs_rubric fall 2011carolbillingcwi
 
Przykład zadania Analiza Mnożnikowa
Przykład zadania Analiza MnożnikowaPrzykład zadania Analiza Mnożnikowa
Przykład zadania Analiza MnożnikowaTomasz Jeziorski
 
Apostila curso de blaster
Apostila   curso de blasterApostila   curso de blaster
Apostila curso de blasterAmilton Castro
 
Golder - Site Remediation & LSRP Expertise
Golder - Site Remediation & LSRP ExpertiseGolder - Site Remediation & LSRP Expertise
Golder - Site Remediation & LSRP ExpertiseTom Glancey
 
Disney Sustainability
Disney SustainabilityDisney Sustainability
Disney SustainabilityDaniel Chang
 

Viewers also liked (13)

Aprendamos tecnologia de almacenamiento y CLOUD COMPUTING
Aprendamos tecnologia de almacenamiento y CLOUD COMPUTINGAprendamos tecnologia de almacenamiento y CLOUD COMPUTING
Aprendamos tecnologia de almacenamiento y CLOUD COMPUTING
 
Tafsir e-rahmatul lil aalameen
Tafsir e-rahmatul lil aalameenTafsir e-rahmatul lil aalameen
Tafsir e-rahmatul lil aalameen
 
Producción agrícola
Producción agrícolaProducción agrícola
Producción agrícola
 
Confident Gold Coast-bangalore5.com
Confident Gold Coast-bangalore5.comConfident Gold Coast-bangalore5.com
Confident Gold Coast-bangalore5.com
 
Mandating Reduced Transit Fares for Low-Income Residents
Mandating Reduced Transit Fares for Low-Income ResidentsMandating Reduced Transit Fares for Low-Income Residents
Mandating Reduced Transit Fares for Low-Income Residents
 
Diversity state of_infor_beliefs_rubric fall 2011
Diversity state of_infor_beliefs_rubric fall 2011Diversity state of_infor_beliefs_rubric fall 2011
Diversity state of_infor_beliefs_rubric fall 2011
 
The quopn-advantage
The quopn-advantageThe quopn-advantage
The quopn-advantage
 
Thirumeni.D-HR
Thirumeni.D-HRThirumeni.D-HR
Thirumeni.D-HR
 
Przykład zadania Analiza Mnożnikowa
Przykład zadania Analiza MnożnikowaPrzykład zadania Analiza Mnożnikowa
Przykład zadania Analiza Mnożnikowa
 
Model liniowy Holta
Model liniowy HoltaModel liniowy Holta
Model liniowy Holta
 
Apostila curso de blaster
Apostila   curso de blasterApostila   curso de blaster
Apostila curso de blaster
 
Golder - Site Remediation & LSRP Expertise
Golder - Site Remediation & LSRP ExpertiseGolder - Site Remediation & LSRP Expertise
Golder - Site Remediation & LSRP Expertise
 
Disney Sustainability
Disney SustainabilityDisney Sustainability
Disney Sustainability
 

Similar to Area Efficient and Reduced Pin Count Multipliers

Implementation of Low Power and Area Efficient Carry Select Adder
Implementation of Low Power and Area Efficient Carry Select AdderImplementation of Low Power and Area Efficient Carry Select Adder
Implementation of Low Power and Area Efficient Carry Select Adderinventionjournals
 
Compressor based approximate multiplier architectures for media processing ap...
Compressor based approximate multiplier architectures for media processing ap...Compressor based approximate multiplier architectures for media processing ap...
Compressor based approximate multiplier architectures for media processing ap...IJECEIAES
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Area Delay Power Efficient and Implementation of Modified Square-Root Carry S...
Area Delay Power Efficient and Implementation of Modified Square-Root Carry S...Area Delay Power Efficient and Implementation of Modified Square-Root Carry S...
Area Delay Power Efficient and Implementation of Modified Square-Root Carry S...IJTET Journal
 
Designing a Novel High Performance Four-to-Two Compressor Cell Based on CNTFE...
Designing a Novel High Performance Four-to-Two Compressor Cell Based on CNTFE...Designing a Novel High Performance Four-to-Two Compressor Cell Based on CNTFE...
Designing a Novel High Performance Four-to-Two Compressor Cell Based on CNTFE...IJECEIAES
 
Design and Verification of Area Efficient Carry Select Adder
Design and Verification of Area Efficient Carry Select AdderDesign and Verification of Area Efficient Carry Select Adder
Design and Verification of Area Efficient Carry Select Adderijsrd.com
 
A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier us...
A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier us...A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier us...
A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier us...IJERA Editor
 
Design and Implementation of Low-Power and Area-Efficient 64 bit CSLA using VHDL
Design and Implementation of Low-Power and Area-Efficient 64 bit CSLA using VHDLDesign and Implementation of Low-Power and Area-Efficient 64 bit CSLA using VHDL
Design and Implementation of Low-Power and Area-Efficient 64 bit CSLA using VHDLIJSRD
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD Editor
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI) International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI) inventionjournals
 
Implementation of FinFET technology based low power 4×4 Wallace tree multipli...
Implementation of FinFET technology based low power 4×4 Wallace tree multipli...Implementation of FinFET technology based low power 4×4 Wallace tree multipli...
Implementation of FinFET technology based low power 4×4 Wallace tree multipli...TELKOMNIKA JOURNAL
 
Comparative Study of Low Power Low Area Bypass Multipliers for Signal Process...
Comparative Study of Low Power Low Area Bypass Multipliers for Signal Process...Comparative Study of Low Power Low Area Bypass Multipliers for Signal Process...
Comparative Study of Low Power Low Area Bypass Multipliers for Signal Process...IJERA Editor
 
Design and Implementation of Optimized 32-Bit Reversible Arithmetic Logic Unit
Design and Implementation of Optimized 32-Bit Reversible Arithmetic Logic UnitDesign and Implementation of Optimized 32-Bit Reversible Arithmetic Logic Unit
Design and Implementation of Optimized 32-Bit Reversible Arithmetic Logic Unitrahulmonikasharma
 
Feasible methodology for
Feasible methodology forFeasible methodology for
Feasible methodology forVLSICS Design
 
FPGA Implementation of High Speed Architecture of CSLA using D-Latches
FPGA Implementation of High Speed Architecture of CSLA using D-LatchesFPGA Implementation of High Speed Architecture of CSLA using D-Latches
FPGA Implementation of High Speed Architecture of CSLA using D-LatchesEditor IJMTER
 

Similar to Area Efficient and Reduced Pin Count Multipliers (20)

Implementation of Low Power and Area Efficient Carry Select Adder
Implementation of Low Power and Area Efficient Carry Select AdderImplementation of Low Power and Area Efficient Carry Select Adder
Implementation of Low Power and Area Efficient Carry Select Adder
 
Compressor based approximate multiplier architectures for media processing ap...
Compressor based approximate multiplier architectures for media processing ap...Compressor based approximate multiplier architectures for media processing ap...
Compressor based approximate multiplier architectures for media processing ap...
 
F010113644
F010113644F010113644
F010113644
 
Modified booth
Modified boothModified booth
Modified booth
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Area Delay Power Efficient and Implementation of Modified Square-Root Carry S...
Area Delay Power Efficient and Implementation of Modified Square-Root Carry S...Area Delay Power Efficient and Implementation of Modified Square-Root Carry S...
Area Delay Power Efficient and Implementation of Modified Square-Root Carry S...
 
Designing a Novel High Performance Four-to-Two Compressor Cell Based on CNTFE...
Designing a Novel High Performance Four-to-Two Compressor Cell Based on CNTFE...Designing a Novel High Performance Four-to-Two Compressor Cell Based on CNTFE...
Designing a Novel High Performance Four-to-Two Compressor Cell Based on CNTFE...
 
Design and Verification of Area Efficient Carry Select Adder
Design and Verification of Area Efficient Carry Select AdderDesign and Verification of Area Efficient Carry Select Adder
Design and Verification of Area Efficient Carry Select Adder
 
Survey on Prefix adders
Survey on Prefix addersSurvey on Prefix adders
Survey on Prefix adders
 
A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier us...
A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier us...A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier us...
A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier us...
 
Design and Implementation of Low-Power and Area-Efficient 64 bit CSLA using VHDL
Design and Implementation of Low-Power and Area-Efficient 64 bit CSLA using VHDLDesign and Implementation of Low-Power and Area-Efficient 64 bit CSLA using VHDL
Design and Implementation of Low-Power and Area-Efficient 64 bit CSLA using VHDL
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI) International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 
Implementation of FinFET technology based low power 4×4 Wallace tree multipli...
Implementation of FinFET technology based low power 4×4 Wallace tree multipli...Implementation of FinFET technology based low power 4×4 Wallace tree multipli...
Implementation of FinFET technology based low power 4×4 Wallace tree multipli...
 
Comparative Study of Low Power Low Area Bypass Multipliers for Signal Process...
Comparative Study of Low Power Low Area Bypass Multipliers for Signal Process...Comparative Study of Low Power Low Area Bypass Multipliers for Signal Process...
Comparative Study of Low Power Low Area Bypass Multipliers for Signal Process...
 
Iaetsd 128-bit area
Iaetsd 128-bit areaIaetsd 128-bit area
Iaetsd 128-bit area
 
Design and Implementation of Optimized 32-Bit Reversible Arithmetic Logic Unit
Design and Implementation of Optimized 32-Bit Reversible Arithmetic Logic UnitDesign and Implementation of Optimized 32-Bit Reversible Arithmetic Logic Unit
Design and Implementation of Optimized 32-Bit Reversible Arithmetic Logic Unit
 
SFQ MULTIPLIER
SFQ MULTIPLIERSFQ MULTIPLIER
SFQ MULTIPLIER
 
Feasible methodology for
Feasible methodology forFeasible methodology for
Feasible methodology for
 
FPGA Implementation of High Speed Architecture of CSLA using D-Latches
FPGA Implementation of High Speed Architecture of CSLA using D-LatchesFPGA Implementation of High Speed Architecture of CSLA using D-Latches
FPGA Implementation of High Speed Architecture of CSLA using D-Latches
 

Recently uploaded

Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 

Recently uploaded (20)

Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 

Area Efficient and Reduced Pin Count Multipliers

  • 1. Omar Nibouche International Journal of Engineering (IJE), Volume (7) : Issue (1) : 2013 1 Area Efficient and Reduced Pin Count Multipliers Omar Nibouche o.nibouche@tu.edu.sa College of Computers and Information Technology Taif University POB 888 Taif 21964, KSA Abstract Fully serial multipliers can play an important role in the implementation of DSP algorithms in resource-limited chips such as FPGAs; offering area efficient architectures with a reduced pin count and moderate throughput rates. In this paper two structures that implement the fully serial multiplication operation are presented. One significant aspect of the new designs is that they are systolic and require near communication links only. They are superior in speed and area usage to similar architectures in the literature. The paper also present a new fully serial multiplier optimized for area-time 2 efficiency with better performance than available architectures in the open literature. Keywords: Educed Pin Count, Serial Multiplication, Area-Time 2 . 1. INTRODUCTION There is a crucial advantage offered by bit-serial processors over their parallel counterpart, which lies in the very efficient use of chip area. They are particularly suitable for applications that require slow to moderate speeds and in batch mode applications. By contrast, bit-parallel processors are useful for fast speed systems, but at the expense of larger a area usage and thus they are more expensive [1-2]. Traditional bit serial multiplier structures suffer from an inefficient generation of partial products, which leads to hardware overuse and slow speed systems. In this paper, two structures for bit serial multiplication are presented. The first structure, called structure I, is the first fully serial multiplier reported in the literature with comparable performance - in terms of speed- to existing serial-parallel multipliers. The second structure, termed structure II, requires an extra multiplexer in the clock path; thus making it slower, but has the merit of reducing the latency of the multiplier. The remainder of the paper is organised as follows: in section 2, the previous work in the literature is reviewed, while section 3 describes the new structures for fully serial multiplication. Section 4 is concerned with an optimisation of the multiplier of Structure I in terms of area-time 2 efficiency. A comparison of performance is shown in section 5 and conclusions are given in section 6. 2. BIT SERIAL MULTIPLIERS: A REVIEW One of the early bit serial multipliers was proposed by [3]. It generates the partial products in a recursive fashion. Consider the multiplication of two n-bit positive numbers A and B as follows: ∑ −= = = 1 0 2 ni i i iaA (1) and ∑ −= = = 1 0 2 ni i i ibB (2)
  • 2. Omar Nibouche International Journal of Engineering (IJE), Volume (7) : Issue (1) : 2013 2 Let Pi represents the partial product computed after the i th bit is fed [3]. Pi is given by: i 2 111 22 ba)AbB(aPP i i i-ii-i i i-i +++= (3) where Ai and Bi represent the value of the operands A and B, respectively, and by considering only bits from the Least Significant Bit (LSB) to the i th bit, that is, Ai = A mod 2 i+1 (4) and Bi = B mod 2 i+1 (5) with the initial values P-1 = A-1 = B-1 = 0. The generation of the partial product, using equation (3), and their assignment to the multiplier cells is shown in Table 1 below: Cycle P Cell 1 Cell 2 1 2 3 4 5 6 7 8 a3 b3 Cell 3 Cell 4 a2 b3 +a3 b2 P0 a2b2 a1 b1 a0 b0 a1b2+a2b1 a0 b1 +a1 b0 a0 b2 +a2 b0 a1 b3 +a3 b1 a0 b3 +a3 b3 P1 P2 P3 P4 P5 P6 P7 TABLE 1: Multiplication Scheme of [3]. From the above table, each cell generates two new bit-products every cycle. To cope with this constraint, The Basic Cell (BC) of the multiplier proposed by [3] is built around a 5 to 3 counter. The counter is capable of accumulating five inputs of the same weight to a sum-bit (Sout) of the same weight as the inputs, i.e.2 0 , 1 st carry-bit (C 1 out) of a weight of 2 1 and 2 nd carry-bit (C 2 out) of a weight of 2 2 . In particular, the sum-bit is calculated through a tree of EXOR gate to reduce the propagation delay within the cell. The BC of the multiplier is shown in Figure 1. The multiplier uses n identical cells to perform the multiplication of two n-bit numbers in 2n cycles, as it can be seen in Figure 2.
  • 3. Omar Nibouche International Journal of Engineering (IJE), Volume (7) : Issue (1) : 2013 3 SoutC1 out C2 out Cin Sin ai bi control 5to3 counter Sout Cout Latch FA FA HA 22 21 20 20 2020 20 20 (b) (a) FIGURE 1 (a): 5 to 3 Counter Made of Two FAs and One HA (b). The BC of [3]. BC n BC 2 BC 1 ai Si 0 control bi FIGURE 2: The n-bit Multiplier by [3]. In [4], modifications were carried out on the multiplication scheme of Table 1 to make a more efficient use of the hardware. In fact, the multiplier proposed by [4] uses only half the number of cells required by [3]. To achieve this, the partial products generated by the last n/2 cells in [3] were reallocated to the first n/2 cells and rescheduled at cycle n+1. In this way, a full utilisation of cells 1 to n/2 can be achieved, as it can be shown in Table 2. Cycle P Cell 1 Cell 2 1 2 3 4 5 6 7 8 a3 b3 a2b3+a3b2 P0 a2 b2 a1 b1 a0b0 a1b2+a2b1 a0b1+a1b0 a0b2+a2b0 a1 b3 +a3 b1 a0 b3 +a3 b3 P1 P2 P3 P4 P5 P6 P7 TABLE 2: Multiplication Scheme by [4].
  • 4. Omar Nibouche International Journal of Engineering (IJE), Volume (7) : Issue (1) : 2013 4 The multiplier was modified using little extra hardware: two n/2 shift registers are used to store the n/2 most significant bits of the data words A and B and n/2 multiplexers. The BC of the modified multiplier d is shown in Figure 3. The multiplier structure as proposed by [4] is shown in Figure 4. This architecture has reduced the number of 5 to 3 counters by 50%, as well as reducing the number of latches by 33%. The clock path equals the delay of a multiplexer, a gated counter and a latch. 1 0 5to3 counter Sout C1 out C2 out Cin Sin ai bi control Cout Sout Latch FIGURE 3: The BC of [4]. BC N/2 BC 2 BC 1 1 0 ai Si 0 control 1 0 n/2 Shift Register n/2 Shift Register bi FIGURE 4: The n-bit Multiplier by [4]. 3. THE NEW FULLY SERIAL MULTIPLIERS Although about 50% of the area used in [3] has been saved by [4], the throughput rate has not been increased. On the contrary, it has decreased as a multiplexer was added to the structure making the clock period of the multiplier in [4] equivalent to the delay of a multiplexer, an AND gate, a 5 to 3 counter and a latch. To remedy this problem, two new structures are proposed. 3.1. Structure I In order to reduce the clock path as described above, the multiplication algorithm has been modified. It generates the bit-products associated with cells 2 to n at the (n+1) th cycle. The scheduling of the tasks of the first cell is kept unchanged, but the latency of the multiplier is increased to n cycles. The multiplication scheme is shown in Table 3 for 4-bit operands. The multiplication operation can be divided into two parts, which can easily be done by rewriting the product of the two numbers, A and B, in the following way: ji ni i nj j jibaBaB*A + −= = −= = ∑ ∑+= 2 1 1 1 0 0 (6)
  • 5. Omar Nibouche International Journal of Engineering (IJE), Volume (7) : Issue (1) : 2013 5 To keep working under the constraints of sending one bit of each operand at a time, the term a0B in equation (6) is generated by the first cell of the multiplier during the first n cycles. At this stage, the other cells only propagate the partial products already generated by the first cell. At the (n+1) th cycle, all the operand bits have been fed and the term ji ni i nj j ji ba + −= = −= = ∑ ∑ 2 1 1 1 0 can be generated during the last n cycles. The clock period is equivalent to the delay of a FA, an AND gate and a latch, as shown in Figure 5. Therefore, Structure I achieves similar speed performances when compared with serial-parallel multipliers. 3.2. Structure II The 5 to 3 counters have been widely used in the literature [3-6]. Basically, such a counter reduces 5 bits of the same weight to three bits: a result-bit of the same weight as the inputs (a weight of 2 0 ), a carry-bit, which has a weight of 2 1 , and a far carry-bit that has a weight of 2 2 . It is clear that while the sum of the inputs is up to 5, the sum of the outputs is up to 6, and as such, two representations of the outputs are excluded. Therefore, it is clearly more appropriate to reduce the 5 inputs to 3 outputs: a result-bit of the same weight as the inputs, and 2 carry-bits of twice the weight of the result-bit. For this purpose, a new cell has been developed by using two FAs as shown in Figure 6. The first FA is used to accumulate two bit products with a carry feedback. The second FA is used to generate a result-bit from the result of the first FA, the result-bit from the adjacent cell and a carry feedback. The two carry-bits generated by the new cell are fed back and accumulated with the bit-products of the next cycle. The sum-bit of the first FA is registered, and thus making the clock period equivalent to the delay of a multiplexer, an AND gate, a FA and a latch. The multiplier structure implements directly the algorithm shown in Table 2. The multiplier requires only n/2 cells for the multiplication of two n-bit numbers. It is also modular and needs near communication links only. TABLE 3: Structure I Multiplication Scheme For 4-bit Operands. Cycle P Cell 1 Cell 2 Cell 3 Cell 4 1 a0b0 2 a0b1 3 a0b2 4 P0 a0b3 5 P1 a3b0 a2b0 a1b0 6 P2 a3b1 a2b1 a1b1 7 P3 a3b2 a2b2 a1b2 8 P4 a3b3 a2b3 a1b3 9 P5 10 P6 11 P7
  • 6. Omar Nibouche International Journal of Engineering (IJE), Volume (7) : Issue (1) : 2013 6 result control 0 FA FA FA FA 1 0n Shift Register ai bi control Latch 00010001 FIGURE 5: Structure I bit-bit Serial Multiplier. ai bi control Sin Sout 1 0 FA FA FIGURE 6: The Basic Cell of Structure II Fully Serial Multiplier 4. AREA-TIME2 EFFICIENT BIT-BIT SERIAL MULTIPLIER In this section, a new multiplier structure, which is capable of multiplexing two multiplication operations into Structure I is proposed. This has the merit of doubling the throughput rate at the expense of extra hardware consisting of 2n multiplexers and n latches. By optimizing the multiplier for area-time 2 efficiency, the problem of lost cycles is circumvented. The lost cycles are the cycles needed for carry propagation once the generation of the partial-products is finished. The best structure in the literature that can multiplex two multiplication operations into the same multiplier was described in [7]. The algorithm presented in [7] is an improvement made on the multiplication scheme of Table 2. Starting from this multiplication scheme, it reassigns and reschedules the partial products generated by the (n/4+1)th cell and above starting at (n+n/4)th cycle to the cells from 1 to n/4 at the (n+n/2)th cycle, respectively. This has the effect of freeing n/4 most significant cells at the (n+n/4)th cycle and thereafter. The multiplication scheme adopted by [7] to achieve this operation is shown in Table 5.
  • 7. Omar Nibouche International Journal of Engineering (IJE), Volume (7) : Issue (1) : 2013 7 Cycle P Cell 1 Cell 2 1 P0 2 P1 3 P2 4 P3 5 P4 6 P5 7 P6 8 P7 a3b3 a2b3+a3b2 a2b2 a13+a3b1a0b3+a3b0 a1b2+a2b1 a1b1 a0b0 a0b1+a1b0 a0b2+a2b0 TABLE 5: Multiplication Scheme of [7]. Although the work presented in [7] can be applied to the multiplier of Structure II, it is easier to optimize the multiplier of Structure I for area-time 2 efficiency. One can clearly observe that the result from the first cell is not accumulated with the bit products of the other cells until the (n+1) th cycle. Therefore, instead of feeding the result of the first cell to its neighbouring cell, it is delayed by n cycles using n latches before being accumulated with the rest of the partial products, as can be seen in Figure 7. In this way, the first cell is used for n cycles to generate the bit-products of the first multiplication operation, then is reinitialised during one cycle before being used to generate the bit-products of the second multiplication operation for a duration of another n cycles and so on. The remaining cells operate almost in the same fashion. The key point is that they generate and accumulate the bit products of the first pair of operands only when the first cell has finished producing its bit-products. This operation lasts for n cycles before the propagation of the partial results is switched to the multiplexers, which allows the cells to generate and accumulate the bit-product of the second pair of data. Consequently, two multiplication operations can be multiplexed into this multiplier every 2n cycles. The proposed architecture is depicted in Figure 6. 5. COMPARISON OF PERFORMANCE A performance comparison of the new proposed architectures with similar structures available in the literature [3,4] in terms of area usage and the speed of the multipliers is presented in Table 6. In terms of speed, the multiplier of Structure I has a clock period equivalent to the delay of a FA, an AND gate and a latch, and as such operates at faster speeds. Furthermore, the multiplier of Structure II has a clock period of a multiplexer, an AND gate, a FA and a latch which makes it faster than the multiplier described in [3]. In terms of area usage, the improvements introduced in [4] on the multiplier of [3] have resulted in saving half the total number of cells. In terms of FPGA area usage and in the case of n-bit operands, Structure I is mapped into 5n/2 slices of a virtex-4 FPGA and Structure II uses 2n slices. The multiplier given in [3] is mapped into 5n slices while the multiplier described in [4] requires 2n slices. These results clearly show the advantages of the new structures in terms of both speed and area usage.
  • 8. Omar Nibouche International Journal of Engineering (IJE), Volume (7) : Issue (1) : 2013 8 FAFA FA ai control bi 1 0 1 0 1 0 1 0 1 0 1 0 FA n Shift Register FIGURE 7: The New Area-Time 2 Efficient Multiplier. Multiplier in [3] Multiplier in [4] Structure I Structure II Basic Cell counter + 2 AND gates+ multiplexer + 6 latches Counter + 2 AND gates + 6 latches + multiplexer FA + AND gate + 4 latches 2 Fas + 2 AND gates + multiplexer + 6 latches n-bit multiplier area usage n BCs n/2 BCs + n + latches n BCs + n latches n/2 BCs + n latches Longest path AND gate + counter + multiplexer + latch AND gate + counter + multiplexer + latch AND gate + FA + latch AND gate + FA + multiplexer + latch Latency 1 cycle 1 cycle n cycles 2 cycles n-bit multiplier area usage in FPGA Virxtex-4 5n slices 2n slices 5n/2 slices 2n slices TABLE 6: Performance Comparison. Multiplier in [7] New multiplier Basic Cell counter + 2 AND gates+ multiplexer + 6 latches FA + AND gate + 4 latches n-bit multiplier area usage 3n/4 BCs + 2n latches ≈63n gates (100%) n BCs + 4n latches + 2n multiplexers ≈66n gates (104%) Longest path AND gate + counter + multiplexer + latch AND gate + FA + latch TABLE 8: Performance Comparison of Area-Time 2 Efficient Structures. Table 8 shows a comparison of performance in terms of hardware usage and speed between the new area-time 2 efficient multiplier and the multiplier described in [7]. An estimation of the area usage for both structures made on the number of gates is also shown. It is assumed that the area of the 5 to 3 counter is equal to that of two FAs and a Half Adder, as shown in Figure 1a, which has the same behavior as the 5 to 3 counter. The area usages of both structures are almost similar, but the BCs and the longest path have not been changed. It is worth pointing out the reason behind the choice of the multiplication scheme of Table 4. One may comment that a "parallel to serial converter" added to a bit serial-parallel multiplier transforms it to a fully serial multiplier with identical features to those of Structure I multiplier. Had this approach been adopted, once the multiplier is optimised for area-time 2 efficiency, an extra n multiplexers would
  • 9. Omar Nibouche International Journal of Engineering (IJE), Volume (7) : Issue (1) : 2013 9 have been added to the multiplier. These multiplexers are to be added in the path of the data making its clock path equal to the delay of a multiplexer, a gated FA and a latch; making it slower than the multiplier derived from Structure I. 6. CONCLUSIONS In this paper, new structures for reduced pin count multiplication architectures have been presented. These multipliers are systolic and scalable, thus suitable for VLSI implementation. They are both modular and need near communication links only. Structure I is the first bit-bit serial multiplier with speed performances similar to existing serial-parallel multipliers. In Structure II, the basic cell has been modified to a more appropriate 5 to 3 counter, thus increasing the throughput rate of the multiplier. Structure I has been optimised for Area-Time 2 efficiency, which has resulted in doubling the throughput rate. 7. REFERENCES [1] K.K. Parhi, "VLSI Digital Signal Processing Systems: Design and Implementation", A Wiley- Interscience Publication, 1999. [2] A. Aggoun, A. Farwan, M.K. Ibrahim and A.S. Ashur, “Radix-2 n Serial-Serial Multipliers”, IEE proc. Circuits, Devices and Systems, vol. 151, issue 6, pp. 503-509, Dec. 2004. [3] P. Ienne, and M. Viredaz, “A bit-serial multipliers and squarers”, IEEE Trans. Computer, 1994, 43, (12), pp.1445-1450. [4] A. Aggoun, A.S. Ashur, and M.K. Ibrahim, “Area-Time efficient serial-serial multipliers”, IEEE International Symp. On Circuits and Systems (ISCAS), pp.V-585-588, GENEVA, May 2000. [5] N. Strader and V. Rhyne, “A canonical bit-sequential multiplier”, IEEE Trans. Computer, 1982, vol. 31, pp. 691-626. [6] J. Scanlon and W. Fuchs, “'High-performance bit-serial multiplication”, in Proc. IEEE ICCD'86, Rye Brook, NY, Oct. 1986. [7] A.S. Ashur, “New Efficient Multiplication Structure and their Applications”. Ph.D. thesis, Dept. of Electrical and Electronic Eng., the University of Nottingham, 1996.