SlideShare a Scribd company logo
IJSRD - International Journal for Scientific Research & Development| Vol. 1, Issue 5, 2013 | ISSN (online): 2321-0613
All rights reserved by www.ijsrd.com 1188
Abstract— Truncated multiplication reduces part of the
power required by multipliers by only computing the most-
significant bits of the product. The most common approach
to truncation includes physical reduction of the partial
product matrix and a compensation for the reduced bits via
different hardware compensation sub circuits. However, this
result in fixed systems optimized for a given application at
design time. A novel approach to truncation is proposed,
where a full precision multiplier is implemented, but the
active section of the partial product matrix is selected
dynamically at run-time. This allows a power reduction
trade off against signal degradation which can be modified
at run time. Such architecture brings together the power
reduction benefits from truncated multipliers and the
flexibility of reconfigurable and general purpose devices.
Efficient implementation of such a multiplier is presented in
a custom digital signal processor where the concept of
software compensation is introduced and analysed for
different applications. Experimental results and power
measurements are studied, including power measurements
from both post-synthesis simulations and a fabricated IC
implementation. This is the first system-level DSP core
using a fine-grain truncated multiplier. Results demonstrate
the effectiveness of the programmable truncated MAC
(PTMAC) in achieving power reduction, with minimum
impact on functionality for a number of applications.
Software compensation also shown to be effective when
deploying truncated multipliers in a system.
INTRODUCTIONI.
The high increase of portable communication and
computing devices and the advance in mobile multimedia
systems has made power consumption critical to optimize in
the design of digital signal processing architectures.
Advances in very large scale integrated-circuit technology
(VLSI) has been one of the major driving forces for the
growth of digital signal processors (DSP), enabling the
implementation of complex algorithms in programmable
DSP architectures and fixed application specific hardware.
Within DSP systems, multipliers are among the most
fundamental building blocks and parallel implementations,
required for high speed computation, represent the main
component in terms of power consumption of any hardware
dedicated to complex mathematic functions, such as
filtering, compression, or classification. The relationship
between the physical characteristics of the multiplier and its
resolution determine the efficiency and accuracy of the
systems. The optimization of multipliers in terms of area,
power and timing has been extensively studied in the past.
Full or direct multiplier implementation of an N×N–bit
multiplication yields a 2N-bit product. In order to keep the
full accuracy of the system, DSP architecture would need an
ever-growing bit width that would be impossible or
impractical to implement. To avoid this, results are usually
truncated or rounded down to keep results within the limits
of the architecture bit width. In order to reduce the parallel
multiplier requirements in systems where an exact result is
not required, several techniques that reduce power
consumption at the expense of lower precision have been
presented in the literature. Such techniques skip the
implementation and/or disable parts of the partial product
matrix to trade energy spent by the computational process
for degradation on the output signal. These techniques fall
into two categories, namely Word length Reduction and
Truncated Multipliers. Word length reduction techniques
minimize switching activity at the expense of data precision
by input shifting or truncation, but since the reduction is
done from the input operands, power reductions are obtained
at the cost of high levels of output noise. Truncated
multiplication structures based on Baugh-Wooley [2] or
Booth [3] algorithms, do not (by design) compute the lowest
sections of the partial product matrix. By doing so, a certain
error is introduced in the output while savings in power,
area, complexity, and timing are achieved. The majority of
the previous publications, focused on fixed-width
multipliers [4] that produce a fixed -bit output, and their
gain-vs.-error ratio is set by hardware. Configurable
techniques result in flexible architectures, which while
losing their benefits resulting from area optimization and
static power reduction, provide the multiplier with the
advantage of adaptability. This is desirable in systems with a
certain degree of programmability. The architecture
presented in this paper, a programmable truncated multiplier
(PTM), describes a full-precision multiplier, where the
elements of the partial product matrix can be disabled
through an external control word in a column-wise mode.
Disabled columns cease to contribute to the dynamic power
consumption, achieving power reductions in the multiplier.
Benefits of PTMs include:
1. Real-time control of the power-SNR exchange in
applications where power modes are switchable at
run-time.
2. Flexibility on accuracy selection for programmable
devices such as DSP structures that would benefit
from different truncation levels for different
applications, and at different operating points of an
individual algorithm.
The paper is organized as follows. Section II presents a brief
background about multipliers and two’s compliment
multiplication.
The proposed works are presented in Section III. Section IV
gives the simulation Results of the implemented multiplier.
Design and Implementation of a Programmable Truncated
Multiplier
Sethu Merin George1
1
M.Tech
1
VLSI & ES, MLMCE, Kottayam
Design and Implementation of a Programmable Truncated Multiplier
(IJSRD/Vol. 1/Issue 5/2013/038)
All rights reserved by www.ijsrd.com 1189
Finally, the conclusion section summarizes the entire
operations performed as well as the results.
BACKGROUNDII.
A. Two’s Complement Multiplication
An N-bit fractional multiplicand and an -bit fractional
multiplier will be considered as the inputs of the multiplier.
Let ∑ and
∑ where Also, for the sake of
generality, their 2N-bit product PStandard will be maintained in
full precision as: PStandard=XY=
∑ .
Applying the Baugh-Wooley algorithm [31] to the partial
product matrix that generates PStandard results in a final
matrix of all-positive partial product bits such as
PStandard=XY
∑ ̅̅̅̅̅̅̅̅̅ ∑ ̅̅̅̅̅̅̅̅̅ .
Once the partial product matrix of the parallel multiplier is
generated, partial product terms have to be combined into
the final product result. Tree structures are often preferred to
array implementations due to their shorter critical path.
B. Truncated Multipliers
Multiplication is a common requirement in DSP systems
and plays an important role in terms of power consumption.
In those systems where it is not necessary to compute the
exact least significant part of the product, truncated
multipliers achieve power, area and timing improvements by
skipping the implementation of a part of the least significant
part of the partial product matrix[5]-[9]. Instead of
computing the full-precision output, the output results from
the sum of the most significant (N+h) columns (where 0 ≤ h
≤ N) plus an estimation of the unformed bits. Fig.1 displays
a generic partial product matrix, where the partial product
matrix is split into two main regions, the least significant
part (LSP), which contains the N least significant columns
of the partial product matrix, and the most significant part
(MSP), that includes the most significant columns of partial
product terms. LSP can be also be split in two regions, LSP
major being the h most significant column, and LSP minor
being the least significant columns of the partial product
matrix.
The product resulting from a full-width multiplier, where the
partial product is fully implemented with an exact 2N-bit
result can be described as Pfull=SMSP+SLSP , where SMSP is the
sum of the partial product bits belonging to MSP and SLSP is
the sum of those belonging to the LSP. In many
applications, product values generated by fixed width N×N
bit multipliers are truncated or rounded back to the original
bit width in latter stages of the algorithm flow. The N-bit
product after rounding is obtained as
Pfullround=tN(SMSP+SLSP+LSB/2) , where LSB is the value of
one bit in the least significant column of the MSP, and tN(x)
represents truncation of an operand x by discarding its
lowest bits. In this case bits are dropped to maintain the
original width of N bits. Truncation allows a way of
reducing the complexity of the multiplier unit by discarding
the lower parts of the partial product matrix. This results in
most of the error being generated in the lower weighted bits
of the output that are discarded when converting the output
back to the original bit width. By doing so, significant
savings in power and complexity can be achieved, at the
expense of signal degradation. The simplest scheme to
obtain a truncated multiplier consists of removing the lower
columns of the partial product matrix that form the LSP, and
is referred to in the literature as direct truncation (D-
Truncation). It results in the multiplier requirements being
almost halved in both area and power, at the price of large
errors with a strong negative bias introduced in the
multiplier output.
In order to reduce both the magnitude and bias of the error
introduced by the truncation process, most of the references
available in the literature study fixed width multiplying
structures where the partial-product bits belonging to LSP
minor are not implemented. A compensation term is added
to compensate for the missing partial products, usually in
the form of a function of the input correction partial-
products f(IC), where the IC terms are the partial product
bits in the leftmost column of LSP minor. The constant
compensation method introduced in [5], was further
improved and explored for fixed width multipliers in [4].
Variable correction structures calculate probabilistic
estimates of the sum of the elements of the LSP minor by
using partial-products in the leftmost column of LSP minor.
C. Reconfigurable Power Reduction
Fixed-width truncated multipliers minimize power
consumption and optimize silicon area and timing, at the
price of adding a truncation error, by not implementing the
LSP minor region of the partial product matrix. Being
purpose-specific and optimized in hardware, they do not
provide any flexibility to operate at different configuration
modes in systems where the power and performance goals
vary with time. Reconfigurable and programmable
multipliers are structures that allow hardware multiplier
architecture to be configured to operate in different modes.
Such structures can be split in two main groups: i)
Reconfigurable, (or configurable multipliers) that can
dynamically divide a large full precision of fixed-width
multiplier into several smaller units working in parallel and
ii) Programmable multipliers, where parts of the partial
product matrix can be disabled at run-time by reducing or
eliminating switching activity. In [10], dynamic power
consumption reduction in a multiplier is achieved by using
arbitrary levels of bit precision on a full N×N bit multiplier.
Here reducing the input width results in a reduction on the
toggling patterns of different parts of the multiplier,
effectively disabling them when left-shifting the input
operands.
PROPOSED WORKSIII.
A. 8×8 Truncated Multiplier
Figure 1 shows the architecture of a full 2N bit multiplier
that can be used for the multiplication of two N bit numbers.
Using this architecture a truncated 8×8 multiplier is
implemented which correspondingly reduces the overall
area and power consumption which is preferable for many
DSP applications. The circuit is coded using VHDL and
simulated using Xilinx ISE 13.3aand obtained the
corresponding results.
Design and Implementation of a Programmable Truncated Multiplier
(IJSRD/Vol. 1/Issue 5/2013/038)
All rights reserved by www.ijsrd.com 1190
Fig. 1: Architecture of 8×8 standard multiplier
Fig. 2: Truncated 8×8 multiplier
B. The Programmable Truncated Multiplier
While many specific applications require the inputs and the
output of the multiplier to have the same bit width, general
purpose digital signal processors need flexibility to support
the generation of large output results in accumulators where
the magnitude of the output is bigger than the multiplier
inputs. They also need to deal correctly with small size
operands that do not optimize the input bit width. Taking as
a starting point a full 2N-bit multiplier, where the full partial
product matrix is computed in order to produce an exact
two’s complement result, the architecture proposed in this
paper provides a method to adjust the active width of the
multiplier following a column-based strategy, thus allowing
a flexible truncation scheme capable of adapting the power
consumption to the requirements of the application.
Equation (1) shows the mathematical representation of the
partial product matrix of a programmable truncated
multiplier. Using external control signals, the active toggling
area can be defined in real time
∑ ∑ [ ]
Where indicates a truncated multiplication, t(i+j) is an
input vector of 2N-1 elements aligned with the output
product bits, and ppt[i+j] are the terms that form the partial
product matrix. For a Baugh-Wooley implementation ppt[i+j]
terms can be obtained as indicated in (2)
[ ]
{
̅̅̅̅̅̅
̅̅̅̅̅̅
Where and terms correspond to the bits in the inputs of
the multiplier X and Y . No gating is applied to the constant
terms added by the Baugh-Wooley algorithm as, being
constant, these bits not contribute to increase the dynamic
power consumption of the multiplier [39]. Fig. 2 illustrates
the concept of programmable truncated multiplication in an
8 8 bit two’s complement multiplier. The column-wise
controllability of the active state of the partial product terms
is achieved by substituting the 2-input AND gates that form
the original partial product terms by 3-input AND gates,
assigning one of the inputs to the corresponding bit. Since 3-
input AND gates require an area overhead of 20%of 2-input
ones, adding programmable truncation results in a maximum
area penalty of 20% of the partial product matrix. Post
synthesis area reports indicate that the area overhead added
due to the addition of programmable truncation in the whole
partial product matrix represents approximately 4% of the
multiplier area in the proposed 16-bitmultiplier, increasing
the logic gate count from 886 to 905 equivalent logic gates.
The multiplier input t controls the functionality of the
programmable partial product cells that generate the final
result. Each of the bits within is connected to the extra input
of the 3-input AND gates that form the corresponding
column in the partial product matrix. If t=0xFFFF in Fig. 2,
all partial products are generated with no truncation, and the
result will be equal to that obtained by a full-precision
multiplier. A value of t=0×7F00 will result in half the partial
product matrix being disabled, which is comparable to
results obtained by direct truncation for fixed-width
multipliers, and equivalent to those in [4] plus a negative
bias. The PTM(Programmable Truncated Multiplier)
structure ,whose partial products are generated using Baugh-
Wooley Algorithm and truncation done using a new
technique is coded in VHDL and simulated.
Fig. 3: Partial product matrix for a 8-bit programmable
truncated multiplier.
Design and Implementation of a Programmable Truncated Multiplier
(IJSRD/Vol. 1/Issue 5/2013/038)
All rights reserved by www.ijsrd.com 1191
SIMULATION RESULTSIV.
Fig. 4: 8×8 Truncated Multiplier Simulation Output
Fig. 5: Programmable Truncated Multiplier Output for
unsigned numbers
Fig. 6: Programmable Truncated Multiplier Output for
signed numbers
CONCLUSIONV.
In this paper, a new architecture to perform truncated
multiplication has been presented and compared against
previous implementations. It has been shown that column-
wise implementation can ease the power-SNR trade off s of
arithmetic units based on truncated multipliers with very
little overhead, when part of a complete DSP system.
REFERENCES
[1] Manuel de la Guia Solaz, and Richard Conway, A
Flexible Low Power DSP With a Programmable
Truncated Multiplier, IEEE Transactions On Circuits
And Systems—I: Regular Papers, Vol. 59, No. 11,
November 2012.
[2] C. Baugh and B. Wooley, ―A two’s complement
parallel array multiplication algorithm,‖ IEEE Trans.
Comput., vol. C-22, no. 12, pp.
1045–1047, 1973.
[3] A. Booth, ―A signed binary multiplication
techniques,‖ Quart. J. Mech. Appl, Math., vol. 4, pp.
236–240, 1951.
[4] N. Petra, D. De Caro, and A. Strollo, ―Design of
fixed-width multipliers with minimum mean square
error,‖ in Proc. 18th Eur. Conf. Circuit Theory
Design (ECCTD), Aug. 2007, pp. 464–467.
[5] M.Schulte and E. E. Swartzlander, Jr., ―Truncated
multiplication with correction constant [for DSP],‖ in
Proc. Workshop VLSI Signal Process. VI, Oct. 1993,
pp. 388–396.
[6] N.Yoshida, E. Goto, and S. Ichikawa,
―Pseudorandom rounding for truncated multipliers,‖
IEEE Trans. Comput., vol. 40, no. 9, pp. 1065–1067,
1991.
[7] N. Petra, D. De Caro, V. Garofalo, E. Napoli, and A.
Strollo, ―Truncated binary multipliers with variable
correction and minimum mean square error,‖ IEEE
Trans. Circuits Syst. I, Reg. Papers, vol. 57, no. 6, pp.
1312–1325, 2010.
[8] N. Petra, D. De Caro, V. Garofalo, E. Napoli, and A.
G. M. Strollo, ―Design of fixed-width multipliers
with linear compensation function,‖ IEEE Trans.
Circuits Syst. I, Reg. Papers, vol. 58, no. 5, pp. 947–
960, May 2011.
[9] C.-H. Chang and R. K. Satzoda,―A low error and high
performance multiplexer-based truncated multiplier,‖
IEEE Trans. Very Large Scale Integr. (VLSI) Syst.,
vol. 18, no. 12, pp. 1767–1771, Dec. 2010.
[10] I.-C. Wey and C.-C. Wang, ―Low-error and area-
efficient fixed-width multiplier by using minor input
correction vector,‖ in Proc. Int. Conf. Electron. Inf.
Eng. (ICEIE), Aug. 2010, vol. 1, pp. V1-118–V1-
122.
[11] K.Han,B. Evans, and E. E. Swartzlander, Jr., ―Low-
power multipliers with data wordlength reduction,‖ in
Conf. Rec. 39th Asilomar Conf. Signals, Syst.
Comput., 2005., Nov. 1, 2005, vol. 28, pp. 1615–
1619.

More Related Content

What's hot

International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
ijceronline
 

What's hot (15)

N046018089
N046018089N046018089
N046018089
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
IMPLEMENTATION OF UNSIGNED MULTIPLIER USING MODIFIED CSLA
IMPLEMENTATION OF UNSIGNED MULTIPLIER USING MODIFIED CSLAIMPLEMENTATION OF UNSIGNED MULTIPLIER USING MODIFIED CSLA
IMPLEMENTATION OF UNSIGNED MULTIPLIER USING MODIFIED CSLA
 
F1074145
F1074145F1074145
F1074145
 
IRJET - Design and Implementation of FFT using Compressor with XOR Gate Topology
IRJET - Design and Implementation of FFT using Compressor with XOR Gate TopologyIRJET - Design and Implementation of FFT using Compressor with XOR Gate Topology
IRJET - Design and Implementation of FFT using Compressor with XOR Gate Topology
 
05725150
0572515005725150
05725150
 
Analysis of low pdp using SPST in bilateral filter
Analysis of low pdp using SPST in bilateral filterAnalysis of low pdp using SPST in bilateral filter
Analysis of low pdp using SPST in bilateral filter
 
Analysis, verification and fpga implementation of low power multiplier
Analysis, verification and fpga implementation of low power multiplierAnalysis, verification and fpga implementation of low power multiplier
Analysis, verification and fpga implementation of low power multiplier
 
Gray Image Watermarking using slant transform - digital image processing
Gray Image Watermarking using slant transform - digital image processingGray Image Watermarking using slant transform - digital image processing
Gray Image Watermarking using slant transform - digital image processing
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Compare Efficiency of Different Multipliers Using Verilog Simulation & Modify...
Compare Efficiency of Different Multipliers Using Verilog Simulation & Modify...Compare Efficiency of Different Multipliers Using Verilog Simulation & Modify...
Compare Efficiency of Different Multipliers Using Verilog Simulation & Modify...
 
IJET-V3I1P14
IJET-V3I1P14IJET-V3I1P14
IJET-V3I1P14
 
Analysis of various mcm algorithms for reconfigurable rrc fir filter
Analysis of various mcm algorithms for reconfigurable rrc fir filterAnalysis of various mcm algorithms for reconfigurable rrc fir filter
Analysis of various mcm algorithms for reconfigurable rrc fir filter
 
A Pipelined Fused Processing Unit for DSP Applications
A Pipelined Fused Processing Unit for DSP ApplicationsA Pipelined Fused Processing Unit for DSP Applications
A Pipelined Fused Processing Unit for DSP Applications
 
Id2514581462
Id2514581462Id2514581462
Id2514581462
 

Viewers also liked

International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Iaetsd design and implementation of a novel multiplier using fixed width repl...
Iaetsd design and implementation of a novel multiplier using fixed width repl...Iaetsd design and implementation of a novel multiplier using fixed width repl...
Iaetsd design and implementation of a novel multiplier using fixed width repl...
Iaetsd Iaetsd
 
An Application for Performing Real Time Speech Translation in Mobile Environment
An Application for Performing Real Time Speech Translation in Mobile EnvironmentAn Application for Performing Real Time Speech Translation in Mobile Environment
An Application for Performing Real Time Speech Translation in Mobile Environment
Association of Scientists, Developers and Faculties
 

Viewers also liked (13)

Design and Implementation of Multiplier Using Kcm and Vedic Mathematics by Us...
Design and Implementation of Multiplier Using Kcm and Vedic Mathematics by Us...Design and Implementation of Multiplier Using Kcm and Vedic Mathematics by Us...
Design and Implementation of Multiplier Using Kcm and Vedic Mathematics by Us...
 
ICIECA 2014 Paper 22
ICIECA 2014 Paper 22ICIECA 2014 Paper 22
ICIECA 2014 Paper 22
 
Fpga implementation of truncated multiplier for array multiplication
Fpga implementation of truncated multiplier for array multiplicationFpga implementation of truncated multiplier for array multiplication
Fpga implementation of truncated multiplier for array multiplication
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Iaetsd design and implementation of a novel multiplier using fixed width repl...
Iaetsd design and implementation of a novel multiplier using fixed width repl...Iaetsd design and implementation of a novel multiplier using fixed width repl...
Iaetsd design and implementation of a novel multiplier using fixed width repl...
 
Fast Multiplier for FIR Filters
Fast Multiplier for FIR FiltersFast Multiplier for FIR Filters
Fast Multiplier for FIR Filters
 
An Application for Performing Real Time Speech Translation in Mobile Environment
An Application for Performing Real Time Speech Translation in Mobile EnvironmentAn Application for Performing Real Time Speech Translation in Mobile Environment
An Application for Performing Real Time Speech Translation in Mobile Environment
 
Block Truncation Coding
Block Truncation CodingBlock Truncation Coding
Block Truncation Coding
 
Booth Multiplier
Booth MultiplierBooth Multiplier
Booth Multiplier
 
DESIGN AND SIMULATION OF DIFFERENT 8-BIT MULTIPLIERS USING VERILOG CODE BY SA...
DESIGN AND SIMULATION OF DIFFERENT 8-BIT MULTIPLIERS USING VERILOG CODE BY SA...DESIGN AND SIMULATION OF DIFFERENT 8-BIT MULTIPLIERS USING VERILOG CODE BY SA...
DESIGN AND SIMULATION OF DIFFERENT 8-BIT MULTIPLIERS USING VERILOG CODE BY SA...
 
Bit Serial multiplier using Verilog
Bit Serial multiplier using VerilogBit Serial multiplier using Verilog
Bit Serial multiplier using Verilog
 
Array multiplier
Array multiplierArray multiplier
Array multiplier
 
Error Reduction of Modified Booth Multipliers in Mac Unit
Error Reduction of Modified Booth Multipliers in Mac UnitError Reduction of Modified Booth Multipliers in Mac Unit
Error Reduction of Modified Booth Multipliers in Mac Unit
 

Similar to Design and Implementation of a Programmable Truncated Multiplier

32 bit×32 bit multiprecision razor based dynamic
32 bit×32 bit multiprecision razor based dynamic32 bit×32 bit multiprecision razor based dynamic
32 bit×32 bit multiprecision razor based dynamic
Mastan Masthan
 
Design of power and delay efficient 32 bit x 32 bit multi precision multiplie...
Design of power and delay efficient 32 bit x 32 bit multi precision multiplie...Design of power and delay efficient 32 bit x 32 bit multi precision multiplie...
Design of power and delay efficient 32 bit x 32 bit multi precision multiplie...
eSAT Journals
 
Implementation of Radix-4 Booth Multiplier by VHDL
Implementation of Radix-4 Booth Multiplier by VHDLImplementation of Radix-4 Booth Multiplier by VHDL
Implementation of Radix-4 Booth Multiplier by VHDL
paperpublications3
 

Similar to Design and Implementation of a Programmable Truncated Multiplier (20)

International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
A METHODOLOGY FOR IMPROVEMENT OF ROBA MULTIPLIER FOR ELECTRONIC APPLICATIONS
A METHODOLOGY FOR IMPROVEMENT OF ROBA MULTIPLIER FOR ELECTRONIC APPLICATIONSA METHODOLOGY FOR IMPROVEMENT OF ROBA MULTIPLIER FOR ELECTRONIC APPLICATIONS
A METHODOLOGY FOR IMPROVEMENT OF ROBA MULTIPLIER FOR ELECTRONIC APPLICATIONS
 
A METHODOLOGY FOR IMPROVEMENT OF ROBA MULTIPLIER FOR ELECTRONIC APPLICATIONS
A METHODOLOGY FOR IMPROVEMENT OF ROBA MULTIPLIER FOR ELECTRONIC APPLICATIONSA METHODOLOGY FOR IMPROVEMENT OF ROBA MULTIPLIER FOR ELECTRONIC APPLICATIONS
A METHODOLOGY FOR IMPROVEMENT OF ROBA MULTIPLIER FOR ELECTRONIC APPLICATIONS
 
A METHODOLOGY FOR IMPROVEMENT OF ROBA MULTIPLIER FOR ELECTRONIC APPLICATIONS
A METHODOLOGY FOR IMPROVEMENT OF ROBA MULTIPLIER FOR ELECTRONIC APPLICATIONSA METHODOLOGY FOR IMPROVEMENT OF ROBA MULTIPLIER FOR ELECTRONIC APPLICATIONS
A METHODOLOGY FOR IMPROVEMENT OF ROBA MULTIPLIER FOR ELECTRONIC APPLICATIONS
 
Ad4103173176
Ad4103173176Ad4103173176
Ad4103173176
 
PARTIAL PRODUCT ARRAY HEIGHT REDUCTION USING RADIX-16 FOR 64-BIT BOOTH MULTI...
PARTIAL PRODUCT ARRAY HEIGHT REDUCTION USING RADIX-16 FOR 64-BIT BOOTH MULTI...PARTIAL PRODUCT ARRAY HEIGHT REDUCTION USING RADIX-16 FOR 64-BIT BOOTH MULTI...
PARTIAL PRODUCT ARRAY HEIGHT REDUCTION USING RADIX-16 FOR 64-BIT BOOTH MULTI...
 
Low Power VLSI Design of Modified Booth Multiplier
Low Power VLSI Design of Modified Booth MultiplierLow Power VLSI Design of Modified Booth Multiplier
Low Power VLSI Design of Modified Booth Multiplier
 
32 bit×32 bit multiprecision razor based dynamic
32 bit×32 bit multiprecision razor based dynamic32 bit×32 bit multiprecision razor based dynamic
32 bit×32 bit multiprecision razor based dynamic
 
Design of a Novel Multiplier and Accumulator using Modified Booth Algorithm w...
Design of a Novel Multiplier and Accumulator using Modified Booth Algorithm w...Design of a Novel Multiplier and Accumulator using Modified Booth Algorithm w...
Design of a Novel Multiplier and Accumulator using Modified Booth Algorithm w...
 
Design of power and delay efficient 32 bit x 32 bit multi precision multiplie...
Design of power and delay efficient 32 bit x 32 bit multi precision multiplie...Design of power and delay efficient 32 bit x 32 bit multi precision multiplie...
Design of power and delay efficient 32 bit x 32 bit multi precision multiplie...
 
Parallel Processing Technique for Time Efficient Matrix Multiplication
Parallel Processing Technique for Time Efficient Matrix MultiplicationParallel Processing Technique for Time Efficient Matrix Multiplication
Parallel Processing Technique for Time Efficient Matrix Multiplication
 
Low Power and Area Efficient Multiplier Layout using Transmission Gate
Low Power and Area Efficient Multiplier Layout using Transmission GateLow Power and Area Efficient Multiplier Layout using Transmission Gate
Low Power and Area Efficient Multiplier Layout using Transmission Gate
 
Design and testing of systolic array multiplier using fault injecting schemes
Design and testing of systolic array multiplier using fault injecting schemesDesign and testing of systolic array multiplier using fault injecting schemes
Design and testing of systolic array multiplier using fault injecting schemes
 
Implementation of Radix-4 Booth Multiplier by VHDL
Implementation of Radix-4 Booth Multiplier by VHDLImplementation of Radix-4 Booth Multiplier by VHDL
Implementation of Radix-4 Booth Multiplier by VHDL
 
IRJET- Comparison of Multiplier Design with Various Full Adders
IRJET- Comparison of Multiplier Design with Various Full AddersIRJET- Comparison of Multiplier Design with Various Full Adders
IRJET- Comparison of Multiplier Design with Various Full Adders
 
FPGA based Efficient Interpolator design using DALUT Algorithm
FPGA based Efficient Interpolator design using DALUT AlgorithmFPGA based Efficient Interpolator design using DALUT Algorithm
FPGA based Efficient Interpolator design using DALUT Algorithm
 
FPGA based Efficient Interpolator design using DALUT Algorithm
FPGA based Efficient Interpolator design using DALUT AlgorithmFPGA based Efficient Interpolator design using DALUT Algorithm
FPGA based Efficient Interpolator design using DALUT Algorithm
 
IRJET- Efficient Shift add Implementation of Fir Filter using Variable Pa...
IRJET-  	  Efficient Shift add Implementation of Fir Filter using Variable Pa...IRJET-  	  Efficient Shift add Implementation of Fir Filter using Variable Pa...
IRJET- Efficient Shift add Implementation of Fir Filter using Variable Pa...
 
Implementation and Performance Analysis of a Vedic Multiplier Using Tanner ED...
Implementation and Performance Analysis of a Vedic Multiplier Using Tanner ED...Implementation and Performance Analysis of a Vedic Multiplier Using Tanner ED...
Implementation and Performance Analysis of a Vedic Multiplier Using Tanner ED...
 

More from ijsrd.com

More from ijsrd.com (20)

IoT Enabled Smart Grid
IoT Enabled Smart GridIoT Enabled Smart Grid
IoT Enabled Smart Grid
 
A Survey Report on : Security & Challenges in Internet of Things
A Survey Report on : Security & Challenges in Internet of ThingsA Survey Report on : Security & Challenges in Internet of Things
A Survey Report on : Security & Challenges in Internet of Things
 
IoT for Everyday Life
IoT for Everyday LifeIoT for Everyday Life
IoT for Everyday Life
 
Study on Issues in Managing and Protecting Data of IOT
Study on Issues in Managing and Protecting Data of IOTStudy on Issues in Managing and Protecting Data of IOT
Study on Issues in Managing and Protecting Data of IOT
 
Interactive Technologies for Improving Quality of Education to Build Collabor...
Interactive Technologies for Improving Quality of Education to Build Collabor...Interactive Technologies for Improving Quality of Education to Build Collabor...
Interactive Technologies for Improving Quality of Education to Build Collabor...
 
Internet of Things - Paradigm Shift of Future Internet Application for Specia...
Internet of Things - Paradigm Shift of Future Internet Application for Specia...Internet of Things - Paradigm Shift of Future Internet Application for Specia...
Internet of Things - Paradigm Shift of Future Internet Application for Specia...
 
A Study of the Adverse Effects of IoT on Student's Life
A Study of the Adverse Effects of IoT on Student's LifeA Study of the Adverse Effects of IoT on Student's Life
A Study of the Adverse Effects of IoT on Student's Life
 
Pedagogy for Effective use of ICT in English Language Learning
Pedagogy for Effective use of ICT in English Language LearningPedagogy for Effective use of ICT in English Language Learning
Pedagogy for Effective use of ICT in English Language Learning
 
Virtual Eye - Smart Traffic Navigation System
Virtual Eye - Smart Traffic Navigation SystemVirtual Eye - Smart Traffic Navigation System
Virtual Eye - Smart Traffic Navigation System
 
Ontological Model of Educational Programs in Computer Science (Bachelor and M...
Ontological Model of Educational Programs in Computer Science (Bachelor and M...Ontological Model of Educational Programs in Computer Science (Bachelor and M...
Ontological Model of Educational Programs in Computer Science (Bachelor and M...
 
Understanding IoT Management for Smart Refrigerator
Understanding IoT Management for Smart RefrigeratorUnderstanding IoT Management for Smart Refrigerator
Understanding IoT Management for Smart Refrigerator
 
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...
 
A Review: Microwave Energy for materials processing
A Review: Microwave Energy for materials processingA Review: Microwave Energy for materials processing
A Review: Microwave Energy for materials processing
 
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web LogsWeb Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
 
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEMAPPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEM
 
Making model of dual axis solar tracking with Maximum Power Point Tracking
Making model of dual axis solar tracking with Maximum Power Point TrackingMaking model of dual axis solar tracking with Maximum Power Point Tracking
Making model of dual axis solar tracking with Maximum Power Point Tracking
 
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...
 
Study and Review on Various Current Comparators
Study and Review on Various Current ComparatorsStudy and Review on Various Current Comparators
Study and Review on Various Current Comparators
 
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...
 
Defending Reactive Jammers in WSN using a Trigger Identification Service.
Defending Reactive Jammers in WSN using a Trigger Identification Service.Defending Reactive Jammers in WSN using a Trigger Identification Service.
Defending Reactive Jammers in WSN using a Trigger Identification Service.
 

Recently uploaded

Laundry management system project report.pdf
Laundry management system project report.pdfLaundry management system project report.pdf
Laundry management system project report.pdf
Kamal Acharya
 
Hall booking system project report .pdf
Hall booking system project report  .pdfHall booking system project report  .pdf
Hall booking system project report .pdf
Kamal Acharya
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdf
Kamal Acharya
 

Recently uploaded (20)

Laundry management system project report.pdf
Laundry management system project report.pdfLaundry management system project report.pdf
Laundry management system project report.pdf
 
IT-601 Lecture Notes-UNIT-2.pdf Data Analysis
IT-601 Lecture Notes-UNIT-2.pdf Data AnalysisIT-601 Lecture Notes-UNIT-2.pdf Data Analysis
IT-601 Lecture Notes-UNIT-2.pdf Data Analysis
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
 
Hall booking system project report .pdf
Hall booking system project report  .pdfHall booking system project report  .pdf
Hall booking system project report .pdf
 
fluid mechanics gate notes . gate all pyqs answer
fluid mechanics gate notes . gate all pyqs answerfluid mechanics gate notes . gate all pyqs answer
fluid mechanics gate notes . gate all pyqs answer
 
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptxCloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
 
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdfA CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
 
Natalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in KrakówNatalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in Kraków
 
Peek implant persentation - Copy (1).pdf
Peek implant persentation - Copy (1).pdfPeek implant persentation - Copy (1).pdf
Peek implant persentation - Copy (1).pdf
 
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdfRESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
 
The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docx
The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docxThe Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docx
The Ultimate Guide to External Floating Roofs for Oil Storage Tanks.docx
 
Top 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering ScientistTop 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering Scientist
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
Arduino based vehicle speed tracker project
Arduino based vehicle speed tracker projectArduino based vehicle speed tracker project
Arduino based vehicle speed tracker project
 
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical EngineeringIntroduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
 
Toll tax management system project report..pdf
Toll tax management system project report..pdfToll tax management system project report..pdf
Toll tax management system project report..pdf
 
Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdf
 
Scaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltageScaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltage
 
Pharmacy management system project report..pdf
Pharmacy management system project report..pdfPharmacy management system project report..pdf
Pharmacy management system project report..pdf
 

Design and Implementation of a Programmable Truncated Multiplier

  • 1. IJSRD - International Journal for Scientific Research & Development| Vol. 1, Issue 5, 2013 | ISSN (online): 2321-0613 All rights reserved by www.ijsrd.com 1188 Abstract— Truncated multiplication reduces part of the power required by multipliers by only computing the most- significant bits of the product. The most common approach to truncation includes physical reduction of the partial product matrix and a compensation for the reduced bits via different hardware compensation sub circuits. However, this result in fixed systems optimized for a given application at design time. A novel approach to truncation is proposed, where a full precision multiplier is implemented, but the active section of the partial product matrix is selected dynamically at run-time. This allows a power reduction trade off against signal degradation which can be modified at run time. Such architecture brings together the power reduction benefits from truncated multipliers and the flexibility of reconfigurable and general purpose devices. Efficient implementation of such a multiplier is presented in a custom digital signal processor where the concept of software compensation is introduced and analysed for different applications. Experimental results and power measurements are studied, including power measurements from both post-synthesis simulations and a fabricated IC implementation. This is the first system-level DSP core using a fine-grain truncated multiplier. Results demonstrate the effectiveness of the programmable truncated MAC (PTMAC) in achieving power reduction, with minimum impact on functionality for a number of applications. Software compensation also shown to be effective when deploying truncated multipliers in a system. INTRODUCTIONI. The high increase of portable communication and computing devices and the advance in mobile multimedia systems has made power consumption critical to optimize in the design of digital signal processing architectures. Advances in very large scale integrated-circuit technology (VLSI) has been one of the major driving forces for the growth of digital signal processors (DSP), enabling the implementation of complex algorithms in programmable DSP architectures and fixed application specific hardware. Within DSP systems, multipliers are among the most fundamental building blocks and parallel implementations, required for high speed computation, represent the main component in terms of power consumption of any hardware dedicated to complex mathematic functions, such as filtering, compression, or classification. The relationship between the physical characteristics of the multiplier and its resolution determine the efficiency and accuracy of the systems. The optimization of multipliers in terms of area, power and timing has been extensively studied in the past. Full or direct multiplier implementation of an N×N–bit multiplication yields a 2N-bit product. In order to keep the full accuracy of the system, DSP architecture would need an ever-growing bit width that would be impossible or impractical to implement. To avoid this, results are usually truncated or rounded down to keep results within the limits of the architecture bit width. In order to reduce the parallel multiplier requirements in systems where an exact result is not required, several techniques that reduce power consumption at the expense of lower precision have been presented in the literature. Such techniques skip the implementation and/or disable parts of the partial product matrix to trade energy spent by the computational process for degradation on the output signal. These techniques fall into two categories, namely Word length Reduction and Truncated Multipliers. Word length reduction techniques minimize switching activity at the expense of data precision by input shifting or truncation, but since the reduction is done from the input operands, power reductions are obtained at the cost of high levels of output noise. Truncated multiplication structures based on Baugh-Wooley [2] or Booth [3] algorithms, do not (by design) compute the lowest sections of the partial product matrix. By doing so, a certain error is introduced in the output while savings in power, area, complexity, and timing are achieved. The majority of the previous publications, focused on fixed-width multipliers [4] that produce a fixed -bit output, and their gain-vs.-error ratio is set by hardware. Configurable techniques result in flexible architectures, which while losing their benefits resulting from area optimization and static power reduction, provide the multiplier with the advantage of adaptability. This is desirable in systems with a certain degree of programmability. The architecture presented in this paper, a programmable truncated multiplier (PTM), describes a full-precision multiplier, where the elements of the partial product matrix can be disabled through an external control word in a column-wise mode. Disabled columns cease to contribute to the dynamic power consumption, achieving power reductions in the multiplier. Benefits of PTMs include: 1. Real-time control of the power-SNR exchange in applications where power modes are switchable at run-time. 2. Flexibility on accuracy selection for programmable devices such as DSP structures that would benefit from different truncation levels for different applications, and at different operating points of an individual algorithm. The paper is organized as follows. Section II presents a brief background about multipliers and two’s compliment multiplication. The proposed works are presented in Section III. Section IV gives the simulation Results of the implemented multiplier. Design and Implementation of a Programmable Truncated Multiplier Sethu Merin George1 1 M.Tech 1 VLSI & ES, MLMCE, Kottayam
  • 2. Design and Implementation of a Programmable Truncated Multiplier (IJSRD/Vol. 1/Issue 5/2013/038) All rights reserved by www.ijsrd.com 1189 Finally, the conclusion section summarizes the entire operations performed as well as the results. BACKGROUNDII. A. Two’s Complement Multiplication An N-bit fractional multiplicand and an -bit fractional multiplier will be considered as the inputs of the multiplier. Let ∑ and ∑ where Also, for the sake of generality, their 2N-bit product PStandard will be maintained in full precision as: PStandard=XY= ∑ . Applying the Baugh-Wooley algorithm [31] to the partial product matrix that generates PStandard results in a final matrix of all-positive partial product bits such as PStandard=XY ∑ ̅̅̅̅̅̅̅̅̅ ∑ ̅̅̅̅̅̅̅̅̅ . Once the partial product matrix of the parallel multiplier is generated, partial product terms have to be combined into the final product result. Tree structures are often preferred to array implementations due to their shorter critical path. B. Truncated Multipliers Multiplication is a common requirement in DSP systems and plays an important role in terms of power consumption. In those systems where it is not necessary to compute the exact least significant part of the product, truncated multipliers achieve power, area and timing improvements by skipping the implementation of a part of the least significant part of the partial product matrix[5]-[9]. Instead of computing the full-precision output, the output results from the sum of the most significant (N+h) columns (where 0 ≤ h ≤ N) plus an estimation of the unformed bits. Fig.1 displays a generic partial product matrix, where the partial product matrix is split into two main regions, the least significant part (LSP), which contains the N least significant columns of the partial product matrix, and the most significant part (MSP), that includes the most significant columns of partial product terms. LSP can be also be split in two regions, LSP major being the h most significant column, and LSP minor being the least significant columns of the partial product matrix. The product resulting from a full-width multiplier, where the partial product is fully implemented with an exact 2N-bit result can be described as Pfull=SMSP+SLSP , where SMSP is the sum of the partial product bits belonging to MSP and SLSP is the sum of those belonging to the LSP. In many applications, product values generated by fixed width N×N bit multipliers are truncated or rounded back to the original bit width in latter stages of the algorithm flow. The N-bit product after rounding is obtained as Pfullround=tN(SMSP+SLSP+LSB/2) , where LSB is the value of one bit in the least significant column of the MSP, and tN(x) represents truncation of an operand x by discarding its lowest bits. In this case bits are dropped to maintain the original width of N bits. Truncation allows a way of reducing the complexity of the multiplier unit by discarding the lower parts of the partial product matrix. This results in most of the error being generated in the lower weighted bits of the output that are discarded when converting the output back to the original bit width. By doing so, significant savings in power and complexity can be achieved, at the expense of signal degradation. The simplest scheme to obtain a truncated multiplier consists of removing the lower columns of the partial product matrix that form the LSP, and is referred to in the literature as direct truncation (D- Truncation). It results in the multiplier requirements being almost halved in both area and power, at the price of large errors with a strong negative bias introduced in the multiplier output. In order to reduce both the magnitude and bias of the error introduced by the truncation process, most of the references available in the literature study fixed width multiplying structures where the partial-product bits belonging to LSP minor are not implemented. A compensation term is added to compensate for the missing partial products, usually in the form of a function of the input correction partial- products f(IC), where the IC terms are the partial product bits in the leftmost column of LSP minor. The constant compensation method introduced in [5], was further improved and explored for fixed width multipliers in [4]. Variable correction structures calculate probabilistic estimates of the sum of the elements of the LSP minor by using partial-products in the leftmost column of LSP minor. C. Reconfigurable Power Reduction Fixed-width truncated multipliers minimize power consumption and optimize silicon area and timing, at the price of adding a truncation error, by not implementing the LSP minor region of the partial product matrix. Being purpose-specific and optimized in hardware, they do not provide any flexibility to operate at different configuration modes in systems where the power and performance goals vary with time. Reconfigurable and programmable multipliers are structures that allow hardware multiplier architecture to be configured to operate in different modes. Such structures can be split in two main groups: i) Reconfigurable, (or configurable multipliers) that can dynamically divide a large full precision of fixed-width multiplier into several smaller units working in parallel and ii) Programmable multipliers, where parts of the partial product matrix can be disabled at run-time by reducing or eliminating switching activity. In [10], dynamic power consumption reduction in a multiplier is achieved by using arbitrary levels of bit precision on a full N×N bit multiplier. Here reducing the input width results in a reduction on the toggling patterns of different parts of the multiplier, effectively disabling them when left-shifting the input operands. PROPOSED WORKSIII. A. 8×8 Truncated Multiplier Figure 1 shows the architecture of a full 2N bit multiplier that can be used for the multiplication of two N bit numbers. Using this architecture a truncated 8×8 multiplier is implemented which correspondingly reduces the overall area and power consumption which is preferable for many DSP applications. The circuit is coded using VHDL and simulated using Xilinx ISE 13.3aand obtained the corresponding results.
  • 3. Design and Implementation of a Programmable Truncated Multiplier (IJSRD/Vol. 1/Issue 5/2013/038) All rights reserved by www.ijsrd.com 1190 Fig. 1: Architecture of 8×8 standard multiplier Fig. 2: Truncated 8×8 multiplier B. The Programmable Truncated Multiplier While many specific applications require the inputs and the output of the multiplier to have the same bit width, general purpose digital signal processors need flexibility to support the generation of large output results in accumulators where the magnitude of the output is bigger than the multiplier inputs. They also need to deal correctly with small size operands that do not optimize the input bit width. Taking as a starting point a full 2N-bit multiplier, where the full partial product matrix is computed in order to produce an exact two’s complement result, the architecture proposed in this paper provides a method to adjust the active width of the multiplier following a column-based strategy, thus allowing a flexible truncation scheme capable of adapting the power consumption to the requirements of the application. Equation (1) shows the mathematical representation of the partial product matrix of a programmable truncated multiplier. Using external control signals, the active toggling area can be defined in real time ∑ ∑ [ ] Where indicates a truncated multiplication, t(i+j) is an input vector of 2N-1 elements aligned with the output product bits, and ppt[i+j] are the terms that form the partial product matrix. For a Baugh-Wooley implementation ppt[i+j] terms can be obtained as indicated in (2) [ ] { ̅̅̅̅̅̅ ̅̅̅̅̅̅ Where and terms correspond to the bits in the inputs of the multiplier X and Y . No gating is applied to the constant terms added by the Baugh-Wooley algorithm as, being constant, these bits not contribute to increase the dynamic power consumption of the multiplier [39]. Fig. 2 illustrates the concept of programmable truncated multiplication in an 8 8 bit two’s complement multiplier. The column-wise controllability of the active state of the partial product terms is achieved by substituting the 2-input AND gates that form the original partial product terms by 3-input AND gates, assigning one of the inputs to the corresponding bit. Since 3- input AND gates require an area overhead of 20%of 2-input ones, adding programmable truncation results in a maximum area penalty of 20% of the partial product matrix. Post synthesis area reports indicate that the area overhead added due to the addition of programmable truncation in the whole partial product matrix represents approximately 4% of the multiplier area in the proposed 16-bitmultiplier, increasing the logic gate count from 886 to 905 equivalent logic gates. The multiplier input t controls the functionality of the programmable partial product cells that generate the final result. Each of the bits within is connected to the extra input of the 3-input AND gates that form the corresponding column in the partial product matrix. If t=0xFFFF in Fig. 2, all partial products are generated with no truncation, and the result will be equal to that obtained by a full-precision multiplier. A value of t=0×7F00 will result in half the partial product matrix being disabled, which is comparable to results obtained by direct truncation for fixed-width multipliers, and equivalent to those in [4] plus a negative bias. The PTM(Programmable Truncated Multiplier) structure ,whose partial products are generated using Baugh- Wooley Algorithm and truncation done using a new technique is coded in VHDL and simulated. Fig. 3: Partial product matrix for a 8-bit programmable truncated multiplier.
  • 4. Design and Implementation of a Programmable Truncated Multiplier (IJSRD/Vol. 1/Issue 5/2013/038) All rights reserved by www.ijsrd.com 1191 SIMULATION RESULTSIV. Fig. 4: 8×8 Truncated Multiplier Simulation Output Fig. 5: Programmable Truncated Multiplier Output for unsigned numbers Fig. 6: Programmable Truncated Multiplier Output for signed numbers CONCLUSIONV. In this paper, a new architecture to perform truncated multiplication has been presented and compared against previous implementations. It has been shown that column- wise implementation can ease the power-SNR trade off s of arithmetic units based on truncated multipliers with very little overhead, when part of a complete DSP system. REFERENCES [1] Manuel de la Guia Solaz, and Richard Conway, A Flexible Low Power DSP With a Programmable Truncated Multiplier, IEEE Transactions On Circuits And Systems—I: Regular Papers, Vol. 59, No. 11, November 2012. [2] C. Baugh and B. Wooley, ―A two’s complement parallel array multiplication algorithm,‖ IEEE Trans. Comput., vol. C-22, no. 12, pp. 1045–1047, 1973. [3] A. Booth, ―A signed binary multiplication techniques,‖ Quart. J. Mech. Appl, Math., vol. 4, pp. 236–240, 1951. [4] N. Petra, D. De Caro, and A. Strollo, ―Design of fixed-width multipliers with minimum mean square error,‖ in Proc. 18th Eur. Conf. Circuit Theory Design (ECCTD), Aug. 2007, pp. 464–467. [5] M.Schulte and E. E. Swartzlander, Jr., ―Truncated multiplication with correction constant [for DSP],‖ in Proc. Workshop VLSI Signal Process. VI, Oct. 1993, pp. 388–396. [6] N.Yoshida, E. Goto, and S. Ichikawa, ―Pseudorandom rounding for truncated multipliers,‖ IEEE Trans. Comput., vol. 40, no. 9, pp. 1065–1067, 1991. [7] N. Petra, D. De Caro, V. Garofalo, E. Napoli, and A. Strollo, ―Truncated binary multipliers with variable correction and minimum mean square error,‖ IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 57, no. 6, pp. 1312–1325, 2010. [8] N. Petra, D. De Caro, V. Garofalo, E. Napoli, and A. G. M. Strollo, ―Design of fixed-width multipliers with linear compensation function,‖ IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 58, no. 5, pp. 947– 960, May 2011. [9] C.-H. Chang and R. K. Satzoda,―A low error and high performance multiplexer-based truncated multiplier,‖ IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 18, no. 12, pp. 1767–1771, Dec. 2010. [10] I.-C. Wey and C.-C. Wang, ―Low-error and area- efficient fixed-width multiplier by using minor input correction vector,‖ in Proc. Int. Conf. Electron. Inf. Eng. (ICEIE), Aug. 2010, vol. 1, pp. V1-118–V1- 122. [11] K.Han,B. Evans, and E. E. Swartzlander, Jr., ―Low- power multipliers with data wordlength reduction,‖ in Conf. Rec. 39th Asilomar Conf. Signals, Syst. Comput., 2005., Nov. 1, 2005, vol. 28, pp. 1615– 1619.