SlideShare a Scribd company logo
1 of 9
Download to read offline
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherryโ€“ 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
Approximate Sum-of-Products Designs Based on Distributed Arithmetic
Abstract:
Approximate circuits provide high performance and require low power. Sum-of-products
(SOP) units are key elements in many digital signal processing applications. In this brief,
three approximate SOP (ASOP) models which are based on the distributed arithmetic are
proposed. They are designed for different levels of accuracy. First model of ASOP
achieves an improvement up to 64% on area and 70% on power, when compared with
conventional unit. Other two models provide an improvement of 32% and 48% on area
and 54% and 58% on power, respectively, with a reduced error rate compared with the
first model. Third model achieves the mean relative error and normalized error distance
as low as 0.05% and 0.009%, respectively. Performance of approximate units is evaluated
with a noisy image smoothing application, where the proposed models are capable of
achieving higher peak signal to-noise ratio than the existing state-of-the-art techniques. It
is shown that the proposed approximate models achieve higher processing accuracy than
existing works but with significant improvements in power and performance.
Software Implementation:
๏‚ท Modelsim
๏‚ท Xilinx 14.2
Existing System:
Approximate computing provides an efficient solution for the design of power efficient
digital systems. For applications, such as multimedia and data processing, approximate
circuits play an important role as a promising alternative for reducing area and power in
digital systems that can tolerate some loss of precision. As one of the key components in
arithmetic circuits, sum-of products (SOP) units have received less attention in terms of
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherryโ€“ 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
approximate implementation. Distributed arithmetic is a very efficient means for
calculation of the inner products between vectors.
It implements multiplication by doing a series of table-lookups and shift-and-accumulate
operations. Due to the flexibility of the level of parallelism in the distributed arithmetic
structure, the area-speed tradeoff can be adjusted. Distributed arithmetic is a bit-serial
operation that computes the inner product of two vectors in parallel. It requires no
multiplication and it has an efficient mechanism to perform the SOP operation. Bit-
parallel versions of distributed arithmetic are proposed. In this brief, three models of SOP
units based on parallel distributed arithmetic are proposed. Their scheme simply involves
truncation in the number of lookup tables, by eliminating the least significant part of the
distributed arithmetic operation. Multipliers have been extensively studied for
approximate implementation. Two models of approximate compressors with reduced
erroneous outputs to accumulate partial products of the Dadda tree multiplier.
The probability-based multiplier is based on the altering the partial products and reducing
the generated partial product tree based on their probability. In partial product perforation
(PPP) multiplier reduces k partial products starting from j th
position, which in turn
reduces the number of adders used in the accumulation of partial products. In this brief,
the novel ASOP designs are proposed using the efficient distributed arithmetic structure.
Approximation involves changes with respect to word length, number of lookup tables,
and number of elements in the final accumulator. Three models are proposed. First model
provides significant power reduction with lower mean relative error (MRE) and
normalized error distance (NED).
Second and third models with increased area and power compared to first model provide
better accuracy. In the proposed approximate structures, reductions in the number of
lookup tables, length of adders, and accumulator size are employed for approximation.
Compared to the exact SOP unit, the proposed models have reduced circuit complexity.
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherryโ€“ 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
NED is an effective metric to quantify the approximation irrespective of the size of the
circuit.
Also, traditional MRE error metric is used to evaluate the impact of approximation. Error
distance is the difference between the exact value and the approximate value, whereas
relative error is the value of error distance divided by the exact value. NED is calculated
by normalizing the error distance by maximum possible exact output. MRE is calculated
from the mean of relative errors for all possible values.
Disadvantages:
๏‚ท Low processing accuracy
๏‚ท Poor performance
๏‚ท Require High power
Proposed System:
Proposed approximate sum -of-products
In this brief, K is 3 and N is 16. For conventional implementation of SOP unit based on
the parallel distributed arithmetic [4], three two-input 16-bit adders, one three-input 16-
bit adder, 16 lookup tables with eight cases, and final accumulator with 16 elements are
required. In our approximation models, hardware requirements are considerably reduced.
Three models of ASOP: ASOP1, ASOP2, and ASOP3 are proposed.
Proposed Approximate Sum-of-Products Model ASOP1
In approximate model 1, K is 3 and N is reduced. m bits at the least significant part of a k
and b k for k = 1, 2, and 3 are truncated. m = 8, 6, and 4 bits are implemented. For this
implementation, three two-input 16 โˆ’ m bit adders, one three-input 16 โˆ’ m bit adder, 16 โˆ’
m lookup tables with eight cases, and final accumulator with 16โˆ’m elements are required.
This considerably reduces the hardware utilization at all the levels. The approximate
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherryโ€“ 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
model with reduced elements is shown in Fig. 1. In by implementing with limits m to N
โˆ’1, the number of lookup tables reduces to 16โˆ’m and 16โˆ’m elements are sent to the final
accumulator (16 โˆ’ m ร— 18). It should be noted that in ASOP1, the number of input bits to
the adders
Fig. 1. Approximate lookup table and corresponding ASOP (ASOP1) structure for K = 3 and N = 16.
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherryโ€“ 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
Fig. 2. Approximate lookup table and corresponding ASOP (ASOP2) structure for K = 3 and N = 16. is
reduced, which further reduces the complexity of accumulator (16 โˆ’ m ร— 18 โˆ’ m), compared to [5].
Proposed Approximate Sum-of-Products Model ASOP2
ASOP2 is similar to ASOP1 with the addition of m-bit leading one predictor. This
increases the accuracy, and more suitable for DSP application which will be discussed
later in this section. In our method, leading one prediction of a k and b k for k = 1, 2, and 3
requires OR operation of most significant m bits of a k and b k for k = 1, 2, and 3 followed
by the priority encoder. The function of OR gates can be given as a mor = a 1m|a 2m|a 3m and
b mor = b 1m|b 2m|b 3m where km represents first m bits of k th
element, for m = 4, 6, or 8.
After the leading one prediction, ASOP1 structure is used for the computation of
elements starting from the leading one position. Fig. 2 shows Approximate lookup table
and corresponding ASOP (ASOP2) structure for K = 3 and N = 16. is reduced, which
further reduces the complexity of accumulator (16 โˆ’ m ร— 18 โˆ’ m), compared to [5].
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherryโ€“ 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
For example, consider the input elements as a 1 = โ€œ00110010 00101110,โ€ a 2 =
โ€œ0001011000101011,โ€ a 3 = โ€œ0010011001 101000,โ€ b1= โ€œ0001001011101001,โ€ b2=
โ€œ0001101000101110,โ€ and b3 = โ€œ0000101011101011.โ€ For m = 4, amor = 0011, leading
one predictor predicts zeros in first two bits of bit positions โ€œ15โ€ and โ€œ14โ€ of a 1, a2, and
a3, 12-bit (16 โˆ’ m) information starting from bit position โ€œ13โ€ to โ€œ2โ€ of a 1 , a2, and a3
(โ€œ110010001011,โ€ โ€œ010110001010,โ€ and โ€œ100110011010โ€) are taken and fed to the
inputs of the lookup tables. For m = 4, b mor = 0001, leading one predictor predicts zeros
in first three bits of bit positions โ€œ15,โ€ โ€œ14,โ€ and โ€œ13โ€ of b1, b2, and b3, 12-bit (16 โˆ’ m)
information starting from bit position โ€œ12โ€ to โ€œ1โ€ of b1, b2, and b3 (โ€œ100101110100,โ€
โ€œ110100010111,โ€ and โ€œ010101110101โ€) are taken and fed as control signals of lookup
Fig. 3. Least significant part of the ASOP (ASOP3) structure.
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherryโ€“ 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
tables. The overall structure of ASOP2 is given in Fig. 3, where LZA refers to leading
zeros in a mor and LZB refers to leading zeros in b mor. ASOP2 reduces the negative
effects of truncation, especially when there is information only in least significant parts of
the inputs. In DSP applications, pixel values are highly correlated and the number of
initial zeros of a k and b k for k = 1, 2, 3 have high chances of being the same. Using OR
gate for combining the elements and using a leading one predictor afterward reduces the
hardware resources to be used.
Proposed Approximate Sum-of-Products Model ASOP3
In ASOP1, the least significant part m = 8, 6, and 4 bits are truncated. In ASOP1, m bits
are truncated from the 18-bit outputs of the lookup table contents. And also, m control
signals b 1n, b 2n, and b 3n of the lookup table for n = 0, 1, ..., m โˆ’ 1 are truncated. In
ASOP3, instead of truncation, approximation is employed. Fig. 3 shows Least significant
part of the ASOP (ASOP3) structure. Lookup table output contents are divided into 18โˆ’m
bits and m bits. The inputs b are divided to 16 โˆ’ m group and m group. ASOP1 is used
for the first 16 โˆ’ m group. For the least m bits group of b k for k = 1, 2, 3, the control
signals are grouped in pair. m lookup tables are reduced to m/2 tables. The additional
hardware required for ASOP3 is given in Fig. 4. For example, consider the input
elements as a 1 = โ€œ00110010 00101110,โ€ a2 = โ€œ0001011000101011,โ€ a3 = โ€œ00100110011
01000,โ€ b1 = โ€œ0001001011101001,โ€ b2= โ€œ0001101000101110,โ€ and b3=
โ€œ0000101011101011.โ€ For m = 4, a 23, a 13, a 12, and a 123 are calculated, then except for
least m bits, other bits are given to ASOP1 structure, and 12-bit (16 โˆ’ m) information
starting most significant bit of b1, b2, and b3are taken and fed as control signals of lookup
tables. For the least significant bits calculation, least significant m bits of a23, a13, a12, and
a 123 are used as inputs to the lookup table. The number of lookup tables are reduced by
half, by ORing each pair of control signals. In this scenario, for lookup table of n = 1 | 0,
the control signals would be 111.
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherryโ€“ 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
Advantages:
๏‚ท Higher processing accuracy
๏‚ท High performance
๏‚ท Require low power
References:
[1] J. Han and M. Orshansky, โ€œApproximate computing: An emerging paradigm for energy-efficient
design,โ€ in Proc. IEEE ETS, May 2013, pp. 1โ€“6.
[2] S. A. White, โ€œApplications of distributed arithmetic to digital signal processing: A tutorial review,โ€
IEEE ASSP Mag., vol. 6, no. 3, pp. 4โ€“19, Jul. 1989.
[3] L. Yuan, S. Sana, H. J. Pottinger, and V. S. Rao, โ€œDistributed arithmetic implementation of
multivariable controllers for smart structural systems,โ€ Smart Mater. Struct., vol. 9, no. 4, p. 402, Jan.
2000.
[4] W. Li, J. B. Burr, and A. M. Peterson, โ€œA fully parallel VLSI implementation of distributed
arithmetic,โ€ in Proc. IEEE Int. Symp. Circuits Syst., vol. 2. Jun. 1988, pp. 1511โ€“1515.
[5] R. Amirtharajah and A. P. Chandrakasan, โ€œA micropower programmable DSP using approximate
signal processing based on distributed arithmetic,โ€ IEEE J. Solid-State Circuits, vol. 39, no. 2, pp. 337โ€“
347, Feb. 2010.
[6] A. Momeni, J. Han, P. Montuschi, and F. Lombardi, โ€œDesign and analysis of approximate
compressors for multiplication,โ€ IEEE Trans. Comput., vol. 64, no. 4, pp. 984โ€“994, Apr. 2015.
[7] S. Venkatachalam and S.-B. Ko, โ€œDesign of power and area efficient approximate multipliers,โ€ IEEE
Trans. Very Large Scale Integr. (VLSI) Syst., vol. 25, no. 5, pp. 1782โ€“1786, May 2017.
[8] G. Zervakis, K. Tsoumanis, S. Xydis, D. Soudris, and K. Pekmestzi, โ€œDesign-efficient approximate
multiplication circuits through partial product perforation,โ€ IEEE Trans. Very Large Scale Integr.
(VLSI) Syst., vol. 24, no. 10, pp. 3105โ€“3117, Oct. 2016.
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherryโ€“ 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
[9] J. Liang, J. Han, and F. Lombardi, โ€œNew metrics for the reliability of approximate and probabilistic
adders,โ€ IEEE Trans. Comput., vol. 63, no. 9, pp. 1760โ€“1771, Sep. 2013.
[10] J. Babaud, A. P. Witkin, M. Baudin, and R. O. Duda, โ€œUniqueness of the Gaussian kernel for scale-
space filtering,โ€ IEEE Trans. Pattern Anal. Mach. Intell., vol. PAMI-8, no. 1, pp. 26โ€“33, Jan. 1986.

More Related Content

Similar to Approximate sum of-products designs based on distributed arithmetic

A high accuracy programmable pulse generator with a 10-ps timing resolution
A high accuracy programmable pulse generator with a 10-ps timing resolutionA high accuracy programmable pulse generator with a 10-ps timing resolution
A high accuracy programmable pulse generator with a 10-ps timing resolution
Nxfee Innovation
ย 
Vector processing aware advanced clock-gating techniques for low-power fused ...
Vector processing aware advanced clock-gating techniques for low-power fused ...Vector processing aware advanced clock-gating techniques for low-power fused ...
Vector processing aware advanced clock-gating techniques for low-power fused ...
Nxfee Innovation
ย 
A reconfigurable ldpc decoder optimized applications
A reconfigurable ldpc decoder optimized applicationsA reconfigurable ldpc decoder optimized applications
A reconfigurable ldpc decoder optimized applications
Nxfee Innovation
ย 

Similar to Approximate sum of-products designs based on distributed arithmetic (20)

Feedback based low-power soft-error-tolerant design for dual-modular redundancy
Feedback based low-power soft-error-tolerant design for dual-modular redundancyFeedback based low-power soft-error-tolerant design for dual-modular redundancy
Feedback based low-power soft-error-tolerant design for dual-modular redundancy
ย 
A high accuracy programmable pulse generator with a 10-ps timing resolution
A high accuracy programmable pulse generator with a 10-ps timing resolutionA high accuracy programmable pulse generator with a 10-ps timing resolution
A high accuracy programmable pulse generator with a 10-ps timing resolution
ย 
Algorithm and vlsi architecture design of proportionate type lms adaptive fil...
Algorithm and vlsi architecture design of proportionate type lms adaptive fil...Algorithm and vlsi architecture design of proportionate type lms adaptive fil...
Algorithm and vlsi architecture design of proportionate type lms adaptive fil...
ย 
The implementation of the improved omp for aic reconstruction based on parall...
The implementation of the improved omp for aic reconstruction based on parall...The implementation of the improved omp for aic reconstruction based on parall...
The implementation of the improved omp for aic reconstruction based on parall...
ย 
Vector processing aware advanced clock-gating techniques for low-power fused ...
Vector processing aware advanced clock-gating techniques for low-power fused ...Vector processing aware advanced clock-gating techniques for low-power fused ...
Vector processing aware advanced clock-gating techniques for low-power fused ...
ย 
Efficient fpga mapping of pipeline sdf fft cores
Efficient fpga mapping of pipeline sdf fft coresEfficient fpga mapping of pipeline sdf fft cores
Efficient fpga mapping of pipeline sdf fft cores
ย 
IRJET- Design of 16 Bit Low Power Vedic Architecture using CSA & UTS
IRJET-  	  Design of 16 Bit Low Power Vedic Architecture using CSA & UTSIRJET-  	  Design of 16 Bit Low Power Vedic Architecture using CSA & UTS
IRJET- Design of 16 Bit Low Power Vedic Architecture using CSA & UTS
ย 
Low complexity methodology for complex square-root computation
Low complexity methodology for complex square-root computationLow complexity methodology for complex square-root computation
Low complexity methodology for complex square-root computation
ย 
A 12 bit 40-ms s sar adc with a fast-binary-window dac switching scheme
A 12 bit 40-ms s sar adc with a fast-binary-window dac switching schemeA 12 bit 40-ms s sar adc with a fast-binary-window dac switching scheme
A 12 bit 40-ms s sar adc with a fast-binary-window dac switching scheme
ย 
IRJET- Distribution Selection for Pump Manufacturing Companies
IRJET- Distribution Selection for Pump Manufacturing CompaniesIRJET- Distribution Selection for Pump Manufacturing Companies
IRJET- Distribution Selection for Pump Manufacturing Companies
ย 
Al04605265270
Al04605265270Al04605265270
Al04605265270
ย 
IRJET- Efficient Design of Radix Booth Multiplier
IRJET- Efficient Design of Radix Booth MultiplierIRJET- Efficient Design of Radix Booth Multiplier
IRJET- Efficient Design of Radix Booth Multiplier
ย 
DESIGN OF LOW POWER MULTIPLIER
DESIGN OF LOW POWER MULTIPLIERDESIGN OF LOW POWER MULTIPLIER
DESIGN OF LOW POWER MULTIPLIER
ย 
A reconfigurable ldpc decoder optimized applications
A reconfigurable ldpc decoder optimized applicationsA reconfigurable ldpc decoder optimized applications
A reconfigurable ldpc decoder optimized applications
ย 
Parallel Processing Technique for Time Efficient Matrix Multiplication
Parallel Processing Technique for Time Efficient Matrix MultiplicationParallel Processing Technique for Time Efficient Matrix Multiplication
Parallel Processing Technique for Time Efficient Matrix Multiplication
ย 
Optimization and implementation of parallel squarer
Optimization and implementation of parallel squarerOptimization and implementation of parallel squarer
Optimization and implementation of parallel squarer
ย 
IRJET- Image and Signal Filtering using Fir Filter Made using Approximate Hyb...
IRJET- Image and Signal Filtering using Fir Filter Made using Approximate Hyb...IRJET- Image and Signal Filtering using Fir Filter Made using Approximate Hyb...
IRJET- Image and Signal Filtering using Fir Filter Made using Approximate Hyb...
ย 
PARTIAL PRODUCT ARRAY HEIGHT REDUCTION USING RADIX-16 FOR 64-BIT BOOTH MULTI...
PARTIAL PRODUCT ARRAY HEIGHT REDUCTION USING RADIX-16 FOR 64-BIT BOOTH MULTI...PARTIAL PRODUCT ARRAY HEIGHT REDUCTION USING RADIX-16 FOR 64-BIT BOOTH MULTI...
PARTIAL PRODUCT ARRAY HEIGHT REDUCTION USING RADIX-16 FOR 64-BIT BOOTH MULTI...
ย 
Improvement of Process and Product Layout for Metro Coach using Craft Method...
Improvement of Process and  Product Layout for Metro Coach using Craft Method...Improvement of Process and  Product Layout for Metro Coach using Craft Method...
Improvement of Process and Product Layout for Metro Coach using Craft Method...
ย 
Improvement of Process and Product Layout for Metro Coach using Craft Method...
Improvement of Process and  Product Layout for Metro Coach using Craft Method...Improvement of Process and  Product Layout for Metro Coach using Craft Method...
Improvement of Process and Product Layout for Metro Coach using Craft Method...
ย 

More from Nxfee Innovation

An energy efficient programmable many core accelerator for personalized biome...
An energy efficient programmable many core accelerator for personalized biome...An energy efficient programmable many core accelerator for personalized biome...
An energy efficient programmable many core accelerator for personalized biome...
Nxfee Innovation
ย 

More from Nxfee Innovation (10)

VLSI IEEE Transaction 2018 - IEEE Transaction
VLSI IEEE Transaction 2018 - IEEE Transaction VLSI IEEE Transaction 2018 - IEEE Transaction
VLSI IEEE Transaction 2018 - IEEE Transaction
ย 
Noise insensitive pll using a gate-voltage-boosted source-follower regulator ...
Noise insensitive pll using a gate-voltage-boosted source-follower regulator ...Noise insensitive pll using a gate-voltage-boosted source-follower regulator ...
Noise insensitive pll using a gate-voltage-boosted source-follower regulator ...
ย 
Securing the present block cipher against combined side channel analysis and ...
Securing the present block cipher against combined side channel analysis and ...Securing the present block cipher against combined side channel analysis and ...
Securing the present block cipher against combined side channel analysis and ...
ย 
Combating data leakage trojans in commercial and asic applications with time ...
Combating data leakage trojans in commercial and asic applications with time ...Combating data leakage trojans in commercial and asic applications with time ...
Combating data leakage trojans in commercial and asic applications with time ...
ย 
Analysis and design of cost effective, high-throughput ldpc decoders
Analysis and design of cost effective, high-throughput ldpc decodersAnalysis and design of cost effective, high-throughput ldpc decoders
Analysis and design of cost effective, high-throughput ldpc decoders
ย 
An energy efficient programmable many core accelerator for personalized biome...
An energy efficient programmable many core accelerator for personalized biome...An energy efficient programmable many core accelerator for personalized biome...
An energy efficient programmable many core accelerator for personalized biome...
ย 
A flexible wildcard pattern matching accelerator via simultaneous discrete fi...
A flexible wildcard pattern matching accelerator via simultaneous discrete fi...A flexible wildcard pattern matching accelerator via simultaneous discrete fi...
A flexible wildcard pattern matching accelerator via simultaneous discrete fi...
ย 
A closed form expression for minimum operating voltage of cmos d flip-flop
A closed form expression for minimum operating voltage of cmos d flip-flopA closed form expression for minimum operating voltage of cmos d flip-flop
A closed form expression for minimum operating voltage of cmos d flip-flop
ย 
A 128 tap highly tunable cmos if finite impulse response filter for pulsed ra...
A 128 tap highly tunable cmos if finite impulse response filter for pulsed ra...A 128 tap highly tunable cmos if finite impulse response filter for pulsed ra...
A 128 tap highly tunable cmos if finite impulse response filter for pulsed ra...
ย 
Nxfee Innovation Brochure
Nxfee Innovation BrochureNxfee Innovation Brochure
Nxfee Innovation Brochure
ย 

Recently uploaded

notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
MsecMca
ย 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
ย 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
ย 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
ย 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
KreezheaRecto
ย 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
SUHANI PANDEY
ย 

Recently uploaded (20)

KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
ย 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
ย 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
ย 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
ย 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
ย 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
ย 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
ย 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
ย 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
ย 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
ย 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
ย 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
ย 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
ย 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
ย 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
ย 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
ย 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
ย 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ย 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
ย 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
ย 

Approximate sum of-products designs based on distributed arithmetic

  • 1. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherryโ€“ 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : nxfee.innovation@gmail.com _________________________________________________________________ Approximate Sum-of-Products Designs Based on Distributed Arithmetic Abstract: Approximate circuits provide high performance and require low power. Sum-of-products (SOP) units are key elements in many digital signal processing applications. In this brief, three approximate SOP (ASOP) models which are based on the distributed arithmetic are proposed. They are designed for different levels of accuracy. First model of ASOP achieves an improvement up to 64% on area and 70% on power, when compared with conventional unit. Other two models provide an improvement of 32% and 48% on area and 54% and 58% on power, respectively, with a reduced error rate compared with the first model. Third model achieves the mean relative error and normalized error distance as low as 0.05% and 0.009%, respectively. Performance of approximate units is evaluated with a noisy image smoothing application, where the proposed models are capable of achieving higher peak signal to-noise ratio than the existing state-of-the-art techniques. It is shown that the proposed approximate models achieve higher processing accuracy than existing works but with significant improvements in power and performance. Software Implementation: ๏‚ท Modelsim ๏‚ท Xilinx 14.2 Existing System: Approximate computing provides an efficient solution for the design of power efficient digital systems. For applications, such as multimedia and data processing, approximate circuits play an important role as a promising alternative for reducing area and power in digital systems that can tolerate some loss of precision. As one of the key components in arithmetic circuits, sum-of products (SOP) units have received less attention in terms of
  • 2. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherryโ€“ 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : nxfee.innovation@gmail.com _________________________________________________________________ approximate implementation. Distributed arithmetic is a very efficient means for calculation of the inner products between vectors. It implements multiplication by doing a series of table-lookups and shift-and-accumulate operations. Due to the flexibility of the level of parallelism in the distributed arithmetic structure, the area-speed tradeoff can be adjusted. Distributed arithmetic is a bit-serial operation that computes the inner product of two vectors in parallel. It requires no multiplication and it has an efficient mechanism to perform the SOP operation. Bit- parallel versions of distributed arithmetic are proposed. In this brief, three models of SOP units based on parallel distributed arithmetic are proposed. Their scheme simply involves truncation in the number of lookup tables, by eliminating the least significant part of the distributed arithmetic operation. Multipliers have been extensively studied for approximate implementation. Two models of approximate compressors with reduced erroneous outputs to accumulate partial products of the Dadda tree multiplier. The probability-based multiplier is based on the altering the partial products and reducing the generated partial product tree based on their probability. In partial product perforation (PPP) multiplier reduces k partial products starting from j th position, which in turn reduces the number of adders used in the accumulation of partial products. In this brief, the novel ASOP designs are proposed using the efficient distributed arithmetic structure. Approximation involves changes with respect to word length, number of lookup tables, and number of elements in the final accumulator. Three models are proposed. First model provides significant power reduction with lower mean relative error (MRE) and normalized error distance (NED). Second and third models with increased area and power compared to first model provide better accuracy. In the proposed approximate structures, reductions in the number of lookup tables, length of adders, and accumulator size are employed for approximation. Compared to the exact SOP unit, the proposed models have reduced circuit complexity.
  • 3. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherryโ€“ 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : nxfee.innovation@gmail.com _________________________________________________________________ NED is an effective metric to quantify the approximation irrespective of the size of the circuit. Also, traditional MRE error metric is used to evaluate the impact of approximation. Error distance is the difference between the exact value and the approximate value, whereas relative error is the value of error distance divided by the exact value. NED is calculated by normalizing the error distance by maximum possible exact output. MRE is calculated from the mean of relative errors for all possible values. Disadvantages: ๏‚ท Low processing accuracy ๏‚ท Poor performance ๏‚ท Require High power Proposed System: Proposed approximate sum -of-products In this brief, K is 3 and N is 16. For conventional implementation of SOP unit based on the parallel distributed arithmetic [4], three two-input 16-bit adders, one three-input 16- bit adder, 16 lookup tables with eight cases, and final accumulator with 16 elements are required. In our approximation models, hardware requirements are considerably reduced. Three models of ASOP: ASOP1, ASOP2, and ASOP3 are proposed. Proposed Approximate Sum-of-Products Model ASOP1 In approximate model 1, K is 3 and N is reduced. m bits at the least significant part of a k and b k for k = 1, 2, and 3 are truncated. m = 8, 6, and 4 bits are implemented. For this implementation, three two-input 16 โˆ’ m bit adders, one three-input 16 โˆ’ m bit adder, 16 โˆ’ m lookup tables with eight cases, and final accumulator with 16โˆ’m elements are required. This considerably reduces the hardware utilization at all the levels. The approximate
  • 4. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherryโ€“ 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : nxfee.innovation@gmail.com _________________________________________________________________ model with reduced elements is shown in Fig. 1. In by implementing with limits m to N โˆ’1, the number of lookup tables reduces to 16โˆ’m and 16โˆ’m elements are sent to the final accumulator (16 โˆ’ m ร— 18). It should be noted that in ASOP1, the number of input bits to the adders Fig. 1. Approximate lookup table and corresponding ASOP (ASOP1) structure for K = 3 and N = 16.
  • 5. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherryโ€“ 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : nxfee.innovation@gmail.com _________________________________________________________________ Fig. 2. Approximate lookup table and corresponding ASOP (ASOP2) structure for K = 3 and N = 16. is reduced, which further reduces the complexity of accumulator (16 โˆ’ m ร— 18 โˆ’ m), compared to [5]. Proposed Approximate Sum-of-Products Model ASOP2 ASOP2 is similar to ASOP1 with the addition of m-bit leading one predictor. This increases the accuracy, and more suitable for DSP application which will be discussed later in this section. In our method, leading one prediction of a k and b k for k = 1, 2, and 3 requires OR operation of most significant m bits of a k and b k for k = 1, 2, and 3 followed by the priority encoder. The function of OR gates can be given as a mor = a 1m|a 2m|a 3m and b mor = b 1m|b 2m|b 3m where km represents first m bits of k th element, for m = 4, 6, or 8. After the leading one prediction, ASOP1 structure is used for the computation of elements starting from the leading one position. Fig. 2 shows Approximate lookup table and corresponding ASOP (ASOP2) structure for K = 3 and N = 16. is reduced, which further reduces the complexity of accumulator (16 โˆ’ m ร— 18 โˆ’ m), compared to [5].
  • 6. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherryโ€“ 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : nxfee.innovation@gmail.com _________________________________________________________________ For example, consider the input elements as a 1 = โ€œ00110010 00101110,โ€ a 2 = โ€œ0001011000101011,โ€ a 3 = โ€œ0010011001 101000,โ€ b1= โ€œ0001001011101001,โ€ b2= โ€œ0001101000101110,โ€ and b3 = โ€œ0000101011101011.โ€ For m = 4, amor = 0011, leading one predictor predicts zeros in first two bits of bit positions โ€œ15โ€ and โ€œ14โ€ of a 1, a2, and a3, 12-bit (16 โˆ’ m) information starting from bit position โ€œ13โ€ to โ€œ2โ€ of a 1 , a2, and a3 (โ€œ110010001011,โ€ โ€œ010110001010,โ€ and โ€œ100110011010โ€) are taken and fed to the inputs of the lookup tables. For m = 4, b mor = 0001, leading one predictor predicts zeros in first three bits of bit positions โ€œ15,โ€ โ€œ14,โ€ and โ€œ13โ€ of b1, b2, and b3, 12-bit (16 โˆ’ m) information starting from bit position โ€œ12โ€ to โ€œ1โ€ of b1, b2, and b3 (โ€œ100101110100,โ€ โ€œ110100010111,โ€ and โ€œ010101110101โ€) are taken and fed as control signals of lookup Fig. 3. Least significant part of the ASOP (ASOP3) structure.
  • 7. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherryโ€“ 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : nxfee.innovation@gmail.com _________________________________________________________________ tables. The overall structure of ASOP2 is given in Fig. 3, where LZA refers to leading zeros in a mor and LZB refers to leading zeros in b mor. ASOP2 reduces the negative effects of truncation, especially when there is information only in least significant parts of the inputs. In DSP applications, pixel values are highly correlated and the number of initial zeros of a k and b k for k = 1, 2, 3 have high chances of being the same. Using OR gate for combining the elements and using a leading one predictor afterward reduces the hardware resources to be used. Proposed Approximate Sum-of-Products Model ASOP3 In ASOP1, the least significant part m = 8, 6, and 4 bits are truncated. In ASOP1, m bits are truncated from the 18-bit outputs of the lookup table contents. And also, m control signals b 1n, b 2n, and b 3n of the lookup table for n = 0, 1, ..., m โˆ’ 1 are truncated. In ASOP3, instead of truncation, approximation is employed. Fig. 3 shows Least significant part of the ASOP (ASOP3) structure. Lookup table output contents are divided into 18โˆ’m bits and m bits. The inputs b are divided to 16 โˆ’ m group and m group. ASOP1 is used for the first 16 โˆ’ m group. For the least m bits group of b k for k = 1, 2, 3, the control signals are grouped in pair. m lookup tables are reduced to m/2 tables. The additional hardware required for ASOP3 is given in Fig. 4. For example, consider the input elements as a 1 = โ€œ00110010 00101110,โ€ a2 = โ€œ0001011000101011,โ€ a3 = โ€œ00100110011 01000,โ€ b1 = โ€œ0001001011101001,โ€ b2= โ€œ0001101000101110,โ€ and b3= โ€œ0000101011101011.โ€ For m = 4, a 23, a 13, a 12, and a 123 are calculated, then except for least m bits, other bits are given to ASOP1 structure, and 12-bit (16 โˆ’ m) information starting most significant bit of b1, b2, and b3are taken and fed as control signals of lookup tables. For the least significant bits calculation, least significant m bits of a23, a13, a12, and a 123 are used as inputs to the lookup table. The number of lookup tables are reduced by half, by ORing each pair of control signals. In this scenario, for lookup table of n = 1 | 0, the control signals would be 111.
  • 8. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherryโ€“ 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : nxfee.innovation@gmail.com _________________________________________________________________ Advantages: ๏‚ท Higher processing accuracy ๏‚ท High performance ๏‚ท Require low power References: [1] J. Han and M. Orshansky, โ€œApproximate computing: An emerging paradigm for energy-efficient design,โ€ in Proc. IEEE ETS, May 2013, pp. 1โ€“6. [2] S. A. White, โ€œApplications of distributed arithmetic to digital signal processing: A tutorial review,โ€ IEEE ASSP Mag., vol. 6, no. 3, pp. 4โ€“19, Jul. 1989. [3] L. Yuan, S. Sana, H. J. Pottinger, and V. S. Rao, โ€œDistributed arithmetic implementation of multivariable controllers for smart structural systems,โ€ Smart Mater. Struct., vol. 9, no. 4, p. 402, Jan. 2000. [4] W. Li, J. B. Burr, and A. M. Peterson, โ€œA fully parallel VLSI implementation of distributed arithmetic,โ€ in Proc. IEEE Int. Symp. Circuits Syst., vol. 2. Jun. 1988, pp. 1511โ€“1515. [5] R. Amirtharajah and A. P. Chandrakasan, โ€œA micropower programmable DSP using approximate signal processing based on distributed arithmetic,โ€ IEEE J. Solid-State Circuits, vol. 39, no. 2, pp. 337โ€“ 347, Feb. 2010. [6] A. Momeni, J. Han, P. Montuschi, and F. Lombardi, โ€œDesign and analysis of approximate compressors for multiplication,โ€ IEEE Trans. Comput., vol. 64, no. 4, pp. 984โ€“994, Apr. 2015. [7] S. Venkatachalam and S.-B. Ko, โ€œDesign of power and area efficient approximate multipliers,โ€ IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 25, no. 5, pp. 1782โ€“1786, May 2017. [8] G. Zervakis, K. Tsoumanis, S. Xydis, D. Soudris, and K. Pekmestzi, โ€œDesign-efficient approximate multiplication circuits through partial product perforation,โ€ IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 24, no. 10, pp. 3105โ€“3117, Oct. 2016.
  • 9. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherryโ€“ 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : nxfee.innovation@gmail.com _________________________________________________________________ [9] J. Liang, J. Han, and F. Lombardi, โ€œNew metrics for the reliability of approximate and probabilistic adders,โ€ IEEE Trans. Comput., vol. 63, no. 9, pp. 1760โ€“1771, Sep. 2013. [10] J. Babaud, A. P. Witkin, M. Baudin, and R. O. Duda, โ€œUniqueness of the Gaussian kernel for scale- space filtering,โ€ IEEE Trans. Pattern Anal. Mach. Intell., vol. PAMI-8, no. 1, pp. 26โ€“33, Jan. 1986.