ACEP: An Accuracy-Configurable Carry Estimating Parallel Adder
Winner of the best poster award at the 26th IEEE International Conference on High Performance Computing, Data, and Analytics 2019, held in Hyderabad, India. (Dec 17-20, 2019)
ACEP: An Accuracy-Configurable Carry Estimating Parallel Adder
1.
ACEP: An Accuracy-ConfigurableCarry Estimating Parallel Adder
Rajat Bhattacharjya, Vishesh Mishra, Saurabh Singh and Kaustav Goswami
Indian Institute of Information Technology Guwahati
INTRODUCTION RELATED WORKS MOTIVATION AND CONTRIBUTION
ALGORITHM METHODOLOGY APPLICATIONS
The motivation behind designing a new Approxi-
mate Binary Adder is as follows:
Accuracy-Configuration: ACEP can be configured
differently for different accuracy requirements by
changing block sizes.
Faster Computations: In ACEP, carry estimation
and sum generation is done in parallel in a non-
blocking manner which makes it much faster
than conventional adders.
Applications end-results: We choose three popu-
lar end-applications that commonly benefit from
approximate methods. We see that ACEP produc-
es results that are better than other state-of-the-
art solutions by 12.3%. Also, we measure
speedup of end applications, which shows ACEP
to be 2.83x faster compared to baseline case.
Approximate adders can be broadly classified as
follows:
Non-Configurable Adders: The approximate binary
adders [5], [6], [7], [8] split the input operands into
two segments. In these types of adders, the LSBs
are approximately computed and MSBs are accu-
rately computed, thus producing results in lesser
time compared to the conventional ripple-carry ad-
der.
Accuracy-Configurable Adders: A few accuracy-
configurable adders have been proposed in the
past:
RAP-CLA [9] splits the carry-lookahead circuit of a
CLA into two parts and switches them for approx-
imate or accurate mode.
SARA [10] on the other hand uses its carry pre-
diction circuit to predict the carry of each block.
BCSA [11] uses a carry predict unit to predict car-
ry and carry select unit to select whether to prop-
agate predicted carry or actual carry.
1.blockDivider (A[], B[], Cin, k):
2. Initialize: sum[k]
3. Cout = carryEstimator (A[k-1], B[k-1], A[k-2], B[k-2])
4. for i 0 to k in parallel do
5. sum[i] = A[i] ˆ B[i] ˆ Cin
6. Cin = (A[i] & B[i]) j ((A[i] ˆ B[i]) & Cin)
7. end for
8.return (sum, Cout)
REFERENCES
1] A. Raha and V. Raghunathan, ”Towards full-system energy-accuracy tradeoffs: A case study of an approximate smart camera system*,” 2017
54th ACM/EDAC/IEEE Design Automation Conference (DAC), Austin, TX, 2017, pp. 1-6.
2] H. Hoffmann, S. Misailovic, S. Sidiroglou, A. Agarwal, M. Rinard, ”Using code perforation to improve performance reduce energy consumption
respond to failures”, 2009.
3] E. Schkufza, R. Sharma, A. Aiken, ”Stochastic optimization of floating-point programs with tunable precision”, Proc. 35th ACM SIGPLAN Conf.
Programm. Lang. Design Implement., pp. 53-64, 2014.
4] A. Sampson, J. Nelson, K. Strauss, L. Ceze, ”Approximate storage in solid-state memories”, Proc. Int. Symp. Microarchit., pp. 25-36, 2013.
5] N.Zhu, W. L. Goh, W. Zhang, K. S. Yeo, and Z. H. Kong, “Design of low-power high-speed truncation-error-tolerant adder and its application in
digital signal processing,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 18, no. 8, pp. 1225–1229, 2010.
6] Ning Zhu, W. L. Goh, and K. S. Yeo, “An enhanced low-power high-speed adder for error-tolerant application,” in Proceedings of the 2009 12th
International Symposium on Integrated Circuits, 2009, pp. 69–72.
7] H. R. Mahdiani, A. Ahmadi, S. M. Fakhraie, and C. Lucas, “Bio-Inspired Imprecise Computational Blocks for Efficient VLSI Implementation of
Soft-Computing Applications,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 57, no. 4, pp. 850–862, 2010.
8] V. Gupta, D. Mohapatra, A. Raghunathan, and K. Roy, “Low-power digital signal processing using approximate
adders,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 32, no. 1, pp. 124– 137, 2013.
9] O. Akbari, M. Kamal, A. Afzali-Kusha, and M. Pedram, “RAP-CLA: A Reconfigurable Approximate Carry LookAhead Adder,” IEEE Transactions on
Circuits and Systems II: Express Briefs, vol. 65, no. 8, pp. 1089–1093, 2018.
10] W. Xu, S. S. Sapatnekar, and J. Hu, “A simple yet efficient accuracy-configurable adder design,” IEEE Transactions
on Very Large Scale Integration (VLSI) Systems, vol. 26, no. 6, pp. 1112–1125, 2018.
11] F. Ebrahimi-Azandaryani, O. Akbari, M. Kamal, A. Afzali-Kusha, and M. Pedram, “Block-based carry speculative
approximate adder for energy-efficient applications,” IEEE Transactions on Circuits and Systems II: Express Briefs, pp. 1–1, 2019.
ANALYSIS
Approximate computing in recent years has gar-
nered significant attention due to the massive data
deluge and requirements for fault-tolerant real-
time computations.
Towards this, there have been several techniques
proposed which have shown that often an approxi-
mate result with provable error bounds are desira-
ble rather than computing correct result from
scratch which can take significantly higher time.
APPROXIMATE ARITHMETIC CIRCUITS:
Approximation can be of various levels, such as:
Full system approximation [1]
Software level approximation [2]
At level of code generation [3]
At level of circuit systems like primary memory
and secondary storage devices [4].
Arithmetic circuits like adders and multipliers form
fundamental building blocks in all such approxi-
mate systems.
Accuracy: We use standard error-metrics to com-
pare our design with other adders in addition to
Gaussian Smoothing and find that our results
show significant accuracy improvement of over
80.7% .
Hardware Evaluation: ACEP shows significant
power and area savings compared to other ad-
ders in addition to improvement in delay. This is
mainly due to the carry-prediction logic em-
ployed in it.
Fig. III: Normalized Execution Time Obtained in SPEC CPU2006 Benchmarks
RESULTS
ACEP Probability Function :
The probability that the result generated through
ACEP has least error:
Carry estimator logic:
Cki = Aki-1 • Bki-1 + Aki-2 • Bki-2•Aki-1 + Aki-2 • Bki-2 • Bki-1
where; i Є[1:n=k].
CONCLUSION
In this paper, we present an accuracy-configurable adder (ACEP). Its major aspects can be listed as :
It estimates carry-outs parallelly based on previous significant bits.
It has significantly lower power consumption, delay and lesser area overhead compared to other state-
of-the-art approximate adders.
When compared to the conventional ripple-carry adder, we find that our adder (ACEP) is 91.117% fast-
er than it when used in a best-case scenario.
In the near future, we wish to investigate our approximate addition technique for signed and floating
point operations.
Fig. II: Gaussian Smoothing
Fig. I: Kth
Block of ACEP
ACEP makes use of a divide and conquer approach
in addition to carry estimation. The following algo-
rithm describes the block division process:
(a) Original image, (b) Original image with noise, (c) RAP-CLA, PSNR=29.366dB & SSIM=0.7814 (d)
SARA, PSNR=26.79dB & SSIM=0.787 (e) BCSA, PSNR=33.9dB & SSIM=0.9142 (f) BCSA with ERU,
PSNR=37.837dB & SSIM=0.9482 (g) ACEP, PSNR=32.032dB & SSIM=0.9007
Fig. IV: Area Comparison Fig. V: Delay Comparison Fig. VI: Power Comparison
Fig. VI: Error Rate (%) Fig. VIII: Mean Error Distance (log) Fig. IX: Mean Relative Error Distance