Abstract: Adders are known to have the frequently used in VLSI designs. In digital design we have half adder and full adder, main adders by using these adders we can implement ripple carry adders. RCA use to perform any number of addition. In this RCA is serial adder and it has commutation delay problem. If increase the ha&fa simultaneously delay also increase. That’s why we go for parallel adders(parallel prefix adders). IN the parallel prefix adder are ks adder(kogge-stone),sks adder(sparse kogge-stone),spaning tree and brentkung adder. These adders are designd and implemented on FPGA sparton3E kit. Simulated and synthesis by model sim6.4b, Xilinx ise10.1.
An Improved Optimization Techniques for Parallel Prefix Adder using FPGAIJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
Design and Estimation of delay, power and area for Parallel prefix addersIJERA Editor
In Very Large Scale Integration (VLSI) designs, Parallel prefix adders (PPA) have the better delay performance. This paper investigates four types of PPA’s (Kogge Stone Adder (KSA), Spanning Tree Adder (STA), Brent Kung Adder (BKA) and Sparse Kogge Stone Adder (SKA)). Additionally Ripple Carry Adder (RCA), Carry Look ahead Adder (CLA) and Carry Skip Adder (CSA) are also investigated. These adders are implemented in verilog Hardware Description Language (HDL) using Xilinx Integrated Software Environment (ISE) 13.2 Design Suite. These designs are implemented in Xilinx Spartan 6 Field Programmable Gate Arrays (FPGA). Delay and area are measured using XPower analyzer and all these adder’s delay, power and area are investigated and compared finally
Implementation and Estimation of Delay, Power and Area for Parallel Prefix Ad...IJMTST Journal
Parallel Prefix Adders have been established as the most efficient circuits for binary addition. The binary adder is the critical element in most digital circuit designs including digital signal processors and microprocessor data path units. The final carry is generated ahead to the generation of the sum which leads extensive research focused on reduction in circuit complexity and power consumption of the adder. In VLSI implementation, parallel-prefix adders are known to have the best performance. This paper investigates four types of carry-tree adders (the Kogge-Stone, sparse Kogge-Stone, spanning tree, Brent kung Adder) and compare them to the simple Ripple Carry Adder and Carry Skip Adder. These designs of varied bit-widths are simulated using implemented on a Xilinx version Spartan 3E FPGA. These fast carry-chain carry-tree adders support the bit width up to 256. We report on the area requirements and reduction in circuit complexity for a variety of classical parallel prefix adder structures.
DUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORMVLSICS Design
This paper is devoted to the design of dual core crypto processor for executing both Prime field and binary field instructions. The proposed design is specifically optimized for Field programmable gate array (FPGA) platform. Combination of two different field (prime field GF(p) and Binary field GF(2m)) instructions execution is analysed.The design is implemented in Spartan 3E and virtex5. Both the performance results are compared. The implementation result shows the execution of parallelism using dual field instructions
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Implementation and Comparison of Efficient 16-Bit SQRT CSLA Using Parity Pres...IJERA Editor
In Very Large Scale Integration (VLSI) outlines, Carry Select Adder (CSLA) is one of the quickest adder utilized as a part of numerous data processing processors to perform quick number crunching capacities. In this paper we proposed the design of SQRT CSLA using parity preserving reversible gate (P2RG). Reversible logic is emerging field in today VLSI design. In conventional circuits, the logic gates such as AND gate, OR gate is irreversible in nature and computing with irreversible logic results in energy dissipation. This problem can be circumvented by using reversible logic. In ideal condition, the reversible logic gate produces zero power dissipation. The proposed design is efficient in terms of delay as compare to irreversible SQRT CSLA. The simulation is done using Xilinx.
An Improved Optimization Techniques for Parallel Prefix Adder using FPGAIJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
Design and Estimation of delay, power and area for Parallel prefix addersIJERA Editor
In Very Large Scale Integration (VLSI) designs, Parallel prefix adders (PPA) have the better delay performance. This paper investigates four types of PPA’s (Kogge Stone Adder (KSA), Spanning Tree Adder (STA), Brent Kung Adder (BKA) and Sparse Kogge Stone Adder (SKA)). Additionally Ripple Carry Adder (RCA), Carry Look ahead Adder (CLA) and Carry Skip Adder (CSA) are also investigated. These adders are implemented in verilog Hardware Description Language (HDL) using Xilinx Integrated Software Environment (ISE) 13.2 Design Suite. These designs are implemented in Xilinx Spartan 6 Field Programmable Gate Arrays (FPGA). Delay and area are measured using XPower analyzer and all these adder’s delay, power and area are investigated and compared finally
Implementation and Estimation of Delay, Power and Area for Parallel Prefix Ad...IJMTST Journal
Parallel Prefix Adders have been established as the most efficient circuits for binary addition. The binary adder is the critical element in most digital circuit designs including digital signal processors and microprocessor data path units. The final carry is generated ahead to the generation of the sum which leads extensive research focused on reduction in circuit complexity and power consumption of the adder. In VLSI implementation, parallel-prefix adders are known to have the best performance. This paper investigates four types of carry-tree adders (the Kogge-Stone, sparse Kogge-Stone, spanning tree, Brent kung Adder) and compare them to the simple Ripple Carry Adder and Carry Skip Adder. These designs of varied bit-widths are simulated using implemented on a Xilinx version Spartan 3E FPGA. These fast carry-chain carry-tree adders support the bit width up to 256. We report on the area requirements and reduction in circuit complexity for a variety of classical parallel prefix adder structures.
DUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORMVLSICS Design
This paper is devoted to the design of dual core crypto processor for executing both Prime field and binary field instructions. The proposed design is specifically optimized for Field programmable gate array (FPGA) platform. Combination of two different field (prime field GF(p) and Binary field GF(2m)) instructions execution is analysed.The design is implemented in Spartan 3E and virtex5. Both the performance results are compared. The implementation result shows the execution of parallelism using dual field instructions
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Implementation and Comparison of Efficient 16-Bit SQRT CSLA Using Parity Pres...IJERA Editor
In Very Large Scale Integration (VLSI) outlines, Carry Select Adder (CSLA) is one of the quickest adder utilized as a part of numerous data processing processors to perform quick number crunching capacities. In this paper we proposed the design of SQRT CSLA using parity preserving reversible gate (P2RG). Reversible logic is emerging field in today VLSI design. In conventional circuits, the logic gates such as AND gate, OR gate is irreversible in nature and computing with irreversible logic results in energy dissipation. This problem can be circumvented by using reversible logic. In ideal condition, the reversible logic gate produces zero power dissipation. The proposed design is efficient in terms of delay as compare to irreversible SQRT CSLA. The simulation is done using Xilinx.
Implementation and Design of AES S-Box on FPGAIJRES Journal
The Advanced Encryption Standard can be programmed in software or built with pure hardware. However Field Programmable Gate Arrays (FPGAs) offer a quicker, more customizable solution. This research investigates the AES algorithm with regard to FPGA and the Very High Speed Integrated Circuit Hardware Description language (VHDL). Xilinx Design Suite 14.5 software is used for simulation and optimization of the synthesizable VHDL code. All the transformations of both Encryptions is simulated using an iterative design approach in order to minimize the hardware consumption. Virtex 6 Family devices are utilized for hardware evaluation. Advanced Encryption Standard (AES) is one of the most common symmetric encryption algorithms. The hardware complexity in AES is dominated by AES substitution box (S-box) which is considered as one of the most complicated and costly part of the system because it is the only non-linear structure. Theoretically, the design reduces the overall delay and efficiently for applications with high-speed performance. This approach is suitable for FPGA implementation in term of gate area. The hardware, total area and delay are presented.
Design of 32 bit Parallel Prefix Adders IOSR Journals
In this paper, we propose 32 bit Kogge-Stone, Brent-Kung, Ladner-Fischer parallel prefix adders. In
general N-bit adders like Ripple Carry Adders (slow adders compare to other adders), and Carry Look Ahead
adders (area consuming adders) are used in earlier days. But now the most Industries are using parallel prefix
adders because of their advantages compare to other adders. Parallel prefix adders are faster and area
efficient. Parallel prefix adder is a technique for increasing the speed in DSP processor while performing
addition. We simulate and synthesis different types of 32-bit prefix adders using Xilinx ISE 10.1i tool. By using
these synthesis results, we noted the performance parameters like number of LUTs and delay. We compare these three adders in terms of LUTs (represents area) and delay values.
HEVC 2D-DCT architectures comparison for FPGA and ASIC implementationsTELKOMNIKA JOURNAL
This paper compares ASIC and FPGA implementations of two commonly used architectures for
2-dimensional discrete cosine transform (DCT), the parallel and folded architectures. The DCT has been
designed for sizes 4x4, 8x8, and 16x16, and implemented on Silterra 180nm ASIC and Xilinx Kintex
Ultrascale FPGA. The objective is to determine suitable low energy architectures to be used as their
characteristics greatly differ in terms of cells usage, placement and routing methods on these platforms.
The parallel and folded DCT architectures for all three sizes have been designed using Verilog HDL,
including the basic serializer-deserializer input and output. Results show that for large size transform of
16x16, ASIC parallel architecture results in roughly 30% less energy compared to folded architecture. As
for FPGAs, folded architecture results in roughly 34% less energy compared to parallel architecture. In
terms of overall energy consumption between 180nm ASIC and Xilinx Ultrascale, ASIC implementation
results in about 58% less energy compared to the FPGA.
Design of High Speed 128 bit Parallel Prefix AddersIJERA Editor
In this paper, we propose 128-bit Kogge-Stone, Ladner-Fischer, Spanning tree parallel prefix adders and
compared with Ripple carry adder. In general N-bit adders like Ripple Carry Adders (slow adders compare to
other adders), and Carry Look Ahead adders (area consuming adders) are used in earlier days. But now the most
Industries are using parallel prefix adders because of their advantages compare to other adders. Parallel prefix
adders are faster and area efficient. Parallel prefix adder is a technique for increasing the speed in DSP processor
while performing addition. While when we want to design any 128-bit operating systems and processors we can
use these adders in place of regular adders. We simulate and synthesis different types of 128-bit prefix adders
using Xilinx ISE 12.3 tool. By using these synthesis results, we noted the performance parameters like number
of LUTs and delay. We compare these three adders in terms of LUTs (represents area) and delay values.
Parallel-prefix adders offer a highly efficient solution to the binary addition problem and are
well-suited for VLSI implementations. In this paper, a novel framework is introduced, which allows
the design of parallel-prefix Ling adders. The proposed approach saves one-logic level of
implementation compared to the parallel-prefix structures proposed for the traditional definition of
carry look ahead equations and reduces the fan out requirements of the design. Experimental results
reveal that the proposed adders achieve delay reductions of up to 14 percent when compared to the
fastest parallel-prefix architectures presented for the traditional definition of carry equations
An Efficient High Speed Design of 16-Bit Sparse-Tree RSFQ AdderIJERA Editor
In this paper, we propse 16-bit sparse tree RSFQ adder (Rapid single flux quantam), kogge-stone adder, carry lookahead adder. In general N-bit adders like Ripple carry adder s(slow adders compare to other adders), and carry lookahead adders(area consuming adders) are used in earlier days. But now the most of industries are using parallel prefix adders because of their advantages compare to kogge-stone adder, carry lookahead adder, Our prefix sparse tree adders are faster and area efficient. Parallel prefix adder is a technique for increasing the speed in DSP processor while performing addition. We simulate and synthesis different types of 16-bit sparse tree RSFQ adders using Xilinx ISE10.1i tool, By using these synthesis results, We noted the performance parameters like number of LUT’s and delay. We compare these three adders interms of LUT’s represents area) and delay values.
Novel Tree Structure Based Conservative Reversible Binary Coded Decimal Adder...VIT-AP University
Reversible logic has been recognized as one of the most promising technique for the realization of the quantum circuit. In this paper, a cost effective conservative, reversible binary coded decimal
(BCD) adder is proposed for the quantum logic circuit. Towards the realization of BCD adder, few novel gates, such as Half-Adder/Subtraction (HAS-PP), Full-Adder/Subtraction (FAS-PP) and Overflow-detection (OD-PP) based on parity preserving logic are synthesized which incurs 7, 10 and
13 quantum cost respectively. Coupling these gates a novel tree-based methodology is proposed to
implement the required BCD Adder. Also, the BCD adder design has been optimized to achieve the optimum value of quantum cost. In addition, the proposed BCD circuit is extended to n-bit adder using replica based techniques. Experimental result establishes the novelty of the proposed logic,
which outperforms the conventional circuits in terms of logic synthesis and testability. The limitation
of detecting the missing gate and missing control point of the quantum circuit of overflow detection
is finally tackled this work by the proposed OD-PP with the application of the minimum test vector. In addition, reversible circuits of control inputs based testable master-slave D-FF is intended. The noted work on the testable sequential circuit presented here is to develop circuit using minimum test vectors and can find diverse application in the testing paradigm
FPGA Implementation of SubByte & Inverse SubByte for AES Algorithmijsrd.com
Advanced encryption standard was accepted as a Federal Information Processing Standard (FIPS) standard. In traditional look up table (LUT) approaches, the unbreakable delay is longer than the total delay of the rest of operations in each round. LUT approach consumes a large area. It is more efficient to apply composite field arithmetic in the SubBytes transformation of the AES algorithm. It not only reduces the complexity but also enables deep sub pipelining such that higher speed can be achieved. Isomorphic mapping can be employed to convert GF(28) to GF(22)2)2) ,so that multiplicative inverse can be easily obtained. SubBytes and InvSubBytes transformations are merged using composite field arithmetic. It is most important responsible for the implementation of low cost and high throughput AES architecture. As compared to the typical ROM based lookup table, the presented implementation is both capable of higher speeds since it can be pipelined and small in terms of area occupancy (137/1290 slices on a Spartan III XCS200-5FPGA).
Design and implementation of Closed Loop Control of Three Phase Interleaved P...IJMTST Journal
A single-phase, three-level, single-stage power-factor corrected AC/DC converter operated under closed
loop manner is presented. That operates with a single controller to regulate the output voltage and the input
inductor act as a boost inductor to have a single stage power factor correction with good output response. The
paper deals with a new single stage three level ac-dc converter which performs both power factor correction
and voltage regulation in a single stage. The proposed converter has two separate controllers, one for power
factor correction and the other for regulating the output voltage. A comprehensive review of the existing single
stage topologies has been carried out. Then the operating principle, control scheme and the design of the new
converter are presented. The proposed converter is having an input power factor close to unity and better
voltage regulation compared to the conventional ac-dc converter topologies. Proposed topology is evaluated
through Matlab/Simulink platform and simulation results are conferred.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
Implementation and Design of AES S-Box on FPGAIJRES Journal
The Advanced Encryption Standard can be programmed in software or built with pure hardware. However Field Programmable Gate Arrays (FPGAs) offer a quicker, more customizable solution. This research investigates the AES algorithm with regard to FPGA and the Very High Speed Integrated Circuit Hardware Description language (VHDL). Xilinx Design Suite 14.5 software is used for simulation and optimization of the synthesizable VHDL code. All the transformations of both Encryptions is simulated using an iterative design approach in order to minimize the hardware consumption. Virtex 6 Family devices are utilized for hardware evaluation. Advanced Encryption Standard (AES) is one of the most common symmetric encryption algorithms. The hardware complexity in AES is dominated by AES substitution box (S-box) which is considered as one of the most complicated and costly part of the system because it is the only non-linear structure. Theoretically, the design reduces the overall delay and efficiently for applications with high-speed performance. This approach is suitable for FPGA implementation in term of gate area. The hardware, total area and delay are presented.
Design of 32 bit Parallel Prefix Adders IOSR Journals
In this paper, we propose 32 bit Kogge-Stone, Brent-Kung, Ladner-Fischer parallel prefix adders. In
general N-bit adders like Ripple Carry Adders (slow adders compare to other adders), and Carry Look Ahead
adders (area consuming adders) are used in earlier days. But now the most Industries are using parallel prefix
adders because of their advantages compare to other adders. Parallel prefix adders are faster and area
efficient. Parallel prefix adder is a technique for increasing the speed in DSP processor while performing
addition. We simulate and synthesis different types of 32-bit prefix adders using Xilinx ISE 10.1i tool. By using
these synthesis results, we noted the performance parameters like number of LUTs and delay. We compare these three adders in terms of LUTs (represents area) and delay values.
HEVC 2D-DCT architectures comparison for FPGA and ASIC implementationsTELKOMNIKA JOURNAL
This paper compares ASIC and FPGA implementations of two commonly used architectures for
2-dimensional discrete cosine transform (DCT), the parallel and folded architectures. The DCT has been
designed for sizes 4x4, 8x8, and 16x16, and implemented on Silterra 180nm ASIC and Xilinx Kintex
Ultrascale FPGA. The objective is to determine suitable low energy architectures to be used as their
characteristics greatly differ in terms of cells usage, placement and routing methods on these platforms.
The parallel and folded DCT architectures for all three sizes have been designed using Verilog HDL,
including the basic serializer-deserializer input and output. Results show that for large size transform of
16x16, ASIC parallel architecture results in roughly 30% less energy compared to folded architecture. As
for FPGAs, folded architecture results in roughly 34% less energy compared to parallel architecture. In
terms of overall energy consumption between 180nm ASIC and Xilinx Ultrascale, ASIC implementation
results in about 58% less energy compared to the FPGA.
Design of High Speed 128 bit Parallel Prefix AddersIJERA Editor
In this paper, we propose 128-bit Kogge-Stone, Ladner-Fischer, Spanning tree parallel prefix adders and
compared with Ripple carry adder. In general N-bit adders like Ripple Carry Adders (slow adders compare to
other adders), and Carry Look Ahead adders (area consuming adders) are used in earlier days. But now the most
Industries are using parallel prefix adders because of their advantages compare to other adders. Parallel prefix
adders are faster and area efficient. Parallel prefix adder is a technique for increasing the speed in DSP processor
while performing addition. While when we want to design any 128-bit operating systems and processors we can
use these adders in place of regular adders. We simulate and synthesis different types of 128-bit prefix adders
using Xilinx ISE 12.3 tool. By using these synthesis results, we noted the performance parameters like number
of LUTs and delay. We compare these three adders in terms of LUTs (represents area) and delay values.
Parallel-prefix adders offer a highly efficient solution to the binary addition problem and are
well-suited for VLSI implementations. In this paper, a novel framework is introduced, which allows
the design of parallel-prefix Ling adders. The proposed approach saves one-logic level of
implementation compared to the parallel-prefix structures proposed for the traditional definition of
carry look ahead equations and reduces the fan out requirements of the design. Experimental results
reveal that the proposed adders achieve delay reductions of up to 14 percent when compared to the
fastest parallel-prefix architectures presented for the traditional definition of carry equations
An Efficient High Speed Design of 16-Bit Sparse-Tree RSFQ AdderIJERA Editor
In this paper, we propse 16-bit sparse tree RSFQ adder (Rapid single flux quantam), kogge-stone adder, carry lookahead adder. In general N-bit adders like Ripple carry adder s(slow adders compare to other adders), and carry lookahead adders(area consuming adders) are used in earlier days. But now the most of industries are using parallel prefix adders because of their advantages compare to kogge-stone adder, carry lookahead adder, Our prefix sparse tree adders are faster and area efficient. Parallel prefix adder is a technique for increasing the speed in DSP processor while performing addition. We simulate and synthesis different types of 16-bit sparse tree RSFQ adders using Xilinx ISE10.1i tool, By using these synthesis results, We noted the performance parameters like number of LUT’s and delay. We compare these three adders interms of LUT’s represents area) and delay values.
Novel Tree Structure Based Conservative Reversible Binary Coded Decimal Adder...VIT-AP University
Reversible logic has been recognized as one of the most promising technique for the realization of the quantum circuit. In this paper, a cost effective conservative, reversible binary coded decimal
(BCD) adder is proposed for the quantum logic circuit. Towards the realization of BCD adder, few novel gates, such as Half-Adder/Subtraction (HAS-PP), Full-Adder/Subtraction (FAS-PP) and Overflow-detection (OD-PP) based on parity preserving logic are synthesized which incurs 7, 10 and
13 quantum cost respectively. Coupling these gates a novel tree-based methodology is proposed to
implement the required BCD Adder. Also, the BCD adder design has been optimized to achieve the optimum value of quantum cost. In addition, the proposed BCD circuit is extended to n-bit adder using replica based techniques. Experimental result establishes the novelty of the proposed logic,
which outperforms the conventional circuits in terms of logic synthesis and testability. The limitation
of detecting the missing gate and missing control point of the quantum circuit of overflow detection
is finally tackled this work by the proposed OD-PP with the application of the minimum test vector. In addition, reversible circuits of control inputs based testable master-slave D-FF is intended. The noted work on the testable sequential circuit presented here is to develop circuit using minimum test vectors and can find diverse application in the testing paradigm
FPGA Implementation of SubByte & Inverse SubByte for AES Algorithmijsrd.com
Advanced encryption standard was accepted as a Federal Information Processing Standard (FIPS) standard. In traditional look up table (LUT) approaches, the unbreakable delay is longer than the total delay of the rest of operations in each round. LUT approach consumes a large area. It is more efficient to apply composite field arithmetic in the SubBytes transformation of the AES algorithm. It not only reduces the complexity but also enables deep sub pipelining such that higher speed can be achieved. Isomorphic mapping can be employed to convert GF(28) to GF(22)2)2) ,so that multiplicative inverse can be easily obtained. SubBytes and InvSubBytes transformations are merged using composite field arithmetic. It is most important responsible for the implementation of low cost and high throughput AES architecture. As compared to the typical ROM based lookup table, the presented implementation is both capable of higher speeds since it can be pipelined and small in terms of area occupancy (137/1290 slices on a Spartan III XCS200-5FPGA).
Design and implementation of Closed Loop Control of Three Phase Interleaved P...IJMTST Journal
A single-phase, three-level, single-stage power-factor corrected AC/DC converter operated under closed
loop manner is presented. That operates with a single controller to regulate the output voltage and the input
inductor act as a boost inductor to have a single stage power factor correction with good output response. The
paper deals with a new single stage three level ac-dc converter which performs both power factor correction
and voltage regulation in a single stage. The proposed converter has two separate controllers, one for power
factor correction and the other for regulating the output voltage. A comprehensive review of the existing single
stage topologies has been carried out. Then the operating principle, control scheme and the design of the new
converter are presented. The proposed converter is having an input power factor close to unity and better
voltage regulation compared to the conventional ac-dc converter topologies. Proposed topology is evaluated
through Matlab/Simulink platform and simulation results are conferred.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
Designing and Characterization of koggestone, Sparse Kogge stone, Spanning tr...IJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
International Journal of Modern Engineering Research (IJMER) covers all the fields of engineering and science: Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Computer Engineering, Agricultural Engineering, Aerospace Engineering, Thermodynamics, Structural Engineering, Control Engineering, Robotics, Mechatronics, Fluid Mechanics, Nanotechnology, Simulators, Web-based Learning, Remote Laboratories, Engineering Design Methods, Education Research, Students' Satisfaction and Motivation, Global Projects, and Assessment…. And many more.
International Journal of Computational Engineering Research(IJCER)ijceronline
International Journal of Computational Engineering Research(IJCER) is an intentional online Journal in English monthly publishing journal. This Journal publish original research work that contributes significantly to further the scientific knowledge in engineering and Technology.
Fpga implementation of encryption and decryption algorithm based on aeseSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
EFFICIENT ABSOLUTE DIFFERENCE CIRCUIT FOR SAD COMPUTATION ON FPGAVLSICS Design
Video Compression is very essential to meet the technological demands such as low power, less memory
and fast transfer rate for different range of devices and for various multimedia applications. Video
compression is primarily achieved by Motion Estimation (ME) process in any video encoder which
contributes to significant compression gain.Sum of Absolute Difference (SAD) is used as distortion metric
in ME process.In this paper, efficient Absolute Difference(AD)circuit is proposed which uses Brent Kung
Adder(BKA) and a comparator based on modified 1’s complement principle and conditional sum adder
scheme. Results shows that proposed architecture reduces delay by 15% and number of slice LUTs by 42
% as compared to conventional architecture. Simulation and synthesis are done on Xilinx ISE 14.2 using
Virtex 7 FPGA.
EFFICIENT ABSOLUTE DIFFERENCE CIRCUIT FOR SAD COMPUTATION ON FPGAVLSICS Design
Video Compression is very essential to meet the technological demands such as low power, less memory and fast transfer rate for different range of devices and for various multimedia applications. Video compression is primarily achieved by Motion Estimation (ME) process in any video encoder which contributes to significant compression gain.Sum of Absolute Difference (SAD) is used as distortion metric in ME process.In this paper, efficient Absolute Difference(AD)circuit is proposed which uses Brent Kung Adder(BKA) and a comparator based on modified 1’s complement principle and conditional sum adder scheme. Results shows that proposed architecture reduces delay by 15% and number of slice LUTs by 42 % as compared to conventional architecture. Simulation and synthesis are done on Xilinx ISE 14.2 using Virtex 7 FPGA.
FPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMSIAEME Publication
An efficient Priority-Arbiter based Router is designed along with 2X2 and 3X3 mesh
topology based NOC architecture are designed. The Priority –Arbiter based Router
design includes Input registers, Priority arbiter, and XY- Routing algorithm. The
Priority-Arbiter based Router and NOC 2X2 and 3X3 Router designs are synthesized
and implemented using Xilinx ISE Tool and simulated using Modelsim6.5f. The
implementation is done by Artix-7 FPGA device, and the physically debugging of the
NOC 2X2 Router design is verified using Chipscope pro tool. The performance results
are analyzed in terms of the Area (Slices, LUT’s), Timing period, and Maximum
operating frequency. The comparison of the Priority-Arbiter based Router is made
concerning previous similar architecture with improvements.
FPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMSIAEME Publication
An efficient Priority-Arbiter based Router is designed along with 2X2 and 3X3 mesh
topology based NOC architecture are designed. The Priority –Arbiter based Router
design includes Input registers, Priority arbiter, and XY- Routing algorithm. The
Priority-Arbiter based Router and NOC 2X2 and 3X3 Router designs are synthesized
and implemented using Xilinx ISE Tool and simulated using Modelsim6.5f. The
implementation is done by Artix-7 FPGA device, and the physically debugging of the
NOC 2X2 Router design is verified using Chipscope pro tool. The performance results
are analyzed in terms of the Area (Slices, LUT’s), Timing period, and Maximum
operating frequency. The comparison of the Priority-Arbiter based Router is made
concerning previous similar architecture with improvements.
EFFICIENT ABSOLUTE DIFFERENCE CIRCUIT FOR SAD COMPUTATION ON FPGAVLSICS Design
Video Compression is very essential to meet the technological demands such as low power, less memory and fast transfer rate for different range of devices and for various multimedia applications. Video compression is primarily achieved by Motion Estimation (ME) process in any video encoder which contributes to significant compression gain.Sum of Absolute Difference (SAD) is used as distortion metric in ME process.In this paper, efficient Absolute Difference(AD)circuit is proposed which uses Brent Kung Adder(BKA) and a comparator based on modified 1’s complement principle and conditional sum adder scheme. Results shows that proposed architecture reduces delay by 15% and number of slice LUTs by 42% as compared to conventional architecture. Simulation and synthesis are done on Xilinx ISE 14.2 using Virtex 7 FPGA.
Braun’s Multiplier Implementation using FPGA with Bypassing Techniques.VLSICS Design
The developing an Application Specific Integrated Circuits (ASICs) will cost very high, the circuits should be proved and then it would be optimized before implementation. Multiplication which is the basic building block for several DSP processors, Image processing and many other. The Braun multipliers can easily be implemented using Field Programmable Gate Array (FPGA) devices. This research presented the comparative study of Spartan-3E, Virtex-4, Virtex-5 and Virtex-6 Low Power FPGA devices. The implementation of Braun multipliers and its bypassing techniques is done using Verilog HDL. We are proposing that adder block which we implemented our design (fast addition) and we compared the results of that so that our proposed method is effective when compare to the conventional design. There is the reduction in the resources like delay LUTs, number of slices used. Results are showed and it is verified using the Spartan-3E, Virtex-4 and Virtex-5 devices. The Virtex-5 FPGA has shown the good performance as compared to Spartan-3E and Virtex-4 FPGA devices.
LOGIC OPTIMIZATION USING TECHNOLOGY INDEPENDENT MUX BASED ADDERS IN FPGAVLSICS Design
Adders form an almost obligatory component of every contemporary integrated circuit. The prerequisite of the adder is that it is primarily fast and secondarily efficient in terms of power consumption and chip area. Therefore, careful optimization of the adder is of the greatest importance. This optimization can be attained
in two levels; it can be circuit or logic optimization. In circuit optimization the size of transistors are manipulated, where as in logic optimization the Boolean equations are rearranged (or manipulated) to optimize speed, area and power consumption. This paper focuses the optimization of adder through technology independent mapping. The work presents 20 different logical construction of 1-bit adder cell in CMOS logic and its performance is analyzed in terms of transistor count, delay and power dissipation. These performance issues are analyzed through Tanner EDA with TSMC MOSIS 250nm technology. From this analysis the optimized equation is chosen to construct a full adder circuit in terms of multiplexer. This logic optimized multiplexer based adders are incorporated in selected existing adders like ripple carry
adder, carry look-ahead adder, carry skip adder, carry select adder, carry increment adder and carry save adder and its performance is analyzed in terms of area (slices used) and maximum combinational path delay as a function of size. The target FPGA device chosen for the implementation of these adders was Xilinx ISE 12.1 Spartan3E XC3S500-5FG320. Each adder type was implemented with bit sizes of: 8, 16, 32, 64 bits. This variety of sizes will provide with more insight about the performance of each adder in terms of area and delay as a function of size.
Design Of 64-Bit Parallel Prefix VLSI Adder For High Speed Arithmetic CircuitsIJRES Journal
Parallel prefix adder is a kind of process for speeding up the addition of the system of writing and calculating with numbers which use only two digits. Parallel prefix adders are also known as carry-tree adders and they are known to have the best performance in VLSI designs. Due to constraints on logic blog configurations a routing overhead, this performance advantage does not translate directly into FPGA implementations. Identifying the absolutely accurate area-delay tradeoff curve of the parallel prefix is an interesting problem that has received more attention in research because parallel prefix adder on the other hand represents a type of general adder structure that displays publically in flexible area-time tradeoffs for the design of adder. Many different types of parallel prefix adders are made to increase for optimizing area, fan out, speed and performance. For high speed performance tree like structure is must which helps in greater way. There are many different method used for designing parallel prefix adder based on their speed, size and performance. For area optimization we use Brent-Kung method. If our main purpose is to get the least timing then we have to use Kogg-Stone adder method.
MF-RALU: design of an efficient multi-functional reversible arithmetic and l...IJECEIAES
Most modern computer applications use reversible logic gates to solve power dissipation issues. This manuscript uses an efficient multi-functional reversible arithmetic and logical unit (MF-RALU) to perform 30 operations. The 32-bit MF-RALU includes arithmetic, logical, complement, shifters, multiplexers, different adders, and multipliers. The multi-bit reversible multiplexers are used to construct the MF-RALU structure. The Reduced instruction set computer (RISC) processor is designed to realize the functionality of the MF-RALU. The MF-RALU can perform its operation in a single clock cycle. The 1-bit RALU is developed and compared with existing approaches with improvements in performance metrics. The 32-bit reversible arithmetic units (RAUs) and reversible logical units (RLUs) are constructed using 1-bit RALU. The MF-RALU and RISC processor are synthesized individually in the Vivado environment using Verilog-HDL and implemented on Artix-7 field programmable gate array (FPGA). The MFRALU utilizes a <11% chip area and consumes 332 mW total power. The RISC processor utilizes a <3% chip area and works at 483 MHZ frequency by consuming 159 mW of total power on Artix-7 FPGA.
Similar to Design and implementation of Parallel Prefix Adders using FPGAs (20)
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Vaccine management system project report documentation..pdfKamal Acharya
The Division of Vaccine and Immunization is facing increasing difficulty monitoring vaccines and other commodities distribution once they have been distributed from the national stores. With the introduction of new vaccines, more challenges have been anticipated with this additions posing serious threat to the already over strained vaccine supply chain system in Kenya.
Design and implementation of Parallel Prefix Adders using FPGAs
1. IOSR Journal of Electronics and Communication Engineering (IOSR-JECE)
e-ISSN: 2278-2834,p- ISSN: 2278-8735. Volume 6, Issue 5 (Jul. - Aug. 2013), PP 41-48
www.iosrjournals.org
www.iosrjournals.org 41 | Page
Design and implementation of Parallel Prefix Adders using
FPGAs
M.Venkata Durga rama raju1
, A.Deepthi2
(1)
M.E, Department of Electronics & Communication Engineering
(2)
Asst.Professor, Department of Electronics & Communication Engineering
(1)(2)
MVSREC, Nadargul, Hyderabad.
Abstract: Adders are known to have the frequently used in VLSI designs. In digital design we have half adder
and full adder, main adders by using these adders we can implement ripple carry adders. RCA use to perform
any number of addition. In this RCA is serial adder and it has commutation delay problem. If increase the
ha&fa simultaneously delay also increase. That’s why we go for parallel adders(parallel prefix adders). IN the
parallel prefix adder are ks adder(kogge-stone),sks adder(sparse kogge-stone),spaning tree and brentkung
adder. These adders are designd and implemented on FPGA sparton3E kit. Simulated and synthesis by model
sim6.4b, Xilinx ise10.1.
I. Introduction
The binary adder is the critical element in most digital circuit designs including digital signal
processors (DSP) and microprocessor datapath units. As such, extensive research continues to be focused on
improving the power-delay performance of the adder. In VLSI implementations, parallel-prefix adders are
known to have the best performance. Reconfigurable logic such as Field Programmable Gate Arrays
(FPGAs) has been gaining in popularity in recent years because it offers improved performance in terms of
speed and power over DSP-based and microprocessor-based solutions for many practical designs involving
mobile DSP and telecommunications applications and a significant reduction in development time and cost
over Application Specific Integrated Circuit (ASIC) designs. The power advantage is especially important with
the growing popularity of mobile and portable electronics, which make extensive use of DSP functions.
However, because of the structure of the configurable logic and routing resources in FPGAs, parallel-prefix
adders will have a different performance than VLSI implementations. In particular, most modern FPGAs
employ a fast-carry chain which optimizes the carry path for the simple Ripple Carry
Adder (RCA).
In this paper, the practical issues involved in designing and implementing tree-based adders on FPGAs
are
This work was supported in part by NSF LSAMP and UT-System STARS awards. The FPGA ISE
synthesis software was supplied by the Xilinx University program.
described. An efficient testing strategy for evaluating the performance of these adders is discussed.
Several tree- based adder structures are implemented and characterized on a FPGA and compared with the
Ripple Carry Adder (RCA) and the Carry Skip Adder (CSA) . Finally, some conclusions and suggestions for
improving FPGA designs to enable better tree-based adder performance are given.
II. Carry-Tree Adder Designs
Parallel-prefix adders, also known as carry-tree adders, pre-compute the propagate and generate signals
[1]. These signals are variously combined using the fundamental carry operator (fco) [2].
(gL, pL) ο (gR, pR) = (gL + pL•gR, pL • pR) (1)
Due to associative property of the fco, these operators can be combined in different ways to form various adder
structures. For, example the four-bit carry-lookahead generator is given by:
c4 = (g4, p4) ο [ (g3, p3) ο [(g2, p2) ο (g1, p1)] ] (2)
A simple rearrangement of the order of operations allows parallel operation, resulting in a more efficient tree
Submitted Date 26 June 2013 Accepted Date: 01 July 2013
3. Design and implementation of Parallel Prefix Adders using FPGAs
www.iosrjournals.org 43 | Page
Fig. 2. Spanning Tree Carry Lookahead Adder (16 bit)
III. Related Work
The ripple carry adder with the carry-lookahead, carry-skip, and carry-select adders on the Xilinx 4000
series FPGAs. Only an optimized form of the carry-skip adder performed better than the ripple carry adder when
the adder operands were above 56 bits. A study of adders implemented on the Xilinx Virtex II yielded similar
results [9]. In [10], the authors considered several parallel prefix adders implemented on a Xilinx Virtex 5
FPGA. It is found that the simple RCA adder is superior to the parallel prefix designs because the RCA can take
advantage of the fast carry chain on the FPGA.
This study focuses on carry-tree adders implemented on a Xilinx Spartan 3E FPGA. The distinctive
contributions of this paper are two-fold. First, we consider tree-based adders and a hybrid form which combines
a tree structure with a ripple-carry design. The Kogge-Stone adder is chosen as a representative of the former
type and the sparse Kogge-Stone and spanning tree adder are representative of the latter category. Second, this
paper considers the practical issues involved in testing the adders and provides actual measurement data to
compare with simulation results. The previous works cited above all rely upon the synthesis reports from the
FPGA place and route software for their results. In addition to being able to compare the simulation data with
measured data using a high-speed logic analyzer, our results present a different perspective in terms of both
results and types of adders as those presented in [8-10].
IV. Method Of Study
The adders to be studied were designed with varied bit widths up to 128 bits and coded in VHDL. The
functionality of the designs were verified via simulation with ModelSim. The Xilinx ISE 12.2 software was used
to synthesize the designs onto the Spartan 3E FPGA. In order to effectively test for the critical delay, two steps
were taken. First, a memory block (labeled as ROM in the figure below) was instantiated on the FPGA using the
CoreGenerator to allow arbitrary patterns of inputs to be applied to the adder design. A multiplexer at each
adder output selects whether or not to include the adder in the measured results, as shown in Fig. 3. A switch on
the FPGA board was wired to the select pin of the multiplexers. This allows measurements to be made to
subtract out the delay due to the memory, the multiplexers, and interconnect (both external cabling and internal
routing).
Xing and Yu noted that delay models and cost analysis for designs developed for VLSI technology
do not map Second, the parallel prefix network was analyzed to directly to FPGA designs [8]. They
compared the design of
4. Design and implementation of Parallel Prefix Adders using FPGAs
www.iosrjournals.org 44 | Page
Fig. 3. Circuit used to test the adders adder
etermine if a specific pattern could be used to extract the worst case delay. Considering the structure of the
Generate-Propagate (GP) blocks (i.e., the BC and GC cells), we were able to develop the following scheme, by
considering the following subset of input values to the GP blocks.
Table I: Subset of (g, p) Relations Used for Testing
(gL, pL) (gR, pR) (gL + pL gR, pL pR)
(0,1) (0,1) (0,1)
(0,1) (1,0) (1,0)
(1,0) (0,1) (1,0)
(1,0) (1,0) (1,0)
If we arbitrarily assign the (g, p) ordered pairs the values (1, 0) = True and (0, 1) = False, then the table
is self-contained and forms an OR truth table. Furthermore, if both inputs to the GP block are False, then the
output is False; conversely, if both inputs are True, then the output is True. Hence, an input pattern that
alternates between generating the (g, p) pairs of (1, 0) and (0, 1) will force its GP pair block to alternate states.
Likewise, it is easily seen that the GP blocks being fed by its predecessors will also alternate states.
Therefore, this scheme will ensure that a worse case delay will be generated in the parallel prefix network since
every block will be active. In order to ensure this scheme works, the parallel prefix adders were synthesized
with the “Keep Hierarchy” design setting turned on (otherwise, the FPGA compiler attempts to reorganize the
logic assigned to each LUT). With this option turned on, it ensures that each GP block is mapped to one LUT,
preserving the basic parallel prefix structure, and ensuring that this test strategy is effective for determining the
critical delay. The designs were also synthesized for speed rather than area optimization.
The adders were tested with a Tektronix TLA7012 Logic Analyzer. The logic analyzer is equipped
with the 7BB4 module that provides a timing resolution of 20 ps under the MagniVu setting. This allows direct
measurement of the adder delays. The Spartan 3E development board is equipped with a soft touch-landing pad
which allows low capacitance connection directly to the logic analyzer. The test setup is depicted in the figure
below.
5. Design and implementation of Parallel Prefix Adders using FPGAs
www.iosrjournals.org 45 | Page
Fig. 4. Test setup showing the Logic Analyzer and Spartan 3E development board
Fig. 5. Screen shot of a delay measurement for a 64 bit adder using MagniVu timing (blue traces) on the TLA
7012.
V. Discussion of Results
The simulated adder delays obtained from the Xilinx ISE synthesis reports are shown in Fig. 6. The
simulation results for the carry skip adders are not included because the ISE software is not able to correctly
identify the critical path through the adder and hence does not report accurate estimates of the adder delay.
Observe that a semi-log plot is employed, so as expected the tree-adder delay plots as a straight line on
this graph. Somewhat surprising is the fact that the sparse Kogge-Stone adder has about the same delay as the
regular Kogge -Stone adder. Because the sparse Kogge Stone completes the summation process with a 4 bit
RCA, which are optimized via the fast carry chain, its performance is expected to be intermediate between the
regular Kogge-Stone adder and the RCA. The impact of the routing overhead would seem to be a likely cause.
However, according to the synthesis reports, the delay with the logic only makes the regular Kogge -Stone
slightly faster. This will need to be a topic of further investigation.
Fig. 6. Simulation results for the adder designs
Overall, when the delay due to routing overhead is removed, the tree adders are now closer to the
simple RCA design. The RCA adder exhibits the best delay with widths up to 64 bits when the routing delay is
excluded and out to 128 bits with the routing delay included.
Figures 7 and 8 depict the measured results using the TLA. A comparison between the tree adders and
the RCA is given in Figure 7. The basic trends are the same: the tree adders exhibit logarithmic delay
dependence on bit widths and the RCA has linear performance. An RCA as large as 160 bits wide was
synthesizable on the FPGA, while a Kogge-Stone adder up to 128 bits wide was implemented. The carry-skip
6. Design and implementation of Parallel Prefix Adders using FPGAs
www.iosrjournals.org 46 | Page
adders are compared with the Kogge-Stone adders and the RCA in Figure 8. Carry skip adders with a skip of
four and eight were implemented. The poor performance of the carry skip adders is attributable to the significant
routing overhead incurred by this structure.
Fig. 7. Measured results for the parallel-prefix adder designs compared with the RCA
Fig. 8. Measured results for the carry-skip adders compared to the RCA and Kogge-Stone adders
The actual measured data appears to be a bit smaller than what is predicted by the Xilinx ISE synthesis
reports. An analysis of these reports, which give a breakdown of delay due to logic and routing, would seem to
indicate that at adder widths approaching 256 bits and beyond, the Kogge-Stone adder will have superior
performance compared to the RCA. Based on the synthesis reports, the delay of the Kogge-Stone adder can be
predicted by the following equation:
tKS = (n+2) LUT + ρKS(n) (4)
where N = 2n
, the adder bit width, LUT is the delay through a lookup table (LUT), and ρKS(n) is the routing delay
of the
Kogge -Stone adder as a function of n. The delay of the RCA can be predicted as:
tRCA = (N – 2) MUX + τRCA (5)
where MUX is the mux delay associated with the fast-carry chain and τRCA is a fixed logic delay. There is
no routing delay assumed for the RCA due to the use of the fast-carry chain. For the Spartan 3E FPGA, the
synthesis reports give
7. Design and implementation of Parallel Prefix Adders using FPGAs
www.iosrjournals.org 47 | Page
the following values: LUT = 0.612 ns, MUX = 0.051 ns, and
τRCA = 1.715 ns. Even though MUX << LUT, it is expected
That the Kogge-Stone adder will eventually be faster than the RCA because N = 2n
, provided that ρKS(
n) grows relatively slower than (N – 2) MUX. Indeed, Table II predicts that the Kogge-Stone adder will have
superior performance at N = 256.
Table II: Delay Results for the Kogge-Stone Adders
N Synth. Route Route Delay Delay
Predict Delay Fitted tKS Trca
4 4.343 1.895 1.852 4.300 1.817
16 6.113 2.441 2.614 6.286 2.429
32 7.607 3.323 3.154 7.438 3.245
64 8.771 3.875 3.800 8.696 4.877
128 10.038 4.530 4.552 10.060 8.141
256 – – 5.410 11.530 14.669
(all delays given in ns)
The second and third columns represent the total predicted delay and the delay due to routing only for
the Kogge-Stone adder from the synthesis reports of the Xilinx ISE software. The fitted routing delay in column
four represents the predicted routing delay using a quadratic polynomial in n based on the N = 4 to 128 data.
This allows the N = 256 routing delay to be predicted with some degree of confidence as an actual Kogge-Stone
adder at this bit width was not synthesized. The final two columns give the predicted adder delays for the
Kogge-Stone and RCA using equations (4) and (5), respectively. The good match between the measured and
simulated data for the implemented Kogge-Stone adders and RCAs gives confidence that the predicted
superiority of the Kogge-Stone adder at the 256 bit
Width is accurate.
This differs from the results in [10], where the parallel-prefix adders, including the Kogge-Stone adder,
always exhibited inferior performance compared with the RCA (simulation results out to 256 bits were
reported). The work in [10] did use a different FPGA (Xilinx Virtex 5), which may account for some of the
differences.
The poor performance of some of the other implemented adders also deserves some comment. The
spanning tree adder is comparable in performance to the Kogge-Stone adder at 16 bits. However, the spanning
tree adder is significantly slower at higher bit widths, according to the simulation results, and slightly slower,
according to the measured data. The structure of the spanning tree adder results in an extra stage of logic for
some adder outputs compared to the Kogge-Stone. This fact coupled with the way the FPGA place and route
software arranges the adder is likely the reason for this significant increase in delay for higher order bit widths.
Similarly, the inferior performance of the carry-skip adders is due to the LUT delay and routing overhead
associated with each carry-skip logic structure. Even if the carry-skip logic could be implemented with the fast-
carry chain, this would just make it equivalent in speed to the RCA. Hence, the RCA delay represents the
theoretical lower limit for a carry-skip architecture on an FPGA.
VI. Summary And Future Work
Both measured and simulation results from this study have shown that parallel-prefix adders are not as
effective as the simple ripple-carry adder at low to moderate bit widths. This is not unexpected as the Xilinx
FPGA has a fast carry chain which optimizes the performance of the ripple carry adder. However, contrary to
other studies, we have indications that the carry-tree adders eventually surpass the performance of the linear
adder designs at high bit-widths, expected to be in the 128 to 256 bit range. This is important for large adders
used in precision arithmetic and cryptographic applications where the addition of numbers on the order of a
thousand bits is not uncommon. Because the adder is often the critical element which determines to a large part
the cycle time and power dissipation for many digital signal processing and cryptographical implementations, it
8. Design and implementation of Parallel Prefix Adders using FPGAs
www.iosrjournals.org 48 | Page
would be worthwhile for future FPGA designs to include an optimized carry path to enable tree-based adder
designs to be optimized for place and routing. This would improve their performance similar to what is found
for the RCA. We plan to explore possible FPGA architectures that could implement a “fast-tree chain” and
investigate the possible trade-offs involved. The built -in redundancy of the Kogge-Stone carry-tree structure
and its implications for fault tolerance in FPGA designs is being studied. The testability and possible fault
tolerant features of the spanning tree adder are also topics for future research.
References
[1] N. H. E. Weste and D. Harris, CMOS VLSI Design, 4th
edition, Pearson–Addison-Wesley, 2011.
[2] R. P. Brent and H. T. Kung, “A regular layout for parallel adders,” IEEE Trans. Comput., vol. C-31, pp. 260-264, 1982.
[3] D. Harris, “A Taxonomy of Parallel Prefix Networks,” in Proc. 37th Asilomar Conf. Signals Systems and Computers, pp. 2213–7,
2003.
[4] P. M. Kogge and H. S. Stone, “A Parallel Algorithm for the Efficient Solution of a General Class of Recurrence Equations,” IEEE
Trans. on Computers, Vol. C-22, No 8, August 1973.
[5] P. Ndai, S. Lu, D. Somesekhar, and K. Roy, “Fine-Grained Redundancy in Adders,” Int. Symp. on Quality Electronic Design, pp.
317-321, March 2007.
[6] T. Lynch and E. E. Swartzlander, “A Spanning Tree Carry Lookahead Adder,” IEEE Trans. on Computers, vol. 41, no. 8, pp. 931-
939, Aug. 1992.
[7] D. Gizopoulos, M. Psarakis, A. Paschalis, and Y. Zorian, “Easily Testable Cellular Carry Lookahead Adders,” Journal of Electronic
Testing: Theory and Applications 19, 285-298, 2003.
[8] S. Xing and W. W. H. Yu, “FPGA Adders: Performance Evaluation and Optimal Design,” IEEE Design & Test of Computers, vol.
15, no. 1, pp. 24-29, Jan. 1998.
[9] M. Bečvář and P. Štukjunger, “Fixed-Point Arithmetic in FPGA,” Acta Polytechnica, vol. 45, no. 2, pp. 67-72, 2005.
[10] K. Vitoroulis and A. J. Al-Khalili, “Performance of Parallel Prefix Adders Implemented with FPGA technology,” IEEE Northeast
Workshop on Circuits and Systems, pp. 498-501, Aug. 2007.