Adders form an almost obligatory component of every contemporary integrated circuit. The prerequisite of the adder is that it is primarily fast and secondarily efficient in terms of power consumption and chip area. Therefore, careful optimization of the adder is of the greatest importance. This optimization can be attained
in two levels; it can be circuit or logic optimization. In circuit optimization the size of transistors are manipulated, where as in logic optimization the Boolean equations are rearranged (or manipulated) to optimize speed, area and power consumption. This paper focuses the optimization of adder through technology independent mapping. The work presents 20 different logical construction of 1-bit adder cell in CMOS logic and its performance is analyzed in terms of transistor count, delay and power dissipation. These performance issues are analyzed through Tanner EDA with TSMC MOSIS 250nm technology. From this analysis the optimized equation is chosen to construct a full adder circuit in terms of multiplexer. This logic optimized multiplexer based adders are incorporated in selected existing adders like ripple carry
adder, carry look-ahead adder, carry skip adder, carry select adder, carry increment adder and carry save adder and its performance is analyzed in terms of area (slices used) and maximum combinational path delay as a function of size. The target FPGA device chosen for the implementation of these adders was Xilinx ISE 12.1 Spartan3E XC3S500-5FG320. Each adder type was implemented with bit sizes of: 8, 16, 32, 64 bits. This variety of sizes will provide with more insight about the performance of each adder in terms of area and delay as a function of size.
International Journal of Engineering Research and DevelopmentIJERD Editor
Electrical, Electronics and Computer Engineering,
Information Engineering and Technology,
Mechanical, Industrial and Manufacturing Engineering,
Automation and Mechatronics Engineering,
Material and Chemical Engineering,
Civil and Architecture Engineering,
Biotechnology and Bio Engineering,
Environmental Engineering,
Petroleum and Mining Engineering,
Marine and Agriculture engineering,
Aerospace Engineering.
IRJET - High Speed Inexact Speculative Adder using Carry Look Ahead Adder...IRJET Journal
This document presents a novel architecture for a high-speed inexact speculative adder using carry lookahead adder and Brent Kung adder. The proposed adder splits the critical path into shorter paths using fine-grain pipelining to improve speed. It uses a carry lookahead adder and Brent Kung adder in 4-bit blocks to speculate the carry and calculate sums, with error compensation. The design is implemented using a 5-stage pipeline to further reduce delay. Implementation in MAGIC shows the proposed adder achieves higher speed through optimized pipelining of key components in the speculative and compensation blocks.
The document describes a proposed LP-HS logic style and its application in designing an ultra low power high speed multiplier accumulator (MAC) unit for digital signal processing applications. The LP-HS logic is derived from an existing constant delay logic style to reduce power and delay. A MAC unit is designed using both the constant delay logic and proposed LP-HS logic. Simulation results in 45nm, 32nm, 22nm, and 16nm CMOS technologies show the LP-HS logic MAC unit achieves up to 94% reduction in power delay product compared to the constant delay logic MAC unit. Therefore, the proposed LP-HS logic is concluded to be better suited for high performance, low power MAC unit design.
With advancement in CMOS technology, a lot of research has been done to develop various logic styles to improve the performance of logic circuits. D flip-flops (DFF) are fundamental building blocks in almost every sequential logic circuit. Hence, in sequential logic circuits, the overall performance of the circuit is affected by the performance of constituent DFFs. In recent years, the focus has been towards incorporating higher clock rates in a processor for better performance. To achieve high clock rates, fine granularity pipelining techniques are used, which implies that there are relatively a fewer levels of logic in each pipeline stage. A major consequence of this design trend is that the pipeline overhead has becoming more significant. The primary cause of pipeline overhead is the latency of the flip-flop or latch used to design the processor and the clock skew of the system. This calls out for the need of incorporating the logic functionality within the architecture of flip-flop. The new family of flip-flops are called Embedded Logic Flip Flops. In this Paper, we have reviewed various Flip-flop architectures which have been proposed so far. Our attempt is to do a qualitative analysis and comparison of the proposed Embedded logic flip-flop designs.
IRJET- A Survey on Reconstruct Structural Design of FPGAIRJET Journal
This document summarizes research on reconstructing the structural design of field-programmable gate arrays (FPGAs). It discusses how logic block complexity, interconnect structure, hardwired logic blocks, and data path implementation can impact the area and speed of FPGAs. The document analyzes past studies that examined how varying the number of lookup table (LUT) inputs, routing flexibility between blocks, and use of hardwired connections affected the routability and timing of mapped circuits. It also describes how FPGAs can be optimized for logic emulation applications by multiplexing logic units over time through memory-based architectures. In general, the document reviews FPGA architectural parameters and how researchers have iteratively improved designs through simulation and
FPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMSIAEME Publication
An efficient Priority-Arbiter based Router is designed along with 2X2 and 3X3 mesh
topology based NOC architecture are designed. The Priority –Arbiter based Router
design includes Input registers, Priority arbiter, and XY- Routing algorithm. The
Priority-Arbiter based Router and NOC 2X2 and 3X3 Router designs are synthesized
and implemented using Xilinx ISE Tool and simulated using Modelsim6.5f. The
implementation is done by Artix-7 FPGA device, and the physically debugging of the
NOC 2X2 Router design is verified using Chipscope pro tool. The performance results
are analyzed in terms of the Area (Slices, LUT’s), Timing period, and Maximum
operating frequency. The comparison of the Priority-Arbiter based Router is made
concerning previous similar architecture with improvements.
This document discusses system designing and modeling using field programmable gate arrays (FPGAs). It provides an overview of FPGA architecture, including logic blocks, interconnects, switch boxes, and input/output pads. Programming FPGAs involves using a hardware description language (HDL) like VHDL or Verilog to define the design, which is then synthesized, placed, and routed to the FPGA. Common applications of FPGAs include digital signal processing, image processing, cryptography, and ASIC prototyping. The document provides examples of FPGA components and programming.
FPGA BASED IMPLEMENTATION OF DELAY OPTIMISED DOUBLE PRECISION IEEE FLOATING-P...Somsubhra Ghosh
This document summarizes an algorithm for implementing a double precision floating point adder according to the IEEE 754 standard. The algorithm uses several optimization techniques to reduce latency, including separating the computation into two parallel paths based on the operands and operation, reducing the number of IEEE rounding modes, using a sign-magnitude representation for subtraction, and performing prefix addition of the significands. Analysis using the logical effort model estimates the delay of this optimized design is 30.6 FO4 delays, an improvement over prior designs.
International Journal of Engineering Research and DevelopmentIJERD Editor
Electrical, Electronics and Computer Engineering,
Information Engineering and Technology,
Mechanical, Industrial and Manufacturing Engineering,
Automation and Mechatronics Engineering,
Material and Chemical Engineering,
Civil and Architecture Engineering,
Biotechnology and Bio Engineering,
Environmental Engineering,
Petroleum and Mining Engineering,
Marine and Agriculture engineering,
Aerospace Engineering.
IRJET - High Speed Inexact Speculative Adder using Carry Look Ahead Adder...IRJET Journal
This document presents a novel architecture for a high-speed inexact speculative adder using carry lookahead adder and Brent Kung adder. The proposed adder splits the critical path into shorter paths using fine-grain pipelining to improve speed. It uses a carry lookahead adder and Brent Kung adder in 4-bit blocks to speculate the carry and calculate sums, with error compensation. The design is implemented using a 5-stage pipeline to further reduce delay. Implementation in MAGIC shows the proposed adder achieves higher speed through optimized pipelining of key components in the speculative and compensation blocks.
The document describes a proposed LP-HS logic style and its application in designing an ultra low power high speed multiplier accumulator (MAC) unit for digital signal processing applications. The LP-HS logic is derived from an existing constant delay logic style to reduce power and delay. A MAC unit is designed using both the constant delay logic and proposed LP-HS logic. Simulation results in 45nm, 32nm, 22nm, and 16nm CMOS technologies show the LP-HS logic MAC unit achieves up to 94% reduction in power delay product compared to the constant delay logic MAC unit. Therefore, the proposed LP-HS logic is concluded to be better suited for high performance, low power MAC unit design.
With advancement in CMOS technology, a lot of research has been done to develop various logic styles to improve the performance of logic circuits. D flip-flops (DFF) are fundamental building blocks in almost every sequential logic circuit. Hence, in sequential logic circuits, the overall performance of the circuit is affected by the performance of constituent DFFs. In recent years, the focus has been towards incorporating higher clock rates in a processor for better performance. To achieve high clock rates, fine granularity pipelining techniques are used, which implies that there are relatively a fewer levels of logic in each pipeline stage. A major consequence of this design trend is that the pipeline overhead has becoming more significant. The primary cause of pipeline overhead is the latency of the flip-flop or latch used to design the processor and the clock skew of the system. This calls out for the need of incorporating the logic functionality within the architecture of flip-flop. The new family of flip-flops are called Embedded Logic Flip Flops. In this Paper, we have reviewed various Flip-flop architectures which have been proposed so far. Our attempt is to do a qualitative analysis and comparison of the proposed Embedded logic flip-flop designs.
IRJET- A Survey on Reconstruct Structural Design of FPGAIRJET Journal
This document summarizes research on reconstructing the structural design of field-programmable gate arrays (FPGAs). It discusses how logic block complexity, interconnect structure, hardwired logic blocks, and data path implementation can impact the area and speed of FPGAs. The document analyzes past studies that examined how varying the number of lookup table (LUT) inputs, routing flexibility between blocks, and use of hardwired connections affected the routability and timing of mapped circuits. It also describes how FPGAs can be optimized for logic emulation applications by multiplexing logic units over time through memory-based architectures. In general, the document reviews FPGA architectural parameters and how researchers have iteratively improved designs through simulation and
FPGA IMPLEMENTATION OF PRIORITYARBITER BASED ROUTER DESIGN FOR NOC SYSTEMSIAEME Publication
An efficient Priority-Arbiter based Router is designed along with 2X2 and 3X3 mesh
topology based NOC architecture are designed. The Priority –Arbiter based Router
design includes Input registers, Priority arbiter, and XY- Routing algorithm. The
Priority-Arbiter based Router and NOC 2X2 and 3X3 Router designs are synthesized
and implemented using Xilinx ISE Tool and simulated using Modelsim6.5f. The
implementation is done by Artix-7 FPGA device, and the physically debugging of the
NOC 2X2 Router design is verified using Chipscope pro tool. The performance results
are analyzed in terms of the Area (Slices, LUT’s), Timing period, and Maximum
operating frequency. The comparison of the Priority-Arbiter based Router is made
concerning previous similar architecture with improvements.
This document discusses system designing and modeling using field programmable gate arrays (FPGAs). It provides an overview of FPGA architecture, including logic blocks, interconnects, switch boxes, and input/output pads. Programming FPGAs involves using a hardware description language (HDL) like VHDL or Verilog to define the design, which is then synthesized, placed, and routed to the FPGA. Common applications of FPGAs include digital signal processing, image processing, cryptography, and ASIC prototyping. The document provides examples of FPGA components and programming.
FPGA BASED IMPLEMENTATION OF DELAY OPTIMISED DOUBLE PRECISION IEEE FLOATING-P...Somsubhra Ghosh
This document summarizes an algorithm for implementing a double precision floating point adder according to the IEEE 754 standard. The algorithm uses several optimization techniques to reduce latency, including separating the computation into two parallel paths based on the operands and operation, reducing the number of IEEE rounding modes, using a sign-magnitude representation for subtraction, and performing prefix addition of the significands. Analysis using the logical effort model estimates the delay of this optimized design is 30.6 FO4 delays, an improvement over prior designs.
The document discusses programmable logic arrays (PLAs) and their minimization and testing. It describes how PLAs can be used to implement combinational and sequential logic. PLA minimization techniques include removing redundant product terms and raising terms to optimize area usage. Folding is also described as a technique to minimize the PLA by allowing columns and rows to be shared, reducing the overall size. Column and row folding algorithms are discussed as well as their complexity.
FPGA Implementation of Viterbi Decoder using Hybrid Trace Back and Register E...ijsrd.com
Error correction is an integral part of any communication system and for this purpose, the convolution codes are widely used as forward error correction codes. For decoding of convolution codes, at the receiver end Viterbi Decoder is being employed. The parameters of Viterbi algorithm can be changed to suit a specific application. The high speed and small area are two important design parameters in today’s wireless technology. In this paper, a high speed feed forward viterbi decoder has been designed using hybrid track back and register exchange architecture and embedded BRAM of target FPGA. The proposed viterbi decoder has been designed with Matlab, simulated with Xilinx ISE 8.1i Tool, synthesized with Xilinx Synthesis Tool (XST), and implemented on Xilinx Virtex4 based xc4vlx15 FPGA device. The results show that the proposed design can operate at an estimated frequency of 107.7 MHz by consuming considerably less resources on target device to provide cost effective solution for wireless applications.
The signal processing algorithms can be implemented on hardware using various strategies such as DSP processors and ASIC. This PPT compares and contrasts the two methods.
SYNTHESIS OPTIMIZATION FOR FINITE STATE MACHINE DESIGN IN FPGASVLSICS Design
Synthesis optimization plays a vital role in modern FPGAs in order to achieve high performance, in terms of resource utilization and reducing time consuming test process. Cell-based design techniques, such as standard-cells and FPGAs, together with versatile hardware synthesis are rudiments for a high productivity in ASIC design. As the capacity of FPGAs increases, synthesis tools and efficient synthesis methods for targeted device become more significant to efficiently exploit the resources and logic capacity. The synthesis tool provides the selection of different constraint to optimize the circuit. This paper presents the design and synthesis optimization constraints in FPGA for Finite state machine. The primary goal of this sequential logic design is to optimize the speed and area by choosing the proper options available in the synthesis tool. More over the work focuses the design of FSM with more processes operates at a faster rate and the number of slices utilized in an FPGA is also reduced when compare to single process. The module functionality are described using Verilog HDL and performance issues like slice utilized, simulation time, percentage of logic utilization, level of logic are analyzed at 90 nm process technology using SPARTAN6 XC6SLX150 XILINX ISE12.1 tool.
This document summarizes a student's MASc research on developing an area-efficient FPGA architecture for datapath circuits. It proposes combining bus-based and bit-based routing to better utilize multibit computing elements. Simulation results show the multi-bit logic block approach reduces routing area by 14% compared to conventional FPGAs. Future work involves exploring directional single-driver wires which could further reduce area by 25% and delay by 9% on average. The student seeks feedback on modifications to the CAD flow needed to support the new architectural features.
FPGA based Efficient Interpolator design using DALUT Algorithmcscpconf
The document describes the design and implementation of an efficient interpolator for wireless communication systems using FPGA. It proposes a multiplier-less technique using distributed arithmetic look-up tables (DALUT) that replaces multiply-accumulate operations with LUT accesses. A 66th-order half-band polyphase FIR structure is implemented using the DALUT approach on Spartan-3E and Virtex2Pro FPGAs. Results show the proposed design achieves maximum frequencies of 92.859MHz on Virtex Pro and 61.6MHz on Spartan 3E while consuming fewer resources than a traditional MAC-based design.
Overview of signal integrity simulation for sfp+ interface serial links with ...Conference Papers
The document discusses signal integrity simulation for high-speed SFP+ serial links. It begins with an overview of the challenges of high-speed transmission and the need for simulation. It then covers:
1) The IBIS-AMI model structure and configuration options for simulating transmitters and receivers.
2) The simulation setup created in ADS to optimize equalization settings for a specific SFP+ interface design.
3) The results of simulations at 10.3125Gbps and 16Gbps, showing larger eye openings at the lower data rate.
4) Additional transmitter simulation results, indicating better eye openings when optimizing de-emphasis levels and voltage amplitudes.
EFFICIENT ABSOLUTE DIFFERENCE CIRCUIT FOR SAD COMPUTATION ON FPGAVLSICS Design
Video Compression is very essential to meet the technological demands such as low power, less memory and fast transfer rate for different range of devices and for various multimedia applications. Video compression is primarily achieved by Motion Estimation (ME) process in any video encoder which contributes to significant compression gain.Sum of Absolute Difference (SAD) is used as distortion metric in ME process.In this paper, efficient Absolute Difference(AD)circuit is proposed which uses Brent Kung Adder(BKA) and a comparator based on modified 1’s complement principle and conditional sum adder scheme. Results shows that proposed architecture reduces delay by 15% and number of slice LUTs by 42 % as compared to conventional architecture. Simulation and synthesis are done on Xilinx ISE 14.2 using Virtex 7 FPGA.
https://technoelectronics44.blogspot.com/
GDI TECHNOLOGY, here you get GDI implementation and design of GDI based gates AND, OR, XOR, and Adders like CLA, CIA, CSKA, performance analysis of CMOS And GDI
The document provides an overview of FPGA routing, which is an important step in the CAD process that connects logic blocks placed on the FPGA. It discusses the routing resources in Xilinx FPGAs including connection boxes, switch boxes, and wire segments. It also describes the FPGA routing model commonly used in academia, which simplifies the island-style architecture of commercial FPGAs. Efficient routing aims to minimize wiring area and critical path lengths to improve circuit performance.
International Journal of Computational Engineering Research(IJCER)ijceronline
This document summarizes a research paper on designing a high-speed arithmetic architecture for a parallel multiplier-accumulator based on the radix-2 modified Booth algorithm. It presents the design of a modified Booth multiplier using a carry lookahead adder for high accuracy with a fixed width. It also proposes a high-speed MAC architecture that improves performance by eliminating the accumulator and modifying the carry-save adder to add lower bits in advance, reducing the number of inputs to the final adder. The proposed CSA architecture can efficiently implement the operations of the new MAC arithmetic.
Hardware simulation for exponential blind equal throughput algorithm using sy...IJECEIAES
Scheduling mechanism is the process of allocating radio resources to User Equipment (UE) that transmits different flows at the same time. It is performed by the scheduling algorithm implemented in the Long Term Evolution base station, Evolved Node B. Normally, most of the proposed algorithms are not focusing on handling the real-time and non-real-time traffics simultaneously. Thus, UE with bad channel quality may starve due to no resources allocated for quite a long time. To solve the problems, Exponential Blind Equal Throughput (EXP-BET) algorithm is proposed. User with the highest priority metrics is allocated the resources firstly which is calculated using the EXP-BET metric equation. This study investigates the implementation of the EXP-BET scheduling algorithm on the FPGA platform. The metric equation of the EXP-BET is modelled and simulated using System Generator. This design has utilized only 10% of available resources on FPGA. Fixed numbers are used for all the input to the scheduler. The system verification is performed by simulating the hardware co-simulation for the metric value of the EXP-BET metric algorithm. The output from the hardware co-simulation showed that the metric values of EXP-BET produce similar results to the Simulink environment. Thus, the algorithm is ready for prototyping and Virtex-6 FPGA is chosen as the platform.
UNIT-III CASE STUDIES -FPGA & CPGA ARCHITECTURES APPLICATIONSDr.YNM
voltage circuits from the programming voltage.
This document discusses different types of programming technologies used in field programmable gate arrays (FPGAs). It describes SRAM-based programming technology, which is the most commonly used technology due to its re-programmability and use of standard CMOS processes. Flash programming technology and anti-fuse programming technology are also discussed. Each technology has advantages and disadvantages related to factors like area efficiency, volatility, re-programmability, and process requirements. The document provides detailed information on how each technology works at a circuit level.
This document presents a sequential quadratic programming (SQP) algorithm for sizing clock meshes to minimize area while meeting skew constraints. The algorithm uses adjoint sensitivity analysis and a compact gate model to efficiently compute sensitivities. It formulates and solves a quadratic programming subproblem at each iteration to determine wire width updates. Experimental results on ISCAS and ISPD benchmarks show up to 33% reduction in clock mesh area compared to initial designs. Future work will extend the approach to simultaneously size interconnects and buffers.
Design and analysis of optimized CORDIC based GMSK system on FPGA platform IJECEIAES
The gaussian minimum shift keying (GMSK) is one of the best suited digital modulation schemes in the global system for mobile communication (GSM) because of its constant envelop and spectral efficiency characteristics. Most of the conventional GMSK approaches failed to balance the digital modulation with efficient usage of spectrum. In this article, the hardware architecture of the optimized CORDIC-based GMSK system is designed, which includes GMSK Modulation with the channel and GMSK Demodulation. The modulation consists of non-return zero (NRZ) encoder, an integrator followed by Gaussian filtering and frequency modulation (FM). The GMSK demodulation consists of FM demodulator, followed by differentiation and NRZ decoder. The FM Modulation and demodulation use the optimized CORDIC model for an In-phase (I) and quadrature (Q) phase generation. The optimized CORDIC is designed by using quadrant mapping and pipelined structure to improve the hardware and computational complexity in GMSK systems. The GMSK system is designed on the Xilinx platform and implemented on Artix-7 and Spartan-3EFPGA. The hardware constraints like area, power, and timing utilization are summarized. The comparison of the optimized CORDIC model with similar CORDIC approaches is tabulated with improvements.
Comparison study of 8-PPM, 8-DPIM, and 8-RDH-PIM modulator FPGA hardware desi...journalBEEI
In this paper, a performance study of 8-Pulse-Position Modulation (PPM), 8-Digital Pulse Interval Modulation (DPIM), and 8-Reverse Dual Header-Pulse Interval Modulation (RDH-PIM) implementation in Verilog hardware design language is presented. The hardware design is chosen over software design since it could provide much more flexibility in term of transmission rate and reduce the workload of the processor in the complete system. Using 50 MHz clock as the reference data clock speeds, the transmission rate recorded are 11.11 Msymbol/second or 33.33 Mbps, 9.09 Msymbol/s or 27.27 Mbps, and 6.25 Msymbol/s or 18.75 Mbps for 8-RDH-PIM, 8-DPIM, and 8-PPM respectively. We conclude that 8-RDH-PIM modulator design provides better performance in term of bandwidth utilization and transmission rate as compared to 8-PPM and 8-DPIM.
This document summarizes a seminar on FPGA, CPLD, and VHDL programming basics. The seminar schedule includes sessions on FPGA technologies compared to previous programmable devices like CPLD, Microsemi FPGA devices and VHDL introduction. There is also an application example of using an FPGA for an Ethernet bus interface board and a discussion of current trends and technologies.
Reconfigurable CORDIC Low-Power Implementation of Complex Signal Processing f...Editor IJMTER
This document describes a proposed low-power CORDIC-based DCT architecture that prioritizes processing of low-frequency DCT coefficients over high-frequency coefficients to reduce power consumption with minimal image quality degradation. It uses a look-ahead CORDIC approach to allow varying the number of CORDIC iterations for different coefficients. Experimental results show the proposed architecture achieves 38.1% area and power savings compared to DA-based DCT, with comparable power to MCM-based DCT but using 100% less area and a minor 0.04dB quality loss.
Low Power-Area Design of Full Adder Using Self Resetting Logic with GDI Techn...VLSICS Design
Various electronic devices such as mobile phones, DSPs,ALU etc., are designed by using VLSI (Very
Large Scale Integration) technology. In VLSI dynamic CMOS logic circuits are concentrating on the Area
,reducing the power consumption and increasing the Speed by reducing the delay. ALU (Arithmetic Logic
Circuits) are designed by using adder, subtractors, multiplier, divider, etc.Various adder circuits designs
have been proposed over last few years with different logic styles. To reduce the power consumption
several parameters are to be taken into account, such as feedthrough, leakage power single-event upsets,
charge sharing by parasitic components while connecting source and drain of CMOS transistors There are
situations in a logic that permit the use of circuits that can automatically precharge themselves (i.e., reset
themselves) after some prescribed delays. These circuits are hence called postcharge or self-resetting logic
which are widely used in dynamic logic circuits. Overall performance of various adder designs is
evaluated by using Tanner tool . The earlier and the proposed SRLGDI primitives are simulated using
Tanner EDA with BSIM 0.250 lm technology with supply voltage ranging from 0 V to 5 V in steps of 0.2 V.
On comparing the various SRLGDI logic adders, the proposed adder shows low power, delay and low
PDP among its counterparts.
DESIGN AND PERFORMANCE ANALYSIS OF HYBRID ADDERS FOR HIGH SPEED ARITHMETIC CI...VLSICS Design
Adder cells using Gate Diffusion Technique (GDI) & PTL-GDI technique are described in this paper. GDI technique allows reducing power consumption, propagation delay and low PDP (power delay product) whereas Pass Transistor Logic (PTL) reduces the count of transistors used to make different logic gates, by eliminating redundant transistors. Performance comparison with various Hybrid Adder is been presented. In this paper, we propose two new designs based on GDI & PTL techniques, which is found to be much more power efficient in comparison with existing design technique. Only 10 transistors are used to implement the SUM & CARRY function for both the designs. The SUM and CARRY cell are implemented in a cascaded way i.e. firstly the XOR cell is implemented and then using XOR as input SUM as well as CARRY cell is implemented. For Proposed GDI adder the SUM as well as CARRY cell is designed using GDI technique. On the other hand in Proposed PTL-GDI adder the SUM cell is constructed using PTL technique and the CARRY cell is designed using GDI technique. The advantages of both the designs are discussed. The significance of these designs is substantiated by the simulation results obtained from Cadence Virtuoso 180nm environment.
Iaetsd design and simulation of high speed cmos full adder (2)Iaetsd Iaetsd
The document describes the design and simulation of high speed CMOS full adders. It analyzes different logic styles for full adder design including SR-CPL and transmission gate styles. Simulation results show that the transmission gate full adder has lower power dissipation of 4.876595e-008 watts compared to 2.309128e-005 watts for the SR-CPL full adder, demonstrating that transmission gate style is better for low power applications.
Leakage power optimization for ripple carry adder NAVEEN TOKAS
This document discusses the design of a ripple carry adder using a 3T XOR gate. It begins by providing background on ripple carry adders and how they are constructed from cascaded full adders. It then describes how full adders can be built from XOR gates. The author proposes designing a low-power 3T XOR gate to serve as the building block for an improved full adder circuit. This new full adder design aims to reduce area, complexity, power consumption and delay compared to previous adder designs. To test the design, the author created a 3T XOR gate simulation using Mentor Graphics software at the 35nm technology node.
The document discusses programmable logic arrays (PLAs) and their minimization and testing. It describes how PLAs can be used to implement combinational and sequential logic. PLA minimization techniques include removing redundant product terms and raising terms to optimize area usage. Folding is also described as a technique to minimize the PLA by allowing columns and rows to be shared, reducing the overall size. Column and row folding algorithms are discussed as well as their complexity.
FPGA Implementation of Viterbi Decoder using Hybrid Trace Back and Register E...ijsrd.com
Error correction is an integral part of any communication system and for this purpose, the convolution codes are widely used as forward error correction codes. For decoding of convolution codes, at the receiver end Viterbi Decoder is being employed. The parameters of Viterbi algorithm can be changed to suit a specific application. The high speed and small area are two important design parameters in today’s wireless technology. In this paper, a high speed feed forward viterbi decoder has been designed using hybrid track back and register exchange architecture and embedded BRAM of target FPGA. The proposed viterbi decoder has been designed with Matlab, simulated with Xilinx ISE 8.1i Tool, synthesized with Xilinx Synthesis Tool (XST), and implemented on Xilinx Virtex4 based xc4vlx15 FPGA device. The results show that the proposed design can operate at an estimated frequency of 107.7 MHz by consuming considerably less resources on target device to provide cost effective solution for wireless applications.
The signal processing algorithms can be implemented on hardware using various strategies such as DSP processors and ASIC. This PPT compares and contrasts the two methods.
SYNTHESIS OPTIMIZATION FOR FINITE STATE MACHINE DESIGN IN FPGASVLSICS Design
Synthesis optimization plays a vital role in modern FPGAs in order to achieve high performance, in terms of resource utilization and reducing time consuming test process. Cell-based design techniques, such as standard-cells and FPGAs, together with versatile hardware synthesis are rudiments for a high productivity in ASIC design. As the capacity of FPGAs increases, synthesis tools and efficient synthesis methods for targeted device become more significant to efficiently exploit the resources and logic capacity. The synthesis tool provides the selection of different constraint to optimize the circuit. This paper presents the design and synthesis optimization constraints in FPGA for Finite state machine. The primary goal of this sequential logic design is to optimize the speed and area by choosing the proper options available in the synthesis tool. More over the work focuses the design of FSM with more processes operates at a faster rate and the number of slices utilized in an FPGA is also reduced when compare to single process. The module functionality are described using Verilog HDL and performance issues like slice utilized, simulation time, percentage of logic utilization, level of logic are analyzed at 90 nm process technology using SPARTAN6 XC6SLX150 XILINX ISE12.1 tool.
This document summarizes a student's MASc research on developing an area-efficient FPGA architecture for datapath circuits. It proposes combining bus-based and bit-based routing to better utilize multibit computing elements. Simulation results show the multi-bit logic block approach reduces routing area by 14% compared to conventional FPGAs. Future work involves exploring directional single-driver wires which could further reduce area by 25% and delay by 9% on average. The student seeks feedback on modifications to the CAD flow needed to support the new architectural features.
FPGA based Efficient Interpolator design using DALUT Algorithmcscpconf
The document describes the design and implementation of an efficient interpolator for wireless communication systems using FPGA. It proposes a multiplier-less technique using distributed arithmetic look-up tables (DALUT) that replaces multiply-accumulate operations with LUT accesses. A 66th-order half-band polyphase FIR structure is implemented using the DALUT approach on Spartan-3E and Virtex2Pro FPGAs. Results show the proposed design achieves maximum frequencies of 92.859MHz on Virtex Pro and 61.6MHz on Spartan 3E while consuming fewer resources than a traditional MAC-based design.
Overview of signal integrity simulation for sfp+ interface serial links with ...Conference Papers
The document discusses signal integrity simulation for high-speed SFP+ serial links. It begins with an overview of the challenges of high-speed transmission and the need for simulation. It then covers:
1) The IBIS-AMI model structure and configuration options for simulating transmitters and receivers.
2) The simulation setup created in ADS to optimize equalization settings for a specific SFP+ interface design.
3) The results of simulations at 10.3125Gbps and 16Gbps, showing larger eye openings at the lower data rate.
4) Additional transmitter simulation results, indicating better eye openings when optimizing de-emphasis levels and voltage amplitudes.
EFFICIENT ABSOLUTE DIFFERENCE CIRCUIT FOR SAD COMPUTATION ON FPGAVLSICS Design
Video Compression is very essential to meet the technological demands such as low power, less memory and fast transfer rate for different range of devices and for various multimedia applications. Video compression is primarily achieved by Motion Estimation (ME) process in any video encoder which contributes to significant compression gain.Sum of Absolute Difference (SAD) is used as distortion metric in ME process.In this paper, efficient Absolute Difference(AD)circuit is proposed which uses Brent Kung Adder(BKA) and a comparator based on modified 1’s complement principle and conditional sum adder scheme. Results shows that proposed architecture reduces delay by 15% and number of slice LUTs by 42 % as compared to conventional architecture. Simulation and synthesis are done on Xilinx ISE 14.2 using Virtex 7 FPGA.
https://technoelectronics44.blogspot.com/
GDI TECHNOLOGY, here you get GDI implementation and design of GDI based gates AND, OR, XOR, and Adders like CLA, CIA, CSKA, performance analysis of CMOS And GDI
The document provides an overview of FPGA routing, which is an important step in the CAD process that connects logic blocks placed on the FPGA. It discusses the routing resources in Xilinx FPGAs including connection boxes, switch boxes, and wire segments. It also describes the FPGA routing model commonly used in academia, which simplifies the island-style architecture of commercial FPGAs. Efficient routing aims to minimize wiring area and critical path lengths to improve circuit performance.
International Journal of Computational Engineering Research(IJCER)ijceronline
This document summarizes a research paper on designing a high-speed arithmetic architecture for a parallel multiplier-accumulator based on the radix-2 modified Booth algorithm. It presents the design of a modified Booth multiplier using a carry lookahead adder for high accuracy with a fixed width. It also proposes a high-speed MAC architecture that improves performance by eliminating the accumulator and modifying the carry-save adder to add lower bits in advance, reducing the number of inputs to the final adder. The proposed CSA architecture can efficiently implement the operations of the new MAC arithmetic.
Hardware simulation for exponential blind equal throughput algorithm using sy...IJECEIAES
Scheduling mechanism is the process of allocating radio resources to User Equipment (UE) that transmits different flows at the same time. It is performed by the scheduling algorithm implemented in the Long Term Evolution base station, Evolved Node B. Normally, most of the proposed algorithms are not focusing on handling the real-time and non-real-time traffics simultaneously. Thus, UE with bad channel quality may starve due to no resources allocated for quite a long time. To solve the problems, Exponential Blind Equal Throughput (EXP-BET) algorithm is proposed. User with the highest priority metrics is allocated the resources firstly which is calculated using the EXP-BET metric equation. This study investigates the implementation of the EXP-BET scheduling algorithm on the FPGA platform. The metric equation of the EXP-BET is modelled and simulated using System Generator. This design has utilized only 10% of available resources on FPGA. Fixed numbers are used for all the input to the scheduler. The system verification is performed by simulating the hardware co-simulation for the metric value of the EXP-BET metric algorithm. The output from the hardware co-simulation showed that the metric values of EXP-BET produce similar results to the Simulink environment. Thus, the algorithm is ready for prototyping and Virtex-6 FPGA is chosen as the platform.
UNIT-III CASE STUDIES -FPGA & CPGA ARCHITECTURES APPLICATIONSDr.YNM
voltage circuits from the programming voltage.
This document discusses different types of programming technologies used in field programmable gate arrays (FPGAs). It describes SRAM-based programming technology, which is the most commonly used technology due to its re-programmability and use of standard CMOS processes. Flash programming technology and anti-fuse programming technology are also discussed. Each technology has advantages and disadvantages related to factors like area efficiency, volatility, re-programmability, and process requirements. The document provides detailed information on how each technology works at a circuit level.
This document presents a sequential quadratic programming (SQP) algorithm for sizing clock meshes to minimize area while meeting skew constraints. The algorithm uses adjoint sensitivity analysis and a compact gate model to efficiently compute sensitivities. It formulates and solves a quadratic programming subproblem at each iteration to determine wire width updates. Experimental results on ISCAS and ISPD benchmarks show up to 33% reduction in clock mesh area compared to initial designs. Future work will extend the approach to simultaneously size interconnects and buffers.
Design and analysis of optimized CORDIC based GMSK system on FPGA platform IJECEIAES
The gaussian minimum shift keying (GMSK) is one of the best suited digital modulation schemes in the global system for mobile communication (GSM) because of its constant envelop and spectral efficiency characteristics. Most of the conventional GMSK approaches failed to balance the digital modulation with efficient usage of spectrum. In this article, the hardware architecture of the optimized CORDIC-based GMSK system is designed, which includes GMSK Modulation with the channel and GMSK Demodulation. The modulation consists of non-return zero (NRZ) encoder, an integrator followed by Gaussian filtering and frequency modulation (FM). The GMSK demodulation consists of FM demodulator, followed by differentiation and NRZ decoder. The FM Modulation and demodulation use the optimized CORDIC model for an In-phase (I) and quadrature (Q) phase generation. The optimized CORDIC is designed by using quadrant mapping and pipelined structure to improve the hardware and computational complexity in GMSK systems. The GMSK system is designed on the Xilinx platform and implemented on Artix-7 and Spartan-3EFPGA. The hardware constraints like area, power, and timing utilization are summarized. The comparison of the optimized CORDIC model with similar CORDIC approaches is tabulated with improvements.
Comparison study of 8-PPM, 8-DPIM, and 8-RDH-PIM modulator FPGA hardware desi...journalBEEI
In this paper, a performance study of 8-Pulse-Position Modulation (PPM), 8-Digital Pulse Interval Modulation (DPIM), and 8-Reverse Dual Header-Pulse Interval Modulation (RDH-PIM) implementation in Verilog hardware design language is presented. The hardware design is chosen over software design since it could provide much more flexibility in term of transmission rate and reduce the workload of the processor in the complete system. Using 50 MHz clock as the reference data clock speeds, the transmission rate recorded are 11.11 Msymbol/second or 33.33 Mbps, 9.09 Msymbol/s or 27.27 Mbps, and 6.25 Msymbol/s or 18.75 Mbps for 8-RDH-PIM, 8-DPIM, and 8-PPM respectively. We conclude that 8-RDH-PIM modulator design provides better performance in term of bandwidth utilization and transmission rate as compared to 8-PPM and 8-DPIM.
This document summarizes a seminar on FPGA, CPLD, and VHDL programming basics. The seminar schedule includes sessions on FPGA technologies compared to previous programmable devices like CPLD, Microsemi FPGA devices and VHDL introduction. There is also an application example of using an FPGA for an Ethernet bus interface board and a discussion of current trends and technologies.
Reconfigurable CORDIC Low-Power Implementation of Complex Signal Processing f...Editor IJMTER
This document describes a proposed low-power CORDIC-based DCT architecture that prioritizes processing of low-frequency DCT coefficients over high-frequency coefficients to reduce power consumption with minimal image quality degradation. It uses a look-ahead CORDIC approach to allow varying the number of CORDIC iterations for different coefficients. Experimental results show the proposed architecture achieves 38.1% area and power savings compared to DA-based DCT, with comparable power to MCM-based DCT but using 100% less area and a minor 0.04dB quality loss.
Low Power-Area Design of Full Adder Using Self Resetting Logic with GDI Techn...VLSICS Design
Various electronic devices such as mobile phones, DSPs,ALU etc., are designed by using VLSI (Very
Large Scale Integration) technology. In VLSI dynamic CMOS logic circuits are concentrating on the Area
,reducing the power consumption and increasing the Speed by reducing the delay. ALU (Arithmetic Logic
Circuits) are designed by using adder, subtractors, multiplier, divider, etc.Various adder circuits designs
have been proposed over last few years with different logic styles. To reduce the power consumption
several parameters are to be taken into account, such as feedthrough, leakage power single-event upsets,
charge sharing by parasitic components while connecting source and drain of CMOS transistors There are
situations in a logic that permit the use of circuits that can automatically precharge themselves (i.e., reset
themselves) after some prescribed delays. These circuits are hence called postcharge or self-resetting logic
which are widely used in dynamic logic circuits. Overall performance of various adder designs is
evaluated by using Tanner tool . The earlier and the proposed SRLGDI primitives are simulated using
Tanner EDA with BSIM 0.250 lm technology with supply voltage ranging from 0 V to 5 V in steps of 0.2 V.
On comparing the various SRLGDI logic adders, the proposed adder shows low power, delay and low
PDP among its counterparts.
DESIGN AND PERFORMANCE ANALYSIS OF HYBRID ADDERS FOR HIGH SPEED ARITHMETIC CI...VLSICS Design
Adder cells using Gate Diffusion Technique (GDI) & PTL-GDI technique are described in this paper. GDI technique allows reducing power consumption, propagation delay and low PDP (power delay product) whereas Pass Transistor Logic (PTL) reduces the count of transistors used to make different logic gates, by eliminating redundant transistors. Performance comparison with various Hybrid Adder is been presented. In this paper, we propose two new designs based on GDI & PTL techniques, which is found to be much more power efficient in comparison with existing design technique. Only 10 transistors are used to implement the SUM & CARRY function for both the designs. The SUM and CARRY cell are implemented in a cascaded way i.e. firstly the XOR cell is implemented and then using XOR as input SUM as well as CARRY cell is implemented. For Proposed GDI adder the SUM as well as CARRY cell is designed using GDI technique. On the other hand in Proposed PTL-GDI adder the SUM cell is constructed using PTL technique and the CARRY cell is designed using GDI technique. The advantages of both the designs are discussed. The significance of these designs is substantiated by the simulation results obtained from Cadence Virtuoso 180nm environment.
Iaetsd design and simulation of high speed cmos full adder (2)Iaetsd Iaetsd
The document describes the design and simulation of high speed CMOS full adders. It analyzes different logic styles for full adder design including SR-CPL and transmission gate styles. Simulation results show that the transmission gate full adder has lower power dissipation of 4.876595e-008 watts compared to 2.309128e-005 watts for the SR-CPL full adder, demonstrating that transmission gate style is better for low power applications.
Leakage power optimization for ripple carry adder NAVEEN TOKAS
This document discusses the design of a ripple carry adder using a 3T XOR gate. It begins by providing background on ripple carry adders and how they are constructed from cascaded full adders. It then describes how full adders can be built from XOR gates. The author proposes designing a low-power 3T XOR gate to serve as the building block for an improved full adder circuit. This new full adder design aims to reduce area, complexity, power consumption and delay compared to previous adder designs. To test the design, the author created a 3T XOR gate simulation using Mentor Graphics software at the 35nm technology node.
This document discusses the I2C bus protocol and its implementation on an FPGA to interface with low speed peripheral devices. It also provides background on VLSI design, including the evolution of integration density over time, the VLSI design flow from behavioral to layout representations, and historical context on increasing processing power needs driving advances in integration technologies. The I2C protocol allows communication between multiple chips using only two pins, addressing the need for lower pin counts as chip sizes decrease. The document implements I2C on an FPGA to interface with a DS1307 peripheral and synthesizes it on a Spartan 3E chip.
Optimized Design of 2D Mesh NOC Router using Custom SRAM & Common Buffer Util...VLSICS Design
With the shrinking technology, reduced scale and power-hungry chip IO leads to System on Chip. The design of SOC using traditional standard bus scheme encounters with issues like non-uniform delay and routing problems. Crossbars could scale better when compared to buses but tend to become huge with increasing number of nodes. NOC has become the design paradigm for SOC design for its highly regularized interconnect structure, good scalability and linear design effort. The main components of an NoC topology are the network adapters, routing nodes, and network interconnect links. This paper mainly deals with the implementation of full custom SRAM based arrays over D FF based register arrays in the design of input module of routing node in 2D mesh NOC topology. The custom SRAM blocks replace DFF(D flip flop) memory implementations to optimize area and power of the input block. Full custom design of SRAMs has been carried out by MILKYWAY, while physical implementation of the input module with SRAMs has been carried out by IC Compiler of SYNOPSYS.The improved design occupies approximately 30% of the area of the original design. This is in conformity to the ratio of the area of an SRAM cell to the area of a D flip flop, which is approximately 6:28.The power consumption is almost halved to 1.5 mW. Maximum operating frequency is improved from 50 MHz to 200 MHz. It is intended to study and quantify the behavior of the single packet array design in relation to the multiple packet array design. Intuitively, a
common packet buffer would result in better utilization of available buffer space. This in turn would translate into lower delays in transmission. A MATLAB model is used to show quantitatively how performance is improved in a common packet array design.
Optimized Design of 2D Mesh NOC Router using Custom SRAM & Common Buffer Util...VLSICS Design
With the shrinking technology, reduced scale and power-hungry chip IO leads to System on Chip. The design of SOC using traditional standard bus scheme encounters with issues like non-uniform delay and routing problems. Crossbars could scale better when compared to buses but tend to become huge with increasing number of nodes. NOC has become the design paradigm for SOC design for its highly regularized interconnect structure, good scalability and linear design effort. The main components of an NoC topology are the network adapters, routing nodes, and network interconnect links. This paper mainly deals with the implementation of full custom SRAM based arrays over D FF based register arrays in the design of input module of routing node in 2D mesh NOC topology. The custom SRAM blocks replace D FF(D flip flop) memory implementations to optimize area and power of the input block. Full custom design of SRAMs has been carried out by MILKYWAY, while physical implementation of the input module with SRAMs has been carried out by IC Compiler of SYNOPSYS.The improved design occupies approximately 30% of the area of the original design. This is in conformity to the ratio of the area of an SRAM cell to the area of a D flip flop, which is approximately 6:28.The power consumption is almost halved to 1.5 mW. Maximum operating frequency is improved from 50 MHz to 200 MHz. It is intended to study and quantify the behavior of the single packet array design in relation to the multiple packet array design. Intuitively, a
common packet buffer would result in better utilization of available buffer space. This in turn would translate into lower delays in transmission. A MATLAB model is used to show quantitatively how performance is improved in a common packet array design.
Design of high speed adders for efficient digital design blocksBharath Chary
This document discusses the design of high-speed adders for efficient digital design blocks. It compares parallel-prefix adders like Kogge-Stone, Brent-Kung, Sklansky, and Kogge-Stone Ling adders. The Kogge-Stone Ling adder is found to perform most efficiently. Different tree adder structures are designed using CMOS logic and transmission gate logic and compared in terms of area, delay, and power consumption. Ling adders are analyzed, and it is shown that they reduce dependency on previous bit additions compared to other adders.
EFFICIENT ABSOLUTE DIFFERENCE CIRCUIT FOR SAD COMPUTATION ON FPGAVLSICS Design
Video Compression is very essential to meet the technological demands such as low power, less memory
and fast transfer rate for different range of devices and for various multimedia applications. Video
compression is primarily achieved by Motion Estimation (ME) process in any video encoder which
contributes to significant compression gain.Sum of Absolute Difference (SAD) is used as distortion metric
in ME process.In this paper, efficient Absolute Difference(AD)circuit is proposed which uses Brent Kung
Adder(BKA) and a comparator based on modified 1’s complement principle and conditional sum adder
scheme. Results shows that proposed architecture reduces delay by 15% and number of slice LUTs by 42
% as compared to conventional architecture. Simulation and synthesis are done on Xilinx ISE 14.2 using
Virtex 7 FPGA.
FPGA based Efficient Interpolator design using DALUT Algorithmcscpconf
Interpolator is an important sampling device used for multirate filtering to
provide signal processing in wireless communication system. There are many
applications in which sampling rate must be changed. Interpolators and decimators are
utilized to increase or decrease the sampling rate. In this paper an efficient method has
been presented to implement high speed and area efficient interpolator for wireless
communication systems. A multiplier less technique is used which substitutes multiplyand-accumulate
operations with look up table (LUT) accesses. Interpolator has been
implemented using Partitioned distributed arithmetic look up table (DALUT)
technique. This technique has been used to take an optimal advantage of embedded
LUTs of the target FPGA. This method is useful to enhance the system performance in
terms of speed and area. The proposed interpolator has been designed using half band
poly phase FIR structure with Matlab, simulated with ISE, synthesized with Xilinx
Synthesis Tools (XST) and implemented on Spartan-3E and Virtex2pro device. The
proposed LUT based multiplier less approach has shown a maximum operating
frequency of 92.859 MHz with Virtex Pro and 61.6 MHz with Spartan 3E by
consuming considerably less resources to provide cost effective solution for wireless
communication systems.
DESIGN AND ANALYSIS OF A 32-BIT PIPELINED MIPS RISC PROCESSORVLSICS Design
Pipelining is a technique that exploits parallelism, among the instructions in a sequential instruction stream
to get increased throughput, and it lessens the total time to complete the work. . The major objective of this
architecture is to design a low power high performance structure which fulfils all the requirements of the
design. The critical factors like power, frequency, area, propagation delay are analysed using Spartan 3E
XC3E 1600e device with Xilinx tool.
Design and Analysis of A 32-bit Pipelined MIPS Risc ProcessorVLSICS Design
The document describes the design and analysis of a 32-bit pipelined MIPS RISC processor. A 6-stage pipeline is implemented, consisting of instruction fetch, instruction decode, register read, memory access, execute, and write back stages. Various low power and high speed techniques are used, including power gating and deeper pipelining. The processor is implemented on a Spartan 3E FPGA and analyzed using Xilinx tools. Simulation results show the pipeline consumes low power of 0.129W and achieves a high frequency of 285.583MHz.
DESIGN AND ANALYSIS OF A 32-BIT PIPELINED MIPS RISC PROCESSORVLSICS Design
The document describes the design and analysis of a 32-bit pipelined MIPS RISC processor. A 6-stage pipeline is implemented, consisting of instruction fetch, instruction decode, register read, memory access, execute, and write back stages. Various techniques are used to optimize critical performance factors like power, frequency, area, and propagation delay. Power gating is applied to minimize power consumption, and deeper pipelining is used to increase speed. Simulation results show the pipeline consumes very low power of 0.129W, has a path delay of 11.180ns, and achieves a high frequency of 285.583MHz.
NEW DESIGN METHODOLOGIES FOR HIGH-SPEED MIXED-MODE CMOS FULL ADDER CIRCUITSVLSICS Design
This paper presents the design of high-speed full adder circuits using a new CMOS mixed mode logic family. The objective of this work is to present a new full adder design circuits combined with current mode circuit in one unit to implement a full adder cell. This paper also discusses a high- speed hybrid majority function based 1-bit full adder that uses MOS capacitors (MOSCAP) in its structure with conventional static and dynamic CMOS logic circuit. The static Majority function (bridge) design style enjoys a high degree of regularity and symmetric higher density than the conventional CMOS design style as well as lower power consumption by using bridge transistors. This technique helps in reducing power consumption, propagation delay, and area of digital circuits while maintaining low complexity of mixedmode logic designs. Dynamic CMOS circuits enjoy area, delay and testability advantages over static CMOS circuits. Simulation results illustrate the superiority of the new designed adder circuits against the reported conventional CMOS, dynamic and majority function adder circuits, in terms of power, delay, power delay product (PDP) and energy delay product (EDP). The design is implemented on UMC 0.18µm process models in Cadence Virtuoso Schematic Composer at 1.8 V single ended supply voltage and simulations are carried out on Spectre S.
EFFICIENT ABSOLUTE DIFFERENCE CIRCUIT FOR SAD COMPUTATION ON FPGAVLSICS Design
Video Compression is very essential to meet the technological demands such as low power, less memory and fast transfer rate for different range of devices and for various multimedia applications. Video compression is primarily achieved by Motion Estimation (ME) process in any video encoder which contributes to significant compression gain.Sum of Absolute Difference (SAD) is used as distortion metric in ME process.In this paper, efficient Absolute Difference(AD)circuit is proposed which uses Brent Kung Adder(BKA) and a comparator based on modified 1’s complement principle and conditional sum adder scheme. Results shows that proposed architecture reduces delay by 15% and number of slice LUTs by 42% as compared to conventional architecture. Simulation and synthesis are done on Xilinx ISE 14.2 using Virtex 7 FPGA.
Polar codes are one of the best linear block codes that are capacity achieving and incorporating along with it a simplified encoding and decoding routines. Successive cancellation (SC) algorithm is one of the predominantly used decoding algorithms due to its low complexity. It has broad scopes for hardware architecture design and reformulation. For polar code, the trade-off among the long latency and the silicon area of the SC algorithm is a bottleneck for the design of a high throughput polar decoder. The available prior SC polar decoder designs have higher area requirements for higher block length. This paper introduces a unique reformulation of the processing element (PE) block of SC decoding. The proposed reformulation leads to two benefits: firstly, critical path and hardware complexity of the PE are meaningfully reduced by using a unified adder block. Secondly, the silicon area requirement and the power consumption were also reduced considerably without any loss in performance. The proposed PE is used to build the decoder for various block lengths. Moreover, a Gate-level analysis of the proposed decoder has revealed that the design attains an 18% area reduction and 38% reduction in power consumption over the conventional one with similar performance.
Performance Analysis of a Low-Power High-Speed Hybrid 1-Bit Full Adder Circui...IRJET Journal
The document describes a proposed hybrid 1-bit full adder circuit designed using different CMOS technologies. Full adders are important building blocks for arithmetic circuits. Existing full adder designs have limitations like high power consumption, voltage degradation, and slow speed. The proposed hybrid design addresses these issues. It uses a combination of logic styles for the sum generation and carry generation circuits. The sum circuit uses an improved XNOR module to reduce power and overcome switching delays at low voltages. The carry generation circuit uses transmission gates to reduce the carry propagation delay. The proposed hybrid full adder circuit was simulated using Cadence at 180nm, 90nm, and 45nm process technologies. Simulation results showed improvements in power, delay, and transistor count compared
Addition is a fundamental arithmetic operation that is broadly used in many VLSI systems, such as application-specific digital signal processing (DSP) architectures and microprocessors. This addition module is also the core of other arithmetic operations such as subtraction, multiplication, division and address generation. The prime objective of this project is to design a full-adder having low-power consumption and low propagation delay which may result in the efficient implementation of modern digital systems. This model is referred as “hybrid” because of the combination of two different design logic styles namely CMOS logic and pass transistor logic. Performance parameters such as power, delay and hence energy were compared with the existing designs such as complementary CMOS logic full adder. In the existing hybrid systems, over 28 transistors were used. While the optimized hybrid full adder circuit reduces this count to 8 transistors, it still obtains better energy efficiency. Further the proper working of proposed full adder is verified by applying it in a Ripple carry Adder circuit.
PERFORMANCE EVALUATION OF LOW POWER CARRY SAVE ADDER FOR VLSI APPLICATIONSVLSICS Design
This document discusses the performance evaluation of a low power carry save adder (CSA) for VLSI applications. It begins with an abstract that examines subthreshold leakage in CSA circuits and how reducing threshold voltage can lower power consumption. The document then reviews previous work on CSA design. It presents the architecture of a proposed 4-bit CSA designed using gate diffusion input cells to reduce area and power. Simulation results show the CSA has total average power of 4.93μW, propagation delay of 16.3ns, and 37% reduced area due to using GDI cells. The CSA operates as intended in subthreshold regions with low static leakage current.
PERFORMANCE EVALUATION OF LOW POWER CARRY SAVE ADDER FOR VLSI APPLICATIONSVLSICS Design
This report examines the subject of sub threshold leakage on carry save adder. When the gate to source voltage reduces to the threshold voltage at that place is yet some amount of current flow in the circuit and that is undesired. As the process technology advancing much rapidly the threshold voltage of MOS devices reduces very drastically, and it must be applied in lower power devices since it contributes to low amount of leakage current which confine increases the power consumption of the devices. Adders are the basic building blocks for any digital circuit design and used in almost all arithmetic’s. The CSA proves efficient adders due to its quick and precise computations. Hence this paper performs sub threshold analysis on CSA and the scrutinize results that the total average power is around 4.93µW, the propagation delay for complete operation is 16.3ns and since this design uses GDI cell so there is a reduction in area with 37%.
Similar to LOGIC OPTIMIZATION USING TECHNOLOGY INDEPENDENT MUX BASED ADDERS IN FPGA (20)
Rainfall intensity duration frequency curve statistical analysis and modeling...bijceesjournal
Using data from 41 years in Patna’ India’ the study’s goal is to analyze the trends of how often it rains on a weekly, seasonal, and annual basis (1981−2020). First, utilizing the intensity-duration-frequency (IDF) curve and the relationship by statistically analyzing rainfall’ the historical rainfall data set for Patna’ India’ during a 41 year period (1981−2020), was evaluated for its quality. Changes in the hydrologic cycle as a result of increased greenhouse gas emissions are expected to induce variations in the intensity, length, and frequency of precipitation events. One strategy to lessen vulnerability is to quantify probable changes and adapt to them. Techniques such as log-normal, normal, and Gumbel are used (EV-I). Distributions were created with durations of 1, 2, 3, 6, and 24 h and return times of 2, 5, 10, 25, and 100 years. There were also mathematical correlations discovered between rainfall and recurrence interval.
Findings: Based on findings, the Gumbel approach produced the highest intensity values, whereas the other approaches produced values that were close to each other. The data indicates that 461.9 mm of rain fell during the monsoon season’s 301st week. However, it was found that the 29th week had the greatest average rainfall, 92.6 mm. With 952.6 mm on average, the monsoon season saw the highest rainfall. Calculations revealed that the yearly rainfall averaged 1171.1 mm. Using Weibull’s method, the study was subsequently expanded to examine rainfall distribution at different recurrence intervals of 2, 5, 10, and 25 years. Rainfall and recurrence interval mathematical correlations were also developed. Further regression analysis revealed that short wave irrigation, wind direction, wind speed, pressure, relative humidity, and temperature all had a substantial influence on rainfall.
Originality and value: The results of the rainfall IDF curves can provide useful information to policymakers in making appropriate decisions in managing and minimizing floods in the study area.
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...IJECEIAES
Medical image analysis has witnessed significant advancements with deep learning techniques. In the domain of brain tumor segmentation, the ability to
precisely delineate tumor boundaries from magnetic resonance imaging (MRI)
scans holds profound implications for diagnosis. This study presents an ensemble convolutional neural network (CNN) with transfer learning, integrating
the state-of-the-art Deeplabv3+ architecture with the ResNet18 backbone. The
model is rigorously trained and evaluated, exhibiting remarkable performance
metrics, including an impressive global accuracy of 99.286%, a high-class accuracy of 82.191%, a mean intersection over union (IoU) of 79.900%, a weighted
IoU of 98.620%, and a Boundary F1 (BF) score of 83.303%. Notably, a detailed comparative analysis with existing methods showcases the superiority of
our proposed model. These findings underscore the model’s competence in precise brain tumor localization, underscoring its potential to revolutionize medical
image analysis and enhance healthcare outcomes. This research paves the way
for future exploration and optimization of advanced CNN models in medical
imaging, emphasizing addressing false positives and resource efficiency.
Discover the latest insights on Data Driven Maintenance with our comprehensive webinar presentation. Learn about traditional maintenance challenges, the right approach to utilizing data, and the benefits of adopting a Data Driven Maintenance strategy. Explore real-world examples, industry best practices, and innovative solutions like FMECA and the D3M model. This presentation, led by expert Jules Oudmans, is essential for asset owners looking to optimize their maintenance processes and leverage digital technologies for improved efficiency and performance. Download now to stay ahead in the evolving maintenance landscape.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
Batteries -Introduction – Types of Batteries – discharging and charging of battery - characteristics of battery –battery rating- various tests on battery- – Primary battery: silver button cell- Secondary battery :Ni-Cd battery-modern battery: lithium ion battery-maintenance of batteries-choices of batteries for electric vehicle applications.
Fuel Cells: Introduction- importance and classification of fuel cells - description, principle, components, applications of fuel cells: H2-O2 fuel cell, alkaline fuel cell, molten carbonate fuel cell and direct methanol fuel cells.
LOGIC OPTIMIZATION USING TECHNOLOGY INDEPENDENT MUX BASED ADDERS IN FPGA
1. International Journal of VLSI design & Communication Systems (VLSICS) Vol.3, No.4, August 2012
DOI : 10.5121/vlsic.2012.3412 135
LOGIC OPTIMIZATION USING TECHNOLOGY
INDEPENDENT MUX BASED ADDERS IN FPGA
R.Uma and P.Dhavachelvan
Department of Computer Engineering, School of Engineering, Pondicherry University,
Pondicherry, India
uma.ramadass1@gmail.com
dhavachelvan@gmail.com
ABSTRACT
Adders form an almost obligatory component of every contemporary integrated circuit. The prerequisite of
the adder is that it is primarily fast and secondarily efficient in terms of power consumption and chip area.
Therefore, careful optimization of the adder is of the greatest importance. This optimization can be attained
in two levels; it can be circuit or logic optimization. In circuit optimization the size of transistors are
manipulated, where as in logic optimization the Boolean equations are rearranged (or manipulated) to
optimize speed, area and power consumption. This paper focuses the optimization of adder through
technology independent mapping. The work presents 20 different logical construction of 1-bit adder cell in
CMOS logic and its performance is analyzed in terms of transistor count, delay and power dissipation.
These performance issues are analyzed through Tanner EDA with TSMC MOSIS 250nm technology. From
this analysis the optimized equation is chosen to construct a full adder circuit in terms of multiplexer. This
logic optimized multiplexer based adders are incorporated in selected existing adders like ripple carry
adder, carry look-ahead adder, carry skip adder, carry select adder, carry increment adder and carry save
adder and its performance is analyzed in terms of area (slices used) and maximum combinational path
delay as a function of size. The target FPGA device chosen for the implementation of these adders was
Xilinx ISE 12.1 Spartan3E XC3S500-5FG320. Each adder type was implemented with bit sizes of: 8, 16,
32, 64 bits. This variety of sizes will provide with more insight about the performance of each adder in
terms of area and delay as a function of size.
KEYWORDS
Technology independent mapping, Adder topologies, FPGA, Multiplexer based Adders, Logical effort,
Delay calculation
1. INTRODUCTION
In most of the digital systems, adders are the fundamental component in the design of application
specific integrated circuits like RISC processors, digital signal processors (DSP), microprocessors
etc. The design criterion of a full adder cell is usually multi-fold. Transistor count is, of course, a
primary concern which largely affects the design complexity of many function units such as
multiplier and Arithmetic logic unit (ALU). The basic principle in designing digital adder
circuit hovers around reducing the required hardware thus reducing the cost too. To achieve
this, logical optimization helps to obtaining minimum number of literals to minimizing the
transistor count and the power consumption and increasing the speed of operation.
A logic expression can be expressed in various logic forms, which differ in literal counts.
In widely-used MOS circuits, the number of transistors to implement a Boolean expression is
directly proportional to literal counts in its logic form[1-2]. Thus, a logic optimization is
simply to derive a logic form with the fewest literals. Logic level optimization is the design
task where an RTL circuit description is optimized in terms of area, delay and power.
2. International Journal of VLSI design & Communication Systems (VLSICS) Vol.3, No.4, August 2012
136
Conventionally, a logic level optimization can be achieved in two steps; they are Technology
Independent (TI) and Technology – Dependent (TD) optimization. In the former method the
circuit’s Boolean description is optimized ignoring the technology in which the circuit will be
implemented. In the second method, the output of the technology independent optimization step
(i.e. optimized Boolean network) is optimized considering the adopted technology. During the TI
step there is much flexibility to restructure circuit logic to minimize the number of nodes and
literals, thereby reducing the area of the circuit. During this stage the circuit can be most
effectively restructured to meet the specified delay constraints critical for circuit performance.
During the TD step, the delay characteristics of the target library are available, but very few
restructuring of the circuit is possible.
Logical effort [13, 14] has been widely used in a variety of application domains as well as in
industry standard EDA synthesis tools. Designing a circuit to achieve the greatest speed or to
meet a delay constraint presents a bewildering array of choices [13, 14]. The method of logical
effort is a design procedure for achieving the least delay along a path of a logic network. This
method is based on a simple approximation that treats MOS circuits as networks of resistance and
capacitance. This RC model provides simple mathematical calculation to obtain the circuit’s
maximum speed. In this paper the delay model for optimized full adder circuit and its delay
estimation is also presented.
In this paper, we proposed 20 different Boolean expressions (logic construction) to implement a
1-bit full adder circuit. All the Boolean expressions are realized in terms of CMOS logic. The
optimization method used in this work is technology independent optimization step. These
Boolean logic realization and performances are analyzed in terms of transistor count, delay and
power dissipation using Tanner EDA with TSMC MOSIS 250nm technology. From this analysis
the optimized equation is selected and it is implemented in terms of multiplexers and it is
incorporated in selected existing adder topologies like ripple carry adder, carry look-ahead adder,
carry skip adder, carry select adder, carry increment adder and carry save adder and its
performance is analyzed in terms of area (slices used) and maximum combinational path delay as
a function of size. Performance comparison of existing and logic optimized schemes are analyzed
on cell-based VLSI technologies, such as standard-cell based FPGAs. The cell-based approach is
justified by its wide-spread use in the ASIC design community and its compatibility with
hardware synthesis, which in turns satisfies the demand for ever higher productivity. This work
presents the significance of adder comparison in terms of CLBs occupied and its maximum
combinational delay exist in adder topology.
The organization of the paper is as follows: The section 2, describes the existing adder topologies.
Section 3, presents the mathematical Boolean expression for the design of 1-bit full adder cell.
Section 4 presents the simulation and analysis of full adder using Tanner EDA. Section 5 presents
the FPGA implementation of different adder topologies. Section 6 gives the summary of
comparison. Finally the conclusion is presented in section 7.
2. REVIEW OF EXISTING ADDER TOPOLOGY
Most of the VLSI applications, such as digital signal processing, image and video processing, and
microprocessors, extensively use arithmetic operations. Addition, subtraction, multiplication, and
multiply and accumulate (MAC) are examples of the most commonly used operations. The 1-bit
full-adder cell is the building block of all these modules. Thus, enhancing its performance is
critical for enhancing the overall module performance. This section presents the overview of the
existing adder topologies.
In FPGAs, the most fundamental component implemented for high speed applications like
microprocessors, arithmetic logic unit, program counters and multiply accumulate unit. Lot of
3. International Journal of VLSI design & Communication Systems (VLSICS) Vol.3, No.4, August 2012
137
implementations has been made for these adder topologies for optimizing area, delay and power
dissipations. In the reference [1], it provides an overview for the comparison of adders in the
early design phase for selecting their appropriate design structure for implementing adders with
the constraints of area, delay and power dissipation. This paper also reveals the pre-estimation of
energy-delay, product, energy-delay estimation and power estimation in the energy delay space.
In the reference [2], the proposed high speed and low power full adder cells which has designed
with pass transistor logic styles to reduce the power delay product (PDP). This paper also reports
the performance comparison of adder cells with CMOS, DCVS, CPL, DPL, Swing restorer CPL
and hybrid styles. This paper shows the implementation of adder cells with enhanced carry
generation stage which is implemented with multiplexes. This feature provides that for this logic
there are no internal signals being generated for controlling the selection of output multiplexers,
thereby reducing the full voltage swing, delay and overall propagation delays.
The adder topology is present in literature [3-12], Ripple Carry Adder (RCA) is the simplest, but
slowest adders with O(n) area and O(n) delay, where n is the operand size in bits. Carry Look-
Ahead (CLA) have O(nlog(n)) area and O(log(n)) delay, but typically suffer from irregular
layout. On the other hand, Carry Skip Adder, carry increment and carry select have O(n) area and
2/ 1
( )l l
O n + +
delay provides a good compromise in terms of area and delay, along with a simple
and regular layout. Carry save adder have O(n) area and O(log n) delay. The ripple carry adder,
the most basic of flavours, is at the one extreme of the spectrum with the least amount of CLBs
but the highest delay. CLA adders can be realized in two gate levels provided there is no limit on
fan in/out. The carry select adders reduce the computation time by pre-computing the sum for all
possible carry bit values (ie ‘0’ and ‘1’). After the carry becomes available the correct sum is
selected using multiplexer. Carry select adders are in the class of fast adders, but they suffer from
fan-out limitation since the number of multiplexers that need to be driven by the carry signal
increases exponentially. In the worst case, a carry signal is used to select n/2 multiplexers in an n-
bit adder. When three or more operands are to be added simultaneously using two operand adders,
the time consuming carry propagation must be repeated several times. If the number of operands
is ‘k’, then carries have to propagate (k-1) times.
3. MATHEMATICAL EQUATIONS FOR FULL ADDER
A full adder is a combinational circuit that performs the arithmetic sum of three bits: A, B and a
carry in, C, from a previous addition produces the corresponding SUM, S, and a carry out,
CARRY. The various equations for SUM and CARRY are given below
ܷܵܯ = ܣ ⊕ ܤ ⊕ ܥ (1)
ܷܵܯ = ቀܤܣ + ܣܤቁ ܥ + ൫ܤܣ + ܤܣ൯ܥ (3)
ܻܴܴܣܥ = (ܣ ⊕ ܥ)ܤ + ܤܣ (5)
ܻܴܴܣܥ = ൫ܤܣ + (ܣ ⊕ ܤ൯ܥ (7)
ܻܴܴܣܥ = ܣ ⊕ ܤ • ܤ + ܣ ⊕ ܤ • ܥ (9)
ܻܴܴܣܥ = ܥ • (ܣ • )ܤ + ܥ • (ܣ + )ܤ (11)
ܻܴܴܣܥ = (ܣ ⊕ ܥ)ܤ • ܤܣ (13)
ܻܴܴܣܥ = ܤܣ • ܥܣ • ܥܤ (15)
ܻܴܴܣܥ = ܤܣ • ܣ + ܤ • ܥ (17)
ܻܴܴܣܥ = ܣ + ܤ + ܣ ⊕ ܤ + ܥ (19)
ܻܴܴܣܥ = ൫ܣ ⊕ ܤ൯ܥ ⊕ ܤܣ (20)
ܷܵܯ = ܣ ⊕ ܤ ⊕ ܥ (2)
ܷܵܯ = ቀܣ ܤ + ܤܣቁ ܥ + ൫ܣ ܤ + ܤܣ൯ܥ (4)
ܻܴܴܣܥ = ൬ܣ ⊕ ܤ൰ ܥ + ܤܣ (6)
ܻܴܴܣܥ = ܤܣ • ܥܣ • ܥܤ (8)
ܻܴܴܣܥ = (ܣ ⊕ )ܤ • ܥ ⊕ ܤܣ (10)
ܻܴܴܣܥ = ܤܣ + ܥܣ + ܥܤ (12)
ܻܴܴܣܥ = ܤܣ + ܣ ⊕ ܤ • ܥ (14)
ܻܴܴܣܥ = ܣ ⊕ ܤ • ܥܣ • ܥܤ (16)
ܻܴܴܣܥ = ܣ ⊕ ܤ • ܥ ⊕ ܤܣ (18)
4. International Journal of VLSI design & Communication Systems (VLSICS) Vol.3, No.4, August 2012
138
Figure 1 Different structure for realizing Full Adder circuit
In this work 20 different Boolean expressions are formulated (Figure 1 shows different full adder
circuit). Using this logical equation it is possible to construct 64 full adder circuits. These adders
5. International Journal of VLSI design & Communication Systems (VLSICS) Vol.3, No.4, August 2012
139
are implemented with CMOS logic with technology independent optimization process and its
performance are analyzed in terms of transistor count, delay and power dissipation using Tanner
EDA with TSMC MOSIS 250nm technology. From this analysis the optimized equation is
selected and it is implemented in terms of multiplexers and it is incorporated in selected existing
adder topologies like ripple carry adder, carry look-ahead adder, carry skip adder, carry select
adder, carry increment adder and carry save adder and its performance is analyzed in terms of
area (slices used) and maximum combinational path delay as a function of size.
Mathematically it is also possible to calculate the delay of a circuit by constructing delay models
instead of simulation tools using logical effort methods. The logical effort provides a simple
method “on the back of an envelope” [13, 14] to choose the best topology (logical constructs) and
number of stages of logic for a function. Speed optimization of circuit network can be achieved
by the method of logical effort. This method provides how many stages of logic are required for
the fastest implementation of any given logic function. The speed of the circuit depends on the
capacitive load that the circuit of the logic gate drives and the logic function of the gate. The
delay incurred in a logic gate are expressed as sum of two components namely, the parasitic delay
p and the effort delay f as follows [13-14].
The delay in single stage network is expressed as
d = gh + p (1)
where,
g - Logical effort (the ability of the logic gate’s topology to produce output current)
h - Electrical effort (the ratio of output capacitance to input capacitance)
p - Intrinsic delay (delay of the gate due to its own internal capacitance)
Table 1 presents the logical effort of common static CMOS gates assuming the aspect ratio of
pull-up and pull-down network to be 2:1 to have equal rise and fall delay. Table 2 presents the
parasitic delay of CMOS logic independent of the size of the logic gate and of the load
capacitance it drives. The principle contribution to parasitic delay is the capacitance of the
source/drain regions of the transistors that drive the gate’s output.
An example to calculate the delay of a full adder is shown in Figure (2) using the expression
SU M A B C= ⊕ ⊕ and C A R R Y A B A C B C= • • . The circuit is realized as two stage
network, stage1 and stage2 respectively. Assume that the input capacitance of 10pf on each input
and it will drive the output capacitance with a maximum of 10pf.
Table 1. Logical Effort of CMOS gates Table 2. Parasitic delay of CMOS gates
Gate type
Number of Inputs
Gate type
Number of Inputs
1 2 3 4 5 n 1 2 3 4 5 n
Inverter 1 Inverter 1
NAND 4/3 5/3 6/3 7/3 (n+1)/3 NAND 2 3 4 5 nPinv
NOR 5/3 7/3 9/3 11/3 (2n+1)/3 NOR 2 3 4 5 nPinv
XOR/XNOR 4 12 32 XOR/XNOR 4 4 4 4Pinv
6. International Journal of VLSI design & Communication Systems (VLSICS) Vol.3, No.4, August 2012
140
Delay calculation for SUM Delay calculation for CARRY
The Electrical effort H = 10pf/10pf (Output
capacitance / Input capacitance) = 1
Path logical effort G along path (N-W) =
1 2g g•∏ (where g1 and g2 are logical
effort of XOR gate) =4X4=16
The parasitic delay along path (N-W) P =
4+4=8
(The parasitic delay of XOR gate is 4)
The Branching effort B =1 [because all of
the fan-out’s along the path are one]
Number of stages N= 2 stages
The path Efforts F=GBH =(16)X1X1=16
The path delay
^
1/ N
D N PF= +
= 2X(16)1/2
+8 =16ps
The Electrical effort H = 10pf/10pf (Output
capacitance / Input capacitance) = 1
Path logical effort G along path (N-W) =
3 4g g•∏ (where g3 and g4 are logical
effort of NAND gate) =4/3X4/3 = 16/9
The parasitic delay along path (N-W)P
=2+2=4 (The parasitic delay of NAND gate
is 2)
The Branching effort B =1 [because all of the
fan-out’s along the path are one]
Number of stages N= 2 stages
The path Efforts F=GBH =(16/9)X1X1=16/9
The path delay
^
1/ N
D N PF= +
= 2X(16/9)1/2
+4 =6.6ps
Figure 2. Logical Delay Model for Full Adder Circuit.
So the total delay will be the sum of CARRY and SUM which is equal to 22.6ps. From this
observation the delay of the circuit vary with change in the input and output capacitance value.
4. SIMULATION AND PERFORMANCE ANALYSIS OF FULL ADDER
The proposed 20 different Boolean expressions (logic construction) are simulated using Tanner
EDA with BSIM3v3 250nm technology with supply voltage ranging from 1V to 2V in steps of
0.2V. All the full adders are simulated with multiple design corners (TT, FF, FS, and SS) to
verify that operation across variations in device characteristics and environment. The simulated
setup for optimized full adder’s (using XOR,MUX) test bed and its gate equivalent along with its
input/output waveform is shown in Figure ( 3 ). The test bed is supplied with a nominal voltage of
2V in steps of 0.2V and it is invoked with the technology library file Generic 025 and it is
specified with TT, FF, FS and SS conditions. The W/L ratios of both nMOS and pMOS
transistors are taken as 2.5/0.25µm. To establish an unbiased testing environment, the simulations
have been carried out using a comprehensive input signal pattern, which covers every possible
transition for a 1- bit full adder.
7. International Journal of VLSI design & Communication Systems (VLSICS) Vol.3, No.4, August 2012
141
The frequencies have been chosen in the range from 10 to 200MHz and its input and output
capacitances are set to 10pf. The three inputs to the full adder are A, B, C and all the test vectors
are generated and have been fed into the adder cell. The cell delay has been measured from the
moment the inputs reach 50% of the voltage supply level to the moment the latest of the SUM
and CARRY signals reach the same voltage level. All transitions from an input combination to
another (total 8 patterns, 000, 001, 010, 011, 100, 101, 110, 111) have been tested, and the delay
at each transition has been measured. The average has been reported as the cell delay. The power
consumption is also measured for these input patterns and its average power has been reported in
Table 3.
Figure 3. A. Snap Shot of Full Adder with XOR and MUX
Figure 3.B . Test bed For Full Adder
The simulation results are shown in Table 3. The performance of all the full adders has been
analyzed in terms of delay, transistor count and power dissipation. It is observed that adder
designed with XOR and MUX has the least delay, transistor count and power dissipation when
compared to other combinations of gate. So the adder realized with MUX and XOR is considered
8. International Journal of VLSI design & Communication Systems (VLSICS) Vol.3, No.4, August 2012
142
to be the optimized adder in terms of delay, transistor count and power dissipation. The second
optimized full adder is realized from XNOR, NOT and MUX.
Figure 3. C. Input/ Output Wave of Full Adder
The worst case full adder construction is not using NOR gate which occupies large transistor
count, dissipates large power and has longer delay. The other optimized solution for constructing
full adders are using NAND gates only, XOR, XNOR,MUX combination and XOR,
AND,OR,MUX combination. Figure (4) shows the simulation result of full adders in terms of
delay, transistor count (TC) and power dissipation (PWR-DISSP). The power-delay product all
the full adders is shown in Figure (5).
Figure 4. Simulation result of adders in terms Figure 5. Power delay product
delay, area and power
9. International Journal of VLSI design & Communication Systems (VLSICS) Vol.3, No.4, August 2012
143
Table 3. Simulated Result for 20 different Full adders
5. FPGA IMPLEMENTATION
In this work the adder structures used are: Ripple Carry Adder, Carry Look-Ahead Adder, Carry
Save Adder, Carry Increment adder, Carry Select Adder, Carry Skip Adder. From section IV it is
observed that the optimized equation for implementing 1-bit full adder is using XOR and MUX.
So the primitive of this adder cell is implemented with multiplexer and this module is
incorporated with existing adder topologies. The target FPGA device chosen for the
implementation of these adders was Xilinx ISE 12.1 Spartan3E XC3S500-5FG320. This device
was chosen because the Spartan3E families of Field-Programmable Gate Arrays (FPGAs) are
specifically designed to meet the needs of high volume, cost-sensitive consumer electronic
applications. The Spartan-3E family builds on the success of the earlier Spartan-3 family by
increasing the amount of logic per I/O, significantly reducing the cost per logic cell. These
Spartan-3E FPGA enhancements, combined with advanced 90 nm process technology, deliver
more functionality and bandwidth. Each adder type was implemented with bit sizes of: 8, 16, 32,
64 bits. This variety of sizes will provide with more insight about the performance of each adder
in terms of area and delay as a function of size. Structural Gate level modeling using Verilog
HDL was used to model each adder. The Xilinx ISE Foundation version 12.1i software was used
for synthesis and implementation.
Table 4 contains the results obtained. The adder abbreviations used in the table are: RCA for
Ripple Carry Adder, CLA for Carry Look-Ahead Adder, CSA for Carry Save adder, CIA for
Carry Increment Adder, CSeA for Carry Select Adder and CSkA for Carry Skip Adder. In the
Table, delay is measured in nanoseconds (ns) while area is measured in Slice Look-Up Tables
(LUT) units which represent configurable logic units within the FPGA. In the Table E-A
represents the adder designed with normal expression for SUM and CARRY, Op-A represents
Full adder using
Avg
Delay
(ps)
Transistor
count
Avg
Power
Dissipation(µW)
XOR,AND, OR (1) 36.1 38 12.693
XOR,AND,OR(2) 33.1 30 9.813
XNOR,AND,OR 35 38 10.371
XNOR,AND,OR,NOT 34 32 9.833
XOR, AND 36.2 30 12.35
XOR, NAND, Negative OR 30.13 30 10.547
XNOR,NAND,NOT 29.5 34 8.752
XNOR, NAND 23.01 30 9.585
XOR,NAND 23.023 30 9.485
XOR, MUX 18.02 18 7.704
XNOR, MUX, NOT 19.01 20 7.769
XOR, XNOR, MUX 20.8 24 8.691
XOR, AND, OR, MUX 23.6 30 9.798
XNOR,AND, OR,MUX 26.12 30 9.058
NAND 20.2 36 10.795
NOR 40.1 51 22.25
XOR,NAND,NOR,NOT 32.2 38 10.583
XNOR,NAND,NOR,NOT 31.1 38 10.1
XNOR,NOR,NOT,OR 34.1 40 11.723
XOR,NOR,NOT,OR 35 42 12.487
10. International Journal of VLSI design & Communication Systems (VLSICS) Vol.3, No.4, August 2012
144
Figure 6 Different Adder topologies
11. International Journal of VLSI design & Communication Systems (VLSICS) Vol.3, No.4, August 2012
145
the adder topology implemented with optimized equations that are realized in terms of
multiplexers. It is noticed that delay, area and power delay product are less when compared to the
normal expression. Figure 6 shows different adder topologies
Table 4 Comparison and Results obtained for different adder topologies
6. SUMMARY
The proposed 20 different Boolean expressions (logic construction) are simulated using Tanner
EDA with BSIM3v3 250nm technology with supply voltage ranging from 1V to 2V in steps of
0.2V. It is observed that adder designed with XOR and MUX has the least delay, transistor count
and power dissipation when compared to other combinations of gate. So the adder realized with
MUX and XOR is considered to be the optimized adder in terms of delay, transistor count and
power dissipation. A new low-power, high-speed full adder cell is proposed using XOR and
MUX gates. Its performances have been analyzed and reported in section 4. This optimized adder
is designed with fully MUX based structure in FPGA using VERILOG HDL and this module is
incorporated in the existing adder topologies and its comparison is made. The target FPGA
device chosen for the implementation of these adders was Xilinx ISE 12.1 Spartan3E XC3S500-
5FG320. The comparison of delay, slice occupied, AT and its power dissipation is depicted in the
A
d
d
e
r
s
Slice
utilized
AT
Power
Dissipation
(mw) PD
Delay (ns) (Area)
bit
E-A
Op-
A E-A
Op-
A E-A Op-A E-A Op-A E-A Op-A
R 8 13.2 12.93 9 9 118.83 116.37 80.98 80.1 1069.18 1035.69
C 16 21.69 21.11 18 18 390.42 379.89 81.1 79.98 1759.06 1687.98
A 32 38.67 37.46 37 37 1430.6 1385.87 82.3 78.5 3182.13 2940.3
64 72.62 70.16 74 74 5373.6 5191.77 85.67 80.2 6221.01 5626.75
C 8 13.2 12.93 9 9 118.83 116.37 78.91 77.5 1041.85 1002.08
L 16 21.69 21.11 18 18 390.42 379.89 80.98 78.2 1756.46 1650.41
A 32 38.67 37.46 37 37 1430.6 1385.87 81.2 79.1 3139.6 2962.77
64 72.62 70.16 74 74 5373.6 5191.77 82.3 79.3 5976.3 5563.61
C 8 12.06 11.1 14 13 168.81 144.3 85.23 80.98 1027.7 898.878
S 16 20.02 19.8 27 23 540.49 455.4 87.1 82.3 1743.57 1629.54
A 32 34.94 32.12 55 52 1921.5 1670.24 88.45 82.3 3090.18 2643.48
64 56.8 54.23 110 106 6247.9 5748.38 90.12 84.01 5118.73 4555.86
C 8 12.21 11.9 12 11 146.56 130.9 85.23 80.18 1040.91 954.142
I 16 16.57 14.32 24 22 397.66 315.04 87.2 82.03 1444.82 1174.67
A 32 27.45 25.67 49 47 1345.1 1206.49 88.23 82.03 2422.09 2105.71
64 47.2 45.21 100 98 4719.7 4430.58 90.1 84.12 4252.45 3803.07
C 8 12.6 10.11 15 11 188.94 111.188 78.91 77.5 993.95 783.37
S 16 21 15.75 32 30 672.1 472.56 80.98 78.2 1700.82 1231.81
E 32 37.93 24.36 67 61 2541 1486.2 81.2 79.1 3079.51 1927.19
A 64 71.77 31.41 135 125 9689 8790.12 82.3 79.3 5906.67 2490.65
C
S
K
A
8 12.78 11.15 13 13 166.17 144.885 78.91 77.5 1008.63 863.738
16 23.27 14.52 23 22 535.28 319.352 80.98 78.2 1884.65 1135.15
32 40.15 23.26 45 44 1806.9 1023.22 81.2 79.1 3260.5 1839.47
64 49.22 35.73 79 78 3888.5 2787.1 82.3 79.3 4050.97 2833.55
12. International Journal of VLSI design & Communication Systems (VLSICS) Vol.3, No.4, August 2012
146
Figure (7). From this analysis it is found that for all the adder topologies the delay is less when
compare to the existing adder with normal equation ( S U M A B C= ⊕ ⊕ and
CARRY AB AC BC= + + ) and it is also observed that the delay for RCA and CLA are the same and
its distribution is shown in the graph (Figure 7a). In case of slice utilized there is no change
occurs for RCA and CLA hence its distribution is shown as single red line in the chart (Figure
7b). From AT chart (Figure 7c) it is noticed that the AT value is large for 64 bit carry select
adders and adders like ripple carry adder, carry look ahead adder and carry increment adder have
less AT Value. From PD distribution (Figure 7d) less power dissipation occurs for carry
increment and ripple carry adders, maximum dissipation occurs for carry save and carry skip
adders. According to the presented results, the adder topology which has the best compromise
between area, delay and power dissipation are carry look-ahead and carry increment adders and
they are suitable for high performance and low-power circuits. The fastest adders are carry select
and carry save adders with the penalty of area. The simplest adder topologies that are suitable for
low power applications are ripple carry adder, carry skip and carry bypass adder with least gate
count and maximum delay.
Figure 7. A. Delay Chart
Figure 7. B. Slices utilized Chart
13. International Journal of VLSI design & Communication Systems (VLSICS) Vol.3, No.4, August 2012
147
Figure 7.C. AT Value Chart
Figure 7. D. PT Value Chart
7. CONCLUSION
An extensive performance analysis of 1-bit full-adder cells has been presented. Technology
independent logic optimization is used to design 1-bit full adder with 20 different Boolean
expressions and its performance was analyzed in terms of transistor count, delay and power
dissipation using Tanner EDA with TSMC MOSIS 250nm technology. From this analysis XOR
and MUX based expression provides low transistor count, minimum delay and minimum power
dissipation when compared to other logic equations. The second optimized full adder can be
realized using XNOR, NOT and MUX. The other optimized solution for constructing full adders
are using NAND gates only, XOR, XNOR,MUX combination and XOR, AND,OR,MUX
combination. The worst case full adder construction is not using NOR gate which occupies large
transistor count, dissipates large power and has longer delay. Logical effort delay model to
estimate the parasitic delay is also presented. Using the optimized expression the primitive adder
cell is implemented with multiplexer and this module is incorporated with existing adder
topologies like ripple carry adder, carry look-ahead adder, carry skip adder, carry select adder,
carry increment adder and carry save adder and its performance is analyzed in terms of area
(slices used) and maximum combinational path delay as a function of size. The target FPGA
device chosen for the implementation of these adders was Xilinx ISE 12.1 Spartan3E XC3S500-
5FG320. The comparison and its simulation results have been presented. Based on the
comparison it is observed that number of slices occupied, power dissipation and delay are less
using the optimized expression. The work presented in this paper gives more insight and deeper
understanding of constituting modules of the adder cell to help the designers in making their
choices.
14. International Journal of VLSI design & Communication Systems (VLSICS) Vol.3, No.4, August 2012
148
REFERENCES
[1] Shih-Chieh Chang, Lukas P.P.P. van Ginneken, “Circuit Optimization by Rewiring”, IEEE
Transaction on Computers, Vol. 48, No. 9, September 1999.
[2] Oh-Hyeong Kwon, “A Boolean Extraction Technique For Multiple-Level Logic Optimization” IEEE
2003
[3] R.Uma, “4-Bit Fast Adder Design: Topology and Layout with Self-Resetting Logic for Low Power
VLSI Circuits”, International Journal of Advanced Engineering Sciences and Technology, Vol No. 7,
Issue No. 2, 197 – 205, 2011.
[4] Padma Devi, Ashima Girdher and Balwinder Singh, “Improved Carry Select Adder with Reduced
Area and Low Power Consumption”, International Journal of Computer Application,Vol 3.No.4, June
2010
[5] Shrirang K. Karandikar and Sachin S. Sapatnekar,” Fast Comparisons of Circuit Implementations”,
IEEE Transaction on Very Large Scale Integration (VLSI) Systems, Vol. 13, No. 12, December 2005
[6] Sutherland, B. Sproull, D. Harris, “Logical Effort: Designing Fast CMOS Circuits,” Morgan
Kaufmann Publisher, c1999.
[7] R.uma, Vidya Vijayan, M. Mohanapriya, Sharon Paul,” Area, Delay and Power Comparison of
Adder Topologies”, International Journal of VLSI and Communication Systems, 2012.
[8] G.Shyam Kishore, “A Novel Full Adder with High Speed Low Area”, 2nd National Conference on
Information and Communication Technology (NCICT) 2011 Proceedings published in International
Journal of Computer Applications® (IJCA).
[9] Mariano Aguirre-Hernandez and Monico Linares-Aranda, “CMOS Full-Adders for Energy-Efficient
Arithmetic Applications”, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol.
19, No. 4, April 2011.
[10] Pudi. V, Sridhara., K, “Low Complexity Design of Ripple Carry and Brent Kung Adders in
CA”,Nanotechnology, IEEE transactions on,Vol.11,Issue.1,pp.105-119,2012.
[11] Shubin.V.V, “Analysis and Comparison of Ripple Carry Full Adders by Speed”, Micro/Nano
Technologies and Electron Devices (EDM),2010, International Conference and Seminar on, pp.132-
135,2010.
[12] Sreehari Veeramachaneni, M.B. Srinivas, “New Improved 1-Bit Full Adder Cells”, IEEE, 2008.
[13] Ivan E. Sutherland, Bob F. Sproull, David L. Harris “Logical Effort: Designing Fast CMOS Circuits”,
Morgan Kaufmann Publishers, Inc. 1998.
[14] R. F. Sproull and I. E. Sutherland. “Theory of Logical Effort: Designing for Speed on the Back of an
Envelope”. In IEEE Adv. Res. in VLSI, 1–16, 1991.
[15] Jian-Fei Jiang; Zhi-Gang Mao; Wei-Feng He; Qin Wang, “A New Full Adder Design for Tree
Structured Arithmetic Circuits”, Computer Engineering and Technology(ICCET),2010,2nd
International Conference on,Vol.4,pp.V4-246-V4- 249,2010.
[16] Shubhajit Roy Chowdhury, Aritra Banerjee, Aniruddha Roy, Hiranmay Saha, “A high Speed 8
Transistor Full Adder Design using Novel 3 Transistor XOR Gates”, International Journal of
Electrical and Computer Engineering 3:12 2008.
[15] Romana Yousuf and Najeeb-ud-din, “Synthesis of Carry Select Adder in 65 nm FPGA”, IEEE.
[16] D. Patel, P. G. Parate, P. S. Patil, and S. Subbaraman, ―ASIC implementation of 1-bit full
adder,ǁ in Proc. 1st Int. Conf. Emerging TrendsEng. Technol., Jul. 2008, pp. 463–467.
[17] Y. Sunil Gavaskar Reddy and V.V.G.S.Rajendra Prasad, “Power Comparison of CMOS and
Adiabatic Full Adder Circuits”, International Journal of VLSI design & Communication Systems
(VLSICS) Vol.2, No.3, September 2011
15. International Journal of VLSI design & Communication Systems (VLSICS) Vol.3, No.4, August 2012
149
Authors
R.Uma received her B.E. (EEE) degree from Bharathiyar University Coimbatore
in the year 1998, Post graduated in M.E (VLSI Design) from Anna
University Chennai in the year 2004. Currently she has been working as Assistant
Professor in Electronics and Communication Engineering, Rajiv Gandhi College
of Engineering and Technology, Puducherry. She authored books on VLSI Design.
She has published several papers on national and International journal and
conferences. She is the guest faculty for Pondicherry University for M.Tech
Electronics. She has received the best teacher award for the year 2006 and 2007.
Her research interests are Analog VLSI Design, Low power VLSI Design,
Testing of VLSI Circuits, Embedded systems and Image processing. She is a member of ISTE. Perusing her
Ph.D. from Pondicherry University in the Department of Computer Science.
Dr. P. Dhavachelvan is working as a Professor in the Department of Computer
Science, Pondicherry University, India. He obtained his B.E. in the field of
Electrical and Electronics Engineering from University of Madras, India. He
pursued his M.E. and Ph.D. in the field of Computer Science and Engineering from
Anna University, Chennai, India. He has about 15 years of experience as an
academician and his research areas include Software Engineering & Standards and
Web Service Computing. In his credit, he has more than 125 research papers
published in reputed International and National Journals and Conferences. He also
obtained Patents and proposed Standards in the domain of Software Engineering.