EVOLUTION OF STRUCTURE OF SOME BINARY GROUP-BASED N-BIT COMPARATOR, N-TO-2N D...VIT-AP University
Reversible logic has attracted substantial interest due to its low power consumption which is the main concern of low power VLSI circuit design. In this paper, a novel 4x4 reversible gate called inventive gate has been introduced and using this gate 1-bit, 2-bit, 8-bit, 32-bit and n-bit group-based reversible comparator have been constructed with low value of reversible parameters. The MOS transistor realizations of 1-bit, 2- bit, and 8-bit of reversible comparator are also presented and finding power, delay and power delay product (PDP) with appropriate aspect ratio W/L. Novel inventive gate has the ability to use as an n-to-2n decoder. Different proposed novel reversible circuit design style is compared with the existing ones. The relative results shows that the novel reversible gate wide utility, group-based reversible comparator outperforms the present design style in terms of number of gates, garbage outputs and constant input.
Implementation and Comparison of Efficient 16-Bit SQRT CSLA Using Parity Pres...IJERA Editor
In Very Large Scale Integration (VLSI) outlines, Carry Select Adder (CSLA) is one of the quickest adder utilized as a part of numerous data processing processors to perform quick number crunching capacities. In this paper we proposed the design of SQRT CSLA using parity preserving reversible gate (P2RG). Reversible logic is emerging field in today VLSI design. In conventional circuits, the logic gates such as AND gate, OR gate is irreversible in nature and computing with irreversible logic results in energy dissipation. This problem can be circumvented by using reversible logic. In ideal condition, the reversible logic gate produces zero power dissipation. The proposed design is efficient in terms of delay as compare to irreversible SQRT CSLA. The simulation is done using Xilinx.
Optimized Reversible Vedic Multipliers for High Speed Low Power Operationsijsrd.com
Multiplier design is always a challenging task; how many ever novel designs are proposed, the user needs demands much more optimized ones. Vedic mathematics is world renowned for its algorithms that yield quicker results, be it for mental calculations or hardware design. Power dissipation is drastically reduced by the use of Reversible logic. The reversible Urdhva Tiryakbhayam Vedic multiplier is one such multiplier which is effective both in terms of speed and power. In this paper we aim to enhance the performance of the previous design. The Total Reversible Logic Implementation Cost (TRLIC) is used as an aid to evaluate the proposed design. This multiplier can be efficiently adopted in designing Fast Fourier Transforms (FFTs) Filters and other applications of DSP like imaging, software defined radios, wireless communications.
Design of High speed Low Power Reversible Vedic multiplier and Reversible Div...IJERA Editor
This paper bring out a 32X32 bit reversible Vedic multiplier using "Urdhva Tiryakabhayam" sutra meaning vertical and crosswise, is designed using reversible logic gates, which is the first of its kind. Also in this paper we propose a new reversible unsigned division circuit. This circuit is designed using reversible components like reversible parallel adder, reversible left-shift register, reversible multiplexer, reversible n-bit register with parallel load line. The reversible vedic multiplier and reversible divider modules have been written in Verilog HDL and then synthesized and simulated using Xilinx ISE 9.2i. This reversible vedic multiplier results shows less delay and less power consumption by comparing with array multiplier.
EVOLUTION OF STRUCTURE OF SOME BINARY GROUP-BASED N-BIT COMPARATOR, N-TO-2N D...VIT-AP University
Reversible logic has attracted substantial interest due to its low power consumption which is the main concern of low power VLSI circuit design. In this paper, a novel 4x4 reversible gate called inventive gate has been introduced and using this gate 1-bit, 2-bit, 8-bit, 32-bit and n-bit group-based reversible comparator have been constructed with low value of reversible parameters. The MOS transistor realizations of 1-bit, 2- bit, and 8-bit of reversible comparator are also presented and finding power, delay and power delay product (PDP) with appropriate aspect ratio W/L. Novel inventive gate has the ability to use as an n-to-2n decoder. Different proposed novel reversible circuit design style is compared with the existing ones. The relative results shows that the novel reversible gate wide utility, group-based reversible comparator outperforms the present design style in terms of number of gates, garbage outputs and constant input.
Implementation and Comparison of Efficient 16-Bit SQRT CSLA Using Parity Pres...IJERA Editor
In Very Large Scale Integration (VLSI) outlines, Carry Select Adder (CSLA) is one of the quickest adder utilized as a part of numerous data processing processors to perform quick number crunching capacities. In this paper we proposed the design of SQRT CSLA using parity preserving reversible gate (P2RG). Reversible logic is emerging field in today VLSI design. In conventional circuits, the logic gates such as AND gate, OR gate is irreversible in nature and computing with irreversible logic results in energy dissipation. This problem can be circumvented by using reversible logic. In ideal condition, the reversible logic gate produces zero power dissipation. The proposed design is efficient in terms of delay as compare to irreversible SQRT CSLA. The simulation is done using Xilinx.
Optimized Reversible Vedic Multipliers for High Speed Low Power Operationsijsrd.com
Multiplier design is always a challenging task; how many ever novel designs are proposed, the user needs demands much more optimized ones. Vedic mathematics is world renowned for its algorithms that yield quicker results, be it for mental calculations or hardware design. Power dissipation is drastically reduced by the use of Reversible logic. The reversible Urdhva Tiryakbhayam Vedic multiplier is one such multiplier which is effective both in terms of speed and power. In this paper we aim to enhance the performance of the previous design. The Total Reversible Logic Implementation Cost (TRLIC) is used as an aid to evaluate the proposed design. This multiplier can be efficiently adopted in designing Fast Fourier Transforms (FFTs) Filters and other applications of DSP like imaging, software defined radios, wireless communications.
Design of High speed Low Power Reversible Vedic multiplier and Reversible Div...IJERA Editor
This paper bring out a 32X32 bit reversible Vedic multiplier using "Urdhva Tiryakabhayam" sutra meaning vertical and crosswise, is designed using reversible logic gates, which is the first of its kind. Also in this paper we propose a new reversible unsigned division circuit. This circuit is designed using reversible components like reversible parallel adder, reversible left-shift register, reversible multiplexer, reversible n-bit register with parallel load line. The reversible vedic multiplier and reversible divider modules have been written in Verilog HDL and then synthesized and simulated using Xilinx ISE 9.2i. This reversible vedic multiplier results shows less delay and less power consumption by comparing with array multiplier.
Design and minimization of reversible programmable logic arrays and its reali...Sajib Mitra
Reversible computing dissipates zero energy in terms of information loss at input and also it can detect error of circuit by keeping unique input-output mapping. In this paper, we have proposed a cost effective design of Reversible Programmable Logic Arrays (RPLAs) which is able to realize multi-output ESOP (Exclusive-OR Sum-Of-Product) functions by using a cost effective 3×3 reversible gate, called MG (MUX Gate). Also a new algorithm has been proposed for the calculation of critical path delay of reversible PLAs. The minimization processes consist of algorithms for ordering of output functions followed by the ordering of products. Five lower bounds on the numbers of gates, garbage and quantum costs of reversible PLAs are also proposed. Finally, we have compared the efficiency of proposed design with the existing one by providing benchmark functions analysis. The experimental results show that the proposed design outperforms the existing one in terms of numbers of gates, garbage, quantum costs and delay.
DUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORMVLSICS Design
This paper is devoted to the design of dual core crypto processor for executing both Prime field and binary field instructions. The proposed design is specifically optimized for Field programmable gate array (FPGA) platform. Combination of two different field (prime field GF(p) and Binary field GF(2m)) instructions execution is analysed.The design is implemented in Spartan 3E and virtex5. Both the performance results are compared. The implementation result shows the execution of parallelism using dual field instructions
Application developers like to stick to the object-oriented style of programming by designing their application's logic as interaction between different object entities. In this process, every entity is modeled as a C++ class or structure. Array of structure (AoS) maintains a collection of those entities, which makes the code more readable and easier to maintain. But, this user-friendly code can potentially pose a challenge when it comes to vectorization efficiency. Often, the data needed for populating the vector register is gathered since the data is laid out in non-unit stride fashion in the main memory. To make the data layout more vector-friendly, developers often had to change their data structures manually from AoS to a structure of arrays (SoA). Single instruction multiple data (SIMD) layout templates from Intel help developers preserve an AoS interface while programming but, under the hood, the data structure is laid out in an SoA format. This is a win-win solution for both object-oriented and vector-friendly programming.
This presentation demonstrates how to analyze the memory access pattern in your performance-sensitive loops and how to enable the layout templates to make changes from constant- and variable-strided memory accesses to unit-strided memory access wherever possible.
Design of Digital Adder Using Reversible LogicIJERA Editor
Reversible logic circuits have promising applications in Quantum computing, Low power VLSI design,
Nanotechnology, optical computing, DNA computing and Quantum dot cellular automata. In spite of them
another main prominent application of reversible logic is Quantum computers where the quantum devices are
essential which are ideally operated at ultra high speed with less power dissipation must be built from reversible
logic components. This makes the reversible logic as a one of the most promising research areas in the past few
decades. In VLSI design the delay is the one of the major issue along with area and power. This paper presents
the implementation of Ripple Carry Adder (RCA) circuits using reversible logic gates are discussed.
Report-Implementation of Quantum Gates using VerilogShashank Kumar
It was a project-based work in which I was guided to implement the quantum-based gates which would be equivalent to classical gates So, the project name was "FPGA Implementation of Digital Logic Design using Quantum Computing". Actually, it is to mitigate the problem, since in quantum any NAND based circuit is not shown universal as in the classical it was so tried by using the "IBM Quantum Composer" to make such circuit which would behave as the NAND gate and also reversible in nature as per the quantum physics says and simulated the circuitry using the "Verilog".
FPGA Implementation of SubByte & Inverse SubByte for AES Algorithmijsrd.com
Advanced encryption standard was accepted as a Federal Information Processing Standard (FIPS) standard. In traditional look up table (LUT) approaches, the unbreakable delay is longer than the total delay of the rest of operations in each round. LUT approach consumes a large area. It is more efficient to apply composite field arithmetic in the SubBytes transformation of the AES algorithm. It not only reduces the complexity but also enables deep sub pipelining such that higher speed can be achieved. Isomorphic mapping can be employed to convert GF(28) to GF(22)2)2) ,so that multiplicative inverse can be easily obtained. SubBytes and InvSubBytes transformations are merged using composite field arithmetic. It is most important responsible for the implementation of low cost and high throughput AES architecture. As compared to the typical ROM based lookup table, the presented implementation is both capable of higher speeds since it can be pipelined and small in terms of area occupancy (137/1290 slices on a Spartan III XCS200-5FPGA).
Design and implementation of Parallel Prefix Adders using FPGAsIOSR Journals
Abstract: Adders are known to have the frequently used in VLSI designs. In digital design we have half adder and full adder, main adders by using these adders we can implement ripple carry adders. RCA use to perform any number of addition. In this RCA is serial adder and it has commutation delay problem. If increase the ha&fa simultaneously delay also increase. That’s why we go for parallel adders(parallel prefix adders). IN the parallel prefix adder are ks adder(kogge-stone),sks adder(sparse kogge-stone),spaning tree and brentkung adder. These adders are designd and implemented on FPGA sparton3E kit. Simulated and synthesis by model sim6.4b, Xilinx ise10.1.
Evolution of Structure of Some Binary Group-Based N-Bit Compartor, N-To-2N De...VLSICS Design
Reversible logic has attracted substantial interest due to its low power consumption which is the main
concern of low power VLSI systems. In this paper, a novel 4x4 reversible gate called inventive gate has
been introduced and using this gate 1-bit, 2-bit, 8-bit, 32-bit and n-bit group-based reversible comparator
have been constructed with low value of reversible parameters. The MOS transistor realizations of 1-bit, 2-
bit, and 8-bit of reversible comparator are also presented and finding power, delay and power delay
product (PDP) with appropriate aspect ratio W/L. Novel inventive gate has the ability to use as an n-to-2n
decoder. Different novel reversible circuit design style is compared with the existing ones. The relative
results shows that the novel reversible gate wide utility, group-based reversible comparator outperforms
the present style in terms of number of gates, garbage outputs and constant input.
Implementation of quantum gates using verilogShashank Kumar
Implementing the XOR, AND, OR gate in the quantum circuits and with the help of IBM Quantum Composer which is a graphical programming tool. Also utilizing the Quantum circuit as well as HDL i.e., Verilog by Xilinx ISE Design Suite version 14.7 for visualizing the simulation graph with implementing the XOR, AND, OR and NAND gates also actually NAND gate is not found the universal gate in quantum, so trying to build the NAND gate which can also perform the reversible nature with simulating using the Verilog code for the desired result i.e. NAND output.
A Novel Design of a 4 Bit Reversible ALU using Kogge-Stone Adderijtsrd
Reversible circuits are one promising direction withapplications in the field of low-power design or quantumcomputation. However, no real design flow for this new kind ofcircuits exists so far. Significant contributions have been madein the literature towards the design of reversible logic gatestructures and arithmetic units, however, there are not manyefforts directed towards the design of reversible ALUs. In thispaper, a novel programmable reversible Kogge-Stone adder ispresented and verified, and its implementation in the design ofa reversible Arithmetic Logic Unit is demonstrated. Then,reversible implementations of ripple-carry, Kogge-Stone carrylook-ahead adders are analyzed and compared in terms ofdelay. The proposed design consists of the reversible Fredkin,Feynman, MG, HNG, PG and RKSC gates. The performancecharacteristics analysis is carried out in Xilinx environment. Swetha Potharla | Rajkumar R"A Novel Design of a 4 Bit Reversible ALU using Kogge-Stone Adder" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-1 | Issue-6 , October 2017, URL: http://www.ijtsrd.com/papers/ijtsrd5758.pdf http://www.ijtsrd.com/engineering/electronics-and-communication-engineering/5758/a-novel-design-of-a-4-bit-reversible-alu-using--kogge-stone-adder/swetha-potharla
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
Efficient Design of Reversible Multiplexers with Low Quantum CostIJERA Editor
Multiplexing is the generic term used to designate the operation of sending one or more analogue or digital
signals over a common transmission line at dissimilar times or speeds and as such, the scheme we use to do just
that is called a Multiplexer. In digital electronics, multiplexers are similarly known as data selectors as they can
“select” each input line, are made from individual Analogue Switches encased in a single IC package as
conflicting to the “mechanical” type selectors such as standard conservative switches and relays. In today era,
reversibility has become essential part of digital world to make digital circuits more efficient. In this paper, we
have proposed a new method to reduce quantum cost and power for various multiplexers. The results are
simulated in Xilinx by using VHDL language.
MF-RALU: design of an efficient multi-functional reversible arithmetic and l...IJECEIAES
Most modern computer applications use reversible logic gates to solve power dissipation issues. This manuscript uses an efficient multi-functional reversible arithmetic and logical unit (MF-RALU) to perform 30 operations. The 32-bit MF-RALU includes arithmetic, logical, complement, shifters, multiplexers, different adders, and multipliers. The multi-bit reversible multiplexers are used to construct the MF-RALU structure. The Reduced instruction set computer (RISC) processor is designed to realize the functionality of the MF-RALU. The MF-RALU can perform its operation in a single clock cycle. The 1-bit RALU is developed and compared with existing approaches with improvements in performance metrics. The 32-bit reversible arithmetic units (RAUs) and reversible logical units (RLUs) are constructed using 1-bit RALU. The MF-RALU and RISC processor are synthesized individually in the Vivado environment using Verilog-HDL and implemented on Artix-7 field programmable gate array (FPGA). The MFRALU utilizes a <11% chip area and consumes 332 mW total power. The RISC processor utilizes a <3% chip area and works at 483 MHZ frequency by consuming 159 mW of total power on Artix-7 FPGA.
Design and minimization of reversible programmable logic arrays and its reali...Sajib Mitra
Reversible computing dissipates zero energy in terms of information loss at input and also it can detect error of circuit by keeping unique input-output mapping. In this paper, we have proposed a cost effective design of Reversible Programmable Logic Arrays (RPLAs) which is able to realize multi-output ESOP (Exclusive-OR Sum-Of-Product) functions by using a cost effective 3×3 reversible gate, called MG (MUX Gate). Also a new algorithm has been proposed for the calculation of critical path delay of reversible PLAs. The minimization processes consist of algorithms for ordering of output functions followed by the ordering of products. Five lower bounds on the numbers of gates, garbage and quantum costs of reversible PLAs are also proposed. Finally, we have compared the efficiency of proposed design with the existing one by providing benchmark functions analysis. The experimental results show that the proposed design outperforms the existing one in terms of numbers of gates, garbage, quantum costs and delay.
DUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORMVLSICS Design
This paper is devoted to the design of dual core crypto processor for executing both Prime field and binary field instructions. The proposed design is specifically optimized for Field programmable gate array (FPGA) platform. Combination of two different field (prime field GF(p) and Binary field GF(2m)) instructions execution is analysed.The design is implemented in Spartan 3E and virtex5. Both the performance results are compared. The implementation result shows the execution of parallelism using dual field instructions
Application developers like to stick to the object-oriented style of programming by designing their application's logic as interaction between different object entities. In this process, every entity is modeled as a C++ class or structure. Array of structure (AoS) maintains a collection of those entities, which makes the code more readable and easier to maintain. But, this user-friendly code can potentially pose a challenge when it comes to vectorization efficiency. Often, the data needed for populating the vector register is gathered since the data is laid out in non-unit stride fashion in the main memory. To make the data layout more vector-friendly, developers often had to change their data structures manually from AoS to a structure of arrays (SoA). Single instruction multiple data (SIMD) layout templates from Intel help developers preserve an AoS interface while programming but, under the hood, the data structure is laid out in an SoA format. This is a win-win solution for both object-oriented and vector-friendly programming.
This presentation demonstrates how to analyze the memory access pattern in your performance-sensitive loops and how to enable the layout templates to make changes from constant- and variable-strided memory accesses to unit-strided memory access wherever possible.
Design of Digital Adder Using Reversible LogicIJERA Editor
Reversible logic circuits have promising applications in Quantum computing, Low power VLSI design,
Nanotechnology, optical computing, DNA computing and Quantum dot cellular automata. In spite of them
another main prominent application of reversible logic is Quantum computers where the quantum devices are
essential which are ideally operated at ultra high speed with less power dissipation must be built from reversible
logic components. This makes the reversible logic as a one of the most promising research areas in the past few
decades. In VLSI design the delay is the one of the major issue along with area and power. This paper presents
the implementation of Ripple Carry Adder (RCA) circuits using reversible logic gates are discussed.
Report-Implementation of Quantum Gates using VerilogShashank Kumar
It was a project-based work in which I was guided to implement the quantum-based gates which would be equivalent to classical gates So, the project name was "FPGA Implementation of Digital Logic Design using Quantum Computing". Actually, it is to mitigate the problem, since in quantum any NAND based circuit is not shown universal as in the classical it was so tried by using the "IBM Quantum Composer" to make such circuit which would behave as the NAND gate and also reversible in nature as per the quantum physics says and simulated the circuitry using the "Verilog".
FPGA Implementation of SubByte & Inverse SubByte for AES Algorithmijsrd.com
Advanced encryption standard was accepted as a Federal Information Processing Standard (FIPS) standard. In traditional look up table (LUT) approaches, the unbreakable delay is longer than the total delay of the rest of operations in each round. LUT approach consumes a large area. It is more efficient to apply composite field arithmetic in the SubBytes transformation of the AES algorithm. It not only reduces the complexity but also enables deep sub pipelining such that higher speed can be achieved. Isomorphic mapping can be employed to convert GF(28) to GF(22)2)2) ,so that multiplicative inverse can be easily obtained. SubBytes and InvSubBytes transformations are merged using composite field arithmetic. It is most important responsible for the implementation of low cost and high throughput AES architecture. As compared to the typical ROM based lookup table, the presented implementation is both capable of higher speeds since it can be pipelined and small in terms of area occupancy (137/1290 slices on a Spartan III XCS200-5FPGA).
Design and implementation of Parallel Prefix Adders using FPGAsIOSR Journals
Abstract: Adders are known to have the frequently used in VLSI designs. In digital design we have half adder and full adder, main adders by using these adders we can implement ripple carry adders. RCA use to perform any number of addition. In this RCA is serial adder and it has commutation delay problem. If increase the ha&fa simultaneously delay also increase. That’s why we go for parallel adders(parallel prefix adders). IN the parallel prefix adder are ks adder(kogge-stone),sks adder(sparse kogge-stone),spaning tree and brentkung adder. These adders are designd and implemented on FPGA sparton3E kit. Simulated and synthesis by model sim6.4b, Xilinx ise10.1.
Evolution of Structure of Some Binary Group-Based N-Bit Compartor, N-To-2N De...VLSICS Design
Reversible logic has attracted substantial interest due to its low power consumption which is the main
concern of low power VLSI systems. In this paper, a novel 4x4 reversible gate called inventive gate has
been introduced and using this gate 1-bit, 2-bit, 8-bit, 32-bit and n-bit group-based reversible comparator
have been constructed with low value of reversible parameters. The MOS transistor realizations of 1-bit, 2-
bit, and 8-bit of reversible comparator are also presented and finding power, delay and power delay
product (PDP) with appropriate aspect ratio W/L. Novel inventive gate has the ability to use as an n-to-2n
decoder. Different novel reversible circuit design style is compared with the existing ones. The relative
results shows that the novel reversible gate wide utility, group-based reversible comparator outperforms
the present style in terms of number of gates, garbage outputs and constant input.
Implementation of quantum gates using verilogShashank Kumar
Implementing the XOR, AND, OR gate in the quantum circuits and with the help of IBM Quantum Composer which is a graphical programming tool. Also utilizing the Quantum circuit as well as HDL i.e., Verilog by Xilinx ISE Design Suite version 14.7 for visualizing the simulation graph with implementing the XOR, AND, OR and NAND gates also actually NAND gate is not found the universal gate in quantum, so trying to build the NAND gate which can also perform the reversible nature with simulating using the Verilog code for the desired result i.e. NAND output.
A Novel Design of a 4 Bit Reversible ALU using Kogge-Stone Adderijtsrd
Reversible circuits are one promising direction withapplications in the field of low-power design or quantumcomputation. However, no real design flow for this new kind ofcircuits exists so far. Significant contributions have been madein the literature towards the design of reversible logic gatestructures and arithmetic units, however, there are not manyefforts directed towards the design of reversible ALUs. In thispaper, a novel programmable reversible Kogge-Stone adder ispresented and verified, and its implementation in the design ofa reversible Arithmetic Logic Unit is demonstrated. Then,reversible implementations of ripple-carry, Kogge-Stone carrylook-ahead adders are analyzed and compared in terms ofdelay. The proposed design consists of the reversible Fredkin,Feynman, MG, HNG, PG and RKSC gates. The performancecharacteristics analysis is carried out in Xilinx environment. Swetha Potharla | Rajkumar R"A Novel Design of a 4 Bit Reversible ALU using Kogge-Stone Adder" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-1 | Issue-6 , October 2017, URL: http://www.ijtsrd.com/papers/ijtsrd5758.pdf http://www.ijtsrd.com/engineering/electronics-and-communication-engineering/5758/a-novel-design-of-a-4-bit-reversible-alu-using--kogge-stone-adder/swetha-potharla
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
Efficient Design of Reversible Multiplexers with Low Quantum CostIJERA Editor
Multiplexing is the generic term used to designate the operation of sending one or more analogue or digital
signals over a common transmission line at dissimilar times or speeds and as such, the scheme we use to do just
that is called a Multiplexer. In digital electronics, multiplexers are similarly known as data selectors as they can
“select” each input line, are made from individual Analogue Switches encased in a single IC package as
conflicting to the “mechanical” type selectors such as standard conservative switches and relays. In today era,
reversibility has become essential part of digital world to make digital circuits more efficient. In this paper, we
have proposed a new method to reduce quantum cost and power for various multiplexers. The results are
simulated in Xilinx by using VHDL language.
MF-RALU: design of an efficient multi-functional reversible arithmetic and l...IJECEIAES
Most modern computer applications use reversible logic gates to solve power dissipation issues. This manuscript uses an efficient multi-functional reversible arithmetic and logical unit (MF-RALU) to perform 30 operations. The 32-bit MF-RALU includes arithmetic, logical, complement, shifters, multiplexers, different adders, and multipliers. The multi-bit reversible multiplexers are used to construct the MF-RALU structure. The Reduced instruction set computer (RISC) processor is designed to realize the functionality of the MF-RALU. The MF-RALU can perform its operation in a single clock cycle. The 1-bit RALU is developed and compared with existing approaches with improvements in performance metrics. The 32-bit reversible arithmetic units (RAUs) and reversible logical units (RLUs) are constructed using 1-bit RALU. The MF-RALU and RISC processor are synthesized individually in the Vivado environment using Verilog-HDL and implemented on Artix-7 field programmable gate array (FPGA). The MFRALU utilizes a <11% chip area and consumes 332 mW total power. The RISC processor utilizes a <3% chip area and works at 483 MHZ frequency by consuming 159 mW of total power on Artix-7 FPGA.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
With advancement in CMOS technology, a lot of research has been done to develop various logic styles to improve the performance of logic circuits. D flip-flops (DFF) are fundamental building blocks in almost every sequential logic circuit. Hence, in sequential logic circuits, the overall performance of the circuit is affected by the performance of constituent DFFs. In recent years, the focus has been towards incorporating higher clock rates in a processor for better performance. To achieve high clock rates, fine granularity pipelining techniques are used, which implies that there are relatively a fewer levels of logic in each pipeline stage. A major consequence of this design trend is that the pipeline overhead has becoming more significant. The primary cause of pipeline overhead is the latency of the flip-flop or latch used to design the processor and the clock skew of the system. This calls out for the need of incorporating the logic functionality within the architecture of flip-flop. The new family of flip-flops are called Embedded Logic Flip Flops. In this Paper, we have reviewed various Flip-flop architectures which have been proposed so far. Our attempt is to do a qualitative analysis and comparison of the proposed Embedded logic flip-flop designs.
This is a presentation on FPGA from my 3rd year academics which was the field of my mini project/seminar in the same. Main emphasis is laid on the application of FPGA in DSP domain
LOGIC OPTIMIZATION USING TECHNOLOGY INDEPENDENT MUX BASED ADDERS IN FPGAVLSICS Design
Adders form an almost obligatory component of every contemporary integrated circuit. The prerequisite of the adder is that it is primarily fast and secondarily efficient in terms of power consumption and chip area. Therefore, careful optimization of the adder is of the greatest importance. This optimization can be attained
in two levels; it can be circuit or logic optimization. In circuit optimization the size of transistors are manipulated, where as in logic optimization the Boolean equations are rearranged (or manipulated) to optimize speed, area and power consumption. This paper focuses the optimization of adder through technology independent mapping. The work presents 20 different logical construction of 1-bit adder cell in CMOS logic and its performance is analyzed in terms of transistor count, delay and power dissipation. These performance issues are analyzed through Tanner EDA with TSMC MOSIS 250nm technology. From this analysis the optimized equation is chosen to construct a full adder circuit in terms of multiplexer. This logic optimized multiplexer based adders are incorporated in selected existing adders like ripple carry
adder, carry look-ahead adder, carry skip adder, carry select adder, carry increment adder and carry save adder and its performance is analyzed in terms of area (slices used) and maximum combinational path delay as a function of size. The target FPGA device chosen for the implementation of these adders was Xilinx ISE 12.1 Spartan3E XC3S500-5FG320. Each adder type was implemented with bit sizes of: 8, 16, 32, 64 bits. This variety of sizes will provide with more insight about the performance of each adder in terms of area and delay as a function of size.
This is a classroom presentation for the basic concepts of HDL, using Verilog as the programming language. Module 2 deals with simulation and synthesis in Verilog.
This is a classroom presentation for the basic concepts of HDL, using Verilog as the programming language. Module 3 deals with programmable logic devices.
This is a slideshow depicting the importance of guru in the spiritual life. And in addition, about the practice of Gaayathree manthra and its specialties.
This is a slide-show containing the names of all the literary works written by Adi Shankaracharya. The last slide contains the summary of his greatest contributions.
An introduction to the practice of Ashtangayoga, with some prerequisites and attitudinal changes, concluding with some valid health tips and lifestyle changes.
Preparation to yogic breathing as well as some popular methods of yogic breathing (pranayama) are mentioned here, along with some additional health tips.
These slides are with less text and more pictures, with each slide sequentially related to the next one in an intuitive way, and hence the viewer should follow his/her intuitive skills in order to comprehend the flow. The truth is one, ultimately.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfKamal Acharya
The College Bus Management system is completely developed by Visual Basic .NET Version. The application is connect with most secured database language MS SQL Server. The application is develop by using best combination of front-end and back-end languages. The application is totally design like flat user interface. This flat user interface is more attractive user interface in 2017. The application is gives more important to the system functionality. The application is to manage the student’s details, driver’s details, bus details, bus route details, bus fees details and more. The application has only one unit for admin. The admin can manage the entire application. The admin can login into the application by using username and password of the admin. The application is develop for big and small colleges. It is more user friendly for non-computer person. Even they can easily learn how to manage the application within hours. The application is more secure by the admin. The system will give an effective output for the VB.Net and SQL Server given as input to the system. The compiled java program given as input to the system, after scanning the program will generate different reports. The application generates the report for users. The admin can view and download the report of the data. The application deliver the excel format reports. Because, excel formatted reports is very easy to understand the income and expense of the college bus. This application is mainly develop for windows operating system users. In 2017, 73% of people enterprises are using windows operating system. So the application will easily install for all the windows operating system users. The application-developed size is very low. The application consumes very low space in disk. Therefore, the user can allocate very minimum local disk space for this application.
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Water scarcity is the lack of fresh water resources to meet the standard water demand. There are two type of water scarcity. One is physical. The other is economic water scarcity.
Vaccine management system project report documentation..pdfKamal Acharya
The Division of Vaccine and Immunization is facing increasing difficulty monitoring vaccines and other commodities distribution once they have been distributed from the national stores. With the introduction of new vaccines, more challenges have been anticipated with this additions posing serious threat to the already over strained vaccine supply chain system in Kenya.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSEDuvanRamosGarzon1
AIRCRAFT GENERAL
The Single Aisle is the most advanced family aircraft in service today, with fly-by-wire flight controls.
The A318, A319, A320 and A321 are twin-engine subsonic medium range aircraft.
The family offers a choice of engines
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
The Benefits and Techniques of Trenchless Pipe Repair.pdf
System design using HDL - Module 5
1.
2. CPLD Complex Programmable Logic Device
FPGA Field Programmable Gate Array
GAL Generic Array Logic
HDL Hardware Description Language
IEEE Institute of Electrical & Electronic Engineers
IP Intellectual Property
ILA Integrated Logic Analyzer
ISE Integrated Software Environment
ISP In-System Programming
JKFF Jack-Kilby Flip Flop
JTAG Joint Test Action Group
LEC Logic Equivalence Checker
LMG Logic Modeling Group
LUT Look-Up Table
NGC Native Generic Compiler
OTP One-Time Programmable
PACE Pin-out and Area Constraints Editor
PAL Programmable Array Logic
PCI Peripheral Component Interconnect
PLA Programmable Logic Array
TBW Test-Bench Waveform
UCF User Constraints File
VHDL VHSIC Hardware Description Language
VHSIC Very High Speed Integrated Circuit
XST Xilinx Synthesis Technology
ABBREVIATIONS
3. SYSTEM DESIGN USING HDL (ECE43)
#
Digital system design using Verilog,
Charles Roth, Lizy Kurian John,
Byeong Kil Lee,
1st Edition, 2016, Cengage Learning
1 2.1, 2.2, 2.3 - 2.8, 2.11, 2.13 - 2.15
2 2.9, 2.10, 2.12, 2.16 - 2.19, 8.1, 8.2
3 3.1 - 3.4, 5.1, 5.2.1, 5.3
4 4.1 - 4.5, 4.8, 4.6, 4.7, 4.9, 4.11
5 6.1 - 6.5, 6.7 - 6.12
DESIGNING
WITH FPGA
4. 07/03/2019
Aravinda K., Dept. of E&C, NHCE, Bengaluru 4
Example-1: Design of a 4:1 multiplexer using FPGA
Configurable Logic Block in FPGA
Each CLB in the FPGA contains
two 4-variable function generators.
It also contains two flip-flops which
can be used for latching the function.
5. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 5
As 4:1 mux contains 6 inputs, it is not
possible to implement it using 1 CLB
in the given FPGA. Therefore, the 4:1
mux can be decomposed into 2:1 mux
blocks. Flip-flops are of no use here.
M = S1'S0'I0 + S1'S0I1 + S1S0'I2 + S1S0I3
M1 = S0'I0 + S0I1 M2 = S0'I2 + S0I3 M = S1'M1 + S1M2
6. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 6
Instead of logic equations,
modern FPGAs use LUT as a
basic building block.
For this particular design of
4:1 mux, the contents of
LUT4 are as shown:
Each LUT4 can
implement 1-bit function
of 4-input variables.
Hence, 16 cells of
SRAM are required for
the input columns
(“don’t care” terms need
to be included as logic
states in the LUT).
3 LUT4s require 48
SRAM cells. Therefore,
this is an expensive
implementation.
7. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 7
If the CLB in the FPGA has a
provision to combine the outputs
of the function generators, then
the 4:1 mux can be implemented
using a single CLB. (e.g.: XC4000)
This method requires
2 LUT4s and 1 LUT3.
Hence, the number of
SRAM cells required
is, 16 + 16 + 8 = 40.
8. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 8
Example-2: Circular Shift Register (Ring Counter)
Even though FGs are of no use, they have to be used.
9. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 9
Q1: How many CLBs are required to implement a 3-to-8 decoder?
A1: If the decoder is implemented using logic gates, 3 NOT gates and
8 AND gates are required. When CLB is used, as there are 8 outputs,
one for each input combination, 8 FGs are required. As each CLB
contains 2 sets of (FG4+FF), the number of CLBs required is 4.
If LUT based FPGA is used, for each output, 8 SRAM cells and one
8:1 mux is required. Thus, 8 LUTs are required. But in the CLB, as
each FG4 contains 16 SRAM cells (including the “don’t care” term),
for the 8 FG4s, 16x8 = 128 SRAM cells are required.
1
0
0
0
0
0
0
0
0 0 0
10. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 10
Implementing functions using Shannon’s decomposition
Shannon’s expansion
theorem helps to decompose a
function containing larger
number of variables, into a
function containing lesser
number of variables.
In the example shown,
instead of a single 6-variable
FG, two 5-variable FGs along
with ½ of a 3rd FG are used to
realize the function.
Thus, Shannon’s
expansion theorem
helps in the reduction
of hardware.
11. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 11
Example-3: Consider Z = abcd’ef’ + a’b’c’def’ + b’cde’f
By setting a = 0, Z = b’c’def’ + b’cde’f => Z0
By setting a = 1, Z = bcd’ef’ + b’cde’f => Z1
Therefore, two LUT5s along with either a 2:1 mux or
another LUT5, can be utilized for implementing the function.
The number of terms in Z0 or Z1 does not matter, as this is
going to be implemented by LUT.
If only LUT4 is available in the CLB, then the function needs
to be decomposed further, by using “a” and “b” together.
12. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 12
This method requires seven LUT4s, in general.
13. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 13
Example-4: Consider
Z = abcd’ef’+a’b’c’def’+b’cde’f
Substituting a = 0 & b = 0,
Y0 = c’def’ + cde’f
Substituting a = 0 & b = 1,
Y1 = 0
Substituting a = 1 & b = 0,
Y2 = cde’f
Substituting a = 1 & b = 1,
Y3 = cd’ef’
As there is a null function,
this requires only 5 LUT4s.
14. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 14
Q2: What is the max. no. of LUT5s for realizing a 7-variable function?
A2:
Four LUT5s are required for implementing Y0, Y1, Y2 and Y3. Another
LUT5 is required to implement the three initial terms. The last LUT5
is required to club the last term with the previous output. Thus, the
total no. of LUT5s required is 6. However, in the last LUT5, one input
remains unused, and it has to be considered as “don’t care”.
Example-5: Implement a 7-variable function
using 4-input LUTs and 2:1 multiplexers.
(7-variable function = Two LUT6 + One 2:1 mux). (6-variable
function = Two LUT5 + One 2:1 mux). (5-variable function =
Two LUT4 + One 2:1 mux). Substituting accordingly, we obtain,
(7-variable function = Eight LUT4 + Seven 2:1 mux).
15. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 15
If muxes are unavailable in CLB,
then more LUTs are needed. Xilinx
Spartan FPGA provides mux in
addition to LUT4. A logic unit is
these FPGAs is called as “slice”.
S
L
I
C
E
16. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 16
REALIZATION OF 7-VARIABLE
FUNCTION USING 4 SLICES
Example-6: Implement the parity
function A⊕B⊕C⊕D⊕E using 4-
variable Function Generators.
For direct implementation, this
5-variable function requires only
one LUT5.
Using Shannon’s expansion,
this function can be decomposed
into two 4-variable functions,
and can be realized using two
LUT4s and one 2:1 multiplexer.
If multiplexer is not present in
the CLB, then it requires three
LUT4s.
17. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 17
CARRY CHAINS IN FPGA
Addition is a very common
operation in digital circuits.
As LUT4 is a standard building
block in FPGA, two LUT4s are
required for sum and carry bits.
Thus, for an n-bit adder, ‘2n’
number of LUT4s are required.
But, if the FPGA can provide
dedicated circuitry for generating
and propagating carry bit to the
next stage, then only ‘n’ number of
LUT4s are required for sum bits.
The dedicated carry chain
generates the carry bit in parallel.
18. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 18
CASCADE CHAINS IN FPGA
For a function with large number of variables, the FPGAs provide
cascade AND and cascade OR chains (for PoS and SoP terms).
Thus, instead of using separate FGs to perform AND or OR
functions, the cascade circuitry can be used to create such functions.
Hence, for a 32 variable SoP function, only 8 LUT4s are required.
But without the cascade chain, 11 LUT4s are required (8 + 2 + 1).
FPGAs such as Altera Stratix IV provide register chains as well.
19. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 19
Examples of Logic blocks in commercial FPGAs
Kintex uses LUT6 in each slice. CLB contains 4 copies of the slice.
Xilinx Virtex and Spartan FPGAs use LUT4. Each slice contains
two FGs, two muxes, two flip-flops, and additional logic.
1. Xilinx Kintex CLB
20. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 20
Each Stratix IV LM contains two LUT6 and two flip-flops. Each
LUT6 has two independent inputs and four shared inputs. In
addition, two 1-bit adders are built in, with carry chaining.
Flip-flops with register chaining allows to create shift registers.
2. Altera Stratix IV Logic Module
21. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 21
Fusion VersaTile consists of muxes and logic gates. Each block
has 4 inputs – X1, X2, X3 and XC. The VersaTile block is of
significantly finer grain than the LUT4 present in other FPGAs.
Each VersaTile can be configured as: 3-input logic function, or
latch with (clear or set), or D flip-flop with (enable, clear or set).
3. Microsemi Fusion VersaTile
22. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 22
DEDICATED MULTIPLIERS IN FPGA
Multiplication is also a common operation, and for implementing
it, several programmable logic blocks are required. In addition, such
multiplier will be slower, because of the interconnecting switches.
Hence, some Xilinx and Altera FPGAs contain dedicated 18X18
multipliers. When multiplication of larger numbers are required,
several of the built-in multipliers can be put together.
e.g., if A and B are of 32 bits, then they can be represented as:
A=(C X 216)+D, B=(E X 216)+F & A X B = (CE X 232)+(DE+CF) X 216 + DF.
Thus, 4 multipliers are required to generate the partial products CE,
DE, CF & DF, which are later added by means of several adders.
23. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 23
Cost
of
programmability
The logic block shown, for its configuration, requires totally 46
SRAM cells (276 transistors). There will be additional configuration
bits required, for programmable interconnect and for programmable
I/O. Thus, the flexibility of programmable points comes with a much
higher additional cost of associated memory cells (SRAM/Flash).
e.g.: Xilinx Virtex-II XC2V40 (with 512 LUT4s & 88 I/O pins), needs
3,38,976 configuration bits. Virtex-II XC2V8000 (with 93,184 LUT4s &
1108 I/O pins), needs more than 26 million configuration bits.
24. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 24
FPGAs and One-Hot state assignment
While implementing a state machine, in general, state
encoding is performed with ‘n’ bits for 2n states. e.g.: for
a machine with 4 states, 2-bit encoding has to be used.
Increase in ‘n’ will be requiring more no. of logic blocks.
For faster implementation of the design, it is desirable
to reduce the no. of logic blocks and interconnections.
Hence, instead of the encoding method, one-hot method
can be used, which will reduce the no. of logic blocks.
This method, in turn, will result in the increased no. of
flip-flops; but this does not affect the implementation
much, as each FPGA logic block contains two flip-flops.
25. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 25
For the state graph shown, the
encoding can be 00, 01, 10 and 11.
But with the usage of one-hot
method, the state encoding will be
1000, 0100, 0010 and 0001. The
states will use one flip-flop each.
The next state equation for the
flip-flop Q3 can be written as,
Q3
+ = X1Q0Q1
’Q2
’Q3
’ + X2Q0
’Q1Q2
’Q3
’
+ X3Q0
’Q1
’Q2Q3
’ + X4Q0
’Q1
’Q2
’Q3.
In the one-hot method, this
equation will get reduced to,
Q3
+ = X1Q0 + X2Q1 + X3Q2 + X4Q3.
Here, each term in the equation
contains exactly one state variable.
The output equations are:
Z1 = X1Q0 + X3Q2,
Z2 = X2Q1 + X4Q3.
As terms contain one
state variable each, this
leads to fewer logic cells.
26. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 26
In electronic designs, a “cell” is
defined as the predesigned and
precharacterized circuit element.
Thus, a cell contains pretested
and prestored instances of circuit
diagram, its circuit symbol, and its
physical description (layout).
27. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 27
ASIC contains an exact number of gates that are required
for the design. But FPGA contains arrays of gates, or arrays of
LUTs. Thus, if a larger design needs to be implemented in
FPGA, the ASIC designer needs to have an idea about the
design being fit into a given FPGA.
For the designer, the number of gates inside FPGA is not a
useful metric, as FPGA is programmable. Hence, a term called
“equivalent gate count” is defined, as a count of the circuitry
that can fit into a particular FPGA. This type of gate count is
extremely difficult to compute, as it depends on the type of
circuitry, the type of interconnections, and the available
routing resources available in the FPGA.
FPGA CAPACITY
(Maximum gates versus usable gates)
28. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 28
One method for computing the equivalent gate count for a
CLB is as follows: 2:1 mux = 4 gates, 3-input XOR gate = 6
gates, 4-input XOR gate = 9 gates, Flip-flop = 7 gates, and so
on. Thus, the equivalent gate count for a CLB can be obtained.
The total gate count can be estimated, by multiplying the
equivalent gate count with the number of CLBs in the FPGA.
In general, this type of gate count is likely to be higher than
the gate count of the practical circuitry that is being realized.
Another method is to use the Benchmark circuits (e.g.:
Benchmark suite prepared by PREP [Programmable
Electronics Performance company]). For example, if an ASIC
contains 2000 gates, and if an FPGA can fit 20 copies of the
ASIC, with no routing between the copies, then the maximum
gate count of the FPGA can be considered as 40,000.
29. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 29
Synthesis is the process of translation of an abstract
high-level design to a detailed circuit description.
The synthesis tool implements the digital system as an
interconnection of gates, flip-flops, registers, counters,
muxes, adders, and other basic building blocks.
The representation of the design as a logic schematic,
together with an associated wirelist, is called as netlist.
DESIGN TRANSLATION (SYNTHESIS)
results in AND gate.
results in AND gate
followed by flip-flop.
30. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 30
The synthesis tool performs a line-by-
line translation of HDL into hardware.
The synthesis tool selects components
that are available in the library.
In general, ‘case’ statement results in
muxes, comparison results in adders,
shift results in registers, and so on.
For implementation with different
technologies, different component
libraries can be provided.
The resulting hardware is optimized
later on.
31. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 31
Synthesis of a
‘case’ statement
module case_eg (a,b);
input [1:0] a;
output reg [1:0] b;
always @(a)
begin
case (a)
0: b<=1;
1: b<=3;
2: b<=0;
3: b<=1;
endcase
end
endmodule
Synthesized circuit before optimization
Logic
optimization
Synthesized circuit after optimization
32. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 32
Unintentional
latch creation
Module latch_eg (a,b);
input [1:0] a;
output reg b;
always @(a)
begin
case (a)
0: b<=1;
1: b<=0;
2: b<=1;
endcase
end
endmodule
Initial output of naïve synthesizer
Optimized output of naïve synthesizer
33. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 33
Output of
optimizing
synthesizer
Output of
naïve
synthesizer
Solution to
eliminate latch
Module latch_eg (a,b);
input [1:0] a;
output reg b;
always @(a)
begin
case (a)
0: b<=1;
1: b<=0;
2: b<=1;
3: b<=0;
endcase
end
endmodule
34. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 34
Synthesis of
‘if’ statements
if (A == 1’b1)
begin
nextstate <= 3;
Z <= 1;
end
if (A == 1’b1)
begin
nextstate <= 3;
Z <= 1;
end
else
begin
nextstate <= 2;
Z <= 0;
end
Ambiguous code,
that results in latch Unambiguous code
module if_eg (A,B,C,D,E,Z);
input A,B;
input [2:0] C,D,E;
output reg [2:0] Z;
always @(A or B)
begin
if (A == 1’b1)
Z <= C;
else if (B == 1’b0)
Z <= D;
else
Z <= E;
end
endmodule
Synthesized
output
35. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 35
Synthesis of arithmetic components
module ar_eg (clk,A,B,ge,acc,count);
input clk;
input [3:0] A,B;
inout [3:0] acc,count;
output ge;
reg [3:0] acc_t, count_t;
assign acc = acc_t;
assign count = count_t;
assign ge = (A >= B);
always @(posedge clk)
begin
acc_t <= acc +B;
count_t <= count + 1;
end
endmodule
Synthesized output
36. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 36
Example-7: What hardware gets
resulted for the statement,
assign LE = (A <= B);
where A and B are 4-bit vectors?
The symbol “<=” is a relational
operator over here.
The following statement
inside the ‘always’ block,
LE <= (A <= B);
results in the same hardware.
Example-8: What is the
optimized hardware for,
assign EQ3 = (A == 3);
where A is 4-bit vector?
A naïve synthesizer may
produce a 4-bit comparator,
with ‘A’ and ‘3’ as inputs.
For optimization, the
statement can be altered as:
assign EQ3 =
~A[3]&~A[2]&A[1]&A[0];
37. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 37
Area of the silicon chip: Minimum
Power consumed: Minimum
Speed of operation: Maximum
Size of the product: Optimum
Weight of the product: Optimum
Memory capacity: Maximum
Cost of the product: Minimum
Delay of operation: Minimum
Ideal requirements
(Practical tradeoffs)
Area, power and delay
optimizations
Area & delay of a circuit
are inversely related (e.g.:
serial v/s parallel).
Energy & delay of a circuit
are also inversely related
(more switching implies
increased dynamic power).
Thus, Area-Time (AT)
product and Energy-Delay
(ED) product are the metrics
used, to qualify the circuit.
The path with the longest
delay in the circuit is called
as the “critical path”.
38. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 38
MAPPING, PLACEMENT AND ROUTING
These are the 3 major steps that happen, to transform
the design that is in the netlist form, to the appropriate
target technology (MPGA, CPLD, FPGA, ASIC).
Mapping is the process of translating the design into the
available building blocks in the target technology. [e.g.: LUT
with mux (Xilinx), Mux with gates (Microsemi)].
In other words, it is the process of binding the technology-
dependent circuits of the target technology to the technology-
independent circuits that are in the design.
In case of FPGA, the design has to be mapped into muxes,
LUTs etc. In case of ASIC, the design has to be mapped into the
standard cells that are available in the library (e.g.: logic gates,
muxes, decoders, encoders, comparators, counters etc.)
39. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 39
Placement is the process
of taking the defined logic &
I/O blocks from the technology
mapper, and assigning them to
the physical locations of the
target implementation.
Routing is the process of
interconnecting those blocks
and sub-blocks on the target
implementation.
“Place & route” are often
done along with each other.
Two of the popular algorithms
used for the same purpose
are, ‘Simulated annealing’ and
‘Iterative improvement’.
40. 07/03/2019 Aravinda K., Dept. of E&C, NHCE, Bengaluru 40
In metallurgy, annealing is the process utilized to toughen
the metal, by heating it, and then cooling it slowly, in a series
of steps. The temperature is kept high in the beginning, and it
is reduced gradually in the next steps.
In a similar fashion, for placing & routing, the simulated
annealing algorithm takes bigger risks in the beginning, by
making random modifications for a feasible solution, and
gradually arrives at an optimal solution. In the beginning, just
like high temperature, risky moves are performed. In the next
steps, as the temperature is reduced, there will be decrease
in the probability of occurrence of bad moves.
In contrast, the iterative improvement algorithm accepts
only better solutions in each step. Such algorithms are called
as ‘greedy’. At the end of simulated annealing, the algorithm
has to be greedy, so as to accept only positive moves.