This document compares different architectures for implementing the discrete cosine transform (DCT) in an image compression system to reduce power consumption. It summarizes four architectures: a baseline 2D DCT architecture, a row/column distributed arithmetic approach, a fully pipelined architecture, and analyzes their speed and power savings. The row/column approach provides a 24.4% power savings compared to the baseline. The fully pipelined architecture provides 16.4% power savings and higher throughput of 4.703 GHz by exploiting pipelining and parallelism.
HIGH SPEED REVERSE CONVERTER FOR HIGH DYNAMIC RANGE MODULI SETP singh
In this paper a new reverse converter architecture for the five moduli set {2n, 2n/2-1, 2n/2+1, 2n+1, 22n-1-1} is presented. The proposed converter is designed in two levels architecture by using of New Chinese Reminder Theorem-I (New CRT-I) and Mixed Radix Conversion (MRC). The proposed architecture has achieved significant improvement in terms of delay of the reverse converter compared to state-of-the-art reverse converters.
Reconfigurable CORDIC Low-Power Implementation of Complex Signal Processing f...Editor IJMTER
This document describes a proposed low-power CORDIC-based DCT architecture that prioritizes processing of low-frequency DCT coefficients over high-frequency coefficients to reduce power consumption with minimal image quality degradation. It uses a look-ahead CORDIC approach to allow varying the number of CORDIC iterations for different coefficients. Experimental results show the proposed architecture achieves 38.1% area and power savings compared to DA-based DCT, with comparable power to MCM-based DCT but using 100% less area and a minor 0.04dB quality loss.
This document describes a proposed VLSI implementation of a high-speed DCT architecture for H.264 video codec design. It presents a Booth radix-8 multiplier-based multiply-accumulate (MAC) unit to improve throughput and minimize area complexity for 8x8 2D DCT computation. The proposed MAC architecture achieves a maximum operating frequency of 129.18MHz while reducing area by 64% compared to a regular merged MAC unit with a conventional multiplier. FPGA implementation and performance analysis demonstrate the suitability of the proposed DCT architecture for applications in HDTV systems.
Efficient Implementation of Low Power 2-D DCT ArchitectureIJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
Design of Continuous Time Multibit Sigma Delta ADC for Next Generation Wirele...IJERA Editor
This paper presents the design of CT ΣΔ modulator which can provide high DR and SNR over a 20 MHz signal bandwidth. So far all the CT SDM uses either feedback or feedforward loop filter architecture. The proposed topology is a 3rd order low-pass sigma-delta modulator, which employs a combination of feedforward and feedback schemes. Loop filter is designed as RC integrators due to its high linearity and easy interface. The design starts from system level using Matlab/Simulink. Then, the first integrator in the loop, which is the most critical block in the modulator, is implemented at transistor level using Cadence Virtuoso 180 nm CMOS technology.
“ Implimentation of SD Processor Based On CRDC Algorithm ”inventionjournals
In Digital Signal Processing (DSP) there are many complex algorithms for which an efficient hardware implementation is required in real time applications. One such complex algorithm is Singular-value Decomposition (SD) which is an important algorithm with applications in varied domains of signal processing such as direction estimation, spectrum analysis and systems identification. It is a generalized extension to the eigen-decomposition for non-square matrices and is hence of great importance, particularly for subspace based algorithms in signal processing. But SD is known to be a very complicated algorithm with computational complexity ~O(N3 ) (for a NxN square matrix). For real-time computation of such a complex algorithm the use of a parallel and direct mapped hardware solution is indeed desired.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Dynamic Texture Coding using Modified Haar Wavelet with CUDAIJERA Editor
Texture is an image having repetition of patterns. There are two types, static and dynamic texture. Static texture is an image having repetitions of patterns in the spatial domain. Dynamic texture is number of frames having repetitions in spatial and temporal domain. This paper introduces a novel method for dynamic texture coding to achieve higher compression ratio of dynamic texture using 2D-modified Haar wavelet transform. The dynamic texture video contains high redundant parts in spatial and temporal domain. Redundant parts can be removed to achieve high compression ratios with better visual quality. The modified Haar wavelet is used to exploit spatial and temporal correlations amongst the pixels. The YCbCr color model is used to exploit chromatic components as HVS is less sensitive to chrominance. To decrease the time complexity of algorithm parallel programming is done using CUDA (Compute Unified Device Architecture). GPU contains the number of cores as compared to CPU, which is utilized to reduce the time complexity of algorithms.
HIGH SPEED REVERSE CONVERTER FOR HIGH DYNAMIC RANGE MODULI SETP singh
In this paper a new reverse converter architecture for the five moduli set {2n, 2n/2-1, 2n/2+1, 2n+1, 22n-1-1} is presented. The proposed converter is designed in two levels architecture by using of New Chinese Reminder Theorem-I (New CRT-I) and Mixed Radix Conversion (MRC). The proposed architecture has achieved significant improvement in terms of delay of the reverse converter compared to state-of-the-art reverse converters.
Reconfigurable CORDIC Low-Power Implementation of Complex Signal Processing f...Editor IJMTER
This document describes a proposed low-power CORDIC-based DCT architecture that prioritizes processing of low-frequency DCT coefficients over high-frequency coefficients to reduce power consumption with minimal image quality degradation. It uses a look-ahead CORDIC approach to allow varying the number of CORDIC iterations for different coefficients. Experimental results show the proposed architecture achieves 38.1% area and power savings compared to DA-based DCT, with comparable power to MCM-based DCT but using 100% less area and a minor 0.04dB quality loss.
This document describes a proposed VLSI implementation of a high-speed DCT architecture for H.264 video codec design. It presents a Booth radix-8 multiplier-based multiply-accumulate (MAC) unit to improve throughput and minimize area complexity for 8x8 2D DCT computation. The proposed MAC architecture achieves a maximum operating frequency of 129.18MHz while reducing area by 64% compared to a regular merged MAC unit with a conventional multiplier. FPGA implementation and performance analysis demonstrate the suitability of the proposed DCT architecture for applications in HDTV systems.
Efficient Implementation of Low Power 2-D DCT ArchitectureIJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
Design of Continuous Time Multibit Sigma Delta ADC for Next Generation Wirele...IJERA Editor
This paper presents the design of CT ΣΔ modulator which can provide high DR and SNR over a 20 MHz signal bandwidth. So far all the CT SDM uses either feedback or feedforward loop filter architecture. The proposed topology is a 3rd order low-pass sigma-delta modulator, which employs a combination of feedforward and feedback schemes. Loop filter is designed as RC integrators due to its high linearity and easy interface. The design starts from system level using Matlab/Simulink. Then, the first integrator in the loop, which is the most critical block in the modulator, is implemented at transistor level using Cadence Virtuoso 180 nm CMOS technology.
“ Implimentation of SD Processor Based On CRDC Algorithm ”inventionjournals
In Digital Signal Processing (DSP) there are many complex algorithms for which an efficient hardware implementation is required in real time applications. One such complex algorithm is Singular-value Decomposition (SD) which is an important algorithm with applications in varied domains of signal processing such as direction estimation, spectrum analysis and systems identification. It is a generalized extension to the eigen-decomposition for non-square matrices and is hence of great importance, particularly for subspace based algorithms in signal processing. But SD is known to be a very complicated algorithm with computational complexity ~O(N3 ) (for a NxN square matrix). For real-time computation of such a complex algorithm the use of a parallel and direct mapped hardware solution is indeed desired.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Dynamic Texture Coding using Modified Haar Wavelet with CUDAIJERA Editor
Texture is an image having repetition of patterns. There are two types, static and dynamic texture. Static texture is an image having repetitions of patterns in the spatial domain. Dynamic texture is number of frames having repetitions in spatial and temporal domain. This paper introduces a novel method for dynamic texture coding to achieve higher compression ratio of dynamic texture using 2D-modified Haar wavelet transform. The dynamic texture video contains high redundant parts in spatial and temporal domain. Redundant parts can be removed to achieve high compression ratios with better visual quality. The modified Haar wavelet is used to exploit spatial and temporal correlations amongst the pixels. The YCbCr color model is used to exploit chromatic components as HVS is less sensitive to chrominance. To decrease the time complexity of algorithm parallel programming is done using CUDA (Compute Unified Device Architecture). GPU contains the number of cores as compared to CPU, which is utilized to reduce the time complexity of algorithms.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
POWER GATING STRUCTURE FOR REVERSIBLE PROGRAMMABLE LOGIC ARRAYecij
This document summarizes a research paper that proposes a power gating structure using sleep transistors to reduce subthreshold leakage in a reversible programmable logic array (RPLA). It begins by introducing the concept of reversible logic for reducing power dissipation at the gate level. However, physical implementation with CMOS technology still leads to leakage during inactive periods. The paper then discusses power gating and sleep transistors as a technique to reduce leakage. It proposes a design for an RPLA using reversible AND and OR arrays with sleep transistors in a footer configuration to switch between active and sleep modes. Simulation results show 40.8% energy savings compared to a conventional CMOS design.
High Speed and Time Efficient 1-D DWT on Xilinx Virtex4 DWT Using 9/7 Filter ...IOSR Journals
This document describes a new efficient distributed arithmetic (NEDA) technique for implementing a high speed 1-D discrete wavelet transform (DWT) using a 9/7 filter on a Xilinx Virtex4 FPGA. The key aspects of the NEDA technique are that it uses adders as the main component and does not require multipliers, subtraction, or ROM. Simulation results show the proposed NEDA architecture requires fewer logic resources and has a shorter maximum path delay compared to existing distributed arithmetic techniques.
Design Analysis of Delay Register with PTL Logic using 90 nm TechnologyIJEEE
This paper presents low area and power efficient delay register using CMOS transistors. The proposed register has reduced area than the conventional register. This resistor design consists of 6 NMOS and 6 PMOS. The proposed delay register has been designed in logic editor and simulated using 90nm technology. Also the layout simulation and parametric analysis has been done to find out the results. In this paper register has been designed using full automatic layout design and semicustom layout design. Then the performance of these different designs has been analyzed and compared in terms of power, delay and area. The simulation result shows that circuit design of delay register using PTL techniques improved by power 0.05% and 61.8% area.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
This paper introduces two architectures for modulo 2n+1 adders. The first architecture is based on a sparse carry computation unit that computes only some carries, enabled by a new inverted circular idempotency property. This sparse approach reduces area and power compared to prior proposals while maintaining speed. The second architecture unifies the design of modulo 2n+1 and 2n-1 adders by showing 2n+1 addition can be treated as a subcase of 2n-1 addition with minor extra logic.
International Journal of Engineering and Science Invention (IJESI)inventionjournals
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online
IRJET- Low Power Adder and Multiplier Circuits Design Optimization in VLSIIRJET Journal
The document describes a proposed design for a low power 4-bit multiplier circuit using a hybrid full adder design with both pass-transistor logic and CMOS technology. The hybrid full adder uses 9 transistors compared to 12 in previous designs, reducing area and power. A faster Dadda algorithm is used to partition the partial product matrix into two parts that are reduced in parallel to two rows each using 3-bit and 2-bit counters, then combined with a carry look-ahead adder to form the final product. The proposed design aims to reduce propagation delay, power dissipation, and improve performance compared to previous multiplier circuit designs.
Fpga based low power and high performance address generator for wimax deinter...eSAT Journals
Abstract
The main aim of this project is to generate the address generation circuitry of Deinterleaver used in the WiMAX transreceiver using
the Xilinx Field Programmable Gate Array(FPGA). The floor function associated with the implementation of FPGA is very difficult in
IEEE 802.16e standard. So we eliminate the requirement of floor function by using a simple mathematical algorithm. Some
modulations like QPSK, 16-QAM and 64-QAM along with its code rates make our approach to be novel and high efficient.
Keywords— Modulation circuits, Deinterleaver/Interleaver circuit, Wireless SYSTEMS
This document describes an FPGA-based address generator for the deinterleaver used in WiMAX systems. It proposes algorithms to generate addresses for the deinterleaver that support different modulation schemes like QPSK, 16-QAM, and 64-QAM without using a floor function. The algorithms are implemented using VHDL on a Xilinx FPGA. Simulation results show the address generation for different modulation types matches the output of a MATLAB program. The FPGA implementation provides better performance and resource utilization than a conventional LUT-based approach.
This document describes the design of different types of parallel multipliers using low power techniques on a 0.18um technology node. It discusses Braun multipliers, row-bypassing multipliers, and column-bypassing multipliers. The multipliers are implemented using both conventional methods and the Gate-Diffusion-Input (GDI) technique. Simulation results show that implementing the multipliers using GDI reduces transistor counts and power consumption compared to conventional implementations. The 4x4 Braun multiplier implemented with GDI uses 136 transistors and consumes 3mW of power, providing significant improvements over the conventional implementation.
IMPLEMENTATION OF UNSIGNED MULTIPLIER USING MODIFIED CSLAeeiej_journal
This document discusses the implementation of an unsigned multiplier using a modified carry select adder technique. It begins with an introduction to digital arithmetic operations like multiplication and addition. It then describes the proposed system, which uses a modified carry select adder based multiplier to reduce area over a traditional carry look ahead adder based multiplier, while maintaining similar delay times. The document provides details on the design of regular and modified square root carry select adders used in the multiplier. It discusses how replacing ripple carry adders with binary to excess-1 converters in the modified design can further reduce area and power consumption.
International Journal of Computational Engineering Research(IJCER)ijceronline
This document summarizes a research paper on designing a high-speed arithmetic architecture for a parallel multiplier-accumulator based on the radix-2 modified Booth algorithm. It presents the design of a modified Booth multiplier using a carry lookahead adder for high accuracy with a fixed width. It also proposes a high-speed MAC architecture that improves performance by eliminating the accumulator and modifying the carry-save adder to add lower bits in advance, reducing the number of inputs to the final adder. The proposed CSA architecture can efficiently implement the operations of the new MAC arithmetic.
An approach to design Flash Analog to Digital Converter for High Speed and Lo...VLSICS Design
This paper proposes the Flash ADC design using Quantized Differential Comparator and fat tree encoder. This approach explores the use of a systematically incorporated input offset voltage in a differential amplifier for quantizing the reference voltages necessary for Flash ADC architectures, therefore eliminating the need for a passive resistor array for the purpose. This approach allows very small voltage comparison and complete elimination of resistor ladder circuit. The thermometer code-to-binary code encoder has become the bottleneck of the ultra-high speed flash ADCs. In this paper, the fat tree thermometer code to-binary code encoder is used for the ultra high speed flash ADCs. The simulation and the implementation results shows that the fat tree encoder performs the commonly used ROM encoder in terms of speed and power for the 6 bit CMOS flash ADC case. The speed is improved by almost a factor of 2 when using the fat tree encoder, which in fact demonstrates the fat tree encoder and it is an effective solution for the bottleneck problem in ultra-high speed ADCs.The design has been carried out for the 0.18um technology using CADENCE tool.
“FIELD PROGRAMMABLE DSP ARRAYS” - A NOVEL RECONFIGURABLE ARCHITECTURE FOR EFF...sipij
Digital Signal Processing functions are widely used in real time high speed applications. Those functions
are generally implemented either on ASICs with inflexibility, or on FPGAs with bottlenecks of relatively
smaller utilization factor or lower speed compared to ASIC. The proposed reconfigurable DSP processor is
redolent to FPGA, but with basic fixed Common Modules (CMs) (like adders, subtractors, multipliers,
scaling units, shifters) instead of CLBs. This paper introduces the development of a reconfigurable DSP
processor that integrates different filter and transform functions. The switching between DSP functions is
occurred by reconfiguring the interconnection between CMs. Validation of the proposed reconfigurable
architecture has been achieved on Virtex5 FPGA. The architecture provides sufficient amount of flexibility,
parallelism and scalability.
This document summarizes a study on removing fluoride from groundwater using an electrocoagulation (EC) process. The researchers conducted experiments with both a simulated control sample and a groundwater sample from India. They tested various operating parameters of the EC system including current density, flow rate, number of treatment stages, and residual aluminum levels. The results showed that a double stage treatment system achieved higher fluoride removal than a single stage system. Residual aluminum levels were low, indicating EC provides better water quality than other defluoridation methods. Current density and flow rate affected defluoridation efficiency, with higher current density generally improving removal up to a point.
This document summarizes a study on the effect of adding nanoclay particles on the machinability of aluminum metal matrix composites (MMCs). Composites were made with 2%, 4%, and 6% nanoclay by weight. Machinability tests were conducted by turning the composites and measuring tangential cutting forces. The results showed that cutting forces increased with higher cutting speed, feed rate, and depth of cut. Composites with more nanoclay produced more chips per gram removed, indicating better chip breaking ability. Overall, the nanoclay particles improved the machinability of the MMCs by facilitating chip fracture.
This document summarizes a research paper that proposes efficient and robust shape signatures for object recognition in content-based image retrieval systems. It presents several shape descriptors including scalar descriptors, simple boundary functions, and a shape signature using level set methods. Experiments were conducted on a database of 16 classes with 20 images each. The proposed shape signatures improved retrieval rates from an initial 60% to over 76% precision on average when retrieving the top 20 images. The signatures demonstrated effectiveness for both image retrieval and recognition while maintaining simplicity in implementation.
This document summarizes a study on the effect of copper corrosion passivators on paper insulated copper conductors. Samples with and without the passivator Irgamet 39 were aged in transformer oil at 140°C for 1400 hours and then analyzed using XRD, SEM, and EDX. Analysis of samples without passivator showed formation of copper sulfide on the copper surface and paper insulation. Samples with passivator showed a thin protective film formed on the copper surface, significantly reducing copper sulfide formation. The passivator was found to effectively inhibit the corrosion of copper in the presence of sulfur in transformer oil.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
POWER GATING STRUCTURE FOR REVERSIBLE PROGRAMMABLE LOGIC ARRAYecij
This document summarizes a research paper that proposes a power gating structure using sleep transistors to reduce subthreshold leakage in a reversible programmable logic array (RPLA). It begins by introducing the concept of reversible logic for reducing power dissipation at the gate level. However, physical implementation with CMOS technology still leads to leakage during inactive periods. The paper then discusses power gating and sleep transistors as a technique to reduce leakage. It proposes a design for an RPLA using reversible AND and OR arrays with sleep transistors in a footer configuration to switch between active and sleep modes. Simulation results show 40.8% energy savings compared to a conventional CMOS design.
High Speed and Time Efficient 1-D DWT on Xilinx Virtex4 DWT Using 9/7 Filter ...IOSR Journals
This document describes a new efficient distributed arithmetic (NEDA) technique for implementing a high speed 1-D discrete wavelet transform (DWT) using a 9/7 filter on a Xilinx Virtex4 FPGA. The key aspects of the NEDA technique are that it uses adders as the main component and does not require multipliers, subtraction, or ROM. Simulation results show the proposed NEDA architecture requires fewer logic resources and has a shorter maximum path delay compared to existing distributed arithmetic techniques.
Design Analysis of Delay Register with PTL Logic using 90 nm TechnologyIJEEE
This paper presents low area and power efficient delay register using CMOS transistors. The proposed register has reduced area than the conventional register. This resistor design consists of 6 NMOS and 6 PMOS. The proposed delay register has been designed in logic editor and simulated using 90nm technology. Also the layout simulation and parametric analysis has been done to find out the results. In this paper register has been designed using full automatic layout design and semicustom layout design. Then the performance of these different designs has been analyzed and compared in terms of power, delay and area. The simulation result shows that circuit design of delay register using PTL techniques improved by power 0.05% and 61.8% area.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
This paper introduces two architectures for modulo 2n+1 adders. The first architecture is based on a sparse carry computation unit that computes only some carries, enabled by a new inverted circular idempotency property. This sparse approach reduces area and power compared to prior proposals while maintaining speed. The second architecture unifies the design of modulo 2n+1 and 2n-1 adders by showing 2n+1 addition can be treated as a subcase of 2n-1 addition with minor extra logic.
International Journal of Engineering and Science Invention (IJESI)inventionjournals
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online
IRJET- Low Power Adder and Multiplier Circuits Design Optimization in VLSIIRJET Journal
The document describes a proposed design for a low power 4-bit multiplier circuit using a hybrid full adder design with both pass-transistor logic and CMOS technology. The hybrid full adder uses 9 transistors compared to 12 in previous designs, reducing area and power. A faster Dadda algorithm is used to partition the partial product matrix into two parts that are reduced in parallel to two rows each using 3-bit and 2-bit counters, then combined with a carry look-ahead adder to form the final product. The proposed design aims to reduce propagation delay, power dissipation, and improve performance compared to previous multiplier circuit designs.
Fpga based low power and high performance address generator for wimax deinter...eSAT Journals
Abstract
The main aim of this project is to generate the address generation circuitry of Deinterleaver used in the WiMAX transreceiver using
the Xilinx Field Programmable Gate Array(FPGA). The floor function associated with the implementation of FPGA is very difficult in
IEEE 802.16e standard. So we eliminate the requirement of floor function by using a simple mathematical algorithm. Some
modulations like QPSK, 16-QAM and 64-QAM along with its code rates make our approach to be novel and high efficient.
Keywords— Modulation circuits, Deinterleaver/Interleaver circuit, Wireless SYSTEMS
This document describes an FPGA-based address generator for the deinterleaver used in WiMAX systems. It proposes algorithms to generate addresses for the deinterleaver that support different modulation schemes like QPSK, 16-QAM, and 64-QAM without using a floor function. The algorithms are implemented using VHDL on a Xilinx FPGA. Simulation results show the address generation for different modulation types matches the output of a MATLAB program. The FPGA implementation provides better performance and resource utilization than a conventional LUT-based approach.
This document describes the design of different types of parallel multipliers using low power techniques on a 0.18um technology node. It discusses Braun multipliers, row-bypassing multipliers, and column-bypassing multipliers. The multipliers are implemented using both conventional methods and the Gate-Diffusion-Input (GDI) technique. Simulation results show that implementing the multipliers using GDI reduces transistor counts and power consumption compared to conventional implementations. The 4x4 Braun multiplier implemented with GDI uses 136 transistors and consumes 3mW of power, providing significant improvements over the conventional implementation.
IMPLEMENTATION OF UNSIGNED MULTIPLIER USING MODIFIED CSLAeeiej_journal
This document discusses the implementation of an unsigned multiplier using a modified carry select adder technique. It begins with an introduction to digital arithmetic operations like multiplication and addition. It then describes the proposed system, which uses a modified carry select adder based multiplier to reduce area over a traditional carry look ahead adder based multiplier, while maintaining similar delay times. The document provides details on the design of regular and modified square root carry select adders used in the multiplier. It discusses how replacing ripple carry adders with binary to excess-1 converters in the modified design can further reduce area and power consumption.
International Journal of Computational Engineering Research(IJCER)ijceronline
This document summarizes a research paper on designing a high-speed arithmetic architecture for a parallel multiplier-accumulator based on the radix-2 modified Booth algorithm. It presents the design of a modified Booth multiplier using a carry lookahead adder for high accuracy with a fixed width. It also proposes a high-speed MAC architecture that improves performance by eliminating the accumulator and modifying the carry-save adder to add lower bits in advance, reducing the number of inputs to the final adder. The proposed CSA architecture can efficiently implement the operations of the new MAC arithmetic.
An approach to design Flash Analog to Digital Converter for High Speed and Lo...VLSICS Design
This paper proposes the Flash ADC design using Quantized Differential Comparator and fat tree encoder. This approach explores the use of a systematically incorporated input offset voltage in a differential amplifier for quantizing the reference voltages necessary for Flash ADC architectures, therefore eliminating the need for a passive resistor array for the purpose. This approach allows very small voltage comparison and complete elimination of resistor ladder circuit. The thermometer code-to-binary code encoder has become the bottleneck of the ultra-high speed flash ADCs. In this paper, the fat tree thermometer code to-binary code encoder is used for the ultra high speed flash ADCs. The simulation and the implementation results shows that the fat tree encoder performs the commonly used ROM encoder in terms of speed and power for the 6 bit CMOS flash ADC case. The speed is improved by almost a factor of 2 when using the fat tree encoder, which in fact demonstrates the fat tree encoder and it is an effective solution for the bottleneck problem in ultra-high speed ADCs.The design has been carried out for the 0.18um technology using CADENCE tool.
“FIELD PROGRAMMABLE DSP ARRAYS” - A NOVEL RECONFIGURABLE ARCHITECTURE FOR EFF...sipij
Digital Signal Processing functions are widely used in real time high speed applications. Those functions
are generally implemented either on ASICs with inflexibility, or on FPGAs with bottlenecks of relatively
smaller utilization factor or lower speed compared to ASIC. The proposed reconfigurable DSP processor is
redolent to FPGA, but with basic fixed Common Modules (CMs) (like adders, subtractors, multipliers,
scaling units, shifters) instead of CLBs. This paper introduces the development of a reconfigurable DSP
processor that integrates different filter and transform functions. The switching between DSP functions is
occurred by reconfiguring the interconnection between CMs. Validation of the proposed reconfigurable
architecture has been achieved on Virtex5 FPGA. The architecture provides sufficient amount of flexibility,
parallelism and scalability.
This document summarizes a study on removing fluoride from groundwater using an electrocoagulation (EC) process. The researchers conducted experiments with both a simulated control sample and a groundwater sample from India. They tested various operating parameters of the EC system including current density, flow rate, number of treatment stages, and residual aluminum levels. The results showed that a double stage treatment system achieved higher fluoride removal than a single stage system. Residual aluminum levels were low, indicating EC provides better water quality than other defluoridation methods. Current density and flow rate affected defluoridation efficiency, with higher current density generally improving removal up to a point.
This document summarizes a study on the effect of adding nanoclay particles on the machinability of aluminum metal matrix composites (MMCs). Composites were made with 2%, 4%, and 6% nanoclay by weight. Machinability tests were conducted by turning the composites and measuring tangential cutting forces. The results showed that cutting forces increased with higher cutting speed, feed rate, and depth of cut. Composites with more nanoclay produced more chips per gram removed, indicating better chip breaking ability. Overall, the nanoclay particles improved the machinability of the MMCs by facilitating chip fracture.
This document summarizes a research paper that proposes efficient and robust shape signatures for object recognition in content-based image retrieval systems. It presents several shape descriptors including scalar descriptors, simple boundary functions, and a shape signature using level set methods. Experiments were conducted on a database of 16 classes with 20 images each. The proposed shape signatures improved retrieval rates from an initial 60% to over 76% precision on average when retrieving the top 20 images. The signatures demonstrated effectiveness for both image retrieval and recognition while maintaining simplicity in implementation.
This document summarizes a study on the effect of copper corrosion passivators on paper insulated copper conductors. Samples with and without the passivator Irgamet 39 were aged in transformer oil at 140°C for 1400 hours and then analyzed using XRD, SEM, and EDX. Analysis of samples without passivator showed formation of copper sulfide on the copper surface and paper insulation. Samples with passivator showed a thin protective film formed on the copper surface, significantly reducing copper sulfide formation. The passivator was found to effectively inhibit the corrosion of copper in the presence of sulfur in transformer oil.
O documento discute a importância da reciclagem de papel para o meio ambiente e a sociedade. Ele explica que a reciclagem de papel poupa árvores, gera renda e economiza recursos naturais. Também lista quais materiais podem e não podem ser reciclados e os benefícios da coleta seletiva de papel para a redução da poluição e prolongamento da vida útil dos aterros sanitários.
La Web 2.0 representa la evolución de las aplicaciones tradicionales hacia aplicaciones enfocadas al usuario final. No es una tecnología en sí, sino una actitud. Incluye aplicaciones para crear y publicar contenido como blogs y wikis, aplicaciones para compartir información como YouTube y Flickr, y aplicaciones para acceder e información actualizada como RSS y buscadores.
O documento fornece uma visão geral da computação em nuvem, discutindo seus principais modelos (IaaS, PaaS, SaaS), estruturas (redes públicas e privadas) e vantagens, como agilidade, redução de custos e aceleração da inovação. No entanto, também destaca desafios como garantir a segurança e privacidade dos dados e lidar com variações nos custos.
I encontro de educadores da carreira assistênciaJeovany Anjos
O documento descreve o I Encontro de Educadores da Carreira Assistência à Educação da DRE de Santa Maria. Apresenta as definições de DRE, seu organograma e as responsabilidades de cada setor, como Secretaria, Assistência Administrativa, Assistência Pedagógica e os Núcleos de Recursos Humanos, Expediente, Financeiro, Apoio Escolar, Desporto Escolar, Monitoramento Pedagógico, Material e Planejamento e Controle.
El documento introduce el tema de la filosofía de una manera poco convencional. Argumenta que la filosofía está en todas partes y que incluso los animales practican una filosofía de vida a través de sus interacciones y ritmos naturales. Insta al lector a observar la naturaleza y aprender de ella, ya que los animales viven de manera más equilibrada y desinteresada que los seres humanos. La filosofía, según el documento, es algo natural que nos envuelve constantemente a través de nuestras acciones cotidian
El documento habla sobre el desarrollo sostenible y sus enemigos como la contaminación y la deforestación. Explica que la contaminación puede darse en la atmósfera, suelos, aguas y de forma acústica. Luego profundiza sobre cómo la contaminación afecta específicamente las aguas, suelos y de forma acústica, concluyendo con una invitación a elaborar un mapa conceptual sobre lo aprendido.
1. O documento apresenta uma cartilha sobre licenciamento ambiental produzida pelo Tribunal de Contas da União em parceria com o Instituto Brasileiro do Meio Ambiente e dos Recursos Naturais Renováveis.
2. A cartilha explica conceitos, procedimentos e legislação relacionados ao licenciamento ambiental no Brasil, visando informar e orientar empreendedores, órgãos públicos e interessados sobre o tema.
3. A segunda edição da cartilha traz atualizações legislativas e jurisprudenciais sobre licenciamento
El documento hace predicciones sobre los coches del futuro, anticipando que para 2025-30 la mayoría de los vehículos en Estados Unidos serán eléctricos. También discute brevemente la historia del tuning de coches y presenta algunos modelos clásicos japoneses de carreras.
Este documento promociona una computadora Dell, alentando al lector a adentrarse en el mundo de la tecnología con este equipo para navegar por Internet y realizar tareas, advirtiéndole que no se quede atrás.
Este documento contém dois formulários de inscrição para os Jogos Escolares de Santa Maria de 2010. O primeiro é para inscrição regular e lista várias modalidades esportivas. O segundo é para inscrição especial (adaptada) e lista algumas modalidades para alunos com necessidades especiais. Ambos os formulários solicitam informações sobre a instituição e assinatura concordando com os termos de participação.
Este documento describe las características y bondades de los wikis. Los wikis permiten la creación colaborativa de páginas web de forma rápida y flexible. Al ser creados y editados conjuntamente por docentes y estudiantes, los wikis fomentan un aprendizaje activo y duradero donde todos contribuyen al conocimiento.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
The document discusses efficient VLSI implementations of image encryption using minimal operations. It proposes using discrete cosine transform (DCT) for image compression and encryption simultaneously. For encryption, a linear feedback shift register generates random numbers added to some DCT outputs. The DCT algorithm and arithmetic operators are optimized to reduce operations and increase throughput. Simulation results show encryption in the frequency domain at 656 million samples per second on an 82 MHz clock.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...IJERA Editor
Real time motion estimation for tracking is a challenging task. Several techniques can transform an image into frequency domain, such as DCT, DFT and wavelet transform. Direct implementation of 2-D DCT takes N^4 multiplications for an N x N image which is impractical. The proposed architecture for implementation of 2-D DCT uses look up tables. They are used to store pre-computed vector products that completely eliminate the multiplier. This makes the architecture highly time efficient, and the routing delay and power consumption is also reduced significantly. Another approach, 2-D discrete wavelet transform based motion estimation (DWT-ME) provides substantial improvements in quality and area. The proposed architecture uses Haar wavelet transform for motion estimation. In this paper, we present the comparison of the performance of discrete cosine transform, discrete wavelet transform for implementation in motion estimation.
Implementation of resource sharing strategy for power optimization in embedde...Alexander Decker
This document discusses the implementation of a resource sharing strategy to optimize power in embedded processors. The strategy is implemented at the hardware level in the decode stage of a 32-bit RISC processor with a 4-stage pipeline. By redefining some instructions to share common resources like adders and decoders, unnecessary switching activity is reduced, lowering dynamic power consumption. Power analysis shows the modified design consumes 3mW less power, a 2.65% improvement, across different clock frequencies compared to the original design. The proposed strategy successfully optimizes power through hardware-level resource sharing.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
This document presents a design for a low power 4-bit binary coded decimal (BCD) adder using a 14-transistor full adder circuit. It begins with background on BCD adders and issues with conventional designs. It then evaluates several existing full adder circuit designs before presenting a novel 14-transistor design with good driving capability. A 4-bit BCD adder is built using this full adder and simulated in 50nm technology. Results show the proposed design reduces power consumption to 0.03μW compared to conventional designs, with delay also reduced to 8ps. In conclusion, using a 14-transistor full adder improves performance metrics of power and delay for BCD addition
Design of High Speed and Low Power Veterbi Decoder for Trellis Coded Modulati...ijsrd.com
It is well known that the Viterbi decoder (VD) is the dominant module determining the overall power consumption of TCM decoders. High-speed, low-power design of Viterbi decoders for trellis coded modulation (TCM) systems is presented in this paper. We propose a pre-computation architecture incorporated with -algorithm for VD, which can effectively reduce the power consumption without degrading the decoding speed much. A general solution to derive the optimal pre-computation steps is also given in the paper. Implementation result of a VD for a rate-3/4 convolutional code used in a TCM system shows that compared with the full trellis VD, the precomputation architecture reduces the power consumption by as much as 70% without performance loss, while the degradation in clock speed is negligible.
Modified Distributive Arithmetic Based DWT-IDWT Processor Design and FPGA Imp...IOSR Journals
1) The document describes a modified distributive arithmetic based discrete wavelet transform (DWT) processor architecture and its FPGA implementation for image compression.
2) The proposed architecture uses four lookup tables to store pre-computed partial products of filter coefficients, achieving a latency of 44 clock cycles and throughput of 4 clock cycles.
3) A software reference model is developed in Matlab to analyze the performance of various wavelets for image compression using the distributive arithmetic based DWT approach. The input image is resized and decomposed into sub-bands using DWT and reconstructed using IDWT.
HIGH-SPEED LOW-POWER VITERBI DECODER DESIGN FOR TCM DECODERSLalitha Gosukonda
This document presents a design for a high-speed low-power Viterbi decoder for trellis coded modulation decoders. It proposes a precomputation architecture incorporated with the T-algorithm to reduce power consumption without significantly degrading decoding speed. The architecture calculates branch metric minimum values in advance and compares them to path metrics to eliminate unlikely paths early. Implementation in Verilog and synthesis results show the proposed architecture operates at a lower supply voltage for moderate throughput applications, achieving quadratic power reduction over conventional decoders.
Design and testing of systolic array multiplier using fault injecting schemesCSITiaesprime
Nowadays low power design circuits are major important for data transmission and processing the information among various system designs. One of the major multipliers used for synchronizing the data transmission is the systolic array multiplier, low power designs are mostly used for increasing the performance and reducing the hardware complexity. Among all the mathematical operations, multiplier plays a major role where it processes more information and with the high complexity of circuit in the existing irreversible design. We develop a systolic array multiplier using reversible gates for low power appliances, faults and coverage of the reversible logic are calculated in this paper. To improvise more, we introduced a reversible logic gate and tested the reversible systolic array multiplier using the fault injection method of built-in self-test block observer (BILBO) in which all corner cases are covered which shows 97% coverage compared with existing designs. Finally, Xilinx ISE 14.7 was used for synthesis and simulation results and compared parameters with existing designs which prove more efficiency.
Adiabatic technique based low power synchronous counter designIJECEIAES
The performance of integrated circuits is evaluated by their design architecture, which ensures high reliability and optimizes energy. The majority of the system-level architectures consist of sequential circuits. Counters are fundamental blocks in numerous very large-scale integration (VLSI) applications. The T-flip-flop is an important block in synchronous counters, and its high-power consumption impacts the overall effectiveness of the system. This paper calculates the power dissipation (PD), power delay product (PDP), and latency of the presented T flip-flop. To create a 2-bit synchronous counter based on the novel T flip-flops, a performance matrix such as PD, latency, and PDP is analyzed. The analysis is carried out at 100 and 10 MHz frequencies with varying temperatures and operating voltages. It is observed that the presented counter design has a lesser power requirement and PDP compared to the existing counter architectures. The proposed T-flip-flop design at the 45 nm technology node shows an improvement of 30%, 76%, and 85% in latency, PD, and PDP respectively to the 180 nm node at 10 MHz frequency. Similarly, the proposed counter at the 45 nm technology node shows 96% and 97% improvement in power dissipation, delay, and PDP respectively compared to the 180 nm at 10 MHz frequency.
High Speed Low-Power Viterbi Decoder Using Trellis Code ModulationMangaiK4
Abstract - High speed low power viterbi decoders for trellis code modulation is well known for the delay consumption in underwater communication. In transmission system wireless communication is the transfer of information between two or more points that are not connected by an electrical conductor. WiMAX is the wireless communication standard designed to provide 30 to 40 Mega bits per second data rates. WiMAX as a standards based technology enabling the delivery of last mile wireless broadband access as an alternative to cable and DSL. WiMAX can provide at home or mobile internet access across whole cities or countries. The address generation of WiMAX is carried out by interleaver and deinterleaver. Interleaving is used to overcome correlated channel noise such as burst error or fading. The interleaver/deinterleaver rearranges input data such that consecutive data are spaced apart. The interleaved memory is to improve the speed of access to memory. The viterbi technique reduces the bit error rate and delay using wimax.
High Speed Low-Power Viterbi Decoder Using Trellis Code ModulationMangaiK4
Abstract - High speed low power viterbi decoders for trellis code modulation is well known for the delay consumption in underwater communication. In transmission system wireless communication is the transfer of information between two or more points that are not connected by an electrical conductor. WiMAX is the wireless communication standard designed to provide 30 to 40 Mega bits per second data rates. WiMAX as a standards based technology enabling the delivery of last mile wireless broadband access as an alternative to cable and DSL. WiMAX can provide at home or mobile internet access across whole cities or countries. The address generation of WiMAX is carried out by interleaver and deinterleaver. Interleaving is used to overcome correlated channel noise such as burst error or fading. The interleaver/deinterleaver rearranges input data such that consecutive data are spaced apart. The interleaved memory is to improve the speed of access to memory. The viterbi technique reduces the bit error rate and delay using wimax.
IRJET- Review Paper on Radix-2 DIT Fast Fourier Transform using Reversible GateIRJET Journal
1) The document discusses a review paper on implementing a radix-2 Decimation-In-Time (DIT) Fast Fourier Transform (FFT) using reversible DKG gates.
2) The proposed design uses a 4x4 reversible DKG gate that functions as both an adder and subtractor.
3) The design is synthesized using Xilinx ISE software and simulated using VHDL test benches to evaluate performance.
An Area Efficient and High Speed Reversible Multiplier Using NS GateIJERA Editor
In digital computer system a major problem has been found that the Power dissipation which leads to bring some research on the methods to decrease this Area efficient, high speed. This is the main cause to give birth to reversible computing systems for digital computers and designs. Reversible computing is the path to future computing technologies, which all happen to use reversible logic. In addition, reversible computing will become mandatory because of the necessity to decrease power consumption. Reversible logic circuits have the same number of inputs and outputs, and have one-to-one mapping between vectors of inputs and outputs; thus the vector of input states can be always reconstructed from the vector of output states. Consequently, a computation is reversible, if it is always possible to uniquely recover the input, given the output. Each gate can be made reversible by adding some additional input and output wires if necessary. The main aim of this reversible computing is to lower the power dissipation, area efficient and high speed and some other advantages like security of data and prevention of errors etc... Reversible logic has so many applications low power CMOS, nanotechnology, DNA computing and quantum computing. There are two primary design implementations in this study which are the major spotlights. The first one is reversible design gate and the second one is multiplier design using reversible gates. In this manuscript we have implemented a 8 * 8 reversible design called “NSG(Non linear Sign Flip)”. The total project is implemented in Xilinx 14.7 ISE with Spartan 3E family.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Highly Parallel Pipelined VLSI Implementation of Lifting Based 2D Discrete Wa...idescitation
The lifting scheme based Discrete Wavelet
Transform is a powerful tool for image processing
applications. The lack of disk space during transmission and
storage of images pushes the demand for high speed
implementation of efficient compression technique. This paper
proposes a highly pipelined and distributed VLSI architecture
of lifting based 2D DWT with lifting coefficients represented
in fixed point [2:14] format. Compared to conventional
architectures [11], [13]-[16], the proposed highly pipelined
architecture optimizes the design which increases
significantly the performance speed. The design raises the
operating frequency, at the expense of more hardware area.
In this paper, initially a software model of the proposed design
was developed using MATLAB ®. Corresponding to this
software model, an efficient highly parallel pipelined
architecture was designed and developed using verilog HDL
language and implemented in VIRTEX ® 6 (XC6VHX380T)
FPGA. Also the design was synthesized on TSMC 0.18μm
ASIC Library by using Synopsis Design Compiler. The entire
system is suitable for several real time applications.
A continuous time adc and digital signal processing system for smart dust and...eSAT Journals
This document discusses a continuous-time (CT) analog-to-digital converter (ADC) and digital signal processing system suitable for applications like smart dust and wireless sensor networks. The key benefits of the CT system are lower noise, no need for a clock generator or anti-aliasing filter.
The paper proposes a clockless, event-driven CTADC based on delta modulation. An unbuffered, area-efficient segmented resistor string digital-to-analog converter is used. This architecture achieves an 87.5% reduction in resistors, switches and flip-flops for an 8-bit converter compared to prior designs.
The CTADC uses a level-crossing sampling technique where samples are generated when
A continuous time adc and digital signal processing system for smart dust and...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
A continuous time adc and digital signal processing system for smart dust and...
Hv2514131415
1. Gulnar Perveen / International Journal of Engineering Research and Applications (IJERA)
ISSN: 2248-9622 www.ijera.com Vol. 2, Issue5, September- October 2012, pp.1413-1415
Low Power DCT Implementation In An Image Compression
System
Gulnar Perveen
Solar Energy Centre, Ministry of New & Renewable Energy, CGO Complex, New Delhi, India
Abstract
Low power serves as the most important takes 8 bit input data and produces 12 bit output
challenges to maximize battery life & to save the using 12 bit DCT matrix coefficients [4].
energy for many signal-processing system
designs, particularly in multimedia cellular
applications and multimedia system on chip
design. The 2-D DCT is a commonly used
frequency transformation in compression
algorithms. In this paper, an Efficient Baseline 2-
D DCT Architecture is compared with the
Row/Column Approach, a Distributed
Arithmetic Architecture & Fully Pipelined DCT
Architecture for wireless Image Compression
Systems. It is observed that the Row/Column DA
DCT Architecture provides power saving of
24.4% and the Fully Pipelined Architecture
provides power saving of 16.4% as compared to
2-D DCT Baseline Architecture. The speed is also
measured & observed that Fully Pipelined
Architecture exploits the principle of pipelining
& parallelism to obtain throughput of 4.703 GHz.
Keywords-DCT (Discrete Cosine Transform), 2-
D(Two-Dimensional), DA(Distributed Arithmetic).
I. ROW/COLUMN ARCHITECTURE
In this an 8-point 1-D DCT is applied to 8
rows, and then again to each of the 8 columns. The
1-D algorithm that is applied to both the rows and
columns are the same. Therefore, it could be
possible to use the identical pieces of hardware to do Figure 1. Row/Column Approach Distributed
the row computation as well as the column arithmetic DCT Architecture, D0-D7: Registers.
computation.
In the above architecture, when second
The bulk of the design and computation is stage of DCT reads out data from transposition
in the 8-point 1-D DCT block, which can potentially memory 1, first DCT stage can populate second
be reduced 16 - 8 times for each row and 8 times for transposition memory with new data. 1-D DCT use
each column. Therefore, the fast algorithm for Distributed Arithmetic with butterfly computation to
computing 1-D DCT is usually selected. The DCT compute DCT values. Because of parallel DA they
core processor implements a Row-Column need considerable amount of ROM memories to
Distributed Arithmetic fast DCT algorithm compute one DCT value.
enhanced with the activity reduction methods
coefficients. The Row/Column Architecture has two
1-D DCT units connected through transposition
matrix memory.
Design is synchronous, with single positive
clock edge and no internal tri-state buffers. This has
RAM for storing product results after first DCT
stage for maximized performance. This way both 1-
D DCT units can work in parallel. This architecture
1413 | P a g e
2. Gulnar Perveen / International Journal of Engineering Research and Applications (IJERA)
ISSN: 2248-9622 www.ijera.com Vol. 2, Issue5, September- October 2012, pp.1413-1415
output elements are computed simultaneously and
are shifted out serially. An input vector is multiplied
by the coefficient matrix M’ to get the output. The
second element of Y is computed through some
additions and subtractions of the elements of X, and
then multiplied by constant a. Permutation is done
before the result goes into the accumulator.
Figure 2. RAC unit of Row/Column Approach. Figure 3. Baseline 2-D DCT Architecture.
Design based on Distributed Arithmetic
does not use any multipliers for computing MAC The circuit accepts one pixel per clock
(multiply and accumulate), instead it stores results cycle and the entire processing is performed as a
in ROM memory. There is a buffer which is linear pipe. When the left column of register set RS
essentially a memory arbiter between 1-D DCT is filled with eight data elements, the entire column
stages. The DCT architecture implements a Row- is copied onto the corresponding registers in the
Column Distributed Arithmetic fast DCT algorithm right column. A similar process occurs in each of
enhanced with the activity reduction methods [4]. the partitions simultaneously. The transpose buffer
consists of an 8x8 array of register pairs, the data is
input to the transpose buffer in row-wise fashion
II. BASELINE 2-D DCT ARCHITECTURE
The Baseline 2-D DCT Architecture until all the 64 registers are loaded. The data in
those registers are copied in parallel onto the
provides a reference design for application of our
corresponding adjacent registers which are
low power techniques. It is based on Chen
Algorithm. This acts as a reference design for the connected in column-wise fashion. While the data is
computation of power savings techniques. This being read out from the column registers, the row
approach requires three steps: eight 1-D
DCT/IDCTs along the rows, a memory
transposition, and another eight 1-D DCT/IDCTs
along the transposed columns.
A block diagram of the Baseline
Architecture shown above includes the controller
which enables input of the first row of data (DIN)
through the ser2par unit under the SEN signal. It
then activates the 1-D DCT unit with the SEL and
REN signals determining the data path. The first
row of the transposition memory stores the results &
the process repeats for the remaining seven rows of
the input block. Next, the ISEL and COLACK
signals enable the 1-D DCT unit to receive the input
data from the columns of the transposition memory.
The final results of the column-wise 1-D DCT are
available at the output [1].
Figure 4. Fully Pipelined 2-D DCT Architecture.
III. FULLY PIPELINED ARCHITECTURE
In this architecture, a row output vector is registers will keep receiving further data from the
computed using multipliers, multiplexers, DCT module.
accumulators, and registers. The elements of input Thus, the output of row-wise DCT
vector X are fed into the circuit one at a time. The 8 computation is transposed for column-wise DCT
1414 | P a g e
3. Gulnar Perveen / International Journal of Engineering Research and Applications (IJERA)
ISSN: 2248-9622 www.ijera.com Vol. 2, Issue5, September- October 2012, pp.1413-1415
computation [2]. The Fully Pipelined 2-D DCT architecture
exploits the principles of pipelining and parallelism
IV. SIMULATION RESULTS to the maximum extent so to obtain high speed and
TABLE I. throughput when compared with Row/Column (DA)
DCT approach & Fully Pipelined 2-D DCT
Comparison of Speed & Power for approach.
DCT Architectures
Architect
Power REFERENCES
ure Power
Speed consumptio [1] Nathaniel J. August and Dong Sam Ha,
saving “Low power Design of DCT and IDCT for
n
2D DCT low bit rate video codecs”, IEEE
Baseline Transactions on Multimedia, vol. 6, No. 3,
2.934 GHz 9mW ---
Architectu June 2004.
re [2] Jim Li and Shih –Lien Lu, “Low Power
Row Design of Two-Dimensional DCT”, IEEE
Column Transactions on communication, vol. 14,
3.23 GHz 6.3mW 24.4% No. 4, April 1996.
DA
Approach [3] F. Bensaali, A. Amira and A. Bouridane,
Fully “An efficient Architecture for color space
Pipelined conversion using Distributed Arithmetic”
2D - DCT 4.703 GHz 7.5mW 16.6% The 10th IEEE International conference on
Architectu Electronics, Circuits & Systems (ICECS
re 2003) Sharjah, UAE, December 14-17,
2004.
TABLE II. [4] Thucydides Xanthopoulos, and Anantha P.
Chandraprakasan, “Low power DCT core
Performance Analysis of JPEG using Adaptive Bit width and Arithmetic
Compressor Activity Exploiting signal correlations and
Units Quantization”, IEEE Journal of Solid state
Frequenc
Logic cells Memory bits Circuits, vol. 35, No. 5, May 2000.
y (MHZ)
[5] Jie Chen and K.J. Ray Liu, “Low-Power
DCT 2D 3,749 1,528 50.485 Architectures for Compressed Domain
Quantizati Video Coding Co-Processor”, IEEE
309 700 85.700 transaction on circuits & systems.,vol. 7,
on
Zigzag pp.459-467, June 2000.
84 1,380 181.285 [6] L. fanucci, S. Saponara, “Data Driven
Buffer
Run VLSI Computation for Low Power DCT-
Length 510 3,901 78.500 based Video Coding”, Proceedimgs of
Encoder IEEE, vol. 83, No. 2, pp.220-246, May
JPEG 2002.
Compresso [7] Hyeonuk Jeong, Jinsang Kim, and Won-
4,652 7,509 395.970 kyung Cho, “Low-Power Multiplier less
r
DCT Architecture using Image Data
Correlation”, IEEE Transactions on
V. CONCLUSION Consumer Electronics, Vol. 50, No.1,
The comparison of the Power Consumption February 2004.
& Speed for three different DCT architectures & the
Performance Analysis for the JPEG Compressor
discussed above have been done using Verilog &
Synthesis using Xilinx.
The Row Column (Distributed Arithmetic)
DCT Architecture approach was selected for
implementation after power savings against the
Fully Pipelined Architecture & Baseline 2-D DCT
Architecture. This architecture results in maximum
power savings as it uses the butterfly operation
which results in minimum data path bit widths since
fewer flip flops were needed between stages, hence
reduction in power consumption.
1415 | P a g e