The document describes a modified radix-24 single-data-path discrete Fourier transform (DFT) algorithm and architecture for implementing a 128-point inverse fast Fourier transform (IFFT) module for use in FPGA-based multiband orthogonal frequency division multiplexing (MB-OFDM) ultra-wideband (UWB) systems. The key aspects of the proposed design include:
1) A modified radix-24 single-data-flow (SDF) DFT algorithm that reorders the twiddle factor sequence to enable easier implementation of complex multiplication compared to earlier designs.
2) A single-data-path pipelined architecture that achieves the required 528 MHz processing speed for the U
Efficient FPGA implementation of high speed digital delay for wideband beamfor...journalBEEI
In this paper, the authors present an FPGA implementation of a digital delay for beamforming applications. The digital delay is based on a Parallel Farrow Filter. Such architecture allows to reach a very high processing rate with wideband signals and it is suitable to be used with Time-Interleaved Analog to Digital Converters (TI-ADC). The proposed delay has been simulated in MATLAB, implemented on FPGA and characterized in terms of amplitude and phase response, maximum clock frequency and area.
FPGA Implementation of Viterbi Decoder using Hybrid Trace Back and Register E...ijsrd.com
Error correction is an integral part of any communication system and for this purpose, the convolution codes are widely used as forward error correction codes. For decoding of convolution codes, at the receiver end Viterbi Decoder is being employed. The parameters of Viterbi algorithm can be changed to suit a specific application. The high speed and small area are two important design parameters in today’s wireless technology. In this paper, a high speed feed forward viterbi decoder has been designed using hybrid track back and register exchange architecture and embedded BRAM of target FPGA. The proposed viterbi decoder has been designed with Matlab, simulated with Xilinx ISE 8.1i Tool, synthesized with Xilinx Synthesis Tool (XST), and implemented on Xilinx Virtex4 based xc4vlx15 FPGA device. The results show that the proposed design can operate at an estimated frequency of 107.7 MHz by consuming considerably less resources on target device to provide cost effective solution for wireless applications.
An Energy-Efficient Lut-Log-Bcjr Architecture Using Constant Log Bcjr AlgorithmIJERA Editor
Error correcting codes are used to correct the data from the corrupted signal due to noise and interference. There
are many error correcting codes. Among them turbo codes is considered to be the best because it is very close to
the Shannon theoretical limit. The MAP algorithm is commonly used in the turbo decoder. Among the different
versions of the MAP algorithm Constant log BCJR algorithm have less complexity and good error performance.
The Constant log BCJR algorithm can be easily designed using look up table which reduces the memory
consumption. The proposed Constant log BCJR decoder is designed to decode two blocks of data at a time, this
increases the throughput. The complexity of the decoder is further reduced by the use of the add compare select
(ACS) units and registers. The proposed decoder is simulated using Xilinx ISE and synthesized using Sparten3
FPGA and found out that Constant log BCJR decoder utilized less amount of memory and power than the LUT
log BCJR decoder.
Survey of Optimization of FFT processor for OFDM Receiversijsrd.com
In the last few years wireless communications have experienced a fast growth due to the high mobility that they allow. However, wireless channels have some disadvantages like multipath fading that make them difficult to deal with. A modulation that efficiently deals with selective fading channels is OFDM. There are a large number of FFT algorithms and architectures in the signal processing literature. Therefore, the state of art algorithms and architectures should be analyzed and compared. Based on different algorithms and architectures, different power consumptions, area and speed of the processor will be achieved. So their ASIC suitability should be analyzed and the effort should be focused on the choosing algorithms and architectures and optimization. In this paper FFT Processor with Pipelined Architecture and CORDIC based ROM-free twiddle factor generator is proposed. The proposed algorithm and architecture should be validated by MATLAB simulation before implementation. After that, it is implemented on DSP Processor kit with Code Composer Studio. The synthesis results will be compared with other published FFT processor results.
SNOW 3G is a synchronous, word-oriented stream cipher used by the 3GPP standards as a confidentiality and integrity algorithms. It is used as first set in long term evolution (LTE) and as a second set in universal mobile telecommunications system (UMTS) networks. The cipher uses 128-bit key and 128 bit IV to produce 32-bit ciphertext. The paper presents two techniques for performance enhancement. The first technique uses novel CLA architecture to minimize the propagation delay of the 2 modulo adders. The second technique uses novel architecture for S-box to minimize the chip area. The presented work uses VHDL language for coding. The same is implemented on the FPGA device Virtex xc5vfx100e manufactured by Xilinx. The presented architecture achieved a maximum frequency of 254.9 MHz and throughput of 7.2235 Gbps. 32
Efficient FPGA implementation of high speed digital delay for wideband beamfor...journalBEEI
In this paper, the authors present an FPGA implementation of a digital delay for beamforming applications. The digital delay is based on a Parallel Farrow Filter. Such architecture allows to reach a very high processing rate with wideband signals and it is suitable to be used with Time-Interleaved Analog to Digital Converters (TI-ADC). The proposed delay has been simulated in MATLAB, implemented on FPGA and characterized in terms of amplitude and phase response, maximum clock frequency and area.
FPGA Implementation of Viterbi Decoder using Hybrid Trace Back and Register E...ijsrd.com
Error correction is an integral part of any communication system and for this purpose, the convolution codes are widely used as forward error correction codes. For decoding of convolution codes, at the receiver end Viterbi Decoder is being employed. The parameters of Viterbi algorithm can be changed to suit a specific application. The high speed and small area are two important design parameters in today’s wireless technology. In this paper, a high speed feed forward viterbi decoder has been designed using hybrid track back and register exchange architecture and embedded BRAM of target FPGA. The proposed viterbi decoder has been designed with Matlab, simulated with Xilinx ISE 8.1i Tool, synthesized with Xilinx Synthesis Tool (XST), and implemented on Xilinx Virtex4 based xc4vlx15 FPGA device. The results show that the proposed design can operate at an estimated frequency of 107.7 MHz by consuming considerably less resources on target device to provide cost effective solution for wireless applications.
An Energy-Efficient Lut-Log-Bcjr Architecture Using Constant Log Bcjr AlgorithmIJERA Editor
Error correcting codes are used to correct the data from the corrupted signal due to noise and interference. There
are many error correcting codes. Among them turbo codes is considered to be the best because it is very close to
the Shannon theoretical limit. The MAP algorithm is commonly used in the turbo decoder. Among the different
versions of the MAP algorithm Constant log BCJR algorithm have less complexity and good error performance.
The Constant log BCJR algorithm can be easily designed using look up table which reduces the memory
consumption. The proposed Constant log BCJR decoder is designed to decode two blocks of data at a time, this
increases the throughput. The complexity of the decoder is further reduced by the use of the add compare select
(ACS) units and registers. The proposed decoder is simulated using Xilinx ISE and synthesized using Sparten3
FPGA and found out that Constant log BCJR decoder utilized less amount of memory and power than the LUT
log BCJR decoder.
Survey of Optimization of FFT processor for OFDM Receiversijsrd.com
In the last few years wireless communications have experienced a fast growth due to the high mobility that they allow. However, wireless channels have some disadvantages like multipath fading that make them difficult to deal with. A modulation that efficiently deals with selective fading channels is OFDM. There are a large number of FFT algorithms and architectures in the signal processing literature. Therefore, the state of art algorithms and architectures should be analyzed and compared. Based on different algorithms and architectures, different power consumptions, area and speed of the processor will be achieved. So their ASIC suitability should be analyzed and the effort should be focused on the choosing algorithms and architectures and optimization. In this paper FFT Processor with Pipelined Architecture and CORDIC based ROM-free twiddle factor generator is proposed. The proposed algorithm and architecture should be validated by MATLAB simulation before implementation. After that, it is implemented on DSP Processor kit with Code Composer Studio. The synthesis results will be compared with other published FFT processor results.
SNOW 3G is a synchronous, word-oriented stream cipher used by the 3GPP standards as a confidentiality and integrity algorithms. It is used as first set in long term evolution (LTE) and as a second set in universal mobile telecommunications system (UMTS) networks. The cipher uses 128-bit key and 128 bit IV to produce 32-bit ciphertext. The paper presents two techniques for performance enhancement. The first technique uses novel CLA architecture to minimize the propagation delay of the 2 modulo adders. The second technique uses novel architecture for S-box to minimize the chip area. The presented work uses VHDL language for coding. The same is implemented on the FPGA device Virtex xc5vfx100e manufactured by Xilinx. The presented architecture achieved a maximum frequency of 254.9 MHz and throughput of 7.2235 Gbps. 32
Design and Implementation of an Embedded System for Software Defined RadioIJECEIAES
In this paper, developing high performance software for demanding real-time embed- ded systems is proposed. This software-based design will enable the software engineers and system architects in emerging technology areas like 5G Wireless and Software Defined Networking (SDN) to build their algorithms. An ADSP-21364 floating point SHARC Digital Signal Processor (DSP) running at 333 MHz is adopted as a platform for an embedded system. To evaluate the proposed embedded system, an implementation of frame, symbol and carrier phase synchronization is presented as an application. Its performance is investigated with an on line Quadrature Phase Shift keying (QPSK) receiver. Obtained results show that the designed software is implemented successfully based on the SHARC DSP which can utilized efficiently for such algorithms. In addition, it is proven that the proposed embedded system is pragmatic and capable of dealing with the memory constraints and critical time issue due to a long length interleaved coded data utilized for channel coding.
International Journal of Engineering Inventions (IJEI) provides a multidisciplinary passage for researchers, managers, professionals, practitioners and students around the globe to publish high quality, peer-reviewed articles on all theoretical and empirical aspects of Engineering and Science.
Transfer of ut information from fpga through ethernet interfaceeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Implementation of Viterbi Decoder on FPGA to Improve Designijsrd.com
In the data transmissions over wireless channels are affect by attenuation, distortion, interference and noise, which affects the receiver's ability to receive correct information. Convolution coding with Viterbi decoding is a FEC technique that is particularly suited to a channel in which transmitted signal is corrupted mainly by additive white Gaussian noise (AWGN).Convolutional codes are used for error correction. They have rather good correcting capability and perform well even on very bad channels with error probabilities. Viterbi decoding is the best technique for decoding the Convolutional codes but it is limited to smaller constraint lengths. Viterbi algorithm is a well-known maximum-likelihood algorithm for decoding of convolutional codes.
A study to Design and comparison of Full Adder using Various TechniquesIOSR Journals
Abstract: Adders is widely used in applications such as digital signal processing (DSP) and microprocessors. In this paper Half adders are simulated and analyzed based on power dissipation, area and speed on 90nm technology using Microwind and Dsch tool. Half Adder is the basic building block in Parallel Feedback Carry Adder (PFCA). Keywords: Full adder, Half adder, PFCA, VLSI
Implementation of UART with BIST Technique Using Low Power LFSRIJERA Editor
Asynchronous serial communication is usually implemented by Universal Asynchronous Receiver Transmitter
(UART), mostly used for low expense, low speed, short distance data exchange between processor and
peripherals. UART allows full duplex serial communication link, and is used in data communication and control
system. There is a need for realizing the UART function in a single or a very few chips. Further, design systems
without full testability are open to the increased possibility of product failures and missed market opportunities.
Also, it is necessary to ensure the data transfer is error proof. This project targets the introduction of Built-in self
test (BIST) and Status register to UART. The basic idea is to reduce the switching activity among the test
patterns at the most. In this approach, the single input change patterns generated by a counter and a gray code
generator are Exclusive-ORed with the seed generated by the low power linear feedback shift register [LP-LFSR].
The 8-bit UART with status register and BIST module is coded in Verilog HDL and synthesized and simulated
using Xilinx XST and ISim version 14.4 and realized on FPGA.
FPGA Implementation of Mixed Radix CORDIC FFTIJSRD
In this Paper, the architecture and FPGA implementation of a Coordinate Rotation Digital Computer (CORDIC) pipeline Fast Fourier Transform (FFT) processor is presented. Fast Fourier Transforms (FFT) is highly efficient algorithm which uses Divide and Conquer approach for speedy calculation of Discrete Fourier transform (DFT) to obtain the frequency spectrum. CORDIC algorithm which is hardware efficient and avoids the use of conventional multiplication and accumulation (MAC) units but evaluates the trigonometric functions by the rotation of a complex vector by means of only add and shift operations. We have developed Fixed point FFT processors using VHDL language for implementation on Field Programmable Gate Array. A Mixed Radix 8 point DIF FFT/IFFT architecture with CORDIC Twiddle factor generation unit with use of pipeline implementation FFT processor has been developed using Xilinx XC3S500E Spartan-3E FPGA and simulated with maximum frequency of 157.359 MHz for 16 bit length 8 point FFT. Results show that the processor uses less number of LUTs and achieves Maximum Frequency.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Performance Analysis of IEEE 802.15.4 Transceiver System under Adaptive White...IJECEIAES
Zigbee technology has been developed for short range wireless sensor networks and it follows IEEE 802.15.4 standard. For such sensors, several considerations should be taken including; low data rate and less design complexity in order to achieve efficient performance considering to the transceiver systems. This research focuses on implementing a digital transceiver system for Zigbee sensor based on IEEE 802.15.4. The system is implemented using offset quadrature phase shift keying (OQPSK) modulation technique with half sine pulse-shaping method. Direct conversion scheme has been used in the design of Zigbee receiver in order to fulfill the requirements mentioned above. System performance is analyzed considering to BER when it encountered adaptive white Gaussian noise (AWGN), besides showing the effect of using direct sequence spread spectrum (DSSS) technique.
FPGA Implementation of FIR Filter using Various Algorithms: A RetrospectiveIJORCS
This Paper is a review study of FPGA implementation of Finite Impulse response (FIR) with low cost and high performance. The key observation of this paper is an elaborate analysis about hardware implementations of FIR filters using different algorithm i.e., Distributed Arithmetic (DA), DA-Offset Binary Coding (DA-OBC), Common Sub-expression Elimination (CSE) and sum-of-power-of-two (SOPOT) with less resources and without affecting the performance of the original FIR Filter.
Design of Multiplier Less 32 Tap FIR Filter using VHDLIJMER
This Paper provide the principles of Distributed Arithmetic, and introduce it into the FIR
filters design, and then presents a 32-Tap FIR low-pass filter using Distributed Arithmetic, which save
considerable MAC blocks to decrease the circuit scale and pipeline structure is also used to increase the
system speed. The implementation of FIR filters on FPGA based on traditional method costs considerable
hardware resources, which goes against the decrease of circuit scale and the increase of system speed.
It is very well known that the FIR filter consists of Delay elements, Multipliers and Adders. Because of
usage of Multipliers in early design gives rise to 2 demerits that are:
(i) Increase in Area and
(ii) Increase in the Delay which ultimately results in low performance (Less speed).
So the Distributed Arithmetic for FIR Filter design and Implementation is provided in this work to solve
this problem. Distributed Arithmetic structure is used to increase the recourse usage and pipeline
structure is used to increase the system speed. Distributed Arithmetic can save considerable hardware
resources through using LUT to take the place of MAC units
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Implementation of High Speed OFDM Transceiver using FPGAMangaiK4
Abstract - Proficient, multi mode and re-configurable architecture of interleaver/de-interleaver for multiple standards, like DVB, OFDM and WLAN is presented. Interleaver plays vital role in 4G technologies to recover symbols from burst errors. The aim of our work is to design a reconfigurable modulation technique called Adaptive modulation scheme uses QAM, QPSK and BPSK modulation that adapt themselves based on channel Signal to Noise ratio. Subcarrier allocation algorithm specifically used to focus on utilizing channels with high gains. Our proposed model can achieves a data rate of min 2.5 Gbps as per 3GPP standard by adaptive modulation technique using QAM, BPSK and QPSK.
Analysis of Women Harassment inVillages Using CETD Matrix ModalMangaiK4
Abstract-It is commonly understood that misbehavior intends to upset .Law says ,the repeated intentional misbehavior towards women is an offensive. The main concept of this paper can find something interesting that will make us reflect on what is done by women’s rights and gender equality. To solve such problem, in this paper we are interested to adopt CETD matrix.
Design and Implementation of an Embedded System for Software Defined RadioIJECEIAES
In this paper, developing high performance software for demanding real-time embed- ded systems is proposed. This software-based design will enable the software engineers and system architects in emerging technology areas like 5G Wireless and Software Defined Networking (SDN) to build their algorithms. An ADSP-21364 floating point SHARC Digital Signal Processor (DSP) running at 333 MHz is adopted as a platform for an embedded system. To evaluate the proposed embedded system, an implementation of frame, symbol and carrier phase synchronization is presented as an application. Its performance is investigated with an on line Quadrature Phase Shift keying (QPSK) receiver. Obtained results show that the designed software is implemented successfully based on the SHARC DSP which can utilized efficiently for such algorithms. In addition, it is proven that the proposed embedded system is pragmatic and capable of dealing with the memory constraints and critical time issue due to a long length interleaved coded data utilized for channel coding.
International Journal of Engineering Inventions (IJEI) provides a multidisciplinary passage for researchers, managers, professionals, practitioners and students around the globe to publish high quality, peer-reviewed articles on all theoretical and empirical aspects of Engineering and Science.
Transfer of ut information from fpga through ethernet interfaceeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Implementation of Viterbi Decoder on FPGA to Improve Designijsrd.com
In the data transmissions over wireless channels are affect by attenuation, distortion, interference and noise, which affects the receiver's ability to receive correct information. Convolution coding with Viterbi decoding is a FEC technique that is particularly suited to a channel in which transmitted signal is corrupted mainly by additive white Gaussian noise (AWGN).Convolutional codes are used for error correction. They have rather good correcting capability and perform well even on very bad channels with error probabilities. Viterbi decoding is the best technique for decoding the Convolutional codes but it is limited to smaller constraint lengths. Viterbi algorithm is a well-known maximum-likelihood algorithm for decoding of convolutional codes.
A study to Design and comparison of Full Adder using Various TechniquesIOSR Journals
Abstract: Adders is widely used in applications such as digital signal processing (DSP) and microprocessors. In this paper Half adders are simulated and analyzed based on power dissipation, area and speed on 90nm technology using Microwind and Dsch tool. Half Adder is the basic building block in Parallel Feedback Carry Adder (PFCA). Keywords: Full adder, Half adder, PFCA, VLSI
Implementation of UART with BIST Technique Using Low Power LFSRIJERA Editor
Asynchronous serial communication is usually implemented by Universal Asynchronous Receiver Transmitter
(UART), mostly used for low expense, low speed, short distance data exchange between processor and
peripherals. UART allows full duplex serial communication link, and is used in data communication and control
system. There is a need for realizing the UART function in a single or a very few chips. Further, design systems
without full testability are open to the increased possibility of product failures and missed market opportunities.
Also, it is necessary to ensure the data transfer is error proof. This project targets the introduction of Built-in self
test (BIST) and Status register to UART. The basic idea is to reduce the switching activity among the test
patterns at the most. In this approach, the single input change patterns generated by a counter and a gray code
generator are Exclusive-ORed with the seed generated by the low power linear feedback shift register [LP-LFSR].
The 8-bit UART with status register and BIST module is coded in Verilog HDL and synthesized and simulated
using Xilinx XST and ISim version 14.4 and realized on FPGA.
FPGA Implementation of Mixed Radix CORDIC FFTIJSRD
In this Paper, the architecture and FPGA implementation of a Coordinate Rotation Digital Computer (CORDIC) pipeline Fast Fourier Transform (FFT) processor is presented. Fast Fourier Transforms (FFT) is highly efficient algorithm which uses Divide and Conquer approach for speedy calculation of Discrete Fourier transform (DFT) to obtain the frequency spectrum. CORDIC algorithm which is hardware efficient and avoids the use of conventional multiplication and accumulation (MAC) units but evaluates the trigonometric functions by the rotation of a complex vector by means of only add and shift operations. We have developed Fixed point FFT processors using VHDL language for implementation on Field Programmable Gate Array. A Mixed Radix 8 point DIF FFT/IFFT architecture with CORDIC Twiddle factor generation unit with use of pipeline implementation FFT processor has been developed using Xilinx XC3S500E Spartan-3E FPGA and simulated with maximum frequency of 157.359 MHz for 16 bit length 8 point FFT. Results show that the processor uses less number of LUTs and achieves Maximum Frequency.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Performance Analysis of IEEE 802.15.4 Transceiver System under Adaptive White...IJECEIAES
Zigbee technology has been developed for short range wireless sensor networks and it follows IEEE 802.15.4 standard. For such sensors, several considerations should be taken including; low data rate and less design complexity in order to achieve efficient performance considering to the transceiver systems. This research focuses on implementing a digital transceiver system for Zigbee sensor based on IEEE 802.15.4. The system is implemented using offset quadrature phase shift keying (OQPSK) modulation technique with half sine pulse-shaping method. Direct conversion scheme has been used in the design of Zigbee receiver in order to fulfill the requirements mentioned above. System performance is analyzed considering to BER when it encountered adaptive white Gaussian noise (AWGN), besides showing the effect of using direct sequence spread spectrum (DSSS) technique.
FPGA Implementation of FIR Filter using Various Algorithms: A RetrospectiveIJORCS
This Paper is a review study of FPGA implementation of Finite Impulse response (FIR) with low cost and high performance. The key observation of this paper is an elaborate analysis about hardware implementations of FIR filters using different algorithm i.e., Distributed Arithmetic (DA), DA-Offset Binary Coding (DA-OBC), Common Sub-expression Elimination (CSE) and sum-of-power-of-two (SOPOT) with less resources and without affecting the performance of the original FIR Filter.
Design of Multiplier Less 32 Tap FIR Filter using VHDLIJMER
This Paper provide the principles of Distributed Arithmetic, and introduce it into the FIR
filters design, and then presents a 32-Tap FIR low-pass filter using Distributed Arithmetic, which save
considerable MAC blocks to decrease the circuit scale and pipeline structure is also used to increase the
system speed. The implementation of FIR filters on FPGA based on traditional method costs considerable
hardware resources, which goes against the decrease of circuit scale and the increase of system speed.
It is very well known that the FIR filter consists of Delay elements, Multipliers and Adders. Because of
usage of Multipliers in early design gives rise to 2 demerits that are:
(i) Increase in Area and
(ii) Increase in the Delay which ultimately results in low performance (Less speed).
So the Distributed Arithmetic for FIR Filter design and Implementation is provided in this work to solve
this problem. Distributed Arithmetic structure is used to increase the recourse usage and pipeline
structure is used to increase the system speed. Distributed Arithmetic can save considerable hardware
resources through using LUT to take the place of MAC units
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Implementation of High Speed OFDM Transceiver using FPGAMangaiK4
Abstract - Proficient, multi mode and re-configurable architecture of interleaver/de-interleaver for multiple standards, like DVB, OFDM and WLAN is presented. Interleaver plays vital role in 4G technologies to recover symbols from burst errors. The aim of our work is to design a reconfigurable modulation technique called Adaptive modulation scheme uses QAM, QPSK and BPSK modulation that adapt themselves based on channel Signal to Noise ratio. Subcarrier allocation algorithm specifically used to focus on utilizing channels with high gains. Our proposed model can achieves a data rate of min 2.5 Gbps as per 3GPP standard by adaptive modulation technique using QAM, BPSK and QPSK.
Analysis of Women Harassment inVillages Using CETD Matrix ModalMangaiK4
Abstract-It is commonly understood that misbehavior intends to upset .Law says ,the repeated intentional misbehavior towards women is an offensive. The main concept of this paper can find something interesting that will make us reflect on what is done by women’s rights and gender equality. To solve such problem, in this paper we are interested to adopt CETD matrix.
IMPLEMENTATION OF SDC - SDF ARCHITECTURE FOR RADIX-4 FFT VLSICS Design
Very large scale integration and Digital signal processing are the very crucial technologies from the last
few decades. DSP applications require high performance, low area and low power VLSI circuits. This
paper is discussing about FFT which is one of the vital component in the digital signal processing. In this
Paper, we propose a single path delay commutator–feedback (SDC-SDF) Architecture for Radix-4 FFT
and presented its simulation and synthesis results. The Radix-4 FFT architecture consists of log4 N-1 SDC
Stages and 1 SDF stage. Previously, the radix-2 SDC-SDF (Single path delay commutator-feedback) FFT
architecture was includes log2 N-1 SDC Stages and 1 SDF stage. The proposed Radix-4 SDC-SDF
architecture reduces the number of multiplications and additions as well as number of stages which
achieves reduced area and low power. The resultant architecture is simulated using Modelsim, design
verification and synthesis results are done using Xilinx ISE. The proposed architecture is compared with
Radix-2 SDC-SDF FFT and it can achieve less area as well as low power
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Design of Scalable FFT architecture for Advanced Wireless Communication Stand...IOSRJECE
Now a day’s numerous wireless communication standards have raised additional stringent requirements on each throughput and flexibility for FFT computation. Advanced wireless systems support multiple standards to satisfy the demands of user application necessities. A wireless system whereas supporting multiple standards should also satisfy performance necessities of these supported standards. Meeting performance requirements of multiple standards is a challenge while designing a system. Fast Fourier transformations, a kernel processing task in communication systems, are studied intensively for efficient software and hardware implementations. To design an efficient system, it's necessary to efficiently design its performance critical component. each system must meet stringent design parameters like high speed, low power, low area, low cost, high flexibility and high scalability, designing FFT processor to support multiple wireless standards whereas meeting the above such performance necessities is a difficult task. This paper proposed a highly efficient scalable architecture, software tools design, and design implementation. The reconstruction of the FFT computation flow is design into a scalable structure. The FFT can be easily expanded for any-point FFT computation. The various parameters satisfied the conditions, gives proper and efficient outputs as compare to other platforms.
Implementation of High Throughput Radix-16 FFT ProcessorIJMER
The extension of radix-4 algorithm to radix-16 to achieve the high throughput of 2.59 giga-samples/s for WPAN’s.We are also reformulating radix-16 algorithm to achieve low-complexity and
low area cost and high performance. Radix-16 FFT is obtained by cascaded the radix -4 butterfly
units. It facilitates low-complexity realization of radix-16 butterfly operation and high operation speed
due to its optimized pipelined structure. Besides, a new three-stage multiplier for twiddle factor
multiplication is also proposed, which has lower area and power consumption than conventional
complex multipliers
Performance Analysis of OFDM Transceiver with Folded FFT and LMS Filteridescitation
This paper presents an Orthogonal Frequency Division Multiplexing (OFDM)
transceiver that makes use of a low power Fast Fourier Transform (FFT) along with a Least
Mean Square (LMS) filter. The folded FFT is developed via folding transformation and
register minimization techniques with real values as inputs which leads to reduction in
hardware complexity by exploiting the redundancy present in computing the FFT samples
and also the amount of power consumed. A LMS filter is also designed for the purpose of
noise removal. The OFDM transceiver with the folded FFT and LMS filter is analyzed in
terms of error performance to validate the advantages of less power consumption and
hardware utilization when compared to the traditional OFDM system with conventional
FFT. The individual components and the entire OFDM system that has been proposed are
modeled using Verilog HDL and functionally verified using Xilinx ISIM simulator.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Field programmable gate array implementation of multiwavelet transform based...IJECEIAES
This article offers an efficient design and implementation of a discrete multiwavelet critical-sampling transform based orthogonal frequency division multiplexing (DMWCST-OFDM) transceiver using field programmable gate array (FPGA) platform. The design uses 16-point discrete multiwavelet critical-sampling transform (DMWCST) and its inverse as main processing modules. All modules were designed using a part of Vivado® Design Suite version (2015.2), which is Xilinx system generator (XSG), and is compatible with MATLAB Simulink version R2013b. The FPGA implementation is carried out on a Zynq (XC7Z020-1CLG484) evaluation board with joint test action group (JTAG) hardware cosimulation. According to the results obtained from the implementation tools, the implemented system is efficient in terms of resource utilization and could support the real-time operations.
A Novel VLSI Architecture for FFT Utilizing Proposed 4:2 & 7:2 CompressorIJERD Editor
With the appearance of new innovation in the fields of VLSI and correspondence, there is likewise a perpetually developing interest for fast transforming and low range outline. It is likewise a remarkable certainty that the multiplier unit structures a fundamental piece of processor configuration. Because of this respect, rapid multiplier architectures turn into the need of the day. In this paper, we acquaint a novel structural engineering with perform high velocity duplication utilizing old Vedic math's strategies. Another fast approach using 4:2 compressors and novel 7:2 compressors for expansion has additionally been joined in the same and has been investigated. Upon examination, the compressor based multiplier present in this paper, is just about two times quicker than the mainstream routines for augmentation. Likewise we outline a FFT utilizing compressor based multiplier. This all configuration and examinations were done on a Xilinx Spartan 3e arrangement of FPGA and the timing and zone of the outline, on the same have been ascertained.
Design of an efficient binary phase-shift keying based IEEE 802.15.4 transce...IJECEIAES
The IEEE 802.15.4 physical layer (PHY) standard is one of the communication standards with wireless features by providing low-power and low-data rates in wireless personal area network (WPAN) applications. In this paper, an efficient IEEE 802.15.4 digital transceiver hardware architecture is designed using the binary phase-shift keying (BPSK) technique. The transceiver mainly has transmitter and receiver modules along with the error calculation unit. The BPSK modulation and demodulation are designed using a digital frequency synthesizer (DFS). The DFS is used to generate the in-phase (I) and quadrature-phase (Q) signals and also provides better system performance than the conventional voltagecontrolled oscillator (VCO) and look up table (LUT) based memory methods. The differential encoding-decoding mechanism is incorporated to recover the bits effectively and to reduce the hardware complexity. The simulation results are illustrated and used to find the error bits. The design utilizes less chip area, works at 268.2 MHz, and consumes 108 mW of total power. The IEEE 802.15.4 transceiver provides a latency of 3.5 clock cycles and works with a throughput of 76.62 Mbps. The bit error rate (BER) of 2×10-5 is achieved by the proposed digital transceiver and is suitable for real-time applications. The work is compared with existing similar approaches with better improvement in performance parameters.
DUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORMVLSICS Design
This paper is devoted to the design of dual core crypto processor for executing both Prime field and binary field instructions. The proposed design is specifically optimized for Field programmable gate array (FPGA) platform. Combination of two different field (prime field GF(p) and Binary field GF(2m)) instructions execution is analysed.The design is implemented in Spartan 3E and virtex5. Both the performance results are compared. The implementation result shows the execution of parallelism using dual field instructions
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Water billing management system project report.pdfKamal Acharya
Our project entitled “Water Billing Management System” aims is to generate Water bill with all the charges and penalty. Manual system that is employed is extremely laborious and quite inadequate. It only makes the process more difficult and hard.
The aim of our project is to develop a system that is meant to partially computerize the work performed in the Water Board like generating monthly Water bill, record of consuming unit of water, store record of the customer and previous unpaid record.
We used HTML/PHP as front end and MYSQL as back end for developing our project. HTML is primarily a visual design environment. We can create a android application by designing the form and that make up the user interface. Adding android application code to the form and the objects such as buttons and text boxes on them and adding any required support code in additional modular.
MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software. It is a stable ,reliable and the powerful solution with the advanced features and advantages which are as follows: Data Security.MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTjpsjournal1
The rivalry between prominent international actors for dominance over Central Asia's hydrocarbon
reserves and the ancient silk trade route, along with China's diplomatic endeavours in the area, has been
referred to as the "New Great Game." This research centres on the power struggle, considering
geopolitical, geostrategic, and geoeconomic variables. Topics including trade, political hegemony, oil
politics, and conventional and nontraditional security are all explored and explained by the researcher.
Using Mackinder's Heartland, Spykman Rimland, and Hegemonic Stability theories, examines China's role
in Central Asia. This study adheres to the empirical epistemological method and has taken care of
objectivity. This study analyze primary and secondary research documents critically to elaborate role of
china’s geo economic outreach in central Asian countries and its future prospect. China is thriving in trade,
pipeline politics, and winning states, according to this study, thanks to important instruments like the
Shanghai Cooperation Organisation and the Belt and Road Economic Initiative. According to this study,
China is seeing significant success in commerce, pipeline politics, and gaining influence on other
governments. This success may be attributed to the effective utilisation of key tools such as the Shanghai
Cooperation Organisation and the Belt and Road Economic Initiative.
Online aptitude test management system project report.pdfKamal Acharya
The purpose of on-line aptitude test system is to take online test in an efficient manner and no time wasting for checking the paper. The main objective of on-line aptitude test system is to efficiently evaluate the candidate thoroughly through a fully automated system that not only saves lot of time but also gives fast results. For students they give papers according to their convenience and time and there is no need of using extra thing like paper, pen etc. This can be used in educational institutions as well as in corporate world. Can be used anywhere any time as it is a web based application (user Location doesn’t matter). No restriction that examiner has to be present when the candidate takes the test.
Every time when lecturers/professors need to conduct examinations they have to sit down think about the questions and then create a whole new set of questions for each and every exam. In some cases the professor may want to give an open book online exam that is the student can take the exam any time anywhere, but the student might have to answer the questions in a limited time period. The professor may want to change the sequence of questions for every student. The problem that a student has is whenever a date for the exam is declared the student has to take it and there is no way he can take it at some other time. This project will create an interface for the examiner to create and store questions in a repository. It will also create an interface for the student to take examinations at his convenience and the questions and/or exams may be timed. Thereby creating an application which can be used by examiners and examinee’s simultaneously.
Examination System is very useful for Teachers/Professors. As in the teaching profession, you are responsible for writing question papers. In the conventional method, you write the question paper on paper, keep question papers separate from answers and all this information you have to keep in a locker to avoid unauthorized access. Using the Examination System you can create a question paper and everything will be written to a single exam file in encrypted format. You can set the General and Administrator password to avoid unauthorized access to your question paper. Every time you start the examination, the program shuffles all the questions and selects them randomly from the database, which reduces the chances of memorizing the questions.
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
ACEP Magazine edition 4th launched on 05.06.2024Rahul
This document provides information about the third edition of the magazine "Sthapatya" published by the Association of Civil Engineers (Practicing) Aurangabad. It includes messages from current and past presidents of ACEP, memories and photos from past ACEP events, information on life time achievement awards given by ACEP, and a technical article on concrete maintenance, repairs and strengthening. The document highlights activities of ACEP and provides a technical educational article for members.
1. A Modified Radix-24
SDF Pipelined OFDM Module
for FPGA based MB-OFDM UWB Systems
M.Santhi, S.Arun Kumar, G.S.Praveen Kalish, K.Murali, S.Siddharth, G.Lakshminarayanan
Department ofECE, National Institute ofTechnology, Thiruchirapalli.
santhiphd@gmaiI.com laksh@nitt.edu
Abstract - The OFDM module in the MB-oFDM UWB
transmitter is necessarily operated at 528 MHz. This is really a
challenging task because the OFDM in the UWB module has to
calculate 128-point IFFT. Earlier papers used radix-24
SDF
algorithm with parallel processing architectures of block size two
to achieve the required speed and implemented the module on
ASIC. In this paper a novel scheme "modified radix-24
SDF
algorithm" is proposed to achieve the calculation of 128-point
IFFT. In the proposed scheme, the order of the twiddle factor
sequence is different compared to the earlier radix-24
SDF
algorithm. The change in twiddle factor sequence achieves easier
implementation of the CSD multiplier used for IFFT calculation.
It is also proposed that the required speed can be achieved on
FPGA itself without using paraDel processing architectures. This
can be done by pipelining the OFDM module as well as using
LPMs. This leads to reduction in area compared to the earlier
approach of using parallel processing architectures of block size
two. For improving the accuracy, in the proposed scheme the
internal wordlength is maintained at 13bits which is 7 bits more
than the input, to account for the overflows at each of the 7 stages
of the OFDM module. The proposed scheme with increased
complexity for better accuracy is tested on ALTERA Stratix III
EP3SL50F484C2 device. From the implementation, it is verified
that the OFDM module achieves a maximum clock speed of 528
MSamplesls. In general ASICs are three times faster than FPGA,
operating the ASIC based OFDM module in 528 MHz with the
proposed modified radix-24
SDF pipelined algorithm is very
much easier.
Keywords - MB-OFDM, SDF, FFT, FPGA.
I. INTRODUCTION
Ultra wideband (UWB) communication systems, which
enable the delivery of data from a rate of 110 Mb/s at a
distance of 10m to a rate of480 Mb/s at a distance of2 m, are
ideally suited to application in short range wireless
communications because they can share a frequency band with
existing narrowband systems and offer a higher data rate than
802.11 or Bluetooth [1]. One of the communication methods
for IEEE 802.15.3a standard is Multiband Orthogonal
Frequency Division Multiplexing (MB-OFDM), which offers
528 MHz bandwidth [2][3]. MB-OFDM-based UWB not only
has reliably high-data-rate transmission in time-dispersive or
frequency-selective channels without having complex time-
domain channel equalizers but also can provide high-spectral
efficiency.
The FFT/IFFT processor is one of the modules having high
computational complexity in the physical layer of the UWB
system, and the execution time of the 128-point FFT/IFFT in
UWB system is only 312.5 ns. The power consumption and
hardware cost can be saved in our processor by using the
higher radix FFT algorithm and less memory and complex
multipliers.
This paper is organized as follows. Section II describes the
design issues of MB-OFDM UWB communication systems.
Section III describes the proposed 128-point radix-24
FFT/IFFT algorithm. Section IV describes the proposed 128-
point radix-24
FFT/IFFT architecture. In Section V, the
implementation and performance of the proposed FFT/IFFT
architecture are discussed. Conclusions and further work are
presented in Sections VI and VII respectively.
II. DESIGN ISSUES OF THE FFf PROCESSOR
A block diagram of the proposed physical layer of OFDM-
based UWB system is shown in Fig. 1[4]. In the UWB system,
the data rate is from 53.3 Mb/s to 480 Mb/s with code rates of
113, 11/32, 112, 5/8, and 3/4. The bandwidth of the transmitted
signal is 528 MHz and the OFDM symbol duration is 312.5
ns, including 60.61 ns for cyclic prefix duration and 9.47 ns
for guard interval duration [2][3]. Thus, an FFT/IFFT has to
compute one OFDM symbol within 312.5 ns and the
throughput rate of this specification in 128-point FFf/IFFT is
up to 409.6 MSamples/s.
Various FFT architectures, such as single-memory
architecture, dual-memory architecture, pipelined architecture,
array architecture, and cached-memory architecture, have been
proposed in the last three decades. In our view, the pipelined
architecture should be the best choice for UWB systems since
it can provide high throughput rate with acceptable hardware
cost.
The pipelined FFT architecture typically falls into one ofthe
two following categories: multipath delay commutator (MDC)
and single-path delay feedback (SDF)[5]. In general, the Moe
scheme can achieve a higher throughput rate, while the SOF
scheme needs less memory and hardware cost. In addition, the
higher radix FFT algorithm is difficult to be implemented in
the traditional MOC architecture. Table 1 compares the
hardware requirements for various architectures. The proposed
architecture based on radix 24
SOF architecture was selected
for implementation owing to the low hardware cost and
greater area efficiency and can also provide an available
throughput rate to meet the UWB specifications.
Proceedings ofthe 2008 International Conference on Computing, Communication and Networking (ICCCN 2008)
978-1-4244-3595-1/08/$25.00 <02008 IEEE
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY TIRUCHIRAPALLI. Downloaded on May 14, 2009 at 02:04 from IEEE Xplore. Restrictions apply.
2. Fig. 1. Block diagram ofthe MB-OFDM UWB receiver system
TABLE 1 COMPARISON OF HARDWARE REQUIREMENTS FOR N-LENGTH FFT
WITH DIFFERENT ARCHITECTURES
Architecture
Complex Complex Memory Control
Multiplier # Adder # size circuit
R2SDF log2(N)-2 log2(N) N-l simple
R2MDC lo~(N)-2 4Iog4(N) 3N/2-2 simple
R4SDF lo~(N)-1 log4(N) N-l medium
R4MDC 3(lo~(N)-1 ) 8Iog4(N) 5N/2-4 simple
R22SDF lo~(N)-1 4Iog4(N) N-l simple
R23
SDF logg(N)-1 4Iog4(N) N-l simple
R24SDF log16(N)-1 41og4(N) N-l simple
III. PROPOSED RADIX 24SDF ALGORITHM
A Discrete Fourier transform (DFf) of length of N (=128)
is defined as
N-f.
x(k) =Lx(n)Wlk .k:: O.l.....N -1 (1)
'tat
Where WN, the so called "twiddle factor", denotes the N-th
primitive root ofunity, with its exponent evaluated modulo N.
The k is the frequency index, and the n is the time index. In
order to derive the radix-24 algorithm, consider the first 4
steps of decomposition [6]. Applying a 5-dimensional linear
index map, wherein the 5th
dimension in itself is decomposed
into a 2 bit and 1 bit index, we have,
11 N .V .V .'i
n =<"2ftt +"4J1.: +8"; + 16114 + 64 n,+ftt >
k = < k1 + 2k: +4k; +SkI. +32k, .... 16k, > (2)
The common factor algorithm (CFA) takes the form of
XOct +2k: +4k; + 8k~ +32ks +16k~)
~ f ~ ~ ~ ~ (i~V .v :N N N ~
=L £, L. l.. i.. ~ x >2"~ +."-- +it':J -+- ii~'" +ii,r.t +"a'
t....r .......,"'J~ ':-0 ..~-o
::L L[G(JII•.1'1,.kt' ":. Iei' k4}it:l"....Jlo)(k,....:k••~.j;(J]
n.-Ot'la-O
.'."';:'lI.....J(;:cI...~ (3)
if
(5)
Where H (n) denotes the second butterfly unit
H(1'I):: H(ra.kt·Ic:>::B(tLkJ +(-j)(.i4...:ft:JB(ft +i.kl)
Where B (n,kl) denotes the first butterfly unit as follows.
B(n.k1) =x(n) +(-l)tt'x(n+~)
~
The algorithm can take complex constant multiplier instead of
programmable complex multiplier. The Canonic Signed Digit
(CSD) constant multiplier contains the fewest number of non-
zero bits, so it can be used to reduce the area and power
consumption [7]. Fig. 2 shows the signal flow graph (SFG) of
the 128-point radix-~4 SDF FFT alg~rithm.
Fig 2. Signal flow graph ofthe proposed R24SDP algorithm
IV. PROPOSED FFT ARCHITECTURE FOR THE MB-OFDM
UWB SYSTEM
A block diagram of the proposed single data-path 128-point
R24SDF FFT/IFFT processor is shown in Fig. 3. The
proposed architecture consists of a memory block, butterfly
units (BFl, BF2), programmable complex multipliers, CSD
complex constant multipliers, register files, and some
multiplexers. The FFT processor can be transformed to an
IFFT block by performing the operation as shown in the Fig
4. The output results of butterfly units are complex addition
and complex subtraction of two input data x[n] and x[N/2+n],
where N=l28.
Due to the spatial regularity of Radix-24 algorithm, the
synchronization control of the processor is very simple. A
(log2N)-bit binary counter serves two purposes:
synchronization controller and address counter for twiddle
factor reading in each stage. For first N/2 cycles, the 2-to-l
multiplexers in the butterfly module I (as shown in Fig.5.i)
switch to position "0", and the butterfly is idle. The input data
from left is directed to the shift registers until they are filled.
On next N/2 cycles, the multiplexers tum to position "1", the
butterfly computes a 2-point DFT with incoming data and the
data stored in the shift registers.
ZI(n) = x(n) + x(n+N/2), 0 ~ n < N/2 (6)
ZI(n + N/2) = x(n) - x(n+N/2)
The butterfly output ZI(n) is sent to apply the twiddle
factor, and ZI(n + N/2) is sent back to the shift registers to be
"multiplied" in still next N/2 cycles when the first half of the
next frame oftime sequence is loaded in. The operation ofthe
second butterfly is similar to that ofthe first one, except the
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY TIRUCHIRAPALLI. Downloaded on May 14, 2009 at 02:04 from IEEE Xplore. Restrictions apply.
3. Output
X(D~2.)..X(4)..-.J((eo)..X(52)C(1)..)C(3)r··,J((61)C(l53)r
X(64)rX(Mi)r··,)((:l24)..x<U6).X(IIi5))C(fi7)......)({127)
:>
A(O)"A(n)"A(Uii)r-.A(62).,A(64)..··-,.Il(94},A(t26),
A(J.)"A(")"A(J.7)r·.A(~~·..,.A(9S),A(127)
Tlf11e>
c8> PragriImmabIe Ca~ Muitipler
o CSD Complex Multiplier
_____Data path
TllTle
Fig 3. Block diagram ofFFT/IFFT processor
"distance" of butterfly input sequence are just N/4 and the
trivial twiddle factor multiplication has been implemented by
real-imaginary swapping with a commutator and controlled
add/subtract operations, as in Fig. 5-ii, which requires two bit
control signal from the synchronizing counter. The data then
goes through a full complex multiplier, working at 75%
IFFT 11M
Fig 5.i Structure ofBFl
Fig 4 Block diagram ofthe proposed 128-point R24
SDF FFT/IFFT processor
R
11N
R
utility, accomplishes the result of first level of radix-4 OFT
word by word. Further processing repeats this pattern with the
distance of the input data decreases by half at each
consecutive butterfly stages. After N-l clock cycles, the
complete OFT transform result streams out to the right, in bit-
reversed order. The next frame of transform can be computed
without pausing due to the pipelined processing ofeach stage.
Radix-24
FFT algorithm based single-data-path architectures
has fewer multipliers than those of lower radix FFT
algorithms. For example, radix-24
algorithm has the same
number ofmultipliers as the radix-22
algorithm but can reduce
an amount ofmultiplicative complexity by means ofreplacing
a half of full complex multipliers with trivial constant
multipliers [8].In the CSD complex constant multiplier, the
multiplication ofthe twiddle factors is processed according to
their scheduling in the signal flow graph. The output data
generated by the BF in the sixth stage are multiplied by a
trivial twiddle factor, -j, W(16) or W(48) before they are fed
to the last stage.
The Simplification ofthe Complex Multiplication
Complex multiplication is the main design key in the FFT
algorithm. Consider the complex multiplication, the two
inputs should be the xr + i xi and the coefficient W =
exp(j21t1N) = cosa + i sin a, and the result can be expressed by
Y = yr + i yi , where,
yr= xr cos a - xi sin a = xi(cos a + sina) + (xi - xr) cos a
yi = xi cos a+ xr sin a= xr(cos a - sin a)-(xi - xr) cos a (7)
Fig 5.ii Structure ofBF2
After the transform of the Eq.7, the complex multiplication
only needs 3 real multiplications, 1 addition and 2 subtraction
when the sum and the difference between the real and the
imaginary parts are precomputed and stored in the ROM .This
algorithm is used for the programmable complex multiplier to
reduce the hardware complexity and to increase the speed.
CSD Multiplier
Since the twiddle factors in the FFT processor are known in
advance, we propose the use of a multiplier-less architecture
to perform the multiplication with the twiddle factors using
shift-and-add operations. The canonical sign digit (CSD)
algorithm has been applied to this architecture to further
reduce the number ofshift and-add operations required. In this
architecture trivial multiplications are implemented without
any multipliers by either passing the data, swapping the real
and imaginary parts ofthe complex data or a sign change. The
design presented in the paper takes advantage of the
symmetries ofthe twiddle factors in the complex plane.
When the real and imaginary values of twiddle factors are
same, two CSO constant multipliers and two adder
/subtractors are used to generate the output. When the real and
imaginary values are not same, three CSO constant multipliers
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY TIRUCHIRAPALLI. Downloaded on May 14, 2009 at 02:04 from IEEE Xplore. Restrictions apply.
4. are used. If inputs don't need to multiply with twiddle factor
the output results are generated from the input directly.
Pipelining
The radix 24
architecture was thoroughly analyzed to find
possible areas to be pipelined based on the design and the
critical path delays between various implemented blocks. The
processor was extensively pipelined to achieve the high
working frequency to meet the UWB specification.
Shimming registers are also needed for control signals to
comply with thus revised timing.
v. IMPLEMENTATION AND PERFORMANCE
The word length of the proposed FFTIIFFT is 6-bit external
FFT data [9] for both the real and imaginary parts. The 2's
complement representation of numbers is used in the
processor. Due to overflow in each adder ofthe butterfly unit,
13-bit internal FFT precision has been maintained. The
determined word length not only keeps the quantization noise
to the least but also can minimize the hardware complexity.
After the appropriate word length of the proposed FFT/IFFT
processor is chosen, the architecture of the processor was
modeled in Verilog in an ALTERA Stratix III FPGA. Some of
the modules were generated from the ALTERA Megawizard
Plug-in Manager and others were written at the RTL level,
including the top level wrapper file. It contains all the
instantiated modules and the connectivity information in RTL
(VerilogHDL). The Timequest timing analyzer and Chip
planner (Floorplan and Chip editor) of QUARTUS II 8.0 were
applied to analyze timing, hardware expenditure and so on.
Vector waveforms associated with the RTL description were
created and the stimulus provided in an external file. Using the
vector waveform file, simulations were carried out for the
design to validate the behavioral description. The results were
obtained incrementally, first for a sub block comprising of one
module of the FFT. Finally the results were obtained for the
whole design comprising of seven such sub blocks, global
clock and dual port RAMs. The output ofthe Verilog coded
TABLE 2 IMPLEMENTATION RESULTS OF THE PROPOSED PROCESSOR
Family
ALTERA Stratix
ALTERA Stratix II
III
Device EP3SL50F484C2 EP2s60FI020C4
ALUTs 7972/38000 (3%) 7822/48352 (16%)
ALMs 3986/19000(3%) 4375/19000(3%)
DSP block
6/216 «3%) 6/288 (2%)
elements
Total memory bits
3328/1880064«1
8192/2544192«1%)
%)
Word length
1:6 bits 1:6 bits
Q:6 bits Q:6 bits
Number of
7580/38000(20%0 7697/38000(20%)
reldsters
Programmable
complex 1 1
multipliers #
Constant complex
2 2
multipliers #
Number of
28 28
complex adders
Clock rate 528 MHz 350 MHz
Throughput rate 528 Msamples/s 350 Msamples/s
Critical path delay 1.87 ns 2.87 ns
architecture agreed with the output data of MATLAB and the
FFT/IFFT in our UWB platform, which was designed on a
EXCEL worksheet which clearly depicts the outputs with the
signal flow graph.
The implementation of the proposed FFT/IFFT processor
was carried out on a Stratix II EP2S60FI020C4 device and
simulated for ALTERA Stratix III EP3SL50F484C2. The
input data is given through a dual port RAM and a PLL unit is
used to give the required clock frequency. The output is
checked using a dual port RAM and the in-system memory
content editor. Table 2 shows the performance and resource
usage of the implemented processor. This shows the processor
is area efficient and so the entire MB-OFDM receiver
Itransmitter with the other modules can be accommodated in a
single chip. It has a significantly reduced number of complex
multiplication and complex addition. The critical path delay
occurs between the input RAM and first butterfly unit and so
the processor is capable of running at UWB speeds if
implemented within a larger system.
All the previous implementations were on ASIC [9] and
so comparison with them is not meaningful. Table 3 shows the
comparisons of performance of the different FFT processors
implemented on FPGA. The validity and efficiency of the
proposed architecture has been verified by extensive
simulation and implementation. Fig 6 shows the
implementation results ofthe proposed FFTIIFFT processor.
TABLE 3 COMPARISIONS OF THE Performance of DIFFERENT PROCESSORS
Family Frequency max
Altera FFT Megacore function on
456 MHz
Stratix III [10]
Proposed processor on ALTERA
350 MHz
Stratix II EP2s60FI020C4
Proposed processor on ALTERA
528 MHz
Stratix III EP3SLSOF484C2
VI. CONCLUSION
An OFDM module implemented as 128-point FFT/IFFT
processor for a FPGA-based MB-OFDM UWB system using
the proposed modified radix-24
SDF pipelined algorithm has
been successfully implemented on ALTERA STRATIX III
and STRATIX II FPGAs without using parallel processing
architectures. The high speed is achieved by using extensive
pipelining on Altera's LPM. The hardware costs of memory
and complex multiplier is saved by adopting delay feedback
and data scheduling approaches. In addition, the number of
complex multiplications is reduced effectively by using a
higher radix algorithm and using CSD complex multipliers.
Also for improving the accuracy in the proposed scheme, the
internal wordlength is maintained at 13bits which is 7 bits
more than the input, to account for the overflows at each ofthe
7 stages of the OFDM module. The implementation results
show that the throughput rate is 350 MSamples/s at 350 MHz
on ALTERA STRATIX II and 528 MSamples/s at 528 MHz
on ALTERA STRATIX III device. The high throughput rate
ofthe OFDM module with increased internal wordlength of 13
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY TIRUCHIRAPALLI. Downloaded on May 14, 2009 at 02:04 from IEEE Xplore. Restrictions apply.
5. bits from 6bits to improve accuracy is very well meeting the
MB-OFDM UWB system's specifications.
Fig 6. Results ofthe implemented processor
VII. REFERENCES
[1] Time Domain, "UWB Applications, Demonstration & Regulatory
Update," Sept 2001 workshop, March 20,2001.
[2] A. Batra et aI., "Multi-band OFDM Physical Layer Proposal for IEEE
802.15 Task Group 3a," IEEE P802.15-Q3/268r3, March 2004.
[3] A. Batra, J. Balakrishnan, G. R. Aiello, J. R. Foerster, A. Dabak, Design
of Multiband OFDM System for Realistic UWB Channel Environment,"
IEEE Trans. On Microwave Theory and Techniques, vol. 52, no. 9, pp.
2123-2138, Sept. 2004.
[4] Y-W. Lin, H-Y. Liu, and C-Y. Lee, "A I-GS/s FFT/IFFT processor for
UWB applications," IEEE Journal of Solid-State Circuits, vol. 40, no. 8,
pp. 1726-1735, August 2005.
[5] S. He and M. Torkelson, iODesigning pipeline FFT processor for
OFDM(de)modulation,i± in Proc. DRSI Int. Symp. Signals, Systems,
and Electronics, vol. 29, Oct. 1998, pp. 257.262.
[6] J. Lee, H. Lee, S-I. Cho, S-S. Choi, "A High-Speed, Low-Complexity
Radix-24
FFT Processor for MB-OFDM UWB Systems," IEEE Inter.
Symp. on Circuits and Systems, pp. 4719-4722,
[7] S-M. Kim, J-G. Chung, and K. K. Parhi, "Low Error Fixed-width CSD
Multiplier with Efficient Sign Extension," IEEE Transactions on
Circuits and Systems-II, vol. 50, no. 12, Dec. 2003.
[8] H.Lee, M.Shin "A High-Speed Low-Complexity Two-Parallel Radix-2
4
FFT/IFFT Processor for UWB Applications, " IEEE Asian Solid-State
Circuits Conference, November 2007
[9] R. S. Sherratt, S. Makino,"Numerical Precision Requirements on the
Multiband Ultra-Wideband System for Practical Consumer Electronic
Devices" IEEE Transactions on Consumer Electronics, Vol. 51, No.2,
MAY 2005.
[10] FFT MegaCore Function User Guide MegaCore Version 7.2
www.altera.com
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY TIRUCHIRAPALLI. Downloaded on May 14, 2009 at 02:04 from IEEE Xplore. Restrictions apply.