In the last few years, efficient resource management turned out to be one of the major challenges for hardware designers. Strategies of reusability through reconfiguration have demonstrated interesting potentials to address it, providing also power and area minimization. The Multi-Dataflow Composer (MDC) tool has been presented to the scientific community to automatically build-up runtime coarse-grained reconfigurable platforms. Originally conceived in the field of Reconfigurable Video Coding (RVC), the MDC allows achieving attractive results also for hardware designers operating in different application fields. In this work, we intend to demonstrate the potential orthogonality of the MDC approach with respect to the RVC domain. A runtime reconfigurable wavelet denoiser, targeted for biomedical applications, has been developed and prototyped onto an FPGA Development Board Spartan 3E 1600. Impressive results in terms of resource minimization have been achieved with respect to more traditional solutions.
How to Terminate the GLIF by Building a Campus Big Data Freeway SystemLarry Smarr
12.10.11
Keynote Lecture
12th Annual Global LambdaGrid Workshop
Title: How to Terminate the GLIF by Building a Campus Big Data Freeway System
Chicago, IL
In this deck from the Perth HPC Conference, Rob Farber from TechEnablement presents: AI is Impacting HPC Everywhere.
"The convergence of AI and HPC has created a fertile venue that is ripe for imaginative researchers — versed in AI technology — to make a big impact in a variety of scientific fields. From new hardware to new computational approaches, the true impact of deep- and machine learning on HPC is, in a word, “everywhere”. Just as technology changes in the personal computer market brought about a revolution in the design and implementation of the systems and algorithms used in high performance computing (HPC), so are recent technology changes in machine learning bringing about an AI revolution in the HPC community. Expect new HPC analytic techniques including the use of GANs (Generative Adversarial Networks) in physics-based modeling and simulation, as well as reduced precision math libraries such as NLAFET and HiCMA to revolutionize many fields of research. Other benefits of the convergence of AI and HPC include the physical instantiation of data flow architectures in FPGAs and ASICs, plus the development of powerful data analytic services."
Learn more: http://www.techenablement.com/
and
http://hpcadvisorycouncil.com/events/2019/australia-conference/agenda.php
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
A High-Performance Campus-Scale Cyberinfrastructure for Effectively Bridging ...Larry Smarr
10.04.07
Presentation by Larry Smarr to the NSF Campus Bridging Workshop
University Place Conference Center
Title: A High-Performance Campus-Scale Cyberinfrastructure for Effectively Bridging End-User Laboratories to Data-Intensive Sources
Indianapolis, IN
In this deck from the HPC User Forum at Argonne, Andrew Siegel from Argonne presents: ECP Application Development.
"The Exascale Computing Project is accelerating delivery of a capable exascale computing ecosystem for breakthroughs in scientific discovery, energy assurance, economic competitiveness, and national security. ECP is chartered with accelerating delivery of a capable exascale computing ecosystem to provide breakthrough modeling and simulation solutions to address the most critical challenges in scientific discovery, energy assurance, economic competitiveness, and national security. This role goes far beyond the limited scope of a physical computing system. ECP’s work encompasses the development of an entire exascale ecosystem: applications, system software, hardware technologies and architectures, along with critical workforce development."
Watch the video: https://wp.me/p3RLHQ-kSL
Learn more: https://www.exascaleproject.org
and
http://hpcuserforum.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
This document discusses real-time embedded acoustic DSP projects. It begins by describing the objective of creating audio special effects using DSP algorithms. It then discusses the technologies applied, including the hardware used which is a TMS320C6713 DSK and the Code Composer Studio software. The core of the document discusses the theory behind building blocks like comb filters, all-pass filters, and notch filters that are used to create effects like echo, flanging, chorus, phasing, reverb, tremolo, and more. It concludes by discussing considerations for real-time embedded DSP applications using the TMS320C6713 architecture.
FR1.L09 - PREDICTIVE QUANTIZATION OF DECHIRPED SPOTLIGHT-MODE SAR RAW DATA IN...grssieee
This document presents methods for predictive quantization of dechirped spotlight-mode synthetic aperture radar (SAR) raw data in the transform domain. It discusses previous work on SAR data compression, analyzes the characteristics of spotlight SAR data in the inverse discrete Fourier transform (IDFT) domain, and proposes three predictive encoding schemes - transform domain block predictive quantization (TD-BPQ), transform domain block predictive vector quantization (TD-BPVQ), and predictive trellis coded quantization (TD-PTCQ) - to take advantage of correlations in the transformed data. Numerical results on an example dataset show SNR improvements of up to 6 dB compared to baseline block adaptive quantization.
iDiff 2008 conference #07 IP-Racine Project OverviewBenoit Michel
The IP-RACINE project was a 40-month, 11+ million Euro European Commission-funded effort to improve digital cinema workflow from content capture to exhibition. The project involved 11 partners across the cinema industry value chain working on 6 areas: virtual studios, post-production, exhibition, image/sound processing, skills training, and promotion. The project developed prototypes and demonstrated interoperability between partner technologies to create a coherent digital cinema workflow from scene to screen.
This document contains contact information for Candor Minds and a list of 29 codes describing VLSI and digital circuit design projects. The codes are grouped under categories such as High Speed, Low Power, Testing, and VLSI with MATLAB. Each code lists a title, brief 1-2 sentence description of the project. The document provides an overview of the types of projects available from Candor Minds in the areas of VLSI design, digital circuits, and DSP applications.
How to Terminate the GLIF by Building a Campus Big Data Freeway SystemLarry Smarr
12.10.11
Keynote Lecture
12th Annual Global LambdaGrid Workshop
Title: How to Terminate the GLIF by Building a Campus Big Data Freeway System
Chicago, IL
In this deck from the Perth HPC Conference, Rob Farber from TechEnablement presents: AI is Impacting HPC Everywhere.
"The convergence of AI and HPC has created a fertile venue that is ripe for imaginative researchers — versed in AI technology — to make a big impact in a variety of scientific fields. From new hardware to new computational approaches, the true impact of deep- and machine learning on HPC is, in a word, “everywhere”. Just as technology changes in the personal computer market brought about a revolution in the design and implementation of the systems and algorithms used in high performance computing (HPC), so are recent technology changes in machine learning bringing about an AI revolution in the HPC community. Expect new HPC analytic techniques including the use of GANs (Generative Adversarial Networks) in physics-based modeling and simulation, as well as reduced precision math libraries such as NLAFET and HiCMA to revolutionize many fields of research. Other benefits of the convergence of AI and HPC include the physical instantiation of data flow architectures in FPGAs and ASICs, plus the development of powerful data analytic services."
Learn more: http://www.techenablement.com/
and
http://hpcadvisorycouncil.com/events/2019/australia-conference/agenda.php
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
A High-Performance Campus-Scale Cyberinfrastructure for Effectively Bridging ...Larry Smarr
10.04.07
Presentation by Larry Smarr to the NSF Campus Bridging Workshop
University Place Conference Center
Title: A High-Performance Campus-Scale Cyberinfrastructure for Effectively Bridging End-User Laboratories to Data-Intensive Sources
Indianapolis, IN
In this deck from the HPC User Forum at Argonne, Andrew Siegel from Argonne presents: ECP Application Development.
"The Exascale Computing Project is accelerating delivery of a capable exascale computing ecosystem for breakthroughs in scientific discovery, energy assurance, economic competitiveness, and national security. ECP is chartered with accelerating delivery of a capable exascale computing ecosystem to provide breakthrough modeling and simulation solutions to address the most critical challenges in scientific discovery, energy assurance, economic competitiveness, and national security. This role goes far beyond the limited scope of a physical computing system. ECP’s work encompasses the development of an entire exascale ecosystem: applications, system software, hardware technologies and architectures, along with critical workforce development."
Watch the video: https://wp.me/p3RLHQ-kSL
Learn more: https://www.exascaleproject.org
and
http://hpcuserforum.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
This document discusses real-time embedded acoustic DSP projects. It begins by describing the objective of creating audio special effects using DSP algorithms. It then discusses the technologies applied, including the hardware used which is a TMS320C6713 DSK and the Code Composer Studio software. The core of the document discusses the theory behind building blocks like comb filters, all-pass filters, and notch filters that are used to create effects like echo, flanging, chorus, phasing, reverb, tremolo, and more. It concludes by discussing considerations for real-time embedded DSP applications using the TMS320C6713 architecture.
FR1.L09 - PREDICTIVE QUANTIZATION OF DECHIRPED SPOTLIGHT-MODE SAR RAW DATA IN...grssieee
This document presents methods for predictive quantization of dechirped spotlight-mode synthetic aperture radar (SAR) raw data in the transform domain. It discusses previous work on SAR data compression, analyzes the characteristics of spotlight SAR data in the inverse discrete Fourier transform (IDFT) domain, and proposes three predictive encoding schemes - transform domain block predictive quantization (TD-BPQ), transform domain block predictive vector quantization (TD-BPVQ), and predictive trellis coded quantization (TD-PTCQ) - to take advantage of correlations in the transformed data. Numerical results on an example dataset show SNR improvements of up to 6 dB compared to baseline block adaptive quantization.
iDiff 2008 conference #07 IP-Racine Project OverviewBenoit Michel
The IP-RACINE project was a 40-month, 11+ million Euro European Commission-funded effort to improve digital cinema workflow from content capture to exhibition. The project involved 11 partners across the cinema industry value chain working on 6 areas: virtual studios, post-production, exhibition, image/sound processing, skills training, and promotion. The project developed prototypes and demonstrated interoperability between partner technologies to create a coherent digital cinema workflow from scene to screen.
This document contains contact information for Candor Minds and a list of 29 codes describing VLSI and digital circuit design projects. The codes are grouped under categories such as High Speed, Low Power, Testing, and VLSI with MATLAB. Each code lists a title, brief 1-2 sentence description of the project. The document provides an overview of the types of projects available from Candor Minds in the areas of VLSI design, digital circuits, and DSP applications.
This document provides information about the instruction set of a microcontroller. It includes two summaries:
1. The first summary lists instructions and describes how they affect flag settings in the microcontroller's status register, such as the carry and overflow flags.
2. The second summary is a table that provides details about the microcontroller's instruction set. It lists instructions, describes their operation, and specifies the number of bytes and oscillator periods each instruction requires. The table is divided into sections for arithmetic, logical, branching, and other operations.
The document as a whole provides low-level technical specifications about a microcontroller's instruction set, including how instructions are encoded, how they modify status register flags, and their timing
The document describes several of the major projects and personal projects completed by the individual. Some of the major projects include image classification using neural networks, 3D position detection using stereo vision, sub-ammunition counting using optical flow, skin image classification, fingerprint image classification, pedestrian proximity detection, and image registration and defect detection. Personal projects include using deep learning for artistic style transfer and reinforcement learning for a game as well as an IoT plant watering system. The projects cover various domains and techniques including computer vision, image processing, neural networks, deep learning, and IoT.
CC-4005, Performance analysis of 3D Finite Difference computational stencils ...AMD Developer Central
The document discusses performance analysis of 3D finite difference computational stencils on Seamicro fabric compute systems. It provides an overview of the hardware including chassis, compute cards, storage cards, and 3D torus fabric topology. It then describes the software stack and various microbenchmarks performed, including CPU, memory, network and storage benchmarks. It also describes modeling of 3D Laplace's equation using an 8th order finite difference scheme and its discretization over a 25 point stencil for computation on the system.
The document describes the PNX85500 single chip LCD TV system with an integrated 120Hz HD frame rate converter. It contains the world's first digital TV system-on-chip implemented on a 45nm process. The chip integrates many functions including demodulation, networking interfaces, 120Hz frame rate conversion, H.264 decoding, and advanced picture quality algorithms. It has an extremely high level of functional integration while requiring the smallest memory footprint and driving low costs.
The document discusses the benefits of protocol aware automatic test equipment (ATE) compared to traditional ATE. Protocol aware ATE would allow testers to interact with devices under test using the same protocol level of abstraction as designers, making testing easier and reducing development cycles. It provides examples showing how protocol aware ATE could speed up silicon bring-up and debug by enabling direct register reads and writes using protocols instead of low-level vectors. This would help address issues of non-deterministic device behavior from processes like cycle slipping.
This document provides an overview of Ethernet and link layer technologies:
- It discusses the basic concepts of carrier sense multiple access with collision detection (CSMA/CD) used in Ethernet, including listening before transmitting, collision detection, and exponential backoff.
- Key aspects of original Ethernet are described, such as using coaxial cable, vampire taps, and supporting up to 256 stations and 1 km of cable length.
- Ethernet standards evolution is summarized, from the original 3 Mbps to modern gigabit Ethernet and wireless variants.
- Other link layer technologies like token ring, FDDI, and frame relay are mentioned as alternatives to Ethernet.
- Building larger networks using bridges and switches to interconnect
The document discusses the features and performance of the K3 HF Transceiver. It highlights its high performance first IF filters down to 200 Hz, high performance noise blankers, 8 band parametric TX and RX equalizers, and ability to decode PSK and RTTY directly from the CW paddle. It also summarizes the K3's dual diversity receiver design, high resolution color spectrum display, best-in-class blocking and intermodulation dynamic range, excellent DSP AGC impulse rejection, low phase noise, and strong front-end design providing great receiver performance. Real-world testing of the K3 on a DXpedition resulted in new records for number of stations worked simultaneously across multiple bands and modes.
The document describes three products from Zodiac Data Systems: the Cortex RTR integrated telemetry receiver, the Radio Signal Recorder/Reproducer (RSR), and the modular DATaRec 4 data acquisition systems. The Cortex RTR provides dual-channel reception in P, L, and S bands with excellent phase noise compatibility. The RSR offers multi-channel IF signal recording and reproduction. The DATaRec 4 is available in standard and ruggedized configurations and can accommodate applications from low to very high channel counts.
PSNC has been researching 4K since 2008, including software like streaming, codecs, and visualization as well as hardware like live 4K. They cooperate with universities, film schools, and artists. Their 4K 3D node includes a professional 3D rig, post-production software, 4K streaming, and a 4K LCD monitor. PSNC is developing new technologies like a 132MP wall, GPU cluster, and 4K/4K3D post-production. They are working on playback, streaming, and a live interface for the RED One camera.
1. The document presents a simulation of a low power analog channel decoder for error correction implemented in 65nm CMOS technology.
2. The decoder uses analog circuitry operating in the sub-threshold region to perform decoding, allowing for ultra-low power operation below 40uW for throughput up to 2.5Mbps.
3. The decoder architecture includes an analog decoding core that implements the sum-product algorithm, digital interfaces for input and output, and a digital controller to manage timing.
1) The document describes the design and simulation of a first-order 1-bit sigma-delta analog-to-digital converter (ADC) using Ngspice simulation software.
2) A key component of sigma-delta ADCs is the op-amp, which is used in the integrator stage. The designed op-amp has a high open loop voltage gain and bandwidth suitable for the integrator.
3) The 1-bit sigma-delta ADC was implemented in a 0.18μm CMOS process with a ±2.5V power supply. The circuit was constructed and simulated using Ngspice to verify the design.
Design and realization of iir digital band stop filter using modified analog ...Subhadeep Chakraborty
This document describes the design and realization of an IIR digital band stop filter using a modified analog to digital mapping technique. It begins with introducing band stop filters and their importance in digital signal processing applications. It then discusses the design of analog band stop filters using passive components like resistors and capacitors. A new algorithm for analog to digital mapping is presented which transforms the transfer function from the s-domain to the z-domain, allowing the analog filter to be realized digitally. The direct form I structure is used to realize the IIR digital band stop filter. Simulation results demonstrating the magnitude response, phase response, impulse response and pole-zero plot are shown for Butterworth band stop filters of orders 6, 8 and 10.
TCP over low-power and lossy networks: tuning the segment size to minimize en...Ahmed Ayadi
This document discusses minimizing energy consumption for TCP over low-power and lossy networks. It presents a mathematical model to predict the energy used for bulk data transfers over multi-hop networks. The model estimates energy based on bit error rate, maximum retransmissions, number of hops, forward error correction amount, and TCP maximum segment size. The model allows studying the tradeoffs of sending short versus long TCP segments to improve energy efficiency over 6LoWPAN and IEEE 802.15.4 networks.
This document summarizes a research paper that designed a digital matched filter (DMF) to compress a binary phase code modulation (BPCM) signal encoded with a 13-chip Barker code. The DMF was implemented on an FPGA using a digital convolution algorithm in the time domain. The DMF design included a direct digital frequency synthesizer to generate the BPCM signal, a digital noise generator to add noise, and digital delay lines and shift registers to perform the convolution. Test results showed the input and output signals on an oscilloscope for different signal-to-noise ratio levels. The DMF achieved a processing gain of 11.14 dB.
November RaspberryPiBASICSPresentationSLIGHTLYOUTOFDATEDale Finley
The document provides information on various topics related to Raspberry Pi including the DSI connector, LED meanings, keyboard configuration, Linux commands, and software installation instructions. It lists graphical displays that can connect to the Raspberry Pi's DSI connector and describes the meanings of the LEDs for power, LAN connection status, and SD card access. It also provides commands for software management and system administration tasks like changing the keyboard, starting the GUI, and installing/updating packages.
Slides of Nathan Piasco ICRA 2019 oral presentation about the paper "Learning Scene Geometry for Visual Localization in Challenging Conditions". Best paper in Robot Vision Finalist
This document provides a summary of the instruction set for the MCS-51 microcontroller. It includes a table listing over 40 instructions such as ADD, SUB, AND, OR, MOV, INC, DEC, along with the description, number of cycles, and whether it affects flag settings. It notes that instructions can specify registers, direct addresses, indirect RAM addresses, or immediate data as operands.
The microcontroller instruction set includes instructions for flow control, logic, arithmetic, and data movement. It supports operations on registers and direct or indirect addressed memory locations. The instruction set is encoded in an 8-bit format and includes operations that affect the processor status flags.
This document discusses an FPGA implementation of a four phase code design using a modified genetic algorithm. It summarizes the key aspects of the implementation as follows:
1) The proposed architecture efficiently implements a modified genetic algorithm on an FPGA to identify good pulse compression sequences based on discrimination factor.
2) Pulse compression techniques in radar allow long pulses to achieve high energy while maintaining the range resolution of short pulses. The receiver compresses the long signal into a narrow signal.
3) The criteria for good pulse compression sequences include high merit factor and discrimination factor. Merit factor measures quality by comparing main lobe energy to side lobe energy. Discrimination factor compares the main peak to maximum side lobes.
This document summarizes the porting and optimization of the ITU-T G.729.1 speech coding algorithm to the SC3850 DSP core. The authors modified the reference G.729.1 code to fix-point C and optimized it for the SC3850 architecture through techniques like replacing functions with intrinsics, inlining small functions, and restructuring loops and memory accesses. They achieved over 90% improvement in cycles per second (MCPS) through these optimizations while maintaining bit-exact accuracy against the ITU test vectors. The G.729.1 codec is an 8-32 kbps scalable wideband speech codec standardized by ITU, useful for applications like VoIP, audio conferencing, and
A Review of the Split Bregman Method for L1 Regularized ProblemsPardis N
The document discusses using the split Bregman method for L1 regularization problems such as total variation denoising. It introduces total variation denoising models, describes how the conventional variational model is easy to solve but gives disappointing results, and explains that the split Bregman method can be applied to improve upon these results. The document then outlines the split Bregman formulation and applying it to total variation denoising before discussing results showing its fast convergence and acceptable intermediate outputs.
The document discusses denoising signals using wavelet transforms. It begins with an overview of denoising and its goal of reconstructing a signal from a noisy one. It then compares denoising using wavelets to other methods like Fourier filtering and spline methods. The key advantages of wavelets are their ability to localize properties and concentrate a signal's energy. The document outlines the basic denoising process using wavelet transforms which involves decomposition, thresholding, and reconstruction. It also discusses different thresholding methods and commonly used thresholds like VisuShrink.
This document provides information about the instruction set of a microcontroller. It includes two summaries:
1. The first summary lists instructions and describes how they affect flag settings in the microcontroller's status register, such as the carry and overflow flags.
2. The second summary is a table that provides details about the microcontroller's instruction set. It lists instructions, describes their operation, and specifies the number of bytes and oscillator periods each instruction requires. The table is divided into sections for arithmetic, logical, branching, and other operations.
The document as a whole provides low-level technical specifications about a microcontroller's instruction set, including how instructions are encoded, how they modify status register flags, and their timing
The document describes several of the major projects and personal projects completed by the individual. Some of the major projects include image classification using neural networks, 3D position detection using stereo vision, sub-ammunition counting using optical flow, skin image classification, fingerprint image classification, pedestrian proximity detection, and image registration and defect detection. Personal projects include using deep learning for artistic style transfer and reinforcement learning for a game as well as an IoT plant watering system. The projects cover various domains and techniques including computer vision, image processing, neural networks, deep learning, and IoT.
CC-4005, Performance analysis of 3D Finite Difference computational stencils ...AMD Developer Central
The document discusses performance analysis of 3D finite difference computational stencils on Seamicro fabric compute systems. It provides an overview of the hardware including chassis, compute cards, storage cards, and 3D torus fabric topology. It then describes the software stack and various microbenchmarks performed, including CPU, memory, network and storage benchmarks. It also describes modeling of 3D Laplace's equation using an 8th order finite difference scheme and its discretization over a 25 point stencil for computation on the system.
The document describes the PNX85500 single chip LCD TV system with an integrated 120Hz HD frame rate converter. It contains the world's first digital TV system-on-chip implemented on a 45nm process. The chip integrates many functions including demodulation, networking interfaces, 120Hz frame rate conversion, H.264 decoding, and advanced picture quality algorithms. It has an extremely high level of functional integration while requiring the smallest memory footprint and driving low costs.
The document discusses the benefits of protocol aware automatic test equipment (ATE) compared to traditional ATE. Protocol aware ATE would allow testers to interact with devices under test using the same protocol level of abstraction as designers, making testing easier and reducing development cycles. It provides examples showing how protocol aware ATE could speed up silicon bring-up and debug by enabling direct register reads and writes using protocols instead of low-level vectors. This would help address issues of non-deterministic device behavior from processes like cycle slipping.
This document provides an overview of Ethernet and link layer technologies:
- It discusses the basic concepts of carrier sense multiple access with collision detection (CSMA/CD) used in Ethernet, including listening before transmitting, collision detection, and exponential backoff.
- Key aspects of original Ethernet are described, such as using coaxial cable, vampire taps, and supporting up to 256 stations and 1 km of cable length.
- Ethernet standards evolution is summarized, from the original 3 Mbps to modern gigabit Ethernet and wireless variants.
- Other link layer technologies like token ring, FDDI, and frame relay are mentioned as alternatives to Ethernet.
- Building larger networks using bridges and switches to interconnect
The document discusses the features and performance of the K3 HF Transceiver. It highlights its high performance first IF filters down to 200 Hz, high performance noise blankers, 8 band parametric TX and RX equalizers, and ability to decode PSK and RTTY directly from the CW paddle. It also summarizes the K3's dual diversity receiver design, high resolution color spectrum display, best-in-class blocking and intermodulation dynamic range, excellent DSP AGC impulse rejection, low phase noise, and strong front-end design providing great receiver performance. Real-world testing of the K3 on a DXpedition resulted in new records for number of stations worked simultaneously across multiple bands and modes.
The document describes three products from Zodiac Data Systems: the Cortex RTR integrated telemetry receiver, the Radio Signal Recorder/Reproducer (RSR), and the modular DATaRec 4 data acquisition systems. The Cortex RTR provides dual-channel reception in P, L, and S bands with excellent phase noise compatibility. The RSR offers multi-channel IF signal recording and reproduction. The DATaRec 4 is available in standard and ruggedized configurations and can accommodate applications from low to very high channel counts.
PSNC has been researching 4K since 2008, including software like streaming, codecs, and visualization as well as hardware like live 4K. They cooperate with universities, film schools, and artists. Their 4K 3D node includes a professional 3D rig, post-production software, 4K streaming, and a 4K LCD monitor. PSNC is developing new technologies like a 132MP wall, GPU cluster, and 4K/4K3D post-production. They are working on playback, streaming, and a live interface for the RED One camera.
1. The document presents a simulation of a low power analog channel decoder for error correction implemented in 65nm CMOS technology.
2. The decoder uses analog circuitry operating in the sub-threshold region to perform decoding, allowing for ultra-low power operation below 40uW for throughput up to 2.5Mbps.
3. The decoder architecture includes an analog decoding core that implements the sum-product algorithm, digital interfaces for input and output, and a digital controller to manage timing.
1) The document describes the design and simulation of a first-order 1-bit sigma-delta analog-to-digital converter (ADC) using Ngspice simulation software.
2) A key component of sigma-delta ADCs is the op-amp, which is used in the integrator stage. The designed op-amp has a high open loop voltage gain and bandwidth suitable for the integrator.
3) The 1-bit sigma-delta ADC was implemented in a 0.18μm CMOS process with a ±2.5V power supply. The circuit was constructed and simulated using Ngspice to verify the design.
Design and realization of iir digital band stop filter using modified analog ...Subhadeep Chakraborty
This document describes the design and realization of an IIR digital band stop filter using a modified analog to digital mapping technique. It begins with introducing band stop filters and their importance in digital signal processing applications. It then discusses the design of analog band stop filters using passive components like resistors and capacitors. A new algorithm for analog to digital mapping is presented which transforms the transfer function from the s-domain to the z-domain, allowing the analog filter to be realized digitally. The direct form I structure is used to realize the IIR digital band stop filter. Simulation results demonstrating the magnitude response, phase response, impulse response and pole-zero plot are shown for Butterworth band stop filters of orders 6, 8 and 10.
TCP over low-power and lossy networks: tuning the segment size to minimize en...Ahmed Ayadi
This document discusses minimizing energy consumption for TCP over low-power and lossy networks. It presents a mathematical model to predict the energy used for bulk data transfers over multi-hop networks. The model estimates energy based on bit error rate, maximum retransmissions, number of hops, forward error correction amount, and TCP maximum segment size. The model allows studying the tradeoffs of sending short versus long TCP segments to improve energy efficiency over 6LoWPAN and IEEE 802.15.4 networks.
This document summarizes a research paper that designed a digital matched filter (DMF) to compress a binary phase code modulation (BPCM) signal encoded with a 13-chip Barker code. The DMF was implemented on an FPGA using a digital convolution algorithm in the time domain. The DMF design included a direct digital frequency synthesizer to generate the BPCM signal, a digital noise generator to add noise, and digital delay lines and shift registers to perform the convolution. Test results showed the input and output signals on an oscilloscope for different signal-to-noise ratio levels. The DMF achieved a processing gain of 11.14 dB.
November RaspberryPiBASICSPresentationSLIGHTLYOUTOFDATEDale Finley
The document provides information on various topics related to Raspberry Pi including the DSI connector, LED meanings, keyboard configuration, Linux commands, and software installation instructions. It lists graphical displays that can connect to the Raspberry Pi's DSI connector and describes the meanings of the LEDs for power, LAN connection status, and SD card access. It also provides commands for software management and system administration tasks like changing the keyboard, starting the GUI, and installing/updating packages.
Slides of Nathan Piasco ICRA 2019 oral presentation about the paper "Learning Scene Geometry for Visual Localization in Challenging Conditions". Best paper in Robot Vision Finalist
This document provides a summary of the instruction set for the MCS-51 microcontroller. It includes a table listing over 40 instructions such as ADD, SUB, AND, OR, MOV, INC, DEC, along with the description, number of cycles, and whether it affects flag settings. It notes that instructions can specify registers, direct addresses, indirect RAM addresses, or immediate data as operands.
The microcontroller instruction set includes instructions for flow control, logic, arithmetic, and data movement. It supports operations on registers and direct or indirect addressed memory locations. The instruction set is encoded in an 8-bit format and includes operations that affect the processor status flags.
This document discusses an FPGA implementation of a four phase code design using a modified genetic algorithm. It summarizes the key aspects of the implementation as follows:
1) The proposed architecture efficiently implements a modified genetic algorithm on an FPGA to identify good pulse compression sequences based on discrimination factor.
2) Pulse compression techniques in radar allow long pulses to achieve high energy while maintaining the range resolution of short pulses. The receiver compresses the long signal into a narrow signal.
3) The criteria for good pulse compression sequences include high merit factor and discrimination factor. Merit factor measures quality by comparing main lobe energy to side lobe energy. Discrimination factor compares the main peak to maximum side lobes.
This document summarizes the porting and optimization of the ITU-T G.729.1 speech coding algorithm to the SC3850 DSP core. The authors modified the reference G.729.1 code to fix-point C and optimized it for the SC3850 architecture through techniques like replacing functions with intrinsics, inlining small functions, and restructuring loops and memory accesses. They achieved over 90% improvement in cycles per second (MCPS) through these optimizations while maintaining bit-exact accuracy against the ITU test vectors. The G.729.1 codec is an 8-32 kbps scalable wideband speech codec standardized by ITU, useful for applications like VoIP, audio conferencing, and
A Review of the Split Bregman Method for L1 Regularized ProblemsPardis N
The document discusses using the split Bregman method for L1 regularization problems such as total variation denoising. It introduces total variation denoising models, describes how the conventional variational model is easy to solve but gives disappointing results, and explains that the split Bregman method can be applied to improve upon these results. The document then outlines the split Bregman formulation and applying it to total variation denoising before discussing results showing its fast convergence and acceptable intermediate outputs.
The document discusses denoising signals using wavelet transforms. It begins with an overview of denoising and its goal of reconstructing a signal from a noisy one. It then compares denoising using wavelets to other methods like Fourier filtering and spline methods. The key advantages of wavelets are their ability to localize properties and concentrate a signal's energy. The document outlines the basic denoising process using wavelet transforms which involves decomposition, thresholding, and reconstruction. It also discusses different thresholding methods and commonly used thresholds like VisuShrink.
This document discusses image denoising techniques. It begins by defining image denoising as removing unwanted noise from an image to restore the original signal. It then discusses several types of noise like additive Gaussian noise, impulse noise, uniform noise, and periodic noise. For denoising, it covers spatial domain techniques like linear filters (mean, weighted mean), non-linear filters (median filter), and frequency domain techniques that apply a low-pass filter to the Fourier transform of the noisy image. The document provides examples of denoising noisy images using mean and median filters to remove different types of noise.
Images may contain different types of noises. Removing noise from image is often the first step in image processing, and remains a challenging problem in spite of sophistication of recent research. This ppt presents an efficient image denoising scheme and their reconstruction based on Discrete Wavelet Transform (DWT) and Inverse Discrete Wavelet Transform (IDWT).
This document discusses image restoration techniques. It describes how image degradation occurs through various processes and the goal of image restoration is to reconstruct the original image from its degraded version. There are two main categories of restoration techniques - deterministic and stochastic. Deterministic techniques use an inverse transformation when the degradation function is known, while stochastic techniques estimate the best restoration according to some criterion. Common degradation functions include relative motion, lens focus issues, and atmospheric turbulence. Inverse filtration and Wiener filtration are two restoration methods described, with Wiener filtration taking noise into account to minimize mean square error.
Speckle noise reduction from medical ultrasound images using wavelet threshIAEME Publication
This document proposes a method for reducing speckle noise from medical ultrasound images using wavelet thresholding and anisotropic diffusion. It first takes the logarithm of noisy images to convert speckle noise from multiplicative to additive. It then applies median filtering, followed by wavelet denoising using soft thresholding of detail coefficients. Finally, it uses SRAD filtering to enhance contrast while retaining edges. The proposed method is tested on three images and is found to improve SNR and PSNR compared to other filters based on statistical measurements.
1. The document proposes a flexible hardware architecture for image scaling using a programmable 2D separable convolution engine.
2. It describes how any scaling operation can be decomposed into three steps: anti-aliasing filtering, continuous image reconstruction via convolution, and resampling to the output grid.
3. The proposed architecture uses a memory to store a programmable interpolation kernel and enables different scaling algorithms like nearest neighbor and bicubic interpolation by programming the kernel.
1) The document summarizes projects completed by Adam McConnell during his first rotation at Analog Devices, including developing FPGA code to write HDCP keys to memory and automating jitter transfer function measurements using lab equipment and GPIB commands.
2) In his HDCP key project, Adam learned Verilog and programming FPGAs while figuring out how to correctly write three copies of an HDCP key to memory.
3) For the jitter transfer function project, Adam automated data collection and plotting using GPIB to control an arbitrary waveform generator and oscilloscope, gaining experience interfacing with lab equipment and characterizing PLL performance.
The Boolean expression at TP1 with respect to the corresponding inputs is:
TP1 = A + B
Question 2.
Question :
(TCO 3) Determine the Boolean expression at TP2 with respect to the corresponding inputs.
Student Answer:
TP2 = C
Instructor Explanation:
Correct. TP2 is simply the input C, so the Boolean expression is C.
Question 3.
Question :
(TCO 3) Determine the Boolean expression at TP3 with respect to the corresponding inputs.
Student Answer:
TP3 = A·C + B
Instructor Explanation:
Correct. TP3 is the output of an AND gate (A and C
Types Of Window Being Used For The Selected GranuleLeslie Lee
The document discusses different types of mode selective devices that can be used for mode multiplexing over few-mode fiber. Free-space based devices are bulky while fiber based devices are more compact and easier to integrate. Early demonstrations transmitted data over 107 Gb/s using the LP01 and LP11 fiber modes and 58.8 Gb/s using dual modes with electronic MIMO processing for mode separation. Mode selective devices can be categorized as either free-space based or fiber based, with fiber based being preferable due to their compact size and integration capabilities.
Design of Adjustable Reconfigurable Wireless Single Core CORDIC based Rake Re...IOSR Journals
In wireless communication system transmitted signals are subjected to multiple reflections,
diffractions and attenuation caused by obstacles such as buildings and hills, etc. At the receiver end, multiple
copies of the transmitted signal are received that arrive at clearly distinguishable time instants and are faded by
signal cancellation. Rake receiver is a technique to combine these so called multi-paths [2] by utilizing multiple
correlation receivers allocated to those delay positions on which the significant energy arrives which achieves a
significant improvement in the SNR of the output signal. This paper shows how the rake, including dispreading
and descrambling could be replaced by a receiver that can be implemented on a CORDIC based hardware
architecture. The performance in conjunction with the computational requirements of the receiver is widely
adjustable which is significantly better than that of the conventional rake receiver
Design of Adjustable Reconfigurable Wireless Single Core CORDIC based Rake Re...IOSR Journals
Abstract : In wireless communication system transmitted signals are subjected to multiple reflections, diffractions and attenuation caused by obstacles such as buildings and hills, etc. At the receiver end, multiple copies of the transmitted signal are received that arrive at clearly distinguishable time instants and are faded by signal cancellation. Rake receiver is a technique to combine these so called multi-paths [2] by utilizing multiple correlation receivers allocated to those delay positions on which the significant energy arrives which achieves a significant improvement in the SNR of the output signal. This paper shows how the rake, including dispreading and descrambling could be replaced by a receiver that can be implemented on a CORDIC based hardware architecture. The performance in conjunction with the computational requirements of the receiver is widely adjustable which is significantly better than that of the conventional rake receiver. Keywords - Rake receiver, Multi-paths, CORDIC
The document discusses the Chameleon Chip, a reconfigurable processor that can rewire itself dynamically to adapt to different software tasks. It contains reconfigurable processing fabric divided into slices that can be reconfigured independently. Algorithms are loaded sequentially onto the fabric for high performance. The chip architecture includes an ARC processor, memory controller, PCI controller, and programmable I/O. Its applications include wireless base stations, wireless local loops, and software-defined radio.
Design and Implementation of Low Power High Speed Symmetric Decoder Structure...Dr. Amarjeet Singh
The key objective of this project is to design a
decoder which can be used for hardware purposes.
Hardware, here accompanies with software which is more
we can discuss as a Software Defined Radio application. The
decoder implemented here offers to new radio equipment
(SDR), the flexibility of a programmable system. Nowadays,
the behavior of a communication system can be modified by
simply changing its software. Large tree decoder is made by
reusing smaller similar sub-modules. Thus the structure is
symmetric. The symmetric and regular structure of tree
decoder makes the system a less complexity one. The
structure obeys regularity and modularity concepts of VLSI
circuit, thus is easy to fabricate using cell library elements.
Design a Tree Decoder proposed architecture for SDR
application on FPGA. The Structures made here are
hardware synthesizable on FPGA board and are done in a
respective manner. The design to be implementing by using
Verilog-HDL language. The Simulation and Synthesis by
using Xilinx Vivado design suite.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Design And Simulation of Modulation Schemes used for FPGA Based Software Defi...Sucharita Saha
Design of a BPSK and QPSK digital Modulation scheme and its implementation on FPGAs for universal mobile telecommunications system and SDR applications. The simulation of the system is made in MATLAB Simulink environment and System Generator, a tool used for FPGA design. Hardware Co-Simulation is designed using VHDL a hardware description language targeting a Xilinx FPGA and is verified using MATLAB Simulink. It is then converted to VHDL level using Simulink HDL coder. The design is synthesized and fitted with Xilinx 14.2 ISE Edition software, and downloaded to Spartan 3E (XC3S500E) board.
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
This document discusses the design and analysis of multirate filters for WiMAX applications. It proposes a programmable multirate filter architecture that can be implemented using software-defined radio technology and multirate signal processing principles. The filters are designed using MATLAB's filter design and analysis tool to meet WiMAX specifications. A digital upconverter is presented that uses three cascaded FIR filters with interpolation factors of 1, 2, and 4 to achieve an overall interpolation factor of 8 as required by WiMAX. The filters are analyzed and simulated in MATLAB to verify they satisfy WiMAX's spectral mask requirements.
Warp processing is a technique that dynamically optimizes software to improve performance and energy efficiency. It works by profiling an application to identify critical regions, then partitioning those regions to hardware using an FPGA. The binary is updated to execute the partitioned regions on the FPGA circuit while the rest continues in software. This allows applications to achieve speedups of 2-100x or more while using 20x less memory and reducing power consumption by 38-94%.
IRJET- A Digital Down Converter on Zynq SoCIRJET Journal
This document describes the design and implementation of a digital down converter (DDC) on a Zynq System on Chip (SoC). Key points:
- The DDC is designed for airborne radar receivers to downconvert high sample rate digitized signals to a lower frequency for easier processing.
- The DDC implementation includes a direct digital synthesizer to generate input signals, complex multiplication for mixing, and a two-stage decimation and filtering process.
- The design is implemented on a Zynq SoC which provides the flexibility of a processor and programmability of an FPGA.
- Results show the DDC design achieves significant improvements in resource utilization compared to a full
International Journal of Engineering Inventions (IJEI) provides a multidisciplinary passage for researchers, managers, professionals, practitioners and students around the globe to publish high quality, peer-reviewed articles on all theoretical and empirical aspects of Engineering and Science.
The peer-reviewed International Journal of Engineering Inventions (IJEI) is started with a mission to encourage contribution to research in Science and Technology. Encourage and motivate researchers in challenging areas of Sciences and Technology.
Small form factor cognitive radio implemented via fpga partial reconfiguratio...Roberto Uribeetxeberria
This document describes a cognitive radio system implemented on an FPGA that can replace a wired video transmission system. Key points:
- The system senses the availability of the wireless transmission channel and can change its intermediate frequency based on channel conditions using FPGA partial reconfiguration.
- It implements an OQPSK modulation scheme and was designed using Xilinx System Generator rapid prototyping tools.
- The transmitter acquires video streaming data, modulates it using OQPSK, and upconverts it to either a 5MHz or 10MHz intermediate frequency depending on channel availability.
- The receiver searches for the signal power, reconfigures to the target frequency, performs synchronization and demodulation, and outputs the
GNU Radio based Real Time Data Transmission and ReceptionIRJET Journal
This document describes a study on real-time data transmission and reception using digital modulation techniques in GNU Radio. The researchers implemented Gaussian Minimum Shift Keying (GMSK), Gaussian Frequency Shift Keying (GFSK), and Differential Phase Shift Keying (DPSK) using GNU Waveguru and GNU Octave tools. GNU Radio was used as a software-defined radio framework to generate and analyze baseband signals and implement the digital modulations. GNU Octave accepted real-time text input from the user and saved it to a file. The modulation techniques were then used to transmit the text data over a wireless medium.
A Simulation of Wideband CDMA System on Digital Up/Down ConvertersEditor IJMTER
In this paper, I present FPGA implementation of a digital down converter (DDC) and
digital up converter (DUC) for a single carrier WCDMA system. The DDC and DUC is complex in
nature. The implementation of DDC is simple because it does not require mixers or filters. Xilinx
System Generator and Xilinx ISE are used to develop the hardware circuit for the FPGA. Both the
circuits are verified on the Spartan - 3 FPGA
We develop custom Image Recognition systems for Aerospace and defence applications. Using algorithms like Deep Convolutional Neural Networks and Regional Convolutional Neural Networks.
Our algorithms for Target Recognition and Tracking are designed from the beginning to be run on embedded systems. We target both GPU and FPGA devices.
To Train and Validate our algorithms we developed a process to generate photorealistic 3D environments.
Those 3D Environments are used to produce realistic video streams of the targets in different environmental conditions (lighting, adverse meteorological conditions, camouflage, point-of-view).
The same technology can be used to Train and Test Automotive Vision Systems.
This presentation discusses developing custom image recognition systems for automotive and defense applications using deep learning techniques. It describes generating 3D environments to realistically simulate environmental and system conditions for training and validating vision algorithms. These virtual environments can be used to test automotive vision systems. The presentation also covers algorithm development targeting GPU and FPGA hardware, including a proposed integrated development environment for designing convolutional neural networks on FPGAs.
Similar to A Coarse-Grained Reconfigurable Wavelet Denoiser Exploiting the Multi-Dataflow Composer Tool (20)
The Reconfigurable Video Coding (RVC) framework is a recent ISO standard aiming at providing a unified specification of MPEG video technology in the form of a library of components. The word “reconfigurable” evokes run-time instantiation of different decoders starting from an on-the-fly analysis of the input bitstream.
In this paper we move a first step towards the definition of systematic procedures that, based on the MPEG RVC specification formalism, are able to produce multi-decoder platforms, capable of fast switching between different configurations. Looking at the similarities between the decoding algorithms to implement, the papers describes an automatic tool for their composition into a single configurable multi-decoder built of all the required modules, and able to reuse the shared components so as to reduce the overall footprint (either from a hardware or software perspective).
The proposed approach, implemented in C++ leveraging on Flex and Bison code generation tools, typically exploited in the compilers front-end, demonstrates to be successful in the composition of two different decoders MPEG-4 Part 2 (SP): serial and parallel.
The Multi-Dataflow Composer Tool: a Runtime Reconfigurable HDL Platform ComposerMDC_UNICA
The document summarizes a presentation given at the Conference on Design and Architectures for Signal and Image Processing (DASIP) in 2011. It discusses the Multi-Dataflow Composer (MDC) tool, which automatically composes different functional units specified using dataflow models of computation onto a coarse-grained reconfigurable hardware template. The MDC aims to close the gap between complex multi-purpose hardware platforms and their software programming by leveraging dataflow models and runtime reconfiguration. It allows developing runtime reconfigurable multi-purpose systems from high-level dataflow descriptions.
DSE and Profiling of Multi-Context Coarse-Grained Reconfigurable SystemsMDC_UNICA
The implementation of multi-context systems over coarse-grained reconfigurable platforms could bring several benefits in terms of efficient resource usage and power management. Nevertheless on-the-fly reconfiguration and mapping are not so straightforward and the optimal configuration of the substrate could be extremely time consuming. In this paper we present an early stage design space exploration methodology intended for dataflow-based design flows where multiple input specifications have to be taken into account. The proposed approach, coupled to the Multi-Dataflow Composer tool, has been exploited to assemble the central reconfigurable computing core of an accelerator for video/image processing.
Automatic Generation of Dataflow-Based Reconfigurable Co-processing UnitsMDC_UNICA
Hardware accelerators are widely adopted to speed up computationally onerous applications. However their design is not trivial, especially if multiple applications/kernels need to be served. To this aim the Multi-Dataflow Composer (MDC) tool can be adopted to generate the internal computing core of flexible and reconfigurable hardware accelerators. Nevertheless, MDC is not able, as it is, to deploy ready to use accelerators. To address this lack, we conceived a fully automated design flow for coarse-grained reconfigurable and memory-mapped hardware accelerators, which required: a) the definition of a generic co-processing template, the Template Interface Layer; b) the extension of the MDC tool to characterize such a template and deploy the accelerators. This methodology represents, within MPEG Reconfigurable Video Coding studies, the first framework for the automatic generation of reconfigurable hardware accelerators and, as it will be discussed, it may be beneficial also in other contexts of execution with fairly limited adjustments. Results validated the proposed approach in a real use case scenario, comparing the automatically generated co-processor with a previous custom one.
Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL...MDC_UNICA
pecialized hardware infrastructures for efficient multi-application runtime reconfigurable platforms require to address several issues. The higher is the system complexity, the more error prone and time consuming is the entire design flow. Moreover, system configuration along with resource management and mapping are challenging, especially when runtime adaptivity is required. In order to address these issues, the Reconfigurable Video Coding Group within the MPEG group has developed the MPEG RMC standards ISO/IEC 23001-4 and 23002-4, based on the dataflow Model of Computation. In this paper, we propose an integrated design flow, leveraging on Xronos, TURNUS, and the Multi-Dataflow Composer tool, capable of automatic synthesis and mapping of reconfigurable systems. In particular, an RVC MPEG-4 SP decoder and the RVC Intra MPEG-4 SP decoder have been implemented on the same coarse-grained reconfigurable platform, targeting a Xilinx Virtex 5 330 FPGA board. Results confirmed the potentiality of the approach, capable of completely preserving the single decoders functionality and of providing, in addition, considerable power/area benefits with respect to the parallel implementation of the considered decoders on the same platform.
Power-Awarness in Coarse-Grained Reconfigurable Designs: a Dataflow Based Str...MDC_UNICA
Applications and hardware complexity management in modern systems tend to collide with efficient resource and power balance. Therefore, dedicated and power-aware design frameworks are necessary to implement efficient multi-functional runtime reconfigurable signal processing platforms. In this work, we adopt dataflow specifications as a starting point to challenge power minimization.
Automated Power Gating Methodology for Dataflow-Based Reconfigurable SystemsMDC_UNICA
Modern embedded systems designers are required to implement efficient multi-functional applications, over portable platforms under strong energy and resources constraints. Automatic tools may help them in challenging such a complex scenario: to develop complex reconfigurable systems while reducing time-to-market. At the same time, automated methodologies can aid them to manage power consumption. Dataflow models of computation, thanks to their modularity, turned out to be extremely useful to these purposes. In this paper, we will demonstrate as they can be used to automatically achieve power management since the earliest stage of the design flow. In particular, we are focussing on the automation of power gating. The methodology has been evaluated on an image processing use case targeting an ASIC 90 nm CMOS technology.
Reconfigurable Coprocessors Synthesis in the MPEG-RVC DomainMDC_UNICA
Flexibility and high efficiency are common design drivers in the embedded systems domain. Coarse-grained reconfigurable coprocessors can tackle these issues, but they suffer of complex design, debugging and applications mapping problems. In this paper, we propose an automated design flow that aids developers in design and managing coarse-grained reconfigurable coprocessors. It provides both the hardware IP and the software drivers, featuring two different levels of coupling with the host processor. The presented solution has been tested on a JPEG codec, targeting a commercial Xilinx Virtex-5 FPGA.
Power Modelling for Saving Strategies in Coarse Grained Reconfigurable SystemsMDC_UNICA
In the context of coarse-grained reconfigurable systems we present a power estimation model to guide the designer in deciding which part of the design may benefit from the application of a power gating technique. The model is assessed by adopting a reconfigurable core for image processing targeting an ASIC 90 nm technology.
Adaptable AES Implementation with Power-Gating SupportMDC_UNICA
The document proposes an adaptable AES implementation with power-gating support. It includes a multi-profile AES design that can reconfigure based on quality of service and battery life requirements. Different configurations offer varying levels of security, latency, and power consumption. The design also uses power gating techniques like sleep transistors and isolation cells to reduce power consumption. Evaluation shows the approach provides a flexible AES solution with efficient power management for Internet of Things devices.
Power and Clock Gating Modelling in Coarse Grained Reconfigurable SystemsMDC_UNICA
Power reduction is one of the biggest challenges in modern systems and tends to become a severe issue dealing with complex scenarios. To provide high-performance and exibility, designers often opt for coarse-grained recongurable (CGR) systems. Nevertheless, these systems require specic attention to the power problem, since large set of resources may be underutilized while computing a certain task.
This paper focuses on this issue. Targeting CGR devices, we propose a way to model in advance power and clock gating costs on the basis of the functional, technological and architectural parameters of the baseline CGR system. The
proposed flow guides designers towards optimal implementations, saving designer eort and time.
The RPCT project aims to define a software framework providing an efficient support for the automatic generation and management of dedicated reconfigurable systems. The starting points of this framework are the high-level specifications provided as dataflow models.
The main outcome of this research is the Multi Dataflow Composer (MDC) tool, which automatically generates a reconfigurable hardware platform. The MDC tool is extremely suitable in video and image processing contexts, portable platforms and bio-medical signal processing applications (i.e. implantable or wearable devices).
The RPCT research project has been developed within the Microelectronics and Bioengineering Lab (EOLAB) of the Department of Electrical and Electronic Engineering at the University of Cagliari.
Building a Raspberry Pi Robot with Dot NET 8, Blazor and SignalRPeter Gallagher
In this session delivered at NDC Oslo 2024, I talk about how you can control a 3D printed Robot Arm with a Raspberry Pi, .NET 8, Blazor and SignalR.
I also show how you can use a Unity app on an Meta Quest 3 to control the arm VR too.
You can find the GitHub repo and workshop instructions here;
https://bit.ly/dotnetrobotgithub
A Coarse-Grained Reconfigurable Wavelet Denoiser Exploiting the Multi-Dataflow Composer Tool
1. A Coarse-Grained Reconfigurable
Wavelet Denoiser Exploiting the
Multi-Dataflow Composer ToolMulti-Dataflow Composer Tool
N. Carta, C. Sau, F. Palumbo, D. Pani and L. Raffo
DIEE - Dept. of Electrical and Electronic Engineering
University of Cagliari, Italy
October 8October 8--10, 201310, 2013
Cagliari, ItalyCagliari, Italy
Conference on Design and Architectures for Signal and Image ProcessingConference on Design and Architectures for Signal and Image Processing
2. Introduction
Some application-specific fields, such as the biomedical one, push
towards area- and power- effective VLSI implementation of digital
signal processing algorithms mainly for the development of
wearable or implantable devices.
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
To this aim, the Multi-Dataflow Composer (MDC) tool, originally
conceived for image/video processing, can be exploited in other
domains leading to optimal hardware implementations.
The MDC-based hardware platforms are flexible, fitting to
parametric solutions adjustable at runtime, according to the
characteristics of the signal of interest.
3. Research goals
Exploit strategies of resources reusability through
reconfiguration to provide area and power minimization.
Demonstrate the potential orthogonality of the MDC approach
with respect to the Reconfigurable Video Domain (RVC)
domain.
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
Test the proposed approach on a runtime reconfigurable
Wavelet Denoiser, targeted for biomedical applications,
prototyped onto an FPGA Development Board.
Evaluate benefits in terms of area and power in comparison to
traditional solutions and verify the system functionality using
simulated datasets of neural signals.
4. AA BB CCA B C
AA BB CCA B C
AA BB CCA B C
Multi-Kernel Datapath Generation
A B C
A
E C
D
Single
Application
Dataflows
SADs
Network Language (NL)
Descriptions
SAD 2SAD 1
HDL Components library, the
Functional Units (FUs) implementing
the actors:
• IEEE 754-1985 compliant
arithmetic modules;
• control flow modules.
All the FUs obey to the same FIFO-
based communication protocol.
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
Platform
Composer
Multi-Dataflow
CAL Composer
(MDCC)
A
D
CSbox
Sbox
B
E
Directed
Flow Graph
(DFG)
MDC FRONT-END
MDC
BACK-END
a Coarse-Grained
Reconfigurable
Hardware Platform
with the minimum
number of FUs
Global Kernel
Multi-Dataflow
Composer (MDC) tool
5. Use Case: Wavelet Denoising (WD)
WD aims at removing the background noise corrupting the signal
of interest, especially when it can be modeled as a Gaussian-
distributed random noise.
Translation Invariant solution: à-trous algorithm implementation.
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
The number of Decomposition/Recomposition levels depends on
the chosen input frequency and on the spectral characteristics of
both noise and signal of interest.
6. Use Case: Wavelet Denoising (WD)
WD aims at removing the background noise corrupting the signal
of interest, especially when it can be modeled as a Gaussian-
distributed random noise.
Translation Invariant solution: à-trous algorithm implementation.
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
A pair of quadrature
mirror Finite Impulse
Response (FIR) filters at
each level
7. Use Case: Wavelet Denoising (WD)
WD aims at removing the background noise corrupting the signal
of interest, especially when it can be modeled as a Gaussian-
distributed random noise.
Translation Invariant solution: à-trous algorithm implementation.
Approximation Signal
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
Approximation Signal
Detail Signal
8. Use Case: Wavelet Denoising (WD)
WD aims at removing the background noise corrupting the signal
of interest, especially when it can be modeled as a Gaussian-
distributed random noise.
Translation Invariant solution: à-trous algorithm implementation.
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
Registers to balance the
delay of each path of
the trellis
9. Use Case: Wavelet Denoising (WD)
WD aims at removing the background noise corrupting the signal
of interest, especially when it can be modeled as a Gaussian-
distributed random noise.
Translation Invariant solution: à-trous algorithm implementation.
Band-Pass solution
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
10. WD: thresholding phase
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
To minimize the hardware resources, an hard thresholding has been
implemented: only samples with values greater than θ are passed whereas
the others are cleared to zero.
θ can coincide with a scaled version of the standard deviation σn of the
additive noise source, computed iteratively analysing consecutive
windows of b_len detail samples.
4
1
2,2
1_*4
1
9.3
n
n jj s
lenb
SCth
lenb
k
j
n
kds j
_
0
2
2,
11. Computing Kernels
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
Considering an input sampling frequency of 12kHz, 4 levels of
decomposition/recomposition and removing the approximation
signal at the 4th level, the denoised signal has a bandwidth
between 375Hz and 6kHz.
Different computational kernels can be identified for the MDC
tool. They should present commonalities at the actor level,
enabling reuse of common hardware FUs.
12. Computing Kernels
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
dec kernel
13. Computing Kernels
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
rec kernel
14. Computing Kernels
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
thr kernel
15. To evaluate the MDC-based wavelet denoiser functionality, a
processor-coprocessor system has been designed and
implemented on a Xilinx FPGA Spartan-3E 1600 Development
Board.
Test Environment
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
16. For every new input buffer frame of b_len samples:
Use-Case Execution Trace
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
17. For every new input buffer frame of b_len samples:
Use-Case Execution Trace
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
18. For every new input buffer frame of b_len samples:
Use-Case Execution Trace
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
19. Coprocessor Parameter Set
The implemented coprocessor is highly parametric. The designer can
choose at runtime:
the number N of levels used in decomposition/recomposition;
the number and the values of the coefficients used for the FIR
filters;
the possibility of removing the last approximation signal to
obtain a band-pass filtering;
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
obtain a band-pass filtering;
the threshold values for the details at each decomposition level;
the b_len number of samples for each input frame (allowable
maximum: 512).
The input signal can be applied in input to the testing environment
through an UDP communication via Ethernet between the FPGA and
the host PC.
20. Experimental results: Test Data
The performance of the reconfigurable Wavelet Denoiser has been
assessed using public available datasets of simulated neural signals,
constructed starting from a database of real physiological spikes.
The background noise overlapped to the significant signal is obtained
considering spikes of different neuronal cells at random times and
amplitudes.
Sampling Frequency: 12kHz Useful Bandwidth: 300Hz – 3kHz
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
0 0.05 0.1 0.15 0.2 0.25 0.3
-1.5
-1
-0.5
0
0.5
1
1.5
Time [sec]
Amplitude[V]
0 0.05 0.1 0.15 0.2 0.25 0.3
-1.5
-1
-0.5
0
0.5
1
1.5
Time [sec]
Amplitude[V]
Background noise: 0.1 Background noise: 0.3
21. Parameters set-up
Wavelet Denoiser parameters defined at runtime:
N = 4 of decomposition/recomposition levels;
the removal of the approximation signal at the 4th
decomposition level determining an output frequency
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
decomposition level determining an output frequency
bandwidth in the range of [375Hz-3kHz].
b_len = 500 samples as length of each input buffer frame;
the Haar/ Daubechies 2 families as mother wavelets.
22. 0.16 0.18 0.2 0.22 0.24 0.26 0.28
-1
-0.5
0
0.5
1
DenoisingInput[V]
Accuracy: low level noise and
Haar as mother wavelet
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
0.16 0.18 0.2 0.22 0.24 0.26 0.28
Time [sec]
0.16 0.18 0.2 0.22 0.24 0.26 0.28
-1
-0.5
0
0.5
1
Time [sec]
DenoisingOutput[V]
23. Accuracy: high level noise and
Haar as mother wavelet
0.16 0.18 0.2 0.22 0.24 0.26 0.28
-1
-0.5
0
0.5
1
DenoisingInput[V]
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
0.16 0.18 0.2 0.22 0.24 0.26 0.28
Time [sec]
0.16 0.18 0.2 0.22 0.24 0.26 0.28
-0.5
0
0.5
1
Time [sec]
DenoisingOutput[V]
25. Area occupancy and Power Consumption
The datapath of the reconfigurable filter has been synthesized in
order to evaluate the effectiveness of our approach in terms of area
and power saving.
We compared the Global Kernel assembled by the MDC tool with:
the datapath obtained by the 3 identified kernels without any
resource sharing among them (Static Global Kernel);
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
resource sharing among them (Static Global Kernel);
the Cascaded Kernel implementing the wavelet denoising as a
fully-parallel solution without sharing actors among the
various decomposition/recomposition levels.
Results for the Cascaded Kernel only represent an estimation taking
into account the number of required FIR filters and the hardware
resources correspondent to the various modules of the HDL
components library.
26. Synthesis Results for Multiplier and Adder FUs
“Multiplier” and “Adder” FUs determine the largest contribution in
terms of area and power consumption.
Implementation adders multipliers
MDC Global Kernel 3 2
Static Global Kernel 6 4
Cascaded WD 20 32
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
Cascaded WD 20 32
FPGA and ASIC (using Cadence RTL Compiler targeting a 90nm CMOS
library) synthesis results for the relevant FUs:
FU FPGA ASIC
Slices Flip Flops MULT18x18s Area
[µm2]
Power
[mW]
Frequency
[MHz]
adder 341 116 0 8644 1,679 555,56
multiplier 138 131 11 113497 2,031 357,14
27. FPGA synthesis using the Xilinx FPGA Spartan-3E 1600 as target:
FPGA synthesis results
Implementation Slices [%] Flip Flops [%] MULT18x18s [%]
MDC Global Kernel 14 5 22
Static Global Kernel 22 8 44
Cascaded WD 76 22 356
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
Cascaded WD 76 22 356
The MDC Global Kernel is able to achieve 36% of saving in terms
of slices compared to Static Global Kernel and 82% compared to
the Cascaded WD solution.
Considering the Xilinx multiplier primitives, the resource saving
percentage is 50% and 97% respectively.
29. Results evaluation
Provided that:
in the MDC Global Kernel case, the same resources are
reused at each level;
in the Cascaded Kernel case, the number of required
resources is linearly proportional to the number of levels;
the more will be the depth of the denoiser the larger will be the
benefits of adopting the MDC-based approach.
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
benefits of adopting the MDC-based approach.
Strict temporal constraints would make impossible the minimization
of hardware resources within kernels. In those cases, a compromise
design choice must be done to balance resource saving and
processing capacity.
In other cases different kernel models, for example exploiting
pipelining solutions, would be beneficial.
30. Conclusions
In this paper, the implementation of a Wavelet Denoiser allows to
demonstrate that:
the applicability of the MDC tool on a completely different
application field compared to the original one it was
conceived for;
a non-static implementation of a denoiser is possible
leveraging a coarse-grained reconfigurable hardware
design platform.
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
leveraging a coarse-grained reconfigurable hardware
design platform.
System correctness and accuracy have been properly verified using a
processor-coprocessor test environment and simulated neural signals
as input data, varying the values assigned to the various parameters.
The MDC-based solution is capable of an on-line processing of the
incoming samples maximizing the resources reuse and determining
area and power minimization.
31. Acknowledgement
The research leading to these results has received funding from the
Region of Sardinia, Fundamental Research Programme, L.R. 7/2007
“Promotion of the scientific research and technological innovation in
Sardinia” under grant agreement CRP-18324 RPCT Project (Y. 2010) and
CRP-60544 ELoRA Project (Y. 2012). Carlo Sau gratefully acknowledges
Sardinia Regional Government for the financial support of his PhD
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
Sardinia Regional Government for the financial support of his PhD
scholarship (P.O.R. Sardegna F.S.E. Operational Programme of the
Autonomous Region of Sardinia, European Social Fund 2007-2013 - Axis
IV Human Resources, Objective l.3, Line of Activity l.3.1).
32. A Coarse-Grained Reconfigurable
Wavelet Denoiser Exploiting the
Multi-Dataflow Composer ToolMulti-Dataflow Composer Tool
N. Carta, C. Sau, F. Palumbo, D. Pani and L. Raffo
DIEE - Dept. of Electrical and Electronic Engineering
University of Cagliari, Italy
33. Read full article
IEEE Xplore Digital Library
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6661532
RPCT Project Website:
http://sites.unica.it/rpct/references/
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
http://sites.unica.it/rpct/references/
34. Reference: BibTex
@INPROCEEDINGS{Carta_DASIP_2013,
author={N. Carta and C. Sau and F. Palumbo and D. Pani and L. Raffo},
booktitle={Design and Architectures for Signal and Image Processing
(DASIP), 2013 Conference on},
title={A coarse-grained reconfigurable wavelet denoiser exploiting the
Multi-Dataflow Composer tool},
year={2013},
Nicola CartaNicola Carta etet al.al.
A CoarseA Coarse--Grained Reconfigurable Wavelet Denoiser Exploiting the MultiGrained Reconfigurable Wavelet Denoiser Exploiting the Multi--Dataflow Composer ToolDataflow Composer Tool
year={2013},
pages={141-148}, }