I studied in Indian Institute of Technology, Kharagpur, India. I did my B.Texh and M.Tech in the department of Electronics and Electrical Communication Engineering. I was student of 2018 batch. After that, I joined Schneider Electric Systems India Private limited Company as Software design Engineer. Currently I am designated as Senior Firmware Engineer in the same company. I have work experience of 4+ years. The uploaded ppt is my MTP Thesis. It is about "temperature aware application mapping on to mesh based network on chip using Genetic Algorithm".
The MSc defense ceremony was held on 6-7-2017 in Mansoura University, Faculty of Engineering. This presentation is shared to help MSc students in Faculty of Engineering prepare their thesis presentation and ease their tension before their presentation time
ANALOG MODELING OF RECURSIVE ESTIMATOR DESIGN WITH FILTER DESIGN MODELVLSICS Design
This document summarizes a research paper on implementing a low power design methodology for recursive encoders and decoders. It discusses how recursive coding can achieve better error correction performance at low signal-to-noise ratios compared to other codes. It then describes the design of a recursive decoder that uses the log-MAP algorithm to minimize power consumption. The decoder uses five main computational steps - branch metric calculation, forward metric computation, backward metric computation, log-likelihood ratio calculation, and extrinsic information calculation. It also compares the implementation of four-state and eight-state recursive encoders. The goal of the design is to optimize the power and area of recursive encoders and decoders.
Energy efficiency is one of the most critical issue in design of System on Chip. In Network On
Chip (NoC) based system, energy consumption is influenced dramatically by mapping of
Intellectual Property (IP) which affect the performance of the system. In this paper we test the
antecedently extant proposed algorithms and introduced a new energy proficient algorithm
stand for 3D NoC architecture. In addition a hybrid method has also been implemented using
bioinspired optimization (particle swarm optimization) technique. The proposed algorithm has
been implemented and evaluated on randomly generated benchmark and real life application
such as MMS, Telecom and VOPD. The algorithm has also been tested with the E3S benchmark
and has been compared with the existing algorithm (spiral and crinkle) and has shown better
reduction in the communication energy consumption and shows improvement in the
performance of the system. Comparing our work with spiral and crinkle, experimental result
shows that the average reduction in communication energy consumption is 19% with spiral and
17% with crinkle mapping algorithms, while reduction in communication cost is 24% and 21%
whereas reduction in latency is of 24% and 22% with spiral and crinkle. Optimizing our work
and the existing methods using bio-inspired technique and having the comparison among them
an average energy reduction is found to be of 18% and 24%.
ENERGY AND LATENCY AWARE APPLICATION MAPPING ALGORITHM & OPTIMIZATION FOR HOM...cscpconf
Energy efficiency is one of the most critical issue in design of System on Chip. In Network On
Chip (NoC) based system, energy consumption is influenced dramatically by mapping of
Intellectual Property (IP) which affect the performance of the system. In this paper we test the
antecedently extant proposed algorithms and introduced a new energy proficient algorithm
stand for 3D NoC architecture. In addition a hybrid method has also been implemented using
bioinspired optimization (particle swarm optimization) technique. The proposed algorithm has
been implemented and evaluated on randomly generated benchmark and real life application
such as MMS, Telecom and VOPD. The algorithm has also been tested with the E3S benchmark
and has been compared with the existing algorithm (spiral and crinkle) and has shown better
reduction in the communication energy consumption and shows improvement in the
performance of the system. Comparing our work with spiral and crinkle, experimental result
shows that the average reduction in communication energy consumption is 19% with spiral and
17% with crinkle mapping algorithms, while reduction in communication cost is 24% and 21%
whereas reduction in latency is of 24% and 22% with spiral and crinkle. Optimizing our work
and the existing methods using bio-inspired technique and having the comparison among them
an average energy reduction is found to be of 18% and 24%.
Improving initial generations in pso algorithm for transportation network des...ijcsit
Transportation Network Design Problem (TNDP) aims to select the best project sets among a number of new projects. Recently, metaheuristic methods are applied to solve TNDP in the sense of finding better solutions sooner. PSO as a metaheuristic method is based on stochastic optimization and is a parallel revolutionary computation technique. The PSO system initializes with a number of random solutions and seeks for optimal solution by improving generations. This paper studies the behavior of PSO on account of improving initial generation and fitness value domain to find better solutions in comparison with previous attempts.
To mine out relevant facts at the time of need from web has been a tenuous task. Research on diverse fields are fine tuning methodologies toward these goals that extracts the best of information relevant to the users search query. In the proposed methodology discussed in this paper find ways to ease the search complexity tackling the severe issues hindering the performance of traditional approaches in use. The proposed methodology find effective means to find all possible semantic relatable frequent sets with FP Growth algorithm. The outcome of which is the further source of fuel for Bio inspired Fuzzy PSO to find the optimal attractive points for the web documents to get clustered meeting the requirement of the search query without losing the relevance. On the whole the proposed system optimizes the objective function of minimizing the intra cluster differences and maximizes the inter cluster distances along with retention of all possible relationships with the search context intact. The major contribution being the system finds all possible combinations matching the user search transaction and thereby making the system more meaningful. These relatable sets form the set of particles for Fuzzy Clustering as well as PSO and thus being unbiased and maintains a innate behaviour for any number of new additions to follow the herd behaviour’s evaluations reveals the proposed methodology fares well as an optimized and effective enhancements over the conventional approaches
A Novel Route Optimized Cluster Based Routing Protocol for Pollution Controll...IRJET Journal
This document summarizes a research paper that proposes a novel cluster-based routing protocol for sensor networks aimed at pollution monitoring. The protocol aims to optimize routes while maintaining moderate energy efficiency. It first discusses challenges in sensor network routing related to dynamic topology and limited device power. It then outlines two lemmas: 1) reducing cluster head detection time can decrease routing time; and 2) changing cluster heads randomly over time can improve network lifetime. The proposed protocol selects cluster heads randomly based on energy levels and predicts cluster heads to reduce overhead. It aims to optimize routing time and energy efficiency through these techniques.
The document describes an algorithm for generating test data for analyzing local search algorithms for solving the Steiner tree problem (STP) in graphs. It first generates a spanning tree using a random tree generation algorithm, then adds two types of edges to make the graph biconnected. It also generates variants of the trees by connecting each node to a minimum number of other nodes determined by a connectivity ratio. This generates test networks to evaluate STP local search algorithms under different conditions.
The MSc defense ceremony was held on 6-7-2017 in Mansoura University, Faculty of Engineering. This presentation is shared to help MSc students in Faculty of Engineering prepare their thesis presentation and ease their tension before their presentation time
ANALOG MODELING OF RECURSIVE ESTIMATOR DESIGN WITH FILTER DESIGN MODELVLSICS Design
This document summarizes a research paper on implementing a low power design methodology for recursive encoders and decoders. It discusses how recursive coding can achieve better error correction performance at low signal-to-noise ratios compared to other codes. It then describes the design of a recursive decoder that uses the log-MAP algorithm to minimize power consumption. The decoder uses five main computational steps - branch metric calculation, forward metric computation, backward metric computation, log-likelihood ratio calculation, and extrinsic information calculation. It also compares the implementation of four-state and eight-state recursive encoders. The goal of the design is to optimize the power and area of recursive encoders and decoders.
Energy efficiency is one of the most critical issue in design of System on Chip. In Network On
Chip (NoC) based system, energy consumption is influenced dramatically by mapping of
Intellectual Property (IP) which affect the performance of the system. In this paper we test the
antecedently extant proposed algorithms and introduced a new energy proficient algorithm
stand for 3D NoC architecture. In addition a hybrid method has also been implemented using
bioinspired optimization (particle swarm optimization) technique. The proposed algorithm has
been implemented and evaluated on randomly generated benchmark and real life application
such as MMS, Telecom and VOPD. The algorithm has also been tested with the E3S benchmark
and has been compared with the existing algorithm (spiral and crinkle) and has shown better
reduction in the communication energy consumption and shows improvement in the
performance of the system. Comparing our work with spiral and crinkle, experimental result
shows that the average reduction in communication energy consumption is 19% with spiral and
17% with crinkle mapping algorithms, while reduction in communication cost is 24% and 21%
whereas reduction in latency is of 24% and 22% with spiral and crinkle. Optimizing our work
and the existing methods using bio-inspired technique and having the comparison among them
an average energy reduction is found to be of 18% and 24%.
ENERGY AND LATENCY AWARE APPLICATION MAPPING ALGORITHM & OPTIMIZATION FOR HOM...cscpconf
Energy efficiency is one of the most critical issue in design of System on Chip. In Network On
Chip (NoC) based system, energy consumption is influenced dramatically by mapping of
Intellectual Property (IP) which affect the performance of the system. In this paper we test the
antecedently extant proposed algorithms and introduced a new energy proficient algorithm
stand for 3D NoC architecture. In addition a hybrid method has also been implemented using
bioinspired optimization (particle swarm optimization) technique. The proposed algorithm has
been implemented and evaluated on randomly generated benchmark and real life application
such as MMS, Telecom and VOPD. The algorithm has also been tested with the E3S benchmark
and has been compared with the existing algorithm (spiral and crinkle) and has shown better
reduction in the communication energy consumption and shows improvement in the
performance of the system. Comparing our work with spiral and crinkle, experimental result
shows that the average reduction in communication energy consumption is 19% with spiral and
17% with crinkle mapping algorithms, while reduction in communication cost is 24% and 21%
whereas reduction in latency is of 24% and 22% with spiral and crinkle. Optimizing our work
and the existing methods using bio-inspired technique and having the comparison among them
an average energy reduction is found to be of 18% and 24%.
Improving initial generations in pso algorithm for transportation network des...ijcsit
Transportation Network Design Problem (TNDP) aims to select the best project sets among a number of new projects. Recently, metaheuristic methods are applied to solve TNDP in the sense of finding better solutions sooner. PSO as a metaheuristic method is based on stochastic optimization and is a parallel revolutionary computation technique. The PSO system initializes with a number of random solutions and seeks for optimal solution by improving generations. This paper studies the behavior of PSO on account of improving initial generation and fitness value domain to find better solutions in comparison with previous attempts.
To mine out relevant facts at the time of need from web has been a tenuous task. Research on diverse fields are fine tuning methodologies toward these goals that extracts the best of information relevant to the users search query. In the proposed methodology discussed in this paper find ways to ease the search complexity tackling the severe issues hindering the performance of traditional approaches in use. The proposed methodology find effective means to find all possible semantic relatable frequent sets with FP Growth algorithm. The outcome of which is the further source of fuel for Bio inspired Fuzzy PSO to find the optimal attractive points for the web documents to get clustered meeting the requirement of the search query without losing the relevance. On the whole the proposed system optimizes the objective function of minimizing the intra cluster differences and maximizes the inter cluster distances along with retention of all possible relationships with the search context intact. The major contribution being the system finds all possible combinations matching the user search transaction and thereby making the system more meaningful. These relatable sets form the set of particles for Fuzzy Clustering as well as PSO and thus being unbiased and maintains a innate behaviour for any number of new additions to follow the herd behaviour’s evaluations reveals the proposed methodology fares well as an optimized and effective enhancements over the conventional approaches
A Novel Route Optimized Cluster Based Routing Protocol for Pollution Controll...IRJET Journal
This document summarizes a research paper that proposes a novel cluster-based routing protocol for sensor networks aimed at pollution monitoring. The protocol aims to optimize routes while maintaining moderate energy efficiency. It first discusses challenges in sensor network routing related to dynamic topology and limited device power. It then outlines two lemmas: 1) reducing cluster head detection time can decrease routing time; and 2) changing cluster heads randomly over time can improve network lifetime. The proposed protocol selects cluster heads randomly based on energy levels and predicts cluster heads to reduce overhead. It aims to optimize routing time and energy efficiency through these techniques.
The document describes an algorithm for generating test data for analyzing local search algorithms for solving the Steiner tree problem (STP) in graphs. It first generates a spanning tree using a random tree generation algorithm, then adds two types of edges to make the graph biconnected. It also generates variants of the trees by connecting each node to a minimum number of other nodes determined by a connectivity ratio. This generates test networks to evaluate STP local search algorithms under different conditions.
IRJET- Chord Classification of an Audio Signal using Artificial Neural NetworkIRJET Journal
This document presents a method for classifying chords in an audio signal using an artificial neural network. It extracts Chroma DCT-Reduced Log Pitch (CRP) features from a dataset of 2,000 recordings of 10 guitar chords. These CRP features are used to train a two-layer feedforward neural network with scaled conjugate gradient backpropagation. The trained network is able to classify chords in the input audio with an overall accuracy of 89.3%. The method demonstrates effective chord classification using a machine learning approach with chroma-based audio features.
Design and Implementation of an Embedded System for Software Defined RadioIJECEIAES
In this paper, developing high performance software for demanding real-time embed- ded systems is proposed. This software-based design will enable the software engineers and system architects in emerging technology areas like 5G Wireless and Software Defined Networking (SDN) to build their algorithms. An ADSP-21364 floating point SHARC Digital Signal Processor (DSP) running at 333 MHz is adopted as a platform for an embedded system. To evaluate the proposed embedded system, an implementation of frame, symbol and carrier phase synchronization is presented as an application. Its performance is investigated with an on line Quadrature Phase Shift keying (QPSK) receiver. Obtained results show that the designed software is implemented successfully based on the SHARC DSP which can utilized efficiently for such algorithms. In addition, it is proven that the proposed embedded system is pragmatic and capable of dealing with the memory constraints and critical time issue due to a long length interleaved coded data utilized for channel coding.
Hamming net based Low Complexity Successive Cancellation Polar DecoderRSIS International
This paper aims to implement hybrid based Polar
encoder using the knowledge of mutual information and channel
capacity. Further a Hamming weight successive cancellation
decoder is simulated with QPSK modulation technique in
presence of additive white gaussian noise. The experimentation
performed with the effect of channel polarization has shown that
for 256- bit data stream, 30% channels has zero bit and 49%
channels are with a one bit capacity. The decoding complexity is
reduced to almost half as compared to conventional successive
cancellation decoding algorithm. However, the required SNR of
7 dB is achieved at the targeted BER of 10 -4. The penalty paid is
in terms of training time required at the decoding end.
Hardware Architecture of Complex K-best MIMO DecoderCSCJournals
This paper presents a hardware architecture of complex K-best Multiple Input Multiple Output (MIMO) decoder reducing the complexity of Maximum Likelihood (ML) detector. We develop a novel low-power VLSI design of complex K-best decoder for MIMO and 64 QAM modulation scheme. Use of Schnorr-Euchner (SE) enumeration and a new parameter, Rlimit in the design reduce the complexity of calculating K-best nodes to a certain level with increased performance. The total word length of only 16 bits has been adopted for the hardware design limiting the bit error rate (BER) degradation to 0.3 dB with list size, K and Rlimit equal to 4. The proposed VLSI architecture is modeled in Verilog HDL using Xilinx and synthesized using Synopsys Design Vision in 45 nm CMOS technology. According to the synthesize result, it achieves 1090.8 Mbps throughput with power consumption of 782 mW and latency of 0.33 us. The maximum frequency the design proposed is 181.8 MHz.
Investigating the Performance of NoC Using Hierarchical Routing ApproachIJERA Editor
The Network-on-Chip (NoC) model has appeared as a revolutionary methodology for incorporatingmany number of intellectual property (IP) blocks in a die. As said by the International Roadmap for Semiconductors (ITRS), it is must to scale down the device size. In order to reduce the device long interconnection should be avoided. For that, new interconnect patterns are need. Three-dimensional ICs are proficient of achieving superior performance, resistance against noise and lower interconnect power consumption compared to traditional planar ICs. In this paper, network data routed by Hierarchical methodology. We are analyzing total number of logic gates and registers, power consumption and delay when different bits of data transmitted using Quartus II software.
Investigating the Performance of NoC Using Hierarchical Routing ApproachIJERA Editor
The Network-on-Chip (NoC) model has appeared as a revolutionary methodology for incorporatingmany number of intellectual property (IP) blocks in a die. As said by the International Roadmap for Semiconductors (ITRS), it is must to scale down the device size. In order to reduce the device long interconnection should be avoided. For that, new interconnect patterns are need. Three-dimensional ICs are proficient of achieving superior performance, resistance against noise and lower interconnect power consumption compared to traditional planar ICs. In this paper, network data routed by Hierarchical methodology. We are analyzing total number of logic gates and registers, power consumption and delay when different bits of data transmitted using Quartus II software.
This document summarizes a research paper that proposes using parallel concatenated turbo codes in wireless sensor networks in an adaptive way. The key points are:
1) Turbo codes can achieve near-Shannon limit performance but decoding is complex, making them difficult to implement on energy-constrained sensor nodes.
2) The proposed approach shifts the complex turbo decoding to the base station while sensor nodes implement encoding and basic error correction.
3) At sensor nodes, a parallel concatenated convolutional code (PCCC) circuit encodes data and detects/corrects errors in forwarded packets. This improves energy efficiency and reliability over the wireless sensor network.
This document describes a PhD thesis on designing unipolar orthogonal codes for optical code division multiple access (CDMA) networks. The research involves developing algorithms to generate sets of 1D and 2D unipolar orthogonal codes with maximum size and orthogonality. Two algorithms are proposed for generating maximal clique sets of 1D codes based on generating all possible codes in a difference of positions representation and using correlation constraints to identify orthogonal code sets. Computational complexity analysis shows the algorithms may have polynomial runtime complexity for certain code parameter ranges. The thesis will evaluate the proposed algorithms and compare results to hypothetical ideal schemes.
Area, Delay and Power Comparison of Adder TopologiesVLSICS Design
This document compares the area, delay, and power characteristics of different adder topologies, including ripple carry adder, carry look-ahead adder, carry skip adder, carry select adder, carry increment adder, carry save adder, and carry bypass adder. It analyzes the functionality and performance of 8-bit implementations of each adder type using Microwind simulation software at a 0.12μm CMOS technology node. The ripple carry adder has the simplest design but the longest propagation delay that scales with the number of bits, while other topologies like the carry look-ahead adder and carry skip adder reduce delay but have more complex circuitry. The document aims to
This document surveys and compares the performance of four types of parallel prefix adders: Kogge-Stone, Brent-Kung, Han-Carlson, and Ladner-Fischer. It analyzes their computational delay, interconnect usage, power consumption, number of cells, and maximum fan-out. Simulation results showed that the Kogge-Stone adder has the lowest delay but highest interconnect usage. The Brent-Kung adder exhibited the best performance in terms of power consumption and number of cells. In conclusion, the optimal adder depends on whether high speed, low power, or reduced area is prioritized.
In this paper, we describe an FPGA H.264/AVC encoder architecture performing at real-time. To reduce the critical path length and to increase throughput, the encoder uses a parallel and pipeline architecture and all modules have been optimized with respect the area cost. Our design is described in VHDL and synthesized to Altera Stratix III FPGA. The throughput of the FPGA architecture reaches a processing rate higher than 177 million of pixels per second at 130 MHz, permitting its use in H.264/AVC standard directed to HDTV.
Design and implementation of log domain decoder IJECEIAES
Low-Density-Parity-Check (LDPC) code has become famous in communications systems for error correction, as an advantage of the robust performance in correcting errors and the ability to meet all the requirements of the 5G system. However, the mot challenge faced researchers is the hardware implementation, because of higher complexity and long run-time. In this paper, an efficient and optimum design for log domain decoder has been implemented using Xilinx system generator with FPGA device Kintex7 (XC7K325T-2FFG900C). Results confirm that the proposed decoder gives a Bit Error Rate (BER) very closed to theory calculations which illustrate that this decoder is suitable for next generation demand which needs a high data rate with very low BER.
Research Inventy: International Journal of Engineering and Scienceresearchinventy
This document summarizes various methods that have been proposed for implementing 16-QAM (Quadrature Amplitude Modulation) in FPGAs (Field Programmable Gate Arrays). It reviews architectures for carrier synchronization, equalization, and digital up/down conversion. The document then proposes a new system generator-based 16-QAM transmitter model that considers issues like symbol mapping, interpolation filtering, and up-conversion to an intermediate frequency. Simulation results demonstrating the transmitter constellation and resource usage on an FPGA are also presented.
Research Inventy : International Journal of Engineering and Science is publis...researchinventy
This document summarizes various methods that have been proposed for implementing 16-QAM (Quadrature Amplitude Modulation) in FPGAs (Field Programmable Gate Arrays). It reviews architectures for carrier synchronization, equalization, and digital up/down conversion. The document then proposes a new system generator-based 16-QAM transmitter model that considers issues like symbol mapping, interpolation filtering, and up-conversion to an intermediate frequency. Simulation results demonstrating the transmitter constellation and resource usage on an FPGA are also presented.
Improving The Performance of Viterbi Decoder using Window System IJECEIAES
An efficient Viterbi decoder is introduced in this paper; it is called Viterbi decoder with window system. The simulation results, over Gaussian channels, are performed from rate 1/2, 1/3 and 2/3 joined to TCM encoder with memory in order of 2, 3. These results show that the proposed scheme outperforms the classical Viterbi by a gain of 1 dB. On the other hand, we propose a function called RSCPOLY2TRELLIS, for recursive systematic convolutional (RSC) encoder which creates the trellis structure of a recursive systematic convolutional encoder from the matrix “H”. Moreover, we present a comparison between the decoding algorithms of the TCM encoder like Viterbi soft and hard, and the variants of the MAP decoder known as BCJR or forward-backward algorithm which is very performant in decoding TCM, but depends on the size of the code, the memory, and the CPU requirements of the application.
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...IRJET Journal
This document summarizes research on developing parallel algorithms to optimize solving the longest common subsequence (LCS) problem. LCS is commonly used for sequence comparison in bioinformatics. Traditional sequential dynamic programming algorithms have complexity of O(mn) for sequences of lengths m and n. The document reviews parallel algorithms developed using tools like OpenMP and GPUs like CUDA to reduce computation time. It proposes the authors' own optimized parallel algorithm for multi-core CPUs using OpenMP.
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
The document presents a novel approach for forward error correction (FEC) decoding based on the belief propagation (BP) algorithm in LTE and WiMAX systems. Specifically, it proposes representing tail-biting convolutional codes and turbo codes using parity check matrices, which allows both code types to be decoded using a unified BP algorithm. This provides a lower complexity decoding architecture compared to traditional approaches. Simulation results show the BP algorithm achieves near-identical performance to maximum a posteriori (MAP) decoding for turbo codes, while being less complex. Representing codes with parity check matrices thus enables a universal decoder for LTE and WiMAX using a single BP algorithm.
THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...ijassn
Advancements in WSN have led to the wide applicability of sensor network in various fields. WSNs basic classification is Reactive and Proactive network. Reactive networks responds to the very immediate changes in its environment in required parameters of interest, as opposed to the Proactive network, due to continuous sensing nature of WSN. To make it more efficient and improved in terms of Energy in network’s
lifetime, we need to reduce the energy expense in the network model, which is one of the most significant issues in wireless sensor networks (WSNs) [1, 2]. In this paper, we proposed an efficient version of TSEP Protocol, which prolongs the networks lifetime by efficient utilization of sensor energy, as we have simulated. We evaluated the performance of our protocol and compared the results with the TSEP. And from the results of simulation, it can be concluded easily that our proposed efficient routing protocol performs better in terms of network lifetime and stability period
THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...ijassn
Advancements in WSN have led to the wide applicability of sensor network in various fields. WSNs basic classification is Reactive and Proactive network. Reactive networks responds to the very immediate changes in its environment in required parameters of interest, as opposed to the Proactive network, due to continuous sensing nature of WSN. To make it more efficient and improved in terms of Energy in network’s lifetime, we need to reduce the energy expense in the network model, which is one of the most significant issues in wireless sensor networks (WSNs) [1, 2]. In this paper, we proposed an efficient version of TSEP Protocol, which prolongs the networks lifetime by efficient utilization of sensor energy, as we have simulated. We evaluated the performance of our protocol and compared the results with the TSEP. And
from the results of simulation, it can be concluded easily that our proposed efficient routing protocol performs better in terms of network lifetime and stability period.
IRJET- Chord Classification of an Audio Signal using Artificial Neural NetworkIRJET Journal
This document presents a method for classifying chords in an audio signal using an artificial neural network. It extracts Chroma DCT-Reduced Log Pitch (CRP) features from a dataset of 2,000 recordings of 10 guitar chords. These CRP features are used to train a two-layer feedforward neural network with scaled conjugate gradient backpropagation. The trained network is able to classify chords in the input audio with an overall accuracy of 89.3%. The method demonstrates effective chord classification using a machine learning approach with chroma-based audio features.
Design and Implementation of an Embedded System for Software Defined RadioIJECEIAES
In this paper, developing high performance software for demanding real-time embed- ded systems is proposed. This software-based design will enable the software engineers and system architects in emerging technology areas like 5G Wireless and Software Defined Networking (SDN) to build their algorithms. An ADSP-21364 floating point SHARC Digital Signal Processor (DSP) running at 333 MHz is adopted as a platform for an embedded system. To evaluate the proposed embedded system, an implementation of frame, symbol and carrier phase synchronization is presented as an application. Its performance is investigated with an on line Quadrature Phase Shift keying (QPSK) receiver. Obtained results show that the designed software is implemented successfully based on the SHARC DSP which can utilized efficiently for such algorithms. In addition, it is proven that the proposed embedded system is pragmatic and capable of dealing with the memory constraints and critical time issue due to a long length interleaved coded data utilized for channel coding.
Hamming net based Low Complexity Successive Cancellation Polar DecoderRSIS International
This paper aims to implement hybrid based Polar
encoder using the knowledge of mutual information and channel
capacity. Further a Hamming weight successive cancellation
decoder is simulated with QPSK modulation technique in
presence of additive white gaussian noise. The experimentation
performed with the effect of channel polarization has shown that
for 256- bit data stream, 30% channels has zero bit and 49%
channels are with a one bit capacity. The decoding complexity is
reduced to almost half as compared to conventional successive
cancellation decoding algorithm. However, the required SNR of
7 dB is achieved at the targeted BER of 10 -4. The penalty paid is
in terms of training time required at the decoding end.
Hardware Architecture of Complex K-best MIMO DecoderCSCJournals
This paper presents a hardware architecture of complex K-best Multiple Input Multiple Output (MIMO) decoder reducing the complexity of Maximum Likelihood (ML) detector. We develop a novel low-power VLSI design of complex K-best decoder for MIMO and 64 QAM modulation scheme. Use of Schnorr-Euchner (SE) enumeration and a new parameter, Rlimit in the design reduce the complexity of calculating K-best nodes to a certain level with increased performance. The total word length of only 16 bits has been adopted for the hardware design limiting the bit error rate (BER) degradation to 0.3 dB with list size, K and Rlimit equal to 4. The proposed VLSI architecture is modeled in Verilog HDL using Xilinx and synthesized using Synopsys Design Vision in 45 nm CMOS technology. According to the synthesize result, it achieves 1090.8 Mbps throughput with power consumption of 782 mW and latency of 0.33 us. The maximum frequency the design proposed is 181.8 MHz.
Investigating the Performance of NoC Using Hierarchical Routing ApproachIJERA Editor
The Network-on-Chip (NoC) model has appeared as a revolutionary methodology for incorporatingmany number of intellectual property (IP) blocks in a die. As said by the International Roadmap for Semiconductors (ITRS), it is must to scale down the device size. In order to reduce the device long interconnection should be avoided. For that, new interconnect patterns are need. Three-dimensional ICs are proficient of achieving superior performance, resistance against noise and lower interconnect power consumption compared to traditional planar ICs. In this paper, network data routed by Hierarchical methodology. We are analyzing total number of logic gates and registers, power consumption and delay when different bits of data transmitted using Quartus II software.
Investigating the Performance of NoC Using Hierarchical Routing ApproachIJERA Editor
The Network-on-Chip (NoC) model has appeared as a revolutionary methodology for incorporatingmany number of intellectual property (IP) blocks in a die. As said by the International Roadmap for Semiconductors (ITRS), it is must to scale down the device size. In order to reduce the device long interconnection should be avoided. For that, new interconnect patterns are need. Three-dimensional ICs are proficient of achieving superior performance, resistance against noise and lower interconnect power consumption compared to traditional planar ICs. In this paper, network data routed by Hierarchical methodology. We are analyzing total number of logic gates and registers, power consumption and delay when different bits of data transmitted using Quartus II software.
This document summarizes a research paper that proposes using parallel concatenated turbo codes in wireless sensor networks in an adaptive way. The key points are:
1) Turbo codes can achieve near-Shannon limit performance but decoding is complex, making them difficult to implement on energy-constrained sensor nodes.
2) The proposed approach shifts the complex turbo decoding to the base station while sensor nodes implement encoding and basic error correction.
3) At sensor nodes, a parallel concatenated convolutional code (PCCC) circuit encodes data and detects/corrects errors in forwarded packets. This improves energy efficiency and reliability over the wireless sensor network.
This document describes a PhD thesis on designing unipolar orthogonal codes for optical code division multiple access (CDMA) networks. The research involves developing algorithms to generate sets of 1D and 2D unipolar orthogonal codes with maximum size and orthogonality. Two algorithms are proposed for generating maximal clique sets of 1D codes based on generating all possible codes in a difference of positions representation and using correlation constraints to identify orthogonal code sets. Computational complexity analysis shows the algorithms may have polynomial runtime complexity for certain code parameter ranges. The thesis will evaluate the proposed algorithms and compare results to hypothetical ideal schemes.
Area, Delay and Power Comparison of Adder TopologiesVLSICS Design
This document compares the area, delay, and power characteristics of different adder topologies, including ripple carry adder, carry look-ahead adder, carry skip adder, carry select adder, carry increment adder, carry save adder, and carry bypass adder. It analyzes the functionality and performance of 8-bit implementations of each adder type using Microwind simulation software at a 0.12μm CMOS technology node. The ripple carry adder has the simplest design but the longest propagation delay that scales with the number of bits, while other topologies like the carry look-ahead adder and carry skip adder reduce delay but have more complex circuitry. The document aims to
This document surveys and compares the performance of four types of parallel prefix adders: Kogge-Stone, Brent-Kung, Han-Carlson, and Ladner-Fischer. It analyzes their computational delay, interconnect usage, power consumption, number of cells, and maximum fan-out. Simulation results showed that the Kogge-Stone adder has the lowest delay but highest interconnect usage. The Brent-Kung adder exhibited the best performance in terms of power consumption and number of cells. In conclusion, the optimal adder depends on whether high speed, low power, or reduced area is prioritized.
In this paper, we describe an FPGA H.264/AVC encoder architecture performing at real-time. To reduce the critical path length and to increase throughput, the encoder uses a parallel and pipeline architecture and all modules have been optimized with respect the area cost. Our design is described in VHDL and synthesized to Altera Stratix III FPGA. The throughput of the FPGA architecture reaches a processing rate higher than 177 million of pixels per second at 130 MHz, permitting its use in H.264/AVC standard directed to HDTV.
Design and implementation of log domain decoder IJECEIAES
Low-Density-Parity-Check (LDPC) code has become famous in communications systems for error correction, as an advantage of the robust performance in correcting errors and the ability to meet all the requirements of the 5G system. However, the mot challenge faced researchers is the hardware implementation, because of higher complexity and long run-time. In this paper, an efficient and optimum design for log domain decoder has been implemented using Xilinx system generator with FPGA device Kintex7 (XC7K325T-2FFG900C). Results confirm that the proposed decoder gives a Bit Error Rate (BER) very closed to theory calculations which illustrate that this decoder is suitable for next generation demand which needs a high data rate with very low BER.
Research Inventy: International Journal of Engineering and Scienceresearchinventy
This document summarizes various methods that have been proposed for implementing 16-QAM (Quadrature Amplitude Modulation) in FPGAs (Field Programmable Gate Arrays). It reviews architectures for carrier synchronization, equalization, and digital up/down conversion. The document then proposes a new system generator-based 16-QAM transmitter model that considers issues like symbol mapping, interpolation filtering, and up-conversion to an intermediate frequency. Simulation results demonstrating the transmitter constellation and resource usage on an FPGA are also presented.
Research Inventy : International Journal of Engineering and Science is publis...researchinventy
This document summarizes various methods that have been proposed for implementing 16-QAM (Quadrature Amplitude Modulation) in FPGAs (Field Programmable Gate Arrays). It reviews architectures for carrier synchronization, equalization, and digital up/down conversion. The document then proposes a new system generator-based 16-QAM transmitter model that considers issues like symbol mapping, interpolation filtering, and up-conversion to an intermediate frequency. Simulation results demonstrating the transmitter constellation and resource usage on an FPGA are also presented.
Improving The Performance of Viterbi Decoder using Window System IJECEIAES
An efficient Viterbi decoder is introduced in this paper; it is called Viterbi decoder with window system. The simulation results, over Gaussian channels, are performed from rate 1/2, 1/3 and 2/3 joined to TCM encoder with memory in order of 2, 3. These results show that the proposed scheme outperforms the classical Viterbi by a gain of 1 dB. On the other hand, we propose a function called RSCPOLY2TRELLIS, for recursive systematic convolutional (RSC) encoder which creates the trellis structure of a recursive systematic convolutional encoder from the matrix “H”. Moreover, we present a comparison between the decoding algorithms of the TCM encoder like Viterbi soft and hard, and the variants of the MAP decoder known as BCJR or forward-backward algorithm which is very performant in decoding TCM, but depends on the size of the code, the memory, and the CPU requirements of the application.
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...IRJET Journal
This document summarizes research on developing parallel algorithms to optimize solving the longest common subsequence (LCS) problem. LCS is commonly used for sequence comparison in bioinformatics. Traditional sequential dynamic programming algorithms have complexity of O(mn) for sequences of lengths m and n. The document reviews parallel algorithms developed using tools like OpenMP and GPUs like CUDA to reduce computation time. It proposes the authors' own optimized parallel algorithm for multi-core CPUs using OpenMP.
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
The document presents a novel approach for forward error correction (FEC) decoding based on the belief propagation (BP) algorithm in LTE and WiMAX systems. Specifically, it proposes representing tail-biting convolutional codes and turbo codes using parity check matrices, which allows both code types to be decoded using a unified BP algorithm. This provides a lower complexity decoding architecture compared to traditional approaches. Simulation results show the BP algorithm achieves near-identical performance to maximum a posteriori (MAP) decoding for turbo codes, while being less complex. Representing codes with parity check matrices thus enables a universal decoder for LTE and WiMAX using a single BP algorithm.
THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...ijassn
Advancements in WSN have led to the wide applicability of sensor network in various fields. WSNs basic classification is Reactive and Proactive network. Reactive networks responds to the very immediate changes in its environment in required parameters of interest, as opposed to the Proactive network, due to continuous sensing nature of WSN. To make it more efficient and improved in terms of Energy in network’s
lifetime, we need to reduce the energy expense in the network model, which is one of the most significant issues in wireless sensor networks (WSNs) [1, 2]. In this paper, we proposed an efficient version of TSEP Protocol, which prolongs the networks lifetime by efficient utilization of sensor energy, as we have simulated. We evaluated the performance of our protocol and compared the results with the TSEP. And from the results of simulation, it can be concluded easily that our proposed efficient routing protocol performs better in terms of network lifetime and stability period
THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...ijassn
Advancements in WSN have led to the wide applicability of sensor network in various fields. WSNs basic classification is Reactive and Proactive network. Reactive networks responds to the very immediate changes in its environment in required parameters of interest, as opposed to the Proactive network, due to continuous sensing nature of WSN. To make it more efficient and improved in terms of Energy in network’s lifetime, we need to reduce the energy expense in the network model, which is one of the most significant issues in wireless sensor networks (WSNs) [1, 2]. In this paper, we proposed an efficient version of TSEP Protocol, which prolongs the networks lifetime by efficient utilization of sensor energy, as we have simulated. We evaluated the performance of our protocol and compared the results with the TSEP. And
from the results of simulation, it can be concluded easily that our proposed efficient routing protocol performs better in terms of network lifetime and stability period.
Rainfall intensity duration frequency curve statistical analysis and modeling...bijceesjournal
Using data from 41 years in Patna’ India’ the study’s goal is to analyze the trends of how often it rains on a weekly, seasonal, and annual basis (1981−2020). First, utilizing the intensity-duration-frequency (IDF) curve and the relationship by statistically analyzing rainfall’ the historical rainfall data set for Patna’ India’ during a 41 year period (1981−2020), was evaluated for its quality. Changes in the hydrologic cycle as a result of increased greenhouse gas emissions are expected to induce variations in the intensity, length, and frequency of precipitation events. One strategy to lessen vulnerability is to quantify probable changes and adapt to them. Techniques such as log-normal, normal, and Gumbel are used (EV-I). Distributions were created with durations of 1, 2, 3, 6, and 24 h and return times of 2, 5, 10, 25, and 100 years. There were also mathematical correlations discovered between rainfall and recurrence interval.
Findings: Based on findings, the Gumbel approach produced the highest intensity values, whereas the other approaches produced values that were close to each other. The data indicates that 461.9 mm of rain fell during the monsoon season’s 301st week. However, it was found that the 29th week had the greatest average rainfall, 92.6 mm. With 952.6 mm on average, the monsoon season saw the highest rainfall. Calculations revealed that the yearly rainfall averaged 1171.1 mm. Using Weibull’s method, the study was subsequently expanded to examine rainfall distribution at different recurrence intervals of 2, 5, 10, and 25 years. Rainfall and recurrence interval mathematical correlations were also developed. Further regression analysis revealed that short wave irrigation, wind direction, wind speed, pressure, relative humidity, and temperature all had a substantial influence on rainfall.
Originality and value: The results of the rainfall IDF curves can provide useful information to policymakers in making appropriate decisions in managing and minimizing floods in the study area.
Software Engineering and Project Management - Introduction, Modeling Concepts...Prakhyath Rai
Introduction, Modeling Concepts and Class Modeling: What is Object orientation? What is OO development? OO Themes; Evidence for usefulness of OO development; OO modeling history. Modeling
as Design technique: Modeling, abstraction, The Three models. Class Modeling: Object and Class Concept, Link and associations concepts, Generalization and Inheritance, A sample class model, Navigation of class models, and UML diagrams
Building the Analysis Models: Requirement Analysis, Analysis Model Approaches, Data modeling Concepts, Object Oriented Analysis, Scenario-Based Modeling, Flow-Oriented Modeling, class Based Modeling, Creating a Behavioral Model.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
An improved modulation technique suitable for a three level flying capacitor ...IJECEIAES
This research paper introduces an innovative modulation technique for controlling a 3-level flying capacitor multilevel inverter (FCMLI), aiming to streamline the modulation process in contrast to conventional methods. The proposed
simplified modulation technique paves the way for more straightforward and
efficient control of multilevel inverters, enabling their widespread adoption and
integration into modern power electronic systems. Through the amalgamation of
sinusoidal pulse width modulation (SPWM) with a high-frequency square wave
pulse, this controlling technique attains energy equilibrium across the coupling
capacitor. The modulation scheme incorporates a simplified switching pattern
and a decreased count of voltage references, thereby simplifying the control
algorithm.
Discover the latest insights on Data Driven Maintenance with our comprehensive webinar presentation. Learn about traditional maintenance challenges, the right approach to utilizing data, and the benefits of adopting a Data Driven Maintenance strategy. Explore real-world examples, industry best practices, and innovative solutions like FMECA and the D3M model. This presentation, led by expert Jules Oudmans, is essential for asset owners looking to optimize their maintenance processes and leverage digital technologies for improved efficiency and performance. Download now to stay ahead in the evolving maintenance landscape.
artificial intelligence and data science contents.pptxGauravCar
What is artificial intelligence? Artificial intelligence is the ability of a computer or computer-controlled robot to perform tasks that are commonly associated with the intellectual processes characteristic of humans, such as the ability to reason.
› ...
Artificial intelligence (AI) | Definitio
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...shadow0702a
This document serves as a comprehensive step-by-step guide on how to effectively use PyCharm for remote debugging of the Windows Subsystem for Linux (WSL) on a local Windows machine. It meticulously outlines several critical steps in the process, starting with the crucial task of enabling permissions, followed by the installation and configuration of WSL.
The guide then proceeds to explain how to set up the SSH service within the WSL environment, an integral part of the process. Alongside this, it also provides detailed instructions on how to modify the inbound rules of the Windows firewall to facilitate the process, ensuring that there are no connectivity issues that could potentially hinder the debugging process.
The document further emphasizes on the importance of checking the connection between the Windows and WSL environments, providing instructions on how to ensure that the connection is optimal and ready for remote debugging.
It also offers an in-depth guide on how to configure the WSL interpreter and files within the PyCharm environment. This is essential for ensuring that the debugging process is set up correctly and that the program can be run effectively within the WSL terminal.
Additionally, the document provides guidance on how to set up breakpoints for debugging, a fundamental aspect of the debugging process which allows the developer to stop the execution of their code at certain points and inspect their program at those stages.
Finally, the document concludes by providing a link to a reference blog. This blog offers additional information and guidance on configuring the remote Python interpreter in PyCharm, providing the reader with a well-rounded understanding of the process.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Mechanical Engineering on AAI Summer Training Report-003.pdf
Kailash(13EC35032)_mtp.pptx
1. By
Kailash Chand Meena
(13EC35032)
under the supervision of
Prof. Santanu Chattopadhyay
Department of Electronics and Electrical Communication
Engineering
IIT Kharagpur
2. 1.Introduction:
Application mapping is one of the most important dimensions in Network-
on-Chip (NoC) research. It affects the overall performance and power
requirement of the system.
Rapid progress in technology scaling makes transistors smaller and faster
over successive generations and consequently number of IP cores in a
system gets increased but power consumption of transistor no longer
scales in proportion .
Increasing number of IP-cores in a multi-processor system on chip makes
NoC application mapping more challenging to find optimum core-to-router
mapping.
A significant proportion of the power consumed gets directly dissipated as
heat. Increase in power density can lead to increase several others.
Application mapping with its ability to spread out high power components
can potentially be a good approach to mitigate the looming issue of
hotspots in many-core processors.
3. Terminology in Application Mapping
Application: An application consists of a set of tasks, each of which is implemented
by an IP core.
IP Cores : Functional modules of NoC are known as intellectual property(IP) cores.
Hopcount: Distance is measured in terms of hopcount to transmit a message from
source router to the destination router through the router fabric.
Core Graph: Application can be represented in the form of a core graph, with each
vertex representing an IP core and the directed edge representing the
communication between the cores. An video application VOPD(video object plane
decoder) consists of 16 cores and DVOPD(dual video object plane decoder) consists
of 32 cores.
6. Mesh Topology:
•The mesh topology is one of the most common network topologies because it
provides a regular structure with short interconnects and a high bisection width and
a modular architecture for the NoC with equal sized links.
7. 2.What is Application Mapping Problem?
The core graph of an application is a directed graph, CG(C,E) with each vertex 𝑐𝑖∈ C
representing a core and the directed edge 𝑒𝑖,𝑗∈E representing the communication
between the cores 𝑐𝑖 and 𝑐𝑗. The bandwidth requirement of the communication from 𝑐𝑖
to 𝑐𝑗, is weighted to the edge 𝑒𝑖,𝑗 and is denoted by 𝑐𝑜𝑚𝑚𝑖,𝑗.
The NoC topology graph is a directed graph TG(T,G) with each vertex 𝑡𝑖 belongs to T
representing a node in the topology and the directed edge 𝑔𝑖,𝑗representing a physical
link between the vertices 𝑡𝑖 and 𝑡𝑗. The weight of the edge 𝑔𝑖,𝑗is denoted as 𝑏𝑤𝑖,𝑗
represents the bandwidth across the edge 𝑔𝑖,𝑗.
A mapping of core graph CG(C,E) onto the topology graph TG(T,G) is defined by the
function H: CG →TP. Such that, ∀𝑐𝑖∈C,∃𝑡𝑗∈T and map (𝑐𝑖) = 𝑡𝑗 .
The quality of such a mapping is defined in terms of the total communication cost of the
application under this mapping. The communication between each pair of cores can be
treated as flow of a single commodity 𝑑𝑘, k = 1, 2,...,|E|.
The value of commodity 𝑑𝑘 corresponding to the communication between cores 𝑐𝑖 and
𝑐𝑗 is equal to 𝑐𝑜𝑚𝑚𝑖,𝑗 , the bandwidth requirement. The quantity 𝑋𝑘(i, j) indicating the
value of commodity 𝑑𝑘 flowing through link (𝑡𝑖, 𝑡𝑗) is given by-
value (𝑑𝑘) , if link (𝑡𝑖, 𝑡𝑗) Path (source (𝑑𝑘 ),destination (𝑑𝑘))
0 , otherwise
8. Contd.
To ensure that the bandwidth does not exceed the limits of individual links,
the following constraints must be satisfied-
𝑘=1
|𝐸|
𝑋𝑘(𝑖, 𝑗) ≤ 𝑏𝑤𝑖,𝑗 , ∀ i, j ∈ {1, 2,...,|T |}.
The Communication Cost between the core 𝑐𝑖 and 𝑐𝑗 is measured by-
𝐶𝑜𝑚𝑚𝑐𝑜𝑠𝑡 𝑖,𝑗 = 𝑐𝑜𝑚𝑚𝑖,𝑗 × 𝑀𝐷(map 𝑐𝑖 , map 𝑐𝑗 )
The total communication cost of a mapping solution is calculated as-
𝐶𝑜𝑚𝑚𝐶𝑜𝑠𝑡 = 𝑐𝑖,𝑐𝑗 ∈𝐸 𝐶𝑜𝑚𝑚𝑐𝑜𝑠𝑡 (𝐶𝑖, 𝐶𝑗)
9. 3. Problem Statement:
Given the properties of the application (in terms of its core graph)and NoC
architecture(in terms of topology graph),the optimum association between routers
and cores has to be so determined that the weighted communication cost(BW ×
Hop-count) of the application and the peak temperature of the chip remain
minimum under a given routing mechanism.
The following are the inputs to the problem:
1. A task graph CG, representing the application.
2. A topology graph TG corresponding to the 2D NoC.
3. Power profile of each core.
4. Power profile of each router and link.
5. Floorplan for the NoC.
A core together with its corresponding router, forms a tile. The tiles are identified
by the router’s ID. So each tile has an associated power profile, governed by the
associated IP-core, router and links.
The above mentioned problem has been solved using the Genetics Algorithm(GA).
10. 4. Why Genetic Algorithm(GA)?:
GA offers several advantages over other stochastic strategies for the optimization of
the application mapping problem like Simulated Annealing(SA) and Ant Colony
Optimization(ACO) .
In GA optimization, multiple solutions co-exist at any stage of the process, whereas,
SA progresses with only one solution. The solutions of GA are generally produced
faster than SA and ACO which use only limited population and resources.
Proposed GA based approach combines the local search method with the global
search method(guided search) to balance exploration and exploitation.
In GA approach, chromosomes( mapping solutions) do not die because of the local
best of a chromosome(solution) remains attached to that chromosome and gets
updated whenever a better solution identified by the solution.
But in SA, the population moves together in an unguided search and some
solutions are filtered out by the selection criteria. Similarly, in ACO, random paths
are selected for an ant(solution) and because of that solution takes time to
converge.
11. 5. GA formulation of Application Mapping Problem:
5.1.Chromosome structure and initial population generation:
The length of each chromosome is equal to the number of vertices in a core graph,
and the chromosome is en-coded into integer strings.
Each gene (vertex in core graph) in the chromosome contains an integer which
indicates a randomly chosen node in mesh topology, and the vertex can not overlap
each other.
A chromosome can efficiently be represented as an 1D-array, in which the indices
represent the router numbers, and the values of the cells represent the core
associated with the corresponding router. Thus, a chromosome is a permutation of
the numbers of cores in core graph
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
16 4 3 2 14 5 6 1 13 12 7 9 15 11 8 10
12. Chromosome structure and corresponding NoC Mapping
A chromosome conveniently can be viewed as a 1-D array in which chromosome[i]
notes down the core mapped to the 𝑖𝑡ℎ router or node.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
15 16
1
6
4 3 2 1
4
5 6 1 1
3
1
2
7 9 1
5
1
1
8 1
0
13. 5.2. Evaluation of Fitness value of Chromosome by calculating Objective Function:
• The Communication Cost between the core 𝑐𝑖 and 𝑐𝑗 is measured by-
𝐶𝑜𝑚𝑚𝑐𝑜𝑠𝑡 𝑖,𝑗 = 𝑐𝑜𝑚𝑚𝑖,𝑗 × 𝑀𝐷(map 𝑐𝑖 , map 𝑐𝑗 )
• The total communication cost of a mapping solution is calculated as-
𝐶𝑜𝑚𝑚𝐶𝑜𝑠𝑡 = 𝑐𝑖,𝑐𝑗 ∈𝐸 𝐶𝑜𝑚𝑚𝑐𝑜𝑠𝑡 𝐶𝑖, 𝐶𝑗
• F_obj[i] = 𝐶𝑜𝑚𝑚𝐶𝑜𝑠𝑡
• Fitness of 𝑖𝑡ℎ chromosome:
Fitness[i]=1/(1+F_obj[i])
5.3. Chromosome Selection for Next Generation using Roulette Wheel:
• The fitness probability for 𝑖𝑡ℎ
chromosome is formulated by:
P[i]=Fitness[i] / ( 𝑖=1
𝑁
𝐹𝑖𝑡𝑛𝑒𝑠𝑠[𝑖])
• The cumulative probability for 𝑘𝑡ℎ
chromosome can be formulated as:
𝐶[𝑘] =
𝑖=1
𝑘
𝑃[𝑖]
14. Contd.
Algorithm for the Roulette wheel selection process:
begin
k 0;
while(k < population size) do
R[k] (0,1);
For(i=0 to population size) do
if(R[k]< C[i]) then
chromosome[k] chromosome[i];
break;
i=i+1;
end;
k=k+1;
end;
end;
15. 5.4. Crossover Operation over Chromosomes(Solutions):
For the crossover process, generated floating point random numbers between 0 to
1. Chromosome k will be selected as parent if R[k] < crossover rate.
After Chromosome selection as parent, position of crossover point is determined
by generating random integers between one to (numbers of cores in core graph-1).
Algorithm:
begin
k 0;
While (k<population size) do
R[k] random(0,1);
If( R[k]< crossover rate) then
Select chromosome[k] as parent;
k=k+1;
end;
end;
16. 5.5 Mutation operation over Chromosomes:
Number of chromosomes that have mutations in population is determined by the
mutation rate parameter.
In mutation process, exchange two members in chromosomes that are selected
randomly.
Total_members = number of cores in a chromosome × population size.
Mutation process is done by generating a random integer between 1 to
Total_Members. If generated random number is smaller than mutation rate then
marked the position of gene and it will be mutated.
Number of mutations = mutation rate × Total_members
Algorithm:
begin
k 0;
While(k < number of mutations) do
R[k] [1,total_members]; Integer random number
a Quotient of (R[k] / core_num);
select chromosome[a] for mutation;
b Remainder of (R[k] / core_num);
select position b in chromosome [a] for mutation;
k=k+1;
end;
end;
17. 6. Control Over GA Iterations:
In this approach, the GA has been run several times to improve upon the best
solution (𝑔𝑠𝑢𝑝𝑒𝑟) which has been found in previous iterations. At the end of the 𝑛𝑡ℎ
iteration of the GA, let the best solution for the 𝑘𝑡ℎ chromosome, found in this
iteration be 𝑙𝑏𝑒𝑠𝑡𝑛
𝑘
and the best solution found in previous n iterations be 𝑔𝑠𝑢𝑝𝑒𝑟𝑛 .
In the (𝑛 + 1)𝑡ℎ
iteration of GA, it starts with a new set of chromosomes. However
the 𝑙𝑏𝑒𝑠𝑡𝑛
𝑘
and 𝑔𝑠𝑢𝑝𝑒𝑟𝑛 solutions are passed on from 𝑛𝑡ℎ to the (𝑛 + 1)𝑡ℎ
iteration of GA.
The maximum number of GA runs has been set as follows:
1. Either the number of GA iterations exceeds a user-define value. For this work,
this limit value is set to be 1000.
2. Or, fitness of the solution 𝑔𝑠𝑢𝑝𝑒𝑟𝑛 which has been found in previous iterations
does not change in the last 30 runs.
18. 7. Genetic Algorithm Formulation of Temperature-Aware Mapping:
7.1. Temperature Calculation:
The primary source of heat generation in a chip is governed by the energy dissipation of
the tiles present in the silicon layer.
This heat generated in the silicon layer flows towards the heat sink through the following
heat transfer path(PHTP): Silicon layer → Thermal Interface-layer → Heat Spreader →
Heat Sink.
Each of these layers is divided into several smaller blocks, as in the block model of
Hotspot.
We have considered that each block in the Si-layer corresponds to a tile present in the
NoC. Thereby, if the NoC contains n tiles, the Si-layer is divided into n blocks.
Also, the other layers present in the PHTP, exactly below Si-layer are divided into similar
n-blocks. Therefore, a total of such (4 × n) number of blocks are present in the thermal
model.
In addition to those 4n blocks, the Heat Spreader layer contains 4 extra peripheral blocks
and the Heat Sink layer contains 8 extra peripheral blocks. Hence the total number of
blocks present in the thermal model of the chip (tot_blk) is (4 × n + 12).
The CTM works on the principle of duality between the thermal and the electrical
quantities.
19. Contd.
Thermal resistance along x, y and z directions:
𝑇𝑅𝑥 =
1
𝑘𝑙𝑎𝑦𝑒𝑟
(0.5 ×
𝐷𝑥
𝐷𝑦 × 𝐷𝑧
)
𝑇𝑅𝑦 =
1
𝑘𝑙𝑎𝑦𝑒𝑟
(0.5 ×
𝐷𝑦
𝐷𝑧 × 𝐷𝑥
)
𝑇𝑅𝑧 =
1
𝑘𝑙𝑎𝑦𝑒𝑟
(0.5 ×
𝐷𝑧
2𝐷𝑥 × 𝐷𝑦
)
Following equation is solved to determine the temperature matrix ([𝑇]𝑡𝑜𝑡_𝑏𝑙𝑘×1) :
[𝐶]𝑡𝑜𝑡_𝑏𝑙𝑘×tot_blk × 𝑇 𝑡𝑜𝑡_𝑏𝑙𝑘×1= 𝑃 𝑡𝑜𝑡_𝑏𝑙𝑘×1
20. 7.2 Fitness Calculation:
The fitness of each chromosome is evaluated using the following
expression:
𝐹𝑖𝑡𝑛𝑒𝑠𝑠 = 𝑤 ×
𝐶𝑜𝑚𝑚𝐶𝑜𝑠𝑡
𝐶𝑜𝑚𝑚𝐶𝑜𝑠𝑡𝑚𝑎𝑥
+ 1 − 𝑤 × (
𝑇𝑃𝑒𝑎𝑘𝐶ℎ𝑖𝑝
𝑇𝑀𝑎𝑥𝑀𝑎𝑝
)
When w=0, it minimizes the chip temperature, and w=1, it minimizes the
communication cost.
21. 8.Simulation Results:
8.1. Comparison of Communication cost for Benchmark Applications:
The applications are mapped onto 2-D mesh structures with mesh sizes
noted in Table I.
TABLE I
NoC Benchmarks and Their Mesh-Sizes
Benchmark NoCs No. Of Cores 2-D Mesh Size
DVOPD 32 8 × 4
VOPD 16 4 × 4
MPEG-4 12 4 × 4
PIP 8 4 × 2
MWD 12 4 × 4
263ENC MP3DEC 12 4 × 4
MP3ENC MP3DEC 13 4 × 4
263DEC MP3DEC 14 4 × 4
23. 8.2. Latency and Throughput for Benchmark Applications :
Used System-C based Noxim simulator to calculate network latency and throughput.
TABLE III
Noxim Settings
Parameters Values
Buffer Depth 6
Minimum and Maximum Packet Size 64 flits(32 flits per flit)
Routing Dimension ordered(XY)
Selection Logic Random
Warm-up Time 10000 Clk cycles
Simulation Time 20000 Clk cycles
Traffic Table based
28. Contd.
To check the applicability of the GA based thermal-aware mapping approach on larger scale, a
few task graphs are generated using TGFF tool.
TABLE VIII
Communication Cost and Peak Temperature Reduction for Different TGFF Task Graphs
Task Graphs Comm_Cost(Hops × BW) Peak Temp. Reduction(Kelvin)
Graph111 124732.77 92.55
Graph112 718853.43 92.09
Graph113 876083.87 96.97
Graph114 182443.65 92.56
Graph115 160572.93 94.38
Graph116 20306.87 92.37
Graph117 20306.87 97.57
Graph118 221245.67 90.66
29. 8.6. Trading-off Communication Cost and Peak Temperature:
A trade-off is established between NoC peak temperature and Communication Cost. Below
figure shows the trade-offs between communication cost and peak temperature for
benchmark application VOPD.
30. 8.7. Imposing Thermal Safety by Temperature Constraints:
In this experiment, thermal safety has been imposed by taking peak temperature as a
constraint. The experiment finds out the mapping solution that is suitable to the temperature
budget.
TABLE IX
Communication Cost and Peak Temperature Constraints
NoC Benchmark Applications
VOPD DVOPD
Tcons (Kelvin) Comm_Cost Tpeak(Kelvin) Tcons (Kelvin) Comm_Cost Tpeak (Kelvin)
361 3612 358.87 360 9427 356.38
359 4888 356.26 356 10486 354.23
356 4899 351.07 359 10510 357.10
31. 8.8. Dynamic Simulation Of Thermal-Aware Mapping:
For the simulation purpose, Noxim simulator has been used. Any NoC is expected to have high
throughput, while the latency is expected to be low.
TABLE X
Throughput and Latency of NoC Benchmarks
Benchmark NoCs Throughput(Flits/Cycle) Latency(Cycles)
DVOPD 83735.70 0.53
VOPD 82398.14 0.57
MPEG-4 79998.50 0.59
PIP 89475.20 0.63
MWD 89963.70 0.61
263ENC-MP3DEC 81997.10 0.58
32. 9.Conclusions:
Proposed mapping approach produces reasonable improvement in communication
cost compared to some of the previously reported strategies.
It can be noted from simulation results that, the proposed strategy performs better
compared to NMAP for the NoCs having higher number of cores.
The communication model used in proposed approach is assumed that each router
takes same amount of time to traverse through it. In practical, this may not be true.
Proposed thermal-aware mapping approach has been found to improve the
communication cost and peak temperature of the chip.
A trade-off has also been established between communication cost and peak
temperature , so that designers can choose the solution that suits their
requirement best.
Experimental results show that the proposed thermal-aware mapping approach
outperforms, those of many contemporary approaches, reported in the literature.
33. 10.Future Scope:
Proposed mapping strategy can be extended for mapping and routing for
NoC architectures with other network topologies like Ring, Torus topology
etc.
Proposed thermal-aware mapping approach can be extended for 3-D
structured mapping strategies targeting fault-tolerant and reliability-aware
mapping techniques for 2-D as well as 3-D NoC environments.
34. 11.References:
[1].S.Murali and G.De. Micheli,Bandwidth-constrained mapping of cores onto noc
architectures,design, Automation and test in Europe conference and exhibition, 2004. Proceedings,
vol. 2. Feb. 2004, pp. 896–901.
[2].Pradip Kumar Sahu, Kanchan Manna, Tapan Shah and Santanu Chattopadhyay, A Constructive
Heuristic for Application Mapping onto Mesh Based Network-on-Chip, Journal of Circuits, Systems,
and Computers Vol. 24, No. 8 (2015) 1550126 (29 pages)
[3]. P. K. Sahu and S. Chattopadhyay, A survey on application mapping strategies for network- on-
chip design, J. Syst. Archit., vol. 59, 2013,pp. 60–76.
[4].Application Mapping Onto Mesh-Based Network-on-Chip Using Discrete Particle Swarm
Optimization, Pradip Kumar Sahu, Tapan Shah, Kanchan Manna, and Santanu Chattopadhyay IEEE
transactions on very large scale integration (VLSI) systems, VOL. 22, NO. 2, February 2014.
[5].J. Hu and R. Marculescu,“Energy-aware mapping for tile-based NoC architectures
underperformance constraints,”in Proc. Asia South Pacific Des. Autom. Conf., 2003,pp.233-239.
[6].M. Moazzen, A. Reza, and M. Reshadi, CoolMap: A Thermal-aware mapping algorithm for
application specific networks-on- chip, in Proc. Euromicro Conf. Digital Syst. Des., Sep. 2012,
pp. 731–734.
[7].D. Zhu, L. Chen, T. Pinkston, and M. Pedram, TAPP: Temperature- aware application mapping for
NoC-based many-core processors, in Proc. Des., Autom. Test Eur., 2015, pp. 1241–1244.
[8].W. Huang, S. Ghosh, S. Velusamy, K. Sankaranarayanan, K. Skadron, M Stan, Hotspot: a compact
thermal modeling methodology for early-stage vlsi design, very large scale integer, VLSI syst. IEEE
Trans, 14(5) (2006) 501-513.
[9]. http://mehransoft.ir/wp-content/uploads/2014/05/Noxim_User_Guide.pdf