Methods for Molecular Dynamics(MD) simulations are investigated. MD simulation is the widely used computer simulation approach to study the properties of molecular system. Force calculation in MD is computationally intensive. Paral-lel programming techniques can be applied to improve those calculations.
The major aim of this paper is to speed up the MD simulation calculations by/using General Purpose Graphics Processing Unit(GPU) computing paradigm, an efficient and economical way for parallel computing. For that we are proposing a method called cell charge approximation which treats the
electrostatic interactions in MD simulations.This method reduces the complexity of force calculations.
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONcsandit
Image reconstruction is a process of obtaining the original image from corrupted data.Applications of image reconstruction include Computer Tomography, radar imaging, weather forecasting etc. Recently steering kernel regression method has been applied for image reconstruction [1]. There are two major drawbacks in this technique. Firstly, it is computationally intensive. Secondly, output of the algorithm suffers form spurious edges(especially in case of denoising). We propose a modified version of Steering Kernel Regression called as Median Based Parallel Steering Kernel Regression Technique. In the proposed algorithm the first problem is overcome by implementing it in on GPUs and multi-cores. The second problem is addressed by a gradient based suppression in which median filter is used.Our algorithm gives better output than that of the Steering Kernel Regression. The results are compared using Root Mean Square Error(RMSE). Our algorithm has also shown a speedup of 21x using GPUs and shown speedup of 6x using multi-cores.
Median based parallel steering kernel regression for image reconstructioncsandit
Image reconstruction is a process of obtaining the original image from corrupted data.
Applications of image reconstruction include Computer Tomography, radar imaging, weather
forecasting etc. Recently steering kernel regression method has been applied for image
reconstruction [1]. There are two major drawbacks in this technique. Firstly, it is
computationally intensive. Secondly, output of the algorithm suffers form spurious edges
(especially in case of denoising). We propose a modified version of Steering Kernel Regression
called as Median Based Parallel Steering Kernel Regression Technique. In the proposed
algorithm the first problem is overcome by implementing it in on GPUs and multi-cores. The
second problem is addressed by a gradient based suppression in which median filter is used.
Our algorithm gives better output than that of the Steering Kernel Regression. The results are
compared using Root Mean Square Error(RMSE). Our algorithm has also shown a speedup of
21x using GPUs and shown speedup of 6x using multi-cores.
Optimum capacity allocation of distributed generation units using parallel ps...eSAT Journals
Abstract This paper proposes the application of Parallel Particle Swarm Optimization (PPSO) technique to find the optimal sizing of multiple DG(Distributed Generation) units in the radial distribution network by reduction in real power losses and enhancement in voltage profile. Message passing interface (MPI) is used for the parallelization of PSO. The initial population of PSO algorithm has been divided between the processors at run time. The proposed technique is tested on standard 123-bus test system and the obtained results show that the simulation time is significantly reduced and is concluded that parallelization helps in enhancing the performance of basic PSO. The procedure has been implemented in an environment in which OpenDSS (Open Distribution System Simulator) is driven from MATLAB. An adaptive weight particle swarm optimization algorithm has been developed in MATLAB , parallelization is achieved using MATLABMPI and the unbalanced three-phase distribution load flow (DLF) has been performed using Electric Power Research Institute’s (EPRI) open source tool OpenDSS. Index Terms: Distributed Generation, Message Passing Interface, Optimal Placement, Parallel Particle Swarm Optimisation
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Power system transient stability margin estimation using artificial neural ne...elelijjournal
This paper presents a methodology for estimating the normalized transient stability margin by using the multilayered perceptron (MLP) neural network. The complex relationship between the input variables and output variables is established by using the neural networks. The nonlinear mapping relation between the normalized transient stability margin and the operating conditions of the power system is established by using the MLP neural network. To obtain the training set of the neural network the potential energy boundary surface (PEBS) method along with time domain simulation method is used. The proposed method is applied on IEEE 9 bus system and the results shows that the proposed method provides fast and accurate tool to assess online transient stability.
CONFIGURABLE TASK MAPPING FOR MULTIPLE OBJECTIVES IN MACRO-PROGRAMMING OF WIR...ijassn
Macro-programming is the new generation advanced method of using Wireless Sensor Network (WSNs), where application developers can extract data from sensor nodes through a high level abstraction of the system. Instead of developing the entire application, task graph representation of the WSN model presents simplified approach of data collection. However, mapping of tasks onto sensor nodes highlights several problems in energy consumption and routing delay. In this paper, we present an efficient hybrid approach of task mapping for WSN – Hybrid Genetic Algorithm, considering multiple objectives of optimization – energy consumption, routing delay and soft real time requirement. We also present a method to configure the algorithm as per user's need by changing the heuristics used for optimization. The trade-off analysis between energy consumption and delivery delay was performed and simulation results are presented. The algorithm is applicable during macro-programming enabling developers to choose a better mapping according to their application requirements.
Performance analysis of real-time and general-purpose operating systems for p...IJECEIAES
In general, modern operating systems can be divided into two essential parts, real-time operating systems (RTOS) and general-purpose operating systems (GPOS). The main difference between GPOS and RTOS is the system is time-critical or not. It means that; in GPOS, a high-priority thread cannot preempt a kernel call. But, in RTOS, a low-priority task is preempted by a high-priority task if necessary, even if it’s executing a kernel call. Most Linux distributions can be used as both GPOS and RTOS with kernel modifications. In this study, two Linux distributions, Ubuntu and Pardus, were analyzed and their performances were compared both as GPOS and RTOS for path planning of the multi-robot systems. Robot groups with different numbers of members were used to perform the path tracking tasks using both Ubuntu and Pardus as GPOS and RTOS. In this way, both the performance of two different Linux distributions in robotic applications were observed and compared in two forms, GPOS, and RTOS.
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONcsandit
Image reconstruction is a process of obtaining the original image from corrupted data.Applications of image reconstruction include Computer Tomography, radar imaging, weather forecasting etc. Recently steering kernel regression method has been applied for image reconstruction [1]. There are two major drawbacks in this technique. Firstly, it is computationally intensive. Secondly, output of the algorithm suffers form spurious edges(especially in case of denoising). We propose a modified version of Steering Kernel Regression called as Median Based Parallel Steering Kernel Regression Technique. In the proposed algorithm the first problem is overcome by implementing it in on GPUs and multi-cores. The second problem is addressed by a gradient based suppression in which median filter is used.Our algorithm gives better output than that of the Steering Kernel Regression. The results are compared using Root Mean Square Error(RMSE). Our algorithm has also shown a speedup of 21x using GPUs and shown speedup of 6x using multi-cores.
Median based parallel steering kernel regression for image reconstructioncsandit
Image reconstruction is a process of obtaining the original image from corrupted data.
Applications of image reconstruction include Computer Tomography, radar imaging, weather
forecasting etc. Recently steering kernel regression method has been applied for image
reconstruction [1]. There are two major drawbacks in this technique. Firstly, it is
computationally intensive. Secondly, output of the algorithm suffers form spurious edges
(especially in case of denoising). We propose a modified version of Steering Kernel Regression
called as Median Based Parallel Steering Kernel Regression Technique. In the proposed
algorithm the first problem is overcome by implementing it in on GPUs and multi-cores. The
second problem is addressed by a gradient based suppression in which median filter is used.
Our algorithm gives better output than that of the Steering Kernel Regression. The results are
compared using Root Mean Square Error(RMSE). Our algorithm has also shown a speedup of
21x using GPUs and shown speedup of 6x using multi-cores.
Optimum capacity allocation of distributed generation units using parallel ps...eSAT Journals
Abstract This paper proposes the application of Parallel Particle Swarm Optimization (PPSO) technique to find the optimal sizing of multiple DG(Distributed Generation) units in the radial distribution network by reduction in real power losses and enhancement in voltage profile. Message passing interface (MPI) is used for the parallelization of PSO. The initial population of PSO algorithm has been divided between the processors at run time. The proposed technique is tested on standard 123-bus test system and the obtained results show that the simulation time is significantly reduced and is concluded that parallelization helps in enhancing the performance of basic PSO. The procedure has been implemented in an environment in which OpenDSS (Open Distribution System Simulator) is driven from MATLAB. An adaptive weight particle swarm optimization algorithm has been developed in MATLAB , parallelization is achieved using MATLABMPI and the unbalanced three-phase distribution load flow (DLF) has been performed using Electric Power Research Institute’s (EPRI) open source tool OpenDSS. Index Terms: Distributed Generation, Message Passing Interface, Optimal Placement, Parallel Particle Swarm Optimisation
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Power system transient stability margin estimation using artificial neural ne...elelijjournal
This paper presents a methodology for estimating the normalized transient stability margin by using the multilayered perceptron (MLP) neural network. The complex relationship between the input variables and output variables is established by using the neural networks. The nonlinear mapping relation between the normalized transient stability margin and the operating conditions of the power system is established by using the MLP neural network. To obtain the training set of the neural network the potential energy boundary surface (PEBS) method along with time domain simulation method is used. The proposed method is applied on IEEE 9 bus system and the results shows that the proposed method provides fast and accurate tool to assess online transient stability.
CONFIGURABLE TASK MAPPING FOR MULTIPLE OBJECTIVES IN MACRO-PROGRAMMING OF WIR...ijassn
Macro-programming is the new generation advanced method of using Wireless Sensor Network (WSNs), where application developers can extract data from sensor nodes through a high level abstraction of the system. Instead of developing the entire application, task graph representation of the WSN model presents simplified approach of data collection. However, mapping of tasks onto sensor nodes highlights several problems in energy consumption and routing delay. In this paper, we present an efficient hybrid approach of task mapping for WSN – Hybrid Genetic Algorithm, considering multiple objectives of optimization – energy consumption, routing delay and soft real time requirement. We also present a method to configure the algorithm as per user's need by changing the heuristics used for optimization. The trade-off analysis between energy consumption and delivery delay was performed and simulation results are presented. The algorithm is applicable during macro-programming enabling developers to choose a better mapping according to their application requirements.
Performance analysis of real-time and general-purpose operating systems for p...IJECEIAES
In general, modern operating systems can be divided into two essential parts, real-time operating systems (RTOS) and general-purpose operating systems (GPOS). The main difference between GPOS and RTOS is the system is time-critical or not. It means that; in GPOS, a high-priority thread cannot preempt a kernel call. But, in RTOS, a low-priority task is preempted by a high-priority task if necessary, even if it’s executing a kernel call. Most Linux distributions can be used as both GPOS and RTOS with kernel modifications. In this study, two Linux distributions, Ubuntu and Pardus, were analyzed and their performances were compared both as GPOS and RTOS for path planning of the multi-robot systems. Robot groups with different numbers of members were used to perform the path tracking tasks using both Ubuntu and Pardus as GPOS and RTOS. In this way, both the performance of two different Linux distributions in robotic applications were observed and compared in two forms, GPOS, and RTOS.
Comparative study to realize an automatic speaker recognition system IJECEIAES
In this research, we present an automatic speaker recognition system based on adaptive orthogonal transformations. To obtain the informative features with a minimum dimension from the input signals, we created an adaptive operator, which helped to identify the speaker’s voice in a fast and efficient manner. We test the efficiency and the performance of our method by comparing it with another approach, mel-frequency cepstral coefficients (MFCCs), which is widely used by researchers as their feature extraction method. The experimental results show the importance of creating the adaptive operator, which gives added value to the proposed approach. The performance of the system achieved 96.8% accuracy using Fourier transform as a compression method and 98.1% using Correlation as a compression method.
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
ANALYSIS OF THRESHOLD BASED CENTRALIZED LOAD BALANCING POLICY FOR HETEROGENEO...ijait
Heterogeneous machines can be significantly better than homogeneous machines but for that an effective workload distribution policy is required. Maximum realization of the performance can be achieved when system designer will overcome load imbalance condition within the system. Load
distribution and load balancing policy together can reduce total execution time and increase system throughput.
In this paper; we provide algorithm analysis of a threshold based job allocation and load balancing policy for heterogeneous system where all incoming jobs are judiciously and transparently distributed among sharing nodes on the basis of jobs’ requirement and processor capability for the maximization of performance and decline in execution time. A brief discussion of job allocation, transfer and location policy is given with explanation of how load imbalance condition is solved within the system. A flow of scheme is given with essential code and analysis of present algorithm is given to show how this algorithm is better.
Real-time traffic sign detection and recognition using Raspberry Pi IJECEIAES
Nowadays, the number of road accident in Malaysia is increasing expeditiously. One of the ways to reduce the number of road accident is through the development of the advanced driving assistance system (ADAS) by professional engineers. Several ADAS system has been proposed by taking into consideration the delay tolerance and the accuracy of the system itself. In this work, a traffic sign recognition system has been developed to increase the safety of the road users by installing the system inside the car for driver’s awareness. TensorFlow algorithm has been considered in this work for object recognition through machine learning due to its high accuracy. The algorithm is embedded in the Raspberry Pi 3 for processing and analysis to detect the traffic sign from the real-time video recording from Raspberry Pi camera NoIR. This work aims to study the accuracy, delay and reliability of the developed system using a Raspberry Pi 3 processor considering several scenarios related to the state of the environment and the condition of the traffic signs. A real-time testbed implementation has been conducted considering twenty different traffic signs and the results show that the system has more than 90% accuracy and is reliable with an acceptable delay.
An Adaptive Internal Model for Load Frequency Control using Extreme Learning ...TELKOMNIKA JOURNAL
As an important part of a power system, a load frequency control has to be prepared with a better controller to ensure internal frequency stability. In this paper, an Internal Model Control (IMC) scheme for a Load Frequency Control (LFC) with an adaptive internal model is proposed. The effectiveness of the IMC control has been tested in a three area power system. Results of the simulation show that the proposed IMC with Extreme Learning Machine (ELM) based adaptive model can accurately cover the power system dynamics. Furthermore, the proposed controller can effectively reduce the frequency and mechanical power deviation under disturbances of the power system.
A study on dynamic load balancing in grid environmentIJSRD
Grid computing is a collection of computer resources from multiple locations to reach a common goal. Grid computing distinguishes from conventional high performance computing systems that are heterogeneous and geographically dispersed than cluster computer. One of the major issues in grid computing is load balancing. Classification of load balancing is: Static – Dynamic, Centralized – Decentralized, Homogeneous – Heterogeneous. Techniques like: Ant Colony Optimization, Threshold based and Optimal Heterogeneous are used by some researcher to balance the load. This survey paper discusses set of parameters to be used for comparing performance of each of them. In addition to that it says which technique is more useful for grid environment.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
(Paper) Task scheduling algorithm for multicore processor system for minimiz...Naoki Shibata
Shohei Gotoda, Naoki Shibata and Minoru Ito : "Task scheduling algorithm for multicore processor system for minimizing recovery time in case of single node fault," Proceedings of IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2012), pp.260-267, DOI:10.1109/CCGrid.2012.23, May 15, 2012.
In this paper, we propose a task scheduling al-gorithm for a multicore processor system which reduces the
recovery time in case of a single fail-stop failure of a multicore
processor. Many of the recently developed processors have
multiple cores on a single die, so that one failure of a computing
node results in failure of many processors. In the case of a failure
of a multicore processor, all tasks which have been executed
on the failed multicore processor have to be recovered at once.
The proposed algorithm is based on an existing checkpointing
technique, and we assume that the state is saved when nodes
send results to the next node. If a series of computations that
depends on former results is executed on a single die, we need
to execute all parts of the series of computations again in
the case of failure of the processor. The proposed scheduling
algorithm tries not to concentrate tasks to processors on a die.
We designed our algorithm as a parallel algorithm that achieves
O(n) speedup where n is the number of processors. We evaluated
our method using simulations and experiments with four PCs.
We compared our method with existing scheduling method, and
in the simulation, the execution time including recovery time in
the case of a node failure is reduced by up to 50% while the
overhead in the case of no failure was a few percent in typical
scenarios.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
Brief Explanation about the Tau-Leaping Process, Parallel Processing and NVIDIA's CUDA architecture
And the use of cuTau - Leaping for simulation of Biological systems
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONcscpconf
Image reconstruction is a process of obtaining the original image from corrupted data. Applications of image reconstruction include Computer Tomography, radar imaging, weather
forecasting etc. Recently steering kernel regression method has been applied for image reconstruction [1]. There are two major drawbacks in this technique. Firstly, it is computationally intensive. Secondly, output of the algorithm suffers form spurious edges (especially in case of denoising). We propose a modified version of Steering Kernel Regression called as Median Based Parallel Steering Kernel Regression Technique. In the proposed algorithm the first problem is overcome by implementing it in on GPUs and multi-cores. The second problem is addressed by a gradient based suppression in which median filter is used. Our algorithm gives better output than that of the Steering Kernel Regression. The results are
compared using Root Mean Square Error(RMSE). Our algorithm has also shown a speedup of 21x using GPUs and shown speedup of 6x using multi-cores.
Comparison of Neural Network Training Functions for Hematoma Classification i...IOSR Journals
Classification is one of the most important task in application areas of artificial neural networks
(ANN).Training neural networks is a complex task in the supervised learning field of research. The main
difficulty in adopting ANN is to find the most appropriate combination of learning, transfer and training
function for the classification task. We compared the performances of three types of training algorithms in feed
forward neural network for brain hematoma classification. In this work we have selected Gradient Descent
based backpropagation, Gradient Descent with momentum, Resilence backpropogation algorithms. Under
conjugate based algorithms, Scaled Conjugate back propagation, Conjugate Gradient backpropagation with
Polak-Riebreupdates(CGP) and Conjugate Gradient backpropagation with Fletcher-Reeves updates (CGF).The
last category is Quasi Newton based algorithm, under this BFGS, Levenberg-Marquardt algorithms are
selected. Proposed work compared training algorithm on the basis of mean square error, accuracy, rate of
convergence and correctness of the classification. Our conclusion about the training functions is based on the
simulation results
Comparative study to realize an automatic speaker recognition system IJECEIAES
In this research, we present an automatic speaker recognition system based on adaptive orthogonal transformations. To obtain the informative features with a minimum dimension from the input signals, we created an adaptive operator, which helped to identify the speaker’s voice in a fast and efficient manner. We test the efficiency and the performance of our method by comparing it with another approach, mel-frequency cepstral coefficients (MFCCs), which is widely used by researchers as their feature extraction method. The experimental results show the importance of creating the adaptive operator, which gives added value to the proposed approach. The performance of the system achieved 96.8% accuracy using Fourier transform as a compression method and 98.1% using Correlation as a compression method.
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
ANALYSIS OF THRESHOLD BASED CENTRALIZED LOAD BALANCING POLICY FOR HETEROGENEO...ijait
Heterogeneous machines can be significantly better than homogeneous machines but for that an effective workload distribution policy is required. Maximum realization of the performance can be achieved when system designer will overcome load imbalance condition within the system. Load
distribution and load balancing policy together can reduce total execution time and increase system throughput.
In this paper; we provide algorithm analysis of a threshold based job allocation and load balancing policy for heterogeneous system where all incoming jobs are judiciously and transparently distributed among sharing nodes on the basis of jobs’ requirement and processor capability for the maximization of performance and decline in execution time. A brief discussion of job allocation, transfer and location policy is given with explanation of how load imbalance condition is solved within the system. A flow of scheme is given with essential code and analysis of present algorithm is given to show how this algorithm is better.
Real-time traffic sign detection and recognition using Raspberry Pi IJECEIAES
Nowadays, the number of road accident in Malaysia is increasing expeditiously. One of the ways to reduce the number of road accident is through the development of the advanced driving assistance system (ADAS) by professional engineers. Several ADAS system has been proposed by taking into consideration the delay tolerance and the accuracy of the system itself. In this work, a traffic sign recognition system has been developed to increase the safety of the road users by installing the system inside the car for driver’s awareness. TensorFlow algorithm has been considered in this work for object recognition through machine learning due to its high accuracy. The algorithm is embedded in the Raspberry Pi 3 for processing and analysis to detect the traffic sign from the real-time video recording from Raspberry Pi camera NoIR. This work aims to study the accuracy, delay and reliability of the developed system using a Raspberry Pi 3 processor considering several scenarios related to the state of the environment and the condition of the traffic signs. A real-time testbed implementation has been conducted considering twenty different traffic signs and the results show that the system has more than 90% accuracy and is reliable with an acceptable delay.
An Adaptive Internal Model for Load Frequency Control using Extreme Learning ...TELKOMNIKA JOURNAL
As an important part of a power system, a load frequency control has to be prepared with a better controller to ensure internal frequency stability. In this paper, an Internal Model Control (IMC) scheme for a Load Frequency Control (LFC) with an adaptive internal model is proposed. The effectiveness of the IMC control has been tested in a three area power system. Results of the simulation show that the proposed IMC with Extreme Learning Machine (ELM) based adaptive model can accurately cover the power system dynamics. Furthermore, the proposed controller can effectively reduce the frequency and mechanical power deviation under disturbances of the power system.
A study on dynamic load balancing in grid environmentIJSRD
Grid computing is a collection of computer resources from multiple locations to reach a common goal. Grid computing distinguishes from conventional high performance computing systems that are heterogeneous and geographically dispersed than cluster computer. One of the major issues in grid computing is load balancing. Classification of load balancing is: Static – Dynamic, Centralized – Decentralized, Homogeneous – Heterogeneous. Techniques like: Ant Colony Optimization, Threshold based and Optimal Heterogeneous are used by some researcher to balance the load. This survey paper discusses set of parameters to be used for comparing performance of each of them. In addition to that it says which technique is more useful for grid environment.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
(Paper) Task scheduling algorithm for multicore processor system for minimiz...Naoki Shibata
Shohei Gotoda, Naoki Shibata and Minoru Ito : "Task scheduling algorithm for multicore processor system for minimizing recovery time in case of single node fault," Proceedings of IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2012), pp.260-267, DOI:10.1109/CCGrid.2012.23, May 15, 2012.
In this paper, we propose a task scheduling al-gorithm for a multicore processor system which reduces the
recovery time in case of a single fail-stop failure of a multicore
processor. Many of the recently developed processors have
multiple cores on a single die, so that one failure of a computing
node results in failure of many processors. In the case of a failure
of a multicore processor, all tasks which have been executed
on the failed multicore processor have to be recovered at once.
The proposed algorithm is based on an existing checkpointing
technique, and we assume that the state is saved when nodes
send results to the next node. If a series of computations that
depends on former results is executed on a single die, we need
to execute all parts of the series of computations again in
the case of failure of the processor. The proposed scheduling
algorithm tries not to concentrate tasks to processors on a die.
We designed our algorithm as a parallel algorithm that achieves
O(n) speedup where n is the number of processors. We evaluated
our method using simulations and experiments with four PCs.
We compared our method with existing scheduling method, and
in the simulation, the execution time including recovery time in
the case of a node failure is reduced by up to 50% while the
overhead in the case of no failure was a few percent in typical
scenarios.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
Brief Explanation about the Tau-Leaping Process, Parallel Processing and NVIDIA's CUDA architecture
And the use of cuTau - Leaping for simulation of Biological systems
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONcscpconf
Image reconstruction is a process of obtaining the original image from corrupted data. Applications of image reconstruction include Computer Tomography, radar imaging, weather
forecasting etc. Recently steering kernel regression method has been applied for image reconstruction [1]. There are two major drawbacks in this technique. Firstly, it is computationally intensive. Secondly, output of the algorithm suffers form spurious edges (especially in case of denoising). We propose a modified version of Steering Kernel Regression called as Median Based Parallel Steering Kernel Regression Technique. In the proposed algorithm the first problem is overcome by implementing it in on GPUs and multi-cores. The second problem is addressed by a gradient based suppression in which median filter is used. Our algorithm gives better output than that of the Steering Kernel Regression. The results are
compared using Root Mean Square Error(RMSE). Our algorithm has also shown a speedup of 21x using GPUs and shown speedup of 6x using multi-cores.
Comparison of Neural Network Training Functions for Hematoma Classification i...IOSR Journals
Classification is one of the most important task in application areas of artificial neural networks
(ANN).Training neural networks is a complex task in the supervised learning field of research. The main
difficulty in adopting ANN is to find the most appropriate combination of learning, transfer and training
function for the classification task. We compared the performances of three types of training algorithms in feed
forward neural network for brain hematoma classification. In this work we have selected Gradient Descent
based backpropagation, Gradient Descent with momentum, Resilence backpropogation algorithms. Under
conjugate based algorithms, Scaled Conjugate back propagation, Conjugate Gradient backpropagation with
Polak-Riebreupdates(CGP) and Conjugate Gradient backpropagation with Fletcher-Reeves updates (CGF).The
last category is Quasi Newton based algorithm, under this BFGS, Levenberg-Marquardt algorithms are
selected. Proposed work compared training algorithm on the basis of mean square error, accuracy, rate of
convergence and correctness of the classification. Our conclusion about the training functions is based on the
simulation results
A PROGRESSIVE MESH METHOD FOR PHYSICAL SIMULATIONS USING LATTICE BOLTZMANN ME...ijdpsjournal
In this paper, a new progressive mesh algorithm is introduced in order to perform fast physical simulations by the use of a lattice Boltzmann method (LBM) on a single-node multi-GPU architecture. This algorithm is able to mesh automatically the simulation domain according to the propagation of fluids. This method can also be useful in order to perform several types of physical simulations. In this paper, we associate this
algorithm with a multiphase and multicomponent lattice Boltzmann model (MPMC–LBM) because it is
able to perform various types of simulations on complex geometries. The use of this algorithm combined
with the massive parallelism of GPUs[5] allows to obtain very good performance in comparison with the
staticmesh method used in literature. Several simulations are shown in order to evaluate the algorithm.
PROBABILISTIC DIFFUSION IN RANDOM NETWORK G...ijfcstjournal
In this paper, we consider a random network such that there could be a link between any two nodes in the network with a certain probability (plink). Diffusion is the phenomenon of spreading information throughout the network, starting from one or more initial set of nodes (called the early adopters). Information spreads along the links with a certain probability (pdiff). Diffusion happens in rounds with the first round involving the early adopters. The nodes that receive the information for the first time are said to be covered and
become candidates for diffusion in the subsequent round. Diffusion continues until all the nodes in the network have received the information (successful diffusion) or there are no more candidate nodes to spread the information but one or more nodes are yet to receive the information (diffusion failure). On the basis of exhaustive simulations conducted in this paper, we observe that for a given plink and pdiff values, the fraction of successful diffusion attempts does not appreciably change with increase in the number of early
adopters; whereas, the average number of rounds per successful diffusion attempt decreases with increase
in the number of early adopters. The invariant nature of the fraction of successful diffusion attempts with increase in the number of early adopters for a random network (for fixed plink and pdiff values) is an interesting and noteworthy observation (for further research) and it has not been hitherto reported in the literature.
Hardback solution to accelerate multimedia computation through mgp in cmpeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
APPROXIMATE ARITHMETIC CIRCUIT DESIGN FOR ERROR RESILIENT APPLICATIONSVLSICS Design
When the application context is ready to accept different levels of exactness in solutions and is supported
by human perception quality, then the term ‘Approximate Computing’ tossed before one decade will
become the first priority . Even though computer hardware and software are working to generate exact
results, approximate results are preferred whenever an error is in predefined bound and adaptive. It will
reduce power demand and critical path delay and improve other circuit metrics. When it comes to
traditional arithmetic circuits, those generating correct results with limitations on performance are rapidly
getting replaced by approximate arithmetic circuits which are the need of the hour, and so on about their
design.
APPROXIMATE ARITHMETIC CIRCUIT DESIGN FOR ERROR RESILIENT APPLICATIONSVLSICS Design
When the application context is ready to accept different levels of exactness in solutions and is supported
by human perception quality, then the term ‘Approximate Computing’ tossed before one decade will
become the first priority . Even though computer hardware and software are working to generate exact
results, approximate results are preferred whenever an error is in predefined bound and adaptive. It will
reduce power demand and critical path delay and improve other circuit metrics. When it comes to
traditional arithmetic circuits, those generating correct results with limitations on performance are rapidly
getting replaced by approximate arithmetic circuits which are the need of the hour, and so on about their
design.
Parallel implementation of pulse compression method on a multi-core digital ...IJECEIAES
Pulse compression algorithm is widely used in radar applications. It requires a huge processing power in order to be executed in real time. Therefore, its processing must be distributed along multiple processing units. The present paper proposes a real time platform based on the multi-core digital signal processor (DSP) C6678 from Texas Instruments (TI). The objective of this paper is the optimization of the parallel implementation of pulse compression algorithm over the eight cores of the C6678 DSP. Two parallelization approaches were implemented. The first approach is based on the open multi processing (OpenMP) programming interface, which is a software interface that helps to execute different sections of a program on a multi core processor. The second approach is an optimized method that we have proposed in order to distribute the processing and to synchronize the eight cores of the C6678 DSP. The proposed method gives the best performance. Indeed, a parallel efficiency of 94% was obtained when the eight cores were activated.
A PROGRESSIVE MESH METHOD FOR PHYSICAL SIMULATIONS USING LATTICE BOLTZMANN ME...ijdpsjournal
In this paper, a new progressive mesh algorithm is introduced in order to perform fast physical simulations
by the use of a lattice Boltzmann method (LBM) on a single-node multi-GPU architecture. This algorithm is
able to mesh automatically the simulation domain according to the propagation of fluids. This method can
also be useful in order to perform several types of physical simulations. In this paper, we associate this
algorithm with a multiphase and multicomponent lattice Boltzmann model (MPMC–LBM) because it is
able to perform various types of simulations on complex geometries. The use of this algorithm combined
with the massive parallelism of GPUs[5] allows to obtain very good performance in comparison with the
staticmesh method used in literature. Several simulations are shown in order to evaluate the algorithm.
SPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHY cscpconf
This paper presents a parallel approach to improve the time complexity problem associated with sequential algorithms. An image steganography algorithm in transform domain is considered for implementation. Image steganography is a technique to hide secret message in an image. With the parallel implementation, large message can be hidden in large image since it does not take much processing time. It is implemented on GPU systems. Parallel programming is done using OpenCL in CUDA cores from NVIDIA. The speed-up improvement
obtained is very good with reasonably good output signal quality, when large amount of data is processed
DYNAMIC TASK PARTITIONING MODEL IN PARALLEL COMPUTINGcscpconf
Parallel computing systems compose task partitioning strategies in a true multiprocessing
manner. Such systems share the algorithm and processing unit as computing resources which
leads to highly inter process communications capabilities. The main part of the proposed
algorithm is resource management unit which performs task partitioning and co-scheduling .In
this paper, we present a technique for integrated task partitioning and co-scheduling on the
privately owned network. We focus on real-time and non preemptive systems. A large variety of
experiments have been conducted on the proposed algorithm using synthetic and real tasks.
Goal of computation model is to provide a realistic representation of the costs of programming
The results show the benefit of the task partitioning. The main characteristics of our method are
optimal scheduling and strong link between partitioning, scheduling and communication. Some
important models for task partitioning are also discussed in the paper. We target the algorithm
for task partitioning which improve the inter process communication between the tasks and use
the recourses of the system in the efficient manner. The proposed algorithm contributes the
inter-process communication cost minimization amongst the executing processes.
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...csandit
Computational Grid (CG) creates a large heterogeneous and distributed paradigm to manage and execute the applications which are computationally intensive. In grid scheduling tasks are assigned to the proper processors in the grid system to for its execution by considering the execution policy and the optimization objectives. In this paper, makespan and the faulttolerance of the computational nodes of the grid which are the two important parameters for the task execution, are considered and tried to optimize it. As the grid scheduling is considered to be NP-Hard, so a meta-heuristics evolutionary based techniques are often used to find a solution for this. We have proposed a NSGA II for this purpose. The performance estimation ofthe proposed Fault tolerance Aware NSGA II (FTNSGA II) has been done by writing program in Matlab. The simulation results evaluates the performance of the all proposed algorithm and the results of proposed model is compared with existing model Min-Min and Max-Min algorithm which proves effectiveness of the model.
An Adaptive Load Balancing Middleware for Distributed SimulationGabriele D'Angelo
The simulation is useful to support the design and performance evaluation of complex systems, possibly composed by a massive number of interacting entities. For this reason, the simulation of such systems may need aggregate computation and memory resources obtained by clusters of parallel and distributed execution units. Shared computer clusters composed of available Commercial-Off-the-Shelf hardware are preferable to dedicated systems, mainly for cost reasons. The performance of distributed simulations is influenced by the heterogeneity of execution units and by their respective CPU load in background. Adaptive load balancing mechanisms could improve the resources utilization and the simulation process execution, by dynamically tuning the simulation load with an eye to the synchronization and communication overheads reduction. In this work it will be presented the GAIA+ framework: a new load balancing mechanism for distributed simulation. The framework has been evaluated by performing testbed simulations of a wireless ad hoc network model. Results confirm the effectiveness of the proposed solutions.
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...IJCNCJournal
Rapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel
system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting
of multiple collections of nodes with different types of computing devices. The execution engine of the
system is open for optimizer implementations, focusing on various criteria. In this paper, we propose a new
optimizer for KernelHive, that utilizes distributed databases and performs data prefetching to optimize the
execution time of applications, which process large input data. Employing a versatile data management
scheme, which allows combining various distributed data providers, we propose using NoSQL databases
for our purposes. We support our solution with results of experiments with real executions of our OpenCL
implementation of a regular expression matching application in various hardware configurations.
Additionally, we propose a network-aware scheduling scheme for selecting hardware for the proposed
optimizer and present simulations that demonstrate its advantages.
Similar to Cell Charge Approximation for Accelerating Molecular Simulation on CUDA-Enabled GPU (20)
NEW ONTOLOGY RETRIEVAL IMAGE METHOD IN 5K COREL IMAGESijcax
Semantic annotation of images is an important research topic on both image understanding and database
or web image search. Image annotation is a technique to choosing appropriate labels for images with
extracting effective and hidden feature in pictures. In the feature extraction step of proposed method, we
present a model, which combined effective features of visual topics (global features over an image) and
regional contexts (relationship between the regions in Image and each other regions images) to automatic
image annotation.In the annotation step of proposed method, we create a new ontology (base on WordNet
ontology) for the semantic relationships between tags in the classification and improving semantic gap
exist in the automatic image annotation.Experiments result on the 5k Corel dataset show the proposed
method of image annotation in addition to reducing the complexity of the classification, increased accuracy
compared to the another methods
THE STUDY OF CUCKOO OPTIMIZATION ALGORITHM FOR PRODUCTION PLANNING PROBLEMijcax
Constrained Nonlinear programming problems are hard problems, and one of the most widely used and
common problems for production planning problem to optimize. In this study, one of the mathematical
models of production planning is survey and the problem solved by cuckoo algorithm. Cuckoo Algorithm is
efficient method to solve continues non linear problem. Moreover, mentioned models of production
planning solved with Genetic algorithm and Lingo software and the results will compared. The Cuckoo
Algorithm is suitable choice for optimization in convergence of solution
COMPARATIVE ANALYSIS OF ROUTING PROTOCOLS IN MOBILE AD HOC NETWORKSijcax
A Mobile Ad Hoc Network (MANET) is a collection of mobile nodes that want to communicate without any
pre-determined infrastructure and fixed organization of available links. Each node in MANET operates as
a router, forwarding information packets for other mobile nodes. There are many routing protocols that
possess different performance levels in different scenarios. The main task is to evaluate the existing routing
protocols and finding by comparing them the best one. In this article we compare AODV, DSR, DSDV,
OLSR and DYMO routing protocols in mobile ad hoc networks (MANETs) to specify the best operational
conditions for each MANETs protocol. We study these five MANETs routing protocols by different
simulations in NS-2 simulator. We describe that pause time parameter affect their performance. This
performance analysis is measured in terms of Packet Delivery Ratio, Average End-to-End Delay,
Normalized Routing Load and Average Throughput.
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...ijcax
In this study, which took place current year in the city of Maragheh in IRAN. Number of high school
students in the fields of study: mathematics, Experimental Sciences, humanities, vocational, business and
science were studied and compared. The purpose of this research is to predict the academic major of high
school students using Bayesian networks. The effective factors have been used in academic major selection
for the first time as an effective indicator of Bayesian networks. Evaluation of Impacts of indicators on
each other, discretization data and processing them was performed by GeNIe. The proper course would be
advised for students to continue their education.
A Multi Criteria Decision Making Based Approach for Semantic Image Annotation ijcax
Automatic image annotation has emerged as an important research topic due to its potential application on
both image understanding and web image search. This paper presents a model, which integrates visual
topics and regional contexts to automatic image annotation. Regional contexts model the relationship
between the regions, while visual topics provide the global distribution of topics over an image. Previous
image annotation methods neglected the relationship between the regions in an image, while these regions
are exactly explanation of the image semantics, therefore considering the relationship between them are
helpful to annotate the images. Regional contexts and visual topics are learned by PLSA (Probability
Latent Semantic Analysis) from the training data. The proposed model incorporates these two types of
information by MCDM (Multi Criteria Decision Making) approach based on WSM (Weighted Sum
Method). Experiments conducted on the 5k Corel dataset demonstrate the effectiveness of the proposed
model.
On Fuzzy Soft Multi Set and Its Application in Information Systems ijcax
Research on information and communication technologies have been developed rapidly since it can be
applied easily to several areas like computer science, medical science, economics, environments,
engineering, among other. Applications of soft set theory, especially in information systems have been
found paramount importance. Recently, Mukherjee and Das defined some new operations in fuzzy soft
multi set theory and show that the De-Morgan’s type of results hold in fuzzy soft multi set theory with
respect to these newly defined operations. In this paper, we extend their work and study some more basic
properties of their defined operations. Also, we define some basic supporting tools in information system
also application of fuzzy soft multi sets in information system are presented and discussed. Here we define
the notion of fuzzy multi-valued information system in fuzzy soft multi set theory and show that every fuzzy
soft multi set is a fuzzy multi valued information system.
Web is a collection of inter-related files on one or more web servers while web mining means extracting
valuable information from web databases. Web mining is one of the data mining domains where data
mining techniques are used for extracting information from the web servers. The web data includes web
pages, web links, objects on the web and web logs. Web mining is used to understand the customer
behaviour, evaluate a particular website based on the information which is stored in web log files. Web
mining is evaluated by using data mining techniques, namely classification, clustering, and association
rules. It has some beneficial areas or applications such as Electronic commerce, E-learning, Egovernment, E-policies, E-democracy, Electronic business, security, crime investigation and digital library.
Retrieving the required web page from the web efficiently and effectively becomes a challenging task
because web is made up of unstructured data, which delivers the large amount of information and increase
the complexity of dealing information from different web service providers. The collection of information
becomes very hard to find, extract, filter or evaluate the relevant information for the users. In this paper,
we have studied the basic concepts of web mining, classification, processes and issues. In addition to this,
this paper also analyzed the web mining research challenges.
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR ijcax
The Pattern classification system classifies the pattern into feature space within a boundary. In case
adversarial applications use, for example Spam Filtering, the Network Intrusion Detection System (NIDS),
Biometric Authentication, the pattern classification systems are used. Spam filtering is an adversary
application in which data can be employed by humans to attenuate perspective operations. To appraise the
security issue related Spam Filtering voluminous machine learning systems. We presented a framework for
the experimental evaluation of the classifier security in an adversarial environments, that combines and
constructs on the arms race and security by design, Adversary modelling and Data distribution under
attack. Furthermore, we presented a SVM, LR and MILR classifier for classification to categorize email as
legitimate (ham) or spam emails on the basis of thee text samples.
Visually impaired people face many problems in their day to day lives. Among them, outdoor navigation is
one of the major concerns. The existing solutions based on Wireless Sensor Networks(WSN) and Global
Positioning System (GPS) track ZigBee units or RFID (Radio Frequency Identification) tags fixed on the
navigation system. The issues pertaining to these solutions are as follows: (1) It is suitable only when the
visually impaired person is commuting in a familiar environment; (2) The device provides only a one way
communication; (3) Most of these instruments are heavy and sometimes costly. Preferable solution would
be to make a system which is easy to carry and cheap.
The objective of this paper is to break down the technological barriers, and to propose a system by
developing an Android App which would help a visually impaired person while traveling via the public
transport system like Bus. The proposed system uses an inbuilt feature of smart phone such as GPS
location tracker to track the location of the user and Text to Speech converter. The system also integrates
Google Speech to Text converter for capturing the voice input and converts them to text. This system
recommends the requirement of installing a GPS module in buses for real time tracking. With minor
modification, this App can also help older people for independent navigation.
INTELLIGENT AGENT FOR PUBLICATION AND SUBSCRIPTION PATTERN ANALYSIS OF NEWS W...ijcax
The rapid growth of Internet has revolutionized online news reporting. Many users tend to use online news
websites to obtain news information. When considering Sri Lanka, there are numerous news websites,
which are subscribed on a daily basis. With the rise in this number of news websites, the Sri Lankan
authorities of media face the issue of lacking a proper methodology or a tool which is capable of tracking
and regulating publications made by different disseminators of news.
This paper proposes a News Agent toolbox which periodically extracts news articles and associated
comments with the aid of a concept called Mapping Rules; to classify them into Personalized Categories
defined in terms of keywords based Category Profiles. The proposed tool also analyzes comments made by
the readers with the aid of simple statistical techniques to discover the most popular news articles and
fluctuations in popularity of news stories.
ADVANCED E-VOTING APPLICATION USING ANDROID PLATFORMijcax
The advancement in the mobile devices, wireless and web technologies given rise to the new application
that will make the voting process very easy and efficient. The E-voting promises the possibility of
convenient, easy and safe way to capture and count the votes in an election[1]. This research project
provides the specification and requirements for E-Voting using an Android platform. The e-voting means
the voting process in election by using electronic device. The android platform is used to develop an evoting application. At first, an introduction about the system is presented. Sections II and III describe all
the concepts (survey, design and implementation) that would be used in this work. Finally, the proposed evoting system will be presented. This technology helps the user to cast the vote without visiting the polling
booth. The application follows proper authentication measures in order to avoid fraud voters using the
system. Once the voting session is completed the results can be available within a fraction of seconds. All
the candidates vote count is encrypted and stored in the database in order to avoid any attacks and
disclosure of results by third person other than the administrator. Once the session is completed the admin
can decrypt the vote count and publish results and can complete the voting process.
The design of silicon chips in every semiconductor industry involves the testing of these chips with other
components on the board. The platform developed acts as power on vehicle for the silicon chips. This
Printed Circuit Board design that serves as a validation platform is foundational to the semiconductor
industry.
The manual/repetitive design activities that accompany the development of this board must be minimized to
achieve high quality, improve design efficiency, and eliminate human-errors. One of the time consuming
tasks in the board design is the Trace Length matching. The paper aims to reduce the length matching time
by automating it using SKILL scripts.
RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...ijcax
The purpose of this research is the analysis using meta-analysis of studies in the field of Educational
Technology in Turkey and in the field is to demonstrate how to get to that trend. For this purpose, a total of
263 studies were analyzed including 98 theses and 165 articles published between 2010-2018. Purpose
sampling method was used when selecting publications. In the research, while selecting articles and theses;
Turkey addressed; YOK Tez Tarama Database, Journal of Hacettepe University Faculty of Education,
Educational Sciences : Theory & Practice Journal, Education and Science Journal, Elementary Education
Online Journal, The Turkish Online Journal of Education and The Turkish Online Journal of Educational
Technology used in journals. Publications have been reviewed under 11 criteria. Index, year of
publication, research scope, method, education level, sample, number of samples, data collection methods,
analysis techniques, and research tendency, research topics in Educational Technology Research in Turkey
has revealed. The data is interpreted based on percentage and frequency and the results are shown using
the table.
RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...ijcax
The purpose of this research is the analysis using meta-analysis of studies in the field of Educational
Technology in Turkey and in the field is to demonstrate how to get to that trend. For this purpose, a total of
263 studies were analyzed including 98 theses and 165 articles published between 2010-2018. Purpose
sampling method was used when selecting publications. In the research, while selecting articles and theses;
Turkey addressed; YOK Tez Tarama Database, Journal of Hacettepe University Faculty of Education,
Educational Sciences : Theory & Practice Journal, Education and Science Journal, Elementary Education
Online Journal, The Turkish Online Journal of Education and The Turkish Online Journal of Educational
Technology used in journals. Publications have been reviewed under 11 criteria. Index, year of
publication, research scope, method, education level, sample, number of samples, data collection methods,
analysis techniques, and research tendency, research topics in Educational Technology Research in Turkey
has revealed. The data is interpreted based on percentage and frequency and the results are shown using
the table
IMPACT OF APPLYING INTERNATIONAL QUALITY STANDARDS ON MEDICAL EQUIPMENT IN SA...ijcax
With the great development that, modern medical technology is witnessing today, medical devices and
equipment have become a basic pillar of any healthcare system in the world and cannot be dispensed with,
so we find competition between the major companies that manufacture medical devices and equipment
resulting in a huge variety of complex modern medical technologies. These medical devices and equipment
require high accuracy in manufacturing and packaging in addition to operation, maintenance, and followup, because any error in any of the previous stages will have bad consequences for the patients and the
health system, there are many accidents that have led to some deaths. Therefore, we find that many medical
device producers and medical companies in addition to health service providers seek to find systems and
protocols to reduce accidents resulting from medical devices. As a result, many systems have recently
appeared that seek to protect from the dangers of medical devices and equipment. This research aims to
conduct a study of the effects of international standards on the safety of medical devices and equipment
and reduce their risks. By counting the international standards in force in the Kingdom of Saudi Arabia
that are applied by the Saudi Food and Drug Authority, making questionnaires, and distributing them to
health service providers and regulatory bodies for medical devices and equipment, considering the data,
these data will be analysed and evaluated the effectiveness of quality systems and standards in maintaining
Effectiveness and quality of medical devices and equipment. The study will include governmental and
private health services sectors.
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR ijcax
The Pattern classification system classifies the pattern into feature space within a boundary. In case adversarial applications use, for example Spam Filtering, the Network Intrusion Detection System (NIDS), Biometric Authentication, the pattern classification systems are used. Spam filtering is an adversary
application in which data can be employed by humans to attenuate perspective operations. To appraise the
security issue related Spam Filtering voluminous machine learning systems. We presented a framework for the experimental evaluation of the classifier security in an adversarial environments, that combines and constructs on the arms race and security by design, Adversary modelling and Data distribution under
attack. Furthermore, we presented a SVM, LR and MILR classifier for classification to categorize email as legitimate (ham) or spam emails on the basis of thee text samples
Developing Product Configurator Tool Using CADs’ API with the help of Paramet...ijcax
Order placingis a crucial phase of lifecycle of a Mass-customizable product and seeks improvement in
Mechanical industry. ‘Product Configurator’ is a good solution to bring in data transparency and speed
up the process. Configuration tools arebeing used on a very small scale,reasons being lack of awareness
and dearer costs of existing tools. In this research work a product configurator is developedfor
Hydraulic Actuator (HA).This method uses Applicable Programing Interface (API) of a CAD tool coupled
with Visual Basics (VB) and MS Excel.Itis a standaloneapplication of VB and its integration into web
portal can be the future scope. The final aim was to reduce time delay at CRM phase,bring more
transparency in the ordering system and to establish a method which, small and medium scale enterprises
canafford. Trails on the tool developed generated Part-Assembly drawings, BOM and JT files in
moments.
DESIGN AND DEVELOPMENT OF CUSTOM CHANGE MANAGEMENT WORKFLOW TEMPLATES AND HAN...ijcax
A large no. of automobile companies finding a convinient way to manage design changes with the use of
various PLM techniques. Change in any product is something that should occur on timely basis to match
up with customer requirement and cost reduction. The change made in the vehicle designs directly affects
various concerned agencies. Automobile Vehicle structures contains thousands of parts and if there is any
change is occurring in child parts then it becomes important to track that impacted part, propose a solution
on that part and release a new assembly structure with feasible changes such that all efforts need to be
done for cost reduction.
Visually impaired people face many problems in their day to day lives. Among them, outdoor navigation is
one of the major concerns. The existing solutions based on Wireless Sensor Networks(WSN) and Global
Positioning System (GPS) track ZigBee units or RFID (Radio Frequency Identification) tags fixed on the
navigation system. The issues pertaining to these solutions are as follows: (1) It is suitable only when the
visually impaired person is commuting in a familiar environment; (2) The device provides only a one way
communication; (3) Most of these instruments are heavy and sometimes costly. Preferable solution would
be to make a system which is easy to carry and cheap.
The objective of this paper is to break down the technological barriers, and to propose a system by
developing an Android App which would help a visually impaired person while traveling via the public
transport system like Bus. The proposed system uses an inbuilt feature of smart phone such as GPS
location tracker to track the location of the user and Text to Speech converter. The system also integrates
Google Speech to Text converter for capturing the voice input and converts them to text. This system
recommends the requirement of installing a GPS module in buses for real time tracking. With minor
modification, this App can also help older people for independent navigation.
TEACHER’S ATTITUDE TOWARDS UTILISING FUTURE GADGETS IN EDUCATION ijcax
Today’s era is an era of modernization and globalization. Everything is happening at a very fast rate
whether it is politics, societal reforms, commercialization, transportation, or educational innovations. In
every few second, technology grows either in the form of arrival of the new devices/gadgets with millions of
apps and these latest technological objects may be in the form of hardware/software devices. We are the
educationists, teachers, students and stakeholders of present Indian educational system. These
gadgets/devices are partly being used by us or most of them are still unaware of these innovative
technologies due to the mass media or economical factor. So, there is a need to improvise ourselves
towards utilizing the future gadgets in order to explore the educational uses, barriers and preparatoryneeds of these available devices for educational purposes. This paper aims to study the opinion of the
teacher-educators about the usage of future gadgets in higher education. It will also contribute towards
establishing the list of latest technological devices, and how it can enhances the process of teachinglearning system.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...ssuser7dcef0
Power plants release a large amount of water vapor into the
atmosphere through the stack. The flue gas can be a potential
source for obtaining much needed cooling water for a power
plant. If a power plant could recover and reuse a portion of this
moisture, it could reduce its total cooling water intake
requirement. One of the most practical way to recover water
from flue gas is to use a condensing heat exchanger. The power
plant could also recover latent heat due to condensation as well
as sensible heat due to lowering the flue gas exit temperature.
Additionally, harmful acids released from the stack can be
reduced in a condensing heat exchanger by acid condensation. reduced in a condensing heat exchanger by acid condensation.
Condensation of vapors in flue gas is a complicated
phenomenon since heat and mass transfer of water vapor and
various acids simultaneously occur in the presence of noncondensable
gases such as nitrogen and oxygen. Design of a
condenser depends on the knowledge and understanding of the
heat and mass transfer processes. A computer program for
numerical simulations of water (H2O) and sulfuric acid (H2SO4)
condensation in a flue gas condensing heat exchanger was
developed using MATLAB. Governing equations based on
mass and energy balances for the system were derived to
predict variables such as flue gas exit temperature, cooling
water outlet temperature, mole fraction and condensation rates
of water and sulfuric acid vapors. The equations were solved
using an iterative solution technique with calculations of heat
and mass transfer coefficients and physical properties.
6th International Conference on Machine Learning & Applications (CMLA 2024)ClaraZara1
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
Water billing management system project report.pdfKamal Acharya
Our project entitled “Water Billing Management System” aims is to generate Water bill with all the charges and penalty. Manual system that is employed is extremely laborious and quite inadequate. It only makes the process more difficult and hard.
The aim of our project is to develop a system that is meant to partially computerize the work performed in the Water Board like generating monthly Water bill, record of consuming unit of water, store record of the customer and previous unpaid record.
We used HTML/PHP as front end and MYSQL as back end for developing our project. HTML is primarily a visual design environment. We can create a android application by designing the form and that make up the user interface. Adding android application code to the form and the objects such as buttons and text boxes on them and adding any required support code in additional modular.
MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software. It is a stable ,reliable and the powerful solution with the advanced features and advantages which are as follows: Data Security.MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Cell Charge Approximation for Accelerating Molecular Simulation on CUDA-Enabled GPU
1. International Journal of Computer-Aided Technologies (IJCAx) Vol.1,No.1,April 2014
9
Cell Charge Approximation for Accelerating
Molecular Simulation on CUDA-Enabled GPU
Shyam Mohan K T ,Jayaraj P B
Dept. of Computer Science and Engineering National Institute of Technology Calicut
Calicut, India
ABSTRACT
Methods for Molecular Dynamics(MD) simulations are investigated. MD simulation is the widely used
computer simulation approach to study the properties of molecular system. Force calculation in MD is
computationally intensive. Paral-lel programming techniques can be applied to improve those calculations.
The major aim of this paper is to speed up the MD simulation calculations by/using General Purpose
Graphics Processing Unit(GPU) computing paradigm, an efficient and economical way for parallel
computing. For that we are proposing a method called cell charge approximation which treats the
electrostatic interactions in MD simulations.This method reduces the complexity of force calculations.
Index Terms
Cell charge approximation, General Purpose Graphics Processing Unit (GPU), MD simulations.
I. INTRODUCTION
General Purpose Computation on Graphics Processing Units(GPUs) is a parallel computing
paradigm which provides a cost effective solution for computationally ex-pensive applications.
GPUs are massively parallel. Many core processors can be used to accelerate a wide range of
data-parallel general purpose applications like molecular dynamics in addition to graphics
processing. Also the power consumption of GPUs are very less compared to other high
performance devices.
Molecular Dynamics(MD) is a widely used computer simulation technique. In molecular
dynamics approach, through the integration of equations of motions, particle trajectories are
calculated from the force exerted on each particle due to molecular interactions. The force on
each particle is calculated as the sum of all forces from rest of the particles. But the force
calculations are independent of each other so that parallel programming paradigm can improve
the simulation process. The introduction of GPU computing made impact on techniques for
molecular simulation. The use of GPGPU techniques as an alternative to distributed memory
clusters in MD simulations has become a real possibility. The availability of large number of
2. International Journal of Computer-Aided Technologies (IJCAx) Vol.1,No.1,April 2014
10
processing cores makes GPU computing suitable here. CUDA architecture introduced by
NVIDIA, which provides high level abstraction and minimal language extensions to implement
parallel programs, significantly simplifies the process of programming the GPU. In this paper we
are introducing an approximation technique, cell charge approximation, which reduces
computation needed for long range coulombic forces in molecular dynamics.
The rest of the paper is organized as follows: Section II explains the background information and
Section III describes the MD algorithm. Section IV presents the improving of simulation and
Section V deals with imple-mentation and results, and finally the paper is concluded in Section
VI.
II. BACKGROUND INFORMATION
A. Molecular Simulation
Molecular simulation is the technique for computing and visualizing the properties of molecules
using com-puters, which serves as a complement to conventional experiments and enabling us to
learn something new without doing the experiment[1]. Molecular simulation tools are helpful in
predicting the behavior of molecules, chemical interactions and physical properties and mainly
used in the field of Physics,Chemistry, Bioinformatics and Nanotechnology. Molecular Dynamics
simulation is compu-tation intensive since the amount of atoms in the system is always
tremendous and time step is always in femto second level, which means realistic simulation needs
millions of steps [1]. Fortunately most part of these simulations shows high parallelism, so that
parallel programming techniques can be applied to improve the performance of simulation
process.
The most used approach for molecular simulation is Molecular Dynamics method. The molecular
model in Molecular Dynamics is based on classical mechanics. In classical Molecular Dynamics,
force calculations are based on Newton’s laws [1]. Since this is an approximation used to reduce
computation, cannot describe some behavior of molecular system such as breakage and formation
of chemical bond. So as to have a perfect simulation of molecular system the calculations must be
done on the basis of quantum mechanics, such simulations are called ab initio simulations [2].
Required properties of the molecular system are found from the particle trajectories during the
simulation time, while particle trajectories are calculated from the force exerted on each particle
due to molecular interactions. The forces are non-linear functions of distances between the
particles in the system and are either long-range or short-range in nature. Short-range force model
limits the range of influence of the particle in the system, thus reducing the computation needed.
Since each particle interact with all other particles in the system, long range force calculation can
be computationally costlier as it scales quadratically but gives higher quality simulation [1].
Parallelization of Molecular Dynamics simulation can be categorized into three types: Particle
Decomposition, Force Decomposition and Spatial Decomposition [1], [2]. In particle
decomposition method subset of atoms is assigned to a streaming multiprocessor in GPU. Spatial
decompo-sition decomposes the whole system into different spatial region and assign each region
into a streaming processor [3], [2]. Force decomposition is rarely used in molecular dynamics
simulations. For molecular simulation in GPU the best decomposition method is spatial
3. International Journal of Computer-Aided Technologies (IJCAx) Vol.1,No.1,April 2014
11
decomposition [2]. The data reuse and less maintenance overhead makes it the best fit. Spatial
decomposition needs only one neighbor list for a set of particles in the system so that subset of
particles in the system shares the common data which reduces the data transform time and thus
increases the performance [4].
B. CUDA
The introduction of General purpose GPUs made an impact on the parallel processing paradigm.
The advantage of GPU over other parallel architectures are large amount of arithmetic capability,
low cost and high throughput[5]. CUDA architecture introduced by NVIDIA in 2006 is a general
purpose parallel computing architecture. It enables dramatic increases in computing performance
by harnessing the power of the graphics processing unit (GPU) [6]. It allows the programmer to
use the GPU as a com-puting device without explicitly mapping to the graphics API. CUDA
provides a parallel programming model and instruction set architecture which makes the
programming of GPU easier as it provides low level hardware access, avoiding the limitations
imposed in fragment shaders.
GPU is a multi-core processor with several Stream-ing Multiprocessors(SMPs) where each SMP
consists of streaming processors arranged as grid of blocks[5]. Tasks are specified as blocks of
threads. In CUDA threads are created and scheduled by hardware which reduces the management
overhead. Since CUDA limits the number of threads per block and SMP ,the number of threads
required for a task is specified accordingly[6]. Synchro-nization of threads within a block is
possible in CUDA architecture. CUDA-compatible GPU can be programmed using CUDA C
which is similar to C and have some extension which supports parallel programming [5]. CUDA
program is divided into host code and device code.The part with little or no data parallelism is
implemented in host code and it runs on the CPU. The part that exhibit rich amount of data
parallelism is implemented in the device code or kernel function and is executed on GPU.
GPGPU programming paradigm belongs to Single Program Multiple Data (SPMD) model, in
which each thread executes the same program on different parts of data.The major bottleneck of
GPGPU programming model is the transfer of data to device memory from the primary memory
before invoking the kernel. Device memory hier-archy consists of a large but slower Graphics
DDR(GDDR) memory, a global memory for all threads,smaller, faster on chip shared memory
shared per block and local registers and memory for each registers[6].
III. THE MD ALGORITHM
This section focuses on the basic idea behind MD simulation. In static MD, an isolated system of
fixed size is studied and hence its total energy is constant. The particles of the system interact
through model potentials and positions of the molecules are obtained by solving the equations of
motion for each molecule [4]. The basic MD algorithm is
Algorithm 1 MD Algorithm
Initialize simulation parameters
Set initial positions and velocities of particles step := 0
4. International Journal of Computer-Aided Technologies (IJCAx) Vol.1,No.1,April 2014
12
repeat
Compute potential and forces on each particle in the system
Integrate the equations of motion to find the new position and velocity of particles
Update position and velocity values of each particle Calculate required properties
Increment the time step
until step < TotalStep
A. Initialization
To start the simulation, a set of initial positions and velocities must be assigned to the particles.
Commonly the simulation starts from the results obtained from previous simulation. Otherwise a
set of positions and velocities are chosen such that the system is in thermal equilibrium. Positions
are defined on crystalline lattice structure with some randomization and initial velocities are taken
from a Maxwell-Boltzmann distribution at the desired temper-ature with zero total momentum.
[7].
B. Force Calculation
Calculation of intermolecular forces is the most com-putationally intensive part of molecular
dynamics [1]. The force on an atom can be calculated from the change in potential energy
between its current position and its position a small distance away [7]. Potential on a particle
arises from the interaction of the particle with other particles in the molecular system. The value
of potential energy is calculated as a sum of bonded and a sum of external or non-bonded terms.
The bonded interaction describes the potential arose from bond stretch, angular movement and
bond rotations in a molecule. The bond stretch interaction represents the energy required to
stretch or compress a covalent bond and is modeled by
( )
= ( − ) ( )
New position and velocity of a particle in the system is found by integrating the equations of
motion based on current position and force acting on the particle. Verlet scheme is an extensively
used integration approach in MD simulations [7]. The new position and velocity of a particle is
given by the following equations in Verlet scheme with r0 being the bond-rest length.
The angular interaction is the energy required to bend a bond from its equilibrium angle θ0 to an
angle θ.
5. International Journal of Computer
Vangle
The energy required to deform a planar group of atoms from its equilibrium angle
dihedral tortion, is given by
Vdih (ϖ) =
Non-bonded interactions act between atoms which are not linked by covalent bonds and
categorized into electrostatic and non
interaction, commonly called as Vander Waals interaction, is the sum of the attractive or
repulsive forces between particles other than those due to cova
interaction. Lennard-Jones (L-J) potential is used for modeling the Vander Waals
potential is used for modeling the Vander Waals action beyond a cut
magnitude of interaction falls quickly
The L-J potential is given by
Vij = 4ε ሾቆ
Electrostatic interactions between charged particles are modeled by Coulomb’s law. Long range
electrostatic interactions cannot be neglected since it converges slowly and hence the computation
is split into short range and long range interactions. Short range interactions are calculated
directly using Coulomb’s law while long range interactions are calculated separa
approximation technique [9]. The electrostatic interaction between two charged particles is given
by
Vij =
1
4πε
Force acting on a particle is calculated as the negative gradient of potentia
ordinates of the particle in the system [7].
C. Integration
New position and velocity of a particle in the system is found by integrating the equations of
motion based on current position and force acting on the particle.
used integration approach in MD simulations [7]. The new position and velocity of a particle is
given by the following equations in Verlet scheme
r(t ∆t) = 2
International Journal of Computer-Aided Technologies (IJCAx) Vol.1,No.1,April 2014
angle (θ) =
1
2
K
a
(θ − θ0) (2)
The energy required to deform a planar group of atoms from its equilibrium angle
=
1
2
K
ϕ
(1 cos(nϕϖ0) (3)
bonded interactions act between atoms which are not linked by covalent bonds and
categorized into electrostatic and non-electrostatic interaction. The non-electrostatic Non
interaction, commonly called as Vander Waals interaction, is the sum of the attractive or
repulsive forces between particles other than those due to covalent bonds and electrostatic
J) potential is used for modeling the Vander Waals
potential is used for modeling the Vander Waals action beyond a cut-off are neglected since the
magnitude of interaction falls quickly as the distance between the particles increases [4], [8], [3].
ቆ
σ
dij
ቇ
12
− ቆ
σ
dij
ቇ
6
ሿ (4)
Electrostatic interactions between charged particles are modeled by Coulomb’s law. Long range
ractions cannot be neglected since it converges slowly and hence the computation
is split into short range and long range interactions. Short range interactions are calculated
directly using Coulomb’s law while long range interactions are calculated separately using some
approximation technique [9]. The electrostatic interaction between two charged particles is given
q1q2
dij
(5)
Force acting on a particle is calculated as the negative gradient of potential with respect to the co
ordinates of the particle in the system [7].
New position and velocity of a particle in the system is found by integrating the equations of
motion based on current position and force acting on the particle. Verlet scheme is an extensively
used integration approach in MD simulations [7]. The new position and velocity of a particle is
given by the following equations in Verlet scheme
2r(t) r(t − ∆t)
Fi
mi
∆t2
(7)
,April 2014
13
The energy required to deform a planar group of atoms from its equilibrium angle ϖ0, called
bonded interactions act between atoms which are not linked by covalent bonds and are
electrostatic Non-bonded
interaction, commonly called as Vander Waals interaction, is the sum of the attractive or
lent bonds and electrostatic
Jones (L-J)
off are neglected since the
particles increases [4], [8], [3].
Electrostatic interactions between charged particles are modeled by Coulomb’s law. Long range
ractions cannot be neglected since it converges slowly and hence the computation
is split into short range and long range interactions. Short range interactions are calculated
tely using some
approximation technique [9]. The electrostatic interaction between two charged particles is given
l with respect to the co-
New position and velocity of a particle in the system is found by integrating the equations of
Verlet scheme is an extensively
used integration approach in MD simulations [7]. The new position and velocity of a particle is
6. International Journal of Computer-Aided Technologies (IJCAx) Vol.1,No.1,April 2014
14
v(t) =
r(t ∆t) − r(t − ∆t)
2 ∆t
(8)
IV. IMPROVING THE SIMULATION
A. Bonded interaction
The number of particles in the molecular system is always large and the bonded interaction on
each particle, is independent of the interaction on other particles, in the system calculations can
be parallelized. Particle decomposition method is used in parallelizing the bonded interaction. A
thread is assigned for each particle in the system for the bonded interaction calculation.
B. Vander Waals Interaction
The large exponential values in Equation 4 reflects how the magnitude of these forces drop off
very rapidly at larger distances, so the effect of Vander Waals forces beyond a cut-off radius (rc)
can be neglected[1]. Based on this, to improve the Vander Waals force calculation,Neighbor lists
are used. A neighbor list is associated with each particle and only the particles in the neighbor list
are considered during the interaction calculation. Complexity of the Vander Waals force
calculation drops from O(n2
) to O(n), since only the particles in the neighbor list are to be
considered and the number of particle in the neighbor list of a particle is very much less
compared to the number of particle in the system, n. The overhead of this optimization is the
construction and updation of neighbor lists which grows to O(n2), but the updation is done less
frequently. Construction of neighbor list is based on the distance between the particles,if the
distance between two particles is less than a specified neighbor distance then those two particles
are neighbors.
The neighbor distance is chosen such that the neighbor list of a particle will contain all the
particles that can be in the cut-off region; till the neighbor list is updated again after a number of
time steps. Cell lists are used to reduce the computation needed in constructing the neighbor list
to O(n) [1]. The cubic simulation box is divided into a regular lattice of smaller cells of size
neighbor distance. The particles in each cells are stored in cell lists. The advantage of cell isthat it
reduces the number of particles needed to be considered while constructing the neighbor list,
since only the particles in current cell and its neighboring cells need to be considered. The cell list
construction got only a linear time complexity. Neighbor list construction is parallelized based on
particle decomposition approach. Neighbor lists of all particle can be constructed in parallel using
a CUDA thread for each particle, so that the construction can be completed in O(n) in direct
method and in O(1) in improved method. Spatial decomposition technique is used for
parallelizing Vander Waals interaction calculation, where a thread is assigned for a set of
neighboring particles. Since the nearest neighbors have the same set of neighbors, data reuse is
possible here and the shared memory facility available in CUDA can be exploited. The common
neighbor set is moved to the faster shared memory of the SMP, and the SMPs calculate the
interaction on each particle in that set, which reduces the running time by reducing the number of
expensive global memory accesses and results in a constant time complexity.
7. International Journal of Computer
C. Electrostatic Interaction
Long range electrostatic interaction cannot be neglected as in the case of Vander Waals
interaction. To reduce the time complexity from
1)Cell Charge Approximation:
To reduce the time required for long range electrostatic interaction computation, cell charge
approximation is introduced. The cubic simulation box is
cells of size greater than cut-off distance. Cell charge approximation treats each cell as a point
charge located at the centroid of the cell with a charge that is equal to the sum of charges of the
charged particle present in that cell. During the construction of cell list, charge density of each
cell is computed and stored. In the computation of long range interaction, the interaction between
the particle and the particles present in the immediate neighboring cell
Then the interaction with particles in the non
charges, as the interaction between the particle and the virtual particle existing at the centroid of
each cell , which reduces the time
interaction in a system of n particle from
the lattice. Particle decomposition is chosen to parallelize the electrostatic interaction in case of
direct computation. Parallelizing cell charge approximation is done with spatial decomposition.
Each SMP will have the details of virtual particles in each cell centroid with the common
neighborset for a set neighboring particles in its shared memory and a th
set of particles, which reduces the number of global memory fetch and hence reduces the running
time.
2) Particle Mesh Ewald:
Ewald Summation is a technique introduced for the calculation of long range interaction between
particles by Peter Ewald in 1921[10]. In Ewald Summation the potential energy of a system of
particles is calculated as sum of two rapidly converging terms namely direct sum and reciprocal
sum and the calculation grows quadratically [11]. Particle Mesh Ewald(PM
Ewald Summation technique with complexity O(nlogn), is used in the long range electrostatic
interaction calculation. The Particle Mesh Ewald method calculates direct
within a finite distance using a modification of Coul
Fourier transform to build a ”mesh” of charges, interpolated onto a grid. In PME, the potential
energy is given by
V(ri) = Vdirect
The direct part is computed using t
Only the particles inside the cut-
interaction is given by
International Journal of Computer-Aided Technologies (IJCAx) Vol.1,No.1,April 2014
Long range electrostatic interaction cannot be neglected as in the case of Vander Waals
time complexity fromO(n2
)some approximation techniques are used.
Cell Charge Approximation:
To reduce the time required for long range electrostatic interaction computation, cell charge
approximation is introduced. The cubic simulation box is divided into a regular lattice of smaller
off distance. Cell charge approximation treats each cell as a point
charge located at the centroid of the cell with a charge that is equal to the sum of charges of the
e present in that cell. During the construction of cell list, charge density of each
cell is computed and stored. In the computation of long range interaction, the interaction between
the particle and the particles present in the immediate neighboring cells are calculated directly.
Then the interaction with particles in the non-neighboring cells are computed based on cell
charges, as the interaction between the particle and the virtual particle existing at the centroid of
each cell , which reduces the time complexity of computation of long range electrostatic
interaction in a system of n particle from O(n2
) to O(nm) where m << n is the number of cells in
the lattice. Particle decomposition is chosen to parallelize the electrostatic interaction in case of
ect computation. Parallelizing cell charge approximation is done with spatial decomposition.
Each SMP will have the details of virtual particles in each cell centroid with the common
neighborset for a set neighboring particles in its shared memory and a thread is assigned to a sub
set of particles, which reduces the number of global memory fetch and hence reduces the running
Ewald Summation is a technique introduced for the calculation of long range interaction between
cles by Peter Ewald in 1921[10]. In Ewald Summation the potential energy of a system of
particles is calculated as sum of two rapidly converging terms namely direct sum and reciprocal
sum and the calculation grows quadratically [11]. Particle Mesh Ewald(PME) is an improved
Ewald Summation technique with complexity O(nlogn), is used in the long range electrostatic
interaction calculation. The Particle Mesh Ewald method calculates direct-space interactions
within a finite distance using a modification of Coulomb’s Law, and in reciprocal space using a
Fourier transform to build a ”mesh” of charges, interpolated onto a grid. In PME, the potential
direct(ri) Vself(ri) Vreciprocal(ri) (9)
The direct part is computed using the equation,
-off region are considered during the computation. The self
,April 2014
15
Long range electrostatic interaction cannot be neglected as in the case of Vander Waals
techniques are used.
To reduce the time required for long range electrostatic interaction computation, cell charge
divided into a regular lattice of smaller
off distance. Cell charge approximation treats each cell as a point
charge located at the centroid of the cell with a charge that is equal to the sum of charges of the
e present in that cell. During the construction of cell list, charge density of each
cell is computed and stored. In the computation of long range interaction, the interaction between
s are calculated directly.
neighboring cells are computed based on cell
charges, as the interaction between the particle and the virtual particle existing at the centroid of
complexity of computation of long range electrostatic
to O(nm) where m << n is the number of cells in
the lattice. Particle decomposition is chosen to parallelize the electrostatic interaction in case of
ect computation. Parallelizing cell charge approximation is done with spatial decomposition.
Each SMP will have the details of virtual particles in each cell centroid with the common
read is assigned to a sub
set of particles, which reduces the number of global memory fetch and hence reduces the running
Ewald Summation is a technique introduced for the calculation of long range interaction between
cles by Peter Ewald in 1921[10]. In Ewald Summation the potential energy of a system of
particles is calculated as sum of two rapidly converging terms namely direct sum and reciprocal
E) is an improved
Ewald Summation technique with complexity O(nlogn), is used in the long range electrostatic
space interactions
omb’s Law, and in reciprocal space using a
Fourier transform to build a ”mesh” of charges, interpolated onto a grid. In PME, the potential
off region are considered during the computation. The self
8. International Journal of Computer
Vself(ri) =
The reciprocal part is computed by the convolutions
a mesh(grid) where the charges are interpolated to the grid points [11]. First the charges are
replaced by grid based charge density. This is done by interpolation of charges into grid points
using an assignment functionW(r),
qrn
= ∑i qi W (rn
Then Fourier coefficient of the mesh
The potential in the Fourier space is then computed as
Then compute the potential on the real
Potential on each particle is computed by interpolating back to the location of charged particles
using the assignment function,
Vreciprocal(ri) = =
rn
Even though PME gives higher performance,
some part of the simulation, the charge assignment, cannot be parallelized. The charge
assignment is done serially and other part of the PME are parallelized.
3) PME with Cell Charge Approximation:
Combining cell charge approximation with PME gives higher performance. Particle Mesh Ewald
method calculates direct-space interactions within a finite distance using a modification of
Coulomb’s Law, and in reciprocal space using a Fourier transform to build a ”m
International Journal of Computer-Aided Technologies (IJCAx) Vol.1,No.1,April 2014
) −
2η
√π
qi (11)
The reciprocal part is computed by the convolutions performed using Fast Fourier Transforms on
a mesh(grid) where the charges are interpolated to the grid points [11]. First the charges are
replaced by grid based charge density. This is done by interpolation of charges into grid points
unctionW(r),
n − ri) (12)
Then Fourier coefficient of the mesh-based charge density is computed as
The potential in the Fourier space is then computed as
Then compute the potential on the real-space,
Potential on each particle is computed by interpolating back to the location of charged particles
n
Vrn W (rn − ri) (17)
Even though PME gives higher performance, parellization of PME yields less speed up since
some part of the simulation, the charge assignment, cannot be parallelized. The charge
assignment is done serially and other part of the PME are parallelized.
3) PME with Cell Charge Approximation:
ing cell charge approximation with PME gives higher performance. Particle Mesh Ewald
space interactions within a finite distance using a modification of
Coulomb’s Law, and in reciprocal space using a Fourier transform to build a ”mesh” of charges,
,April 2014
16
performed using Fast Fourier Transforms on
a mesh(grid) where the charges are interpolated to the grid points [11]. First the charges are
replaced by grid based charge density. This is done by interpolation of charges into grid points
Potential on each particle is computed by interpolating back to the location of charged particles
parellization of PME yields less speed up since
some part of the simulation, the charge assignment, cannot be parallelized. The charge
ing cell charge approximation with PME gives higher performance. Particle Mesh Ewald
space interactions within a finite distance using a modification of
esh” of charges,
9. International Journal of Computer-Aided Technologies (IJCAx) Vol.1,No.1,April 2014
17
interpolated onto a grid and hence gives a time complexity of O(nlogn). The cell charge
approximation is combined with PME such that the virtual particle existing at the centroid of each
cell is used as the charged particle in PME approach, hence improving the long range electrostatic
interaction calculation of MD simulation. Dividing the simulation box into cells comes up with an
advantage of reducing the computation needed in neighbor list construction, by reducing the
number of particle need to be considered to the particles present in the current cell and its
immediate neighboring cells
V. IMPLEMENTATION AND RESULTS
Any existing package would simply pose too many restrictions on the underlying data structures
and the order in which operations are performed. Instead, a completely fresh MD code was built.
The Serial molecular dynamics algorithm was implemented in C and the parallel versions were
implemented using CUDA C language extensions. The program that implements Parallel MD
algorithm uses serial version as the basis. The basic serial version is implemented on the basis
of core MD algorithm given by Shrinidhi et al. [4].
The running time taken by different implementations for different number of atoms are given in
Table I and Table II. Figure 1 and Figure 2 compares the speed enhancements of different
implementations on an input of 3000 atoms and 6000 atoms respectively. It shows how different
improvements reduces the running time of MD simulation. The approximations used in
electrostatic interaction computations reduces the running time tremendously. The overhead of
constructing the neighbor lists reduces the performance gain achieved through neglecting the long
range Vander Waals interaction. Since the amount of parallelization portion available reduces and
the copy overhead remains the same, reduction in running time of parallel implementation due to
approximation methods is less compared to the serial version. The accuracy of simulation result
obtained is compared with the result obtained from LAMMPS. The result obtained was similar
with some minor errors. Figure 3 gives the comparison of speedup obtained from parallel
implementation when compared with the serial version. The speed up is maximum in direct
implementation and least in the final implementation since the amount of parallelization portion
reduces as new approximations are introduced and the GPU overhead remains as it is.
10. International Journal of Computer-Aided Technologies (IJCAx) Vol.1,No.1,April 2014
18
Table 1 : Running Time Of Different Implements ( 3000 Atoms)
Table 2: Running Time Of Different Implementations (3000 Atoms)
Fig 1: Performance Comparison (3000 atoms)
11. International Journal of Computer-Aided Technologies (IJCAx) Vol.1,No.1,April 2014
19
Fig 2: Performance Comparison(6000 atoms)
Fig 3: Speed up on GPU
VI.CONCLUSION
Molecular Dynamics is a widely used molecular simulation approach. Reduction in time
complexity of MD simulation will allow us to simulate larger molecular systems within
practically feasible time period. The work proposes cell charge approximation technique for
reducing the computation needed for long range electrostatic interaction calculation in Molecular
Dynamics simulation. The comparative analysis of proposed approach is done with other schemes
12. International Journal of Computer-Aided Technologies (IJCAx) Vol.1,No.1,April 2014
20
for calculating long range interaction. The results obtained indicate that the Cell Charge
approximation used here provides good speedup in force calculation with out degrading the
accuracy much.
REFERENCES
[1] Steve Plimpton. Fast parallel algorithms for short-range molecular dynamics. volume 117, pages 1 –
19, 1995.
[2] Hongjian Li, Shixin Sun, Hong Tang, Yusheng Dou, and GlennV. Lo. Two-level parallelization of
ehrenfest force calculations in ab-initio molecular dynamics simulation,. volume 15, pages 255–263.
Springer US, 2012.
[3] Weiguo Liu, Bertil Schmidt, Gerrit Voss, and Wolfgang Muller-Wittig. Molecular dynamics
simulations on commodity gpus with cuda. In Proceedings of the 14th international conference on
High performance computing, HiPC’07, pages 185–196, Berlin, Heidelberg, 2007. Springer-Verlag.
[4] Shrinidhi Hudli, Shrihari Hudli, Raghu Hudli, Yashonath Subramanian, and T. S. Mohan. Gpgpu-
based parallel computation: application to molecular dynamics problems. In R. K. Shyamasundar and
Lokendra Shastri, editors, Bangalore Compute Conf., page 10. ACM, 2011.
[5] NVIDIA Corporation. NVIDIA CUDA C programming guide, 2010. Version 3.2.
[6] Jason Sanders and Edward Kandrot. CUDA by Example: An Introduction to General-Purpose GPU
Programming. Addison- Wesley Professional, 1st edition, 2010.
[7] D. C. Rapaport. The Art of Molecular Dynamics Simulation. Cambridge University Press, New York,
NY, USA, 1996.
[8] M. Taufer, N. Ganesan, and S. Patel. Gpu enabled macromolecular simulation: Challenges and
opportunities. volume PP, page 1, 2012.
[9] Christopher I. Rodrigues, David J. Hardy, John E. Stone, Klaus Schulten, and Wen-Mei W.Hwu. Gpu
acceleration of cutoff pair potentials for molecular modeling applications. In Proceedings of the 5th
conference on Computing frontiers, CF ’08, pages 273–282,
New York, NY, USA, 2008. ACM.
[10] P. P. Ewald. Die berechnung optischer und elektrostatischer gitterpotentiale. Annalen der Physik,
369(3):253–287, 1921.
[11] Harry A Stern and Keith G Calkins. On mesh-based ewald methods: optimal parameters for two
differentiation schemes. J Chem Phys, 128(21):214106, 2008.
[12] Canqun Yang, Qiang Wu, Juan Chen, and Zhen Ge. Gpu acceleration of high-speed collision
molecular dynamics simulation. In CIT (2), pages 254–259. IEEE Computer Society, 2009.
[13] Weiguo Liu, Bertil Schmidt, and Wolfgang Muller-Wittig. Cudablastp: Accelerating blastp on cuda-
enabled graphics hardware. volume 8, pages 1678–1684, Los Alamitos, CA, USA, 2011. IEEE
Computer Society.
[14] Marco Maggioni, Marco D. Santambrogio, and Jie Liang. Gpuaccelerated chemical similarity
assessment for large scale databases. volume 4, pages 2007–2016, 2011.