This document compares two methods for parallel matrix multiplication using PVM (Parallel Virtual Machine): the row per slave method and the rows set per slave method. It finds that the row per slave method provides optimal computation time. The row per slave method assigns each slave a single row from the first matrix to compute, while the rows set per slave method assigns each slave a set of rows. Experimental results on matrices of varying sizes show the row per slave method takes less time, with an average 50% reduction in computation time compared to the rows set per slave method.
Performance comparison of row per slave and rows set per slave method in pvm ...eSAT Journals
Abstract Parallel computing operates on the principle that large problems can often be divided into smaller ones, which are then solved concurrently to save time by taking advantage of non-local resources and overcoming memory constraints. Multiplication of larger matrices requires a lot of computation time. This paper deals with the two methods for handling Parallel Matrix Multiplication. First is, dividing the rows of one of the input matrices into set of rows based on the number of slaves and assigning one rows set for each slave for computation. Second method is, assigning one row of one of the input matrices at a time for each slave starting from first row to first slave and second row to second slave and so on and loop backs to the first slave when last slave assignment is finished and repeated until all rows are finished assigning. These two methods are implemented using Parallel Virtual Machine and the computation is performed for different sizes of matrices over the different number of nodes. The results show that the row per slave method gives the optimal computation time in PVM based parallel matrix multiplication. Keywords: Parallel Execution, Cluster Computing, MPI (Message Passing Interface), PVM (Parallel Virtual Machine) RAM (Random Access Memory).
DYNAMIC TASK PARTITIONING MODEL IN PARALLEL COMPUTINGcscpconf
Parallel computing systems compose task partitioning strategies in a true multiprocessing
manner. Such systems share the algorithm and processing unit as computing resources which
leads to highly inter process communications capabilities. The main part of the proposed
algorithm is resource management unit which performs task partitioning and co-scheduling .In
this paper, we present a technique for integrated task partitioning and co-scheduling on the
privately owned network. We focus on real-time and non preemptive systems. A large variety of
experiments have been conducted on the proposed algorithm using synthetic and real tasks.
Goal of computation model is to provide a realistic representation of the costs of programming
The results show the benefit of the task partitioning. The main characteristics of our method are
optimal scheduling and strong link between partitioning, scheduling and communication. Some
important models for task partitioning are also discussed in the paper. We target the algorithm
for task partitioning which improve the inter process communication between the tasks and use
the recourses of the system in the efficient manner. The proposed algorithm contributes the
inter-process communication cost minimization amongst the executing processes.
Along with idling and contention, communication is a major overhead in parallel programs.
The cost of communication is dependent on a variety of features including the programming model semantics, the network topology, data handling and routing, and associated software protocols.
Performance comparison of row per slave and rows set per slave method in pvm ...eSAT Journals
Abstract Parallel computing operates on the principle that large problems can often be divided into smaller ones, which are then solved concurrently to save time by taking advantage of non-local resources and overcoming memory constraints. Multiplication of larger matrices requires a lot of computation time. This paper deals with the two methods for handling Parallel Matrix Multiplication. First is, dividing the rows of one of the input matrices into set of rows based on the number of slaves and assigning one rows set for each slave for computation. Second method is, assigning one row of one of the input matrices at a time for each slave starting from first row to first slave and second row to second slave and so on and loop backs to the first slave when last slave assignment is finished and repeated until all rows are finished assigning. These two methods are implemented using Parallel Virtual Machine and the computation is performed for different sizes of matrices over the different number of nodes. The results show that the row per slave method gives the optimal computation time in PVM based parallel matrix multiplication. Keywords: Parallel Execution, Cluster Computing, MPI (Message Passing Interface), PVM (Parallel Virtual Machine) RAM (Random Access Memory).
DYNAMIC TASK PARTITIONING MODEL IN PARALLEL COMPUTINGcscpconf
Parallel computing systems compose task partitioning strategies in a true multiprocessing
manner. Such systems share the algorithm and processing unit as computing resources which
leads to highly inter process communications capabilities. The main part of the proposed
algorithm is resource management unit which performs task partitioning and co-scheduling .In
this paper, we present a technique for integrated task partitioning and co-scheduling on the
privately owned network. We focus on real-time and non preemptive systems. A large variety of
experiments have been conducted on the proposed algorithm using synthetic and real tasks.
Goal of computation model is to provide a realistic representation of the costs of programming
The results show the benefit of the task partitioning. The main characteristics of our method are
optimal scheduling and strong link between partitioning, scheduling and communication. Some
important models for task partitioning are also discussed in the paper. We target the algorithm
for task partitioning which improve the inter process communication between the tasks and use
the recourses of the system in the efficient manner. The proposed algorithm contributes the
inter-process communication cost minimization amongst the executing processes.
Along with idling and contention, communication is a major overhead in parallel programs.
The cost of communication is dependent on a variety of features including the programming model semantics, the network topology, data handling and routing, and associated software protocols.
Parallel programming platforms are introduced here. For more information about parallel programming and distributed computing visit,
https://sites.google.com/view/vajira-thambawita/leaning-materials
Macromodel of High Speed Interconnect using Vector Fitting Algorithmijsrd.com
At high frequency efficient macromodeling of high speed interconnects is all time challenging task. We have presented systematic methodologies to generate rational function approximations of high-speed interconnects using vector fitting technique for any type of termination conditions and construct efficient multiport model, which is easily and directly compatible with circuit simulators.
In all-reduce, each node starts with a buffer of size m and the final results of the operation are identical buffers of size m on each node that are formed by combining the original p buffers using an associative operator.
A PROGRESSIVE MESH METHOD FOR PHYSICAL SIMULATIONS USING LATTICE BOLTZMANN ME...ijdpsjournal
In this paper, a new progressive mesh algorithm is introduced in order to perform fast physical simulations by the use of a lattice Boltzmann method (LBM) on a single-node multi-GPU architecture. This algorithm is able to mesh automatically the simulation domain according to the propagation of fluids. This method can also be useful in order to perform several types of physical simulations. In this paper, we associate this
algorithm with a multiphase and multicomponent lattice Boltzmann model (MPMC–LBM) because it is
able to perform various types of simulations on complex geometries. The use of this algorithm combined
with the massive parallelism of GPUs[5] allows to obtain very good performance in comparison with the
staticmesh method used in literature. Several simulations are shown in order to evaluate the algorithm.
A natural extension of the Random Access Machine (RAM) serial architecture is the Parallel Random Access Machine, or PRAM.
PRAMs consist of p processors and a global memory of unbounded size that is uniformly accessible to all processors.
Processors share a common clock but may execute different instructions in each cycle.
(Paper) Task scheduling algorithm for multicore processor system for minimiz...Naoki Shibata
Shohei Gotoda, Naoki Shibata and Minoru Ito : "Task scheduling algorithm for multicore processor system for minimizing recovery time in case of single node fault," Proceedings of IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2012), pp.260-267, DOI:10.1109/CCGrid.2012.23, May 15, 2012.
In this paper, we propose a task scheduling al-gorithm for a multicore processor system which reduces the
recovery time in case of a single fail-stop failure of a multicore
processor. Many of the recently developed processors have
multiple cores on a single die, so that one failure of a computing
node results in failure of many processors. In the case of a failure
of a multicore processor, all tasks which have been executed
on the failed multicore processor have to be recovered at once.
The proposed algorithm is based on an existing checkpointing
technique, and we assume that the state is saved when nodes
send results to the next node. If a series of computations that
depends on former results is executed on a single die, we need
to execute all parts of the series of computations again in
the case of failure of the processor. The proposed scheduling
algorithm tries not to concentrate tasks to processors on a die.
We designed our algorithm as a parallel algorithm that achieves
O(n) speedup where n is the number of processors. We evaluated
our method using simulations and experiments with four PCs.
We compared our method with existing scheduling method, and
in the simulation, the execution time including recovery time in
the case of a node failure is reduced by up to 50% while the
overhead in the case of no failure was a few percent in typical
scenarios.
GRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEMIJCSEA Journal
Task assignment is one of the most challenging problems in distributed computing environment. An optimal task assignment guarantees minimum turnaround time for a given architecture. Several approaches of optimal task assignment have been proposed by various researchers ranging from graph partitioning based tools to heuristic graph matching. Using heuristic graph matching, it is often impossible to get optimal task assignment for practical test cases within an acceptable time limit. In this paper, we have parallelized the basic heuristic graph-matching algorithm of task assignment which is suitable only for cases where processors and inter processor links are homogeneous. This proposal is a derivative of the basic task assignment methodology using heuristic graph matching. The results show that near optimal assignments are obtained much faster than the sequential program in all the cases with reasonable speed-up.
Comprehensive Performance Evaluation on Multiplication of Matrices using MPIijtsrd
In Matrix multiplication we refer to a concept that is used in technology applications such as digital image processing, digital signal processing and graph problem solving. Multiplication of huge matrices requires a lot of computing time as its complexity is O n3 . Because most engineering science applications require higher computational throughput with minimum time, many sequential and analogue algorithms are developed. In this paper, methods of matrix multiplication are elect, implemented, and analyzed. A performance analysis is evaluated, and some recommendations are given when using open MP and MPI methods of parallel of latitude computing. Adamu Abubakar I | Oyku A | Mehmet K | Amina M. Tako ""Comprehensive Performance Evaluation on Multiplication of Matrices using MPI""
Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-2 , February 2020,
URL: https://www.ijtsrd.com/papers/ijtsrd30015.pdf
Paper Url : https://www.ijtsrd.com/engineering/electrical-engineering/30015/comprehensive-performance-evaluation-on-multiplication-of-matrices-using-mpi/adamu-abubakar-i
program partitioning and scheduling IN Advanced Computer ArchitecturePankaj Kumar Jain
Advanced Computer Architecture,Program Partitioning and Scheduling,Program Partitioning & Scheduling,Latency,Levels of Parallelism,Loop-level Parallelism,Subprogram-level Parallelism,Job or Program-Level Parallelism,Communication Latency,Grain Packing and Scheduling,Program Graphs and Packing
A survey of Parallel models for Sequence Alignment using Smith Waterman Algor...iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Over time, Machine Learning inference workloads became more and more demanding in terms of latency and throughput. Moreover, many inference workloads compute predictions based on a limited number of models that are deployed in the system. This scenario provides large rooms for optimizations of runtime and memory, which current systems fall short in exploring because they employ a black-box model of ML models and tasks.
On the opposite side, Pretzel adopts a white-box description of ML models, which allows the framework to perform optimizations over deployed models and running tasks, saving memory and increasing the overall system performance. In particular, Pretzel can properly schedule ML jobs on NUMA machines, whose complexities may impact latencies and efficiency aspects.
In this talk we will show the motivations behind Pretzel, its current design and possible future developments.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Procurement principle towards effective management of construction projectseSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Parallel programming platforms are introduced here. For more information about parallel programming and distributed computing visit,
https://sites.google.com/view/vajira-thambawita/leaning-materials
Macromodel of High Speed Interconnect using Vector Fitting Algorithmijsrd.com
At high frequency efficient macromodeling of high speed interconnects is all time challenging task. We have presented systematic methodologies to generate rational function approximations of high-speed interconnects using vector fitting technique for any type of termination conditions and construct efficient multiport model, which is easily and directly compatible with circuit simulators.
In all-reduce, each node starts with a buffer of size m and the final results of the operation are identical buffers of size m on each node that are formed by combining the original p buffers using an associative operator.
A PROGRESSIVE MESH METHOD FOR PHYSICAL SIMULATIONS USING LATTICE BOLTZMANN ME...ijdpsjournal
In this paper, a new progressive mesh algorithm is introduced in order to perform fast physical simulations by the use of a lattice Boltzmann method (LBM) on a single-node multi-GPU architecture. This algorithm is able to mesh automatically the simulation domain according to the propagation of fluids. This method can also be useful in order to perform several types of physical simulations. In this paper, we associate this
algorithm with a multiphase and multicomponent lattice Boltzmann model (MPMC–LBM) because it is
able to perform various types of simulations on complex geometries. The use of this algorithm combined
with the massive parallelism of GPUs[5] allows to obtain very good performance in comparison with the
staticmesh method used in literature. Several simulations are shown in order to evaluate the algorithm.
A natural extension of the Random Access Machine (RAM) serial architecture is the Parallel Random Access Machine, or PRAM.
PRAMs consist of p processors and a global memory of unbounded size that is uniformly accessible to all processors.
Processors share a common clock but may execute different instructions in each cycle.
(Paper) Task scheduling algorithm for multicore processor system for minimiz...Naoki Shibata
Shohei Gotoda, Naoki Shibata and Minoru Ito : "Task scheduling algorithm for multicore processor system for minimizing recovery time in case of single node fault," Proceedings of IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2012), pp.260-267, DOI:10.1109/CCGrid.2012.23, May 15, 2012.
In this paper, we propose a task scheduling al-gorithm for a multicore processor system which reduces the
recovery time in case of a single fail-stop failure of a multicore
processor. Many of the recently developed processors have
multiple cores on a single die, so that one failure of a computing
node results in failure of many processors. In the case of a failure
of a multicore processor, all tasks which have been executed
on the failed multicore processor have to be recovered at once.
The proposed algorithm is based on an existing checkpointing
technique, and we assume that the state is saved when nodes
send results to the next node. If a series of computations that
depends on former results is executed on a single die, we need
to execute all parts of the series of computations again in
the case of failure of the processor. The proposed scheduling
algorithm tries not to concentrate tasks to processors on a die.
We designed our algorithm as a parallel algorithm that achieves
O(n) speedup where n is the number of processors. We evaluated
our method using simulations and experiments with four PCs.
We compared our method with existing scheduling method, and
in the simulation, the execution time including recovery time in
the case of a node failure is reduced by up to 50% while the
overhead in the case of no failure was a few percent in typical
scenarios.
GRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEMIJCSEA Journal
Task assignment is one of the most challenging problems in distributed computing environment. An optimal task assignment guarantees minimum turnaround time for a given architecture. Several approaches of optimal task assignment have been proposed by various researchers ranging from graph partitioning based tools to heuristic graph matching. Using heuristic graph matching, it is often impossible to get optimal task assignment for practical test cases within an acceptable time limit. In this paper, we have parallelized the basic heuristic graph-matching algorithm of task assignment which is suitable only for cases where processors and inter processor links are homogeneous. This proposal is a derivative of the basic task assignment methodology using heuristic graph matching. The results show that near optimal assignments are obtained much faster than the sequential program in all the cases with reasonable speed-up.
Comprehensive Performance Evaluation on Multiplication of Matrices using MPIijtsrd
In Matrix multiplication we refer to a concept that is used in technology applications such as digital image processing, digital signal processing and graph problem solving. Multiplication of huge matrices requires a lot of computing time as its complexity is O n3 . Because most engineering science applications require higher computational throughput with minimum time, many sequential and analogue algorithms are developed. In this paper, methods of matrix multiplication are elect, implemented, and analyzed. A performance analysis is evaluated, and some recommendations are given when using open MP and MPI methods of parallel of latitude computing. Adamu Abubakar I | Oyku A | Mehmet K | Amina M. Tako ""Comprehensive Performance Evaluation on Multiplication of Matrices using MPI""
Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-2 , February 2020,
URL: https://www.ijtsrd.com/papers/ijtsrd30015.pdf
Paper Url : https://www.ijtsrd.com/engineering/electrical-engineering/30015/comprehensive-performance-evaluation-on-multiplication-of-matrices-using-mpi/adamu-abubakar-i
program partitioning and scheduling IN Advanced Computer ArchitecturePankaj Kumar Jain
Advanced Computer Architecture,Program Partitioning and Scheduling,Program Partitioning & Scheduling,Latency,Levels of Parallelism,Loop-level Parallelism,Subprogram-level Parallelism,Job or Program-Level Parallelism,Communication Latency,Grain Packing and Scheduling,Program Graphs and Packing
A survey of Parallel models for Sequence Alignment using Smith Waterman Algor...iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Over time, Machine Learning inference workloads became more and more demanding in terms of latency and throughput. Moreover, many inference workloads compute predictions based on a limited number of models that are deployed in the system. This scenario provides large rooms for optimizations of runtime and memory, which current systems fall short in exploring because they employ a black-box model of ML models and tasks.
On the opposite side, Pretzel adopts a white-box description of ML models, which allows the framework to perform optimizations over deployed models and running tasks, saving memory and increasing the overall system performance. In particular, Pretzel can properly schedule ML jobs on NUMA machines, whose complexities may impact latencies and efficiency aspects.
In this talk we will show the motivations behind Pretzel, its current design and possible future developments.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Procurement principle towards effective management of construction projectseSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Tuning of pid controller of inverted pendulum using genetic algorithmeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Coupling the bionic surface friction contact performance and wear resistance ...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Analysis of multi hop relay algorithm for efficient broadcasting in manetseSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Improving quality of service using ofdm technique for 4 th generation networkeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Load balancing in public cloud combining the concepts of data mining and netw...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Hardback solution to accelerate multimedia computation through mgp in cmpeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Performance evaluation of larger matrices over cluster of four nodes using mpieSAT Journals
Abstract Parallel computing operates on the principle that large problems can often be divided into smaller ones, which are then solved concurrently to save time (wall clock time) by taking advantage of non-local resources and overcoming memory constraints. The main aim is to form a cluster based parallel computing architecture for MPI based applications which demonstrates the performance gain and losses achieved through parallel processing using MPI. This can be realized by implementing the parallel applications like solving matrix multiplication problem, using MPI. The architecture for demonstrating MPI based parallel applications works on the Master-Slave computing paradigm. We aim to evaluate the time statistics of parallel execution and do comparison with the time taken to solve the same problem in serial execution. We also demonstrate communication overhead involved in parallel computation. The results with runs on different number of nodes are compared to evaluate the efficiency of MPI based parallel applications. We also show the performance dependency of parallel and serial computation, on RAM. Finally we show the relationship between the number of slave processes to be specified for computation and the number of cores available for parallel computation. Keywords: Parallel Execution, Cluster Computing, Symmetric Multi-Processor (SMP), MPI (Message Passing Interface), RAM (Random Access Memory).
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
An octa core processor with shared memory and message-passingeSAT Journals
Abstract This being the era of fast, high performance computing, there is the need of having efficient optimizations in the processor architecture and at the same time in memory hierarchy too. Each and every day, the advancement of applications in communication and multimedia systems are compelling to increase number of cores in the main processor viz., dual-core, quad-core, octa-core and so on. But, for enhancing the overall performance of multi processor chip, there are stringent requirements to improve inter-core synchronization. Thus, a MPSoC with 8-cores supporting both message-passing and shared-memory inter-core communication mechanisms is implemented on Virtex 5 LX110T FPGA. Each core is based on MIPS III (Microprocessor without interlocked pipelined stages) ISA, handling only integer type instructions and having six-stage pipeline with data hazard detection unit and forwarding logic. The eight processing cores and one central shared memory core are inter connected using 3x3 2-D mesh topology based Network-on-chip (NoC) with virtual channel router. The router is four stage pipelined supporting DOR X-Y routing algorithm and with round robin arbitration technique. For verification and functionality test of above fully synthesized multi core processor, matrix multiplication operation is mapped onto the above said. Partitioning and scheduling of multiple multiplications and addition for each element of resultant matrix has been done accordingly among eight cores to get maximum throughput. All the codes for processor design are written in Verilog HDL. Keywords: MPSoC, message-passing, shared memory, MIPS, ISA, wormhole router, network-on-chip, SIMD, data level parallelism, 2-D Mesh, virtual channel
Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Schedulin...csandit
Computational Grid (CG) creates a large heterogeneous and distributed paradigm to manage and execute the applications which are computationally intensive. In grid scheduling tasks are assigned to the proper processors in the grid system to for its execution by considering the execution policy and the optimization objectives. In this paper, makespan and the faulttolerance of the computational nodes of the grid which are the two important parameters for the task execution, are considered and tried to optimize it. As the grid scheduling is considered to be NP-Hard, so a meta-heuristics evolutionary based techniques are often used to find a solution for this. We have proposed a NSGA II for this purpose. The performance estimation ofthe proposed Fault tolerance Aware NSGA II (FTNSGA II) has been done by writing program in Matlab. The simulation results evaluates the performance of the all proposed algorithm and the results of proposed model is compared with existing model Min-Min and Max-Min algorithm which proves effectiveness of the model.
Comparative Study of Neural Networks Algorithms for Cloud Computing CPU Sched...IJECEIAES
Cloud Computing is the most powerful computing model of our time. While the major IT providers and consumers are competing to exploit the benefits of this computing model in order to thrive their profits, most of the cloud computing platforms are still built on operating systems that uses basic CPU (Core Processing Unit) scheduling algorithms that lacks the intelligence needed for such innovative computing model. Correspdondingly, this paper presents the benefits of applying Artificial Neural Networks algorithms in regards to enhancing CPU scheduling for Cloud Computing model. Furthermore, a set of characteristics and theoretical metrics are proposed for the sake of comparing the different Artificial Neural Networks algorithms and finding the most accurate algorithm for Cloud Computing CPU Scheduling.
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...IJCNCJournal
Rapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel
system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting
of multiple collections of nodes with different types of computing devices. The execution engine of the
system is open for optimizer implementations, focusing on various criteria. In this paper, we propose a new
optimizer for KernelHive, that utilizes distributed databases and performs data prefetching to optimize the
execution time of applications, which process large input data. Employing a versatile data management
scheme, which allows combining various distributed data providers, we propose using NoSQL databases
for our purposes. We support our solution with results of experiments with real executions of our OpenCL
implementation of a regular expression matching application in various hardware configurations.
Additionally, we propose a network-aware scheduling scheme for selecting hardware for the proposed
optimizer and present simulations that demonstrate its advantages.
A Review - Synchronization Approaches to Digital systemsIJERA Editor
Synchronization is a prime requirement in the process of Digital systems. Wherein new devices are upcoming
towards providing higher service level, advanced distributed systems are been integrated onto a single platform
for higher service provision. However with the integration of large processing units, the distributed processing
needs a high level synchronization with minimum processing overhead. The issue of synchronization was
processed by various approaches. This paper outlines a brief review on the developments made in the field of
synchronization approach to digital system, under distributed mode operation.
Machine learning in Dynamic Adaptive Streaming over HTTP (DASH)Eswar Publications
Recently machine learning has been introduced into the area of adaptive video streaming. This paper explores a novel taxonomy that includes six state of the art techniques of machine learning that have been applied to Dynamic Adaptive Streaming over HTTP (DASH): (1) Q-learning, (2) Reinforcement learning, (3) Regression, (4) Classification, (5) Decision Tree learning, and (6) Neural networks.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Brief Explanation about the Tau-Leaping Process, Parallel Processing and NVIDIA's CUDA architecture
And the use of cuTau - Leaping for simulation of Biological systems
Enhancement of qos in multihop wireless networks by delivering cbr using lb a...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Enhancement of qos in multihop wireless networks by delivering cbr using lb a...eSAT Journals
Abstract One of the most complicated issues is to measuring the delay performance of end to end nodes in Multi-hop Wireless Networks. The two nodes are communicating via hopping over the multiple wireless links. The fact that is each node has to concentrate not only its own generated traffic, but also relayed one. Observing unfairness particularly for transmissions among nodes that are more than one hop Most of the existing works deals with the joint congestion control and scheduling algorithm, which does not focusing the delay performance. In turn, considering the throughput metric alone although for congestion control flows, throughput is the repeated difficult performance metric Packet delay is also important because practical congestion control protocols need to establish the timeouts for the retransmissions based on the packet delay, such parameters could significantly impact the speed of recovery when loss of packets occurred. The related issues on the delay-performance First, for long flows, the end to end delay may grow in terms of square with based on the number of hops. Second, it is difficult to control the end-to-end delay of each flows. TDMA schedules the transmissions in a fair way, in terms of throughput per connection, considering the communication requirements of the active flows of the network. It does not work properly in the multi-hop scenario, because it is generated only for single hop networks, We propose The Leaky Bucket Algorithm, in addition to joint congestion control and scheduling algorithm in multi-hop wireless networks. The proposed algorithm not only achieves the provable throughput and also considering the upper bounds of the delay of each flow. It reduces the transmission time by delivering packets at a constant bit rate even it receives the packet at a busty way. Keywords- Multi-hop wireless networks, congestion control, Performance, Delay, Flow, Throughput.
The network anomaly detection technology based
on support vector machine (SVM) can efficiently detect unknown
attacks or variants of known attacks. However, it cannot be used
for detection of large-scale intrusion scenarios due to the demand
of computational time. The graphics processing unit (GPU) has
the characteristics of multi-threads and powerful parallel
processing capability. Hence Parallel computing framework is
used to accelerate the SVM-based classification.
A multi objective hybrid aco-pso optimization algorithm for virtual machine p...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Wireless communication without pre shared secrets using spread spectrum techn...eSAT Journals
Abstract
The wireless communication using spread spectrum relies on the assumption that some secret is shared among source and
destination node before communication or transmission has started. This problem is called the circular dependency problem
(CDP). This CDP exists in large networks, where nodes frequently join and leaves the network. In this work we have introduced
an efficient and reliable mechanism called Advanced Encryption Standard (AES) Algorithm, to overcome circular dependency
problem (CDP). This is an efficient algorithm to make successful transmission of data without pre-sharing any secret key. We
have evaluated this by simulation in Matrix Laboratory (MATLAB).
Keywords: -Spread spectrum, CDP, AES and MATLAB.
Comparative Analysis of Job Scheduling for Grid Environment ............................................................1
Neeraj Pandey, Ashish Arya and Nitin Kumar Agrawal
Hackers Portfolio and its Impact on Society ........................................................................................1
Dr. Adnan Omar and Terrance Sanchez, M.S.
Ontology Based Multi-Viewed Approach for Requirements Engineering ..............................................1
R. Subha and S. Palaniswami
Modified Colonial Competitive Algorithm: An Approach for Graph Coloring Problem ..........................1
Hojjat Emami and Parvaneh Hasanzadeh
Security and Privacy in E-Passport Scheme using Authentication Protocols and Multiple Biometrics
Technology ........................................................................................................................................1
V. K. Narendira Kumar and B. Srinivasan
Comparative Study of WLAN, WPAN, WiMAX Technologies ................................................................1
Prof. Mangesh M. Ghonge and Prof. Suraj G. Gupta
A New Method for Web Development using Search Engine Optimization ............................................1
Chutisant Kerdvibulvech and Kittidech Impaiboon
A New Design to Improve the Security Aspects of RSA Cryptosystem ..................................................1
Sushma Pradhan and Birendra Kumar Sharma
A Hybrid Model of Multimodal Approach for Multiple Biometrics Recognition ...................................1
P. Prabhusundhar, V.K. Narendira Kumar and B. Srinivasan
Similar to Performance comparison of row per slave and rows set (20)
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfKamal Acharya
The College Bus Management system is completely developed by Visual Basic .NET Version. The application is connect with most secured database language MS SQL Server. The application is develop by using best combination of front-end and back-end languages. The application is totally design like flat user interface. This flat user interface is more attractive user interface in 2017. The application is gives more important to the system functionality. The application is to manage the student’s details, driver’s details, bus details, bus route details, bus fees details and more. The application has only one unit for admin. The admin can manage the entire application. The admin can login into the application by using username and password of the admin. The application is develop for big and small colleges. It is more user friendly for non-computer person. Even they can easily learn how to manage the application within hours. The application is more secure by the admin. The system will give an effective output for the VB.Net and SQL Server given as input to the system. The compiled java program given as input to the system, after scanning the program will generate different reports. The application generates the report for users. The admin can view and download the report of the data. The application deliver the excel format reports. Because, excel formatted reports is very easy to understand the income and expense of the college bus. This application is mainly develop for windows operating system users. In 2017, 73% of people enterprises are using windows operating system. So the application will easily install for all the windows operating system users. The application-developed size is very low. The application consumes very low space in disk. Therefore, the user can allocate very minimum local disk space for this application.
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Courier management system project report.pdfKamal Acharya
It is now-a-days very important for the people to send or receive articles like imported furniture, electronic items, gifts, business goods and the like. People depend vastly on different transport systems which mostly use the manual way of receiving and delivering the articles. There is no way to track the articles till they are received and there is no way to let the customer know what happened in transit, once he booked some articles. In such a situation, we need a system which completely computerizes the cargo activities including time to time tracking of the articles sent. This need is fulfilled by Courier Management System software which is online software for the cargo management people that enables them to receive the goods from a source and send them to a required destination and track their status from time to time.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Automobile Management System Project Report.pdfKamal Acharya
The proposed project is developed to manage the automobile in the automobile dealer company. The main module in this project is login, automobile management, customer management, sales, complaints and reports. The first module is the login. The automobile showroom owner should login to the project for usage. The username and password are verified and if it is correct, next form opens. If the username and password are not correct, it shows the error message.
When a customer search for a automobile, if the automobile is available, they will be taken to a page that shows the details of the automobile including automobile name, automobile ID, quantity, price etc. “Automobile Management System” is useful for maintaining automobiles, customers effectively and hence helps for establishing good relation between customer and automobile organization. It contains various customized modules for effectively maintaining automobiles and stock information accurately and safely.
When the automobile is sold to the customer, stock will be reduced automatically. When a new purchase is made, stock will be increased automatically. While selecting automobiles for sale, the proposed software will automatically check for total number of available stock of that particular item, if the total stock of that particular item is less than 5, software will notify the user to purchase the particular item.
Also when the user tries to sale items which are not in stock, the system will prompt the user that the stock is not enough. Customers of this system can search for a automobile; can purchase a automobile easily by selecting fast. On the other hand the stock of automobiles can be maintained perfectly by the automobile shop manager overcoming the drawbacks of existing system.
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
block diagram and signal flow graph representation
Performance comparison of row per slave and rows set
1. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 02 Issue: 12 | Dec-2013, Available @ http://www.ijret.org 249
PERFORMANCE COMPARISON OF ROW PER SLAVE AND ROWS SET
PER SLAVE METHOD IN PVM BASED PARALLEL MATRIX
MULTIPLICATION
Sampath S1
, Nanjesh B R2
, Bharat Bhushan Sagar3
, C K Subbaraya4
1
Research Scholar, Sri Venkateshwara University, Gajraula, Amroha, Uttarpradesh, INDIA, 23.sampath@gmail.com
2
Department of Information Science and Engineering,
Adichunchanagiri Institute of Technology,
Chikmagalur, Karnataka,
INDIA, nanjeshbr@gmail.com
3
Department of Computer Science and Engineering, Birla Institute of Technology, Noida, Uttarpradesh, INDIA,
drbbsagar@gmail.com
4
Department of Computer Science and Engineering, Adichunchanagiri Institute of Technology Chikmagalur, Karnataka,,
INDIA, subrayack@gmail.com
Abstract
Parallel computing operates on the principle that large problems can often be divided into smaller ones, which are then solved
concurrently to save time by taking advantage of non-local resources and overcoming memory constraints. Multiplication of larger
matrices requires a lot of computation time. This paper deals with the two methods for handling Parallel Matrix Multiplication. First
is, dividing the rows of one of the input matrices into set of rows based on the number of slaves and assigning one rows set for each
slave for computation. Second method is, assigning one row of one of the input matrices at a time for each slave starting from first
row to first slave and second row to second slave and so on and loop backs to the first slave when last slave assignment is finished and
repeated until all rows are finished assigning. These two methods are implemented using Parallel Virtual Machine and the
computation is performed for different sizes of matrices over the different number of nodes. The results show that the row per slave
method gives the optimal computation time in PVM based parallel matrix multiplication.
Keywords: Parallel Execution, Cluster Computing, MPI (Message Passing Interface), PVM (Parallel Virtual Machine)
RAM (Random Access Memory).
---------------------------------------------------------------------***----------------------------------------------------------------------
1. INTRODUCTION
Parallel processing refers to the concept of speeding up the
execution of a program by dividing the program into multiple
fragments that can execute simultaneously, each on its own
processor. Matrix multiplication is commonly used in the areas
of graph theory, numerical algorithms, image processing and
aviation. Multiplication of larger matrices requires a lot of
computation time. This paper deals how to handle Matrix
Multiplication problem that can be split into sub-problems and
each sub-problem can be solved simultaneously using two
methods of parallel matrix multiplication.
MPI (Message Passing Interface) is specification for message-
passing libraries that can be used for writing portable parallel
programs. In MPI programming, a fixed set of processes is
created at program initialization. Each process knows its
personal number. Each process knows number of all processes
and they can communicate with other processes. Process cannot
create new processes and the group of processes is static [11].
PVM (Parallel Virtual Machine) is a software package that
allows a heterogeneous collection of workstations (host pool) to
function as a single high performance parallel virtual machine.
The PVM system consists of the daemon (or pvmd), the console
process and the interface library routines. One daemon process
resides on each constituent machine of the virtual machine.
Daemons are started when the user starts PVM by specifying a
host file, or by adding hosts using the PVM console [12].
This paper deals with the implementation of parallel
application, matrix multiplication using recent versions of
PVM, under PVM using PVM3.4.6 [12] for communication
between the cores and for the computation. Because they are
very much suitable to implement in LINUX systems
2. RELATED WORKS
Amit Chhabra, Gurvinder Singh (2010) [1] proposed Cluster
based parallel computing framework which is based on the
Master-Slave computing paradigm and it emulates the parallel
2. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 02 Issue: 12 | Dec-2013, Available @ http://www.ijret.org 250
computing environment. Hai Jin et al (2001) [6] discussed the
incentive for using clusters as well as the technologies available
for building clusters. and also discussed a number of Linux-
based tools such as MPI, PVM etc. and utilities for building
clusters. Rafiqul Zaman Khan and Md Firoj Ali (2011) [2]
represented the comparative study of MPI and PVM parallel
programming tools in parallel distributed computing system.
They described some of the features for parallel distributed
computing system with a particular focus on PVM and MPI
which are mostly used in today’s parallel and distributed
computing system. Sampath S et al (2012) [3] presented the
framework that demonstrates the performance gain and losses
achieved through parallel processing and made the performance
analysis of parallel applications using this cluster based parallel
computing framework. Rajkumar Sharma et al (2011) [5]
evaluated performance of parallel applications using MPI on
cluster of nodes having different computing powers in terms of
hardware attributes/parameters. Cirtek P and Racek S (2007)
[4] made the performance comparison of distributed simulation
using PVM and MPI in which presented the possibilities of the
simulation programs speedup using parallel processing and
compared the results from an example experiments.
Eyas El-Qawsmeh et al [7] presented a quick matrix
multiplication algorithm and evaluated on a cluster of
networked workstations consisting of Pentium hosts connected
together by Ethernet segments. Petre Anghelescu [8] showed
how the implementation of a matrix multiplication on a network
computers can be accomplished using the MPI standard and
presented extensive experimental results regarding the
performance issues of matrix parallel multiplication algorithms.
Various ways of matrix distribution among processors have
been described here. Muhammad Ali Ismail et al [9] performed
the concurrent matrix multiplication on multi-core processors.
This study is a part of an on-going research for designing of a
new parallel programming model SPC3 PM for multicore
architectures. Ziad A.A. Alqadi, et al [10] conducted the
performance analysis and evaluation of parallel matrix
multiplication algorithms, In this work, a theoretical analysis
for the performance and evaluation of the parallel matrix
multiplication algorithms is carried out. However, an
experimental analysis is performed to support the theoretical
analysis results. Recommendations are made based on this
analysis to select the proper parallel multiplication algorithms.
In our work we do the comparison of row per slave and rows
set per slave method which is implemented using PVM. We
show that optimal computation time can be obtained using row
per slave method of parallel matrix multiplication.
3. SYSTEM REQUIREMENTS
3.1 Hardware Requirements
• Processor: Pentium D (3 G Hz)
• Two RAM: 256MB and 1GB
• Hard Disk Free Space: 5 GB
• Network: TCP/IP LAN using switches or hubs
3.2 Software Requirements
• Operating System: Linux
• Version: Fedora Core 14
• Compiler: GCC
• Communication protocol: PVM
• Network protocol: Secure Shell
Fig 1: Cluster based parallel computing architecture
3. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 02 Issue: 12 | Dec-2013, Available @ http://www.ijret.org 251
4. CLUSTER BASED PARALLEL COMPUTING
ARCHITECTURE
Fig.1 shows the cluster based parallel computing architecture
involving three nodes over which PVM based parallel
applications can run. Desktop PC’s are termed here as nodes
which are connected together using Ethernet TCP/IP LAN to
work as single high performance computing system. Each node
contains two cores. Using the capacity of underlying nodes, the
processes perform the parallel computation. One of the
processes acts as master and remaining processes acts as slaves.
For each process unique task ids or number will be generated
for identifying processes in the communication world. The
main problem is taken by the master process and assigns the
task into slaves. Each slave send back the solutions of the
assigned task.
5. ROWS SET PER SLAVE METHOD OF MATRIX
MULTIPLICATION
The operations involved in Rows set per slave method of matrix
multiplication are as follows: Master finds average number of
rows to be sent to each slave and extra rows. Then Master finds
the number of rows to be sent in a rows set of matrix A and
Send the each set of rows along with the offset to the available
slaves. Assigning is done serially from first slave to the last
slave. Slaves computes the rows set of resultant matrix C and
send back the solution. Slave uses entire matrix B and rows set
assigned to it for computation. Receiving of solution is done
serially from first slave to last slave. Master receives the
solution of subtasks from each slave which is the part of
resultant matrix C along with the offset. The Row per slave
based algorithm for master and slave side operations shown in
Fig 2 and 3 respectively.
6. ROW PER SLAVE METHOD OF MATRIX
MULTIPLICATION
The operations involved in Row per slave method of matrix
multiplication are as follows: Master sends one row of the first
matrix (matrix A) and the one count value which varies from 0
to size of matrix-1 to each slave. Slaves as soon they receive a
row of first matrix, computes the resultant row of matrix C
using received row of matrix A and predefined Matrix B in
them. Finally slaves send back the resultant row of Matrix C to
master along with their tid and count value. But initially, master
starts receiving resultant rows only after the assignment of
single row to all available slaves is finished. Master then
receives the row of resultant matrix, count value and tid from
the process which finished its computation and it is set free.
This tid is used for assigning next task to the slave process.
Then master copies row just calculated into C matrix in a
correct order using count value. This procedure is repeated until
all the rows of Matrix A is finished assigning to slaves. The
Row per slave methodology of master and slave side operations
is shown in Figure 4a and 4b respectively. The operations
involved in getting the resultant matrix are shown in Fig. 4.c
with simple example.
7. RESULTS AND DISCUSSION
We compared the new row per slave and traditional rows set
per slave methods using the parallel computing tool PVM3.4.6.
Computation time is taken for different sizes of input matrices
and for executions over different number of nodes. Table 1
shows the computation time taken for rows set per slave and
row per slave based matrix multiplication using PVM.
Comparison of these two methods over single, two and three
nodes using PVM are shown in figs 5, 6, 7. The row per slave
method of matrix multiplication is taking less computation time
compared to rows set per slave method. The rows set per slave
method takes more computation time as it does assigning sub
tasks and retrieving solution serially and in terms of rows set
Fig 2: Algorithm for Mater side operations using rows set per slave method
4. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 02 Issue: 12 | Dec-2013, Available @ http://www.ijret.org 252
Fig 3: Algorithm for Slave side operations using rows set per slave method
Fig.4.a) Flow diagram for operations involved at the Master side b). Flow diagram for operations involved at the Slave side c).
Example (1 task into 3 subtasks computed using 2 slaves) to show Operations involved
define matrix B
get parent id, so we know where to receive from
mtype FROM_MASTER
receive offset, matrix A’s subsetand Matrix B for k 0 to NCB
for i 0 to rows
c[i][k] 0.0
for j 0 to NCA
c[i][k] c[i][k] + a[i][j] * b[j][k]
end for
end for
end for
mtype = FROM_WORKER
send offset, set of rows portion of resultant matrix C
5. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 02 Issue: 12 | Dec-2013, Available @ http://www.ijret.org 253
Table 1: performance comparison of rows set per slave and row per slave method using pvm(all time in seconds)
Number of
Nodes
Type of matrix
multiplication
1000*1000 1500*1500 2000*2000 2500*2500 3000*3000
Single Node
Rows set per
slave
10.1783 36.2335 86.3251 171.3234 291.5617
Row per
slave
5.4749 28.2361 43.0896 133.6212 157.8411
Two Nodes
Rows set per
slave
7.5521 18.4315 45.8543 75.1528 136.2641
Row per
slave
2.8996 12.0456 22.0475 58.4992 80.4918
Three nodes
Rows set per
slave
7.6645 19.1214 36.2816 66.9552 108.2531
Row per
slave
1.9545 7.4321 14.6072 31.4274 48.4312
Fig 5: Comparison over single node using PVM
Fig 6: Comparison over two nodes using PVM
86.32
171.32
291.56
43.08
133.62
157.84
0
50
100
150
200
250
300
350
2000*2000 2500*2500 3000*3000
Rows set per slave
Row per slave
Time in seconds
Matrix size
Matrix size
6. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 02 Issue: 12 | Dec-2013, Available @ http://www.ijret.org 254
Fig 7: Comparison over three nodes using PVM
Even though the slaves finished computation they must be
waiting until their turn to send the solution, comes. But in case
of row per slave method the slaves do the computation of only
one row at a time and sends back the solution and receives
another row for computation.
CONCLUSIONS
The row per slave method is giving more optimal computation
time than the rows set per slave method in PVM based parallel
matrix multiplication. An average reduction in computation
time with row per slave method when compared to rows set per
slave method of matrix multiplication is around 50%.
ACKNOWLEDGEMENTS
We express our humble pranams to his holiness SRI SRI SRI
Dr|| BALAGANGADHARANATHA MAHA SWAMIJI and
seeking his blessings.First and foremost we would like to thank
Dr. C.K. Subbaraya, Principal, Adichunchangiri Institute of
Technology, Chikmagalur, for his moral support towards
completing our work. And also we would like to thank Dr.
Mallikarjuna Bennur, for his valuable suggestions given for us
throughout our work.
REFERENCES
[1] AmitChhabra, Gurvinder Singh "A Cluster Based
Parallel Computing Framework (CBPCF) for
Performance Evaluation of Parallel Applications",
International Journal of Computer Theory and
Engineering, Vol. 2, No. 2 April, 2010.
[2] RafiqulZaman Khan, MdFiroj Ali, “A Comparative
Study on Parallel Programming Tools in Parallel
Distributed Computing System: MPI and PVM”,
Proceedings of the 5th National Conference;
INDIACom-2011.
[3] Sampath S, Sudeepa K.B, Nanjesh B R “Performance
Analysis and Evaluation of Parallel Applications using a
CBPCF”, International Journal of Computer Science and
Information Technology Research Excellence
(IJCSITRE), Vol.2,Issue 1,Jan-Feb 2012.
[4] Cirtek P, Racek S, “Performance Comparison of
Distributed Simulation using PVM and MPI”, The
International Conference on "Computer as a Tool".
Page(s): 2238 – 2241, EUROCON, 2007.
[5] Rajkumar Sharma, Priyesh Kanungo, Manohar
Chandwani, “Performance Evaluation of Parallel
Applications using Message Passing Interface in
Ntework of Workstations of Different Computing
Powers”, Indian Journal of Computer Science and
Engineering(IJCSE), Vol. 2,No. 2, April-May 2011.
[6] Hai Jin, Rajkumar Buyya, Mark Baker, “Cluster
Computing Tools, Applications, and Australian
Initiatives for Low Cost Supercomputing”, MONITOR
Magazine, The Institution of Engineers Australia ,
Volume 25, No 4, Dec.2000-Feb 2001.
[7] Eyas El-Qawsmeh, Abdel-Elah AL-Ayyoub, Nayef
Abu-Ghazaleh, “Quick Matrix Multiplication on
Clusters of Workstations”, INFORMATICA, Volume
15, Issue.2, pages 203–218, 2004.
[8] Petre Anghelescu, “Parallel Algorithms for Matrix
Multiplication”, 2012 2nd International Conference on
Future Computers in Education, Vols.23-24, pages 65-
70, 2012.
[9] Muhammad Ali Ismail, S. H. Mirza, Talat Altaf,
“Concurrent Matrix Multiplication on Multi-Core
Processors”, International Journal of Computer Science
36.281
66.95
108.25
14.61
31.42
48.43
0
20
40
60
80
100
120
2000*2000 2500*2500 3000*3000
Rows set per slave
Row per slave
Time in seconds
Matrix Size
7. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 02 Issue: 12 | Dec-2013, Available @ http://www.ijret.org 255
and Security, Volume 5, Issue 2, pages 208-220, Feb
2011.
[10] Ziad A.A. Alqadi, MusbahAqel and Ibrahiem M. M. E l
Emary “Performance Analysis and Evaluation of
Parallel Matrix Multiplication Algorithms”, World
Applied Sciences Journals, Volume 5, Issue 2, pages
211-214, 2008.
[11] History of PVM versions:
http://www.netlib.org/pvm3/book/node156.html.
[12] PVM3.4.6: http://www.csm.ornl.gov/pvm/pvm3.4.6.