SlideShare a Scribd company logo
1 of 4
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 06 | June 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1417
Latin Square Computation of Order-3 using Open CL
Avnish Kansal1, Ashish Chaturwedi2
1,2Department of Computer Science & Engineering, Carlox Teacher’s University, Ahmedabad, Gujarat
---------------------------------------------------------------------------***---------------------------------------------------------------------------
Abstract: Latin sqaure is widely used in steganography,
cryptography, digital watermarks, computer games,
sudoku, graph analysis, error correcting codes; generate
magic squares, statistics and mathematical field. The
Sudoku puzzles are a special case of Latin squares. When
we have to make the latin computation using the
sequential algorithm then it waste more clock time. By
using parallel programming (OpenCL) the time taken is
reduced and throughput is increased. Traditionally Latin
square methodology is based on heuristic cell based
technique and generates random Latin square using
genetic algorithmic approach both consumes high
processing time and decreases the throughput. Here we
are presenting the algorithm by using parallel processing
environment using OpenCL for computing latin square of
order3.
Keywords: OpenCL, gnuplot, Sequential architecture,
parallel architecture, GPU.
I. Introduction:
Latin square is an 𝑛×𝑛 array in which each cell is having at
most one symbol, chosen form an n-set, such that every
symbol occurs at most one time in each row and at- most
one time in each column. The “Latin square” name was
stimulated by mathematical papers by Leonhard Euler.
Two latin squares are said to be orthogonal if both Latin
squares of the same size such that when one latin square is
superposed on the other latin, each letter of the one
coincides once with each letter of the other. The two Latin
squares are held to be conjugate if the rows of one are the
columns of the other that is if the rows and columns of a
square be interchanged then conjugate square is
generated. An Adjugacy is a generality of the concept to
conjugacy in which a permutation of the constraints of one
generates another [12]. Each Latin square is defined as a
triple (r,c,s), where r is the row, c is the column, and s is
the symbol and from this triplet we attain a set of n2 triples
called the orthogonal array representation of the
square. Latin square of n x n order, in which every row is
derivative from any other in a cyclic permutation of
degree n, or by a power of such a permutation, is
a cyclic Latin square [9].
While the implementation done using sequential
algorithms techniques but with help of high level
languages parallel processing technique we are able to
decrease the processing time for matrix processing
application [3]. As the problem is divided into discrete set
of instances which are solved concurrently [1]. With help
of this parallel computation technique we are able to
execute two or more instructions at a same time
simultaneously. While executing the sequential algorithms
on a CPU it runs slower. In the proposed system the
sequential algorithms which have task parallelism or data
parallelism those algorithms are implemented to OpenCL
[14]. By the help of OpenCL we minimizes overhead on the
CPU and makes matrix processing run faster and efficiently
to get higher throughput.
Concurrency is the way to sharing of multiple resources in
a software. For the applications that are naturally parallel
the concurrency provides an abstraction [5].
When the execution of the multiple threads running in
parallel, this means that the active thread running
simultaneously on different hardware resources and
processing elements [10]. The execution of the
simultaneous threads is provided by the platforms, for
achieving parallel computing. In latest computing
machines we have SIMD or MIMD which have capabilities
to exploit either data level or task level parallism[6].
Task level parallism, to handle different number of tasks,
within a single problem at the same time. The efficiency of
this model will depend on the independent operations of
the task [6].
Data level parallelism, to handle the discrete chunks of
the same task at the same time simultaneously. The
efficiency of this model will depend on the independent
operations of the task [6].
1.1 Existing System & Proposed System
Traditionally all the computations and execution of
instructions are handled by the CPU in the computer.
Architecturally, the CPU consists of very few cores with
lots of cache memory which are able to handle a few
software threads at a time concurrently. These few cores
are used to optimize the query by sequential serial
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 06 | June 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1418
processing architecture. So to alleviate the load of the CPU
by handle all its advanced computations which are
necessary to project the final display on the monitor a
concept of GPU’s being introduced. The capability of the
GPUs with 100+ cores to process thousands of threads can
accelerate software by 100x over a CPU alone. The GPUs
have massively parallel architecture which consists of
thousands of smaller and more efficient cores that are
designed for handling multiple tasks simultaneously. This
is the reason for the wide and mainstream acceptance of
the GPU’s now a day. The GPU-accelerated computing has
now grown into a mainstream movement that is supported
by the latest operating systems from Apple (with OpenCL)
and Microsoft (using DirectCompute). This accelerated
computing consist of graphics processing unit (GPU)
together with a CPU to accelerate scientific, analytics,
engineering, consumer, and enterprise applications.
1.2 Defining Algorithm
The parallel processing algorithms are intended for Latin
square. The representation of algorithms is presented in
Figure 1.The first step is input matrix of order 3, we have
input as a matrix in OpenCL programming code. The
matrix have complex data in the form of array values when
we applying sequential algorithm on an matrix it takes
more of time for execution on CPU. But if we are applying
the parallel processing concepts then we positively we
reduce time taken by the matrix execution.
Second step, is decomposing the input matrix according to
task parallelism or data parallelism technique. Then this
divided matrix is being used in third step for concurrent
execution
Third step, is defining the individual sub matrices which
are further being processed by various processing
elements.
Fourth step, after the division of matrix into sub matrices
each individual sub matrix is sent to number of processing
elements simultaneously in GPU.
Fifth step, the sub matrices result is being calculated with
the help of processing elements concurrently.
Sixth step is to combine the results of all matrices in single
processing element so as to obtain the final output of input
matrix in reduced time using parallel architecture.
Figure 1 Flowchart of Proposed Work
Seventh step, is to store the final result of input matrix
from various sub matrices which are further being saved
in the dynamic random access memory (DRAM) of
graphical processing unit.
Eighth step, after the final result is saved in the DRAM of
GPU; that result is copy to the DRAM of CPU for displaying
the final output to the user.
2. Implementing Latin Square using OpenCL:
The algorithm which we are chosen to parallel the matrix
over a number of workgroups, the matrix is further
divided into number of chunks as per the coarse grained
division. The workgroup is the part of a matrix by which
we make the sub-matrix stored in the on chip local
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 06 | June 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1419
memory of the GPU. These partial latin square are further
reduced into the single main latin square results.
2.1 Overview is as follows, when we begin to design a
OpenCL kernel to the respective hardware we have to take
care work-items and the size of the work_group. The local
memory is resides in the work_group and we share the
data inside the work_group. The local memory is not
shared in the work_group only. If we want to share the
results of the each work_group to another then we are not
able to do this. The work_group in OpenCL consists of
work_items which further share between the local
memories within that work_group. We are performing all
this to reduce the memory overheads by storing sub-
matrix in local memory.
After that we have to define more work_groups for
exchanging more local memory data into the global
memory but we must assure that the number of
work_groups for efficient use of local memory and reduce
overhead on the global memory the number should be
close to the number of compute units we have in our
hardware.
To traverse the total input matrix we use here, global Ids,
local Ids, group Ids, group size, etc.
Our OpenCL program consists of two parts: First is kernel
part, the instances of kernel are copied to the different
compute units and the kernel is executed on the each
compute unit individually. Another is host code which we
have to add to work for the different models of the OpenCL
which is executed on the host or CPU.
The host program defines the context for the kernels and
manages their execution. Each OpenCL device has a
command queue, where the host programs are queued for
kernel execution and memory transfer.
The core of the OpenCL execution model is defined by
execution of kernels. When kernel is executed an index
space is defined which is an instance of the kernel as in our
problem. Each work_item executes the same code but the
execution pathway and the data used will be different.
Work items are organized into workgroups. Each
work_group is assigned a unique workgroup_Id and each
work_item in the workgroup assigned the local_Id.
A single work_group can be identified by its global_Id or
by its workgroup_Id or local_Id. The work_item in a single
work_group executes concurrently on a processing
elements of a single compute units. Work_item in a
work_group can synchronize each other and share data
through local memory in the compute unit. All the
work_items has read and write access to any position in
the global memory. The global and constant memory can
also be accessed from the host processor before and after
kernel execution.
3 Experiments and results: For each latin square, both
the GPU kernel code and CPU serial code are designed. The
processing that takes place on the CPU; the kernel code in
the algorithm as an instance copied to the GPU. The kernel
is executed on compute unit and after that the results are
copied to the CPU.
Figure 2: Latin Square Computation (Order-3) order v/s
time graph using procedural programming code technique
having more time complexity.
Figure 3: Latin Square Computation (order-3) time v/s
order graph using OpenCL code having reduced time
complexity.
The speed up of the latin square computation by the GPU is
significantly improves the computing speed by reducing
the time complexity and throughput. Here the graph
depicts while using the procedural programming
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 06 | June 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1420
technique the time complexity increases as the order for
latin square increased. But when the same code is
implemented using OpenCL the time complexity decreases
tremendously.
4. Conclusions
In this paper a framework is tried to be developed an
efficient algorithmic approach to find a solution to
different latin square to enhance the existing approach. We
implemented the Latin square problem definition of order-
3 in GPGPU (GP2U), which providing heterogeneous
environment to execute and capability to reduce time
complexity which improves the performance of Latin
square computation over intensive domains latin square
We executed the latin square by using OpenCL
environment and compared with the sequential
implementations on CPU. On CPU algorithm takes very
huge amount of time. Obviously, the time taken becomes
low on GPU device. It provides novel and efficient
acceleration technique for matrix calculation and is cheap
in hardware implementation.
Future work is to gain deeper knowledge about the
parallization techniques to make best use of GPU device
and to work with other algorithms related to these project
concepts.
5. References
[1] “Heterogeneous computing with OpenCL” by Benedict
gaster,british libraries,printed USA
[2] http://www.khronos.org/OpenCL ,”Khronos Group”.
[3] Roberto Fontana, Random Latin squares and Sudoku
designs generation,2013
[4] Nan Zhang, Yun-shan Chen, Jian-li Wang. “Image
Parallel Processing Based on GPU”. International
Conference on Advanced Computer Control, March 2010.
[5] Pardalos P.M., Xue, J, “The maximum clique problem”,
Journal of Global Optimization, 4, 1994, 301—328.
[6] Demetres Christofides, Klas Markstrom (2003)
Random Latin square graphs
[7]http://developer.amd.com/pages/default.aspx
“University Kit 1.0”.
[8] C.Colbourn(1984) The Complexity of completing
partial Latin squares. Discrete Applied Mathematics8: 25-
30. Doi:10.1016/0166-218X(84)90075-1
[9] Denes, J. and A. Keedwell. 1991. Latin Squares: New
Developments in the Theory and applications. North-
Holland.
[10] AMD Accelerated Parallel Processing OpenCL
Programming Guide1.pdf
[11] Jacobson, M.T and P. Matthews (1996) Generating
uniformly distributed random Latin squares. Journals of
Combinatorial Designs 4(6), 405-406
[12] Brendan D. McKay and Ian M. Wanless(2000) On the
number of Latin squares Australian National University,
Canberra, ACT 0200, Australia
[13] J.A. Bate, G.H.J. van Rees, The Size of the Smallest
Strong Critical Set in a Latin Square University of
Manitoba, Winnipeg, Manitoba.

More Related Content

What's hot

Parallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel applicationParallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel applicationGeoffrey Fox
 
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...IJCNCJournal
 
program partitioning and scheduling IN Advanced Computer Architecture
program partitioning and scheduling  IN Advanced Computer Architectureprogram partitioning and scheduling  IN Advanced Computer Architecture
program partitioning and scheduling IN Advanced Computer ArchitecturePankaj Kumar Jain
 
Efficient Dynamic Scheduling Algorithm for Real-Time MultiCore Systems
Efficient Dynamic Scheduling Algorithm for Real-Time MultiCore Systems Efficient Dynamic Scheduling Algorithm for Real-Time MultiCore Systems
Efficient Dynamic Scheduling Algorithm for Real-Time MultiCore Systems iosrjce
 
SPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHY
SPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHYSPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHY
SPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHYcsandit
 
Hybrid Model Based Testing Tool Architecture for Exascale Computing System
Hybrid Model Based Testing Tool Architecture for Exascale Computing SystemHybrid Model Based Testing Tool Architecture for Exascale Computing System
Hybrid Model Based Testing Tool Architecture for Exascale Computing SystemCSCJournals
 
Scalable Algorithm Design with MapReduce
Scalable Algorithm Design with MapReduceScalable Algorithm Design with MapReduce
Scalable Algorithm Design with MapReducePietro Michiardi
 
Pretzel: optimized Machine Learning framework for low-latency and high throug...
Pretzel: optimized Machine Learning framework for low-latency and high throug...Pretzel: optimized Machine Learning framework for low-latency and high throug...
Pretzel: optimized Machine Learning framework for low-latency and high throug...NECST Lab @ Politecnico di Milano
 
A NOBEL HYBRID APPROACH FOR EDGE DETECTION
A NOBEL HYBRID APPROACH FOR EDGE  DETECTIONA NOBEL HYBRID APPROACH FOR EDGE  DETECTION
A NOBEL HYBRID APPROACH FOR EDGE DETECTIONijcses
 
Parallel Computing 2007: Overview
Parallel Computing 2007: OverviewParallel Computing 2007: Overview
Parallel Computing 2007: OverviewGeoffrey Fox
 
A survey of Parallel models for Sequence Alignment using Smith Waterman Algor...
A survey of Parallel models for Sequence Alignment using Smith Waterman Algor...A survey of Parallel models for Sequence Alignment using Smith Waterman Algor...
A survey of Parallel models for Sequence Alignment using Smith Waterman Algor...iosrjce
 
Hadoop Mapreduce Performance Enhancement Using In-Node Combiners
Hadoop Mapreduce Performance Enhancement Using In-Node CombinersHadoop Mapreduce Performance Enhancement Using In-Node Combiners
Hadoop Mapreduce Performance Enhancement Using In-Node Combinersijcsit
 
Vol 3 No 1 - July 2013
Vol 3 No 1 - July 2013Vol 3 No 1 - July 2013
Vol 3 No 1 - July 2013ijcsbi
 

What's hot (18)

Parallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel applicationParallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel application
 
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...
 
Parallel Algorithms
Parallel AlgorithmsParallel Algorithms
Parallel Algorithms
 
program partitioning and scheduling IN Advanced Computer Architecture
program partitioning and scheduling  IN Advanced Computer Architectureprogram partitioning and scheduling  IN Advanced Computer Architecture
program partitioning and scheduling IN Advanced Computer Architecture
 
Efficient Dynamic Scheduling Algorithm for Real-Time MultiCore Systems
Efficient Dynamic Scheduling Algorithm for Real-Time MultiCore Systems Efficient Dynamic Scheduling Algorithm for Real-Time MultiCore Systems
Efficient Dynamic Scheduling Algorithm for Real-Time MultiCore Systems
 
Parallel computation
Parallel computationParallel computation
Parallel computation
 
SPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHY
SPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHYSPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHY
SPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHY
 
Hybrid Model Based Testing Tool Architecture for Exascale Computing System
Hybrid Model Based Testing Tool Architecture for Exascale Computing SystemHybrid Model Based Testing Tool Architecture for Exascale Computing System
Hybrid Model Based Testing Tool Architecture for Exascale Computing System
 
Todtree
TodtreeTodtree
Todtree
 
Scalable Algorithm Design with MapReduce
Scalable Algorithm Design with MapReduceScalable Algorithm Design with MapReduce
Scalable Algorithm Design with MapReduce
 
Pretzel: optimized Machine Learning framework for low-latency and high throug...
Pretzel: optimized Machine Learning framework for low-latency and high throug...Pretzel: optimized Machine Learning framework for low-latency and high throug...
Pretzel: optimized Machine Learning framework for low-latency and high throug...
 
A NOBEL HYBRID APPROACH FOR EDGE DETECTION
A NOBEL HYBRID APPROACH FOR EDGE  DETECTIONA NOBEL HYBRID APPROACH FOR EDGE  DETECTION
A NOBEL HYBRID APPROACH FOR EDGE DETECTION
 
Parallel Computing 2007: Overview
Parallel Computing 2007: OverviewParallel Computing 2007: Overview
Parallel Computing 2007: Overview
 
A survey of Parallel models for Sequence Alignment using Smith Waterman Algor...
A survey of Parallel models for Sequence Alignment using Smith Waterman Algor...A survey of Parallel models for Sequence Alignment using Smith Waterman Algor...
A survey of Parallel models for Sequence Alignment using Smith Waterman Algor...
 
Hadoop Mapreduce Performance Enhancement Using In-Node Combiners
Hadoop Mapreduce Performance Enhancement Using In-Node CombinersHadoop Mapreduce Performance Enhancement Using In-Node Combiners
Hadoop Mapreduce Performance Enhancement Using In-Node Combiners
 
MapReduce in Cloud Computing
MapReduce in Cloud ComputingMapReduce in Cloud Computing
MapReduce in Cloud Computing
 
Bt0070
Bt0070Bt0070
Bt0070
 
Vol 3 No 1 - July 2013
Vol 3 No 1 - July 2013Vol 3 No 1 - July 2013
Vol 3 No 1 - July 2013
 

Similar to IRJET- Latin Square Computation of Order-3 using Open CL

Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP
Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP
Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP IJCSEIT Journal
 
Accelerating Real Time Applications on Heterogeneous Platforms
Accelerating Real Time Applications on Heterogeneous PlatformsAccelerating Real Time Applications on Heterogeneous Platforms
Accelerating Real Time Applications on Heterogeneous PlatformsIJMER
 
SPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHY
SPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHY SPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHY
SPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHY cscpconf
 
An Algorithm for Optimized Cost in a Distributed Computing System
An Algorithm for Optimized Cost in a Distributed Computing SystemAn Algorithm for Optimized Cost in a Distributed Computing System
An Algorithm for Optimized Cost in a Distributed Computing SystemIRJET Journal
 
Mapreduce2008 cacm
Mapreduce2008 cacmMapreduce2008 cacm
Mapreduce2008 cacmlmphuong06
 
Performance evaluation of larger matrices over cluster of four nodes using mpi
Performance evaluation of larger matrices over cluster of four nodes using mpiPerformance evaluation of larger matrices over cluster of four nodes using mpi
Performance evaluation of larger matrices over cluster of four nodes using mpieSAT Journals
 
BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...
BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...
BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...IJCI JOURNAL
 
Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...
Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...
Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...Sara Alvarez
 
Cloud Module 3 .pptx
Cloud Module 3 .pptxCloud Module 3 .pptx
Cloud Module 3 .pptxssuser41d319
 
Comprehensive Performance Evaluation on Multiplication of Matrices using MPI
Comprehensive Performance Evaluation on Multiplication of Matrices using MPIComprehensive Performance Evaluation on Multiplication of Matrices using MPI
Comprehensive Performance Evaluation on Multiplication of Matrices using MPIijtsrd
 
Complier design
Complier design Complier design
Complier design shreeuva
 
Performance analysis of sobel edge filter on heterogeneous system using opencl
Performance analysis of sobel edge filter on heterogeneous system using openclPerformance analysis of sobel edge filter on heterogeneous system using opencl
Performance analysis of sobel edge filter on heterogeneous system using opencleSAT Publishing House
 
Genetic Algorithm for task scheduling in Cloud Computing Environment
Genetic Algorithm for task scheduling in Cloud Computing EnvironmentGenetic Algorithm for task scheduling in Cloud Computing Environment
Genetic Algorithm for task scheduling in Cloud Computing EnvironmentSwapnil Shahade
 
Concurrent Matrix Multiplication on Multi-core Processors
Concurrent Matrix Multiplication on Multi-core ProcessorsConcurrent Matrix Multiplication on Multi-core Processors
Concurrent Matrix Multiplication on Multi-core ProcessorsCSCJournals
 

Similar to IRJET- Latin Square Computation of Order-3 using Open CL (20)

Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP
Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP
Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP
 
Accelerating Real Time Applications on Heterogeneous Platforms
Accelerating Real Time Applications on Heterogeneous PlatformsAccelerating Real Time Applications on Heterogeneous Platforms
Accelerating Real Time Applications on Heterogeneous Platforms
 
SPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHY
SPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHY SPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHY
SPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHY
 
An Algorithm for Optimized Cost in a Distributed Computing System
An Algorithm for Optimized Cost in a Distributed Computing SystemAn Algorithm for Optimized Cost in a Distributed Computing System
An Algorithm for Optimized Cost in a Distributed Computing System
 
Mapreduce2008 cacm
Mapreduce2008 cacmMapreduce2008 cacm
Mapreduce2008 cacm
 
Performance evaluation of larger matrices over cluster of four nodes using mpi
Performance evaluation of larger matrices over cluster of four nodes using mpiPerformance evaluation of larger matrices over cluster of four nodes using mpi
Performance evaluation of larger matrices over cluster of four nodes using mpi
 
1844 1849
1844 18491844 1849
1844 1849
 
BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...
BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...
BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...
 
unit 2 hpc.pptx
unit 2 hpc.pptxunit 2 hpc.pptx
unit 2 hpc.pptx
 
Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...
Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...
Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...
 
Cloud Module 3 .pptx
Cloud Module 3 .pptxCloud Module 3 .pptx
Cloud Module 3 .pptx
 
Comprehensive Performance Evaluation on Multiplication of Matrices using MPI
Comprehensive Performance Evaluation on Multiplication of Matrices using MPIComprehensive Performance Evaluation on Multiplication of Matrices using MPI
Comprehensive Performance Evaluation on Multiplication of Matrices using MPI
 
Complier design
Complier design Complier design
Complier design
 
Performance analysis of sobel edge filter on heterogeneous system using opencl
Performance analysis of sobel edge filter on heterogeneous system using openclPerformance analysis of sobel edge filter on heterogeneous system using opencl
Performance analysis of sobel edge filter on heterogeneous system using opencl
 
035
035035
035
 
K017617786
K017617786K017617786
K017617786
 
50120140503017 2
50120140503017 250120140503017 2
50120140503017 2
 
Genetic Algorithm for task scheduling in Cloud Computing Environment
Genetic Algorithm for task scheduling in Cloud Computing EnvironmentGenetic Algorithm for task scheduling in Cloud Computing Environment
Genetic Algorithm for task scheduling in Cloud Computing Environment
 
p850-ries
p850-riesp850-ries
p850-ries
 
Concurrent Matrix Multiplication on Multi-core Processors
Concurrent Matrix Multiplication on Multi-core ProcessorsConcurrent Matrix Multiplication on Multi-core Processors
Concurrent Matrix Multiplication on Multi-core Processors
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

Digital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptxDigital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptxpritamlangde
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaOmar Fathy
 
fitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptfitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptAfnanAhmad53
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdfKamal Acharya
 
Path loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata ModelPath loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata ModelDrAjayKumarYadav4
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxSCMS School of Architecture
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiessarkmank1
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXssuser89054b
 
Augmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxAugmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxMustafa Ahmed
 
Ground Improvement Technique: Earth Reinforcement
Ground Improvement Technique: Earth ReinforcementGround Improvement Technique: Earth Reinforcement
Ground Improvement Technique: Earth ReinforcementDr. Deepak Mudgal
 
Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxMustafa Ahmed
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 
Electromagnetic relays used for power system .pptx
Electromagnetic relays used for power system .pptxElectromagnetic relays used for power system .pptx
Electromagnetic relays used for power system .pptxNANDHAKUMARA10
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startQuintin Balsdon
 
Introduction to Artificial Intelligence ( AI)
Introduction to Artificial Intelligence ( AI)Introduction to Artificial Intelligence ( AI)
Introduction to Artificial Intelligence ( AI)ChandrakantDivate1
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxSCMS School of Architecture
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARKOUSTAV SARKAR
 
Introduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfIntroduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfsumitt6_25730773
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Ramkumar k
 

Recently uploaded (20)

Digital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptxDigital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptx
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
fitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptfitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .ppt
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
Path loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata ModelPath loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata Model
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and properties
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Augmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxAugmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptx
 
Ground Improvement Technique: Earth Reinforcement
Ground Improvement Technique: Earth ReinforcementGround Improvement Technique: Earth Reinforcement
Ground Improvement Technique: Earth Reinforcement
 
Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptx
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Electromagnetic relays used for power system .pptx
Electromagnetic relays used for power system .pptxElectromagnetic relays used for power system .pptx
Electromagnetic relays used for power system .pptx
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Introduction to Artificial Intelligence ( AI)
Introduction to Artificial Intelligence ( AI)Introduction to Artificial Intelligence ( AI)
Introduction to Artificial Intelligence ( AI)
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
Introduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfIntroduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdf
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)
 

IRJET- Latin Square Computation of Order-3 using Open CL

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 06 | June 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1417 Latin Square Computation of Order-3 using Open CL Avnish Kansal1, Ashish Chaturwedi2 1,2Department of Computer Science & Engineering, Carlox Teacher’s University, Ahmedabad, Gujarat ---------------------------------------------------------------------------***--------------------------------------------------------------------------- Abstract: Latin sqaure is widely used in steganography, cryptography, digital watermarks, computer games, sudoku, graph analysis, error correcting codes; generate magic squares, statistics and mathematical field. The Sudoku puzzles are a special case of Latin squares. When we have to make the latin computation using the sequential algorithm then it waste more clock time. By using parallel programming (OpenCL) the time taken is reduced and throughput is increased. Traditionally Latin square methodology is based on heuristic cell based technique and generates random Latin square using genetic algorithmic approach both consumes high processing time and decreases the throughput. Here we are presenting the algorithm by using parallel processing environment using OpenCL for computing latin square of order3. Keywords: OpenCL, gnuplot, Sequential architecture, parallel architecture, GPU. I. Introduction: Latin square is an 𝑛×𝑛 array in which each cell is having at most one symbol, chosen form an n-set, such that every symbol occurs at most one time in each row and at- most one time in each column. The “Latin square” name was stimulated by mathematical papers by Leonhard Euler. Two latin squares are said to be orthogonal if both Latin squares of the same size such that when one latin square is superposed on the other latin, each letter of the one coincides once with each letter of the other. The two Latin squares are held to be conjugate if the rows of one are the columns of the other that is if the rows and columns of a square be interchanged then conjugate square is generated. An Adjugacy is a generality of the concept to conjugacy in which a permutation of the constraints of one generates another [12]. Each Latin square is defined as a triple (r,c,s), where r is the row, c is the column, and s is the symbol and from this triplet we attain a set of n2 triples called the orthogonal array representation of the square. Latin square of n x n order, in which every row is derivative from any other in a cyclic permutation of degree n, or by a power of such a permutation, is a cyclic Latin square [9]. While the implementation done using sequential algorithms techniques but with help of high level languages parallel processing technique we are able to decrease the processing time for matrix processing application [3]. As the problem is divided into discrete set of instances which are solved concurrently [1]. With help of this parallel computation technique we are able to execute two or more instructions at a same time simultaneously. While executing the sequential algorithms on a CPU it runs slower. In the proposed system the sequential algorithms which have task parallelism or data parallelism those algorithms are implemented to OpenCL [14]. By the help of OpenCL we minimizes overhead on the CPU and makes matrix processing run faster and efficiently to get higher throughput. Concurrency is the way to sharing of multiple resources in a software. For the applications that are naturally parallel the concurrency provides an abstraction [5]. When the execution of the multiple threads running in parallel, this means that the active thread running simultaneously on different hardware resources and processing elements [10]. The execution of the simultaneous threads is provided by the platforms, for achieving parallel computing. In latest computing machines we have SIMD or MIMD which have capabilities to exploit either data level or task level parallism[6]. Task level parallism, to handle different number of tasks, within a single problem at the same time. The efficiency of this model will depend on the independent operations of the task [6]. Data level parallelism, to handle the discrete chunks of the same task at the same time simultaneously. The efficiency of this model will depend on the independent operations of the task [6]. 1.1 Existing System & Proposed System Traditionally all the computations and execution of instructions are handled by the CPU in the computer. Architecturally, the CPU consists of very few cores with lots of cache memory which are able to handle a few software threads at a time concurrently. These few cores are used to optimize the query by sequential serial
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 06 | June 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1418 processing architecture. So to alleviate the load of the CPU by handle all its advanced computations which are necessary to project the final display on the monitor a concept of GPU’s being introduced. The capability of the GPUs with 100+ cores to process thousands of threads can accelerate software by 100x over a CPU alone. The GPUs have massively parallel architecture which consists of thousands of smaller and more efficient cores that are designed for handling multiple tasks simultaneously. This is the reason for the wide and mainstream acceptance of the GPU’s now a day. The GPU-accelerated computing has now grown into a mainstream movement that is supported by the latest operating systems from Apple (with OpenCL) and Microsoft (using DirectCompute). This accelerated computing consist of graphics processing unit (GPU) together with a CPU to accelerate scientific, analytics, engineering, consumer, and enterprise applications. 1.2 Defining Algorithm The parallel processing algorithms are intended for Latin square. The representation of algorithms is presented in Figure 1.The first step is input matrix of order 3, we have input as a matrix in OpenCL programming code. The matrix have complex data in the form of array values when we applying sequential algorithm on an matrix it takes more of time for execution on CPU. But if we are applying the parallel processing concepts then we positively we reduce time taken by the matrix execution. Second step, is decomposing the input matrix according to task parallelism or data parallelism technique. Then this divided matrix is being used in third step for concurrent execution Third step, is defining the individual sub matrices which are further being processed by various processing elements. Fourth step, after the division of matrix into sub matrices each individual sub matrix is sent to number of processing elements simultaneously in GPU. Fifth step, the sub matrices result is being calculated with the help of processing elements concurrently. Sixth step is to combine the results of all matrices in single processing element so as to obtain the final output of input matrix in reduced time using parallel architecture. Figure 1 Flowchart of Proposed Work Seventh step, is to store the final result of input matrix from various sub matrices which are further being saved in the dynamic random access memory (DRAM) of graphical processing unit. Eighth step, after the final result is saved in the DRAM of GPU; that result is copy to the DRAM of CPU for displaying the final output to the user. 2. Implementing Latin Square using OpenCL: The algorithm which we are chosen to parallel the matrix over a number of workgroups, the matrix is further divided into number of chunks as per the coarse grained division. The workgroup is the part of a matrix by which we make the sub-matrix stored in the on chip local
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 06 | June 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1419 memory of the GPU. These partial latin square are further reduced into the single main latin square results. 2.1 Overview is as follows, when we begin to design a OpenCL kernel to the respective hardware we have to take care work-items and the size of the work_group. The local memory is resides in the work_group and we share the data inside the work_group. The local memory is not shared in the work_group only. If we want to share the results of the each work_group to another then we are not able to do this. The work_group in OpenCL consists of work_items which further share between the local memories within that work_group. We are performing all this to reduce the memory overheads by storing sub- matrix in local memory. After that we have to define more work_groups for exchanging more local memory data into the global memory but we must assure that the number of work_groups for efficient use of local memory and reduce overhead on the global memory the number should be close to the number of compute units we have in our hardware. To traverse the total input matrix we use here, global Ids, local Ids, group Ids, group size, etc. Our OpenCL program consists of two parts: First is kernel part, the instances of kernel are copied to the different compute units and the kernel is executed on the each compute unit individually. Another is host code which we have to add to work for the different models of the OpenCL which is executed on the host or CPU. The host program defines the context for the kernels and manages their execution. Each OpenCL device has a command queue, where the host programs are queued for kernel execution and memory transfer. The core of the OpenCL execution model is defined by execution of kernels. When kernel is executed an index space is defined which is an instance of the kernel as in our problem. Each work_item executes the same code but the execution pathway and the data used will be different. Work items are organized into workgroups. Each work_group is assigned a unique workgroup_Id and each work_item in the workgroup assigned the local_Id. A single work_group can be identified by its global_Id or by its workgroup_Id or local_Id. The work_item in a single work_group executes concurrently on a processing elements of a single compute units. Work_item in a work_group can synchronize each other and share data through local memory in the compute unit. All the work_items has read and write access to any position in the global memory. The global and constant memory can also be accessed from the host processor before and after kernel execution. 3 Experiments and results: For each latin square, both the GPU kernel code and CPU serial code are designed. The processing that takes place on the CPU; the kernel code in the algorithm as an instance copied to the GPU. The kernel is executed on compute unit and after that the results are copied to the CPU. Figure 2: Latin Square Computation (Order-3) order v/s time graph using procedural programming code technique having more time complexity. Figure 3: Latin Square Computation (order-3) time v/s order graph using OpenCL code having reduced time complexity. The speed up of the latin square computation by the GPU is significantly improves the computing speed by reducing the time complexity and throughput. Here the graph depicts while using the procedural programming
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 06 | June 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1420 technique the time complexity increases as the order for latin square increased. But when the same code is implemented using OpenCL the time complexity decreases tremendously. 4. Conclusions In this paper a framework is tried to be developed an efficient algorithmic approach to find a solution to different latin square to enhance the existing approach. We implemented the Latin square problem definition of order- 3 in GPGPU (GP2U), which providing heterogeneous environment to execute and capability to reduce time complexity which improves the performance of Latin square computation over intensive domains latin square We executed the latin square by using OpenCL environment and compared with the sequential implementations on CPU. On CPU algorithm takes very huge amount of time. Obviously, the time taken becomes low on GPU device. It provides novel and efficient acceleration technique for matrix calculation and is cheap in hardware implementation. Future work is to gain deeper knowledge about the parallization techniques to make best use of GPU device and to work with other algorithms related to these project concepts. 5. References [1] “Heterogeneous computing with OpenCL” by Benedict gaster,british libraries,printed USA [2] http://www.khronos.org/OpenCL ,”Khronos Group”. [3] Roberto Fontana, Random Latin squares and Sudoku designs generation,2013 [4] Nan Zhang, Yun-shan Chen, Jian-li Wang. “Image Parallel Processing Based on GPU”. International Conference on Advanced Computer Control, March 2010. [5] Pardalos P.M., Xue, J, “The maximum clique problem”, Journal of Global Optimization, 4, 1994, 301—328. [6] Demetres Christofides, Klas Markstrom (2003) Random Latin square graphs [7]http://developer.amd.com/pages/default.aspx “University Kit 1.0”. [8] C.Colbourn(1984) The Complexity of completing partial Latin squares. Discrete Applied Mathematics8: 25- 30. Doi:10.1016/0166-218X(84)90075-1 [9] Denes, J. and A. Keedwell. 1991. Latin Squares: New Developments in the Theory and applications. North- Holland. [10] AMD Accelerated Parallel Processing OpenCL Programming Guide1.pdf [11] Jacobson, M.T and P. Matthews (1996) Generating uniformly distributed random Latin squares. Journals of Combinatorial Designs 4(6), 405-406 [12] Brendan D. McKay and Ian M. Wanless(2000) On the number of Latin squares Australian National University, Canberra, ACT 0200, Australia [13] J.A. Bate, G.H.J. van Rees, The Size of the Smallest Strong Critical Set in a Latin Square University of Manitoba, Winnipeg, Manitoba.