SlideShare a Scribd company logo
1 of 9
Download to read offline
European Journal of Scientific Research
ISSN 1450-216X / 1450-202X Vol.121 No.3, 2014, pp.258-266
http://www.europeanjournalofscientificresearch.com
Analysis of Matrix Multiplication Computational Methods
Khaled Matrouk
Corrospondent Author, Department of Computer Engineering, Faculty of Engineering
Al-Hussein Bin Talal University, P. O. Box (20), Ma'an, Zip Code 71111, Jordan
E-mail: khaled.matrouk@ahu.edu.jo
Tel: +962-3-2179000 (ext. 8503), Fax: +962-3-2179050
Abdullah Al- Hasanat
Department of Computer Engineering, Faculty of Engineering
Al-Hussein Bin Talal University, P. O. Box (20), Ma'an, Zip Code 71111, Jordan
Haitham Alasha'ary
Department of Computer Engineering, Faculty of Engineering
Al-Hussein Bin Talal University, P. O. Box (20), Ma'an, Zip Code 71111, Jordan
Ziad Al-Qadi
Prof, Department of Computer Engineering, Faculty of Engineering
Al-Hussein Bin Talal University, P. O. Box (20), Ma'an, Zip Code 71111, Jordan
Hasan Al-Shalabi
Prof, Department of Computer Engineering, Faculty of Engineering
Al-Hussein Bin Talal University, P. O. Box (20), Ma'an, Zip Code 71111, Jordan
Abstract
Matrix multiplication is a basic concept that is used in engineering applications such
as digital image processing, digital signal processing and graph problem solving.
Multiplication of huge matrices requires a lot of computation time as its complexity is
O(n3
). Because most engineering applications require higher computational throughputs
with minimum time, many sequential and parallel algorithms are developed. In this paper,
methods of matrix multiplication are chosen, implemented, and analyzed. A performance
analysis is evaluated, and some recommendations are given when using openMP and MPI
methods of parallel computing.
Keywords: OpenMP, MPI, Processing Time, Speedup, Efficiency
1. Introduction
With the advent of parallel hardware and software technologies users are faced with the challenge to
choose a programming paradigm best suited for the underlying computer architecture (Alqadi and
Abu-Jazzar, 2005a; Alqadi and Abu-Jazzar, 2005b; Alqadi et al, 2008). With the current trend in
parallel computer architectures towards clusters of shared memory symmetric multi-processors (SMP)
parallel programming techniques have evolved to support parallelism beyond a single level (Choi et al,
1994).
Analysis of Matrix Multiplication Computational Methods 259
Parallel programming within one SMP node can take advantage of the globally shared address
space. Compilers for shared memory architectures usually support multi-threaded execution of a
program. Loop level parallelism can be exploited by using compiler directives such as those defined in
the OpenMP standard (Dongarra et al, 1994; Alpatov et al, 1997).
OpenMP provides a fork-and-join execution model in which a program begins execution as a single
thread. This thread executes sequentially until a parallelization directive for a parallel region is found
(Alpatov et al, 1997; Anderson et al, 1987). At this time, the thread creates a team of threads and becomes
the master thread of the new team (Chtchelkanova et al, 1995; Barnett et al, 1994; Choi et al, 1992).
All threads execute the statements until the end of the parallel region. Work-sharing directives
are provided to divide the execution of the enclosed code region among the threads. All threads need to
synchronize at the end of parallel constructs. The advantage of OpenMP (web ref.) is that an existing
code can be easily parallelized by placing OpenMP directives around time consuming loops which do
not contain data dependences, leaving the source code unchanged. The disadvantage is that it is not
easy for the user to optimize workflow and memory access.
On an SMP cluster the message passing programming paradigm can be employed within and
across several nodes. MPI (web ref.) is a widely accepted standard for writing message passing
programs (web ref.; Rabenseifner, 2003).
MPI provides the user with a programming model where processes communicate with other
processes by calling library routines to send and receive messages. The advantage of the MPI programming
model is that the user has complete control over data distribution and process synchronization, permitting
the optimization data locality and workflow distribution. The disadvantage is that existing sequential
applications require a fair amount of restructuring for a parallelization based on MPI.
1.1. Serial Matrix Multiplication
Matrix multiplication involves two matrices A and B such that the number of columns of A and the
number of rows of B are equal. When carried in sequential, it takes a time O(n3
). The algorithm for
ordinary matrix multiplication is:
for i=1 to n
for j=1 to n
c(i,j)=0
for k=1 to n
c(i,j)=c(i,j)+a(i,k)*b(k,j)
end
end
end
1.2. Parallel Matrix Multiplication Using OpenMp
Master thread forks the outer loop between the slave threads, thus each of these threads implements
matrix multiplication using a part of rows from the first matrix, when the threads multiplication are
done the master thread joins the total result of matrix multiplication.
1.3. Parallel Matrix Multiplication Using MPI
The procedures of implementing the sequential algorithm in parallel using MPI can be divided into the
following steps:
 Split the first matrix row wise to split to the different processors, this is performed by the
master processor.
 Broadcast the second matrix to all processors.
 Each processor performs multiplication of the partial of the first matrix and the second matrix.
 Each processor sends back the partial product to the master processor.
260 Khaled Matrouk, Abdullah Al- Hasanat, Haitham Alasha'ary
Ziad Al-Qadi and Hasan Al-Shalabi
Implementation
 Master (processor 0) reads data
 Master sends size of data to slaves
 Slaves allocate memory
 Master broadcasts second matrix to all other processors
 Master sends respective parts of first matrix to all other processors
 Every processor performs its local multiplication
o All slave processors send back their result.
2. Methods and Tools
One station with Pentium i5 processor with 2.5 GHz and 4 GB memory is used to implement serial
matrix multiplication. Visual Studio 2008 with openMP library is used as an environment for building,
executing and testing matrix multiplication program. The program is tested using Pentium i5 processer
with 2.5 GHz and 4 GB memory. A distributed processing system with different number of processors
is used, each processor is a 4 core processor with 2.5 MHz and 4 GB memory, the processors are
connected though Visual Studio 2008 with MPI environment.
3. Experimental Part
Different sets of 2 matrices are chosen (different in sizes and data types ) and each pair of matrices is
multiplied serially and in parallel using both openMP and MPI environments, and the average
multiplication time is taken.
3.1. Experiment 1
Sequential matrix multiplication program is tested using different size matrices. Different size matrices
with different data types (integer, float, double, and complex) are chosen, 100 types of matrices with
different data types and different sizes are multiplied. Table 1 shows the average results obtained in
this experiment.
Table 1: Experiment 1 Results
Matrices size Multiplication time in seconds
10*10 0.00641199
20*20 0.00735038
40*40 0.0063971
100*100 0.0142716
200*200 0.0386879
1000*1000 6.75335
1200*1200 11.889
5000*5000 2007
10000*10000 13000
3.2. Experiment 2
Matrix multiplication program is tested using small size matrices. Different size matrices with different
data types (integer, float, double, and complex) are chosen, 200 types of matrices with different data
types and different sizes are multiplied using openMP environment. Table 2 shows the average results
obtained in this experiment.
Analysis of Matrix Multiplication Computational Methods 261
Table 2: Experiment 2 Results
# of threads
10,10
(time in seconds)
20,20
(time in seconds)
40,40
(time in seconds)
100,100
(time in seconds)
200,200
(time in seconds)
1 0.00641199 0.00735038 0.0063971 0.0142716 0.0386879
2 0.03675360 0.06866570 0.0370589 0.0373609 0.0615986
3 0.06271470 0.06311360 0.0701701 0.0978940 0.0787245
4 0.07273020 0.06979990 0.0710032 0.0706766 0.079643
5 0.06772930 0.07232620 0.0673493 0.0699920 0.051531
6 0.06918620 0.07037430 0.0707350 0.0724863 0.0837632
7 0.07124480 0.07204210 0.0727263 0.0727355 0.0820219
8 0.74631600 0.07348000 0.0677064 0.0762404 0.0820226
3.3. Experiment 3
Matrix multiplication program is tested using big size matrices. Different size matrices with different
data types (integer, float, double, and complex) are chosen, 200 types of matrices with different data
types and different sizes are multiplied using openMP environment with 8 threads. Table 3 shows the
average results obtained in this experiment.
Table 3: Experiment 3 Results
Matrices size Multiplication time (in seconds)
1000,1000 1.8377
1200,1200 3.19091
2000,2000 18.0225
5000,5000 508.1
10000,10000 333.3
3.4. Experiment 4
Matrix multiplication program is tested using big size matrices. Different size matrices with different
data types (integer, float, double, and complex) are chosen, 200 types of matrices with different data
types and different sizes are multiplied using MPI environment with different number of processors.
Table 4 shows the average results obtained in this experiment.
Table 4: Experiment 4 Results
Number of
processors
Multiplication time in second
1000x1000 matrices
Multiplication time in second
5000x5000 matrices
Multiplication time in
second 10000x10000
matrices
1 6.96 2007 13000
2 5.9 1055 7090
4 3.3 525 3290
5 2.8 431 2965
8 2.1 260 1920
10 1.5 235 1600
20 0.8 119 900
25 0.6 91 830
50 0.55 52 292
4. Results Discussion
From the results obtained in the previous section we can categorize the matrices into three groups:
 Small size matrices with size less than 1000*1000
 Mid size matrices with 1000*1000 ≤ size ≤ 5000*5000
262 Khaled Matrouk, Abdullah Al- Hasanat, Haitham Alasha'ary
Ziad Al-Qadi and Hasan Al-Shalabi
 Huge size matrices with size ≥ 5000*5000
 The following recommendation can be declared:
 For small size matrices, it is preferable to use sequential matrix multiplication.
 For mid size matrices, it is preferable to use parallel matrix multiplication using openMP.
 For huge size matrices, it is preferable to use parallel matrix multiplication using MPI.
 Also it is recommended to use hybrid parallel systems (MPI with openMP) to multiply matrices
with huge sizes.
From the results obtained in Table 2 we can see that the speedup of using openMP is limited to
the number of actual available physical cores in the computer system as it is shown in Table 5 and Fig.
1.
Speedup (times) = Time of execution with 1 thread/parallel time
Table 5: Speedup results of Using OpenMP
Matrix size #of threads = 1 #of threads = 8 Speedup
300,300 0.110188 0.097704 1.1278
400,400 0.314468 0.170006 1.8497
500,500 0.601031 0.277821 2.1634
600,600 1.14773 0.64882 1.7689
700,700 2.17295 0.704228 3.0856
800,800 3.16512 0.963983 3.2834
900,900 4.93736 1.37456 3.5920
1000,1000 6.69186 1.8377 3.6414
1024,1024 7.18151 1.97027 3.6449
1200,1200 12.0819 3.19091 3.7863
2000,2000 72.8571 18.0996 4.0253
2048,2048 74.7383 18.8406 3.9669
Figure 1: Maximum performance of using openMP
From the results obtained in Tables 1 and 2 we can see that the matrix multiplication time
increases rapidly when the matrix size increases as shown in Figs 2 and 3.
Analysis of Matrix Multiplication Computational Methods 263
Figure 2: Comparing between 8 and 1 threads results
200 400 600 800 1000 1200 1400 1600 1800 2000 2200
0
10
20
30
40
50
60
70
80
n matrix(nxn)
tim
e
in
s
ec
onds One Thread
8 threads
Figure 3: Relationship between the speedup and the matrix size
200 400 600 800 1000 1200 1400 1600 1800 2000 2200
1
1.5
2
2.5
3
3.5
4
4.5
matrix size(nxn)
speedup
:tim
es
m ax # of cores
From the results obtained in Table 4 we can calculate the speedup of using MPI and the system
efficiency:
Efficiency = speedup/number of processors
The calculation results are shown in Table 6:
264 Khaled Matrouk, Abdullah Al- Hasanat, Haitham Alasha'ary
Ziad Al-Qadi and Hasan Al-Shalabi
Table 6: Speedup and efficiency of using MPI
Number of
processors
Speedup of
Multiplication
1000x100
matrices
Efficiency
Speedup of
Multiplication
5000x5000
matrices
Efficiency
Speedup of
Multiplication
10000x10000
matrices
Efficiency
1 1 1 1 1 1 1
2 1.17 0.585 1.9 0.38 1.83 0.92
4 2.9 0.725 3.8 0.95 3.9 0.98
5 2.46 0.492 4.66 0.93 4.4 0.88
8 3.29 0.411 7.72 0.96 6.77 0.85
10 4.6 0.46 8.54 0.85 8.13 0.81
20 8.63 0.43 16.87 0,84 14.44 0.72
25 11.5 0.45 22.05 0.88 15.66 0.63
50 12.55 0.251 38.6 0.77 44.52 0.89
From Table 6 we can see that increasing the number of processors in an MPI environments
leads to enhancing the speedup of matrix multiplication but it also leads to poor system efficiency as
shown in Figs 4, 5 and 6.
Figure 4: Multiplication time for 10000x10000 matrices
0 5 10 15 20 25 30 35 40 45 50
0
2000
4000
6000
8000
10000
12000
14000
Number of processors
Time
in
seconds
Running times for parallel matrix multiplication of two 10000x10000 matrices
Analysis of Matrix Multiplication Computational Methods 265
Figure 5: Speedup of multiplication for 10000x10000 matrices
12 45 8 10 20 25 50
0
5
10
15
20
25
30
35
40
45
Speedup of matrix multiplication 10000*10000
Number of processors
Speedup
Figure 6: System efficiency of matrices multiplication
0 5 10 15 20 25 30 35 40 45 50
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
Number of processors
Efficiency
System efficiency
1000*1000
5000*5000
10000*10000
5. Conclusions
Based on the results obtained and shown above, the following conclusions can be drawn:
 Sequential matrix multiplication is preferable for matrices with small sizes.
 OpenMP is a good method to use as an environment for parallel matrix multiplication with mid
sizes, and here the speedup is limited to the number of available physical cores.
266 Khaled Matrouk, Abdullah Al- Hasanat, Haitham Alasha'ary
Ziad Al-Qadi and Hasan Al-Shalabi
 MPI is a good method to use as an environment for parallel matrix multiplication with huge
sizes, here we can increase the speedup but negatively affects the system efficiency.
 To avoid the problems in the two previous conclusions we can recommend a hybrid parallel
system
References
[1] Alqadi Z., and Abu-Jazzar A., 2005. CNRS-NSF Workshop on Environments and Program
Methods Used for Optimizing Matrix Tools for Parallel Scientific Computing, Saint Hilaire
Multiplication, Journal of Engineering 15(1), pp. 73-78, du Touvet, France, Sept. 7-8, Elsevier
Sci. Publishers.
[2] Alqadi Z., and Abu-Jazzar A., 2005. "Analysis of Program Methods Used for Optimizing
Matrix Multiplication", Journal of Engineering 15(1), pp. 73-78.
[3] Alqadi Z., Aqel M., and El Emary I. M. M., 2008. "Performance Analysis and Evaluation of
Parallel Matrix Multiplication Algorithms" ,World Applied Sciences Journal 5(2).
[4] Dongarra, J. J., R.A. Van de Geijn, and D.W. Walker, 1994. "Scalability Issues Affecting the
Design of a Linear Algebra Library, Parallel Linear Algebra Package Design", Distributed
Computing 22( 3), Proceedings of SC 97, pp. 523-537.
[5] Alpatov, P., G. Baker, C. Edwards, J. Gunnels, and P. Geng, 1997. "Parallel Matrix
Distributions: Parallel Linear Algebra Package", Tech. Report TR-95-40, Proceedings of the
SIAM Parallel Processing Computer Sciences Conference, The University of Texas, Austin.
[6] Choi, J., J. J. Dongarra and D.W. Walker, 1994. "A High-Performance Matrix Multiplication
Algorithm Pumma: Parallel Universal Matrix Multiplication on a Distributed Memory Parallel
Computer Using Algorithms on Distributed Memory Concurrent Overlapped Communication",
IBM J. Res. Develop., Computers, Concurrency: Practice and Experience 6(7), pp. 543-570.
[7] Chtchelkanova, A., J. Gunnels, and G. Morrow, 1986. "IEEE Implementation of BLAS:
General Techniques for Level 3 BLAS", Proceedings of the 1986 International Conference on
Parallel Processing, pp. 640-648, TR-95-40, Department of Computer Sciences, University of
Texas.
[8] Barnett, M., S. Gupta, D. Payne, and L. Shuler, 1994. "Using MPI: Communication Library
(InterCom), Scalable High Portable Programming with the Message-Passing Performance,
Computing Conference, pp. 17-31.
[9] Anderson E., Z. Bai, C. Bischof, and J. Demmel, 1987. "Solving Problems on Concurrent
Processors", Proceedings of Matrix Algorithms Supercomputing '90, IEEE 1, pp. 1-10.
[10] Choi J., J. J. Dongarra, R. Pozo and D.W. Walker, 1992. "Scalapack: A Scalable Linear
Algebra Library for Distributed Memory Concurrent Computers", Proceedings of the Fourth
Symposium on the Frontiers of Massively Parallel Computation. IEEE Comput. Soc. Press, pp.
120-127.
[11] MPI 1.1 Standard, http://www-unix.mcs.anl.gov/mpi/mpich.
[12] OpenMP Fortran Application Program Interface, http://www.openmp.org/.
[13] Rabenseifner, R., 2003. “Hybrid Parallel Programming: Performance Problems and Chances”,
Proceedings of the 45th Cray User Group Conference, Ohio, May 12-16, 2003.
View publication stats
View publication stats

More Related Content

Similar to Analysis Of Matrix Multiplication Computational Methods

Hardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmpHardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmpeSAT Publishing House
 
IRJET- Latin Square Computation of Order-3 using Open CL
IRJET- Latin Square Computation of Order-3 using Open CLIRJET- Latin Square Computation of Order-3 using Open CL
IRJET- Latin Square Computation of Order-3 using Open CLIRJET Journal
 
A Novel Design For Generating Dynamic Length Message Digest To Ensure Integri...
A Novel Design For Generating Dynamic Length Message Digest To Ensure Integri...A Novel Design For Generating Dynamic Length Message Digest To Ensure Integri...
A Novel Design For Generating Dynamic Length Message Digest To Ensure Integri...IRJET Journal
 
Comparative Study of Neural Networks Algorithms for Cloud Computing CPU Sched...
Comparative Study of Neural Networks Algorithms for Cloud Computing CPU Sched...Comparative Study of Neural Networks Algorithms for Cloud Computing CPU Sched...
Comparative Study of Neural Networks Algorithms for Cloud Computing CPU Sched...IJECEIAES
 
A multi-microcontroller-based hardware for deploying Tiny machine learning mo...
A multi-microcontroller-based hardware for deploying Tiny machine learning mo...A multi-microcontroller-based hardware for deploying Tiny machine learning mo...
A multi-microcontroller-based hardware for deploying Tiny machine learning mo...IJECEIAES
 
Achieving Portability and Efficiency in a HPC Code Using Standard Message-pas...
Achieving Portability and Efficiency in a HPC Code Using Standard Message-pas...Achieving Portability and Efficiency in a HPC Code Using Standard Message-pas...
Achieving Portability and Efficiency in a HPC Code Using Standard Message-pas...Derryck Lamptey, MPhil, CISSP
 
Parallelization of Graceful Labeling Using Open MP
Parallelization of Graceful Labeling Using Open MPParallelization of Graceful Labeling Using Open MP
Parallelization of Graceful Labeling Using Open MPIJSRED
 
Concurrent Matrix Multiplication on Multi-core Processors
Concurrent Matrix Multiplication on Multi-core ProcessorsConcurrent Matrix Multiplication on Multi-core Processors
Concurrent Matrix Multiplication on Multi-core ProcessorsCSCJournals
 
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...Tahmid Abtahi
 
Cloud Module 3 .pptx
Cloud Module 3 .pptxCloud Module 3 .pptx
Cloud Module 3 .pptxssuser41d319
 
Design of a Novel Multiplier and Accumulator using Modified Booth Algorithm w...
Design of a Novel Multiplier and Accumulator using Modified Booth Algorithm w...Design of a Novel Multiplier and Accumulator using Modified Booth Algorithm w...
Design of a Novel Multiplier and Accumulator using Modified Booth Algorithm w...IRJET Journal
 
IRJET- Chatbot Using Gated End-to-End Memory Networks
IRJET-  	  Chatbot Using Gated End-to-End Memory NetworksIRJET-  	  Chatbot Using Gated End-to-End Memory Networks
IRJET- Chatbot Using Gated End-to-End Memory NetworksIRJET Journal
 
sp-trajano-april2010
sp-trajano-april2010sp-trajano-april2010
sp-trajano-april2010Axel Trajano
 
The Parallel Architecture Approach, Single Program Multiple Data (Spmd) Imple...
The Parallel Architecture Approach, Single Program Multiple Data (Spmd) Imple...The Parallel Architecture Approach, Single Program Multiple Data (Spmd) Imple...
The Parallel Architecture Approach, Single Program Multiple Data (Spmd) Imple...ijceronline
 
MultiObjective(11) - Copy
MultiObjective(11) - CopyMultiObjective(11) - Copy
MultiObjective(11) - CopyAMIT KUMAR
 
Computer Network Performance Evaluation Based on Different Data Packet Size U...
Computer Network Performance Evaluation Based on Different Data Packet Size U...Computer Network Performance Evaluation Based on Different Data Packet Size U...
Computer Network Performance Evaluation Based on Different Data Packet Size U...Jaipal Dhobale
 

Similar to Analysis Of Matrix Multiplication Computational Methods (20)

Hardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmpHardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmp
 
IRJET- Latin Square Computation of Order-3 using Open CL
IRJET- Latin Square Computation of Order-3 using Open CLIRJET- Latin Square Computation of Order-3 using Open CL
IRJET- Latin Square Computation of Order-3 using Open CL
 
A Novel Design For Generating Dynamic Length Message Digest To Ensure Integri...
A Novel Design For Generating Dynamic Length Message Digest To Ensure Integri...A Novel Design For Generating Dynamic Length Message Digest To Ensure Integri...
A Novel Design For Generating Dynamic Length Message Digest To Ensure Integri...
 
Comparative Study of Neural Networks Algorithms for Cloud Computing CPU Sched...
Comparative Study of Neural Networks Algorithms for Cloud Computing CPU Sched...Comparative Study of Neural Networks Algorithms for Cloud Computing CPU Sched...
Comparative Study of Neural Networks Algorithms for Cloud Computing CPU Sched...
 
A0270107
A0270107A0270107
A0270107
 
A multi-microcontroller-based hardware for deploying Tiny machine learning mo...
A multi-microcontroller-based hardware for deploying Tiny machine learning mo...A multi-microcontroller-based hardware for deploying Tiny machine learning mo...
A multi-microcontroller-based hardware for deploying Tiny machine learning mo...
 
Achieving Portability and Efficiency in a HPC Code Using Standard Message-pas...
Achieving Portability and Efficiency in a HPC Code Using Standard Message-pas...Achieving Portability and Efficiency in a HPC Code Using Standard Message-pas...
Achieving Portability and Efficiency in a HPC Code Using Standard Message-pas...
 
Parallelization of Graceful Labeling Using Open MP
Parallelization of Graceful Labeling Using Open MPParallelization of Graceful Labeling Using Open MP
Parallelization of Graceful Labeling Using Open MP
 
Concurrent Matrix Multiplication on Multi-core Processors
Concurrent Matrix Multiplication on Multi-core ProcessorsConcurrent Matrix Multiplication on Multi-core Processors
Concurrent Matrix Multiplication on Multi-core Processors
 
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
 
Cloud Module 3 .pptx
Cloud Module 3 .pptxCloud Module 3 .pptx
Cloud Module 3 .pptx
 
Gk3611601162
Gk3611601162Gk3611601162
Gk3611601162
 
Design of a Novel Multiplier and Accumulator using Modified Booth Algorithm w...
Design of a Novel Multiplier and Accumulator using Modified Booth Algorithm w...Design of a Novel Multiplier and Accumulator using Modified Booth Algorithm w...
Design of a Novel Multiplier and Accumulator using Modified Booth Algorithm w...
 
cug2011-praveen
cug2011-praveencug2011-praveen
cug2011-praveen
 
IRJET- Chatbot Using Gated End-to-End Memory Networks
IRJET-  	  Chatbot Using Gated End-to-End Memory NetworksIRJET-  	  Chatbot Using Gated End-to-End Memory Networks
IRJET- Chatbot Using Gated End-to-End Memory Networks
 
sp-trajano-april2010
sp-trajano-april2010sp-trajano-april2010
sp-trajano-april2010
 
The Parallel Architecture Approach, Single Program Multiple Data (Spmd) Imple...
The Parallel Architecture Approach, Single Program Multiple Data (Spmd) Imple...The Parallel Architecture Approach, Single Program Multiple Data (Spmd) Imple...
The Parallel Architecture Approach, Single Program Multiple Data (Spmd) Imple...
 
MultiObjective(11) - Copy
MultiObjective(11) - CopyMultiObjective(11) - Copy
MultiObjective(11) - Copy
 
FrackingPaper
FrackingPaperFrackingPaper
FrackingPaper
 
Computer Network Performance Evaluation Based on Different Data Packet Size U...
Computer Network Performance Evaluation Based on Different Data Packet Size U...Computer Network Performance Evaluation Based on Different Data Packet Size U...
Computer Network Performance Evaluation Based on Different Data Packet Size U...
 

More from Joe Andelija

How To Write A Progress Report For A Project
How To Write A Progress Report For A ProjectHow To Write A Progress Report For A Project
How To Write A Progress Report For A ProjectJoe Andelija
 
Quality Writing Paper. Best Website For Homework Help Services.
Quality Writing Paper. Best Website For Homework Help Services.Quality Writing Paper. Best Website For Homework Help Services.
Quality Writing Paper. Best Website For Homework Help Services.Joe Andelija
 
The Ultimate Guide To Writing A Brilliant History E
The Ultimate Guide To Writing A Brilliant History EThe Ultimate Guide To Writing A Brilliant History E
The Ultimate Guide To Writing A Brilliant History EJoe Andelija
 
A Day In The Life Of Miss Kranz Today Is Your Day Fr
A Day In The Life Of Miss Kranz Today Is Your Day FrA Day In The Life Of Miss Kranz Today Is Your Day Fr
A Day In The Life Of Miss Kranz Today Is Your Day FrJoe Andelija
 
Excellent Tips On Research Paper Writing Educationa
Excellent Tips On Research Paper Writing EducationaExcellent Tips On Research Paper Writing Educationa
Excellent Tips On Research Paper Writing EducationaJoe Andelija
 
Analysis Of The Poem The Of The. Online assignment writing service.
Analysis Of The Poem The Of The. Online assignment writing service.Analysis Of The Poem The Of The. Online assignment writing service.
Analysis Of The Poem The Of The. Online assignment writing service.Joe Andelija
 
Example Of Narrative Report For Ojt In Restau
Example Of Narrative Report For Ojt In RestauExample Of Narrative Report For Ojt In Restau
Example Of Narrative Report For Ojt In RestauJoe Andelija
 
PPT - Essay Writing PowerPoint Presentation, F
PPT - Essay Writing PowerPoint Presentation, FPPT - Essay Writing PowerPoint Presentation, F
PPT - Essay Writing PowerPoint Presentation, FJoe Andelija
 
How To Write A Good, Or Really Bad, Philosophy Es
How To Write A Good, Or Really Bad, Philosophy EsHow To Write A Good, Or Really Bad, Philosophy Es
How To Write A Good, Or Really Bad, Philosophy EsJoe Andelija
 
Submit Essays For Money - College Homework Help A
Submit Essays For Money - College Homework Help ASubmit Essays For Money - College Homework Help A
Submit Essays For Money - College Homework Help AJoe Andelija
 
The Basics Of MLA Style Essay Format, Essay Templ
The Basics Of MLA Style Essay Format, Essay TemplThe Basics Of MLA Style Essay Format, Essay Templ
The Basics Of MLA Style Essay Format, Essay TemplJoe Andelija
 
Evaluation Essay - 9 Examples, Fo. Online assignment writing service.
Evaluation Essay - 9 Examples, Fo. Online assignment writing service.Evaluation Essay - 9 Examples, Fo. Online assignment writing service.
Evaluation Essay - 9 Examples, Fo. Online assignment writing service.Joe Andelija
 
Buy Cheap Essay Writing An Essay For College Applicatio
Buy Cheap Essay Writing An Essay For College ApplicatioBuy Cheap Essay Writing An Essay For College Applicatio
Buy Cheap Essay Writing An Essay For College ApplicatioJoe Andelija
 
Writing Paper For First Grade - 11 Best Images Of
Writing Paper For First Grade - 11 Best Images OfWriting Paper For First Grade - 11 Best Images Of
Writing Paper For First Grade - 11 Best Images OfJoe Andelija
 
Steps In Doing Research Paper , Basic Steps In The
Steps In Doing Research Paper , Basic Steps In TheSteps In Doing Research Paper , Basic Steps In The
Steps In Doing Research Paper , Basic Steps In TheJoe Andelija
 
Gingerbread Writing Project The Kindergarten Smorg
Gingerbread Writing Project The Kindergarten SmorgGingerbread Writing Project The Kindergarten Smorg
Gingerbread Writing Project The Kindergarten SmorgJoe Andelija
 
Analytical Essay - What Is An Analytical Essay Before Y
Analytical Essay - What Is An Analytical Essay Before YAnalytical Essay - What Is An Analytical Essay Before Y
Analytical Essay - What Is An Analytical Essay Before YJoe Andelija
 
Comparative Essay English (Advanced) - Year 11 HSC
Comparative Essay English (Advanced) - Year 11 HSCComparative Essay English (Advanced) - Year 11 HSC
Comparative Essay English (Advanced) - Year 11 HSCJoe Andelija
 
Pay Someone To Write A Letter For Me, Writing A Letter Requesting M
Pay Someone To Write A Letter For Me, Writing A Letter Requesting MPay Someone To Write A Letter For Me, Writing A Letter Requesting M
Pay Someone To Write A Letter For Me, Writing A Letter Requesting MJoe Andelija
 
Essay Plan Essay Plan, Essay Writing, Essay Writin
Essay Plan Essay Plan, Essay Writing, Essay WritinEssay Plan Essay Plan, Essay Writing, Essay Writin
Essay Plan Essay Plan, Essay Writing, Essay WritinJoe Andelija
 

More from Joe Andelija (20)

How To Write A Progress Report For A Project
How To Write A Progress Report For A ProjectHow To Write A Progress Report For A Project
How To Write A Progress Report For A Project
 
Quality Writing Paper. Best Website For Homework Help Services.
Quality Writing Paper. Best Website For Homework Help Services.Quality Writing Paper. Best Website For Homework Help Services.
Quality Writing Paper. Best Website For Homework Help Services.
 
The Ultimate Guide To Writing A Brilliant History E
The Ultimate Guide To Writing A Brilliant History EThe Ultimate Guide To Writing A Brilliant History E
The Ultimate Guide To Writing A Brilliant History E
 
A Day In The Life Of Miss Kranz Today Is Your Day Fr
A Day In The Life Of Miss Kranz Today Is Your Day FrA Day In The Life Of Miss Kranz Today Is Your Day Fr
A Day In The Life Of Miss Kranz Today Is Your Day Fr
 
Excellent Tips On Research Paper Writing Educationa
Excellent Tips On Research Paper Writing EducationaExcellent Tips On Research Paper Writing Educationa
Excellent Tips On Research Paper Writing Educationa
 
Analysis Of The Poem The Of The. Online assignment writing service.
Analysis Of The Poem The Of The. Online assignment writing service.Analysis Of The Poem The Of The. Online assignment writing service.
Analysis Of The Poem The Of The. Online assignment writing service.
 
Example Of Narrative Report For Ojt In Restau
Example Of Narrative Report For Ojt In RestauExample Of Narrative Report For Ojt In Restau
Example Of Narrative Report For Ojt In Restau
 
PPT - Essay Writing PowerPoint Presentation, F
PPT - Essay Writing PowerPoint Presentation, FPPT - Essay Writing PowerPoint Presentation, F
PPT - Essay Writing PowerPoint Presentation, F
 
How To Write A Good, Or Really Bad, Philosophy Es
How To Write A Good, Or Really Bad, Philosophy EsHow To Write A Good, Or Really Bad, Philosophy Es
How To Write A Good, Or Really Bad, Philosophy Es
 
Submit Essays For Money - College Homework Help A
Submit Essays For Money - College Homework Help ASubmit Essays For Money - College Homework Help A
Submit Essays For Money - College Homework Help A
 
The Basics Of MLA Style Essay Format, Essay Templ
The Basics Of MLA Style Essay Format, Essay TemplThe Basics Of MLA Style Essay Format, Essay Templ
The Basics Of MLA Style Essay Format, Essay Templ
 
Evaluation Essay - 9 Examples, Fo. Online assignment writing service.
Evaluation Essay - 9 Examples, Fo. Online assignment writing service.Evaluation Essay - 9 Examples, Fo. Online assignment writing service.
Evaluation Essay - 9 Examples, Fo. Online assignment writing service.
 
Buy Cheap Essay Writing An Essay For College Applicatio
Buy Cheap Essay Writing An Essay For College ApplicatioBuy Cheap Essay Writing An Essay For College Applicatio
Buy Cheap Essay Writing An Essay For College Applicatio
 
Writing Paper For First Grade - 11 Best Images Of
Writing Paper For First Grade - 11 Best Images OfWriting Paper For First Grade - 11 Best Images Of
Writing Paper For First Grade - 11 Best Images Of
 
Steps In Doing Research Paper , Basic Steps In The
Steps In Doing Research Paper , Basic Steps In TheSteps In Doing Research Paper , Basic Steps In The
Steps In Doing Research Paper , Basic Steps In The
 
Gingerbread Writing Project The Kindergarten Smorg
Gingerbread Writing Project The Kindergarten SmorgGingerbread Writing Project The Kindergarten Smorg
Gingerbread Writing Project The Kindergarten Smorg
 
Analytical Essay - What Is An Analytical Essay Before Y
Analytical Essay - What Is An Analytical Essay Before YAnalytical Essay - What Is An Analytical Essay Before Y
Analytical Essay - What Is An Analytical Essay Before Y
 
Comparative Essay English (Advanced) - Year 11 HSC
Comparative Essay English (Advanced) - Year 11 HSCComparative Essay English (Advanced) - Year 11 HSC
Comparative Essay English (Advanced) - Year 11 HSC
 
Pay Someone To Write A Letter For Me, Writing A Letter Requesting M
Pay Someone To Write A Letter For Me, Writing A Letter Requesting MPay Someone To Write A Letter For Me, Writing A Letter Requesting M
Pay Someone To Write A Letter For Me, Writing A Letter Requesting M
 
Essay Plan Essay Plan, Essay Writing, Essay Writin
Essay Plan Essay Plan, Essay Writing, Essay WritinEssay Plan Essay Plan, Essay Writing, Essay Writin
Essay Plan Essay Plan, Essay Writing, Essay Writin
 

Recently uploaded

Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........LeaCamillePacle
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxLigayaBacuel1
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxsqpmdrvczh
 

Recently uploaded (20)

Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptx
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptx
 

Analysis Of Matrix Multiplication Computational Methods

  • 1. European Journal of Scientific Research ISSN 1450-216X / 1450-202X Vol.121 No.3, 2014, pp.258-266 http://www.europeanjournalofscientificresearch.com Analysis of Matrix Multiplication Computational Methods Khaled Matrouk Corrospondent Author, Department of Computer Engineering, Faculty of Engineering Al-Hussein Bin Talal University, P. O. Box (20), Ma'an, Zip Code 71111, Jordan E-mail: khaled.matrouk@ahu.edu.jo Tel: +962-3-2179000 (ext. 8503), Fax: +962-3-2179050 Abdullah Al- Hasanat Department of Computer Engineering, Faculty of Engineering Al-Hussein Bin Talal University, P. O. Box (20), Ma'an, Zip Code 71111, Jordan Haitham Alasha'ary Department of Computer Engineering, Faculty of Engineering Al-Hussein Bin Talal University, P. O. Box (20), Ma'an, Zip Code 71111, Jordan Ziad Al-Qadi Prof, Department of Computer Engineering, Faculty of Engineering Al-Hussein Bin Talal University, P. O. Box (20), Ma'an, Zip Code 71111, Jordan Hasan Al-Shalabi Prof, Department of Computer Engineering, Faculty of Engineering Al-Hussein Bin Talal University, P. O. Box (20), Ma'an, Zip Code 71111, Jordan Abstract Matrix multiplication is a basic concept that is used in engineering applications such as digital image processing, digital signal processing and graph problem solving. Multiplication of huge matrices requires a lot of computation time as its complexity is O(n3 ). Because most engineering applications require higher computational throughputs with minimum time, many sequential and parallel algorithms are developed. In this paper, methods of matrix multiplication are chosen, implemented, and analyzed. A performance analysis is evaluated, and some recommendations are given when using openMP and MPI methods of parallel computing. Keywords: OpenMP, MPI, Processing Time, Speedup, Efficiency 1. Introduction With the advent of parallel hardware and software technologies users are faced with the challenge to choose a programming paradigm best suited for the underlying computer architecture (Alqadi and Abu-Jazzar, 2005a; Alqadi and Abu-Jazzar, 2005b; Alqadi et al, 2008). With the current trend in parallel computer architectures towards clusters of shared memory symmetric multi-processors (SMP) parallel programming techniques have evolved to support parallelism beyond a single level (Choi et al, 1994).
  • 2. Analysis of Matrix Multiplication Computational Methods 259 Parallel programming within one SMP node can take advantage of the globally shared address space. Compilers for shared memory architectures usually support multi-threaded execution of a program. Loop level parallelism can be exploited by using compiler directives such as those defined in the OpenMP standard (Dongarra et al, 1994; Alpatov et al, 1997). OpenMP provides a fork-and-join execution model in which a program begins execution as a single thread. This thread executes sequentially until a parallelization directive for a parallel region is found (Alpatov et al, 1997; Anderson et al, 1987). At this time, the thread creates a team of threads and becomes the master thread of the new team (Chtchelkanova et al, 1995; Barnett et al, 1994; Choi et al, 1992). All threads execute the statements until the end of the parallel region. Work-sharing directives are provided to divide the execution of the enclosed code region among the threads. All threads need to synchronize at the end of parallel constructs. The advantage of OpenMP (web ref.) is that an existing code can be easily parallelized by placing OpenMP directives around time consuming loops which do not contain data dependences, leaving the source code unchanged. The disadvantage is that it is not easy for the user to optimize workflow and memory access. On an SMP cluster the message passing programming paradigm can be employed within and across several nodes. MPI (web ref.) is a widely accepted standard for writing message passing programs (web ref.; Rabenseifner, 2003). MPI provides the user with a programming model where processes communicate with other processes by calling library routines to send and receive messages. The advantage of the MPI programming model is that the user has complete control over data distribution and process synchronization, permitting the optimization data locality and workflow distribution. The disadvantage is that existing sequential applications require a fair amount of restructuring for a parallelization based on MPI. 1.1. Serial Matrix Multiplication Matrix multiplication involves two matrices A and B such that the number of columns of A and the number of rows of B are equal. When carried in sequential, it takes a time O(n3 ). The algorithm for ordinary matrix multiplication is: for i=1 to n for j=1 to n c(i,j)=0 for k=1 to n c(i,j)=c(i,j)+a(i,k)*b(k,j) end end end 1.2. Parallel Matrix Multiplication Using OpenMp Master thread forks the outer loop between the slave threads, thus each of these threads implements matrix multiplication using a part of rows from the first matrix, when the threads multiplication are done the master thread joins the total result of matrix multiplication. 1.3. Parallel Matrix Multiplication Using MPI The procedures of implementing the sequential algorithm in parallel using MPI can be divided into the following steps:  Split the first matrix row wise to split to the different processors, this is performed by the master processor.  Broadcast the second matrix to all processors.  Each processor performs multiplication of the partial of the first matrix and the second matrix.  Each processor sends back the partial product to the master processor.
  • 3. 260 Khaled Matrouk, Abdullah Al- Hasanat, Haitham Alasha'ary Ziad Al-Qadi and Hasan Al-Shalabi Implementation  Master (processor 0) reads data  Master sends size of data to slaves  Slaves allocate memory  Master broadcasts second matrix to all other processors  Master sends respective parts of first matrix to all other processors  Every processor performs its local multiplication o All slave processors send back their result. 2. Methods and Tools One station with Pentium i5 processor with 2.5 GHz and 4 GB memory is used to implement serial matrix multiplication. Visual Studio 2008 with openMP library is used as an environment for building, executing and testing matrix multiplication program. The program is tested using Pentium i5 processer with 2.5 GHz and 4 GB memory. A distributed processing system with different number of processors is used, each processor is a 4 core processor with 2.5 MHz and 4 GB memory, the processors are connected though Visual Studio 2008 with MPI environment. 3. Experimental Part Different sets of 2 matrices are chosen (different in sizes and data types ) and each pair of matrices is multiplied serially and in parallel using both openMP and MPI environments, and the average multiplication time is taken. 3.1. Experiment 1 Sequential matrix multiplication program is tested using different size matrices. Different size matrices with different data types (integer, float, double, and complex) are chosen, 100 types of matrices with different data types and different sizes are multiplied. Table 1 shows the average results obtained in this experiment. Table 1: Experiment 1 Results Matrices size Multiplication time in seconds 10*10 0.00641199 20*20 0.00735038 40*40 0.0063971 100*100 0.0142716 200*200 0.0386879 1000*1000 6.75335 1200*1200 11.889 5000*5000 2007 10000*10000 13000 3.2. Experiment 2 Matrix multiplication program is tested using small size matrices. Different size matrices with different data types (integer, float, double, and complex) are chosen, 200 types of matrices with different data types and different sizes are multiplied using openMP environment. Table 2 shows the average results obtained in this experiment.
  • 4. Analysis of Matrix Multiplication Computational Methods 261 Table 2: Experiment 2 Results # of threads 10,10 (time in seconds) 20,20 (time in seconds) 40,40 (time in seconds) 100,100 (time in seconds) 200,200 (time in seconds) 1 0.00641199 0.00735038 0.0063971 0.0142716 0.0386879 2 0.03675360 0.06866570 0.0370589 0.0373609 0.0615986 3 0.06271470 0.06311360 0.0701701 0.0978940 0.0787245 4 0.07273020 0.06979990 0.0710032 0.0706766 0.079643 5 0.06772930 0.07232620 0.0673493 0.0699920 0.051531 6 0.06918620 0.07037430 0.0707350 0.0724863 0.0837632 7 0.07124480 0.07204210 0.0727263 0.0727355 0.0820219 8 0.74631600 0.07348000 0.0677064 0.0762404 0.0820226 3.3. Experiment 3 Matrix multiplication program is tested using big size matrices. Different size matrices with different data types (integer, float, double, and complex) are chosen, 200 types of matrices with different data types and different sizes are multiplied using openMP environment with 8 threads. Table 3 shows the average results obtained in this experiment. Table 3: Experiment 3 Results Matrices size Multiplication time (in seconds) 1000,1000 1.8377 1200,1200 3.19091 2000,2000 18.0225 5000,5000 508.1 10000,10000 333.3 3.4. Experiment 4 Matrix multiplication program is tested using big size matrices. Different size matrices with different data types (integer, float, double, and complex) are chosen, 200 types of matrices with different data types and different sizes are multiplied using MPI environment with different number of processors. Table 4 shows the average results obtained in this experiment. Table 4: Experiment 4 Results Number of processors Multiplication time in second 1000x1000 matrices Multiplication time in second 5000x5000 matrices Multiplication time in second 10000x10000 matrices 1 6.96 2007 13000 2 5.9 1055 7090 4 3.3 525 3290 5 2.8 431 2965 8 2.1 260 1920 10 1.5 235 1600 20 0.8 119 900 25 0.6 91 830 50 0.55 52 292 4. Results Discussion From the results obtained in the previous section we can categorize the matrices into three groups:  Small size matrices with size less than 1000*1000  Mid size matrices with 1000*1000 ≤ size ≤ 5000*5000
  • 5. 262 Khaled Matrouk, Abdullah Al- Hasanat, Haitham Alasha'ary Ziad Al-Qadi and Hasan Al-Shalabi  Huge size matrices with size ≥ 5000*5000  The following recommendation can be declared:  For small size matrices, it is preferable to use sequential matrix multiplication.  For mid size matrices, it is preferable to use parallel matrix multiplication using openMP.  For huge size matrices, it is preferable to use parallel matrix multiplication using MPI.  Also it is recommended to use hybrid parallel systems (MPI with openMP) to multiply matrices with huge sizes. From the results obtained in Table 2 we can see that the speedup of using openMP is limited to the number of actual available physical cores in the computer system as it is shown in Table 5 and Fig. 1. Speedup (times) = Time of execution with 1 thread/parallel time Table 5: Speedup results of Using OpenMP Matrix size #of threads = 1 #of threads = 8 Speedup 300,300 0.110188 0.097704 1.1278 400,400 0.314468 0.170006 1.8497 500,500 0.601031 0.277821 2.1634 600,600 1.14773 0.64882 1.7689 700,700 2.17295 0.704228 3.0856 800,800 3.16512 0.963983 3.2834 900,900 4.93736 1.37456 3.5920 1000,1000 6.69186 1.8377 3.6414 1024,1024 7.18151 1.97027 3.6449 1200,1200 12.0819 3.19091 3.7863 2000,2000 72.8571 18.0996 4.0253 2048,2048 74.7383 18.8406 3.9669 Figure 1: Maximum performance of using openMP From the results obtained in Tables 1 and 2 we can see that the matrix multiplication time increases rapidly when the matrix size increases as shown in Figs 2 and 3.
  • 6. Analysis of Matrix Multiplication Computational Methods 263 Figure 2: Comparing between 8 and 1 threads results 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 0 10 20 30 40 50 60 70 80 n matrix(nxn) tim e in s ec onds One Thread 8 threads Figure 3: Relationship between the speedup and the matrix size 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 1 1.5 2 2.5 3 3.5 4 4.5 matrix size(nxn) speedup :tim es m ax # of cores From the results obtained in Table 4 we can calculate the speedup of using MPI and the system efficiency: Efficiency = speedup/number of processors The calculation results are shown in Table 6:
  • 7. 264 Khaled Matrouk, Abdullah Al- Hasanat, Haitham Alasha'ary Ziad Al-Qadi and Hasan Al-Shalabi Table 6: Speedup and efficiency of using MPI Number of processors Speedup of Multiplication 1000x100 matrices Efficiency Speedup of Multiplication 5000x5000 matrices Efficiency Speedup of Multiplication 10000x10000 matrices Efficiency 1 1 1 1 1 1 1 2 1.17 0.585 1.9 0.38 1.83 0.92 4 2.9 0.725 3.8 0.95 3.9 0.98 5 2.46 0.492 4.66 0.93 4.4 0.88 8 3.29 0.411 7.72 0.96 6.77 0.85 10 4.6 0.46 8.54 0.85 8.13 0.81 20 8.63 0.43 16.87 0,84 14.44 0.72 25 11.5 0.45 22.05 0.88 15.66 0.63 50 12.55 0.251 38.6 0.77 44.52 0.89 From Table 6 we can see that increasing the number of processors in an MPI environments leads to enhancing the speedup of matrix multiplication but it also leads to poor system efficiency as shown in Figs 4, 5 and 6. Figure 4: Multiplication time for 10000x10000 matrices 0 5 10 15 20 25 30 35 40 45 50 0 2000 4000 6000 8000 10000 12000 14000 Number of processors Time in seconds Running times for parallel matrix multiplication of two 10000x10000 matrices
  • 8. Analysis of Matrix Multiplication Computational Methods 265 Figure 5: Speedup of multiplication for 10000x10000 matrices 12 45 8 10 20 25 50 0 5 10 15 20 25 30 35 40 45 Speedup of matrix multiplication 10000*10000 Number of processors Speedup Figure 6: System efficiency of matrices multiplication 0 5 10 15 20 25 30 35 40 45 50 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 Number of processors Efficiency System efficiency 1000*1000 5000*5000 10000*10000 5. Conclusions Based on the results obtained and shown above, the following conclusions can be drawn:  Sequential matrix multiplication is preferable for matrices with small sizes.  OpenMP is a good method to use as an environment for parallel matrix multiplication with mid sizes, and here the speedup is limited to the number of available physical cores.
  • 9. 266 Khaled Matrouk, Abdullah Al- Hasanat, Haitham Alasha'ary Ziad Al-Qadi and Hasan Al-Shalabi  MPI is a good method to use as an environment for parallel matrix multiplication with huge sizes, here we can increase the speedup but negatively affects the system efficiency.  To avoid the problems in the two previous conclusions we can recommend a hybrid parallel system References [1] Alqadi Z., and Abu-Jazzar A., 2005. CNRS-NSF Workshop on Environments and Program Methods Used for Optimizing Matrix Tools for Parallel Scientific Computing, Saint Hilaire Multiplication, Journal of Engineering 15(1), pp. 73-78, du Touvet, France, Sept. 7-8, Elsevier Sci. Publishers. [2] Alqadi Z., and Abu-Jazzar A., 2005. "Analysis of Program Methods Used for Optimizing Matrix Multiplication", Journal of Engineering 15(1), pp. 73-78. [3] Alqadi Z., Aqel M., and El Emary I. M. M., 2008. "Performance Analysis and Evaluation of Parallel Matrix Multiplication Algorithms" ,World Applied Sciences Journal 5(2). [4] Dongarra, J. J., R.A. Van de Geijn, and D.W. Walker, 1994. "Scalability Issues Affecting the Design of a Linear Algebra Library, Parallel Linear Algebra Package Design", Distributed Computing 22( 3), Proceedings of SC 97, pp. 523-537. [5] Alpatov, P., G. Baker, C. Edwards, J. Gunnels, and P. Geng, 1997. "Parallel Matrix Distributions: Parallel Linear Algebra Package", Tech. Report TR-95-40, Proceedings of the SIAM Parallel Processing Computer Sciences Conference, The University of Texas, Austin. [6] Choi, J., J. J. Dongarra and D.W. Walker, 1994. "A High-Performance Matrix Multiplication Algorithm Pumma: Parallel Universal Matrix Multiplication on a Distributed Memory Parallel Computer Using Algorithms on Distributed Memory Concurrent Overlapped Communication", IBM J. Res. Develop., Computers, Concurrency: Practice and Experience 6(7), pp. 543-570. [7] Chtchelkanova, A., J. Gunnels, and G. Morrow, 1986. "IEEE Implementation of BLAS: General Techniques for Level 3 BLAS", Proceedings of the 1986 International Conference on Parallel Processing, pp. 640-648, TR-95-40, Department of Computer Sciences, University of Texas. [8] Barnett, M., S. Gupta, D. Payne, and L. Shuler, 1994. "Using MPI: Communication Library (InterCom), Scalable High Portable Programming with the Message-Passing Performance, Computing Conference, pp. 17-31. [9] Anderson E., Z. Bai, C. Bischof, and J. Demmel, 1987. "Solving Problems on Concurrent Processors", Proceedings of Matrix Algorithms Supercomputing '90, IEEE 1, pp. 1-10. [10] Choi J., J. J. Dongarra, R. Pozo and D.W. Walker, 1992. "Scalapack: A Scalable Linear Algebra Library for Distributed Memory Concurrent Computers", Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation. IEEE Comput. Soc. Press, pp. 120-127. [11] MPI 1.1 Standard, http://www-unix.mcs.anl.gov/mpi/mpich. [12] OpenMP Fortran Application Program Interface, http://www.openmp.org/. [13] Rabenseifner, R., 2003. “Hybrid Parallel Programming: Performance Problems and Chances”, Proceedings of the 45th Cray User Group Conference, Ohio, May 12-16, 2003. View publication stats View publication stats