SlideShare a Scribd company logo
PARALLELIZING MATRIX
MULTIPLICATION IN
COMPILER DESIGN
BY
M.SRI NANDHINI,
II- MSC(CS),
NADAR SARASWATHI COLLEGE OF
ARTS & SCIENCE,
VADAPUTHUPATTI, THENI.
MATRIX MULTIPLICATION
• Let’s consider arbitrary matrices A and B.
Since the matrices are square matrices
n=m=p.
• So, the resultant matrix AB can be obtained
like this,
• (AB)ij = ∑ Aik Bkj.
GENERATE RANDOM SQUARE MATRIX
• Let’s get into implementation by creating random
matrices for multiplication. Here we are using
malloc function to allocate memory dynamically
at heap. Because when it comes to testing we
have to deal with matrices with different
dimensions.
• Here we have defined the data type as double
which can be changed according to the use case.
The #pragma omp parallel for statement will do
the loop parallelization which we can initialize the
matrix more efficiently.
TRADITIONAL MATRIX
MULTIPLICATION
• Without considering much about the
performance, the direct implementation of
the matrix multiplication is given below.
• Operations will occur in sequential manner for
each element at resultant matrix.
• Here matrix A and matrix B are input matrices
where matrix C is the resultant matrix. So, we
have to pass the resultant matrix into function
as a reference.
MATRIX MULTIPLICATION USING
PARALLEL FOR LOOPS
• When you are going to implement loop
parallelization in your algorithm, you can use a
library like OpenMP to make the hard work
easy or you can use your own implementation
with threads where you have to handle load
balancing, race conditions etc.
PROGRAM
double parallelMultiply(TYPE** matrixA,TYPE** matrix
B,TYPE** matrixC,int dimension)
{
struct timeval t0,t1;
gettimeofday(&t0,0);
#pragma omp parallel for
for(int i=0;i<dimension;i++){
for(int j=0;j<dimension;j++){
for(int k=0;k<dimension;k++){
matrixC[i][j] +=matrixA[i][k] * matrixB[k][j];
}
}
}
gettimeofday(&t1,0);
double elapsed= (t1.tv_sec-t0.tv_sec) * 1.0f +
(t1.tv_ usec-t0.tv_usec)/ 1000000.0f;
return elapsed;
}
OPTIMIZED MATRIX MULTIPLICATION
USING PARALLEL FOR LOOPS
• Since, our matrices are stored in heap, it is not
easy to access them as they stored in the
stack. It is not easy to access them as they
stored in the stack. It is better to bring those
data from heap to stack before start the
multiplication process. So, we need to set
containers initially for those data.
• TYPE flatA[MAX_DIM];
• TYPE flatB[MAX_DIM];
STEPS OF OPTIMIZED MATRIX
MULTIPLICATION IMPLEMENTATION
1.Put common calculation at one place:
Most of the time we do not consider small
calculations that redundant over the program
where performance is not required but clarity
is.
2. Cache friendly algorithm implementation:
We all know that memory has a linear
arrangement. So, every N-dimensional array
Cont…
ordered sequentially inside the memory .
3. Using stack vs Heap memory efficiently:
• It is fast access stack rather than heap
memory. But stack has limited memory. We
have stored the large input memories in heap
memory. For efficient intermediate
calculations we have used the stack with
predefined memory allocations.
Cont…
• Here we have launched 40 threads to do the
multiplication process. Since we are dealing
with dimensions of 200, 400, 600, 800, 1000,
1200, 1400, 1600, 1800 and 2000, workload
can be divided equally among threads.
• In omp we have explicitly declared that
matrixC as shared resource to avoid race
conditions.
THANK YOU!!!
Parallelizing  matrix multiplication

More Related Content

What's hot

Sccd and topological sorting
Sccd and topological sortingSccd and topological sorting
Sccd and topological sortingAmit Kumar Rathi
 
Machine Learning Basics for Web Application Developers
Machine Learning Basics for Web Application DevelopersMachine Learning Basics for Web Application Developers
Machine Learning Basics for Web Application DevelopersEtsuji Nakai
 
Rendering of Complex 3D Treemaps (GRAPP 2013)
Rendering of Complex 3D Treemaps (GRAPP 2013)Rendering of Complex 3D Treemaps (GRAPP 2013)
Rendering of Complex 3D Treemaps (GRAPP 2013)Matthias Trapp
 
2.5D Clip-Surfaces for Technical Visualization
2.5D Clip-Surfaces for Technical Visualization2.5D Clip-Surfaces for Technical Visualization
2.5D Clip-Surfaces for Technical VisualizationMatthias Trapp
 
Machine learning session 9
Machine learning session 9Machine learning session 9
Machine learning session 9NirsandhG
 
MBrace: Large-scale cloud computation with F# (CUFP 2014)
MBrace: Large-scale cloud computation with F# (CUFP 2014)MBrace: Large-scale cloud computation with F# (CUFP 2014)
MBrace: Large-scale cloud computation with F# (CUFP 2014)Eirik George Tsarpalis
 
Geometry Batching Using Texture-Arrays
Geometry Batching Using Texture-ArraysGeometry Batching Using Texture-Arrays
Geometry Batching Using Texture-ArraysMatthias Trapp
 
Summarizing videos with Attention
Summarizing videos with AttentionSummarizing videos with Attention
Summarizing videos with AttentionArithmer Inc.
 
Image Segmentation Chain
Image Segmentation ChainImage Segmentation Chain
Image Segmentation ChainRMwebsite
 
Webinar on Graph Neural Networks
Webinar on Graph Neural NetworksWebinar on Graph Neural Networks
Webinar on Graph Neural NetworksLucaCrociani1
 
Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...
Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...
Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...Computer Science Club
 

What's hot (20)

Sccd and topological sorting
Sccd and topological sortingSccd and topological sorting
Sccd and topological sorting
 
upgrade2013
upgrade2013upgrade2013
upgrade2013
 
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
 
Machine Learning Basics for Web Application Developers
Machine Learning Basics for Web Application DevelopersMachine Learning Basics for Web Application Developers
Machine Learning Basics for Web Application Developers
 
Rendering of Complex 3D Treemaps (GRAPP 2013)
Rendering of Complex 3D Treemaps (GRAPP 2013)Rendering of Complex 3D Treemaps (GRAPP 2013)
Rendering of Complex 3D Treemaps (GRAPP 2013)
 
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
 
2.5D Clip-Surfaces for Technical Visualization
2.5D Clip-Surfaces for Technical Visualization2.5D Clip-Surfaces for Technical Visualization
2.5D Clip-Surfaces for Technical Visualization
 
Centernet
CenternetCenternet
Centernet
 
Machine learning session 9
Machine learning session 9Machine learning session 9
Machine learning session 9
 
MBrace: Cloud Computing with F#
MBrace: Cloud Computing with F#MBrace: Cloud Computing with F#
MBrace: Cloud Computing with F#
 
DataXDay - Tensors in the sky with CloudML
DataXDay - Tensors in the sky with CloudML DataXDay - Tensors in the sky with CloudML
DataXDay - Tensors in the sky with CloudML
 
MBrace: Large-scale cloud computation with F# (CUFP 2014)
MBrace: Large-scale cloud computation with F# (CUFP 2014)MBrace: Large-scale cloud computation with F# (CUFP 2014)
MBrace: Large-scale cloud computation with F# (CUFP 2014)
 
Geometry Batching Using Texture-Arrays
Geometry Batching Using Texture-ArraysGeometry Batching Using Texture-Arrays
Geometry Batching Using Texture-Arrays
 
Summarizing videos with Attention
Summarizing videos with AttentionSummarizing videos with Attention
Summarizing videos with Attention
 
Image Segmentation Chain
Image Segmentation ChainImage Segmentation Chain
Image Segmentation Chain
 
Webinar on Graph Neural Networks
Webinar on Graph Neural NetworksWebinar on Graph Neural Networks
Webinar on Graph Neural Networks
 
Ds36715716
Ds36715716Ds36715716
Ds36715716
 
Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...
Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...
Andrew Goldberg. Highway Dimension and Provably Efficient Shortest Path Algor...
 
What is the point of Boson sampling?
What is the point of Boson sampling?What is the point of Boson sampling?
What is the point of Boson sampling?
 
distance_matrix_ch
distance_matrix_chdistance_matrix_ch
distance_matrix_ch
 

Similar to Parallelizing matrix multiplication

Basic MATLAB-Presentation.pptx
Basic MATLAB-Presentation.pptxBasic MATLAB-Presentation.pptx
Basic MATLAB-Presentation.pptxPremanandS3
 
Evaluation of programs codes using machine learning
Evaluation of programs codes using machine learningEvaluation of programs codes using machine learning
Evaluation of programs codes using machine learningVivek Maskara
 
Fosdem2017 Scientific computing on Jruby
Fosdem2017  Scientific computing on JrubyFosdem2017  Scientific computing on Jruby
Fosdem2017 Scientific computing on JrubyPrasun Anand
 
Dsp manual completed2
Dsp manual completed2Dsp manual completed2
Dsp manual completed2bilawalali74
 
Variables in matlab
Variables in matlabVariables in matlab
Variables in matlabTUOS-Sam
 
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic...
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic...Quantization and Training of Neural Networks for Efficient Integer-Arithmetic...
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic...Ryo Takahashi
 
L 5 Numpy final ppt kirti.pptx
L 5 Numpy final ppt kirti.pptxL 5 Numpy final ppt kirti.pptx
L 5 Numpy final ppt kirti.pptxKirti Verma
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Using Sparse Matrix for the Contact Calculation_ZhanWang
Using Sparse Matrix for the Contact Calculation_ZhanWangUsing Sparse Matrix for the Contact Calculation_ZhanWang
Using Sparse Matrix for the Contact Calculation_ZhanWangZhan Wang
 
"Optimization of a .NET application- is it simple ! / ?", Yevhen Tatarynov
"Optimization of a .NET application- is it simple ! / ?",  Yevhen Tatarynov"Optimization of a .NET application- is it simple ! / ?",  Yevhen Tatarynov
"Optimization of a .NET application- is it simple ! / ?", Yevhen TatarynovFwdays
 
lec08-numpy.pptx
lec08-numpy.pptxlec08-numpy.pptx
lec08-numpy.pptxlekha572836
 
COMPANION TO MATRICES SESSION II.pptx
COMPANION TO MATRICES SESSION II.pptxCOMPANION TO MATRICES SESSION II.pptx
COMPANION TO MATRICES SESSION II.pptximman gwu
 
VCE Unit 01 (1).pptx
VCE Unit 01 (1).pptxVCE Unit 01 (1).pptx
VCE Unit 01 (1).pptxskilljiolms
 
Options and trade offs for parallelism and concurrency in Modern C++
Options and trade offs for parallelism and concurrency in Modern C++Options and trade offs for parallelism and concurrency in Modern C++
Options and trade offs for parallelism and concurrency in Modern C++Satalia
 
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory AccessAccelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory AccessIgor Sfiligoi
 

Similar to Parallelizing matrix multiplication (20)

Compiler Design
Compiler DesignCompiler Design
Compiler Design
 
Basic MATLAB-Presentation.pptx
Basic MATLAB-Presentation.pptxBasic MATLAB-Presentation.pptx
Basic MATLAB-Presentation.pptx
 
Evaluation of programs codes using machine learning
Evaluation of programs codes using machine learningEvaluation of programs codes using machine learning
Evaluation of programs codes using machine learning
 
Fosdem2017 Scientific computing on Jruby
Fosdem2017  Scientific computing on JrubyFosdem2017  Scientific computing on Jruby
Fosdem2017 Scientific computing on Jruby
 
Dsp manual completed2
Dsp manual completed2Dsp manual completed2
Dsp manual completed2
 
Variables in matlab
Variables in matlabVariables in matlab
Variables in matlab
 
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic...
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic...Quantization and Training of Neural Networks for Efficient Integer-Arithmetic...
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic...
 
L 5 Numpy final ppt kirti.pptx
L 5 Numpy final ppt kirti.pptxL 5 Numpy final ppt kirti.pptx
L 5 Numpy final ppt kirti.pptx
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Matlab
MatlabMatlab
Matlab
 
Neural networks
Neural networksNeural networks
Neural networks
 
Using Sparse Matrix for the Contact Calculation_ZhanWang
Using Sparse Matrix for the Contact Calculation_ZhanWangUsing Sparse Matrix for the Contact Calculation_ZhanWang
Using Sparse Matrix for the Contact Calculation_ZhanWang
 
"Optimization of a .NET application- is it simple ! / ?", Yevhen Tatarynov
"Optimization of a .NET application- is it simple ! / ?",  Yevhen Tatarynov"Optimization of a .NET application- is it simple ! / ?",  Yevhen Tatarynov
"Optimization of a .NET application- is it simple ! / ?", Yevhen Tatarynov
 
lec08-numpy.pptx
lec08-numpy.pptxlec08-numpy.pptx
lec08-numpy.pptx
 
COMPANION TO MATRICES SESSION II.pptx
COMPANION TO MATRICES SESSION II.pptxCOMPANION TO MATRICES SESSION II.pptx
COMPANION TO MATRICES SESSION II.pptx
 
Matlab
MatlabMatlab
Matlab
 
Matlab
MatlabMatlab
Matlab
 
VCE Unit 01 (1).pptx
VCE Unit 01 (1).pptxVCE Unit 01 (1).pptx
VCE Unit 01 (1).pptx
 
Options and trade offs for parallelism and concurrency in Modern C++
Options and trade offs for parallelism and concurrency in Modern C++Options and trade offs for parallelism and concurrency in Modern C++
Options and trade offs for parallelism and concurrency in Modern C++
 
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory AccessAccelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
 

More from DEEPIKA T

71619109 configuration-management.pdf (1) (1)
71619109 configuration-management.pdf (1) (1)71619109 configuration-management.pdf (1) (1)
71619109 configuration-management.pdf (1) (1)DEEPIKA T
 
Health care in big data analytics
Health care in big data analyticsHealth care in big data analytics
Health care in big data analyticsDEEPIKA T
 
Role of human interaction
Role of human interactionRole of human interaction
Role of human interactionDEEPIKA T
 
Basic analtyics &amp; advanced analtyics
Basic analtyics &amp; advanced analtyicsBasic analtyics &amp; advanced analtyics
Basic analtyics &amp; advanced analtyicsDEEPIKA T
 
Soap,Rest&Json
Soap,Rest&JsonSoap,Rest&Json
Soap,Rest&JsonDEEPIKA T
 
Remote method invocation
Remote  method invocationRemote  method invocation
Remote method invocationDEEPIKA T
 
Graph representation
Graph representationGraph representation
Graph representationDEEPIKA T
 
Presentation2
Presentation2Presentation2
Presentation2DEEPIKA T
 
Depth first search [dfs]
Depth first search [dfs]Depth first search [dfs]
Depth first search [dfs]DEEPIKA T
 
Topological sort
Topological sortTopological sort
Topological sortDEEPIKA T
 
Path compression
Path compressionPath compression
Path compressionDEEPIKA T
 

More from DEEPIKA T (20)

See
SeeSee
See
 
71619109 configuration-management.pdf (1) (1)
71619109 configuration-management.pdf (1) (1)71619109 configuration-management.pdf (1) (1)
71619109 configuration-management.pdf (1) (1)
 
80068
8006880068
80068
 
242296
242296242296
242296
 
Data mining
Data miningData mining
Data mining
 
Health care in big data analytics
Health care in big data analyticsHealth care in big data analytics
Health care in big data analytics
 
Ajax
AjaxAjax
Ajax
 
Role of human interaction
Role of human interactionRole of human interaction
Role of human interaction
 
Basic analtyics &amp; advanced analtyics
Basic analtyics &amp; advanced analtyicsBasic analtyics &amp; advanced analtyics
Basic analtyics &amp; advanced analtyics
 
Soap,Rest&Json
Soap,Rest&JsonSoap,Rest&Json
Soap,Rest&Json
 
Applet (1)
Applet (1)Applet (1)
Applet (1)
 
Jdbc ja
Jdbc jaJdbc ja
Jdbc ja
 
Appletjava
AppletjavaAppletjava
Appletjava
 
Remote method invocation
Remote  method invocationRemote  method invocation
Remote method invocation
 
Graph representation
Graph representationGraph representation
Graph representation
 
Al
AlAl
Al
 
Presentation2
Presentation2Presentation2
Presentation2
 
Depth first search [dfs]
Depth first search [dfs]Depth first search [dfs]
Depth first search [dfs]
 
Topological sort
Topological sortTopological sort
Topological sort
 
Path compression
Path compressionPath compression
Path compression
 

Recently uploaded

Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdfDanh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdfQucHHunhnh
 
Industrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training ReportIndustrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training ReportAvinash Rai
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativePeter Windle
 
Basic_QTL_Marker-assisted_Selection_Sourabh.ppt
Basic_QTL_Marker-assisted_Selection_Sourabh.pptBasic_QTL_Marker-assisted_Selection_Sourabh.ppt
Basic_QTL_Marker-assisted_Selection_Sourabh.pptSourabh Kumar
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersPedroFerreira53928
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfkaushalkr1407
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...Jisc
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePedroFerreira53928
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxJisc
 
NLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptxNLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptxssuserbdd3e8
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfjoachimlavalley1
 
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptxMatatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptxJenilouCasareno
 
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...Sayali Powar
 
Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptx
Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptxSolid waste management & Types of Basic civil Engineering notes by DJ Sir.pptx
Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptxDenish Jangid
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleCeline George
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptxJosvitaDsouza2
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPCeline George
 

Recently uploaded (20)

Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdfDanh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
 
Industrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training ReportIndustrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training Report
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
Basic_QTL_Marker-assisted_Selection_Sourabh.ppt
Basic_QTL_Marker-assisted_Selection_Sourabh.pptBasic_QTL_Marker-assisted_Selection_Sourabh.ppt
Basic_QTL_Marker-assisted_Selection_Sourabh.ppt
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumers
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
NLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptxNLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptx
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
B.ed spl. HI pdusu exam paper-2023-24.pdf
B.ed spl. HI pdusu exam paper-2023-24.pdfB.ed spl. HI pdusu exam paper-2023-24.pdf
B.ed spl. HI pdusu exam paper-2023-24.pdf
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
 
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptxMatatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
 
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
 
Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptx
Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptxSolid waste management & Types of Basic civil Engineering notes by DJ Sir.pptx
Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptx
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
 

Parallelizing matrix multiplication

  • 1. PARALLELIZING MATRIX MULTIPLICATION IN COMPILER DESIGN BY M.SRI NANDHINI, II- MSC(CS), NADAR SARASWATHI COLLEGE OF ARTS & SCIENCE, VADAPUTHUPATTI, THENI.
  • 2. MATRIX MULTIPLICATION • Let’s consider arbitrary matrices A and B. Since the matrices are square matrices n=m=p. • So, the resultant matrix AB can be obtained like this, • (AB)ij = ∑ Aik Bkj.
  • 3. GENERATE RANDOM SQUARE MATRIX • Let’s get into implementation by creating random matrices for multiplication. Here we are using malloc function to allocate memory dynamically at heap. Because when it comes to testing we have to deal with matrices with different dimensions. • Here we have defined the data type as double which can be changed according to the use case. The #pragma omp parallel for statement will do the loop parallelization which we can initialize the matrix more efficiently.
  • 4. TRADITIONAL MATRIX MULTIPLICATION • Without considering much about the performance, the direct implementation of the matrix multiplication is given below. • Operations will occur in sequential manner for each element at resultant matrix. • Here matrix A and matrix B are input matrices where matrix C is the resultant matrix. So, we have to pass the resultant matrix into function as a reference.
  • 5. MATRIX MULTIPLICATION USING PARALLEL FOR LOOPS • When you are going to implement loop parallelization in your algorithm, you can use a library like OpenMP to make the hard work easy or you can use your own implementation with threads where you have to handle load balancing, race conditions etc.
  • 6. PROGRAM double parallelMultiply(TYPE** matrixA,TYPE** matrix B,TYPE** matrixC,int dimension) { struct timeval t0,t1; gettimeofday(&t0,0); #pragma omp parallel for for(int i=0;i<dimension;i++){ for(int j=0;j<dimension;j++){ for(int k=0;k<dimension;k++){ matrixC[i][j] +=matrixA[i][k] * matrixB[k][j];
  • 7. } } } gettimeofday(&t1,0); double elapsed= (t1.tv_sec-t0.tv_sec) * 1.0f + (t1.tv_ usec-t0.tv_usec)/ 1000000.0f; return elapsed; }
  • 8. OPTIMIZED MATRIX MULTIPLICATION USING PARALLEL FOR LOOPS • Since, our matrices are stored in heap, it is not easy to access them as they stored in the stack. It is not easy to access them as they stored in the stack. It is better to bring those data from heap to stack before start the multiplication process. So, we need to set containers initially for those data. • TYPE flatA[MAX_DIM]; • TYPE flatB[MAX_DIM];
  • 9. STEPS OF OPTIMIZED MATRIX MULTIPLICATION IMPLEMENTATION 1.Put common calculation at one place: Most of the time we do not consider small calculations that redundant over the program where performance is not required but clarity is. 2. Cache friendly algorithm implementation: We all know that memory has a linear arrangement. So, every N-dimensional array
  • 10. Cont… ordered sequentially inside the memory . 3. Using stack vs Heap memory efficiently: • It is fast access stack rather than heap memory. But stack has limited memory. We have stored the large input memories in heap memory. For efficient intermediate calculations we have used the stack with predefined memory allocations.
  • 11. Cont… • Here we have launched 40 threads to do the multiplication process. Since we are dealing with dimensions of 200, 400, 600, 800, 1000, 1200, 1400, 1600, 1800 and 2000, workload can be divided equally among threads. • In omp we have explicitly declared that matrixC as shared resource to avoid race conditions.