SlideShare a Scribd company logo
1 of 49
1
Distributed Memory
Programming with MPI
Slides extended from
An Introduction to Parallel Programming by Peter
Pacheco
Dilum Bandara
Dilum.Bandara@uom.lk
2
Distributed Memory Systems
 We discuss about developing programs for these
systems using MPI
 MPI – Message Passing Interface
 Are set of libraries that can be called from C, C++, &
Fortran
Copyright © 2010, Elsevier Inc. All rights Reserved
3
Why MPI?
 Standardized & portable message-passing
system
 One of the oldest libraries
 Wide-spread adoption
 Minimal requirements on underlying hardware
 Explicit parallelization
 Achieves high performance
 Scales to a large no of processors
 Intellectually demanding
Copyright © 2010, Elsevier Inc. All rights Reserved
4
Our First MPI Program
Copyright © 2010, Elsevier Inc. All rights Reserved
5
Compilation
Copyright © 2010, Elsevier Inc. All rights Reserved
mpicc -g -Wall -o mpi_hello mpi_hello.c
wrapper script to compile
turns on all warnings
source file
create this executable file name
(as opposed to default a.out)
produce
debugging
information
6
Execution
Copyright © 2010, Elsevier Inc. All rights Reserved
mpiexec -n <no of processes> <executable>
mpiexec -n 1 ./mpi_hello
mpiexec -n 4 ./mpi_hello
run with 1 process
run with 4 processes
7
Execution
Copyright © 2010, Elsevier Inc. All rights Reserved
mpiexec -n 1 ./mpi_hello
mpiexec -n 4 ./mpi_hello
Greetings from process 0 of 1 !
Greetings from process 0 of 4 !
Greetings from process 1 of 4 !
Greetings from process 2 of 4 !
Greetings from process 3 of 4 !
8
MPI Programs
 Need to add mpi.h header file
 Identifiers defined by MPI start with “MPI_”
 1st letter following underscore is uppercase
 For function names & MPI-defined types
 Helps to avoid confusion
Copyright © 2010, Elsevier Inc. All rights Reserved
9
6 Golden MPI Functions
Copyright © 2010, Elsevier Inc. All rights Reserved
10
MPI Components
 MPI_Init
 Tells MPI to do all necessary setup
 e.g., allocate storage for message buffers, decide rank of a
process
 argc_p & argv_p are pointers to argc & argv
arguments in main( )
 Function returns error codes
Copyright © 2010, Elsevier Inc. All rights Reserved
11
 MPI_Finalize
 Tells MPI we’re done, so clean up anything allocated
for this program
MPI Components (Cont.)
Copyright © 2010, Elsevier Inc. All rights Reserved
12
Communicators
 Collection of processes that can send messages
to each other
 Messages from others communicators are ignored
 MPI_Init defines a communicator that consists of
all processes created when the program is
started
 Called MPI_COMM_WORLD
Copyright © 2010, Elsevier Inc. All rights Reserved
13
Communicators (Cont.)
Copyright © 2010, Elsevier Inc. All rights Reserved
My rank
(process making this call)
No of processes in the communicator
14
Single-Program Multiple-Data (SPMD)
 We compile 1 program
 Process 0 does something different
 Receives messages & prints them while the
other processes do the work
 if-else construct makes our program
SPMD
 We can run this program on any no of
processors
 e.g., 4, 8, 32, 1000, …
Copyright © 2010, Elsevier Inc. All rights Reserved
15
Communication
 msg_buf_p, msg_size, msg_type
 Determines content of message
 dest – destination processor’s rank
 tag – use to distinguish messages that are identical in
content
Copyright © 2010, Elsevier Inc. All rights Reserved
16
Data Types
Copyright © 2010, Elsevier Inc. All rights Reserved
17
Communication (Cont.)
Copyright © 2010, Elsevier Inc. All rights Reserved
MPI_ANY_SOURCE to receive messages (from any
source) in order for arrival
18
Message Matching
Copyright © 2010, Elsevier Inc. All rights Reserved
MPI_Send
src = q
MPI_Recv
dest = r
r
q
19
Receiving Messages
 Receiver can get a message without
knowing
 Amount of data in message
 Sender of message
 Tag of message
 How can those be found out?
Copyright © 2010, Elsevier Inc. All rights Reserved
20
How Much Data am I Receiving?
Copyright © 2010, Elsevier Inc. All rights Reserved
21
 Exact behavior is determined by MPI
implementation
 MPI_Send may behave differently with regard to
buffer size, cutoffs, & blocking
 Cutoff
 if message size < cutoff  buffer
 if message size ≥ cutoff  MPI_Send will block
 MPI_Recv always blocks until a matching
message is received
 Preserve message ordering from a sender
 Know your implementation
 Don’t make assumptions!
Issues With Send & Receive
Copyright © 2010, Elsevier Inc. All rights Reserved
22
Trapezoidal Rule
Copyright © 2010, Elsevier Inc. All rights Reserved
23
Trapezoidal Rule (Cont.)
Copyright © 2010, Elsevier Inc. All rights Reserved
24
Serial Pseudo-code
Copyright © 2010, Elsevier Inc. All rights Reserved
25
Parallel Pseudo-Code
Copyright © 2010, Elsevier Inc. All rights Reserved
26
Tasks & Communications for
Trapezoidal Rule
Copyright © 2010, Elsevier Inc. All rights Reserved
27
First Version
Copyright © 2010, Elsevier Inc. All rights Reserved
28
First Version (Cont.)
Copyright © 2010, Elsevier Inc. All rights Reserved
29
First Version (Cont.)
Copyright © 2010, Elsevier Inc. All rights Reserved
30
COLLECTIVE
COMMUNICATION
Copyright © 2010, Elsevier Inc. All rights Reserved
31
Collective Communication
Copyright © 2010, Elsevier Inc. All rights Reserved
A tree-structured global sum
32
Alternative Tree-Structured Global Sum
Copyright © 2010, Elsevier Inc. All rights Reserved
Which is most optimum?
Can we do better?
33
MPI_Reduce
Copyright © 2010, Elsevier Inc. All rights Reserved
34
Predefined Reduction Operators
Copyright © 2010, Elsevier Inc. All rights Reserved
35
Collective vs. Point-to-Point Communications
 All processes in the communicator must call the
same collective function
 e.g., a program that attempts to match a call to
MPI_Reduce on 1 process with a call to MPI_Recv on
another process is erroneous
 Program will hang or crash
 Arguments passed by each process to an MPI
collective communication must be “compatible”
 e.g., if 1 process passes in 0 as dest_process &
another passes in 1, then the outcome of a call to
MPI_Reduce is erroneous
 Program is likely to hang or crash
Copyright © 2010, Elsevier Inc. All rights Reserved
36
Collective vs. P-to-P Communications (Cont.)
 output_data_p argument is only used on
dest_process
 However, all of the processes still need to pass in an
actual argument corresponding to output_data_p,
even if it’s just NULL
 Point-to-point communications are matched on
the basis of tags & communicators
 Collective communications don’t use tags
 Matched solely on the basis of communicator & order
in which they’re called
Copyright © 2010, Elsevier Inc. All rights Reserved
37
MPI_Allreduce
 Useful when all processes need result of a global
sum to complete some larger computation
Copyright © 2010, Elsevier Inc. All rights Reserved
38
Copyright © 2010, Elsevier Inc. All rights Reserved
Global sum followed
by distribution of result
MPI_Allreduce (Cont.)
39
Butterfly-Structured Global Sum
Copyright © 2010, Elsevier Inc. All rights Reserved
Processes exchange partial results
40
Broadcast
 Data belonging to a single process is sent to all
of the processes in communicator
Copyright © 2010, Elsevier Inc. All rights Reserved
41
Tree-Structured Broadcast
Copyright © 2010, Elsevier Inc. All rights Reserved
42
Data Distributions – Compute a Vector Sum
Copyright © 2010, Elsevier Inc. All rights Reserved
Serial implementation
43
Partitioning Options
Copyright © 2010, Elsevier Inc. All rights Reserved
 Block partitioning
 Assign blocks of consecutive components to each process
 Cyclic partitioning
 Assign components in a round robin fashion
 Block-cyclic partitioning
 Use a cyclic distribution of blocks of components
44
Parallel Implementation
Copyright © 2010, Elsevier Inc. All rights Reserved
45
MPI_Scatter
 Can be used in a function that reads in an entire
vector on process 0 but only sends needed
components to other processes
Copyright © 2010, Elsevier Inc. All rights Reserved
46
Reading & Distributing a Vector
Copyright © 2010, Elsevier Inc. All rights Reserved
47
MPI_Gather
 Collect all components of a vector onto process 0
 Then process 0 can process all of components
Copyright © 2010, Elsevier Inc. All rights Reserved
48
MPI_Allgather
 Concatenates contents of each process’
send_buf_p & stores this in each process’
recv_buf_p
 recv_count is the amount of data being received from
each process
Copyright © 2010, Elsevier Inc. All rights Reserved
49
Summary
Copyright © 2010, Elsevier Inc. All rights Reserved
Source: https://computing.llnl.gov/tutorials/mpi/

More Related Content

What's hot

Visible surface determination
Visible  surface determinationVisible  surface determination
Visible surface determination
Patel Punit
 
lecture2 computer graphics graphics hardware(Computer graphics tutorials)
 lecture2  computer graphics graphics hardware(Computer graphics tutorials) lecture2  computer graphics graphics hardware(Computer graphics tutorials)
lecture2 computer graphics graphics hardware(Computer graphics tutorials)
Daroko blog(www.professionalbloggertricks.com)
 

What's hot (20)

VTU 6th Sem Elective CSE - Module 3 cloud computing
VTU 6th Sem Elective CSE - Module 3 cloud computingVTU 6th Sem Elective CSE - Module 3 cloud computing
VTU 6th Sem Elective CSE - Module 3 cloud computing
 
2. pl domain
2. pl domain2. pl domain
2. pl domain
 
Color Models Computer Graphics
Color Models Computer GraphicsColor Models Computer Graphics
Color Models Computer Graphics
 
Memory management ppt
Memory management pptMemory management ppt
Memory management ppt
 
Cyrus beck line clipping algorithm
Cyrus beck line clipping algorithmCyrus beck line clipping algorithm
Cyrus beck line clipping algorithm
 
Illumination models
Illumination modelsIllumination models
Illumination models
 
Distributed System
Distributed SystemDistributed System
Distributed System
 
Transactions and Concurrency Control
Transactions and Concurrency ControlTransactions and Concurrency Control
Transactions and Concurrency Control
 
Input of graphical data
Input of graphical dataInput of graphical data
Input of graphical data
 
Raster scan system & random scan system
Raster scan system & random scan systemRaster scan system & random scan system
Raster scan system & random scan system
 
Process synchronization in Operating Systems
Process synchronization in Operating SystemsProcess synchronization in Operating Systems
Process synchronization in Operating Systems
 
Color Models
Color ModelsColor Models
Color Models
 
ProLog (Artificial Intelligence) Introduction
ProLog (Artificial Intelligence) IntroductionProLog (Artificial Intelligence) Introduction
ProLog (Artificial Intelligence) Introduction
 
2 d viewing computer graphics
2 d viewing computer graphics2 d viewing computer graphics
2 d viewing computer graphics
 
Deadlock
DeadlockDeadlock
Deadlock
 
Replication in Distributed Database
Replication in Distributed DatabaseReplication in Distributed Database
Replication in Distributed Database
 
3D Transformation in Computer Graphics
3D Transformation in Computer Graphics3D Transformation in Computer Graphics
3D Transformation in Computer Graphics
 
Visible surface determination
Visible  surface determinationVisible  surface determination
Visible surface determination
 
Raster scan systems with video controller and display processor
Raster scan systems with video controller and display processorRaster scan systems with video controller and display processor
Raster scan systems with video controller and display processor
 
lecture2 computer graphics graphics hardware(Computer graphics tutorials)
 lecture2  computer graphics graphics hardware(Computer graphics tutorials) lecture2  computer graphics graphics hardware(Computer graphics tutorials)
lecture2 computer graphics graphics hardware(Computer graphics tutorials)
 

Similar to Distributed Memory Programming with MPI

Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2
Marcirio Chaves
 

Similar to Distributed Memory Programming with MPI (20)

MPI - 2
MPI - 2MPI - 2
MPI - 2
 
My ppt hpc u4
My ppt hpc u4My ppt hpc u4
My ppt hpc u4
 
MPI - 3
MPI - 3MPI - 3
MPI - 3
 
What is [Open] MPI?
What is [Open] MPI?What is [Open] MPI?
What is [Open] MPI?
 
MPI - 1
MPI - 1MPI - 1
MPI - 1
 
High Performance Computing using MPI
High Performance Computing using MPIHigh Performance Computing using MPI
High Performance Computing using MPI
 
MPI - 4
MPI - 4MPI - 4
MPI - 4
 
Safetty systems intro_embedded_c
Safetty systems intro_embedded_cSafetty systems intro_embedded_c
Safetty systems intro_embedded_c
 
6-9-2017-slides-vFinal.pptx
6-9-2017-slides-vFinal.pptx6-9-2017-slides-vFinal.pptx
6-9-2017-slides-vFinal.pptx
 
MPI message passing interface
MPI message passing interfaceMPI message passing interface
MPI message passing interface
 
Advanced Scalable Decomposition Method with MPICH Environment for HPC
Advanced Scalable Decomposition Method with MPICH Environment for HPCAdvanced Scalable Decomposition Method with MPICH Environment for HPC
Advanced Scalable Decomposition Method with MPICH Environment for HPC
 
Introduction to MPI
Introduction to MPI Introduction to MPI
Introduction to MPI
 
Chapter 1.ppt
Chapter 1.pptChapter 1.ppt
Chapter 1.ppt
 
Programming using MPI and OpenMP
Programming using MPI and OpenMPProgramming using MPI and OpenMP
Programming using MPI and OpenMP
 
Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2
 
Inter Process Communication
Inter Process CommunicationInter Process Communication
Inter Process Communication
 
cs556-2nd-tutorial.pdf
cs556-2nd-tutorial.pdfcs556-2nd-tutorial.pdf
cs556-2nd-tutorial.pdf
 
Rgk cluster computing project
Rgk cluster computing projectRgk cluster computing project
Rgk cluster computing project
 
Intro to MPI
Intro to MPIIntro to MPI
Intro to MPI
 
Pivotal Cloud Foundry 2.3: A First Look
Pivotal Cloud Foundry 2.3: A First LookPivotal Cloud Foundry 2.3: A First Look
Pivotal Cloud Foundry 2.3: A First Look
 

More from Dilum Bandara

More from Dilum Bandara (20)

Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Time Series Analysis and Forecasting in Practice
Time Series Analysis and Forecasting in PracticeTime Series Analysis and Forecasting in Practice
Time Series Analysis and Forecasting in Practice
 
Introduction to Dimension Reduction with PCA
Introduction to Dimension Reduction with PCAIntroduction to Dimension Reduction with PCA
Introduction to Dimension Reduction with PCA
 
Introduction to Descriptive & Predictive Analytics
Introduction to Descriptive & Predictive AnalyticsIntroduction to Descriptive & Predictive Analytics
Introduction to Descriptive & Predictive Analytics
 
Introduction to Concurrent Data Structures
Introduction to Concurrent Data StructuresIntroduction to Concurrent Data Structures
Introduction to Concurrent Data Structures
 
Hard to Paralelize Problems: Matrix-Vector and Matrix-Matrix
Hard to Paralelize Problems: Matrix-Vector and Matrix-MatrixHard to Paralelize Problems: Matrix-Vector and Matrix-Matrix
Hard to Paralelize Problems: Matrix-Vector and Matrix-Matrix
 
Introduction to Map-Reduce Programming with Hadoop
Introduction to Map-Reduce Programming with HadoopIntroduction to Map-Reduce Programming with Hadoop
Introduction to Map-Reduce Programming with Hadoop
 
Embarrassingly/Delightfully Parallel Problems
Embarrassingly/Delightfully Parallel ProblemsEmbarrassingly/Delightfully Parallel Problems
Embarrassingly/Delightfully Parallel Problems
 
Introduction to Warehouse-Scale Computers
Introduction to Warehouse-Scale ComputersIntroduction to Warehouse-Scale Computers
Introduction to Warehouse-Scale Computers
 
Introduction to Thread Level Parallelism
Introduction to Thread Level ParallelismIntroduction to Thread Level Parallelism
Introduction to Thread Level Parallelism
 
CPU Memory Hierarchy and Caching Techniques
CPU Memory Hierarchy and Caching TechniquesCPU Memory Hierarchy and Caching Techniques
CPU Memory Hierarchy and Caching Techniques
 
Data-Level Parallelism in Microprocessors
Data-Level Parallelism in MicroprocessorsData-Level Parallelism in Microprocessors
Data-Level Parallelism in Microprocessors
 
Instruction Level Parallelism – Hardware Techniques
Instruction Level Parallelism – Hardware TechniquesInstruction Level Parallelism – Hardware Techniques
Instruction Level Parallelism – Hardware Techniques
 
Instruction Level Parallelism – Compiler Techniques
Instruction Level Parallelism – Compiler TechniquesInstruction Level Parallelism – Compiler Techniques
Instruction Level Parallelism – Compiler Techniques
 
CPU Pipelining and Hazards - An Introduction
CPU Pipelining and Hazards - An IntroductionCPU Pipelining and Hazards - An Introduction
CPU Pipelining and Hazards - An Introduction
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
High Performance Networking with Advanced TCP
High Performance Networking with Advanced TCPHigh Performance Networking with Advanced TCP
High Performance Networking with Advanced TCP
 
Introduction to Content Delivery Networks
Introduction to Content Delivery NetworksIntroduction to Content Delivery Networks
Introduction to Content Delivery Networks
 
Peer-to-Peer Networking Systems and Streaming
Peer-to-Peer Networking Systems and StreamingPeer-to-Peer Networking Systems and Streaming
Peer-to-Peer Networking Systems and Streaming
 
Mobile Services
Mobile ServicesMobile Services
Mobile Services
 

Recently uploaded

Recently uploaded (20)

WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!
WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!
WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
Driving Innovation: Scania's API Revolution with WSO2
Driving Innovation: Scania's API Revolution with WSO2Driving Innovation: Scania's API Revolution with WSO2
Driving Innovation: Scania's API Revolution with WSO2
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
WSO2CON 2024 - Software Engineering for Digital Businesses
WSO2CON 2024 - Software Engineering for Digital BusinessesWSO2CON 2024 - Software Engineering for Digital Businesses
WSO2CON 2024 - Software Engineering for Digital Businesses
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
WSO2Con2024 - Unleashing the Financial Potential of 13 Million People
WSO2Con2024 - Unleashing the Financial Potential of 13 Million PeopleWSO2Con2024 - Unleashing the Financial Potential of 13 Million People
WSO2Con2024 - Unleashing the Financial Potential of 13 Million People
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public AdministrationWSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
 
WSO2CON 2024 - Architecting AI in the Enterprise: APIs and Applications
WSO2CON 2024 - Architecting AI in the Enterprise: APIs and ApplicationsWSO2CON 2024 - Architecting AI in the Enterprise: APIs and Applications
WSO2CON 2024 - Architecting AI in the Enterprise: APIs and Applications
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
WSO2CON 2024 - Building a Digital Government in Uganda
WSO2CON 2024 - Building a Digital Government in UgandaWSO2CON 2024 - Building a Digital Government in Uganda
WSO2CON 2024 - Building a Digital Government in Uganda
 
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public AdministrationWSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...
WSO2Con2024 - From Blueprint to Brilliance: WSO2's Guide to API-First Enginee...
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 

Distributed Memory Programming with MPI

  • 1. 1 Distributed Memory Programming with MPI Slides extended from An Introduction to Parallel Programming by Peter Pacheco Dilum Bandara Dilum.Bandara@uom.lk
  • 2. 2 Distributed Memory Systems  We discuss about developing programs for these systems using MPI  MPI – Message Passing Interface  Are set of libraries that can be called from C, C++, & Fortran Copyright © 2010, Elsevier Inc. All rights Reserved
  • 3. 3 Why MPI?  Standardized & portable message-passing system  One of the oldest libraries  Wide-spread adoption  Minimal requirements on underlying hardware  Explicit parallelization  Achieves high performance  Scales to a large no of processors  Intellectually demanding Copyright © 2010, Elsevier Inc. All rights Reserved
  • 4. 4 Our First MPI Program Copyright © 2010, Elsevier Inc. All rights Reserved
  • 5. 5 Compilation Copyright © 2010, Elsevier Inc. All rights Reserved mpicc -g -Wall -o mpi_hello mpi_hello.c wrapper script to compile turns on all warnings source file create this executable file name (as opposed to default a.out) produce debugging information
  • 6. 6 Execution Copyright © 2010, Elsevier Inc. All rights Reserved mpiexec -n <no of processes> <executable> mpiexec -n 1 ./mpi_hello mpiexec -n 4 ./mpi_hello run with 1 process run with 4 processes
  • 7. 7 Execution Copyright © 2010, Elsevier Inc. All rights Reserved mpiexec -n 1 ./mpi_hello mpiexec -n 4 ./mpi_hello Greetings from process 0 of 1 ! Greetings from process 0 of 4 ! Greetings from process 1 of 4 ! Greetings from process 2 of 4 ! Greetings from process 3 of 4 !
  • 8. 8 MPI Programs  Need to add mpi.h header file  Identifiers defined by MPI start with “MPI_”  1st letter following underscore is uppercase  For function names & MPI-defined types  Helps to avoid confusion Copyright © 2010, Elsevier Inc. All rights Reserved
  • 9. 9 6 Golden MPI Functions Copyright © 2010, Elsevier Inc. All rights Reserved
  • 10. 10 MPI Components  MPI_Init  Tells MPI to do all necessary setup  e.g., allocate storage for message buffers, decide rank of a process  argc_p & argv_p are pointers to argc & argv arguments in main( )  Function returns error codes Copyright © 2010, Elsevier Inc. All rights Reserved
  • 11. 11  MPI_Finalize  Tells MPI we’re done, so clean up anything allocated for this program MPI Components (Cont.) Copyright © 2010, Elsevier Inc. All rights Reserved
  • 12. 12 Communicators  Collection of processes that can send messages to each other  Messages from others communicators are ignored  MPI_Init defines a communicator that consists of all processes created when the program is started  Called MPI_COMM_WORLD Copyright © 2010, Elsevier Inc. All rights Reserved
  • 13. 13 Communicators (Cont.) Copyright © 2010, Elsevier Inc. All rights Reserved My rank (process making this call) No of processes in the communicator
  • 14. 14 Single-Program Multiple-Data (SPMD)  We compile 1 program  Process 0 does something different  Receives messages & prints them while the other processes do the work  if-else construct makes our program SPMD  We can run this program on any no of processors  e.g., 4, 8, 32, 1000, … Copyright © 2010, Elsevier Inc. All rights Reserved
  • 15. 15 Communication  msg_buf_p, msg_size, msg_type  Determines content of message  dest – destination processor’s rank  tag – use to distinguish messages that are identical in content Copyright © 2010, Elsevier Inc. All rights Reserved
  • 16. 16 Data Types Copyright © 2010, Elsevier Inc. All rights Reserved
  • 17. 17 Communication (Cont.) Copyright © 2010, Elsevier Inc. All rights Reserved MPI_ANY_SOURCE to receive messages (from any source) in order for arrival
  • 18. 18 Message Matching Copyright © 2010, Elsevier Inc. All rights Reserved MPI_Send src = q MPI_Recv dest = r r q
  • 19. 19 Receiving Messages  Receiver can get a message without knowing  Amount of data in message  Sender of message  Tag of message  How can those be found out? Copyright © 2010, Elsevier Inc. All rights Reserved
  • 20. 20 How Much Data am I Receiving? Copyright © 2010, Elsevier Inc. All rights Reserved
  • 21. 21  Exact behavior is determined by MPI implementation  MPI_Send may behave differently with regard to buffer size, cutoffs, & blocking  Cutoff  if message size < cutoff  buffer  if message size ≥ cutoff  MPI_Send will block  MPI_Recv always blocks until a matching message is received  Preserve message ordering from a sender  Know your implementation  Don’t make assumptions! Issues With Send & Receive Copyright © 2010, Elsevier Inc. All rights Reserved
  • 22. 22 Trapezoidal Rule Copyright © 2010, Elsevier Inc. All rights Reserved
  • 23. 23 Trapezoidal Rule (Cont.) Copyright © 2010, Elsevier Inc. All rights Reserved
  • 24. 24 Serial Pseudo-code Copyright © 2010, Elsevier Inc. All rights Reserved
  • 25. 25 Parallel Pseudo-Code Copyright © 2010, Elsevier Inc. All rights Reserved
  • 26. 26 Tasks & Communications for Trapezoidal Rule Copyright © 2010, Elsevier Inc. All rights Reserved
  • 27. 27 First Version Copyright © 2010, Elsevier Inc. All rights Reserved
  • 28. 28 First Version (Cont.) Copyright © 2010, Elsevier Inc. All rights Reserved
  • 29. 29 First Version (Cont.) Copyright © 2010, Elsevier Inc. All rights Reserved
  • 30. 30 COLLECTIVE COMMUNICATION Copyright © 2010, Elsevier Inc. All rights Reserved
  • 31. 31 Collective Communication Copyright © 2010, Elsevier Inc. All rights Reserved A tree-structured global sum
  • 32. 32 Alternative Tree-Structured Global Sum Copyright © 2010, Elsevier Inc. All rights Reserved Which is most optimum? Can we do better?
  • 33. 33 MPI_Reduce Copyright © 2010, Elsevier Inc. All rights Reserved
  • 34. 34 Predefined Reduction Operators Copyright © 2010, Elsevier Inc. All rights Reserved
  • 35. 35 Collective vs. Point-to-Point Communications  All processes in the communicator must call the same collective function  e.g., a program that attempts to match a call to MPI_Reduce on 1 process with a call to MPI_Recv on another process is erroneous  Program will hang or crash  Arguments passed by each process to an MPI collective communication must be “compatible”  e.g., if 1 process passes in 0 as dest_process & another passes in 1, then the outcome of a call to MPI_Reduce is erroneous  Program is likely to hang or crash Copyright © 2010, Elsevier Inc. All rights Reserved
  • 36. 36 Collective vs. P-to-P Communications (Cont.)  output_data_p argument is only used on dest_process  However, all of the processes still need to pass in an actual argument corresponding to output_data_p, even if it’s just NULL  Point-to-point communications are matched on the basis of tags & communicators  Collective communications don’t use tags  Matched solely on the basis of communicator & order in which they’re called Copyright © 2010, Elsevier Inc. All rights Reserved
  • 37. 37 MPI_Allreduce  Useful when all processes need result of a global sum to complete some larger computation Copyright © 2010, Elsevier Inc. All rights Reserved
  • 38. 38 Copyright © 2010, Elsevier Inc. All rights Reserved Global sum followed by distribution of result MPI_Allreduce (Cont.)
  • 39. 39 Butterfly-Structured Global Sum Copyright © 2010, Elsevier Inc. All rights Reserved Processes exchange partial results
  • 40. 40 Broadcast  Data belonging to a single process is sent to all of the processes in communicator Copyright © 2010, Elsevier Inc. All rights Reserved
  • 41. 41 Tree-Structured Broadcast Copyright © 2010, Elsevier Inc. All rights Reserved
  • 42. 42 Data Distributions – Compute a Vector Sum Copyright © 2010, Elsevier Inc. All rights Reserved Serial implementation
  • 43. 43 Partitioning Options Copyright © 2010, Elsevier Inc. All rights Reserved  Block partitioning  Assign blocks of consecutive components to each process  Cyclic partitioning  Assign components in a round robin fashion  Block-cyclic partitioning  Use a cyclic distribution of blocks of components
  • 44. 44 Parallel Implementation Copyright © 2010, Elsevier Inc. All rights Reserved
  • 45. 45 MPI_Scatter  Can be used in a function that reads in an entire vector on process 0 but only sends needed components to other processes Copyright © 2010, Elsevier Inc. All rights Reserved
  • 46. 46 Reading & Distributing a Vector Copyright © 2010, Elsevier Inc. All rights Reserved
  • 47. 47 MPI_Gather  Collect all components of a vector onto process 0  Then process 0 can process all of components Copyright © 2010, Elsevier Inc. All rights Reserved
  • 48. 48 MPI_Allgather  Concatenates contents of each process’ send_buf_p & stores this in each process’ recv_buf_p  recv_count is the amount of data being received from each process Copyright © 2010, Elsevier Inc. All rights Reserved
  • 49. 49 Summary Copyright © 2010, Elsevier Inc. All rights Reserved Source: https://computing.llnl.gov/tutorials/mpi/

Editor's Notes

  1. 8 January 2024
  2. Began in Supercomputing ’92
  3. Tag – 2 messages – content in 1 should be printed while other should be used for calculation
  4. sendbuf address of send buffer (choice) count no of elements in send buffer (integer) datatype data type of elements of send buffer (handle) op reduce operation (handle) root rank of root process (integer) comm communicator (handle)