SlideShare a Scribd company logo
Introduction to MPI
The Message Passing Model
Applications that do not share a global address space
need a Message Passing Framework.
An application passes messages among processes in order
to perform a task.
Almost any parallel application can be expressed with the
message passing model.
Four classes of operations:
 Environment Management
 Data movement/ Communication
 Collective computation/communication
 Synchronization
General MPI Program Structure
Header File
– include "mpi.h"
– include 'mpif.h'
Initialize MPI Env.
– MPI_Init(..)
Terminate MPI Env.
– MPI_Finalize()
#include "mpi.h"
#include <stdio.h>
int main( int argc, char *argv[] )
{
MPI_Init( &argc, &argv );
printf( "Hello, world!n" );
MPI_Finalize();
return 0;
}
Header File
– include "mpi.h"
– include 'mpif.h'
Initialize MPI Env.
– MPI_Init(..)
Terminate MPI Env.
– MPI_Finalize()
General MPI Program Structure
Environment Management Routines
Group of Routines used for interrogating and setting the
MPI execution environment.
MPI_Init
Initializes the MPI execution environment. This function
must be called in every MPI program
MPI_Finalize
Terminates the MPI execution environment. This function
should be the last MPI routine called in every MPI
program - no other MPI routines may be called after it.
MPI_Get_processor_name
Returns the processor name. Also returns the length of the name.
The buffer for "name" must be at least
MPI_MAX_PROCESSOR_NAME characters in size. What is
returned into "name" is implementation dependent - may not be the
same as the output of the "hostname" or "host" shell commands.
MPI_Get_processor_name (&name,&resultlength)
MPI_Wtime
Returns an elapsed wall clock time in seconds (double precision) on
the calling processor.
MPI_Wtime ()
Environment Management Routines
Communication
Communicator : All MPI communication occurs within a
group of processes.
Rank : Each process in the group has a unique
identifier
Size: Number of processes in a group or
communicator
The Default/ pre-defined communicator is the
MPI_COMM_WORLD which is a group of all
processes.
Environment / Communication
MPI_Comm_size
Returns the total number of MPI processes in the specified
communicator, such as MPI_COMM_WORLD. If the
communicator is MPI_COMM_WORLD, then it represents the
number of MPI tasks available to your application.
MPI_Comm_size (comm,&size)
MPI_Comm_rank
Returns the rank of the calling MPI process within the specified
communicator. Initially, each process will be assigned a unique
integer rank between 0 and number of tasks - 1 within the
communicator MPI_COMM_WORLD.
This rank is often referred to as a task ID. If a process becomes
associated with other communicators, it will have a unique rank
within each of these as well.
MPI_Comm_rank (comm,&rank)
MPI – HelloWorld Example
#include <mpi.h>
#include<iostream.h>
int main(int argc, char **argv) {
int rank;
int size;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
cout << "Hello, I’m process "
<< rank << " of " << size << endl;
MPI_Finalize();
return 0;
}
MPI Init(int *argc, char ***argv);
MPI_init(NULL,NULL)
Hello, I’m process 0 of 3
Hello, I’m process 2 of 3
Hello, I’m process 1 of 3
Not necessarily sorted!
Communication
Point-to-point communications : Transfer
message from one process to another
process
❖ It involves an explicit send and receive,
which is called two-sided communication.
❖ Message: data + (source + destination +
communicator )
❖ Almost all of the MPI commands are built
around point-to-point operations.
MPI Send and Receive
The foundation of communication is built upon send and receive
operations among processes.
Almost every single function in MPI can be implemented with basic
send and receive calls.
1. process A decides a message needs to be sent to process B.
2. Process A then packs up all of its necessary data into a buffer for
process B.
3. These buffers are often referred to as envelopes since the data is
being packed into a single message before transmission.
4. After the data is packed into a buffer, the communication device
(which is often a network) is responsible for routing the message to
the proper location.
5. Location identifier is the rank of the process
MPI Send and Receive
6. Send and Recv has to occur in pairs and are Blocking functions.
7. Even though the message is routed to B, process B still has to
acknowledge that it wants to receive A’s data. Once it does this, the
data has been transmitted. Process A is acknowledged that the data
has been transmitted and may go back to work. (Blocking)
8. Sometimes there are cases when A might have to send many
different types of messages to B. Instead of B having to go through
extra measures to differentiate all these messages.
9. MPI allows senders and receivers to also specify message IDs with
the message (known as tags).
10. When process B only requests a message with a certain tag
number, messages with different tags will be buffered by the
network until B is ready for them.
Blocking Send & Receive
Non-Blocking Send & Receive
More MPI Concepts
Blocking: blocking send or receive routines does not return until
operation is complete.
--blocking sends ensure that it is safe to overwrite the sent data
--blocking receives ensure that the data has arrived and is ready
for use
Non-blocking: Non-blocking send or receive routines returns
immediately, with no information about completion.
-- User should test for success or failure of communication.
-- In between, the process is free to handle other tasks.
-- It is less likely to form deadlocking code
-- It is used with MPI_Wait() or MPI_Test()
MPI Send and Receive
MPI_Send ( &data, count, MPI_INT, 1, tag, comm);
Parses memory based on the starting address, size, and count
based on contiguous data
MPI_Recv (void* data, int count, MPI_Datatype datatype, int
source, int tag, MPI_Comm communicator, MPI_Status* status);
Address of data
Number of Elements
Data Type
Destination
(Rank)
Message
Identifier (int)
Communicator
MPI Send and Receive
MPI Datatypes MPI datatype C equivalent
MPI_SHORT short int
MPI_INT int
MPI_LONG long int
MPI_LONG_LONG long long int
MPI_UNSIGNED_CHAR unsigned char
MPI_UNSIGNED_SHORT unsigned short int
MPI_UNSIGNED unsigned int
MPI_UNSIGNED_LONG unsigned long int
MPI_UNSIGNED_LONG_L
ONG
unsigned long long int
MPI_FLOAT float
MPI_DOUBLE double
MPI_LONG_DOUBLE long double
MPI_BYTE char
➢ MPI predefines its
primitive data types
➢ Primitive data types are
contiguous.
➢ MPI also provides facilities
for you to define your own
data structures based upon
sequences of the MPI
primitive data types.
➢ Such user defined
structures are called
derived data types
Compute pi by Numerical Integration
N processes (0,1….. N-1) Master Process: Process 0
Divide the computational task into N portions and each processor
will compute its own (partial) sum.
Then at the end, the master (processor 0) collects all (partial)
sums and forms a total sum.
Basic set of MPI functions used
◦ Init
◦ Finalize
◦ Send
◦ Recv
◦ Comm Size
◦ Rank
MPI_Init(&argc,&argv); // Initialize
MPI_Comm_size(MPI_COMM_WORLD, &num_procs); // Get # processors
MPI_Comm_rank(MPI_COMM_WORLD, &myid);
N = # intervals used to do the integration...
w = 1.0/(double) N;
mypi = 0.0; // My partial sum (from a MPI processor)
Compute my part of the partial sum based on
1. myid
2. num_procs
if ( I am the master of the group )
{
for ( i = 1; i < num_procs; i++)
{
receive the partial sum from MPI processor i;
Add partial sum to my own partial sum;
}
Print final total;
}
else
{
Send my partial sum to the master of the MPI group;
}
MPI_Finalize();
int main(int argc, char *argv[]) {
int N; // Number of intervals
double w, x; // width and x point
int i, myid;
double mypi, others_pi;
MPI_Init(&argc,&argv); // Initialize
// Get # processors
MPI_Comm_size(MPI_COMM_WORLD, &num_procs);
MPI_Comm_rank(MPI_COMM_WORLD, &myid);
N = atoi(argv[1]);
w = 1.0/(double) N;
mypi = 0.0;
//Each MPI Process has its own copy of every variable
Compute PI by Numerical Integration C code
/* --------------------------------------------------------------------------------
Every MPI process computes a partial sum for the integral
------------------------------------------------------------------------------------ */
for (i = myid; i < N; i = i + num_procs)
{
x = w*(i + 0.5);
mypi = mypi + w*f(x);
}
P = total number of Processes, N in the total number of rectangles
Process 0 computes the sum of f(w *(0.5)), f(w*(P+0.5)), f(w*(2P+0.5))……
Process 1 computes the sum of f(w *(1.5)), f(w*(P+1.5)), f(w*(2P+1.5))……
Process 2 computes the sum of f(w *(2.5)), f(w*(P+2.5)), f(w*(2P+2.5))……
Process 3 computes the sum of f(w *(3.5)), f(w*(P+3.5)), f(w*(2P+3.5))……
Process 4 computes the sum of f(w *(4.5)), f(w*(P+4.5)), f(w*(2P+4.5))……
Compute PI by Numerical Integration C code
if ( myid == 0 ) //Now put the sum together...
{
// Proc 0 collects and others send data to proc 0
for (i = 1; i < num_procs; i++)
{
MPI_Recv(&others_pi, 1, MPI_DOUBLE, i, 0, MPI_COMM_WORLD, NULL);
mypi += others_pi;
}
cout << "Pi = " << mypi<< endl << endl; // Output...
}
else
{
//The other processors send their partial sum to processor 0
MPI_Send(&mypi, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD);
}
MPI_Finalize();
}
Compute PI by Numerical Integration C code
Collective Communication
Communications that involve all processes in a group.
One to all
♦ Broadcast
♦ Scatter (personalized)
All to one
♦ Gather
All to all
♦ Allgather
♦ Alltoall (personalized)
“Personalized” means each process gets different data
Collective Communication
In a collective operation, processes must reach the same
point in the program code in order for the communication to
begin.
The call to the collective function is blocking.
Broadcast: Root
Process sends the same
piece of data to all
Processes in a
communicator group.
Scatter: Takes an array
of elements and
distributes the
elements in the order
of process rank.
Collective Communication
Gather: Takes elements
from many processes and
gathers them to one single
process. This routine is
highly useful to many
parallel algorithms, such
as parallel sorting and
searching
Collective Communication
Reduce: Takes an
array of input
elements on each
process and returns an
array of output
elements to the root
process. The output
elements contain the
reduced result.
Reduction Operation:
Max, Min, Sum,
Product, Logical and
Bitwise Operations.
MPI_Reduce (&local_sum, &global_sum, 1,
MPI_FLOAT, MPI_SUM, 0, MPI_COMM_WORLD)
Collective Computation
Collective Computation
All Gather: Just like
MPI_Gather, the elements from
each process are gathered in
order of their rank, except this
time the elements are gathered
to all processes
All to All: Extension to
MPI_Allgather. The jth block
from process i is received by
process j and stored in the i-th
block. Useful in applications
like matrix transposes or FFTs
If one process reads data from disc or the command line, it can use a
broadcast or a gather to get the information to other processes.
Likewise, at the end of a program run, a gather or reduction can be used
to collect summary information about the program run.
However, a more common scenario is that the result of a collective is
needed on all processes.
Consider the computation of the standard deviation :
Assume that every processor stores just one Xi value
You can compute μ by doing a reduction followed by a broadcast.
It is better to use a so-called allreduce operation, which does the reduction and
leaves the result on all processors.
Collectives: Use
MPI_Barrier(MPI_Comm comm)
Provides the ability to block the calling process until all processes
in the communicator have reached this routine.
#include "mpi.h"
#include int main(int argc, char *argv[]) {
int rank, nprocs;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&nprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
MPI_Barrier(MPI_COMM_WORLD);
printf("Hello, world. I am %d of %dn", rank, procs); fflush(stdout);
MPI_Finalize();
return 0;
}
Synchronization
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
MPI_Get_processor_name(processor_name,&namelen);
fprintf(stderr,"Process# %d with name %s on %d processorsn",
myid, processor_name, numprocs);
if (myid == 0)
{
scanf("%d",&n);
printf("Number of intervals: %d (0 quits)n", n);
startwtime = MPI_Wtime();
}
MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
Compute PI by Numerical Integration C code
h = 1.0 / (double) n;
sum = 0.0;
for (i = myid + 1; i <= n; i += numprocs)
{
x = h * ((double)i - 0.5);
sum += f(x);
}
mypi = h * sum;
MPI_Reduce(&mypi, &pi, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);
if (myid == 0){
printf("pi is approximately %.16fn", pi );
endwtime = MPI_Wtime();
printf("wall clock time = %fn", endwtime-startwtime);
}
MPI_Finalize()
Compute PI by Numerical Integration C code
Programming Environment
A Programming Environment (PrgEnv)
Set of related software components like
compilers, scientific software libraries,
implementations of parallel programming
paradigms, batch job schedulers, and other third-
party tools, all of which cooperate with each other.
Current Environments on Cray
PrgEnv-cray, PrgEnv-gnu and PrgEnv-intel
Implementations of MPI
Examples of Different Implementations
▪ MPICH - developed by Argonne National Labs (free)
▪ MPI/LAM - developed by Indiana, OSC, Notre Dame (free)
▪ MPI/Pro - commercial product
▪ Apple's X Grid
▪ OpenMPI
CRAY XC40 provides an implementation of the MPI-3.0
standard via the Cray Message Passing Toolkit (MPT), which
is based on the MPICH 3 library and optimised for the Cray
Aries interconnect.
All Programming Environments (PrgEnv-cray, PrgEnv-gnu and
PrgEnv-intel) can utilize the MPI library that is implemented
by Cray
Compiling an MPI program
Depends upon the implementation of MPI
Some Standard implementations : MPICH/ OPENMPI
Language Wrapper Compiler Name
C mpicc
C++ mpicxx or mpic++
Fortran mpifort (for v1.7 and above)
mpif77 and mpif90 (for older versions)
Running an MPI program
Execution mode:
Interactive Mode : mpirun -np <#Number of Processors>
<name_of_executable>
Batch Mode: Using a job script (details on the SERC webpage)
#!/bin/csh
#PBS -N jobname
#PBS -l nodes=1: ppn=16
#PBS -l walltime=1:00:00
#PBS -e /path_of_executable/error.log
cd /path_of_executable
NPROCS=`wc -l < $PBS_NODEFILE`
HOSTS=`cat $PBS_NODEFILE | uniq | tr 'n' "," | sed 's|,$||'`
mpirun -np $NPROCS --host $HOSTS /name_of_executable
Error Handling
Most MPI routines include a return/error code parameter.
However, according to the MPI standard, the default behavior of an MPI
call is to abort if there is an error.
You will probably not be able to capture a return/error code other than
MPI_SUCCESS (zero).
The standard does provide a means to override this default error
handler. Consult the error handling section of the relevant MPI Standard
documentation located at http://www.mpi-forum.org/docs/.
The types of errors displayed to the user are implementation
dependent.
Thank You

More Related Content

What's hot

Adversarial search
Adversarial search Adversarial search
Adversarial search
Farah M. Altufaili
 
Prim's algorithm
Prim's algorithmPrim's algorithm
Prim's algorithm
Pankaj Thakur
 
Introduction to OpenMP
Introduction to OpenMPIntroduction to OpenMP
Introduction to OpenMP
Akhila Prabhakaran
 
MPI Tutorial
MPI TutorialMPI Tutorial
MPI Tutorial
Dhanashree Prasad
 
Socket Programming In Python
Socket Programming In PythonSocket Programming In Python
Socket Programming In Python
didip
 
Minmax Algorithm In Artificial Intelligence slides
Minmax Algorithm In Artificial Intelligence slidesMinmax Algorithm In Artificial Intelligence slides
Minmax Algorithm In Artificial Intelligence slides
SamiaAziz4
 
Complexity analysis in Algorithms
Complexity analysis in AlgorithmsComplexity analysis in Algorithms
Complexity analysis in Algorithms
Daffodil International University
 
Distributed Mutual Exclusion and Distributed Deadlock Detection
Distributed Mutual Exclusion and Distributed Deadlock DetectionDistributed Mutual Exclusion and Distributed Deadlock Detection
Distributed Mutual Exclusion and Distributed Deadlock Detection
SHIKHA GAUTAM
 
Python Advanced – Building on the foundation
Python Advanced – Building on the foundationPython Advanced – Building on the foundation
Python Advanced – Building on the foundation
Kevlin Henney
 
0 1 knapsack using branch and bound
0 1 knapsack using branch and bound0 1 knapsack using branch and bound
0 1 knapsack using branch and bound
Abhishek Singh
 
Distributed system lamport's and vector algorithm
Distributed system lamport's and vector algorithmDistributed system lamport's and vector algorithm
Distributed system lamport's and vector algorithm
pinki soni
 
sum of subset problem using Backtracking
sum of subset problem using Backtrackingsum of subset problem using Backtracking
sum of subset problem using Backtracking
Abhishek Singh
 
Unit 1 chapter 1 Design and Analysis of Algorithms
Unit 1   chapter 1 Design and Analysis of AlgorithmsUnit 1   chapter 1 Design and Analysis of Algorithms
Unit 1 chapter 1 Design and Analysis of Algorithms
P. Subathra Kishore, KAMARAJ College of Engineering and Technology, Madurai
 
Backtracking-N Queens Problem-Graph Coloring-Hamiltonian cycle
Backtracking-N Queens Problem-Graph Coloring-Hamiltonian cycleBacktracking-N Queens Problem-Graph Coloring-Hamiltonian cycle
Backtracking-N Queens Problem-Graph Coloring-Hamiltonian cycle
varun arora
 
Java Networking
Java NetworkingJava Networking
Java Networking
Sunil OS
 
RPC: Remote procedure call
RPC: Remote procedure callRPC: Remote procedure call
RPC: Remote procedure call
Sunita Sahu
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
Sopheaktra YONG
 
Job sequencing with Deadlines
Job sequencing with DeadlinesJob sequencing with Deadlines
Job sequencing with Deadlines
YashiUpadhyay3
 
Asymptotic Notations
Asymptotic NotationsAsymptotic Notations
Asymptotic Notations
Rishabh Soni
 
Lecture 2 more about parallel computing
Lecture 2   more about parallel computingLecture 2   more about parallel computing
Lecture 2 more about parallel computing
Vajira Thambawita
 

What's hot (20)

Adversarial search
Adversarial search Adversarial search
Adversarial search
 
Prim's algorithm
Prim's algorithmPrim's algorithm
Prim's algorithm
 
Introduction to OpenMP
Introduction to OpenMPIntroduction to OpenMP
Introduction to OpenMP
 
MPI Tutorial
MPI TutorialMPI Tutorial
MPI Tutorial
 
Socket Programming In Python
Socket Programming In PythonSocket Programming In Python
Socket Programming In Python
 
Minmax Algorithm In Artificial Intelligence slides
Minmax Algorithm In Artificial Intelligence slidesMinmax Algorithm In Artificial Intelligence slides
Minmax Algorithm In Artificial Intelligence slides
 
Complexity analysis in Algorithms
Complexity analysis in AlgorithmsComplexity analysis in Algorithms
Complexity analysis in Algorithms
 
Distributed Mutual Exclusion and Distributed Deadlock Detection
Distributed Mutual Exclusion and Distributed Deadlock DetectionDistributed Mutual Exclusion and Distributed Deadlock Detection
Distributed Mutual Exclusion and Distributed Deadlock Detection
 
Python Advanced – Building on the foundation
Python Advanced – Building on the foundationPython Advanced – Building on the foundation
Python Advanced – Building on the foundation
 
0 1 knapsack using branch and bound
0 1 knapsack using branch and bound0 1 knapsack using branch and bound
0 1 knapsack using branch and bound
 
Distributed system lamport's and vector algorithm
Distributed system lamport's and vector algorithmDistributed system lamport's and vector algorithm
Distributed system lamport's and vector algorithm
 
sum of subset problem using Backtracking
sum of subset problem using Backtrackingsum of subset problem using Backtracking
sum of subset problem using Backtracking
 
Unit 1 chapter 1 Design and Analysis of Algorithms
Unit 1   chapter 1 Design and Analysis of AlgorithmsUnit 1   chapter 1 Design and Analysis of Algorithms
Unit 1 chapter 1 Design and Analysis of Algorithms
 
Backtracking-N Queens Problem-Graph Coloring-Hamiltonian cycle
Backtracking-N Queens Problem-Graph Coloring-Hamiltonian cycleBacktracking-N Queens Problem-Graph Coloring-Hamiltonian cycle
Backtracking-N Queens Problem-Graph Coloring-Hamiltonian cycle
 
Java Networking
Java NetworkingJava Networking
Java Networking
 
RPC: Remote procedure call
RPC: Remote procedure callRPC: Remote procedure call
RPC: Remote procedure call
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
 
Job sequencing with Deadlines
Job sequencing with DeadlinesJob sequencing with Deadlines
Job sequencing with Deadlines
 
Asymptotic Notations
Asymptotic NotationsAsymptotic Notations
Asymptotic Notations
 
Lecture 2 more about parallel computing
Lecture 2   more about parallel computingLecture 2   more about parallel computing
Lecture 2 more about parallel computing
 

Similar to Introduction to MPI

Parallel computing(2)
Parallel computing(2)Parallel computing(2)
Parallel computing(2)
Md. Mahedi Mahfuj
 
High Performance Computing using MPI
High Performance Computing using MPIHigh Performance Computing using MPI
High Performance Computing using MPI
Ankit Mahato
 
My ppt hpc u4
My ppt hpc u4My ppt hpc u4
Parallel and Distributed Computing Chapter 10
Parallel and Distributed Computing Chapter 10Parallel and Distributed Computing Chapter 10
Parallel and Distributed Computing Chapter 10
AbdullahMunir32
 
Introduction to MPI
Introduction to MPIIntroduction to MPI
Introduction to MPI
yaman dua
 
Mpi
Mpi Mpi
25-MPI-OpenMP.pptx
25-MPI-OpenMP.pptx25-MPI-OpenMP.pptx
25-MPI-OpenMP.pptx
GopalPatidar13
 
MPI Introduction
MPI IntroductionMPI Introduction
MPI Introduction
Rohit Banga
 
Intro to MPI
Intro to MPIIntro to MPI
Intro to MPI
jbp4444
 
Lecture9
Lecture9Lecture9
Lecture9
tt_aljobory
 
Parallel programming using MPI
Parallel programming using MPIParallel programming using MPI
Parallel programming using MPI
Ajit Nayak
 
Programming using MPI and OpenMP
Programming using MPI and OpenMPProgramming using MPI and OpenMP
Programming using MPI and OpenMP
Divya Tiwari
 
cs556-2nd-tutorial.pdf
cs556-2nd-tutorial.pdfcs556-2nd-tutorial.pdf
cs556-2nd-tutorial.pdf
ssuserada6a9
 
MPI message passing interface
MPI message passing interfaceMPI message passing interface
MPI message passing interface
Mohit Raghuvanshi
 
MPI
MPIMPI
Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2
Marcirio Chaves
 
More mpi4py
More mpi4pyMore mpi4py
More mpi4py
A Jorge Garcia
 
hybrid-programming.pptx
hybrid-programming.pptxhybrid-programming.pptx
hybrid-programming.pptx
MohammadRayyanFeroz1
 
Message Passing Interface (MPI)-A means of machine communication
Message Passing Interface (MPI)-A means of machine communicationMessage Passing Interface (MPI)-A means of machine communication
Message Passing Interface (MPI)-A means of machine communication
Himanshi Kathuria
 
Open MPI
Open MPIOpen MPI
Open MPI
Anshul Sharma
 

Similar to Introduction to MPI (20)

Parallel computing(2)
Parallel computing(2)Parallel computing(2)
Parallel computing(2)
 
High Performance Computing using MPI
High Performance Computing using MPIHigh Performance Computing using MPI
High Performance Computing using MPI
 
My ppt hpc u4
My ppt hpc u4My ppt hpc u4
My ppt hpc u4
 
Parallel and Distributed Computing Chapter 10
Parallel and Distributed Computing Chapter 10Parallel and Distributed Computing Chapter 10
Parallel and Distributed Computing Chapter 10
 
Introduction to MPI
Introduction to MPIIntroduction to MPI
Introduction to MPI
 
Mpi
Mpi Mpi
Mpi
 
25-MPI-OpenMP.pptx
25-MPI-OpenMP.pptx25-MPI-OpenMP.pptx
25-MPI-OpenMP.pptx
 
MPI Introduction
MPI IntroductionMPI Introduction
MPI Introduction
 
Intro to MPI
Intro to MPIIntro to MPI
Intro to MPI
 
Lecture9
Lecture9Lecture9
Lecture9
 
Parallel programming using MPI
Parallel programming using MPIParallel programming using MPI
Parallel programming using MPI
 
Programming using MPI and OpenMP
Programming using MPI and OpenMPProgramming using MPI and OpenMP
Programming using MPI and OpenMP
 
cs556-2nd-tutorial.pdf
cs556-2nd-tutorial.pdfcs556-2nd-tutorial.pdf
cs556-2nd-tutorial.pdf
 
MPI message passing interface
MPI message passing interfaceMPI message passing interface
MPI message passing interface
 
MPI
MPIMPI
MPI
 
Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2
 
More mpi4py
More mpi4pyMore mpi4py
More mpi4py
 
hybrid-programming.pptx
hybrid-programming.pptxhybrid-programming.pptx
hybrid-programming.pptx
 
Message Passing Interface (MPI)-A means of machine communication
Message Passing Interface (MPI)-A means of machine communicationMessage Passing Interface (MPI)-A means of machine communication
Message Passing Interface (MPI)-A means of machine communication
 
Open MPI
Open MPIOpen MPI
Open MPI
 

More from Akhila Prabhakaran

Re Imagining Education
Re Imagining EducationRe Imagining Education
Re Imagining Education
Akhila Prabhakaran
 
Introduction to OpenMP (Performance)
Introduction to OpenMP (Performance)Introduction to OpenMP (Performance)
Introduction to OpenMP (Performance)
Akhila Prabhakaran
 
Hypothesis testing Part1
Hypothesis testing Part1Hypothesis testing Part1
Hypothesis testing Part1
Akhila Prabhakaran
 
Statistical Analysis with R- III
Statistical Analysis with R- IIIStatistical Analysis with R- III
Statistical Analysis with R- III
Akhila Prabhakaran
 
Statistical Analysis with R -II
Statistical Analysis with R -IIStatistical Analysis with R -II
Statistical Analysis with R -II
Akhila Prabhakaran
 
Statistical Analysis with R -I
Statistical Analysis with R -IStatistical Analysis with R -I
Statistical Analysis with R -I
Akhila Prabhakaran
 
Introduction to Parallel Computing
Introduction to Parallel ComputingIntroduction to Parallel Computing
Introduction to Parallel Computing
Akhila Prabhakaran
 

More from Akhila Prabhakaran (7)

Re Imagining Education
Re Imagining EducationRe Imagining Education
Re Imagining Education
 
Introduction to OpenMP (Performance)
Introduction to OpenMP (Performance)Introduction to OpenMP (Performance)
Introduction to OpenMP (Performance)
 
Hypothesis testing Part1
Hypothesis testing Part1Hypothesis testing Part1
Hypothesis testing Part1
 
Statistical Analysis with R- III
Statistical Analysis with R- IIIStatistical Analysis with R- III
Statistical Analysis with R- III
 
Statistical Analysis with R -II
Statistical Analysis with R -IIStatistical Analysis with R -II
Statistical Analysis with R -II
 
Statistical Analysis with R -I
Statistical Analysis with R -IStatistical Analysis with R -I
Statistical Analysis with R -I
 
Introduction to Parallel Computing
Introduction to Parallel ComputingIntroduction to Parallel Computing
Introduction to Parallel Computing
 

Recently uploaded

Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
Aditi Bajpai
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
HongcNguyn6
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
Daniel Tubbenhauer
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
muralinath2
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills MN
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Texas Alliance of Groundwater Districts
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
Anagha Prasad
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
AbdullaAlAsif1
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
European Sustainable Phosphorus Platform
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
terusbelajar5
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptxBREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
RASHMI M G
 

Recently uploaded (20)

Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptxBREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
 

Introduction to MPI

  • 2. The Message Passing Model Applications that do not share a global address space need a Message Passing Framework. An application passes messages among processes in order to perform a task. Almost any parallel application can be expressed with the message passing model. Four classes of operations:  Environment Management  Data movement/ Communication  Collective computation/communication  Synchronization
  • 3. General MPI Program Structure Header File – include "mpi.h" – include 'mpif.h' Initialize MPI Env. – MPI_Init(..) Terminate MPI Env. – MPI_Finalize()
  • 4. #include "mpi.h" #include <stdio.h> int main( int argc, char *argv[] ) { MPI_Init( &argc, &argv ); printf( "Hello, world!n" ); MPI_Finalize(); return 0; } Header File – include "mpi.h" – include 'mpif.h' Initialize MPI Env. – MPI_Init(..) Terminate MPI Env. – MPI_Finalize() General MPI Program Structure
  • 5. Environment Management Routines Group of Routines used for interrogating and setting the MPI execution environment. MPI_Init Initializes the MPI execution environment. This function must be called in every MPI program MPI_Finalize Terminates the MPI execution environment. This function should be the last MPI routine called in every MPI program - no other MPI routines may be called after it.
  • 6. MPI_Get_processor_name Returns the processor name. Also returns the length of the name. The buffer for "name" must be at least MPI_MAX_PROCESSOR_NAME characters in size. What is returned into "name" is implementation dependent - may not be the same as the output of the "hostname" or "host" shell commands. MPI_Get_processor_name (&name,&resultlength) MPI_Wtime Returns an elapsed wall clock time in seconds (double precision) on the calling processor. MPI_Wtime () Environment Management Routines
  • 7. Communication Communicator : All MPI communication occurs within a group of processes. Rank : Each process in the group has a unique identifier Size: Number of processes in a group or communicator The Default/ pre-defined communicator is the MPI_COMM_WORLD which is a group of all processes.
  • 8. Environment / Communication MPI_Comm_size Returns the total number of MPI processes in the specified communicator, such as MPI_COMM_WORLD. If the communicator is MPI_COMM_WORLD, then it represents the number of MPI tasks available to your application. MPI_Comm_size (comm,&size) MPI_Comm_rank Returns the rank of the calling MPI process within the specified communicator. Initially, each process will be assigned a unique integer rank between 0 and number of tasks - 1 within the communicator MPI_COMM_WORLD. This rank is often referred to as a task ID. If a process becomes associated with other communicators, it will have a unique rank within each of these as well. MPI_Comm_rank (comm,&rank)
  • 9. MPI – HelloWorld Example #include <mpi.h> #include<iostream.h> int main(int argc, char **argv) { int rank; int size; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Comm_rank(MPI_COMM_WORLD, &rank); cout << "Hello, I’m process " << rank << " of " << size << endl; MPI_Finalize(); return 0; } MPI Init(int *argc, char ***argv); MPI_init(NULL,NULL) Hello, I’m process 0 of 3 Hello, I’m process 2 of 3 Hello, I’m process 1 of 3 Not necessarily sorted!
  • 10. Communication Point-to-point communications : Transfer message from one process to another process ❖ It involves an explicit send and receive, which is called two-sided communication. ❖ Message: data + (source + destination + communicator ) ❖ Almost all of the MPI commands are built around point-to-point operations.
  • 11. MPI Send and Receive The foundation of communication is built upon send and receive operations among processes. Almost every single function in MPI can be implemented with basic send and receive calls. 1. process A decides a message needs to be sent to process B. 2. Process A then packs up all of its necessary data into a buffer for process B. 3. These buffers are often referred to as envelopes since the data is being packed into a single message before transmission. 4. After the data is packed into a buffer, the communication device (which is often a network) is responsible for routing the message to the proper location. 5. Location identifier is the rank of the process
  • 12. MPI Send and Receive 6. Send and Recv has to occur in pairs and are Blocking functions. 7. Even though the message is routed to B, process B still has to acknowledge that it wants to receive A’s data. Once it does this, the data has been transmitted. Process A is acknowledged that the data has been transmitted and may go back to work. (Blocking) 8. Sometimes there are cases when A might have to send many different types of messages to B. Instead of B having to go through extra measures to differentiate all these messages. 9. MPI allows senders and receivers to also specify message IDs with the message (known as tags). 10. When process B only requests a message with a certain tag number, messages with different tags will be buffered by the network until B is ready for them.
  • 13. Blocking Send & Receive
  • 15. More MPI Concepts Blocking: blocking send or receive routines does not return until operation is complete. --blocking sends ensure that it is safe to overwrite the sent data --blocking receives ensure that the data has arrived and is ready for use Non-blocking: Non-blocking send or receive routines returns immediately, with no information about completion. -- User should test for success or failure of communication. -- In between, the process is free to handle other tasks. -- It is less likely to form deadlocking code -- It is used with MPI_Wait() or MPI_Test()
  • 16. MPI Send and Receive MPI_Send ( &data, count, MPI_INT, 1, tag, comm); Parses memory based on the starting address, size, and count based on contiguous data MPI_Recv (void* data, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm communicator, MPI_Status* status); Address of data Number of Elements Data Type Destination (Rank) Message Identifier (int) Communicator
  • 17. MPI Send and Receive
  • 18. MPI Datatypes MPI datatype C equivalent MPI_SHORT short int MPI_INT int MPI_LONG long int MPI_LONG_LONG long long int MPI_UNSIGNED_CHAR unsigned char MPI_UNSIGNED_SHORT unsigned short int MPI_UNSIGNED unsigned int MPI_UNSIGNED_LONG unsigned long int MPI_UNSIGNED_LONG_L ONG unsigned long long int MPI_FLOAT float MPI_DOUBLE double MPI_LONG_DOUBLE long double MPI_BYTE char ➢ MPI predefines its primitive data types ➢ Primitive data types are contiguous. ➢ MPI also provides facilities for you to define your own data structures based upon sequences of the MPI primitive data types. ➢ Such user defined structures are called derived data types
  • 19. Compute pi by Numerical Integration N processes (0,1….. N-1) Master Process: Process 0 Divide the computational task into N portions and each processor will compute its own (partial) sum. Then at the end, the master (processor 0) collects all (partial) sums and forms a total sum. Basic set of MPI functions used ◦ Init ◦ Finalize ◦ Send ◦ Recv ◦ Comm Size ◦ Rank
  • 20. MPI_Init(&argc,&argv); // Initialize MPI_Comm_size(MPI_COMM_WORLD, &num_procs); // Get # processors MPI_Comm_rank(MPI_COMM_WORLD, &myid); N = # intervals used to do the integration... w = 1.0/(double) N; mypi = 0.0; // My partial sum (from a MPI processor) Compute my part of the partial sum based on 1. myid 2. num_procs if ( I am the master of the group ) { for ( i = 1; i < num_procs; i++) { receive the partial sum from MPI processor i; Add partial sum to my own partial sum; } Print final total; } else { Send my partial sum to the master of the MPI group; } MPI_Finalize();
  • 21. int main(int argc, char *argv[]) { int N; // Number of intervals double w, x; // width and x point int i, myid; double mypi, others_pi; MPI_Init(&argc,&argv); // Initialize // Get # processors MPI_Comm_size(MPI_COMM_WORLD, &num_procs); MPI_Comm_rank(MPI_COMM_WORLD, &myid); N = atoi(argv[1]); w = 1.0/(double) N; mypi = 0.0; //Each MPI Process has its own copy of every variable Compute PI by Numerical Integration C code
  • 22. /* -------------------------------------------------------------------------------- Every MPI process computes a partial sum for the integral ------------------------------------------------------------------------------------ */ for (i = myid; i < N; i = i + num_procs) { x = w*(i + 0.5); mypi = mypi + w*f(x); } P = total number of Processes, N in the total number of rectangles Process 0 computes the sum of f(w *(0.5)), f(w*(P+0.5)), f(w*(2P+0.5))…… Process 1 computes the sum of f(w *(1.5)), f(w*(P+1.5)), f(w*(2P+1.5))…… Process 2 computes the sum of f(w *(2.5)), f(w*(P+2.5)), f(w*(2P+2.5))…… Process 3 computes the sum of f(w *(3.5)), f(w*(P+3.5)), f(w*(2P+3.5))…… Process 4 computes the sum of f(w *(4.5)), f(w*(P+4.5)), f(w*(2P+4.5))…… Compute PI by Numerical Integration C code
  • 23. if ( myid == 0 ) //Now put the sum together... { // Proc 0 collects and others send data to proc 0 for (i = 1; i < num_procs; i++) { MPI_Recv(&others_pi, 1, MPI_DOUBLE, i, 0, MPI_COMM_WORLD, NULL); mypi += others_pi; } cout << "Pi = " << mypi<< endl << endl; // Output... } else { //The other processors send their partial sum to processor 0 MPI_Send(&mypi, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD); } MPI_Finalize(); } Compute PI by Numerical Integration C code
  • 24. Collective Communication Communications that involve all processes in a group. One to all ♦ Broadcast ♦ Scatter (personalized) All to one ♦ Gather All to all ♦ Allgather ♦ Alltoall (personalized) “Personalized” means each process gets different data
  • 25. Collective Communication In a collective operation, processes must reach the same point in the program code in order for the communication to begin. The call to the collective function is blocking.
  • 26. Broadcast: Root Process sends the same piece of data to all Processes in a communicator group. Scatter: Takes an array of elements and distributes the elements in the order of process rank. Collective Communication
  • 27. Gather: Takes elements from many processes and gathers them to one single process. This routine is highly useful to many parallel algorithms, such as parallel sorting and searching Collective Communication
  • 28. Reduce: Takes an array of input elements on each process and returns an array of output elements to the root process. The output elements contain the reduced result. Reduction Operation: Max, Min, Sum, Product, Logical and Bitwise Operations. MPI_Reduce (&local_sum, &global_sum, 1, MPI_FLOAT, MPI_SUM, 0, MPI_COMM_WORLD) Collective Computation
  • 30. All Gather: Just like MPI_Gather, the elements from each process are gathered in order of their rank, except this time the elements are gathered to all processes All to All: Extension to MPI_Allgather. The jth block from process i is received by process j and stored in the i-th block. Useful in applications like matrix transposes or FFTs
  • 31. If one process reads data from disc or the command line, it can use a broadcast or a gather to get the information to other processes. Likewise, at the end of a program run, a gather or reduction can be used to collect summary information about the program run. However, a more common scenario is that the result of a collective is needed on all processes. Consider the computation of the standard deviation : Assume that every processor stores just one Xi value You can compute μ by doing a reduction followed by a broadcast. It is better to use a so-called allreduce operation, which does the reduction and leaves the result on all processors. Collectives: Use
  • 32. MPI_Barrier(MPI_Comm comm) Provides the ability to block the calling process until all processes in the communicator have reached this routine. #include "mpi.h" #include int main(int argc, char *argv[]) { int rank, nprocs; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&nprocs); MPI_Comm_rank(MPI_COMM_WORLD,&rank); MPI_Barrier(MPI_COMM_WORLD); printf("Hello, world. I am %d of %dn", rank, procs); fflush(stdout); MPI_Finalize(); return 0; } Synchronization
  • 33. MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid); MPI_Get_processor_name(processor_name,&namelen); fprintf(stderr,"Process# %d with name %s on %d processorsn", myid, processor_name, numprocs); if (myid == 0) { scanf("%d",&n); printf("Number of intervals: %d (0 quits)n", n); startwtime = MPI_Wtime(); } MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD); Compute PI by Numerical Integration C code
  • 34. h = 1.0 / (double) n; sum = 0.0; for (i = myid + 1; i <= n; i += numprocs) { x = h * ((double)i - 0.5); sum += f(x); } mypi = h * sum; MPI_Reduce(&mypi, &pi, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD); if (myid == 0){ printf("pi is approximately %.16fn", pi ); endwtime = MPI_Wtime(); printf("wall clock time = %fn", endwtime-startwtime); } MPI_Finalize() Compute PI by Numerical Integration C code
  • 35. Programming Environment A Programming Environment (PrgEnv) Set of related software components like compilers, scientific software libraries, implementations of parallel programming paradigms, batch job schedulers, and other third- party tools, all of which cooperate with each other. Current Environments on Cray PrgEnv-cray, PrgEnv-gnu and PrgEnv-intel
  • 36. Implementations of MPI Examples of Different Implementations ▪ MPICH - developed by Argonne National Labs (free) ▪ MPI/LAM - developed by Indiana, OSC, Notre Dame (free) ▪ MPI/Pro - commercial product ▪ Apple's X Grid ▪ OpenMPI CRAY XC40 provides an implementation of the MPI-3.0 standard via the Cray Message Passing Toolkit (MPT), which is based on the MPICH 3 library and optimised for the Cray Aries interconnect. All Programming Environments (PrgEnv-cray, PrgEnv-gnu and PrgEnv-intel) can utilize the MPI library that is implemented by Cray
  • 37. Compiling an MPI program Depends upon the implementation of MPI Some Standard implementations : MPICH/ OPENMPI Language Wrapper Compiler Name C mpicc C++ mpicxx or mpic++ Fortran mpifort (for v1.7 and above) mpif77 and mpif90 (for older versions)
  • 38. Running an MPI program Execution mode: Interactive Mode : mpirun -np <#Number of Processors> <name_of_executable> Batch Mode: Using a job script (details on the SERC webpage) #!/bin/csh #PBS -N jobname #PBS -l nodes=1: ppn=16 #PBS -l walltime=1:00:00 #PBS -e /path_of_executable/error.log cd /path_of_executable NPROCS=`wc -l < $PBS_NODEFILE` HOSTS=`cat $PBS_NODEFILE | uniq | tr 'n' "," | sed 's|,$||'` mpirun -np $NPROCS --host $HOSTS /name_of_executable
  • 39. Error Handling Most MPI routines include a return/error code parameter. However, according to the MPI standard, the default behavior of an MPI call is to abort if there is an error. You will probably not be able to capture a return/error code other than MPI_SUCCESS (zero). The standard does provide a means to override this default error handler. Consult the error handling section of the relevant MPI Standard documentation located at http://www.mpi-forum.org/docs/. The types of errors displayed to the user are implementation dependent.