SlideShare a Scribd company logo
1 of 39
PROGRAMMING USING
MPI AND OPENMP
- Mayuri Sewatkar(16101A1002)
Topics Covered
 MPI
 MPI Principles
 Building blocks
 The Message Passing Interface(MPI)
 Overlapping Communication and
Computation
 Collective Communication
Operations
 Composite Synchronization
Constructs
 Pros and Cons of MPI
 OpenMP
 Threading
 Parallel Programming
Model
 Combining MPI and
OpenMP
 Shared Memory
Programming
 Pros and Cons of OpenMP
What is MPI???
 Message Passing Interface (MPI) is a language-independent
communications protocol used to program parallel computers. Both
point-to-point and collective communication are supported.
 MPI "is a message-passing application programmer interface, together
with protocol and semantic specifications for how its features must
behave in any implementation." So, MPI is a specification, not an
implementation.
 MPI's goals are high performance, scalability, and portability.
MPI Principles
 MPI-1 model has no shared memory concept.
 MPI-2 has only a limited distributed shared memory
concept.
 MPI-3 includes new Fortran 2008 bindings, while it
removes deprecated C++ bindings as well as many
deprecated routines and MPI objects.
MPI Building Blocks
 Since interactions are accomplished by sending and receiving messages,
the basic operations in the message-passing programming paradigm are
SEND and RECEIVE.
 In their simplest form, the prototypes of these operations are defined as
follows:
 send(void *sendbuf, int nelems, int dest)
 receive(void *recvbuf, int nelems, int source)
 The sendbuf points to a buffer that stores the data to be sent, recvbuf
points to a buffer that stores the data to be received, nelems is the
number of data units to be sent and received, dest is the identifier of the
process that receives the data, and source is the identifier of the process
that sends the data.
MPI: the Message Passing Interface
 MPI defines a standard library for message-passing that can be used to
develop portable message-passing programs using either C or Fortran.
 The MPI standard defines both the syntax as well as the semantics of a
core set of library routines that are very useful in writing message-
passing programs.
 The MPI library contains over 125 routines.
 These routines are used to initialize and terminate the MPI library, to
get information about the parallel computing environment, and to send
and receive messages.
MPI: the Message Passing Interface
 MPI_Init - Initializes MPI.
 This function must be called in every MPI program, must be called
before any other MPI functions and must be called only once in an MPI
program.
 MPI_Init(&argc,&argv);
 MPI_Comm_size - Determines the number of processes.
 Returns the total number of MPI processes in the specified
communicator (MPI_COMM_WORLD).
 It represents the number of MPI tasks available to your application.
MPI: the Message Passing Interface
 MPI_Comm_rank - Determines the label of the calling process.
 Returns the rank of the calling MPI process within the specified
communicator. Initially, each process will be assigned a unique
integer rank between 0 and number of tasks - 1 within the
communicator MPI_COMM_WORLD. This rank is often referred
to as a task ID.
 MPI_Comm_rank (comm,&rank);
 MPI_Send - Sends a message.
 It performs a blocking send i.e. this routine may block until the
message is received by the destination process.
 int MPI_Send(void *buf, int count, MPI_Datatype datatype, int
dest, int tag, MPI_Comm comm)
 Buf -> initial address
of send buffer
 Count -> number of
elements in send
buffer
 Datatype -> datatype
of each send buffer
element
 Dest -> rank of
destination
 Tag -> message tag
 Comm ->
communicator
MPI: the Message Passing Interface
 MPI_Recv - Receives a message.
 The count argument indicates the maximum length of
a message;
 int MPI_Recv(void *buf, int count, MPI_Datatype
datatype, int source, int tag,MPI_Comm comm,
MPI_Status *status)
 MPI_Finalize - Terminates MPI.
 This function should be the last MPI routine called in
every MPI program - no other MPI routines may be
called after it.
 MPI_Finalize();
 buf->initial address of
receive buffer
 status->status object
 count->maximum
number of elements in
receive buffer
 datatype->datatype of
each receive buffer
element
 source->rank of source
 tag->message tag
 comm->communicator
Compiling and running MPI
mpicc –o helloworld helloworld.cCompiling
mpirun –np 4 ./ helloworldRunning
MPI Example – Hello World
#include "mpi.h"
#include <stdio.h>
int main( int argc, char *argv[] )
{
int rank, size; MPI_Init( &argc, &argv );
MPI_Comm_rank( MPI_COMM_WORLD, &rank );
MPI_Comm_size( MPI_COMM_WORLD, &size );
printf( "Hello World from process %d of %dn", rank, size );
MPI_Finalize();
return 0;
}
MPI Example – Hello World
Output –
Hello World from process 0 of 4
Hello World from process 2 of 4
Hello World from process 3 of 4
Hello World from process 1 of 4
Overlapping Communication and
Computation
 A blocking send operation remains blocked until the message has been
copied out of the send buffer (either into a system buffer at the source
process or sent to the destination process).
 Similarly, a blocking receive operation returns only after the message
has been received and copied into the receive buffer.
 In order to overlap communication with computation, MPI provides a
pair of functions for performing non-blocking send and receive
operations.
 These functions:
 MPI_Isend and
 MPI_Irecv.
Overlapping Communication and
Computation
 MPI_Isend
 MPI_Isend starts a send operation but does not complete, that is, it returns
before the data is copied out of the buffer.
 The calling sequences of MPI_Isend is
 int MPI_Isend(void *buf, int count, MPI_Datatype datatype, int dest, int tag,
MPI_Comm comm, MPI_Request *request)
 MPI_Irecv
 MPI_Irecv starts a receive operation but returns before the data has been
received and copied into the buffer.
 The calling sequences of MPI_Irecv is
 int MPI_Irecv(void *buf, int count, MPI_Datatype datatype, int source, int tag,
MPI_Comm comm, MPI_Request *request)
Collective Communication Operations
 MPI provides the following routines for collective communication:
 MPI_Bcast() -> Broadcast (one to all)
 MPI_Reduce() -> Reduction (all to one)
 MPI_Allreduce() -> Reduction (all to all)
 MPI_Scatter() -> Distribute data (one to all)
 MPI_Gather() -> Collect data (all to one)
 MPI_Alltoall() -> Distribute data (all to all)
 MPI_Allgather() -> Collect data (all to all)
Composite Synchronization
Constructs
 By design, Pthreads provide support for a basic set of operations.
 Higher level constructs can be built using basic synchronization
constructs.
 We discuss two such constructs - read-write locks and barriers.
 A read lock is granted when there are other threads that may already
have read locks.
 If there is a write lock on the data (or if there are queued write locks),
the thread performs a condition wait.
 If there are multiple threads requesting a write lock, they must perform
a condition wait.
 With this description, we can design functions for
 read locks mylib_rwlock_rlock,
 write locks mylib_rwlock_wlock and
 unlocking mylib_rwlock_unlock.
Read-Write Locks
 The lock data type mylib_rwlock_t holds the following:
 a count of the number of readers,
 the writer (a 0/1 integer specifying whether a writer is present),
 a condition variable readers_proceed that is signaled when readers
can proceed,
 a condition variable writer_proceed that is signaled when one of the
writers can proceed,
 a count pending_writers of pending writers, and
 a mutex read_write_lock associated with the shared data structure
Barriers
 As in MPI, a barrier holds a thread until all threads participating in the
barrier have reached it.
 Barriers can be implemented using a counter, a mutex and a condition
variable.
 A single integer is used to keep track of the number of threads that have
reached the barrier.
 If the count is less than the total number of threads, the threads execute
a condition wait.
 The last thread entering (and setting the count to the number of
threads) wakes up all the threads using a condition broadcast.
Barriers
typedef struct
{
pthread_mutex_t count_lock;
pthread_cond_t ok_to_proceed;
int count;
} mylib_barrier_t;
void mylib_init_barrier(mylib_barrier_t *b)
{
b -> count = 0;
pthread_mutex_init(&(b -> count_lock), NULL);
pthread_cond_init(&(b -> ok_to_proceed), NULL);
}
Pros and Cons of MPI
 Pros
 Does not require shared memory architectures which are more expensive
than distributed memory architectures
 Can be used on a wider range of problems since it exploits both task
parallelism and data parallelism
 Can run on both shared memory and distributed memory architectures
 Highly portable with specific optimization for the implementation on most
hardware
 Cons
 Requires more programming changes to go from serial to parallel version
 Can be harder to debug
What is OpenMP???
 OpenMP (Open Multi-Processing) is an application programming
interface (API) that supports multi-platform shared memory
multiprocessing programming in C, C++, and Fortran, on most
platforms, processor architectures and operating systems, including
Solaris, AIX, HP-UX, Linux, MacOS, and Windows.
 OpenMP uses a portable, scalable model that gives programmers a
simple and flexible interface for developing parallel applications for
platforms ranging from the standard desktop computer to the
supercomputer.
What is OpenMP???
 OpenMP is basically an add on in compiler. It is available in GCC (gnu
compiler) , Intel compiler and with other compilers.
 OpenMP target shared memory systems i.e. where processor shared the
main memory.
 OpenMP is based on thread approach . It launches a single process which in
turn can create n number of thread as desired. It is based on what is called
"fork and join method" i.e. depending on particular task it can launch
desired number of thread as directed by user.
Threading
 A thread is a single stream of control in the flow of a program.
 Static Threads
 All work is allocated and assigned at runtime
 Dynamic Threads
 Consists of one Master and a pool of threads
 The pool is assigned some of the work at runtime, but not all of it
 When a thread from the pool becomes idle, the Master gives it a new
assignment
 “Round-robin assignments”
Parallel Programing Model
 OpenMP uses the fork-join model of parallel execution.
 All OpenMP programs begin with a single master thread.
 The master thread executes sequentially until a parallel region is
encountered, when it creates a team of parallel threads (FORK).
 When the team threads complete the parallel region, they synchronize and
terminate, leaving only the master thread that executes sequentially
(JOIN).
Variables
 2 types of Variables
 Private
 Shared
 Private Variables
 Variables in a thread’s private space can only be accessed by the thread
 Private variable has a different address in the execution context of every
thread.
 Clause : private «variable list»
 Shared Variables
 Variables in the global data space are accessed by all parallel threads.
 Shared-variable has the same address in the execution context of every
thread. All threads have access to shared variables.
Variables
 A thread can access its own private variables, but cannot access the
private variable of another thread.
 In parallel for pragma, variables are by default shared, except
loop index variable which is private.
#pragma omp parallel for private(privDbl )
for ( i = 0; i < arraySize; i++ ) {
for ( privIndx = 0; privIndx < 16; privIndx++ ) { privDbl = ( (double)
privIndx ) / 16;
y[i] = sin( exp( cos( - exp( sin(x[i]) ) ) ) ) + cos( privDbl );
}
}
OpenMP Functions
 omp_get_num_procs ()
 Returns the number of CPUs in the multiprocessor on which this thread is
executing,
 The integer returned by this function may be less than the total number of
physical processors in the multiprocessor, depending on how the run-time
system gives processes access to processors.
 e.g. int t= omp_get_num_procs();
 omp_get_num_threads()
 Returns the number of threads active in the current parallel region
 t=omp_get_num_threads();
OpenMP Functions Contd.
 omp_set_num_threads()
 Allows to set the number of threads executing the parallel sections of code
 Setting the number of threads equal to the number of available CPUs
 e.g. omp_set_num_threads(t);
 omp_get _thread_num()
 Returns the thread identification number, from 0 to n-1 where n are
number of active threads.
 tid = omp_get_thread_num();
OpenMP compiler directives (Pragma)
 A compiler directive in C or c++ is called a pragma.
 Format:
 #pragma omp directive-name [clause,..]
1. #pragma omp parallel
 Block of code should be executed by all of the threads (code block is
replicated among the threads)
 use curly braces {} to create a block of code from a statement group.
OpenMP compiler directives (Pragma)
2. #pragma omp parallel for
 indicate to the compiler that the iterations of a for loop may
be executed in parallel.
 e.g.
#pragma omp parallel for
for (i = first; i < size; i += prime)
marked[i] = 1;
Compiling and running OpenMP
$gcc -o hello_omp hello_omp.c –fopenmpCompiling
$./hello_ompRunning
Combining MPI and OpenMP
 In many cases hybrid programs using both MPI and OpenMP execute
faster than programs using only MPL.
 Sometimes hybrid programs execute faster because they have lower
communication overhead.
 Suppose we are executing our program on a cluster of m multiprocessors,
where each multiprocessor has k CPUs. In order to utilize every CPU, a
program relying on MPI must create mk processes. During
communication steps, mk processes are active.
 On the other hand, a hybrid program need only create m processes. In
parallel sections of code, the workload is divided among k threads on
each multiprocessor. Hence every CPU is utilized.
 However, during communication
steps, only m processes are active.
This may well give the hybrid
program lower communication
overhead than a "pure" MPI
program, resulting in higher
speedup.
Combining MPI and OpenMP
Shared Memory Programing
 The underlying hardware is assumed to be a collection
of processors, each with access to the same shared
memory.
 Because they have access 10 the same memory
locations, processors can interact and synchronize with
each other through shared variables.
 The standard view of parallelism in a shared memory program is
fork/join parallelism.
 When the program begins execution, only a single thread, called the
master thread, is active.
 The master thread executes the sequential portions of the algorithm. At
those points where parallel operations arc required, the master thread
forks (creates or awakens) additional threads.
 The master thread and the created threads work concurrently through
the parallel section, At the end of the parallel code the created threads
die or are suspended, and the flow of control returns to the single
master thread. This is called a join.
Shared Memory Programing
 The shared-memory model is
characterized by forkjoin parallelism,
in which parallelism comes and goes.
 At the beginning of execution only a
single thread, called the master thread,
is active.
 The master thread executes the serial
portions 0f the program. It forks
additional threads to help it execute
parallel portions of the program.
 These threads are deactivated when
serial execution resumes.
Shared Memory Programing
 A key difference, then, between the shared-memory model and the
message passing model is that in the message-passing model all
processes typically remain active throughout the execution of the
program, whereas in the shared-memory model the number of
active threads is one at the program's start and finish and may
change dynamically throughout the execution of the program.
 Parallel shared-memory programs range from those with only a
single fork/join around a single loop to those in which most of the
code segments are executed in parallel. Hence the shared-memory
model supports incremental paI1lllelization, the process of
transforming a sequential program into a parallel program one
block of code at a line.
Shared Memory Programing
Pros and Cons of OpenMP
 Pros
 Considered by some to be easier to program and debug (compared to
MPI)
 Data layout and decomposition is handled automatically by directives.
 Allows incremental parallelism: directives can be added incrementally,
so the program can be parallelized one portion after another and thus
no dramatic change to code is needed.
 Unified code for both serial and parallel applications: OpenMP
constructs are treated as comments when sequential compilers are
used.
 Original (serial) code statements need not, in general, be modified when
parallelized with OpenMP. This reduces the chance of inadvertently
introducing bugs and helps maintenance as well.
 Both coarse-grained and fine-grained parallelism are possible
Pros and Cons of OpenMP
 Cons
 Currently only runs efficiently in shared-memory multiprocessor
platforms
 Requires a compiler that supports OpenMP.
 Scalability is limited by memory architecture.
 Reliable error handling is missing.
 Lacks fine-grained mechanisms to control thread-processor
mapping.
 Synchronization between subsets of threads is not allowed.
 Mostly used for loop parallelization
 Can be difficult to debug, due to implicit communication between
threads via shared variables.

More Related Content

What's hot

The Message Passing Interface (MPI) in Layman's Terms
The Message Passing Interface (MPI) in Layman's TermsThe Message Passing Interface (MPI) in Layman's Terms
The Message Passing Interface (MPI) in Layman's TermsJeff Squyres
 
Point-to-Point Communicationsin MPI
Point-to-Point Communicationsin MPIPoint-to-Point Communicationsin MPI
Point-to-Point Communicationsin MPIHanif Durad
 
What is [Open] MPI?
What is [Open] MPI?What is [Open] MPI?
What is [Open] MPI?Jeff Squyres
 
RPC: Remote procedure call
RPC: Remote procedure callRPC: Remote procedure call
RPC: Remote procedure callSunita Sahu
 
Introduction to OpenMP (Performance)
Introduction to OpenMP (Performance)Introduction to OpenMP (Performance)
Introduction to OpenMP (Performance)Akhila Prabhakaran
 
Message Passing Interface (MPI)-A means of machine communication
Message Passing Interface (MPI)-A means of machine communicationMessage Passing Interface (MPI)-A means of machine communication
Message Passing Interface (MPI)-A means of machine communicationHimanshi Kathuria
 
Unix.system.calls
Unix.system.callsUnix.system.calls
Unix.system.callsGRajendra
 
OpenMP Tutorial for Beginners
OpenMP Tutorial for BeginnersOpenMP Tutorial for Beginners
OpenMP Tutorial for BeginnersDhanashree Prasad
 
Parallel computing
Parallel computingParallel computing
Parallel computingvirend111
 
Introduction to parallel processing
Introduction to parallel processingIntroduction to parallel processing
Introduction to parallel processingPage Maker
 
Process scheduling (CPU Scheduling)
Process scheduling (CPU Scheduling)Process scheduling (CPU Scheduling)
Process scheduling (CPU Scheduling)Mukesh Chinta
 
MPI message passing interface
MPI message passing interfaceMPI message passing interface
MPI message passing interfaceMohit Raghuvanshi
 
Presentation on flynn’s classification
Presentation on flynn’s classificationPresentation on flynn’s classification
Presentation on flynn’s classificationvani gupta
 
Research Scope in Parallel Computing And Parallel Programming
Research Scope in Parallel Computing And Parallel ProgrammingResearch Scope in Parallel Computing And Parallel Programming
Research Scope in Parallel Computing And Parallel ProgrammingShitalkumar Sukhdeve
 
Multiprocessor architecture
Multiprocessor architectureMultiprocessor architecture
Multiprocessor architectureArpan Baishya
 
2 operating system structures
2 operating system structures2 operating system structures
2 operating system structuresDr. Loganathan R
 

What's hot (20)

Introduction to OpenMP
Introduction to OpenMPIntroduction to OpenMP
Introduction to OpenMP
 
The Message Passing Interface (MPI) in Layman's Terms
The Message Passing Interface (MPI) in Layman's TermsThe Message Passing Interface (MPI) in Layman's Terms
The Message Passing Interface (MPI) in Layman's Terms
 
Point-to-Point Communicationsin MPI
Point-to-Point Communicationsin MPIPoint-to-Point Communicationsin MPI
Point-to-Point Communicationsin MPI
 
What is [Open] MPI?
What is [Open] MPI?What is [Open] MPI?
What is [Open] MPI?
 
RPC: Remote procedure call
RPC: Remote procedure callRPC: Remote procedure call
RPC: Remote procedure call
 
Introduction to OpenMP (Performance)
Introduction to OpenMP (Performance)Introduction to OpenMP (Performance)
Introduction to OpenMP (Performance)
 
Message Passing Interface (MPI)-A means of machine communication
Message Passing Interface (MPI)-A means of machine communicationMessage Passing Interface (MPI)-A means of machine communication
Message Passing Interface (MPI)-A means of machine communication
 
Unix.system.calls
Unix.system.callsUnix.system.calls
Unix.system.calls
 
OpenMP Tutorial for Beginners
OpenMP Tutorial for BeginnersOpenMP Tutorial for Beginners
OpenMP Tutorial for Beginners
 
Parallel computing
Parallel computingParallel computing
Parallel computing
 
Message passing interface
Message passing interfaceMessage passing interface
Message passing interface
 
Introduction to parallel processing
Introduction to parallel processingIntroduction to parallel processing
Introduction to parallel processing
 
Process scheduling (CPU Scheduling)
Process scheduling (CPU Scheduling)Process scheduling (CPU Scheduling)
Process scheduling (CPU Scheduling)
 
MPI message passing interface
MPI message passing interfaceMPI message passing interface
MPI message passing interface
 
Presentation on flynn’s classification
Presentation on flynn’s classificationPresentation on flynn’s classification
Presentation on flynn’s classification
 
Research Scope in Parallel Computing And Parallel Programming
Research Scope in Parallel Computing And Parallel ProgrammingResearch Scope in Parallel Computing And Parallel Programming
Research Scope in Parallel Computing And Parallel Programming
 
Open MPI
Open MPIOpen MPI
Open MPI
 
Multiprocessor architecture
Multiprocessor architectureMultiprocessor architecture
Multiprocessor architecture
 
2 operating system structures
2 operating system structures2 operating system structures
2 operating system structures
 
6.distributed shared memory
6.distributed shared memory6.distributed shared memory
6.distributed shared memory
 

Similar to My ppt hpc u4

Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2Marcirio Chaves
 
Intro to MPI
Intro to MPIIntro to MPI
Intro to MPIjbp4444
 
Parallel programming using MPI
Parallel programming using MPIParallel programming using MPI
Parallel programming using MPIAjit Nayak
 
Parallel and Distributed Computing Chapter 10
Parallel and Distributed Computing Chapter 10Parallel and Distributed Computing Chapter 10
Parallel and Distributed Computing Chapter 10AbdullahMunir32
 
cs556-2nd-tutorial.pdf
cs556-2nd-tutorial.pdfcs556-2nd-tutorial.pdf
cs556-2nd-tutorial.pdfssuserada6a9
 
Advanced Scalable Decomposition Method with MPICH Environment for HPC
Advanced Scalable Decomposition Method with MPICH Environment for HPCAdvanced Scalable Decomposition Method with MPICH Environment for HPC
Advanced Scalable Decomposition Method with MPICH Environment for HPCIJSRD
 
Message passing Programing and MPI.
Message passing Programing and MPI.Message passing Programing and MPI.
Message passing Programing and MPI.Munawar Hussain
 
ipc.pptx
ipc.pptxipc.pptx
ipc.pptxSuhanB
 
Rgk cluster computing project
Rgk cluster computing projectRgk cluster computing project
Rgk cluster computing projectOstopD
 
Tutorial on Parallel Computing and Message Passing Model - C3
Tutorial on Parallel Computing and Message Passing Model - C3Tutorial on Parallel Computing and Message Passing Model - C3
Tutorial on Parallel Computing and Message Passing Model - C3Marcirio Chaves
 
“Programação paralela híbrida com MPI e OpenMP – uma abordagem prática”. Edua...
“Programação paralela híbrida com MPI e OpenMP – uma abordagem prática”. Edua...“Programação paralela híbrida com MPI e OpenMP – uma abordagem prática”. Edua...
“Programação paralela híbrida com MPI e OpenMP – uma abordagem prática”. Edua...lccausp
 

Similar to My ppt hpc u4 (20)

Introduction to MPI
Introduction to MPIIntroduction to MPI
Introduction to MPI
 
Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2Tutorial on Parallel Computing and Message Passing Model - C2
Tutorial on Parallel Computing and Message Passing Model - C2
 
Intro to MPI
Intro to MPIIntro to MPI
Intro to MPI
 
Parallel computing(2)
Parallel computing(2)Parallel computing(2)
Parallel computing(2)
 
25-MPI-OpenMP.pptx
25-MPI-OpenMP.pptx25-MPI-OpenMP.pptx
25-MPI-OpenMP.pptx
 
More mpi4py
More mpi4pyMore mpi4py
More mpi4py
 
Parallel programming using MPI
Parallel programming using MPIParallel programming using MPI
Parallel programming using MPI
 
Lecture9
Lecture9Lecture9
Lecture9
 
Parallel and Distributed Computing Chapter 10
Parallel and Distributed Computing Chapter 10Parallel and Distributed Computing Chapter 10
Parallel and Distributed Computing Chapter 10
 
cs556-2nd-tutorial.pdf
cs556-2nd-tutorial.pdfcs556-2nd-tutorial.pdf
cs556-2nd-tutorial.pdf
 
Mpi
Mpi Mpi
Mpi
 
Advanced Scalable Decomposition Method with MPICH Environment for HPC
Advanced Scalable Decomposition Method with MPICH Environment for HPCAdvanced Scalable Decomposition Method with MPICH Environment for HPC
Advanced Scalable Decomposition Method with MPICH Environment for HPC
 
Chap6 slides
Chap6 slidesChap6 slides
Chap6 slides
 
Message passing Programing and MPI.
Message passing Programing and MPI.Message passing Programing and MPI.
Message passing Programing and MPI.
 
ipc.pptx
ipc.pptxipc.pptx
ipc.pptx
 
Rgk cluster computing project
Rgk cluster computing projectRgk cluster computing project
Rgk cluster computing project
 
MPI - 1
MPI - 1MPI - 1
MPI - 1
 
Tutorial on Parallel Computing and Message Passing Model - C3
Tutorial on Parallel Computing and Message Passing Model - C3Tutorial on Parallel Computing and Message Passing Model - C3
Tutorial on Parallel Computing and Message Passing Model - C3
 
Open MPI 2
Open MPI 2Open MPI 2
Open MPI 2
 
“Programação paralela híbrida com MPI e OpenMP – uma abordagem prática”. Edua...
“Programação paralela híbrida com MPI e OpenMP – uma abordagem prática”. Edua...“Programação paralela híbrida com MPI e OpenMP – uma abordagem prática”. Edua...
“Programação paralela híbrida com MPI e OpenMP – uma abordagem prática”. Edua...
 

Recently uploaded

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 

Recently uploaded (20)

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 

My ppt hpc u4

  • 1. PROGRAMMING USING MPI AND OPENMP - Mayuri Sewatkar(16101A1002)
  • 2. Topics Covered  MPI  MPI Principles  Building blocks  The Message Passing Interface(MPI)  Overlapping Communication and Computation  Collective Communication Operations  Composite Synchronization Constructs  Pros and Cons of MPI  OpenMP  Threading  Parallel Programming Model  Combining MPI and OpenMP  Shared Memory Programming  Pros and Cons of OpenMP
  • 3. What is MPI???  Message Passing Interface (MPI) is a language-independent communications protocol used to program parallel computers. Both point-to-point and collective communication are supported.  MPI "is a message-passing application programmer interface, together with protocol and semantic specifications for how its features must behave in any implementation." So, MPI is a specification, not an implementation.  MPI's goals are high performance, scalability, and portability.
  • 4. MPI Principles  MPI-1 model has no shared memory concept.  MPI-2 has only a limited distributed shared memory concept.  MPI-3 includes new Fortran 2008 bindings, while it removes deprecated C++ bindings as well as many deprecated routines and MPI objects.
  • 5. MPI Building Blocks  Since interactions are accomplished by sending and receiving messages, the basic operations in the message-passing programming paradigm are SEND and RECEIVE.  In their simplest form, the prototypes of these operations are defined as follows:  send(void *sendbuf, int nelems, int dest)  receive(void *recvbuf, int nelems, int source)  The sendbuf points to a buffer that stores the data to be sent, recvbuf points to a buffer that stores the data to be received, nelems is the number of data units to be sent and received, dest is the identifier of the process that receives the data, and source is the identifier of the process that sends the data.
  • 6. MPI: the Message Passing Interface  MPI defines a standard library for message-passing that can be used to develop portable message-passing programs using either C or Fortran.  The MPI standard defines both the syntax as well as the semantics of a core set of library routines that are very useful in writing message- passing programs.  The MPI library contains over 125 routines.  These routines are used to initialize and terminate the MPI library, to get information about the parallel computing environment, and to send and receive messages.
  • 7. MPI: the Message Passing Interface  MPI_Init - Initializes MPI.  This function must be called in every MPI program, must be called before any other MPI functions and must be called only once in an MPI program.  MPI_Init(&argc,&argv);  MPI_Comm_size - Determines the number of processes.  Returns the total number of MPI processes in the specified communicator (MPI_COMM_WORLD).  It represents the number of MPI tasks available to your application.
  • 8. MPI: the Message Passing Interface  MPI_Comm_rank - Determines the label of the calling process.  Returns the rank of the calling MPI process within the specified communicator. Initially, each process will be assigned a unique integer rank between 0 and number of tasks - 1 within the communicator MPI_COMM_WORLD. This rank is often referred to as a task ID.  MPI_Comm_rank (comm,&rank);  MPI_Send - Sends a message.  It performs a blocking send i.e. this routine may block until the message is received by the destination process.  int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm)  Buf -> initial address of send buffer  Count -> number of elements in send buffer  Datatype -> datatype of each send buffer element  Dest -> rank of destination  Tag -> message tag  Comm -> communicator
  • 9. MPI: the Message Passing Interface  MPI_Recv - Receives a message.  The count argument indicates the maximum length of a message;  int MPI_Recv(void *buf, int count, MPI_Datatype datatype, int source, int tag,MPI_Comm comm, MPI_Status *status)  MPI_Finalize - Terminates MPI.  This function should be the last MPI routine called in every MPI program - no other MPI routines may be called after it.  MPI_Finalize();  buf->initial address of receive buffer  status->status object  count->maximum number of elements in receive buffer  datatype->datatype of each receive buffer element  source->rank of source  tag->message tag  comm->communicator
  • 10. Compiling and running MPI mpicc –o helloworld helloworld.cCompiling mpirun –np 4 ./ helloworldRunning
  • 11. MPI Example – Hello World #include "mpi.h" #include <stdio.h> int main( int argc, char *argv[] ) { int rank, size; MPI_Init( &argc, &argv ); MPI_Comm_rank( MPI_COMM_WORLD, &rank ); MPI_Comm_size( MPI_COMM_WORLD, &size ); printf( "Hello World from process %d of %dn", rank, size ); MPI_Finalize(); return 0; }
  • 12. MPI Example – Hello World Output – Hello World from process 0 of 4 Hello World from process 2 of 4 Hello World from process 3 of 4 Hello World from process 1 of 4
  • 13. Overlapping Communication and Computation  A blocking send operation remains blocked until the message has been copied out of the send buffer (either into a system buffer at the source process or sent to the destination process).  Similarly, a blocking receive operation returns only after the message has been received and copied into the receive buffer.  In order to overlap communication with computation, MPI provides a pair of functions for performing non-blocking send and receive operations.  These functions:  MPI_Isend and  MPI_Irecv.
  • 14. Overlapping Communication and Computation  MPI_Isend  MPI_Isend starts a send operation but does not complete, that is, it returns before the data is copied out of the buffer.  The calling sequences of MPI_Isend is  int MPI_Isend(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm, MPI_Request *request)  MPI_Irecv  MPI_Irecv starts a receive operation but returns before the data has been received and copied into the buffer.  The calling sequences of MPI_Irecv is  int MPI_Irecv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Request *request)
  • 15. Collective Communication Operations  MPI provides the following routines for collective communication:  MPI_Bcast() -> Broadcast (one to all)  MPI_Reduce() -> Reduction (all to one)  MPI_Allreduce() -> Reduction (all to all)  MPI_Scatter() -> Distribute data (one to all)  MPI_Gather() -> Collect data (all to one)  MPI_Alltoall() -> Distribute data (all to all)  MPI_Allgather() -> Collect data (all to all)
  • 16. Composite Synchronization Constructs  By design, Pthreads provide support for a basic set of operations.  Higher level constructs can be built using basic synchronization constructs.  We discuss two such constructs - read-write locks and barriers.  A read lock is granted when there are other threads that may already have read locks.  If there is a write lock on the data (or if there are queued write locks), the thread performs a condition wait.  If there are multiple threads requesting a write lock, they must perform a condition wait.  With this description, we can design functions for  read locks mylib_rwlock_rlock,  write locks mylib_rwlock_wlock and  unlocking mylib_rwlock_unlock.
  • 17. Read-Write Locks  The lock data type mylib_rwlock_t holds the following:  a count of the number of readers,  the writer (a 0/1 integer specifying whether a writer is present),  a condition variable readers_proceed that is signaled when readers can proceed,  a condition variable writer_proceed that is signaled when one of the writers can proceed,  a count pending_writers of pending writers, and  a mutex read_write_lock associated with the shared data structure
  • 18. Barriers  As in MPI, a barrier holds a thread until all threads participating in the barrier have reached it.  Barriers can be implemented using a counter, a mutex and a condition variable.  A single integer is used to keep track of the number of threads that have reached the barrier.  If the count is less than the total number of threads, the threads execute a condition wait.  The last thread entering (and setting the count to the number of threads) wakes up all the threads using a condition broadcast.
  • 19. Barriers typedef struct { pthread_mutex_t count_lock; pthread_cond_t ok_to_proceed; int count; } mylib_barrier_t; void mylib_init_barrier(mylib_barrier_t *b) { b -> count = 0; pthread_mutex_init(&(b -> count_lock), NULL); pthread_cond_init(&(b -> ok_to_proceed), NULL); }
  • 20. Pros and Cons of MPI  Pros  Does not require shared memory architectures which are more expensive than distributed memory architectures  Can be used on a wider range of problems since it exploits both task parallelism and data parallelism  Can run on both shared memory and distributed memory architectures  Highly portable with specific optimization for the implementation on most hardware  Cons  Requires more programming changes to go from serial to parallel version  Can be harder to debug
  • 21. What is OpenMP???  OpenMP (Open Multi-Processing) is an application programming interface (API) that supports multi-platform shared memory multiprocessing programming in C, C++, and Fortran, on most platforms, processor architectures and operating systems, including Solaris, AIX, HP-UX, Linux, MacOS, and Windows.  OpenMP uses a portable, scalable model that gives programmers a simple and flexible interface for developing parallel applications for platforms ranging from the standard desktop computer to the supercomputer.
  • 22. What is OpenMP???  OpenMP is basically an add on in compiler. It is available in GCC (gnu compiler) , Intel compiler and with other compilers.  OpenMP target shared memory systems i.e. where processor shared the main memory.  OpenMP is based on thread approach . It launches a single process which in turn can create n number of thread as desired. It is based on what is called "fork and join method" i.e. depending on particular task it can launch desired number of thread as directed by user.
  • 23. Threading  A thread is a single stream of control in the flow of a program.  Static Threads  All work is allocated and assigned at runtime  Dynamic Threads  Consists of one Master and a pool of threads  The pool is assigned some of the work at runtime, but not all of it  When a thread from the pool becomes idle, the Master gives it a new assignment  “Round-robin assignments”
  • 24. Parallel Programing Model  OpenMP uses the fork-join model of parallel execution.  All OpenMP programs begin with a single master thread.  The master thread executes sequentially until a parallel region is encountered, when it creates a team of parallel threads (FORK).  When the team threads complete the parallel region, they synchronize and terminate, leaving only the master thread that executes sequentially (JOIN).
  • 25. Variables  2 types of Variables  Private  Shared  Private Variables  Variables in a thread’s private space can only be accessed by the thread  Private variable has a different address in the execution context of every thread.  Clause : private «variable list»  Shared Variables  Variables in the global data space are accessed by all parallel threads.  Shared-variable has the same address in the execution context of every thread. All threads have access to shared variables.
  • 26. Variables  A thread can access its own private variables, but cannot access the private variable of another thread.  In parallel for pragma, variables are by default shared, except loop index variable which is private. #pragma omp parallel for private(privDbl ) for ( i = 0; i < arraySize; i++ ) { for ( privIndx = 0; privIndx < 16; privIndx++ ) { privDbl = ( (double) privIndx ) / 16; y[i] = sin( exp( cos( - exp( sin(x[i]) ) ) ) ) + cos( privDbl ); } }
  • 27. OpenMP Functions  omp_get_num_procs ()  Returns the number of CPUs in the multiprocessor on which this thread is executing,  The integer returned by this function may be less than the total number of physical processors in the multiprocessor, depending on how the run-time system gives processes access to processors.  e.g. int t= omp_get_num_procs();  omp_get_num_threads()  Returns the number of threads active in the current parallel region  t=omp_get_num_threads();
  • 28. OpenMP Functions Contd.  omp_set_num_threads()  Allows to set the number of threads executing the parallel sections of code  Setting the number of threads equal to the number of available CPUs  e.g. omp_set_num_threads(t);  omp_get _thread_num()  Returns the thread identification number, from 0 to n-1 where n are number of active threads.  tid = omp_get_thread_num();
  • 29. OpenMP compiler directives (Pragma)  A compiler directive in C or c++ is called a pragma.  Format:  #pragma omp directive-name [clause,..] 1. #pragma omp parallel  Block of code should be executed by all of the threads (code block is replicated among the threads)  use curly braces {} to create a block of code from a statement group.
  • 30. OpenMP compiler directives (Pragma) 2. #pragma omp parallel for  indicate to the compiler that the iterations of a for loop may be executed in parallel.  e.g. #pragma omp parallel for for (i = first; i < size; i += prime) marked[i] = 1;
  • 31. Compiling and running OpenMP $gcc -o hello_omp hello_omp.c –fopenmpCompiling $./hello_ompRunning
  • 32. Combining MPI and OpenMP  In many cases hybrid programs using both MPI and OpenMP execute faster than programs using only MPL.  Sometimes hybrid programs execute faster because they have lower communication overhead.  Suppose we are executing our program on a cluster of m multiprocessors, where each multiprocessor has k CPUs. In order to utilize every CPU, a program relying on MPI must create mk processes. During communication steps, mk processes are active.  On the other hand, a hybrid program need only create m processes. In parallel sections of code, the workload is divided among k threads on each multiprocessor. Hence every CPU is utilized.
  • 33.  However, during communication steps, only m processes are active. This may well give the hybrid program lower communication overhead than a "pure" MPI program, resulting in higher speedup. Combining MPI and OpenMP
  • 34. Shared Memory Programing  The underlying hardware is assumed to be a collection of processors, each with access to the same shared memory.  Because they have access 10 the same memory locations, processors can interact and synchronize with each other through shared variables.
  • 35.  The standard view of parallelism in a shared memory program is fork/join parallelism.  When the program begins execution, only a single thread, called the master thread, is active.  The master thread executes the sequential portions of the algorithm. At those points where parallel operations arc required, the master thread forks (creates or awakens) additional threads.  The master thread and the created threads work concurrently through the parallel section, At the end of the parallel code the created threads die or are suspended, and the flow of control returns to the single master thread. This is called a join. Shared Memory Programing
  • 36.  The shared-memory model is characterized by forkjoin parallelism, in which parallelism comes and goes.  At the beginning of execution only a single thread, called the master thread, is active.  The master thread executes the serial portions 0f the program. It forks additional threads to help it execute parallel portions of the program.  These threads are deactivated when serial execution resumes. Shared Memory Programing
  • 37.  A key difference, then, between the shared-memory model and the message passing model is that in the message-passing model all processes typically remain active throughout the execution of the program, whereas in the shared-memory model the number of active threads is one at the program's start and finish and may change dynamically throughout the execution of the program.  Parallel shared-memory programs range from those with only a single fork/join around a single loop to those in which most of the code segments are executed in parallel. Hence the shared-memory model supports incremental paI1lllelization, the process of transforming a sequential program into a parallel program one block of code at a line. Shared Memory Programing
  • 38. Pros and Cons of OpenMP  Pros  Considered by some to be easier to program and debug (compared to MPI)  Data layout and decomposition is handled automatically by directives.  Allows incremental parallelism: directives can be added incrementally, so the program can be parallelized one portion after another and thus no dramatic change to code is needed.  Unified code for both serial and parallel applications: OpenMP constructs are treated as comments when sequential compilers are used.  Original (serial) code statements need not, in general, be modified when parallelized with OpenMP. This reduces the chance of inadvertently introducing bugs and helps maintenance as well.  Both coarse-grained and fine-grained parallelism are possible
  • 39. Pros and Cons of OpenMP  Cons  Currently only runs efficiently in shared-memory multiprocessor platforms  Requires a compiler that supports OpenMP.  Scalability is limited by memory architecture.  Reliable error handling is missing.  Lacks fine-grained mechanisms to control thread-processor mapping.  Synchronization between subsets of threads is not allowed.  Mostly used for loop parallelization  Can be difficult to debug, due to implicit communication between threads via shared variables.