MPI (Message Passing Interface) has been effectively used in the high performance computing community for years and is the main programming model. The MPI is being widely used to developed parallel programs on computing system such as clusters. The major component of high performance computing (HPC) environment MPI is becoming increasingly prevalent. MPI implementations typically equate an MPI process with an OS-process, resulting in decomposition technique where MPI processes are bound to the physical cores. It integrated approach makes it possible to add more concurrency than available parallelism, while minimizing the overheads related to context switches, scheduling and synchronization. Fiber is used by it to support multiple MPI processes inside an operating system process. There are three widely used MPI libraries, including OPENMPI, MPICH2 and MVAPICH2. This paper works on the decomposition techniques and also integrates the MPI environment with using MPICH2.
Advanced Scalable Decomposition Method with MPICH Environment for HPC
1. IJSRD - International Journal for Scientific Research & Development| Vol. 3, Issue 10, 2015 | ISSN (online): 2321-0613
All rights reserved by www.ijsrd.com 485
Advanced Scalable Decomposition Method with MPICH Environment for
HPC
Hiral Patel1 Ishan Rajani2
1,2
Department of Computer Engineering
1,2
Darshan Institute of Engg. & Tech.
Abstract— MPI (Message Passing Interface) has been
effectively used in the high performance computing
community for years and is the main programming model.
The MPI is being widely used to developed parallel programs
on computing system such as clusters. The major component
of high performance computing (HPC) environment MPI is
becoming increasingly prevalent. MPI implementations
typically equate an MPI process with an OS-process,
resulting in decomposition technique where MPI processes
are bound to the physical cores. It integrated approach makes
it possible to add more concurrency than available
parallelism, while minimizing the overheads related to
context switches, scheduling and synchronization. Fiber is
used by it to support multiple MPI processes inside an
operating system process. There are three widely used MPI
libraries, including OPENMPI, MPICH2 and MVAPICH2.
This paper works on the decomposition techniques and also
integrates the MPI environment with using MPICH2.
Key word: HPC, MPI, MPICH
I. INTRODUCTION
MPI is originally intended for distributed memory schemes.
Unlike OpenMPI, pthread or other parallelization
explanations for shared memory, it does not certification
shared data. Instead, MPI programs broadcast data by
message passing. Because of the memory partition, when
there are thousands of centers on one computer, message
passing will demonstration its advantage on scalability. It will
be more accomplished than shared statistics accessing. Thus,
it is significant for MPI implementations to increase the
rapidity of data communication. There are many repeatedly
used exposed MPI implementations, such as MPICH2 and
OpenMPI. To fully exploit multicore architectures, these
requests may use certain novel technologies. MPI has been
very successful in High Presentation Calculating for applying
message-passing programs on calculate clusters. There are
many requests and a variety of records that have been
inscribed using MPI. Many of these programs are inscribed
as SPMD programs where the program is parameterized by
―N‖ the amount of MPI processes. Parameter N regulates the
granularity of the package and provides the measure of
available concurrency. In executing MPI programs, one
typically matches the number of MPI courses to the quantity
of cores, the measure of available parallelism. MPI (Message
Passing Interface) is the leading model used for similar
programming in great performance computing [1]. MPI is
fruitful because of the work over the last 15 years on the MPI
normal and middleware that assistance that MPI programs
continue to achieve well on parallel and cluster constructions
across a wide variety of net fabrics. Almost all MPI
applications bind the implementation of an MPI process to an
operating system (OS) process where usually a ―one
process‖ per ―processor core‖ planning is used. As a result,
the notion of an MPI process is tightly bound to the physical
incomes of the machine, in actual the number of cores and OS
processes that can be created. Programs written using MPI
tend to be coarse-grain and cannot easily exploit more fine-
grain parallelism without resorting to threads or combining
MPI with other APIs like OpenMPI[2]. In this paper, we
estimate the data decomposition technique to improve the
efficiency of MPI-implementation.
II. MPICH VERSIONS
There are various versions available of MPICH2. The latest
versions of MPICH2 are:
A. MPICH 3.2rc2 Released
Released on 4th November,2015
Displaying source after attaching to an MPI job
Hydra: Automatically handle bad ssh setup
Build fails with missing routines
Cancel sand not implemented in the MXM netmod
B. MPICH 3.2b3 Released
Released on 4th june ,2015
Support MPI-3.1 standard
Support full fortran 2008
Support for the Mellanox MXM interface for Infiniband
Support for the Mellanox HCOLL interface for the
collective communication
Significant improvement to the MPICH portals for
implementation
Completely revamped RMA infrastructure providing
scalability and performance improvement
C. MPICH 3.2b2 Released
Released on 14th April ,2015
Support MPI-3.1 standard
Support full fortran 2008
Support for the Mellanox MXM interface for Infiniband
Support for the Mellanox HCOLL interface for the
collective communication
III. MPICH
MPICH is a high-performance and usually moveable
functioning of the Message Passing border (MPI) standard
(MPI-1, MPI-2 and MPI-3). The goals of MPICH are:
To supply an MPI functioning that powerfully supports
different calculation and message platform including
high speed network like InfniBand , Myrinet.
It is easy to modular framework for other
implementation.
MPICH2 implement extra features of the MPI-2
standard over what was implemented in the unique MPICH
(now referred to as MPICH-1). Here we describe how to and
install the latest version of MPICH. Steps to install
MPICH3.2rc2
2. Advanced Scalable Decomposition Method with MPICH Environment for HPC
(IJSRD/Vol. 3/Issue 10/2015/100)
All rights reserved by www.ijsrd.com 486
1) Unzip the tar file
Tar xfz mpich3.2rc2.tar.gz
2) Decide an installation directory
mkdir /home/student/mpich-install
3) Decide a build directory
mkdir /home/student/mpich-3.2rc2sub
4) Decide any configure options
prefix set the setting up directories for MPICH.
5) Configure MPICH, specify the installation directory
cd /home/student/mpich-3.2rc2sb /home/student/mpich-
3.2rc2/configure -prefix=/home/student/mpich-install |&
tee c.txt
6) Build MPICH3.2rc2
make |& tee m.txt (for csh and tcsh) OR make 2>&1 | tee
m.txt (for bash and sh)
7) nstall the MPICH commands
make install |& tee mi.txt
8) Add the bin subdirectory of the installation directory to
your path
Setenv PATH /home/student/mpich-install/bin:$PATH For
csh and tcsh, or Export PATH=/home/student/mpich-
install/bin:$PATH For bash, and
PATH=/home/student/mpich-install/bin:$PATH export
PATH for sh which mpicc which mpiexec
9) Run the job with mpiecc
mpiexec -n 5 ./cpi The cpi example will tell you which hosts
it are running on. C.
IV. MPICH ARCHITECTURE
A. Romio
It is a great-performance, portable implementation of
MPI-IO that works with any MPI application on
numerous file system.
It is included as part of MPICH2, MPICH1, seller MPI
implementations for PVFS, SGI, XFS, PanFS and UFS
folder system.
ROMIO achieves sophisticated optimization that
enables applications to accomplish high I/O
performance.
Shared I/O, data sifting and I/O aggregation are
integrated by ROMIO.
ROMIO also admits a number of hints from the user for
refining I/O performance, such as file striping and
algorithm change parameters.
B. Nemesis
Nemesis is a scalable, high routine, shared
commemoration, multi-network message subsystem
within MPICH2.
Nemesis offers low-latency, high bandwidth message,
particularly for intra-node communication.
C. Lustre
It is a parallel dispersed file system, normally used for
great scale cluster encoding.
D. GPFS
It is a high performance clustered label system
developed by IBM.
E. Infniband
It is a mainframe network communication connection
used in high performance computing fecturing very high
thoughput & low-latency.
It is used for data interconnect both concerning and
within computer.
F. Myrinet
It is rate effective, high performance, package
communication &converting technology that is widely
used to interconnect cluster & workstation.
V. HOW TO BUILD DISTRIBUTED
A. Environment with Using MPICH in Windows
1) Hardware
Before starting, you should have the following hardware and
software:
Take 2 computers with same operating system.
Must be installed correct driver.
2) Software
Mpi is a de-facto standard.
1) Step 1: you can download the latest version of mpich2
then unpack mpich2 with the admin permission.
2) Step 2: All files of the lib folder in MPICH2 copy to the
other folder.
3) Step 3: Install the Cluster Manager facility on each
swarm you want to use for distant completing of MPI
processes.
4) Step 4: Follow step 1 and 2 for each host in the cluster
5) Step 5: Now start mpiexec (from folder
C:MPICH2bin) by double-clicking it.
3. Advanced Scalable Decomposition Method with MPICH Environment for HPC
(IJSRD/Vol. 3/Issue 10/2015/100)
All rights reserved by www.ijsrd.com 487
VI. FLOW OF FLOYD ALGORITHM
Fig. 1: Flow of Floyd Algorithm
Fig. 2:
VII. RESULTS
Fig. 3:
VIII. CONCLUSION
MPI has wide applicability and many years of effort have
gone into ensuring its performance portability. Our work is
an attempt to extend the use of the MPI model with
decomposition technique. Domain decomposition technique
extend the MPICH2 completion of MPI to build it probable
to have manifold MPI processes within an OS-process. By
decomposition technique improved the MPI environment
using MPICH2.
REFERENCES
[1] V. R. Basili, J. C. Carver, D. Cruzes, L. M. Hochstein, J.
K. Hollingsworth, F. Shull, and M. V. Zelkowitz,
―Understanding the high-performance-computing
community: A software engineer’s perspective,‖ IEEE
Softw., vol. 25, no. 4, pp. 29–36, 2008.
[2] OpenMP, ―The OpenMP Application Program
Interface,‖ Available from http://openmp.org/wp/about-
openmp/.
[3] E. Lusk, ―MPI on a hundred million processors...Why
not?‖ Talk at Clusters and Computational Grids for
Scientific Computing 2008, Available from
http://www.cs.utk.edu/_dongarra/ccgsc2008/talks/
[4] ―Pvfs project,‖ http://www.pvfs.org/, Aug 2011.
[5] W. Yu, S. Liang, and D. K. Panda, ―High performance
support of parallel virtual file system (pvfs2) over
quadrics,‖ in Proceedings of the 19th annual international
conference on Supercomputing, ser. ICS ’05. New York,
NY, USA: ACM, 2005, pp. 323–331. [Online].
Available:http://doi.acm.org/10.1145/1088149.
[6] D. Buntinas, G. Mercier, and W. Gropp,
―Implementation and evaluation of shared-memory
communication and synchronization operations in
MPICH2 using the Nemesis communication subsystem,‖
Parallel Comput., vol. 33, no. 9, pp. 634–644, 2007.
[7] H. Kamal, S. M. Mirtaheri, and A. Wagner, ―Scalability
of communicators and groups in MPI,‖ in Proceedings of
the 19th ACM International Symposium on High
Performance Distributed Computing, ser. HPDC ’10.
New York, NY, USA: ACM, 2010, pp. 264–275.
[8] Argonne National Laboratory, ―MPICH2: A high
performance and portable implementation of MPI
4. Advanced Scalable Decomposition Method with MPICH Environment for HPC
(IJSRD/Vol. 3/Issue 10/2015/100)
All rights reserved by www.ijsrd.com 488
standard,‖ Available from
http://www.mcs.anl.gov/research/projects/mpich2/index.
php.
[9] D. Buntinas, W. Gropp, and G. Mercier, ―Design and
evaluation of Nemesis, a scalable, low-latency, message-
passing communication subsystem,‖ in CCGRID ’06:
Proceedings of the Sixth IEEEInternational Symposium
on Cluster Computing and the Grid.Washington, DC,
USA: IEEE Computer Society, 2006, pp. 521–
530.[Online]. Available:
http://dx.doi.org/10.1109/CCGRID.2006.31
[10]D.Buntinas,G.Mercier and G.Gropp, ―Implementation
and evalution of shaed-memory commuication and
synchronization operations in MPICH2 using the
Nemesis communication subsystem‖ paellel compute.,
vol. 33, no. 9, pp. 634-644 , 2007.
[11]A study of the fault-tolerant PVFS2‖ Yoon H. Choi 1 ,
Wan H. Cho 2 , Hyeonsang Eom 3 , Heon Y. Yeom 4,
School of Computer Science and Engineering Seoul
National University, Seoul 151-742, Korea.
[12]http://www.mpich.org.
[13]http://www.mpi-forum.org
[14]http://www.mcs.anl.gov/reserch/projects/mpich2