this project based on Fault Tolerance In Cluster Computing and this project is only Done in C language using Linux operating System. it uses MPICH2 package to handle and mpi commands to operate system.
4. INTRODUCTION
What Is Cluster …???
A cluster is a set of connected computers that work
together so that it can be viewed as a single system. It
works on master slave connection.
What Is Cluster Computing…???
A cluster computing is also known as HPC as it is
used to solve the large problems in less time compared
with other techniques. HPC may include Parallel,
Cluster, Grid, Cloud and Green computing.
5. CONTINUE...
What Is Fault…???
A fault is any error or unwanted condition that
may arise in a system due to which our system will stop
its execution. It may be a natural or man-made types.
What Is Fault Tolerance…???
A fault tolerance is an ability by which we will
tolerate some type of faults so that we will get the
correct final outcome. Eg. Faulty processor etc.
6. PUPOSE
The purpose of cluster technology is to eliminate single
points of failure. When availability of data is your
paramount consideration, clustering is ideal. Using a
cluster we can avoids all of these single points of
failure:
Network card failure
Processor failure
Motherboard failure
8. ADVANTAGES OF USING
LINUX
The following are some advantages of using Linux:
Linux is readily available on the Internet and can be
downloaded without cost.
It is easy to fix bugs and improve system performance.
Users can develop or fine-tune hardware drivers which
can easily be made available to other users.
The most important advantage of using linux is that it
creates a several copies of one processor which helps in
enhancing the performance of a system.
9. OBJECTIVE
We are working on linux operating system & on a
communication patterns of clusters using MPI.
Our aim is to find faults, and to recover those faults
which are causing unexpected behaviours (error , bugs
etc.).
10. MESSAGE PASSING
INTERFACE(MPI)
The generic form of message passing in the parallel
computing is the Message Passing Interface.
It is used as a medium of communication among the
nodes.
In message passing, data is moved from address space
of one to that of other by mean of cooperative
operation such as send/receive pair.
11. BASIC MPI
ROUTINS/COMMANDS
For comunication among different processes some routines
are used which are-
MPI_Send, to send a message to another process.
MPI_Recv, to receive a message from another process.
MPI_Gather, MPI_Gatherv, to gather data from
participating processes into a single structure.
MPI Comm size() – Number of MPI processes.
MPI Comm rank() – Internal process number.
MPI Get processor name() – External processor name.
12. CONTINUE…
MPI_Scatter, MPI_Scatter, to break a structure into
portions and distribute those portions to other processes.
MPI_Allgather, MPI_Allgatherv, to gather data from
different processes into a single structure that is then sent
to all participants (Gather-to-all).
MPI_Alltoall, MPI_Alltoallv, to gather data and
then scatter it to all participants (All-to-all
scatter/gather).
MPI_Bcast, to Broadcast data to other processes.
13. COMMUNICATION
PATTERNS
Cluster Computer s working on four communication
patterns-
1. Single Direction Communication
2.Pair-based Communication
3.Pre-posted Communication
4.All-start Communication
14. SINGLE DIRECTION
COMMUNICATION
Processes are paired off, with the lower rank sending
message to the higher rank in a tight loop.
The individual pair synchronize before communication
begins.
15. PAIR-BASED
COMMUNICATION
Each process communicates with a small number of
remote processes in each communication phase.
Communication is paired, so that a given process is
both sending and receiving messages with exactly one
other process at a time, rotating to a new process when
communication is complete.
16. PRE-POSTED
Excepted message reception in the next communication
phase is computed before starting the computation
phase.
This guarantees that receive buffer will be available
during the communication phase.
17. ALL-START
COMMUNICATION
It is very much same as that of the pre-posted
communication but it does not guarantee that all
receives are pre-posted.
After the computation, MPI_WaitALL is called.
A call to MPI_WaitALL can be used to wait for all
pending operation in a list.
18. WORKING STRATEGY
Installation of Ubuntu 10.04 LTS.
Installation of C in Ubuntu 10.04 LTS.
Use of terminal.
Installation of MPI_ch package on our Linux system.
Study of basic Linux command & other Linux features
Study of MPI, its basic commands & syntax.
Execution of basic Linux & MPI commands.
Execution of matrix program using C on linux platform.
19. CONTINUE...
Execution of basic programs using MPI.
Execution of parallel computing.
We will generate fault, then detect & at last, recover
them by assigning the task of faulty process to some
other process so as to overcome from failure.
We will apply fault tolerance techniques i.e.
Co-ordinate checkpoints
Message logging
20. RESEARCH GAP
Up to now, fault tolerance has not yet been applied to
communication patterns.
So as to overcome with this problem, we need to introduce
fault tolerance in communication patterns so as to reach
to the correct final outcome.