Cluster Computing: An Introduction to Configuring and Implementing High Performance Computing Clusters

Cluster Computing
Presented by:
Biswaraj Baral
NCIT-ME-II

Outlines:
 Introduction
 Objective/Design requirements
 Cluster Configuration
 Design Issue
 Clusters Computer Architecture
 Implementation: Blade servers
 Clusters compared to SMP

Introduction:
 Super, Mainframe computers are not cost effective
 Cluster technology have been developed that allow multiple low cost
computers to work in coordinated fashion to process applications.
 An alternative to symmetric multiprocessing as an approach to providing high
performance and is attractive for server applications.
 Group of interconnected, whole computers working together as a unified
computing resources that can create illusion of being one machine.
 The term whole computer means a system that can run on its own, apart
from the cluster; each computer in a cluster is typically referred to as a
node.
 Composed of many commodity computers , linked together by a high-speed
dedicated network.

Objective/Design requirements:
 Absolute Scalability
 Incremental Scalability
 High Availability
 Superior Price/ Performance

Cluster Configuration:
Clustering Methods:
 Passive Standby:
 Older method.
 A secondary server takes over in case of primary server failure.
 Easy to implement.
 High cost because the secondary server is unavailable for the other processing
tasks.
 Heartbeat message.
 Active Secondary:
 The secondary server is also used for processing tasks.
 Reduced cost because secondary servers can be used for processing.
 Increased complexity.

 Separate Servers:
 No disks shared between systems.
 High performance as well as high availability
 Scheduling software needed for load balancing
 Data must be constantly copied among systems so that each systems has access to the current data of
the other system.
 High availability, Communication overhead

 Shared nothing and shared Memory:
 Reduce the communication overhead.
 Servers connected to common disk.
 Shared Nothing: common disks are partitioned into volumes, and each volume owned by a single computer.
 Shared Disk: Multiple computers share the same disks at the same time, so that each computer
has access to all the volumes on all of the disks.
Requires Locking Mechanism.
(b) Shared Disk

Operating system design Issues:
 Failure Management:
 How failures are managed by a cluster depends on the clustering method used.
 Two approaches:
 Highly Available clusters:
 These clusters are designed to provide uninterrupted availability of data or service
(typically web services) to the end user community.
 If a node fails, the service can be restored without affecting the availability of the services
provided by the cluster, while the application will still be available, there will be a
performance drop due to missing node.
 Failover and failback
 Fault tolerant clusters:
 Ensures that all resources are always available. This is achieved by the use of redundant
shared disks and mechanisms for backing out uncommitted transactions and committing
completed transactions.
 Load Balancing:
 This type of cluster distributes the incoming requests for resources or content among multiple nodes
running the same program or having the same content.
 Every node in the cluster is able to handle requests for the same content or application.
 Middleware mechanisms need to recognize that services can appear on different members of cluster
and may migrate from one member to another.
 Almost all load balancing clusters are HA clusters.

 Parallelizing Computation:
 Parallelizing Compiler:
 Compiler determines which part of an application can be executed in parallel at
compile time.
 Split off to assigned to different computers in clusters.
 Performance depends on the nature of the problem and how well the compiler is
designed.
 Difficult to design such compiler.
 Parallelized application:
 The programmer writes the application from the outset to run on a cluster, and
uses message passing to move data, as required, between cluster nodes. This
places a high burden on the programmer but may be the best approach for
exploiting clusters for some applications.
 Parametric computing:
 Can be used if the essence of the application is an program that must be
executed a large number of times, each time with a different set of starting
conditions or parameters. Eg-Simulation model

Cluster Computer Architecture:

 Individual computers are connected by some high-speed LAN or switch hardware.
 Each computer is capable of operating independently.
 Middleware layer of software is installed in each computer to enable cluster
operation.
 The middleware provides a unified system image to the user, known as a single-system
image.
 The middleware is responsible for providing high availability, by means of load
balancing and responding to failures in individual components. Following are the
desirable cluster middleware services and functions:
 Single entry point
 Single file hierarchy
 Single control point
 Single virtual networking (any node can access any other point )
 Single memory space
 Single job-management system
 Single user interface
 Single I/O space
 Single process space
 Checkpointing (saves the process state, recovery after failure)
 Process migration (load balancing)

 The last four items in the preceding list enhance the availability of the cluster.
 Others are concerned with providing a single system image.
Implementation: Blade Servers
 Is a server architecture that houses multiple server modules (‘blades’) in a single chassis.
 Used in data centres.
 Increase server density, lowers powers and cooling costs, ease server expansion and simplifies
data centre management.

Ethernet configuration for Massive Blade Server Site

Clusters Compared to SMP:
 Both clusters and symmetric multiprocessors provide a configuration with multiple
processors to support high-demand applications.
 Both solutions are commercially available.
 SMP: the processing of programs by multiple processors that share a common OS and
memory. Whereas , in clusters individual systems are tied together.
 The aim of SMP is time saving and of cluster computing is high availability.
 SMP is easier to manage and configure than cluster.
 SMP takes less physical space and draws less power than a comparable cluster.
 SMP products are well established and stable.
 Cluster: high performance server market.
 Clusters are superior to SMPs in terms of incremental and absolute scalability.
 Clusters are also superior in terms of availability, because all components of the system
can readily be made highly redundant.

Cluster Computing: An Introduction to Configuring and Implementing High Performance Computing Clusters

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Cluster Computing: An Introduction to Configuring and Implementing High Performance Computing Clusters

Similar to Cluster Computing: An Introduction to Configuring and Implementing High Performance Computing Clusters (20)

Recently uploaded

Recently uploaded (20)

Cluster Computing: An Introduction to Configuring and Implementing High Performance Computing Clusters