CLUSTER COMPUTING
INTRODUCTION A computer cluster is a group of tightly coupled  computers that work together closely so that it  can be viewed as  a single computer. Clusters are commonly connected through fast  local area networks. Clusters have evolved to support applications  ranging from ecommerce, to high performance database applications.
HISTORY The first commodity clustering product was  ARCnet, developed by Datapoint in 1977. The next product was VAXcluster, released by  DEC  in 1980’s. Microsoft, Sun Microsystems, and other leading  hardware and software companies offer clustering  packages
WHY CLUSTERS? Price/Performance The  reason for the growth in use of clusters is that they have significantly reduced the cost of processing power. Availability S ingle points of failure can be eliminated, if any one system component goes down, the system as a whole  stay highly available. Scalability HPC clusters can grow in overall capacity because  processors and nodes can be added as demand  increases.
Contd… The components critical to the development of low cost clusters are: Processors Memory Networking components Motherboards, busses, and other sub-systems
 
LOGICAL VIEW OF CLUSTER
ARCHITECTURE A cluster is a type of parallel /distributed processing system,which consists of a collection of interconnected stand-alone  computers cooperatively working together a  single,  integrated computing resource. A node: a single or multiprocessor system with memory, I/O facilities,  &OS generally 2 or more computers (nodes) connected together in a single cabinet, or physically separated & connected via a  LAN appear as a single system to users and applications provide a cost-effective way to gain features and benefits
ARCHITECTURE
COMPONENTS 1.Multiple High Performance Computers   a.PCs b.Workstations c.SMPs (CLUMPS) d.Distributed HPC Systems
Contd… 2. State of the art Operating Systems a. Linux (Beowulf) b. Microsoft NT (Illinois HPVM) c. SUN Solaris (Berkeley NOW) d. IBM AIX (IBM SP2) e. HP UX (Illinois - PANDA)
3.High Performance Networks/Switches a. Ethernet (10Mbps), b. Fast Ethernet (100Mbps), c. Gigabit Ethernet (1Gbps) e. ATM f. Myrinet (1.2Gbps) g. Digital Memory Channel h. FDDI Contd…
Contd… 4.  Network Interface Card a. Myrinet has NIC 5.  Fast Communication Protocols and  Services a. Active Messages (Berkeley) b. Fast Messages (Illinois) 6. Cluster Middleware a. Single System Image (SSI) b. System Availability (SA) Infrastructure
Contd… 7. Parallel Programming Environments  and Tools a. Threads (PCs, SMPs, NOW..) b. MPI c. Compilers d. RAD (rapid application development tools) e. Debuggers f. Performance Analysis Tools g. Visualization Tools
Contd… 8. Applications a. Sequential b. Parallel / Distributed (Cluster-aware app.)
DIFFERENT KINDS OF  CLUSTERS   High Performance (HP) Clusters Load Balancing Clusters High Availability (HA) Clusters
Contd… HIGH PERFORMANCE CLUSTER Start from 1994 Donald Becker of NASA assembled this cluster. Also called Beowulf cluster Applications like data mining, simulations, parallel  processing, weather modeling,  etc
Contd… LOAD BALANCING CLUSER PC cluster deliver load balancing performance Commonly used with busy ftp and web servers with large client base Large number of nodes to share load
Contd… HIGH AVAILABILITY CLUSTER Avoid single point of failure This requires atleast two nodes - a primary and a  backup. Always with redundancy Almost all load balancing cluster are with HA  capability
ISSUES TO BE CONSIDERED Cluster Networking Cluster Software Programming Timing Network Selection Speed Selection
Contd… Cluster networking If you are mixing hardware that has different  networking technologies, there will be large  differences in the speed with which data will be  accessed and how individual nodes can  communicate. If it is in your budget make sure that all of the machines you want to include in your  cluster have similar networking capabilities, and if  at all possible, have network adapters from the  same manufacturer.
Contd… Cluster Software You will have to build versions of clustering  software for each kind of system you include in  your cluster.
Contd… Programming Our code will have to be written to support the lowest common denominator for data types  supported by the least powerful node in our cluster. With mixed machines, the more powerful machines  will have attributes that cannot be attained in the  powerful machine.
Contd… Timing This is the most problematic aspect of cluster. Since  these machines have different performance  profile our code will execute at different rates on the different kinds of nodes. This can cause serious  bottlenecks if a process on one node is waiting for  results of a calculation on a slower node..
Contd… Network Selection There are a number of different kinds of  network topologies, including buses, cubes of  various degrees, and grids/meshes. These  network topologies will be implemented by use  of one or more network interface cards, or NICs, installed into the head-node and compute nodes of our cluster.
Contd… Speed Selection No matter what topology you choose for your  cluster, you will want to get fastest network that  your budget allows. Fortunately, the availability of  high speed computers has also forced the  development of high speed networking systems.  Examples are :  10Mbit Ethernet, 100Mbit Ethernet, gigabit  networking, channel bonding etc.
Conclusion Clusters are promising Solve parallel processing paradox New trends in hardware and software technologies  are likely to make clusters. Clusters based supercomputers (Linux based  clusters) can be seen everywhere !!
THANK YOU

Cluster Computing

  • 1.
  • 2.
    INTRODUCTION A computercluster is a group of tightly coupled computers that work together closely so that it can be viewed as a single computer. Clusters are commonly connected through fast local area networks. Clusters have evolved to support applications ranging from ecommerce, to high performance database applications.
  • 3.
    HISTORY The firstcommodity clustering product was ARCnet, developed by Datapoint in 1977. The next product was VAXcluster, released by DEC in 1980’s. Microsoft, Sun Microsystems, and other leading hardware and software companies offer clustering packages
  • 4.
    WHY CLUSTERS? Price/PerformanceThe reason for the growth in use of clusters is that they have significantly reduced the cost of processing power. Availability S ingle points of failure can be eliminated, if any one system component goes down, the system as a whole stay highly available. Scalability HPC clusters can grow in overall capacity because processors and nodes can be added as demand increases.
  • 5.
    Contd… The componentscritical to the development of low cost clusters are: Processors Memory Networking components Motherboards, busses, and other sub-systems
  • 6.
  • 7.
  • 8.
    ARCHITECTURE A clusteris a type of parallel /distributed processing system,which consists of a collection of interconnected stand-alone computers cooperatively working together a single, integrated computing resource. A node: a single or multiprocessor system with memory, I/O facilities, &OS generally 2 or more computers (nodes) connected together in a single cabinet, or physically separated & connected via a LAN appear as a single system to users and applications provide a cost-effective way to gain features and benefits
  • 9.
  • 10.
    COMPONENTS 1.Multiple HighPerformance Computers a.PCs b.Workstations c.SMPs (CLUMPS) d.Distributed HPC Systems
  • 11.
    Contd… 2. Stateof the art Operating Systems a. Linux (Beowulf) b. Microsoft NT (Illinois HPVM) c. SUN Solaris (Berkeley NOW) d. IBM AIX (IBM SP2) e. HP UX (Illinois - PANDA)
  • 12.
    3.High Performance Networks/Switchesa. Ethernet (10Mbps), b. Fast Ethernet (100Mbps), c. Gigabit Ethernet (1Gbps) e. ATM f. Myrinet (1.2Gbps) g. Digital Memory Channel h. FDDI Contd…
  • 13.
    Contd… 4. Network Interface Card a. Myrinet has NIC 5. Fast Communication Protocols and Services a. Active Messages (Berkeley) b. Fast Messages (Illinois) 6. Cluster Middleware a. Single System Image (SSI) b. System Availability (SA) Infrastructure
  • 14.
    Contd… 7. ParallelProgramming Environments and Tools a. Threads (PCs, SMPs, NOW..) b. MPI c. Compilers d. RAD (rapid application development tools) e. Debuggers f. Performance Analysis Tools g. Visualization Tools
  • 15.
    Contd… 8. Applicationsa. Sequential b. Parallel / Distributed (Cluster-aware app.)
  • 16.
    DIFFERENT KINDS OF CLUSTERS High Performance (HP) Clusters Load Balancing Clusters High Availability (HA) Clusters
  • 17.
    Contd… HIGH PERFORMANCECLUSTER Start from 1994 Donald Becker of NASA assembled this cluster. Also called Beowulf cluster Applications like data mining, simulations, parallel processing, weather modeling, etc
  • 18.
    Contd… LOAD BALANCINGCLUSER PC cluster deliver load balancing performance Commonly used with busy ftp and web servers with large client base Large number of nodes to share load
  • 19.
    Contd… HIGH AVAILABILITYCLUSTER Avoid single point of failure This requires atleast two nodes - a primary and a backup. Always with redundancy Almost all load balancing cluster are with HA capability
  • 20.
    ISSUES TO BECONSIDERED Cluster Networking Cluster Software Programming Timing Network Selection Speed Selection
  • 21.
    Contd… Cluster networkingIf you are mixing hardware that has different networking technologies, there will be large differences in the speed with which data will be accessed and how individual nodes can communicate. If it is in your budget make sure that all of the machines you want to include in your cluster have similar networking capabilities, and if at all possible, have network adapters from the same manufacturer.
  • 22.
    Contd… Cluster SoftwareYou will have to build versions of clustering software for each kind of system you include in your cluster.
  • 23.
    Contd… Programming Ourcode will have to be written to support the lowest common denominator for data types supported by the least powerful node in our cluster. With mixed machines, the more powerful machines will have attributes that cannot be attained in the powerful machine.
  • 24.
    Contd… Timing Thisis the most problematic aspect of cluster. Since these machines have different performance profile our code will execute at different rates on the different kinds of nodes. This can cause serious bottlenecks if a process on one node is waiting for results of a calculation on a slower node..
  • 25.
    Contd… Network SelectionThere are a number of different kinds of network topologies, including buses, cubes of various degrees, and grids/meshes. These network topologies will be implemented by use of one or more network interface cards, or NICs, installed into the head-node and compute nodes of our cluster.
  • 26.
    Contd… Speed SelectionNo matter what topology you choose for your cluster, you will want to get fastest network that your budget allows. Fortunately, the availability of high speed computers has also forced the development of high speed networking systems. Examples are : 10Mbit Ethernet, 100Mbit Ethernet, gigabit networking, channel bonding etc.
  • 27.
    Conclusion Clusters arepromising Solve parallel processing paradox New trends in hardware and software technologies are likely to make clusters. Clusters based supercomputers (Linux based clusters) can be seen everywhere !!
  • 28.