Distributed Computing and Cloud Computing 2

Copyright © 2012, Elsevier Inc. All rights reserved.
2 -
2 - 1
1
Distributed and Cloud Computing
Distributed and Cloud Computing
K. Hwang, G. Fox and J. Dongarra
K. Hwang, G. Fox and J. Dongarra
Chapter 2: Computer Clusters for
Chapter 2: Computer Clusters for
Scalable parallel Computing
Scalable parallel Computing
(suggested for 3 lectures in 150 minutes)
(suggested for 3 lectures in 150 minutes)
Prepared by Kai Hwang
Prepared by Kai Hwang
University of Southern California
University of Southern California
March 30, 2012
March 30, 2012

2 -
2 - 2
2
Figure 2.3 The Top-500 supercomputer performance from
1993 to 2010 (Courtesy of http://www.top500.org [25)

2 -
2 - 3
3
Figure 2.2 Architectural share of the Top-500
systems
(Courtesy of http://www.top500.org [25])

2 -
2 - 4
4
What is a computing cluster?
 A computing cluster consists of a collection of
interconnected stand-alone/complete computers,
which can cooperatively working together as a single,
integrated computing resource. Cluster explores
parallelism at job level and distributed computing
with higher availability.
 A typical cluster:
 Merging multiple system images to a SSI
(single-system image ) at certain functional levels.
 Low latency communication protocols applied
 Loosely coupled than an SMP with a SSI

2 -
2 - 5
5
Multicomputer Clusters:
 Cluster:
Cluster: A network of computers supported
A network of computers supported
by middleware and interacting by message passing
by middleware and interacting by message passing
 PC Cluster (Most
PC Cluster (Most Linux clusters
Linux clusters)
)
 Workstation Cluster
Workstation Cluster (NOW, COW)
(NOW, COW)
 Server cluster or Server Farm
Server cluster or Server Farm
 Cluster of
Cluster of SMPs
SMPs or
or ccNUMA
ccNUMA systems
systems
 Cluster-structured massively parallel processors
Cluster-structured massively parallel processors
(MPP) – about 85% of the top-500 systems
(MPP) – about 85% of the top-500 systems

2 -
2 - 6
6
Multi-Computer Cluster Components
Multi-Computer Cluster Components

2 -
2 - 7
7

2 -
2 - 8
8
Operational Benefits of Clustering
 System availability (HA) : Cluster offers inherent high system
availability due to the redundancy of hardware, operating
systems, and applications.
 Hardware Fault Tolerance: Cluster has some degree of
redundancy in most system components including both
hardware and software modules.
 OS and application reliability : Run multiple copies of the OS
and applications, and through this redundancy
 Scalability : Adding servers to a cluster or adding more
clusters to a network as the application need arises.
 High Performance : Running cluster enabled programs to
yield higher throughput.

2 -
2 - 9
9
Cluster Opportunities :
MPP/DSM:
 Compute across multiple systems: parallelism.
Network RAM:
 Idle memory in other nodes. Page across other nodes’ idle
memory
Software RAID:
 file system supporting parallel I/O and reliability, mass-
storage.
Multi-path Communication:
 Communicate across multiple networks: Ethernet, ATM,
Myrinet

2 -
2 - 10
10
Resource Sharing in Cluster of Computers

2 -
2 - 11
11
 Size Scalability (physical & application)
 Enhanced Availability (failure management)
 Single System Image (Middleware, OS extensions)
 Fast Communication (networks & protocols)
 Load Balancing (CPU, Net, Memory, Disk)
 Security and Encryption (clusters and Grids)
 Distributed Environment (User friendly)
 Manageability (Jobs and resources )
 Programmability (simple API required)
 Applicability (cluster- and grid-awareness)
Issues in Cluster Design

2 -
2 - 12
12
Operational Benefits of Clustering
 System availability (HA) : Cluster offers inherent high system
availability due to the redundancy of hardware, operating
systems, and applications.
 Hardware Fault Tolerance: Cluster has some degree of
redundancy in most system components including both
hardware and software modules.
 OS and application reliability : Run multiple copies of the OS
and applications, and through this redundancy
 Scalability : Adding servers to a cluster or adding more
clusters to a network as the application need arises.
 High Performance : Running cluster enabled programs to
yield higher throughput.

2 -
2 - 13
13

2 -
2 - 14
14

2 -
2 - 15
15

2 -
2 - 16
16
Compute Node Architectures :
Compute Node Architectures :

2 -
2 - 17
17

2 -
2 - 18
18
Cluster Interconnects :
Cluster Interconnects :

2 -
2 - 19
19
IBM BlueGene/L Supercomputer: The World
Fastest Message-Passing MPP built in 2005
Built jointly by IBM and LLNL teams and
funded by US DoE ASCI Research Program

2 -
2 - 20
20
Overview of Blue Gene L
 Blue Gene L is a supercomputer jointly developed
by IBM and Lawrence Livermore National Laboratory
 It occupies 17 of the top 100 slots in the rankings at
top500.org, including 5 of the top 10
 360 TeraFLOPS theoretical peak speed
 Largest configuration:
 At Lawrence Livermore Nat’l Lab.
 Runs simulations on US nuclear weapon stockpile
 64 physical racks
 65,536 compute nodes
 Torus interconnection network of 64 x 32 x 32

2 -
2 - 21
21

2 -
2 - 22
22

2 -
2 - 23
23

2 -
2 - 24
24

2 -
2 - 25
25
Cluster Middleware
 Resides Between OS and Applications and
offers in infrastructure for supporting:
 Single System Image (SSI)
 System Availability (SA)
 SSI makes collection appear as single
machine (globalised view of system
resources). Telnet cluster.myinstitute.edu
 Checkpointing and process migration..

2 -
2 - 26
26
What is Single System Image (SSI) ?
 A single system image is the illusion, created by
software or hardware, that presents a collection of
resources as an integrated powerful resource.
 SSI makes the cluster appear like a single machine
to the user, applications, and network.
 A cluster with multiple system images is nothing
but a collection of independent computers
(Distributed systems in general)

2 -
2 - 27
27
Desired SSI Services
 Single Entry Point
 telnet cluster.usc.edu
 telnet node1.cluster.usc.edu
 Single File Hierarchy: xFS, AFS, Solaris MC Proxy
 Single Control Point: Management from single GUI
 Single virtual networking over multiple physical networks
 Single memory space - Network RAM / DSM
 Single Job Management: GlUnix, Codine, LSF, etc.
 Single User Interface: Like CDE in Solaris/NT

2 -
2 - 28
28
Single Entry Point to access a
Cluster from any physical point

2 -
2 - 29
29
Four SSI Features
Four SSI Features in
in
Networking, I/O Space, Memory
Networking, I/O Space, Memory
Sharing, and Cluster Control
Sharing, and Cluster Control

2 -
2 - 30
30

2 -
2 - 31
31

2 -
2 - 32
32
Availability Support Functions
 Single I/O Space (SIO):
 Any node can access any peripheral or disk devices without the
knowledge of their physical location.
 Single Process Space (SPS)
 Any process has cluster wide process id and they
communicate through signal, pipes, etc, as if they are on a
single node.
 Checkpointing and Process Migration.
 Saves the process state and intermediate results from memory
in rollback recovery from fails.

2 -
2 - 33
33
Distribute RAID - The RAID-x Architecture

2 -
2 - 34
34

2 -
2 - 35
35

2 -
2 - 36
36

2 -
2 - 37
37

2 -
2 - 38
38
Single Points of Failure in SMP and Clusters

2 -
2 - 39
39

2 -
2 - 40
40

2 -
2 - 41
41

2 -
2 - 42
42

2 -
2 - 43
43

2 -
2 - 44
44

2 -
2 - 45
45

2 -
2 - 46
46

2 -
2 - 47
47
Figure 2.4 Country share of the Top-500
supercomputers over time [25]

2 -
2 - 48
48
Figure 2.5 Application-area share of Top-500 systems over time.
(Courtesy of http://www.top500.org [25])

2 -
2 - 49
49
Top- 500 Release in June 2010
Top- 500 Release in June 2010

2 -
2 - 50
50

2 -
2 - 51
51

2 -
2 - 52
52

2 -
2 - 53
53

2 -
2 - 54
54

2 -
2 - 55
55

2 -
2 - 56
56
The Cray XT-5 Jagaur
The Cray XT-5 Jagaur
Supercomputer
Supercomputer

2 -
2 - 57
57

2 -
2 - 58
58

2 -
2 - 59
59
IBM
IBM
Roadrunner
Roadrunner
System
System

2 -
2 - 60
60

2 -
2 - 61
61

2 -
2 - 62
62
(Courtesy of Bill Dally, 2011)

2 -
2 - 63
63

2 -
2 - 64
64

2 -
2 - 65
65
A proposed Nivdia GPU chip processor architecture with 128 cores (160 GFlpos each) plus
A proposed Nivdia GPU chip processor architecture with 128 cores (160 GFlpos each) plus
8 latency processors (LP) connected to 1024 SRAMs (L2 caches) by a NoC, where MS are
8 latency processors (LP) connected to 1024 SRAMs (L2 caches) by a NoC, where MS are
the memory controllers connecting to off-chip DRAMS and NI is the network interface to
the memory controllers connecting to off-chip DRAMS and NI is the network interface to
next level of network (Courtesy of Bill Dally, reprint with permission [10] ).
next level of network (Courtesy of Bill Dally, reprint with permission [10] ).

2 -
2 - 66
66
The architecture of a GPU cluster built with a hierarchical network of
The architecture of a GPU cluster built with a hierarchical network of
processor chips (GPUs) that can deliver 2.6 PFlops per cabinet. It takes at
processor chips (GPUs) that can deliver 2.6 PFlops per cabinet. It takes at
least N = 400 cabinets to achieve the desired PFlops or EFlops performance.
least N = 400 cabinets to achieve the desired PFlops or EFlops performance.
(Courtesy of Bill Dally, reprint with permission [10] ).
(Courtesy of Bill Dally, reprint with permission [10] ).

2 -
2 - 67
67

2 -
2 - 68
68
Paper/Books on Clusters and MPPs:
Paper/Books on Clusters and MPPs:
1.
1. G. Bell, J. Gray. And A. Szalay, “Petascale Computational
G. Bell, J. Gray. And A. Szalay, “Petascale Computational
Systems : Balanced Cyberstructure in a Data-Centric
Systems : Balanced Cyberstructure in a Data-Centric
World”,
World”, IEEE Computer Magazine,
IEEE Computer Magazine, 2006.
2006.
2. K. Hwang, G. Fox. And J. Dongarra, Distributed and Cloud
Computing Systems, Chapter 2, Kauffmann, 2011
3.
3. G. F. Phister,
G. F. Phister, In Search of Clusters,
In Search of Clusters, (second Edition)
(second Edition)
Prentice-Hall, N.J. 2001.
Prentice-Hall, N.J. 2001.
Updated Studies:
Updated Studies: Conduct an updated Study of the Top 5
Conduct an updated Study of the Top 5
Systems in the latest Top- 500 List . Produce an updated Table 2.3
Systems in the latest Top- 500 List . Produce an updated Table 2.3
and describe the No.1 system in details like the treatment of
and describe the No.1 system in details like the treatment of
Tianhe-1A, Jagaur, and Roadrunner in 2011. 2010. and 2009,
Tianhe-1A, Jagaur, and Roadrunner in 2011. 2010. and 2009,
respectively, in Chapter 2.
respectively, in Chapter 2.

Distributed Computing and Cloud Computing 2

More Related Content

Similar to Distributed Computing and Cloud Computing 2

Recently uploaded

Distributed Computing and Cloud Computing 2