Full introduction to_parallel_computing

Introduction to
Parallel Computing

Presented by
Supasit Kajkamhaeng
1

Computational Problem

Problem

…………

Instructions

2

Serial Computing

Problem

…
CPU

Instructions
Time

3

What is Parallel Computing?
 A form of computation in which many calculations are
carried out simultaneously, operating on the principle
that large problems can often be divided into smaller
ones, which are then solved concurrently ("in parallel").
1
[Almasi and Gottlieb, 1989]

Problem
Task Problem
Task Task Task
Instructions

… … … …

CPU CPU CPU CPU
4

Pattern of Parallelism
 Data parallelism [Quinn, 2003] 2

 There are independent tasks applying the same
operation to different elements of a data set.
for i ← 0 to 99 do
a[i] = b[i] + c[i]
endfor

 Functional Parallelism [Quinn, 2003] 2

 There are independent tasks applying different
operations to different data elements.
a = 2, b=3
m = (a + b) / 2
n = a 2 + b2
5

Data Communications

Task 1 Task 2

exchange

6

Why use Parallel Computing?
 Reduce computing time
 More Processor

7

Why use Parallel Computing? (1)

 Solve larger problems
 More Memory
Problem
Task Problem
Task Task Task
Instructions

… … … …

RAM RAM RAM
RAM RAM

8

Parallel Computing Systems
• A single machine with multi-core processors

Process
Memory

C C C C
Multithreaded
C C C C
P P

Problem

Limits of a single machine (performance, available memory)
9

What is Cluster?
 A group of linked computers, working together
closely so that in many respects they from a single
computer
 To improve performance and/or availability over
that provided by a single computer 3
[Webopedia computer dictionary, 2007]

High-Performance High-Availability
10

Cluster Architecture

11

Message-Passing model
 The system is assumed to be a collection of processors,
each with its own local memory (Distributed memory
system)
 A processor has direct access only to the instructions
and data stored in its local memory
 An interconnection network supports message passing
between processors

MPI Standard
2
[Quinn, 2003] 12

Performance metrics
for parallel computing
• Speedup [Kumar et al., 1994] 4

 How much performance gain is achieved
parallelizing a given application over a sequential
implementation

SP - speedup with p processors
TS
Sp = P Ts Tp Sp
TP
4 40 15 2.67
where
TS - a sequential execution time
P - a number of processors
TP - a parallel execution time
with p processors
13

Speedup

5
[Eijkhout, 2011]
14

Efficiency
 A measure of processor utilization [Quinn, 2003] 2

EP - Efficiency with p processors
SP P Sp Ep
Ep =
P 4 2 0.5
8 3 0.375

 In practice, speedup is less than p and efficiency is
between zero and one, depending on the degree of
effectiveness with which the processors are utilized
5
[Eijkhout, 2011]

15

Effective factors of
Parallel Performance
• Portion of computation [Quinn, 2003]
2

 Computations that must be performed sequentially
 Computations that can be performed in parallel

fs - Serial fraction of computation
fp - Parallel fraction of computation

TS TS 1
Sp = = =
TP fs(Ts) + fp(Ts) fs + fp
P P
TS fs fp fs(TS) fp(Ts)

100 10% 90% 10 90 16

Parallel Performance (1)
• Parallel Overhead [Barney, 2011]
6

 The amount of time required to coordinate
parallel tasks, as opposed to doing useful
work
o Task start-up time
o Synchronizations
o Data communications
o Task termination time

• Load balancing, etc.

17


Tp = (fs)Ts + (1 – fs)Ts + Toverhead
P

Sp =
TS = TS
TP (fs)Ts + (1 – fs)Ts + Toverhead
P

18


Fixed Problem Size

Fixed
Sp = TS = TS
P

19

Fixed P; Problem Size => Speedup

P
Sp = TS = 0
TS
0
P

2D grid calculations 85 mins 85% 680 mins 97.84%
Serial fraction 15 mins 15% 15 mins 2.16%

20

Case Study
 Hardware Configuration
 Linux Cluster (4 compute nodes)
 Detail of Compute node
o 2x Intel Xeon 2.80 GHz (Single core)
o 4 GB RAM
o Gigabit Ethernet
o CentOS 4.3

21

Case Study - CFD
 Parallel Fluent Processing [Junhong, 2004] 7

 Run Fluent solver on two or more CPUs
simultaneously to calculate a computational
fluid dynamics (CFD) job

22

Case Study – CFD (1)
 Case Test #1

23

 Case Test #1 – Runtime

24

 Case Test #1 – Speedup

25

 Case Test #1 – Efficiency

26

Conclusion
 Parallel computing help to save time of
computation and solve larger problems over that
provided by a single computer (sequential
computing)
 To use parallel computers, then software is
developed with parallel programming model
 Performance of parallel computing is measured
with speedup and efficiency

27

Reference
1. G.S. Almasi and A. Gottlieb. 1989. Highly Parallel Computing. The
Benjamin-Cummings publishers, Redwood City, CA.
2. M.J. Quinn. 2003. Parallel Programming in C with MPI and
OpenMP. The McGraw-Hill Companies, Inc. NY.
3. What is clustering?. Webopedia computer dictionary. Retrieved on
November 7, 2007.
4. V. Kumar, A. Grama, A. Gupta, and G. Karypis. 1994. Introduction
to parallel computing: design and analysis of parallel algorithms.
The Benjamin-Cummings publishers, Redwood City, CA.
5. V. Eijkhout. 2011. Introduction to Parallel Computing. Texas
Advanced Computing Center (TACC), The University of Texas at
Austin.
6. B. Barney. 2011. Introduction to Parallel Computing. Lawrence
Livermore National Laboratory.
7. Junhong, W. 2004. Parallel Fluent Processing. SVU/Academic
Computing, Computer Centre, National University of Singapore.

28

Full introduction to_parallel_computing

More Related Content

What's hot

Viewers also liked

Similar to Full introduction to_parallel_computing

Full introduction to_parallel_computing

Editor's Notes