How Much Parallelism?

How Much Parallelism?
CS4532 Concurrent Programming
Dilum Bandara
Dilum.Bandara@uom.lk
Slides adapted from “The Art of Multiprocessor Programming”
by Maurice Herlihy & Nir Shavit Slightly, & Dr. Srinath Perera

Why Do We Care?
 Want as much of the code as possible to
execute concurrently (in parallel)
 Larger sequential part implies reduced
performance
 Amdahl’s law: this relation is not linear…
2

Amdahl’s Law
OldExecutionTime
NewExecutionTime
3
Speedup=
…of computation given n CPUs instead of 1

Amdahl’s Law
 
p
p
n
1
1
4
Speedup=
Parallel
fraction
Sequential
fraction
Number of
processors

Example – 1
 10 processors
 60% concurrent, 40% sequential
 How close to 10-fold speedup?
5
10
6
.
0
6
.
0
1
1


Speedup=2.17=

Example – 2
 10 processors
6
10
8
.
0
8
.
0
1
1


Speedup=3.57=

Example – 3
 10 processors
7
10
9
.
0
9
.
0
1
1


Speedup=5.26=

Example – 4
 10 processors
8
10
99
.
0
99
.
0
1
1


Speedup=9.17=

Speedup Against No of Processors
 Even with  no of processors, maximum speedup limited to
1/(1 – p)
 e.g., with only 5% of computation being serial, maximum
speedup is 20
9
Source:
http://wiki.ccs.tulane.edu/index.php5/
Speedup/Scaling

The Moral
 Making good use of our multiple processors
(cores) means
 Finding ways to effectively parallelize our code
 Minimize sequential parts
 It’s worth our effort to try & parallelize even these last 10% of
serial code
 Reduce idle time in which threads wait without
executing
 This is what this course is about…
 % that is not easy to make concurrent yet may have a
large impact on overall speedup
10

Costs of Parallel Programming
 Costs
 Task start-up time
 Synchronizations
 Data communications
 Software overhead imposed by parallel compilers, libraries, tools,
operating system, etc.
 Task termination time
 Parallel programs have efficiency < 1, which means it
waste resources
 For small programs, additional cost will be prohibitive
 Parallel Programming let us get faster results at the cost
of efficiency
 Let us do 1 CPU year problem within a day using more CPUs
11

Complexity
 Parallel programs are often complex than their
serial counterparts
 Complexity is measured in terms of programmers
time in different steps of lifecycle
 Design
 Coding
 Debugging
 Tuning
 Maintenance
 They should yield significant improvement to
justify the costs
 Using parallelism to achieve 10-20% gain not useful 12

Performance in General
 We can never measure the real performance of a
system
 Yet, we still try do it
 To understand a system, 2 readings are required
1. Latency – time to finish 1 instance of the problem
2. Throughput – no of instances that can be finished in a
unit time
 Does throughput = 1/ Latency?
 Examples
 Water pipe
 Car vs. bus
13

Measuring Throughput or Latency
 When to measure Latency?
 When you have only 1 instance to run
 When operation has user waiting on it (user
interactions)
 When time sensitive deadlines are involved
 e.g., real time applications like predicting a Strom as soon as
possible
 When to measure Throughput?
 When latency is not important & overall utilization is
more crucial
 Sometime we need both
14

Note on Performance Analysis
 When you measure a system, you are taking an
sample
 Central Limit Theorem
 When we draw n samples from a distribution with
mean µ & variance σ2, as sample size n increases
distribution of the sample average of these random
variables approaches normal distribution with a mean
µ & variance σ2/n irrespective of the shape of
distribution
 Confidence Interval + Error Bars
 More readings means better confidence interval
15

No of Samples?
 How many observations n to get an accuracy of
± r% and a confidence level of 100(1 - α)%
17

Example
 Sample mean of response time = 20 s
 Sample standard deviation = 5
 How many repetitions are needed to get
response time accurate within 1 second at
95% confidence?
 Required accuracy (r) = 1 in 20 = 5%
 z= 1.960
18

Data Presentation
 Numbers
 Average, std, min, max, percentiles
 Tables
 Enable comparisons
 Graphs
 Easy to see trends
 Enable more complex comparisons
19

Graph Rules
 Use a suitable graph type for case under analysis & data
 Should have a title or caption
 Axis properly titled with units
 Independent variable always goes on x-axis
 Time always on x-axis
 Range of each axis may be different
 Tics should each be large enough to cover needed range without
lots of extra space
 No need to start at zero
 Use a key to explain colors or symbols
 Graph should fill available space
 Error bars are encouraged to indicate uncertainty in a
measurement 23

How Much Parallelism?

Recommended

Recommended

More Related Content

Similar to How Much Parallelism?

Similar to How Much Parallelism? (20)

More from Dilum Bandara

More from Dilum Bandara (20)

Recently uploaded

Recently uploaded (20)

How Much Parallelism?