SlideShare a Scribd company logo
1 of 23
How Much Parallelism?
CS4532 Concurrent Programming
Dilum Bandara
Dilum.Bandara@uom.lk
Slides adapted from “The Art of Multiprocessor Programming”
by Maurice Herlihy & Nir Shavit Slightly, & Dr. Srinath Perera
Why Do We Care?
 Want as much of the code as possible to
execute concurrently (in parallel)
 Larger sequential part implies reduced
performance
 Amdahl’s law: this relation is not linear…
2
Amdahl’s Law
OldExecutionTime
NewExecutionTime
3
Speedup=
…of computation given n CPUs instead of 1
Amdahl’s Law
 
p
p
n
1
1
4
Speedup=
Parallel
fraction
Sequential
fraction
Number of
processors
Example – 1
 10 processors
 60% concurrent, 40% sequential
 How close to 10-fold speedup?
5
10
6
.
0
6
.
0
1
1


Speedup=2.17=
Example – 2
 10 processors
 80% concurrent, 20% sequential
 How close to 10-fold speedup?
6
10
8
.
0
8
.
0
1
1


Speedup=3.57=
Example – 3
 10 processors
 90% concurrent, 10% sequential
 How close to 10-fold speedup?
7
10
9
.
0
9
.
0
1
1


Speedup=5.26=
Example – 4
 10 processors
 99% concurrent, 1% sequential
 How close to 10-fold speedup?
8
10
99
.
0
99
.
0
1
1


Speedup=9.17=
Speedup Against No of Processors
 Even with  no of processors, maximum speedup limited to
1/(1 – p)
 e.g., with only 5% of computation being serial, maximum
speedup is 20
9
Source:
http://wiki.ccs.tulane.edu/index.php5/
Speedup/Scaling
The Moral
 Making good use of our multiple processors
(cores) means
 Finding ways to effectively parallelize our code
 Minimize sequential parts
 It’s worth our effort to try & parallelize even these last 10% of
serial code
 Reduce idle time in which threads wait without
executing
 This is what this course is about…
 % that is not easy to make concurrent yet may have a
large impact on overall speedup
10
Costs of Parallel Programming
 Costs
 Task start-up time
 Synchronizations
 Data communications
 Software overhead imposed by parallel compilers, libraries, tools,
operating system, etc.
 Task termination time
 Parallel programs have efficiency < 1, which means it
waste resources
 For small programs, additional cost will be prohibitive
 Parallel Programming let us get faster results at the cost
of efficiency
 Let us do 1 CPU year problem within a day using more CPUs
11
Complexity
 Parallel programs are often complex than their
serial counterparts
 Complexity is measured in terms of programmers
time in different steps of lifecycle
 Design
 Coding
 Debugging
 Tuning
 Maintenance
 They should yield significant improvement to
justify the costs
 Using parallelism to achieve 10-20% gain not useful 12
Performance in General
 We can never measure the real performance of a
system
 Yet, we still try do it
 To understand a system, 2 readings are required
1. Latency – time to finish 1 instance of the problem
2. Throughput – no of instances that can be finished in a
unit time
 Does throughput = 1/ Latency?
 Examples
 Water pipe
 Car vs. bus
13
Measuring Throughput or Latency
 When to measure Latency?
 When you have only 1 instance to run
 When operation has user waiting on it (user
interactions)
 When time sensitive deadlines are involved
 e.g., real time applications like predicting a Strom as soon as
possible
 When to measure Throughput?
 When latency is not important & overall utilization is
more crucial
 Sometime we need both
14
Note on Performance Analysis
 When you measure a system, you are taking an
sample
 Central Limit Theorem
 When we draw n samples from a distribution with
mean µ & variance σ2, as sample size n increases
distribution of the sample average of these random
variables approaches normal distribution with a mean
µ & variance σ2/n irrespective of the shape of
distribution
 Confidence Interval + Error Bars
 More readings means better confidence interval
15
Confidence Interval
16
No of Samples?
 How many observations n to get an accuracy of
± r% and a confidence level of 100(1 - α)%
17
Example
 Sample mean of response time = 20 s
 Sample standard deviation = 5
 How many repetitions are needed to get
response time accurate within 1 second at
95% confidence?
 Required accuracy (r) = 1 in 20 = 5%
 z= 1.960
18
Data Presentation
 Numbers
 Average, std, min, max, percentiles
 Tables
 Enable comparisons
 Graphs
 Easy to see trends
 Enable more complex comparisons
19
Graphs
20
Error Bars & Box Plots
21
Box Plots (Cont.)
22
Graph Rules
 Use a suitable graph type for case under analysis & data
 Should have a title or caption
 Axis properly titled with units
 Independent variable always goes on x-axis
 Time always on x-axis
 Range of each axis may be different
 Tics should each be large enough to cover needed range without
lots of extra space
 No need to start at zero
 Use a key to explain colors or symbols
 Graph should fill available space
 Error bars are encouraged to indicate uncertainty in a
measurement 23

More Related Content

Similar to How Much Parallelism?

Lec 4 (program and network properties)
Lec 4 (program and network properties)Lec 4 (program and network properties)
Lec 4 (program and network properties)Sudarshan Mondal
 
Automated Parameterization of Performance Models from Measurements
Automated Parameterization of Performance Models from MeasurementsAutomated Parameterization of Performance Models from Measurements
Automated Parameterization of Performance Models from MeasurementsWeikun Wang
 
parellel computing
parellel computingparellel computing
parellel computingkatakdound
 
3. Potential Benefits, Limits and Costs of Parallel Programming.pdf
3. Potential Benefits, Limits and Costs of Parallel Programming.pdf3. Potential Benefits, Limits and Costs of Parallel Programming.pdf
3. Potential Benefits, Limits and Costs of Parallel Programming.pdfMohamedAymen14
 
(Slides) Efficient Evaluation Methods of Elementary Functions Suitable for SI...
(Slides) Efficient Evaluation Methods of Elementary Functions Suitable for SI...(Slides) Efficient Evaluation Methods of Elementary Functions Suitable for SI...
(Slides) Efficient Evaluation Methods of Elementary Functions Suitable for SI...Naoki Shibata
 
Measuring Performance by Irfanullah
Measuring Performance by IrfanullahMeasuring Performance by Irfanullah
Measuring Performance by Irfanullahguest2e9811e
 
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy Ehsan Sharifi
 
Conducting and reporting the results of a cfd simulation
Conducting and reporting the results of a cfd simulationConducting and reporting the results of a cfd simulation
Conducting and reporting the results of a cfd simulationMalik Abdul Wahab
 
Computer architecture short note (version 8)
Computer architecture short note (version 8)Computer architecture short note (version 8)
Computer architecture short note (version 8)Nimmi Weeraddana
 
Algorithm Analysis.pdf
Algorithm Analysis.pdfAlgorithm Analysis.pdf
Algorithm Analysis.pdfMemMem25
 
program partitioning and scheduling IN Advanced Computer Architecture
program partitioning and scheduling  IN Advanced Computer Architectureprogram partitioning and scheduling  IN Advanced Computer Architecture
program partitioning and scheduling IN Advanced Computer ArchitecturePankaj Kumar Jain
 
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...AntareepMajumder
 
AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices
AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT DevicesAdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices
AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT DevicesDemetris Trihinas
 

Similar to How Much Parallelism? (20)

Unit 3 part2
Unit 3 part2Unit 3 part2
Unit 3 part2
 
Unit 3 part2
Unit 3 part2Unit 3 part2
Unit 3 part2
 
Unit 3 part2
Unit 3 part2Unit 3 part2
Unit 3 part2
 
Lec 4 (program and network properties)
Lec 4 (program and network properties)Lec 4 (program and network properties)
Lec 4 (program and network properties)
 
Chpt7
Chpt7Chpt7
Chpt7
 
Amdahl`s law -Processor performance
Amdahl`s law -Processor performanceAmdahl`s law -Processor performance
Amdahl`s law -Processor performance
 
Automated Parameterization of Performance Models from Measurements
Automated Parameterization of Performance Models from MeasurementsAutomated Parameterization of Performance Models from Measurements
Automated Parameterization of Performance Models from Measurements
 
parellel computing
parellel computingparellel computing
parellel computing
 
3. Potential Benefits, Limits and Costs of Parallel Programming.pdf
3. Potential Benefits, Limits and Costs of Parallel Programming.pdf3. Potential Benefits, Limits and Costs of Parallel Programming.pdf
3. Potential Benefits, Limits and Costs of Parallel Programming.pdf
 
(Slides) Efficient Evaluation Methods of Elementary Functions Suitable for SI...
(Slides) Efficient Evaluation Methods of Elementary Functions Suitable for SI...(Slides) Efficient Evaluation Methods of Elementary Functions Suitable for SI...
(Slides) Efficient Evaluation Methods of Elementary Functions Suitable for SI...
 
Measuring Performance by Irfanullah
Measuring Performance by IrfanullahMeasuring Performance by Irfanullah
Measuring Performance by Irfanullah
 
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
 
Conducting and reporting the results of a cfd simulation
Conducting and reporting the results of a cfd simulationConducting and reporting the results of a cfd simulation
Conducting and reporting the results of a cfd simulation
 
Computer architecture short note (version 8)
Computer architecture short note (version 8)Computer architecture short note (version 8)
Computer architecture short note (version 8)
 
Algorithm Analysis.pdf
Algorithm Analysis.pdfAlgorithm Analysis.pdf
Algorithm Analysis.pdf
 
program partitioning and scheduling IN Advanced Computer Architecture
program partitioning and scheduling  IN Advanced Computer Architectureprogram partitioning and scheduling  IN Advanced Computer Architecture
program partitioning and scheduling IN Advanced Computer Architecture
 
Parallel Algorithms
Parallel AlgorithmsParallel Algorithms
Parallel Algorithms
 
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...
 
PPoPP15
PPoPP15PPoPP15
PPoPP15
 
AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices
AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT DevicesAdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices
AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices
 

More from Dilum Bandara

Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningDilum Bandara
 
Time Series Analysis and Forecasting in Practice
Time Series Analysis and Forecasting in PracticeTime Series Analysis and Forecasting in Practice
Time Series Analysis and Forecasting in PracticeDilum Bandara
 
Introduction to Dimension Reduction with PCA
Introduction to Dimension Reduction with PCAIntroduction to Dimension Reduction with PCA
Introduction to Dimension Reduction with PCADilum Bandara
 
Introduction to Descriptive & Predictive Analytics
Introduction to Descriptive & Predictive AnalyticsIntroduction to Descriptive & Predictive Analytics
Introduction to Descriptive & Predictive AnalyticsDilum Bandara
 
Introduction to Concurrent Data Structures
Introduction to Concurrent Data StructuresIntroduction to Concurrent Data Structures
Introduction to Concurrent Data StructuresDilum Bandara
 
Hard to Paralelize Problems: Matrix-Vector and Matrix-Matrix
Hard to Paralelize Problems: Matrix-Vector and Matrix-MatrixHard to Paralelize Problems: Matrix-Vector and Matrix-Matrix
Hard to Paralelize Problems: Matrix-Vector and Matrix-MatrixDilum Bandara
 
Introduction to Map-Reduce Programming with Hadoop
Introduction to Map-Reduce Programming with HadoopIntroduction to Map-Reduce Programming with Hadoop
Introduction to Map-Reduce Programming with HadoopDilum Bandara
 
Embarrassingly/Delightfully Parallel Problems
Embarrassingly/Delightfully Parallel ProblemsEmbarrassingly/Delightfully Parallel Problems
Embarrassingly/Delightfully Parallel ProblemsDilum Bandara
 
Introduction to Warehouse-Scale Computers
Introduction to Warehouse-Scale ComputersIntroduction to Warehouse-Scale Computers
Introduction to Warehouse-Scale ComputersDilum Bandara
 
Introduction to Thread Level Parallelism
Introduction to Thread Level ParallelismIntroduction to Thread Level Parallelism
Introduction to Thread Level ParallelismDilum Bandara
 
CPU Memory Hierarchy and Caching Techniques
CPU Memory Hierarchy and Caching TechniquesCPU Memory Hierarchy and Caching Techniques
CPU Memory Hierarchy and Caching TechniquesDilum Bandara
 
Data-Level Parallelism in Microprocessors
Data-Level Parallelism in MicroprocessorsData-Level Parallelism in Microprocessors
Data-Level Parallelism in MicroprocessorsDilum Bandara
 
Instruction Level Parallelism – Hardware Techniques
Instruction Level Parallelism – Hardware TechniquesInstruction Level Parallelism – Hardware Techniques
Instruction Level Parallelism – Hardware TechniquesDilum Bandara
 
Instruction Level Parallelism – Compiler Techniques
Instruction Level Parallelism – Compiler TechniquesInstruction Level Parallelism – Compiler Techniques
Instruction Level Parallelism – Compiler TechniquesDilum Bandara
 
CPU Pipelining and Hazards - An Introduction
CPU Pipelining and Hazards - An IntroductionCPU Pipelining and Hazards - An Introduction
CPU Pipelining and Hazards - An IntroductionDilum Bandara
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
High Performance Networking with Advanced TCP
High Performance Networking with Advanced TCPHigh Performance Networking with Advanced TCP
High Performance Networking with Advanced TCPDilum Bandara
 
Introduction to Content Delivery Networks
Introduction to Content Delivery NetworksIntroduction to Content Delivery Networks
Introduction to Content Delivery NetworksDilum Bandara
 
Peer-to-Peer Networking Systems and Streaming
Peer-to-Peer Networking Systems and StreamingPeer-to-Peer Networking Systems and Streaming
Peer-to-Peer Networking Systems and StreamingDilum Bandara
 

More from Dilum Bandara (20)

Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Time Series Analysis and Forecasting in Practice
Time Series Analysis and Forecasting in PracticeTime Series Analysis and Forecasting in Practice
Time Series Analysis and Forecasting in Practice
 
Introduction to Dimension Reduction with PCA
Introduction to Dimension Reduction with PCAIntroduction to Dimension Reduction with PCA
Introduction to Dimension Reduction with PCA
 
Introduction to Descriptive & Predictive Analytics
Introduction to Descriptive & Predictive AnalyticsIntroduction to Descriptive & Predictive Analytics
Introduction to Descriptive & Predictive Analytics
 
Introduction to Concurrent Data Structures
Introduction to Concurrent Data StructuresIntroduction to Concurrent Data Structures
Introduction to Concurrent Data Structures
 
Hard to Paralelize Problems: Matrix-Vector and Matrix-Matrix
Hard to Paralelize Problems: Matrix-Vector and Matrix-MatrixHard to Paralelize Problems: Matrix-Vector and Matrix-Matrix
Hard to Paralelize Problems: Matrix-Vector and Matrix-Matrix
 
Introduction to Map-Reduce Programming with Hadoop
Introduction to Map-Reduce Programming with HadoopIntroduction to Map-Reduce Programming with Hadoop
Introduction to Map-Reduce Programming with Hadoop
 
Embarrassingly/Delightfully Parallel Problems
Embarrassingly/Delightfully Parallel ProblemsEmbarrassingly/Delightfully Parallel Problems
Embarrassingly/Delightfully Parallel Problems
 
Introduction to Warehouse-Scale Computers
Introduction to Warehouse-Scale ComputersIntroduction to Warehouse-Scale Computers
Introduction to Warehouse-Scale Computers
 
Introduction to Thread Level Parallelism
Introduction to Thread Level ParallelismIntroduction to Thread Level Parallelism
Introduction to Thread Level Parallelism
 
CPU Memory Hierarchy and Caching Techniques
CPU Memory Hierarchy and Caching TechniquesCPU Memory Hierarchy and Caching Techniques
CPU Memory Hierarchy and Caching Techniques
 
Data-Level Parallelism in Microprocessors
Data-Level Parallelism in MicroprocessorsData-Level Parallelism in Microprocessors
Data-Level Parallelism in Microprocessors
 
Instruction Level Parallelism – Hardware Techniques
Instruction Level Parallelism – Hardware TechniquesInstruction Level Parallelism – Hardware Techniques
Instruction Level Parallelism – Hardware Techniques
 
Instruction Level Parallelism – Compiler Techniques
Instruction Level Parallelism – Compiler TechniquesInstruction Level Parallelism – Compiler Techniques
Instruction Level Parallelism – Compiler Techniques
 
CPU Pipelining and Hazards - An Introduction
CPU Pipelining and Hazards - An IntroductionCPU Pipelining and Hazards - An Introduction
CPU Pipelining and Hazards - An Introduction
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
High Performance Networking with Advanced TCP
High Performance Networking with Advanced TCPHigh Performance Networking with Advanced TCP
High Performance Networking with Advanced TCP
 
Introduction to Content Delivery Networks
Introduction to Content Delivery NetworksIntroduction to Content Delivery Networks
Introduction to Content Delivery Networks
 
Peer-to-Peer Networking Systems and Streaming
Peer-to-Peer Networking Systems and StreamingPeer-to-Peer Networking Systems and Streaming
Peer-to-Peer Networking Systems and Streaming
 
Mobile Services
Mobile ServicesMobile Services
Mobile Services
 

Recently uploaded

Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noidabntitsolutionsrishis
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 

Recently uploaded (20)

Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 

How Much Parallelism?

  • 1. How Much Parallelism? CS4532 Concurrent Programming Dilum Bandara Dilum.Bandara@uom.lk Slides adapted from “The Art of Multiprocessor Programming” by Maurice Herlihy & Nir Shavit Slightly, & Dr. Srinath Perera
  • 2. Why Do We Care?  Want as much of the code as possible to execute concurrently (in parallel)  Larger sequential part implies reduced performance  Amdahl’s law: this relation is not linear… 2
  • 5. Example – 1  10 processors  60% concurrent, 40% sequential  How close to 10-fold speedup? 5 10 6 . 0 6 . 0 1 1   Speedup=2.17=
  • 6. Example – 2  10 processors  80% concurrent, 20% sequential  How close to 10-fold speedup? 6 10 8 . 0 8 . 0 1 1   Speedup=3.57=
  • 7. Example – 3  10 processors  90% concurrent, 10% sequential  How close to 10-fold speedup? 7 10 9 . 0 9 . 0 1 1   Speedup=5.26=
  • 8. Example – 4  10 processors  99% concurrent, 1% sequential  How close to 10-fold speedup? 8 10 99 . 0 99 . 0 1 1   Speedup=9.17=
  • 9. Speedup Against No of Processors  Even with  no of processors, maximum speedup limited to 1/(1 – p)  e.g., with only 5% of computation being serial, maximum speedup is 20 9 Source: http://wiki.ccs.tulane.edu/index.php5/ Speedup/Scaling
  • 10. The Moral  Making good use of our multiple processors (cores) means  Finding ways to effectively parallelize our code  Minimize sequential parts  It’s worth our effort to try & parallelize even these last 10% of serial code  Reduce idle time in which threads wait without executing  This is what this course is about…  % that is not easy to make concurrent yet may have a large impact on overall speedup 10
  • 11. Costs of Parallel Programming  Costs  Task start-up time  Synchronizations  Data communications  Software overhead imposed by parallel compilers, libraries, tools, operating system, etc.  Task termination time  Parallel programs have efficiency < 1, which means it waste resources  For small programs, additional cost will be prohibitive  Parallel Programming let us get faster results at the cost of efficiency  Let us do 1 CPU year problem within a day using more CPUs 11
  • 12. Complexity  Parallel programs are often complex than their serial counterparts  Complexity is measured in terms of programmers time in different steps of lifecycle  Design  Coding  Debugging  Tuning  Maintenance  They should yield significant improvement to justify the costs  Using parallelism to achieve 10-20% gain not useful 12
  • 13. Performance in General  We can never measure the real performance of a system  Yet, we still try do it  To understand a system, 2 readings are required 1. Latency – time to finish 1 instance of the problem 2. Throughput – no of instances that can be finished in a unit time  Does throughput = 1/ Latency?  Examples  Water pipe  Car vs. bus 13
  • 14. Measuring Throughput or Latency  When to measure Latency?  When you have only 1 instance to run  When operation has user waiting on it (user interactions)  When time sensitive deadlines are involved  e.g., real time applications like predicting a Strom as soon as possible  When to measure Throughput?  When latency is not important & overall utilization is more crucial  Sometime we need both 14
  • 15. Note on Performance Analysis  When you measure a system, you are taking an sample  Central Limit Theorem  When we draw n samples from a distribution with mean µ & variance σ2, as sample size n increases distribution of the sample average of these random variables approaches normal distribution with a mean µ & variance σ2/n irrespective of the shape of distribution  Confidence Interval + Error Bars  More readings means better confidence interval 15
  • 17. No of Samples?  How many observations n to get an accuracy of ± r% and a confidence level of 100(1 - α)% 17
  • 18. Example  Sample mean of response time = 20 s  Sample standard deviation = 5  How many repetitions are needed to get response time accurate within 1 second at 95% confidence?  Required accuracy (r) = 1 in 20 = 5%  z= 1.960 18
  • 19. Data Presentation  Numbers  Average, std, min, max, percentiles  Tables  Enable comparisons  Graphs  Easy to see trends  Enable more complex comparisons 19
  • 21. Error Bars & Box Plots 21
  • 23. Graph Rules  Use a suitable graph type for case under analysis & data  Should have a title or caption  Axis properly titled with units  Independent variable always goes on x-axis  Time always on x-axis  Range of each axis may be different  Tics should each be large enough to cover needed range without lots of extra space  No need to start at zero  Use a key to explain colors or symbols  Graph should fill available space  Error bars are encouraged to indicate uncertainty in a measurement 23