SlideShare a Scribd company logo
Chapter 6
Basic Parallelism and CPU
Chapter 6
Basic Parallelism and CPU
• 6.1 Introduction
• 6.2 SISD Computers
• 6.3 Hardware and software parallelism; Hardware
parallelism and Software parallelism
• 6.4 The role of compilers
• 6.5 Communication latency
• 6.6 Grain packing and scheduling
• 6.7 Static multiprocessors scheduling
• 6.8 Node duplication
6.1 Introduction
• SISD CPUs
– Parallelism in a conventional CPU
– Multiple issue CPUs
– Multiple functional units,
– Parallelism with multiple CPUs,
– grain packing and node duplication
– Scheduling
• Simple processing elements executed single
instruction on a single data stream.
• Tekrarlarsak
• Conventional Von Neumann Computer.
• Single processor executes instructions
sequentially.
• The operations are ordered in time and may be
easily traced from start to end.
• Modern uni-processor system use some from of
pipelining and super scalar techniques.
6.2 SISD computer
• Pipelining introduces temporal parallelism by
allowing sequential executions of instruction to be
overlapped in time (Used multiple functional
units).
• . The need for branching may reduce
effectiveness.
• . Very long instruction words can be used to
reduce the impact of branching
• Tekrar
6.3 Hardware and software
parallelism
• For implementation of parallelism, we need special
hardware and software support.
• Distinguish between hardware and software parallelism.
• The mismatch problem between hardware and software.
• Compilation support needed to close the gap between
hardware and software.
• Parallelism cannot be achieved free.
• Detail of special hardware functions and software supports.
Hardware parallelism
• Defined by machine hardware and hardware multiplicity.
• Cost and performance tradeoffs.
• Indicated the peak performance of the processor resource.
• A processors issues k- instruction per machine cycle the it
is called k-issue processor.
• Conventional processor takes one or more cycle to issue a
single instruction.
• This processor is one issue machine.
• For example i960CA three-issue processor , Pentium 4 4-
issu etc.
Software parallelism
• Defined by control and data dependence of programs
• Degree of parallelism is revealed in the program profile or
in the program flow graph.
• Software parallelism is function of algorithm.
• Program flow graph displays the patterns of simultaneous
executable operations.
• Example Hwang, (page 58 figure 2.3 and page 59 fig 2.4)
• Control and data parallelism ( control parallelism in
pipelining or multiplicity of functional units and data
parallelism higher potential of concurrency on SISD and
MIMD systems)
• To solve mismatch problem between software parallelism
and hardware parallelism.
• Develop compilation support.
• Hardware redesign and intelligent or optimized compiler.
• The instruction scheduler exploits pipeline hardware by
filling branch and load delay slots (using cache and
dynamic scheduling).
6.4 The role of compilers
• Compiler techniques used to exploit hardware features to
improve performance.
• Loop transformation, software pipelining, and features
developed in existing optimizing compiler for supporting
parallelism.
• Hardware and software designed jointly at the same time.
• Hardware and software design tradeoffs also exist in terms
of cost, complexity, expandability, compatibility, and
performance.
• Granularity and communication latency play important
role in the code optimization and scheduling.
6.5 Communication latency
• Balancing granularity and latency to achieve better
performance (depend on technology, scalability and
machine size).
• Memory latency increases respect to memory capacity.
• Various latency hiding and tolerating techniques.
• Inter-process communication latency is another important
parameter.
• n tasks communicating with each other requires
• n(n-1)/2 communication links (grows quadrically).
• Communication pattern.
• Pattern included ( permutations, and broadcast, multicast,
and conference)
• Communication demand may limit granularity of
parallelism.
• Trade of between communication and granularity
• Reduce latency and complexity of communication.
• Prevention of deadlock.
• Minimization of blocking in communication.
6.6 Grain packing and
scheduling
• Two fundamental question
– 1. How can we partition in to parallel branches,
program modules, or grains to yield the shortest
possible execution time.
– 2. What it the optimal size of concurrent grains in a
computation.
• Both problems are machine-dependent
• The goal is a short schedule for fast execution of
subdivided program modules.
• Tradeoffs between parallelism and
scheduling/synchronization overhead.
• Partitioning involves the algorithm designer, programmer,
compiler, operating system support, etc.
• Hwang, (fig 2.6 , page 65)
•
• (n,s) ; (n is node,s : grain size)
•
• (v,d) ; (v : output variable, d : delay )
•
• Fine and coarse grain and grain packing
• Hwang, (fig 2.7, page 66)
6.7 Static multiprocessors
scheduling
• Grain packing may not produce a short
schedule always.
• Dynamic multiprocessor scheduling is an
NP-hard problem.
6.8 Node duplication
• To eliminate idle time and reduce communication
delay
• Four major step for grain packing and scheduling.
– 1. Construct fine-grain program graph.
– 2. Schedule the fine-grain computation
– 3. Grain packing to produce the short grain
– 4. Generate a parallel schedule based on the packed
graph.
• Hwang (Figure 2.8 page 67)
• Calculatable grain size and communication
• Hwang fig 2.9, page 68
• Sequential versus parallel scheduling.
• Hwang (fig 2.10 , page 69)
• Grain packing for problem fig 2.9.c
• Hwang , ( fig 2.11, page 70)

More Related Content

Similar to BIL406-Chapter-6-Basic Parallelism and CPU.ppt

Parallel Computing on the GPU
Parallel Computing on the GPUParallel Computing on the GPU
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
Andriy Zabavskyy
 
Unit 5 Advanced Computer Architecture
Unit 5 Advanced Computer ArchitectureUnit 5 Advanced Computer Architecture
Unit 5 Advanced Computer Architecture
Balaji Vignesh
 
BIL406-Chapter-2-Classifications of Parallel Systems.ppt
BIL406-Chapter-2-Classifications of Parallel Systems.pptBIL406-Chapter-2-Classifications of Parallel Systems.ppt
BIL406-Chapter-2-Classifications of Parallel Systems.ppt
Kadri20
 
Parallel Computing
Parallel ComputingParallel Computing
Parallel Computing
Mohsin Bhat
 
Memory and Performance Isolation for a Multi-tenant Function-based Data-plane
Memory and Performance Isolation for a Multi-tenant Function-based Data-planeMemory and Performance Isolation for a Multi-tenant Function-based Data-plane
Memory and Performance Isolation for a Multi-tenant Function-based Data-plane
AJAY KHARAT
 
Multicore_Architecture Book.pdf
Multicore_Architecture Book.pdfMulticore_Architecture Book.pdf
Multicore_Architecture Book.pdf
SwatantraPrakash5
 
network ram parallel computing
network ram parallel computingnetwork ram parallel computing
network ram parallel computing
Niranjana Ambadi
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMS
Kathirvel Ayyaswamy
 
Advanced processor principles
Advanced processor principlesAdvanced processor principles
Advanced processor principles
Dhaval Bagal
 
01-MessagePassingFundamentals.ppt
01-MessagePassingFundamentals.ppt01-MessagePassingFundamentals.ppt
01-MessagePassingFundamentals.ppt
HarshitPal37
 
Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
BaliThorat1
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
Peter Clapham
 
Week # 1.pdf
Week # 1.pdfWeek # 1.pdf
Week # 1.pdf
giddy5
 
Cloud computing
Cloud computingCloud computing
Cloud computing
Aaron Tushabe
 
Cloud Computing - Geektalk
Cloud Computing - GeektalkCloud Computing - Geektalk
Cloud Computing - Geektalk
Malisa Ncube
 
SOC System Design Approach
SOC System Design ApproachSOC System Design Approach
SOC System Design Approach
A B Shinde
 
Approximation techniques used for general purpose algorithms
Approximation techniques used for general purpose algorithmsApproximation techniques used for general purpose algorithms
Approximation techniques used for general purpose algorithms
Sabidur Rahman
 
Play With Streams
Play With StreamsPlay With Streams
Play With Streams
Tianjian Chen
 
OpenPOWER Webinar
OpenPOWER Webinar OpenPOWER Webinar
OpenPOWER Webinar
Ganesan Narayanasamy
 

Similar to BIL406-Chapter-6-Basic Parallelism and CPU.ppt (20)

Parallel Computing on the GPU
Parallel Computing on the GPUParallel Computing on the GPU
Parallel Computing on the GPU
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
Unit 5 Advanced Computer Architecture
Unit 5 Advanced Computer ArchitectureUnit 5 Advanced Computer Architecture
Unit 5 Advanced Computer Architecture
 
BIL406-Chapter-2-Classifications of Parallel Systems.ppt
BIL406-Chapter-2-Classifications of Parallel Systems.pptBIL406-Chapter-2-Classifications of Parallel Systems.ppt
BIL406-Chapter-2-Classifications of Parallel Systems.ppt
 
Parallel Computing
Parallel ComputingParallel Computing
Parallel Computing
 
Memory and Performance Isolation for a Multi-tenant Function-based Data-plane
Memory and Performance Isolation for a Multi-tenant Function-based Data-planeMemory and Performance Isolation for a Multi-tenant Function-based Data-plane
Memory and Performance Isolation for a Multi-tenant Function-based Data-plane
 
Multicore_Architecture Book.pdf
Multicore_Architecture Book.pdfMulticore_Architecture Book.pdf
Multicore_Architecture Book.pdf
 
network ram parallel computing
network ram parallel computingnetwork ram parallel computing
network ram parallel computing
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMS
 
Advanced processor principles
Advanced processor principlesAdvanced processor principles
Advanced processor principles
 
01-MessagePassingFundamentals.ppt
01-MessagePassingFundamentals.ppt01-MessagePassingFundamentals.ppt
01-MessagePassingFundamentals.ppt
 
Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 
Week # 1.pdf
Week # 1.pdfWeek # 1.pdf
Week # 1.pdf
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Cloud Computing - Geektalk
Cloud Computing - GeektalkCloud Computing - Geektalk
Cloud Computing - Geektalk
 
SOC System Design Approach
SOC System Design ApproachSOC System Design Approach
SOC System Design Approach
 
Approximation techniques used for general purpose algorithms
Approximation techniques used for general purpose algorithmsApproximation techniques used for general purpose algorithms
Approximation techniques used for general purpose algorithms
 
Play With Streams
Play With StreamsPlay With Streams
Play With Streams
 
OpenPOWER Webinar
OpenPOWER Webinar OpenPOWER Webinar
OpenPOWER Webinar
 

More from Kadri20

BIL406-Chapter-11-MIMD Programming Languages.ppt
BIL406-Chapter-11-MIMD Programming Languages.pptBIL406-Chapter-11-MIMD Programming Languages.ppt
BIL406-Chapter-11-MIMD Programming Languages.ppt
Kadri20
 
BIL406-Chapter-8-Asynchronous parallelism.ppt
BIL406-Chapter-8-Asynchronous parallelism.pptBIL406-Chapter-8-Asynchronous parallelism.ppt
BIL406-Chapter-8-Asynchronous parallelism.ppt
Kadri20
 
BIL406-Chapter-5-Network Structures.ppt
BIL406-Chapter-5-Network Structures.pptBIL406-Chapter-5-Network Structures.ppt
BIL406-Chapter-5-Network Structures.ppt
Kadri20
 
BIL406-Chapter-7-Superscalar and Superpipeline processors.ppt
BIL406-Chapter-7-Superscalar and Superpipeline  processors.pptBIL406-Chapter-7-Superscalar and Superpipeline  processors.ppt
BIL406-Chapter-7-Superscalar and Superpipeline processors.ppt
Kadri20
 
BIL406-Chapter-9-Synchronization and Communication in MIMD Systems.ppt
BIL406-Chapter-9-Synchronization and Communication in MIMD Systems.pptBIL406-Chapter-9-Synchronization and Communication in MIMD Systems.ppt
BIL406-Chapter-9-Synchronization and Communication in MIMD Systems.ppt
Kadri20
 
BIL406-Chapter-10-Problems with Asynchronous Parallelism.ppt
BIL406-Chapter-10-Problems with Asynchronous Parallelism.pptBIL406-Chapter-10-Problems with Asynchronous Parallelism.ppt
BIL406-Chapter-10-Problems with Asynchronous Parallelism.ppt
Kadri20
 
BIL406-Chapter-4-Parallel Processing Concept.ppt
BIL406-Chapter-4-Parallel Processing Concept.pptBIL406-Chapter-4-Parallel Processing Concept.ppt
BIL406-Chapter-4-Parallel Processing Concept.ppt
Kadri20
 
BIL406-Chapter-1-Introduction.ppt
BIL406-Chapter-1-Introduction.pptBIL406-Chapter-1-Introduction.ppt
BIL406-Chapter-1-Introduction.ppt
Kadri20
 
BIL406-Chapter-0-Introduction-Course.ppt
BIL406-Chapter-0-Introduction-Course.pptBIL406-Chapter-0-Introduction-Course.ppt
BIL406-Chapter-0-Introduction-Course.ppt
Kadri20
 

More from Kadri20 (9)

BIL406-Chapter-11-MIMD Programming Languages.ppt
BIL406-Chapter-11-MIMD Programming Languages.pptBIL406-Chapter-11-MIMD Programming Languages.ppt
BIL406-Chapter-11-MIMD Programming Languages.ppt
 
BIL406-Chapter-8-Asynchronous parallelism.ppt
BIL406-Chapter-8-Asynchronous parallelism.pptBIL406-Chapter-8-Asynchronous parallelism.ppt
BIL406-Chapter-8-Asynchronous parallelism.ppt
 
BIL406-Chapter-5-Network Structures.ppt
BIL406-Chapter-5-Network Structures.pptBIL406-Chapter-5-Network Structures.ppt
BIL406-Chapter-5-Network Structures.ppt
 
BIL406-Chapter-7-Superscalar and Superpipeline processors.ppt
BIL406-Chapter-7-Superscalar and Superpipeline  processors.pptBIL406-Chapter-7-Superscalar and Superpipeline  processors.ppt
BIL406-Chapter-7-Superscalar and Superpipeline processors.ppt
 
BIL406-Chapter-9-Synchronization and Communication in MIMD Systems.ppt
BIL406-Chapter-9-Synchronization and Communication in MIMD Systems.pptBIL406-Chapter-9-Synchronization and Communication in MIMD Systems.ppt
BIL406-Chapter-9-Synchronization and Communication in MIMD Systems.ppt
 
BIL406-Chapter-10-Problems with Asynchronous Parallelism.ppt
BIL406-Chapter-10-Problems with Asynchronous Parallelism.pptBIL406-Chapter-10-Problems with Asynchronous Parallelism.ppt
BIL406-Chapter-10-Problems with Asynchronous Parallelism.ppt
 
BIL406-Chapter-4-Parallel Processing Concept.ppt
BIL406-Chapter-4-Parallel Processing Concept.pptBIL406-Chapter-4-Parallel Processing Concept.ppt
BIL406-Chapter-4-Parallel Processing Concept.ppt
 
BIL406-Chapter-1-Introduction.ppt
BIL406-Chapter-1-Introduction.pptBIL406-Chapter-1-Introduction.ppt
BIL406-Chapter-1-Introduction.ppt
 
BIL406-Chapter-0-Introduction-Course.ppt
BIL406-Chapter-0-Introduction-Course.pptBIL406-Chapter-0-Introduction-Course.ppt
BIL406-Chapter-0-Introduction-Course.ppt
 

Recently uploaded

5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
ihlasbinance2003
 
CSM Cloud Service Management Presentarion
CSM Cloud Service Management PresentarionCSM Cloud Service Management Presentarion
CSM Cloud Service Management Presentarion
rpskprasana
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
yokeleetan1
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
jpsjournal1
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
NidhalKahouli2
 
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
Mukeshwaran Balu
 
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSA SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
IJNSA Journal
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
Series of visio cisco devices Cisco_Icons.ppt
Series of visio cisco devices Cisco_Icons.pptSeries of visio cisco devices Cisco_Icons.ppt
Series of visio cisco devices Cisco_Icons.ppt
PauloRodrigues104553
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
nooriasukmaningtyas
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
gestioneergodomus
 
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptxML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
JamalHussainArman
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
Hitesh Mohapatra
 
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
gerogepatton
 
Low power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniquesLow power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniques
nooriasukmaningtyas
 

Recently uploaded (20)

5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
 
CSM Cloud Service Management Presentarion
CSM Cloud Service Management PresentarionCSM Cloud Service Management Presentarion
CSM Cloud Service Management Presentarion
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
 
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
 
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSA SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
Series of visio cisco devices Cisco_Icons.ppt
Series of visio cisco devices Cisco_Icons.pptSeries of visio cisco devices Cisco_Icons.ppt
Series of visio cisco devices Cisco_Icons.ppt
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
 
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptxML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
 
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
 
Low power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniquesLow power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniques
 

BIL406-Chapter-6-Basic Parallelism and CPU.ppt

  • 2. Chapter 6 Basic Parallelism and CPU • 6.1 Introduction • 6.2 SISD Computers • 6.3 Hardware and software parallelism; Hardware parallelism and Software parallelism • 6.4 The role of compilers • 6.5 Communication latency • 6.6 Grain packing and scheduling • 6.7 Static multiprocessors scheduling • 6.8 Node duplication
  • 3. 6.1 Introduction • SISD CPUs – Parallelism in a conventional CPU – Multiple issue CPUs – Multiple functional units, – Parallelism with multiple CPUs, – grain packing and node duplication – Scheduling
  • 4. • Simple processing elements executed single instruction on a single data stream. • Tekrarlarsak • Conventional Von Neumann Computer. • Single processor executes instructions sequentially. • The operations are ordered in time and may be easily traced from start to end. • Modern uni-processor system use some from of pipelining and super scalar techniques. 6.2 SISD computer
  • 5. • Pipelining introduces temporal parallelism by allowing sequential executions of instruction to be overlapped in time (Used multiple functional units). • . The need for branching may reduce effectiveness. • . Very long instruction words can be used to reduce the impact of branching • Tekrar
  • 6. 6.3 Hardware and software parallelism • For implementation of parallelism, we need special hardware and software support. • Distinguish between hardware and software parallelism. • The mismatch problem between hardware and software. • Compilation support needed to close the gap between hardware and software. • Parallelism cannot be achieved free. • Detail of special hardware functions and software supports.
  • 7. Hardware parallelism • Defined by machine hardware and hardware multiplicity. • Cost and performance tradeoffs. • Indicated the peak performance of the processor resource. • A processors issues k- instruction per machine cycle the it is called k-issue processor. • Conventional processor takes one or more cycle to issue a single instruction. • This processor is one issue machine. • For example i960CA three-issue processor , Pentium 4 4- issu etc.
  • 8. Software parallelism • Defined by control and data dependence of programs • Degree of parallelism is revealed in the program profile or in the program flow graph. • Software parallelism is function of algorithm. • Program flow graph displays the patterns of simultaneous executable operations. • Example Hwang, (page 58 figure 2.3 and page 59 fig 2.4) • Control and data parallelism ( control parallelism in pipelining or multiplicity of functional units and data parallelism higher potential of concurrency on SISD and MIMD systems)
  • 9.
  • 10.
  • 11. • To solve mismatch problem between software parallelism and hardware parallelism. • Develop compilation support. • Hardware redesign and intelligent or optimized compiler. • The instruction scheduler exploits pipeline hardware by filling branch and load delay slots (using cache and dynamic scheduling).
  • 12. 6.4 The role of compilers • Compiler techniques used to exploit hardware features to improve performance. • Loop transformation, software pipelining, and features developed in existing optimizing compiler for supporting parallelism. • Hardware and software designed jointly at the same time. • Hardware and software design tradeoffs also exist in terms of cost, complexity, expandability, compatibility, and performance. • Granularity and communication latency play important role in the code optimization and scheduling.
  • 13. 6.5 Communication latency • Balancing granularity and latency to achieve better performance (depend on technology, scalability and machine size). • Memory latency increases respect to memory capacity. • Various latency hiding and tolerating techniques. • Inter-process communication latency is another important parameter. • n tasks communicating with each other requires • n(n-1)/2 communication links (grows quadrically). • Communication pattern.
  • 14. • Pattern included ( permutations, and broadcast, multicast, and conference) • Communication demand may limit granularity of parallelism. • Trade of between communication and granularity • Reduce latency and complexity of communication. • Prevention of deadlock. • Minimization of blocking in communication.
  • 15. 6.6 Grain packing and scheduling • Two fundamental question – 1. How can we partition in to parallel branches, program modules, or grains to yield the shortest possible execution time. – 2. What it the optimal size of concurrent grains in a computation. • Both problems are machine-dependent • The goal is a short schedule for fast execution of subdivided program modules. • Tradeoffs between parallelism and scheduling/synchronization overhead.
  • 16. • Partitioning involves the algorithm designer, programmer, compiler, operating system support, etc. • Hwang, (fig 2.6 , page 65) • • (n,s) ; (n is node,s : grain size) • • (v,d) ; (v : output variable, d : delay ) • • Fine and coarse grain and grain packing • Hwang, (fig 2.7, page 66)
  • 17.
  • 18.
  • 19. 6.7 Static multiprocessors scheduling • Grain packing may not produce a short schedule always. • Dynamic multiprocessor scheduling is an NP-hard problem.
  • 20. 6.8 Node duplication • To eliminate idle time and reduce communication delay • Four major step for grain packing and scheduling. – 1. Construct fine-grain program graph. – 2. Schedule the fine-grain computation – 3. Grain packing to produce the short grain – 4. Generate a parallel schedule based on the packed graph. • Hwang (Figure 2.8 page 67)
  • 21.
  • 22. • Calculatable grain size and communication • Hwang fig 2.9, page 68
  • 23.
  • 24. • Sequential versus parallel scheduling. • Hwang (fig 2.10 , page 69)
  • 25.
  • 26. • Grain packing for problem fig 2.9.c • Hwang , ( fig 2.11, page 70)