SlideShare a Scribd company logo
Advanced Computer Architecture
Conditions of Parallelism
Conditions of Parallelism
The exploitation of parallelism in computing
requires understanding the basic theory associated
with it. Progress is needed in several areas:
computation models for parallel computing
interprocessor communication in parallel architectures
integration of parallel systems into general environments
Data and Resource Dependencies
Program segments cannot be executed in parallel unless
they are independent.
Independence comes in several forms:
Data dependence: data modified by one segement must not be
modified by another parallel segment.
Control dependence: if the control flow of segments cannot be
identified before run time, then the data dependence between the
segments is variable.
Resource dependence: even if several segments are independent
in other ways, they cannot be executed in parallel if there aren’t
sufficient processing resources (e.g. functional units).
dependence graph
A dependence graph is used to describe the
relations between the statements.
The nodes corresponds to the program
statements and directed edges with different labels
shows the ordered relation among the statements.
Data Dependence - 1
The ordering relationship between statements is
shown by data dependence .following are the five
types data dependences:
Flow dependence(denoted by S1 S2): A
statement S2 is flow dependent on statement S1,if
an execution path is exists from S1 to S2 and if
atleast one output of 1 is used as input to S2.
Anti dependence(denoted by S1 S2): Statement
S2 is independent on statement S1 if S2 follows S1
in program order and if output of S2 overlaps the
input of S1.
Data Dependence - 2
Output dependence (denoted by S1 S2): Two
statement are output dependent if they produce the
same output variable.
I/O dependence (denoted by ): Read and
write are I/O statements . I/O dependence occure
when the same file is referenced by both I/O
statements.
Data Dependence - 3
Unknown dependence:
The subscript of a variable is itself subscripted.
The subscript does not contain the loop index variable.
A variable appears more than once with subscripts
having different coefficients of the loop variable (that is,
different functions of the loop variable).
The subscript is nonlinear in the loop index variable.
Parallel execution of program segments which do
not have total data independence can produce
non-deterministic results.
Example………….
EENG-630 - Chapter 2
I/O dependence example
S1: Read (4), A(I)
S2: Rewind (4)
S3: Write (4), B(I)
S4: Rewind (4)
S1 S3
I/O
Control Dependence
Control dependency arise when the order of
execution of statements cannot be determined
before runtime.
For example , conditional statements (IF) will not
be resolved until runtime. Differnet paths taken
after a conditional branch may introduce or
eliminate data dependence among instruction.
Dependence may also exits between operations
performed in successive iteration of looping
procedures.
Control Dependence
Dependence may also exist between operations
performed in successive iteration of looping
procedure.
The successive iteration of the following loop are
control-independent. example:
for (i=0;i<n;i++) {
a[i] = c[i];
if (a[i] < 0) a[i] = 1;
}
Control Dependence
The successive iteration of the following loop are control-
independent. example:
for (i=1;i<n;i++) {
if (a[i-1] < 0) a[i] = 1;
}
Compiler techniques are needed to get around
control dependence limitations.
Resource Dependence
Data and control dependencies are based on the
independence of the work to be done.
Resource independence is concerned with
conflicts in using shared resources, such as
registers, integer and floating point ALUs, etc.
ALU conflicts are called ALU dependence.
Memory (storage) conflicts are called storage
dependence.
The transformation of a sequentially coded
program into a parallel executable form can be
done manually by the programmer using explicit
parallism
Or done by a compiler detecting implicit parallism
automatically .
In both the approaches ,the decomposition of
program is primary objective.
Bernstein’s Conditions - 1
Bernstein’s revealed a set of conditions which
must exist if two processes can execute in parallel.
Notation
Let P1 and P2 be two processes .
Ii is the set of all input variables for a process Pi .
Oi is the set of all output variables for a process Pi .
If P1 and P2 can execute in parallel (which is written
as P1 || P2), then:
1 2
2 1
1 2
I O
I O
O O
∩ = ∅
∩ = ∅
∩ = ∅
Bernstein’s Conditions - 2
In terms of data dependencies, Bernstein’s
conditions imply that two processes can execute in
parallel if they are flow-independent, ant
independent, and output-independent.
The parallelism relation || is commutative (Pi || Pj
implies Pj || Pi ), but not transitive (Pi || Pj and Pj || Pk
does not imply Pi || Pk ) . Therefore, || is not an
equivalence relation.
Intersection of the input sets is allowed.
Hardware And Software Parallelism
For implementation of parallelism , special
hardware and software support is needed .
Compilation support is also needed to close to gap
between hardware and software .
Hardware Parallelism
Hardware parallelism is defined by machine architecture
and hardware multiplicity.
Hardware parallelism is a function of cost and performance
trade offs.
It displays the resources utilization patterns of
simultaneously executable operations.
It can also indicate the peak performance of the
processor resources.
It can be characterized by the number of
instructions that can be issued per machine cycle.
If a processor issues k instructions per machine
cycle, it is called a k-issue processor.
Conventional processors are one-issue machines.
Examples. Intel i960CA is a three-issue processor
(arithmetic, memory access, branch). IBM RS-
6000 is a four-issue processor (arithmetic, floating-
point, memory access, branch).
A machine with n k-issue processors should be
able to handle a maximum of nk threads
simultaneously.
Software Parallelism
Software parallelism is defined by the control and
data dependence of programs, and is revealed in
the program’s flow graph.
It is a function of algorithm, programming style, and
compiler optimization.
The program flow graph display the patterns of
simultaneously executable operations
Parallelism in a program varies during the
execution period.
Mismatch between software and
hardware parallelism - 1
L1 L2 L3 L4
X1 X2
+ -
A B
Maximum software
parallelism (L=load,
X/+/- = arithmetic).
Cycle 1
Cycle 2
Cycle 3
Mismatch between software and
hardware parallelism - 2
L1
L2
L4
L3X1
X2
+
-
A
B
Cycle 1
Cycle 2
Cycle 3
Cycle 4
Cycle 5
Cycle 6
Cycle 7
Same problem, but
considering the
parallelism on a two-issue
superscalar processor.
Mismatch between software and
hardware parallelism - 3
L1
L2
S1
X1
+
L5
L3
L4
S2
X2
-
L6
BA
Cycle 1
Cycle 2
Cycle 3
Cycle 4
Cycle 5
Cycle 6
Same problem,
with two single-
issue processors
= inserted for
synchronization
Types of Software Parallelism
Control Parallelism – two or more operations can
be performed simultaneously. This can be
detected by a compiler, or a programmer can
explicitly indicate control parallelism by using
special language constructs or dividing a program
into multiple processes.
Control Parallelism appear in the form of pipelining
or multi-functional units which is limited by pipeline
length and by the multiplicity of functional units.
Both are handle by the hardware .
Types of Software Parallelism
Data parallelism – multiple data elements have the
same operations applied to them at the same time.
This offers the highest potential for concurrency
(in SIMD and MIMD modes). Synchronization in
SIMD machines handled by hardware.
Data parallel code is easier to write and to
debug .
Solving the Mismatch Problems
Develop compilation support
Redesign hardware for more efficient exploitation
by compilers
Use large register files and sustained instruction
pipelining.
Have the compiler fill the branch and load delay
slots in code generated for RISC processors.
The Role of Compilers
Compilers used to exploit hardware features to
improve performance.
Interaction between compiler and architecture
design is a necessity in modern computer
development.
It is not necessarily the case that more software
parallelism will improve performance in
conventional scalar processors.
The hardware and compiler should be designed at
the same time.

More Related Content

What's hot

Parallel Programing Model
Parallel Programing ModelParallel Programing Model
Parallel Programing Model
Adlin Jeena
 
Parallel algorithms
Parallel algorithmsParallel algorithms
Parallel algorithms
guest084d20
 
Multivector and multiprocessor
Multivector and multiprocessorMultivector and multiprocessor
Multivector and multiprocessor
Kishan Panara
 

What's hot (20)

Parallel Programing Model
Parallel Programing ModelParallel Programing Model
Parallel Programing Model
 
Evaluation of morden computer & system attributes in ACA
Evaluation of morden computer &  system attributes in ACAEvaluation of morden computer &  system attributes in ACA
Evaluation of morden computer & system attributes in ACA
 
Data flow architecture
Data flow architectureData flow architecture
Data flow architecture
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Applications of paralleL processing
Applications of paralleL processingApplications of paralleL processing
Applications of paralleL processing
 
Design issues of dos
Design issues of dosDesign issues of dos
Design issues of dos
 
Parallel programming model
Parallel programming modelParallel programming model
Parallel programming model
 
Parallel algorithms
Parallel algorithmsParallel algorithms
Parallel algorithms
 
Multivector and multiprocessor
Multivector and multiprocessorMultivector and multiprocessor
Multivector and multiprocessor
 
distributed Computing system model
distributed Computing system modeldistributed Computing system model
distributed Computing system model
 
Message passing in Distributed Computing Systems
Message passing in Distributed Computing SystemsMessage passing in Distributed Computing Systems
Message passing in Distributed Computing Systems
 
Interface specification
Interface specificationInterface specification
Interface specification
 
Lecture 4 principles of parallel algorithm design updated
Lecture 4   principles of parallel algorithm design updatedLecture 4   principles of parallel algorithm design updated
Lecture 4 principles of parallel algorithm design updated
 
Parallel computing persentation
Parallel computing persentationParallel computing persentation
Parallel computing persentation
 
Lamport’s algorithm for mutual exclusion
Lamport’s algorithm for mutual exclusionLamport’s algorithm for mutual exclusion
Lamport’s algorithm for mutual exclusion
 
Dynamic interconnection networks
Dynamic interconnection networksDynamic interconnection networks
Dynamic interconnection networks
 
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
ADVANCED COMPUTER ARCHITECTUREAND PARALLEL PROCESSINGADVANCED COMPUTER ARCHITECTUREAND PARALLEL PROCESSING
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
 
Distributed shred memory architecture
Distributed shred memory architectureDistributed shred memory architecture
Distributed shred memory architecture
 
Limitations of memory system performance
Limitations of memory system performanceLimitations of memory system performance
Limitations of memory system performance
 
Computer architecture multi processor
Computer architecture multi processorComputer architecture multi processor
Computer architecture multi processor
 

Similar to advanced computer architesture-conditions of parallelism

CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxCS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
faithxdunce63732
 
Reflective and Refractive Variables: A Model for Effective and Maintainable A...
Reflective and Refractive Variables: A Model for Effective and Maintainable A...Reflective and Refractive Variables: A Model for Effective and Maintainable A...
Reflective and Refractive Variables: A Model for Effective and Maintainable A...
Vincenzo De Florio
 
1000 solved questions
1000 solved questions1000 solved questions
1000 solved questions
Kranthi Kumar
 

Similar to advanced computer architesture-conditions of parallelism (20)

Program and Network Properties
Program and Network PropertiesProgram and Network Properties
Program and Network Properties
 
programnetwork_properties-parallelism_ch2.ppt
programnetwork_properties-parallelism_ch2.pptprogramnetwork_properties-parallelism_ch2.ppt
programnetwork_properties-parallelism_ch2.ppt
 
ACA-Lect10.pptx
ACA-Lect10.pptxACA-Lect10.pptx
ACA-Lect10.pptx
 
Lec 4 (program and network properties)
Lec 4 (program and network properties)Lec 4 (program and network properties)
Lec 4 (program and network properties)
 
C question
C questionC question
C question
 
Code scheduling constraints
Code scheduling constraintsCode scheduling constraints
Code scheduling constraints
 
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxCS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
 
Hardware and software parallelism
Hardware and software parallelismHardware and software parallelism
Hardware and software parallelism
 
Reflective and Refractive Variables: A Model for Effective and Maintainable A...
Reflective and Refractive Variables: A Model for Effective and Maintainable A...Reflective and Refractive Variables: A Model for Effective and Maintainable A...
Reflective and Refractive Variables: A Model for Effective and Maintainable A...
 
Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...
 
Traffic Simulator
Traffic SimulatorTraffic Simulator
Traffic Simulator
 
Lect12-13_MS_Networks.pptx
Lect12-13_MS_Networks.pptxLect12-13_MS_Networks.pptx
Lect12-13_MS_Networks.pptx
 
Disadvantages Of Robotium
Disadvantages Of RobotiumDisadvantages Of Robotium
Disadvantages Of Robotium
 
Software maintenance
Software maintenanceSoftware maintenance
Software maintenance
 
Pipelining and vector processing
Pipelining and vector processingPipelining and vector processing
Pipelining and vector processing
 
LOCK-FREE PARALLEL ACCESS COLLECTIONS
LOCK-FREE PARALLEL ACCESS COLLECTIONSLOCK-FREE PARALLEL ACCESS COLLECTIONS
LOCK-FREE PARALLEL ACCESS COLLECTIONS
 
Lock free parallel access collections
Lock free parallel access collectionsLock free parallel access collections
Lock free parallel access collections
 
1000 solved questions
1000 solved questions1000 solved questions
1000 solved questions
 
PID2143641
PID2143641PID2143641
PID2143641
 
p850-ries
p850-riesp850-ries
p850-ries
 

Recently uploaded

Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Po-Chuan Chen
 
Industrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training ReportIndustrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training Report
Avinash Rai
 

Recently uploaded (20)

Fish and Chips - have they had their chips
Fish and Chips - have they had their chipsFish and Chips - have they had their chips
Fish and Chips - have they had their chips
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Basic_QTL_Marker-assisted_Selection_Sourabh.ppt
Basic_QTL_Marker-assisted_Selection_Sourabh.pptBasic_QTL_Marker-assisted_Selection_Sourabh.ppt
Basic_QTL_Marker-assisted_Selection_Sourabh.ppt
 
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
 
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptxMatatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
 
Gyanartha SciBizTech Quiz slideshare.pptx
Gyanartha SciBizTech Quiz slideshare.pptxGyanartha SciBizTech Quiz slideshare.pptx
Gyanartha SciBizTech Quiz slideshare.pptx
 
Salient features of Environment protection Act 1986.pptx
Salient features of Environment protection Act 1986.pptxSalient features of Environment protection Act 1986.pptx
Salient features of Environment protection Act 1986.pptx
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
 
Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptx
Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptxSolid waste management & Types of Basic civil Engineering notes by DJ Sir.pptx
Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptx
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumers
 
Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & EngineeringBasic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
 
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdfINU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
 
NLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptxNLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptx
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
 
Industrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training ReportIndustrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training Report
 
The Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve ThomasonThe Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve Thomason
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
 

advanced computer architesture-conditions of parallelism

  • 2. Conditions of Parallelism The exploitation of parallelism in computing requires understanding the basic theory associated with it. Progress is needed in several areas: computation models for parallel computing interprocessor communication in parallel architectures integration of parallel systems into general environments
  • 3. Data and Resource Dependencies Program segments cannot be executed in parallel unless they are independent. Independence comes in several forms: Data dependence: data modified by one segement must not be modified by another parallel segment. Control dependence: if the control flow of segments cannot be identified before run time, then the data dependence between the segments is variable. Resource dependence: even if several segments are independent in other ways, they cannot be executed in parallel if there aren’t sufficient processing resources (e.g. functional units).
  • 4. dependence graph A dependence graph is used to describe the relations between the statements. The nodes corresponds to the program statements and directed edges with different labels shows the ordered relation among the statements.
  • 5. Data Dependence - 1 The ordering relationship between statements is shown by data dependence .following are the five types data dependences: Flow dependence(denoted by S1 S2): A statement S2 is flow dependent on statement S1,if an execution path is exists from S1 to S2 and if atleast one output of 1 is used as input to S2. Anti dependence(denoted by S1 S2): Statement S2 is independent on statement S1 if S2 follows S1 in program order and if output of S2 overlaps the input of S1.
  • 6. Data Dependence - 2 Output dependence (denoted by S1 S2): Two statement are output dependent if they produce the same output variable. I/O dependence (denoted by ): Read and write are I/O statements . I/O dependence occure when the same file is referenced by both I/O statements.
  • 7. Data Dependence - 3 Unknown dependence: The subscript of a variable is itself subscripted. The subscript does not contain the loop index variable. A variable appears more than once with subscripts having different coefficients of the loop variable (that is, different functions of the loop variable). The subscript is nonlinear in the loop index variable. Parallel execution of program segments which do not have total data independence can produce non-deterministic results.
  • 9. EENG-630 - Chapter 2 I/O dependence example S1: Read (4), A(I) S2: Rewind (4) S3: Write (4), B(I) S4: Rewind (4) S1 S3 I/O
  • 10. Control Dependence Control dependency arise when the order of execution of statements cannot be determined before runtime. For example , conditional statements (IF) will not be resolved until runtime. Differnet paths taken after a conditional branch may introduce or eliminate data dependence among instruction. Dependence may also exits between operations performed in successive iteration of looping procedures.
  • 11. Control Dependence Dependence may also exist between operations performed in successive iteration of looping procedure. The successive iteration of the following loop are control-independent. example: for (i=0;i<n;i++) { a[i] = c[i]; if (a[i] < 0) a[i] = 1; }
  • 12. Control Dependence The successive iteration of the following loop are control- independent. example: for (i=1;i<n;i++) { if (a[i-1] < 0) a[i] = 1; } Compiler techniques are needed to get around control dependence limitations.
  • 13. Resource Dependence Data and control dependencies are based on the independence of the work to be done. Resource independence is concerned with conflicts in using shared resources, such as registers, integer and floating point ALUs, etc. ALU conflicts are called ALU dependence. Memory (storage) conflicts are called storage dependence.
  • 14. The transformation of a sequentially coded program into a parallel executable form can be done manually by the programmer using explicit parallism Or done by a compiler detecting implicit parallism automatically . In both the approaches ,the decomposition of program is primary objective.
  • 15. Bernstein’s Conditions - 1 Bernstein’s revealed a set of conditions which must exist if two processes can execute in parallel. Notation Let P1 and P2 be two processes . Ii is the set of all input variables for a process Pi . Oi is the set of all output variables for a process Pi . If P1 and P2 can execute in parallel (which is written as P1 || P2), then: 1 2 2 1 1 2 I O I O O O ∩ = ∅ ∩ = ∅ ∩ = ∅
  • 16. Bernstein’s Conditions - 2 In terms of data dependencies, Bernstein’s conditions imply that two processes can execute in parallel if they are flow-independent, ant independent, and output-independent. The parallelism relation || is commutative (Pi || Pj implies Pj || Pi ), but not transitive (Pi || Pj and Pj || Pk does not imply Pi || Pk ) . Therefore, || is not an equivalence relation. Intersection of the input sets is allowed.
  • 17. Hardware And Software Parallelism For implementation of parallelism , special hardware and software support is needed . Compilation support is also needed to close to gap between hardware and software .
  • 18. Hardware Parallelism Hardware parallelism is defined by machine architecture and hardware multiplicity. Hardware parallelism is a function of cost and performance trade offs. It displays the resources utilization patterns of simultaneously executable operations. It can also indicate the peak performance of the processor resources.
  • 19. It can be characterized by the number of instructions that can be issued per machine cycle. If a processor issues k instructions per machine cycle, it is called a k-issue processor. Conventional processors are one-issue machines. Examples. Intel i960CA is a three-issue processor (arithmetic, memory access, branch). IBM RS- 6000 is a four-issue processor (arithmetic, floating- point, memory access, branch). A machine with n k-issue processors should be able to handle a maximum of nk threads simultaneously.
  • 20. Software Parallelism Software parallelism is defined by the control and data dependence of programs, and is revealed in the program’s flow graph. It is a function of algorithm, programming style, and compiler optimization. The program flow graph display the patterns of simultaneously executable operations Parallelism in a program varies during the execution period.
  • 21. Mismatch between software and hardware parallelism - 1 L1 L2 L3 L4 X1 X2 + - A B Maximum software parallelism (L=load, X/+/- = arithmetic). Cycle 1 Cycle 2 Cycle 3
  • 22. Mismatch between software and hardware parallelism - 2 L1 L2 L4 L3X1 X2 + - A B Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Same problem, but considering the parallelism on a two-issue superscalar processor.
  • 23. Mismatch between software and hardware parallelism - 3 L1 L2 S1 X1 + L5 L3 L4 S2 X2 - L6 BA Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Same problem, with two single- issue processors = inserted for synchronization
  • 24. Types of Software Parallelism Control Parallelism – two or more operations can be performed simultaneously. This can be detected by a compiler, or a programmer can explicitly indicate control parallelism by using special language constructs or dividing a program into multiple processes. Control Parallelism appear in the form of pipelining or multi-functional units which is limited by pipeline length and by the multiplicity of functional units. Both are handle by the hardware .
  • 25. Types of Software Parallelism Data parallelism – multiple data elements have the same operations applied to them at the same time. This offers the highest potential for concurrency (in SIMD and MIMD modes). Synchronization in SIMD machines handled by hardware. Data parallel code is easier to write and to debug .
  • 26. Solving the Mismatch Problems Develop compilation support Redesign hardware for more efficient exploitation by compilers Use large register files and sustained instruction pipelining. Have the compiler fill the branch and load delay slots in code generated for RISC processors.
  • 27. The Role of Compilers Compilers used to exploit hardware features to improve performance. Interaction between compiler and architecture design is a necessity in modern computer development. It is not necessarily the case that more software parallelism will improve performance in conventional scalar processors. The hardware and compiler should be designed at the same time.