SlideShare a Scribd company logo
1 of 39
Chapter 4
Parallel Processing Concept
Chapter 4
Parallel Processing Concepts
• 4.1 Program flow mechanism
• 4.2 Control flow versus data flow; A data flow
Architecture
• 4.3 Demand driven mechanism; Reduction machine
model
• 4.4 Comparison of flow mechanisms
• 4.5 Coroutunes; Fork and Join, Data flow,
ParBegin and ParEnd
• 4.6 Processes; Remote Procedure Call
• 4.7 Implicit Parallelism; Explicit versus implicit
parallelism
Introduction
• Program flow mechanisms will be introduced.
• Data, demand and control flow or driven approach
will be introduced.
• Typical architecture of those system will be given
in this cheapter.
• Parallel processing concepts and fundementals of
parallel processing will be presented in this
chapter.
4.1 Program flow mechanism
• Conventional computers base on control-flow
mechanism by which the order of program
execution is explicitly stated in user programs.
• Data-flow computers are based on a data-driven
mechanism which the execution of any instruction
to be driven by data (operand) availability.
• Dataflow computer emphasize a high degree of
parallelism at fine grain instruction level.
• Reduction computer are based on demand driven
mechanism which initiates an operation based on
the demand for its results by other computations.
4.2 Control flow versus data
flow
• Von Neuman Computers uses program counters
(PC) to sequence the execution of instructions in a
program.
– PC sequenced by the instruction flow in a program.
– Sequential execution style has been called control-
driven.
– Control flow computers use shared memory to hold
program instructions and data objects.
– Variable in the shared memory are updated by many
instructions.
– This may produce side effects since memory is shared.
– This side effect may prevent parallelism.
– Control flow ca be made parallel by using parallel
languages construct or parallel compilers.
• In data flow computers, the execution of a
instruction is driven by data availability instead of
being guided by program counter.
– The instructions in a data driven program are not
ordered in any way.
– Computational results (data tokens) are passed directly
between instructions.
– The data generated by an instruction will be duplicated
into many copies and forwarded directly to all needy
instructions.
– Data driven schema requires no shared
memory, no program counter, and no control
sequencer.
– It requires a special mechanism to detect data
availability, to match tokens with needy
instructions.
– This implies the need of hand shaking or token
matching operations.
– Data flow computers exploits fine grain level
parallelism.
A data flow Architecture
• There are a few experimental data flow computer
project
• MIT developed tagged-token architecture for
building data flow computer.
•
• Hwang fig 2.12, page 72
•
• n PEs interconnected by nxn routing network.
• System supports pipelined dataflow operations in
all n Pes.
• Machine provides a low-level token matching
mechanism. Instructions are stored in the program
memory.
• Tagged tokens enters the PE through local path.
• Tokens also passed other PEs through the routing
network.
• Each instruction represents a synchronization
operation.
• Another synchronization mechanism, called I-
structure which is tagged memory unit for
overlapping usage of a data structure by both the
producer and consumer processes.
• I-structure uses a 2- bit tag indicating a word
which is empty, full, or pending a request.
• This may thread pure data flow approach.
• Comparison of data and control flow machines.
• Hwang Page 73, fig 2.13
•
• Data flow computer can absorb the
communication latency and minimizes the loses
due to synchronization waits.
• Data flow offers an ideal model for MasPar
computations because all far-reaching side effect
are removed.
4.3 Demand driven mechanism
• In a reduction machine, the computation is
triggered by the demand for an operation’s results
•
• a = ((b+1)*c-(d/e)).
•
• The data driven computation chooses a bottom up
approach starting from the innermost operations;
• b+1 and d/e
• then proceeding * operation and finally –
• Such a computation is called eager evaluation
because operation carried out immediately after all
their operand available.
• A demand driven computation chooses a top down
approach by first demanding approach by first demanding
value of a, which triggers the demand for evaluating the
next-level expression (b+1)*e and d/e and then b+1.
• A demand driven computation correspond to lazy
evaluation, because operations are executed only when
their results are required by another instruction.
• The demand driven approaches matches naturally
with the functional programming concept.
• The removal of side effects in functional
programming makes programming easier to
parallelize.
Reduction machine model
• In a string reduction model, each demander gets a
separate copy of the expression for its own
evaluation.
• The operator is suspended while its input
arguments are being evaluated.
• Different part of program graphs or sub-regions
can be reduced or evaluated in parallel upon
demand.
• A determined value (a copy ) is returned to the
original demanding instruction.
4.4 Comparison of flow
mechanisms
• Data, control, and demand flow mechanisms are
compared.
• Hwang page 76, table 2.1
• The degree of explicit control decreases from
control drive to demand driven to data driven.
• Advantages and disadvantages are given in the
table.
• Both data and demand flow mechanism despite of
a higher potentials for parallelism, are still in the
research.
• Control flow machines still dominate markets.
4.5 Coroutines
• The fundamental design character is the single
processor model.
• There is only one instruction stream with
sequential flow control.
• The processor system resource can be engaged
and released by a set of coroutines in an orderly
manner.
• A quasi-parallel execution takes place between
two or more corroutines.
• Execution starts with the call of one particular coroutine (
as a kind of procedure)
• Each coroutine may contain any number of transfer
statement that switch the flow of control to a different
coroutine.
• This not a procedure call.
• The transfer of control has to be explicit specified by the
application programmer ( ho also make sure that the flow
of control is transferred at the correct points.
• The procedure transfer is provided in order to switch
control of flow between corrotines in Modulo-2.
• PROCEDURE TRANFER (VAR source, Destination:ADDRESS);
Fork and Join
• The fork and join construct are among the earliest parallel
language construct.
• It is possible start parallel processes in the Unix operating
system with the fork operation and to wait end of them
with wait operation.
• In this type of parallel programming, two fundamentally
different concepts are mixed first, the declaration of
parallel processes; and second, the synchronization of the
processes.
• Actually the functionality of the fork operation in Unix is
not as general as shown in figure 4.2
• Instead, an identical copy of the calling process is
generated, which then executes in parallel to the original.
• The only possibility of a process to determine its identity
(it is an identification number).
• In order to start a different program and wait for its
termination, the two Unix calls can be embedded in the C
language in the following way.
int status;
if ( fork() == 0 execlp (“program_B”,...); /* Child Proc */
.... /* Parent Proc */
wait(&status);
Program Segments
• Master-Slave programming
Program Segments (2)
• Program code
• Any global declaration of variables will be doubled in each process but
local variables are not doubled.
• The call to the fork operation returns the process number
of the child process to the parent process (is not equal to
0).
• For child fork return 0 to to the child.
• The child immediately executes the execlp operation.
• The parent process can wait for the termination of the child
process (wait operation).
ParBegin and ParEnd
• Blocks of parallel code are defined with ParBegin and
ParEnd (cobegin and coend) in a manner analogous to the
sequential begin and end.
• However the instructions in the block should be carried out
simultaneously.
• This language is used AL control several robots and
coordinates them.
• Synchronization between processes through semaphores.
• Due to restrictions mentioned, this concept of statement
has no application in modern programming languages
(synchronization and etc).
4.6 Processes
• Processes are declared similar to procedures and are started
with a specific instruction.
• If several copies of a process needed to be executed, then
that process type must be started with multiple calls
possibly having different parameters.
• The synchronization between processes executing in
parallel may be controlled through concepts of semaphores
or monitors with condition variables.
• The explicit synchronization of parallel processes exact not
only not only an additional control cost but, is also
extremely susceptible to error and deadlocks.
• Communication and synchronization are accomplished in
system with shared (“tightly coupled”) via monitors with
conditions.
• In system without shared memory (“loosely coupled”), the
concept is illustrated in figure 3.x (4.4)
Remote procedure call
• In order to extend the process concept to a parallel
computer system without shared memory, the
communication between processes located on different
processor has to be carried out by message passing.
• The programming system is dvided into multiple parallel
processes, where each process takes on the roll of either a
client or a sever.
• Each sever can also become a client by using the services
of another sever.
• Each client confers tasks on one or more approximately
configured sever processes.
• This type of parallel task distribution is implemented with
the remote procedure call RPC mechanism.
• Here, a remote procedure call resembles just a task deposit
operation.
• Returning the results after the calculation by server
requires another explicit data exchange in the opposite
direction
• Problems with the remote procedure call include the
application of error tolerant protocols for resetting or may
be restarting of the client after a server failure.
4.7 Implicit parallelism
• All parallel concepts covered so far use a special term,
explicit language construct for controlling the parallel
execution.
• Several languages do not require any language constructs
for parallelism, but nevertheless allow parallel processing.
• Such a programming languages are called languages with
implicit parallelism.
• The programmer is much more limited in controlling the
parallel processors which are executing his program
(efficient parallelism as done by an intelligent compiler .
• The compiler has no interaction with the application
programmer (declarative languages represents knowledge
or problems to be solved by using complex mathematical
formulas.
• The implicit parallelism of vector expressions, for example
from programming language Functional Programming.
• A shown in figure 3.x (4.7), the mathematical notation of a
matrix addition contains implicit parallelism that can quite
easily be converted to a parallel computer architecture
through automatic parallelization.
Explicit versus Implicit
parallelism
• A summary of advantages and disadvantages of explicit
and implicit is presented in figure 3.x (4.8.
• The programming actually occurs at high level abstraction;
for this reason implicit parallelism is often found in higher
level non-procedural languages.
• In contrast, explicit parallelism gives the programmer
considerable more flexibility that, can lead to better
processors utilization and higher performance.
• This advantage is paid for with more complicated and
more error prone programming method.
BIL406-Chapter-4-Parallel Processing Concept.ppt

More Related Content

Similar to BIL406-Chapter-4-Parallel Processing Concept.ppt

Operating system Q/A
Operating system Q/AOperating system Q/A
Operating system Q/A
Abdul Munam
 
Operating-System-(1-3 group) Case study on windows Mac and linux among variou...
Operating-System-(1-3 group) Case study on windows Mac and linux among variou...Operating-System-(1-3 group) Case study on windows Mac and linux among variou...
Operating-System-(1-3 group) Case study on windows Mac and linux among variou...
ssuser4a97d3
 
Cloud data management
Cloud data managementCloud data management
Cloud data management
ambitlick
 

Similar to BIL406-Chapter-4-Parallel Processing Concept.ppt (20)

Cleanroom Software Engineering By NADEEM AHMED FROM DEPALPUR
Cleanroom Software Engineering By NADEEM AHMED FROM DEPALPURCleanroom Software Engineering By NADEEM AHMED FROM DEPALPUR
Cleanroom Software Engineering By NADEEM AHMED FROM DEPALPUR
 
Parallel Processing & Pipelining in Computer Architecture_Prof.Sumalatha.pptx
Parallel Processing & Pipelining in Computer Architecture_Prof.Sumalatha.pptxParallel Processing & Pipelining in Computer Architecture_Prof.Sumalatha.pptx
Parallel Processing & Pipelining in Computer Architecture_Prof.Sumalatha.pptx
 
Module3 part1
Module3 part1Module3 part1
Module3 part1
 
Lec 4 (program and network properties)
Lec 4 (program and network properties)Lec 4 (program and network properties)
Lec 4 (program and network properties)
 
Real time operating systems
Real time operating systemsReal time operating systems
Real time operating systems
 
IRJET-Concurrency Control Model for Distributed Database
IRJET-Concurrency Control Model for Distributed DatabaseIRJET-Concurrency Control Model for Distributed Database
IRJET-Concurrency Control Model for Distributed Database
 
Design space for user interface architectures
Design space for user interface architecturesDesign space for user interface architectures
Design space for user interface architectures
 
Unit 5 Advanced Computer Architecture
Unit 5 Advanced Computer ArchitectureUnit 5 Advanced Computer Architecture
Unit 5 Advanced Computer Architecture
 
Aca module 1
Aca module 1Aca module 1
Aca module 1
 
Unit 2_OS process management
Unit 2_OS process management Unit 2_OS process management
Unit 2_OS process management
 
Operating system Q/A
Operating system Q/AOperating system Q/A
Operating system Q/A
 
unit 4.pptx
unit 4.pptxunit 4.pptx
unit 4.pptx
 
unit 4.pptx
unit 4.pptxunit 4.pptx
unit 4.pptx
 
Parallel Computing
Parallel ComputingParallel Computing
Parallel Computing
 
Operating-System-(1-3 group) Case study on windows Mac and linux among variou...
Operating-System-(1-3 group) Case study on windows Mac and linux among variou...Operating-System-(1-3 group) Case study on windows Mac and linux among variou...
Operating-System-(1-3 group) Case study on windows Mac and linux among variou...
 
[EWiLi2016] Enabling power-awareness for the Xen Hypervisor
[EWiLi2016] Enabling power-awareness for the Xen Hypervisor[EWiLi2016] Enabling power-awareness for the Xen Hypervisor
[EWiLi2016] Enabling power-awareness for the Xen Hypervisor
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Process Migration in Heterogeneous Systems
Process Migration in Heterogeneous SystemsProcess Migration in Heterogeneous Systems
Process Migration in Heterogeneous Systems
 
Cloud data management
Cloud data managementCloud data management
Cloud data management
 
13009690.ppt
13009690.ppt13009690.ppt
13009690.ppt
 

More from Kadri20

BIL406-Chapter-11-MIMD Programming Languages.ppt
BIL406-Chapter-11-MIMD Programming Languages.pptBIL406-Chapter-11-MIMD Programming Languages.ppt
BIL406-Chapter-11-MIMD Programming Languages.ppt
Kadri20
 
BIL406-Chapter-8-Asynchronous parallelism.ppt
BIL406-Chapter-8-Asynchronous parallelism.pptBIL406-Chapter-8-Asynchronous parallelism.ppt
BIL406-Chapter-8-Asynchronous parallelism.ppt
Kadri20
 
BIL406-Chapter-5-Network Structures.ppt
BIL406-Chapter-5-Network Structures.pptBIL406-Chapter-5-Network Structures.ppt
BIL406-Chapter-5-Network Structures.ppt
Kadri20
 
BIL406-Chapter-7-Superscalar and Superpipeline processors.ppt
BIL406-Chapter-7-Superscalar and Superpipeline  processors.pptBIL406-Chapter-7-Superscalar and Superpipeline  processors.ppt
BIL406-Chapter-7-Superscalar and Superpipeline processors.ppt
Kadri20
 
BIL406-Chapter-9-Synchronization and Communication in MIMD Systems.ppt
BIL406-Chapter-9-Synchronization and Communication in MIMD Systems.pptBIL406-Chapter-9-Synchronization and Communication in MIMD Systems.ppt
BIL406-Chapter-9-Synchronization and Communication in MIMD Systems.ppt
Kadri20
 
BIL406-Chapter-2-Classifications of Parallel Systems.ppt
BIL406-Chapter-2-Classifications of Parallel Systems.pptBIL406-Chapter-2-Classifications of Parallel Systems.ppt
BIL406-Chapter-2-Classifications of Parallel Systems.ppt
Kadri20
 
BIL406-Chapter-6-Basic Parallelism and CPU.ppt
BIL406-Chapter-6-Basic Parallelism and CPU.pptBIL406-Chapter-6-Basic Parallelism and CPU.ppt
BIL406-Chapter-6-Basic Parallelism and CPU.ppt
Kadri20
 
BIL406-Chapter-10-Problems with Asynchronous Parallelism.ppt
BIL406-Chapter-10-Problems with Asynchronous Parallelism.pptBIL406-Chapter-10-Problems with Asynchronous Parallelism.ppt
BIL406-Chapter-10-Problems with Asynchronous Parallelism.ppt
Kadri20
 
BIL406-Chapter-1-Introduction.ppt
BIL406-Chapter-1-Introduction.pptBIL406-Chapter-1-Introduction.ppt
BIL406-Chapter-1-Introduction.ppt
Kadri20
 
BIL406-Chapter-0-Introduction-Course.ppt
BIL406-Chapter-0-Introduction-Course.pptBIL406-Chapter-0-Introduction-Course.ppt
BIL406-Chapter-0-Introduction-Course.ppt
Kadri20
 

More from Kadri20 (10)

BIL406-Chapter-11-MIMD Programming Languages.ppt
BIL406-Chapter-11-MIMD Programming Languages.pptBIL406-Chapter-11-MIMD Programming Languages.ppt
BIL406-Chapter-11-MIMD Programming Languages.ppt
 
BIL406-Chapter-8-Asynchronous parallelism.ppt
BIL406-Chapter-8-Asynchronous parallelism.pptBIL406-Chapter-8-Asynchronous parallelism.ppt
BIL406-Chapter-8-Asynchronous parallelism.ppt
 
BIL406-Chapter-5-Network Structures.ppt
BIL406-Chapter-5-Network Structures.pptBIL406-Chapter-5-Network Structures.ppt
BIL406-Chapter-5-Network Structures.ppt
 
BIL406-Chapter-7-Superscalar and Superpipeline processors.ppt
BIL406-Chapter-7-Superscalar and Superpipeline  processors.pptBIL406-Chapter-7-Superscalar and Superpipeline  processors.ppt
BIL406-Chapter-7-Superscalar and Superpipeline processors.ppt
 
BIL406-Chapter-9-Synchronization and Communication in MIMD Systems.ppt
BIL406-Chapter-9-Synchronization and Communication in MIMD Systems.pptBIL406-Chapter-9-Synchronization and Communication in MIMD Systems.ppt
BIL406-Chapter-9-Synchronization and Communication in MIMD Systems.ppt
 
BIL406-Chapter-2-Classifications of Parallel Systems.ppt
BIL406-Chapter-2-Classifications of Parallel Systems.pptBIL406-Chapter-2-Classifications of Parallel Systems.ppt
BIL406-Chapter-2-Classifications of Parallel Systems.ppt
 
BIL406-Chapter-6-Basic Parallelism and CPU.ppt
BIL406-Chapter-6-Basic Parallelism and CPU.pptBIL406-Chapter-6-Basic Parallelism and CPU.ppt
BIL406-Chapter-6-Basic Parallelism and CPU.ppt
 
BIL406-Chapter-10-Problems with Asynchronous Parallelism.ppt
BIL406-Chapter-10-Problems with Asynchronous Parallelism.pptBIL406-Chapter-10-Problems with Asynchronous Parallelism.ppt
BIL406-Chapter-10-Problems with Asynchronous Parallelism.ppt
 
BIL406-Chapter-1-Introduction.ppt
BIL406-Chapter-1-Introduction.pptBIL406-Chapter-1-Introduction.ppt
BIL406-Chapter-1-Introduction.ppt
 
BIL406-Chapter-0-Introduction-Course.ppt
BIL406-Chapter-0-Introduction-Course.pptBIL406-Chapter-0-Introduction-Course.ppt
BIL406-Chapter-0-Introduction-Course.ppt
 

Recently uploaded

一比一原版(NEU毕业证书)东北大学毕业证成绩单原件一模一样
一比一原版(NEU毕业证书)东北大学毕业证成绩单原件一模一样一比一原版(NEU毕业证书)东北大学毕业证成绩单原件一模一样
一比一原版(NEU毕业证书)东北大学毕业证成绩单原件一模一样
A
 
01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...
01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...
01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...
AshwaniAnuragi1
 
☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...
☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...
☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...
mikehavy0
 

Recently uploaded (20)

Seismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas SachpazisSeismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
 
Databricks Generative AI Fundamentals .pdf
Databricks Generative AI Fundamentals  .pdfDatabricks Generative AI Fundamentals  .pdf
Databricks Generative AI Fundamentals .pdf
 
DBMS-Report on Student management system.pptx
DBMS-Report on Student management system.pptxDBMS-Report on Student management system.pptx
DBMS-Report on Student management system.pptx
 
一比一原版(NEU毕业证书)东北大学毕业证成绩单原件一模一样
一比一原版(NEU毕业证书)东北大学毕业证成绩单原件一模一样一比一原版(NEU毕业证书)东北大学毕业证成绩单原件一模一样
一比一原版(NEU毕业证书)东北大学毕业证成绩单原件一模一样
 
Max. shear stress theory-Maximum Shear Stress Theory ​ Maximum Distortional ...
Max. shear stress theory-Maximum Shear Stress Theory ​  Maximum Distortional ...Max. shear stress theory-Maximum Shear Stress Theory ​  Maximum Distortional ...
Max. shear stress theory-Maximum Shear Stress Theory ​ Maximum Distortional ...
 
Basics of Relay for Engineering Students
Basics of Relay for Engineering StudentsBasics of Relay for Engineering Students
Basics of Relay for Engineering Students
 
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdfInstruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
 
Dynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptxDynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptx
 
Adsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) pptAdsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) ppt
 
01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...
01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...
01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...
 
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
 
History of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & ModernizationHistory of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & Modernization
 
Geometric constructions Engineering Drawing.pdf
Geometric constructions Engineering Drawing.pdfGeometric constructions Engineering Drawing.pdf
Geometric constructions Engineering Drawing.pdf
 
Raashid final report on Embedded Systems
Raashid final report on Embedded SystemsRaashid final report on Embedded Systems
Raashid final report on Embedded Systems
 
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdflitvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
 
engineering chemistry power point presentation
engineering chemistry  power point presentationengineering chemistry  power point presentation
engineering chemistry power point presentation
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
 
Artificial Intelligence in due diligence
Artificial Intelligence in due diligenceArtificial Intelligence in due diligence
Artificial Intelligence in due diligence
 
Autodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxAutodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptx
 
☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...
☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...
☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...
 

BIL406-Chapter-4-Parallel Processing Concept.ppt

  • 2. Chapter 4 Parallel Processing Concepts • 4.1 Program flow mechanism • 4.2 Control flow versus data flow; A data flow Architecture • 4.3 Demand driven mechanism; Reduction machine model • 4.4 Comparison of flow mechanisms • 4.5 Coroutunes; Fork and Join, Data flow, ParBegin and ParEnd • 4.6 Processes; Remote Procedure Call • 4.7 Implicit Parallelism; Explicit versus implicit parallelism
  • 3. Introduction • Program flow mechanisms will be introduced. • Data, demand and control flow or driven approach will be introduced. • Typical architecture of those system will be given in this cheapter. • Parallel processing concepts and fundementals of parallel processing will be presented in this chapter.
  • 4. 4.1 Program flow mechanism • Conventional computers base on control-flow mechanism by which the order of program execution is explicitly stated in user programs. • Data-flow computers are based on a data-driven mechanism which the execution of any instruction to be driven by data (operand) availability. • Dataflow computer emphasize a high degree of parallelism at fine grain instruction level. • Reduction computer are based on demand driven mechanism which initiates an operation based on the demand for its results by other computations.
  • 5. 4.2 Control flow versus data flow • Von Neuman Computers uses program counters (PC) to sequence the execution of instructions in a program. – PC sequenced by the instruction flow in a program. – Sequential execution style has been called control- driven. – Control flow computers use shared memory to hold program instructions and data objects. – Variable in the shared memory are updated by many instructions.
  • 6. – This may produce side effects since memory is shared. – This side effect may prevent parallelism. – Control flow ca be made parallel by using parallel languages construct or parallel compilers. • In data flow computers, the execution of a instruction is driven by data availability instead of being guided by program counter. – The instructions in a data driven program are not ordered in any way. – Computational results (data tokens) are passed directly between instructions. – The data generated by an instruction will be duplicated into many copies and forwarded directly to all needy instructions.
  • 7. – Data driven schema requires no shared memory, no program counter, and no control sequencer. – It requires a special mechanism to detect data availability, to match tokens with needy instructions. – This implies the need of hand shaking or token matching operations. – Data flow computers exploits fine grain level parallelism.
  • 8. A data flow Architecture • There are a few experimental data flow computer project • MIT developed tagged-token architecture for building data flow computer. • • Hwang fig 2.12, page 72 • • n PEs interconnected by nxn routing network. • System supports pipelined dataflow operations in all n Pes.
  • 9.
  • 10. • Machine provides a low-level token matching mechanism. Instructions are stored in the program memory. • Tagged tokens enters the PE through local path. • Tokens also passed other PEs through the routing network. • Each instruction represents a synchronization operation. • Another synchronization mechanism, called I- structure which is tagged memory unit for overlapping usage of a data structure by both the producer and consumer processes.
  • 11. • I-structure uses a 2- bit tag indicating a word which is empty, full, or pending a request. • This may thread pure data flow approach. • Comparison of data and control flow machines. • Hwang Page 73, fig 2.13 • • Data flow computer can absorb the communication latency and minimizes the loses due to synchronization waits. • Data flow offers an ideal model for MasPar computations because all far-reaching side effect are removed.
  • 12.
  • 13. 4.3 Demand driven mechanism • In a reduction machine, the computation is triggered by the demand for an operation’s results • • a = ((b+1)*c-(d/e)). • • The data driven computation chooses a bottom up approach starting from the innermost operations; • b+1 and d/e • then proceeding * operation and finally –
  • 14. • Such a computation is called eager evaluation because operation carried out immediately after all their operand available. • A demand driven computation chooses a top down approach by first demanding approach by first demanding value of a, which triggers the demand for evaluating the next-level expression (b+1)*e and d/e and then b+1. • A demand driven computation correspond to lazy evaluation, because operations are executed only when their results are required by another instruction. • The demand driven approaches matches naturally with the functional programming concept. • The removal of side effects in functional programming makes programming easier to parallelize.
  • 15. Reduction machine model • In a string reduction model, each demander gets a separate copy of the expression for its own evaluation. • The operator is suspended while its input arguments are being evaluated. • Different part of program graphs or sub-regions can be reduced or evaluated in parallel upon demand. • A determined value (a copy ) is returned to the original demanding instruction.
  • 16. 4.4 Comparison of flow mechanisms • Data, control, and demand flow mechanisms are compared. • Hwang page 76, table 2.1 • The degree of explicit control decreases from control drive to demand driven to data driven. • Advantages and disadvantages are given in the table. • Both data and demand flow mechanism despite of a higher potentials for parallelism, are still in the research. • Control flow machines still dominate markets.
  • 17.
  • 18. 4.5 Coroutines • The fundamental design character is the single processor model. • There is only one instruction stream with sequential flow control. • The processor system resource can be engaged and released by a set of coroutines in an orderly manner. • A quasi-parallel execution takes place between two or more corroutines.
  • 19. • Execution starts with the call of one particular coroutine ( as a kind of procedure) • Each coroutine may contain any number of transfer statement that switch the flow of control to a different coroutine. • This not a procedure call. • The transfer of control has to be explicit specified by the application programmer ( ho also make sure that the flow of control is transferred at the correct points. • The procedure transfer is provided in order to switch control of flow between corrotines in Modulo-2. • PROCEDURE TRANFER (VAR source, Destination:ADDRESS);
  • 20.
  • 21. Fork and Join • The fork and join construct are among the earliest parallel language construct. • It is possible start parallel processes in the Unix operating system with the fork operation and to wait end of them with wait operation. • In this type of parallel programming, two fundamentally different concepts are mixed first, the declaration of parallel processes; and second, the synchronization of the processes. • Actually the functionality of the fork operation in Unix is not as general as shown in figure 4.2 • Instead, an identical copy of the calling process is generated, which then executes in parallel to the original.
  • 22.
  • 23. • The only possibility of a process to determine its identity (it is an identification number). • In order to start a different program and wait for its termination, the two Unix calls can be embedded in the C language in the following way. int status; if ( fork() == 0 execlp (“program_B”,...); /* Child Proc */ .... /* Parent Proc */ wait(&status);
  • 25. Program Segments (2) • Program code • Any global declaration of variables will be doubled in each process but local variables are not doubled.
  • 26. • The call to the fork operation returns the process number of the child process to the parent process (is not equal to 0). • For child fork return 0 to to the child. • The child immediately executes the execlp operation. • The parent process can wait for the termination of the child process (wait operation).
  • 27. ParBegin and ParEnd • Blocks of parallel code are defined with ParBegin and ParEnd (cobegin and coend) in a manner analogous to the sequential begin and end. • However the instructions in the block should be carried out simultaneously. • This language is used AL control several robots and coordinates them. • Synchronization between processes through semaphores. • Due to restrictions mentioned, this concept of statement has no application in modern programming languages (synchronization and etc).
  • 28.
  • 29. 4.6 Processes • Processes are declared similar to procedures and are started with a specific instruction. • If several copies of a process needed to be executed, then that process type must be started with multiple calls possibly having different parameters. • The synchronization between processes executing in parallel may be controlled through concepts of semaphores or monitors with condition variables. • The explicit synchronization of parallel processes exact not only not only an additional control cost but, is also extremely susceptible to error and deadlocks.
  • 30. • Communication and synchronization are accomplished in system with shared (“tightly coupled”) via monitors with conditions. • In system without shared memory (“loosely coupled”), the concept is illustrated in figure 3.x (4.4)
  • 31. Remote procedure call • In order to extend the process concept to a parallel computer system without shared memory, the communication between processes located on different processor has to be carried out by message passing. • The programming system is dvided into multiple parallel processes, where each process takes on the roll of either a client or a sever. • Each sever can also become a client by using the services of another sever. • Each client confers tasks on one or more approximately configured sever processes.
  • 32. • This type of parallel task distribution is implemented with the remote procedure call RPC mechanism.
  • 33. • Here, a remote procedure call resembles just a task deposit operation. • Returning the results after the calculation by server requires another explicit data exchange in the opposite direction • Problems with the remote procedure call include the application of error tolerant protocols for resetting or may be restarting of the client after a server failure.
  • 34.
  • 35. 4.7 Implicit parallelism • All parallel concepts covered so far use a special term, explicit language construct for controlling the parallel execution. • Several languages do not require any language constructs for parallelism, but nevertheless allow parallel processing. • Such a programming languages are called languages with implicit parallelism. • The programmer is much more limited in controlling the parallel processors which are executing his program (efficient parallelism as done by an intelligent compiler .
  • 36. • The compiler has no interaction with the application programmer (declarative languages represents knowledge or problems to be solved by using complex mathematical formulas. • The implicit parallelism of vector expressions, for example from programming language Functional Programming. • A shown in figure 3.x (4.7), the mathematical notation of a matrix addition contains implicit parallelism that can quite easily be converted to a parallel computer architecture through automatic parallelization.
  • 37.
  • 38. Explicit versus Implicit parallelism • A summary of advantages and disadvantages of explicit and implicit is presented in figure 3.x (4.8. • The programming actually occurs at high level abstraction; for this reason implicit parallelism is often found in higher level non-procedural languages. • In contrast, explicit parallelism gives the programmer considerable more flexibility that, can lead to better processors utilization and higher performance. • This advantage is paid for with more complicated and more error prone programming method.