SlideShare a Scribd company logo
PARALELLISM
Executingtwoor more operationsorstreamsof instructionatthe same time is knownas
Parallelism.
The amount of parallelismavailable withinabasicblock—astraight-line code sequencewithno
branchesinexcepttothe entryandno branchesout exceptatthe exit—isquitesmall.Fortypical
MIPS programs,the average dynamicbranchfrequencyisoftenbetween15% and 25%, meaning
that betweenthree andsix instructionsexecute betweenapairof branches.Since these instructions
are likelytodependuponone another,the amountof overlapwe canexploitwithinabasicblockis
likelytobe lessthanthe average basic blocksize.Toobtainsubstantial performance enhancements,
we must exploitILPacrossmultiple basicblocks.The simplestandmostcommonwayto increase
the ILP isto exploitparallelismamongiterationsof aloop.Thistype of parallelismisoftencalled
loop-levelparallelism.Here isasimple example of aloopthat addstwo 1000-elementarraysand is
completelyparallel.
for (i=0; i<=999; i=i+1) x[i] = x[i] +y[i];
Every iterationof the loopcan overlapwithanyotheriteration,althoughwithineachloopiteration
there islittle orno opportunityforoverlap.
GOALS
 The purpose of parallel processingistospeed upthe computerprocessingcapabilityorin
words,itincreasesthe computational speed
 The systemmay have twoor more processorsoperatingconcurrently.
 Improvesthe performance of the computerforagivenclockspeed.
TYPES OF PARALLELISM
1) InstructionLevel Parallelism(ILP)
 Pipelining
 Superscalar
2) Process Level Parallelism(PLP)
 Array Computer
 Multiprocessor
1) INSTRUCTION PIPELINING
 An instructionpipeliningreadsconsecutive instructionsfrommemorywhile previous
instructionsare beingexecutedinothersegments.
 Computerneedstoprocesseachinstructionwiththe followingsequence of steps.
Pipelining Steps
 Fetchthe instructionfrommemory
 Decode the instruction
 Calculate the effective address
 Fetchthe operandsfrommemory
 Execute the instruction
 Store the resultinthe properplace
Flow Diagram
PipeliningConflicts
 Resource conflictscausedbyaccessto memorybytwosegmentsatthe same time.These may
be resolvedbyusingseparate instructionanddatamemories
 Data Dependencyconflictsarise whenaninstructiondependsonthe resultof aprevious
instruction,butthisresultisnotyetavailable.
2) Superscalar execution in whichmultiple executionunitsare usedto
execute multipleinstructionsinparallel.Intypical superscalarprocessors,the instructions
executingsimultaneouslyare adjacentinthe original programorder.
 A superscalarCPUarchitecture implementsaformof parallelismcalledinstruction-level
parallelismwithinasingle processor.
 It therefore allowsfasterCPUthroughputthanwouldotherwise be possible ata givenclock
rate.
 A superscalarprocessorexecutesmore thanone instructionduringaclockcycle by
simultaneouslydispatchingmultiple instructionstoredundantfunctional unitsonthe processor.
 The term superscalar, firstcoinedin1987 referstoa machine that isdesignedtoimprove the
performance of the executionof scalarinstructions.
Super scaler Implementation
 A superscalarimplementationof a processor architecture isone inwhichcommon
instructions—integerandfloating-pointarithmetic,loads,stores,andconditional branches—can
be initiatedsimultaneouslyandexecutedindependently.
Why we use Super scaler?
 CPU hardware dynamicallychecksfordatadependenciesbetweeninstructionsat runtime
(versussoftware checkingatcompile time)
 The CPU acceptsmultiple instructionsperclockcycle.
 The branch instructionprocessing
Super scaler organization
Data Dependences
There are three differenttypesof dependences:data dependences(alsocalledtrue datadependences),
name dependences,andcontrol dependences.Aninstructionj isdatadependentoninstructioni if
eitherof the followingholds:
■ Instructioni producesa resultthatmay be usedby instructionj.
■ Instructionj is data dependentoninstructionk,andinstructionkisdatadependentoninstructioni.
For example
considerthe followingMIPScode sequencethatincrementsavectorof valuesinmemory(startingat
0(R1) and withthe lastelementat8(R2)) by a scalar inregisterF2.(For simplicity,throughoutthis
chapter,our examplesignore the effectsof delayedbranches.)
Loop: L.D F0,0(R1) ;F0=array elementADD.DF4,F0,F2;add scalar inF2 S.D F4,0(R1) ;store resultDADDUI
R1,R1,#-8 ;decrementpointer8bytesBNE R1,R2,LOOP;branch R1!=R2
The data dependencesinthiscode sequenceinvolve bothfloating-pointdata:
and integerdata:
In bothof the above dependentsequences,asshownbythe arrows,eachinstructiondependsonthe
previousone.The arrows here andinfollowingexamplesshow the orderthatmustbe preservedfor
correct execution.The arrowpointsfromaninstructionthatmustprecede the instructionthatthe
arrowheadpointsto.If two instructionsare datadependent,theymustexecute inorderandcannot
execute simultaneouslyorbe completelyoverlapped.The dependence impliesthatthere wouldbe a
chainof one or more data hazards betweenthe twoinstructions.(See Appendix Cfora brief description
of data hazards,whichwe will define preciselyinafew pages.) Executingthe instructionssimultaneously
will cause a processorwithpipelineinterlocks(andapipelinedepthlongerthanthe distance between
the instructionsincycles) todetectahazard and stall,therebyreducingoreliminatingthe overlap.Ina
processorwithoutinterlocksthatreliesoncompilerscheduling,the compilercannotschedule
dependentinstructionsinsucha waythat theycompletelyoverlap,since the programwill notexecute
correctly.The presence of a data dependence inaninstructionsequence reflectsadata dependence in
the source code fromwhichthe instructionsequence wasgenerated.The effectof the original data
dependence mustbe preserved.
Loop: L.D F0,0(R1) ;F0=array elementADD.DF4,F0,F2;add scalar inF2 S.D F4,0(R1) ;store result
DADDIU R1,R1,#-8 ;decrementpointer;8bytes(perDW) BNE R1,R2,Loop ;branch R1!=R2
Name Dependences
The secondtype of dependence isaname dependence.A name dependence occurswhentwo
instructionsuse the same registerormemorylocation,calledaname,butthere isno flow of data
betweenthe instructionsassociatedwiththatname.There are twotypesof name dependences
betweenaninstructioni thatprecedesinstructionj inprogramorder:
1. An antidependence betweeninstructioni andinstructionj occurswheninstructionj writesaregister
or memorylocationthatinstructioni reads.The original orderingmustbe preservedtoensure thati
readsthe correct value.Inthe example onpage 151, there is antidependence betweenS.DandDADDIU
on registerR1.
2. An outputdependence occurswheninstructioni andinstructionj write the same registerormemory
location.The orderingbetweenthe instructions mustbe preservedtoensure thatthe value finally
writtencorrespondstoinstructionj.
Hazards
A hazard existswheneverthere isaname or data dependence betweeninstructions.
Because of the dependence,we mustpreservewhatiscalledprogramorder—thatis,the orderthatthe
instructionswouldexecuteinif executedsequentiallyone ata time as determinedbythe original source
program.The goal of bothour software andhardware techniquesistoexploitparallelismbypreserving
program orderonlywhere itaffectsthe outcome of the program.Detectingandavoidinghazards
ensuresthatnecessaryprogramorderispreserved.Datahazards,whichare informallydescribedin
Appendix C,maybe classifiedasone of three types,dependingonthe orderof read and write accesses
inthe instructions.Byconvention,the hazardsare namedbythe orderinginthe program thatmust be
preservedbythe pipeline.Considertwoinstructionsi andj,withi precedingj inprogram order.The
possible datahazardsare
■ RAW (readafterwrite)—j triestoreadasource before i writesit,soj incorrectlygetsthe oldvalue.
Thishazard isthe mostcommontype and correspondstoa true data dependence.Programordermust
be preservedtoensure thatj receivesthe value fromi.
■ WAW (write afterwrite)—j triestowrite anoperandbefore itiswrittenbyi.The writesendupbeing
performedinthe wrongorder,leavingthe value writtenbyi ratherthanthe value writtenbyj inthe
destination.Thishazardcorrespondstoanoutputdependence.WAWhazardsare presentonlyin
pipelinesthatwrite inmore thanone pipe stage orallow an instructiontoproceedevenwhena
previousinstructionisstalled.
■ WAR (write afterread)—j triestowrite adestinationbefore itisreadbyi,so i incorrectlygetsthe
newvalue.Thishazardarisesfroman antidependence (orname dependence).WARhazardscannot
occur in moststatic issue pipelines— evendeeperpipelinesorfloating-pointpipelines—because all
readsare early.
Control Dependences
The last type of dependence isacontrol dependence.A control dependence determinesthe orderingof
an instruction,i,withrespecttoa branchinstructionsothat instructioni isexecutedincorrectprogram
orderand onlywhenitshouldbe.Everyinstruction,exceptforthose inthe firstbasicblockof the
program,is control dependentonsome setof branches,and,ingeneral,these control dependences
mustbe preservedtopreserveprogramorder.One of the simplestexamplesof acontrol dependence is
the dependence of the statementsinthe “then”partof an if statementonthe branch.For example,in
the code segment
if
p1
{
S1;
};
If
p2
{
S2;
}
S1 is control dependentonp1, andS2 iscontrol dependentonp2but not onp1. Ingeneral,two
constraintsare imposedbycontrol dependences:1.An instructionthatiscontrol dependentona
branch cannotbe movedbefore the branchsothat itsexecutionisnolongercontrolledbythe branch.
For example,we cannottake aninstructionfromthe thenportionof anif statementandmove itbefore
the if statement.2.An instructionthatisnot control dependentona branchcannot be movedafterthe
branch so thatits executioniscontrolledbythe branch.Forexample,we cannottake astatement
before the if statementandmove itintothe thenportion.
Processor Level Parallelism(Machine Parallelism)
 In a multiprocessingsystem,all CPUsmaybe equal,orsome may be reservedforspecial
purposes.
 In multiprocessing,the processorscanbe usedtoexecute asingle sequence of instructionsin
multiple contexts
 Multiprocessingisthe use of twoor more central processingunits(CPUs) withinasingle
computersystem.
 The term alsoreferstothe abilityof asystemto supportmore thanone processorand/orthe
abilitytoallocate tasksbetweenthem.
 Multiprocessingsometimesreferstothe executionof multipleconcurrentsoftware processesin
a systemas opposedtoa single processatany one instant.
 The terms multitaskingormultiprogrammingare more appropriate todescribe thisconcept,
whichisimplementedmostlyinsoftware,whereasmultiprocessingismore appropriate to
describe the use of multiplehardware CPUs.
Amdahl’sLaw
 Amdahl'slaw,alsoknownasAmdahl'sargument,isnamedaftercomputerarchitectGene
Amdahl,andisusedto findthe maximumexpectedimprovementtoanoverall systemwhen
onlypart of the systemisimproved.
 Amdahl'slawstatesthatthe overall speedupof applyingthe improvementwill be.
Old RunningTime = 1
NewRunning Time = (1-P)+P/S

More Related Content

Similar to parallelism

Program slicing
Program slicing Program slicing
Program slicing
Feras Tanan
 
Signature Free Virus Blocking Method to Detect Software Code Security (Intern...
Signature Free Virus Blocking Method to Detect Software Code Security (Intern...Signature Free Virus Blocking Method to Detect Software Code Security (Intern...
Signature Free Virus Blocking Method to Detect Software Code Security (Intern...
Student
 
Architecture of a morphological malware detector
Architecture of a morphological malware detectorArchitecture of a morphological malware detector
Architecture of a morphological malware detector
UltraUploader
 
379008-rc217-functionalprogramming
379008-rc217-functionalprogramming379008-rc217-functionalprogramming
379008-rc217-functionalprogramming
Luis Atencio
 
ES-CH5.ppt
ES-CH5.pptES-CH5.ppt
ES-CH5.ppt
alaakaraja1
 
2. ILP Processors.ppt
2. ILP Processors.ppt2. ILP Processors.ppt
2. ILP Processors.ppt
ShifaZahra7
 
Spark ml streaming
Spark ml streamingSpark ml streaming
Spark ml streaming
Adam Doyle
 
29-Krishan Kumar
29-Krishan Kumar29-Krishan Kumar
29-Krishan Kumar
krishan8018
 
A fast static analysis approach to detect exploit code inside network flows
A fast static analysis approach to detect exploit code inside network flowsA fast static analysis approach to detect exploit code inside network flows
A fast static analysis approach to detect exploit code inside network flows
UltraUploader
 
Instruction Level Parallelism – Compiler Techniques
Instruction Level Parallelism – Compiler TechniquesInstruction Level Parallelism – Compiler Techniques
Instruction Level Parallelism – Compiler Techniques
Dilum Bandara
 
Hydan
HydanHydan
Unit 2 contd. and( unit 3 voice over ppt)
Unit 2 contd. and( unit 3   voice over ppt)Unit 2 contd. and( unit 3   voice over ppt)
Unit 2 contd. and( unit 3 voice over ppt)
Dr Reeja S R
 
Secure computing for java and dot net
Secure computing for java and dot netSecure computing for java and dot net
Secure computing for java and dot net
redpel dot com
 
Unit 1
Unit  1Unit  1
Unit 1
donny101
 
C question
C questionC question
C question
Kuntal Bhowmick
 
The Champion Supervisor
The Champion SupervisorThe Champion Supervisor
The Champion Supervisor
Hassan Rizwan
 
Parallel programming model
Parallel programming modelParallel programming model
Parallel programming model
Illuru Phani Kumar
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects
PVS-Studio
 
Lecture 16 17 code-generation
Lecture 16 17 code-generationLecture 16 17 code-generation
Lecture 16 17 code-generation
Iffat Anjum
 
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
IDES Editor
 

Similar to parallelism (20)

Program slicing
Program slicing Program slicing
Program slicing
 
Signature Free Virus Blocking Method to Detect Software Code Security (Intern...
Signature Free Virus Blocking Method to Detect Software Code Security (Intern...Signature Free Virus Blocking Method to Detect Software Code Security (Intern...
Signature Free Virus Blocking Method to Detect Software Code Security (Intern...
 
Architecture of a morphological malware detector
Architecture of a morphological malware detectorArchitecture of a morphological malware detector
Architecture of a morphological malware detector
 
379008-rc217-functionalprogramming
379008-rc217-functionalprogramming379008-rc217-functionalprogramming
379008-rc217-functionalprogramming
 
ES-CH5.ppt
ES-CH5.pptES-CH5.ppt
ES-CH5.ppt
 
2. ILP Processors.ppt
2. ILP Processors.ppt2. ILP Processors.ppt
2. ILP Processors.ppt
 
Spark ml streaming
Spark ml streamingSpark ml streaming
Spark ml streaming
 
29-Krishan Kumar
29-Krishan Kumar29-Krishan Kumar
29-Krishan Kumar
 
A fast static analysis approach to detect exploit code inside network flows
A fast static analysis approach to detect exploit code inside network flowsA fast static analysis approach to detect exploit code inside network flows
A fast static analysis approach to detect exploit code inside network flows
 
Instruction Level Parallelism – Compiler Techniques
Instruction Level Parallelism – Compiler TechniquesInstruction Level Parallelism – Compiler Techniques
Instruction Level Parallelism – Compiler Techniques
 
Hydan
HydanHydan
Hydan
 
Unit 2 contd. and( unit 3 voice over ppt)
Unit 2 contd. and( unit 3   voice over ppt)Unit 2 contd. and( unit 3   voice over ppt)
Unit 2 contd. and( unit 3 voice over ppt)
 
Secure computing for java and dot net
Secure computing for java and dot netSecure computing for java and dot net
Secure computing for java and dot net
 
Unit 1
Unit  1Unit  1
Unit 1
 
C question
C questionC question
C question
 
The Champion Supervisor
The Champion SupervisorThe Champion Supervisor
The Champion Supervisor
 
Parallel programming model
Parallel programming modelParallel programming model
Parallel programming model
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects
 
Lecture 16 17 code-generation
Lecture 16 17 code-generationLecture 16 17 code-generation
Lecture 16 17 code-generation
 
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
 

Recently uploaded

BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
Nguyen Thanh Tu Collection
 
Stack Memory Organization of 8086 Microprocessor
Stack Memory Organization of 8086 MicroprocessorStack Memory Organization of 8086 Microprocessor
Stack Memory Organization of 8086 Microprocessor
JomonJoseph58
 
Data Structure using C by Dr. K Adisesha .ppsx
Data Structure using C by Dr. K Adisesha .ppsxData Structure using C by Dr. K Adisesha .ppsx
Data Structure using C by Dr. K Adisesha .ppsx
Prof. Dr. K. Adisesha
 
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.pptLevel 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Henry Hollis
 
Oliver Asks for More by Charles Dickens (9)
Oliver Asks for More by Charles Dickens (9)Oliver Asks for More by Charles Dickens (9)
Oliver Asks for More by Charles Dickens (9)
nitinpv4ai
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
Steve Thomason
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Denish Jangid
 
Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"
National Information Standards Organization (NISO)
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
HajraNaeem15
 
skeleton System.pdf (skeleton system wow)
skeleton System.pdf (skeleton system wow)skeleton System.pdf (skeleton system wow)
skeleton System.pdf (skeleton system wow)
Mohammad Al-Dhahabi
 
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...
TechSoup
 
Bossa N’ Roll Records by Ismael Vazquez.
Bossa N’ Roll Records by Ismael Vazquez.Bossa N’ Roll Records by Ismael Vazquez.
Bossa N’ Roll Records by Ismael Vazquez.
IsmaelVazquez38
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
Himanshu Rai
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
Krassimira Luka
 
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptxBIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
RidwanHassanYusuf
 
How to Predict Vendor Bill Product in Odoo 17
How to Predict Vendor Bill Product in Odoo 17How to Predict Vendor Bill Product in Odoo 17
How to Predict Vendor Bill Product in Odoo 17
Celine George
 
Electric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger HuntElectric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger Hunt
RamseyBerglund
 
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdfمصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
سمير بسيوني
 
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptxRESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
zuzanka
 
Bonku-Babus-Friend by Sathyajith Ray (9)
Bonku-Babus-Friend by Sathyajith Ray  (9)Bonku-Babus-Friend by Sathyajith Ray  (9)
Bonku-Babus-Friend by Sathyajith Ray (9)
nitinpv4ai
 

Recently uploaded (20)

BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
 
Stack Memory Organization of 8086 Microprocessor
Stack Memory Organization of 8086 MicroprocessorStack Memory Organization of 8086 Microprocessor
Stack Memory Organization of 8086 Microprocessor
 
Data Structure using C by Dr. K Adisesha .ppsx
Data Structure using C by Dr. K Adisesha .ppsxData Structure using C by Dr. K Adisesha .ppsx
Data Structure using C by Dr. K Adisesha .ppsx
 
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.pptLevel 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
 
Oliver Asks for More by Charles Dickens (9)
Oliver Asks for More by Charles Dickens (9)Oliver Asks for More by Charles Dickens (9)
Oliver Asks for More by Charles Dickens (9)
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
 
Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
 
skeleton System.pdf (skeleton system wow)
skeleton System.pdf (skeleton system wow)skeleton System.pdf (skeleton system wow)
skeleton System.pdf (skeleton system wow)
 
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...
 
Bossa N’ Roll Records by Ismael Vazquez.
Bossa N’ Roll Records by Ismael Vazquez.Bossa N’ Roll Records by Ismael Vazquez.
Bossa N’ Roll Records by Ismael Vazquez.
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
 
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptxBIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
 
How to Predict Vendor Bill Product in Odoo 17
How to Predict Vendor Bill Product in Odoo 17How to Predict Vendor Bill Product in Odoo 17
How to Predict Vendor Bill Product in Odoo 17
 
Electric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger HuntElectric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger Hunt
 
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdfمصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
 
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptxRESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
 
Bonku-Babus-Friend by Sathyajith Ray (9)
Bonku-Babus-Friend by Sathyajith Ray  (9)Bonku-Babus-Friend by Sathyajith Ray  (9)
Bonku-Babus-Friend by Sathyajith Ray (9)
 

parallelism

  • 1.
  • 2. PARALELLISM Executingtwoor more operationsorstreamsof instructionatthe same time is knownas Parallelism. The amount of parallelismavailable withinabasicblock—astraight-line code sequencewithno branchesinexcepttothe entryandno branchesout exceptatthe exit—isquitesmall.Fortypical MIPS programs,the average dynamicbranchfrequencyisoftenbetween15% and 25%, meaning that betweenthree andsix instructionsexecute betweenapairof branches.Since these instructions are likelytodependuponone another,the amountof overlapwe canexploitwithinabasicblockis likelytobe lessthanthe average basic blocksize.Toobtainsubstantial performance enhancements, we must exploitILPacrossmultiple basicblocks.The simplestandmostcommonwayto increase the ILP isto exploitparallelismamongiterationsof aloop.Thistype of parallelismisoftencalled loop-levelparallelism.Here isasimple example of aloopthat addstwo 1000-elementarraysand is completelyparallel. for (i=0; i<=999; i=i+1) x[i] = x[i] +y[i]; Every iterationof the loopcan overlapwithanyotheriteration,althoughwithineachloopiteration there islittle orno opportunityforoverlap. GOALS  The purpose of parallel processingistospeed upthe computerprocessingcapabilityorin words,itincreasesthe computational speed  The systemmay have twoor more processorsoperatingconcurrently.  Improvesthe performance of the computerforagivenclockspeed. TYPES OF PARALLELISM 1) InstructionLevel Parallelism(ILP)  Pipelining  Superscalar 2) Process Level Parallelism(PLP)  Array Computer  Multiprocessor 1) INSTRUCTION PIPELINING
  • 3.  An instructionpipeliningreadsconsecutive instructionsfrommemorywhile previous instructionsare beingexecutedinothersegments.  Computerneedstoprocesseachinstructionwiththe followingsequence of steps. Pipelining Steps  Fetchthe instructionfrommemory  Decode the instruction  Calculate the effective address  Fetchthe operandsfrommemory  Execute the instruction  Store the resultinthe properplace
  • 5.  Resource conflictscausedbyaccessto memorybytwosegmentsatthe same time.These may be resolvedbyusingseparate instructionanddatamemories  Data Dependencyconflictsarise whenaninstructiondependsonthe resultof aprevious instruction,butthisresultisnotyetavailable. 2) Superscalar execution in whichmultiple executionunitsare usedto execute multipleinstructionsinparallel.Intypical superscalarprocessors,the instructions executingsimultaneouslyare adjacentinthe original programorder.  A superscalarCPUarchitecture implementsaformof parallelismcalledinstruction-level parallelismwithinasingle processor.  It therefore allowsfasterCPUthroughputthanwouldotherwise be possible ata givenclock rate.  A superscalarprocessorexecutesmore thanone instructionduringaclockcycle by simultaneouslydispatchingmultiple instructionstoredundantfunctional unitsonthe processor.  The term superscalar, firstcoinedin1987 referstoa machine that isdesignedtoimprove the performance of the executionof scalarinstructions. Super scaler Implementation  A superscalarimplementationof a processor architecture isone inwhichcommon instructions—integerandfloating-pointarithmetic,loads,stores,andconditional branches—can be initiatedsimultaneouslyandexecutedindependently. Why we use Super scaler?  CPU hardware dynamicallychecksfordatadependenciesbetweeninstructionsat runtime (versussoftware checkingatcompile time)  The CPU acceptsmultiple instructionsperclockcycle.  The branch instructionprocessing
  • 6. Super scaler organization Data Dependences There are three differenttypesof dependences:data dependences(alsocalledtrue datadependences), name dependences,andcontrol dependences.Aninstructionj isdatadependentoninstructioni if eitherof the followingholds: ■ Instructioni producesa resultthatmay be usedby instructionj. ■ Instructionj is data dependentoninstructionk,andinstructionkisdatadependentoninstructioni.
  • 7. For example considerthe followingMIPScode sequencethatincrementsavectorof valuesinmemory(startingat 0(R1) and withthe lastelementat8(R2)) by a scalar inregisterF2.(For simplicity,throughoutthis chapter,our examplesignore the effectsof delayedbranches.) Loop: L.D F0,0(R1) ;F0=array elementADD.DF4,F0,F2;add scalar inF2 S.D F4,0(R1) ;store resultDADDUI R1,R1,#-8 ;decrementpointer8bytesBNE R1,R2,LOOP;branch R1!=R2 The data dependencesinthiscode sequenceinvolve bothfloating-pointdata: and integerdata: In bothof the above dependentsequences,asshownbythe arrows,eachinstructiondependsonthe previousone.The arrows here andinfollowingexamplesshow the orderthatmustbe preservedfor correct execution.The arrowpointsfromaninstructionthatmustprecede the instructionthatthe arrowheadpointsto.If two instructionsare datadependent,theymustexecute inorderandcannot execute simultaneouslyorbe completelyoverlapped.The dependence impliesthatthere wouldbe a chainof one or more data hazards betweenthe twoinstructions.(See Appendix Cfora brief description of data hazards,whichwe will define preciselyinafew pages.) Executingthe instructionssimultaneously will cause a processorwithpipelineinterlocks(andapipelinedepthlongerthanthe distance between the instructionsincycles) todetectahazard and stall,therebyreducingoreliminatingthe overlap.Ina processorwithoutinterlocksthatreliesoncompilerscheduling,the compilercannotschedule dependentinstructionsinsucha waythat theycompletelyoverlap,since the programwill notexecute correctly.The presence of a data dependence inaninstructionsequence reflectsadata dependence in the source code fromwhichthe instructionsequence wasgenerated.The effectof the original data dependence mustbe preserved. Loop: L.D F0,0(R1) ;F0=array elementADD.DF4,F0,F2;add scalar inF2 S.D F4,0(R1) ;store result DADDIU R1,R1,#-8 ;decrementpointer;8bytes(perDW) BNE R1,R2,Loop ;branch R1!=R2
  • 8. Name Dependences The secondtype of dependence isaname dependence.A name dependence occurswhentwo instructionsuse the same registerormemorylocation,calledaname,butthere isno flow of data betweenthe instructionsassociatedwiththatname.There are twotypesof name dependences betweenaninstructioni thatprecedesinstructionj inprogramorder: 1. An antidependence betweeninstructioni andinstructionj occurswheninstructionj writesaregister or memorylocationthatinstructioni reads.The original orderingmustbe preservedtoensure thati readsthe correct value.Inthe example onpage 151, there is antidependence betweenS.DandDADDIU on registerR1. 2. An outputdependence occurswheninstructioni andinstructionj write the same registerormemory location.The orderingbetweenthe instructions mustbe preservedtoensure thatthe value finally writtencorrespondstoinstructionj. Hazards A hazard existswheneverthere isaname or data dependence betweeninstructions. Because of the dependence,we mustpreservewhatiscalledprogramorder—thatis,the orderthatthe instructionswouldexecuteinif executedsequentiallyone ata time as determinedbythe original source program.The goal of bothour software andhardware techniquesistoexploitparallelismbypreserving program orderonlywhere itaffectsthe outcome of the program.Detectingandavoidinghazards ensuresthatnecessaryprogramorderispreserved.Datahazards,whichare informallydescribedin Appendix C,maybe classifiedasone of three types,dependingonthe orderof read and write accesses inthe instructions.Byconvention,the hazardsare namedbythe orderinginthe program thatmust be preservedbythe pipeline.Considertwoinstructionsi andj,withi precedingj inprogram order.The possible datahazardsare ■ RAW (readafterwrite)—j triestoreadasource before i writesit,soj incorrectlygetsthe oldvalue. Thishazard isthe mostcommontype and correspondstoa true data dependence.Programordermust be preservedtoensure thatj receivesthe value fromi. ■ WAW (write afterwrite)—j triestowrite anoperandbefore itiswrittenbyi.The writesendupbeing performedinthe wrongorder,leavingthe value writtenbyi ratherthanthe value writtenbyj inthe destination.Thishazardcorrespondstoanoutputdependence.WAWhazardsare presentonlyin pipelinesthatwrite inmore thanone pipe stage orallow an instructiontoproceedevenwhena previousinstructionisstalled. ■ WAR (write afterread)—j triestowrite adestinationbefore itisreadbyi,so i incorrectlygetsthe newvalue.Thishazardarisesfroman antidependence (orname dependence).WARhazardscannot occur in moststatic issue pipelines— evendeeperpipelinesorfloating-pointpipelines—because all readsare early.
  • 9. Control Dependences The last type of dependence isacontrol dependence.A control dependence determinesthe orderingof an instruction,i,withrespecttoa branchinstructionsothat instructioni isexecutedincorrectprogram orderand onlywhenitshouldbe.Everyinstruction,exceptforthose inthe firstbasicblockof the program,is control dependentonsome setof branches,and,ingeneral,these control dependences mustbe preservedtopreserveprogramorder.One of the simplestexamplesof acontrol dependence is the dependence of the statementsinthe “then”partof an if statementonthe branch.For example,in the code segment if p1 { S1; }; If p2 { S2; } S1 is control dependentonp1, andS2 iscontrol dependentonp2but not onp1. Ingeneral,two constraintsare imposedbycontrol dependences:1.An instructionthatiscontrol dependentona branch cannotbe movedbefore the branchsothat itsexecutionisnolongercontrolledbythe branch. For example,we cannottake aninstructionfromthe thenportionof anif statementandmove itbefore the if statement.2.An instructionthatisnot control dependentona branchcannot be movedafterthe branch so thatits executioniscontrolledbythe branch.Forexample,we cannottake astatement before the if statementandmove itintothe thenportion.
  • 10. Processor Level Parallelism(Machine Parallelism)  In a multiprocessingsystem,all CPUsmaybe equal,orsome may be reservedforspecial purposes.  In multiprocessing,the processorscanbe usedtoexecute asingle sequence of instructionsin multiple contexts  Multiprocessingisthe use of twoor more central processingunits(CPUs) withinasingle computersystem.  The term alsoreferstothe abilityof asystemto supportmore thanone processorand/orthe abilitytoallocate tasksbetweenthem.  Multiprocessingsometimesreferstothe executionof multipleconcurrentsoftware processesin a systemas opposedtoa single processatany one instant.  The terms multitaskingormultiprogrammingare more appropriate todescribe thisconcept, whichisimplementedmostlyinsoftware,whereasmultiprocessingismore appropriate to describe the use of multiplehardware CPUs.
  • 11. Amdahl’sLaw  Amdahl'slaw,alsoknownasAmdahl'sargument,isnamedaftercomputerarchitectGene Amdahl,andisusedto findthe maximumexpectedimprovementtoanoverall systemwhen onlypart of the systemisimproved.  Amdahl'slawstatesthatthe overall speedupof applyingthe improvementwill be. Old RunningTime = 1 NewRunning Time = (1-P)+P/S