Computer Organization &
Architecture
Pipeline Hazards
1
Pipeline Hazards
•There are situations, called hazards, that prevent the
next instruction in the instruction stream from
executing during its designated cycle
•There are three classes of hazards
• Structural hazard
• Data hazard
• Branch hazard
Structural Hazards. They arise from resource conflicts when the hardware cannot support all
possible combinations of instructions in simultaneous overlapped execution.
Data Hazards. They arise when an instruction depends on the result of a previous instruction
in a way that is exposed by the overlapping of instructions in the pipeline.
Control Hazards.They arise from the pipelining of branches and other instructions
that change the PC.
What Makes Pipelining Hard?
• Power failing,
• Arithmetic overflow,
• I/O device request,
• OS call,
• Page fault
3
Pipeline Hazards
•Structural hazard
• Resource conflicts when the hardware cannot support all possible
combination of instructions simultaneously
•Data hazard
• An instruction depends on the results of a previous instruction
•Branch hazard
• Instructions that change the PC
Structural Hazard
• This is the situation when two instructions require the use of a given
hardware resource at the same time.
• Arises from resource conflicts when the hardware can’t support all
possible combinations of overlapping instructions.
• Root Cause: resource conflicts
e.g., a processor with 1 reg write portbut having two instructions to
write in register
• Solution
stall one of the instructions until required unit is available
5
Structural Hazard
• Example
same resource i.e
mem conflict
data access
vs
instr fetch
Load
Instr i+3
Instr i+2
Instr i+1
MEM
IF
Structural hazard
• To solve this hazard, we “stall” the pipeline
until the resource is freed
• A stall is commonly called pipeline bubble,
since it floats through the pipeline taking space
but carry no useful work
How is it resolved?
Time
Reg
Mem Reg
Load
Instruction 1
Instruction 2
Stall
Instruction 3
Bubbl
e
Bubbl
e
Bubbl
e
Bubbl
e
Bubbl
e
Pipeline generally stalled
by
inserting a “bubble” or
NOP
Reg
Mem MEM Reg
Reg
Mem Reg
Reg
Mem Reg
MEM
MEM
MEM
Structural Hazard Solution
Stall Instr i+3
till CC 5
10
Structural Hazards
Stalling instruction i+3 to start from clock cycle 5
Data Hazards
• A data hazard occur if either the source or the destination
operands of an instruction are not available at the time
expected in the pipeline. As a result some operations has to be
delayed and the pipeline stalls.
• A data hazard is a situation in which the pipeline is stalled
because the data to be operated on are delayed for some
reason.
11
Data hazard
Example:
ADD R1,R2,R3 R1R2+R3
SUB R4,R1,R5 R4R1-R5
AND R6,R1,R7 R6R1 AND R7
OR R8,R1,R9 R8R1 OR R9
XOR R10,R1,R11 R10R1 XOR R11
ADD writes the register in WB but SUB needs it in ID.
This is a data hazard
All instructions
after ADD use result
of ADD
Data hazard
• Delay load approach inserts a no-operation instruction to avoid the
data conflict
ADD R1R2+R3
No-op
No-op
SUB R4R1-R5
AND R6R1 AND R7
OR R8R1 OR R9
XOR R10R1 XOR R11
Data hazard
Data hazard
• It can be further solved by a simple hardware technique called forwarding (also
called bypassing or short-circuiting)
• The insight in forwarding is that the result is not really needed by SUB until the
ADD execute completely
• If the forwarding hardware detects that the previous ALU operation has written the
register corresponding to a source for the current ALU operation, control logic
selects the results in ALU instead of from memory
Forwarding
• Problem illustrated on previous slide can actually be solved
relatively easily – with forwarding
• Can we move the result from EX/MEM register to the beginning of
ALU (where SUB needs it)?
– Yes!
• Generally speaking:
– Forwarding occurs when a result is passed directly to functional
unit that requires it.
– Result goes from output of one unit to input of another
When can we forward?
Reg
Mem DM Reg
Reg
Mem DM Reg
Reg
Mem DM
Reg
Mem
Time
ADD R1, R2, R3
SUB R4, R1, R5
AND R6, R1, R7
OR R8, R1, R9
XOR R10, R1, R11
Reg
Mem
SUB gets info.
from EX/MEM
pipe register
1
7
AND gets info.
from MEM/WB
pipe register
OR gets info. by
forwarding from
register file
Rule of thumb: If line goes “forward” you can do forwarding.
If its drawn backward, it’s physically impossible.
Data hazard
19
Data Hazard Classification
• Three types of data hazards
– RAW : Read After Write
– WAW : Write After Write
– WAR : Write After Read
• RAR : Read After Read
– Is this a hazard?
Read After Write (RAW)
• A read after write (RAW) data hazard refers to a
situation where an instruction refers to a result
that has not yet been calculated or retrieved.
• This can occur because even though an
instruction is executed after a previous
instruction, the previous instruction has not
been completely processed through the pipeline.
example:
i1. R2 <- R1 + R3
i2. R4 <- R2 + R3
Write After Read (WAR)
• A write after read (WAR) data hazard
represents a problem with concurrent
execution.
For example:
i1. R4 <- R1 + R5
i2. R5 <- R1 + R2
Write After Write (WAW
• A write after write (WAW) data hazard may occur
in a concurrent execution environment.
example:
i1. R2 <- R4 + R7
i2. R2 <- R1 + R3
We must delay the WB (Write Back) of i2 until the execution of i1
Control Hazards (Effect of Branching)
• Control hazard occurs whenever the pipeline makes an incorrect branch
prediction decisions, resulting in instructions entering the pipeline that
needs to be discarded.
• These result from the pipelining of branches and other instructions that
change the Program Counter.
• The action of fetching a non sequential or remote instruction after a branch
instruction is called branch taken.
• The instruction to be executed after a branch taken is called a branch
target.
• When a branch taken occurs all the instructions following the branch in the
pipeline becomes useless and will be drained from the pipeline.
23
Branch hazards
• Branch hazards can cause a greater performance loss for
pipelines
• When a branch instruction is executed, it may or may not
change the PC
• If a branch changes the PC to its target address, it is a
taken branch Otherwise, it is untaken
Control Hazard Example
12: BEQ R1, R3, 36
16: AND R2, R3, R5
20: OR R6, R1, R7
24: ADD R8, R1, R9
36: XOR R10, R1, R11
Here 12,16,20,24,36 are the address of the instructions.
BEQ (Branch if Equal) instructions executes and if R1 and R3 are equal, so jump to instruction whose
memory address is 36.
So, once branch is taken, the next three in-order instructions which entered into the pipeline needs to
discarded. Hence, clock cycles are wasted and pipeline is stalled.
25
• A branch instruction can be conditional or unconditional.
• The branch instruction breaks the normal sequence of the
instruction stream, causing difficulties in the operation of the
instruction pipeline.
Handling Branching Instructions
26
Branch Difficulties
Solutions:
– Prefetch target instruction
– Loop Buffer
– Branch target buffer (BTB)
– Branch Prediction
– Delayed Branch
27
Prefetch target instruction
• Prefetch the target instruction in addition to the instruction
following the branch.
• If the branch condition is successful, the pipeline continues
from the branch target instruction.
28
Loop Buffer
• Very fast memory
• Maintained by fetch stage of pipeline
• Check buffer before fetching from memory
• When a program loop is detected it is stored in the loop buffer in
its entirety including all branches.
• The loop buffer is similar (in principle) to a cache dedicated to
instructions. The differences are that the loop buffer only retains
instructions in sequence, and is much smaller in size (and lower
in cost).
29
Branch target buffer (BTB)
• BTB is an associative memory.
• Each entry in the BTB consists of the address of a previously
executed branch instruction and the target instruction for the
branch.
30
Branch Prediction
• A pipeline with branch prediction uses some additional logic
to guess the outcome of a conditional branch instruction
before it is executed.
31
Branch Prediction
• Various techniques can be used to predict whether a branch will be
taken or not:
– Prediction never taken
– Prediction always taken
– Prediction by opcode
– Branch history table
• The first three approaches are static: they do not depend on the
execution history up to the time of the conditional branch instruction.
The last approach is dynamic: they depend on the execution history.
32
Delayed Branch
• In this procedure, the compiler detects the branch instruction
and rearrange the machine language code sequence by
inserting useful instructions that keep the pipeline operating
without interrupts.
33
Example
• Five instructions need to be carried out:
Load from memory to R1
Increment R2
Add R3 to R4
Subtract R5 from R6
Branch to address X
34
Delay Branch
35
Rearrange the Instruction
36

Pipeline Hazards _

  • 1.
  • 2.
    Pipeline Hazards •There aresituations, called hazards, that prevent the next instruction in the instruction stream from executing during its designated cycle •There are three classes of hazards • Structural hazard • Data hazard • Branch hazard Structural Hazards. They arise from resource conflicts when the hardware cannot support all possible combinations of instructions in simultaneous overlapped execution. Data Hazards. They arise when an instruction depends on the result of a previous instruction in a way that is exposed by the overlapping of instructions in the pipeline. Control Hazards.They arise from the pipelining of branches and other instructions that change the PC.
  • 3.
    What Makes PipeliningHard? • Power failing, • Arithmetic overflow, • I/O device request, • OS call, • Page fault 3
  • 4.
    Pipeline Hazards •Structural hazard •Resource conflicts when the hardware cannot support all possible combination of instructions simultaneously •Data hazard • An instruction depends on the results of a previous instruction •Branch hazard • Instructions that change the PC
  • 5.
    Structural Hazard • Thisis the situation when two instructions require the use of a given hardware resource at the same time. • Arises from resource conflicts when the hardware can’t support all possible combinations of overlapping instructions. • Root Cause: resource conflicts e.g., a processor with 1 reg write portbut having two instructions to write in register • Solution stall one of the instructions until required unit is available 5
  • 6.
    Structural Hazard • Example sameresource i.e mem conflict data access vs instr fetch Load Instr i+3 Instr i+2 Instr i+1 MEM IF
  • 7.
    Structural hazard • Tosolve this hazard, we “stall” the pipeline until the resource is freed • A stall is commonly called pipeline bubble, since it floats through the pipeline taking space but carry no useful work
  • 8.
    How is itresolved? Time Reg Mem Reg Load Instruction 1 Instruction 2 Stall Instruction 3 Bubbl e Bubbl e Bubbl e Bubbl e Bubbl e Pipeline generally stalled by inserting a “bubble” or NOP Reg Mem MEM Reg Reg Mem Reg Reg Mem Reg MEM MEM MEM
  • 9.
  • 10.
    10 Structural Hazards Stalling instructioni+3 to start from clock cycle 5
  • 11.
    Data Hazards • Adata hazard occur if either the source or the destination operands of an instruction are not available at the time expected in the pipeline. As a result some operations has to be delayed and the pipeline stalls. • A data hazard is a situation in which the pipeline is stalled because the data to be operated on are delayed for some reason. 11
  • 12.
    Data hazard Example: ADD R1,R2,R3R1R2+R3 SUB R4,R1,R5 R4R1-R5 AND R6,R1,R7 R6R1 AND R7 OR R8,R1,R9 R8R1 OR R9 XOR R10,R1,R11 R10R1 XOR R11 ADD writes the register in WB but SUB needs it in ID. This is a data hazard All instructions after ADD use result of ADD
  • 13.
    Data hazard • Delayload approach inserts a no-operation instruction to avoid the data conflict ADD R1R2+R3 No-op No-op SUB R4R1-R5 AND R6R1 AND R7 OR R8R1 OR R9 XOR R10R1 XOR R11
  • 14.
  • 15.
    Data hazard • Itcan be further solved by a simple hardware technique called forwarding (also called bypassing or short-circuiting) • The insight in forwarding is that the result is not really needed by SUB until the ADD execute completely • If the forwarding hardware detects that the previous ALU operation has written the register corresponding to a source for the current ALU operation, control logic selects the results in ALU instead of from memory
  • 16.
    Forwarding • Problem illustratedon previous slide can actually be solved relatively easily – with forwarding • Can we move the result from EX/MEM register to the beginning of ALU (where SUB needs it)? – Yes! • Generally speaking: – Forwarding occurs when a result is passed directly to functional unit that requires it. – Result goes from output of one unit to input of another
  • 17.
    When can weforward? Reg Mem DM Reg Reg Mem DM Reg Reg Mem DM Reg Mem Time ADD R1, R2, R3 SUB R4, R1, R5 AND R6, R1, R7 OR R8, R1, R9 XOR R10, R1, R11 Reg Mem SUB gets info. from EX/MEM pipe register 1 7 AND gets info. from MEM/WB pipe register OR gets info. by forwarding from register file Rule of thumb: If line goes “forward” you can do forwarding. If its drawn backward, it’s physically impossible.
  • 18.
  • 19.
    19 Data Hazard Classification •Three types of data hazards – RAW : Read After Write – WAW : Write After Write – WAR : Write After Read • RAR : Read After Read – Is this a hazard?
  • 20.
    Read After Write(RAW) • A read after write (RAW) data hazard refers to a situation where an instruction refers to a result that has not yet been calculated or retrieved. • This can occur because even though an instruction is executed after a previous instruction, the previous instruction has not been completely processed through the pipeline. example: i1. R2 <- R1 + R3 i2. R4 <- R2 + R3
  • 21.
    Write After Read(WAR) • A write after read (WAR) data hazard represents a problem with concurrent execution. For example: i1. R4 <- R1 + R5 i2. R5 <- R1 + R2
  • 22.
    Write After Write(WAW • A write after write (WAW) data hazard may occur in a concurrent execution environment. example: i1. R2 <- R4 + R7 i2. R2 <- R1 + R3 We must delay the WB (Write Back) of i2 until the execution of i1
  • 23.
    Control Hazards (Effectof Branching) • Control hazard occurs whenever the pipeline makes an incorrect branch prediction decisions, resulting in instructions entering the pipeline that needs to be discarded. • These result from the pipelining of branches and other instructions that change the Program Counter. • The action of fetching a non sequential or remote instruction after a branch instruction is called branch taken. • The instruction to be executed after a branch taken is called a branch target. • When a branch taken occurs all the instructions following the branch in the pipeline becomes useless and will be drained from the pipeline. 23
  • 24.
    Branch hazards • Branchhazards can cause a greater performance loss for pipelines • When a branch instruction is executed, it may or may not change the PC • If a branch changes the PC to its target address, it is a taken branch Otherwise, it is untaken
  • 25.
    Control Hazard Example 12:BEQ R1, R3, 36 16: AND R2, R3, R5 20: OR R6, R1, R7 24: ADD R8, R1, R9 36: XOR R10, R1, R11 Here 12,16,20,24,36 are the address of the instructions. BEQ (Branch if Equal) instructions executes and if R1 and R3 are equal, so jump to instruction whose memory address is 36. So, once branch is taken, the next three in-order instructions which entered into the pipeline needs to discarded. Hence, clock cycles are wasted and pipeline is stalled. 25
  • 26.
    • A branchinstruction can be conditional or unconditional. • The branch instruction breaks the normal sequence of the instruction stream, causing difficulties in the operation of the instruction pipeline. Handling Branching Instructions 26
  • 27.
    Branch Difficulties Solutions: – Prefetchtarget instruction – Loop Buffer – Branch target buffer (BTB) – Branch Prediction – Delayed Branch 27
  • 28.
    Prefetch target instruction •Prefetch the target instruction in addition to the instruction following the branch. • If the branch condition is successful, the pipeline continues from the branch target instruction. 28
  • 29.
    Loop Buffer • Veryfast memory • Maintained by fetch stage of pipeline • Check buffer before fetching from memory • When a program loop is detected it is stored in the loop buffer in its entirety including all branches. • The loop buffer is similar (in principle) to a cache dedicated to instructions. The differences are that the loop buffer only retains instructions in sequence, and is much smaller in size (and lower in cost). 29
  • 30.
    Branch target buffer(BTB) • BTB is an associative memory. • Each entry in the BTB consists of the address of a previously executed branch instruction and the target instruction for the branch. 30
  • 31.
    Branch Prediction • Apipeline with branch prediction uses some additional logic to guess the outcome of a conditional branch instruction before it is executed. 31
  • 32.
    Branch Prediction • Varioustechniques can be used to predict whether a branch will be taken or not: – Prediction never taken – Prediction always taken – Prediction by opcode – Branch history table • The first three approaches are static: they do not depend on the execution history up to the time of the conditional branch instruction. The last approach is dynamic: they depend on the execution history. 32
  • 33.
    Delayed Branch • Inthis procedure, the compiler detects the branch instruction and rearrange the machine language code sequence by inserting useful instructions that keep the pipeline operating without interrupts. 33
  • 34.
    Example • Five instructionsneed to be carried out: Load from memory to R1 Increment R2 Add R3 to R4 Subtract R5 from R6 Branch to address X 34
  • 35.
  • 36.

Editor's Notes

  • #6 Here’s an example of structural hazard due to memory conflict. Assume the processor has only memory port. A structural hazard will arise in clock cycle 4 when the load instruction reads data from memory and instruction i plus 3 fetches instruction from memory.
  • #9 The solution to this structural hazard is stall instruction i+3 for one clock cycle.