SlideShare a Scribd company logo
Dr. Shadrokh Samavi 1
Advanced Computer
Architecture
Pipelining
Dr. Shadrokh Samavi 2
Dr. Shadrokh Samavi 3
Some slides are from the instructorsā€™ resources which
accompany the 5th and previous editions.
Some slides are from David Patterson, David Culler
and Krste Asanovic of UC Berkeley; Israel Koren
of UM Amherst, and Milos Prvulovic of Georgia
Tech.
Sources of some slides are mentioned at the bottom
of the page.
Please inform me if I am missing a name in the above
list.
3
Dr. Shadrokh Samavi 4
C.1. What Is Pipelining?
Dr. Shadrokh Samavi 5
ā€¢ An implementation technique whereby multiple
instructions are overlapped in execution.
ā€¢ pipe stage
ā€¢ throughput
ā€¢ machine cycle
Dr. Shadrokh Samavi 6
IF ID EXE MEM WB
Dr. Shadrokh Samavi 7
IF
Dr. Shadrokh Samavi 8
ID
Dr. Shadrokh Samavi 9
EXE
Dr. Shadrokh Samavi 10
MEM
Dr. Shadrokh Samavi 11
WB
Dr. Shadrokh Samavi 12
Pipelined
Dr. Shadrokh Samavi 13
Simple RISC Pipeline
Clock number
Instruction number 1 2 3 4 5 6 7 8 9
Instruction i IF ID EX MEM WB
Instruction i + 1 IF ID EX MEM WB
Instruction i + 2 IF ID EX MEM WB
Instruction i + 3 IF ID EX MEM WB
Instruction i + 4 IF ID EX MEM WB
Dr. Shadrokh Samavi 14
Review: Visualizing Pipelining
I
n
s
t
r.
O
r
d
e
r
Time (clock cycles)
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 6 Cycle 7
Cycle 5
Adapted from Patterson, Katz and Kubiatowicz Ā© UCB
Dr. Shadrokh Samavi 15
Dr. Shadrokh Samavi 16
C.2. Pipeline Hazards
Dr. Shadrokh Samavi 17
ā€¢ Hazards: circumstances that would cause incorrect
execution if next instruction were launched
ā€“ Structural hazards: Attempting to use the same
hardware to do two different things at the
same time
ā€“ Data hazards: Instruction depends on result of
prior instruction still in the pipeline
ā€“ Control hazards: Caused by delay between the
fetching of instructions and decisions about
changes in control flow (branches and jumps).
Dr. Shadrokh Samavi 18
Speedup from pipelining
Average instruction time unpipelined
Average instruction time pipelined
-----------------------------------------
-
=
CPI unpipelined Clock cycle unpipelined
Ɨ
CPI pipelined Clock cycle pipelined
----------------- -
=
CPI unpipelined
CPI pipelined
----------------- Clock cycle unpipelined
Clock cycle pipelined
--------------------------
-
=
CPI pipelined Ideal CPI Pipeline stall clock cycles per instruction
+
=
1 Pipeline stall clock cycles per instruction
+
=
Speedup
CPI unpipelined
1 Pipeline stall cycles per instruction
+
------------------------------------------
-
=
Speedup Pipeline depth
1 Pipeline stall cycles per instruction
+
------------------------------------------
=
Speedup from pipelining
CPI unpipelined
CPI pipelined
-----------------
-
Clock cycle unpipelined
Clock cycle pipelined
-------------------------
-
=
1
1 Pipeline stall cycles per instruction
+
-----------------------------------------
Clock cycle unpipelined
Clock cycle pipelined
------------------------
-
=
--------------------------
Assuming
same Clock
cycle for
pipelined &
unpipelined
Ɨ
Ɨ
Ɨ
Dr. Shadrokh Samavi 19
Example: Structural Hazard
I
n
s
t
r.
O
r
d
e
r
Time (clock cycles)
Load
Instr 1
Instr 2
Instr 3
Instr 4
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 6 Cycle 7
Cycle 5
DMem
Structural Hazard
Dr. Shadrokh Samavi 20
Resolving structural hazards
ā€¢ Definition of structural hazard: attempt to
use same hardware for two different things
at the same time
ā€¢ Solution 1: Wait (stall)
ļƒžmust detect the hazard
ļƒžmust have mechanism to stall
ā€¢ Solution 2: Use more hardware
Dr. Shadrokh Samavi 21
Detecting and Resolving Structural Hazard
I
n
s
t
r.
O
r
d
e
r
Time (clock cycles)
Load
Instr 1
Instr 2
Stall
Instr 3
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 6 Cycle 7
Cycle 5
Reg
ALU
DMem
Ifetch Reg
Bubble Bubble Bubble Bubble
Bubble
Adapted from Patterson, Katz and Kubiatowicz Ā© UCB
Dr. Shadrokh Samavi 22
Eliminating Structural Hazards at Design
Time
ALU
Instr
Cache
Reg
File
MUX
MUX
Data
Cache
MUX
Sign
Extend
Zero?
IF/ID
ID/EX
MEM/WB
EX/MEM
4
Adder Next SEQ PC Next SEQ PC
RD RD RD
WB
Data
Next PC
Address
RS1
RS2
Imm
MUX
Datapath
Control Path
Dr. Shadrokh Samavi 23
WB
M
EX
WB
M WB
Control
IF/ID ID/EX EX/MEM MEM/WB
Instruction
RegDst
ALUOp
ALUSrc
Branch
MemRead
MemWrite
MemtoReg
RegWrite
strip off signals for
execution phase
strip off signals for
write-back phase
strip off signals for
memory phase
Genera-
tion
CSE 60321 University of Notre Dame
Dr. Shadrokh Samavi 24
Role of ISA in Structural Hazard Resolution
ā€¢ Simple to determine the sequence of
resources used by an instruction
ā€“ opcode tells it all
ā€¢ Uniformity in the resource usage
ā€¢ Compare MIPS to IA32?
ā€¢ MIPS approach => all instructions flow
through same 5-stage pipelining
Dr. Shadrokh Samavi 25
Data Hazards
I
n
s
t
r.
O
r
d
e
r
add r1,r2,r3
sub r4,r1,r3
and r6,r1,r7
or r8,r1,r9
xor r10,r1,r11
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Time (clock cycles)
IF ID/RF EX MEM WB
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Adapted from Patterson, Katz and Kubiatowicz Ā© UCB
Dr. Shadrokh Samavi 26
Program
execution
order
(in
instructions)
Time (in clock cycles)
CC1 CC2 CC3 CC4 CC5 CC6
ADD R1, R2, R3 DM
DM
DM
Reg
Reg
Reg
IM
IM
IM
Reg
Reg
ALU
ALU
ALU
LW R4, 0(R1)
SW 12(R1), R4
Stores require an operand during MEM, and forwarding of
that operand is shown here
Dr. Shadrokh Samavi 27
Data Hazards Classification
Resource Objects (R.O.): all addressable locations
Data Objects (D.O.): content of resource objects
D(I): Domain of instruction I = all R.O. that their
D.O. effect the operation of I.
R(I): Range of instruction I = all R.O. that their
D.O. are effected by the execution of I.
Dr. Shadrokh Samavi 28
A RAW hazard exists on register ļ²
if ļ² ļƒŽ R ( i ) ļƒ‡ D( j ) ļ‚¹ ļ¦
A WAW hazard exists on register ļ²
if ļ² ļƒŽ R( i ) ļƒ‡ R(j ) ļ‚¹ ļ¦
A WAR hazard exists on register ļ²
if ļ² ļƒŽ D( i ) ļƒ‡ R (j ) ļ‚¹ ļ¦
Dr. Shadrokh Samavi 29
D(I) R(I)
D(J) D(J)
I
write
Read
J
D(J) R(J)
D(I) D(I)
J
write
Read
I
D(I)
R(I)
D(J) R(J)
I
write
write
J
RAW
WAW
WAR
Dr. Shadrokh Samavi 30
ā€¢ Read After Write (RAW)
InstrJ tries to read operand before InstrI writes it
ā€¢ Caused by a ā€œData Dependenceā€ (in compiler
nomenclature). This hazard results from an actual
need for communication.
Three Generic Data Hazards
I: add r1,r2,r3
J: sub r4,r1,r3
Dr. Shadrokh Samavi 31
ā€¢ Write After Read (WAR)
InstrJ writes operand before InstrI reads it
ā€¢ Called an ā€œanti-dependenceā€ by compiler writers.
This results from reuse of the name ā€œr1ā€.
ā€¢ Canā€™t happen in MIPS 5 stage pipeline because:
ā€“ All instructions take 5 stages, and
ā€“ Reads are always in stage 2, and
ā€“ Writes are always in stage 5
I: sub r4,r1,r3
J: add r1,r2,r3
K: mul r6,r1,r7
Three Generic Data Hazards
Dr. Shadrokh Samavi 32
ā€¢ Write After Write (WAW)
InstrJ writes operand before InstrI writes it.
ā€¢ Called an ā€œoutput dependenceā€ by compiler writers
This also results from the reuse of name ā€œr1ā€.
ā€¢ Canā€™t happen in MIPS 5 stage pipeline because:
ā€“ All instructions take 5 stages, and
ā€“ Writes are always in stage 5
ā€¢ Will see WAR and WAW in later more complicated
pipes
I: sub r1,r4,r3
J: add r1,r2,r3
K: mul r6,r1,r7
Three Generic Data Hazards
Dr. Shadrokh Samavi 33
Time (clock cycles)
Forwarding to Avoid Data Hazard
I
n
s
t
r.
O
r
d
e
r
add r1,r2,r3
sub r4,r1,r3
and r6,r1,r7
or r8,r1,r9
xor r10,r1,r11
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Dr. Shadrokh Samavi 34
Time (clock cycles)
I
n
s
t
r.
O
r
d
e
r
lw r1, 0(r2)
sub r4,r1,r6
and r6,r1,r7
or r8,r1,r9
Data Hazard Even with Forwarding
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Adapted from Patterson, Katz and Kubiatowicz Ā© UCB
Dr. Shadrokh Samavi 35
Resolving this load hazard
ā€¢ Adding hardware? ... not
ā€¢ Detection?
ā€¢ Compilation techniques?
ā€¢ What is the cost of load delays?
Dr. Shadrokh Samavi 36
Resolving the Load Data Hazard
Time (clock cycles)
or r8,r1,r9
I
n
s
t
r.
O
r
d
e
r
lw r1, 0(r2)
sub r4,r1,r6
and r6,r1,r7
Reg
ALU
DMem
Ifetch Reg
Reg
Ifetch
ALU
DMem Reg
Bubble
Ifetch
ALU
DMem Reg
Bubble Reg
Ifetch
ALU
DMem
Bubble Reg
How is this different from the instruction issue stall?
Dr. Shadrokh Samavi 37
Try producing fast code for
a = b + c;
d = e ā€“ f;
assuming a, b, c, d ,e, and f in memory.
Slow code:
LW Rb,b
LW Rc,c
ADD Ra,Rb,Rc
SW a,Ra
LW Re,e
LW Rf,f
SUB Rd,Re,Rf
SW d,Rd
Software Scheduling to Avoid Load Hazards
Fast code:
LW Rb, b
LW Rc, c
LW Re,e
ADD Ra, Rb, Rc
LW Rf, f
SW a,Ra
SUB Rd, Re, Rf
SW d,Rd
Dr. Shadrokh Samavi 38
Instruction Set Connection
ā€¢ What is exposed about this organizational
hazard in the instruction set?
ā€¢ k cycle delay?
ā€“ bad, CPI is not part of ISA
ā€¢ k instruction slot delay
ā€“ load should not be followed by use of the value in the
next k instructions
ā€¢ Nothing, but code can reduce run-time
delays
ā€¢ MIPS did the transformation in the
assembler
Dr. Shadrokh Samavi 39
Fraction of loads that cause a stall
Benchmark
compress
eqntott
espresso
gcc
li
doduc
ear
hydro2d
mdljdp
su2cor
24%
41%
12%
23%
24%
20% 20%
10% 10%
4%
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
Dr. Shadrokh Samavi 40
Control Hazard on Branches
=> Three Stage Stall
10: beq r1,r3,36
14: and r2,r3,r5
18: or r6,r1,r7
22: add r8,r1,r9
36: xor r10,r1,r11
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Reg
ALU
DMem
Ifetch Reg
Dr. Shadrokh Samavi 41
Example: Branch Stall Impact
ā€¢ If 30% branch, Stall 3 cycles significant
ā€¢ Two part solution:
ā€“ Determine branch taken or not sooner, AND
ā€“ Compute taken branch address earlier
ā€¢ MIPS branch tests if register = 0 or ļ‚¹ 0
ā€¢ MIPS Solution:
ā€“ Move Zero test to ID/RF stage
ā€“ Adder to calculate new PC in ID/RF stage
ā€“ 1 clock cycle penalty for branch versus 3
Dr. Shadrokh Samavi 42
The frequency of
instructions
that may change
the PC
Percentage of instructions executed
0% 25%
5% 10% 15% 20%
10%
0%
0%
2%
1%
2%
6%
4%
4%
6%
2%
2%
11%
8%
4%
12%
4%
3%
11%
1%
4%
22%
2%
2%
11%
3%
3%
9%
0%
1%
Forward conditional
branches
Unconditional branches
Backward conditional
branches
Benchmark
compress
eqntott
espresso
gcc
li
doduc
ear
hydro2d
mdljdp
su2cor
Dr. Shadrokh Samavi 43
Four Branch Hazard Alternatives
#1: Stall until branch direction is clear
#2: Predict Branch Not Taken
ā€“ Execute successor instructions in sequence
ā€“ ā€œSquashā€ instructions in pipeline if branch actually taken
ā€“ Advantage of late pipeline state update
ā€“ 47% MIPS branches not taken on average
ā€“ PC+4 already calculated, so use it to get next instruction
#3: Predict Branch Taken
ā€“ 53% MIPS branches taken on average
ā€“ But havenā€™t calculated branch target address in MIPS
Ā» MIPS still incurs 1 cycle branch penalty
Ā» Other machines: branch target known before outcome
Dr. Shadrokh Samavi 44
#4: Delayed Branch
ā€“ Define branch to take place AFTER a following instruction
branch instruction
sequential successor1
sequential successor2
........
sequential successorn
........
branch target if taken
ā€“ 1 slot delay allows proper decision and branch target
address in 5 stage pipeline
ā€“ MIPS uses this
Branch delay of length n
Four Branch Hazard Alternatives
Dr. Shadrokh Samavi 45
ā€¢ Where to get instructions to fill branch delay slot?
ā€“ Before branch instruction
ā€“ From the target address: only valuable when branch taken
ā€“ From fall through: only valuable when branch not taken
ā€“ Canceling branches allow more slots to be filled
ā€¢ Compiler effectiveness for single branch delay slot:
ā€“ Fills about 60% of branch delay slots
ā€“ About 80% of instructions executed in branch delay slots useful in
computation
ā€“ About 50% (60% x 80%) of slots usefully filled
ā€¢ Delayed Branch downside: 7-8 stage pipelines,
multiple instructions issued per clock (superscalar)
Delayed Branch
Dr. Shadrokh Samavi 46
(a) From before (b) From target (c) From fall through
SUB R4, R5, R6
ADD R1, R2, R3
if R1 = 0 then
ADD R1, R2, R3
if R1 = 0 then
SUB R4, R5, R6
SUB R4, R5, R6
ADD R1, R2, R3
if R1 = 0 then
OR R7, R8, R9
ADD R1, R2, R3
if R1 = 0 then
SUB R4, R5, R6
ADD R1, R2, R3
if R2 = 0 then
if R2 = 0 then
ADD R1, R2, R3
Becomes
Becomes
Becomes
Delay slot
Delay slot
Delay slot
OR R7, R8, R9
SUB R4, R5, R6
Dr. Shadrokh Samavi 47
Canceling or Nullifying Branch.
In a canceling branch, the instruction includes the
direction that the branch was predicted. When the
branch behaves as predicted, the instruction in the
branch-delay slot is simply executed as it would
normally be with a delayed branch. When the branch
is incorrectly predicted, the instruction in the
branch-delay slot is simply turned into a no-op.
Dr. Shadrokh Samavi 48
Delayed-branch scheduling schemes
Scheduling strategy
(a) From before.
(b) From target
(c) From fall
through
Improves performance when?
Always.
When branch is taken. May
enlarge program if
instructions are duplicated.
When branch is not taken.
Requirements
ā€¢Branch must not depend
on the rescheduled
instructions
ā€¢Must be OK to execute
rescheduled instructions
if branch is not taken.
May need to duplicate
instructions.
ā€¢Must be OK to execute
instructions if branch is
taken.
Dr. Shadrokh Samavi 49
The behavior of a predicted-taken canceling branch
depends on whether the branch is taken or not.
Branch is NOT TAKEN (wrong prediction)
Branch is TAKEN ( predicted correctly)
Dr. Shadrokh Samavi 50
Delayed and Canceling Delay Branches
Dr. Shadrokh Samavi 51
Overall costs of a variety of branch schemes
Dr. Shadrokh Samavi 52
For an R4000-style pipeline, it takes 3 pipeline stages before the
branch target address is known & 1 more cycle to evaluate branch
condition. This leads to the following branch penalties:
Find the effective addition to the CPI arising from branches given
that the relative frequency of unconditional, conditional untaken, and
conditional taken branches are 4%, 6%, and 10%, respectively.
Dr. Shadrokh Samavi 53
Control Hazards
1. Find out whether the branch is taken or not taken
earlier in the pipeline.
2. Compute the taken PC (i.e., the address of the branch
target) earlier.
Branch instruction IF ID EX MEM WB
Branch successor IF stall stall IF ID EX MEM WB
Branch successor + 1 IF ID EX MEM WB
Branch successor + 2 IF ID EX MEM
Branch successor + 3 IF ID EX
Branch successor + 4 IF ID
Branch successor + 5 IF
Dr. Shadrokh Samavi 54
C.3.Implementation of
Pipelining
Dr. Shadrokh Samavi 55
Instruction fetch
Instruction decode/
register fetch
Execute/
Address
calculation
Memory
access
Write
back
Dr. Shadrokh Samavi 56
IF/ID
ID/EX
MEM/WB
EX/MEM
IF
IF ID EX M
E
M
WB
INST
1
INST
2
Dr. Shadrokh Samavi 57
Events on every pipe stage of the MIPS pipeline.
Dr. Shadrokh Samavi 58
5 Steps of MIPS Datapath
Memory
Access
Write
Back
Instruction
Fetch
Instr. Decode
Reg. Fetch
Execute
Addr. Calc
ALU
Memory
Reg
File
MUX
MUX
Memory
MUX
Sign
Extend
Zero?
IF/ID
ID/EX
MEM/WB
EX/MEM
4
Adder Next SEQ PC Next SEQ PC
RD RD RD
WB
Data
Next PC
Address
RS1
RS2
Imm
MUX
Datapath
Control Path
Inst
1
Dr. Shadrokh Samavi 59
5 Steps of MIPS Datapath
Memory
Access
Write
Back
Instruction
Fetch
Instr. Decode
Reg. Fetch
Execute
Addr. Calc
ALU
Memory
Reg
File
MUX
MUX
Memory
MUX
Sign
Extend
Zero?
IF/ID
ID/EX
MEM/WB
EX/MEM
4
Adder Next SEQ PC Next SEQ PC
RD RD RD
WB
Data
Next PC
Address
RS1
RS2
Imm
MUX
Datapath
Control Path
Inst
1
Inst
2
Dr. Shadrokh Samavi 60
5 Steps of MIPS Datapath
Memory
Access
Write
Back
Instruction
Fetch
Instr. Decode
Reg. Fetch
Execute
Addr. Calc
ALU
Memory
Reg
File
MUX
MUX
Memory
MUX
Sign
Extend
Zero?
IF/ID
ID/EX
MEM/WB
EX/MEM
4
Adder Next SEQ PC Next SEQ PC
RD RD RD
WB
Data
Next PC
Address
RS1
RS2
Imm
MUX
Datapath
Control Path
Inst
1
Inst
2
Inst
3
Dr. Shadrokh Samavi 61
Dr. Shadrokh Samavi 62
Situation
Example code
sequence Action
No dependence LWR1,45(R2)
ADD R5,R6,R7
SUB R8,R6,R7
OR R9,R6,R7
No hazard possible because no dependence
exists on R1 in the immediately following
three instructions
.
Dependence
requiring stall
LWR1,45(R2)
ADD R5,
R1,R7
SUB R8,R6,R7
OR R9,R6,R7
Comparators detect the use of R1 in the
ADD and stall the ADD (and SUB and OR)
before the ADD begins EX.
Dependence
overcome by
forwarding
LWR1,45(R2)
ADD R5,R6,R7
SUB R8,R1,R7
OR R9,R6,R7
Comparators detect use of R1 in SUB
and forward result of load to ALU in
time for SUB to begin EX.
Dependence
with accesses
in order
LWR1,45(R2)
ADD R5,R6,R7
SUB R8,R6,R7
OR R9,R1,R7
No action required because the read of R1
by OR occurs in the second half of the ID
phase, while the write of the loaded data
occurred in the first half.
No action required because the read of R1
by OR occurs in the second half of the ID
phase, while the write of the loaded data
occurred in the first half.
Situations that the pipeline hazard detection hardware can see by
comparing the destination and sources of adjacent instructions.
Dr. Shadrokh Samavi 63
Forwarding of data
Pipeline
register
containing
source
instruction
Opcode
of source
instruction
Pipeline
register
containing
destination
instruction
Opcode of
destination
instruction
Destination
of the
forwarded
result
Comparison
(if equal then
forward)
EX/MEM ID/EX
ALU immediate, load,
store, branch
Top ALU
input
EX/MEM.IR 16..20 ==
ID/EX.IR 6..10
EX/MEM ID/EX Bottom ALU
input
EX/MEM.IR 16..20 ==
ID/EX.IR 11..15
MEM/WB ID/EX Register-register ALU,
ALU immediate, load,
store, branch
MEM/WB.IR 16..20 ==
ID/EX.IR 6..10
MEM/WB ID/EX Register-register ALU MEM/WB.IR 16..20 ==
ID/EX.IR 11..15
EX/MEM ALU
immediate
ID/EX
ALU immediate, load,
store, branch
EX/MEM.IR 11..15 ==
ID/EX.IR 6..10
EX/MEM ALU
immediate
ID/EX EX/MEM.IR 11..15 ==
ID/EX.IR 11..15
MEM/WB ALU
immediate
ID/EX
ALU immediate, load,
store, branch
MEM/WB.IR 11..15 ==
ID/EX.IR 6..10
MEM/WB ALU
immediate
ID/EX MEM/WB.IR 11..15 ==
ID/EX.IR 11..15
MEM/WB Load ID/EX
ALU immediate, load,
store, branch
MEM/WB.IR 11..15 ==
ID/EX.IR 6..10
MEM/WB Load ID/EX MEM/WB.IR 11..15 ==
ID/EX.IR 11..15
Register-register ALU
Register-register ALU
Register-register ALU
Register-register ALU
Register-register ALU
Register-register ALU
Register-register ALU,
Register-register ALU,
Register-
register ALU,
Register-
register ALU,
Register-
register ALU,
Register-
register ALU,
Top ALU
input
Bottom ALU
input
Top ALU
input
Bottom ALU
input
Top ALU
input
Bottom ALU
input
Top ALU
input
Bottom ALU
input
Dr. Shadrokh Samavi 64
HW Change for Forwarding
MEM/WR
ID/EX
EX/MEM
Data
Memory
ALU
mux
mux
Registers
NextPC
Immediate
mux
Dr. Shadrokh Samavi 65
Memory
Access
Write
Back
Instruction
Fetch
Instr. Decode
Reg. Fetch
Execute
Addr. Calc
ALU
Memory
Reg
File
MUX
MUX
Memory
MUX
Sign
Extend
Zero?
IF/ID
ID/EX
MEM/WB
EX/MEM
4
Adder
Next SEQ PC Next SEQ PC
RD RD RD
WB
Data
Next PC
Address
RS1
RS2
Imm
MUX
Datapath
Dr. Shadrokh Samavi 66
Adder
IF/ID
Pipelined MIPS Datapath
Memory
Access
Write
Back
Instruction
Fetch
Instr. Decode
Reg. Fetch
Execute
Addr. Calc
ALU
Memory
Reg
File
MUX
Data
Memory
MUX
Sign
Extend
Zero?
MEM/WB
EX/MEM
4
Adder
Next
SEQ PC
RD RD RD
WB
Data
ā€¢ Data stationary control
ā€“ local decode for each instruction phase / pipeline stage
Next PC
Address
RS1
RS2
Imm
MUX
ID/EX
Dr. Shadrokh Samavi 67
C.5. Multicycle Operations
Dr. Shadrokh Samavi 68
The MIPS pipeline with three
additional unpipelined, floating-point,
functional units.
Dr. Shadrokh Samavi 69
Latency : the number of intervening cycles between an
instruction that produces a result and an instruction
that uses the result.
The initiation or repeat interval: the number of cycles
that must elapse between issuing two operations of a
given type.
Dr. Shadrokh Samavi 70
Dr. Shadrokh Samavi 71
The pipeline timing of a set of independent FP operations.
A typical FP code sequence showing the stalls arising from RAW hazards.
Dr. Shadrokh Samavi 72
Three instructions want to perform a write back to the
FP register file simultaneously, as shown in clock cycle
11.
Dr. Shadrokh Samavi 73
Dr. Shadrokh Samavi 74
View publication stats
View publication stats

More Related Content

Similar to 2-Advanced Computer Architecture Pipelining

Condition Monitoring of a Large-scale PV Power Plant in Australia
Condition Monitoring of a Large-scale PV Power Plant in AustraliaCondition Monitoring of a Large-scale PV Power Plant in Australia
Condition Monitoring of a Large-scale PV Power Plant in Australia
Amit Dhoke
Ā 
Lect 1.pptx
Lect 1.pptxLect 1.pptx
Lect 1.pptx
JimingKang
Ā 
pipeline and vector processing
pipeline and vector processingpipeline and vector processing
pipeline and vector processing
Acad
Ā 
Topic2a ss pipelines
Topic2a ss pipelinesTopic2a ss pipelines
Topic2a ss pipelines
turki_09
Ā 
Complex test pattern generation for high speed fault diagnosis in Embedded SRAM
Complex test pattern generation for high speed fault diagnosis in Embedded SRAMComplex test pattern generation for high speed fault diagnosis in Embedded SRAM
Complex test pattern generation for high speed fault diagnosis in Embedded SRAM
IJMER
Ā 
Towards Dynamic Consistency Checking in Goal-directed Predicate Answer Set Pr...
Towards Dynamic Consistency Checking in Goal-directed Predicate Answer Set Pr...Towards Dynamic Consistency Checking in Goal-directed Predicate Answer Set Pr...
Towards Dynamic Consistency Checking in Goal-directed Predicate Answer Set Pr...
Universidad Rey Juan Carlos
Ā 
Instruction Level Parallelism ā€“ Compiler Techniques
Instruction Level Parallelism ā€“ Compiler TechniquesInstruction Level Parallelism ā€“ Compiler Techniques
Instruction Level Parallelism ā€“ Compiler Techniques
Dilum Bandara
Ā 
Combining Cluster Sampling and ACE analysis to improve fault-injection based ...
Combining Cluster Sampling and ACE analysis to improve fault-injection based ...Combining Cluster Sampling and ACE analysis to improve fault-injection based ...
Combining Cluster Sampling and ACE analysis to improve fault-injection based ...
Stefano Di Carlo
Ā 
Introduction to Java Profiling
Introduction to Java ProfilingIntroduction to Java Profiling
Introduction to Java Profiling
Jerry Yoakum
Ā 
CPU Pipelining and Hazards - An Introduction
CPU Pipelining and Hazards - An IntroductionCPU Pipelining and Hazards - An Introduction
CPU Pipelining and Hazards - An Introduction
Dilum Bandara
Ā 
Engineering C-programing module1 ppt (18CPS13/23)
Engineering C-programing module1 ppt (18CPS13/23)Engineering C-programing module1 ppt (18CPS13/23)
Engineering C-programing module1 ppt (18CPS13/23)
kavya R
Ā 
Be cps-18 cps13or23-module1
Be cps-18 cps13or23-module1Be cps-18 cps13or23-module1
Be cps-18 cps13or23-module1
kavya R
Ā 
Pinpointing Vulnerabilities (Ravel)
Pinpointing Vulnerabilities (Ravel)Pinpointing Vulnerabilities (Ravel)
Pinpointing Vulnerabilities (Ravel)
Yue Chen
Ā 
Pipelining & All Hazards Solution
Pipelining  & All Hazards SolutionPipelining  & All Hazards Solution
Pipelining & All Hazards Solution
.AIR UNIVERSITY ISLAMABAD
Ā 
Data link control
Data link controlData link control
Data link control
Iffat Anjum
Ā 
Fall 2016 Insurance Case Study ā€“ Finance 360Loss ControlLoss.docx
Fall 2016 Insurance Case Study ā€“ Finance 360Loss ControlLoss.docxFall 2016 Insurance Case Study ā€“ Finance 360Loss ControlLoss.docx
Fall 2016 Insurance Case Study ā€“ Finance 360Loss ControlLoss.docx
lmelaine
Ā 
Gene's law
Gene's lawGene's law
Gene's law
Hoopeer Hoopeer
Ā 
Advanced Techniques for Exploiting ILP
Advanced Techniques for Exploiting ILPAdvanced Techniques for Exploiting ILP
Advanced Techniques for Exploiting ILP
A B Shinde
Ā 
LTE KPI Optimization - A to Z Abiola.pptx
LTE KPI Optimization - A to Z Abiola.pptxLTE KPI Optimization - A to Z Abiola.pptx
LTE KPI Optimization - A to Z Abiola.pptx
ssuser574918
Ā 
Crossing the Boundaries: Development Strategies for (P)SoCs
Crossing the Boundaries: Development Strategies for (P)SoCsCrossing the Boundaries: Development Strategies for (P)SoCs
Crossing the Boundaries: Development Strategies for (P)SoCs
Andreas Koschak
Ā 

Similar to 2-Advanced Computer Architecture Pipelining (20)

Condition Monitoring of a Large-scale PV Power Plant in Australia
Condition Monitoring of a Large-scale PV Power Plant in AustraliaCondition Monitoring of a Large-scale PV Power Plant in Australia
Condition Monitoring of a Large-scale PV Power Plant in Australia
Ā 
Lect 1.pptx
Lect 1.pptxLect 1.pptx
Lect 1.pptx
Ā 
pipeline and vector processing
pipeline and vector processingpipeline and vector processing
pipeline and vector processing
Ā 
Topic2a ss pipelines
Topic2a ss pipelinesTopic2a ss pipelines
Topic2a ss pipelines
Ā 
Complex test pattern generation for high speed fault diagnosis in Embedded SRAM
Complex test pattern generation for high speed fault diagnosis in Embedded SRAMComplex test pattern generation for high speed fault diagnosis in Embedded SRAM
Complex test pattern generation for high speed fault diagnosis in Embedded SRAM
Ā 
Towards Dynamic Consistency Checking in Goal-directed Predicate Answer Set Pr...
Towards Dynamic Consistency Checking in Goal-directed Predicate Answer Set Pr...Towards Dynamic Consistency Checking in Goal-directed Predicate Answer Set Pr...
Towards Dynamic Consistency Checking in Goal-directed Predicate Answer Set Pr...
Ā 
Instruction Level Parallelism ā€“ Compiler Techniques
Instruction Level Parallelism ā€“ Compiler TechniquesInstruction Level Parallelism ā€“ Compiler Techniques
Instruction Level Parallelism ā€“ Compiler Techniques
Ā 
Combining Cluster Sampling and ACE analysis to improve fault-injection based ...
Combining Cluster Sampling and ACE analysis to improve fault-injection based ...Combining Cluster Sampling and ACE analysis to improve fault-injection based ...
Combining Cluster Sampling and ACE analysis to improve fault-injection based ...
Ā 
Introduction to Java Profiling
Introduction to Java ProfilingIntroduction to Java Profiling
Introduction to Java Profiling
Ā 
CPU Pipelining and Hazards - An Introduction
CPU Pipelining and Hazards - An IntroductionCPU Pipelining and Hazards - An Introduction
CPU Pipelining and Hazards - An Introduction
Ā 
Engineering C-programing module1 ppt (18CPS13/23)
Engineering C-programing module1 ppt (18CPS13/23)Engineering C-programing module1 ppt (18CPS13/23)
Engineering C-programing module1 ppt (18CPS13/23)
Ā 
Be cps-18 cps13or23-module1
Be cps-18 cps13or23-module1Be cps-18 cps13or23-module1
Be cps-18 cps13or23-module1
Ā 
Pinpointing Vulnerabilities (Ravel)
Pinpointing Vulnerabilities (Ravel)Pinpointing Vulnerabilities (Ravel)
Pinpointing Vulnerabilities (Ravel)
Ā 
Pipelining & All Hazards Solution
Pipelining  & All Hazards SolutionPipelining  & All Hazards Solution
Pipelining & All Hazards Solution
Ā 
Data link control
Data link controlData link control
Data link control
Ā 
Fall 2016 Insurance Case Study ā€“ Finance 360Loss ControlLoss.docx
Fall 2016 Insurance Case Study ā€“ Finance 360Loss ControlLoss.docxFall 2016 Insurance Case Study ā€“ Finance 360Loss ControlLoss.docx
Fall 2016 Insurance Case Study ā€“ Finance 360Loss ControlLoss.docx
Ā 
Gene's law
Gene's lawGene's law
Gene's law
Ā 
Advanced Techniques for Exploiting ILP
Advanced Techniques for Exploiting ILPAdvanced Techniques for Exploiting ILP
Advanced Techniques for Exploiting ILP
Ā 
LTE KPI Optimization - A to Z Abiola.pptx
LTE KPI Optimization - A to Z Abiola.pptxLTE KPI Optimization - A to Z Abiola.pptx
LTE KPI Optimization - A to Z Abiola.pptx
Ā 
Crossing the Boundaries: Development Strategies for (P)SoCs
Crossing the Boundaries: Development Strategies for (P)SoCsCrossing the Boundaries: Development Strategies for (P)SoCs
Crossing the Boundaries: Development Strategies for (P)SoCs
Ā 

More from Karla Adamson

Cover Essay Tulisan
Cover Essay TulisanCover Essay Tulisan
Cover Essay Tulisan
Karla Adamson
Ā 
Political Science Research Paper Outline. Free Politica
Political Science Research Paper Outline. Free PoliticaPolitical Science Research Paper Outline. Free Politica
Political Science Research Paper Outline. Free Politica
Karla Adamson
Ā 
Essay Professional Essay Writing Services And How T
Essay Professional Essay Writing Services And How TEssay Professional Essay Writing Services And How T
Essay Professional Essay Writing Services And How T
Karla Adamson
Ā 
Essay Writing Website Reviews - The Oscillation Band
Essay Writing Website Reviews - The Oscillation BandEssay Writing Website Reviews - The Oscillation Band
Essay Writing Website Reviews - The Oscillation Band
Karla Adamson
Ā 
003 Essay Example Grading Rubric Synthesisqkeythings Page Thatsnotus
003 Essay Example Grading Rubric Synthesisqkeythings Page Thatsnotus003 Essay Example Grading Rubric Synthesisqkeythings Page Thatsnotus
003 Essay Example Grading Rubric Synthesisqkeythings Page Thatsnotus
Karla Adamson
Ā 
List Of Scholarships 2021 - Schoolarship
List Of Scholarships 2021 - SchoolarshipList Of Scholarships 2021 - Schoolarship
List Of Scholarships 2021 - Schoolarship
Karla Adamson
Ā 
BPG Indian Font This BPG Indian Font Styl
BPG Indian Font This BPG Indian Font StylBPG Indian Font This BPG Indian Font Styl
BPG Indian Font This BPG Indian Font Styl
Karla Adamson
Ā 
A Friend In Need Is A Friend Indeed Essay English Handwriting
A Friend In Need Is A Friend Indeed Essay English HandwritingA Friend In Need Is A Friend Indeed Essay English Handwriting
A Friend In Need Is A Friend Indeed Essay English Handwriting
Karla Adamson
Ā 
What Is A Essay Map - Premier Unique School Writings And Services
What Is A Essay Map - Premier Unique School Writings And ServicesWhat Is A Essay Map - Premier Unique School Writings And Services
What Is A Essay Map - Premier Unique School Writings And Services
Karla Adamson
Ā 
How To Write On Black Paper With
How To Write On Black Paper WithHow To Write On Black Paper With
How To Write On Black Paper With
Karla Adamson
Ā 
Tips On Writing GMAT AWA Essays
Tips On Writing GMAT AWA EssaysTips On Writing GMAT AWA Essays
Tips On Writing GMAT AWA Essays
Karla Adamson
Ā 
Writing Assignment 2
Writing Assignment 2Writing Assignment 2
Writing Assignment 2
Karla Adamson
Ā 
The Top Ten Academic Writing Accounts (Both Har
The Top Ten Academic Writing Accounts (Both HarThe Top Ten Academic Writing Accounts (Both Har
The Top Ten Academic Writing Accounts (Both Har
Karla Adamson
Ā 
How To Word A Thesis. Format
How To Word A Thesis. FormatHow To Word A Thesis. Format
How To Word A Thesis. Format
Karla Adamson
Ā 
Guide To Writing Basic Essay
Guide To Writing Basic EssayGuide To Writing Basic Essay
Guide To Writing Basic Essay
Karla Adamson
Ā 
Scholarship Essay Descripti
Scholarship Essay DescriptiScholarship Essay Descripti
Scholarship Essay Descripti
Karla Adamson
Ā 
016 Essay Example Expository Prompts That
016 Essay Example Expository Prompts  That016 Essay Example Expository Prompts  That
016 Essay Example Expository Prompts That
Karla Adamson
Ā 
Buy Writing Paper For Kids Handwriting Practice Pape
Buy Writing Paper For Kids  Handwriting Practice PapeBuy Writing Paper For Kids  Handwriting Practice Pape
Buy Writing Paper For Kids Handwriting Practice Pape
Karla Adamson
Ā 
Top Online Essay Writing Help From Best Premium Writers
Top Online Essay Writing Help From Best Premium WritersTop Online Essay Writing Help From Best Premium Writers
Top Online Essay Writing Help From Best Premium Writers
Karla Adamson
Ā 
Great Writing 2 Great Paragraphs (Edition
Great Writing 2 Great Paragraphs (EditionGreat Writing 2 Great Paragraphs (Edition
Great Writing 2 Great Paragraphs (Edition
Karla Adamson
Ā 

More from Karla Adamson (20)

Cover Essay Tulisan
Cover Essay TulisanCover Essay Tulisan
Cover Essay Tulisan
Ā 
Political Science Research Paper Outline. Free Politica
Political Science Research Paper Outline. Free PoliticaPolitical Science Research Paper Outline. Free Politica
Political Science Research Paper Outline. Free Politica
Ā 
Essay Professional Essay Writing Services And How T
Essay Professional Essay Writing Services And How TEssay Professional Essay Writing Services And How T
Essay Professional Essay Writing Services And How T
Ā 
Essay Writing Website Reviews - The Oscillation Band
Essay Writing Website Reviews - The Oscillation BandEssay Writing Website Reviews - The Oscillation Band
Essay Writing Website Reviews - The Oscillation Band
Ā 
003 Essay Example Grading Rubric Synthesisqkeythings Page Thatsnotus
003 Essay Example Grading Rubric Synthesisqkeythings Page Thatsnotus003 Essay Example Grading Rubric Synthesisqkeythings Page Thatsnotus
003 Essay Example Grading Rubric Synthesisqkeythings Page Thatsnotus
Ā 
List Of Scholarships 2021 - Schoolarship
List Of Scholarships 2021 - SchoolarshipList Of Scholarships 2021 - Schoolarship
List Of Scholarships 2021 - Schoolarship
Ā 
BPG Indian Font This BPG Indian Font Styl
BPG Indian Font This BPG Indian Font StylBPG Indian Font This BPG Indian Font Styl
BPG Indian Font This BPG Indian Font Styl
Ā 
A Friend In Need Is A Friend Indeed Essay English Handwriting
A Friend In Need Is A Friend Indeed Essay English HandwritingA Friend In Need Is A Friend Indeed Essay English Handwriting
A Friend In Need Is A Friend Indeed Essay English Handwriting
Ā 
What Is A Essay Map - Premier Unique School Writings And Services
What Is A Essay Map - Premier Unique School Writings And ServicesWhat Is A Essay Map - Premier Unique School Writings And Services
What Is A Essay Map - Premier Unique School Writings And Services
Ā 
How To Write On Black Paper With
How To Write On Black Paper WithHow To Write On Black Paper With
How To Write On Black Paper With
Ā 
Tips On Writing GMAT AWA Essays
Tips On Writing GMAT AWA EssaysTips On Writing GMAT AWA Essays
Tips On Writing GMAT AWA Essays
Ā 
Writing Assignment 2
Writing Assignment 2Writing Assignment 2
Writing Assignment 2
Ā 
The Top Ten Academic Writing Accounts (Both Har
The Top Ten Academic Writing Accounts (Both HarThe Top Ten Academic Writing Accounts (Both Har
The Top Ten Academic Writing Accounts (Both Har
Ā 
How To Word A Thesis. Format
How To Word A Thesis. FormatHow To Word A Thesis. Format
How To Word A Thesis. Format
Ā 
Guide To Writing Basic Essay
Guide To Writing Basic EssayGuide To Writing Basic Essay
Guide To Writing Basic Essay
Ā 
Scholarship Essay Descripti
Scholarship Essay DescriptiScholarship Essay Descripti
Scholarship Essay Descripti
Ā 
016 Essay Example Expository Prompts That
016 Essay Example Expository Prompts  That016 Essay Example Expository Prompts  That
016 Essay Example Expository Prompts That
Ā 
Buy Writing Paper For Kids Handwriting Practice Pape
Buy Writing Paper For Kids  Handwriting Practice PapeBuy Writing Paper For Kids  Handwriting Practice Pape
Buy Writing Paper For Kids Handwriting Practice Pape
Ā 
Top Online Essay Writing Help From Best Premium Writers
Top Online Essay Writing Help From Best Premium WritersTop Online Essay Writing Help From Best Premium Writers
Top Online Essay Writing Help From Best Premium Writers
Ā 
Great Writing 2 Great Paragraphs (Edition
Great Writing 2 Great Paragraphs (EditionGreat Writing 2 Great Paragraphs (Edition
Great Writing 2 Great Paragraphs (Edition
Ā 

Recently uploaded

Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
EduSkills OECD
Ā 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
Mohammed Sikander
Ā 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
Ā 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
Ā 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
Ā 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
Ā 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
Ā 
Digital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion DesignsDigital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion Designs
chanes7
Ā 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
Ā 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
Ā 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
Ā 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
Ā 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
Ā 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
Ā 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
Ā 
The Accursed House by Ɖmile Gaboriau.pptx
The Accursed House by Ɖmile Gaboriau.pptxThe Accursed House by Ɖmile Gaboriau.pptx
The Accursed House by Ɖmile Gaboriau.pptx
DhatriParmar
Ā 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
Ā 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
Ā 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
deeptiverma2406
Ā 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
Ā 

Recently uploaded (20)

Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
Ā 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
Ā 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Ā 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Ā 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
Ā 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
Ā 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
Ā 
Digital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion DesignsDigital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion Designs
Ā 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Ā 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Ā 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Ā 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ā 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Ā 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Ā 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Ā 
The Accursed House by Ɖmile Gaboriau.pptx
The Accursed House by Ɖmile Gaboriau.pptxThe Accursed House by Ɖmile Gaboriau.pptx
The Accursed House by Ɖmile Gaboriau.pptx
Ā 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Ā 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Ā 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
Ā 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Ā 

2-Advanced Computer Architecture Pipelining

  • 1. Dr. Shadrokh Samavi 1 Advanced Computer Architecture Pipelining
  • 3. Dr. Shadrokh Samavi 3 Some slides are from the instructorsā€™ resources which accompany the 5th and previous editions. Some slides are from David Patterson, David Culler and Krste Asanovic of UC Berkeley; Israel Koren of UM Amherst, and Milos Prvulovic of Georgia Tech. Sources of some slides are mentioned at the bottom of the page. Please inform me if I am missing a name in the above list. 3
  • 4. Dr. Shadrokh Samavi 4 C.1. What Is Pipelining?
  • 5. Dr. Shadrokh Samavi 5 ā€¢ An implementation technique whereby multiple instructions are overlapped in execution. ā€¢ pipe stage ā€¢ throughput ā€¢ machine cycle
  • 6. Dr. Shadrokh Samavi 6 IF ID EXE MEM WB
  • 12. Dr. Shadrokh Samavi 12 Pipelined
  • 13. Dr. Shadrokh Samavi 13 Simple RISC Pipeline Clock number Instruction number 1 2 3 4 5 6 7 8 9 Instruction i IF ID EX MEM WB Instruction i + 1 IF ID EX MEM WB Instruction i + 2 IF ID EX MEM WB Instruction i + 3 IF ID EX MEM WB Instruction i + 4 IF ID EX MEM WB
  • 14. Dr. Shadrokh Samavi 14 Review: Visualizing Pipelining I n s t r. O r d e r Time (clock cycles) Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 6 Cycle 7 Cycle 5 Adapted from Patterson, Katz and Kubiatowicz Ā© UCB
  • 16. Dr. Shadrokh Samavi 16 C.2. Pipeline Hazards
  • 17. Dr. Shadrokh Samavi 17 ā€¢ Hazards: circumstances that would cause incorrect execution if next instruction were launched ā€“ Structural hazards: Attempting to use the same hardware to do two different things at the same time ā€“ Data hazards: Instruction depends on result of prior instruction still in the pipeline ā€“ Control hazards: Caused by delay between the fetching of instructions and decisions about changes in control flow (branches and jumps).
  • 18. Dr. Shadrokh Samavi 18 Speedup from pipelining Average instruction time unpipelined Average instruction time pipelined ----------------------------------------- - = CPI unpipelined Clock cycle unpipelined Ɨ CPI pipelined Clock cycle pipelined ----------------- - = CPI unpipelined CPI pipelined ----------------- Clock cycle unpipelined Clock cycle pipelined -------------------------- - = CPI pipelined Ideal CPI Pipeline stall clock cycles per instruction + = 1 Pipeline stall clock cycles per instruction + = Speedup CPI unpipelined 1 Pipeline stall cycles per instruction + ------------------------------------------ - = Speedup Pipeline depth 1 Pipeline stall cycles per instruction + ------------------------------------------ = Speedup from pipelining CPI unpipelined CPI pipelined ----------------- - Clock cycle unpipelined Clock cycle pipelined ------------------------- - = 1 1 Pipeline stall cycles per instruction + ----------------------------------------- Clock cycle unpipelined Clock cycle pipelined ------------------------ - = -------------------------- Assuming same Clock cycle for pipelined & unpipelined Ɨ Ɨ Ɨ
  • 19. Dr. Shadrokh Samavi 19 Example: Structural Hazard I n s t r. O r d e r Time (clock cycles) Load Instr 1 Instr 2 Instr 3 Instr 4 Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 6 Cycle 7 Cycle 5 DMem Structural Hazard
  • 20. Dr. Shadrokh Samavi 20 Resolving structural hazards ā€¢ Definition of structural hazard: attempt to use same hardware for two different things at the same time ā€¢ Solution 1: Wait (stall) ļƒžmust detect the hazard ļƒžmust have mechanism to stall ā€¢ Solution 2: Use more hardware
  • 21. Dr. Shadrokh Samavi 21 Detecting and Resolving Structural Hazard I n s t r. O r d e r Time (clock cycles) Load Instr 1 Instr 2 Stall Instr 3 Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 6 Cycle 7 Cycle 5 Reg ALU DMem Ifetch Reg Bubble Bubble Bubble Bubble Bubble Adapted from Patterson, Katz and Kubiatowicz Ā© UCB
  • 22. Dr. Shadrokh Samavi 22 Eliminating Structural Hazards at Design Time ALU Instr Cache Reg File MUX MUX Data Cache MUX Sign Extend Zero? IF/ID ID/EX MEM/WB EX/MEM 4 Adder Next SEQ PC Next SEQ PC RD RD RD WB Data Next PC Address RS1 RS2 Imm MUX Datapath Control Path
  • 23. Dr. Shadrokh Samavi 23 WB M EX WB M WB Control IF/ID ID/EX EX/MEM MEM/WB Instruction RegDst ALUOp ALUSrc Branch MemRead MemWrite MemtoReg RegWrite strip off signals for execution phase strip off signals for write-back phase strip off signals for memory phase Genera- tion CSE 60321 University of Notre Dame
  • 24. Dr. Shadrokh Samavi 24 Role of ISA in Structural Hazard Resolution ā€¢ Simple to determine the sequence of resources used by an instruction ā€“ opcode tells it all ā€¢ Uniformity in the resource usage ā€¢ Compare MIPS to IA32? ā€¢ MIPS approach => all instructions flow through same 5-stage pipelining
  • 25. Dr. Shadrokh Samavi 25 Data Hazards I n s t r. O r d e r add r1,r2,r3 sub r4,r1,r3 and r6,r1,r7 or r8,r1,r9 xor r10,r1,r11 Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Time (clock cycles) IF ID/RF EX MEM WB Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Adapted from Patterson, Katz and Kubiatowicz Ā© UCB
  • 26. Dr. Shadrokh Samavi 26 Program execution order (in instructions) Time (in clock cycles) CC1 CC2 CC3 CC4 CC5 CC6 ADD R1, R2, R3 DM DM DM Reg Reg Reg IM IM IM Reg Reg ALU ALU ALU LW R4, 0(R1) SW 12(R1), R4 Stores require an operand during MEM, and forwarding of that operand is shown here
  • 27. Dr. Shadrokh Samavi 27 Data Hazards Classification Resource Objects (R.O.): all addressable locations Data Objects (D.O.): content of resource objects D(I): Domain of instruction I = all R.O. that their D.O. effect the operation of I. R(I): Range of instruction I = all R.O. that their D.O. are effected by the execution of I.
  • 28. Dr. Shadrokh Samavi 28 A RAW hazard exists on register ļ² if ļ² ļƒŽ R ( i ) ļƒ‡ D( j ) ļ‚¹ ļ¦ A WAW hazard exists on register ļ² if ļ² ļƒŽ R( i ) ļƒ‡ R(j ) ļ‚¹ ļ¦ A WAR hazard exists on register ļ² if ļ² ļƒŽ D( i ) ļƒ‡ R (j ) ļ‚¹ ļ¦
  • 29. Dr. Shadrokh Samavi 29 D(I) R(I) D(J) D(J) I write Read J D(J) R(J) D(I) D(I) J write Read I D(I) R(I) D(J) R(J) I write write J RAW WAW WAR
  • 30. Dr. Shadrokh Samavi 30 ā€¢ Read After Write (RAW) InstrJ tries to read operand before InstrI writes it ā€¢ Caused by a ā€œData Dependenceā€ (in compiler nomenclature). This hazard results from an actual need for communication. Three Generic Data Hazards I: add r1,r2,r3 J: sub r4,r1,r3
  • 31. Dr. Shadrokh Samavi 31 ā€¢ Write After Read (WAR) InstrJ writes operand before InstrI reads it ā€¢ Called an ā€œanti-dependenceā€ by compiler writers. This results from reuse of the name ā€œr1ā€. ā€¢ Canā€™t happen in MIPS 5 stage pipeline because: ā€“ All instructions take 5 stages, and ā€“ Reads are always in stage 2, and ā€“ Writes are always in stage 5 I: sub r4,r1,r3 J: add r1,r2,r3 K: mul r6,r1,r7 Three Generic Data Hazards
  • 32. Dr. Shadrokh Samavi 32 ā€¢ Write After Write (WAW) InstrJ writes operand before InstrI writes it. ā€¢ Called an ā€œoutput dependenceā€ by compiler writers This also results from the reuse of name ā€œr1ā€. ā€¢ Canā€™t happen in MIPS 5 stage pipeline because: ā€“ All instructions take 5 stages, and ā€“ Writes are always in stage 5 ā€¢ Will see WAR and WAW in later more complicated pipes I: sub r1,r4,r3 J: add r1,r2,r3 K: mul r6,r1,r7 Three Generic Data Hazards
  • 33. Dr. Shadrokh Samavi 33 Time (clock cycles) Forwarding to Avoid Data Hazard I n s t r. O r d e r add r1,r2,r3 sub r4,r1,r3 and r6,r1,r7 or r8,r1,r9 xor r10,r1,r11 Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg
  • 34. Dr. Shadrokh Samavi 34 Time (clock cycles) I n s t r. O r d e r lw r1, 0(r2) sub r4,r1,r6 and r6,r1,r7 or r8,r1,r9 Data Hazard Even with Forwarding Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Adapted from Patterson, Katz and Kubiatowicz Ā© UCB
  • 35. Dr. Shadrokh Samavi 35 Resolving this load hazard ā€¢ Adding hardware? ... not ā€¢ Detection? ā€¢ Compilation techniques? ā€¢ What is the cost of load delays?
  • 36. Dr. Shadrokh Samavi 36 Resolving the Load Data Hazard Time (clock cycles) or r8,r1,r9 I n s t r. O r d e r lw r1, 0(r2) sub r4,r1,r6 and r6,r1,r7 Reg ALU DMem Ifetch Reg Reg Ifetch ALU DMem Reg Bubble Ifetch ALU DMem Reg Bubble Reg Ifetch ALU DMem Bubble Reg How is this different from the instruction issue stall?
  • 37. Dr. Shadrokh Samavi 37 Try producing fast code for a = b + c; d = e ā€“ f; assuming a, b, c, d ,e, and f in memory. Slow code: LW Rb,b LW Rc,c ADD Ra,Rb,Rc SW a,Ra LW Re,e LW Rf,f SUB Rd,Re,Rf SW d,Rd Software Scheduling to Avoid Load Hazards Fast code: LW Rb, b LW Rc, c LW Re,e ADD Ra, Rb, Rc LW Rf, f SW a,Ra SUB Rd, Re, Rf SW d,Rd
  • 38. Dr. Shadrokh Samavi 38 Instruction Set Connection ā€¢ What is exposed about this organizational hazard in the instruction set? ā€¢ k cycle delay? ā€“ bad, CPI is not part of ISA ā€¢ k instruction slot delay ā€“ load should not be followed by use of the value in the next k instructions ā€¢ Nothing, but code can reduce run-time delays ā€¢ MIPS did the transformation in the assembler
  • 39. Dr. Shadrokh Samavi 39 Fraction of loads that cause a stall Benchmark compress eqntott espresso gcc li doduc ear hydro2d mdljdp su2cor 24% 41% 12% 23% 24% 20% 20% 10% 10% 4% 0% 5% 10% 15% 20% 25% 30% 35% 40% 45%
  • 40. Dr. Shadrokh Samavi 40 Control Hazard on Branches => Three Stage Stall 10: beq r1,r3,36 14: and r2,r3,r5 18: or r6,r1,r7 22: add r8,r1,r9 36: xor r10,r1,r11 Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg Reg ALU DMem Ifetch Reg
  • 41. Dr. Shadrokh Samavi 41 Example: Branch Stall Impact ā€¢ If 30% branch, Stall 3 cycles significant ā€¢ Two part solution: ā€“ Determine branch taken or not sooner, AND ā€“ Compute taken branch address earlier ā€¢ MIPS branch tests if register = 0 or ļ‚¹ 0 ā€¢ MIPS Solution: ā€“ Move Zero test to ID/RF stage ā€“ Adder to calculate new PC in ID/RF stage ā€“ 1 clock cycle penalty for branch versus 3
  • 42. Dr. Shadrokh Samavi 42 The frequency of instructions that may change the PC Percentage of instructions executed 0% 25% 5% 10% 15% 20% 10% 0% 0% 2% 1% 2% 6% 4% 4% 6% 2% 2% 11% 8% 4% 12% 4% 3% 11% 1% 4% 22% 2% 2% 11% 3% 3% 9% 0% 1% Forward conditional branches Unconditional branches Backward conditional branches Benchmark compress eqntott espresso gcc li doduc ear hydro2d mdljdp su2cor
  • 43. Dr. Shadrokh Samavi 43 Four Branch Hazard Alternatives #1: Stall until branch direction is clear #2: Predict Branch Not Taken ā€“ Execute successor instructions in sequence ā€“ ā€œSquashā€ instructions in pipeline if branch actually taken ā€“ Advantage of late pipeline state update ā€“ 47% MIPS branches not taken on average ā€“ PC+4 already calculated, so use it to get next instruction #3: Predict Branch Taken ā€“ 53% MIPS branches taken on average ā€“ But havenā€™t calculated branch target address in MIPS Ā» MIPS still incurs 1 cycle branch penalty Ā» Other machines: branch target known before outcome
  • 44. Dr. Shadrokh Samavi 44 #4: Delayed Branch ā€“ Define branch to take place AFTER a following instruction branch instruction sequential successor1 sequential successor2 ........ sequential successorn ........ branch target if taken ā€“ 1 slot delay allows proper decision and branch target address in 5 stage pipeline ā€“ MIPS uses this Branch delay of length n Four Branch Hazard Alternatives
  • 45. Dr. Shadrokh Samavi 45 ā€¢ Where to get instructions to fill branch delay slot? ā€“ Before branch instruction ā€“ From the target address: only valuable when branch taken ā€“ From fall through: only valuable when branch not taken ā€“ Canceling branches allow more slots to be filled ā€¢ Compiler effectiveness for single branch delay slot: ā€“ Fills about 60% of branch delay slots ā€“ About 80% of instructions executed in branch delay slots useful in computation ā€“ About 50% (60% x 80%) of slots usefully filled ā€¢ Delayed Branch downside: 7-8 stage pipelines, multiple instructions issued per clock (superscalar) Delayed Branch
  • 46. Dr. Shadrokh Samavi 46 (a) From before (b) From target (c) From fall through SUB R4, R5, R6 ADD R1, R2, R3 if R1 = 0 then ADD R1, R2, R3 if R1 = 0 then SUB R4, R5, R6 SUB R4, R5, R6 ADD R1, R2, R3 if R1 = 0 then OR R7, R8, R9 ADD R1, R2, R3 if R1 = 0 then SUB R4, R5, R6 ADD R1, R2, R3 if R2 = 0 then if R2 = 0 then ADD R1, R2, R3 Becomes Becomes Becomes Delay slot Delay slot Delay slot OR R7, R8, R9 SUB R4, R5, R6
  • 47. Dr. Shadrokh Samavi 47 Canceling or Nullifying Branch. In a canceling branch, the instruction includes the direction that the branch was predicted. When the branch behaves as predicted, the instruction in the branch-delay slot is simply executed as it would normally be with a delayed branch. When the branch is incorrectly predicted, the instruction in the branch-delay slot is simply turned into a no-op.
  • 48. Dr. Shadrokh Samavi 48 Delayed-branch scheduling schemes Scheduling strategy (a) From before. (b) From target (c) From fall through Improves performance when? Always. When branch is taken. May enlarge program if instructions are duplicated. When branch is not taken. Requirements ā€¢Branch must not depend on the rescheduled instructions ā€¢Must be OK to execute rescheduled instructions if branch is not taken. May need to duplicate instructions. ā€¢Must be OK to execute instructions if branch is taken.
  • 49. Dr. Shadrokh Samavi 49 The behavior of a predicted-taken canceling branch depends on whether the branch is taken or not. Branch is NOT TAKEN (wrong prediction) Branch is TAKEN ( predicted correctly)
  • 50. Dr. Shadrokh Samavi 50 Delayed and Canceling Delay Branches
  • 51. Dr. Shadrokh Samavi 51 Overall costs of a variety of branch schemes
  • 52. Dr. Shadrokh Samavi 52 For an R4000-style pipeline, it takes 3 pipeline stages before the branch target address is known & 1 more cycle to evaluate branch condition. This leads to the following branch penalties: Find the effective addition to the CPI arising from branches given that the relative frequency of unconditional, conditional untaken, and conditional taken branches are 4%, 6%, and 10%, respectively.
  • 53. Dr. Shadrokh Samavi 53 Control Hazards 1. Find out whether the branch is taken or not taken earlier in the pipeline. 2. Compute the taken PC (i.e., the address of the branch target) earlier. Branch instruction IF ID EX MEM WB Branch successor IF stall stall IF ID EX MEM WB Branch successor + 1 IF ID EX MEM WB Branch successor + 2 IF ID EX MEM Branch successor + 3 IF ID EX Branch successor + 4 IF ID Branch successor + 5 IF
  • 54. Dr. Shadrokh Samavi 54 C.3.Implementation of Pipelining
  • 55. Dr. Shadrokh Samavi 55 Instruction fetch Instruction decode/ register fetch Execute/ Address calculation Memory access Write back
  • 56. Dr. Shadrokh Samavi 56 IF/ID ID/EX MEM/WB EX/MEM IF IF ID EX M E M WB INST 1 INST 2
  • 57. Dr. Shadrokh Samavi 57 Events on every pipe stage of the MIPS pipeline.
  • 58. Dr. Shadrokh Samavi 58 5 Steps of MIPS Datapath Memory Access Write Back Instruction Fetch Instr. Decode Reg. Fetch Execute Addr. Calc ALU Memory Reg File MUX MUX Memory MUX Sign Extend Zero? IF/ID ID/EX MEM/WB EX/MEM 4 Adder Next SEQ PC Next SEQ PC RD RD RD WB Data Next PC Address RS1 RS2 Imm MUX Datapath Control Path Inst 1
  • 59. Dr. Shadrokh Samavi 59 5 Steps of MIPS Datapath Memory Access Write Back Instruction Fetch Instr. Decode Reg. Fetch Execute Addr. Calc ALU Memory Reg File MUX MUX Memory MUX Sign Extend Zero? IF/ID ID/EX MEM/WB EX/MEM 4 Adder Next SEQ PC Next SEQ PC RD RD RD WB Data Next PC Address RS1 RS2 Imm MUX Datapath Control Path Inst 1 Inst 2
  • 60. Dr. Shadrokh Samavi 60 5 Steps of MIPS Datapath Memory Access Write Back Instruction Fetch Instr. Decode Reg. Fetch Execute Addr. Calc ALU Memory Reg File MUX MUX Memory MUX Sign Extend Zero? IF/ID ID/EX MEM/WB EX/MEM 4 Adder Next SEQ PC Next SEQ PC RD RD RD WB Data Next PC Address RS1 RS2 Imm MUX Datapath Control Path Inst 1 Inst 2 Inst 3
  • 62. Dr. Shadrokh Samavi 62 Situation Example code sequence Action No dependence LWR1,45(R2) ADD R5,R6,R7 SUB R8,R6,R7 OR R9,R6,R7 No hazard possible because no dependence exists on R1 in the immediately following three instructions . Dependence requiring stall LWR1,45(R2) ADD R5, R1,R7 SUB R8,R6,R7 OR R9,R6,R7 Comparators detect the use of R1 in the ADD and stall the ADD (and SUB and OR) before the ADD begins EX. Dependence overcome by forwarding LWR1,45(R2) ADD R5,R6,R7 SUB R8,R1,R7 OR R9,R6,R7 Comparators detect use of R1 in SUB and forward result of load to ALU in time for SUB to begin EX. Dependence with accesses in order LWR1,45(R2) ADD R5,R6,R7 SUB R8,R6,R7 OR R9,R1,R7 No action required because the read of R1 by OR occurs in the second half of the ID phase, while the write of the loaded data occurred in the first half. No action required because the read of R1 by OR occurs in the second half of the ID phase, while the write of the loaded data occurred in the first half. Situations that the pipeline hazard detection hardware can see by comparing the destination and sources of adjacent instructions.
  • 63. Dr. Shadrokh Samavi 63 Forwarding of data Pipeline register containing source instruction Opcode of source instruction Pipeline register containing destination instruction Opcode of destination instruction Destination of the forwarded result Comparison (if equal then forward) EX/MEM ID/EX ALU immediate, load, store, branch Top ALU input EX/MEM.IR 16..20 == ID/EX.IR 6..10 EX/MEM ID/EX Bottom ALU input EX/MEM.IR 16..20 == ID/EX.IR 11..15 MEM/WB ID/EX Register-register ALU, ALU immediate, load, store, branch MEM/WB.IR 16..20 == ID/EX.IR 6..10 MEM/WB ID/EX Register-register ALU MEM/WB.IR 16..20 == ID/EX.IR 11..15 EX/MEM ALU immediate ID/EX ALU immediate, load, store, branch EX/MEM.IR 11..15 == ID/EX.IR 6..10 EX/MEM ALU immediate ID/EX EX/MEM.IR 11..15 == ID/EX.IR 11..15 MEM/WB ALU immediate ID/EX ALU immediate, load, store, branch MEM/WB.IR 11..15 == ID/EX.IR 6..10 MEM/WB ALU immediate ID/EX MEM/WB.IR 11..15 == ID/EX.IR 11..15 MEM/WB Load ID/EX ALU immediate, load, store, branch MEM/WB.IR 11..15 == ID/EX.IR 6..10 MEM/WB Load ID/EX MEM/WB.IR 11..15 == ID/EX.IR 11..15 Register-register ALU Register-register ALU Register-register ALU Register-register ALU Register-register ALU Register-register ALU Register-register ALU, Register-register ALU, Register- register ALU, Register- register ALU, Register- register ALU, Register- register ALU, Top ALU input Bottom ALU input Top ALU input Bottom ALU input Top ALU input Bottom ALU input Top ALU input Bottom ALU input
  • 64. Dr. Shadrokh Samavi 64 HW Change for Forwarding MEM/WR ID/EX EX/MEM Data Memory ALU mux mux Registers NextPC Immediate mux
  • 65. Dr. Shadrokh Samavi 65 Memory Access Write Back Instruction Fetch Instr. Decode Reg. Fetch Execute Addr. Calc ALU Memory Reg File MUX MUX Memory MUX Sign Extend Zero? IF/ID ID/EX MEM/WB EX/MEM 4 Adder Next SEQ PC Next SEQ PC RD RD RD WB Data Next PC Address RS1 RS2 Imm MUX Datapath
  • 66. Dr. Shadrokh Samavi 66 Adder IF/ID Pipelined MIPS Datapath Memory Access Write Back Instruction Fetch Instr. Decode Reg. Fetch Execute Addr. Calc ALU Memory Reg File MUX Data Memory MUX Sign Extend Zero? MEM/WB EX/MEM 4 Adder Next SEQ PC RD RD RD WB Data ā€¢ Data stationary control ā€“ local decode for each instruction phase / pipeline stage Next PC Address RS1 RS2 Imm MUX ID/EX
  • 67. Dr. Shadrokh Samavi 67 C.5. Multicycle Operations
  • 68. Dr. Shadrokh Samavi 68 The MIPS pipeline with three additional unpipelined, floating-point, functional units.
  • 69. Dr. Shadrokh Samavi 69 Latency : the number of intervening cycles between an instruction that produces a result and an instruction that uses the result. The initiation or repeat interval: the number of cycles that must elapse between issuing two operations of a given type.
  • 71. Dr. Shadrokh Samavi 71 The pipeline timing of a set of independent FP operations. A typical FP code sequence showing the stalls arising from RAW hazards.
  • 72. Dr. Shadrokh Samavi 72 Three instructions want to perform a write back to the FP register file simultaneously, as shown in clock cycle 11.
  • 74. Dr. Shadrokh Samavi 74 View publication stats View publication stats