SlideShare a Scribd company logo
1 of 26
1
Single-cycle datapath, slightly rearranged
MemToReg
Read
address
Instruction
memory
Instruction
[31-0]
Address
Write
data
Data
memory
Read
data
MemWrite
MemRead
1
0
4
Shift
left 2
P
C
Add
1
0
PCSrc
Sign
extend
ALUSrc
Result
Zero
ALU
ALUOp
Instr [15 - 0]
RegDst
Read
register 1
Read
register 2
Write
register
Write
data
Read
data 2
Read
data 1
Registers
RegWrite
Add
Instr [15 - 11]
Instr [20 - 16]
0
1
0
1
2
Pipeline registers
 In pipelining, we divide instruction execution into multiple cycles
— IF ID EX MEM WB
 Information computed during one cycle may be needed in a later cycle:
— Instruction read in IF stage determines which registers are fetched in
ID stage, what immediate is used for EX stage, and what destination
register is for WB
— Register values read in ID are used in EX and/or MEM stages
— ALU output produced in EX is an effective address for MEM or a result
for WB
 A lot of information to save!
— Saved in intermediate registers called pipeline registers
 The registers are named for the stages they connect:
IF/ID ID/EX EX/MEM MEM/WB
 No register is needed after the WB stage, because after WB the
instruction is done
3
Pipelined datapath
Read
address
Instruction
memory
Instruction
[31-0]
Address
Write
data
Data
memory
Read
data
MemWrite
MemRead
1
0
MemToReg
4
Shift
left 2
Add
Sign
extend
ALUSrc
Result
Zero
ALU
ALUOp
Instr [15 - 0]
RegDst
Read
register 1
Read
register 2
Write
register
Write
data
Read
data 2
Read
data 1
Registers
RegWrite
Add
Instr [15 - 11]
Instr [20 - 16]
0
1
0
1
IF/ID ID/EX EX/MEM MEM/WB
1
0
PCSrc
P
C
4
Propagating values forward
 Data values required later propagated through the pipeline registers
 The most extreme example is the destination register (rd or rt)
— It is retrieved in IF, but isn’t updated until the WB
— Thus, it must be passed through all pipeline stages, as shown in red
on the next slide
 Notice that we can’t keep a single “instruction register,” because the
pipelined machine needs to fetch a new instruction every clock cycle
5
The destination register
Read
address
Instruction
memory
Instruction
[31-0]
Address
Write
data
Data
memory
Read
data
MemWrite
MemRead
1
0
MemToReg
4
Shift
left 2
Add
ALUSrc
Result
Zero
ALU
ALUOp
Instr [15 - 0]
RegDst
Read
register 1
Read
register 2
Write
register
Write
data
Read
data 2
Read
data 1
Registers
RegWrite
Add
Instr [15 - 11]
Instr [20 - 16]
0
1
0
1
IF/ID ID/EX EX/MEM MEM/WB
1
0
PCSrc
P
C
Sign
extend
6
What about control signals?
 Control signals generated similar to the single-cycle processor
— in the ID stage, the processor decodes the instruction fetched in IF
and produces the appropriate control values
 Some of the control signals will not be needed until later stages
— These signals must be propagated through the pipeline until they
reach the appropriate stage
— We just pass them in the pipeline registers, along with the data
 Control signals can be categorized by the pipeline stage that uses them
Stage Control signals needed
EX ALUSrc ALUOp RegDst
MEM MemRead MemWrite PCSrc
WB RegWrite MemToReg
7
Pipelined datapath and control
Read
address
Instruction
memory
Instruction
[31-0]
Address
Write
data
Data
memory
Read
data
MemWrite
MemRead
1
0
MemToReg
4
Shift
left 2
Add
ALUSrc
Result
Zero
ALU
ALUOp
Instr [15 - 0]
RegDst
Read
register 1
Read
register 2
Write
register
Write
data
Read
data 2
Read
data 1
Registers
RegWrite
Add
Instr [15 - 11]
Instr [20 - 16]
0
1
0
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Control
M
WB
WB
P
C
1
0
PCSrc
Sign
extend
EX
M
WB
8
 Here’s a sample sequence of instructions to execute
1000: lw $8, 4($29)
1004: sub $2, $4, $5
1008: and $9, $10, $11
1012: or $16, $17, $18
1016: add $13, $14, $0
 We’ll make some assumptions, just so we can show actual data values:
— Each register contains its number plus 100. For instance, register $8
contains 108, register $29 contains 129, etc.
— Every data memory location contains 99
 Our pipeline diagrams will follow some conventions:
— An X indicates values that aren’t important, like the constant field of
an R-type instruction
— Question marks ??? indicate values we don’t know, usually resulting
from instructions coming before and after the ones in our example
An example execution sequence
addresses
in decimal
9
Cycle 1 (filling)
IF: lw $8, 4($29) MEM: ??? WB: ???
EX: ???
ID: ???
Read
address
Instruction
memory
Instruction
[31-0]
Address
Write
data
Data
memory
Read
data
MemWrite (?)
MemRead (?)
1
0
MemToReg
(?)
Shift
left 2
Add
1
0
PCSrc
ALUSrc (?)
Result
Zero
ALU
ALUOp (???)
RegDst (?)
Read
register 1
Read
register 2
Write
register
Write
data
Read
data 2
Read
data 1
Registers
RegWrite (?)
Add
0
1
0
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Control
M
WB
WB
1000
1004
???
???
???
???
???
???
???
???
???
???
???
???
???
???
???
???
???
???
???
???
???
???
???
4
P
C
Sign
extend
EX
M
WB
10
Cycle 2
ID: lw $8, 4($29)
IF: sub $2, $4, $5 MEM: ??? WB: ???
EX: ???
Read
address
Instruction
memory
Instruction
[31-0]
Address
Write
data
Data
memory
Read
data
1
0
4
Shift
left 2
Add
PCSrc
Result
Zero
ALU
4
Read
register 1
Read
register 2
Write
register
Write
data
Read
data 2
Read
data 1
Registers
Add
X
8
0
1
0
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Control
M
WB
WB
1004
29
X
1008
129
X
MemToReg
(?)
???
???
???
???
???
???
RegWrite (?)
MemWrite (?)
MemRead (?)
???
???
???
ALUSrc (?)
ALUOp (???)
RegDst (?)
???
???
???
???
???
???
???
P
C
Sign
extend
EX
M
WB
1
0
11
Cycle 3
ID: sub $2, $4, $5
IF: and $9, $10, $11 EX: lw $8, 4($29) MEM: ??? WB: ???
MemToReg
(?)
Read
address
Instruction
memory
Instruction
[31-0]
Address
Write
data
Data
memory
Read
data
MemWrite (?)
MemRead (?)
1
0
4
Shift
left 2
Add
PCSrc
ALUSrc (1)
Result
Zero
ALU
ALUOp (add)
X
RegDst (0)
Read
register 1
Read
register 2
Write
register
Write
data
Read
data 2
Read
data 1
Registers
Add
2
X
0
1
0
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Control
M
WB
WB
1008
4
5
1012
104
105
129
4
X
8
X
8
133
4
???
???
???
???
???
???
RegWrite (?)
???
???
???
P
C
Sign
extend
1
0
EX
M
WB
12
Cycle 4
ID: and $9, $10, $11
IF: or $16, $17, $18 EX: sub $2, $4, $5 MEM: lw $8, 4($29) WB: ???
Read
address
Instruction
memory
Instruction
[31-0]
Address
Write
data
Data
memory
Read
data
MemWrite (0)
MemRead (1)
1
0
MemToReg
(?)
4
Shift
left 2
Add
PCSrc
ALUSrc (0)
Result
Zero
ALU
ALUOp (sub)
X
RegDst (1)
Read
register 1
Read
register 2
Write
register
Write
data
Read
data 2
Read
data 1
Registers
RegWrite (?)
Add
9
X
0
1
0
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Control
M
WB
WB
1012
10
11
1016
110
111
104
X
105
X
2
2
–1
133
X
99
8
???
???
???
???
???
???
P
C
Sign
extend
EX
M
WB
1
0
13
Cycle 5 (full)
ID: or $16, $17, $18
IF: add $13, $14, $0 EX: and $9, $10, $11 MEM: sub $2, $4, $5 WB:
lw $8, 4($29)
Read
address
Instruction
memory
Instruction
[31-0]
Address
Write
data
Data
memory
Read
data
MemWrite (0)
MemRead (0)
1
0
MemToReg
(1)
4
Shift
left 2
Add
PCSrc
ALUSrc (0)
Result
Zero
ALU
ALUOp (and)
X
RegDst (1)
Read
register 1
Read
register 2
Write
register
Write
data
Read
data 2
Read
data 1
Registers
RegWrite (1)
Add
16
X
0
1
0
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Control
M
WB
WB
1016
17
18
1020
117
118
110
X
111
X
9
9
110
-1
105
X
2
99
133
99
8
99
8
P
C
Sign
extend
EX
M
WB
1
0
14
Cycle 6 (emptying)
ID: add $13, $14, $0
IF: ??? EX: or $16, $17, $18 MEM: and $9, $10, $11 WB: sub
$2, $4, $5
Read
address
Instruction
memory
Instruction
[31-0]
Address
Write
data
Data
memory
Read
data
MemWrite (0)
MemRead (0)
1
0
MemToReg
(0)
4
Shift
left 2
Add
PCSrc
ALUSrc (0)
Result
Zero
ALU
ALUOp (or)
X
RegDst (1)
Read
register 1
Read
register 2
Write
register
Write
data
Read
data 2
Read
data 1
Registers
RegWrite (1)
Add
13
X
0
1
0
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Control
M
WB
WB
1020
14
0
???
114
0
117
X
118
X
16
16
119
110
111
X
9
X
-1
-1
2
-1
2
P
C
Sign
extend
1
0
EX
M
WB
15
Cycle 7
ID: ???
IF: ??? EX: add $13, $14, $0 MEM: or $16, $17, $18 WB: and
$9, $10, $11
Read
address
Instruction
memory
Instruction
[31-0]
Address
Write
data
Data
memory
Read
data
MemWrite (0)
MemRead (0)
1
0
MemToReg
(0)
4
Shift
left 2
Add
PCSrc
ALUSrc (0)
Result
Zero
ALU
ALUOp (add)
???
RegDst (1)
Read
register 1
Read
register 2
Write
register
Write
data
Read
data 2
Read
data 1
Registers
RegWrite (1)
Add
???
???
0
1
0
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Control
M
WB
WB
???
???
???
???
114
X
0
X
13
13
114
119
118
X
16
X
110
110
9
110
9
P
C
Sign
extend
???
???
EX
M
WB
1
0
16
Cycle 8
ID: ???
IF: ??? EX: ??? MEM: add $13, $14, $0 WB: or $16,
$17, $18
Read
address
Instruction
memory
Instruction
[31-0]
Address
Write
data
Data
memory
Read
data
MemWrite (0)
MemRead (0)
1
0
MemToReg
(0)
4
Shift
left 2
Add
PCSrc
ALUSrc (?)
Result
Zero
ALU
ALUOp (???)
???
RegDst (?)
Read
register 1
Read
register 2
Write
register
Write
data
Read
data 2
Read
data 1
Registers
RegWrite (1)
Add
???
???
0
1
0
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Control
M
WB
WB
???
???
???
???
???
???
114
0
X
13
X
119
119
16
119
16
P
C
Sign
extend
???
??? ???
???
???
???
???
1
0
EX
M
WB
17
Cycle 9
ID: ???
IF: ??? EX: ??? MEM: ??? WB: add
$13, $14, $0
Read
address
Instruction
memory
Instruction
[31-0]
Address
Write
data
Data
memory
Read
data
MemWrite (?)
MemRead (?)
1
0
MemToReg
(0)
4
Shift
left 2
Add
PCSrc
ALUSrc (?)
Result
Zero
ALU
ALUOp (???)
???
RegDst (?)
Read
register 1
Read
register 2
Write
register
Write
data
Read
data 2
Read
data 1
Registers
RegWrite (1)
Add
???
???
0
1
0
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Control
M
WB
WB
???
???
???
???
???
???
???
???
?
X
???
X
114
114
13
114
13
P
C
Sign
extend
???
???
???
???
???
???
1
0
EX
M
WB
18
That’s a lot of diagrams there
 Compare the last few slides with the pipeline diagram above
— You can see how instruction executions are overlapped
— Each functional unit is used by a different instruction in each cycle
— The pipeline registers save control and data values generated in
previous clock cycles for later use
— When the pipeline is full in clock cycle 5, all of the hardware units
are utilized. This is the ideal situation, and what makes pipelined
processors so fast
 See the textbook for more examples
Clock cycle
1 2 3 4 5 6 7 8 9
lw $t0, 4($sp) IF ID EX MEM WB
sub $v0, $a0, $a1 IF ID EX MEM WB
and $t1, $t2, $t3 IF ID EX MEM WB
or $s0, $s1, $s2 IF ID EX MEM WB
add $t5, $t6, $0 IF ID EX MEM WB
19
Instruction set architectures and pipelining
 The MIPS instruction set was designed especially for easy pipelining:
— All instructions are 32-bits long, so the instruction fetch stage just
needs to read one word on every clock cycle
— Fields are in the same position in different instruction formats—the
opcode is always the first six bits, rs is the next five bits, etc. This
makes things easy for the ID stage
— MIPS is a register-to-register architecture, so arithmetic operations
cannot contain memory references. This keeps the pipeline shorter
and simpler
 Pipelining is harder for older/more complex instruction sets:
— If different instructions had different lengths or formats, the fetch
and decode stages would need extra time to determine the actual
length of each instruction and the position of the fields
— With memory-to-memory instructions, additional pipeline stages may
be needed to compute effective addresses and read memory before
the EX stage
20
Note how everything goes left to right, except …
Read
address
Instruction
memory
Instruction
[31-0]
Address
Write
data
Data
memory
Read
data
MemWrite
MemRead
1
0
MemToReg
4
Shift
left 2
Add
Sign
extend
ALUSrc
Result
Zero
ALU
ALUOp
Instr [15 - 0]
RegDst
Read
register 1
Read
register 2
Write
register
Write
data
Read
data 2
Read
data 1
Registers
RegWrite
Add
Instr [15 - 11]
Instr [20 - 16]
0
1
0
1
IF/ID ID/EX EX/MEM MEM/WB
1
0
PCSrc
P
C
21
An example with dependencies
sub $2, $1, $3
and $12, $2, $5
or $13, $6, $2
add $14, $2, $2
sw $15, 100($2)
 There are several dependencies in this new code fragment
— the first instruction, SUB, stores a value into $2
— that register is used as a source in the rest of the instructions
 This is not a problem for the single-cycle datapath
— each instruction is executed completely before the next one begins,
so instructions 2 through 5 above use the new value of $2
 How would this code sequence fare in our 5-stage MIPS pipeline?
22
Clock cycle
1 2 3 4 5 6 7 8 9
sub $2, $1, $3 IF ID EX MEM WB
and $12, $2, $5 IF ID EX MEM WB
or $13, $6, $2 IF ID EX MEM WB
add $14, $2, $2 IF ID EX MEM WB
sw $15, 100($2) IF ID EX MEM WB
 The sub instruction does not write to register $2 until clock cycle 5. This
causes two data hazards in our current pipelined datapath:
— the and reads register $2 in cycle 3, and since sub hasn’t modified
the register yet, this will be the old value of $2, not the new one
— the or instruction uses register $2 in cycle 4, again before it’s
actually updated by sub
Data hazards in the pipeline diagram
23
Clock cycle
1 2 3 4 5 6 7 8 9
sub $2, $1, $3 IF ID EX MEM WB
and $12, $2, $5 IF ID EX MEM WB
or $13, $6, $2 IF ID EX MEM WB
add $14, $2, $2 IF ID EX MEM WB
sw $15, 100($2) IF ID EX MEM WB
 The add instruction is okay, because of the register file design
— registers are written at the beginning of a clock cycle
— the new value will be available by the end of that cycle
 The sw is no problem at all, since it reads $2 after the sub finishes
Things that are okay
24
Clock cycle
1 2 3 4 5 6 7 8 9
sub $2, $1, $3 IF ID EX MEM WB
and $12, $2, $5 IF ID EX MEM WB
or $13, $6, $2 IF ID EX MEM WB
add $14, $2, $2 IF ID EX MEM WB
sw $15, 100($2) IF ID EX MEM WB
 Arrows indicate the flow of data between instructions
— The tails of the arrows show when register $2 is written
— The heads of the arrows show when $2 is read
 Any arrow that points backwards in time represents a data hazard in our
basic pipelined datapath
Dependency arrows
25
Bypassing the register file
 The actual result $1 - $3 is computed in clock cycle 3, before it is
needed in cycles 4 and 5
 If we could somehow bypass the writeback and register read stages when
needed, then we can eliminate these data hazards
 Essentially, we need to pass the ALU output from sub directly to the and
and or instructions, without going through the register file
Clock cycle
1 2 3 4 5 6 7
sub $2, $1, $3 IF ID EX MEM WB
and $12, $2, $5 IF ID EX MEM WB
or $13, $6, $2 IF ID EX MEM WB
26
Pipleline Registers to the rescue!
 Pipeline stages communicate through pipeline registers:
IF/ID ID/EX EX/MEM MEM/WB
 We “forward” data from pipeline registers to later instructions
Instruction
memory
Data
memory
1
0
PC
ALU
Registers
Rd
Rt
0
1
IF/ID ID/EX EX/MEM MEM/WB
ALU output
available here

More Related Content

Similar to L11.ppt

cse211 power point presentation for engineering
cse211 power point presentation for engineeringcse211 power point presentation for engineering
cse211 power point presentation for engineering
VishnuVinay6
 
ESD_05_ARM_Instructions set for preparation
ESD_05_ARM_Instructions set for preparationESD_05_ARM_Instructions set for preparation
ESD_05_ARM_Instructions set for preparation
ssuser026e94
 

Similar to L11.ppt (20)

material for studentbasic computer organization and design .pptx
material for studentbasic computer organization and design .pptxmaterial for studentbasic computer organization and design .pptx
material for studentbasic computer organization and design .pptx
 
cse211 power point presentation for engineering
cse211 power point presentation for engineeringcse211 power point presentation for engineering
cse211 power point presentation for engineering
 
Ch5_MorrisMano.pptx
Ch5_MorrisMano.pptxCh5_MorrisMano.pptx
Ch5_MorrisMano.pptx
 
Basic computer organization and design
Basic computer organization and designBasic computer organization and design
Basic computer organization and design
 
PattPatelCh04.ppt
PattPatelCh04.pptPattPatelCh04.ppt
PattPatelCh04.ppt
 
Presentation computer architechure (1)
Presentation computer architechure (1)Presentation computer architechure (1)
Presentation computer architechure (1)
 
timing_diagram_of_8085.pptx
timing_diagram_of_8085.pptxtiming_diagram_of_8085.pptx
timing_diagram_of_8085.pptx
 
Introduction to 8085 microprocessor
Introduction to 8085 microprocessorIntroduction to 8085 microprocessor
Introduction to 8085 microprocessor
 
Introduction to Processor Design and ARM Processor
Introduction to Processor Design and ARM ProcessorIntroduction to Processor Design and ARM Processor
Introduction to Processor Design and ARM Processor
 
Instruction codes and computer registers
Instruction codes and computer registersInstruction codes and computer registers
Instruction codes and computer registers
 
Unit_2 (4).pptx
Unit_2 (4).pptxUnit_2 (4).pptx
Unit_2 (4).pptx
 
OptimizingARM
OptimizingARMOptimizingARM
OptimizingARM
 
Bca 2nd sem-u-2.2-overview of register transfer, micro operations and basic c...
Bca 2nd sem-u-2.2-overview of register transfer, micro operations and basic c...Bca 2nd sem-u-2.2-overview of register transfer, micro operations and basic c...
Bca 2nd sem-u-2.2-overview of register transfer, micro operations and basic c...
 
B.sc cs-ii-u-2.2-overview of register transfer, micro operations and basic co...
B.sc cs-ii-u-2.2-overview of register transfer, micro operations and basic co...B.sc cs-ii-u-2.2-overview of register transfer, micro operations and basic co...
B.sc cs-ii-u-2.2-overview of register transfer, micro operations and basic co...
 
ESD_05_ARM_Instructions set for preparation
ESD_05_ARM_Instructions set for preparationESD_05_ARM_Instructions set for preparation
ESD_05_ARM_Instructions set for preparation
 
Implementation Of MIPS Single Cycle Processor
Implementation Of MIPS Single Cycle ProcessorImplementation Of MIPS Single Cycle Processor
Implementation Of MIPS Single Cycle Processor
 
Presentation
PresentationPresentation
Presentation
 
Control unit-implementation
Control unit-implementationControl unit-implementation
Control unit-implementation
 
UNIT-3.pptx
UNIT-3.pptxUNIT-3.pptx
UNIT-3.pptx
 
Design of microcontroller CPU.pdf
Design of microcontroller CPU.pdfDesign of microcontroller CPU.pdf
Design of microcontroller CPU.pdf
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

L11.ppt

  • 1. 1 Single-cycle datapath, slightly rearranged MemToReg Read address Instruction memory Instruction [31-0] Address Write data Data memory Read data MemWrite MemRead 1 0 4 Shift left 2 P C Add 1 0 PCSrc Sign extend ALUSrc Result Zero ALU ALUOp Instr [15 - 0] RegDst Read register 1 Read register 2 Write register Write data Read data 2 Read data 1 Registers RegWrite Add Instr [15 - 11] Instr [20 - 16] 0 1 0 1
  • 2. 2 Pipeline registers  In pipelining, we divide instruction execution into multiple cycles — IF ID EX MEM WB  Information computed during one cycle may be needed in a later cycle: — Instruction read in IF stage determines which registers are fetched in ID stage, what immediate is used for EX stage, and what destination register is for WB — Register values read in ID are used in EX and/or MEM stages — ALU output produced in EX is an effective address for MEM or a result for WB  A lot of information to save! — Saved in intermediate registers called pipeline registers  The registers are named for the stages they connect: IF/ID ID/EX EX/MEM MEM/WB  No register is needed after the WB stage, because after WB the instruction is done
  • 3. 3 Pipelined datapath Read address Instruction memory Instruction [31-0] Address Write data Data memory Read data MemWrite MemRead 1 0 MemToReg 4 Shift left 2 Add Sign extend ALUSrc Result Zero ALU ALUOp Instr [15 - 0] RegDst Read register 1 Read register 2 Write register Write data Read data 2 Read data 1 Registers RegWrite Add Instr [15 - 11] Instr [20 - 16] 0 1 0 1 IF/ID ID/EX EX/MEM MEM/WB 1 0 PCSrc P C
  • 4. 4 Propagating values forward  Data values required later propagated through the pipeline registers  The most extreme example is the destination register (rd or rt) — It is retrieved in IF, but isn’t updated until the WB — Thus, it must be passed through all pipeline stages, as shown in red on the next slide  Notice that we can’t keep a single “instruction register,” because the pipelined machine needs to fetch a new instruction every clock cycle
  • 5. 5 The destination register Read address Instruction memory Instruction [31-0] Address Write data Data memory Read data MemWrite MemRead 1 0 MemToReg 4 Shift left 2 Add ALUSrc Result Zero ALU ALUOp Instr [15 - 0] RegDst Read register 1 Read register 2 Write register Write data Read data 2 Read data 1 Registers RegWrite Add Instr [15 - 11] Instr [20 - 16] 0 1 0 1 IF/ID ID/EX EX/MEM MEM/WB 1 0 PCSrc P C Sign extend
  • 6. 6 What about control signals?  Control signals generated similar to the single-cycle processor — in the ID stage, the processor decodes the instruction fetched in IF and produces the appropriate control values  Some of the control signals will not be needed until later stages — These signals must be propagated through the pipeline until they reach the appropriate stage — We just pass them in the pipeline registers, along with the data  Control signals can be categorized by the pipeline stage that uses them Stage Control signals needed EX ALUSrc ALUOp RegDst MEM MemRead MemWrite PCSrc WB RegWrite MemToReg
  • 7. 7 Pipelined datapath and control Read address Instruction memory Instruction [31-0] Address Write data Data memory Read data MemWrite MemRead 1 0 MemToReg 4 Shift left 2 Add ALUSrc Result Zero ALU ALUOp Instr [15 - 0] RegDst Read register 1 Read register 2 Write register Write data Read data 2 Read data 1 Registers RegWrite Add Instr [15 - 11] Instr [20 - 16] 0 1 0 1 IF/ID ID/EX EX/MEM MEM/WB Control M WB WB P C 1 0 PCSrc Sign extend EX M WB
  • 8. 8  Here’s a sample sequence of instructions to execute 1000: lw $8, 4($29) 1004: sub $2, $4, $5 1008: and $9, $10, $11 1012: or $16, $17, $18 1016: add $13, $14, $0  We’ll make some assumptions, just so we can show actual data values: — Each register contains its number plus 100. For instance, register $8 contains 108, register $29 contains 129, etc. — Every data memory location contains 99  Our pipeline diagrams will follow some conventions: — An X indicates values that aren’t important, like the constant field of an R-type instruction — Question marks ??? indicate values we don’t know, usually resulting from instructions coming before and after the ones in our example An example execution sequence addresses in decimal
  • 9. 9 Cycle 1 (filling) IF: lw $8, 4($29) MEM: ??? WB: ??? EX: ??? ID: ??? Read address Instruction memory Instruction [31-0] Address Write data Data memory Read data MemWrite (?) MemRead (?) 1 0 MemToReg (?) Shift left 2 Add 1 0 PCSrc ALUSrc (?) Result Zero ALU ALUOp (???) RegDst (?) Read register 1 Read register 2 Write register Write data Read data 2 Read data 1 Registers RegWrite (?) Add 0 1 0 1 IF/ID ID/EX EX/MEM MEM/WB Control M WB WB 1000 1004 ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? 4 P C Sign extend EX M WB
  • 10. 10 Cycle 2 ID: lw $8, 4($29) IF: sub $2, $4, $5 MEM: ??? WB: ??? EX: ??? Read address Instruction memory Instruction [31-0] Address Write data Data memory Read data 1 0 4 Shift left 2 Add PCSrc Result Zero ALU 4 Read register 1 Read register 2 Write register Write data Read data 2 Read data 1 Registers Add X 8 0 1 0 1 IF/ID ID/EX EX/MEM MEM/WB Control M WB WB 1004 29 X 1008 129 X MemToReg (?) ??? ??? ??? ??? ??? ??? RegWrite (?) MemWrite (?) MemRead (?) ??? ??? ??? ALUSrc (?) ALUOp (???) RegDst (?) ??? ??? ??? ??? ??? ??? ??? P C Sign extend EX M WB 1 0
  • 11. 11 Cycle 3 ID: sub $2, $4, $5 IF: and $9, $10, $11 EX: lw $8, 4($29) MEM: ??? WB: ??? MemToReg (?) Read address Instruction memory Instruction [31-0] Address Write data Data memory Read data MemWrite (?) MemRead (?) 1 0 4 Shift left 2 Add PCSrc ALUSrc (1) Result Zero ALU ALUOp (add) X RegDst (0) Read register 1 Read register 2 Write register Write data Read data 2 Read data 1 Registers Add 2 X 0 1 0 1 IF/ID ID/EX EX/MEM MEM/WB Control M WB WB 1008 4 5 1012 104 105 129 4 X 8 X 8 133 4 ??? ??? ??? ??? ??? ??? RegWrite (?) ??? ??? ??? P C Sign extend 1 0 EX M WB
  • 12. 12 Cycle 4 ID: and $9, $10, $11 IF: or $16, $17, $18 EX: sub $2, $4, $5 MEM: lw $8, 4($29) WB: ??? Read address Instruction memory Instruction [31-0] Address Write data Data memory Read data MemWrite (0) MemRead (1) 1 0 MemToReg (?) 4 Shift left 2 Add PCSrc ALUSrc (0) Result Zero ALU ALUOp (sub) X RegDst (1) Read register 1 Read register 2 Write register Write data Read data 2 Read data 1 Registers RegWrite (?) Add 9 X 0 1 0 1 IF/ID ID/EX EX/MEM MEM/WB Control M WB WB 1012 10 11 1016 110 111 104 X 105 X 2 2 –1 133 X 99 8 ??? ??? ??? ??? ??? ??? P C Sign extend EX M WB 1 0
  • 13. 13 Cycle 5 (full) ID: or $16, $17, $18 IF: add $13, $14, $0 EX: and $9, $10, $11 MEM: sub $2, $4, $5 WB: lw $8, 4($29) Read address Instruction memory Instruction [31-0] Address Write data Data memory Read data MemWrite (0) MemRead (0) 1 0 MemToReg (1) 4 Shift left 2 Add PCSrc ALUSrc (0) Result Zero ALU ALUOp (and) X RegDst (1) Read register 1 Read register 2 Write register Write data Read data 2 Read data 1 Registers RegWrite (1) Add 16 X 0 1 0 1 IF/ID ID/EX EX/MEM MEM/WB Control M WB WB 1016 17 18 1020 117 118 110 X 111 X 9 9 110 -1 105 X 2 99 133 99 8 99 8 P C Sign extend EX M WB 1 0
  • 14. 14 Cycle 6 (emptying) ID: add $13, $14, $0 IF: ??? EX: or $16, $17, $18 MEM: and $9, $10, $11 WB: sub $2, $4, $5 Read address Instruction memory Instruction [31-0] Address Write data Data memory Read data MemWrite (0) MemRead (0) 1 0 MemToReg (0) 4 Shift left 2 Add PCSrc ALUSrc (0) Result Zero ALU ALUOp (or) X RegDst (1) Read register 1 Read register 2 Write register Write data Read data 2 Read data 1 Registers RegWrite (1) Add 13 X 0 1 0 1 IF/ID ID/EX EX/MEM MEM/WB Control M WB WB 1020 14 0 ??? 114 0 117 X 118 X 16 16 119 110 111 X 9 X -1 -1 2 -1 2 P C Sign extend 1 0 EX M WB
  • 15. 15 Cycle 7 ID: ??? IF: ??? EX: add $13, $14, $0 MEM: or $16, $17, $18 WB: and $9, $10, $11 Read address Instruction memory Instruction [31-0] Address Write data Data memory Read data MemWrite (0) MemRead (0) 1 0 MemToReg (0) 4 Shift left 2 Add PCSrc ALUSrc (0) Result Zero ALU ALUOp (add) ??? RegDst (1) Read register 1 Read register 2 Write register Write data Read data 2 Read data 1 Registers RegWrite (1) Add ??? ??? 0 1 0 1 IF/ID ID/EX EX/MEM MEM/WB Control M WB WB ??? ??? ??? ??? 114 X 0 X 13 13 114 119 118 X 16 X 110 110 9 110 9 P C Sign extend ??? ??? EX M WB 1 0
  • 16. 16 Cycle 8 ID: ??? IF: ??? EX: ??? MEM: add $13, $14, $0 WB: or $16, $17, $18 Read address Instruction memory Instruction [31-0] Address Write data Data memory Read data MemWrite (0) MemRead (0) 1 0 MemToReg (0) 4 Shift left 2 Add PCSrc ALUSrc (?) Result Zero ALU ALUOp (???) ??? RegDst (?) Read register 1 Read register 2 Write register Write data Read data 2 Read data 1 Registers RegWrite (1) Add ??? ??? 0 1 0 1 IF/ID ID/EX EX/MEM MEM/WB Control M WB WB ??? ??? ??? ??? ??? ??? 114 0 X 13 X 119 119 16 119 16 P C Sign extend ??? ??? ??? ??? ??? ??? ??? 1 0 EX M WB
  • 17. 17 Cycle 9 ID: ??? IF: ??? EX: ??? MEM: ??? WB: add $13, $14, $0 Read address Instruction memory Instruction [31-0] Address Write data Data memory Read data MemWrite (?) MemRead (?) 1 0 MemToReg (0) 4 Shift left 2 Add PCSrc ALUSrc (?) Result Zero ALU ALUOp (???) ??? RegDst (?) Read register 1 Read register 2 Write register Write data Read data 2 Read data 1 Registers RegWrite (1) Add ??? ??? 0 1 0 1 IF/ID ID/EX EX/MEM MEM/WB Control M WB WB ??? ??? ??? ??? ??? ??? ??? ??? ? X ??? X 114 114 13 114 13 P C Sign extend ??? ??? ??? ??? ??? ??? 1 0 EX M WB
  • 18. 18 That’s a lot of diagrams there  Compare the last few slides with the pipeline diagram above — You can see how instruction executions are overlapped — Each functional unit is used by a different instruction in each cycle — The pipeline registers save control and data values generated in previous clock cycles for later use — When the pipeline is full in clock cycle 5, all of the hardware units are utilized. This is the ideal situation, and what makes pipelined processors so fast  See the textbook for more examples Clock cycle 1 2 3 4 5 6 7 8 9 lw $t0, 4($sp) IF ID EX MEM WB sub $v0, $a0, $a1 IF ID EX MEM WB and $t1, $t2, $t3 IF ID EX MEM WB or $s0, $s1, $s2 IF ID EX MEM WB add $t5, $t6, $0 IF ID EX MEM WB
  • 19. 19 Instruction set architectures and pipelining  The MIPS instruction set was designed especially for easy pipelining: — All instructions are 32-bits long, so the instruction fetch stage just needs to read one word on every clock cycle — Fields are in the same position in different instruction formats—the opcode is always the first six bits, rs is the next five bits, etc. This makes things easy for the ID stage — MIPS is a register-to-register architecture, so arithmetic operations cannot contain memory references. This keeps the pipeline shorter and simpler  Pipelining is harder for older/more complex instruction sets: — If different instructions had different lengths or formats, the fetch and decode stages would need extra time to determine the actual length of each instruction and the position of the fields — With memory-to-memory instructions, additional pipeline stages may be needed to compute effective addresses and read memory before the EX stage
  • 20. 20 Note how everything goes left to right, except … Read address Instruction memory Instruction [31-0] Address Write data Data memory Read data MemWrite MemRead 1 0 MemToReg 4 Shift left 2 Add Sign extend ALUSrc Result Zero ALU ALUOp Instr [15 - 0] RegDst Read register 1 Read register 2 Write register Write data Read data 2 Read data 1 Registers RegWrite Add Instr [15 - 11] Instr [20 - 16] 0 1 0 1 IF/ID ID/EX EX/MEM MEM/WB 1 0 PCSrc P C
  • 21. 21 An example with dependencies sub $2, $1, $3 and $12, $2, $5 or $13, $6, $2 add $14, $2, $2 sw $15, 100($2)  There are several dependencies in this new code fragment — the first instruction, SUB, stores a value into $2 — that register is used as a source in the rest of the instructions  This is not a problem for the single-cycle datapath — each instruction is executed completely before the next one begins, so instructions 2 through 5 above use the new value of $2  How would this code sequence fare in our 5-stage MIPS pipeline?
  • 22. 22 Clock cycle 1 2 3 4 5 6 7 8 9 sub $2, $1, $3 IF ID EX MEM WB and $12, $2, $5 IF ID EX MEM WB or $13, $6, $2 IF ID EX MEM WB add $14, $2, $2 IF ID EX MEM WB sw $15, 100($2) IF ID EX MEM WB  The sub instruction does not write to register $2 until clock cycle 5. This causes two data hazards in our current pipelined datapath: — the and reads register $2 in cycle 3, and since sub hasn’t modified the register yet, this will be the old value of $2, not the new one — the or instruction uses register $2 in cycle 4, again before it’s actually updated by sub Data hazards in the pipeline diagram
  • 23. 23 Clock cycle 1 2 3 4 5 6 7 8 9 sub $2, $1, $3 IF ID EX MEM WB and $12, $2, $5 IF ID EX MEM WB or $13, $6, $2 IF ID EX MEM WB add $14, $2, $2 IF ID EX MEM WB sw $15, 100($2) IF ID EX MEM WB  The add instruction is okay, because of the register file design — registers are written at the beginning of a clock cycle — the new value will be available by the end of that cycle  The sw is no problem at all, since it reads $2 after the sub finishes Things that are okay
  • 24. 24 Clock cycle 1 2 3 4 5 6 7 8 9 sub $2, $1, $3 IF ID EX MEM WB and $12, $2, $5 IF ID EX MEM WB or $13, $6, $2 IF ID EX MEM WB add $14, $2, $2 IF ID EX MEM WB sw $15, 100($2) IF ID EX MEM WB  Arrows indicate the flow of data between instructions — The tails of the arrows show when register $2 is written — The heads of the arrows show when $2 is read  Any arrow that points backwards in time represents a data hazard in our basic pipelined datapath Dependency arrows
  • 25. 25 Bypassing the register file  The actual result $1 - $3 is computed in clock cycle 3, before it is needed in cycles 4 and 5  If we could somehow bypass the writeback and register read stages when needed, then we can eliminate these data hazards  Essentially, we need to pass the ALU output from sub directly to the and and or instructions, without going through the register file Clock cycle 1 2 3 4 5 6 7 sub $2, $1, $3 IF ID EX MEM WB and $12, $2, $5 IF ID EX MEM WB or $13, $6, $2 IF ID EX MEM WB
  • 26. 26 Pipleline Registers to the rescue!  Pipeline stages communicate through pipeline registers: IF/ID ID/EX EX/MEM MEM/WB  We “forward” data from pipeline registers to later instructions Instruction memory Data memory 1 0 PC ALU Registers Rd Rt 0 1 IF/ID ID/EX EX/MEM MEM/WB ALU output available here