3. Performance
3KICT, IIUM Single Cycle Processor Design
Recall, performance is determined by:
Instruction count
Clock cycles per instruction (CPI)
Clock cycle time
Processor design will affect
Clock cycles per instruction
Clock cycle time
Single cycle datapath and control design:
Advantage: One clock cycle per instruction
Disadvantage: long cycle time
I-Count
CPI Cycle
4. Step by Step Design of a Processor
4KICT, IIUM Single Cycle Processor Design
Analyze instruction set => datapath requirements
The meaning of each instruction is given by the register transfers
Datapath must include storage elements for ISA registers
Datapath must support each register transfer
Select datapath components and clocking methodology
Assemble datapath meeting the requirements
Analyze implementation of each instruction
Determine the setting of control signals for register transfer
Assemble the control logic
5. MIPS Instruction
5KICT, IIUM Single Cycle Processor Design
All instructions are 32-bit wide
Three instruction formats: R-type, I-type, and J-type
Op6: 6-bit opcode of the instruction
Rs5, Rt5, Rd5: 5-bit source and destination register numbers
sa5: 5-bit shift amount used by shift instructions
funct6: 6-bit function field for R-type instructions
immediate16: 16-bit immediate value or address offset
immediate26: 26-bit target address of the jump instruction
Op6 Rs5 Rt5 Rd5 funct6sa5
Op6 Rs5 Rt5 immediate16
Op6 immediate26
6. MIPS Instruction
6KICT, IIUM Single Cycle Processor Design
Only a subset of the MIPS instructions are considered
ALU instructions (R-type): add, sub, and, or, xor, slt
Immediate instructions (I-type): addi, slti, andi, ori, xori
Load and Store (I-type): lw, sw
Branch (I-type): beq, bne
Jump (J-type): j
This subset does not include all instructions
Sufficient to illustrate design of datapath and control
Concepts used to implement the MIPS subset are
used to construct a broad spectrum of computers
7. MIPS Instruction
7KICT, IIUM Single Cycle Processor Design
Instruction Meaning Format
add Rd, Rs, Rt addition op6 = 0 Rs5 Rt5 Rd5 0 0x20
sub Rd, Rs, Rt subtraction op6 = 0 Rs5 Rt5 Rd5 0 0x22
and Rd, Rs, Rt bitwise and op6 = 0 Rs5 Rt5 Rd5 0 0x24
or Rd, Rs, Rt bitwise or op6 = 0 Rs5 Rt5 Rd5 0 0x25
xor Rd, Rs, Rt exclusive or op6 = 0 Rs5 Rt5 Rd5 0 0x26
slt Rd, Rs, Rt set on less than op6 = 0 Rs5 Rt5 Rd5 0 0x2a
addi Rt, Rs, Im16 add Immediate 0x08 Rs5 Rt5 Im16
slti Rt, Rs, Im16 slt Immediate 0x0a Rs5 Rt5 Im16
andi Rt, Rs, Im16 and Immediate 0x0c Rs5 Rt5 Im16
ori Rt, Rs, Im16 or Immediate 0x0d Rs5 Rt5 Im16
xori Rt, Im16 xor Immediate 0x0e Rs5 Rt5 Im16
lw Rt, Im16(Rs) load woRd 0x23 Rs5 Rt5 Im16
sw Rt, Im16(Rs) store woRd 0x2b Rs5 Rt5 Im16
beq Rs, Rt, Im16 branch if equal 0x04 Rs5 Rt5 Im16
bne Rs, Rt, Im16 branch not equal 0x05 Rs5 Rt5 Im16
j Im26 jump 0x02 Im26
8. Processor Implementation
8KICT, IIUM Single Cycle Processor Design
Single Cycle
perform each instruction in 1 clock cycle
clock cycle must be long enough for slowest instruction
disadvantage: only as fast as slowest instruction
Multi-Cycle
break fetch/execute cycle into multiple steps
perform 1 step in each clock cycle
advantage: each instruction uses only as many cycles as it
needs
Pipelined
execute each instruction in multiple steps
perform 1 step / instruction in each clock cycle
process multiple instructions in parallel
9. Register Transfer Level
9KICT, IIUM Single Cycle Processor Design
RTL is a description of data flow between registers
RTL gives a meaning to the instructions
All instructions are fetched from memory at address PC
Instruction RTL Description
ADD Reg(Rd) ← Reg(Rs) + Reg(Rt); PC ← PC + 4
SUB Reg(Rd) ← Reg(Rs) – Reg(Rt); PC ← PC + 4
ORI Reg(Rt) ← Reg(Rs) | zero_ext(Im16); PC ← PC + 4
LW Reg(Rt) ← MEM[Reg(Rs) + sign_ext(Im16)]; PC ← PC + 4
SW MEM[Reg(Rs) + sign_ext(Im16)] ← Reg(Rt); PC ← PC + 4
BEQ if (Reg(Rs) == Reg(Rt))
PC ← PC + 4 + 4 × sign_extend(Im16)
else PC ← PC + 4
10. Instruction Execution
10KICT, IIUM Single Cycle Processor Design
R-type Fetch instruction: Instruction ← MEM[PC]
Fetch operands: data1 ← Reg(Rs), data2 ← Reg(Rt)
Execute operation: ALU_result ← func(data1, data2)
Write ALU result: Reg(Rd) ← ALU_result
Next PC address: PC ← PC + 4
I-type Fetch instruction: Instruction ← MEM[PC]
Fetch operands: data1 ← Reg(Rs), data2 ← Extend(imm16)
Execute operation: ALU_result ← op(data1, data2)
Write ALU result: Reg(Rt) ← ALU_result
Next PC address: PC ← PC + 4
BEQ Fetch instruction: Instruction ← MEM[PC]
Fetch operands: data1 ← Reg(Rs), data2 ← Reg(Rt)
Equality: zero ← subtract(data1, data2)
Branch: if (zero) PC ← PC + 4× (1 + sign_ext(imm16)
else PC ← PC + 4
11. Instruction Execution
11KICT, IIUM Single Cycle Processor Design
LW Fetch instruction: Instruction ← MEM[PC]
Fetch base register: base ← Reg(Rs)
Calculate address: address ← base + sign_extend(imm16)
Read memory: data ← MEM[address]
Write register Rt: Reg(Rt) ← data
Next PC address: PC ← PC + 4
SW Fetch instruction: Instruction ← MEM[PC]
Fetch registers: base ← Reg(Rs), data ← Reg(Rt)
Calculate address: address ← base + sign_extend(imm16)
Write memory: MEM[address] ← data
Next PC address: PC ← PC + 4
Jump Fetch instruction: Instruction ← MEM[PC]
Target PC address: target ← PC[31:28] || Imm26 || 00
Jump: PC ← target
concatenation
12. What do we need?
12KICT, IIUM Single Cycle Processor Design
Two types of functional hardware elements are
needed:
elements that operate on data (called
combinational elements)
elements that contain data (called sequential
or state elements)
13. Fetch and Execute Cycle
13KICT, IIUM Single Cycle Processor Design
Abstraction of fetch/execute implementation
use the PC to read instruction address
fetch the instruction from memory and increment PC
use fields of the instruction to select registers to read
execute depending on the instruction
repeat…
Registers
Register #
Data
Register #
Data
memory
Address
Data
Register #
PC Instruction ALU
Instruction
memory
Address
15. Basic Hardware
15KICT, IIUM
c = a . bba
000
010
001
111
b
a
c
b
a
c
a c
c = a + bba
000
110
101
111
10
01
c = aa
a0
b1
cd
0
1
a
c
b
d
1. AND gate (c = a . b)
2. OR gate (c = a + b)
3. Inverter (c = a)
4. Multiplexor
(if d = = 0, c = a;
else c = b)
Single Cycle Processor Design
16. Truth Table and Simplification
16KICT, IIUM
Problem: Consider logic functions with three
inputs: A, B, C.
output D is true if at least one input is true
output E is true if exactly two inputs are true
output F is true only if all three inputs are true
Show the truth table for these three functions
Show the Boolean equations for these three functions
Show an implementation consisting of AND-OR-NOT
gate.
Single Cycle Processor Design
17. A Simple Multifunction Logic Unit
17KICT, IIUM
To warm up let's build a logic unit to support
the AND & OR instructions for MIPS (32-bit
registers)
we'll just build a 1-bit unit and use 32 of them
Possible implementation using a multiplexor :
a
b
output
operation
selector .
.
.
Single Cycle Processor Design
18. Using Multiplexor
18KICT, IIUM
Selects one of the inputs to be the output based on
a control input
Lets build our ALU using a MUX (multiplexor):
b
0
1
Result
Operation
a
Single Cycle Processor Design
19. Implementation
19KICT, IIUM
Not easy to decide the best way to implement something
do not want too many inputs to a single gate
do not want to have to go through too many gates (= levels)
for our purposes, ease of comprehension is important
Let's look at a 1-bit ALU for addition:
How could we build a 1-bit ALU for add, and, and or?
How could we build a 32-bit ALU?
𝐶 𝑜𝑢𝑡 = 𝑎. 𝑏 + 𝑎. 𝑐𝑖𝑛 + 𝑏. 𝑐𝑖𝑛
𝑆𝑢𝑚 = 𝑎. 𝑏. 𝑐𝑖𝑛 + 𝑎. 𝑏. 𝑐𝑖𝑛 + 𝑎. 𝑏. 𝑐𝑖𝑛 + 𝑎. 𝑏. 𝑐𝑖𝑛
= 𝑎 ⊕ 𝑏 ⊕ 𝑐𝑖𝑛
Sum
CarryIn
CarryOut
a
b
Single Cycle Processor Design
23. Subtraction
23KICT, IIUM
Two's complement approach: just negate b and add.
Negation: invert each bit of b and set Cin (LSB, ALU0) to 1
0
2
Result
Operation
a
1
CarryIn
CarryOut
0
1
Binvert
b
Single Cycle Processor Design
24. Detecting Overflow
24KICT, IIUM
No overflow when adding a positive and a negative number
No overflow when subtracting numbers with the same sign
Overflow occurs when the result has “wrong” sign (verify!):
Consider the operations A + B, and A – B
can overflow occur if B is 0 ?
can overflow occur if A is 0 ?
Operation Operand
A
Operand
B
Result Indicating
Overflow
A + B ≥ 0 ≥ 0 < 0
A + B < 0 < 0 ≥ 0
A – B ≥ 0 < 0 < 0
A – B < 0 ≥ 0 ≥ 0
Single Cycle Processor Design
25. MIPS Instruction
25KICT, IIUM Single Cycle Processor Design
Instruction Meaning Format
add Rd, Rs, Rt addition op6 = 0 Rs5 Rt5 Rd5 0 0x20
sub Rd, Rs, Rt subtraction op6 = 0 Rs5 Rt5 Rd5 0 0x22
and Rd, Rs, Rt bitwise and op6 = 0 Rs5 Rt5 Rd5 0 0x24
or Rd, Rs, Rt bitwise or op6 = 0 Rs5 Rt5 Rd5 0 0x25
xor Rd, Rs, Rt exclusive or op6 = 0 Rs5 Rt5 Rd5 0 0x26
slt Rd, Rs, Rt set on less than op6 = 0 Rs5 Rt5 Rd5 0 0x2a
addi Rt, Rs, Im16 add Immediate 0x08 Rs5 Rt5 Im16
slti Rt, Rs, Im16 slt Immediate 0x0a Rs5 Rt5 Im16
andi Rt, Rs, Im16 and Immediate 0x0c Rs5 Rt5 Im16
ori Rt, Rs, Im16 or Immediate 0x0d Rs5 Rt5 Im16
xori Rt, Im16 xor Immediate 0x0e Rs5 Rt5 Im16
lw Rt, Im16(Rs) load woRd 0x23 Rs5 Rt5 Im16
sw Rt, Im16(Rs) store woRd 0x2b Rs5 Rt5 Im16
beq Rs, Rt, Im16 branch if equal 0x04 Rs5 Rt5 Im16
bne Rs, Rt, Im16 branch not equal 0x05 Rs5 Rt5 Im16
j Im26 jump 0x02 Im26
26. Set Less Than Instruction
26KICT, IIUM
MIPS has set on less than instructions
slt rd,rs,rt if (rs < rt) rd = 1 else rd = 0
sltu rd,rs,rt unsigned <
slti rt,rs,im16 if (rs < im16) rt = 1 else rt = 0
sltiu rt,rs,im16 unsigned <
Signed / Unsigned Comparisons produce different
results
Assume $s0 = 1 and $s1 = -1 = 0xffffffff
slt $t0,$s0,$s1 results in $t0 = 0
stlu $t0,$s0,$s1 results in $t0 = 1
Single Cycle Processor Design
27. ALU to MIPS: Less than and Equality
27KICT, IIUM
Need to support the set-on-less-than instruction
slt $t0, $t3, $t4, produces 1 if Rs < Rt otherwise 0
Idea is to use subtraction: Rs < Rt Rs – Rt < 0. Recall
MSB of -ve number is 1
two cases after subtraction Rs – Rt:
if no overflow then Rs < Rt most significant bit of Rs – Rt = 1
5ten – 6ten = 0101 – 0110 = 0101 + 1010 = 1111 (ok!)
if overflow then Rs < Rt most significant bit of Rs – Rt = 0
-7ten – (+6ten)= 1001 – 0110 = 1001 + 1010 = 0011 (overflow!)
Therefore, set bit = MSB of Rs – Rt overflow bit, where
set bit, which is output from ALU31, gives the result of slt
set bit is sent from ALU31 to ALU0 as the Less bit at ALU0;
all other Less bits are hardwired 0; so Less is the 32-bit
result of slt
Single Cycle Processor Design
31. 1- bit ALU for the 31 LSBs
0
3
Result
Operation
a
1
CarryIn
CarryOut
0
1
Binvert
b 2
Less
0
3
Result
Operation
a
1
CarryIn
0
1
Binvert
b 2
Less
Set
Overflow
detection
Overflow
a.
b.
31
Supporting slt
Set
a31
0
ALU0 Result0
CarryIn
a0
Result1
a1
0
Result2
a2
0
Operation
b31
b0
b1
b2
Result31
Overflow
Binvert
CarryIn
Less
CarryIn
CarryOut
ALU1
Less
CarryIn
CarryOut
ALU2
Less
CarryIn
CarryOut
ALU31
Less
CarryIn
1-bit ALU for the MSB
Extra set bit, to be routed to the Less input of the least significant 1-bit
ALU, is computed from the most significant Result bit and the Overflow bit
(it is not the output of the adder as the figure seems to indicate)
Less input of
the 31 most
significant ALUs
is always 0
32-bit ALU from 31 copies of ALU at top left and 1 copy
of ALU at bottom left in the most significant position
32.
33. What do we need?
33KICT, IIUM Single Cycle Processor Design
ALU for executing instructions
Memory
Instruction memory where instructions are stored
Data memory where data is stored
Registers
32 × 32-bit general purpose registers, R0 is always zero
Read source register Rs
Read source register Rt
Write destination register Rt or Rd
Program counter PC register and Adder to increment PC
Sign and Zero extender for immediate constant
34. Datapath Components
34KICT, IIUM Single Cycle Processor Design
Combinational Elements
ALU, Adder
Multiplexers
Immediate extender
Storage Elements
Instruction memory
Data memory
Program Counter
Register file
Clocking methodology
Timing of writes
32
Address
Instruction
Instruction
Memory
32
m
u
x
0
1
select
Extend
3216
ExtOp
A
L
U
ALU control
ALU result
zero
32
32
32
overflow
PC
32 32
clk
Registers
Rs
Rt
BusS
Write Enable
BusT
Rd
5
5
5
32
32
32
BusD
clk
Data
Memory
Address
Data_in
Data_out
Mem
Read
Mem
Write
32
32
32
clk
35. Register
35KICT, IIUM Single Cycle Processor Design
Register
Similar to the D-type Flip-Flop
n-bit input and output
Write Enable (WE):
Enable/disable writing of register
Disable (0): Data_Out will not change
Enable (1): Data_Out will become Data_In after clock edge
Edge triggered clocking
Register output is modified at clock edge
Register
Data_In
Clock
Write
Enable
n bits
Data_Out
n bits
WE
36. MIPS Register File
36KICT, IIUM Single Cycle Processor Design
Register File consists of 32 × 32-bit registers
BusS and BusT: 32-bit output busses for reading 2 registers
BusD: 32-bit input bus for writing a register when WE is 1
Two registers read and one written in a cycle
Registers are selected by:
Rs selects register to be read on BusS
Rt selects register to be read on BusT
Rd selects the register to be written
Clock input
The clock input is used ONLY during write operation
During read, register file behaves as a combinational logic block
Rs or Rt valid => BusS or BusT valid after access time
Registers
Rs
Rt
BusS
Write Enable
BusT
Rd
5
5
5
32
32
32
BusD
clk
37. Register File
37KICT, IIUM Single Cycle Processor Design
Registers are implemented with arrays of D-FFs
Register file with two read ports and one write port
Read register
number 1 Read
data 1
Read
data 2
Read register
number 2
Register file
Write
register
Write
data Write
Clock
5 bits
5 bits
5 bits
32 bits
32 bits
32 bits
Control signal
38. Register File
38KICT, IIUM Single Cycle Processor Design
n-to-1
decoder
Register 0
Register 1
Register n – 1
C
C
D
D
Register n
C
C
D
D
Register number
Write
Register data
0
1
n – 1
n
Read ports are implemented
with a pair of multiplexors – 5
bit multiplexors for 32 registers
Write port is implemented using
a decoder – 5-to-32 decoder for
32 registers. Clock is relevant to
write as register state may change
only at clock edge
M
u
x
Register 0
Register 1
Register n – 1
Register n
M
u
x
Read data 1
Read data 2
Read register
number 1
Read register
number 2
Clock
Clock
39. Details of Register File
39KICT, IIUM Single Cycle Processor Design
BusS
BusT
"0" "0"
RS
Decoder
5 RT
Decoder
5
R1
R2
R31
.
.
.
BusD
Decoder
RD
5
Clock
RegWrite
.
.
.
R0 is not
used
32
32
32
32
32
32
32
32
32
Tri-state
buffers
WE
WE
WE
40. Different Design of ALU
40KICT, IIUM Single Cycle Processor Design
0
1
2
3
0
1
2
3
Logic Unit
2
AND = 00
OR = 01
NOR = 10
XOR = 11
Logical
Operation
Shifter
2
SLL = 00
SRL = 01
SRA = 10
ROR = 11
Shift/Rotate
Operation
Rs 32
32
Rt
A
d
d
e
r
c0
32
32
ADD = 0
SUB = 1
Arithmetic
Operation
Shift = 00
SLT = 01
Arith = 10
Logic = 11
ALU
Selection
32
2
Shift Amount
ALU Result
5
sign
≠
zerooverflow
SLT: ALU does a SUB
and check the sign
and overflow
41. Instruction and Data Memory
41KICT, IIUM Single Cycle Processor Design
Instruction memory needs only provide read access
Because datapath does not write instructions
Behaves as combinational logic for read
Address selects Instruction after access time
Data Memory is used for load and store
MemRead: enables output on Data_out
Address selects the word to put on Data_out
MemWrite: enables writing of Data_in
Address selects the memory word to be written
The Clock synchronizes the write operation
Separate instruction and data memories
Caches memory
MemWriteMemRead
Data
Memory
Address
Data_in
Data_out
32
32
32
Clock
32
Address Instruction
Instruction
Memory
32
42. Clocking Methodology
42KICT, IIUM Single Cycle Processor Design
Clocks are needed in a sequential
logic to decide when a state element
(register) should be updated
To ensure correctness, a clocking
methodology defines when data can
be written and read
Combinational logic
Register1
Register2
clock
rising edge falling edge
We assume edge-
triggered clocking
All state changes
occur on the same
clock edge
Data must be valid
and stable before
arrival of clock edge
Edge-triggered
clocking allows a
register to be read
and written during
same clock cycle
43. Clock Cycle
43KICT, IIUM Single Cycle Processor Design
With edge-triggered clocking, the clock cycle must be
long enough to accommodate the path from one register
through the combinational logic to another register
Tcycle ≥ Tclk-q + Tmax_comb + Ts
Combinational logic
Register1
Register2
clock
writing edge
Tclk-q Tmax_comb Ts Th
Tclk-q : clock to output delay
through register
Tmax_comb : longest delay
through combinational logic
Ts : setup time that input to a
register must be stable before
arrival of clock edge
Th: hold time that input to a
register must hold after arrival
of clock edge
Hold time (Th) is normally
satisfied since Tclk-q > Th
44. Fetch and Execute Cycle
44KICT, IIUM Single Cycle Processor Design
Abstraction of fetch/execute implementation
use the PC to read instruction address
fetch the instruction from memory and increment PC
use fields of the instruction to select registers to read
execute depending on the instruction
repeat…
Registers
Register #
Data
Register #
Data
memory
Address
Data
Register #
PC Instruction ALU
Instruction
memory
Address
45. Datapath: Fetch Cycle
45KICT, IIUM Single Cycle Processor Design
Assemble the datapath from its components
For instruction fetching, we need …
Program Counter
Instruction Memory
Adder for incrementing PC
The least significant 2 bits of
the PC are ‘00’ since PC is a
multiple of 4
Datapath does not
handle branch or
jump instructions
Improved datapath
increments upper
30 bits of PC by 1
32
Address
Instruction
Instruction
Memory
32
30
PC00
+1
30
Improved
Datapath
next PC
clk
PC
32
Address
Instruction
Instruction
Memory
32
32
32
4
A
d
d
next PC
clk
00
47. Datapath for R-Type Instruction
47KICT, IIUM Single Cycle Processor Design
Control signals
ALUCtrl is derived from the funct field because Op = 0 for R-type
RegWrite is used to enable the writing of the ALU result
Op6 Rs5 Rt5 Rd5 funct6sa5
ALUCtrl
RegWrite
A
L
U32
32
ALU result
32
Rs and Rt fields select two
registers to read. Rd field
selects register to write
BusS & BusT provide data input to ALU.
ALU result is connected to BusD
Registers
Rs
Rt
BusS
BusT
Rd
BusD
5Rs
5Rt
5Rd
Same clock updates PC and Rd register
PC
32
Address
Instruction
Instruction
Memory
32
32
32
4 A
d
d
next PC
clk
00
48. Datapath for R-Type Instruction
48KICT, IIUM Single Cycle Processor Design
add Rd, Rs, Rt
R[rd] R[rs] + R[rt];
5 5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
op rs rt rd functshamt
Operation
ALU Zero
Instruction
3
49. Datapath for I-type Instructn
49KICT, IIUM Single Cycle Processor Design
Control signals
ALUCtrl is derived from the Op field
RegWrite is used to enable the writing of the ALU result
ExtOp is used to control the extension of the 16-bit immediate
Op6 Rs5 Rt5 immediate16
ALUCtrl
RegWrite
5
Registers
Rs
Rt
BusS
BusT
Rd
BusD
5Rs
5Rt
ExtOp
32
32
ALU result
32
32
A
L
U
Extender
Imm16
Second ALU input comes from the extended
immediate. Rt and BusT are not used
Same clock
edge updates
PC and Rt
Rt selects register
to write, not Rd
clk
PC
32
Address
Instruction
Instruction
Memory
32
32
32
4 A
d
d
next PC
clk
00
50. Immediate Extension
50KICT, IIUM Single Cycle Processor Design
Two types of extensions
Zero-extension for unsigned constants
Sign-extension for signed constants
Control signal ExtOp indicates type of extension
Extender Implementation: wiring and one AND gate
ExtOp = 0 Upper16 = 0
ExtOp = 1
Upper16 = sign bit
...
ExtOp
Upper
16 bits
Lower
16 bits
...
Imm16
51. Combination of R and I
51KICT, IIUM Single Cycle Processor Design
Control signals
ALUCtrl is derived from either the Op or the funct field
RegWrite enables the writing of the ALU result
ExtOp controls the extension of the 16-bit immediate
RegDst selects the register destination as either Rt or Rd
ALUSrc selects the 2nd ALU source as BusT or extended immediate
A mux selects Rd
as either Rt or Rd
Another mux
selects 2nd ALU
input as either
data on BusT or
the extended
immediate
ALUCtrl
RegWrite
ExtOp
A
L
U
ALU result
32
32
Registers
Rs
Rt
BusS
BusT
Rd
5
32
BusD
32
Address
Instruction
Instruction
Memory
32
30
PC00
30
Rs
5
Rd
Extender
Imm16
Rt
32
RegDst ALUSrc
0
1
clk
0
1
PC
32
32
32
4 A
d
d
next
PC
cl
k
0
0
52. Adding Data Memory
52KICT, IIUM Single Cycle Processor Design
Additional Control signals
MemRead for load instructions
MemWrite for store instructions
MemtoReg selects data on BusD as ALU result or Memory Data_out
BusT is connected to Data_in of Data
Memory for store instructions
A data memory is added for load and store instructions
A 3rd mux selects data on BusD as
either ALU result or memory data_out
Data
Memory
Address
Data_in
Data_out
32
32A
L
U
ALUCtrl
32
Registers
Rs
Rt
BusS
Reg
Write
BusT
Rd
5
BusD
32
Address
Instruction
Instruction
Memory
32
30
PC00
+1
30
Rs
5
Rd
E
ExtOp
Imm16
Rt
0
1
RegDst
ALUSrc
0
1
32
MemRead MemWrite
32
ALU result
32
0
1
MemtoReg
ALU calculates data memory address
clk
53. Datapath of LOAD
53KICT, IIUM Single Cycle Processor Design
lw Rt, offset(Rs)
R[rt] <- MEM[R[rs] + s_extend(offset)];
54. Datapath of STORE
54KICT, IIUM Single Cycle Processor Design
sw rt, offset(rs)
MEM[R[rs] + sign_extend(offset)] <- R[rt]
55. Branching Instruction
55KICT, IIUM Single Cycle Processor Design
beq Rs, Rt, LABEL
if (R[rs] == R[rt]) then PC <- PC+4 + s_extend(offset<<2)
LBL
CPC+4
TPC
4L
56. add Rd, Rs, Rt
56KICT, IIUM Single Cycle Processor Design
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
32
M
U
X
M
U
XALUSrc
MemtoReg
57. lw Rt, offset(Rs)
57KICT, IIUM Single Cycle Processor Design
lw rt,offset(rs)
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
32
M
U
X
M
U
XALUSrc
MemtoReg
58. sw Rt, offset (Rs)
58KICT, IIUM Single Cycle Processor Design
sw rt,offset(rs)
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
32
M
U
X
M
U
XALUSrc
MemtoReg
59. Datapath with Fetch
59KICT, IIUM Single Cycle Processor Design
PC
Instruction
memory
Read
address
Instruction
16 32
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Sign
extend
ALU
result
Zero
Data
memory
Address
Write
data
Read
data
M
u
x
4
Add
M
u
x
ALU
RegWrite
ALU operation
3
MemRead
MemWrite
ALUSrc
MemtoReg
Adding instruction fetch
Separate instruction memory
as instruction and data read
occur in the same clock cycle
Separate adder as ALU operations and PC
increment occur in the same clock cycle
60. Datapath
60KICT, IIUM Single Cycle Processor Design
PC
Instruction
memory
Read
address
Instruction
16 32
Add ALU
result
M
u
x
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Shift
left 2
4
M
u
x
ALU operation3
RegWrite
MemRead
MemWrite
PCSrc
ALUSrc
MemtoReg
ALU
result
Zero
ALU
Data
memory
Address
Write
data
Read
data M
u
x
Sign
extend
Add
Adding branch capability and another multiplexor
Instruction address is either
PC+4 or branch target address
Extra adder needed as both
adders operate in each cycle
New multiplexor
Important note: in a single-cycle implementation data cannot be stored
during an instruction – it only moves through combinational logic
Question: is the MemRead signal really needed?! Think of RegWrite…!
61. add Rd, Rs, Rt
61KICT, IIUM Single Cycle Processor Design
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
62. lw Rt, offset (Rs)
62KICT, IIUM Single Cycle Processor Design
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
lw Rt,offset(Rs)
63. 5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
sw Rt, offset(Rs)
63KICT, IIUM Single Cycle Processor Design
sw Rt,offset(Rs)
64. beq rs, st, offset
64KICT, IIUM Single Cycle Processor Design
beq r1,r2,offset
5 516
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Register File
Operation
ALU
3
E
X
T
N
D
16 32
Zero
RD
WD
MemRead
Data
Memory
ADDR
MemWrite
5
Instruction
32
M
U
X
ALUSrc
MemtoReg
ADD
<<2
RD
Instruction
Memory
ADDR
PC
4
ADD
ADD
M
U
X
M
U
X
PCSrc
65. Where we are now?
65KICT, IIUM Single Cycle Processor Design
Processor
Computer
Control
Datapath
Memory
(passive)
(where
programs,
data live
when
running)
Devices
Input
Output
Keyboard,
Mouse
Display,
Printer
Disk
(where
programs,
data live
when not
running)
66. Fetch and Execute Cycle
66KICT, IIUM Single Cycle Processor Design
Abstraction of fetch/execute implementation
use the PC to read instruction address
fetch the instruction from memory and increment PC
use fields of the instruction to select registers to read
execute depending on the instruction
repeat…
Registers
Register #
Data
Register #
Data
memory
Address
Data
Register #
PC Instruction ALU
Instruction
memory
Address
68. Single Cycle MIPS Datapath
68KICT, IIUM Single Cycle Processor Design
PC
Instruction
memory
Read
address
Instruction
16 32
Add ALU
result
M
u
x
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Shift
left 2
4
M
u
x
ALU operation3
RegWrite
MemRead
MemWrite
PCSrc
ALUSrc
MemtoReg
ALU
result
Zero
ALU
Data
memory
Address
Write
data
Read
data M
u
x
Sign
extend
Add
Adding branch capability and another multiplexor
Instruction address is either
PC+4 or branch target address
Extra adder needed as both
adders operate in each cycle
New multiplexor
Important note: in a single-cycle implementation data cannot be stored
during an instruction – it only moves through combinational logic
Question: is the MemRead signal really needed?! Think of RegWrite…!
69. Control
69KICT, IIUM Single Cycle Processor Design
Control unit takes input from
the instruction OPCODE bits
Control unit generates
ALU control input
Data flow write/read enable signals for each
storage element
Selector controls for each multiplexor
70. ALU Control
70KICT, IIUM Single Cycle Processor Design
Main control sends a 2-bit ALUOp control field to the ALU control.
Based on ALUOp and function field of instruction the ALU control
generates the 3-bit ALU control field
ALU control Func-
field tion
000 and
001 or
010 add
110 sub
111 slt
ALU must perform
add for load/stores (ALUOp 00)
sub for branches (ALUOp 01)
one of and, or, add, sub, slt for R-type instructions,
instruction’s 6-bit function field (ALUOp 10)
Main
Control
ALU
Control
2
ALUOp
6
Instruction
funct field
3
ALU
control
input
To
ALU
71. Setting ALU Control Bits
71KICT, IIUM Single Cycle Processor Design
Truth table for ALU control bits
ALUOp Funct field Operation
ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0
0 0 X X X X X X 010
0 1 X X X X X X 110
1 X X X 0 0 0 0 010
1 X X X 0 0 1 0 110
1 X X X 0 1 0 0 000
1 X X X 0 1 0 1 001
1 X X X 1 0 1 0 111
Instruction AluOp Instruction Funct Field Desired ALU control
opcode operation ALU action input
LW 00 load word xxxxxx add 010
SW 00 store word xxxxxx add 010
Branch eq 01 BEQ xxxxxx subtract 110
R-type 10 ADD 100000 add 010
R-type 10 SUB 100010 subtract 110
R-type 10 AND 100100 and 000
R-type 10 OR 100101 or 001
R-type 10 SLT 101010 set on less 111
The n-bit ALUCtrl is
encoded according to
the ALU implementation
72. Setting ALU Control Bits
72KICT, IIUM Single Cycle Processor Design
opcode ALU
Op1
ALU
Op0
Operation funct ALU function ALU
control
lw 0 0 load word XXXXXX add 010
sw 0 0 store word XXXXXX add 010
beq 0 1 branch equal XXXXXX subtract 110
R-type 1 0 add 100000 add 010
subtract 100010 subtract 110
AND 100100 AND 000
OR 100101 OR 001
set-on-less-than 101010 set-on-less-than 111
73. ALU Control
73KICT, IIUM Single Cycle Processor Design
The n-bit ALUCtrl is
encoded according to the
ALU implementation
Other ALU control
encodings are also
possible. The idea is to
choose a binary encoding
that will simplify the logic
74. Main Control
74KICT, IIUM Single Cycle Processor Design
Op6 Rs5 Rt5 Rd5 funct6sa5
31-26 25-21 20-16 15-11 10-6 5-0
Op6 Rs5 Rt5 immediate16
31-26 25-21 20-16 15-0
Some Observations
opcode is always in bits 31-26
two registers to be read are always Rs (bits 25-21) and Rt (bits 20-16)
base register for load/stores is always Rs (bits 25-21)
16-bit offset for branch equal and load/store is always bits 15-0
destination register for loads is in bits 20-16 (Rt) while for R-type
instructions it is in bits 15-11 (Rd) (will require multiplexor to select)
75. Control Signal
75KICT, IIUM Single Cycle Processor Design
Main Control Input:
6-bit opcode field from instruction
Main Control Output:
10 control signals to the Datapath
ALU Control Input:
6-bit opcode field from instruction
6-bit function field from instruction
ALU Control Output:
ALUCtrl signal for ALU
Datapath32
Address
Instruction
Instruction
Memory
A
L
U
ALU
ControlOp6
ALUCtrl
funct6
Op6
RegDst
RegWrite
ExtOp
ALUSrc
MemRead
MemWrite
MemtoReg
Beq
Bne
J
Main
Control
76. Datapath with Control
76KICT, IIUM Single Cycle Processor Design
PC
Instruction
memory
Read
address
Instruction
[31– 0]
Instruction [20 16]
Instruction [25 21]
Add
Instruction [5 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
RegDst
ALUSrc
Instruction [31 26]
4
16 32
Instruction [15 0]
0
0M
u
x
0
1
Control
Add
ALU
result
M
u
x
0
1
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Sign
extend
M
u
x
1
ALU
result
Zero
PCSrc
Data
memory
Write
data
Read
data
M
u
x
1
Instruction [15 11]
ALU
control
Shift
left 2
ALU
Address
77. Datapath with Control
77KICT, IIUM Single Cycle Processor Design
zero
PCSrc
E
Data
Memory
Address
Data_in
Data_out
32
A
L
U
ALU result
32
5
Registers
Rs
Rt
BusS
BusT
Rd BusD
32
Address
Instruction
Instruction
Memory
PC00
+1
30
Rs
5
Rd
Imm2
6
Rt
m
u
x
0
1
5
m
u
x
0
1
m
u
x
0
1
m
u
x
0
1
30
30 Jump or Branch Target Address
30
Imm16
Next
PC
RegDst
ALUSrc
RegWrite
J, Beq, Bne
MemtoReg
MemRead
MemWrite
ExtOp
Main
Control
Op
ALU
Ctrl
ALUop
func
clk
78. Immediate Extension
78KICT, IIUM Single Cycle Processor Design
Two types of extensions
Zero-extension for unsigned constants
Sign-extension for signed constants
Control signal ExtOp indicates type of extension
Extender Implementation: wiring and one AND gate
ExtOp = 0 Upper16 = 0
ExtOp = 1
Upper16 = sign bit
...
ExtOp
Upper
16 bits
Lower
16 bits
...
Imm16
79. Main Control Signals
79KICT, IIUM Single Cycle Processor Design
Signal Effect when ‘0’ Effect when ‘1’
RegDst Destination register = Rt Destination register = Rd
RegWrite None
Destination register is written with
the data value on BusD
ExtOp 16-bit immediate is zero-extended 16-bit immediate is sign-extended
ALUSrc
Second ALU operand comes from the
second register file output (BusT)
Second ALU operand comes from
the extended 16-bit immediate
MemRead None
Data memory is read
Data_out ← Memory[address]
MemWrite None
Data memory is written
Memory[address] ← Data_in
MemtoReg BusD = ALU result BusD = Data_out from Memory
Beq, Bne PC ← PC + 4
PC ← Branch target address
If branch is taken
J PC ← PC + 4 PC ← Jump target address
80. Main Control Signal Values
80KICT, IIUM Single Cycle Processor Design
Op
Reg
Dst
Reg
Write
Ext
Op
ALU
Src
Beq Bne J
Mem
Read
Mem
Write
Mem
toReg
R-type 1 = Rd 1 x 0=BusT 0 0 0 0 0 0
addi 0 = Rt 1 1=sign 1=Imm 0 0 0 0 0 0
slti 0 = Rt 1 1=sign 1=Imm 0 0 0 0 0 0
andi 0 = Rt 1 0=zero 1=Imm 0 0 0 0 0 0
ori 0 = Rt 1 0=zero 1=Imm 0 0 0 0 0 0
xori 0 = Rt 1 0=zero 1=Imm 0 0 0 0 0 0
lw 0 = Rt 1 1=sign 1=Imm 0 0 0 1 0 1
Sw x 0 1=sign 1=Imm 0 0 0 0 1 x
Beq x 0 x 0=BusT 1 0 0 0 0 x
bne x 0 x 0=BusT 0 1 0 0 0 x
j x 0 x x 0 0 1 0 0 x
X is a don’t care (can be 0 or 1), used to minimize logic
82. Controlling ALU Instructions
82KICT, IIUM Single Cycle Processor Design
For R-type ALU
instructions, RegDst is
‘1’ to select Rd on RW
and ALUSrc is ‘0’ to
select BusT as second
ALU input. The active
part of datapath is
shown in green
For I-type ALU
instructions, RegDst is
‘0’ to select Rt on RW
and ALUSrc is ‘1’ to
select Extended
immediate as second
ALU input. The active
part of datapath is
shown in green
A
L
U
ALUCtrl
ALU result
32
32
Registers
Rs
Rt
BusS
RegWrite =
1
BusT
Rd
5
32
BusD
32
Address
Instruction
Instruction
Memory
32
30
PC00
+1
30
Rs
5
Rd
Extender
ExtOp
Imm16
Rt
0
1
0
1
RegDst =
1
ALUSrc = 0
clk
clk
A
L
U
ALUCtrl
ALU result
32
32
Registers
Rs
Rt
BusS
RegWrite =
1
BusT
Rd
5
32
BusD
32
Address
Instruction
Instruction
Memory
32
30
PC00
+1
30
Rs
5
Rd
Extender
ExtOp
Imm16
Rt
32
0
1
0
1
RegDst =
0
ALUSrc = 1
83. Controlling LOAD Instructions
83KICT, IIUM Single Cycle Processor Design
ALUCtrl
= ADD
RegWr
= 1
ExtOp =
1
32
Data
Memory
Address
Data_in
Data_out
32
A
L
U
Registers
Rs
Rt
BusS
BusT
Rd
5
BusD
32
Address
Instruction
Instruction
Memory
32
30
PC00
+1
30
Rs
5
R
d
E
Imm16
Rt
0
1
0
1
32
ALU result
32
0
1
32
32
ALUCtrl = ‘ADD’ to calculate data memory
address as Reg(Rs) + sign-extend(Imm16)
ALUSrc = ‘1’ selects extended
immediate as second ALU input
MemRead = ‘1’ to
read data memory
RegDst = ‘0’ selects Rt
as destination register
RegWrite = ‘1’ to enable
writing of register file
MemtoReg = ‘1’ places the data
read from memory on BusW
ExtOp = 1 to sign-extend
Immmediate16 to 32 bits
Clock edge updates PC
and Register Rt
RegDst
= 0
ALUSrc
= 1 MemtoReg
= 1
MemRead
= 1
MemWrite
= 0
clk
lw Rt, offset(Rs)
84. Controlling STORE Instructions
84KICT, IIUM Single Cycle Processor Design
ALUCtrl
= ADD
RegWr
= 0
ExtOp =
1
32
Data
Memory
Address
Data_in
Data_out
32
A
L
U
Registers
Rs
Rt
BusS
BusT
Rd
5
BusD
32
Address
Instruction
Instruction
Memory
32
30
PC00
+1
30
Rs
5
R
d
E
Imm16
Rt
0
1
0
1
32
ALU result
32
0
1
32
32
ALUCtrl = ‘ADD’ to calculate data memory
address as Reg(Rs) + sign-extend(Imm16)
ALUSrc = ‘1’ selects extended
immediate as second ALU input
MemWrite = ‘1’ to
write data memory
RegDst = ‘X’ because
no register is written
RegWrite = ‘0’ to disable
writing of register file
MemtoReg = ‘X’ because don’t
care what data is put on BusW
ExtOp = 1 to sign-extend
Immmediate16 to 32 bits
Clock edge updates PC
and Data Memory
RegDst
= X
ALUSrc
= 1 MemtoReg
= X
MemRead
= 0
MemWrite
= 1
clk
sw Rt, offset(Rs)
85. Datapath with Control
85KICT, IIUM Single Cycle Processor Design
PC
Instruction
memory
Read
address
Instruction
[31– 0]
Instruction [20 16]
Instruction [25 21]
Add
Instruction [5 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
RegDst
ALUSrc
Instruction [31 26]
4
16 32
Instruction [15 0]
0
0M
u
x
0
1
Control
Add
ALU
result
M
u
x
0
1
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Sign
extend
M
u
x
1
ALU
result
Zero
PCSrc
Data
memory
Write
data
Read
data
M
u
x
1
Instruction [15 11]
ALU
control
Shift
left 2
ALU
Address
86. Adding Jump and Branch
86KICT, IIUM Single Cycle Processor Design
Additional Control Signals
J, Beq, Bne for jump and branch instructions
Zero flag of the ALU is examined
PCSrc = 1 for jump & taken branch
Next
PC
Next PC logic
computes jump or
branch target
instruction address
zeroPCSrc
Bne
Beq
J
ALUCtrl
Reg
Write
ExtOp
RegDst
ALUSrc
Data
Memory
Address
Data_in
Data_out
32
32
A
L
U
32
Registers
Rs
Rt
BusS
BusT
Rd
5
BusD
32
Address
Instruction
Instruction
Memory
PC00
30
Rs
5
R
d
E
Imm16
Rt
0
1
0
1
32
Imm26
32
ALU result
32
0
1
clk
+1
0
1
30
Jump or Branch Target Address30
Mem
Read
Mem
Write
Mem
toReg
87. Next PC
87KICT, IIUM Single Cycle Processor Design
A
D
D
30
30
0
m
u
x
1
Inc PC
30
Imm16
Imm26
30
SE
4msb
26
Beq
Bne
J
Zero
PCSrcBranch or Jump Target Address
Imm16 is sign-extended to 30 bits
Jump target address: upper 4 bits of PC are concatenated with Imm26
PCSrc = J + (Beq . Zero) + (Bne . Zero)
Sign-Extension:
Most-significant
bit is replicated
88. Controlling Jump Instruction
88KICT, IIUM Single Cycle Processor Design
ALU result
32
0
1
32
Next
PC
zero
Bne = 0
Beq = 0
J = 1
ALUCtrl
= x
RegWr
= 0
RegDst
= x
ALUSrc
= x
Data
Memory
Address
Data_in
Data_out
32
A
L
U
32
Registers
Rs
Rt
BusS
BusT
Rd
5
BusD
32
Address
Instruction
Instruction
Memory
PC00
30
Rs
5
R
d
E
Imm16
Rt
0
1
0
1
32
Imm26
clk
+1
0
1
30
Jump Target Address30
Mem
Read
= 0
Mem
Write
= 0
Mem
toReg
= x
PCSrc
= 1
ExtOp
= x
32
MemRead, MemWrite,
and RegWrite are 0
J = 1 to control jump.
Next PC outputs Jump
Target Address We don’t care about RegDst, ExtOp,
ALUSrc, ALUCtrl, and MemtoReg
Clock edge updates PC only
89. Controlling Branch Instr
89KICT, IIUM Single Cycle Processor Design
ALU result
32
0
1
32
Next
PC
Zero
= 1
Bne = 0
Beq = 1
J = 0
ALUCtrl
= SUB
RegWr
= 0
RegDst
= x
Data
Memory
Address
Data_in
Data_out
32A
L
U
32
Registers
Rs
Rt
BusS
BusT
Rd
5
BusD
32
Address
Instruction
Instruction
Memory
PC00
30
Rs
5
Rd
E
Imm16
Rt
0
1
0
1
32
Imm26
clk
+1
0
1
30
Branch Target Address30
Mem
Read
= 0
Mem
Write
= 0
Mem
toReg
= x
PCSrc
= 1
ExtOp
= x
32
RegWrite, MemRead, and MemWrite are 0
Either Beq = 1 or Bne
depending on opcode
Clock edge updates PC register only
ALUSrc = 0 to select
value on BusT
ALUCtrl = SUB to
generate Zero Flag
Next PC outputs branch target address
PCSrc = 1 if branch is taken
ALUSrc
= 0
beq Rs,Rt,label
90. Generic Datapath
90KICT, IIUM Single Cycle Processor Design
instruction
memory
+4
Rd
Rt
Rs
registers
ALU
Data
memory
imm
1. Instruction
Fetch
2. Decode/
Register
Read
3. Execute 4. Memory
5. Register
Write
PC
91. Drawbacks
91KICT, IIUM Single Cycle Processor Design
Long cycle time
All instructions take as much time as the slowest instruction
longest delay
Instruction
FetchALU
Decode
Reg Read
ALU
Reg
Write
Load
Instruction
Fetch
Decode
Reg Read
Compute
Address
Reg
Write
Memory Read
Store
Instruction
Fetch
Decode
Reg Read
Compute
Address
Memory Write
Jump
Instruction
Fetch
Decode
PC Write
Branch
Instruction
Fetch
Reg Read
Br Target
Compare
& PC Write
92. Summary
92KICT, IIUM Single Cycle Processor Design
5 steps to design a processor
Analyze instruction set => datapath requirements
Select datapath components & establish clocking methodology
Assemble datapath meeting the requirements
Analyze implementation of each instruction to determine control signals
Assemble the control logic
MIPS makes easy control
Instructions are of same size
Source registers always in same place
Immediate vales are of same size and same location
Operations are always on Registers/Immediates
Single cycle datapath => CPI=1, But long clock cycle
93. Put Control Signal for BNE
93KICT, IIUM Single Cycle Processor Design
PC
Instruction
memory
Read
address
Instruction
16 32
Add ALU
result
M
u
x
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Shift
left 2
4
M
u
x
ALU operation3
RegWrite
MemRead
MemWrite
PCSrc
ALUSrc
MemtoReg
ALU
result
Zero
ALU
Data
memory
Address
Write
data
Read
data M
u
x
Sign
extend
Add
Adding branch capability and another multiplexor
Instruction address is either
PC+4 or branch target address
Extra adder needed as both
adders operate in each cycle
New multiplexor
Important note: in a single-cycle implementation data cannot be stored
during an instruction – it only moves through combinational logic
Question: is the MemRead signal really needed?! Think of RegWrite…!
96. addi $s7, $sp, 16
96KICT, IIUM Single Cycle Processor Design
RegDst ALUSrc PCSrc MemtoReg
0 1 0 0
A B C D E F
29 19 16 X 16 100
G H I J K L
64 84 20 X 116 16
97. addi $s7, $sp, 16
97KICT, IIUM Single Cycle Processor Design
x
100
X
16
116
84
20
64
23
16
16
29
23
1. SP = 100
2. PC = 16
98. Fault Tolerant
98KICT, IIUM Single Cycle Processor Design
PC
Instruction
memory
Read
address
Instruction
[31– 0]
Instruction [20 16]
Instruction [25 21]
Add
Instruction [5 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
RegDst
ALUSrc
Instruction [31 26]
4
16 32
Instruction [15 0]
0
0M
u
x
0
1
Control
Add
ALU
result
M
u
x
0
1
Registers
Write
register
Write
data
Read
data 1
Read
data 2
Read
register 1
Read
register 2
Sign
extend
M
u
x
1
ALU
result
Zero
PCSrc
Data
memory
Write
data
Read
data
M
u
x
1
Instruction [15 11]
ALU
control
Shift
left 2
ALU
Address
1. Line Cut
2. Stuck at 0
3. Stuck at 1
99. Quiz 2
99KICT, IIUM MIPS Programming
Syllabus: Lecture 6
Date and Day: 03/04/2017 Monday
Time: 40 min (class time)
Venue: Class Room
Questions
As shown in the last couple of slides