Instruction Set
Architecture: MIPS
Prasenjit Dey
Instruction Set
 Set of instructions supported by a machine
 Specific to a machine
 However, follows a similar format
 Earlier computers used to have small and simple instruction sets
 Many modern computers also have simple instruction sets
Dr. Prasenjit Dey
2
MIPS Instruction Set Architecture(ISA)
 Well known ISA
 Established in 1980’s
 Used in embedded systems, e.g., consumer electronics, network/storage
equipment, cameras, printers, etc.
 Sony, Silicon graphics, etc. uses MIPS architecture
Dr. Prasenjit Dey
3
Arithmetic Instructions
 Add instruction: add two operands (a = b + c)
 Two source operands (b and c) and one destination operand (a)
add a, b, c
 Add three operands (a = b + c + d)
add c, c, d #c = c + d
add a, b, c #a = b + c
 All arithmetic operations are performed in this manner
 Design Principle 1: Simplicity favors regularity
 Regularity makes implementation simpler
 Simplicity enables higher performance at lower cost
Dr. Prasenjit Dey
4
Arithmetic MIPS instructions
 C code:
f = (a + b) * (c - d);
 Compiled MIPS code:
add t0, a, b # temp t0 = a + b
sub t1, c, d # temp t1 = c - d
mul f, t0, t1 # f = t0 * t1
Dr. Prasenjit Dey
5
Register Operands
 Arithmetic instructions use register operands
 MIPS has 32 registers, numbered from 0 to 31
 Each register is 32 bit long(word)
 Design Principle 2: Smaller is faster
 Registers are small but faster than memory
Dr. Prasenjit Dey
6
Register number and name Mapping
 Register 0 is the zero register($Zero) has a fixed value 0
 2 Value registers
 Register 2 and register 3, denoted by $v0 and $v1 respectively
 4 arguments
 Register 4 to register 7, denoted by $a0, $a1, $a2, $a3
 10 temporary registers
 Register 8 to register 15, denoted by $t0, $t1, …, $t7; $t0  register 8, $t1  register 9, so on.
 Register 24 and register 25, denoted by $t8 and $t9 respectively
 8 save registers
 Register16 to register 23, denoted by $s0, $s1, …, $s7; $s0  register 16, $s1  register 17, so on.
 Register 28 is the global pointer for static data denote by $gp
 Register 29 is the stack pointer denote by $sp
 Register 30 is the frame pointer denote by $fp
 Register 31 is the return address denote by $ra
Dr. Prasenjit Dey
7
Use of Register in MIPS
 C code:
f = (a + b) * (c - d);
 Compiled MIPS code:
save f, a, b, c, d in $s0, $s1, $s2, $s3, $s4
respectively
add $t0, $s1, $s2 # temp t0 = a + b
sub $t1, $s3, $s4 # temp t1 = c - d
mul $s0, $t0, $t1 # f = t0 * t1
Dr. Prasenjit Dey
8
Memory Operands
 Memory used to store composite data
 Arrays, structures, dynamic data
 Uses in arithmetic operations
 Load values from memory into registers
 Store result from register to memory
 Memory is byte addressed
 Each address identifies an 8-bit byte
 An word, 32 bit data occupy 4bytes
 byte [0, 1, 2, 3]  1 word
 Byte of an word can be aligned in 2 ways into the memory
 Big Endian: Most-significant byte(0) at least address(0) of a word
 Little Endian: least-significant byte(3) at least address (0) of a words
Memory
address
1st Byte 2nd Byte 3rd Byte 4th Byte
100
104
108
112
Memory
address
1st Byte 2nd Byte 3rd Byte 4th Byte
100 0 (MSB) 1 2 3 (LSB)
104 4 5 6 7
108 8 9 10 11
112 12 13 14 15
Dr. Prasenjit Dey
9
Memory Operand Example
 C code:
c = b + A[8];
 c, b are in $s1, $s2
 base address of A in $s3
A[12] = b + A[8];
 Compiled MIPS code:
 Array Index = 8
 1 item occupy 4 bytes(1 word)
 requires offset of (8*4) = 32
lw $t0, 32($s3) # Load Word A[8] into $t0
add $s1, $s2, $t0 # add b and A[8]
lw $t0, 32($s3) # Load Word A[8] into $t0
add $t0, $s2, $t0 # add b and A[8]
sw $t0, 48($s3) # store into A[12]
Dr. Prasenjit Dey
10
Immediate Operands
 Constant operand specified in an instruction
addi $s0, $s0, 4 # $s0  a; a = a + 4;
 No subtract immediate instruction
 Just use a negative constant
addi $s0, $s0, -4 # $s0  a; a = a - 4;
 Design Principle 3: Make the common case fast
 Small constants are common
 Immediate operand avoids a load instruction
Dr. Prasenjit Dey
11
The Constant Zero
 In MIPS, register 0 ($zero) has a constant 0
 Cannot be overwritten
 Add $zero, $zero, 2
 Usage
 E.g., move between registers
 mov $t2, $s1 # $t2  $s1
add $t2, $s1, $zero # $t2 = $s1 + 0
 NOT instruction can be implemented by NOR instruction
 nor $t0, $t1, $zero
Dr. Prasenjit Dey
12
Instruction Format
 32-bit instruction
 3 types of instructions
1. R-format instruction (Register type instructions)
2. I-format instruction (Immediate type instructions)
3. J-format instruction (Jump type instructions)
Dr. Prasenjit Dey
13
R-format Instructions
 Instruction fields
 op: operation code (opcode) (2^6 = 64 distinct opcode can be represented)
 rs: first source register number (As MIPS has 32 registers, so 5bits(2^5 = 32))
 rt: second source register number (3rd register/Register Third=(RT))
 rd: destination register number
 shamt: shift amount (5-bit can represent (2^5 = 32) bit shift, as reg. size = 1 word)
 funct: function code (extends opcode)
op rs rt rd shamt funct
6 bits 6 bits
5 bits 5 bits 5 bits 5 bits
Dr. Prasenjit Dey
14
R-format Instruction: Arithmetic operations
add $t0, $s1, $s2
arithmetic $s1 $s2 $t0 0 add
0 17 18 8 0 16
000000 10001 10010 01000 00000 010000
000000100011001001000000000100002
op rs rt rd shamt funct
6 bits 6 bits
5 bits 5 bits 5 bits 5 bits
0232401016
Let Opcode for arithmetic = 0
Let Encoding for add = 16
Dr. Prasenjit Dey
15
R-format Instructions: Shift Operations
op rs rt rd shamt funct
6 bits 6 bits
5 bits 5 bits 5 bits 5 bits
shamt: how many positions to shift
sll $t0, $t1, 5 # shift left 5 bits
srl $t0, $t1, 5 # shift right 5 bits
Shift $t1 $t0 5 sll
0
8 9 0 8 5 16
Let Opcode for shift = 8
Let Encoding for sll = 16
Dr. Prasenjit Dey
16
R-format Instructions: Logical Operations
op rs rt rd shamt funct
6 bits 6 bits
5 bits 5 bits 5 bits 5 bits
and $t0, $s0, $s1
or $t0, $s0, $s1
Logical $s0 $t0 0 and
$s1
10 16 17 8 0 32
Let Opcode for logical = 10
Let Encoding for and = 32
Dr. Prasenjit Dey
17
I-format Instructions
 Immediate arithmetic and load/store instructions
 rs: source register, base address of an array
 rt: destination or source register number
 Address: offset field
 Design Principle 4: Good design demands good compromises
 Different formats complicate decoding
 Keep formats as similar as possible
op rs rt constant or address
6 bits 5 bits 5 bits 16 bits
Dr. Prasenjit Dey
18
I-format Instruction: Example
 Arithmetic instructions with constants
 addi $s1, $s0, 4
 Load/store instructions
 lw $t0, 32($s3)
addi $s0 $s1 4
6 bits 5 bits 5 bits 16 bits
lw $s3 $t0 32
6 bits 5 bits 5 bits 16 bits
Dr. Prasenjit Dey
19
Conditional Operations
 Conditional branch
 If a condition is true, branch to a labeled instruction
 Otherwise, continue sequentially
 bne rs, rt, L1 #branch if not equal
 if (rs != rt) branch to instruction labeled L1;
 beq rs, rt, L2 #branch if equal
 if (rs == rt) branch to instruction labeled L2;
 Unconditional branch instruction
 j L3
 unconditional jump to instruction labeled L3
Dr. Prasenjit Dey
20
If Statements
 C code:
if (i==j)
c = a + b;
else
c = a - b;
 Compiled MIPS code:
Save c, a, b, i , j in … in $s0, $s1, $s2, $s3, $s4
bne $s3, $s4, Else # i != j
add $s0, $s1, $s2 # c = a + b
j Exit # unconditional jump
Else: sub $s0, $s1, $s2 # c = a – b
Exit
Dr. Prasenjit Dey
21
Swap operations
 C code:
 Swap procedure (leaf)
 void swap(int v[], int i)
{
int temp;
temp = v[i];
v[i] = v[i+1];
v[i+1] = temp;
}
 MIPS code:
Save v in $a0, i in $a1, temp in $t0
swap: sll $t1, $a1, 2 # $t1 = i * 4
add $t1, $a0, $t1 # $t1 = v+(i*4) #(address of v[i])
lw $t0, 0($t1) #$t0 = v[i]
lw $t2, 4($t1) #$t2 = v[i+1]
sw $t2, 0($t1) #v[i] = $t2
sw $t0, 4($t1) #v[i+1] = $t0
jr $ra #return to calling routine
Dr. Prasenjit Dey
22
Loop Statements
 C code:
 while (save[i] == k)
i += 1;
 Compiled MIPS code:
i in $s0, k in $s1, address of save in $s2
Loop: sll $t1, $s0, 2 #t1 = i*4;byte addressing
add $t1, $t1, $s2 #t1 = base addr. + offset
lw $t0, 0($t1) #load data into $t0
bne $t0, $s1, Exit
addi $s0, $s0, 1
j Loop
Exit:
Dr. Prasenjit Dey
23
Basic Blocks
 A basic block is
 Sequence of instructions
 No branches
 e.g., a set of instructions within a if/else construct, loop etc.
 Can be optimized via compiler
 Advanced processor can also be used for faster execution
Dr. Prasenjit Dey
24
other Conditional instructions
 slt rd, rs, rt #(set less than) set result to 1 if condition is true, else set to 0
 if (rs < rt) rd = 1; else rd = 0;
 slti rt, rs, constant
 if (rs < constant) rt = 1; else rt = 0;
 Uses in conjunction with beq (branch equal), bne (branch not equal)
 blt $s0, $s1 L #(branch less than) take branch if condition is true, else don’t
if ($s0 < $s1) goto L
slt $t0, $s0, $s1 # if ($s0 < $s1), $t0 = 1
bne $t0, $zero, L # branch to L
Dr. Prasenjit Dey
25
Branch Instruction Design
 Why not blt(branch less than), ble(branch less than or equal), etc?
 Hardware for <, ≤, … slower than =, ≠
 Composite instruction, requiring a slower clock
 This is a good design compromise
 beq and bne are the common case
Dr. Prasenjit Dey
26
Procedure Call
 Sequences
1. Place parameters in registers
2. Transfer control to procedure
3. Acquire storage for procedure
4. Perform procedure’s operations
5. Place result in register for caller
6. Return to place of call
Dr. Prasenjit Dey
27
Procedure Call Instructions
 Procedure call:
 jump and link to the location of the procedure definition
jal ProcedureLabel #(jump and link) jump to first address of the procedure
 Save the address of next instruction after the procedure call in $ra #return address
 Procedure return:
jr $ra # (jump register) jump to $ra
 Update PC  $ra
Dr. Prasenjit Dey
28
Procedure Call: Example
 C code:
 int leaf_example (int a, b, c, d)
{
int f;
f = (a + b) - (c + d);
return f;
}
 MIPS code:
Save arguments a, b, c, d in $a0, …, $a3
f in $s0 (on stack), result in $v0
leaf_example:
addi $sp, $sp, -4
sw $s0, 0($sp)
add $t0, $a0, $a1
add $t1, $a2, $a3
sub $s0, $t0, $t1
add $v0, $s0, $zero
lw $s0, 0($sp)
addi $sp, $sp, 4
jr $ra
#Save $s0 on stack
#Procedure body
#Restore $s0
#Result
#Return
$s0
$sp
Stack
Dr. Prasenjit Dey
29
Nested Procedure Call: Example
 For nested call, caller needs to save on the stack:
 Its return address
 Any arguments and temporaries needed after the call
 Restore from the stack after the call
Dr. Prasenjit Dey
30
Nested Procedure Call: Example
 C code:
 int fact (int n)
{
if (n < 1) return f; # f=1 (0!=1)
else return n * fact(n - 1);
}
 MIPS code:
 Save argument n in $a0, result f in $v0
 fact:
addi $sp, $sp, -8 # adjust stack for 2 items
sw $ra, 4($sp) # save return address
sw $a0, 0($sp) # save argument
slti $t0, $a0, 1 # if n < 1 then $t0 = 1
beq $t0, $zero, L1 # (if n not less than1) goto L1
addi $v0, $zero, 1 # if so, result is 1
addi $sp, $sp, 8 # pop 2 items from stack
jr $ra # and return
L1: addi $a0, $a0, -1 # decrement n = n -1
jal fact # recursive call
lw $a0, 0($sp) # restore original n
lw $ra, 4($sp) # and return address
addi $sp, $sp, 8 # pop 2 items from stack
mul $v0, $a0, $v0 # multiply to get result (n*(n-1))
jr $ra # and return
$a0
$ra
$sp
Stack
$v0 = n
$a0 = n-1
Dr. Prasenjit Dey
31
32 bit operations
 Load 32-bit address/constant into 32 bit register
 lui $t0, 0000 0000 0011 1101 (higher order 2 bytes) #(load upper immediate)
 Load most significant 16 bits of address/constant into upper half of $t0
 ori $t0, $t0, 0000 0000 1010 1111 (lower order 2 bytes)
 OR least significant 16 bits into lower half of $t0, keeping upper half unchanged
0000 0000 0011 1101 0000 0000 0000 0000
0000 0000 0011 1101 0000 0000 1010 1111
Dr. Prasenjit Dey
32
Branch Addressing
 Conditional branch instructions has
 Opcode, two registers, address field
 PC-relative addressing
 Target address = PC + offset × 4 #1 word = 4 byte
 Here, PC already incremented by 4
op rs rt constant or address
6 bits 5 bits 5 bits 16 bits
Dr. Prasenjit Dey
33
J-format instructions
 Unconditional branch instructions follow J-format instruction format
 j and jal
 Can jump into any location within a text segment
 Size of a text segment = 2^28 = 256MB (26bit address field + 2)
 Pseudo-direct jump addressing
 To update PC
 concatenate top 4 bits of old PC with 26-bit jump address
 Target address = PC31…28 : (address × 4)
op address
6 bits 26 bits
Dr. Prasenjit Dey
34
Branching Far Away
 If branch target is too far to encode with 16-bit offset, assembler rewrites the code
 For example
beq $s0, $s1, L1
↓
bne $s0, $s1, L2
j L1
L2:
Dr. Prasenjit Dey
35
Addressing Mode Summary
op rs rt immediate
Immediate Addressing
op rs rt rd Shamt funct
Register Addressing
op rs rt Address
Base Addressing
op rs rt Address
PC-relative Addressing
op Address
Pseudo Direct Addressing
Register
PC
PC
Register
Memory
Memory
Memory
+
+
+
Dr. Prasenjit Dey
36
Code Optimization using Pointers
 Use pointers instead of array for code optimization
 Array involves indexing
 To get the effective address
 Compute offset by multiplying index by element size
 Add offset with the base address of array
 Pointers have the effective address automatically
 Don’t need to compute offset from index
 Don’t need to computer effective address from offset
Dr. Prasenjit Dey
37
Demonstration: C Code
ArrayTest(int a[], int size) {
int i;
for (i = 0; i < size; i += 1)
a[i] = 0;
}
Save i, a, size in $t0, $a0, $a1
add $t0, $zero, $zero # i = 0
L1: sll $t1, $t0, 2 # $t1 = i * 4
add $t2, $a0, $t1 # $t2 = &a[i]
sw $zero, 0($t2) # a[i] = 0
addi $t0, $t0, 1 # i = i + 1
slt $t3, $t0, $a1 # $t3 =1 if(i < size)
bne $t3, $zero, L1 # if $t3=1, goto L1
PointerTest(int *a, int size) {
int *p;
for (p = &a[0]; p < &a[size]; p = p + 1)
*p = 0;
}
Save p in $t0
add $t0, $a0, $zero # p = & a[0]
sll $t1, $a1, 2 # $t1 = size * 4
add $t2, $a0, $t1 # $t2 =&a[size]
L2: sw $zero, 0($t2) # a[p] = 0
addi $t0, $t0, 4 # p = p + 4
slt $t3, $t0, $t2 #$t3 =(p<&a[size])
bne $t3, $zero, L2 # if $t3=1, goto L2
Dr. Prasenjit Dey
38
6 instructions inside loop
4 instructions inside loop
Constraints
 Complex instructions reduce size of instruction set
 Provide high performance
 However, complex instructions are hard to implement
 Creates pipeline hazards, hamper overall performance
 Compiler optimization can be used to run simple instructions faster
 Assembly codes can be used to produce high performance
 Handling Pointers with precautions
 Dangling pointers, memory dump
 Usually, ISAs are backward compatible
 Add up new instructions in the ISA but doesn't remove old instructions
Dr. Prasenjit Dey
39
Summary
 Design principles
1. Simplicity favors regularity
2. Smaller is faster
3. Make the common case fast
4. Good design demands good compromises
Dr. Prasenjit Dey
40

Instruction Set Architecture: MIPS

  • 1.
  • 2.
    Instruction Set  Setof instructions supported by a machine  Specific to a machine  However, follows a similar format  Earlier computers used to have small and simple instruction sets  Many modern computers also have simple instruction sets Dr. Prasenjit Dey 2
  • 3.
    MIPS Instruction SetArchitecture(ISA)  Well known ISA  Established in 1980’s  Used in embedded systems, e.g., consumer electronics, network/storage equipment, cameras, printers, etc.  Sony, Silicon graphics, etc. uses MIPS architecture Dr. Prasenjit Dey 3
  • 4.
    Arithmetic Instructions  Addinstruction: add two operands (a = b + c)  Two source operands (b and c) and one destination operand (a) add a, b, c  Add three operands (a = b + c + d) add c, c, d #c = c + d add a, b, c #a = b + c  All arithmetic operations are performed in this manner  Design Principle 1: Simplicity favors regularity  Regularity makes implementation simpler  Simplicity enables higher performance at lower cost Dr. Prasenjit Dey 4
  • 5.
    Arithmetic MIPS instructions C code: f = (a + b) * (c - d);  Compiled MIPS code: add t0, a, b # temp t0 = a + b sub t1, c, d # temp t1 = c - d mul f, t0, t1 # f = t0 * t1 Dr. Prasenjit Dey 5
  • 6.
    Register Operands  Arithmeticinstructions use register operands  MIPS has 32 registers, numbered from 0 to 31  Each register is 32 bit long(word)  Design Principle 2: Smaller is faster  Registers are small but faster than memory Dr. Prasenjit Dey 6
  • 7.
    Register number andname Mapping  Register 0 is the zero register($Zero) has a fixed value 0  2 Value registers  Register 2 and register 3, denoted by $v0 and $v1 respectively  4 arguments  Register 4 to register 7, denoted by $a0, $a1, $a2, $a3  10 temporary registers  Register 8 to register 15, denoted by $t0, $t1, …, $t7; $t0  register 8, $t1  register 9, so on.  Register 24 and register 25, denoted by $t8 and $t9 respectively  8 save registers  Register16 to register 23, denoted by $s0, $s1, …, $s7; $s0  register 16, $s1  register 17, so on.  Register 28 is the global pointer for static data denote by $gp  Register 29 is the stack pointer denote by $sp  Register 30 is the frame pointer denote by $fp  Register 31 is the return address denote by $ra Dr. Prasenjit Dey 7
  • 8.
    Use of Registerin MIPS  C code: f = (a + b) * (c - d);  Compiled MIPS code: save f, a, b, c, d in $s0, $s1, $s2, $s3, $s4 respectively add $t0, $s1, $s2 # temp t0 = a + b sub $t1, $s3, $s4 # temp t1 = c - d mul $s0, $t0, $t1 # f = t0 * t1 Dr. Prasenjit Dey 8
  • 9.
    Memory Operands  Memoryused to store composite data  Arrays, structures, dynamic data  Uses in arithmetic operations  Load values from memory into registers  Store result from register to memory  Memory is byte addressed  Each address identifies an 8-bit byte  An word, 32 bit data occupy 4bytes  byte [0, 1, 2, 3]  1 word  Byte of an word can be aligned in 2 ways into the memory  Big Endian: Most-significant byte(0) at least address(0) of a word  Little Endian: least-significant byte(3) at least address (0) of a words Memory address 1st Byte 2nd Byte 3rd Byte 4th Byte 100 104 108 112 Memory address 1st Byte 2nd Byte 3rd Byte 4th Byte 100 0 (MSB) 1 2 3 (LSB) 104 4 5 6 7 108 8 9 10 11 112 12 13 14 15 Dr. Prasenjit Dey 9
  • 10.
    Memory Operand Example C code: c = b + A[8];  c, b are in $s1, $s2  base address of A in $s3 A[12] = b + A[8];  Compiled MIPS code:  Array Index = 8  1 item occupy 4 bytes(1 word)  requires offset of (8*4) = 32 lw $t0, 32($s3) # Load Word A[8] into $t0 add $s1, $s2, $t0 # add b and A[8] lw $t0, 32($s3) # Load Word A[8] into $t0 add $t0, $s2, $t0 # add b and A[8] sw $t0, 48($s3) # store into A[12] Dr. Prasenjit Dey 10
  • 11.
    Immediate Operands  Constantoperand specified in an instruction addi $s0, $s0, 4 # $s0  a; a = a + 4;  No subtract immediate instruction  Just use a negative constant addi $s0, $s0, -4 # $s0  a; a = a - 4;  Design Principle 3: Make the common case fast  Small constants are common  Immediate operand avoids a load instruction Dr. Prasenjit Dey 11
  • 12.
    The Constant Zero In MIPS, register 0 ($zero) has a constant 0  Cannot be overwritten  Add $zero, $zero, 2  Usage  E.g., move between registers  mov $t2, $s1 # $t2  $s1 add $t2, $s1, $zero # $t2 = $s1 + 0  NOT instruction can be implemented by NOR instruction  nor $t0, $t1, $zero Dr. Prasenjit Dey 12
  • 13.
    Instruction Format  32-bitinstruction  3 types of instructions 1. R-format instruction (Register type instructions) 2. I-format instruction (Immediate type instructions) 3. J-format instruction (Jump type instructions) Dr. Prasenjit Dey 13
  • 14.
    R-format Instructions  Instructionfields  op: operation code (opcode) (2^6 = 64 distinct opcode can be represented)  rs: first source register number (As MIPS has 32 registers, so 5bits(2^5 = 32))  rt: second source register number (3rd register/Register Third=(RT))  rd: destination register number  shamt: shift amount (5-bit can represent (2^5 = 32) bit shift, as reg. size = 1 word)  funct: function code (extends opcode) op rs rt rd shamt funct 6 bits 6 bits 5 bits 5 bits 5 bits 5 bits Dr. Prasenjit Dey 14
  • 15.
    R-format Instruction: Arithmeticoperations add $t0, $s1, $s2 arithmetic $s1 $s2 $t0 0 add 0 17 18 8 0 16 000000 10001 10010 01000 00000 010000 000000100011001001000000000100002 op rs rt rd shamt funct 6 bits 6 bits 5 bits 5 bits 5 bits 5 bits 0232401016 Let Opcode for arithmetic = 0 Let Encoding for add = 16 Dr. Prasenjit Dey 15
  • 16.
    R-format Instructions: ShiftOperations op rs rt rd shamt funct 6 bits 6 bits 5 bits 5 bits 5 bits 5 bits shamt: how many positions to shift sll $t0, $t1, 5 # shift left 5 bits srl $t0, $t1, 5 # shift right 5 bits Shift $t1 $t0 5 sll 0 8 9 0 8 5 16 Let Opcode for shift = 8 Let Encoding for sll = 16 Dr. Prasenjit Dey 16
  • 17.
    R-format Instructions: LogicalOperations op rs rt rd shamt funct 6 bits 6 bits 5 bits 5 bits 5 bits 5 bits and $t0, $s0, $s1 or $t0, $s0, $s1 Logical $s0 $t0 0 and $s1 10 16 17 8 0 32 Let Opcode for logical = 10 Let Encoding for and = 32 Dr. Prasenjit Dey 17
  • 18.
    I-format Instructions  Immediatearithmetic and load/store instructions  rs: source register, base address of an array  rt: destination or source register number  Address: offset field  Design Principle 4: Good design demands good compromises  Different formats complicate decoding  Keep formats as similar as possible op rs rt constant or address 6 bits 5 bits 5 bits 16 bits Dr. Prasenjit Dey 18
  • 19.
    I-format Instruction: Example Arithmetic instructions with constants  addi $s1, $s0, 4  Load/store instructions  lw $t0, 32($s3) addi $s0 $s1 4 6 bits 5 bits 5 bits 16 bits lw $s3 $t0 32 6 bits 5 bits 5 bits 16 bits Dr. Prasenjit Dey 19
  • 20.
    Conditional Operations  Conditionalbranch  If a condition is true, branch to a labeled instruction  Otherwise, continue sequentially  bne rs, rt, L1 #branch if not equal  if (rs != rt) branch to instruction labeled L1;  beq rs, rt, L2 #branch if equal  if (rs == rt) branch to instruction labeled L2;  Unconditional branch instruction  j L3  unconditional jump to instruction labeled L3 Dr. Prasenjit Dey 20
  • 21.
    If Statements  Ccode: if (i==j) c = a + b; else c = a - b;  Compiled MIPS code: Save c, a, b, i , j in … in $s0, $s1, $s2, $s3, $s4 bne $s3, $s4, Else # i != j add $s0, $s1, $s2 # c = a + b j Exit # unconditional jump Else: sub $s0, $s1, $s2 # c = a – b Exit Dr. Prasenjit Dey 21
  • 22.
    Swap operations  Ccode:  Swap procedure (leaf)  void swap(int v[], int i) { int temp; temp = v[i]; v[i] = v[i+1]; v[i+1] = temp; }  MIPS code: Save v in $a0, i in $a1, temp in $t0 swap: sll $t1, $a1, 2 # $t1 = i * 4 add $t1, $a0, $t1 # $t1 = v+(i*4) #(address of v[i]) lw $t0, 0($t1) #$t0 = v[i] lw $t2, 4($t1) #$t2 = v[i+1] sw $t2, 0($t1) #v[i] = $t2 sw $t0, 4($t1) #v[i+1] = $t0 jr $ra #return to calling routine Dr. Prasenjit Dey 22
  • 23.
    Loop Statements  Ccode:  while (save[i] == k) i += 1;  Compiled MIPS code: i in $s0, k in $s1, address of save in $s2 Loop: sll $t1, $s0, 2 #t1 = i*4;byte addressing add $t1, $t1, $s2 #t1 = base addr. + offset lw $t0, 0($t1) #load data into $t0 bne $t0, $s1, Exit addi $s0, $s0, 1 j Loop Exit: Dr. Prasenjit Dey 23
  • 24.
    Basic Blocks  Abasic block is  Sequence of instructions  No branches  e.g., a set of instructions within a if/else construct, loop etc.  Can be optimized via compiler  Advanced processor can also be used for faster execution Dr. Prasenjit Dey 24
  • 25.
    other Conditional instructions slt rd, rs, rt #(set less than) set result to 1 if condition is true, else set to 0  if (rs < rt) rd = 1; else rd = 0;  slti rt, rs, constant  if (rs < constant) rt = 1; else rt = 0;  Uses in conjunction with beq (branch equal), bne (branch not equal)  blt $s0, $s1 L #(branch less than) take branch if condition is true, else don’t if ($s0 < $s1) goto L slt $t0, $s0, $s1 # if ($s0 < $s1), $t0 = 1 bne $t0, $zero, L # branch to L Dr. Prasenjit Dey 25
  • 26.
    Branch Instruction Design Why not blt(branch less than), ble(branch less than or equal), etc?  Hardware for <, ≤, … slower than =, ≠  Composite instruction, requiring a slower clock  This is a good design compromise  beq and bne are the common case Dr. Prasenjit Dey 26
  • 27.
    Procedure Call  Sequences 1.Place parameters in registers 2. Transfer control to procedure 3. Acquire storage for procedure 4. Perform procedure’s operations 5. Place result in register for caller 6. Return to place of call Dr. Prasenjit Dey 27
  • 28.
    Procedure Call Instructions Procedure call:  jump and link to the location of the procedure definition jal ProcedureLabel #(jump and link) jump to first address of the procedure  Save the address of next instruction after the procedure call in $ra #return address  Procedure return: jr $ra # (jump register) jump to $ra  Update PC  $ra Dr. Prasenjit Dey 28
  • 29.
    Procedure Call: Example C code:  int leaf_example (int a, b, c, d) { int f; f = (a + b) - (c + d); return f; }  MIPS code: Save arguments a, b, c, d in $a0, …, $a3 f in $s0 (on stack), result in $v0 leaf_example: addi $sp, $sp, -4 sw $s0, 0($sp) add $t0, $a0, $a1 add $t1, $a2, $a3 sub $s0, $t0, $t1 add $v0, $s0, $zero lw $s0, 0($sp) addi $sp, $sp, 4 jr $ra #Save $s0 on stack #Procedure body #Restore $s0 #Result #Return $s0 $sp Stack Dr. Prasenjit Dey 29
  • 30.
    Nested Procedure Call:Example  For nested call, caller needs to save on the stack:  Its return address  Any arguments and temporaries needed after the call  Restore from the stack after the call Dr. Prasenjit Dey 30
  • 31.
    Nested Procedure Call:Example  C code:  int fact (int n) { if (n < 1) return f; # f=1 (0!=1) else return n * fact(n - 1); }  MIPS code:  Save argument n in $a0, result f in $v0  fact: addi $sp, $sp, -8 # adjust stack for 2 items sw $ra, 4($sp) # save return address sw $a0, 0($sp) # save argument slti $t0, $a0, 1 # if n < 1 then $t0 = 1 beq $t0, $zero, L1 # (if n not less than1) goto L1 addi $v0, $zero, 1 # if so, result is 1 addi $sp, $sp, 8 # pop 2 items from stack jr $ra # and return L1: addi $a0, $a0, -1 # decrement n = n -1 jal fact # recursive call lw $a0, 0($sp) # restore original n lw $ra, 4($sp) # and return address addi $sp, $sp, 8 # pop 2 items from stack mul $v0, $a0, $v0 # multiply to get result (n*(n-1)) jr $ra # and return $a0 $ra $sp Stack $v0 = n $a0 = n-1 Dr. Prasenjit Dey 31
  • 32.
    32 bit operations Load 32-bit address/constant into 32 bit register  lui $t0, 0000 0000 0011 1101 (higher order 2 bytes) #(load upper immediate)  Load most significant 16 bits of address/constant into upper half of $t0  ori $t0, $t0, 0000 0000 1010 1111 (lower order 2 bytes)  OR least significant 16 bits into lower half of $t0, keeping upper half unchanged 0000 0000 0011 1101 0000 0000 0000 0000 0000 0000 0011 1101 0000 0000 1010 1111 Dr. Prasenjit Dey 32
  • 33.
    Branch Addressing  Conditionalbranch instructions has  Opcode, two registers, address field  PC-relative addressing  Target address = PC + offset × 4 #1 word = 4 byte  Here, PC already incremented by 4 op rs rt constant or address 6 bits 5 bits 5 bits 16 bits Dr. Prasenjit Dey 33
  • 34.
    J-format instructions  Unconditionalbranch instructions follow J-format instruction format  j and jal  Can jump into any location within a text segment  Size of a text segment = 2^28 = 256MB (26bit address field + 2)  Pseudo-direct jump addressing  To update PC  concatenate top 4 bits of old PC with 26-bit jump address  Target address = PC31…28 : (address × 4) op address 6 bits 26 bits Dr. Prasenjit Dey 34
  • 35.
    Branching Far Away If branch target is too far to encode with 16-bit offset, assembler rewrites the code  For example beq $s0, $s1, L1 ↓ bne $s0, $s1, L2 j L1 L2: Dr. Prasenjit Dey 35
  • 36.
    Addressing Mode Summary oprs rt immediate Immediate Addressing op rs rt rd Shamt funct Register Addressing op rs rt Address Base Addressing op rs rt Address PC-relative Addressing op Address Pseudo Direct Addressing Register PC PC Register Memory Memory Memory + + + Dr. Prasenjit Dey 36
  • 37.
    Code Optimization usingPointers  Use pointers instead of array for code optimization  Array involves indexing  To get the effective address  Compute offset by multiplying index by element size  Add offset with the base address of array  Pointers have the effective address automatically  Don’t need to compute offset from index  Don’t need to computer effective address from offset Dr. Prasenjit Dey 37
  • 38.
    Demonstration: C Code ArrayTest(inta[], int size) { int i; for (i = 0; i < size; i += 1) a[i] = 0; } Save i, a, size in $t0, $a0, $a1 add $t0, $zero, $zero # i = 0 L1: sll $t1, $t0, 2 # $t1 = i * 4 add $t2, $a0, $t1 # $t2 = &a[i] sw $zero, 0($t2) # a[i] = 0 addi $t0, $t0, 1 # i = i + 1 slt $t3, $t0, $a1 # $t3 =1 if(i < size) bne $t3, $zero, L1 # if $t3=1, goto L1 PointerTest(int *a, int size) { int *p; for (p = &a[0]; p < &a[size]; p = p + 1) *p = 0; } Save p in $t0 add $t0, $a0, $zero # p = & a[0] sll $t1, $a1, 2 # $t1 = size * 4 add $t2, $a0, $t1 # $t2 =&a[size] L2: sw $zero, 0($t2) # a[p] = 0 addi $t0, $t0, 4 # p = p + 4 slt $t3, $t0, $t2 #$t3 =(p<&a[size]) bne $t3, $zero, L2 # if $t3=1, goto L2 Dr. Prasenjit Dey 38 6 instructions inside loop 4 instructions inside loop
  • 39.
    Constraints  Complex instructionsreduce size of instruction set  Provide high performance  However, complex instructions are hard to implement  Creates pipeline hazards, hamper overall performance  Compiler optimization can be used to run simple instructions faster  Assembly codes can be used to produce high performance  Handling Pointers with precautions  Dangling pointers, memory dump  Usually, ISAs are backward compatible  Add up new instructions in the ISA but doesn't remove old instructions Dr. Prasenjit Dey 39
  • 40.
    Summary  Design principles 1.Simplicity favors regularity 2. Smaller is faster 3. Make the common case fast 4. Good design demands good compromises Dr. Prasenjit Dey 40