SlideShare a Scribd company logo
1 of 112
Download to read offline
Velammal Engineering College
Department of Computer Science
and Engineering
Welcome…
Slide Sources: Patterson & Hennessy COD book
website (copyright Morgan Kaufmann) adapted
and supplemented
Mr. A. Arockia Abins &
Ms. R. Amirthavalli,
Asst. Prof,
CSE,
Velammal Engineering College
Course Objectives
• This course aims to learn the basic structure and operations of
a computer.
• The course is intended to learn ALU, pipelined execution,
parallelism and multi-core processors.
• The course will enable the students to understand memory
hierarchies, cache memories and virtual memories.
Course Outcomes
CO 1
Discuss the basics structure of computers, operations and
instructions.
CO 2 Design arithmetic and logic unit.
CO 3 Analyze pipelined execution and design control unit.
CO 4 Analyze parallel processing architectures.
CO 5 Examine the performance of various memory systems
CO 6 Organize the various I/O communications.
Syllabus
Unit Titles:
• Unit I Basic Structure of a Computer System
• Unit II Arithmetic for Computers
• Unit III Processor and Control Unit
• Unit IV Parallelism
• Unit V Memory & I/O Systems
Syllabus – Unit I
UNIT-I BASIC STRUCTURE OF A COMPUTER
SYSTEM
Functional Units – Basic operational concepts –– Instructions:
Operations, Operands – Instruction representation – Instruction
Types – MIPS addressing, Performance
Syllabus – Unit II
UNIT-II ARITHMETIC FOR COMPUTERS
Addition and Subtraction – Multiplication – Division – Floating
Point Representation – Floating Point Addition and Subtraction.
Syllabus – Unit III
UNIT-III PROCESSOR AND CONTROL UNIT
A Basic MIPS implementation – Building a Datapath – Control
Implementation Scheme – Pipelining – Pipelined datapath and
control – Handling Data Hazards & Control Hazards.
Syllabus – Unit IV
UNIT-IV PARALLELISM
Introduction to Multicore processors and other shared memory
multiprocessors – Flynn’s classification: SISD, MIMD, SIMD,
SPMD and Vector – Hardware multithreading – GPU
architecture.
Syllabus – Unit V
• UNIT-V MEMORY & I/O SYSTEMS
Memory Hierarchy – memory technologies – Cache Memory –
Performance Considerations, Virtual Memory,TLB’s – Accessing
I/O devices – Interrupts – Direct Memory Access – Bus Structure
– Bus operation.
Text Books
• Book 1:
o Name: Computer Organization and Design: The
Hardware/Software Interface
o Authors: David A. Patterson and John L. Hennessy
o Publisher: Morgan Kaufmann / Elsevier
o Edition: Fifth Edition, 2014
• Book 2:
o Name: Computer Organization and Embedded Systems
Interface
o Authors: Carl Hamacher, Zvonko Vranesic, Safwat Zaky and
Naraig Manjikian
o Publisher: Tata McGraw Hill
o Edition: Sixth Edition, 2012
Introduction
• What is mean by Computer Architecture?
Hardware parts
Instruction set
Interface between hardware &
software
Introduction
ISA: a+b -> add a,b ->000100110101010
Instruction Set Architecture
(ISA)
ISA: The interface or contact between the hardware and
the software
Rules about how to code and interpret machine
instructions:
Execution model (program counter)
Operations (instructions)
Data formats (sizes, addressing modes)
Processor state (registers)
Input and Output (memory, etc.)
Introduction
• What is meant by Computer
Architecture?
Computer architecture encompasses
the specification of an instruction set
and the functional behavior of the
hardware units that implement the
instructions.
Introduction
Technology Evolution
UNIT-I
BASIC STRUCTURE OF A
COMPUTER SYSTEM
Topics:
• Functional Units
• Basic operational concepts
• Instructions: Operations, Operands
• Instruction representation
• Instruction Types
• MIPS addressing mode
• Performance
Functional Units
Also called
as Datapath
Functional Units
Functional Units
• Input unit
• Output unit
• Memory unit
• Arithmetic Logic unit
• Control unit
Functional Units
• Input unit
Functional Units
• Output unit
Functional Units
• Memory unit
Functional Units
Functional Units
Functional Units
Arithmetic & Logic unit and Control unit
Basic Operational Concepts
Unit I
Connection between the processor and the main
memory Code Snippet:
Load R2, LOC
Add R4, R3, R2
Store LOC, R4
IR & PC
• Instruction Register:
The instruction register (IR) holds the
instruction that is currently being executed.
• Program Counter:
The program counter (PC) contains the
memory address of the next instruction to be
fetched and executed.
Memory Locations and Addresses
Examples of encoded information in a
32-bit word.
Instructions
Steps in program
translation
Translations
Machine vs Assembly
Language
Machine Language Assembly Language
• A particular set of
instructions that the
CPU can directly
execute – but these
are ones and zeros
• Ex:
0100001010101
• Assembly language
is a symbolic
version of the
equivalent machine
language
• Ex:
add a,b
Instructions
• Instruction Set:
o The vocabulary of commands understand by a
given architecture.
• Some ISA:
o ARM
o Intel x86
o IBM Power
o MIPS
o SPARC
• Different CPUs implement different set of
instructions.
MIPS
MIPS - Microprocessor with Interlocked Pipeline Stages
Features:
• five-stage execution pipeline: fetch, decode, execute,
memory-access, write-result
• regular instruction set, all instructions are 32-bit
• three-operand arithmetical and logical instructions
• 32 general-purpose registers of 32-bits each
• only the load and store instruction access memory
• flat address space of 4 GBytes of main memory (2^32
bytes)
MIPS Assembly Language
• Categories:
oArithmetic – Only processor and registers
involved (sum of two registers)
oData transfer – Interacts with memory
(load and store)
oLogical - Only processor and registers
involved (and, sll)
oConditional branch – Change flow of
execution (branch instructions)
oUnconditional Jump – Change flow of
execution (jump to a subroutine)
MIPS Registers
Arithmetic
Data Transfer
Load & Store Instructions
• Load:
o Transfer data from memory to a register
• Store:
o Transfer a data from a register to memory
• Memory address must be specified by
load and store
•
Processor Memory
STORE
LOAD
Logical
Conditional
Unconditional Jump
MIPS Arithmetic
• All MIPS arithmetic instructions have 3 operands
• Operand order is fixed (e.g., destination first)
• Example:
C code: A = B + C
MIPS code: add $s0, $s1, $s2
compiler’s job to associate
variables with registers
MIPS Arithmetic
• Design Principle 1: simplicity favors regularity.
Translation: Regular instructions make for simple hardware!
• Simpler hardware reduces design time and manufacturing cost.
• Of course this complicates some things...
C code: A = B + C + D;
E = F - A;
MIPS code add $t0, $s1, $s2
(arithmetic): add $s0, $t0, $s3
sub $s4, $s5, $s0
• Performance penalty: high-level code translates to denser machine
code.
Allowing variable number
of operands would
simplify the assembly
code but complicate the
hardware.
MIPS Arithmetic
a b c f g h i j
$ s 0 $ s 1 $ s 2 $ s 3 $ s 4 $ s 5 $ s 6
$ s 7
a = b - c ;
f = ( g + h ) – ( i + j ) ;
s u b $ s 0 , $ s 1 , $ s 2
a d d $ t 0 , $ s 4 , $ s 5
a d d $ t 1 , $ s 6 , $ s 7
s u b $ s 3 , $ t 0 , $ t 1
1 9 / 6 7
T r y :
1 . f = g + ( h – 5 )
2 . f = ( i + j ) – ( k – 2 0 )
Registers vs. Memory
• Arithmetic instructions operands must be in registers
o MIPS has 32 registers
• Compiler associates variables with registers
• What about programs with lots of variables (arrays, etc.)? Use
memory, load/store operations to transfer data from memory to
register – if not enough registers spill registers to memory
• MIPS is a load/store architecture
Processor I/O
Control
Datapath
Memory
Input
Output
Memory Organization
• Viewed as a large single-dimension array with access by
address
• A memory address is an index into the memory array
• Byte addressing means that the index points to a byte of
memory, and that the unit of memory accessed by a load/store
is a byte
0
1
2
3
4
5
6
...
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
Memory Organization
• Bytes are load/store units, but most data items use larger words
• For MIPS, a word is 32 bits or 4 bytes.
• 232 bytes with byte addresses from 0 to 232-1
• 230 words with byte addresses 0, 4, 8, ... 232-4
o i.e., words are aligned
o what are the least 2 significant bits of a word address?
0
4
8
12
...
32 bits of data
32 bits of data
32 bits of data
32 bits of data
Registers correspondingly hold 32 bits of data
The Endian Question
Big Endian
31 0
MIPS can also load and
store 4-byte words and
2-byte halfwords.
The endian question:
when you read a word, in
what order do the bytes
appear?
Little Endian: Intel, DEC,
et al.
Big Endian: Motorola,
IBM, Sun, et al.
MIPS can do either
SPIM adopts its host’s
convention
by te 0 by te 1 by te 2 by te 3
Little Endian
31 0
by te 3 by te 2 by te 1 by te 0
3 2 / 6 7
The Endian Question
x = 0x01234567
Load/Store Instructions
• Load and store instructions
• Example:
C code: A[8] = h + A[8];
MIPS code (load): lw $t0, 32($s3)
(arithmetic): add $t0, $s2, $t0
(store): sw $t0, 32($s3)
• Load word has destination first, store has destination last
• Remember MIPS arithmetic operands are registers, not memory
locations
o therefore, words must first be moved from memory to registers using
loads before they can be operated on; then result can be stored back to
memory
offset address
value
So far we’ve learned:
• MIPS
o loading words but addressing bytes
o arithmetic on registers only
• Instruction Meaning
add $s1, $s2, $s3 $s1 = $s2 + $s3
sub $s1, $s2, $s3 $s1 = $s2 – $s3
lw $s1, 100($s2) $s1 = Memory[$s2+100]
sw $s1, 100($s2) Memory[$s2+100]= $s1
• Try:Find the assembly code of B[8]=A[i]+A[j];
A and B available in $s6 and $s7 respectively
$so-$s5 consists of the values f-j
Exercise
Q: For the following C statement, what is the corresponding
MIPS assembly code? Assume that the variables f, g, h,
and i are given and could be considered 32-bit integers as
declared in a C program. Use a minimal number of MIPS
assembly instructions. f = g + (h − 5);
Solution:
f -> $s1, g -> $s2, h -> $s3
addi $t0, $s3,-5
add $s1, $s2, $t0
Representing Instructions
in the Computer
• Instruction format:
o A form of representation of an instruction
composed of fields of binary numbers.
• All MIPS instructions are 32 bit long.
• Three types of instruction formats:
o R-type (for register) or R-format
o I-type (for immediate) or I-format
o J-type (for jump) or J-format
R-type (for register)
• MIPS fields:
• op: Basic operation of the instruction (opcode)
• rs: The first register source operand
• rt: The second register source operand
• rd: The register destination operand
• shamt: Shift amount
• funt: Function. It selects the specific variant of the
operation in the op filed. (function code)
Ex: add $t0, $s1, $s2
I-type (for immediate)
• MIPS fields:
• op: Basic operation of the instruction (opcode)
• rs: The register source operand
• rt: destination register, which receives the result of the
load
• constant or address: It contains 16 bit constant or
address value.
I-type (for immediate)
• MIPS fields:
Ex: addi $t1, $s0, 10
lw $t0, 40($s4)
bne $s5,$s6, 100
J-type (for jump)
• MIPS fields:
• op: Basic operation of the instruction (opcode)
• address: It contains 26 bit address value.
• Ex:
j 10000
Instruction formats for
MIPS architecture
MIPS instruction
encoding
MIPS Registers
Mapping register names
to register numbers
t0 t1 t2 t3 t4 t5 t6 t7
8 9 10 11 12 13 14 15
s0 s1 s2 s3 s4 s5 s6 S7
16 17 18 19 20 21 22 23
Translating a MIPS Assembly
Instruction into a Machine Instruction
Given instruction: add $t0,$s1,$s2
• Solution:
• Identify the type instruction format: R-type
• Format: Operation rd, rs, rt
• rs -> $s1, rt -> $s2, rd -> $t0, shamt – NA
• Op -> , funct ->
• Decimal representation:
• Binary representation:
op rs rt rd shamt funct
0 17 18 8 0 32
op rs rt rd shamt funct
000000 10001 10010 01000 00000 100000
Exercise
Q: Translate the following MIPS Assembly code
into binary code.
sub $t3,$s4,$s5
op rs rt rd Shamt Funct
0 20 21 11 0 34
000000 10100 10101 01011 00000 100010
Exercise
Q: Translate the following MIPS Assembly code
into binary code.
sub $t3,$s4,$s5
000000 10100 10101 01011 00000 100010
Translating a MIPS Assembly
Instruction into a Machine Instruction
Given instruction: lw $t0,32($s3)
• Solution:
• Identify the type instruction format: I-type
• Format: Operation rt, addr.(rs)
• rs -> $s3, rt -> $to, immediate -> 32
• Decimal representation:
• Binary representation:
op rs rt address
35 19 8 32
op rs rt
100011 10011 01000 0000 0000 0010 0000
Exercise
Q: Translate the following MIPS Assembly code
into binary code.
sw $t2,58($s5)
101011 10101 01010 0000 0000 0011 1010
Translating High level Language
into Machine Language
Q: Consider the following high level statement
A[300] = h + A[300];
If $t1 has the base of the array A and $s2 corresponds to
h, What is the MIPS machine language code?
Logical Operations
Shift operations
• Shift allow bits to be moved around inside of a register.
• Shift left logical
Example: sll $t2,$s0,4 # reg $t2 = reg $s0 << 4 bits
Machine Code:
op rs rt rd shamt funct
000000 00000 10000 01010 00100 000000
Shift Left Logical(sll)
• Example: sll $t2,$s0,4 # reg $t2 = reg $s0 << 4 bits
• If $s0=10
• Value of $t2=???
Shift operations
• Shift right logical
Example: srl $t5,$s3,2 # reg $t5 = reg $s3 >> 2 bits
Machine Code:
op rs rt rd shamt funct
000000 00000 10011 01101 00010 000010
op rs rt rd shamt funct
0 00000 19 13 2 2
Shift Right Logical(srl)
Example: srl $t5,$s3,2 # reg $t5 = reg $s3 >> 2 bits
• If $s3=12
• Value of $t5=???
Logical Operations –
AND, OR & NOT
• A logical bit-by-bit operation with two operands.
• EX:
and $t0,$t1,$t2 # reg $t0 = reg $t1 & reg $t2
or $t0,$t1,$t2 # reg $t0 = reg $t1 | reg $t2
nor $t0,$t1,$t3 # reg $t0 = ~ (reg $t1 | reg $t3)
Example
Instructions for Making
Decisions
• Sequences that allow programs to execute statements in order
one after another.
•  Branches that allow programs to jump to other points in a
program.
•  Loops that allow a program to execute a fragment of code
multiple times.
• MIPS Instructions:
beq register1, register2, L1
bne register1, register2, L1
• beq and bne are mnemonics
• Conditional branches
Instructions for Making
Decisions
Q: In the following code segment, f, g, h, i, and j are
variables. If the five variables f through j correspond to the
five registers $s0 through $s4, what is the compiled MIPS
code for this C if statement?
if (i == j) f = g + h; else f = g - h;
Instructions for Making
Decisions
• Solution:
Instructions for Making
Decisions
High level code:
if (i == j)
f = g + h;
else
f = g - h;
MIPS code:
bne $s3,$s4,Else # go to Else if i ≠ j
add $s0,$s1,$s2 # f = g + h (skipped if i ≠ j)
j Exit # go to Exit
Else: sub $s0,$s1,$s2 # f = g - h (skipped if i = j)
Exit:
Compiling a while Loop
in C
while (save[i] == k)
i += 1;
Assume that i and k correspond to registers $s3 and $s5
and the base of the array save is in $s6. What is the MIPS
assembly code corresponding to this C segment?
Compiling a while Loop
in C
while (save[i] == k)
i += 1;
1. load save[i] into a temporary register
1. add i to the base of array save to form the address
2. performs the loop test
1. go to Exit if save[i] ≠ k
3. adds 1 to I
4. back to the while test at the top of the loop
5. Exit
while (save[i] == k)
i += 1;
Assume that i and k correspond to registers $s3 and $s5
and the base of the array save is in $s6. What is the MIPS
assembly code corresponding to this C segment?
Solution:
Loop: sll $t1,$s3,2 # Temp reg $t1 = i * 4
add $t1,$t1,$s6 # $t1 = address of save[i]
lw $t0,0($t1) # Temp reg $t0 = save[i]
bne $t0,$s5, Exit # go to Exit if save[i] ≠ k
addi $s3,$s3,1 # i = i + 1
j Loop # go to Loop
Exit:
MIPS Addressing Mode
• The different ways for specifying the locations
of instruction operands are known as
addressing mode.
• The MIPS addressing modes are the following:
1. Immediate addressing mode
2. Register addressing mode
3. Base or displacement addressing mode
4. PC-relative addressing mode
5. Pseudodirect addressing mode
Immediate addressing mode
• Def:
o the operand is a constant within the instruction itself
• Ex:
o addi $s1, $s2, 20 #$s1=$s2+20
• Ilustration:
Register addressing mode
• Def:
o source and destination operands are registers which are
available in processor registers.
o Direct addressing mode
• Ex:
o add $s1, $s2, $s3 #$s1=$s2+$s3
• Ilustration:
Base or displacement
addressing mode
• Def:
o the operand is at the memory location whose address is the
sum of a register and a constant in the instruction
o Indirect addressing mode
• Ex:
o lw $s1, 20 ($s3) #$s1= Memory[$s3+20]
• Ilustration:
PC-relative addressing mode
• Def:
o the branch address is the sum of the PC and a constant in
the instruction
• Ex:
o bne $s4, $s5, 25 # if ($s4 != $s5), go to
pc=12+4+100
• Ilustration:
Pseudodirect addressing
mode
• Def:
o the jump address is the 26 bits of the instruction
concatenated with the upper bits of the PC
• Ex:
o j 1000
• Ilustration:
Decoding Machine Code
• Q: What is the assembly language statement
corresponding to this machine instruction?
00af8020hex
Solution:
converting hexadecimal to binary
Binary instruction format
Assembly instruction
Translating Machine Language
to Assembly Language
• Translate the following machine language code into
assembly language.
0x02F34022
Performance
• Performance is the key to understanding underlying motivation for
the hardware and its organization
• Measure, report, and summarize performance to enable users to
o make intelligent choices
o see through the marketing hype!
• Why is some hardware better than others for different programs?
• What factors of system performance are hardware related?
(e.g., do we need a new machine, or a new operating system?)
• How does the machine's instruction set affect performance?
Computer Performance:
TIME, TIME, TIME!!!
• Response Time (elapsed time, latency):
o how long does it take for my job to run?
o how long does it take to execute (start to
finish) my job?
o how long must I wait for the database query?
• Throughput:
o how many jobs can the machine run at once?
o what is the average execution rate?
o how much work is getting done?
• If we upgrade a machine with a new processor what do we increase?
• If we add a new machine to the lab what do we increase?
Individual user
concerns…
Systems manager
concerns…
Execution Time
• Elapsed Time
o counts everything (disk and memory accesses, waiting for I/O, running
other programs, etc.) from start to finish
o a useful number, but often not good for comparison purposes
elapsed time = CPU time + wait time (I/O, other programs, etc.)
• CPU time
o doesn't count waiting for I/O or time spent running other programs
o can be divided into user CPU time and system CPU time (OS calls)
CPU time = user CPU time + system CPU time
 elapsed time = user CPU time + system CPU time + wait time
• Our focus: user CPU time (CPU execution time or, simply, execution
time)
o time spent executing the lines of code that are in our program
Definition of Performance
• For some program running on machine X:
PerformanceX = 1 / Execution timeX
• If there are two machines X and Y if the performance of X is greater than performance of
Y,
PerformanceX > PerformanceY
ie., 1 / Execution timeX > 1 / Execution timeY
• X is n times faster than Y means:
PerformanceX / PerformanceY = n
PerformanceX / PerformanceY = Execution timeY / Execution timeX = n
Q: If computer A runs a program in 10 sec
and computer B runs the same program in
15 secs, how much faster is A than B
• We know that,
PerformanceA / PerformanceB
= Execution timeB / Execution timeA = n
Thus the performance ratio is,
Execution timeB / Execution timeA = 15 / 10 = 1.5
ie., PerformanceA / PerformanceB = 1.5
Therfore Peformance of A 1.5 times faster than Performance
of B
Clock Cycles
• Instead of reporting execution time in seconds, we often use cycles.
In modern computers hardware events progress cycle by cycle: in
other words, each event, e.g., multiplication, addition, etc., is a
sequence of cycles
• Clock ticks indicate start and end of cycles:
• cycle time = time between ticks = seconds per cycle
• clock rate (frequency) = clock cycles per second (1 Hz. = 1
cycle/sec, 1 MHz. = 106 cycles/sec)
• Example: A 200 Mhz. clock has a cycle time of ????
time
seconds
program

cycles
program

seconds
cycle
cycle
tick
tick
Performance Equation I
• So, to improve performance one can either:
o reduce the number of cycles for a program, or
o reduce the clock cycle time, or, equivalently,
o increase the clock rate
seconds
program

cycles
program

seconds
cycle
CPU execution time CPU clock cycles Clock cycle time
for a program for a program
=

equivalently
Also, CPU execution time CPU clock cycles / Clock cycle rate
for a program for a program
Our favorite program runs in 10 seconds on computer A, which has a 2
GHz clock. We are trying to help a computer designer build a computer,
B, which will run this program in 6 seconds. The designer has determined
that a substantial increase in the clock rate is possible, but this increase
will affect the rest of the CPU design, causing computer B to require 1.2
times as many clock cycles as computer A for this program. What clock
rate should we tell the designer to target?
CPU timeA = CPU Clock cyclesA / clock rateA
10 sec = CPU Clock cyclesA / 2*109 cycles/sec
CPU Clock cyclesA = 10 sec * 2*109 cycles/sec
= 20 *109 cycles
CPU timeB = 1.2 * CPU Clock cyclesA / clock rateB
6 secs = 1.2 * 20 *109 cycles / clock rateB
clock rateB = 1.2 * 20 *109 cycles / 6 sec= 4 * 109 Hz
To run the program in 6 secs, B must be 4 * 109 Hz
Instruction Performance
• No reference to no of instructions in previous equation
• The execution time depends on the number of
instructions in the program
Clock cycles per instruction (CPI)
• Average number of clock cycles per instruction for a
program or program fragment
Suppose we have two implementations of the same instruction
set architecture. Computer A has a clock cycle time of 250 ps
and a CPI of 2.0 for some program, and computer B has a
clock cycle time of 500 ps and a CPI of 1.2 for the same
program. Which computer is faster for this program and by
how much?
• Same number of instructions are instructions are
executed
Instruction Performance
CPU execution time = Instruction count * average CPI * Clock cycle time
for a program for a program
Or
CPU execution time = Instruction count * average CPI / Clock rate
for a program for a program
Instruction Performance
Which code sequence
executes the most?
• Sequence 1 executes,
2 + 1 + 2 = 5 instructions
• Sequence 2 executes,
4+ 1 + 1 = 6 instructions
Sequence 2 executes most no of instructions
Which will be faster?
• So code sequence 2 is faster
What is the CPI for each
sequence?
• Sequence 2 has lower CPI as it takes fewer clock cycles
but has more instructions
Basic components of
Performance
Factors affecting
Peformance

More Related Content

Similar to CE412 -advanced computer Architecture lecture 1.pdf

Instruction set.pptx
Instruction set.pptxInstruction set.pptx
Instruction set.pptxssuser000e54
 
isa architecture
isa architectureisa architecture
isa architectureAJAL A J
 
Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.
Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.
Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.li50916ku
 
11-risc-cisc-and-isa-w.pptx
11-risc-cisc-and-isa-w.pptx11-risc-cisc-and-isa-w.pptx
11-risc-cisc-and-isa-w.pptxSuma Prakash
 
MCA-I-COA- overview of register transfer, micro operations and basic computer...
MCA-I-COA- overview of register transfer, micro operations and basic computer...MCA-I-COA- overview of register transfer, micro operations and basic computer...
MCA-I-COA- overview of register transfer, micro operations and basic computer...Rai University
 
Instruction Set Architecture
Instruction Set ArchitectureInstruction Set Architecture
Instruction Set ArchitectureJaffer Haadi
 
4.1 Introduction 145• In this section, we first take a gander at a.pdf
4.1 Introduction 145• In this section, we first take a gander at a.pdf4.1 Introduction 145• In this section, we first take a gander at a.pdf
4.1 Introduction 145• In this section, we first take a gander at a.pdfarpowersarps
 
CST 20363 Session 4 Computer Logic Design
CST 20363 Session 4 Computer Logic DesignCST 20363 Session 4 Computer Logic Design
CST 20363 Session 4 Computer Logic Designoudesign
 
Unit 1 computer architecture (1)
Unit 1   computer architecture (1)Unit 1   computer architecture (1)
Unit 1 computer architecture (1)DevaKumari Vijay
 
isa architecture
isa architectureisa architecture
isa architectureAJAL A J
 
B.sc cs-ii-u-2.2-overview of register transfer, micro operations and basic co...
B.sc cs-ii-u-2.2-overview of register transfer, micro operations and basic co...B.sc cs-ii-u-2.2-overview of register transfer, micro operations and basic co...
B.sc cs-ii-u-2.2-overview of register transfer, micro operations and basic co...Rai University
 
Chapter 02 instructions language of the computer
Chapter 02   instructions language of the computerChapter 02   instructions language of the computer
Chapter 02 instructions language of the computerBảo Hoang
 
software engineering CSE675_01_Introduction.ppt
software engineering CSE675_01_Introduction.pptsoftware engineering CSE675_01_Introduction.ppt
software engineering CSE675_01_Introduction.pptSomnathMule5
 
Bca 2nd sem-u-2.2-overview of register transfer, micro operations and basic c...
Bca 2nd sem-u-2.2-overview of register transfer, micro operations and basic c...Bca 2nd sem-u-2.2-overview of register transfer, micro operations and basic c...
Bca 2nd sem-u-2.2-overview of register transfer, micro operations and basic c...Rai University
 
RISC Vs CISC Computer architecture and design
RISC Vs CISC Computer architecture and designRISC Vs CISC Computer architecture and design
RISC Vs CISC Computer architecture and designyousefzahdeh
 
CSE675_01_Introduction.ppt
CSE675_01_Introduction.pptCSE675_01_Introduction.ppt
CSE675_01_Introduction.pptAshokRachapalli1
 
CSE675_01_Introduction.ppt
CSE675_01_Introduction.pptCSE675_01_Introduction.ppt
CSE675_01_Introduction.pptAshokRachapalli1
 

Similar to CE412 -advanced computer Architecture lecture 1.pdf (20)

Instruction set.pptx
Instruction set.pptxInstruction set.pptx
Instruction set.pptx
 
Unit I_MT2301.pdf
Unit I_MT2301.pdfUnit I_MT2301.pdf
Unit I_MT2301.pdf
 
isa architecture
isa architectureisa architecture
isa architecture
 
Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.
Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.
Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.
 
11-risc-cisc-and-isa-w.pptx
11-risc-cisc-and-isa-w.pptx11-risc-cisc-and-isa-w.pptx
11-risc-cisc-and-isa-w.pptx
 
MCA-I-COA- overview of register transfer, micro operations and basic computer...
MCA-I-COA- overview of register transfer, micro operations and basic computer...MCA-I-COA- overview of register transfer, micro operations and basic computer...
MCA-I-COA- overview of register transfer, micro operations and basic computer...
 
Instruction Set Architecture
Instruction Set ArchitectureInstruction Set Architecture
Instruction Set Architecture
 
4.1 Introduction 145• In this section, we first take a gander at a.pdf
4.1 Introduction 145• In this section, we first take a gander at a.pdf4.1 Introduction 145• In this section, we first take a gander at a.pdf
4.1 Introduction 145• In this section, we first take a gander at a.pdf
 
Processors selection
Processors selectionProcessors selection
Processors selection
 
CST 20363 Session 4 Computer Logic Design
CST 20363 Session 4 Computer Logic DesignCST 20363 Session 4 Computer Logic Design
CST 20363 Session 4 Computer Logic Design
 
Unit 1 computer architecture (1)
Unit 1   computer architecture (1)Unit 1   computer architecture (1)
Unit 1 computer architecture (1)
 
isa architecture
isa architectureisa architecture
isa architecture
 
B.sc cs-ii-u-2.2-overview of register transfer, micro operations and basic co...
B.sc cs-ii-u-2.2-overview of register transfer, micro operations and basic co...B.sc cs-ii-u-2.2-overview of register transfer, micro operations and basic co...
B.sc cs-ii-u-2.2-overview of register transfer, micro operations and basic co...
 
Chapter 02 instructions language of the computer
Chapter 02   instructions language of the computerChapter 02   instructions language of the computer
Chapter 02 instructions language of the computer
 
software engineering CSE675_01_Introduction.ppt
software engineering CSE675_01_Introduction.pptsoftware engineering CSE675_01_Introduction.ppt
software engineering CSE675_01_Introduction.ppt
 
Bca 2nd sem-u-2.2-overview of register transfer, micro operations and basic c...
Bca 2nd sem-u-2.2-overview of register transfer, micro operations and basic c...Bca 2nd sem-u-2.2-overview of register transfer, micro operations and basic c...
Bca 2nd sem-u-2.2-overview of register transfer, micro operations and basic c...
 
RISC Vs CISC Computer architecture and design
RISC Vs CISC Computer architecture and designRISC Vs CISC Computer architecture and design
RISC Vs CISC Computer architecture and design
 
8871077.ppt
8871077.ppt8871077.ppt
8871077.ppt
 
CSE675_01_Introduction.ppt
CSE675_01_Introduction.pptCSE675_01_Introduction.ppt
CSE675_01_Introduction.ppt
 
CSE675_01_Introduction.ppt
CSE675_01_Introduction.pptCSE675_01_Introduction.ppt
CSE675_01_Introduction.ppt
 

Recently uploaded

HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 

Recently uploaded (20)

HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 

CE412 -advanced computer Architecture lecture 1.pdf

  • 1. Velammal Engineering College Department of Computer Science and Engineering Welcome… Slide Sources: Patterson & Hennessy COD book website (copyright Morgan Kaufmann) adapted and supplemented Mr. A. Arockia Abins & Ms. R. Amirthavalli, Asst. Prof, CSE, Velammal Engineering College
  • 2. Course Objectives • This course aims to learn the basic structure and operations of a computer. • The course is intended to learn ALU, pipelined execution, parallelism and multi-core processors. • The course will enable the students to understand memory hierarchies, cache memories and virtual memories.
  • 3. Course Outcomes CO 1 Discuss the basics structure of computers, operations and instructions. CO 2 Design arithmetic and logic unit. CO 3 Analyze pipelined execution and design control unit. CO 4 Analyze parallel processing architectures. CO 5 Examine the performance of various memory systems CO 6 Organize the various I/O communications.
  • 4. Syllabus Unit Titles: • Unit I Basic Structure of a Computer System • Unit II Arithmetic for Computers • Unit III Processor and Control Unit • Unit IV Parallelism • Unit V Memory & I/O Systems
  • 5. Syllabus – Unit I UNIT-I BASIC STRUCTURE OF A COMPUTER SYSTEM Functional Units – Basic operational concepts –– Instructions: Operations, Operands – Instruction representation – Instruction Types – MIPS addressing, Performance
  • 6. Syllabus – Unit II UNIT-II ARITHMETIC FOR COMPUTERS Addition and Subtraction – Multiplication – Division – Floating Point Representation – Floating Point Addition and Subtraction.
  • 7. Syllabus – Unit III UNIT-III PROCESSOR AND CONTROL UNIT A Basic MIPS implementation – Building a Datapath – Control Implementation Scheme – Pipelining – Pipelined datapath and control – Handling Data Hazards & Control Hazards.
  • 8. Syllabus – Unit IV UNIT-IV PARALLELISM Introduction to Multicore processors and other shared memory multiprocessors – Flynn’s classification: SISD, MIMD, SIMD, SPMD and Vector – Hardware multithreading – GPU architecture.
  • 9. Syllabus – Unit V • UNIT-V MEMORY & I/O SYSTEMS Memory Hierarchy – memory technologies – Cache Memory – Performance Considerations, Virtual Memory,TLB’s – Accessing I/O devices – Interrupts – Direct Memory Access – Bus Structure – Bus operation.
  • 10. Text Books • Book 1: o Name: Computer Organization and Design: The Hardware/Software Interface o Authors: David A. Patterson and John L. Hennessy o Publisher: Morgan Kaufmann / Elsevier o Edition: Fifth Edition, 2014 • Book 2: o Name: Computer Organization and Embedded Systems Interface o Authors: Carl Hamacher, Zvonko Vranesic, Safwat Zaky and Naraig Manjikian o Publisher: Tata McGraw Hill o Edition: Sixth Edition, 2012
  • 11. Introduction • What is mean by Computer Architecture? Hardware parts Instruction set Interface between hardware & software
  • 12. Introduction ISA: a+b -> add a,b ->000100110101010
  • 13. Instruction Set Architecture (ISA) ISA: The interface or contact between the hardware and the software Rules about how to code and interpret machine instructions: Execution model (program counter) Operations (instructions) Data formats (sizes, addressing modes) Processor state (registers) Input and Output (memory, etc.)
  • 14. Introduction • What is meant by Computer Architecture? Computer architecture encompasses the specification of an instruction set and the functional behavior of the hardware units that implement the instructions.
  • 17. UNIT-I BASIC STRUCTURE OF A COMPUTER SYSTEM Topics: • Functional Units • Basic operational concepts • Instructions: Operations, Operands • Instruction representation • Instruction Types • MIPS addressing mode • Performance
  • 20. Functional Units • Input unit • Output unit • Memory unit • Arithmetic Logic unit • Control unit
  • 26. Functional Units Arithmetic & Logic unit and Control unit
  • 28. Connection between the processor and the main memory Code Snippet: Load R2, LOC Add R4, R3, R2 Store LOC, R4
  • 29. IR & PC • Instruction Register: The instruction register (IR) holds the instruction that is currently being executed. • Program Counter: The program counter (PC) contains the memory address of the next instruction to be fetched and executed.
  • 30. Memory Locations and Addresses
  • 31. Examples of encoded information in a 32-bit word.
  • 35. Machine vs Assembly Language Machine Language Assembly Language • A particular set of instructions that the CPU can directly execute – but these are ones and zeros • Ex: 0100001010101 • Assembly language is a symbolic version of the equivalent machine language • Ex: add a,b
  • 36.
  • 37. Instructions • Instruction Set: o The vocabulary of commands understand by a given architecture. • Some ISA: o ARM o Intel x86 o IBM Power o MIPS o SPARC • Different CPUs implement different set of instructions.
  • 38. MIPS MIPS - Microprocessor with Interlocked Pipeline Stages Features: • five-stage execution pipeline: fetch, decode, execute, memory-access, write-result • regular instruction set, all instructions are 32-bit • three-operand arithmetical and logical instructions • 32 general-purpose registers of 32-bits each • only the load and store instruction access memory • flat address space of 4 GBytes of main memory (2^32 bytes)
  • 39. MIPS Assembly Language • Categories: oArithmetic – Only processor and registers involved (sum of two registers) oData transfer – Interacts with memory (load and store) oLogical - Only processor and registers involved (and, sll) oConditional branch – Change flow of execution (branch instructions) oUnconditional Jump – Change flow of execution (jump to a subroutine)
  • 43. Load & Store Instructions • Load: o Transfer data from memory to a register • Store: o Transfer a data from a register to memory • Memory address must be specified by load and store • Processor Memory STORE LOAD
  • 47.
  • 48. MIPS Arithmetic • All MIPS arithmetic instructions have 3 operands • Operand order is fixed (e.g., destination first) • Example: C code: A = B + C MIPS code: add $s0, $s1, $s2 compiler’s job to associate variables with registers
  • 49. MIPS Arithmetic • Design Principle 1: simplicity favors regularity. Translation: Regular instructions make for simple hardware! • Simpler hardware reduces design time and manufacturing cost. • Of course this complicates some things... C code: A = B + C + D; E = F - A; MIPS code add $t0, $s1, $s2 (arithmetic): add $s0, $t0, $s3 sub $s4, $s5, $s0 • Performance penalty: high-level code translates to denser machine code. Allowing variable number of operands would simplify the assembly code but complicate the hardware.
  • 50. MIPS Arithmetic a b c f g h i j $ s 0 $ s 1 $ s 2 $ s 3 $ s 4 $ s 5 $ s 6 $ s 7 a = b - c ; f = ( g + h ) – ( i + j ) ; s u b $ s 0 , $ s 1 , $ s 2 a d d $ t 0 , $ s 4 , $ s 5 a d d $ t 1 , $ s 6 , $ s 7 s u b $ s 3 , $ t 0 , $ t 1 1 9 / 6 7 T r y : 1 . f = g + ( h – 5 ) 2 . f = ( i + j ) – ( k – 2 0 )
  • 51. Registers vs. Memory • Arithmetic instructions operands must be in registers o MIPS has 32 registers • Compiler associates variables with registers • What about programs with lots of variables (arrays, etc.)? Use memory, load/store operations to transfer data from memory to register – if not enough registers spill registers to memory • MIPS is a load/store architecture Processor I/O Control Datapath Memory Input Output
  • 52. Memory Organization • Viewed as a large single-dimension array with access by address • A memory address is an index into the memory array • Byte addressing means that the index points to a byte of memory, and that the unit of memory accessed by a load/store is a byte 0 1 2 3 4 5 6 ... 8 bits of data 8 bits of data 8 bits of data 8 bits of data 8 bits of data 8 bits of data 8 bits of data
  • 53. Memory Organization • Bytes are load/store units, but most data items use larger words • For MIPS, a word is 32 bits or 4 bytes. • 232 bytes with byte addresses from 0 to 232-1 • 230 words with byte addresses 0, 4, 8, ... 232-4 o i.e., words are aligned o what are the least 2 significant bits of a word address? 0 4 8 12 ... 32 bits of data 32 bits of data 32 bits of data 32 bits of data Registers correspondingly hold 32 bits of data
  • 54. The Endian Question Big Endian 31 0 MIPS can also load and store 4-byte words and 2-byte halfwords. The endian question: when you read a word, in what order do the bytes appear? Little Endian: Intel, DEC, et al. Big Endian: Motorola, IBM, Sun, et al. MIPS can do either SPIM adopts its host’s convention by te 0 by te 1 by te 2 by te 3 Little Endian 31 0 by te 3 by te 2 by te 1 by te 0 3 2 / 6 7
  • 55. The Endian Question x = 0x01234567
  • 56. Load/Store Instructions • Load and store instructions • Example: C code: A[8] = h + A[8]; MIPS code (load): lw $t0, 32($s3) (arithmetic): add $t0, $s2, $t0 (store): sw $t0, 32($s3) • Load word has destination first, store has destination last • Remember MIPS arithmetic operands are registers, not memory locations o therefore, words must first be moved from memory to registers using loads before they can be operated on; then result can be stored back to memory offset address value
  • 57. So far we’ve learned: • MIPS o loading words but addressing bytes o arithmetic on registers only • Instruction Meaning add $s1, $s2, $s3 $s1 = $s2 + $s3 sub $s1, $s2, $s3 $s1 = $s2 – $s3 lw $s1, 100($s2) $s1 = Memory[$s2+100] sw $s1, 100($s2) Memory[$s2+100]= $s1 • Try:Find the assembly code of B[8]=A[i]+A[j]; A and B available in $s6 and $s7 respectively $so-$s5 consists of the values f-j
  • 58. Exercise Q: For the following C statement, what is the corresponding MIPS assembly code? Assume that the variables f, g, h, and i are given and could be considered 32-bit integers as declared in a C program. Use a minimal number of MIPS assembly instructions. f = g + (h − 5); Solution: f -> $s1, g -> $s2, h -> $s3 addi $t0, $s3,-5 add $s1, $s2, $t0
  • 59. Representing Instructions in the Computer • Instruction format: o A form of representation of an instruction composed of fields of binary numbers. • All MIPS instructions are 32 bit long. • Three types of instruction formats: o R-type (for register) or R-format o I-type (for immediate) or I-format o J-type (for jump) or J-format
  • 60. R-type (for register) • MIPS fields: • op: Basic operation of the instruction (opcode) • rs: The first register source operand • rt: The second register source operand • rd: The register destination operand • shamt: Shift amount • funt: Function. It selects the specific variant of the operation in the op filed. (function code) Ex: add $t0, $s1, $s2
  • 61. I-type (for immediate) • MIPS fields: • op: Basic operation of the instruction (opcode) • rs: The register source operand • rt: destination register, which receives the result of the load • constant or address: It contains 16 bit constant or address value.
  • 62. I-type (for immediate) • MIPS fields: Ex: addi $t1, $s0, 10 lw $t0, 40($s4) bne $s5,$s6, 100
  • 63. J-type (for jump) • MIPS fields: • op: Basic operation of the instruction (opcode) • address: It contains 26 bit address value. • Ex: j 10000
  • 67. Mapping register names to register numbers t0 t1 t2 t3 t4 t5 t6 t7 8 9 10 11 12 13 14 15 s0 s1 s2 s3 s4 s5 s6 S7 16 17 18 19 20 21 22 23
  • 68. Translating a MIPS Assembly Instruction into a Machine Instruction Given instruction: add $t0,$s1,$s2 • Solution: • Identify the type instruction format: R-type • Format: Operation rd, rs, rt • rs -> $s1, rt -> $s2, rd -> $t0, shamt – NA • Op -> , funct -> • Decimal representation: • Binary representation: op rs rt rd shamt funct 0 17 18 8 0 32 op rs rt rd shamt funct 000000 10001 10010 01000 00000 100000
  • 69. Exercise Q: Translate the following MIPS Assembly code into binary code. sub $t3,$s4,$s5 op rs rt rd Shamt Funct 0 20 21 11 0 34 000000 10100 10101 01011 00000 100010
  • 70. Exercise Q: Translate the following MIPS Assembly code into binary code. sub $t3,$s4,$s5 000000 10100 10101 01011 00000 100010
  • 71. Translating a MIPS Assembly Instruction into a Machine Instruction Given instruction: lw $t0,32($s3) • Solution: • Identify the type instruction format: I-type • Format: Operation rt, addr.(rs) • rs -> $s3, rt -> $to, immediate -> 32 • Decimal representation: • Binary representation: op rs rt address 35 19 8 32 op rs rt 100011 10011 01000 0000 0000 0010 0000
  • 72. Exercise Q: Translate the following MIPS Assembly code into binary code. sw $t2,58($s5) 101011 10101 01010 0000 0000 0011 1010
  • 73. Translating High level Language into Machine Language Q: Consider the following high level statement A[300] = h + A[300]; If $t1 has the base of the array A and $s2 corresponds to h, What is the MIPS machine language code?
  • 75. Shift operations • Shift allow bits to be moved around inside of a register. • Shift left logical Example: sll $t2,$s0,4 # reg $t2 = reg $s0 << 4 bits Machine Code: op rs rt rd shamt funct 000000 00000 10000 01010 00100 000000
  • 76. Shift Left Logical(sll) • Example: sll $t2,$s0,4 # reg $t2 = reg $s0 << 4 bits • If $s0=10 • Value of $t2=???
  • 77. Shift operations • Shift right logical Example: srl $t5,$s3,2 # reg $t5 = reg $s3 >> 2 bits Machine Code: op rs rt rd shamt funct 000000 00000 10011 01101 00010 000010 op rs rt rd shamt funct 0 00000 19 13 2 2
  • 78. Shift Right Logical(srl) Example: srl $t5,$s3,2 # reg $t5 = reg $s3 >> 2 bits • If $s3=12 • Value of $t5=???
  • 79. Logical Operations – AND, OR & NOT • A logical bit-by-bit operation with two operands. • EX: and $t0,$t1,$t2 # reg $t0 = reg $t1 & reg $t2 or $t0,$t1,$t2 # reg $t0 = reg $t1 | reg $t2 nor $t0,$t1,$t3 # reg $t0 = ~ (reg $t1 | reg $t3)
  • 81. Instructions for Making Decisions • Sequences that allow programs to execute statements in order one after another. •  Branches that allow programs to jump to other points in a program. •  Loops that allow a program to execute a fragment of code multiple times. • MIPS Instructions: beq register1, register2, L1 bne register1, register2, L1 • beq and bne are mnemonics • Conditional branches
  • 82. Instructions for Making Decisions Q: In the following code segment, f, g, h, i, and j are variables. If the five variables f through j correspond to the five registers $s0 through $s4, what is the compiled MIPS code for this C if statement? if (i == j) f = g + h; else f = g - h;
  • 84. Instructions for Making Decisions High level code: if (i == j) f = g + h; else f = g - h; MIPS code: bne $s3,$s4,Else # go to Else if i ≠ j add $s0,$s1,$s2 # f = g + h (skipped if i ≠ j) j Exit # go to Exit Else: sub $s0,$s1,$s2 # f = g - h (skipped if i = j) Exit:
  • 85. Compiling a while Loop in C while (save[i] == k) i += 1; Assume that i and k correspond to registers $s3 and $s5 and the base of the array save is in $s6. What is the MIPS assembly code corresponding to this C segment?
  • 86. Compiling a while Loop in C while (save[i] == k) i += 1; 1. load save[i] into a temporary register 1. add i to the base of array save to form the address 2. performs the loop test 1. go to Exit if save[i] ≠ k 3. adds 1 to I 4. back to the while test at the top of the loop 5. Exit
  • 87. while (save[i] == k) i += 1; Assume that i and k correspond to registers $s3 and $s5 and the base of the array save is in $s6. What is the MIPS assembly code corresponding to this C segment? Solution: Loop: sll $t1,$s3,2 # Temp reg $t1 = i * 4 add $t1,$t1,$s6 # $t1 = address of save[i] lw $t0,0($t1) # Temp reg $t0 = save[i] bne $t0,$s5, Exit # go to Exit if save[i] ≠ k addi $s3,$s3,1 # i = i + 1 j Loop # go to Loop Exit:
  • 88. MIPS Addressing Mode • The different ways for specifying the locations of instruction operands are known as addressing mode. • The MIPS addressing modes are the following: 1. Immediate addressing mode 2. Register addressing mode 3. Base or displacement addressing mode 4. PC-relative addressing mode 5. Pseudodirect addressing mode
  • 89. Immediate addressing mode • Def: o the operand is a constant within the instruction itself • Ex: o addi $s1, $s2, 20 #$s1=$s2+20 • Ilustration:
  • 90. Register addressing mode • Def: o source and destination operands are registers which are available in processor registers. o Direct addressing mode • Ex: o add $s1, $s2, $s3 #$s1=$s2+$s3 • Ilustration:
  • 91. Base or displacement addressing mode • Def: o the operand is at the memory location whose address is the sum of a register and a constant in the instruction o Indirect addressing mode • Ex: o lw $s1, 20 ($s3) #$s1= Memory[$s3+20] • Ilustration:
  • 92. PC-relative addressing mode • Def: o the branch address is the sum of the PC and a constant in the instruction • Ex: o bne $s4, $s5, 25 # if ($s4 != $s5), go to pc=12+4+100 • Ilustration:
  • 93. Pseudodirect addressing mode • Def: o the jump address is the 26 bits of the instruction concatenated with the upper bits of the PC • Ex: o j 1000 • Ilustration:
  • 94. Decoding Machine Code • Q: What is the assembly language statement corresponding to this machine instruction? 00af8020hex Solution: converting hexadecimal to binary Binary instruction format Assembly instruction
  • 95. Translating Machine Language to Assembly Language • Translate the following machine language code into assembly language. 0x02F34022
  • 96. Performance • Performance is the key to understanding underlying motivation for the hardware and its organization • Measure, report, and summarize performance to enable users to o make intelligent choices o see through the marketing hype! • Why is some hardware better than others for different programs? • What factors of system performance are hardware related? (e.g., do we need a new machine, or a new operating system?) • How does the machine's instruction set affect performance?
  • 97. Computer Performance: TIME, TIME, TIME!!! • Response Time (elapsed time, latency): o how long does it take for my job to run? o how long does it take to execute (start to finish) my job? o how long must I wait for the database query? • Throughput: o how many jobs can the machine run at once? o what is the average execution rate? o how much work is getting done? • If we upgrade a machine with a new processor what do we increase? • If we add a new machine to the lab what do we increase? Individual user concerns… Systems manager concerns…
  • 98. Execution Time • Elapsed Time o counts everything (disk and memory accesses, waiting for I/O, running other programs, etc.) from start to finish o a useful number, but often not good for comparison purposes elapsed time = CPU time + wait time (I/O, other programs, etc.) • CPU time o doesn't count waiting for I/O or time spent running other programs o can be divided into user CPU time and system CPU time (OS calls) CPU time = user CPU time + system CPU time  elapsed time = user CPU time + system CPU time + wait time • Our focus: user CPU time (CPU execution time or, simply, execution time) o time spent executing the lines of code that are in our program
  • 99. Definition of Performance • For some program running on machine X: PerformanceX = 1 / Execution timeX • If there are two machines X and Y if the performance of X is greater than performance of Y, PerformanceX > PerformanceY ie., 1 / Execution timeX > 1 / Execution timeY • X is n times faster than Y means: PerformanceX / PerformanceY = n PerformanceX / PerformanceY = Execution timeY / Execution timeX = n
  • 100. Q: If computer A runs a program in 10 sec and computer B runs the same program in 15 secs, how much faster is A than B • We know that, PerformanceA / PerformanceB = Execution timeB / Execution timeA = n Thus the performance ratio is, Execution timeB / Execution timeA = 15 / 10 = 1.5 ie., PerformanceA / PerformanceB = 1.5 Therfore Peformance of A 1.5 times faster than Performance of B
  • 101. Clock Cycles • Instead of reporting execution time in seconds, we often use cycles. In modern computers hardware events progress cycle by cycle: in other words, each event, e.g., multiplication, addition, etc., is a sequence of cycles • Clock ticks indicate start and end of cycles: • cycle time = time between ticks = seconds per cycle • clock rate (frequency) = clock cycles per second (1 Hz. = 1 cycle/sec, 1 MHz. = 106 cycles/sec) • Example: A 200 Mhz. clock has a cycle time of ???? time seconds program  cycles program  seconds cycle cycle tick tick
  • 102. Performance Equation I • So, to improve performance one can either: o reduce the number of cycles for a program, or o reduce the clock cycle time, or, equivalently, o increase the clock rate seconds program  cycles program  seconds cycle CPU execution time CPU clock cycles Clock cycle time for a program for a program =  equivalently Also, CPU execution time CPU clock cycles / Clock cycle rate for a program for a program
  • 103. Our favorite program runs in 10 seconds on computer A, which has a 2 GHz clock. We are trying to help a computer designer build a computer, B, which will run this program in 6 seconds. The designer has determined that a substantial increase in the clock rate is possible, but this increase will affect the rest of the CPU design, causing computer B to require 1.2 times as many clock cycles as computer A for this program. What clock rate should we tell the designer to target? CPU timeA = CPU Clock cyclesA / clock rateA 10 sec = CPU Clock cyclesA / 2*109 cycles/sec CPU Clock cyclesA = 10 sec * 2*109 cycles/sec = 20 *109 cycles CPU timeB = 1.2 * CPU Clock cyclesA / clock rateB 6 secs = 1.2 * 20 *109 cycles / clock rateB clock rateB = 1.2 * 20 *109 cycles / 6 sec= 4 * 109 Hz To run the program in 6 secs, B must be 4 * 109 Hz
  • 104. Instruction Performance • No reference to no of instructions in previous equation • The execution time depends on the number of instructions in the program Clock cycles per instruction (CPI) • Average number of clock cycles per instruction for a program or program fragment
  • 105. Suppose we have two implementations of the same instruction set architecture. Computer A has a clock cycle time of 250 ps and a CPI of 2.0 for some program, and computer B has a clock cycle time of 500 ps and a CPI of 1.2 for the same program. Which computer is faster for this program and by how much? • Same number of instructions are instructions are executed
  • 106. Instruction Performance CPU execution time = Instruction count * average CPI * Clock cycle time for a program for a program Or CPU execution time = Instruction count * average CPI / Clock rate for a program for a program
  • 108. Which code sequence executes the most? • Sequence 1 executes, 2 + 1 + 2 = 5 instructions • Sequence 2 executes, 4+ 1 + 1 = 6 instructions Sequence 2 executes most no of instructions
  • 109. Which will be faster? • So code sequence 2 is faster
  • 110. What is the CPI for each sequence? • Sequence 2 has lower CPI as it takes fewer clock cycles but has more instructions