Advertisement
Advertisement

More Related Content

Advertisement

Ch 2.pptx

  1. Chapter two Central processing unit
  2. Outlines 3/24/2023 computer architecture and organization 2 Computer arithmetic Instruction sets, Instruction format and addressing modes CPU Structure, RISC and CISC Pipelining The Control Unit (Hardwired and Micro programmed Implementations)
  3. CPU • Part of computer that performs the bulk of data processing operation • It interprets and executes machine level instructions • Controls data transfer from/to Main Memory (MM) and CPU • Detects any errors 3/24/2023 3
  4. Arithmetic logic unit (ALU) • The ALU is that part of the computer that actually performs arithmetic and logic operations on data. • All other element of Computer Systems - CU, register, main memory, and I/O are mainly used to bring data into the ALU for it to process and to take the result backout. 3/24/2023 computer architecture and organization 4
  5. ALU Inputs and Outputs 3/24/2023 5 • Data are presented to the ALU in registers, and the results of an operation are stored in registers. • These registers are temporary storage locations within the processor. • The ALU may also set flags as the result of an operation. For example, • an overflow flag is set to 1 if the result of a computation exceeds the length of the register into which it is to be stored. • Flags are also stored in registers within the processor. • The control unit provides signals that control the operation of the ALU and the movement of the data into and out of the ALU. ALU Control Signals Operand Registers Flags Result Registers Figure 10.1 ALU Inputs and Outputs
  6. Computer arithmetic • The two principal concerns for computer arithmetic are The way in which numbers are represented (the binary format) The algorithms used for the basic arithmetic operations (add, subtract, multiply, divide). • The basic arithmetic operations are: add, sub, mult and div. • Arithmetic instruction are performed on binary /decimal data. • Arithmetic operation is executed in ALU section. • Computer arithmetic is commonly performed on two very different types of numbers: integer and floating point. 6
  7. Cont..  Integer representation Sign magnitude representation Two’s complement representation Integer arithmetic Addition and subtraction Multiplication and division Negation 3/24/2023 computer architecture and organization 7
  8. Integer Representation • Only have 0 & 1 to represent numbers • Positive numbers stored in binary • e.g. 41=00101001 • No minus sign and periods. 3/24/2023 computer architecture and organization 8
  9. Sign-Magnitude Representation • The simplest form of representation that employs a sign bit is the sign-magnitude representation. • Left most bit is sign bit  0 means positive  1 means negative • +18 = 00010010 • -18 = 10010010 (sign magnitude) 3/24/2023 computer architecture and organization 9
  10. Twos Complement Representation • Like sign magnitude, twos complement representation uses the most significant bit as a sign bit, • Easy to test whether an integer is positive or negative. • Differs from the use of the sign-magnitude representation in the way that the other bits are interpreted.  +2 = 00000010  +1 = 00000001  +0 = 00000000  -1 = 11111111  -2 = 11111110 3/24/2023 computer architecture and organization 10
  11. Integer arithmetic 3/24/2023 11 1. Negation • Twos complement operation Take the Boolean complement of each bit of the integer (including the sign bit) Treating the result as an unsigned binary integer, add 1 • The negative of the negative of that number is itself: +18 = 00010010 (twos complement) bitwise complement = 11101101 + 1 11101110 = -18 -18 = 11101110 (twos complement) bitwise complement = 00010001 + 1 00010010 = +18
  12. 2. Addition and subtraction 3/24/2023 12 1001 = –7 +0101 = 5 1110 = –2 1100 = –4 +0100 = 4 10000 = 0 (a) (–7) + (+5) (b) (–4) + (+4) 0011 = 3 +0100 = 4 0111 = 7 1100 = –4 +1111 = –1 11011 = –5 (c) (+3) + (+4) (d) (–4) + (–1) 0101 = 5 +0100 = 4 1001 = Overflow 1001 = –7 +1010 = –6 10011 = Overflow (e) (+5) + (+4) (f) (–7) + (–6) Figure 10.3 Addition of Numbers in Twos Complement Representation
  13. 3/24/2023 13 OVERFLOW RULE: If two numbers are added, and they are both positive or both negative, then overflow occurs if and only if the result has the opposite sign. SUBTRACTION RULE: To subtract one number (subtrahend) from another (minuend), take the twos complement (negation) of the subtrahend and add it to the minuend.
  14. 3/24/2023 14 0010 = 2 +1001 = –7 1011 = –5 0101 = 5 +1110 = –2 10011 = 3 (a) M = 2 = 0010 S = 7 = 0111 –S = 1001 (b) M = 5 = 0101 S = 2 = 0010 –S = 1110 1011 = –5 +1110 = –2 11001 = –7 0101 = 5 +0010 = 2 0111 = 7 (c) M =–5 = 1011 S = 2 = 0010 –S = 1110 (d) M = 5 = 0101 S =–2 = 1110 –S = 0010 0111 = 7 +0111 = 7 1110 = Overflow 1010 = –6 +1100 = –4 10110 = Overflow (e) M = 7 = 0111 S = –7 = 1001 –S = 0111 (f) M = –6 = 1010 S = 4 = 0100 –S = 1100 Figure 10.4 Subtraction of Numbers in Twos Complement Representation (M – S)
  15. Adder OF OF = overflow bit SW = Switch (select addition or subtraction) Complementer Figure 10.6 Block Diagram of Hardware for Addition and Subtraction A Register B Register SW • For addition,  the two numbers are presented to the adder from two registers, as A and B registers. The result may be stored in one of these registers or in a third. • For subtraction,  the subtrahend (B register) is passed through a twos complementer so that its twos complement is presented to the adder. • The overflow indication is stored in a 1-bit overflow flag (0 = no overflow; 1 = overflow). • Control signals are needed to control whether or not the complementer is used, depending on whether the operation is addition or subtraction.
  16. • The operation of the processor is determined by the instructions it executes, referred to as machine instructions or computer instructions. • The collection of different instructions that the processor can execute is referred to as the processor’s instruction set. • Each instruction must contain the information required by the processor for execution. 3/24/2023 computer architecture and organization 16 Machine Instruction Characteristics
  17. Elements of a Machine Instruction 3/24/2023 17 Operation code: Specifies the operation to be performed (e.g., ADD, I/O). The operation is specified by a binary code, known as the operation code, or op- code. Source operand reference: The operation may involve one or more source operands, that is, operands that are inputs for the operation. Result operand reference: The operation may produce a result. Next instruction reference: This tells the processor where to fetch the next instruction after the execution of this instruction is complete.
  18. © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. Instruction address calculation Instruction operation decoding Operand address calculation Data Operation Operand address calculation Instruction fetch Instruction complete, fetch next instruction Multiple operands Return for string or vector data Figure 12.1 Instruction Cycle State Diagram Operand fetch Operand store Multiple results
  19. Source and result operands can be in one of four areas: • Main or virtual memory: As with next instruction references, the main or virtual memory address must be supplied. • Processor register: With rare exceptions, a processor contains one or more registers that may be referenced by machine instructions. • Immediate: The value of the operand is contained in a field in the instruction being executed. • I/O device: The instruction must specify the I/O module and device for the operation. If memory-mapped I/O is used, this is just another main or virtual memory address. 3/24/2023 19
  20. Instruction format • Instruction:- is collection of ordered steps forms a program of a computer. CU reads an instruction from memory and execute it. consists of an opcode, usually with some additional information such as where operands come from, and where results go. • Each instruction is represented by a sequence of bits. • The bits of instruction are divided into groups called fields. 3/24/2023 20
  21. Op-codes are represented by abbreviations called mnemonics. • Examples include:  ADD Add  SUB Subtract  MUL Multiply  DIV Divide  LOAD Load data from memory  STOR Store data to memory  Operands are also represented symbolically. for example,  ADD R, Y : mean add the value contained in data location Y to the contents of register R.  Each symbolic op-code has a fixed binary representation. The programmer specifies the location of each symbolic operand. 3/24/2023 computer architecture and organization 21
  22. Instruction Types Any program written in a high-level language must be translated into machine language to be executed. Thus, the set of machine instructions must be sufficient to express any of the instructions from a high-level language. Instructions are categorized as follows:  Data processing: Arithmetic and logic instructions.  Data storage: Movement of data into or out of register and or memory locations.  Data movement: I/O instructions.  I/O instructions are needed to transfer programs and data into memory and the results of computations back out to the user.  Control: Test and branch instructions  Test instructions are used to test the value of a data word or the status of a computation.  Branch instructions are then used to branch to a different set of instructions depending on the decision made. 3/24/2023 22
  23. Types of instruction • Note:thenumberofaddressfieldintheinstructionformatofcomputerdependsontheinternal organizationofitsregister 1. Three addresses instruction • Operand 1, Operand 2, Result • format : op X,Y,Z; E.g. ADD X,Y,Z; • Not common • Needs very long(complex design) words to hold 3 address 3/24/2023 computer architecture and organization 23
  24. Types of instruction cont.. 2. Two addresses instruction • One address doubles as operand and result • format: op X,Y; E.g. SUB X,Y; • most common in commercial computer • Reduces length of instruction • Requires some extra work • Temporary storage to hold some results 3/24/2023 computer architecture and organization 24
  25. Cont… 3. One address instruction • Implicit second address • Uses an accumulator contains one of the operand &used to store result • format: op X; E.g. MUL X, • Common on early machines 4. 0 (zero) addresses • All addresses implicit • Uses a stack • format: OP; e.g. DIV; 3/24/2023 computer architecture and organization 25
  26. Example: Evaluate (A+B)  (C+D) • Three-Address 1. ADD R1, A, B; R1 ← M[A] + M[B] 2. ADD R2, C, D; R2 ← M[C] + M[D] 3. MUL X, R1, R2; M[X] ← R1  R2 26 / 34 • Two-Address 1. MOV R1, A; R1 ← M[A] 2. ADD R1, B; R1 ← R1 + M[B] 3. MOV R2, C; R2 ← M[C] 4. ADD R2, D; R2 ← R2 + M[D] 5. MUL R1, R2; R1 ← R1  R2 6. MOV X, R1; M[X] ← R1
  27. Example: Evaluate (A+B)  (C+D) • One-Address 1. LOAD A; AC ← M[A] 2. ADD B; AC ← AC + M[B] 3. STORE T; M[T] ← AC 4. LOAD C; AC ← M[C] 5. ADD D; AC ← AC + M[D] 6. MUL T; AC ← AC  M[T] 7. STORE X; M[X] ← AC 27 / 34
  28. Example: Evaluate (A+B)  (C+D) • Zero-Address 1. PUSH A; TOS ← A 2. PUSH B ; TOS ← B 3. ADD; TOS ← (A + B) 4. PUSH C; TOS ← C 5. PUSH D; TOS ← D 6. ADD; TOS ← (C + D) 7. MUL; TOS ← (C+D)(A+B) 8. POP X; M[X] ← TOS 28 / 34
  29. Instruction Comment SUB Y, A, B Y ¬ A – B MPY T, D, E T ¬ D ´ E ADD T, T, C T ¬ T + C DIV Y, Y, T Y ¬ Y ÷ T (a) Three-address instructions Instruction Comment LOAD D AC ¬ D MPY E AC ¬ AC ´ E ADD C AC ¬ AC + C STOR Y Y ¬ AC LOAD A AC ¬ A SUB B AC ¬ AC – B DIV Y AC ¬ AC ÷ Y STOR Y Y ¬ AC Instruction Comment MOVE Y, A Y ¬ A SUB Y, B Y ¬ Y – B MOVE T, D T ¬ D MPY T, E T ¬ T ´ E ADD T, C T ¬ T + C DIV Y, T Y ¬ Y ÷ T (b) Two-address instructions (c) One-address instructions Figure 12.3 Programs to Execute Y= A- B C+ D´ E ( )
  30. AC = accumulator T = top of stack (T – 1) = second element of stack A, B, C = memory or register locations © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. Number of Addresses Symbolic Representation Interpretation 3 OP A, B, C A ¬ B OP C 2 OP A, B A ¬ A OP B 1 OP A AC ¬ AC OP A 0 OP T ¬ (T – 1) OP T Table 12.1 Utilization of Instruction Addresses (Non-branching Instructions)
  31. Instruction Set Design The most important of these fundamental design issues include the following: • Operation repertoire: How many and which operations to provide, and how complex operations should be . • Data types: The various types of data upon which operations are performed. • Instruction format: Instruction length (in bits), number of addresses, size of various fields, and so on. • Registers: Number of processor registers that can be referenced by instructions, and their use • Addressing: The mode or modes by which the address of an operand is specified. 3/24/2023 computer architecture and organization 31
  32. Addressing Modes Specify where an operand is located & how to compute the exact “memory address” of an operand. They can specify a constant, a register, or a memory location. The actual location of an operand is its effective address. Generally it Inform 2 things;- 1. Exact location of an operand 2. How to find that memory address 3/24/2023 computer architecture and organization 32
  33. Types of addressing modes • An operand reference in an instruction either contains the actual value of the operand (immediate) or a reference to the address of the operand. • A wide variety of addressing modes is used in various instruction sets such as; Immediate Direct Indirect Register 3/24/2023 33 Register Indirect Displacement (Indexed) Stack
  34. Immediate Addressing • Operand is part of instruction • Operand = address field e.g. ADD 5: Add 5 to contents of accumulator 5 is operand • The simplest form of addressing, in which the operand value is present in the instruction. • No memory reference to fetch data (saves one memory cycle). • Fast, but Limited range(operand); The size of the number is restricted to the size of the address field. 3/24/2023 computer architecture and organization 34
  35. Direct Addressing • Address field contains address of operand. • Effective address (EA) = address field (A) • e.g. ADD A • Add contents of cell A to accumulator • Look in memory at address A for operand • Single memory reference to access data • No additional calculations to work out effective address • Limited address space (limitation) 3/24/2023 computer architecture and organization 35
  36. Indirect Addressing • With direct addressing, the length of the address field is usually less than the word length, thus limiting the address range. • Solution: having the address field refer to the address of a word in memory, which contains a full-length address of the operand. This is known as indirect addressing. • Reference to the address of a word in memory which contains a full-length address of the operand. • EA = (A) • Look in A, find address (A) and look there for operand • e.g. ADD (A) • Add contents of cell pointed to by contents of A to accumulator. 3/24/2023 computer architecture and organization 36
  37. Cont.. • Multiple memory accesses to find operand • Instruction execution requires two memory references to fetch the operand • One to get its address and a second to get its value • Hence slower • Large address space • 2n where n = word length • May be nested, multilevel, cascaded • e.g. EA = (((A))) • Parentheses are to be interpreted as meaning ‘contents of’ 3/24/2023 computer architecture and organization 37
  38. Register Addressing • Similar to direct addressing. The only difference is that the address field refers to a register rather than a main memory address. • Operand is held in register named in address filed EA = R • Very small address field needed • Shorter instructions • Faster instruction fetch • Limited number of registers • The address space is very limited 3/24/2023 computer architecture and organization 38
  39. Register Addressing … • Advantage: • No memory access • No time-consuming memory references are required • Very fast execution • Disadvantage: Very limited address space 3/24/2023 computer architecture and organization 39
  40. Register Indirect Addressing • Analogous to indirect addressing. The only difference is whether the address field refers to a memory location or a register. EA = (R) • Operand is in memory cell pointed to by contents of register R. • Address space limitation is overcome by having that field refer to a word-length location containing an address. • Large address space (2n) • Uses one less memory reference than indirect addressing 3/24/2023 computer architecture and organization 40
  41. Displacement Addressing • Combines the capabilities of direct addressing and register indirect addressing. EA = A + (R) • Instruction has two address fields A = base value R = register that holds displacement or vice versa (a register whose contents are added to A to produce the effective address) • Common uses of displacement addressing  Relative addressing (PC-relative addressing)  Base-register addressing  Indexing 3/24/2023 computer architecture and organization 41
  42. Relative Addressing • The content of the program counter is added to the addressing field of the instruction to obtain the effective address. • EA = PC + Address field value • PC = PC + Relative value program counter (PC) © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
  43. Base-Register Addressing • The base register content is added to the addressing field of the instruction to obtain the effective address. © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
  44. Indexing • The content of index register is added to the address part of the instruction to obtain the EA. • The method of calculating the EA is the same as for base-register addressing • Auto indexing • Automatically increment or decrement the index register after each reference to it • EA = A + (R) • (R)  (R) + 1 • Postindexing • Indexing is performed after the indirection • EA = (A) + (R) • Preindexing • Indexing is performed before the indirection • EA = (A + (R))
  45. Stack Addressing • A stack is a linear array of locations • Sometimes referred to as a pushdown list or last-in-first-out queue • A stack is a reserved block of locations • Items are appended to the top of the stack so that the block is partially filled • The machine instructions need not include a memory reference but implicitly operate on the top of the stack • Operand is (implicitly) on top of stack  ADD • Pop top two items from stack, add, push the result onto stack 3/24/2023 computer architecture and organization 45
  46. Summary of Addressing Modes 3/24/2023 computer architecture and organization 46 Mode Algorithm Principal Advantage Principal Disadvantage Immediate Operand = A No memory reference Limited operand magnitude Direct EA = A Simple Limited address space Indirect EA = (A) Large address space Multiple memory references Register EA = R No memory reference Limited address space Register indirect EA = (R) Large address space Extra memory reference Displacement EA = A + (R) Flexibility Complexity Stack EA = top of stack No memory reference Limited applicability
  47. CPU Structure and Function Things a CPU must do (CPU requirements): Fetch instruction: reads an instruction from memory (register, cache, main memory). Interpret instruction: instruction is decoded to determine what action is required. Fetch data: reading data from memory or an I/O module. Process data: performing some arithmetic or logical operation on data. Write data: writing exaction result to memory or an I/O module. A small amount of internal memory, called registers, is needed to fulfill this requirements (store some data temporarily) 3/24/2023 computer architecture and organization 47 Processor Organization
  48. Figure 14.1 The CPU with the System Bus Control Bus Data Bus Address Bus System Bus ALU Registers Control Unit • The ALU  Does the actual computation or processing of data. • The control unit  Controls the movement of data and instructions into and out of the processor  Controls the operation of the ALU  Decode the op-code
  49. Control Unit Registers Arithmetic and Boolean Logic Complementer Internal CPU Bus Shifter Status Flags Arithmetic and Logic Unit Figure 14.2 Internal Structure of the CPU Control Paths
  50. Register Organization Within the processor there is a set of registers that function as a level of memory above main memory and cache in the hierarchy The registers in the processor perform two roles: 3/24/2023 • Enable the machine or assembly language programmer to minimize main memory references by optimizing use of registers. • Used by the control unit to control the operation of the CPU and • To control the execution of programs by privileged operating system programs • Not visible to the user
  51. User-Visible Registers General-purpose registers: used for any type of functions. Data registers: only used to hold data. Address registers: used to hold address information. Segment pointers: holds the address of the base of the segment. Index registers: These are used for indexed addressing and may be auto indexed. Stack pointer: dedicated register that points to the top of the stack. condition codes (also referred to as flags). bits set by the processor hardware as the result of operations. For example, an arithmetic operation may produce a positive, negative, zero, or overflow result. 3/24/2023 computer architecture and organization 51
  52. Control and Status Registers Four registers are essential to instruction execution: Program counter (PC) Contains the address of an instruction to be fetched. Instruction register (IR) Contains the instruction most recently fetched. Memory address register (MAR) Contains the address of a location in memory. Memory buffer register (MBR) Contains a word of data to be written to memory or the word most recently read. 3/24/2023 computer architecture and organization 52
  53. Program Status Word (PSW): Register or set of registers that contain status information  Common fields or flags include: Sign: sign bit of the result of the last arithmetic operation. Zero: Set when the result is 0. Carry: Set if an operation resulted in a carry (addition) into or borrow (sub- traction) out of a high-order bit. Equal: Set if a logical compare result is equality. Overflow: Used to indicate arithmetic overflow. Interrupt Enable/Disable: Used to enable or disable interrupts. Supervisor: Indicates whether the processor is executing in supervisor or user mode. 3/24/2023 computer architecture and organization 53
  54. CPU design issues Whether to use completely general-purpose registers or to specialize their use. Specialized registers save bits in instruction b/c their use can be implicit. General purpose registers are more flexible. The number of registers, either general purpose or data plus address, to be provided. More registers require more operand specifier bits. between 8 and 32 registers appears optimum. Register length. Registers must be at least long enough to hold the largest address. Data registers should be able to hold values of most data types. 3/24/2023 computer architecture and organization 54
  55. The Instruction cycle An instruction cycle includes the following stages: Fetch: Read the next instruction from memory into the processor. Execute: Interpret the op-code and perform the indicated operation. Interrupt: If interrupts are enabled and an interrupt has occurred, save the current process state and service the interrupt. The instruction cycle consists of these phases: – Fetch an instruction from memory – Decode the instruction – Read the effective address from memory if the operand has an indirect address. – Execute the instruction. 3/24/2023 computer architecture and organization 55
  56. Fetch Figure 14.4 The Instruction Cycle Execute Interrupt Indirect
  57. Instruction address calculation Instruction operation decoding Operand address calculation Data Operation Operand address calculation Instruction fetch Instruction complete, fetch next instruction Multiple operands Return for string or vector data Figure 14.5 Instruction Cycle State Diagram No interrupt Interrupt Operand fetch Indirection Operand store Interrupt check Interrupt Multiple results Indirection
  58. Instruction Pipelining • To improve the performance of a CPU we have two options: 1. Improve the hardware by introducing faster circuits. 2. Arrange the hardware such that more than one operation can be performed at the same time. • Since, there is a limit on the speed of hardware and the cost of faster circuits is quite high, thus the 2nd option is better. • Pipelining is a process of arrangement of hardware elements of the CPU such that its overall performance is increased. • Simultaneous execution of more than one instruction takes place in a pipelined processor. • This is solved without additional hardware, only letting different parts of the hardware work for different instructions at the same time. • Thus, pipelined operation increases the efficiency of a system. 3/24/2023 computer architecture and organization 58
  59. Cont.… • Data dependencies can be addressed by reordering the instructions when possible (compiler). • Processors make use of instruction pipelining to speed up execution. • By breaking up the instruction cycle into a number of separate stages that occur in sequence, such as fetch instruction, decode instruction, determine operand addresses, fetch operands, execute instruction, and write operand result. • Instructions move through these stages, as on an assembly line, so that in principle, each stage can be working on a different instruction at the same time. 3/24/2023 computer architecture and organization 59
  60. Cont.… • Apparently a greater number of stages always provides better performance. • However: Greater number of stages increases the overhead in moving information between stages and synchronization between stages. With the number of stages the complexity of the CPU grows. It is difficult to keep a large pipeline at maximum rate because of pipeline hazards. 3/24/2023 computer architecture and organization 60
  61. Fetch Instruction Instruction (a) Simplified view Result Execute Fetch Instruction Discard Instruction New address Wait Wait (b) Expanded view Figure 14.9 Two-Stage Instruction Pipeline Result Execute
  62. Additional Stages • Fetch instruction (FI) • Read the next expected instruction into a buffer • Decode instruction (DI) • Determine the opcode and the operand specifiers • Calculate operands (CO) • Calculate the effective address of each source operand • This may involve displacement, register indirect, indirect, or other forms of address calculation • Fetch operands (FO) • Fetch each operand from memory • Operands in registers need not be fetched • Execute instruction (EI) • Perform the indicated operation and store the result, if any, in the specified destination operand location • Write operand (WO) • Store the result in memory © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
  63. 1 Instruction 1 Time FI Instruction 2 Instruction 3 Instruction 4 Instruction 5 Instruction 6 Instruction 7 Instruction 8 Instruction 9 Figure 14.10 Timing Diagram for Instruction Pipeline Operation 2 3 4 5 6 7 8 9 10 11 12 13 14 DI CO FO EI WO WO FI DI CO FO EI FI DI CO FO EI WO FI DI CO FO EI WO FI DI CO FO EI WO FI DI CO FO EI WO FI DI CO FO EI WO FI DI CO FO EI WO FI DI CO FO EI WO
  64. 1 Instruction 1 Time Instruction 2 Instruction 3 Instruction 4 Instruction 5 Instruction 6 Instruction 7 Instruction 15 Instruction 16 Figure 14.11 The Effect of a Conditional Branch on Instruction Pipeline Operation 2 3 4 5 6 7 8 9 10 Branch Penalty 11 12 13 14 FI DI CO FO EI WO FI DI CO FO EI WO FI DI CO FO EI WO FI DI CO FO FI DI CO FI DI FI FI DI CO FO EI WO FI DI CO FO EI WO
  65. No Yes Yes No FI DI CO FO EI WO Calculate Operands Fetch Instruction Decode Instruction Uncon- ditional Branch? Branch or Inter -rupt? Figure 14.12 Six-Stage Instruction Pipeline Write Operands Fetch Operands Execute Instruction Update PC Empty Pipe
  66. 1 0 2 4 6 8 10 12 0 5 10 15 20 0 2 4 6 8 10 12 14 2 4 8 Number of instructions (log scale) (a) Speedup factor Speedup factor Number of stages (b) Figure 14.14 Speedup Factors with Instruction Pipelining 16 k = 6 stages n = 10 instructions n = 20 instructions n = 30 instructions k = 9 stages k = 12 stages 32 64 128
  67. Pipeline Hazards • Pipeline hazards are situations that prevent the next instruction in the instruction stream from executing during its designated clock cycle. • The instruction is said to be stalled. • When an instruction is stalled, all instructions later in the pipeline than the stalled instruction are also stalled. • Instructions earlier than the stalled one can continue. • No new instructions are fetched during the stall. 3/24/2023 computer architecture and organization 67
  68. Pipeline Hazards Occur when the pipeline, or some portion of the pipeline, must stall because conditions do not permit continued execution Also referred to as a pipeline bubble There are three types of hazards: •Resource •Data •Control © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
  69. Types of Hazards • Structural (Resource) Hazards: arises from hardware resource conflicts. That is, when the hardware cannot service all the combinations of parallel use attempted by the stages in the pipeline. • Data Hazards: arise when an instruction depends on the(data) results of another instruction that has not yet produced the desired/needed result. • Control Hazards: arising from the presence of branches or other instructions in the pipeline that alter the sequential instruction flow. • Instruction fetch depends on the result of instruction in pipeline 3/24/2023 computer architecture and organization 69
  70. Structural (resource) Hazards • Occur when a certain resource (memory, functional unit) is requested by more than one instruction at the same time. • Insufficient resources to service need. • Sometimes resources are not sufficiently duplicated: e.g., read/writes ports • Commonly arises when you have uneven service rates in the pipe stages. Possible Solutions  Stall. Refactor pipeline or pipeline the pipe stage. Duplicate/split the resource (split/duplicate caches to alleviate memory pressure). Build instruction buffers to alleviate memory pressure. 3/24/2023 computer architecture and organization 70
  71. 1 I1 Clock cycle (a) Five-stage pipeline, ideal case Instrutcion FI I2 I3 I4 Figure 14.15 Example of Resource Hazard 2 3 4 5 6 7 8 9 DI FO EI WO FI DI FO EI WO FI DI FO EI WO FI DI FO EI WO 1 I1 Clock cycle (b) I1 source operand in memory Instrutcion FI I2 I3 I4 2 3 4 5 6 7 8 9 DI FO EI WO FI DI FO EI WO FI Idle DI FO EI WO FI DI FO EI WO • In Figure 14.15b, assume that the source operand for instruction I1 is in memory, rather than a register. Therefore, the fetch instruction stage of the pipeline must idle for one cycle before beginning the instruction fetch for instruction I3. • If multiple instructions are ready to enter the execute instruction phase and there is a single ALU. The solutions is to increase available resources, such as having multiple ports into main memory and multiple ALU units.
  72. Data Hazards • Occurs when the pipeline changes the order of read/write access to operands. • We have two instructions, I1 and I2. The execution of I2 starts before I1 has terminated. If I2 needs the result produced by I1, but this result has not yet been generated, we have a data hazard. E.g. I1: MUL R2,R3 ; R2 ← R2 * R3 I2: ADD R1,R2; R1 ← R1 + R2 • Early pipe stage attempts to read a data/operand value that has not yet been produced by an instruction in a later pipe stage. Possible Solutions  Stall. Data forwarding (allow earlier pipe stage to fetch incorrect data, but then overwrite the fetched result from the later pipe stage) which is called bypassing or short-circuiting. 3/24/2023 72 => the program produces an incorrect result because of the use of pipelining.
  73. 1 ADD EAX, EBX Clock cycle FI SUB ECX, EAX I3 I4 Figure 14.16 Example of Data Hazard 2 3 4 5 6 7 8 9 10 DI FO EI WO FI DI Idle FO EI WO FI DI FO EI WO FI DI FO EI WO The first instruction adds the contents of the 32-bit registers EAX and EBX and stores the result in EAX. The second instruction subtracts the contents of EAX from ECX and stores the result in ECX. If the two instructions are executed in strict sequence, no problem occurs. However, if the instructions are executed in a pipeline, then it is possible for the program produces an incorrect result because of the use of pipelining.
  74. Types of Data Hazard • Read after write (RAW), or true dependency • An instruction modifies a register or memory location and a succeeding instruction reads data in memory or register location. • Hazard occurs if the read takes place before write operation is complete. • Write after read (WAR), or antidependency • An instruction reads a register or memory location and a succeeding instruction writes to the location • Hazard occurs if the write operation completes before the read operation takes place. • Write after write (WAW), or output dependency • Two instructions both write to the same location. • Hazard occurs if the write operations take place in the reverse order of the intended sequence.
  75. Control Hazard • Also known as a branch hazard (produced by branch instructions) • The presence of a (conditional) branch alters the sequential flow of instructions and it is not known where to continue until the branch outcome is resolved. Possible Solutions Multiple streams : replicate the initial portions of the pipeline and allow the pipeline to fetch both instructions, making use of two streams. Prefetching branch target: When a conditional branch is recognized, the target of the branch is pre fetched, in addition to the instruction following the branch. Loop buffer: the most recently fetched instructions, in sequence is buffered. If a branch is to be taken, the hardware checks the branch within the buffer. Delayed branch: Redefine the runtime behavior of branches to take affect only after the partially fetched /executed instructions flow through the pipeline. Branch prediction: Predict (statically or dynamically) the outcome of the branch and fetch there.
  76. RISC & CISC • Instruction set : determines the way that machine language programs are constructed. • Its design is important aspect of computer • Based on the instruction set design computer instruction is classified as CISC (Complex Instruction Set Computing) and RISC (Reduced Instruction Set Computing). • Both approaches try to increase the CPU performance • RISC: Reduce the cycles per instruction at the cost of the number of instructions per program. • CISC: The CISC approach attempts to minimize the number of instructions per program but at the cost of increase in number of cycles per instruction. 3/24/2023 76
  77. Complex Instruction Set Architecture (CISC) • The main idea is that a single instruction will do all loading, evaluating and storing operations just like a multiplication command will do stuff like loading data, evaluating and storing it, hence it’s complex. • Minimize the number of instructions per program but at the cost of increase in number of cycles per instruction. • Code size is smaller, but more number of cycle. • Needs better hardware and powerful processing. • Large variety of addressing modes. • Variable length instruction formats. • Use more RAM but less register. 3/24/2023 computer architecture and organization 77
  78. Characteristics of CISC • Complex instruction, hence complex instruction decoding. • Instruction are larger than one word size. • Instruction may take more than single clock cycle to get executed. • Less number of general purpose register as operation get performed in memory itself. • Complex Addressing Modes. • More Data types 3/24/2023 computer architecture and organization 78
  79. Reduced Instruction Set Architecture (RISC) • The main idea behind is to make hardware simpler by using an instruction set composed of a few basic steps for loading, evaluating and storing operations just like a load command will load data, store command will store the data. • Is a type of microprocessor architecture that utilizes a small, highly-optimized set of instructions. • Reduce the cycles per instruction at the cost of the number of instructions per program. • Designed to perform a set of smaller computer instructions so that it can operate at higher speeds. • Code size is larger but clock cycle is smaller. • Use more register and less RAM. 3/24/2023 computer architecture and organization 79
  80. Characteristics of RISC • Simpler instruction, hence simple instruction decoding. • Instruction come under size of one word. • Instruction take single clock cycle to get executed. • More number of general purpose register. • Simple Addressing Modes. • All operations done within “registers” of the CPU. • Fixed-length and easily decoded instruction format. • Less Data types. • Pipeline can be achieved. • Hardwired control unit. 3/24/2023 computer architecture and organization 80
  81. Cont.…. Advantage of RISC 1)Because each instruction requires only one clock cycle to execute, the entire program will execute in approximately the same amount of time as the multi-cycle "MUL" command. 2)These RISC "reduced instructions" require less transistors of hardware space than the complex instructions, leaving more room for general purpose registers. 3)Because all of the instructions execute in a uniform amount of time (i.e. one clock), pipelining is possible. 3/24/2023 computer architecture and organization 81
  82. Difference between CISC and RISC RISC CISC  Focus on software  Focus on hardware  Uses only Hardwired control unit  Uses both hardwired and microprogrammed control unit  Transistors are used for more registers  Transistors are used for storing complex Instructions  Fixed sized instructions  Variable sized instructions  Can perform only Register to Register Arithmetic operations  Can perform REG to REG or REG to MEM or MEM to MEM  Requires more number of registers  Requires less number of registers  Code size is large  Code size is small  An instruction executed in a single clock cycle  Instruction takes more than one clock cycle  An instruction fit in one word  Instructions are larger than the size of one word 3/24/2023 82
  83. Control Unit The execution of a program consists of the sequential execution of instructions. • Each instruction is executed during an instruction cycle made up of shorter sub- cycles (fetch, indirect, execute, interrupt) • Each cycle is made up of a sequence of more fundamental operations, called micro- operations  A single micro-operation involves a transfer b/n registers, a transfer b/n a register and an external bus, or a simple ALU operation. The control unit of a processor performs two tasks: 1. It causes the processor to step through a series of micro-operations in the proper sequence, based on the program being executed, i.e. sequencing. 2. It generates the control signals that cause each micro-operation to be executed i.e. executing. This signal cause the opening and closing of logic gates, resulting in the transfer of data to and from registers and the operation of the ALU. 3/24/2023 computer architecture and organization 83
  84. Program Execution Instruction Cycle Instruction Cycle Instruction Cycle Indirect Execute Interrupt Fetch µOP µOP µOP Figure 20.1 Constituent Elements of a Program Execution µOP µOP
  85. The three-step process leads to a characterization of the control unit: 1. Define the basic elements of the processor. 2. Describe the micro-operations that the processor performs. 3. Determine the functions that the control unit must perform to cause the micro-operations to be performed. The basic functional elements of the processor are the following: ■ ALU ■ Registers ■ Internal data paths Micro-operations fall into one of the following categories: • Transfer data from one register to another. • Transfer data from a register to an external interface (e.g., system bus). • Transfer data from an external interface to a register. • Perform an arithmetic or logic operation, using registers for input and output. 3/24/2023 computer architecture and organization 85 ■ External data paths ■ Control unit
  86. Control Unit Figure 20.4 Block Diagram of the Control Unit Instruction register Flags Clock Control signals within CPU Control signals from control bus Control signals to control bus Control bus The inputs are: ■ Clock: causes one micro-operation to be performed for each clock pulse. ■ Instruction register: The opcode and addressing mode of the current instruction are used to determine which micro- operations to be performed. ■ Flags: determine the status of the CPU and the outcome of previous ALU operations. ■ Control signals from control bus: provides signals of interrupt and acknowledgment to the control unit. The outputs are as follows: ■ Control signals within CPU: cause data mov’t from one register to another, and activate specific ALU functions. ■ Control signals to control bus: control signals to memory, and to the I/O modules.  Three types of control signals :  those that activate an ALU function;  those that activate a data path; and  those that are signals on the external system bus or other external interface.
  87. 3/24/2023 87 The first step in execution is to transfer the contents of the PC to the MAR.  The control unit activates the control signal that opens the gates between the bits of PC and the MAR. The next step is to read a word from memory into the MBR and increment the PC. The control unit does this by sending the following control signals simultaneously: ■ A control signal that opens gates, allowing the contents of the MAR onto the address bus; ■ A memory read control signal on the control bus; ■ A control signal that opens the gates, allowing the contents of the data bus to be stored in the MBR; ■ Control signals to logic that add 1 to the contents of the PC and store the result back to the PC.  The control unit sends a control signal that opens gates between the MBR and the IR.  The control unit decide whether to perform an indirect cycle or an execute cycle next. To decide this, it examines the IR to see if an indirect memory reference is made.
  88. Two major types of Control Unit Hardwired Control Unit:  The control logic is implemented with gates, flip-flops, decoders, and other digital circuits.  + Fast operation.  -Wiring change(if the design has to be modified). Micro-programmed Control Unit:  The control information is stored in a control memory, and the control memory is programmed to initiate the required sequence of micro-operations.  +Any required change can be done by updating the micro-program in control memory (Easy to change or modification).  -Slow operation. 3/24/2023 88
  89. Hardwired Implementation • The control unit is a combinatorial circuit. • The controls signals are generated by hardware using conventional logic design techniques. • The control signals, that specify the micro operations, are a group of bits that select the path in multiplexer, decoder, and arithmetic logic units. Problems Complex sequencing and micro-operation logic. Difficult to design and test. Inflexible design. Difficult to add new instruction. 3/24/2023 computer architecture and organization 89
  90. Instruction register Decoder Control Unit Figure 20.10 Control Unit with Decoded Inputs Flags Timing generator Tn Clock T2 T1 I0 I1 Ik C0 C1 Cm
  91. Microprogrammed Implementation An alternative to a hardwired control unit. The logic of the control unit is specified by a microprogram which consists a sequence of instructions.  Micro program is a sequence of microinstruction.  Microinstruction are very simple instructions that specify micro-operations. The Microinstruction is stored in a control memory in the form of control word.  to initiate (generate) the required sequence of micro-operations. A microprogrammed control unit is a relatively simple logic circuit that is capable of 1. Sequencing through microinstructions and 2. Generating control signals to execute each microinstruction • The control signals generated by a microinstruction are used to cause register transfers and ALU operations. 3/24/2023 computer architecture and organization 91
  92. Cont... Dynamic microprogramming :Control Memory =RAM RAM can be used for writing (to change a writable control memory) Micro program is loaded initially from an auxiliary memory such as a magnetic disk. Static microprogramming : Control Memory =ROM Control words in ROM are made permanent during the hardware production. Control Memory • A memory which is part of a control unit »Computer Memory • Main Memory : for storing user program (Machine instruction/data) • Control Memory : for storing microprogram (serious of microinstruction) 3/24/2023 computer architecture and organization 92
  93. Sequencing Logic Control Unit Decoder Decoder Control Signals to System Bus Control Signals Within CPU ALU Flags Clock Read Next Address Control Control Address Register Instruction Register Control Buffer Register Figure 21.4 Functioning of Microprogrammed Control Unit Control Memory 1. Sequencing logic issues a READ command to the control memory. 2. The control word whose address is specified in the control address register is read into the control buffer register. 3. The content of the control buffer register generates control signals and next-address information for the sequencing logic unit. 4. The sequencing logic unit loads a new address into the control address register based on the information from the control buffer register and the ALU flags. • The upper decoder translates the opcode of the IR into a control memory address. To execute an instruction in 1 clock pulse;
  94. 3/24/2023 computer architecture and organization 94
  95. Hardwired vs Micro programmed 3/24/2023 95 Parameters Hardwired Microprogrammed Control signals  Generated using hardware.  Generated using software(micro program) Structure  Based on hardware, so it is rigid.  Based on software, so it is flexible. Modification  Done by redesigning.  Done by reprogramming. Instruction set  Small and simple.  Large and complex. Debugging  Difficult  Easy Emulation  Not possible  Possible Execution speed  Very fast  Slower Memory  No memory required  Memory required for microprogram Cost  Low cost as no memory  High cost due to control memory Processor  Preferred in RISC  Preferred in CISC Design process  Sequential circuit  Programming Chip area  More chip area  Less chip area Pipelining  Small and efficient  Long and less efficient
  96. Thank You 
Advertisement