• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Compuer organizaion processing unit
 

Compuer organizaion processing unit

on

  • 1,241 views

Computer Organization-Pdf,Lecture note,Ppt

Computer Organization-Pdf,Lecture note,Ppt

Statistics

Views

Total Views
1,241
Views on SlideShare
1,241
Embed Views
0

Actions

Likes
0
Downloads
60
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Compuer organizaion processing unit Compuer organizaion processing unit Presentation Transcript

    • Processing Unit Deepak John Department of IT, CE PoonjarDeepak John, Department Of IT, CE Poonjar
    • Fundamental Concepts Processor fetches one instruction at a time and perform the operation specified. Instructions are fetched from successive memory locations until a branch or a jump instruction is encountered. Processor keeps track of the address of the memory location containing the next instruction to be fetched using Program Counter (PC). Instruction Register (IR) Deepak John, Department Of IT, CE Poonjar
    • Executing an Instruction Fetch the contents of the memory location pointed to by the PC. The contents of this location are loaded into the IR (fetch phase). IR ← [[PC]] Assuming that the memory is byte addressable, increment the contents of the PC by 4 (fetch phase). PC ← [PC] + 4 Carry out the actions specified by the instruction in the IR (execution phase). Deepak John, Department Of IT, CE Poonjar
    • Processor Organization Deepak John, Department Of IT, CE Poonjar
    • Executing an Instruction Transfer a word of data from one processor register to another or to the ALU. Perform an arithmetic or a logic operation and store the result in a processor register. Fetch the contents of a given memory location and load them into a processor register. Store a word of data from a processor register into a given memory location.Register Transfers For each register two control signals are used1. To place the contents of that register on the bus2. To load the data on the bus into register The input and output of register are connected to the bus via switches controlled by the signals Rin and Rout Deepak John, Department Of IT, CE Poonjar
    • Deepak John, Department Of Of IT, CE Deepak John, Department IT, CE Poonjar Poonjar
    • Performing an Arithmetic or Logic Operation The ALU is a combinational circuit that has no internal storage. ALU gets the two operands from MUX and bus. The result is temporarily stored in register Z. What is the sequence of operations to add the contents of register R1 to those of R2 and store the result in R3? 1. R1out, Yin 2. R2out, SelectY, Add, Zin 3. Zout, R3in Deepak John, Department Of IT, CE Poonjar
    • Fetching a Word from Memory  Address into MAR; issue Read operation; data into MDR. Memory-Function-Completed (MFC) Deepak John, Department Of IT, CE Poonjar
    • Deepak John, Department Of IT, CE Poonjar
    • Execution of Branch Instructions A branch instruction replaces the contents of PC with the branch target address, which is usually obtained by adding an offset X given in the branch instruction. The offset X is usually the difference between the branch target address and the address immediately following the branch instruction. Conditional branch Step Action 1 PCou , MAR , Read Select4,Add, Z i in t , n 2 Zou , PCi , Y , WM C in F t n 3 MDR , IR i out n 4 Offset-field-of-IR , Add Z out in , 5 Z ou , PCi , En t n d Figure 7.7. Control sequence for an unconditional branch instruction. Deepak John, Department Of IT, CE Poonjar
    • Multiple-Bus OrganizationRegister file:all general purpose registers arecombinedBuses A and B:used to transfer the source operandsto the ALU Step Action 1 PCout , R=B, MAR in , Read, IncPC 2 WMFC 3 MDR outB , R=B, IR in 4 R4outA , R5outB , SelectA, Add, R6in , EndFigure 7.9. Control sequence for the instruction.Add R4,R5,R6,for the three-bus organization Deepak John, Department Of IT, CE Poonjar
    • Hardwired ControlDeepak John, Department Of IT, CE Poonjar
    •  To execute instructions, the processor must have some means of generating the control signals needed in the proper sequence. Two categories: hardwired control and microprogrammed control Hardwired system can operate at high speed; but with little flexibility. CLK Control step Clock counterControl Unit Organization External inputs Decoder/ IR encoder Condition codes Deepak John, Department Of IT, CE Poonjar Control signals
    • Step Decoder :providesDetailed Block Description seperate signal line for each step or time slot in the ccontrol sequence. Instruction decoder :o/p consists of a seperate line for each machine instruction. For any instruction in IR, one of the output lines INS1 through INSm is set to 1,all other lines are set to 0 Input signals to the encoder block are combined to generate the individual control signals Yin,Pcout,ADD,End etc Deepak John, Department Of IT, CE Poonjar
    • Generating Zin Zin = T1 + T6 • ADD + T4 • BR + … Branch Add T4 T6 T1 Deepak John, Department Of IT, CE Poonjar
    • ADVANTAGES Hardwired Control Unit is fast because control signals are generated by combinational circuits. The delay in generation of control signals depends upon the number of gates.DISADVANTAGES more complex will be the design of control unit. Modifications in control signal are very difficult. That means it requires rearranging of wires in the hardware circuit. It is difficult to correct mistake in original design or adding new feature in existing design of control unit. Deepak John, Department Of IT, CE Poonjar
    • A Complete Processor Deepak John, Department Of IT, CE Poonjar
    • Microprogrammed ControlDeepak John, Department Of IT, CE Poonjar
    •  Control signals are generated by a program similar to machine language programs. Control Word (CW):is a word whose individual bits represent the various control signals; Each step of the instruction execution is represented by a control word with all of the bits corresponding to the control signals needed for the step set to one. Microroutine:a sequence of CW’s corresponding to the control sequence of a machine instruction microinstruction:individual control words in microroutine, consists of: One or more micro-operations to be executed. Address of next microinstruction to be executed. Deepak John, Department Of IT, CE Poonjar
    • Deepak John, Department Of IT, CE Poonjar
    • Control store: micro routines for all instructions in the instruction set of a computer are storedDeepak John, Department Of IT, CE Poonjar Deepak John, Department Of IT, CE Poonjar
    •  The previous organization cannot handle the situation when the control unit is required to check the status of the condition codes or external inputs to choose between alternative courses of action. Use conditional branch microinstruction. Address Microinstruction 0 PCout , MAR in , Read, Select4,Add, Z in 1 Zout , PCin , Yin , WMFC 2 MDRout , IR in 3 Branch to startingaddress of appropriate microroutine . ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... .. ... ... .. ... .. 25 If N=0, then branch to microinstruction 0 26 Offset-field-of-IRout , SelectY, Add, Z in 27 Zout , PCin , End Figure 7.17. Microroutine for the instruction Branch<0. Deepak John, Department Of IT, CE Poonjar
    • External inputs Starting and branch address Condition IR codes generator Clock mPC Control store CWFigure 7.18. Organization of the control unit to allow conditional branching in themicroprogram. Deepak John, Department Of IT, CE Poonjar
    • Microinstructions A straightforward way to structure microinstructions is to assign one bit position to each control signal. However, this is very inefficient. The length can be reduced: most signals are not needed simultaneously, and many signals are mutually exclusive. All mutually exclusive signals are placed in the same group in binary coding. Deepak John, Department Of IT, CE Poonjar
    • Partial Format for the Microinstructions Deepak John, Department Of IT, CE Poonjar
    • Further Improvement Vertical organization:1. each micro instruction specify only a small number of control functions.2. Slower operating speed due to the need of more micro instructions to perform the desired function.3. less hardware is needed to handle the execution of microinstructions. Horizontal organization:1. each micro instruction specify many control signals.2. its useful when higher operating speed is desired and the machine structure allows parallel use of resources Deepak John, Department Of IT, CE Poonjar
    • Microprogram sequencingSeveral operations execute with varying addressing modes• For example, consider ADD r1, r2, r3; ADD (r1), r2,r3; ADD x(r1), r2, r3; and ADD (r1)+, r2, r3• two disadvantages:Ø Having a separate microroutine for each machine instruction results in a large total number of microinstructions and a large control store.Ø Longer execution time because it takes more time to carry out the required branches.§ a separate microroutine for each combination of instruction would produce considerable duplication of common parts.§ Organize the micro program so that the micro routines share as many common parts as possible. Deepak John, Department Of IT, CE Poonjar
    • Prefetching microinstructions Microprogrammed control leads to a slower operating speed because of the time it take to fetch microinstructions from control store. Avoid this by prefetching the next microinstruction while the current one is executed. prefetching the microinstruction had some difficulties like the status flags and the result of the current microinstruction are needed to determine the address of next microinstruction. Thus a straight forward prefetching occasionally prefetches wrong microinstruction. In this case the fetch must be repeated with correct address. Deepak John, Department Of IT, CE Poonjar Deepak John, Department Of IT, CE Poonjar
    • emulation Allows us to replace obsolete equipment with more up to date machines. If the replacement computer fully emulates the original one then no software changes have have to be made to run existing programs. Facilitates transition to new computer system with minimal disruption. Deepak John, Department Of IT, CE Poonjar
    • ADVANTAGES The design of micro-program control unit is less complex because micro-programs are implemented using software routines. The micro-programmed control unit is more flexible because design modifications, correction and enhancement is easily possible. The new or modified instruction set of CPU can be easily implemented by simply rewriting or modifying the contents of control memory. The fault can be easily diagnosed in the micro-program control unit using diagnostics tools by maintaining the contents of flags, registers and counters. Deepak John, Department Of IT, CE Poonjar
    • DISADVANTAGES The micro-program control unit is slower than hardwired control unit. That means to execute an instruction in micro-program control unit requires more time. The micro-program control unit is expensive than hardwired control unit in case of limited hardware resources. The design duration of micro-program control unit is more than hardwired control unit for smaller CPU. Deepak John, Department Of IT, CE Poonjar
    • Deepak John, Department Of Of IT, CE Deepak John, Department IT, CE Poonjar Poonjar
    • ArithmeticDeepak John, Department Of IT, CE Poonjar
    • Signed Numbers  Left most bit is sign bit  0 means positive  1 means negative  +18 = 00010010  -18 = 10010010 2’s complement number system•To compute representation of a negative number faster, find therepresentation of its absolute value in n bits.•invert all bits of this number and add 1 to it.•For example, −10 in 8 bits 10 in 8 bits is 00001010. Invert all bits to get: 11110101. Add 1 to it to get: 11110110. Deepak John, Department Of IT, CE Poonjar
    • Addition/subtraction of signed numbers xi yi Carry-in ci Sumsi Carry-outci +1 At the ith stage: 0 0 0 0 0 Input: 0 0 1 1 0 ci is the carry-in 0 1 0 1 0 0 1 1 0 1 Output: 1 0 0 1 0 si is the sum 1 0 1 0 1 ci+1 carry-out to (i+1)st 1 1 0 0 1 1 1 1 1 1 state si = xi yi ci + xi yi ci + xi yi ci + xi yi ci = x i Å yi Å ci ci +1 = yi ci + xi ci + xi yi Example: X 7 0 1 1 1 Carry-out xi Carry-in + Y = +6 = + 00 1 1 1 1 0 0 0 yi ci+1 ci Z 13 1 1 0 1 si Deepak John, Department Of IT, CE Poonjar
    • Addition logic for a single stage Sum Carry yi c i xi xi yi si c c i +1 i ci x xi yi i yi ci + 1 Full adder ci (FA) s iFull Adder (FA): Symbol for the complete circuit for a single stage of addition. Deepak John, Department Of IT, CE Poonjar
    • n-bit adder •Cascade n full adder (FA) blocks to form a n-bit adder. •Carries propagate or ripple through this cascade, n-bit ripple carry adder. xn - 1 yn- 1 x1 y1 x0 y0 cn - 1 c1 cn FA FA FA c0 sn - 1 s1 s0 Most significant bit Least significant bit (MSB) position (LSB) position Carry-in c0 into the LSB position provides a convenient way to perform subtraction.K n-bit adderK n-bit numbers can be added by cascading k n-bit adders. xkn - 1 ykn - 1 x2n - 1 y2n - 1 xn y n xn - y 1 n- 1 x0 y 0 cn n-bit n-bit n-bit c c kn 0 adder adder adder s s( s s s s kn - 1 k - 1) n 2n - 1 n n- 1 0Each n-bit adder forms a block, so this is cascading of blocks.Carries ripple or propagate through blocks, Blocked Ripple Carry Adder Deepak John, Department Of IT, CE Poonjar
    • n-bit subtractor•Recall X – Y is equivalent to adding 2’s complement of Y to X.•2’s complement is equivalent to 1’s complement + 1.•X – Y = X + Y + 1•2’s complement of positive and negative numbers is computedsimilarly. x y x y x y n- 1 n- 1 1 1 0 0 c c n- 1 1 c FA FA FA 1 n s s s n- 1 1 0 Most significant bit Least significant bit (MSB) position (LSB) position Deepak John, Department Of IT, CE Poonjar
    • n-bit adder/subtractor y y y n- 1 1 0 Add/Sub control x x x n- 1 1 0c n-bit adder n c 0 s s s n- 1 1 0 •Add/sub control = 0, addition. •Add/sub control = 1, subtraction. Deepak John, Department Of IT, CE Poonjar
    • Detecting overflows Overflows can only occur when the sign of the two operands is the same. Overflow occurs if the sign of the result is different from the sign of the operands. Recall that the MSB represents the sign.  xn-1, yn-1, sn-1 represent the sign of operand x, operand y and result s respectively. Circuit to detect overflow can be implemented by the following logic expressions: Overflow = xn −1 yn −1sn −1 + xn −1 yn −1sn −1 Overflow = cn ⊕ cn −1 Deepak John, Department Of IT, CE Poonjar
    • Computing the add time x0 y0 Consider 0th stage: •c1 is available after 2 gate delays. •s1 is available after 1 gate delay. c1 FA c0 s0 Sum Carry yi c i xi xi yi si c c i +1 i ci x i yi Deepak John, Department Of IT, CE Poonjar
    • Computing the add time (contd..) Cascade of 4 Full Adders, or a 4-bit adder x0 y0 x0 y0 x0 y0 x0 y0 FA FA FA FA c0 c4 c3 c2 c1 s3 s2 s1 s0•s0 available after 1 gate delays, c1 available after 2 gate delays.•s1 available after 3 gate delays, c2 available after 4 gate delays.•s2 available after 5 gate delays, c3 available after 6 gate delays.•s3 available after 7 gate delays, c4 available after 8 gate delays.For an n-bit adder, sn-1 is available after 2n-1 gate delays ,cn isavailable after 2n gate delays. Deepak John, Department Of IT, CE Poonjar
    • Fast addition Recall the equations: si = xi ⊕ yi ⊕ ci ci +1 = xi yi + xi ci + yi ci Second equation can be written as: ci +1 = xi yi + ( xi + yi )ci We can write: ci +1 = Gi + Pi ci where Gi = xi yi and Pi = xi + yi•Gi is called generate function and Pi is called propagate function•Gi and Pi are computed only from xi and yi and not ci, thus they can becomputed in one gate delay after X and Y are applied to the inputs of an n-bit adder. Deepak John, Department Of IT, CE Poonjar
    • Carry lookahead ci +1 = Gi + Pi ci ci = Gi −1 + Pi −1ci −1 ⇒ ci+1 = Gi + Pi (Gi −1 + Pi −1ci −1 ) continuing ⇒ ci+1 = Gi + Pi (Gi −1 + Pi −1 (Gi − 2 + Pi− 2 ci −2 )) until ci+1 = Gi + PiGi −1 + Pi Pi−1 Gi −2 + .. + Pi Pi −1 ..P1G0 + Pi Pi −1 ...P0 c 0•All carries can be obtained 3 gate delays after X, Y and c0 are applied. -One gate delay for Pi and Gi -Two gate delays in the AND-OR circuit for ci+1•All sums can be obtained 1 gate delay after the carries are computed.•Independent of n, n-bit addition requires only 4 gate delays.•This is called Carry Lookahead adder. Deepak John, Department Of IT, CE Poonjar
    • Carry-lookahead adder x y x y x y x y 3 3 2 2 1 1 0 0 4-bitc4 c 3 c 2 c 1 . c carry-lookahead B cell B cell B cell B cell 0 adder s s s s 3 2 1 0 G3 P3 G2 P2 G P G P 1 1 0 0 Carry-lookahead logic xi yi . . B-cell for a single stage . c i C1=G0+P0C0 C2=G1+P1G0+P1P0C0 B cell C3=G2+P2G1+P2P1G0+P2P1P0C0 C4=G3+P3G2+P3P2G1+P3P2P1G0+P3P2P1P0C0 Gi P i si Deepak John, Department Of IT, CE Poonjar
    • Blocked Carry-Lookahead adderIn order to add operands longer than 4 bits, we can cascade 4- bit Carry-Lookahead adders. Cascade of Carry-Lookahead adders is called Blocked Carry-Lookahead adder. Carry-out from a 4-bit block can be given as: c4 = G3 + P3 G2 + P3 P2 G1 + P3 P2 P G0 + P3 P2 P1P0 c0 1 Rewrite this as: P0I = P3 P2 P1 P0 G0I = G3 + P3 G2 + P3 P2 G1 + P3 P2 P1G0 Subscript I denotes the blocked carry lookahead and identifies the block. Cascade 4 4-bit adders, c16 can be expressed as: c16 = G3I + P3I G2I + P3I P2I G1I + P3I P2I P10 G0I + P3I P2I P10 P00 c0 Deepak John, Department Of IT, CE Poonjar
    • Blocked Carry-Lookahead adder x15-12 y15-12 x11-8 y11-8 x7-4 y7-4 x3-0 y3-0 c16 4-bit adder c12 4-bit adder c8 4-bit adder c4 4-bit adder . c0 s15-12 s11-8 s7-4 s3-0 G3I P3I G2I P2I G1I P1I G0I P0I Carry-lookahead logicAfter xi, yi and c0 are applied as inputs: - Gi and Pi for each stage are available after 1 gate delay. - PI is available after 2 and GI after 3 gate delays. - All carries are available after 5 gate delays. - c16 is available after 5 gate delays. - s15 which depends on c12 is available after 8 (5+3)gate delays (Recall thatfor a 4-bit carry lookahead adder, the last sum bit is available 3 gate delaysafter all inputs are available) Deepak John, Department Of IT, CE Poonjar
    • MultiplicationDeepak John, Department Of IT, CE Poonjar
    • Multiplication of unsigned numbers Product of 2 n-bit numbers is at most a 2n-bit number.Unsigned multiplication can be viewed as addition of shifted versions of themultiplicand. Deepak John, Department Of IT, CE Poonjar
    • Multiplication of unsigned numbers (contd..) Rules to implement multiplication are:  If the 0th bit of multiplier is 1, then add the multiplicand with 0  If the 0th bit of multiplier is 0 ,then current partial product is 0.  If the ith bit (except 0th bit) of the multiplier is 1, shift the multiplicand and add the shifted multiplicand to the current value of the partial product.  Hand over the partial product to the next stage  If the ith bit of the multiplier is 0, shift the multiplicand and add the current partial product to the 0.  Value of the partial product at the start stage is 0. Deepak John, Department Of IT, CE Poonjar
    • Multiplication of unsigned numbers Typical multiplication cell Bit of incoming partial product (PPi) jth multiplicand bit ith multiplier bit carry out FA carry in Bit of outgoing partial product (PP(i+1)) Deepak John, Department Of IT, CE Poonjar
    • Combinational array multiplier Multiplicand 0 m 3 0 m 2 0 m 1 0 m0 (PP0) q0 0 PP1 p 0 q 1 0 PP2 p 1 q 2 0 PP3 p 2 q3 0 , p 7 p 6 p 5 p4 p3 Product is: p7,p6,..p0 Multiplicand is shifted by displacing it through an array of adders. Deepak John, Department Of IT, CE Poonjar
    •  Combinatorial array multipliers are:  Extremely inefficient.  Have a high gate count for multiplying numbers of practical size such as 32-bit or 64-bit numbers.  Perform only one function, namely, unsigned integer product. Improve gate efficiency by using a mixture of combinatorial array techniques and sequential techniques requiring less combinational logic. Deepak John, Department Of IT, CE Poonjar
    • Sequential Circuit Multiplier Register A (initially 0) Shift right C a a q q n - 1 0 n - 1 0 Multiplier Q Add/Noadd control n-bit Adder MUX Control sequencer 0 0 m m n - 1 0 Multiplicand M Deepak John, Department Of IT, CE Poonjar
    • M 1 1 0 1 Initial configuration0 0 0 0 0 1 0 1 1C A Q0 1 1 0 1 1 0 1 1 Add First cycle0 0 1 1 0 1 1 0 1 Shift1 0 0 1 1 1 1 0 1 Add Second cycle0 1 0 0 1 1 1 1 0 Shift0 1 0 0 1 1 1 1 0 No add Shift Third cycle0 0 1 0 0 1 1 1 11 0 0 0 1 1 1 1 1 Add Fourth cycle0 1 0 0 0 1 1 1 1 Shift Product Deepak John, Department Of IT, CE Poonjar
    • Signed MultiplicationDeepak John, Department Of IT, CE Poonjar
    • Signed Multiplication Considering 2’s-complement signed operands, Extend the sign bit value of the multiplicand to the left as far as the product will extend (-13)(+11) 1 0 0 1 1 ( - 13) 0 1 0 1 1 ( + 11) 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 0 0 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0 0 0 0 1 1 0 1 1 1 0 0 0 1 ( - 143) Deepak John, Department Of IT, CE Poonjar
    • Signed Multiplication For a negative multiplier, a straightforward solution is to form the 2’s-complement of both the multiplier and the multiplicand and proceed as in the case of a positive multiplier. This is possible because complementation of both operands does not change the value or the sign of the product. A technique that works equally well for both negative and positive multipliers – Booth algorithm. Deepak John, Department Of IT, CE Poonjar
    • Booth Algorithm 0 1 1 0 1 ( + 13) X 1 1 0 1 0 (- 6) Multiplier V ersion of multiplicand selected by bitBit i Bit i - 1 0 0 0 XM 0 1 1 0 1 0 1 + 1 XM 0 - 1 +1 - 1 0 1 0 0 0 0 0 0 0 0 0 0 0  1 XM 2s complement of 1 1 1 1 1 0 0 1 1 1 1 0 XM the multiplicand 0 0 0 0 1 1 0 1 1 1 1 0 0 1 1 Booth multiplier recoding table. 0 0 0 0 0 0 1 1 1 0 1 1 0 0 1 0 ( - 78) Booth multiplication with a negative multiplier. Deepak John, Department Of IT, CE Poonjar
    • Fast MultiplicationDeepak John, Department Of IT, CE Poonjar
    • Bit-Pair Recoding of Multipliers Bit-pair recoding halves the maximum number of summands (versions of the multiplicand). Sign extension Implied 0 to right of LSB 1 1 1 0 1 0 0 0 0  1 +1  1 0 0  1  2 (a) Example of bit-pair recoding derived from Booth recoding Deepak John, Department Of IT, CE Poonjar
    • 0 1 1 0 1 ( + 13) ´ 1 1 0 1 0 (- 6)Multiplier bit-pair Multiplier bit on the right Multiplicand 0 1 1 0 1 i +1 i i 1 selected at position i 0 - 1 +1 - 1 0 0 0 0 0 X M 0 0 0 0 0 0 0 0 0 0 0 0 1 +1 1 1 1 1 1 0 0 1 1 X M 0 0 0 0 1 1 0 1 0 1 0 +1 X M 1 1 1 0 0 1 1 0 1 1 +2 X M 0 0 0 0 0 0 1 0 0  2 X M 1 0 1  1 1 1 1 0 1 1 0 0 1 0 ( - 78) X M 1 1 0  1 X M 1 1 1 0 X M (b) Table of multiplicand selection decisions 0 1 1 0 1 0 -1 -2 1 1 1 1 1 0 0 1 1 0 1 1 1 1 0 0 1 1 0 0 0 0 0 0 1 1 1 0 1 1 0 0 1 0 Figure 6.15. Multiplication requiring only n/2 summands. Deepak John, Department Of IT, CE Poonjar
    • Carry-Save Addition of Summands CSA speeds up the addition process.P7 P6 P5 P4 P3 P2 P1 P0 Deepak John, Department Of IT, CE Poonjar
    • P7 P6 P5 P4 P3 P2 P1 P0 Deepak John, Department Of IT, CE Poonjar
    •  Consider the addition of many summands, we can:Ø Group the summands in threes and perform carry-save addition on each of these groups in parallel to generate a set of S and C vectors in one full-adder delayØ Group all of the S and C vectors into threes, and perform carry- save addition on them, generating a further set of S and C vectors in one more full-adder delayØ Continue with this process until there are only two vectors remainingØ They can be added in a RCA or CLA to produce the desired product Deepak John, Department Of IT, CE Poonjar
    • 1 0 1 1 0 1 (45) M X 1 1 1 1 1 1 (63) Q 1 0 1 1 0 1 A 1 0 1 1 0 1 B 1 0 1 1 0 1 C 1 0 1 1 0 1 D 1 0 1 1 0 1 E 1 0 1 1 0 1 F 1 0 1 1 0 0 0 1 0 0 1 1 (2,835) ProductFigure 6.17. A multiplication example used to illustrate carry-save addition as shown in Figure 6.18. Deepak John, Department Of IT, CE Poonjar
    • 1 0 1 1 0 1 M x 1 1 1 1 1 1 Q 1 0 1 1 0 1 A 1 0 1 1 0 1 B 1 0 1 1 0 1 C 1 1 0 0 0 0 1 1 S 1 0 0 1 1 1 1 0 0 C 1 1 0 1 1 0 1 D 1 0 1 1 0 1 E 1 0 1 1 0 1 F 1 1 0 0 0 0 1 1 S 2 0 0 1 1 1 1 0 0 C 2 1 1 0 0 0 0 1 1 S1 0 0 1 1 1 1 0 0 C 1 1 1 0 0 0 0 1 1 S2 1 1 0 1 0 1 0 0 0 1 1 S 3 0 0 0 0 1 0 1 1 0 0 0 C3 0 0 1 1 1 1 0 0 C2 0 1 0 1 1 1 0 1 0 0 1 1 S4 + 0 1 0 1 0 1 0 0 0 0 0 C 4 1 0 1 1 0 0 0 1 0 0 1 1 ProductFigure 6.18. The multiplication example from Figure 6.17 performed using carry-save addition. Deepak John, Department Of IT, CE Poonjar
    • Integer DivisionDeepak John, Department Of IT, CE Poonjar
    • Longhand Division Steps Position the divisor appropriately with respect to the dividend and performs a subtraction. If the remainder is zero or positive, a quotient bit of 1 is determined, the remainder is extended by another bit of the dividend, the divisor is repositioned, and another subtraction is performed. If the remainder is negative, a quotient bit of 0 is determined, the dividend is restored by adding back the divisor, and the divisor is repositioned for another subtraction. Deepak John, Department Of IT, CE Poonjar
    • Division of Unsigned Binary Integers 00001101 Quotient Divisor 1011 10010011 Dividend 1011 001110 Partial 1011 Remainders 001111 1011 Remainder 100 Deepak John, Department Of IT, CE Poonjar
    • Circuit Arrangement Shift left an an-1 a0 qn-1 q0 Dividend Q A Quotient Setting N+1 bit Add/Subtract adder Control Sequencer 0 mn-1 m0 Divisor M Figure 6.21. Circuit arrangement for binary division. Deepak John, Department Of IT, CE Poonjar
    • Restoring Division Shift A and Q left one binary position Subtract M from A, and place the answer back in A If the sign of A is 1, set q0 to 0 and add M back to A (restore A); otherwise, set q0 to 1 Repeat these steps n times Deepak John, Department Of IT, CE Poonjar
    • Examples Initially 0 0 0 0 0 0 0 1 0 1 1 0 0 0 Shift 0 0 0 0 1 0 0 0 Subtract 1 1 1 0 1 First cycle Set q0 1 1 1 1 0 Restore 1 1 0 0 0 0 1 0 0 0 0 Shift 0 0 0 1 0 0 0 0 Subtract 1 1 1 0 1 10 Set q0 1 1 1 1 1 Second cycle Restore 1 1 11 1000 0 0 0 1 0 0 0 0 0 11 Shift 0 0 1 0 0 0 0 0 Subtract 1 1 1 0 1 10 Set q0 0 0 0 0 1 Third cycle Shift 0 0 0 1 0 0 0 0 1 Subtract 1 1 1 0 1 0 0 1 Set q0 1 1 1 1 1 Fourth cycle Restore 1 1 0 0 0 1 0 0 0 1 0 Remainder Quotient Figure 6.22. A restoring-division example. Deepak John, Department Of IT, CE Poonjar
    • Nonrestoring Division Avoid the need for restoring A after an unsuccessful subtraction. Step 1: (Repeat n times)Ø If the sign of A is 0, shift A and Q left one bit position and subtract M from A; otherwise, shift A and Q left and add M to A.Ø Now, if the sign of A is 0, set q0 to 1; otherwise, set q0 to 0. Step2: If the sign of A is 1, add M to A Deepak John, Department Of IT, CE Poonjar
    • Examples Initially0 0 0 0 0 1 0 0 0 0 0 0 1 1 Shift 0 0 0 0 1 0 0 0 First cycle Subtract 1 1 1 0 1 Set q0 1 1 1 1 0 0 0 0 0 Shift 1 1 1 0 0 0 0 0 Add 0 0 0 1 1 Second cycle Set q 1 1 1 1 1 0 0 0 0 0 Shift 1 1 1 1 0 0 0 0 1 1 1 1 1 Add 0 0 0 1 1 Third cycle Restore 0 0 0 1 1 Set q 0 0 0 0 1 0 0 0 1 remainder 0Add 0 0 0 1 0 Remainder Shift 0 0 0 1 0 0 0 1 Subtract 1 1 1 0 1 Fourth cycle Set q 1 1 1 1 1 0 0 1 0 0 A nonrestoring-division example. Quotient Deepak John, Department Of IT, CE Poonjar
    • Floating-Point Numbers and OperationsDeepak John, Department Of IT, CE Poonjar
    • FractionsIf b is a binary vector, then we have seen that it can be interpreted as an unsignedinteger by: V(b) = b31.231 + b30.230 + bn-3.229 + .... + b1.21 + b0.20 This vector has an implicit binary point to its immediate right: b31b30b29....................b1b0. implicit binary point Suppose if the binary vector is interpreted with the implicit binary point is just left of the sign bit: implicit binary point .b31b30b29....................b1b0 The value of b is then given by: V(b) = b31.2-1 + b30.2-2 + b29.2-3 + .... + b1.2-31 + b0.2-32 Deepak John, Department Of IT, CE Poonjar
    • Range of fractions The value of the unsigned binary fraction is: V(b) = b31.2-1 + b30.2-2 + b29.2-3 + .... + b1.2-31 + b0.2-32 The range of the numbers represented in this format is: 0 ≤ V (b) ≤ 1 − 2 −32 ≈ 0.9999999998In general for a n-bit binary fraction (a number with an assumed binary point atthe immediate left of the vector), then the range of values is: 0 ≤ V (b) ≤ 1 − 2 − n Deepak John, Department Of IT, CE Poonjar
    • Scientific notation•Previous representations have a fixed point. Either the point is to the immediateright or it is to the immediate left. This is called Fixed point representation.•Fixed point representation suffers from a drawback that the representation can onlyrepresent a finite range (and quite small) range of numbers. A more convenient representation is the scientific representation, where the numbers are represented in the form: x = m1 .m2 m3 m4 × b ±e Components of these numbers are: Mantissa (m), implied base (b), and exponent (e) Deepak John, Department Of IT, CE Poonjar
    • Significant digits A number such as the following is said to have 7 significant digits x = ±0.m1 m2 m3 m4 m5 m6 m7 × b ±eFractions in the range 0.0 to 0.9999999 need about 24 bits of precision (in binary).For example the binary fraction with 24 1’s: 111111111111111111111111 = 0.9999999404Not every real number between 0 and 0.9999999404 can be represented by a 24-bitfractional number.The smallest non-zero number that can be represented is: 000000000000000000000001 = 5.96046 x 10-8 Every other non-zero number is constructed in increments of this value. Deepak John, Department Of IT, CE Poonjar
    • Sign and exponent digits•In a 32-bit number, suppose we allocate 24 bits to represent a fractionalmantissa.•Assume that the mantissa is represented in sign and magnitude format,and we have allocated one bit to represent the sign.•We allocate 7 bits to represent the exponent, and assume that theexponent is represented as a 2’s complement integer.•There are no bits allocated to represent the base, we assume that thebase is implied for now, that is the base is 2.•Since a 7-bit 2’s complement number can represent values in the range64 to 63, the range of numbers that can be represented is: 0.0000001 x 2-64 < = | x | <= 0.9999999 x 263 •In decimal representation this range is: 0.5421 x 10-20 < = | x | <= 9.2237 x 1018 Deepak John, Department Of IT, CE Poonjar
    • A sample representation 1 7 24 Sign Exponent Fractional mantissa bit •24-bit mantissa with an implied binary point to the immediate left •7-bit exponent in 2’s complement form, and implied base is 2. Deepak John, Department Of IT, CE Poonjar
    • Normalization Consider the number: x = 0.0004056781 x 1012 If the number is to be represented using only 7 significant mantissa digits, the representation ignoring rounding is: x = 0.0004056 x 1012 If the number is shifted so that as many significant digits are brought into 7 available slots: x = 0.4056781 x 109 = 0.0004056 x 1012 Exponent of x was decreased by 1 for every left shift of x.A number which is brought into a form so that all of the available mantissa digitsare optimally used (this is different from all occupied which may not hold), iscalled a normalized number. Same methodology holds in the case of binary mantissas 0001101000(10110) x 28 = 1101000101(10) x 25 Deepak John, Department Of IT, CE Poonjar
    • Normalization (contd..)•A floating point number is in normalized form if the most significant 1 in themantissa is in the most significant bit of the mantissa.•All normalized floating point numbers in this system will be of the form: 0.1xxxxx.......xx Range of numbers representable in this system, if every number must be normalized is: 0.5 x 2-64 <= | x | < 1 x 263 Deepak John, Department Of IT, CE Poonjar
    • Normalization, overflow and underflow The procedure for normalizing a floating point number is: Do (until MSB of mantissa = = 1) Shift the mantissa left (or right) Decrement (increment) the exponent by 1 end do Applying the normalization procedure to: .000111001110....0010 x 2-62 gives: .111001110........ x 2-65But we cannot represent an exponent of –65, in trying to normalize the number wehave underflowed our representation. Applying the normalization procedure to: 1.00111000............x 263 gives: 0.100111..............x 264 This overflows the representation. Deepak John, Department Of IT, CE Poonjar
    • Changing the implied baseSo far we have assumed an implied base of 2, that is our floating pointnumbers are of the form: x = m 2e If we choose an implied base of 16, then: x = m 16e Then: y = (m.16) .16e-1 (m.24) .16e-1 = m . 16e = x•Thus, every four left shifts of a binary mantissa results in a decrease of 1 in a base16 exponent.•Normalization in this case means shifting the mantissa until there is a 1 in the firstfour bits of the mantissa. Deepak John, Department Of IT, CE Poonjar
    • Excess notation•Rather than representing an exponent in 2’s complement form, it turns out to bemore beneficial to represent the exponent in excess notation.•If 7 bits are allocated to the exponent, exponents can be represented in the rangeof -64 to +63, that is: -64 <= e <= 63Exponent can also be represented using the following coding called as excess-64: E’ = Etrue + 64 In general, excess-p coding is represented as: E’ = Etrue + p True exponent of -64 is represented as 0 0 is represented as 64 63 is represented as 127This enables efficient comparison of the relative sizes of two floating point numbers. Deepak John, Department Of IT, CE Poonjar
    • IEEE notationIEEE Floating Point notation is the standard representation in use. Thereare two representations: - Single precision. - Double precision.Both have an implied base of 2.Single precision: - 32 bits (23-bit mantissa, 8-bit exponent in excess-127 representation)Double precision: - 64 bits (52-bit mantissa, 11-bit exponent in excess-1023representation)Fractional mantissa, with an implied binary point at immediate left. Sign Exponent Mantissa 1 8 or 11 23 or 52 Deepak John, Department Of IT, CE Poonjar
    • Peculiarities of IEEE notation•Floating point numbers have to be represented in a normalized form tomaximize the use of available mantissa digits.•In a base-2 representation, this implies that the MSB of the mantissa isalways equal to 1.•If every number is normalized, then the MSB of the mantissa is always 1. We can do away without storing the MSB.•IEEE notation assumes that all numbers are normalized so that the MSB ofthe mantissa is a 1 and does not store this bit.•So the real MSB of a number in the IEEE notation is either a 0 or a 1.•The values of the numbers represented in the IEEE single precisionnotation are of the form: (+,-) 1.M x 2(E - 127) •The hidden 1 forms the integer part of the mantissa. •Note that excess-127 and excess-1023 (not excess-128 or excess-1024) are used to represent the exponent. Deepak John, Department Of IT, CE Poonjar
    • Exponent fieldIn the IEEE representation, the exponent is in excess-127 (excess-1023) notation.The actual exponents represented are: -126 <= E <= 127 and -1022 <= E <= 1023 not -127 <= E <= 128 and -1023 <= E <= 1024This is because the IEEE uses the exponents -127 and 128 (and -1023 and 1024),that is the actual values 0 and 255 to represent special conditions: - Exact zero - Infinity Deepak John, Department Of IT, CE Poonjar
    • Floating point arithmetic Addition: 3.1415 x 108 + 1.19 x 106 = 3.1415 x 108 + 0.0119 x 108 = 3.1534 x 108 Multiplication: 3.1415 x 108 x 1.19 x 106 = (3.1415 x 1.19 ) x 10(8+6) Division: 3.1415 x 108 / 1.19 x 106 = (3.1415 / 1.19 ) x 10(8-6) Biased exponent problem: If a true exponent e is represented in excess-p notation, that is as e+p. Then consider what happens under multiplication: a. 10(x + p) * b. 10(y + p) = (a.b). 10(x + p + y +p) = (a.b). 10(x +y + 2p)Representing the result in excess-p notation implies that the exponentshould be x+y+p. Instead it is x+y+2p.Biases should be handled in floating point arithmetic. Deepak John, Department Of IT, CE Poonjar
    • Floating point arithmetic: ADD/SUB rule Choose the number with the smaller exponent. Shift its mantissa right until the exponents of both the numbers are equal. Add or subtract the mantissas. Determine the sign of the result. Normalize the result if necessary and truncate/round to the number of mantissa bits. Note: This does not consider the possibility of overflow/underflow. Deepak John, Department Of IT, CE Poonjar
    • Floating point arithmetic: MUL rule Add the exponents. Subtract the bias. Multiply the mantissas and determine the sign of the result. Normalize the result (if necessary). Truncate/round the mantissa of the result. Deepak John, Department Of IT, CE Poonjar
    • Floating point arithmetic: DIV rule Subtract the exponents Add the bias. Divide the mantissas and determine the sign of the result. Normalize the result if necessary. Truncate/round the mantissa of the result.Note: Multiplication and division does not require alignment of the mantissasthe way addition and subtraction does. Deepak John, Department Of IT, CE Poonjar
    • Guard bitsWhile adding two floating point numbers with 24-bit mantissas, we shiftthe mantissa of the number with the smaller exponent to the right untilthe two exponents are equalized.This implies that mantissa bits may be lost during the right shift (that is,bits of precision may be shifted out of the mantissa being shifted).To prevent this, floating point operations are implemented by keepingguard bits, that is, extra bits of precision at the least significant endof the mantissa.The arithmetic on the mantissas is performed with these extra bits ofprecision.After an arithmetic operation, the guarded mantissas are: - Normalized (if necessary) - Converted back by a process called truncation/rounding to a 24-bit mantissa. Deepak John, Department Of IT, CE Poonjar
    • Truncation/rounding Straight chopping:  The guard bits (excess bits of precision) are dropped. Von Neumann rounding:  If the guard bits are all 0, they are dropped.  However, if any bit of the guard bit is a 1, then the LSB of the retained bit is set to 1. Rounding:  If there is a 1 in the MSB of the guard bit then a 1 is added to the LSB of the retained bits. Deepak John, Department Of IT, CE Poonjar
    • Rounding Rounding is evidently the most accurate truncation method. However,  Rounding requires an addition operation.  Rounding may require a renormalization, if the addition operation de- normalizes the truncated number. 0.111111100000 rounds to 0.111111 + 0.000001 =1.000000 which must be renormalized to 0.100000 IEEE uses the rounding method. Deepak John, Department Of IT, CE Poonjar
    •  Thank You Deepak John, Department Of IT, CE Poonjar