CSE 8383 - AdvancedComputer Architecture            Week-3     Week of Jan 26, 2004   engr.smu.edu/~rewini/8383
Contents   Linear Pipelines   Nonlinear pipelines   Instruction Pipelines   Arithmetic Operations   Design of Multifu...
Linear Pipeline   Processing Stages are linearly    connected   Perform fixed function   Synchronous Pipeline       Cl...
Latches     S1               S2              S3              L1                 L2Slowest stage determines delayEqual dela...
Reservation Table          TimeS1    XS2        XS3                 X                     XS4
5 tasks on 4 stages                  TimeS1    X   X   X   X      XS2        X   X   X      X   XS3            X   X      ...
Non Linear Pipelines   Variable functions   Feed-Forward   Feedback
3 stages & 2 functions       X                  Y S1        S2        S3
Reservation Tables for X & YS1    X                   X       XS2        X       XS3            X       X       XS1    Y  ...
Linear Instruction Pipelines   Assume the following instruction    execution phases:       Fetch (F)       Decode (D)  ...
Pipeline Instruction ExecutionF    I1   I2   I3D         I1   I2   I3O              I1   I2   I3E                   I1   I...
Dependencies   Data Dependency    (Operand is not ready yet)   Instruction Dependency    (Branching)    Will that Cause ...
Data DependencyI1 -- Add R1, R2, R3I2 -- Sub R4, R1, R5       1    2    3    4    5    6  F   I1   I2  D        I1   I2 O ...
Solutions   STALL   Forwarding   Write and Read in one cycle   ….
Instruction DependencyI1 – Branch oI2 –        1    2    3    4    5    6   F   I1   I2  D         I1   I2  O             ...
Solutions   STALL   Predict Branch taken   Predict Branch not taken   ….
Floating Point Multiplication   Inputs (Mantissa1, Exponenet1), (Mantissa2,    Exponent2)   Add the two exponents  Expo...
Linear Pipeline for floating-      point multiplication     Add             Multiply                                    No...
Linear Pipeline for floating-       point Addition            Partial    Add            Find             Partial Subtract ...
Combined Adder and       Multiplier             Partial                       B            Products   A          F        ...
Reservation Table for Multiply    1   2   3   4   5   6   7A   XB       X   XC           X   XD                   X       ...
Reservation Table for Addition    1   2   3   4   5   6   7   8   9A   YBC               YD                               ...
Nonlinear Pipeline Design   Latency      The number of clock cycles between two      initiations of a pipeline   Collisi...
Nonlinear Pipeline Designcont   Latency Sequence      A sequence of permissible latencies between      successive task in...
Mul – Mul Collision (lunchafter 1 cycle)    1   2    3     4    5   6   7A   X   ZB       X   X Z    ZC            X    X ...
Mul –Mul Collision (lunch after2 cycles)    1   2   3   4   5   6    7A   X       ZB       X   X   Z   ZC           X   X ...
Mul – Mul Collision (lunchafter 3 cycles)    1   2   3   4   5   6   7A   X           ZB       X   X       Z   ZC         ...
Collision Vector for Multiplyafter MultiplyForbidden Latencies: 1, 2Collision vector0 0 0 0 1 1  11Maximum forbidden late...
Example      X             Y S1       S2   S3
Reservation Tables for X & YS1    X                   X       XS2        X       XS3            X       X       XS1    Y  ...
Reservation Tables for X & YS1    X                   X       XS2        X       XS3            X       X       XS1    Y  ...
Forbidden Latencies   X after X   X after Y   Y after X   Y after Y
X after X       2S1    X1        X2                   X1            X2 X1S2         X1        X2 X1           X2S3        ...
X after X       4S1    X1                       X2        X1                X1S2         X1        X1                  X2 ...
Collision Vector Forbidden Latencies: 2, 4, 5, 7 Collision Vector = 1011010
Y after YS1   Y       Y       YS2           Y       YS3       Y       Y       Y                 Y       YS1   Y           ...
Collision Vector   Forbidden Latencies: 2, 4   Collision Vector =    1010
Exercise – Find the collisionvector    1   2   3   4   5   6   7A   X       X   XB       X               XC               ...
State Diagram for X                           8+             1011010     3                            8+         6       8...
Cycles Simple cycles  each state appears  only once(3), (6), (8), (1, 8), (3, 8), and (6,8) Greedy Cycles  simple cycl...
Upcoming SlideShare
Loading in...5
×

Advanced computer architecture

5,702

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
5,702
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
218
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Advanced computer architecture

  1. 1. CSE 8383 - AdvancedComputer Architecture Week-3 Week of Jan 26, 2004 engr.smu.edu/~rewini/8383
  2. 2. Contents Linear Pipelines Nonlinear pipelines Instruction Pipelines Arithmetic Operations Design of Multifunction Pipeline
  3. 3. Linear Pipeline Processing Stages are linearly connected Perform fixed function Synchronous Pipeline  Clocked latches between Stage i and Stage i+1  Equal delays in all stages Asynchronous Pipeline (Handshaking)
  4. 4. Latches S1 S2 S3 L1 L2Slowest stage determines delayEqual delays  clock period
  5. 5. Reservation Table TimeS1 XS2 XS3 X XS4
  6. 6. 5 tasks on 4 stages TimeS1 X X X X XS2 X X X X XS3 X X X X XS4 X X X X X
  7. 7. Non Linear Pipelines Variable functions Feed-Forward Feedback
  8. 8. 3 stages & 2 functions X Y S1 S2 S3
  9. 9. Reservation Tables for X & YS1 X X XS2 X XS3 X X XS1 Y YS2 YS3 Y Y Y
  10. 10. Linear Instruction Pipelines Assume the following instruction execution phases:  Fetch (F)  Decode (D)  Operand Fetch (O)  Execute (E)  Write results (W)
  11. 11. Pipeline Instruction ExecutionF I1 I2 I3D I1 I2 I3O I1 I2 I3E I1 I2 I3W I1 I2 I3
  12. 12. Dependencies Data Dependency (Operand is not ready yet) Instruction Dependency (Branching) Will that Cause a Problem?
  13. 13. Data DependencyI1 -- Add R1, R2, R3I2 -- Sub R4, R1, R5 1 2 3 4 5 6 F I1 I2 D I1 I2 O I1 I2 E I1 I2 W I1 I2
  14. 14. Solutions STALL Forwarding Write and Read in one cycle ….
  15. 15. Instruction DependencyI1 – Branch oI2 – 1 2 3 4 5 6 F I1 I2 D I1 I2 O I1 I2 E I1 I2 W I1 I2
  16. 16. Solutions STALL Predict Branch taken Predict Branch not taken ….
  17. 17. Floating Point Multiplication Inputs (Mantissa1, Exponenet1), (Mantissa2, Exponent2) Add the two exponents  Exponent-out Multiple the 2 mantissas Normalize mantissa and adjust exponent Round the product mantissa to a single length mantissa. You may adjust the exponent
  18. 18. Linear Pipeline for floating- point multiplication Add Multiply Normalize Round Exponents Mantissa Add Partial Normalize Round AccumulatorExponents Products Re normalize
  19. 19. Linear Pipeline for floating- point Addition Partial Add Find Partial Subtract Shift Mantissa Leading 1 ShiftExponents Re Round normalize
  20. 20. Combined Adder and Multiplier Partial B Products A F C G HExponents Partial Add Find Partial Subtract Shift Mantissa Leading 1 Shift / ADD Re Round normalize E D
  21. 21. Reservation Table for Multiply 1 2 3 4 5 6 7A XB X XC X XD X XE XFGH
  22. 22. Reservation Table for Addition 1 2 3 4 5 6 7 8 9A YBC YD YE YF Y YG YH Y Y
  23. 23. Nonlinear Pipeline Design Latency The number of clock cycles between two initiations of a pipeline Collision Resource Conflict Forbidden Latencies Latencies that cause collisions
  24. 24. Nonlinear Pipeline Designcont Latency Sequence A sequence of permissible latencies between successive task initiations Latency Cycle A sequence that repeats the same subsequence Collision vector C = (Cm, Cm-1, …, C2, C1), m <= n-1 n = number of column in reservation table Ci = 1 if latency i causes collision, 0 otherwise
  25. 25. Mul – Mul Collision (lunchafter 1 cycle) 1 2 3 4 5 6 7A X ZB X X Z ZC X X Z ZD X Z XE X ZFGH
  26. 26. Mul –Mul Collision (lunch after2 cycles) 1 2 3 4 5 6 7A X ZB X X Z ZC X X Z ZD X X ZE XFGH
  27. 27. Mul – Mul Collision (lunchafter 3 cycles) 1 2 3 4 5 6 7A X ZB X X Z ZC X X Z ZD X XE XFGH
  28. 28. Collision Vector for Multiplyafter MultiplyForbidden Latencies: 1, 2Collision vector0 0 0 0 1 1  11Maximum forbidden latency = 2  m = 2
  29. 29. Example X Y S1 S2 S3
  30. 30. Reservation Tables for X & YS1 X X XS2 X XS3 X X XS1 Y YS2 YS3 Y Y Y
  31. 31. Reservation Tables for X & YS1 X X XS2 X XS3 X X XS1 Y YS2 YS3 Y Y Y
  32. 32. Forbidden Latencies X after X X after Y Y after X Y after Y
  33. 33. X after X 2S1 X1 X2 X1 X2 X1S2 X1 X2 X1 X2S3 X1 X2 X1 X2 X1 5S1 X1 X2 X1 X1 S2 X1 X1 X2S3 X1 X1 X1 X2
  34. 34. X after X 4S1 X1 X2 X1 X1S2 X1 X1 X2 X2S3 X1 X1 X2 X1 7S1 X1 X1 X2 X1 S2 X1 X1S3 X1 X1 X1
  35. 35. Collision Vector Forbidden Latencies: 2, 4, 5, 7 Collision Vector = 1011010
  36. 36. Y after YS1 Y Y YS2 Y YS3 Y Y Y Y YS1 Y YS2 YS3 Y Y Y Y Y
  37. 37. Collision Vector Forbidden Latencies: 2, 4 Collision Vector = 1010
  38. 38. Exercise – Find the collisionvector 1 2 3 4 5 6 7A X X XB X XC X XD X
  39. 39. State Diagram for X 8+ 1011010 3 8+ 6 8+ 1* 1011011 11111113* 6
  40. 40. Cycles Simple cycles  each state appears only once(3), (6), (8), (1, 8), (3, 8), and (6,8) Greedy Cycles  simple cycles whose edges are all made with minimum latencies from their respective starting states (1,8), (3)  one of them is MAL
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×