Successfully reported this slideshow.
Upcoming SlideShare
×

19,192 views

Published on

• Full Name
Comment goes here.

Are you sure you want to Yes No
• chutiye question shi se kr

Are you sure you want to  Yes  No
• chutiye question shi se kr

Are you sure you want to  Yes  No
• I have Found a better PPT on ThesisScientist.com on the same Topic

Are you sure you want to  Yes  No

1. 1. CSE 8383 - AdvancedComputer Architecture Week-3 Week of Jan 26, 2004 engr.smu.edu/~rewini/8383
2. 2. Contents Linear Pipelines Nonlinear pipelines Instruction Pipelines Arithmetic Operations Design of Multifunction Pipeline
3. 3. Linear Pipeline Processing Stages are linearly connected Perform fixed function Synchronous Pipeline  Clocked latches between Stage i and Stage i+1  Equal delays in all stages Asynchronous Pipeline (Handshaking)
4. 4. Latches S1 S2 S3 L1 L2Slowest stage determines delayEqual delays  clock period
5. 5. Reservation Table TimeS1 XS2 XS3 X XS4
6. 6. 5 tasks on 4 stages TimeS1 X X X X XS2 X X X X XS3 X X X X XS4 X X X X X
7. 7. Non Linear Pipelines Variable functions Feed-Forward Feedback
8. 8. 3 stages & 2 functions X Y S1 S2 S3
9. 9. Reservation Tables for X & YS1 X X XS2 X XS3 X X XS1 Y YS2 YS3 Y Y Y
10. 10. Linear Instruction Pipelines Assume the following instruction execution phases:  Fetch (F)  Decode (D)  Operand Fetch (O)  Execute (E)  Write results (W)
11. 11. Pipeline Instruction ExecutionF I1 I2 I3D I1 I2 I3O I1 I2 I3E I1 I2 I3W I1 I2 I3
12. 12. Dependencies Data Dependency (Operand is not ready yet) Instruction Dependency (Branching) Will that Cause a Problem?
13. 13. Data DependencyI1 -- Add R1, R2, R3I2 -- Sub R4, R1, R5 1 2 3 4 5 6 F I1 I2 D I1 I2 O I1 I2 E I1 I2 W I1 I2
14. 14. Solutions STALL Forwarding Write and Read in one cycle ….
15. 15. Instruction DependencyI1 – Branch oI2 – 1 2 3 4 5 6 F I1 I2 D I1 I2 O I1 I2 E I1 I2 W I1 I2
16. 16. Solutions STALL Predict Branch taken Predict Branch not taken ….
17. 17. Floating Point Multiplication Inputs (Mantissa1, Exponenet1), (Mantissa2, Exponent2) Add the two exponents  Exponent-out Multiple the 2 mantissas Normalize mantissa and adjust exponent Round the product mantissa to a single length mantissa. You may adjust the exponent
18. 18. Linear Pipeline for floating- point multiplication Add Multiply Normalize Round Exponents Mantissa Add Partial Normalize Round AccumulatorExponents Products Re normalize
19. 19. Linear Pipeline for floating- point Addition Partial Add Find Partial Subtract Shift Mantissa Leading 1 ShiftExponents Re Round normalize
20. 20. Combined Adder and Multiplier Partial B Products A F C G HExponents Partial Add Find Partial Subtract Shift Mantissa Leading 1 Shift / ADD Re Round normalize E D
21. 21. Reservation Table for Multiply 1 2 3 4 5 6 7A XB X XC X XD X XE XFGH
22. 22. Reservation Table for Addition 1 2 3 4 5 6 7 8 9A YBC YD YE YF Y YG YH Y Y
23. 23. Nonlinear Pipeline Design Latency The number of clock cycles between two initiations of a pipeline Collision Resource Conflict Forbidden Latencies Latencies that cause collisions
24. 24. Nonlinear Pipeline Designcont Latency Sequence A sequence of permissible latencies between successive task initiations Latency Cycle A sequence that repeats the same subsequence Collision vector C = (Cm, Cm-1, …, C2, C1), m <= n-1 n = number of column in reservation table Ci = 1 if latency i causes collision, 0 otherwise
25. 25. Mul – Mul Collision (lunchafter 1 cycle) 1 2 3 4 5 6 7A X ZB X X Z ZC X X Z ZD X Z XE X ZFGH
26. 26. Mul –Mul Collision (lunch after2 cycles) 1 2 3 4 5 6 7A X ZB X X Z ZC X X Z ZD X X ZE XFGH
27. 27. Mul – Mul Collision (lunchafter 3 cycles) 1 2 3 4 5 6 7A X ZB X X Z ZC X X Z ZD X XE XFGH
28. 28. Collision Vector for Multiplyafter MultiplyForbidden Latencies: 1, 2Collision vector0 0 0 0 1 1  11Maximum forbidden latency = 2  m = 2
29. 29. Example X Y S1 S2 S3
30. 30. Reservation Tables for X & YS1 X X XS2 X XS3 X X XS1 Y YS2 YS3 Y Y Y
31. 31. Reservation Tables for X & YS1 X X XS2 X XS3 X X XS1 Y YS2 YS3 Y Y Y
32. 32. Forbidden Latencies X after X X after Y Y after X Y after Y
33. 33. X after X 2S1 X1 X2 X1 X2 X1S2 X1 X2 X1 X2S3 X1 X2 X1 X2 X1 5S1 X1 X2 X1 X1 S2 X1 X1 X2S3 X1 X1 X1 X2
34. 34. X after X 4S1 X1 X2 X1 X1S2 X1 X1 X2 X2S3 X1 X1 X2 X1 7S1 X1 X1 X2 X1 S2 X1 X1S3 X1 X1 X1
35. 35. Collision Vector Forbidden Latencies: 2, 4, 5, 7 Collision Vector = 1011010
36. 36. Y after YS1 Y Y YS2 Y YS3 Y Y Y Y YS1 Y YS2 YS3 Y Y Y Y Y
37. 37. Collision Vector Forbidden Latencies: 2, 4 Collision Vector = 1010
38. 38. Exercise – Find the collisionvector 1 2 3 4 5 6 7A X X XB X XC X XD X
39. 39. State Diagram for X 8+ 1011010 3 8+ 6 8+ 1* 1011011 11111113* 6
40. 40. Cycles Simple cycles  each state appears only once(3), (6), (8), (1, 8), (3, 8), and (6,8) Greedy Cycles  simple cycles whose edges are all made with minimum latencies from their respective starting states (1,8), (3)  one of them is MAL