SlideShare a Scribd company logo
CSE 8383 - Advanced
Computer Architecture

            Week-3
     Week of Jan 26, 2004
   engr.smu.edu/~rewini/8383
Contents
   Linear Pipelines
   Nonlinear pipelines
   Instruction Pipelines
   Arithmetic Operations
   Design of Multifunction Pipeline
Linear Pipeline
   Processing Stages are linearly
    connected
   Perform fixed function
   Synchronous Pipeline
       Clocked latches between Stage i and
        Stage i+1
       Equal delays in all stages
   Asynchronous Pipeline (Handshaking)
Latches


     S1               S2              S3


              L1                 L2

Slowest stage determines delay

Equal delays  clock period
Reservation Table
          Time


S1    X

S2        X

S3
                 X

                     X
S4
5 tasks on 4 stages
                  Time

S1    X   X   X   X      X

S2        X   X   X      X   X

S3            X   X      X   X   X

S4                X      X   X   X   X
Non Linear Pipelines
   Variable functions
   Feed-Forward
   Feedback
3 stages & 2 functions
       X                  Y



 S1        S2        S3
Reservation Tables for X & Y
S1    X                   X       X
S2        X       X
S3            X       X       X


S1    Y               Y
S2            Y
S3        Y       Y       Y
Linear Instruction Pipelines
   Assume the following instruction
    execution phases:
       Fetch (F)
       Decode (D)
       Operand Fetch (O)
       Execute (E)
       Write results (W)
Pipeline Instruction Execution

F    I1   I2   I3

D         I1   I2   I3

O              I1   I2   I3

E                   I1   I2   I3

W
                         I1   I2   I3
Dependencies
   Data Dependency
    (Operand is not ready yet)

   Instruction Dependency
    (Branching)

    Will that Cause a Problem?
Data Dependency
I1 -- Add R1, R2, R3
I2 -- Sub R4, R1, R5
       1    2    3    4    5    6

  F   I1   I2
  D        I1   I2
 O              I1   I2
 E
                     I1   I2
 W                        I1   I2
Solutions
   STALL
   Forwarding
   Write and Read in one cycle
   ….
Instruction Dependency
I1 – Branch o
I2 –
        1    2    3    4    5    6

   F   I1   I2
  D         I1   I2
  O              I1   I2
  E
                      I1   I2
  W                        I1   I2
Solutions
   STALL
   Predict Branch taken
   Predict Branch not taken
   ….
Floating Point Multiplication
   Inputs (Mantissa1, Exponenet1), (Mantissa2,
    Exponent2)
   Add the two exponents  Exponent-out
   Multiple the 2 mantissas
   Normalize mantissa and adjust exponent
   Round the product mantissa to a single length
    mantissa. You may adjust the exponent
Linear Pipeline for floating-
      point multiplication

     Add             Multiply
                                    Normalize          Round
   Exponents         Mantissa




  Add           Partial                    Normalize      Round
                            Accumulator
Exponents      Products



                                                          Re
                                                       normalize
Linear Pipeline for floating-
       point Addition


            Partial    Add            Find             Partial
 Subtract
             Shift    Mantissa      Leading 1           Shift
Exponents




                                                   Re
                                 Round
                                                normalize
Combined Adder and
       Multiplier
             Partial
                       B
            Products


   A          F              C                G               H
Exponents    Partial         Add             Find             Partial
 Subtract     Shift        Mantissa        Leading 1           Shift
  / ADD



                                                          Re
                                      Round
                                                       normalize

                                       E                  D
Reservation Table for Multiply
    1   2   3   4   5   6   7

A   X
B       X   X
C           X   X
D                   X       X
E                       X
F

G

H
Reservation Table for Addition
    1   2   3   4   5   6   7   8   9
A   Y
B
C               Y
D                                   Y
E                               Y
F       Y   Y
G                   Y
H                       Y   Y
Nonlinear Pipeline Design
   Latency
      The number of clock cycles between two
      initiations of a pipeline
   Collision
      Resource Conflict
   Forbidden Latencies
      Latencies that cause collisions
Nonlinear Pipeline Design
cont
   Latency Sequence
      A sequence of permissible latencies between
      successive task initiations
   Latency Cycle
      A sequence that repeats the same subsequence
   Collision vector
    C = (Cm, Cm-1, …, C2, C1), m <= n-1
    n = number of column in reservation table
    Ci = 1 if latency i causes collision, 0 otherwise
Mul – Mul Collision (lunch
after 1 cycle)
    1   2    3     4    5   6   7

A   X   Z
B       X   X Z    Z
C            X    X Z   Z
D                       X   Z   X
E                           X   Z
F

G

H
Mul –Mul Collision (lunch after
2 cycles)
    1   2   3   4   5   6    7

A   X       Z
B       X   X   Z   Z
C           X   X   Z   Z
D                   X       X Z
E                       X
F

G

H
Mul – Mul Collision (lunch
after 3 cycles)
    1   2   3   4   5   6   7

A   X           Z
B       X   X       Z   Z
C           X   X       Z   Z
D                   X       X
E                       X
F

G

H
Collision Vector for Multiply
after Multiply
Forbidden Latencies: 1, 2

Collision vector
0 0 0 0 1 1  11

Maximum forbidden latency = 2  m = 2
Example
      X             Y



 S1       S2   S3
Reservation Tables for X & Y
S1    X                   X       X
S2        X       X
S3            X       X       X


S1    Y               Y
S2            Y
S3        Y       Y       Y
Reservation Tables for X & Y
S1    X                   X       X
S2        X       X
S3            X       X       X


S1    Y               Y
S2            Y
S3        Y       Y       Y
Forbidden Latencies
   X after X
   X after Y
   Y after X
   Y after Y
X after X
       2
S1    X1        X2                   X1            X2 X1
S2         X1        X2 X1           X2
S3              X1           X2 X1        X2 X1

       5
S1    X1                       X2 X1              X1

 S2        X1        X1                   X2
S3              X1        X1              X1      X2
X after X
       4
S1    X1                       X2        X1                X1
S2         X1        X1                  X2                X2
S3              X1             X1             X2 X1

       7
S1    X1                            X1                X2
                                                      X1
 S2
           X1        X1
S3              X1        X1                  X1
Collision Vector
 Forbidden Latencies: 2, 4, 5, 7
 Collision Vector =

 1011010
Y after Y
S1   Y       Y       Y
S2           Y       Y
S3       Y       Y       Y
                 Y       Y

S1   Y               Y
S2                   Y
S3
             Y
         Y       Y       Y
                         Y
Collision Vector
   Forbidden Latencies: 2, 4
   Collision Vector =
    1010
Exercise – Find the collision
vector

    1   2   3   4   5   6   7

A   X       X   X

B       X               X

C                   X       X

D               X
State Diagram for X

                           8+

             1011010


     3                            8+
         6       8+   1*

     1011011                    1111111
3*           6
Cycles
 Simple cycles  each state appears
  only once
(3), (6), (8), (1, 8), (3, 8), and (6,8)
 Greedy Cycles  simple cycles whose

  edges are all made with minimum
  latencies from their respective starting
  states
 (1,8), (3)  one of them is MAL

More Related Content

What's hot

Graphs In Data Structure
Graphs In Data StructureGraphs In Data Structure
Graphs In Data Structure
Anuj Modi
 
Ch8 (1) morris mano
Ch8 (1) morris manoCh8 (1) morris mano
Ch8 (1) morris mano
KIRTI89
 
Floyd Warshall Algorithm
Floyd Warshall AlgorithmFloyd Warshall Algorithm
Floyd Warshall Algorithm
InteX Research Lab
 
Bellman ford algorithm
Bellman ford algorithmBellman ford algorithm
Bellman ford algorithm
MdSajjadulislamBappi
 
Graph in data structure
Graph in data structureGraph in data structure
Graph in data structure
Abrish06
 
Basic Processing Unit
Basic Processing UnitBasic Processing Unit
Basic Processing Unit
Slideshare
 
Data Structure and Algorithms.pptx
Data Structure and Algorithms.pptxData Structure and Algorithms.pptx
Data Structure and Algorithms.pptx
Syed Zaid Irshad
 
Asymptotic notations
Asymptotic notationsAsymptotic notations
Asymptotic notations
Nikhil Sharma
 
pipelining
pipeliningpipelining
pipelining
Siddique Ibrahim
 
Strongly Connected Components
Strongly Connected Components Strongly Connected Components
Strongly Connected Components
Md. Shafiuzzaman Hira
 
Polyphase
PolyphasePolyphase
Np cooks theorem
Np cooks theoremNp cooks theorem
Np cooks theorem
Narayana Galla
 
Pipelining powerpoint presentation
Pipelining powerpoint presentationPipelining powerpoint presentation
Pipelining powerpoint presentation
bhavanadonthi
 
Tree - Data Structure
Tree - Data StructureTree - Data Structure
Tree - Data Structure
Ashim Lamichhane
 
Ford Fulkerson Algorithm
Ford Fulkerson AlgorithmFord Fulkerson Algorithm
Ford Fulkerson Algorithm
MrMoliya
 
Deque and its applications
Deque and its applicationsDeque and its applications
Deque and its applications
Jsaddam Hussain
 
Intermediate code generation
Intermediate code generationIntermediate code generation
Intermediate code generation
RamchandraRegmi
 
asymptotic notation
asymptotic notationasymptotic notation
asymptotic notation
SangeethaSasi1
 
Bit pair recoding
Bit pair recodingBit pair recoding
Bit pair recoding
Basit Ali
 
Graph data structure and algorithms
Graph data structure and algorithmsGraph data structure and algorithms
Graph data structure and algorithms
Anandhasilambarasan D
 

What's hot (20)

Graphs In Data Structure
Graphs In Data StructureGraphs In Data Structure
Graphs In Data Structure
 
Ch8 (1) morris mano
Ch8 (1) morris manoCh8 (1) morris mano
Ch8 (1) morris mano
 
Floyd Warshall Algorithm
Floyd Warshall AlgorithmFloyd Warshall Algorithm
Floyd Warshall Algorithm
 
Bellman ford algorithm
Bellman ford algorithmBellman ford algorithm
Bellman ford algorithm
 
Graph in data structure
Graph in data structureGraph in data structure
Graph in data structure
 
Basic Processing Unit
Basic Processing UnitBasic Processing Unit
Basic Processing Unit
 
Data Structure and Algorithms.pptx
Data Structure and Algorithms.pptxData Structure and Algorithms.pptx
Data Structure and Algorithms.pptx
 
Asymptotic notations
Asymptotic notationsAsymptotic notations
Asymptotic notations
 
pipelining
pipeliningpipelining
pipelining
 
Strongly Connected Components
Strongly Connected Components Strongly Connected Components
Strongly Connected Components
 
Polyphase
PolyphasePolyphase
Polyphase
 
Np cooks theorem
Np cooks theoremNp cooks theorem
Np cooks theorem
 
Pipelining powerpoint presentation
Pipelining powerpoint presentationPipelining powerpoint presentation
Pipelining powerpoint presentation
 
Tree - Data Structure
Tree - Data StructureTree - Data Structure
Tree - Data Structure
 
Ford Fulkerson Algorithm
Ford Fulkerson AlgorithmFord Fulkerson Algorithm
Ford Fulkerson Algorithm
 
Deque and its applications
Deque and its applicationsDeque and its applications
Deque and its applications
 
Intermediate code generation
Intermediate code generationIntermediate code generation
Intermediate code generation
 
asymptotic notation
asymptotic notationasymptotic notation
asymptotic notation
 
Bit pair recoding
Bit pair recodingBit pair recoding
Bit pair recoding
 
Graph data structure and algorithms
Graph data structure and algorithmsGraph data structure and algorithms
Graph data structure and algorithms
 

Similar to Advanced computer architecture

Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
zukun
 
Lesson 10: The Chain Rule (handout)
Lesson 10: The Chain Rule (handout)Lesson 10: The Chain Rule (handout)
Lesson 10: The Chain Rule (handout)
Matthew Leingang
 
Lifting 1
Lifting 1Lifting 1
Lifting 1
douglaslyon
 
UNIT I_5.pdf
UNIT I_5.pdfUNIT I_5.pdf
UNIT I_5.pdf
Muthukumar P
 
Matched filter
Matched filterMatched filter
Matched filter
srkrishna341
 
Lesson 4A - Inverses of Functions.ppt
Lesson 4A - Inverses of Functions.pptLesson 4A - Inverses of Functions.ppt
Lesson 4A - Inverses of Functions.ppt
ssuser78a386
 
Continuity.ppt
Continuity.pptContinuity.ppt
Continuity.ppt
SupriyaGhosh43
 
Design of infinite impulse response digital filters 2
Design of infinite impulse response digital filters 2Design of infinite impulse response digital filters 2
Design of infinite impulse response digital filters 2
HIMANSHU DIWAKAR
 
ITS World Congress :: Vienna, Oct 2012
ITS World Congress :: Vienna, Oct 2012ITS World Congress :: Vienna, Oct 2012
ITS World Congress :: Vienna, Oct 2012
László Nádai
 
Lecture.1
Lecture.1Lecture.1
Lecture.1
Faiza Memon
 
Lecture28
Lecture28Lecture28
Lecture28
Dharmesh Goyal
 
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
Alex Pruden
 
Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.
Leonid Zhukov
 
Lect26 Engin112
Lect26 Engin112Lect26 Engin112
Lect26 Engin112
John Williams
 
Lecture22
Lecture22Lecture22
Lecture22
Dharmesh Goyal
 
fghdfh
fghdfhfghdfh
fghdfh
pushbarajaa
 
Singlevaropt
SinglevaroptSinglevaropt
Singlevaropt
sheetslibrary
 
Conic Clustering
Conic ClusteringConic Clustering
Conic Clustering
Napat Rujeerapaiboon
 
Design of IIR filters
Design of IIR filtersDesign of IIR filters
Design of IIR filters
op205
 
Lesson 10: The Chain Rule (Section 21 handout)
Lesson 10: The Chain Rule (Section 21 handout)Lesson 10: The Chain Rule (Section 21 handout)
Lesson 10: The Chain Rule (Section 21 handout)
Matthew Leingang
 

Similar to Advanced computer architecture (20)

Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lesson 10: The Chain Rule (handout)
Lesson 10: The Chain Rule (handout)Lesson 10: The Chain Rule (handout)
Lesson 10: The Chain Rule (handout)
 
Lifting 1
Lifting 1Lifting 1
Lifting 1
 
UNIT I_5.pdf
UNIT I_5.pdfUNIT I_5.pdf
UNIT I_5.pdf
 
Matched filter
Matched filterMatched filter
Matched filter
 
Lesson 4A - Inverses of Functions.ppt
Lesson 4A - Inverses of Functions.pptLesson 4A - Inverses of Functions.ppt
Lesson 4A - Inverses of Functions.ppt
 
Continuity.ppt
Continuity.pptContinuity.ppt
Continuity.ppt
 
Design of infinite impulse response digital filters 2
Design of infinite impulse response digital filters 2Design of infinite impulse response digital filters 2
Design of infinite impulse response digital filters 2
 
ITS World Congress :: Vienna, Oct 2012
ITS World Congress :: Vienna, Oct 2012ITS World Congress :: Vienna, Oct 2012
ITS World Congress :: Vienna, Oct 2012
 
Lecture.1
Lecture.1Lecture.1
Lecture.1
 
Lecture28
Lecture28Lecture28
Lecture28
 
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
zkStudyClub: PLONKUP & Reinforced Concrete [Luke Pearson, Joshua Fitzgerald, ...
 
Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.
 
Lect26 Engin112
Lect26 Engin112Lect26 Engin112
Lect26 Engin112
 
Lecture22
Lecture22Lecture22
Lecture22
 
fghdfh
fghdfhfghdfh
fghdfh
 
Singlevaropt
SinglevaroptSinglevaropt
Singlevaropt
 
Conic Clustering
Conic ClusteringConic Clustering
Conic Clustering
 
Design of IIR filters
Design of IIR filtersDesign of IIR filters
Design of IIR filters
 
Lesson 10: The Chain Rule (Section 21 handout)
Lesson 10: The Chain Rule (Section 21 handout)Lesson 10: The Chain Rule (Section 21 handout)
Lesson 10: The Chain Rule (Section 21 handout)
 

More from Md. Mahedi Mahfuj

Bengali optical character recognition system
Bengali optical character recognition systemBengali optical character recognition system
Bengali optical character recognition system
Md. Mahedi Mahfuj
 
Parallel computing chapter 3
Parallel computing chapter 3Parallel computing chapter 3
Parallel computing chapter 3
Md. Mahedi Mahfuj
 
Parallel computing chapter 2
Parallel computing chapter 2Parallel computing chapter 2
Parallel computing chapter 2
Md. Mahedi Mahfuj
 
Parallel computing(2)
Parallel computing(2)Parallel computing(2)
Parallel computing(2)
Md. Mahedi Mahfuj
 
Parallel computing(1)
Parallel computing(1)Parallel computing(1)
Parallel computing(1)
Md. Mahedi Mahfuj
 
Message passing interface
Message passing interfaceMessage passing interface
Message passing interface
Md. Mahedi Mahfuj
 
Parallel searching
Parallel searchingParallel searching
Parallel searching
Md. Mahedi Mahfuj
 
Clustering manual
Clustering manualClustering manual
Clustering manual
Md. Mahedi Mahfuj
 
Matrix multiplication graph
Matrix multiplication graphMatrix multiplication graph
Matrix multiplication graph
Md. Mahedi Mahfuj
 
Strategy pattern
Strategy patternStrategy pattern
Strategy pattern
Md. Mahedi Mahfuj
 
Observer pattern
Observer patternObserver pattern
Observer pattern
Md. Mahedi Mahfuj
 
Mediator pattern
Mediator patternMediator pattern
Mediator pattern
Md. Mahedi Mahfuj
 
Database management system chapter16
Database management system chapter16Database management system chapter16
Database management system chapter16
Md. Mahedi Mahfuj
 
Database management system chapter15
Database management system chapter15Database management system chapter15
Database management system chapter15
Md. Mahedi Mahfuj
 
Database management system chapter12
Database management system chapter12Database management system chapter12
Database management system chapter12
Md. Mahedi Mahfuj
 
Strategies in job search process
Strategies in job search processStrategies in job search process
Strategies in job search process
Md. Mahedi Mahfuj
 
Report writing(short)
Report writing(short)Report writing(short)
Report writing(short)
Md. Mahedi Mahfuj
 
Report writing(long)
Report writing(long)Report writing(long)
Report writing(long)
Md. Mahedi Mahfuj
 
Job search_resume
Job search_resumeJob search_resume
Job search_resume
Md. Mahedi Mahfuj
 
Job search_interview
Job search_interviewJob search_interview
Job search_interview
Md. Mahedi Mahfuj
 

More from Md. Mahedi Mahfuj (20)

Bengali optical character recognition system
Bengali optical character recognition systemBengali optical character recognition system
Bengali optical character recognition system
 
Parallel computing chapter 3
Parallel computing chapter 3Parallel computing chapter 3
Parallel computing chapter 3
 
Parallel computing chapter 2
Parallel computing chapter 2Parallel computing chapter 2
Parallel computing chapter 2
 
Parallel computing(2)
Parallel computing(2)Parallel computing(2)
Parallel computing(2)
 
Parallel computing(1)
Parallel computing(1)Parallel computing(1)
Parallel computing(1)
 
Message passing interface
Message passing interfaceMessage passing interface
Message passing interface
 
Parallel searching
Parallel searchingParallel searching
Parallel searching
 
Clustering manual
Clustering manualClustering manual
Clustering manual
 
Matrix multiplication graph
Matrix multiplication graphMatrix multiplication graph
Matrix multiplication graph
 
Strategy pattern
Strategy patternStrategy pattern
Strategy pattern
 
Observer pattern
Observer patternObserver pattern
Observer pattern
 
Mediator pattern
Mediator patternMediator pattern
Mediator pattern
 
Database management system chapter16
Database management system chapter16Database management system chapter16
Database management system chapter16
 
Database management system chapter15
Database management system chapter15Database management system chapter15
Database management system chapter15
 
Database management system chapter12
Database management system chapter12Database management system chapter12
Database management system chapter12
 
Strategies in job search process
Strategies in job search processStrategies in job search process
Strategies in job search process
 
Report writing(short)
Report writing(short)Report writing(short)
Report writing(short)
 
Report writing(long)
Report writing(long)Report writing(long)
Report writing(long)
 
Job search_resume
Job search_resumeJob search_resume
Job search_resume
 
Job search_interview
Job search_interviewJob search_interview
Job search_interview
 

Advanced computer architecture

  • 1. CSE 8383 - Advanced Computer Architecture Week-3 Week of Jan 26, 2004 engr.smu.edu/~rewini/8383
  • 2. Contents  Linear Pipelines  Nonlinear pipelines  Instruction Pipelines  Arithmetic Operations  Design of Multifunction Pipeline
  • 3. Linear Pipeline  Processing Stages are linearly connected  Perform fixed function  Synchronous Pipeline  Clocked latches between Stage i and Stage i+1  Equal delays in all stages  Asynchronous Pipeline (Handshaking)
  • 4. Latches S1 S2 S3 L1 L2 Slowest stage determines delay Equal delays  clock period
  • 5. Reservation Table Time S1 X S2 X S3 X X S4
  • 6. 5 tasks on 4 stages Time S1 X X X X X S2 X X X X X S3 X X X X X S4 X X X X X
  • 7. Non Linear Pipelines  Variable functions  Feed-Forward  Feedback
  • 8. 3 stages & 2 functions X Y S1 S2 S3
  • 9. Reservation Tables for X & Y S1 X X X S2 X X S3 X X X S1 Y Y S2 Y S3 Y Y Y
  • 10. Linear Instruction Pipelines  Assume the following instruction execution phases:  Fetch (F)  Decode (D)  Operand Fetch (O)  Execute (E)  Write results (W)
  • 11. Pipeline Instruction Execution F I1 I2 I3 D I1 I2 I3 O I1 I2 I3 E I1 I2 I3 W I1 I2 I3
  • 12. Dependencies  Data Dependency (Operand is not ready yet)  Instruction Dependency (Branching) Will that Cause a Problem?
  • 13. Data Dependency I1 -- Add R1, R2, R3 I2 -- Sub R4, R1, R5 1 2 3 4 5 6 F I1 I2 D I1 I2 O I1 I2 E I1 I2 W I1 I2
  • 14. Solutions  STALL  Forwarding  Write and Read in one cycle  ….
  • 15. Instruction Dependency I1 – Branch o I2 – 1 2 3 4 5 6 F I1 I2 D I1 I2 O I1 I2 E I1 I2 W I1 I2
  • 16. Solutions  STALL  Predict Branch taken  Predict Branch not taken  ….
  • 17. Floating Point Multiplication  Inputs (Mantissa1, Exponenet1), (Mantissa2, Exponent2)  Add the two exponents  Exponent-out  Multiple the 2 mantissas  Normalize mantissa and adjust exponent  Round the product mantissa to a single length mantissa. You may adjust the exponent
  • 18. Linear Pipeline for floating- point multiplication Add Multiply Normalize Round Exponents Mantissa Add Partial Normalize Round Accumulator Exponents Products Re normalize
  • 19. Linear Pipeline for floating- point Addition Partial Add Find Partial Subtract Shift Mantissa Leading 1 Shift Exponents Re Round normalize
  • 20. Combined Adder and Multiplier Partial B Products A F C G H Exponents Partial Add Find Partial Subtract Shift Mantissa Leading 1 Shift / ADD Re Round normalize E D
  • 21. Reservation Table for Multiply 1 2 3 4 5 6 7 A X B X X C X X D X X E X F G H
  • 22. Reservation Table for Addition 1 2 3 4 5 6 7 8 9 A Y B C Y D Y E Y F Y Y G Y H Y Y
  • 23. Nonlinear Pipeline Design  Latency The number of clock cycles between two initiations of a pipeline  Collision Resource Conflict  Forbidden Latencies Latencies that cause collisions
  • 24. Nonlinear Pipeline Design cont  Latency Sequence A sequence of permissible latencies between successive task initiations  Latency Cycle A sequence that repeats the same subsequence  Collision vector C = (Cm, Cm-1, …, C2, C1), m <= n-1 n = number of column in reservation table Ci = 1 if latency i causes collision, 0 otherwise
  • 25. Mul – Mul Collision (lunch after 1 cycle) 1 2 3 4 5 6 7 A X Z B X X Z Z C X X Z Z D X Z X E X Z F G H
  • 26. Mul –Mul Collision (lunch after 2 cycles) 1 2 3 4 5 6 7 A X Z B X X Z Z C X X Z Z D X X Z E X F G H
  • 27. Mul – Mul Collision (lunch after 3 cycles) 1 2 3 4 5 6 7 A X Z B X X Z Z C X X Z Z D X X E X F G H
  • 28. Collision Vector for Multiply after Multiply Forbidden Latencies: 1, 2 Collision vector 0 0 0 0 1 1  11 Maximum forbidden latency = 2  m = 2
  • 29. Example X Y S1 S2 S3
  • 30. Reservation Tables for X & Y S1 X X X S2 X X S3 X X X S1 Y Y S2 Y S3 Y Y Y
  • 31. Reservation Tables for X & Y S1 X X X S2 X X S3 X X X S1 Y Y S2 Y S3 Y Y Y
  • 32. Forbidden Latencies  X after X  X after Y  Y after X  Y after Y
  • 33. X after X 2 S1 X1 X2 X1 X2 X1 S2 X1 X2 X1 X2 S3 X1 X2 X1 X2 X1 5 S1 X1 X2 X1 X1 S2 X1 X1 X2 S3 X1 X1 X1 X2
  • 34. X after X 4 S1 X1 X2 X1 X1 S2 X1 X1 X2 X2 S3 X1 X1 X2 X1 7 S1 X1 X1 X2 X1 S2 X1 X1 S3 X1 X1 X1
  • 35. Collision Vector  Forbidden Latencies: 2, 4, 5, 7  Collision Vector = 1011010
  • 36. Y after Y S1 Y Y Y S2 Y Y S3 Y Y Y Y Y S1 Y Y S2 Y S3 Y Y Y Y Y
  • 37. Collision Vector  Forbidden Latencies: 2, 4  Collision Vector = 1010
  • 38. Exercise – Find the collision vector 1 2 3 4 5 6 7 A X X X B X X C X X D X
  • 39. State Diagram for X 8+ 1011010 3 8+ 6 8+ 1* 1011011 1111111 3* 6
  • 40. Cycles  Simple cycles  each state appears only once (3), (6), (8), (1, 8), (3, 8), and (6,8)  Greedy Cycles  simple cycles whose edges are all made with minimum latencies from their respective starting states (1,8), (3)  one of them is MAL