Describe the non linear dynamic pipeline concepts, Creation of reservation table from non-linear pipeline architecture, creation of collision vector from reservation table, generation of state diagram, derivation of simple cycles, greedy cycles and MAL(Minimum Average Latency)
2. Two types of pipelines
Linear pipeline
Non-linear Pipeline
Non-linear pipelines are also known as general pipeline
3. Linear Pipeline
Streamline (cascade) connections of processing stages
Performs a fixed function
Contain k processing stages
External input fed in at stage S1
Final result emerges from stage Sk
Asynchronous model
4. General Pipeline
Contains feed forward/feedback connections
Making pipeline a nonlinear one
Multiple paths
Parallel usage of multiple storages
Can implement variable functions at different times by
reconfigurations
Same pipeline can be used to evaluate different functions by
following different dataflow patterns,
5. Reservation Table
Shows the utilization pattern of successive pipeline stages
Represents the data flow through the pipeline for one complete
evaluation of a given function
Represented by a matrix, where rows represent stages and
columns represent clock cycles
Multiple reservation tables for different functions
One to many mapping between pipeline configuration and
reservation tables
1 2 3 4 5 6 7 8
S1 X X x
S2 X x
S3 X x X
Reservation table for function x
6. Formation of Reservation Table
1 2 3 4 5 6 7 8
S1 X X x
S2 X x
S3 X x X
Reservation table for function x
1 2 3 4 5 6
S1 Y Y
S2 Y
S3 Y Y Y
Reservation table for function Y
7. Latency
An initiation refers to the start of a function evaluation, and the
latency refers to the number of time units between two
initiations
Any attempt by two or more initiations to use the same pipeline
stage at the same time causes a collision (resource conflict
between two initiations)
Collision causing latencies are called forbidden latencies
To detect forbidden latencies, check distance between any two
marks in the same row of the reservation table, e.g., 2,4,5 for
reservation table X
8. Computation of forbidden
latencies
S1: {(6-1), (8-1), (8-6)} = {5, 7, 2}
S2: {(4-2) = {2}
S1: {(5-3), (7-5), (7-3)} = {2, 4, 2}
Forbidden latencies = {2, 4, 5, 7}
Latencies that does not cause collisions are called permissible
latencies
permissible latencies = {1, 3, 6, 8}, as total number of clock = 8
1 2 3 4 5 6 7 8
S1 X X x
S2 X x
S3 X x X
10. Latency Analysis
The sequence of permissible latencies between successive task
initiations are called latency sequence
Latency sequence that repeats the same cycle indefinitely are
called latency cycle
Average latency of a cycle can be computed by dividing sum of all
latencies by number of latencies in cycle
Latency cycle (1, 6) has average latency =(1+6)/2 = 3.5
Some latency cycles contain only one latency value, these are
called constant cycle
constant cycle = (3)
11. Demonstration of latency
cycles
Latency cycle (1, 8) = 1, 8, 1, 8, 1, 8, …, with avg. latency 4.5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
S
1
X
1
X
2
X
1
X
2
X
1
X
2
X
3
X
4
X
3
X
4
X
3
X
4
S
2
X
1
X
2
X
1
X
2
X
3
X
4
X
3
X
4
S
3
X
1
X
2
X
1
X
2
X
1
X
2
X
3
X
4
X
3
X
4
X
3
X
4
Cycle repeats (1,8)
12. Collision free scheduling
To avoid collisions, all tasks should be scheduled properly
The objective is to achieve shortest average latency between
initiations without causing collisions
Steps to achieve shortest average latency are
Computation of collision vector
Formation of State diagram
Computation of greedy cycles
Computation of Minimum Average Latency (MAL)
13. Collision Vectors
An m-bit binary vector c = (cm, cm-1, …, c2, c1), m is the maximum
forbidden latency
E.g., c = 1011010
m n-1, where n is the number of clock cycles
Displays set of permissible & forbidden latencies
Ci = 1 if latency i causes collision
Ci = 0 for permissible latencies
Cm = 1 always (max forbidden latency)
14. Illustration
Maximum forbidden latency(m) = 7
7-bit collision vector (c) = (c7 c6 c5 c4 c3 c2 c1)
Forbidden latencies = {2, 4, 5, 7}
Permissible latencies = {1, 3, 4, 6}
Collision vector = (c7 c6 c5 c4 c3 c2 c1)
(1 0 1 1 0 1 0)
1 2 3 4 5 6 7 8
S1 X X x
S2 X x
S3 X x X
15. State Diagrams
Constructed from collision vector
Specifies permissible state transitions among successive initiations
based on collision vector
The collision vector at the initial state of the pipeline is called Initial
collision vector (ICV)
Allowed State Transitions
Permissible latencies = {1, 3, 6, 8+}, 8+ indicates that any latency beyond
maximum clock cycle is permissible
State transitions of the collision vector are allowed for these bit positions
in a collision vector
Next state of the collision vector can be obtained by making these
transitions
16. Steps to Draw State Diagrams
To obtain the next state of the collision vector
Right shift the collision vector of the present state for each transition =
{1, 3, 6, 8+}
Ored the right shifted collision vector of the of the present states with
ICV
ICV = (1011010)
Possible state transitions = {1, 3, 6, 8+}
Next state for transition =1
1011010 0101101
0101101 OR 1011010(ICV) = 1111111
1 bit right shift
1011010
1111111
1
18. Greedy Cycles
Simple cycles: a latency cycle in which each state appears only
once
Greedy cycles: Is a simple cycle whose edges are all made with
minimum latencies from their respective starting states
Their average latencies must be lower than those of other simple
cycles
19. MAL (Minimum Average Latency)
It is the minimum average latency obtained from the greedy cycle
Lower bounded by max number of checkmarks in any row of
reservation table
Lower than or equal to avg. latency of any greedy cycle in the state
diagram
Average latency of any greedy cycle is upper-bounded by number
of 1’s in the initial collision vector + 1.
20. Schedule Optimization
As greedy cycle not sufficient for optimality of MAL, lower bound on
MAL is required
Optimize the reservation table to obtain the lower bound
21. Reference
Hwang, Kai, and A. Faye. "Computer architecture and parallel
processing." (1984).
Hennessy, John L., and David A. Patterson. Computer architecture:
a quantitative approach. Elsevier, 2011.