Loop alignment

Loop Alignment
(Advanced Compilers)
By-
Isha Pandya
Sumita Das

Introduction
Loop distribution eliminates loop carried dependences by
executing the sources of all dependences before executing any
sinks.
Many carried dependencies are due to array alignment issues.
If we can align all references, then dependencies would go
away, and parallelism is possible.
For e.g.,
DO I = 2,N
A(I) = B(I)+C(I)
D(I) = A(I-1)*2.0
ENDDO
Created by Sumita Das

This loop cannot be run in parallel.
Because the value of A computed on iteration I is used on
iteration I+1.
The two statements can be aligned to compute and use the
values in the same iteration by adding an extra iteration and
adjusting the indices of one of the statement to produce
For e.g., DO I = 1,N+1
IF (I .GT. 1) A(I) = B(I)+C(I)
IF (I .LE. N) D(I+1) = A(I)*2.0
ENDDO

Illustration of Loop Alignment

DO I = 2,N
J = MOD(I+N-4,N-1)+2
A(J) = B(J)+C
D(I)=A(I-1)*2.0
ENDDO
Alignment
Loop alignment does incur some overhead—
One extra loop iteration and extra work required to test the
conditionals.
This overhead can be reduced by executing the last iteration of
the ﬁrst statement with the ﬁrst iteration of the second
statement.

For every iteration other than the ﬁrst, j is one less than i, so
that the assignment to A is for the ith location.
On the ﬁrst iteration, j=N-1 so that j+1=N, and the assignment
to the last location of A is correctly executed.
 As a result, the total number of loop iterations is restored to its
original count, but there is still the overhead of the MOD
calculation.

Alternatively, the conditional statements can be eliminated
without adding calls to MOD by peeling off the ﬁrst and last
executions for each of the statements, yielding
This form permits efficient parallelism with the added
overhead of two statements, one before and one after the
loop, that cannot be executed in parallel.
D(2) = A(1) * 2.0
DO I= 2, N-1
A(I) = B(I) + C(I)
D(I+1) = A(I)*2.0
ENDDO
A(N) = B(N) + C(N)

It is not possible to use alignment to eliminate all carried
dependences in a loop if the carried dependence is involved in
a recurrence, as the following example shows:
DO I = 1, N
A(I) = B(I) + C
B(I+1) = A(I) + D
ENDDO
In this example, the references to B create a carried
dependence.
For alignment to be successful in this case, we would need to
interchange the order of the two statements in the loop body.

However, the loop-independent dependence involving A
prevents interchanging the statements before alignment, so our
hope is that we can do the alignment and statement interchange in
a single step to eliminate the carried dependence:
DO I = 1, N+1
IF (I .NE. 1) B(I) = A(I-1) + D
IF (I .NE. N+1) A(I) = B(I) + C
ENDDO
Although B is now aligned, the references to A are misaligned,
creating a new carried dependence.
Looking at this example, it is reasonable to believe that loop
alignment cannot eliminate carried dependences in a recurrence.

Alignment, replication, and statement reordering are
sufficient to eliminate all carried dependencies in a single
loop containing no recurrence, and in which the distance of
each dependence is a constant independent of the loop
index
 We can establish this constructively.
 Let G = (V,E,) be a weighted graph. v  V is a
statement, and (v1, v2) is the dependence distance
between v1 and v2. Let o: V Z give the offset of
vertices.
 G is said to be carry free if o(v1) + (v1, v2) = o(v2).
Theorem

The carried dependences that are not involved in a recurrence
cannot be always eliminated by alignment without introducing
new carried dependences?
Because of the possibility of an alignment conflict—two or
more dependences that cannot be simultaneously aligned.
Consider the following example:
DO I = 1, N
A(I+1) = B(I) + C
X(I)= A(I+1) + A(I)
ENDDO
This loop contains two dependences involving the array A, one
loop-independent dependence and a loop-carried dependence.

If the statements are aligned to eliminate the carried
dependence, the following code results:
DO I = 0, N
IF (I .NE. 0) A(I+1) = B(I) + C
IF (I .NE. N) X(I+1)= A(I+2) + A(I+1)
ENDDO
The original loop-carried dependence has been eliminated,
but the process of eliminating it has transformed the original
loop-independent dependence into a loop-carried dependence.
The loop still cannot be correctly run in parallel.

procedure Align(V,E,,0)
While V is not empty
remove element v from V
for each (w,v)  E
if w  V
W  W  {w}
o(w)  o(v) - (w,v)
else if o(w) != o(v) - (w,v)
create vertex w’
replace (w,v) with (w’,v)
replicate all edges into w
onto w’
W  W  {w’}
o(w)’  o(v) - (w,v)
for each (v,w)  E
if w  V
W  W {w}
o(w)  o(v) + (v,w)
else if o(w) != o(v) + (v,w)
create vertex v’
replace (v,w) with (v’,w)
replicate edges into v onto v’
W  W  {v’}
o(v’)  o(w) - (v,w)
end align
Alignment Procedure

References
[1] Randy Allen, Ken Kennedy”Optimizing Compilers for Modern
Architectures, Chapter 6: Creating Coarse-Grained Parallelism”,
1st Edition

Loop alignment

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Loop alignment

Similar to Loop alignment (20)

More from Sumita Das

More from Sumita Das (10)

Recently uploaded

Recently uploaded (20)

Loop alignment