Slideshare.net (beta)

 
Post: 
Myspace Hi5 Friendster Xanga LiveJournal Facebook Blogger Tagged Typepad Freewebs BlackPlanet gigya icons



All comments

Add a comment on Slide 1

If you have a SlideShare account, login to comment; else you can comment as a guest


Showing 1-50 of 0 (more)

Cgo2007 P3 3 Birkbeck

From aiQUANT, 7 months ago

A Dimension Abstraction Approach to Vectorization in Matlab

195 views  |  0 comments  |  0 favorites  |  2 downloads  |  1 embed (Stats)
 

Tags

No tags to display

 
 

Groups/Events

Not added to any group/event

 
 

Privacy InfoNew!

This slideshow is Public

 
Embed in your blog
Embed (wordpress.com)
custom

Slideshow Statistics
Total Views: 195
on Slideshare: 121
from embeds: 74* * Views from embeds since 21 Aug, 07

Slideshow transcript

Slide 1: A Dimension Abstraction Approach to Vectorization in Matlab Neil Birkbeck Jonathan Levesque Jose Nelson Amaral Computing Science University of Alberta Edmonton, Alberta, Canada

Slide 2: Problem Problem Statement:   Generate equivalent, error-free vectorized source code for Matlab source while utilizing higher level matrix operations when possible to improve efficiency.

Slide 3: Motivation Loop-based code is slower  n=1000; than vector code in for i=1:n, A(i)=B(i)+C(i); Matlab. end  Why? 5x faster! n=1000; interpretive overhead  A(1:n)=B(1:n)+C(1:n); (type/shape checking,…)  resizing of arrays in loops Vectorization also useful for compiled  Matlab code, where optimized vector routines could be substituted.

Slide 4: Related Work Data dependence vectorization   Allen & Kennedy’s Codegen algorithm Build data dependence graph  Topological visit strongly connected components  Abstract Matrix Form (AMF) [Menon & Pingali]   axioms used to transform array code  take advantage of matrix multiplication  Not clear if it is easily extensible or allows for vectorization of irregular access (e.g., access to the diagonal)

Slide 5: Incorrect Vectorization Example 1:  for i=1:n, Pull out of loop. Index variable a(i)=b(i)+c(i); a(1:n)=b(1:n)+c(1:n) substitution (i1:n) end Vectorization correct if a,b, and c are row  vectors or column vectors If this is not true the vectorized code will introduce an error!

Slide 6: Incorrect Vectorization Example 2:  for i=1:n, x(i)=y(i,h)*z(h,i); end Matlab is untyped   Vectorization depends on whether h is a vector or scalar.  If h is a scalar: x(1:n)=y(1:n,h).*z(h,1:n)’;  Otherwise: x(1:n)=sum(y(1:n,h).*z(h,1:n)’,2);

Slide 7: Overview of Solution Data dependence-based vectorizer Knowledge of Vectorizable statement Shape of variables Propagate dimensionality up parse tree Yes Dimensions Perform Agree? Transformations No Leave statement in loop Output Vector statement

Slide 8: More Specifically Examples: Type dim Represent dimensionality of  scalar (1) expressions as list of symbols 1xn vector (1,*) 1 or “*” (>1) nx1 vector (*,1),(*) mxn matrix (*,*)  Assume known for variables.  Propagate up parse tree according to Matlab rules Compatibility:   dim(A)≈dim(B) when the lists are equivalent (after removal of redundant 1’s)

Slide 9: Vectorized Dimensionality Vectorized dimensionality:   representation of dimensions after vectorization of a loop  denoted dimi for loop with index variable i  Introduce new symbol ri for index variable i exp dim(exp) vectorized dimi(exp) 10 (1) 10 (1) for i=1:n, i (1) 1:n (1,ri) a(i)=10+i; end a (*) a (*) a(i) (1) a(1:n) (ri)

Slide 10: Vectorized Dimensionality Expressions with incompatible vectorized  dimensionality should not be vectorized.  When do dimensionalities agree? Θ in {+,-,.*,…}  Assignment expressions: elhs=erhs dimi(elhs)≈dimi(erhs) || erhs≈(1)   Element-wise binary operators: e=elhsΘerhs dimi(elhs) ≈(1)||dimi(erhs)≈(1)||dimi(elhs)≈dim(erhs) 

Slide 11: Vectorized Dimensionality Rules very restrictive:   Assume dim(A)=dim(B)=dim(C)=(*,*) dimi,j(B)=(rj,ri) for i=1:100, dimi,j(C)=(ri,rj) for j=1:100 A(i,j)=B(j,i)+C(i,j); Vectorization fails because (ri,rj) is not compatible with (rj,ri) end end

Slide 12: Transpose Transformation Extension to utilize transpose when  necessary is straightforward:  For assignment: if dimi(A)≈reverse(dimi(B)) then A=BT is allowable  for i=1:m, for j=1:n dimi,j(A)=reverse(dimi,j(B))=(ri,rj) A(i,j)=B(j,i); A(1:m,1:n)=(B(1:n,1:m))’ end end

Slide 13: Transpose Transformation Extension to utilize transpose when  necessary is straightforward:  Similar for pointwise operations: if dimi(A)≈reverse(dimi(B)) then AΘBT is allowable,  propagate dimi(AΘBT)=dimi(A) if dimi(reverse(A))≈dimi(A) then ATΘB is allowable,  propagate dimi(ATΘB)=dimi(B)

Slide 14: Pattern Database Dimensionality disagreement at binary operators inhibits  vectorization. Recognizing patterns (consisting of operator type and  operand dimensionalities) can be used to identify a transformation enabling vectorization. lhs operation rhs output Pattern: (ri, rj) Θ (ri,1) (ri, rj) for i=1:m, Transformed Result for j=1:n, A(i,j)=B(i,j)+C(i); B(i,j)+C(i); B(1:m,1:n)+repmat(C(1:m),1,n); end end

Slide 15: Pattern Database Diagonal access pattern:  lhs operation rhs output Pattern: (ri, ri) (index) nil (1, ri) for i=1:n, a(1:n)=A((1:n)+size(A,1)*((1:n)-1)).*b(1:n); a(i)=A(i,i)*b(i); end Column major indexing of A

Slide 16: Additive Reduction Statements Additive-reduction statements use a loop  variable to perform an accumulation.  Not all loop nest index variables appear in output dimensionality for i1=…, Loop nest variables for i2=…, I={i1,i2,…,ik} … for i=1:m, a subset of E J is I={i,j} J={i} for ik=… j=1:n, for A(J)=A(J)+E; a(i)=a(i)+B(i,j); … end end end end end

Slide 17: Additive Reduction (Solution) Maintain/propagate dimensionality and reduced  variables for an expression. ρ(E) denotes the reduced variables for expression E  When checking statement A(J)=A(J)+E   ensure dimi1,i2,…,ikA(J)≈dimi1,i2,…,ik(E) and ρ(E)=I-J  any variable ri in I-J but not in ρ(E) must be reduced I={i},J={} rirnotdimi(b(i))=(ri,1) in dimi(10) i in for i=1:m I-J={i} I-J={i} Reduce: 10m*10, ρ(m*10)={ri} a=a+b(i); a=a+10; Reduce: b(i)sum(b(i),1); ρ(10)={} ρ(b(i))={} end Vectorize: a=a+sum(b(1:m)); Vectorize: a=a+m*10;

Slide 18: Additive Reduction via Matrix Multiplication Matrix multiplication can be  for i=1:m used to perform reductions for j=1:n a(i)=a(i)+B(i,j)*x(j); on e=elhs*erhs , provided: end end dimi1,…,ik(elhs)=(Sl,rk) 1. dimi1,…,ik(erhs)=(rk,Sr) • j is used for reduction 2. • dimi,j(B(i,j))=(ri,rj) rk is a reduction variable. 3. • dimi,j (x(j))=(rj) Implies:  a(1:m)=a(1:m)+… B(1:m,1:n)*x(1:n); dimi1,…,ik(e)=(Sl,Sr)  ρ(e)=union(ρ(elhs), ρ(erhs),{rk}) 

Slide 19: Additive Reduction Example Additive reduction example:  ρ(a(i,j))={}, dimi,j(a(i,j))=(ri,rj) for i=1:m, ρ(b(j))={}, dimi,j(b(j))=(rj) for j=1:n, rj is reduction variable matrix multiplication to Use d(i)=d(i)+a(i,j)*b(j)+c(i,j) reduce rj end ρ(a(i,j)*b(j))={rj}, end ρ(c(i,j))={}, dimi,j(a(i,j)*b(j))=(ri) dimi,j(c(i,j))={ri,rj} Need to reduce rj: c(i,j)sum(c(i,j),2); ρ(a(i,j)*b(j)+sum(c(i,j),2))={rj}, dimi,j(a(i,j)*b(j)+sum(c(i,j),2)=(ri,rj) Dimensionality and reduced variables agree, now replace index variables: d(1:m)=d(1:m)+a(1:m,1:n)*b(1:n)+sum(c(1:m,1:n),2);

Slide 20: Implementation Prototype Vectorized Original Loop Vectorizer Loop Embedded Code yes Control Octave Parser Generator Dimension Statements no Check no Success Vectorize Create DDG Statement yes Pattern database and corresponding  transformations are specified in modular end-user extensible manner.

Slide 21: Results Source-to-source transformation   Timing results averaged over 100 runs:  Platform:  Matlab7.2.0.283  3.0 GHz Pentium D Processor

Slide 22: Results Histogram Equalization:  Input source Vectorized Result h=hist(im(:),[0:255]);%histogram h=hist(im(:),[(0:255)]); heq=255*cumsum(h(:))/sum(h(:)); heq=255*cumsum(h(:))/sum(h(:)); for i=1:size(im,1), im2(1:size(im,1),1:size(im,2))=... for j=1:size(im,2), heq(im(1:size(im,1),1:size(im,2))+1); im2(i,j)=heq(im(i,j)+1); end end For monochrome 8-bit 800x600 image:   original/vectorized: Entire routine: 0.178s/0.114s (speedup: 1.56)  Loop Portion only: 0.0814s/0.0176s (speedup: 4.6) 

Slide 23: Results (Menon & Pingali Examples) for k=1:p, for j=1:(i-1), X(i,1:p)=X(i,1:p)-L(i,1:i-1)*X(1:i-1,1:p); X(i,k)=X(i,k)-L(i,j)*X(j,k); end end for i=1:N,for j=1:N phi(k)=phi(k)+sum(a(1:N,1:N)’* phi(k)=phi(k)+a(i,j)*x_se(i)*f(j); x_se(1:N).*f(1:N),1); end end for i=1:n,for j=1:n, for k=1:n,for l=1:n y(i)=y(i)+x(j)*A(i,k)* y(1:n)=y(1:n)+x(1:n)’*... B(l,k)*C(l,j); (A(1:n,1:n)*B(1:n,1:n)’*C(1:n,1:n))’; end end end end Settings Input time (s) Output time(s) speedup i=500,p=5000 0.536s 0.030s 17 N=1000 0.174s 0.012s 14 n=40 0.622s 0.0001s 5000

Slide 24: Remaining Issues/Future Work Each pattern transformation is local; no  optimization over entire statement.  e.g., we do not optimize and distribute transposes Control flow within loop  Function calls   functionsare treated as pointwise operators (correct for many predefined arithmetic functions) Incorporate our analysis directly with shape  analysis

Slide 25: Summary Contributions:  A simple method to prevent incorrect vectorization in Matlab  A user extensible operator/dimensionality pattern database can be used to improve vectorization These patterns can make use of higher level  semantics (e.g., matrix multiplication) or diagonal accesses in vectorization.

Slide 26: Acknowledgements Funding provided by NSERC   Grateful for reviewers comments and suggestions

Slide 27: Thank You Questions?