Upcoming SlideShare
×

# A Dimension Abstraction Approach to Vectorization in Matlab

842 views

Published on

To help improve code vectorisation in Matlab.

Published in: Technology
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

• Be the first to like this

Views
Total views
842
On SlideShare
0
From Embeds
0
Number of Embeds
37
Actions
Shares
0
20
0
Likes
0
Embeds 0
No embeds

No notes for slide

### A Dimension Abstraction Approach to Vectorization in Matlab

1. 1. A Dimension Abstraction Approach to Vectorization in Matlab Neil Birkbeck Jonathan Levesque Jose Nelson Amaral Computing Science University of Alberta Edmonton, Alberta, Canada
2. 2. Problem <ul><li>Problem Statement: </li></ul><ul><ul><li>Generate equivalent, error-free vectorized source code for Matlab source while utilizing higher level matrix operations when possible to improve efficiency. </li></ul></ul>
3. 3. Motivation <ul><li>Loop-based code is slower than vector code in Matlab. </li></ul><ul><ul><li>Why? </li></ul></ul><ul><ul><ul><li>interpretive overhead (type/shape checking,…) </li></ul></ul></ul><ul><ul><ul><li>resizing of arrays in loops </li></ul></ul></ul><ul><li>Vectorization also useful for compiled Matlab code, where optimized vector routines could be substituted. </li></ul>n=1000; for i=1:n, A(i)=B(i)+C(i); end n=1000; A(1:n)=B(1:n)+C(1:n); 5x faster!
4. 4. Related Work <ul><li>Data dependence vectorization </li></ul><ul><ul><li>Allen & Kennedy’s Codegen algorithm </li></ul></ul><ul><ul><ul><li>Build data dependence graph </li></ul></ul></ul><ul><ul><ul><li>Topological visit strongly connected components </li></ul></ul></ul><ul><li>Abstract Matrix Form (AMF) [Menon & Pingali] </li></ul><ul><ul><li>axioms used to transform array code </li></ul></ul><ul><ul><li>take advantage of matrix multiplication </li></ul></ul><ul><ul><li>Not clear if it is easily extensible or allows for vectorization of irregular access (e.g., access to the diagonal) </li></ul></ul>
5. 5. Incorrect Vectorization <ul><li>Example 1: </li></ul>for i=1:n, a(i)=b(i)+c(i); end Pull out of loop. Index variable substitution (i  1:n) a(1:n)=b(1:n)+c(1:n) <ul><li>Vectorization correct if a,b, and c are row vectors or column vectors </li></ul>If this is not true the vectorized code will introduce an error!
6. 6. Incorrect Vectorization <ul><li>Example 2: </li></ul>for i=1:n, x(i)=y(i,h)*z(h,i); end <ul><li>Matlab is untyped </li></ul><ul><li>Vectorization depends on whether h is a vector or scalar. </li></ul><ul><ul><li>If h is a scalar: </li></ul></ul><ul><ul><li>Otherwise: </li></ul></ul>x(1:n)=y(1:n,h).*z(h,1:n)’; x(1:n)=sum(y(1:n,h).*z(h,1:n)’,2);
7. 7. Overview of Solution Vectorizable statement Data dependence-based vectorizer Knowledge of Shape of variables Propagate dimensionality up parse tree Dimensions Agree? Leave statement in loop No Yes Perform Transformations Output Vector statement
8. 8. More Specifically <ul><li>Represent dimensionality of expressions as list of symbols 1 or “*” (>1) </li></ul><ul><ul><li>Assume known for variables. </li></ul></ul>Examples: <ul><ul><li>Propagate up parse tree according to Matlab rules </li></ul></ul><ul><li>Compatibility: </li></ul><ul><ul><li>dim(A)≈dim(B) when the lists are equivalent (after removal of redundant 1’s) </li></ul></ul>dim Type (*,*) mxn matrix (*,1),(*) nx1 vector (1,*) 1xn vector (1) scalar
9. 9. Vectorized Dimensionality <ul><li>Vectorized dimensionality: </li></ul><ul><ul><li>representation of dimensions after vectorization of a loop </li></ul></ul><ul><ul><li>denoted dim i for loop with index variable i </li></ul></ul><ul><ul><li>Introduce new symbol r i for index variable i </li></ul></ul>for i=1:n, a(i)=10+i; end vectorized 10 (1) (1) 10 dim i (exp) dim(exp) exp 1:n (1,r i ) (1) i a(1:n) (r i ) (1) a(i) a (*) (*) a
10. 10. Vectorized Dimensionality <ul><li>Expressions with incompatible vectorized dimensionality should not be vectorized. </li></ul><ul><li>When do dimensionalities agree? </li></ul><ul><ul><li>Assignment expressions: e lhs =e rhs </li></ul></ul><ul><ul><ul><li>dim i (e lhs ) ≈ dim i (e rhs ) || e rhs ≈ (1) </li></ul></ul></ul><ul><ul><li>Element-wise binary operators: e=e lhs Θ e rhs </li></ul></ul><ul><ul><ul><li>dim i (e lhs ) ≈ (1)||dim i (e rhs ) ≈( 1)||dim i (e lhs ) ≈ dim(e rhs ) </li></ul></ul></ul>Θ in {+,-,.*,…}
11. 11. Vectorized Dimensionality <ul><li>Rules very restrictive: </li></ul><ul><ul><li>Assume dim(A)=dim(B)=dim(C)=(*,*) </li></ul></ul>dim i,j (B)=(r j ,r i ) dim i,j (C)=(r i ,r j ) Vectorization fails because (r i ,r j ) is not compatible with (r j ,r i ) for i=1:100, for j=1:100 A(i,j)=B(j,i)+C(i,j); end end
12. 12. Transpose Transformation <ul><li>Extension to utilize transpose when necessary is straightforward: </li></ul><ul><ul><li>For assignment: </li></ul></ul><ul><ul><ul><li>if dim i (A) ≈reverse( dim i (B)) then A=B T is allowable </li></ul></ul></ul>for i=1:m, for j=1:n A(i,j)=B(j,i); end end dim i,j (A)=reverse(dim i,j (B))=(r i ,r j ) A(1:m,1:n)=(B(1:n,1:m))’
13. 13. Transpose Transformation <ul><li>Extension to utilize transpose when necessary is straightforward: </li></ul><ul><ul><li>Similar for pointwise operations: </li></ul></ul><ul><ul><ul><li>if dim i (A)≈reverse( dim i (B)) then A Θ B T is allowable, propagate dim i (A Θ B T )= dim i (A) </li></ul></ul></ul><ul><ul><ul><li>if dim i (reverse(A))≈ dim i (A) then A T Θ B is allowable, propagate dim i (A T Θ B)= dim i (B) </li></ul></ul></ul>
14. 14. Pattern Database <ul><li>Dimensionality disagreement at binary operators inhibits vectorization. </li></ul><ul><li>Recognizing patterns (consisting of operator type and operand dimensionalities) can be used to identify a transformation enabling vectorization. </li></ul><ul><ul><li>lhs operation rhs output </li></ul></ul><ul><ul><li>(r i , r j ) Θ (r i ,1) (r i , r j ) </li></ul></ul>for i=1:m, for j=1:n, A(i,j)=B(i,j)+C(i); end end B(i,j)+C(i); B(1:m,1:n)+repmat(C(1:m),1,n); Transformed Result Pattern:
15. 15. Pattern Database <ul><li>Diagonal access pattern: </li></ul><ul><ul><li>lhs operation rhs output </li></ul></ul><ul><ul><li>(r i , r i ) (index) nil (1, r i ) </li></ul></ul>Pattern: for i=1:n, a(i)=A(i,i)*b(i); end a(1:n)=A((1:n)+size(A,1)*((1:n)-1)).*b(1:n); Column major indexing of A
16. 16. Additive Reduction Statements <ul><li>Additive-reduction statements use a loop variable to perform an accumulation. </li></ul><ul><ul><li>Not all loop nest index variables appear in output dimensionality </li></ul></ul>for i1=…, for i2=…, … for ik=… A(J)=A(J)+E; … end end end Loop nest variables I={i1,i2,…,ik} J is a subset of E for i=1:m, for j=1:n, a(i)=a(i)+B(i,j); end end I={i,j} J={i}
17. 17. Additive Reduction (Solution) <ul><li>Maintain/propagate dimensionality and reduced variables for an expression. </li></ul><ul><ul><li>ρ (E) denotes the reduced variables for expression E </li></ul></ul><ul><li>When checking statement A(J)=A(J)+E </li></ul><ul><ul><li>ensure dim i1,i2,…,ik A(J) ≈ dim i1,i2,…,ik (E) and ρ (E)=I-J </li></ul></ul><ul><ul><li>any variable r i in I-J but not in ρ (E) must be reduced </li></ul></ul>for i=1:m a=a+b(i); end I={i},J={} I-J={i} ρ (b(i))={} r i in dim i (b(i))=(ri,1) Reduce: b(i)  sum(b(i),1); Vectorize: a=a+sum(b(1:m)); for i=1:m a=a+10; end I={i},J={} I-J={i} ρ (10)={} r i not in dim i (10) Reduce: 10  m*10, ρ (m*10)={r i } Vectorize: a=a+m*10;
18. 18. Additive Reduction via Matrix Multiplication <ul><li>Matrix multiplication can be used to perform reductions on e=e lhs *e rhs , provided: </li></ul><ul><ul><ul><li>dim i1,…,ik (e lhs )=(S l ,r k ) </li></ul></ul></ul><ul><ul><ul><li>dim i1,…,ik (e rhs )=(r k ,S r ) </li></ul></ul></ul><ul><ul><ul><li>r k is a reduction variable. </li></ul></ul></ul><ul><ul><li>Implies: </li></ul></ul><ul><ul><ul><li>dim i1,…,ik (e)=(S l ,S r ) </li></ul></ul></ul><ul><ul><ul><li>ρ (e)=union( ρ (e lhs ), ρ (e rhs ),{r k }) </li></ul></ul></ul>for i=1:m for j=1:n a(i)=a(i)+B(i,j)*x(j); end end <ul><li>j is used for reduction </li></ul><ul><li>dim i,j (B(i,j))=(r i ,r j ) </li></ul><ul><li>dim i,j (x(j))=(r j ) </li></ul>a(1:m)=a(1:m)+… B(1:m,1:n)*x(1:n);
19. 19. Additive Reduction Example <ul><li>Additive reduction example: </li></ul>ρ (a(i,j)*b(j)+sum(c(i,j),2))={r j }, dim i,j (a(i,j)*b(j)+sum(c(i,j),2)=(r i ,r j ) ρ (a(i,j))={}, dim i,j (a(i,j))=(r i ,r j ) ρ (b(j))={}, dim i,j (b(j))=(r j ) r j is reduction variable for i=1:m, for j=1:n, d(i)=d(i)+a(i,j)*b(j)+c(i,j) end end ρ (c(i,j))={}, dim i,j (c(i,j))={ri,rj} Need to reduce r j : c(i,j)  sum(c(i,j),2); Dimensionality and reduced variables agree, now replace index variables: d(1:m)=d(1:m)+a(1:m,1:n)*b(1:n)+sum(c(1:m,1:n),2); ρ (a(i,j)*b(j))={r j }, dim i,j (a(i,j)*b(j))=(r i ) Use matrix multiplication to reduce r j
20. 20. Implementation Prototype <ul><li>Pattern database and corresponding transformations are specified in modular end-user extensible manner. </li></ul>Original Loop Octave Parser Embedded Control Statements Create DDG Dimension Check Success Vectorize Statement Code Generator Vectorizer Vectorized Loop no yes no yes
21. 21. Results <ul><li>Source-to-source transformation </li></ul><ul><li>Timing results averaged over 100 runs: </li></ul><ul><li>Platform: </li></ul><ul><ul><li>Matlab 7.2.0.283 </li></ul></ul><ul><ul><li>3.0 GHz Pentium D Processor </li></ul></ul>
22. 22. Results <ul><li>Histogram Equalization: </li></ul>h= hist (im(:),[0:255]);%histogram heq=255* cumsum (h(:))/ sum (h(:)); for i=1: size (im,1), for j=1: size (im,2), im2(i,j)=heq(im(i,j)+1); end end h= hist (im(:),[(0:255)]); heq=255* cumsum (h(:))/ sum (h(:)); im2(1: size (im,1),1: size (im,2))=... heq(im(1: size (im,1),1: size (im,2))+1); Input source Vectorized Result <ul><li>For monochrome 8-bit 800x600 image: </li></ul><ul><ul><li>original/vectorized: </li></ul></ul><ul><ul><ul><li>Entire routine: 0.178s/0.114s (speedup: 1.56) </li></ul></ul></ul><ul><ul><ul><li>Loop Portion only: 0.0814s/0.0176s (speedup: 4.6) </li></ul></ul></ul>
23. 23. Results (Menon & Pingali Examples) X(i,1:p)=X(i,1:p)-L(i,1:i-1)*X(1:i-1,1:p); for k=1:p, for j=1:(i-1), X(i,k)=X(i,k)-L(i,j)*X(j,k); end end for i=1:N, for j=1:N phi(k)=phi(k)+a(i,j)*x_se(i)*f(j); end end phi(k)=phi(k)+ sum (a(1:N,1:N)’* x_se(1:N).*f(1:N),1); for i=1:n, for j=1:n, for k=1:n, for l=1:n y(i)=y(i)+x(j)*A(i,k)* B(l,k)*C(l,j); end end end end y(1:n)=y(1:n)+x(1:n)’*... (A(1:n,1:n)*B(1:n,1:n)’*C(1:n,1:n))’; 5000 0.0001s 0.622s n=40 14 0.012s 0.174s N=1000 17 0.030s 0.536s i=500,p=5000 speedup Output time(s) Input time (s) Settings
24. 24. Remaining Issues/Future Work <ul><li>Each pattern transformation is local; no optimization over entire statement. </li></ul><ul><ul><li>e.g., we do not optimize and distribute transposes </li></ul></ul><ul><li>Control flow within loop </li></ul><ul><li>Function calls </li></ul><ul><ul><li>functions are treated as pointwise operators (correct for many predefined arithmetic functions) </li></ul></ul><ul><li>Incorporate our analysis directly with shape analysis </li></ul>
25. 25. Summary <ul><li>Contributions: </li></ul><ul><ul><li>A simple method to prevent incorrect vectorization in Matlab </li></ul></ul><ul><ul><li>A user extensible operator/dimensionality pattern database can be used to improve vectorization </li></ul></ul><ul><ul><ul><li>These patterns can make use of higher level semantics (e.g., matrix multiplication) or diagonal accesses in vectorization. </li></ul></ul></ul>
26. 26. Acknowledgements <ul><li>Funding provided by NSERC </li></ul><ul><li>Grateful for reviewers comments and suggestions </li></ul>
27. 27. Thank You <ul><li>Questions? </li></ul>