Generative and Meta-Programming - Modern C++ Design for Parallel Computing

2,137 views

Published on

Those slides are from my 2011 Texas A&M IAMCS seminar.

Generative and Generic progrmamign techniques for HPC are presented over NT2 and Quaff projects

Published in: Technology
1 Comment
0 Likes
Statistics
Notes
  • Please dear can you contact me on my email id jessicaduale@yahoo.com, i have something private to discuss with you. Thank, i will be happy to meet you in my email. Jessica
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

No Downloads
Views
Total views
2,137
On SlideShare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
Downloads
15
Comments
1
Likes
0
Embeds 0
No embeds

No notes for slide

Generative and Meta-Programming - Modern C++ Design for Parallel Computing

  1. 1. Introduction Generative Programming NT2 Quaff Conclusion Generative and Meta-Programming Modern C++ Design for Parallel Computing Joel Falcou 05/02/2011 LRI, University Paris Sud XI LSU Seminar 1 / 23
  2. 2. Introduction Generative Programming NT2 Quaff Conclusion Context In Scientific Computing ... there is Scientific LSU Seminar 2 / 23
  3. 3. Introduction Generative Programming NT2 Quaff Conclusion Context In Scientific Computing ... there is Scientific Applications are domain driven Users = Developers Users are reluctant to changes LSU Seminar 2 / 23
  4. 4. Introduction Generative Programming NT2 Quaff Conclusion Context In Scientific Computing ... there is Scientific Applications are domain driven Users = Developers Users are reluctant to changes there is Computing LSU Seminar 2 / 23
  5. 5. Introduction Generative Programming NT2 Quaff Conclusion Context In Scientific Computing ... there is Scientific Applications are domain driven Users = Developers Users are reluctant to changes there is Computing Computing requires performance ... ... which implies architectures specific tuning ... which requires expertise ... which may or may not be available LSU Seminar 2 / 23
  6. 6. Introduction Generative Programming NT2 Quaff Conclusion Context In Scientific Computing ... there is Scientific Applications are domain driven Users = Developers Users are reluctant to changes there is Computing Computing requires performance ... ... which implies architectures specific tuning ... which requires expertise ... which may or may not be available The Problem People using computers to do science want to do science first. LSU Seminar 2 / 23
  7. 7. Introduction Generative Programming NT2 Quaff Conclusion The Problem – and how we want to solve it The Facts The ”Library to bind them all” doesn’t exist (or we should have it already ) All those users want to take advantage of new architectures Few of them want to actually handle all the dirty work The Ends Provide a ”familiar” interface that let users benefit from parallelism Helps compilers to generate better parallel code Increase sustainability by decreasing amount of code to write The Means Generative Programming Template Meta-Programming Embedded Domain Specific Languages LSU Seminar 3 / 23
  8. 8. Introduction Generative Programming NT2 Quaff Conclusion Talk Layout 1 Introduction 2 Generative Programming 3 NT2 4 Quaff 5 Conclusion LSU Seminar 4 / 23
  9. 9. Introduction Generative Programming NT2 Quaff Conclusion Generative Programming Domain Specific Generative Component Concrete Application Application Description Translator Parametric Sub-components LSU Seminar 5 / 23
  10. 10. Introduction Generative Programming NT2 Quaff Conclusion Generative Programming as a Tool Available techniques Dedicated compilers External pre-processing tools Languages supporting meta-programming LSU Seminar 6 / 23
  11. 11. Introduction Generative Programming NT2 Quaff Conclusion Generative Programming as a Tool Available techniques Dedicated compilers External pre-processing tools Languages supporting meta-programming LSU Seminar 6 / 23
  12. 12. Introduction Generative Programming NT2 Quaff Conclusion Generative Programming as a Tool Available techniques Dedicated compilers External pre-processing tools Languages supporting meta-programming Definition of Meta-programming Meta-programming is the writing of computer programs that analyse, transform and generate other programs (or them- selves) as their data. LSU Seminar 6 / 23
  13. 13. Introduction Generative Programming NT2 Quaff Conclusion From Generative to Meta-programming Meta-programmable languages template H ASKELL metaOcaml C++ LSU Seminar 7 / 23
  14. 14. Introduction Generative Programming NT2 Quaff Conclusion From Generative to Meta-programming Meta-programmable languages template H ASKELL metaOcaml C++ LSU Seminar 7 / 23
  15. 15. Introduction Generative Programming NT2 Quaff Conclusion From Generative to Meta-programming Meta-programmable languages template H ASKELL metaOcaml C++ C++ meta-programming Relies on the C++ template sub-language Handles types and integral constants at compile-time Proved to be Turing-complete LSU Seminar 7 / 23
  16. 16. Introduction Generative Programming NT2 Quaff Conclusion Embedded Domain Specific Languages What’s an EDSL ? DSL = Domain Specific Language Declarative language, easy-to-use, fitting the domain EDSL = DSL within a general purpose language EDSL in C++ Relies on operator overload abuse (Expression Templates) Carry semantic information around code fragment Generic implementation become self-aware of optimizations Exploiting static AST At the expression level: code generation At the function level: inter-procedural optimization LSU Seminar 8 / 23
  17. 17. Introduction Generative Programming NT2 Quaff Conclusion Expression Templates matrix x(h,w),a(h,w),b(h,w); = x = cos(a) + (b*a); x + expr<assign ,expr<matrix&> ,expr<plus cos * , expr<cos ,expr<matrix&> > a b a , expr<multiplies ,expr<matrix&> ,expr<matrix&> > #pragma omp parallel for >(x,a,b); for(int j=0;j<h;++j) { for(int i=0;i<w;++i) { Arbitrary Transforms applied x(j,i) = cos(a(j,i)) on the meta-AST + ( b(j,i) * a(j,i) ); } } LSU Seminar 9 / 23
  18. 18. Introduction Generative Programming NT2 Quaff Conclusion What is NT 2 ? A Scientific Computing Library Provide a simple, M ATLAB-like interface for users Provide high-performance computing entities and primitives Easily extendable LSU Seminar 10 / 23
  19. 19. Introduction Generative Programming NT2 Quaff Conclusion What is NT 2 ? A Scientific Computing Library Provide a simple, M ATLAB-like interface for users Provide high-performance computing entities and primitives Easily extendable Principles Take a .m file, copy to a .cpp file LSU Seminar 10 / 23
  20. 20. Introduction Generative Programming NT2 Quaff Conclusion What is NT 2 ? A Scientific Computing Library Provide a simple, M ATLAB-like interface for users Provide high-performance computing entities and primitives Easily extendable Principles Take a .m file, copy to a .cpp file Add #include <nt2/nt2.hpp> and do cosmetic changes LSU Seminar 10 / 23
  21. 21. Introduction Generative Programming NT2 Quaff Conclusion What is NT 2 ? A Scientific Computing Library Provide a simple, M ATLAB-like interface for users Provide high-performance computing entities and primitives Easily extendable Principles Take a .m file, copy to a .cpp file Add #include <nt2/nt2.hpp> and do cosmetic changes Compile the file and link with libnt2.a LSU Seminar 10 / 23
  22. 22. Introduction Generative Programming NT2 Quaff Conclusion M ATLAB you said ? R = I(:,:,1); G = I(:,:,2); B = I(:,:,3); Y = min(abs(0.299.*R+0.587.*G+0.114.*B),235); U = min(abs(-0.169.*R-0.331.*G+0.5.*B),240); V = min(abs(0.5.*R-0.419.*G-0.081.*B),240); LSU Seminar 11 / 23
  23. 23. Introduction Generative Programming NT2 Quaff Conclusion Now with NT 2 table<double> R = I(_,_,1); table<double> G = I(_,_,2); table<double> B = I(_,_,3); table<double> Y, U, V; Y = min(abs(0.299*R+0.587*G+0.114*B),235); U = min(abs(-0.169*R-0.331*G+0.5*B),240); V = min(abs(0.5*R-0.419*G-0.081*B),240); LSU Seminar 12 / 23
  24. 24. Introduction Generative Programming NT2 Quaff Conclusion Now with NT 2 table<float,settings(shallow,of_size_<640,480>)> R = I(_,_,1); table<float,settings(shallow,of_size_<640,480>)> G = I(_,_,2); table<float,settings(shallow,of_size_<640,480>)> B = I(_,_,3); table<float,settings(of_size_<640,480>)> Y, U, V; Y = min(abs(0.299*R+0.587*G+0.114*B),235), U = min(abs(-0.169*R-0.331*G+0.5*B),240), V = min(abs(0.5*R-0.419*G-0.081*B),240); LSU Seminar 12 / 23
  25. 25. Introduction Generative Programming NT2 Quaff Conclusion Some Performances RGB2YUV timing (in cycles/pixels) Size 128x128 256x256 512x512 1024x1024 M ATLAB 2010a (2 cores) 85 89 97 102 C (1 core) 23.6 23.8 23.9 24.0 NT2 (1 core) 22.3 22.4 22.2 24.1 NT2 OpenMP (2 cores) 11.4 11.4 11.5 12.2 NT2 SIMD 5.5 5.6 5.6 5.9 NT2 SIMD+OpenMP 2.9 2.9 2.9 3.0 NT2 SIMD speed-up 3.9 3.9 3.9 4.0 NT2 OpenMP speed-up 1.92 1.96 1.98 1.95 NT2 vs M ATLAB speed-up 29.3 30.7 33.5 34 LSU Seminar 13 / 23
  26. 26. Introduction Generative Programming NT2 Quaff Conclusion Some real life applications LSU Seminar 14 / 23
  27. 27. Introduction Generative Programming NT2 Quaff Conclusion Parallel Skeletons in a nutshell Basic Principles [COLE 89] There are patterns in parallel applications Those patterns can be generalized in Skeletons Applications are assembled as combination of such patterns LSU Seminar 15 / 23
  28. 28. Introduction Generative Programming NT2 Quaff Conclusion Parallel Skeletons in a nutshell Basic Principles [COLE 89] There are patterns in parallel applications Those patterns can be generalized in Skeletons Applications are assembled as combination of such patterns Functionnal point of view Skeletons are Higher-Order Functions Skeletons support a compositionnal semantic Applications become composition of state-less functions LSU Seminar 15 / 23
  29. 29. Introduction Generative Programming NT2 Quaff Conclusion Spotting skeletons when you see one Pipeline Farm Processus 3 F2 Processus 1 Processus 2 Processus 4 Processus 6 Processus 7 F1 Distrib. F2 Collect. F3 Processus 5 F2 LSU Seminar 16 / 23
  30. 30. Introduction Generative Programming NT2 Quaff Conclusion Objectives Proposing a C++ skeletons library Limit runtime overhead Use the formal model skeleton as guidelines LSU Seminar 17 / 23
  31. 31. Introduction Generative Programming NT2 Quaff Conclusion Objectives Proposing a C++ skeletons library Limit runtime overhead Use the formal model skeleton as guidelines What we want to write run( pipeline( seq(F1), seq(F2), seq(F3) ) ) LSU Seminar 17 / 23
  32. 32. Introduction Generative Programming NT2 Quaff Conclusion Objectives Proposing a C++ skeletons library Limit runtime overhead Use the formal model skeleton as guidelines What we want to run if( rank == 0 ) do { out = F1(); MPI_Send(&out,1,MPI_INT,1,0,MPI_COMM_WORLD); } while( isValid(out) ); if( rank == 1 ) do { MPI_Recv(&in,1,MPI_INT,0,0,MPI_COMM_WORLD,&s); out = F2(in); MPI_Send(&in,1,MPI_INT,2,0,MPI_COMM_WORLD); } while ( isValid(out) ) if( rank == 2 ) do { MPI_Recv(&in,1,MPI_INT,1,0,MPI_COMM_WORLD,&s); F2(in); } while ( isValid(in) ) LSU Seminar 17 / 23
  33. 33. Introduction Generative Programming NT2 Quaff Conclusion Design Rationale Code generation process Generating the skeleton tree at compile-time Turning this tree into a process network using MPL Producing the C + MPI target code using Fusion PIPE Génération Production φ 2 Génération AST du RPSC φ 2 φ 2 de code C+MPI C C++ φ1 FARM3 φ3 MPI φ 1 f φ 3 φ2 LSU Seminar 18 / 23
  34. 34. Introduction Generative Programming NT2 Quaff Conclusion Quaff over the CELL The OCELLE project IEF/CEA/TRT Tools for heterog. proc. MPI and skeleton for the Cell Developing for the CELL How to find a good mapping for an application ? LSU Seminar 19 / 23
  35. 35. Introduction Generative Programming NT2 Quaff Conclusion Harris Corner Detector Mul Ixx=Ix*Ix Gauss Sxx Grad X Ix coarsity I Mul Ixy=Ix*Iy Gauss Sxy Sxx*Syy-Sxy² K Grad Y Iy Mul Iyy=Iy*Iy Gauss Syy LSU Seminar 20 / 23
  36. 36. Introduction Generative Programming NT2 Quaff Conclusion Harris Corner Detector Mul Ixx=Ix*Ix Gauss Sxx Grad X Ix coarsity I Mul Ixy=Ix*Iy Gauss Sxy Sxx*Syy-Sxy² K Grad Y Iy Mul Iyy=Iy*Iy Gauss Syy Skeleton Code and Benchmarks void full_chain_harris(tile const&,tile&) { run( pardo<8>((seq(grad),seq(mul),seq(gauss),seq(coarsity))); } LSU Seminar 20 / 23
  37. 37. Introduction Generative Programming NT2 Quaff Conclusion Harris Corner Detector Mul Ixx=Ix*Ix Gauss Sxx Grad X Ix coarsity I Mul Ixy=Ix*Iy Gauss Sxy Sxx*Syy-Sxy² K Grad Y Iy Mul Iyy=Iy*Iy Gauss Syy Skeleton Code and Benchmarks void half_chain_harris(tile const&,tile&) { run( pardo<4>((seq(grad),seq(mul)) | (seq(gauss),seq(coarsity))); } LSU Seminar 20 / 23
  38. 38. Introduction Generative Programming NT2 Quaff Conclusion Harris Corner Detector Mul Ixx=Ix*Ix Gauss Sxx Grad X Ix coarsity I Mul Ixy=Ix*Iy Gauss Sxy Sxx*Syy-Sxy² K Grad Y Iy Mul Iyy=Iy*Iy Gauss Syy Skeleton Code and Benchmarks void no_chain_harris(tile const&,tile&) { run( pardo<2>( seq(grad) | seq(mul) | seq(gauss) | seq(coarsity)); } LSU Seminar 20 / 23
  39. 39. Introduction Generative Programming NT2 Quaff Conclusion Harris Corner Detector Mul Ixx=Ix*Ix Gauss Sxx Grad X Ix coarsity I Mul Ixy=Ix*Iy Gauss Sxy Sxx*Syy-Sxy² K Grad Y Iy Mul Iyy=Iy*Iy Gauss Syy Results (PACT 09) Version Full-chain Half-chain No-chain Manual 11.26 8.36 9.97 Skell 11.86 8.64 10.43 Overhead 5.33% 3.35% 4.61% LSU Seminar 20 / 23
  40. 40. Introduction Generative Programming NT2 Quaff Conclusion Let’s round this up! Parallel Computing for Scientist Parallel programmign tools are required BUT: ... They need to cater to users’ needs ! ... Users should feel «at home» LSU Seminar 21 / 23
  41. 41. Introduction Generative Programming NT2 Quaff Conclusion Let’s round this up! Parallel Computing for Scientist Parallel programmign tools are required BUT: ... They need to cater to users’ needs ! ... Users should feel «at home» Our claims Software Libraries built as Generic and Generative components can solve a large chunk of parallelism related problems while being easy to use. EDSL are an efficient way to design such libraries C++ provides a wide selection of idioms to achieve this goal. LSU Seminar 21 / 23
  42. 42. Introduction Generative Programming NT2 Quaff Conclusion Prospectives What we’re cooking at the moment Get a public release of NT 2 and Quaff NT 2 : Automatic matlab conversion with source to source compiler Quaff: Simplify new skeletons design by providing a Semantic Rules EDSL What we’ll be cooking in the future Multi-stage programming for skeleton over the Grid/Cloud NT 2 supports for embedded architectures LSU Seminar 22 / 23
  43. 43. Thanks for your attention

×