SlideShare a Scribd company logo
Introduction      Generative Programming                NT2   Quaff         Conclusion




                  Generative and Meta-Programming
               Modern C++ Design for Parallel Computing

                                       Joel Falcou
                                           05/02/2011
                               LRI, University Paris Sud XI




                                                                      LSU Seminar
                                                                                    1 / 23
Introduction           Generative Programming    NT2   Quaff         Conclusion


                                           Context
       In Scientific Computing ...
               there is Scientific




                                                               LSU Seminar
                                                                             2 / 23
Introduction           Generative Programming        NT2   Quaff         Conclusion


                                           Context
       In Scientific Computing ...
               there is Scientific
                    Applications are domain driven
                    Users = Developers
                    Users are reluctant to changes




                                                                   LSU Seminar
                                                                                 2 / 23
Introduction           Generative Programming        NT2   Quaff         Conclusion


                                           Context
       In Scientific Computing ...
               there is Scientific
                    Applications are domain driven
                    Users = Developers
                    Users are reluctant to changes
               there is Computing




                                                                   LSU Seminar
                                                                                 2 / 23
Introduction           Generative Programming        NT2             Quaff         Conclusion


                                           Context
       In Scientific Computing ...
               there is Scientific
                    Applications are domain driven
                    Users = Developers
                    Users are reluctant to changes
               there is Computing
                    Computing requires performance ...
                    ... which implies architectures specific tuning
                    ... which requires expertise
                    ... which may or may not be available




                                                                             LSU Seminar
                                                                                           2 / 23
Introduction           Generative Programming        NT2             Quaff         Conclusion


                                           Context
       In Scientific Computing ...
               there is Scientific
                    Applications are domain driven
                    Users = Developers
                    Users are reluctant to changes
               there is Computing
                    Computing requires performance ...
                    ... which implies architectures specific tuning
                    ... which requires expertise
                    ... which may or may not be available

       The Problem
       People using computers to do science want to do science first.


                                                                             LSU Seminar
                                                                                           2 / 23
Introduction             Generative Programming            NT2              Quaff            Conclusion


        The Problem – and how we want to solve it

       The Facts
               The ”Library to bind them all” doesn’t exist (or we should have it already )
               All those users want to take advantage of new architectures
               Few of them want to actually handle all the dirty work

       The Ends
               Provide a ”familiar” interface that let users benefit from parallelism
               Helps compilers to generate better parallel code
               Increase sustainability by decreasing amount of code to write

       The Means
               Generative Programming
               Template Meta-Programming
               Embedded Domain Specific Languages




                                                                                       LSU Seminar
                                                                                                     3 / 23
Introduction       Generative Programming   NT2   Quaff         Conclusion


                                   Talk Layout


       1   Introduction

       2   Generative Programming

       3   NT2

       4   Quaff

       5   Conclusion




                                                          LSU Seminar
                                                                        4 / 23
Introduction      Generative Programming                    NT2              Quaff             Conclusion


                     Generative Programming


                  Domain Specific         Generative Component     Concrete Application
               Application Description



                                             Translator



                                             Parametric
                                           Sub-components




                                                                                         LSU Seminar
                                                                                                       5 / 23
Introduction          Generative Programming   NT2     Quaff         Conclusion


                 Generative Programming as a Tool


       Available techniques
               Dedicated compilers
               External pre-processing tools
               Languages supporting meta-programming




                                                               LSU Seminar
                                                                             6 / 23
Introduction          Generative Programming   NT2     Quaff         Conclusion


                 Generative Programming as a Tool


       Available techniques
               Dedicated compilers
               External pre-processing tools
               Languages supporting meta-programming




                                                               LSU Seminar
                                                                             6 / 23
Introduction          Generative Programming   NT2     Quaff         Conclusion


                 Generative Programming as a Tool


       Available techniques
               Dedicated compilers
               External pre-processing tools
               Languages supporting meta-programming
       Definition of Meta-programming
       Meta-programming is the writing of computer programs that
       analyse, transform and generate other programs (or them-
       selves) as their data.




                                                               LSU Seminar
                                                                             6 / 23
Introduction         Generative Programming   NT2   Quaff         Conclusion


               From Generative to Meta-programming


       Meta-programmable languages
               template H ASKELL
               metaOcaml
               C++




                                                            LSU Seminar
                                                                          7 / 23
Introduction         Generative Programming   NT2   Quaff         Conclusion


               From Generative to Meta-programming


       Meta-programmable languages
               template H ASKELL
               metaOcaml
               C++




                                                            LSU Seminar
                                                                          7 / 23
Introduction          Generative Programming    NT2         Quaff           Conclusion


               From Generative to Meta-programming


       Meta-programmable languages
               template H ASKELL
               metaOcaml
               C++
       C++ meta-programming
               Relies on the C++ template sub-language
               Handles types and integral constants at compile-time
               Proved to be Turing-complete




                                                                      LSU Seminar
                                                                                    7 / 23
Introduction            Generative Programming        NT2             Quaff         Conclusion


               Embedded Domain Specific Languages

       What’s an EDSL ?
               DSL = Domain Specific Language
               Declarative language, easy-to-use, fitting the domain
               EDSL = DSL within a general purpose language

       EDSL in C++
               Relies on operator overload abuse (Expression Templates)
               Carry semantic information around code fragment
               Generic implementation become self-aware of optimizations

       Exploiting static AST
               At the expression level: code generation
               At the function level: inter-procedural optimization



                                                                              LSU Seminar
                                                                                            8 / 23
Introduction         Generative Programming      NT2                 Quaff             Conclusion


                          Expression Templates

               matrix x(h,w),a(h,w),b(h,w);            =
               x = cos(a) + (b*a);
                                                  x          +
               expr<assign
                   ,expr<matrix&>
                   ,expr<plus                          cos            *
                        , expr<cos
                              ,expr<matrix&>
                              >
                                                       a         b           a
                        , expr<multiplies
                              ,expr<matrix&>
                              ,expr<matrix&>
                              >                #pragma omp parallel for
                        >(x,a,b);              for(int j=0;j<h;++j)
                                               {
                                                 for(int i=0;i<w;++i)
                                                 {
               Arbitrary Transforms applied
                                                   x(j,i) = cos(a(j,i))
                     on the meta-AST
                                                          + ( b(j,i)
                                                             * a(j,i)
                                                          );
                                                 }
                                               }




                                                                                 LSU Seminar
                                                                                               9 / 23
Introduction            Generative Programming       NT2            Quaff         Conclusion


                                     What is NT 2 ?


       A Scientific Computing Library
               Provide a simple, M ATLAB-like interface for users
               Provide high-performance computing entities and primitives
               Easily extendable




                                                                            LSU Seminar
                                                                                          10 / 23
Introduction            Generative Programming       NT2            Quaff         Conclusion


                                     What is NT 2 ?


       A Scientific Computing Library
               Provide a simple, M ATLAB-like interface for users
               Provide high-performance computing entities and primitives
               Easily extendable

       Principles
               Take a .m file, copy to a .cpp file




                                                                            LSU Seminar
                                                                                          10 / 23
Introduction            Generative Programming       NT2            Quaff         Conclusion


                                     What is NT 2 ?


       A Scientific Computing Library
               Provide a simple, M ATLAB-like interface for users
               Provide high-performance computing entities and primitives
               Easily extendable

       Principles
               Take a .m file, copy to a .cpp file
               Add #include <nt2/nt2.hpp> and do cosmetic changes




                                                                            LSU Seminar
                                                                                          10 / 23
Introduction            Generative Programming       NT2            Quaff         Conclusion


                                     What is NT 2 ?


       A Scientific Computing Library
               Provide a simple, M ATLAB-like interface for users
               Provide high-performance computing entities and primitives
               Easily extendable

       Principles
               Take a .m file, copy to a .cpp file
               Add #include <nt2/nt2.hpp> and do cosmetic changes
               Compile the file and link with libnt2.a




                                                                            LSU Seminar
                                                                                          10 / 23
Introduction      Generative Programming   NT2        Quaff         Conclusion


                          M ATLAB you said ?



      R = I(:,:,1);
      G = I(:,:,2);
      B = I(:,:,3);

      Y = min(abs(0.299.*R+0.587.*G+0.114.*B),235);
      U = min(abs(-0.169.*R-0.331.*G+0.5.*B),240);
      V = min(abs(0.5.*R-0.419.*G-0.081.*B),240);




                                                              LSU Seminar
                                                                            11 / 23
Introduction      Generative Programming   NT2     Quaff         Conclusion


                                Now with NT 2



      table<double>   R = I(_,_,1);
      table<double>   G = I(_,_,2);
      table<double>   B = I(_,_,3);
      table<double>   Y, U, V;

      Y = min(abs(0.299*R+0.587*G+0.114*B),235);
      U = min(abs(-0.169*R-0.331*G+0.5*B),240);
      V = min(abs(0.5*R-0.419*G-0.081*B),240);




                                                           LSU Seminar
                                                                         12 / 23
Introduction      Generative Programming   NT2        Quaff         Conclusion


                                Now with NT 2



      table<float,settings(shallow,of_size_<640,480>)> R = I(_,_,1);
      table<float,settings(shallow,of_size_<640,480>)> G = I(_,_,2);
      table<float,settings(shallow,of_size_<640,480>)> B = I(_,_,3);
      table<float,settings(of_size_<640,480>)> Y, U, V;

      Y = min(abs(0.299*R+0.587*G+0.114*B),235),
      U = min(abs(-0.169*R-0.331*G+0.5*B),240),
      V = min(abs(0.5*R-0.419*G-0.081*B),240);




                                                              LSU Seminar
                                                                            12 / 23
Introduction           Generative Programming        NT2            Quaff            Conclusion


                              Some Performances

       RGB2YUV timing (in cycles/pixels)

           Size                         128x128   256x256   512x512     1024x1024
           M ATLAB 2010a (2 cores)         85        89        97          102
           C (1 core)                     23.6      23.8      23.9         24.0
           NT2 (1 core)                   22.3      22.4      22.2         24.1
           NT2 OpenMP (2 cores)           11.4      11.4      11.5         12.2
           NT2 SIMD                       5.5        5.6       5.6         5.9
           NT2 SIMD+OpenMP                2.9        2.9       2.9         3.0
           NT2 SIMD speed-up              3.9       3.9       3.9           4.0
           NT2 OpenMP speed-up            1.92     1.96      1.98           1.95
           NT2 vs M ATLAB speed-up        29.3     30.7      33.5            34




                                                                               LSU Seminar
                                                                                             13 / 23
Introduction   Generative Programming   NT2   Quaff         Conclusion


               Some real life applications




                                                      LSU Seminar
                                                                    14 / 23
Introduction            Generative Programming        NT2       Quaff             Conclusion


                    Parallel Skeletons in a nutshell


       Basic Principles [COLE 89]
               There are patterns in parallel applications
               Those patterns can be generalized in Skeletons
               Applications are assembled as combination of such patterns




                                                                            LSU Seminar
                                                                                          15 / 23
Introduction            Generative Programming        NT2        Quaff            Conclusion


                    Parallel Skeletons in a nutshell


       Basic Principles [COLE 89]
               There are patterns in parallel applications
               Those patterns can be generalized in Skeletons
               Applications are assembled as combination of such patterns

       Functionnal point of view
               Skeletons are Higher-Order Functions
               Skeletons support a compositionnal semantic
               Applications become composition of state-less functions




                                                                            LSU Seminar
                                                                                          15 / 23
Introduction             Generative Programming                 NT2                 Quaff              Conclusion


               Spotting skeletons when you see one


               Pipeline
                              Farm
                                                  Processus 3


                                                     F2
                Processus 1     Processus 2       Processus 4         Processus 6      Processus 7


                   F1           Distrib.             F2               Collect.              F3

                                                  Processus 5


                                                     F2




                                                                                                 LSU Seminar
                                                                                                               16 / 23
Introduction           Generative Programming      NT2       Quaff         Conclusion


                                        Objectives




       Proposing a C++ skeletons library
               Limit runtime overhead
               Use the formal model skeleton as guidelines




                                                                     LSU Seminar
                                                                                   17 / 23
Introduction           Generative Programming      NT2       Quaff         Conclusion


                                        Objectives



       Proposing a C++ skeletons library
               Limit runtime overhead
               Use the formal model skeleton as guidelines

       What we want to write
       run( pipeline( seq(F1), seq(F2), seq(F3) ) )




                                                                     LSU Seminar
                                                                                   17 / 23
Introduction           Generative Programming         NT2      Quaff         Conclusion


                                        Objectives

       Proposing a C++ skeletons library
               Limit runtime overhead
               Use the formal model skeleton as guidelines

       What we want to run
       if( rank == 0 )
         do { out = F1();
              MPI_Send(&out,1,MPI_INT,1,0,MPI_COMM_WORLD);
            } while( isValid(out) );

       if( rank == 1 )
         do { MPI_Recv(&in,1,MPI_INT,0,0,MPI_COMM_WORLD,&s);
              out = F2(in);
              MPI_Send(&in,1,MPI_INT,2,0,MPI_COMM_WORLD);
            } while ( isValid(out) )

       if( rank == 2 )
         do { MPI_Recv(&in,1,MPI_INT,1,0,MPI_COMM_WORLD,&s);
              F2(in);
            } while ( isValid(in) )




                                                                       LSU Seminar
                                                                                     17 / 23
Introduction                Generative Programming                       NT2                              Quaff                 Conclusion


                                        Design Rationale


       Code generation process
                Generating the skeleton tree at compile-time
                Turning this tree into a process network using MPL
                Producing the C + MPI target code using Fusion

                                       PIPE
                     Génération                     Production                   φ   2
                                                                                                           Génération
                       AST                           du RPSC             φ   2           φ   2           de code C+MPI
                                                                                                                           C
               C++
                                  φ1   FARM3   φ3                                                                         MPI
                                                                 φ   1           f               φ   3




                                        φ2




                                                                                                                         LSU Seminar
                                                                                                                                       18 / 23
Introduction   Generative Programming           NT2           Quaff           Conclusion


                      Quaff over the CELL


                                  The OCELLE project
                                        IEF/CEA/TRT
                                        Tools for heterog. proc.
                                        MPI and skeleton for the Cell


                                  Developing for the CELL
                                  How to find a good mapping for an application ?




                                                                        LSU Seminar
                                                                                      19 / 23
Introduction       Generative Programming                  NT2                Quaff             Conclusion


                        Harris Corner Detector




                                   Mul      Ixx=Ix*Ix   Gauss    Sxx

                    Grad X   Ix
                                                                         coarsity
               I                   Mul      Ixy=Ix*Iy   Gauss    Sxy   Sxx*Syy-Sxy²
                                                                                      K

                    Grad Y   Iy

                                   Mul      Iyy=Iy*Iy   Gauss    Syy




                                                                                          LSU Seminar
                                                                                                        20 / 23
Introduction           Generative Programming                  NT2                Quaff             Conclusion


                            Harris Corner Detector


                                       Mul      Ixx=Ix*Ix   Gauss    Sxx

                        Grad X   Ix
                                                                             coarsity
                   I                   Mul      Ixy=Ix*Iy   Gauss    Sxy   Sxx*Syy-Sxy²
                                                                                          K

                        Grad Y   Iy

                                       Mul      Iyy=Iy*Iy   Gauss    Syy




       Skeleton Code and Benchmarks
       void full_chain_harris(tile const&,tile&)
       {
        run( pardo<8>((seq(grad),seq(mul),seq(gauss),seq(coarsity)));
       }




                                                                                              LSU Seminar
                                                                                                            20 / 23
Introduction           Generative Programming                  NT2                Quaff             Conclusion


                            Harris Corner Detector


                                       Mul      Ixx=Ix*Ix   Gauss    Sxx

                        Grad X   Ix
                                                                             coarsity
                   I                   Mul      Ixy=Ix*Iy   Gauss    Sxy   Sxx*Syy-Sxy²
                                                                                          K

                        Grad Y   Iy

                                       Mul      Iyy=Iy*Iy   Gauss    Syy




       Skeleton Code and Benchmarks
       void half_chain_harris(tile const&,tile&)
       {
        run( pardo<4>((seq(grad),seq(mul)) | (seq(gauss),seq(coarsity)));
       }




                                                                                              LSU Seminar
                                                                                                            20 / 23
Introduction           Generative Programming                  NT2                Quaff             Conclusion


                            Harris Corner Detector



                                       Mul      Ixx=Ix*Ix   Gauss    Sxx

                        Grad X   Ix
                                                                             coarsity
                   I                   Mul      Ixy=Ix*Iy   Gauss    Sxy   Sxx*Syy-Sxy²
                                                                                          K

                        Grad Y   Iy

                                       Mul      Iyy=Iy*Iy   Gauss    Syy




       Skeleton Code and Benchmarks
       void no_chain_harris(tile const&,tile&)
       {
        run( pardo<2>( seq(grad) | seq(mul) | seq(gauss) | seq(coarsity));
       }




                                                                                              LSU Seminar
                                                                                                            20 / 23
Introduction        Generative Programming                    NT2                 Quaff             Conclusion


                         Harris Corner Detector


                                     Mul     Ixx=Ix*Ix     Gauss    Sxx

                     Grad X   Ix
                                                                             coarsity
               I                     Mul     Ixy=Ix*Iy     Gauss    Sxy    Sxx*Syy-Sxy²
                                                                                          K

                     Grad Y   Iy

                                     Mul     Iyy=Iy*Iy     Gauss    Syy




       Results (PACT 09)

                    Version        Full-chain            Half-chain       No-chain
                    Manual           11.26                 8.36             9.97
                     Skell           11.86                 8.64            10.43
                   Overhead         5.33%                 3.35%            4.61%



                                                                                              LSU Seminar
                                                                                                            20 / 23
Introduction            Generative Programming        NT2     Quaff         Conclusion


                                Let’s round this up!


       Parallel Computing for Scientist
               Parallel programmign tools are required BUT:
               ... They need to cater to users’ needs !
               ... Users should feel «at home»




                                                                      LSU Seminar
                                                                                    21 / 23
Introduction            Generative Programming        NT2          Quaff             Conclusion


                                Let’s round this up!


       Parallel Computing for Scientist
               Parallel programmign tools are required BUT:
               ... They need to cater to users’ needs !
               ... Users should feel «at home»
       Our claims
               Software Libraries built as Generic and Generative components can
               solve a large chunk of parallelism related problems while being
               easy to use.
               EDSL are an efficient way to design such libraries
               C++ provides a wide selection of idioms to achieve this goal.




                                                                               LSU Seminar
                                                                                             21 / 23
Introduction           Generative Programming       NT2           Quaff            Conclusion


                                     Prospectives


       What we’re cooking at the moment
               Get a public release of NT 2 and Quaff
               NT 2 : Automatic matlab conversion with source to source compiler
               Quaff: Simplify new skeletons design by providing a Semantic
               Rules EDSL
       What we’ll be cooking in the future
               Multi-stage programming for skeleton over the Grid/Cloud
               NT 2 supports for embedded architectures




                                                                           LSU Seminar
                                                                                         22 / 23
Thanks for your attention

More Related Content

Viewers also liked

cara membuat kalkulator dengan C#
cara membuat kalkulator dengan C#cara membuat kalkulator dengan C#
cara membuat kalkulator dengan C#Hibaten Wafiroh
 
Seri Belajar Mandiri - Pemrograman C# Untuk Pemula
Seri Belajar Mandiri - Pemrograman C# Untuk PemulaSeri Belajar Mandiri - Pemrograman C# Untuk Pemula
Seri Belajar Mandiri - Pemrograman C# Untuk PemulaAgus Kurniawan
 
Pemrograman Game Tetris Dengan C#
Pemrograman Game Tetris Dengan C#Pemrograman Game Tetris Dengan C#
Pemrograman Game Tetris Dengan C#Robby Angryawan
 
Belajar koding c#
Belajar koding c#Belajar koding c#
Belajar koding c#Ali Ikhsan
 
Introduction to csharp
Introduction to csharpIntroduction to csharp
Introduction to csharpSatish Verma
 
Pengenalan bahasa c#
Pengenalan bahasa c#Pengenalan bahasa c#
Pengenalan bahasa c#Heru Khoir
 

Viewers also liked (8)

cara membuat kalkulator dengan C#
cara membuat kalkulator dengan C#cara membuat kalkulator dengan C#
cara membuat kalkulator dengan C#
 
Seri Belajar Mandiri - Pemrograman C# Untuk Pemula
Seri Belajar Mandiri - Pemrograman C# Untuk PemulaSeri Belajar Mandiri - Pemrograman C# Untuk Pemula
Seri Belajar Mandiri - Pemrograman C# Untuk Pemula
 
Pemrograman Game Tetris Dengan C#
Pemrograman Game Tetris Dengan C#Pemrograman Game Tetris Dengan C#
Pemrograman Game Tetris Dengan C#
 
Tutorial csharp
Tutorial csharpTutorial csharp
Tutorial csharp
 
Pemrograman Dasar Pengenalan C#
Pemrograman Dasar Pengenalan C#Pemrograman Dasar Pengenalan C#
Pemrograman Dasar Pengenalan C#
 
Belajar koding c#
Belajar koding c#Belajar koding c#
Belajar koding c#
 
Introduction to csharp
Introduction to csharpIntroduction to csharp
Introduction to csharp
 
Pengenalan bahasa c#
Pengenalan bahasa c#Pengenalan bahasa c#
Pengenalan bahasa c#
 

Similar to Generative and Meta-Programming - Modern C++ Design for Parallel Computing

Deep Learning libraries and first experiments with Theano
Deep Learning libraries and first experiments with TheanoDeep Learning libraries and first experiments with Theano
Deep Learning libraries and first experiments with TheanoVincenzo Lomonaco
 
A Java-Compatible Multi-Thread Middleware for an Experimental Wireless Sensor...
A Java-Compatible Multi-Thread Middleware for an Experimental Wireless Sensor...A Java-Compatible Multi-Thread Middleware for an Experimental Wireless Sensor...
A Java-Compatible Multi-Thread Middleware for an Experimental Wireless Sensor...paperpublications3
 
A Java-Compatible Multi-Thread Middleware for an Experimental Wireless Sensor...
A Java-Compatible Multi-Thread Middleware for an Experimental Wireless Sensor...A Java-Compatible Multi-Thread Middleware for an Experimental Wireless Sensor...
A Java-Compatible Multi-Thread Middleware for an Experimental Wireless Sensor...paperpublications3
 
Compiler Engineering Lab#1
Compiler Engineering Lab#1Compiler Engineering Lab#1
Compiler Engineering Lab#1MashaelQ
 
Literate computing for reproducible infrastructure - our basic practices in a...
Literate computing for reproducible infrastructure - our basic practices in a...Literate computing for reproducible infrastructure - our basic practices in a...
Literate computing for reproducible infrastructure - our basic practices in a...No Bu
 
Seminar Report - Managing the Cloud with Open Source Tools
Seminar Report - Managing the Cloud with Open Source ToolsSeminar Report - Managing the Cloud with Open Source Tools
Seminar Report - Managing the Cloud with Open Source ToolsNakul Ezhuthupally
 
Distributed Deployment Model Driven Development
Distributed Deployment Model Driven DevelopmentDistributed Deployment Model Driven Development
Distributed Deployment Model Driven DevelopmentPathfinder Solutions
 
229 Convergence In Device Software
229   Convergence In Device Software229   Convergence In Device Software
229 Convergence In Device SoftwareEric Cloninger
 
Jtag Tools For Linux
Jtag Tools For LinuxJtag Tools For Linux
Jtag Tools For Linuxsheilamia
 
RTL Design Engineer - Vinothkumar Murugesan - 3.3Yrs - Microchip
RTL Design Engineer - Vinothkumar Murugesan - 3.3Yrs - MicrochipRTL Design Engineer - Vinothkumar Murugesan - 3.3Yrs - Microchip
RTL Design Engineer - Vinothkumar Murugesan - 3.3Yrs - Microchipvinothkumar M
 
Directive-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous ComputingDirective-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous ComputingRuymán Reyes
 
Master Thesis of Computer Engineering: OpenTranslator
Master Thesis of Computer Engineering: OpenTranslatorMaster Thesis of Computer Engineering: OpenTranslator
Master Thesis of Computer Engineering: OpenTranslatorGiuseppe D'Onofrio
 
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARNHadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARNJosh Patterson
 
Tokyo Webmining Talk1
Tokyo Webmining Talk1Tokyo Webmining Talk1
Tokyo Webmining Talk1Kenta Oono
 
Cag Corporate Dossier May 2012
Cag Corporate Dossier May 2012Cag Corporate Dossier May 2012
Cag Corporate Dossier May 2012fastmpj
 

Similar to Generative and Meta-Programming - Modern C++ Design for Parallel Computing (20)

Deep Learning libraries and first experiments with Theano
Deep Learning libraries and first experiments with TheanoDeep Learning libraries and first experiments with Theano
Deep Learning libraries and first experiments with Theano
 
A Java-Compatible Multi-Thread Middleware for an Experimental Wireless Sensor...
A Java-Compatible Multi-Thread Middleware for an Experimental Wireless Sensor...A Java-Compatible Multi-Thread Middleware for an Experimental Wireless Sensor...
A Java-Compatible Multi-Thread Middleware for an Experimental Wireless Sensor...
 
A Java-Compatible Multi-Thread Middleware for an Experimental Wireless Sensor...
A Java-Compatible Multi-Thread Middleware for an Experimental Wireless Sensor...A Java-Compatible Multi-Thread Middleware for an Experimental Wireless Sensor...
A Java-Compatible Multi-Thread Middleware for an Experimental Wireless Sensor...
 
Compiler Engineering Lab#1
Compiler Engineering Lab#1Compiler Engineering Lab#1
Compiler Engineering Lab#1
 
Literate computing for reproducible infrastructure - our basic practices in a...
Literate computing for reproducible infrastructure - our basic practices in a...Literate computing for reproducible infrastructure - our basic practices in a...
Literate computing for reproducible infrastructure - our basic practices in a...
 
Seminar Report - Managing the Cloud with Open Source Tools
Seminar Report - Managing the Cloud with Open Source ToolsSeminar Report - Managing the Cloud with Open Source Tools
Seminar Report - Managing the Cloud with Open Source Tools
 
Mpi4py
Mpi4pyMpi4py
Mpi4py
 
Distributed Deployment Model Driven Development
Distributed Deployment Model Driven DevelopmentDistributed Deployment Model Driven Development
Distributed Deployment Model Driven Development
 
229 Convergence In Device Software
229   Convergence In Device Software229   Convergence In Device Software
229 Convergence In Device Software
 
Thesis Defense
Thesis DefenseThesis Defense
Thesis Defense
 
FA 10 Poster
FA 10 PosterFA 10 Poster
FA 10 Poster
 
Jtag Tools For Linux
Jtag Tools For LinuxJtag Tools For Linux
Jtag Tools For Linux
 
RTL Design Engineer - Vinothkumar Murugesan - 3.3Yrs - Microchip
RTL Design Engineer - Vinothkumar Murugesan - 3.3Yrs - MicrochipRTL Design Engineer - Vinothkumar Murugesan - 3.3Yrs - Microchip
RTL Design Engineer - Vinothkumar Murugesan - 3.3Yrs - Microchip
 
Thesis
ThesisThesis
Thesis
 
Directive-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous ComputingDirective-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous Computing
 
Ppt on fft
Ppt on fftPpt on fft
Ppt on fft
 
Master Thesis of Computer Engineering: OpenTranslator
Master Thesis of Computer Engineering: OpenTranslatorMaster Thesis of Computer Engineering: OpenTranslator
Master Thesis of Computer Engineering: OpenTranslator
 
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARNHadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
 
Tokyo Webmining Talk1
Tokyo Webmining Talk1Tokyo Webmining Talk1
Tokyo Webmining Talk1
 
Cag Corporate Dossier May 2012
Cag Corporate Dossier May 2012Cag Corporate Dossier May 2012
Cag Corporate Dossier May 2012
 

More from Joel Falcou

Designing C++ portable SIMD support
Designing C++ portable SIMD supportDesigning C++ portable SIMD support
Designing C++ portable SIMD supportJoel Falcou
 
The Goal and The Journey - Turning back on one year of C++14 Migration
The Goal and The Journey - Turning back on one year of C++14 MigrationThe Goal and The Journey - Turning back on one year of C++14 Migration
The Goal and The Journey - Turning back on one year of C++14 MigrationJoel Falcou
 
HDR Defence - Software Abstractions for Parallel Architectures
HDR Defence - Software Abstractions for Parallel ArchitecturesHDR Defence - Software Abstractions for Parallel Architectures
HDR Defence - Software Abstractions for Parallel ArchitecturesJoel Falcou
 
Lattice Boltzmann sur architecture multicoeurs vectorielle - Une approche de ...
Lattice Boltzmann sur architecture multicoeurs vectorielle - Une approche de ...Lattice Boltzmann sur architecture multicoeurs vectorielle - Une approche de ...
Lattice Boltzmann sur architecture multicoeurs vectorielle - Une approche de ...Joel Falcou
 
Automatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSELAutomatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSELJoel Falcou
 
Software Abstractions for Parallel Hardware
Software Abstractions for Parallel HardwareSoftware Abstractions for Parallel Hardware
Software Abstractions for Parallel HardwareJoel Falcou
 
(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel ArchitecturesJoel Falcou
 
Designing Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.ProtoDesigning Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.ProtoJoel Falcou
 

More from Joel Falcou (10)

Designing C++ portable SIMD support
Designing C++ portable SIMD supportDesigning C++ portable SIMD support
Designing C++ portable SIMD support
 
The Goal and The Journey - Turning back on one year of C++14 Migration
The Goal and The Journey - Turning back on one year of C++14 MigrationThe Goal and The Journey - Turning back on one year of C++14 Migration
The Goal and The Journey - Turning back on one year of C++14 Migration
 
HDR Defence - Software Abstractions for Parallel Architectures
HDR Defence - Software Abstractions for Parallel ArchitecturesHDR Defence - Software Abstractions for Parallel Architectures
HDR Defence - Software Abstractions for Parallel Architectures
 
Lattice Boltzmann sur architecture multicoeurs vectorielle - Une approche de ...
Lattice Boltzmann sur architecture multicoeurs vectorielle - Une approche de ...Lattice Boltzmann sur architecture multicoeurs vectorielle - Une approche de ...
Lattice Boltzmann sur architecture multicoeurs vectorielle - Une approche de ...
 
Boost.SIMD
Boost.SIMDBoost.SIMD
Boost.SIMD
 
Automatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSELAutomatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSEL
 
Software Abstractions for Parallel Hardware
Software Abstractions for Parallel HardwareSoftware Abstractions for Parallel Hardware
Software Abstractions for Parallel Hardware
 
(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures
 
Boost.Dispatch
Boost.DispatchBoost.Dispatch
Boost.Dispatch
 
Designing Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.ProtoDesigning Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.Proto
 

Recently uploaded

The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfFIDO Alliance
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?Mark Billinghurst
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...FIDO Alliance
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfFIDO Alliance
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfFIDO Alliance
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfFIDO Alliance
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfChristopherTHyatt
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...CzechDreamin
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Julian Hyde
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessUXDXConf
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutesconfluent
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka DoktorováCzechDreamin
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomCzechDreamin
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastUXDXConf
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...FIDO Alliance
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityScyllaDB
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2DianaGray10
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxJennifer Lim
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaCzechDreamin
 

Recently uploaded (20)

The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdf
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 

Generative and Meta-Programming - Modern C++ Design for Parallel Computing

  • 1. Introduction Generative Programming NT2 Quaff Conclusion Generative and Meta-Programming Modern C++ Design for Parallel Computing Joel Falcou 05/02/2011 LRI, University Paris Sud XI LSU Seminar 1 / 23
  • 2. Introduction Generative Programming NT2 Quaff Conclusion Context In Scientific Computing ... there is Scientific LSU Seminar 2 / 23
  • 3. Introduction Generative Programming NT2 Quaff Conclusion Context In Scientific Computing ... there is Scientific Applications are domain driven Users = Developers Users are reluctant to changes LSU Seminar 2 / 23
  • 4. Introduction Generative Programming NT2 Quaff Conclusion Context In Scientific Computing ... there is Scientific Applications are domain driven Users = Developers Users are reluctant to changes there is Computing LSU Seminar 2 / 23
  • 5. Introduction Generative Programming NT2 Quaff Conclusion Context In Scientific Computing ... there is Scientific Applications are domain driven Users = Developers Users are reluctant to changes there is Computing Computing requires performance ... ... which implies architectures specific tuning ... which requires expertise ... which may or may not be available LSU Seminar 2 / 23
  • 6. Introduction Generative Programming NT2 Quaff Conclusion Context In Scientific Computing ... there is Scientific Applications are domain driven Users = Developers Users are reluctant to changes there is Computing Computing requires performance ... ... which implies architectures specific tuning ... which requires expertise ... which may or may not be available The Problem People using computers to do science want to do science first. LSU Seminar 2 / 23
  • 7. Introduction Generative Programming NT2 Quaff Conclusion The Problem – and how we want to solve it The Facts The ”Library to bind them all” doesn’t exist (or we should have it already ) All those users want to take advantage of new architectures Few of them want to actually handle all the dirty work The Ends Provide a ”familiar” interface that let users benefit from parallelism Helps compilers to generate better parallel code Increase sustainability by decreasing amount of code to write The Means Generative Programming Template Meta-Programming Embedded Domain Specific Languages LSU Seminar 3 / 23
  • 8. Introduction Generative Programming NT2 Quaff Conclusion Talk Layout 1 Introduction 2 Generative Programming 3 NT2 4 Quaff 5 Conclusion LSU Seminar 4 / 23
  • 9. Introduction Generative Programming NT2 Quaff Conclusion Generative Programming Domain Specific Generative Component Concrete Application Application Description Translator Parametric Sub-components LSU Seminar 5 / 23
  • 10. Introduction Generative Programming NT2 Quaff Conclusion Generative Programming as a Tool Available techniques Dedicated compilers External pre-processing tools Languages supporting meta-programming LSU Seminar 6 / 23
  • 11. Introduction Generative Programming NT2 Quaff Conclusion Generative Programming as a Tool Available techniques Dedicated compilers External pre-processing tools Languages supporting meta-programming LSU Seminar 6 / 23
  • 12. Introduction Generative Programming NT2 Quaff Conclusion Generative Programming as a Tool Available techniques Dedicated compilers External pre-processing tools Languages supporting meta-programming Definition of Meta-programming Meta-programming is the writing of computer programs that analyse, transform and generate other programs (or them- selves) as their data. LSU Seminar 6 / 23
  • 13. Introduction Generative Programming NT2 Quaff Conclusion From Generative to Meta-programming Meta-programmable languages template H ASKELL metaOcaml C++ LSU Seminar 7 / 23
  • 14. Introduction Generative Programming NT2 Quaff Conclusion From Generative to Meta-programming Meta-programmable languages template H ASKELL metaOcaml C++ LSU Seminar 7 / 23
  • 15. Introduction Generative Programming NT2 Quaff Conclusion From Generative to Meta-programming Meta-programmable languages template H ASKELL metaOcaml C++ C++ meta-programming Relies on the C++ template sub-language Handles types and integral constants at compile-time Proved to be Turing-complete LSU Seminar 7 / 23
  • 16. Introduction Generative Programming NT2 Quaff Conclusion Embedded Domain Specific Languages What’s an EDSL ? DSL = Domain Specific Language Declarative language, easy-to-use, fitting the domain EDSL = DSL within a general purpose language EDSL in C++ Relies on operator overload abuse (Expression Templates) Carry semantic information around code fragment Generic implementation become self-aware of optimizations Exploiting static AST At the expression level: code generation At the function level: inter-procedural optimization LSU Seminar 8 / 23
  • 17. Introduction Generative Programming NT2 Quaff Conclusion Expression Templates matrix x(h,w),a(h,w),b(h,w); = x = cos(a) + (b*a); x + expr<assign ,expr<matrix&> ,expr<plus cos * , expr<cos ,expr<matrix&> > a b a , expr<multiplies ,expr<matrix&> ,expr<matrix&> > #pragma omp parallel for >(x,a,b); for(int j=0;j<h;++j) { for(int i=0;i<w;++i) { Arbitrary Transforms applied x(j,i) = cos(a(j,i)) on the meta-AST + ( b(j,i) * a(j,i) ); } } LSU Seminar 9 / 23
  • 18. Introduction Generative Programming NT2 Quaff Conclusion What is NT 2 ? A Scientific Computing Library Provide a simple, M ATLAB-like interface for users Provide high-performance computing entities and primitives Easily extendable LSU Seminar 10 / 23
  • 19. Introduction Generative Programming NT2 Quaff Conclusion What is NT 2 ? A Scientific Computing Library Provide a simple, M ATLAB-like interface for users Provide high-performance computing entities and primitives Easily extendable Principles Take a .m file, copy to a .cpp file LSU Seminar 10 / 23
  • 20. Introduction Generative Programming NT2 Quaff Conclusion What is NT 2 ? A Scientific Computing Library Provide a simple, M ATLAB-like interface for users Provide high-performance computing entities and primitives Easily extendable Principles Take a .m file, copy to a .cpp file Add #include <nt2/nt2.hpp> and do cosmetic changes LSU Seminar 10 / 23
  • 21. Introduction Generative Programming NT2 Quaff Conclusion What is NT 2 ? A Scientific Computing Library Provide a simple, M ATLAB-like interface for users Provide high-performance computing entities and primitives Easily extendable Principles Take a .m file, copy to a .cpp file Add #include <nt2/nt2.hpp> and do cosmetic changes Compile the file and link with libnt2.a LSU Seminar 10 / 23
  • 22. Introduction Generative Programming NT2 Quaff Conclusion M ATLAB you said ? R = I(:,:,1); G = I(:,:,2); B = I(:,:,3); Y = min(abs(0.299.*R+0.587.*G+0.114.*B),235); U = min(abs(-0.169.*R-0.331.*G+0.5.*B),240); V = min(abs(0.5.*R-0.419.*G-0.081.*B),240); LSU Seminar 11 / 23
  • 23. Introduction Generative Programming NT2 Quaff Conclusion Now with NT 2 table<double> R = I(_,_,1); table<double> G = I(_,_,2); table<double> B = I(_,_,3); table<double> Y, U, V; Y = min(abs(0.299*R+0.587*G+0.114*B),235); U = min(abs(-0.169*R-0.331*G+0.5*B),240); V = min(abs(0.5*R-0.419*G-0.081*B),240); LSU Seminar 12 / 23
  • 24. Introduction Generative Programming NT2 Quaff Conclusion Now with NT 2 table<float,settings(shallow,of_size_<640,480>)> R = I(_,_,1); table<float,settings(shallow,of_size_<640,480>)> G = I(_,_,2); table<float,settings(shallow,of_size_<640,480>)> B = I(_,_,3); table<float,settings(of_size_<640,480>)> Y, U, V; Y = min(abs(0.299*R+0.587*G+0.114*B),235), U = min(abs(-0.169*R-0.331*G+0.5*B),240), V = min(abs(0.5*R-0.419*G-0.081*B),240); LSU Seminar 12 / 23
  • 25. Introduction Generative Programming NT2 Quaff Conclusion Some Performances RGB2YUV timing (in cycles/pixels) Size 128x128 256x256 512x512 1024x1024 M ATLAB 2010a (2 cores) 85 89 97 102 C (1 core) 23.6 23.8 23.9 24.0 NT2 (1 core) 22.3 22.4 22.2 24.1 NT2 OpenMP (2 cores) 11.4 11.4 11.5 12.2 NT2 SIMD 5.5 5.6 5.6 5.9 NT2 SIMD+OpenMP 2.9 2.9 2.9 3.0 NT2 SIMD speed-up 3.9 3.9 3.9 4.0 NT2 OpenMP speed-up 1.92 1.96 1.98 1.95 NT2 vs M ATLAB speed-up 29.3 30.7 33.5 34 LSU Seminar 13 / 23
  • 26. Introduction Generative Programming NT2 Quaff Conclusion Some real life applications LSU Seminar 14 / 23
  • 27. Introduction Generative Programming NT2 Quaff Conclusion Parallel Skeletons in a nutshell Basic Principles [COLE 89] There are patterns in parallel applications Those patterns can be generalized in Skeletons Applications are assembled as combination of such patterns LSU Seminar 15 / 23
  • 28. Introduction Generative Programming NT2 Quaff Conclusion Parallel Skeletons in a nutshell Basic Principles [COLE 89] There are patterns in parallel applications Those patterns can be generalized in Skeletons Applications are assembled as combination of such patterns Functionnal point of view Skeletons are Higher-Order Functions Skeletons support a compositionnal semantic Applications become composition of state-less functions LSU Seminar 15 / 23
  • 29. Introduction Generative Programming NT2 Quaff Conclusion Spotting skeletons when you see one Pipeline Farm Processus 3 F2 Processus 1 Processus 2 Processus 4 Processus 6 Processus 7 F1 Distrib. F2 Collect. F3 Processus 5 F2 LSU Seminar 16 / 23
  • 30. Introduction Generative Programming NT2 Quaff Conclusion Objectives Proposing a C++ skeletons library Limit runtime overhead Use the formal model skeleton as guidelines LSU Seminar 17 / 23
  • 31. Introduction Generative Programming NT2 Quaff Conclusion Objectives Proposing a C++ skeletons library Limit runtime overhead Use the formal model skeleton as guidelines What we want to write run( pipeline( seq(F1), seq(F2), seq(F3) ) ) LSU Seminar 17 / 23
  • 32. Introduction Generative Programming NT2 Quaff Conclusion Objectives Proposing a C++ skeletons library Limit runtime overhead Use the formal model skeleton as guidelines What we want to run if( rank == 0 ) do { out = F1(); MPI_Send(&out,1,MPI_INT,1,0,MPI_COMM_WORLD); } while( isValid(out) ); if( rank == 1 ) do { MPI_Recv(&in,1,MPI_INT,0,0,MPI_COMM_WORLD,&s); out = F2(in); MPI_Send(&in,1,MPI_INT,2,0,MPI_COMM_WORLD); } while ( isValid(out) ) if( rank == 2 ) do { MPI_Recv(&in,1,MPI_INT,1,0,MPI_COMM_WORLD,&s); F2(in); } while ( isValid(in) ) LSU Seminar 17 / 23
  • 33. Introduction Generative Programming NT2 Quaff Conclusion Design Rationale Code generation process Generating the skeleton tree at compile-time Turning this tree into a process network using MPL Producing the C + MPI target code using Fusion PIPE Génération Production φ 2 Génération AST du RPSC φ 2 φ 2 de code C+MPI C C++ φ1 FARM3 φ3 MPI φ 1 f φ 3 φ2 LSU Seminar 18 / 23
  • 34. Introduction Generative Programming NT2 Quaff Conclusion Quaff over the CELL The OCELLE project IEF/CEA/TRT Tools for heterog. proc. MPI and skeleton for the Cell Developing for the CELL How to find a good mapping for an application ? LSU Seminar 19 / 23
  • 35. Introduction Generative Programming NT2 Quaff Conclusion Harris Corner Detector Mul Ixx=Ix*Ix Gauss Sxx Grad X Ix coarsity I Mul Ixy=Ix*Iy Gauss Sxy Sxx*Syy-Sxy² K Grad Y Iy Mul Iyy=Iy*Iy Gauss Syy LSU Seminar 20 / 23
  • 36. Introduction Generative Programming NT2 Quaff Conclusion Harris Corner Detector Mul Ixx=Ix*Ix Gauss Sxx Grad X Ix coarsity I Mul Ixy=Ix*Iy Gauss Sxy Sxx*Syy-Sxy² K Grad Y Iy Mul Iyy=Iy*Iy Gauss Syy Skeleton Code and Benchmarks void full_chain_harris(tile const&,tile&) { run( pardo<8>((seq(grad),seq(mul),seq(gauss),seq(coarsity))); } LSU Seminar 20 / 23
  • 37. Introduction Generative Programming NT2 Quaff Conclusion Harris Corner Detector Mul Ixx=Ix*Ix Gauss Sxx Grad X Ix coarsity I Mul Ixy=Ix*Iy Gauss Sxy Sxx*Syy-Sxy² K Grad Y Iy Mul Iyy=Iy*Iy Gauss Syy Skeleton Code and Benchmarks void half_chain_harris(tile const&,tile&) { run( pardo<4>((seq(grad),seq(mul)) | (seq(gauss),seq(coarsity))); } LSU Seminar 20 / 23
  • 38. Introduction Generative Programming NT2 Quaff Conclusion Harris Corner Detector Mul Ixx=Ix*Ix Gauss Sxx Grad X Ix coarsity I Mul Ixy=Ix*Iy Gauss Sxy Sxx*Syy-Sxy² K Grad Y Iy Mul Iyy=Iy*Iy Gauss Syy Skeleton Code and Benchmarks void no_chain_harris(tile const&,tile&) { run( pardo<2>( seq(grad) | seq(mul) | seq(gauss) | seq(coarsity)); } LSU Seminar 20 / 23
  • 39. Introduction Generative Programming NT2 Quaff Conclusion Harris Corner Detector Mul Ixx=Ix*Ix Gauss Sxx Grad X Ix coarsity I Mul Ixy=Ix*Iy Gauss Sxy Sxx*Syy-Sxy² K Grad Y Iy Mul Iyy=Iy*Iy Gauss Syy Results (PACT 09) Version Full-chain Half-chain No-chain Manual 11.26 8.36 9.97 Skell 11.86 8.64 10.43 Overhead 5.33% 3.35% 4.61% LSU Seminar 20 / 23
  • 40. Introduction Generative Programming NT2 Quaff Conclusion Let’s round this up! Parallel Computing for Scientist Parallel programmign tools are required BUT: ... They need to cater to users’ needs ! ... Users should feel «at home» LSU Seminar 21 / 23
  • 41. Introduction Generative Programming NT2 Quaff Conclusion Let’s round this up! Parallel Computing for Scientist Parallel programmign tools are required BUT: ... They need to cater to users’ needs ! ... Users should feel «at home» Our claims Software Libraries built as Generic and Generative components can solve a large chunk of parallelism related problems while being easy to use. EDSL are an efficient way to design such libraries C++ provides a wide selection of idioms to achieve this goal. LSU Seminar 21 / 23
  • 42. Introduction Generative Programming NT2 Quaff Conclusion Prospectives What we’re cooking at the moment Get a public release of NT 2 and Quaff NT 2 : Automatic matlab conversion with source to source compiler Quaff: Simplify new skeletons design by providing a Semantic Rules EDSL What we’ll be cooking in the future Multi-stage programming for skeleton over the Grid/Cloud NT 2 supports for embedded architectures LSU Seminar 22 / 23
  • 43. Thanks for your attention