Successfully reported this slideshow.
Your SlideShare is downloading. ×

Simplificando el paralelismo con Patrones Genéricos Paralelos

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 135 Ad

Simplificando el paralelismo con Patrones Genéricos Paralelos

Download to read offline

La programación genérica ofrece un mecanismo para abstraer componentes software sin traducir esta abstracción en un sobrecoste en el rendimiento. Esto ha permitido poder expresar patrones y algoritmos de uso común, permitiendo que los programadores de aplicación centrarse en los detalles del software de aplicación. Esta idea ha sido, durante décadas, uno de los principios de la Standard Template Library (STL) del lenguaje C++. El mismo principio puede aplicarse los esqueletos paralelos y los patrones de patrones de paralelismo, que aplican la idea de patrón de diseño al caso de la programación paralela. En este caso, un parámetro adicional de configuración es el modelo de ejecución a utilizar. En esta charla, consideraré diversos aspectos de la integración de las ideas de programación paralela y programación genérica. Daré una visión muy general del estado actual en C++17 y me centraré en un enfoque alternativo basado en la biblioteca GrPPI (Generic and reusable Parallel Pattern Interface).

La programación genérica ofrece un mecanismo para abstraer componentes software sin traducir esta abstracción en un sobrecoste en el rendimiento. Esto ha permitido poder expresar patrones y algoritmos de uso común, permitiendo que los programadores de aplicación centrarse en los detalles del software de aplicación. Esta idea ha sido, durante décadas, uno de los principios de la Standard Template Library (STL) del lenguaje C++. El mismo principio puede aplicarse los esqueletos paralelos y los patrones de patrones de paralelismo, que aplican la idea de patrón de diseño al caso de la programación paralela. En este caso, un parámetro adicional de configuración es el modelo de ejecución a utilizar. En esta charla, consideraré diversos aspectos de la integración de las ideas de programación paralela y programación genérica. Daré una visión muy general del estado actual en C++17 y me centraré en un enfoque alternativo basado en la biblioteca GrPPI (Generic and reusable Parallel Pattern Interface).

Advertisement
Advertisement

More Related Content

More from Facultad de Informática UCM (20)

Recently uploaded (20)

Advertisement

Simplificando el paralelismo con Patrones Genéricos Paralelos

  1. 1. An introduction to GrPPI An introduction to GrPPI Generic Reusable Parallel Patterns Interface J. Daniel Garcia ARCOS Group University Carlos III of Madrid Spain November 2019 cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 1/80
  2. 2. An introduction to GrPPI Warning c This work is under Attribution-NonCommercial- NoDerivatives 4.0 International (CC BY-NC-ND 4.0) license. You are free to Share — copy and redistribute the ma- terial in any medium or format. b You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. e You may not use the material for commercial purposes. d If you remix, transform, or build upon the material, you may not distribute the modified material. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 2/80
  3. 3. An introduction to GrPPI ARCOS@uc3m UC3M: A young international research oriented university. ARCOS: An applied research group. Lines: High Performance Computing, Big data, Cyberphysical Systems, and Programming models for application improvement. Programming Models for Application Improvement: Provide programming tools for improving: Performance. Energy efficiency. Maintainability. Correctness. Standardization: ISO/IEC JTC/SC22/WG21. ISO C++ standards committe. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 3/80
  4. 4. An introduction to GrPPI Acknowledgements Project 801091 “ASPIDE: Exascale programming models for extreme programming funded by the European Commission through H2020 FET-HPC (2018-2020). Project ICT 644235 “REPHRASE: REfactoring Parallel Heterogeneous Resource-aware Applications” funded by the European Commission through H2020 program (2015-2018). Project TIN2016-79673-P “Towards Unification of HPC and Big Data Paradigms” funded by the Spanish Ministry of Economy and Competitiveness (2016-2019). cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 4/80
  5. 5. An introduction to GrPPI GrPPI team Main team J. Daniel Garcia (UC3M, lead). Javier Garcia Blas (UC3M). David del Río (UC3M). Javier Fernández (UC3M). Cooperation Marco Danelutto (Univ. Pisa) Massimo Torquati (Univ. Pisa) Marco Aldinucci (Univ. Torino) Fabio Tordini (Univ. Torino) Plácido Fernández (UC3M-CERN). Manuel F. Dolz (Univ. Jaume I). . . . cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 5/80
  6. 6. An introduction to GrPPI Introduction 1 Introduction 2 Simple use 3 Patterns in GrPPI 4 Evaluation 5 Conclusions cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 6/80
  7. 7. An introduction to GrPPI Introduction Sequential Programming versus Parallel Programming Sequential programming Well-known set of control-structures embedded in programming languages. Control structures inherently sequential. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 7/80
  8. 8. An introduction to GrPPI Introduction Sequential Programming versus Parallel Programming Sequential programming Well-known set of control-structures embedded in programming languages. Control structures inherently sequential. Traditional Parallel programming Constructs adapting sequential control structures to the parallel world (e.g. parallel-for). cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 7/80
  9. 9. An introduction to GrPPI Introduction Sequential Programming versus Parallel Programming Sequential programming Well-known set of control-structures embedded in programming languages. Control structures inherently sequential. Traditional Parallel programming Constructs adapting sequential control structures to the parallel world (e.g. parallel-for). But wait! What if we had constructs that could be both sequential and parallel? cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 7/80
  10. 10. An introduction to GrPPI Introduction Software design There are two ways of constructing a software design: cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 8/80
  11. 11. An introduction to GrPPI Introduction Software design There are two ways of constructing a software design: One way is cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 8/80
  12. 12. An introduction to GrPPI Introduction Software design There are two ways of constructing a software design: One way is to make it so simple that there are obviously no defi- ciencies, cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 8/80
  13. 13. An introduction to GrPPI Introduction Software design There are two ways of constructing a software design: One way is to make it so simple that there are obviously no defi- ciencies, and the other way is cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 8/80
  14. 14. An introduction to GrPPI Introduction Software design There are two ways of constructing a software design: One way is to make it so simple that there are obviously no defi- ciencies, and the other way is to make it so complicated that there are no obvious deficiencies. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 8/80
  15. 15. An introduction to GrPPI Introduction Software design There are two ways of constructing a software design: One way is to make it so simple that there are obviously no defi- ciencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult. C.A.R Hoare cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 8/80
  16. 16. An introduction to GrPPI Introduction Adding two vectors Traditional way using numvec = std::vector<double>; numvec add(const numvec & v1, const numvec & v2) { numvec res; res.reserve(v1.size() ) ; // Asume equal sizes for (int i =0; i <v1.size() ; ++i) { res.push_back(v1[i]+v2[i]) ; } return res; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 9/80
  17. 17. An introduction to GrPPI Introduction Adding two vectors Traditional way using numvec = std::vector<double>; numvec add(const numvec & v1, const numvec & v2) { numvec res; res.reserve(v1.size() ) ; // Asume equal sizes for (int i =0; i <v1.size() ; ++i) { res.push_back(v1[i]+v2[i]) ; } return res; } Adds additional constraints. Traversing in-order. Potential mistakes. i<v1.size() versus i<=v1.size(). cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 9/80
  18. 18. An introduction to GrPPI Introduction Adding two vectors The STL way (C++98/03) using numvec = std::vector<double>; numvec add(const numvec & v1, const numvec & v2) { numvec res; res.reserve(v1.size() ) ; // Asume equal sizes std :: transform(v1.begin(), v1.end(), v2.begin() , std :: back_inserter(res), std :: plus<>{}) ; return res; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 10/80
  19. 19. An introduction to GrPPI Introduction Adding two vectors The STL way (C++98/03) using numvec = std::vector<double>; numvec add(const numvec & v1, const numvec & v2) { numvec res; res.reserve(v1.size() ) ; // Asume equal sizes std :: transform(v1.begin(), v1.end(), v2.begin() , std :: back_inserter(res), std :: plus<>{}) ; return res; } Minimize off-by-one mistakes. Type specific optimizations. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 10/80
  20. 20. An introduction to GrPPI Introduction Adding two vectors The STL way (C++14) using numvec = std::vector<double>; numvec add(const numvec & v1, const numvec & v2) { numvec res; res.reserve(v1.size() ) ; // Asume equal sizes std :: transform(v1.begin(), v1.end(), v2.begin() , std :: back_inserter(res), []( auto x, auto y) { return x+y; }) ; return res; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 11/80
  21. 21. An introduction to GrPPI Introduction Adding two vectors The STL way (C++14) using numvec = std::vector<double>; numvec add(const numvec & v1, const numvec & v2) { numvec res; res.reserve(v1.size() ) ; // Asume equal sizes std :: transform(v1.begin(), v1.end(), v2.begin() , std :: back_inserter(res), []( auto x, auto y) { return x+y; }) ; return res; } Use (possibly generic) lambdas . cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 11/80
  22. 22. An introduction to GrPPI Introduction A brief history of patterns From building and architecture (Cristopher Alexander): 1977: A Pattern Language: Towns, Buildings, Construction. 1979: The timeless way of buildings. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 12/80
  23. 23. An introduction to GrPPI Introduction A brief history of patterns From building and architecture (Cristopher Alexander): 1977: A Pattern Language: Towns, Buildings, Construction. 1979: The timeless way of buildings. To software design (Gamma et al.): 1993: Design Patterns: abstraction and reuse of object oriented design. ECOOP. 1995: Design Patterns. Elements of Reusable Object-Oriented Software. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 12/80
  24. 24. An introduction to GrPPI Introduction A brief history of patterns From building and architecture (Cristopher Alexander): 1977: A Pattern Language: Towns, Buildings, Construction. 1979: The timeless way of buildings. To software design (Gamma et al.): 1993: Design Patterns: abstraction and reuse of object oriented design. ECOOP. 1995: Design Patterns. Elements of Reusable Object-Oriented Software. To parallel programming (McCool, Reinders, Robinson): 2012: Structured Parallel Programming: Patterns for Efficient Computation. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 12/80
  25. 25. An introduction to GrPPI Introduction Parallel Patterns Levels of patterns: Software Architecture Patterns. Design Patterns. Algorithm Strategy Patterns. Implementation patterns. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 13/80
  26. 26. An introduction to GrPPI Introduction Parallel Patterns Levels of patterns: Software Architecture Patterns. Design Patterns. Algorithm Strategy Patterns. Implementation patterns. Algorithm Strategy Pattern or Algorithmic Skeleton A way to codify best practices in parallel programming in a reusable way. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 13/80
  27. 27. An introduction to GrPPI Introduction Parallel Patterns Levels of patterns: Software Architecture Patterns. Design Patterns. Algorithm Strategy Patterns. Implementation patterns. Algorithm Strategy Pattern or Algorithmic Skeleton A way to codify best practices in parallel programming in a reusable way. A single Parallel Pattern may have different realizations in different programming models. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 13/80
  28. 28. An introduction to GrPPI Introduction Example: map pattern Apply a single elemental operation to different data items. SAXPY: Sequential for (size_t i=0; i<n; ++i) { y[ i ] = a ∗ x[ i ] + y[ i ]; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 14/80
  29. 29. An introduction to GrPPI Introduction Example: map pattern Apply a single elemental operation to different data items. SAXPY: Sequential for (size_t i=0; i<n; ++i) { y[ i ] = a ∗ x[ i ] + y[ i ]; } SAXPY: OpenMP #pragma omp parallel for for (size_t i=0; i<n; ++i) { y[ i ] = a ∗ x[ i ] + y[ i ]; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 14/80
  30. 30. An introduction to GrPPI Introduction Example: map pattern Apply a single elemental operation to different data items. SAXPY: Sequential for (size_t i=0; i<n; ++i) { y[ i ] = a ∗ x[ i ] + y[ i ]; } SAXPY: OpenMP #pragma omp parallel for for (size_t i=0; i<n; ++i) { y[ i ] = a ∗ x[ i ] + y[ i ]; } SAXPY: TBB tbb :: parallel_for (tbb :: blocked_range<int>{0,n}, [&](tbb :: blocked_range<int> r) { for (auto i : r) { y[ i ] = a ∗ x[ i ] + y[ i ]; } }) ; cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 14/80
  31. 31. An introduction to GrPPI Introduction Ideals for Parallel Patterns Applications should be expressed independently of the execution model. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 15/80
  32. 32. An introduction to GrPPI Introduction Ideals for Parallel Patterns Applications should be expressed independently of the execution model. Computations should be expressed in terms of structured composable patterns. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 15/80
  33. 33. An introduction to GrPPI Introduction Ideals for Parallel Patterns Applications should be expressed independently of the execution model. Computations should be expressed in terms of structured composable patterns. Multiple back-ends should be offered with simple switching mechanisms. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 15/80
  34. 34. An introduction to GrPPI Introduction Ideals for Parallel Patterns Applications should be expressed independently of the execution model. Computations should be expressed in terms of structured composable patterns. Multiple back-ends should be offered with simple switching mechanisms. Interface should integrate seamlessly with modern C++ and its standard library. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 15/80
  35. 35. An introduction to GrPPI Introduction Ideals for Parallel Patterns Applications should be expressed independently of the execution model. Computations should be expressed in terms of structured composable patterns. Multiple back-ends should be offered with simple switching mechanisms. Interface should integrate seamlessly with modern C++ and its standard library. Applications should be able to take advantage of modern C++ language features. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 15/80
  36. 36. An introduction to GrPPI Introduction C++17: Execution policies An execution policy allows to disambiguate parallel algorithms overloads. sequenced_policy: Not parallelized. std::execution::seq. parallel_policy: Execution can be parallelized in multiple threads. std::execution::par. parallel_unsequenced_policy: Execution can be parallelized, vectorized, or migrated across threads. std::execution::par_unseq. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 16/80
  37. 37. An introduction to GrPPI Introduction C++17: Parallel algorithms Most algorithms in <algorithm> and <numeric> have execution policy based versions. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 17/80
  38. 38. An introduction to GrPPI Introduction C++17: Parallel algorithms Most algorithms in <algorithm> and <numeric> have execution policy based versions. SAXPY: C++17 std :: transform(std::execution::par, y.begin(), y.end(), x.begin(), y.begin(), [=]( auto vx, auto vy) { return a ∗ vx + vy; }) ; cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 17/80
  39. 39. An introduction to GrPPI Introduction C++17: Parallel algorithms Most algorithms in <algorithm> and <numeric> have execution policy based versions. SAXPY: C++17 std :: transform(std::execution::par, y.begin(), y.end(), x.begin(), y.begin(), [=]( auto vx, auto vy) { return a ∗ vx + vy; }) ; However . . . No fine control knobs. Lack of task parallelism and stream parallelism. OK as an entry level to parallelism. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 17/80
  40. 40. An introduction to GrPPI Introduction GrPPI https://github.com/arcosuc3m/grppi cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 18/80
  41. 41. An introduction to GrPPI Introduction GrPPI https://github.com/arcosuc3m/grppi Generic reusable Parallel Pattern Interface. A header only library (might change). A set of execution policies. A set of type safe generic algorithms. Requires C++14. Apache 2.0 License. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 18/80
  42. 42. An introduction to GrPPI Simple use 1 Introduction 2 Simple use 3 Patterns in GrPPI 4 Evaluation 5 Conclusions cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 19/80
  43. 43. An introduction to GrPPI Simple use Example: Transforming a sequence Given a sequence of frames generate a new sequence of frames in grayscale. struct frame { /∗ ... ∗/ }; frame togray(const frame & f); frame[0] frame[1] frame[2] frame[3] frame[4] frame[0] frame[1] frame[2] frame[3] frame[4] togray() togray() togray() togray() togray() cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 20/80
  44. 44. An introduction to GrPPI Simple use Transforming a sequence Traditional explicit loop using frameseq = std::vector<frame>; frameseq seq_togray(const frameseq & s) { frameseq r; r.reserve(s.size() ) ; // Requires processing in−order for (const auto & f : s) { r.push_back(togray(f)); } return r; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 21/80
  45. 45. An introduction to GrPPI Simple use Transforming a sequence STL way using frameseq = std::vector<frame>; frameseq seq_togray(const frameseq & s) { frameseq r; r.reserve(s.size() ) ; std :: transform(s.begin(), s.end(), std :: back_inserter(r), togray); return r; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 22/80
  46. 46. An introduction to GrPPI Simple use Transforming a sequence Parallel STL way (C++17) using frameseq = std::vector<frame>; frameseq seq_togray(const frameseq & s) { frameseq r; r.reserve(s.size () ) ; // No execution order assumed std :: transform(std :: execution::par, s.begin(), s.end(), std :: back_inserter(r), togray); return r; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 23/80
  47. 47. An introduction to GrPPI Simple use Transforming a sequence GrPPI using frameseq = std::vector<frame>; frameseq seq_togray(const frameseq & s) { frameseq r(s.size() ); grppi :: sequential_execution seq; grppi :: map(seq, s.begin(), s.end(), r.begin(), togray); return r; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 24/80
  48. 48. An introduction to GrPPI Simple use Transforming a sequence GrPPI + lambda using frameseq = std::vector<frame>; frameseq seq_togray(const frameseq & s) { frameseq r(s.size() ); grppi :: sequential_execution seq; grppi :: map(seq, s.begin(), s.end(), r.begin(), []( const frame & f) { return filter (f,64); }) ; return r; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 25/80
  49. 49. An introduction to GrPPI Simple use Transforming a sequence GrPPI + lambda - iterators using frameseq = std::vector<frame>; frameseq seq_togray(const frameseq & s) { frameseq r(s.size() ); grppi :: sequential_execution seq; grppi :: map(seq, s, r, []( const frame & f) { return filter (f,64); }) ; return r; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 26/80
  50. 50. An introduction to GrPPI Patterns in GrPPI 1 Introduction 2 Simple use 3 Patterns in GrPPI 4 Evaluation 5 Conclusions cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 27/80
  51. 51. An introduction to GrPPI Patterns in GrPPI Controlling execution 3 Patterns in GrPPI Controlling execution Patterns overview Data patterns Streaming patterns cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 28/80
  52. 52. An introduction to GrPPI Patterns in GrPPI Controlling execution Execution types Execution model is encapsulated in execution types. Always provided as first argument to patterns. Current concrete execution types: Sequential: sequential_execution. ISO C++ Threads: parallel_execution_native. OpenMP: parallel_execution_omp. Intel TBB: parallel_execution_tbb. FastFlow: parallel_execution_ff. Run-time polymorphic wrapper through type erasure: dynamic_execution. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 29/80
  53. 53. An introduction to GrPPI Patterns in GrPPI Controlling execution Execution model properties Some execution types allow finer configurtion. Example: Concurrency degree. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 30/80
  54. 54. An introduction to GrPPI Patterns in GrPPI Controlling execution Execution model properties Some execution types allow finer configurtion. Example: Concurrency degree. Interface: ex.set_concurrency_degree(4); int n = ex.concurrency_degree(); cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 30/80
  55. 55. An introduction to GrPPI Patterns in GrPPI Controlling execution Execution model properties Some execution types allow finer configurtion. Example: Concurrency degree. Interface: ex.set_concurrency_degree(4); int n = ex.concurrency_degree(); Default values: Sequential ⇒ 1. Native ⇒ std::thread::hardware_concurrency(). OpenMP ⇒ omp_get_num_threads(). cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 30/80
  56. 56. An introduction to GrPPI Patterns in GrPPI Patterns overview 3 Patterns in GrPPI Controlling execution Patterns overview Data patterns Streaming patterns cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 31/80
  57. 57. An introduction to GrPPI Patterns in GrPPI Patterns overview A classification Data patterns: Express computations over a data set. map, reduce, map/reduce, stencil. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 32/80
  58. 58. An introduction to GrPPI Patterns in GrPPI Patterns overview A classification Data patterns: Express computations over a data set. map, reduce, map/reduce, stencil. Task patterns: Express task composition. divide/conquer. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 32/80
  59. 59. An introduction to GrPPI Patterns in GrPPI Patterns overview A classification Data patterns: Express computations over a data set. map, reduce, map/reduce, stencil. Task patterns: Express task composition. divide/conquer. Streaming patterns: Express computations ver a (possibly unbounded) data set. pipeline. Specialized stages: farm, filter, reduction, iteration. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 32/80
  60. 60. An introduction to GrPPI Patterns in GrPPI Data patterns 3 Patterns in GrPPI Controlling execution Patterns overview Data patterns Streaming patterns cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 33/80
  61. 61. An introduction to GrPPI Patterns in GrPPI Data patterns Patterns on data sets A data pattern performs an operation on one or more data sets that are already in memory. Input: One or more data sets. Operations. Output: A data set (map, stencil). A single value (reduce, map/reduce). cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 34/80
  62. 62. An introduction to GrPPI Patterns in GrPPI Data patterns Maps on data sequences A map pattern applies an operation to every element in a a data set, generating a new data set. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 35/80
  63. 63. An introduction to GrPPI Patterns in GrPPI Data patterns Maps on data sequences A map pattern applies an operation to every element in a a data set, generating a new data set. Unidimensional: cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 35/80
  64. 64. An introduction to GrPPI Patterns in GrPPI Data patterns Maps on data sequences A map pattern applies an operation to every element in a a data set, generating a new data set. Unidimensional: Multidimensional: cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 35/80
  65. 65. An introduction to GrPPI Patterns in GrPPI Data patterns Single sequences mapping Double all elements in vector sequentially std :: vector<double> double_elements(const std::vector<double> & v) { std :: vector<double> res(v.size()); grppi :: sequential_execution seq; grppi :: map(seq, v, res, []( double x) { return 2∗x; }) ; return res; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 36/80
  66. 66. An introduction to GrPPI Patterns in GrPPI Data patterns Single sequences mapping Double all elements in vector with OpenMP std :: vector<double> double_elements(const std::vector<double> & v) { std :: vector<double> res(v.size()); grppi :: parallel_execution_omp omp; grppi :: map(omp, v, res, []( double x) { return 2∗x; }) ; return res; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 37/80
  67. 67. An introduction to GrPPI Patterns in GrPPI Data patterns Multiple sequences mapping Add two vectors template <typename Execution> std :: vector<double> add_vectors(const Execution & ex, const std::vector<double> & v1, const std::vector<double> & v2) { auto size = std :: min(v1.size() , v2.size() ); std :: vector<double> res(size); grppi :: map(ex, zip(v1, v2), res, []( double x, double y) { return x+y; }) ; return res; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 38/80
  68. 68. An introduction to GrPPI Patterns in GrPPI Data patterns Multiple sequences mapping Add three vectors template <typename Execution> std :: vector<double> add_vectors(const Execution & ex, const std::vector<double> & v1, const std::vector<double> & v2, const std::vector<double> & v3) { auto size = std :: min(v1.size() , v2.size() ); std :: vector<double> res(size); grppi :: map(ex, zip(v1,v2,v3), res, []( double x, double y, double z) { return x+y+z; }); return res; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 39/80
  69. 69. An introduction to GrPPI Patterns in GrPPI Data patterns Heterogeneous mapping The result can be from a different type. Complex vector from real and imaginary vectors template <typename Execution> std :: vector<complex<double>> create_cplx(const Execution & ex, const std::vector<double> & re, const std::vector<double> & im) { auto size = std :: min(re.size() , im.size() ); std :: vector<complex<double>> res(size); grppi :: map(ex, zip(re,im), res, []( double r, double i) −> complex<double> { return {r,i}; }); return res; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 40/80
  70. 70. An introduction to GrPPI Patterns in GrPPI Data patterns Reductions on data sequences A reduce pattern combines all values in a data set using a binary combination operation. id result cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 41/80
  71. 71. An introduction to GrPPI Patterns in GrPPI Data patterns Homogeneous reduction Add a sequence of values template <typename Execution> double add_sequence(const Execution & ex, const vector<double> & v) { return grppi :: reduce(ex, v, 0.0, []( double x, double y) { return x+y; }) ; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 42/80
  72. 72. An introduction to GrPPI Patterns in GrPPI Data patterns Heterogeneous reduction Add areas of shapes template <typename Execution> double add_areas(const Execution & ex, const std::vector<shape> & shapes) { return grppi :: reduce(ex, shapes, 0.0, []( double a, const auto & s) { return a + s.area()() ; }) ; } Note: Better expressed as a map reduce. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 43/80
  73. 73. An introduction to GrPPI Patterns in GrPPI Data patterns Map/reduce pattern A map/reduce pattern combines a map pattern and a reduce pattern into a single pattern. 1 One or more data sets are mapped applying a transformation operation. 2 The results are combined by a reduction operation. A map/reduce could be also expressed by the composition of a map and a reduce. However, map/reduce may potentially fuse both stages, allowing for extra optimizations. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 44/80
  74. 74. An introduction to GrPPI Patterns in GrPPI Data patterns Map/reduce id result cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 45/80
  75. 75. An introduction to GrPPI Patterns in GrPPI Data patterns Single sequence map/reduce Sum of squares template <typename Execution> double sum_squares(const Execution & ex, const std::vector<double> & v) { return grppi :: map_reduce(ex, v, 0.0, []( double x) { return x∗x; } []( double x, double y) { return x+y; } ); } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 46/80
  76. 76. An introduction to GrPPI Patterns in GrPPI Data patterns Map/reduce on two data sets Scalar product template <typename Execution> double scalar_product(const Execution & ex, const std::vector<double> & v1, const std::vector<double> & v2) { return grppi :: map_reduce(ex, zip(v1,v2), 0.0, []( double x, double y) { return x∗y; }, []( double x, double y) { return x+y; }) ; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 47/80
  77. 77. An introduction to GrPPI Patterns in GrPPI Data patterns Heterogeneous reduction versus map/reduce Add areas of shapes template <typename Execution> int add_areas(const Execution & ex, const std::vector<shape> & shapes) { return grppi :: map_reduce(ex, shapes, 0.0, []( const auto & s) { return s.area(); }, []( double a, double b) { return a+b; } ); } Note: Better than previous heterogeneous reduce. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 48/80
  78. 78. An introduction to GrPPI Patterns in GrPPI Data patterns Cannonical map/reduce Given a sequence of words, produce a container where: The key is the word. The value is the number of occurrences of that word. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 49/80
  79. 79. An introduction to GrPPI Patterns in GrPPI Data patterns Cannonical map/reduce Given a sequence of words, produce a container where: The key is the word. The value is the number of occurrences of that word. Word frequencies template <typename Execution> auto word_freq(const Execution & ex, const std::vector<std::string> & words) { cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 49/80
  80. 80. An introduction to GrPPI Patterns in GrPPI Data patterns Cannonical map/reduce Given a sequence of words, produce a container where: The key is the word. The value is the number of occurrences of that word. Word frequencies template <typename Execution> auto word_freq(const Execution & ex, const std::vector<std::string> & words) { using namespace std; using dictionary = std :: map<string,int>; cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 49/80
  81. 81. An introduction to GrPPI Patterns in GrPPI Data patterns Cannonical map/reduce Given a sequence of words, produce a container where: The key is the word. The value is the number of occurrences of that word. Word frequencies template <typename Execution> auto word_freq(const Execution & ex, const std::vector<std::string> & words) { using namespace std; using dictionary = std :: map<string,int>; return grppi :: map_reduce(ex, words, dictionary{}, cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 49/80
  82. 82. An introduction to GrPPI Patterns in GrPPI Data patterns Cannonical map/reduce Given a sequence of words, produce a container where: The key is the word. The value is the number of occurrences of that word. Word frequencies template <typename Execution> auto word_freq(const Execution & ex, const std::vector<std::string> & words) { using namespace std; using dictionary = std :: map<string,int>; return grppi :: map_reduce(ex, words, dictionary{}, []( string w) −> dictionary { return {w,1}; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 49/80
  83. 83. An introduction to GrPPI Patterns in GrPPI Data patterns Cannonical map/reduce Given a sequence of words, produce a container where: The key is the word. The value is the number of occurrences of that word. Word frequencies template <typename Execution> auto word_freq(const Execution & ex, const std::vector<std::string> & words) { using namespace std; using dictionary = std :: map<string,int>; return grppi :: map_reduce(ex, words, dictionary{}, []( string w) −> dictionary { return {w,1}; } []( dictionary & lhs, const dictionary & rhs) −> dictionary { for (auto & entry : rhs) { lhs[entry. first ] += entry.second; } return lhs; }) ; } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 49/80
  84. 84. An introduction to GrPPI Patterns in GrPPI Streaming patterns 3 Patterns in GrPPI Controlling execution Patterns overview Data patterns Streaming patterns cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 50/80
  85. 85. An introduction to GrPPI Patterns in GrPPI Streaming patterns Pipeline pattern A pipeline pattern allows processing a data stream where the computation may be divided in multiple stages. Each stage processes the data item generated in the previous stage and passes the produced result to the next stage. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 51/80
  86. 86. An introduction to GrPPI Patterns in GrPPI Streaming patterns Standalone pipeline A standalone pipeline is a top-level pipeline. Invoking the pipeline translates into its execution. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 52/80
  87. 87. An introduction to GrPPI Patterns in GrPPI Streaming patterns Standalone pipeline A standalone pipeline is a top-level pipeline. Invoking the pipeline translates into its execution. Given: A generator g : ∅ → T1 ∪ ∅ A sequence of transformers ti : Ti → Ti+1 For every non-empty value generated by g, it evaluates: tn(tn−1(. . . t1(g()))) cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 52/80
  88. 88. An introduction to GrPPI Patterns in GrPPI Streaming patterns Generators A generator g is any callable C++ entity that: Takes no argument. Returns a value of type T that may hold (or not) a value. Null value signals end of stream. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 53/80
  89. 89. An introduction to GrPPI Patterns in GrPPI Streaming patterns Generators A generator g is any callable C++ entity that: Takes no argument. Returns a value of type T that may hold (or not) a value. Null value signals end of stream. The return value must be any type that: Is copy-constructible or move-constructible. T x = g(); cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 53/80
  90. 90. An introduction to GrPPI Patterns in GrPPI Streaming patterns Generators A generator g is any callable C++ entity that: Takes no argument. Returns a value of type T that may hold (or not) a value. Null value signals end of stream. The return value must be any type that: Is copy-constructible or move-constructible. T x = g(); Is contextually convertible to bool if (x) { /∗ ... ∗/ } if (!x) { /∗ ... ∗/ } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 53/80
  91. 91. An introduction to GrPPI Patterns in GrPPI Streaming patterns Generators A generator g is any callable C++ entity that: Takes no argument. Returns a value of type T that may hold (or not) a value. Null value signals end of stream. The return value must be any type that: Is copy-constructible or move-constructible. T x = g(); Is contextually convertible to bool if (x) { /∗ ... ∗/ } if (!x) { /∗ ... ∗/ } Can be derreferenced auto val = ∗x; cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 53/80
  92. 92. An introduction to GrPPI Patterns in GrPPI Streaming patterns Generators A generator g is any callable C++ entity that: Takes no argument. Returns a value of type T that may hold (or not) a value. Null value signals end of stream. The return value must be any type that: Is copy-constructible or move-constructible. T x = g(); Is contextually convertible to bool if (x) { /∗ ... ∗/ } if (!x) { /∗ ... ∗/ } Can be derreferenced auto val = ∗x; The standard library offers an excellent candidate std::experimental::optional<T>. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 53/80
  93. 93. An introduction to GrPPI Patterns in GrPPI Streaming patterns Simple pipeline x -> x*x -> 1/x -> print template <typename Execution> void run_pipe(const Execution & ex, int n) { grppi :: pipeline(ex, [ i=0,max=n] () mutable −> optional<int> { if ( i<max) return i++; else return {}; }, []( int x) −> double { return x∗x; }, []( double x) { return 1/x; }, []( double x) { cout << x << "n"; } ); } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 54/80
  94. 94. An introduction to GrPPI Patterns in GrPPI Streaming patterns Nested pipelines Pipelines may be nested. An inner pipeline: Does not take an execution policy. All stages are transformers (no generator). The last stage must also produce values. The inner pipeline uses the same execution policy than the outer pipeline. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 55/80
  95. 95. An introduction to GrPPI Patterns in GrPPI Streaming patterns Nested pipelines Image processing void process(std::istream & in_file , std :: ostream & out_file) { grppi:parallel_execution_native ex; cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 56/80
  96. 96. An introduction to GrPPI Patterns in GrPPI Streaming patterns Nested pipelines Image processing void process(std::istream & in_file , std :: ostream & out_file) { grppi:parallel_execution_native ex; grppi :: pipeline(ex, cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 56/80
  97. 97. An introduction to GrPPI Patterns in GrPPI Streaming patterns Nested pipelines Image processing void process(std::istream & in_file , std :: ostream & out_file) { grppi:parallel_execution_native ex; grppi :: pipeline(ex, [& in_file ]() −> optional<frame> { frame f = read_frame(file); if (! file ) return {}; return f ; }, cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 56/80
  98. 98. An introduction to GrPPI Patterns in GrPPI Streaming patterns Nested pipelines Image processing void process(std::istream & in_file , std :: ostream & out_file) { grppi:parallel_execution_native ex; grppi :: pipeline(ex, [& in_file ]() −> optional<frame> { frame f = read_frame(file); if (! file ) return {}; return f ; }, pipeline( []( const frame & f) { return filter (f); }, []( const frame & f) { return gray_scale(f); }, }, cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 56/80
  99. 99. An introduction to GrPPI Patterns in GrPPI Streaming patterns Nested pipelines Image processing void process(std::istream & in_file , std :: ostream & out_file) { grppi:parallel_execution_native ex; grppi :: pipeline(ex, [& in_file ]() −> optional<frame> { frame f = read_frame(file); if (! file ) return {}; return f ; }, pipeline( []( const frame & f) { return filter (f); }, []( const frame & f) { return gray_scale(f); }, }, [& out_file ](const frame & f) { write_frame(out_file , f); } ); } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 56/80
  100. 100. An introduction to GrPPI Patterns in GrPPI Streaming patterns Piecewise pipelines Image processing void process(std::istream & in_file , std :: ostream & out_file) { cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 57/80
  101. 101. An introduction to GrPPI Patterns in GrPPI Streaming patterns Piecewise pipelines Image processing void process(std::istream & in_file , std :: ostream & out_file) { auto reader = [& in_file ]() −> optional<frame> { frame f = read_frame(file); if (! file ) return {}; return f ; }; cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 57/80
  102. 102. An introduction to GrPPI Patterns in GrPPI Streaming patterns Piecewise pipelines Image processing void process(std::istream & in_file , std :: ostream & out_file) { auto reader = [& in_file ]() −> optional<frame> { frame f = read_frame(file); if (! file ) return {}; return f ; }; auto transformer = pipeline( []( const frame & f) { return filter (f); }, []( const frame & f) { return gray_scale(f); }, }; cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 57/80
  103. 103. An introduction to GrPPI Patterns in GrPPI Streaming patterns Piecewise pipelines Image processing void process(std::istream & in_file , std :: ostream & out_file) { auto reader = [& in_file ]() −> optional<frame> { frame f = read_frame(file); if (! file ) return {}; return f ; }; auto transformer = pipeline( []( const frame & f) { return filter (f); }, []( const frame & f) { return gray_scale(f); }, }; auto writer = [& out_file ](const frame & f) { write_frame(out_file , f); } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 57/80
  104. 104. An introduction to GrPPI Patterns in GrPPI Streaming patterns Piecewise pipelines Image processing void process(std::istream & in_file , std :: ostream & out_file) { auto reader = [& in_file ]() −> optional<frame> { frame f = read_frame(file); if (! file ) return {}; return f ; }; auto transformer = pipeline( []( const frame & f) { return filter (f); }, []( const frame & f) { return gray_scale(f); }, }; auto writer = [& out_file ](const frame & f) { write_frame(out_file , f); } grppi:parallel_execution_native ex; grppi :: pipeline(ex, reader, transformer, writer ); } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 57/80
  105. 105. An introduction to GrPPI Patterns in GrPPI Streaming patterns Farm pattern A farm is a streaming pattern applicable to a stage in a pipeline, providing multiple tasks to process data items from a data stream. A farm has an associated cardinality which is the number of parallel tasks used to serve the stage. Each task in a farm runs a transformer for each data item it receives. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 58/80
  106. 106. An introduction to GrPPI Patterns in GrPPI Streaming patterns Farms in pipelines Improving a video template <typename Execution> void run_pipe(const Execution & ex, std::ifstream & filein , std :: ofstream & fileout ) { grppi :: pipeline(ex, [& filein ] () −> optional<frame> { frame f = read_frame(filein); if (! filein ) retrun {}; return f ; }, farm(4, []( const frame & f) { return improve(f); }, [& fileout ] (const frame & f) { write_frame(f); } ); } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 59/80
  107. 107. An introduction to GrPPI Patterns in GrPPI Streaming patterns Piecewise farms Improving a video template <typename Execution> void run_pipe(const Execution & ex, std::ifstream & filein , std :: ofstream & fileout ) { auto improver = farm(4, []( const frame & f) { return improve(f); }) ; grppi :: pipeline(ex, [& filein ] () −> optional<frame> { frame f = read_frame(filein); if (! filein ) retrun {}; return f ; }, improver, [& fileout ] (const frame & f) { write_frame(f); } ); } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 60/80
  108. 108. An introduction to GrPPI Patterns in GrPPI Streaming patterns Ordering Signals if pipeline items must be consumed in the same order they were produced. Do they need to be time-stamped? Default is ordered. API ex.enable_ordering() ex.disable_ordering() bool o = ex.is_ordered() cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 61/80
  109. 109. An introduction to GrPPI Patterns in GrPPI Streaming patterns Queueing properties Some policies (native and omp) use queues to communicate pipeline stages. Properties: Queue size: Buffer size of the queue. Mode: blocking versus lock-free. API ex.set_queue_attributes(100, mode::blocking) cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 62/80
  110. 110. An introduction to GrPPI Patterns in GrPPI Streaming patterns Filter pattern A filter pattern discards (or keeps) the data items from a data stream based on the outcome of a predicate. This pattern can be used only as a stage of a pipeline. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 63/80
  111. 111. An introduction to GrPPI Patterns in GrPPI Streaming patterns Filter pattern A filter pattern discards (or keeps) the data items from a data stream based on the outcome of a predicate. This pattern can be used only as a stage of a pipeline. Alternatives: Keep: Only data items satisfying the predicate are sent to the next stage. Discard: Only data items not satisfying the predicate are sent to the next stage. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 63/80
  112. 112. An introduction to GrPPI Patterns in GrPPI Streaming patterns Filtering in Print primes bool is_prime(int n); template <typename Execution> void print_primes(const Execution & ex, int n) { grppi :: pipeline(exec, [ i=0,max=n]() mutable −> optional<int> { if ( i<=n) return i++; else return {}; }, grppi :: keep(is_prime), []( int x) { cout << x << "n"; } ); } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 64/80
  113. 113. An introduction to GrPPI Patterns in GrPPI Streaming patterns Filtering out Discard words template <typename Execution> void print_primes(const Execution & ex, std::istream & is) { grppi :: pipeline(exec, [& file ]() −> optional<string> { string word; file >> word; if (! file ) { return {}; } else { return word; } }, grppi :: discard ([]( std :: string w) { return w.length() < 4; }, []( std :: string w) { cout << x << "n"; } ); } cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 65/80
  114. 114. An introduction to GrPPI Evaluation 1 Introduction 2 Simple use 3 Patterns in GrPPI 4 Evaluation 5 Conclusions cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 66/80
  115. 115. An introduction to GrPPI Evaluation Example Video frame processing for border detection with two filters. Gaussian blur. Sobel. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 67/80
  116. 116. An introduction to GrPPI Evaluation Example Video frame processing for border detection with two filters. Gaussian blur. Sobel. Using pipeline pattern: S1: Frame reading. S2: Gaussian blur (may apply farm). S3: Sobel filter (may apply farm). S4: Frame writing. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 67/80
  117. 117. An introduction to GrPPI Evaluation Example Video frame processing for border detection with two filters. Gaussian blur. Sobel. Using pipeline pattern: S1: Frame reading. S2: Gaussian blur (may apply farm). S3: Sobel filter (may apply farm). S4: Frame writing. Approaches: Manual. GrPPI. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 67/80
  118. 118. An introduction to GrPPI Evaluation Parallelization effort Pipeline % LOC increase Composition C++ Threads OpenMP Intel TBB GrPPI ( p | p | p | p ) +8.8 % +13.0 % +25.9 % +1.8 % ( p | f | p | p ) +59.4 % +62.6 % +25.9 % +3.1 % ( p | p | f | p ) +60.0 % +63.9 % +25.9 % +3.1 % ( p | f | f | p ) +106.9 % +109.4 % +25.9 % +4.4 % cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 68/80
  119. 119. An introduction to GrPPI Evaluation Performance: frames per second 1 4 16 64 480p 720p 1080p 1440p 2160p Fotogramasporsegundo Pipeline (p|p|p|p) 1 4 16 64 480p 720p 1080p 1440p 2160p Pipeline (p|f|p|p) 1 4 16 64 480p 720p 1080p 1440p 2160p Fotogramasporsegundo Resolución de video Pipeline (p|p|f|p) 1 4 16 64 256 480p 720p 1080p 1440p 2160p Pipeline (p|f|f|p) C++11 GrPPI C++11 OpenMP GrPPI OpenMP TBB GrPPI TBB Resolución de video cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 69/80
  120. 120. An introduction to GrPPI Evaluation Brain MRI (Magnetic Resonance Imaging) Non intrusive method for getting internal anatomy. Huge amount of data generated. Applied to neuro-sciences. Bipolar disorder. Paranoia. Schizophrenia. Identification of fibers and connectivity between areas in brain. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 70/80
  121. 121. An introduction to GrPPI Evaluation Fibbers in brain cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 71/80
  122. 122. An introduction to GrPPI Evaluation MRI Evaluation 0 100 200 300 400 500 600 700 800 900 1000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Time(seconds) Number of threads GRPPI+SEQ GRPPI+TBB GRPPI+OMP GRPPI+THR GRPPI+FastFlow cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 72/80
  123. 123. An introduction to GrPPI Conclusions 1 Introduction 2 Simple use 3 Patterns in GrPPI 4 Evaluation 5 Conclusions cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 73/80
  124. 124. An introduction to GrPPI Conclusions Summary A unified programming model for sequential and parallel modes. Multiple back-ends available: sequential, OpenMP, TBB, native, FastFlow. Current pattern set: Data: map, reduce, map/reduce, stencil. Task: divide/conquer. Streaming: pipeline with nesting of farm, filter, reduction, iteration. Low programming effort. Very low performance overhead. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 74/80
  125. 125. An introduction to GrPPI Conclusions Further reading A Generic Parallel Pattern Interface for Stream and Data Processing. D. del Rio, M. F. Dolz, J. Fernández, J. D. García. Concurrency and Computation: Practice and Experience. 2017. Exploring stream parallel patterns in distributed MPI environments. Javier López-Gómez, Javier Fernández Muñoz, David del Rio Astorga, Manuel F. Dolz, J. Daniel Garcia. Parallel Computing. 84:24–36, 2019. Challenging the abstraction penalty in parallel patterns libraries. J. D. Garcia, D. del Rio, M. Aldinucci, F. Tordini, M. Danelutto, G. Mencagli, M. Torquati. Journal of Supercomputing, 21pp. 2019. Paving the way towards high-level parallel pattern interfaces for data stream processing. D. del Rio, M. F. Dolz, J. Fernandez, and J. D. Garcia. Future Generation Computer Systems 87:228-241. 2018 Supporting Advanced Patterns in GrPPI: a Generic Parallel Pattern Interface. D. del rio, M. F. Dolz, J. Fernandez, and J. D. Garcia, Auto-DaSP 2017 (Euro-Par 2017). cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 75/80
  126. 126. An introduction to GrPPI Conclusions Further reading Towards Automatic Parallelization of Stream Processing Applications. M. F. Dolz, D. del Rio, J. Fernandez, J. D. Garcia, and J. Carretero, IEEE Access, 6:39944-39961. 2018. Finding parallel patterns through static analysis in C++ applications. D. R. del Astorga, M. F. Dolz, L. M. Sanchez, J. D. Garcia, M. Danelutto, and M. Torquati, International Journal of High Performance Computing Applications, 2017. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 76/80
  127. 127. An introduction to GrPPI Conclusions GrPPI https://github.com/arcosuc3m/grppi cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 77/80
  128. 128. An introduction to GrPPI Conclusions Future work Additional backends: Accelerator integration, distributed computing, . . . cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 78/80
  129. 129. An introduction to GrPPI Conclusions Future work Additional backends: Accelerator integration, distributed computing, . . . Next generation C++: Concepts (reduce metaprogramming), modules. Executors: parallel versus reactive models. Ranges. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 78/80
  130. 130. An introduction to GrPPI Conclusions Future work Additional backends: Accelerator integration, distributed computing, . . . Next generation C++: Concepts (reduce metaprogramming), modules. Executors: parallel versus reactive models. Ranges. More patterns and less patterns: Simplified interface. Pattern catalog extension. Minimal building blocks identification. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 78/80
  131. 131. An introduction to GrPPI Conclusions Future work Additional backends: Accelerator integration, distributed computing, . . . Next generation C++: Concepts (reduce metaprogramming), modules. Executors: parallel versus reactive models. Ranges. More patterns and less patterns: Simplified interface. Pattern catalog extension. Minimal building blocks identification. Pattern composition: Pattern fusion. Pattern transformation rules. Multi-context approach for heterogenous computing. Better support of NUMA and affinity. cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 78/80
  132. 132. An introduction to GrPPI Conclusions The future grppi :: omp_execution omp; grppi :: cuda_execution cuda; std :: vector<int> v = get() ; auto comp = v | grppi :: map([](auto x) { return x∗x; }) | grppi :: reduce([](auto x, auto y) { return x+y; }) ; cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 79/80
  133. 133. An introduction to GrPPI Conclusions The future grppi :: omp_execution omp; grppi :: cuda_execution cuda; std :: vector<int> v = get() ; auto comp = v | grppi :: map([](auto x) { return x∗x; }) | grppi :: reduce([](auto x, auto y) { return x+y; }) ; comp >> grppi::on(omp); cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 79/80
  134. 134. An introduction to GrPPI Conclusions The future grppi :: omp_execution omp; grppi :: cuda_execution cuda; std :: vector<int> v = get() ; auto comp = v | grppi :: map([](auto x) { return x∗x; }) | grppi :: reduce([](auto x, auto y) { return x+y; }) ; comp >> grppi::on(omp); auto pipe = generator | ( grppi :: seq ([]( const frame & f) { return filter (f); }) | grppi :: farm(4, []( const frame & f) { return to_gray(f); }) >> on(cuda) ) | frame_writer >> on(omp); cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 79/80
  135. 135. An introduction to GrPPI Conclusions An introduction to GrPPI Generic Reusable Parallel Patterns Interface J. Daniel Garcia ARCOS Group University Carlos III of Madrid Spain November 2019 cbed – J. Daniel Garcia – ARCOS@UC3M (josedaniel.garcia@uc3m.es) – Twitter: @jdgarciauc3m 80/80

×