Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
EEDC                          34330ExecutionEnvironments for                   Scientific ProgrammingDistributed          ...
OutlineScientific Programming Models  – Part 1: Introduction  – Part 2: Reference parallel programming models  – Part 3: N...
Introduction Scientific applications:   –   Solve complex problems   –   Usually long run applications   –   Implemented ...
Introduction             In time terms…Scientific applications can’t be no more     considered in sequential way!!!       ...
Introduction      We need solutions based on distribute and               parallelize the work.                        5
Introduction: MPI1980s - early 1990s: Distributed memory & parallel computing startedas a bunch of incompatible software t...
Introduction: OpenMPIn the early 90s: Vendors of shared-memory machines supplied similar,directive-based for Fortran progr...
Reference PM: OpenMPProgramming model: Computation is done by threads. Fork-join model: Threads are dynamically created ...
Reference PM: OpenMP Example of sequential PI calculation                           9
Reference PM: OpenMP Example of OpenMP PI calculation                         10
Reference PM: OpenMPStrong Points:   –   Keeps the sequential version.   –   Communication is implicit.   –   Easy to prog...
Reference PM: MPIProgramming model: Computation is done by several processes that execute the same program. Communicates...
Reference PM: MPI Example of MPI PI calculation                          13
Reference PM: MPIStrong Points:  –   Any parallel algorithm can be expressed in terms of the MPI paradigm.  –   Data place...
Reference PM: The best of both worlds Hybrid (MPI + OpenMP):  – MPI is most effective for problems with “course-grained” ...
Reference PM: Hybrid (MPI + OpenMP) Example of MPI + OpenMP PI calculation                         16
Reference PM: New reference approaches Heterogeneous parallel-computing:  – CUDA (From NVIDIA)  – OpenCL (Open Compute La...
Novel PMs Workflows:  –   Based on processes  –   Requires planning and scheduling  –   Needs flow control  –   In-transi...
Microsoft Dryad The Dryad Project is investigating programming model  for writing parallel and distributed programs to sc...
Microsoft DryadGraphBuilder Xset = moduleX^N;GraphBuilder Dset = moduleD^N;GraphBuilder Mset = moduleM^(N*4);GraphBuilder ...
MapReduce Programmer only defines 2 functions   – Map(KInput,VInput) list(Ktemp,Vtemp)   – Reduce(Ktemp, list(Vtemp))li...
MapReduce Weaknesses  – Specific programming.  – Not easy to find key value pairs. Strong points  – Efficiency.  – Simpl...
The COMP Superscalar (COMPSs)                23
COMPSs overview - Objective Reduce the development complexity of  Grid/Cluster/Cloud applications to the minimum   – As e...
COMPSs overview - Main idea                                                                                               ...
Programming model - Sample application  Main program  public void main(){          Integer sum=0;          double pi      ...
Programming Model - Task Selection                                                Task selection interfacepublic interface...
Programming Model – Main code   public static void main(String[] args) {            Integer sum=0;            double pi   ...
Programming Model – Real ExampleHMMER    Protein Database              Aminoacid Sequence                                 ...
Programming Model – Real ExampleAminoacidsequence                     30
Programming Model – Real ExampleString[] outputs = new String[numDBFrags];//Processfor (String dbFrag : dbFrags) {     out...
Programming Model – Real Examplepublic interface HMMPfamItf {    @Method(declaringClass = "worker.hmmerobj.HMMPfamImpl")  ...
Programming Model – Real Example                   33
Programming Model – Real Example                   34
COMPSs Strong points  –   Sequential programming approach  –   Parallelization at task level  –   Transparent data manage...
Tutorial Sample & Development Virtual Appliance   – http://bscgrid06.bsc.es/~lezzi/vms/COMPSs_Tutorial.ova Tutorial   – ...
Manjrasoft Aneka .NET based Platform-as-a-Service Allows the usage of:   – Private Clouds.   – Public Clouds: Amazon EC2...
Microsoft Azure .NET based Platform-as-a-Service Computing services  – Web Role: Web Service frontend.  – Worker Role: B...
Conclusions Scientific problems are usually complex. Current reference PMs are usually unsuitable. New novel & flexible...
Questions            40
Upcoming SlideShare
Loading in …5
×

EEDC Programming Models

774 views

Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

EEDC Programming Models

  1. 1. EEDC 34330ExecutionEnvironments for Scientific ProgrammingDistributed ModelsComputingMaster in Computer Architecture,Networks and Systems - CANS Group members: Francesc Lordan francesc.lordan@bsc.es Roger Rafanell roger.rafanell@bsc.es
  2. 2. OutlineScientific Programming Models – Part 1: Introduction – Part 2: Reference parallel programming models – Part 3: Novel parallel programming models – Part 4: Conclusions – Part 5: Questions 2
  3. 3. Introduction Scientific applications: – Solve complex problems – Usually long run applications – Implemented as a sequence of steps – Each step (task) can be hard to compute – So … 3
  4. 4. Introduction In time terms…Scientific applications can’t be no more considered in sequential way!!! OK? 4
  5. 5. Introduction We need solutions based on distribute and parallelize the work. 5
  6. 6. Introduction: MPI1980s - early 1990s: Distributed memory & parallel computing startedas a bunch of incompatible software tools for writing programs. MPI (Message Passing Interface) becomes at 1994 a new reference standard. It provides: – Portability – Performance – Functionality – Availability (many implementations) Good for: Parallelize the processing by distributing the work among different machines/nodes. 6
  7. 7. Introduction: OpenMPIn the early 90s: Vendors of shared-memory machines supplied similar,directive-based for Fortran programming extensions: The user can extend a serial Fortran program with directives specifying which loops were to be parallelized. The compiler automatically parallelize such loops across the SMP processors. Implementations were all functionally similar, but were diverging (as usual). Good for: Parallelize the computation among all the resources of asingle machine. 7
  8. 8. Reference PM: OpenMPProgramming model: Computation is done by threads. Fork-join model: Threads are dynamically created and destroyed. Programmer can specify which variables are shared among threads and which are private. 8
  9. 9. Reference PM: OpenMP Example of sequential PI calculation 9
  10. 10. Reference PM: OpenMP Example of OpenMP PI calculation 10
  11. 11. Reference PM: OpenMPStrong Points: – Keeps the sequential version. – Communication is implicit. – Easy to program, debug and modify. – Good performance and scalability.Weaknesses: – Communication is implicit (less control). – Simple and flat memory model (does not run on clusters). – No support for accelerators. 11
  12. 12. Reference PM: MPIProgramming model: Computation is done by several processes that execute the same program. Communicates by passing data (send/receive). Programmer decides: – Which role the process plays by branches. – Orders which communications are done. 12
  13. 13. Reference PM: MPI Example of MPI PI calculation 13
  14. 14. Reference PM: MPIStrong Points: – Any parallel algorithm can be expressed in terms of the MPI paradigm. – Data placement problems are rarely observed. – Suitable for clusters/supercomputers (large number of processors). – Excellent performance and scalable.Weaknesses: – Communication is explicit. – Re-fitting serial code using MPI often requires refactoring. – Dynamic load balancing is difficult to implement. 14
  15. 15. Reference PM: The best of both worlds Hybrid (MPI + OpenMP): – MPI is most effective for problems with “course-grained” parallelism. – “Fine-grain” parallelization is successfully handled by OpenMP. When use hybrid programming? – The code exhibits limited scaling with MPI. – The code could make use of dynamic load balancing. – The code exhibits fine-grained or a combination of both fine-grained and course-grained parallelism. Some algorithms, such as computational fluid dynamics, benefit greatly from a hybrid approach!!! 15
  16. 16. Reference PM: Hybrid (MPI + OpenMP) Example of MPI + OpenMP PI calculation 16
  17. 17. Reference PM: New reference approaches Heterogeneous parallel-computing: – CUDA (From NVIDIA) – OpenCL (Open Compute Language) – Cross-platform • Implementations for – ATI GPUs – NVIDIA GPUs – x86 CPUs – API similar to OpenGL. – Based on C. 17
  18. 18. Novel PMs Workflows: – Based on processes – Requires planning and scheduling – Needs flow control – In-transit visibility Novel PMs: – Complex problems require simple solutions (non reference PMs based) 18
  19. 19. Microsoft Dryad The Dryad Project is investigating programming model for writing parallel and distributed programs to scale from a small cluster to a large data-center. Theoretical approach (not used) – Last and unique publication on 2007. User defines: – a set of methods – a task dependency graph with a specific language. 19
  20. 20. Microsoft DryadGraphBuilder Xset = moduleX^N;GraphBuilder Dset = moduleD^N;GraphBuilder Mset = moduleM^(N*4);GraphBuilder Sset = moduleS^(N*4);GraphBuilder Yset = moduleY^N;GraphBuilder Hset = moduleH^1;GraphBuilder XInputs = (ugriz1 >= XSet) || (neighbor >= XSet);GraphBuilder YInputs = ugriz2 >= YSet;GraphBuilder XToY = XSet >= DSet >> MSet >= SSet;for (i = 0; i < N*4; ++i){ XToY = XToY || (SSet.GetVertex(i) >= YSet.GetVertex(i/4));}GraphBuilder YToH = YSet >= HSet;GraphBuilder HOutputs = HSet >= output;GraphBuilder final = XInputs || YInputs || XToY || YToH || HOutputs; 20
  21. 21. MapReduce Programmer only defines 2 functions – Map(KInput,VInput) list(Ktemp,Vtemp) – Reduce(Ktemp, list(Vtemp))list(Vtemp) The library is in charge of all the rest 21
  22. 22. MapReduce Weaknesses – Specific programming. – Not easy to find key value pairs. Strong points – Efficiency. – Simplicity of the model. – Community and tools. 22
  23. 23. The COMP Superscalar (COMPSs) 23
  24. 24. COMPSs overview - Objective Reduce the development complexity of Grid/Cluster/Cloud applications to the minimum – As easy as writing a sequential application. Target applications: composed of tasks, most of them repetitive – Granularity of the tasks of the level of simulations or programs. – Data: files, objects, arrays, primitive types. 24
  25. 25. COMPSs overview - Main idea Parallel Resources (a) Task selection + Sequential Code parameters direction Resource 1 ... for (i=0; i<N; i++){ ( (input, output, inout) T1 (data1, data2); T2 (data4, data5); T3 (data2, data5, data6); T4 (data7, data8); T5 (data6, data8, data9); (d) Task completion, } ... Resource 2 synchronization T10 T20 T30 T40 . .. (b) Task graph creation T50 T11 T21 Resource N based on data (c) Scheduling, T41 T31 dependencies data transfer, T51 task execution T12 … 25
  26. 26. Programming model - Sample application Main program public void main(){ Integer sum=0; double pi double step=1.0d /(double) num_steps; for (int i=0;i<num_steps;i++){ computeInterval (i, step,sum); } pi = sum * step; } Subroutine public static void computeInterval (int index, int step, Integer acum) { int x = (index -0.5) * step; acum = acum + 4.0/(1.0+x*x); } 26
  27. 27. Programming Model - Task Selection Task selection interfacepublic interface PiItf { Implementation @Method(declaringClass = “Pi") void computeInterval( @Parameter(direction = IN) int index, @Parameter(direction = IN) int step, @Parameter(direction = INOUT) Parameter Integer index, metadata );} 13 27
  28. 28. Programming Model – Main code public static void main(String[] args) { Integer sum=0; double pi double step=1.0d /(double) num_steps; NO CHANGES! for (int i=0;i<num_steps;i++){ computeInterval (i, step, sum); } pi = sum * step; } 10 Compute Step Compute … N-1 Step Compute sum SYNCHStep Interval Interval Intervalsum sum sum sum 28
  29. 29. Programming Model – Real ExampleHMMER Protein Database Aminoacid Sequence IQKKSGKWHTLTDLRA VNAVIQPMGPLQPGLP SPAMIPKDWPLIIIDLK DCFFTIPLAEQDCEKFA FTIPAINNKEPATRF Model Score E-value N -------- ------ --------- --- IL6_2 -78.5 0.13 1 COLFI_2 -164.5 0.35 1 pgtp_13 -36.3 0.48 1 clf2 -15.6 3.6 1 PKD_9 -24.0 5 1 29
  30. 30. Programming Model – Real ExampleAminoacidsequence 30
  31. 31. Programming Model – Real ExampleString[] outputs = new String[numDBFrags];//Processfor (String dbFrag : dbFrags) { outputs[dbNum]= HMMPfamImpl.hmmpfam(sequence, dbFrag);}//Mergeint neighbor = 1;while (neighbor < numDBFrags) { for (int db = 0; db < numDBFrags; db += 2 * neighbor) { if (db + neighbor < numDBFrags) { HMMPfamImpl.merge(outputs[db], outputs[db + neighbor]); } } neighbor *= 2;} 31
  32. 32. Programming Model – Real Examplepublic interface HMMPfamItf { @Method(declaringClass = "worker.hmmerobj.HMMPfamImpl") String hmmpfam( @Parameter(type = Type.FILE, direction = Direction.IN) String seqFile, @Parameter(type = Type.STRING, direction = Direction.IN) String dbFile ); @Method(declaringClass = "worker.hmmerobj.HMMPfamImpl") void scoreRatingSameDB( @Parameter(type = Type.OBJECT, direction = Direction.INOUT) String resultFile1, @Parameter(type = Type.OBJECT, direction = Direction.IN) String resultFile2 );} 32
  33. 33. Programming Model – Real Example 33
  34. 34. Programming Model – Real Example 34
  35. 35. COMPSs Strong points – Sequential programming approach – Parallelization at task level – Transparent data management and remote execution – Can operate on different infrastructures: • Cluster/Grid • Cloud (Public/Private) – PaaS – IaaS • Web services Weaknesses: – Under continuous development – Does not offer binding to other languages (currently) 35
  36. 36. Tutorial Sample & Development Virtual Appliance – http://bscgrid06.bsc.es/~lezzi/vms/COMPSs_Tutorial.ova Tutorial – http://bscgrid06.bsc.es/~lezzi/ppts/tutorial-Life.ppt 36
  37. 37. Manjrasoft Aneka .NET based Platform-as-a-Service Allows the usage of: – Private Clouds. – Public Clouds: Amazon EC2, Azure, GoGrid. Offers mechanisms to control, reserve and monitoring the resources. – Also offers autoscale mechanisms. 3 programming models – Task-based: tasks are put in a bag of executable tasks. – Thread-based: exposes the .NET thread API but they are remotely created. – MapReduce No data dependency analysis!! 37
  38. 38. Microsoft Azure .NET based Platform-as-a-Service Computing services – Web Role: Web Service frontend. – Worker Role: Backend. Storage Services  Strong Point – Scalable architecture.  Weakness – Platform-tied applications. 38
  39. 39. Conclusions Scientific problems are usually complex. Current reference PMs are usually unsuitable. New novel & flexible PMs came into the game. Existing gap between scientifics and user-friendly workflow-oriented programming models. A sea of available solutions (DSLs) 39
  40. 40. Questions 40

×