• Like
  • Save
EEDC Programming Models
Upcoming SlideShare
Loading in...5
×

EEDC Programming Models

  • 266 views
Uploaded on

 

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
266
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • The programming model can be defined as task-based and dependency-aware. In it, the programmer is only required to select a set of methods called from a sequential Java application, for them to be run as parallel tasks on the available distributed resources. Initially, the application starts running sequentially in one node and, whenever a call to a selected method is found, an asynchronous task is created instead, letting the main program continue its execution right away. The created tasks are processed by the runtime, which discovers the dependencies between them, building a task dependency graph. A renaming technique is used to avoid some kinds of dependencies. The parallelism exhibited by the graph is exploited as much as possible, scheduling the dependency-free tasks on the available resources. The scheduling is locality-aware: nodes can cache task data for later use, and a node that already has some or all the input data for a task gets more chances to run it. The runtime also manages these data - performing data copies or transfers if necessary - and controls the completion of tasks.
  • First, the user has to provide a Java interface which declares the methods that must be executed on the Grid, that’s to say, the different kinds of task. As I mentioned before, a task is a given call to one of these methods from the application code. In addition, the user can utilise Java annotations to provide: First, the class that implements the method. Second, the constraints for each kind of task, what are the capabilities that a resource must have to run the task. This is optional. Third, it is mandatory to state the type and direction of the parameters for each kind of task. Currenly we support the file type, the string type and all the primitive types.

Transcript

  • 1. EEDC 34330ExecutionEnvironments for Scientific ProgrammingDistributed ModelsComputingMaster in Computer Architecture,Networks and Systems - CANS Group members: Francesc Lordan francesc.lordan@bsc.es Roger Rafanell roger.rafanell@bsc.es
  • 2. OutlineScientific Programming Models – Part 1: Introduction – Part 2: Reference parallel programming models – Part 3: Novel parallel programming models – Part 4: Conclusions – Part 5: Questions 2
  • 3. Introduction Scientific applications: – Solve complex problems – Usually long run applications – Implemented as a sequence of steps – Each step (task) can be hard to compute – So … 3
  • 4. Introduction In time terms…Scientific applications can’t be no more considered in sequential way!!! OK? 4
  • 5. Introduction We need solutions based on distribute and parallelize the work. 5
  • 6. Introduction: MPI1980s - early 1990s: Distributed memory & parallel computing startedas a bunch of incompatible software tools for writing programs. MPI (Message Passing Interface) becomes at 1994 a new reference standard. It provides: – Portability – Performance – Functionality – Availability (many implementations) Good for: Parallelize the processing by distributing the work among different machines/nodes. 6
  • 7. Introduction: OpenMPIn the early 90s: Vendors of shared-memory machines supplied similar,directive-based for Fortran programming extensions: The user can extend a serial Fortran program with directives specifying which loops were to be parallelized. The compiler automatically parallelize such loops across the SMP processors. Implementations were all functionally similar, but were diverging (as usual). Good for: Parallelize the computation among all the resources of asingle machine. 7
  • 8. Reference PM: OpenMPProgramming model: Computation is done by threads. Fork-join model: Threads are dynamically created and destroyed. Programmer can specify which variables are shared among threads and which are private. 8
  • 9. Reference PM: OpenMP Example of sequential PI calculation 9
  • 10. Reference PM: OpenMP Example of OpenMP PI calculation 10
  • 11. Reference PM: OpenMPStrong Points: – Keeps the sequential version. – Communication is implicit. – Easy to program, debug and modify. – Good performance and scalability.Weaknesses: – Communication is implicit (less control). – Simple and flat memory model (does not run on clusters). – No support for accelerators. 11
  • 12. Reference PM: MPIProgramming model: Computation is done by several processes that execute the same program. Communicates by passing data (send/receive). Programmer decides: – Which role the process plays by branches. – Orders which communications are done. 12
  • 13. Reference PM: MPI Example of MPI PI calculation 13
  • 14. Reference PM: MPIStrong Points: – Any parallel algorithm can be expressed in terms of the MPI paradigm. – Data placement problems are rarely observed. – Suitable for clusters/supercomputers (large number of processors). – Excellent performance and scalable.Weaknesses: – Communication is explicit. – Re-fitting serial code using MPI often requires refactoring. – Dynamic load balancing is difficult to implement. 14
  • 15. Reference PM: The best of both worlds Hybrid (MPI + OpenMP): – MPI is most effective for problems with “course-grained” parallelism. – “Fine-grain” parallelization is successfully handled by OpenMP. When use hybrid programming? – The code exhibits limited scaling with MPI. – The code could make use of dynamic load balancing. – The code exhibits fine-grained or a combination of both fine-grained and course-grained parallelism. Some algorithms, such as computational fluid dynamics, benefit greatly from a hybrid approach!!! 15
  • 16. Reference PM: Hybrid (MPI + OpenMP) Example of MPI + OpenMP PI calculation 16
  • 17. Reference PM: New reference approaches Heterogeneous parallel-computing: – CUDA (From NVIDIA) – OpenCL (Open Compute Language) – Cross-platform • Implementations for – ATI GPUs – NVIDIA GPUs – x86 CPUs – API similar to OpenGL. – Based on C. 17
  • 18. Novel PMs Workflows: – Based on processes – Requires planning and scheduling – Needs flow control – In-transit visibility Novel PMs: – Complex problems require simple solutions (non reference PMs based) 18
  • 19. Microsoft Dryad The Dryad Project is investigating programming model for writing parallel and distributed programs to scale from a small cluster to a large data-center. Theoretical approach (not used) – Last and unique publication on 2007. User defines: – a set of methods – a task dependency graph with a specific language. 19
  • 20. Microsoft DryadGraphBuilder Xset = moduleX^N;GraphBuilder Dset = moduleD^N;GraphBuilder Mset = moduleM^(N*4);GraphBuilder Sset = moduleS^(N*4);GraphBuilder Yset = moduleY^N;GraphBuilder Hset = moduleH^1;GraphBuilder XInputs = (ugriz1 >= XSet) || (neighbor >= XSet);GraphBuilder YInputs = ugriz2 >= YSet;GraphBuilder XToY = XSet >= DSet >> MSet >= SSet;for (i = 0; i < N*4; ++i){ XToY = XToY || (SSet.GetVertex(i) >= YSet.GetVertex(i/4));}GraphBuilder YToH = YSet >= HSet;GraphBuilder HOutputs = HSet >= output;GraphBuilder final = XInputs || YInputs || XToY || YToH || HOutputs; 20
  • 21. MapReduce Programmer only defines 2 functions – Map(KInput,VInput) list(Ktemp,Vtemp) – Reduce(Ktemp, list(Vtemp))list(Vtemp) The library is in charge of all the rest 21
  • 22. MapReduce Weaknesses – Specific programming. – Not easy to find key value pairs. Strong points – Efficiency. – Simplicity of the model. – Community and tools. 22
  • 23. The COMP Superscalar (COMPSs) 23
  • 24. COMPSs overview - Objective Reduce the development complexity of Grid/Cluster/Cloud applications to the minimum – As easy as writing a sequential application. Target applications: composed of tasks, most of them repetitive – Granularity of the tasks of the level of simulations or programs. – Data: files, objects, arrays, primitive types. 24
  • 25. COMPSs overview - Main idea Parallel Resources (a) Task selection + Sequential Code parameters direction Resource 1 ... for (i=0; i<N; i++){ ( (input, output, inout) T1 (data1, data2); T2 (data4, data5); T3 (data2, data5, data6); T4 (data7, data8); T5 (data6, data8, data9); (d) Task completion, } ... Resource 2 synchronization T10 T20 T30 T40 . .. (b) Task graph creation T50 T11 T21 Resource N based on data (c) Scheduling, T41 T31 dependencies data transfer, T51 task execution T12 … 25
  • 26. Programming model - Sample application Main program public void main(){ Integer sum=0; double pi double step=1.0d /(double) num_steps; for (int i=0;i<num_steps;i++){ computeInterval (i, step,sum); } pi = sum * step; } Subroutine public static void computeInterval (int index, int step, Integer acum) { int x = (index -0.5) * step; acum = acum + 4.0/(1.0+x*x); } 26
  • 27. Programming Model - Task Selection Task selection interfacepublic interface PiItf { Implementation @Method(declaringClass = “Pi") void computeInterval( @Parameter(direction = IN) int index, @Parameter(direction = IN) int step, @Parameter(direction = INOUT) Parameter Integer index, metadata );} 13 27
  • 28. Programming Model – Main code public static void main(String[] args) { Integer sum=0; double pi double step=1.0d /(double) num_steps; NO CHANGES! for (int i=0;i<num_steps;i++){ computeInterval (i, step, sum); } pi = sum * step; } 10 Compute Step Compute … N-1 Step Compute sum SYNCHStep Interval Interval Intervalsum sum sum sum 28
  • 29. Programming Model – Real ExampleHMMER Protein Database Aminoacid Sequence IQKKSGKWHTLTDLRA VNAVIQPMGPLQPGLP SPAMIPKDWPLIIIDLK DCFFTIPLAEQDCEKFA FTIPAINNKEPATRF Model Score E-value N -------- ------ --------- --- IL6_2 -78.5 0.13 1 COLFI_2 -164.5 0.35 1 pgtp_13 -36.3 0.48 1 clf2 -15.6 3.6 1 PKD_9 -24.0 5 1 29
  • 30. Programming Model – Real ExampleAminoacidsequence 30
  • 31. Programming Model – Real ExampleString[] outputs = new String[numDBFrags];//Processfor (String dbFrag : dbFrags) { outputs[dbNum]= HMMPfamImpl.hmmpfam(sequence, dbFrag);}//Mergeint neighbor = 1;while (neighbor < numDBFrags) { for (int db = 0; db < numDBFrags; db += 2 * neighbor) { if (db + neighbor < numDBFrags) { HMMPfamImpl.merge(outputs[db], outputs[db + neighbor]); } } neighbor *= 2;} 31
  • 32. Programming Model – Real Examplepublic interface HMMPfamItf { @Method(declaringClass = "worker.hmmerobj.HMMPfamImpl") String hmmpfam( @Parameter(type = Type.FILE, direction = Direction.IN) String seqFile, @Parameter(type = Type.STRING, direction = Direction.IN) String dbFile ); @Method(declaringClass = "worker.hmmerobj.HMMPfamImpl") void scoreRatingSameDB( @Parameter(type = Type.OBJECT, direction = Direction.INOUT) String resultFile1, @Parameter(type = Type.OBJECT, direction = Direction.IN) String resultFile2 );} 32
  • 33. Programming Model – Real Example 33
  • 34. Programming Model – Real Example 34
  • 35. COMPSs Strong points – Sequential programming approach – Parallelization at task level – Transparent data management and remote execution – Can operate on different infrastructures: • Cluster/Grid • Cloud (Public/Private) – PaaS – IaaS • Web services Weaknesses: – Under continuous development – Does not offer binding to other languages (currently) 35
  • 36. Tutorial Sample & Development Virtual Appliance – http://bscgrid06.bsc.es/~lezzi/vms/COMPSs_Tutorial.ova Tutorial – http://bscgrid06.bsc.es/~lezzi/ppts/tutorial-Life.ppt 36
  • 37. Manjrasoft Aneka .NET based Platform-as-a-Service Allows the usage of: – Private Clouds. – Public Clouds: Amazon EC2, Azure, GoGrid. Offers mechanisms to control, reserve and monitoring the resources. – Also offers autoscale mechanisms. 3 programming models – Task-based: tasks are put in a bag of executable tasks. – Thread-based: exposes the .NET thread API but they are remotely created. – MapReduce No data dependency analysis!! 37
  • 38. Microsoft Azure .NET based Platform-as-a-Service Computing services – Web Role: Web Service frontend. – Worker Role: Backend. Storage Services  Strong Point – Scalable architecture.  Weakness – Platform-tied applications. 38
  • 39. Conclusions Scientific problems are usually complex. Current reference PMs are usually unsuitable. New novel & flexible PMs came into the game. Existing gap between scientifics and user-friendly workflow-oriented programming models. A sea of available solutions (DSLs) 39
  • 40. Questions 40