Architecture and Performance of Runtime Environments for Data Intensive Scalable ComputingSC09 Doctoral Symposium,  Portland, 11/18/2009Student: Jaliya EkanayakeAdvisor: Prof. Geoffrey FoxCommunity Grids Laboratory, Digital Science CenterPervasive Technology InstituteIndiana University
Cloud Runtimes for Data/Compute Intensive ApplicationsCloud RuntimesMapReduce Dryad/DryadLINQSector/Sphere Moving Computation to  DataSimple communication topologiesMapReduceDirected Acyclic Graphs (DAG)sDistributed File SystemsFault ToleranceData/Compute intensive Applications
Represented as filter pipelines
Parallelizable filtersApplications using Hadoop and DryadLINQ (1)Input files (FASTA)CAP3 [1] - Expressed Sequence Tag assembly  to re-construct full-length mRNACAP3CAP3CAP3DryadLINQOutput files“Map only” operation in HadoopSingle “Select” operation in DryadLINQ[1] X. Huang, A. Madan, “CAP3: A DNA Sequence Assembly Program,” Genome Research, vol. 9, no. 9, pp. 868-877, 1999.
Applications using Hadoop and DryadLINQ (2)PhyloD [1]project from Microsoft ResearchDerive associations between HLA alleles and HIV codons and between codons themselvesDryadLINQ  implementation[1] Microsoft Computational Biology Web Tools, http://research.microsoft.com/en-us/um/redmond/projects/MSCompBio/
Applications using Hadoop and DryadLINQ (3)125 million distances4 hours & 46 minutesCalculate  Pairwise Distances (Smith Waterman Gotoh)Calculate pairwise distances for a collection of genes (used for clustering, MDS)Fine grained tasks in MPICoarse grained tasks in DryadLINQPerformed on 768 cores (Tempest Cluster)
Applications using Hadoop and DryadLINQ (4)High Energy Physics (HEP)
K-Means Clustering
Matrix Multiplication
Multi-Dimensional Scaling (MDS)MapReduce for Iterative ComputationsClassic MapReduce RuntimesGoogle, Apache Hadoop, Sector/Sphere, DryadLINQ (DAG based)Focus on Single Step MapReduce computations onlyIntermediate data is stored and accessed via file systemsBetter fault tolerance supportHigher latenciesIterative MapReduce computations uses new maps/reducesin each iterationFixed data is loaded again and againInefficient for many iterative computations to which the MapReduce technique could be appliedSolution: i-MapReduce
Applications & Different Interconnection PatternsInputmapiterationsInputInputmapmapOutputPijreducereduceMPIDomain of MapReduce and Iterative Extensions
i-MapReduceIn-memory MapReduce
Distinction on static data and variable data (data flow vs. δ flow)
Cacheable map/reduce tasks (long running tasks)
Combine operation
Support fast intermediate data transfersStaticdataConfigure()IterateUser Programδ flowMap(Key, Value)  Reduce (Key, List<Value>) Close()Combine (Key, List<Value>)Different synchronization and intercommunication mechanisms used by the parallel runtimes
i-MapReduceProgramming ModelrunMapReduce()  IterationsWorker NodesconfigureMaps()Local DiskconfigureReduce()Cacheable map/reduce taskswhile(condition){Can send <Key,Value> pairs directlyMap()Reduce()Combine() operationCommunications/data transfers via the pub-sub broker networkupdateCondition()Two configuration options :Using local disks (only for maps)Using pub-sub bus } //end whileclose()User program’s process space
i-MapReduceArchitecturePub/Sub Broker NetworkMap WorkerMWorker NodesReduce WorkerDMRDriverUserProgramDRMMMMMRDeamonDRRRRData Read/WriteFile SystemCommunicationData SplitStreaming based communication
Eliminates file based communication
Cacheable map/reduce tasks
Static data remains in memory

Architecture and Performance of Runtime Environments for Data Intensive Scalable Computing

  • 1.
    Architecture and Performanceof Runtime Environments for Data Intensive Scalable ComputingSC09 Doctoral Symposium, Portland, 11/18/2009Student: Jaliya EkanayakeAdvisor: Prof. Geoffrey FoxCommunity Grids Laboratory, Digital Science CenterPervasive Technology InstituteIndiana University
  • 2.
    Cloud Runtimes forData/Compute Intensive ApplicationsCloud RuntimesMapReduce Dryad/DryadLINQSector/Sphere Moving Computation to DataSimple communication topologiesMapReduceDirected Acyclic Graphs (DAG)sDistributed File SystemsFault ToleranceData/Compute intensive Applications
  • 3.
  • 4.
    Parallelizable filtersApplications usingHadoop and DryadLINQ (1)Input files (FASTA)CAP3 [1] - Expressed Sequence Tag assembly to re-construct full-length mRNACAP3CAP3CAP3DryadLINQOutput files“Map only” operation in HadoopSingle “Select” operation in DryadLINQ[1] X. Huang, A. Madan, “CAP3: A DNA Sequence Assembly Program,” Genome Research, vol. 9, no. 9, pp. 868-877, 1999.
  • 5.
    Applications using Hadoopand DryadLINQ (2)PhyloD [1]project from Microsoft ResearchDerive associations between HLA alleles and HIV codons and between codons themselvesDryadLINQ implementation[1] Microsoft Computational Biology Web Tools, http://research.microsoft.com/en-us/um/redmond/projects/MSCompBio/
  • 6.
    Applications using Hadoopand DryadLINQ (3)125 million distances4 hours & 46 minutesCalculate Pairwise Distances (Smith Waterman Gotoh)Calculate pairwise distances for a collection of genes (used for clustering, MDS)Fine grained tasks in MPICoarse grained tasks in DryadLINQPerformed on 768 cores (Tempest Cluster)
  • 7.
    Applications using Hadoopand DryadLINQ (4)High Energy Physics (HEP)
  • 8.
  • 9.
  • 10.
    Multi-Dimensional Scaling (MDS)MapReducefor Iterative ComputationsClassic MapReduce RuntimesGoogle, Apache Hadoop, Sector/Sphere, DryadLINQ (DAG based)Focus on Single Step MapReduce computations onlyIntermediate data is stored and accessed via file systemsBetter fault tolerance supportHigher latenciesIterative MapReduce computations uses new maps/reducesin each iterationFixed data is loaded again and againInefficient for many iterative computations to which the MapReduce technique could be appliedSolution: i-MapReduce
  • 11.
    Applications & DifferentInterconnection PatternsInputmapiterationsInputInputmapmapOutputPijreducereduceMPIDomain of MapReduce and Iterative Extensions
  • 12.
  • 13.
    Distinction on staticdata and variable data (data flow vs. δ flow)
  • 14.
    Cacheable map/reduce tasks(long running tasks)
  • 15.
  • 16.
    Support fast intermediatedata transfersStaticdataConfigure()IterateUser Programδ flowMap(Key, Value) Reduce (Key, List<Value>) Close()Combine (Key, List<Value>)Different synchronization and intercommunication mechanisms used by the parallel runtimes
  • 17.
    i-MapReduceProgramming ModelrunMapReduce() IterationsWorker NodesconfigureMaps()Local DiskconfigureReduce()Cacheable map/reduce taskswhile(condition){Can send <Key,Value> pairs directlyMap()Reduce()Combine() operationCommunications/data transfers via the pub-sub broker networkupdateCondition()Two configuration options :Using local disks (only for maps)Using pub-sub bus } //end whileclose()User program’s process space
  • 18.
    i-MapReduceArchitecturePub/Sub Broker NetworkMapWorkerMWorker NodesReduce WorkerDMRDriverUserProgramDRMMMMMRDeamonDRRRRData Read/WriteFile SystemCommunicationData SplitStreaming based communication
  • 19.
  • 20.
  • 21.

Editor's Notes

  • #12 Currently uses NaradaBrokering, but it is easily extensible to use any other pub/sub message infrastructure such as Apache ActiveMQ.