• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
HPC with Clouds and Cloud Technologies

HPC with Clouds and Cloud Technologies






Total Views
Views on SlideShare
Embed Views



7 Embeds 1,285

http://tech2view.com 1275
http://feeds.feedburner.com 4
http://puresearching.com 2
http://translate.googleusercontent.com 1
http://prlog.ru 1
http://yoleoreader.com 1
http://www.365dailyjournal.com 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    HPC with Clouds and Cloud Technologies HPC with Clouds and Cloud Technologies Presentation Transcript

    • Jaliya Ekanayake and Geoffrey Fox School of Informatics and Computing Indiana University BloomingtonCloud Computing and Software Services: Theory and Techniques July, 2010 Presented by: Inderjeet Singh
    •  Introduction Problem Data Analysis Applications Evaluations and Analysis Performance of MPI on Clouds Benchmarks and Results Conclusions and Future Work Critique
    •  Apache Hadoop (OpenSource version of Google MapReduce) DryadLINQ (Microsoft API for Dryad) CGL-MapReduce (Iterative version of MapReduce)Cloud technologies/Parallel Runtimes/Cloud Runtimes
    •  On demand provisioning of resources Customizable Virtual Machines (VM) Root privileges Provisioning is very fast (within minutes) You pay only for what you use Better resource utilization
    • Cloud Technologies Moving computation to data Better Quality of Service (QoS) Simple communication topologies Distributed file system (HDFS,GFS)Most HPC applications are based upon MPI Many fine grained communication topologies Usage of fast network
    • Software framework to support distributed computing on large datasets on cluster of computers Map step - The master node takes the input, partitions it up into smaller sub-problems, and distributes them to worker nodes. A worker node may do this again in turn, leading to a multi-level tree structure. The worker node processes the smaller problem, and passes the answer back to its master node Reduce step - The master node collects the answers to all the sub-problems and combines them in some way to form the output or answer
    • Large data/compute intensive applicationsTraditional approach Execution on Clusters/grid/supercomputers Moving both application and data to available computational power Efficiency decreases with large datasetsBetter approach Execution with Cloud technologies Moving computations to data to perform processing More data centric approach
    • Comparisons of features supported by differentcloud technologies and MPI
    •  What applications are best handled by cloud technologies? What overheads do they introduce? Can traditional parallel runtimes such as MPI be used in cloud? If so, what overheads do they have?
    • Types of Applications (Based upon communication) Map only (Cap3) Map Reduce (HEP) Iterative/Complex style (Matrix Multiplication and K-Means Clustering)
    •  Cap3 - Sequence assembly program that operates on a collection of gene sequence files to produce several outputs HEP - High Energy Physics data analysis application K-Means clustering - Performs iteratively refining computation of clusters Matrix Multiplication – Cannon’s algorithm
    •  MapReduce does not support iterative/complex style applications so [Fox] build CGL- MapReduce CGL-Mapreduce – Supports long running tasks and retains static data in memory across invocations
    •  Performance (average running time) Overhead = [P * T(P) – T(1)]/T(1) P = No. of processes DryadLINQ Hadoop/ CGL MapReduce/M PI
    •  CAP3 (map only) and HEP (mapreduce) perform well with cloud runtimes K-means clustering (iterative) and matrix multiplications (iterative) show high overheads with cloud runtimes compared to MPI runtime CGL-Mapreduce also gives less overhead for large datasets
    • Goals Overhead of Virtual Machines (VM) on parallel applications in MPI How applications with different communication/computation (c/c) ratio perform on cloud? Effect of different CPU core assignment strategies on VMs and running these MPI applications on these VMs
    • Three MPI applications with different c/c ratios requirements Matrix multiplication (Cannon’s algorithm) K-Means clustering Concurrent wave solver
    • Computation and Communication complexities of thedifferent MPI applications used
    •  Eucalyptus and Xen based cloud infrastructure  16 nodes with 2 Quad Core Intel Xeon processors and 32 GB of memory  Nodes connected with 1 gigabit Ethernet connection Same s/w configuration for both bare-metal nodes and VMs • OS - Red Hat Enterprise Linux Server release 5.2 • OpenMP version 1.3.2
    • Different CPU core/virtual machines assignment strategiesInvariant to select the number of MPI processes Number of MPI processes = Number of CPU cores used
    • Performance – 64 CPU Cores Speedup – Fixed Matrix size (5184*5184) ◦ Speedup decrease 34% between Bare metal and 8-VM/node at 81 processes ◦ Exchange of large messages and more communication
    • Performance – 128 CPU Cores Total overhead (Number of MPI Processes =128) ◦ Communication is very less than computations ◦ Communication here depends upon number of clusters formed ◦ Overhead is large for small data sizes, so less speedup is observed
    • Total Overhead (Number of MPIPerformance – 128 CPU Cores Processes = 128) ◦ Amount of communications is fixed, less data transfer rates ◦ Lower c/c ratio of O(1/n) leads to more latency and lower performance on VMs ◦ 8-VMs per node has 7% more overhead than bare metal node
    • Communication between dom0 and domUs when 1-VM per node is deployed(top). Communication between dom0 and domUs when 8-VMs per node aredeployed (bottom)◦ In multi VMs configuration scheduling of I/O operation of DomUs (user domains) happens via Dom0 (privileged OS)
    • Figure: LAM vs. OpenMPI in different VM configurations When using mutliple VMs on multi-core CPUs, it is good to use runtimes supporting in-node communications (OpenMP vs LAM-MPI)
    •  Cloud runtimes work well for pleasing parallel (map only and mapreduce) applications with large datasets Overheads of cloud runtimes are high with parallel applications that require iterative/complex communication patterns (MPI based applications) Work needs to be done on finding algorithms for these applications that are cloud friendly CGL-MapReduce is efficient for iterative style mapreduce applications (k-means)
    •  Overheads for MPI applications increase as number of VMs/node increase (22-50% degradation) In-node communication in important MapReduce applications (not susceptible to latencies) may perform well on VMs deployed on clouds Integration of MapReduce and MPI (biological DNA sequencing application)
    •  No results of implementation of pleasing parallel applications (Cap3, HEP) with MPI, missing MPI and cloud runtimes time comparisons Missing evaluations of HPC applications implemented with cloud runtimes on private cloud, which is critical to show the effect of multi VMs/multi-core configurations on performances of these applications Difference in memory sizes (16/32 GB) for clusters of different OS. This could lead to biased results
    •  Ekanayake Jaliya and Fox Geoffrey, High Performance Parallel Computing with Clouds and Cloud Technologies, Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering (2010), Pages 20, Volume 34 High Performance Parallel Computing with Clouds and Cloud Technologies. http://www.slideshare.net/jaliyae/high-performance- parallel-computing-with-clouds-and-cloud-technologies Map Reduce, Wikipedia: http://en.wikipedia.org/wiki/MapReduce