2. Acknowledgements Work on Deformable Mesh Abstractions is joint work with GeetaIyer and SriramKailasam Work on Edge Node File Systems is joint work with Kovendhan Work on Deformable Mesh Abstractions is funded by Yahoo Research
3. Introduction Cloud computing: provides pay-for-use access to compute and storage resources over the Internet. Smart applications: intelligence embedded within the application (e.g. Recommender systems) Computation, data requirements and algorithms increasingly becoming complex. Popular programming models for cloud: MapReduce, Dryad. Are these right abstractions for smart apps?
4. MapReduce Origins Primary motivation: To facilitate indexing, searching, sorting like operations on massive datasets over large resources. Inspired from map and reduce primitives in LISP. Requirement to perform computations on key-value pairs to generate intermediate key-value pairs and reduce all values with the same key. Runtime responsible for parallelization of map and reduce tasks, and handles other low level details.
5. Limitations and Proposed Extensions Limitations in original MR model: Input/output restricted to key-value pairs. Jobs are loosely synchronized (no connected computation). No support for iteration and recursion. Doesn’t directly support multiple inputs for a job. Optimized for batch processing. Different nodes are assumed to perform work roughly at the same rate. Inherent assumption that all tasks require the same amount of time. Extensions: IterativeMR: adds support for iterations relies on long running mapreduce tasks and streaming data between iterations Spark: Supports iterations and interactive queries. Each iteration is handled as a separate MapReduce job, incurring job submission overheads. Streaming makes fault tolerance difficult.
6. Basic Database Operations Projection Selection Aggregation Join, Cartesian product, Set operations Only the unary operations can be directly modeled with the original MapReduce framework. There is no direct support for operations over multiple, possibly heterogeneous input data sources. Can be done indirectly by chaining extra MapReduce steps.
7. Dryad & DryadLINQ Motivated primarily from the parallel databases. Makes the communication graph explicit. Execution graph expressed as Directed Acyclic Graph (DAG). DryadLINQ allows computations to be expressed in terms of LINQ operators (similar to SQL operators) Automatically parallelized by Dryad execution engine. Supports multiple datasets and runtime optimizations of complete execution graph.
8. Limitations Lacks support for recursively spawning new tasks as computation proceeds. Adaptive computations like AI planning, branch-and-bound cannot be supported directly.
9.
10. Different nodes executing in parallel needs to communicate; requires support for a shared communication model.
13. Real world graphs may not be captured by hash-based partitioning; alternate partitioning schemes.Classes of Applications AI planning Decision tree algorithms Association rule mining Recommender systems Data mining Graph algorithms Clustering algorithms
14. Deformable Mesh Abstraction Focus: New programming model targeted towards wider applications that cannot be modeled efficiently using existing frameworks. At the same time, support MapReduce-like computations efficiently. Bring out clear separation between programmer expressibility issues and runtime environment issues.
34. Heuristic-guided Problem Solving General Methodology (e.g. AI planning) Set of actions are evaluated in parallel on the problem state. Newly generated states are inserted into the queue, based on a heuristic value. Best state is selected from the queue for further processing. Iteration continues till the goal state is reached. Requirements State of the queue needs to be preserved across iterations. On-the-fly evaluation of termination condition to decide the number of iterations.
35. Case Study: 1. Sapa Planner* Current State and all actions Splits: based on applicable actions Current State, appAction1 Current State, appActionn Current State, appAction3 Current State, appAction2 Solve1 Solve2 Solve3 Solven Evaluate action on current state and compute heuristic Communication to perform enqueue() Combine Distributed Priority Queue sorted by heuristic value Communication to perform dequeue() Select the next state and repeat again *M. B. Do and S. Kambhampati, “Sapa: a multi-objective metric temporal planner,” J. Artif. Int. Res., vol. 20, no. 1, pp. 155–194, 2003.
36. Modeling Sapa Planner using DMA Solve tasks assigned to different machines, that evaluates actions on a particular state in parallel. Recursive split facilitated through invokeSplit() call from within Combine. Preliminary Result: Split, Solve and Combine operations are modeled with minimal modification of sequential planner code. Shared information required for Split, Solve and Combine operations are loaded only once on different machines, thus avoiding recursive split overheads.
37. Case Study: 2. SGPlan4 Planner* Goals Subgoal1 Splits: constraint based subgoal partitioning Subgoal2 Subgoal3 Subgoaln Solve1 Subgoal11 Splits: based on landmark analysis Subgoal12 Subgoal1n Evaluate actions applicable on the current state Solve11 Solve12 Solve1n Splits: based on path optimization Communication based on global constraints Combine the subplans and update the penalty value of global constraints Combine & evaluate Combine & evaluate Check the producible resources and repeat again Combine & evaluate *Chen Y. X., Wah B. W. and Hsu C. W. “Temporal planning using subgoal partitioning and resolution in SGPlan”, J. of Artificial Intelligence Research 2006.
38. Edge Node File System (ENFS) * ENFS Architecture Metadata management is distributed amongst supernodes. Centrally managed metadata at the namenode. *K. Ponnavaikko and D. Janakiram, “The edge node file system: A distributed file system for high performance computing,” Scalable Computing: Practice and Experience, vol. 10, pp. 111–114, 2009
44. Supernode responsible for maintaining shared storage’s metadata, while the shared storage itself is distributed across cluster nodes.
45.
46. Extending DMA on Hadoop Clear separation between expressibility issues and runtime issues, facilitate extending DMA to Hadoop environment. Advantages DMA interfaces can exploit the efficient runtime provided by Hadoop. At the same time, wider class of applications can be captured.
Federation of clusters–Clusters: sets of geographically proximal Autonomous Systems (AS)–O(10^3) nodes per cluster•A dynamic set of relatively capable nodes, Supernodes, manage–Resources within a cluster such as devices, users, etc.–Portions of the file system namespace•Clusters connected by a system wide structured overlay