Mbrace plos-slides final
Upcoming SlideShare
Loading in...5
×
 

Mbrace plos-slides final

on

  • 424 views

“MBrace: Cloud Computing with Monads”, has been accepted for presentation at the Programming Languages and Operating Systems workshop, co-located with the SOSP 2013 conference.

“MBrace: Cloud Computing with Monads”, has been accepted for presentation at the Programming Languages and Operating Systems workshop, co-located with the SOSP 2013 conference.

Statistics

Views

Total Views
424
Slideshare-icon Views on SlideShare
424
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Mbrace plos-slides final Mbrace plos-slides final Presentation Transcript

    • MBrace: Cloud Computing with Monads Jan Dzik Nick Palladinos Kostas Rontogiannis Eirik Tsarpalis Nikolaos Vathis Nessos Information Technologies, SA 7th Workshop on Programming Languages and Operating Systems November 3, 2013 Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 1 / 29
    • Introduction Motivation Motivation Distributed Computation is Challenging. Key to success: choose the right distribution framework. Each framework tied to particular programming abstraction. Map-Reduce, Actor model, Dataflow model, etc. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 2 / 29
    • Introduction Motivation Established distributed frameworks Restrict to specific distribution patterns. Not expressive enough for certain classes of algorithms. Difficult to influence task granularity. Time consuming to deploy, manage and debug. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 3 / 29
    • Introduction What is MBrace? What is MBrace? 1 A new programming model for the cloud. 2 An elastic, fault tolerant, multitasking cluster infrastructure. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 4 / 29
    • Introduction In This Talk In This Talk Concentrate on the programming model. Distributed Computation. Distributed Data. Benchmarks. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 5 / 29
    • The MBrace Programming Model The Cloud Monad The MBrace Programming Model A monad for composing distribution workflows. Essentially a continuation monad that admits distribution. Based on F# computation expressions. Inspired by the successful F# asynchronous workflows. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 6 / 29
    • The MBrace Programming Model The Cloud Monad A Basic cloud workflow let download (url : string) = cloud { let client = new System.Net.WebClient() let content = client.DownloadString(url) return content } : Cloud<string> Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 7 / 29
    • The MBrace Programming Model The Cloud Monad Composing cloud workflows let downloadSequential () = cloud { let! c1 = download "http://m-brace.net/" let! c2 = download "http://nessos.gr/" let c = c1 + c2 return c } Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 8 / 29
    • The MBrace Programming Model Distribution Combinators Parallel Composition let downloadParallel () = cloud { let! c1,c2 = download "http://m-brace.net/" <||> download "http://nessos.gr/" return c1 + c2 } Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 9 / 29
    • The MBrace Programming Model Distribution Combinators Distribution Primitives: an overview Binary parallel operator: <||> : Cloud<'T> -> Cloud<'U> -> Cloud<'T * 'U> Variadic parallel combinator: Cloud.Parallel : Cloud<'T> [] -> Cloud<'T []> Non-deterministic parallel combinator: Cloud.Choice : Cloud<'T option> [] -> Cloud<'T option> Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 10 / 29
    • The MBrace Programming Model Additional Constructs Cloud Monad: additional constructs Monadic for loops. Monadic while loops. Monadic exception handling. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 11 / 29
    • The MBrace Programming Model Additional Constructs Example: Inverse squares let inverseSquares (inputs : int []) = cloud { let jobs : Cloud<float> [] = [| for i in inputs -> cloud { return 1.0 / float (i * i) } |] try let! results = Cloud.Parallel jobs return Array.sum results with :? DivideByZeroException -> return -1.0 } Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 12 / 29
    • The MBrace Programming Model Evaluation in the Cloud How is it all executed? Scheduler/worker cluster organization. Symbolic execution stack (free monad/trampolines). Scheduler interprets “monadic skeleton”. Native “leaf expressions” dispatched to workers. Symbolic stack winds across multiple machines. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 13 / 29
    • The MBrace Programming Model Map-Reduce A Map-Reduce implementation let rec mapReduce (map : 'T -> Cloud<'R>) (reduce : 'R -> 'R -> Cloud<'R>) (identity : 'R) (input : 'T list) = cloud { match input with | [] -> return identity | [value] -> return! map value | _ -> let left, right = List.split input let! r1, r2 = (mapReduce map reduce identity left) <||> (mapReduce map reduce identity right) return! reduce r1 r2 } Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 14 / 29
    • The Distributed Data Programming Model Introduction What about Data Distribution? MBrace does NOT include a storage service (for now). Relies on third-party storage services. Storage Provider plugin architecture. Out-of-the-box support for FileSystem, SQL and Azure. Future support for HDFS and Amazon S3. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 15 / 29
    • The Distributed Data Programming Model The MBrace Data Programming Model The MBrace Data Programming Model Storage services interfaced through data primitives. Data primitives act as references to distributed resources. Initialized or updated through the monad. Come in immutable or mutable flavors. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 16 / 29
    • The Distributed Data Programming Model Cloud Ref Cloud Ref Simplest distributed data primitive of MBrace. Generic reference to a stored value. Conceptually similar to ML ref cells. Immutable by design. Cached in worker nodes for performance. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 17 / 29
    • The Distributed Data Programming Model Cloud Ref Cloud Ref: Example let createRef (inputs : int []) = cloud { let! ref = CloudRef.New inputs return ref : CloudRef<int []> } let deRef (ref : CloudRef<int []>) = cloud { let content = ref.Value return content : int [] } Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 18 / 29
    • The Distributed Data Programming Model Cloud Ref Application: Data Sharding type DistribTree<'T> = | Leaf of 'T | Branch of CloudRef<DistribTree<'T>> * CloudRef<DistribTree<'T>> let rec map (f : 'T -> 'S) (tree : DistribTree<'T>) = cloud { match tree with | Leaf t -> return! CloudRef.New (Leaf (f t)) | Branch(l,r) -> let! l', r' = map f l.Value <||> map f r.Value return! CloudRef.New (Branch(l',r')) } Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 19 / 29
    • The Distributed Data Programming Model Cloud File Cloud File References files in the distributed store. Untyped, immutable, binary blobs. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 20 / 29
    • The Distributed Data Programming Model Cloud File Cloud File : Example let getSize (file : CloudFile) = cloud { let! bytes = CloudFile.ReadAllBytes file return bytes.Length / 1024 } cloud { let! files = CloudDir.GetFiles "/path/to/files" let jobs = Array.map getSize files let! sizes = Cloud.Parallel jobs return Array.sum sizes } Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 21 / 29
    • The MBrace Framework Performance Performance We tested MBrace against Hadoop. Both frameworks were run on Windows Azure. Clusters consisted of 4, 8, 16 and 32 quad-core nodes. Two algorithms were tested, grep and k-means. Source code available on github. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 22 / 29
    • The MBrace Framework Performance Distributed Grep (Windows Azure) Count occurrences of given pattern from input files. Straightforward Map-Reduce algorithm. Input data was 32, 64, 128 and 256 GB of text. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 23 / 29
    • The MBrace Framework Performance Distributed Grep (Windows Azure) 400 Time (sec) 300 200 100 MBrace Hadoop 0 20 Eirik Tsarpalis (Nessos IT) 40 60 80 worker cores 100 MBrace: Cloud Computing with Monads 120 PLOS ’13 24 / 29
    • The MBrace Framework Performance k-means Clustering (Windows Azure) Centroid computation out of a set of vectors. Iterative algorithm. Not naturally definable with Map-Reduce workflows. Hadoop implementation from Apache Mahout library. Input was 106 , randomly generated, 100-dimensional points. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 25 / 29
    • The MBrace Framework Performance k-means Clustering (Windows Azure) MBrace Hadoop Time (sec) 1,500 1,000 500 0 20 Eirik Tsarpalis (Nessos IT) 40 60 80 worker cores MBrace: Cloud Computing with Monads 100 120 PLOS ’13 26 / 29
    • Conclusions & Future Work Conclusions A big data platform for the .NET framework. Language-integrated cloud workflows. User-specifiable parallelism patterns and task granularity. Distributed exception handling. Pluggable storage services. Data API integrated with programming model. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 27 / 29
    • Conclusions & Future Work Future Work Improved C# support. A rich library of combinators and parallelism patterns. A LINQ provider for data parallelism. Support for the Mono framework and Linux. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 28 / 29
    • Conclusions & Future Work Thank You! Questions? http://m-brace.net Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 29 / 29