Your SlideShare is downloading. ×
0
MBrace: Cloud Computing with Monads
Jan Dzik

Nick Palladinos Kostas Rontogiannis
Eirik Tsarpalis Nikolaos Vathis
Nessos I...
Introduction

Motivation

Motivation

Distributed Computation is Challenging.
Key to success: choose the right distributio...
Introduction

Motivation

Established distributed frameworks

Restrict to specific distribution patterns.
Not expressive en...
Introduction

What is MBrace?

What is MBrace?

1

A new programming model for the cloud.

2

An elastic, fault tolerant, ...
Introduction

In This Talk

In This Talk

Concentrate on the programming model.
Distributed Computation.
Distributed Data....
The MBrace Programming Model

The Cloud Monad

The MBrace Programming Model

A monad for composing distribution workflows.
...
The MBrace Programming Model

The Cloud Monad

A Basic cloud workflow

let download (url : string) = cloud {
let client = n...
The MBrace Programming Model

The Cloud Monad

Composing cloud workflows

let downloadSequential () = cloud {
let! c1 = dow...
The MBrace Programming Model

Distribution Combinators

Parallel Composition

let downloadParallel () = cloud {
let! c1,c2...
The MBrace Programming Model

Distribution Combinators

Distribution Primitives: an overview

Binary parallel operator:
<|...
The MBrace Programming Model

Additional Constructs

Cloud Monad: additional constructs

Monadic for loops.
Monadic while ...
The MBrace Programming Model

Additional Constructs

Example: Inverse squares
let inverseSquares (inputs : int []) = cloud...
The MBrace Programming Model

Evaluation in the Cloud

How is it all executed?

Scheduler/worker cluster organization.
Sym...
The MBrace Programming Model

Map-Reduce

A Map-Reduce implementation
let rec mapReduce (map : 'T -> Cloud<'R>)
(reduce : ...
The Distributed Data Programming Model

Introduction

What about Data Distribution?

MBrace does NOT include a storage ser...
The Distributed Data Programming Model

The MBrace Data Programming Model

The MBrace Data Programming Model

Storage serv...
The Distributed Data Programming Model

Cloud Ref

Cloud Ref

Simplest distributed data primitive of MBrace.
Generic refer...
The Distributed Data Programming Model

Cloud Ref

Cloud Ref: Example

let createRef (inputs : int []) = cloud {
let! ref ...
The Distributed Data Programming Model

Cloud Ref

Application: Data Sharding
type DistribTree<'T> =
| Leaf of 'T
| Branch...
The Distributed Data Programming Model

Cloud File

Cloud File

References files in the distributed store.
Untyped, immutab...
The Distributed Data Programming Model

Cloud File

Cloud File : Example

let getSize (file : CloudFile) = cloud {
let! by...
The MBrace Framework

Performance

Performance

We tested MBrace against Hadoop.
Both frameworks were run on Windows Azure...
The MBrace Framework

Performance

Distributed Grep (Windows Azure)

Count occurrences of given pattern from input files.
S...
The MBrace Framework

Performance

Distributed Grep (Windows Azure)
400

Time (sec)

300

200

100
MBrace
Hadoop
0

20

Ei...
The MBrace Framework

Performance

k-means Clustering (Windows Azure)

Centroid computation out of a set of vectors.
Itera...
The MBrace Framework

Performance

k-means Clustering (Windows Azure)
MBrace
Hadoop

Time (sec)

1,500

1,000

500

0

20
...
Conclusions & Future Work

Conclusions

A big data platform for the .NET framework.
Language-integrated cloud workflows.
Us...
Conclusions & Future Work

Future Work

Improved C# support.
A rich library of combinators and parallelism patterns.
A LIN...
Conclusions & Future Work

Thank You!

Questions?

http://m-brace.net

Eirik Tsarpalis (Nessos IT)

MBrace: Cloud Computin...
Upcoming SlideShare
Loading in...5
×

Mbrace plos-slides final

337

Published on

“MBrace: Cloud Computing with Monads”, has been accepted for presentation at the Programming Languages and Operating Systems workshop, co-located with the SOSP 2013 conference.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
337
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Mbrace plos-slides final"

  1. 1. MBrace: Cloud Computing with Monads Jan Dzik Nick Palladinos Kostas Rontogiannis Eirik Tsarpalis Nikolaos Vathis Nessos Information Technologies, SA 7th Workshop on Programming Languages and Operating Systems November 3, 2013 Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 1 / 29
  2. 2. Introduction Motivation Motivation Distributed Computation is Challenging. Key to success: choose the right distribution framework. Each framework tied to particular programming abstraction. Map-Reduce, Actor model, Dataflow model, etc. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 2 / 29
  3. 3. Introduction Motivation Established distributed frameworks Restrict to specific distribution patterns. Not expressive enough for certain classes of algorithms. Difficult to influence task granularity. Time consuming to deploy, manage and debug. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 3 / 29
  4. 4. Introduction What is MBrace? What is MBrace? 1 A new programming model for the cloud. 2 An elastic, fault tolerant, multitasking cluster infrastructure. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 4 / 29
  5. 5. Introduction In This Talk In This Talk Concentrate on the programming model. Distributed Computation. Distributed Data. Benchmarks. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 5 / 29
  6. 6. The MBrace Programming Model The Cloud Monad The MBrace Programming Model A monad for composing distribution workflows. Essentially a continuation monad that admits distribution. Based on F# computation expressions. Inspired by the successful F# asynchronous workflows. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 6 / 29
  7. 7. The MBrace Programming Model The Cloud Monad A Basic cloud workflow let download (url : string) = cloud { let client = new System.Net.WebClient() let content = client.DownloadString(url) return content } : Cloud<string> Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 7 / 29
  8. 8. The MBrace Programming Model The Cloud Monad Composing cloud workflows let downloadSequential () = cloud { let! c1 = download "http://m-brace.net/" let! c2 = download "http://nessos.gr/" let c = c1 + c2 return c } Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 8 / 29
  9. 9. The MBrace Programming Model Distribution Combinators Parallel Composition let downloadParallel () = cloud { let! c1,c2 = download "http://m-brace.net/" <||> download "http://nessos.gr/" return c1 + c2 } Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 9 / 29
  10. 10. The MBrace Programming Model Distribution Combinators Distribution Primitives: an overview Binary parallel operator: <||> : Cloud<'T> -> Cloud<'U> -> Cloud<'T * 'U> Variadic parallel combinator: Cloud.Parallel : Cloud<'T> [] -> Cloud<'T []> Non-deterministic parallel combinator: Cloud.Choice : Cloud<'T option> [] -> Cloud<'T option> Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 10 / 29
  11. 11. The MBrace Programming Model Additional Constructs Cloud Monad: additional constructs Monadic for loops. Monadic while loops. Monadic exception handling. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 11 / 29
  12. 12. The MBrace Programming Model Additional Constructs Example: Inverse squares let inverseSquares (inputs : int []) = cloud { let jobs : Cloud<float> [] = [| for i in inputs -> cloud { return 1.0 / float (i * i) } |] try let! results = Cloud.Parallel jobs return Array.sum results with :? DivideByZeroException -> return -1.0 } Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 12 / 29
  13. 13. The MBrace Programming Model Evaluation in the Cloud How is it all executed? Scheduler/worker cluster organization. Symbolic execution stack (free monad/trampolines). Scheduler interprets “monadic skeleton”. Native “leaf expressions” dispatched to workers. Symbolic stack winds across multiple machines. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 13 / 29
  14. 14. The MBrace Programming Model Map-Reduce A Map-Reduce implementation let rec mapReduce (map : 'T -> Cloud<'R>) (reduce : 'R -> 'R -> Cloud<'R>) (identity : 'R) (input : 'T list) = cloud { match input with | [] -> return identity | [value] -> return! map value | _ -> let left, right = List.split input let! r1, r2 = (mapReduce map reduce identity left) <||> (mapReduce map reduce identity right) return! reduce r1 r2 } Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 14 / 29
  15. 15. The Distributed Data Programming Model Introduction What about Data Distribution? MBrace does NOT include a storage service (for now). Relies on third-party storage services. Storage Provider plugin architecture. Out-of-the-box support for FileSystem, SQL and Azure. Future support for HDFS and Amazon S3. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 15 / 29
  16. 16. The Distributed Data Programming Model The MBrace Data Programming Model The MBrace Data Programming Model Storage services interfaced through data primitives. Data primitives act as references to distributed resources. Initialized or updated through the monad. Come in immutable or mutable flavors. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 16 / 29
  17. 17. The Distributed Data Programming Model Cloud Ref Cloud Ref Simplest distributed data primitive of MBrace. Generic reference to a stored value. Conceptually similar to ML ref cells. Immutable by design. Cached in worker nodes for performance. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 17 / 29
  18. 18. The Distributed Data Programming Model Cloud Ref Cloud Ref: Example let createRef (inputs : int []) = cloud { let! ref = CloudRef.New inputs return ref : CloudRef<int []> } let deRef (ref : CloudRef<int []>) = cloud { let content = ref.Value return content : int [] } Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 18 / 29
  19. 19. The Distributed Data Programming Model Cloud Ref Application: Data Sharding type DistribTree<'T> = | Leaf of 'T | Branch of CloudRef<DistribTree<'T>> * CloudRef<DistribTree<'T>> let rec map (f : 'T -> 'S) (tree : DistribTree<'T>) = cloud { match tree with | Leaf t -> return! CloudRef.New (Leaf (f t)) | Branch(l,r) -> let! l', r' = map f l.Value <||> map f r.Value return! CloudRef.New (Branch(l',r')) } Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 19 / 29
  20. 20. The Distributed Data Programming Model Cloud File Cloud File References files in the distributed store. Untyped, immutable, binary blobs. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 20 / 29
  21. 21. The Distributed Data Programming Model Cloud File Cloud File : Example let getSize (file : CloudFile) = cloud { let! bytes = CloudFile.ReadAllBytes file return bytes.Length / 1024 } cloud { let! files = CloudDir.GetFiles "/path/to/files" let jobs = Array.map getSize files let! sizes = Cloud.Parallel jobs return Array.sum sizes } Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 21 / 29
  22. 22. The MBrace Framework Performance Performance We tested MBrace against Hadoop. Both frameworks were run on Windows Azure. Clusters consisted of 4, 8, 16 and 32 quad-core nodes. Two algorithms were tested, grep and k-means. Source code available on github. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 22 / 29
  23. 23. The MBrace Framework Performance Distributed Grep (Windows Azure) Count occurrences of given pattern from input files. Straightforward Map-Reduce algorithm. Input data was 32, 64, 128 and 256 GB of text. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 23 / 29
  24. 24. The MBrace Framework Performance Distributed Grep (Windows Azure) 400 Time (sec) 300 200 100 MBrace Hadoop 0 20 Eirik Tsarpalis (Nessos IT) 40 60 80 worker cores 100 MBrace: Cloud Computing with Monads 120 PLOS ’13 24 / 29
  25. 25. The MBrace Framework Performance k-means Clustering (Windows Azure) Centroid computation out of a set of vectors. Iterative algorithm. Not naturally definable with Map-Reduce workflows. Hadoop implementation from Apache Mahout library. Input was 106 , randomly generated, 100-dimensional points. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 25 / 29
  26. 26. The MBrace Framework Performance k-means Clustering (Windows Azure) MBrace Hadoop Time (sec) 1,500 1,000 500 0 20 Eirik Tsarpalis (Nessos IT) 40 60 80 worker cores MBrace: Cloud Computing with Monads 100 120 PLOS ’13 26 / 29
  27. 27. Conclusions & Future Work Conclusions A big data platform for the .NET framework. Language-integrated cloud workflows. User-specifiable parallelism patterns and task granularity. Distributed exception handling. Pluggable storage services. Data API integrated with programming model. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 27 / 29
  28. 28. Conclusions & Future Work Future Work Improved C# support. A rich library of combinators and parallelism patterns. A LINQ provider for data parallelism. Support for the Mono framework and Linux. Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 28 / 29
  29. 29. Conclusions & Future Work Thank You! Questions? http://m-brace.net Eirik Tsarpalis (Nessos IT) MBrace: Cloud Computing with Monads PLOS ’13 29 / 29
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×