Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

MBrace: Cloud Computing with F#

3,343 views

Published on

MBrace is a programming model and cluster infrastructure for effectively defining and executing large scale computation in the cloud. Based on the .NET framework, it builds upon and extends F# asynchronous workflows.

https://skillsmatter.com/skillscasts/5157-mbrace-large-scale-distributed-computation-with-f

Published in: Technology, Business

MBrace: Cloud Computing with F#

  1. 1. Cloud Computing with F#
  2. 2.  Athens based ISV company  Specialize in the .NET framework and C#/F#  Various business fields ◦ Business process management ◦ GIS ◦ Application framework development  R&D Development ◦ OR Mappers ◦ MBrace and related frameworks ◦ Open Source development About Nessos IT
  3. 3. What is MBrace?  A Programming Model. ◦ Leverages the power of the F# language. ◦ Inspired by F#’s asynchronous workflows. ◦ Declarative, compositional, higher-order.  A Cluster Infrastructure. ◦ Based on the .NET framework. ◦ Elastic, fault tolerant, multitasking.
  4. 4. HelloWorld The MBrace Programming Model val hello : Cloud<unit> let hello = cloud { printfn "hello, world!" return () } MBrace.CreateProcess <@ hello @>
  5. 5. Sequential Composition The MBrace Programming Model let first = cloud { return 15 } let second = cloud { return 27 } cloud { let! x = first let! y = second return x + y }
  6. 6. Example : Sequential fold The MBrace Programming Model val foldl : ('S -> 'T -> Cloud<'S>) -> 'S -> 'T list -> Cloud<'S> let rec foldl f s ts = cloud { match ts with | [] -> return s | t :: ts' -> let! s' = f s t return! foldl f s' ts' }
  7. 7. ParallelComposition The MBrace Programming Model val (<||>) : Cloud<'T> -> Cloud<'S> -> Cloud<'S * 'T> cloud { let first = cloud { return 15 } let second = cloud { return 27 } let! x,y = first <||> second return x + y }
  8. 8. ParallelComposition (Variadic) The MBrace Programming Model val Cloud.Parallel : Cloud<'T> [] -> Cloud<'T []> cloud { let sqr x = cloud { return x * x } let jobs = Array.map sqr [|1 .. 100|] let! sqrs = Cloud.Parallel jobs return Array.sum sqrs }
  9. 9. Non-Deterministic Parallelism The MBrace Programming Model val Cloud.Choice : Cloud<'T option> [] -> Cloud<'T option> let tryPick (f : 'T -> Cloud<'S option>) (ts : 'T []) = cloud { let jobs = Array.map f ts return! Cloud.Choice jobs }
  10. 10. Exception handling The MBrace Programming Model let first = cloud { return 17 } let second = cloud { return 25 / 0 } cloud { try let! x,y = first <||> second return x + y with :? DivideByZeroException -> return -1 }
  11. 11. Example: Map-Reduce The MBrace Programming Model let mapReduce (mapF : 'T -> ICloud<'S>) (reduceF : 'S -> 'S -> ICloud<'S>) (identity : 'S) (inputs : 'T list) = let rec aux inputs = cloud { match inputs with | [] -> return identity | [t] -> return! mapF t | _ -> let left,right = List.split inputs let! s1, s2 = aux left <||> aux right return! reduceF s1 s2 } aux inputs
  12. 12. Demo 1
  13. 13. About that MapReduce workflow…
  14. 14. About that MapReduce workflow…  Communication Overhead. ◦ Data captured in cloud workflow closures. ◦ Needlessly passed between worker machines.  Granularity issues. ◦ Each input entails a scheduling decision by the cluster. ◦ Cluster size not taken into consideration. ◦ Multicore capacity of worker nodes ignored.
  15. 15. The Cloud Ref Distributed Data in MBrace let createRef (data : string list) = cloud { let! cref = CloudRef.New data return cref : CloudRef<string list> } let deRef (cref : CloudRef<string list>) = cloud { return cref.Value }
  16. 16. The Cloud Ref Distributed Data in MBrace  Simplest data primitive in MBrace.  References a value stored in the cluster.  Conceptually similar to ML ref types.  Immutable by design.  Values cached in worker nodes for performance.
  17. 17. Disposable types Distributed Data in MBrace cloud { use! data = CloudRef.New [| 1 .. 1000000 |] let! x,y = doSomething data <||> doSomethingElse data return x + y }
  18. 18. Demo 2
  19. 19. Performance  We tested MBrace against Hadoop.  Tests were staged onWindows Azure.  Clusters of 4, 8, 16 and 32 Large Azure instances.  Two algorithms were tested, grep and k-means.  Source code available on github.
  20. 20. Distributed grep Performance  Find occurrences of given pattern in text files.  Straightforward Map-Reduce algorithm.  Input data was 32, 64, 128 and 256 GB of text.
  21. 21. Distributed grep Performance  Find occurrences of given pattern in text files.  Straightforward Map-Reduce algorithm.  Input data was 32, 64, 128 and 256 GB of text.
  22. 22. Distributed grep Performance
  23. 23. K-means Performance  Centroid computation out of a set of vectors.  Iterative algorithm.  Not naturally describable in Map-Reduce workflows.  Hadoop implementation using Apache Mahout.  Input was 106 , randomly generated 100-dimensional points.
  24. 24. K-means Performance
  25. 25. Future  Better C# support. ◦ LinqOptimizer, LinqOptimizer.GPU andCloudLINQ. ◦ Support for the upcoming C# interactive.  Open Source. ◦ FsPickler,Thespian, CloudLINQ, etc. components of MBrace already published.  Mono/Linux support.
  26. 26. http://github.com/nessos Find more at http://www.m-brace.net

×