MBrace: Cloud Computing with F#

Eirik George Tsarpalis
Eirik George TsarpalisF# Developer at Nessos Information Technologies S.A.
Cloud Computing with F#
 Athens based ISV company
 Specialize in the .NET framework and C#/F#
 Various business fields
◦ Business process management
◦ GIS
◦ Application framework development
 R&D Development
◦ OR Mappers
◦ MBrace and related frameworks
◦ Open Source development
About Nessos IT
What is MBrace?
 A Programming Model.
◦ Leverages the power of the F# language.
◦ Inspired by F#’s asynchronous workflows.
◦ Declarative, compositional, higher-order.
 A Cluster Infrastructure.
◦ Based on the .NET framework.
◦ Elastic, fault tolerant, multitasking.
HelloWorld
The MBrace Programming Model
val hello : Cloud<unit>
let hello = cloud {
printfn "hello, world!"
return ()
}
MBrace.CreateProcess <@ hello @>
Sequential Composition
The MBrace Programming Model
let first = cloud { return 15 }
let second = cloud { return 27 }
cloud {
let! x = first
let! y = second
return x + y
}
Example : Sequential fold
The MBrace Programming Model
val foldl :
('S -> 'T -> Cloud<'S>) ->
'S -> 'T list -> Cloud<'S>
let rec foldl f s ts = cloud {
match ts with
| [] -> return s
| t :: ts' ->
let! s' = f s t
return! foldl f s' ts'
}
ParallelComposition
The MBrace Programming Model
val (<||>) : Cloud<'T> -> Cloud<'S> -> Cloud<'S * 'T>
cloud {
let first = cloud { return 15 }
let second = cloud { return 27 }
let! x,y = first <||> second
return x + y
}
ParallelComposition (Variadic)
The MBrace Programming Model
val Cloud.Parallel : Cloud<'T> [] -> Cloud<'T []>
cloud {
let sqr x = cloud { return x * x }
let jobs = Array.map sqr [|1 .. 100|]
let! sqrs = Cloud.Parallel jobs
return Array.sum sqrs
}
Non-Deterministic Parallelism
The MBrace Programming Model
val Cloud.Choice : Cloud<'T option> [] -> Cloud<'T option>
let tryPick (f : 'T -> Cloud<'S option>) (ts : 'T []) =
cloud {
let jobs = Array.map f ts
return! Cloud.Choice jobs
}
Exception handling
The MBrace Programming Model
let first = cloud { return 17 }
let second = cloud { return 25 / 0 }
cloud {
try
let! x,y = first <||> second
return x + y
with :? DivideByZeroException ->
return -1
}
Example: Map-Reduce
The MBrace Programming Model
let mapReduce (mapF : 'T -> ICloud<'S>)
(reduceF : 'S -> 'S -> ICloud<'S>)
(identity : 'S) (inputs : 'T list) =
let rec aux inputs = cloud {
match inputs with
| [] -> return identity
| [t] -> return! mapF t
| _ ->
let left,right = List.split inputs
let! s1, s2 = aux left <||> aux right
return! reduceF s1 s2
}
aux inputs
Demo 1
About that MapReduce workflow…
About that MapReduce workflow…
 Communication Overhead.
◦ Data captured in cloud workflow closures.
◦ Needlessly passed between worker machines.
 Granularity issues.
◦ Each input entails a scheduling decision by the cluster.
◦ Cluster size not taken into consideration.
◦ Multicore capacity of worker nodes ignored.
The Cloud Ref
Distributed Data in MBrace
let createRef (data : string list) = cloud {
let! cref = CloudRef.New data
return cref : CloudRef<string list>
}
let deRef (cref : CloudRef<string list>) = cloud {
return cref.Value
}
The Cloud Ref
Distributed Data in MBrace
 Simplest data primitive in MBrace.
 References a value stored in the cluster.
 Conceptually similar to ML ref types.
 Immutable by design.
 Values cached in worker nodes for performance.
Disposable types
Distributed Data in MBrace
cloud {
use! data = CloudRef.New [| 1 .. 1000000 |]
let! x,y = doSomething data <||> doSomethingElse data
return x + y
}
Demo 2
Performance
 We tested MBrace against Hadoop.
 Tests were staged onWindows Azure.
 Clusters of 4, 8, 16 and 32 Large Azure instances.
 Two algorithms were tested, grep and k-means.
 Source code available on github.
Distributed grep
Performance
 Find occurrences of given pattern in text files.
 Straightforward Map-Reduce algorithm.
 Input data was 32, 64, 128 and 256 GB of text.
Distributed grep
Performance
 Find occurrences of given pattern in text files.
 Straightforward Map-Reduce algorithm.
 Input data was 32, 64, 128 and 256 GB of text.
Distributed grep
Performance
K-means
Performance
 Centroid computation out of a set of vectors.
 Iterative algorithm.
 Not naturally describable in Map-Reduce workflows.
 Hadoop implementation using Apache Mahout.
 Input was 106
, randomly generated 100-dimensional
points.
K-means
Performance
Future
 Better C# support.
◦ LinqOptimizer, LinqOptimizer.GPU andCloudLINQ.
◦ Support for the upcoming C# interactive.
 Open Source.
◦ FsPickler,Thespian, CloudLINQ, etc.
components of MBrace already published.
 Mono/Linux support.
http://github.com/nessos
Find more at
http://www.m-brace.net
1 of 26

Recommended

Build 2017 - B8037 - Explore the next generation of innovative UI in the Visu... by
Build 2017 - B8037 - Explore the next generation of innovative UI in the Visu...Build 2017 - B8037 - Explore the next generation of innovative UI in the Visu...
Build 2017 - B8037 - Explore the next generation of innovative UI in the Visu...Windows Developer
210 views39 slides
Poster-Adekunbi by
Poster-AdekunbiPoster-Adekunbi
Poster-AdekunbiAdekunbi Adewojo
125 views1 slide
Oral-2 by
Oral-2Oral-2
Oral-2Thomas Effland
252 views11 slides
Programming the cloud with Skywriting by
Programming the cloud with SkywritingProgramming the cloud with Skywriting
Programming the cloud with SkywritingDerek Murray
353 views34 slides
In class, we discussed min-heaps. In a min-heap the element of the heap with ... by
In class, we discussed min-heaps. In a min-heap the element of the heap with ...In class, we discussed min-heaps. In a min-heap the element of the heap with ...
In class, we discussed min-heaps. In a min-heap the element of the heap with ...licservernoida
19 views1 slide
Network simulator 2 by
Network simulator 2Network simulator 2
Network simulator 2AAKASH S
550 views13 slides

More Related Content

What's hot

Matlab Nn Intro by
Matlab Nn IntroMatlab Nn Intro
Matlab Nn IntroImthias Ahamed
642 views31 slides
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017 by
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017StampedeCon
1.8K views25 slides
Incremental and parallel computation of structural graph summaries for evolvi... by
Incremental and parallel computation of structural graph summaries for evolvi...Incremental and parallel computation of structural graph summaries for evolvi...
Incremental and parallel computation of structural graph summaries for evolvi...Till Blume
23 views12 slides
Rethinking metrics: metrics 2.0 @ Lisa 2014 by
Rethinking metrics: metrics 2.0 @ Lisa 2014Rethinking metrics: metrics 2.0 @ Lisa 2014
Rethinking metrics: metrics 2.0 @ Lisa 2014Dieter Plaetinck
1.3K views111 slides
Barbara Nelson [InfluxData] | How Can I Put That Dashboard in My App? | Influ... by
Barbara Nelson [InfluxData] | How Can I Put That Dashboard in My App? | Influ...Barbara Nelson [InfluxData] | How Can I Put That Dashboard in My App? | Influ...
Barbara Nelson [InfluxData] | How Can I Put That Dashboard in My App? | Influ...InfluxData
118 views27 slides
Matlab integration by
Matlab integrationMatlab integration
Matlab integrationpramodkumar1804
474 views6 slides

What's hot(19)

End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017 by StampedeCon
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
StampedeCon1.8K views
Incremental and parallel computation of structural graph summaries for evolvi... by Till Blume
Incremental and parallel computation of structural graph summaries for evolvi...Incremental and parallel computation of structural graph summaries for evolvi...
Incremental and parallel computation of structural graph summaries for evolvi...
Till Blume23 views
Rethinking metrics: metrics 2.0 @ Lisa 2014 by Dieter Plaetinck
Rethinking metrics: metrics 2.0 @ Lisa 2014Rethinking metrics: metrics 2.0 @ Lisa 2014
Rethinking metrics: metrics 2.0 @ Lisa 2014
Dieter Plaetinck1.3K views
Barbara Nelson [InfluxData] | How Can I Put That Dashboard in My App? | Influ... by InfluxData
Barbara Nelson [InfluxData] | How Can I Put That Dashboard in My App? | Influ...Barbara Nelson [InfluxData] | How Can I Put That Dashboard in My App? | Influ...
Barbara Nelson [InfluxData] | How Can I Put That Dashboard in My App? | Influ...
InfluxData118 views
Anders Nielsen AD Model-Builder by David LeBauer
Anders Nielsen AD Model-BuilderAnders Nielsen AD Model-Builder
Anders Nielsen AD Model-Builder
David LeBauer2.1K views
Anders Nielsen template model-builder by David LeBauer
Anders Nielsen template model-builderAnders Nielsen template model-builder
Anders Nielsen template model-builder
David LeBauer3.2K views
Explore ML Beginner Session on Linear Regression by vaishnaviayyappan
Explore ML Beginner Session on Linear RegressionExplore ML Beginner Session on Linear Regression
Explore ML Beginner Session on Linear Regression
Hw5 2017-spring by 奕安 陳
Hw5 2017-springHw5 2017-spring
Hw5 2017-spring
奕安 陳346 views
lecture 6 by sajinsc
lecture 6lecture 6
lecture 6
sajinsc646 views
Essence of the iterator pattern by Markus Klink
Essence of the iterator patternEssence of the iterator pattern
Essence of the iterator pattern
Markus Klink1.2K views
Graphite, an introduction by jamesrwu
Graphite, an introductionGraphite, an introduction
Graphite, an introduction
jamesrwu3.3K views
NUS-ISS Learning Day 2019-Pandas in the cloud by NUS-ISS
NUS-ISS Learning Day 2019-Pandas in the cloudNUS-ISS Learning Day 2019-Pandas in the cloud
NUS-ISS Learning Day 2019-Pandas in the cloud
NUS-ISS174 views
How to use Map() Filter() and Reduce() functions in Python | Edureka by Edureka!
How to use Map() Filter() and Reduce() functions in Python | EdurekaHow to use Map() Filter() and Reduce() functions in Python | Edureka
How to use Map() Filter() and Reduce() functions in Python | Edureka
Edureka!743 views

Viewers also liked

La Voz De La Marca Case Study El Corte Ingles Y Media Contacts Javier Barrio ... by
La Voz De La Marca Case Study El Corte Ingles Y Media Contacts Javier Barrio ...La Voz De La Marca Case Study El Corte Ingles Y Media Contacts Javier Barrio ...
La Voz De La Marca Case Study El Corte Ingles Y Media Contacts Javier Barrio ...Eyeblaster Spain
1.5K views16 slides
Antonio Romero i Andía by
Antonio Romero i Andía Antonio Romero i Andía
Antonio Romero i Andía joansoco
6.5K views11 slides
F# Type Provider for R Statistical Platform by
F# Type Provider for R Statistical PlatformF# Type Provider for R Statistical Platform
F# Type Provider for R Statistical PlatformHoward Mansell
9.4K views14 slides
Scalable and Flexible Machine Learning With Scala @ LinkedIn by
Scalable and Flexible Machine Learning With Scala @ LinkedInScalable and Flexible Machine Learning With Scala @ LinkedIn
Scalable and Flexible Machine Learning With Scala @ LinkedInVitaly Gordon
49.1K views63 slides
Data Workflows for Machine Learning - Seattle DAML by
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLPaco Nathan
31.6K views74 slides
Building a scalable data science platform with R by
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with RRevolution Analytics
30.2K views22 slides

Viewers also liked(7)

La Voz De La Marca Case Study El Corte Ingles Y Media Contacts Javier Barrio ... by Eyeblaster Spain
La Voz De La Marca Case Study El Corte Ingles Y Media Contacts Javier Barrio ...La Voz De La Marca Case Study El Corte Ingles Y Media Contacts Javier Barrio ...
La Voz De La Marca Case Study El Corte Ingles Y Media Contacts Javier Barrio ...
Eyeblaster Spain1.5K views
Antonio Romero i Andía by joansoco
Antonio Romero i Andía Antonio Romero i Andía
Antonio Romero i Andía
joansoco6.5K views
F# Type Provider for R Statistical Platform by Howard Mansell
F# Type Provider for R Statistical PlatformF# Type Provider for R Statistical Platform
F# Type Provider for R Statistical Platform
Howard Mansell9.4K views
Scalable and Flexible Machine Learning With Scala @ LinkedIn by Vitaly Gordon
Scalable and Flexible Machine Learning With Scala @ LinkedInScalable and Flexible Machine Learning With Scala @ LinkedIn
Scalable and Flexible Machine Learning With Scala @ LinkedIn
Vitaly Gordon49.1K views
Data Workflows for Machine Learning - Seattle DAML by Paco Nathan
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAML
Paco Nathan31.6K views
Building a scalable data science platform with R by Revolution Analytics
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with R
Revolution Analytics30.2K views
Deep learning with C++ - an introduction to tiny-dnn by Taiga Nomi
Deep learning with C++  - an introduction to tiny-dnnDeep learning with C++  - an introduction to tiny-dnn
Deep learning with C++ - an introduction to tiny-dnn
Taiga Nomi13.4K views

Similar to MBrace: Cloud Computing with F#

MBrace: Large-scale cloud computation with F# (CUFP 2014) by
MBrace: Large-scale cloud computation with F# (CUFP 2014)MBrace: Large-scale cloud computation with F# (CUFP 2014)
MBrace: Large-scale cloud computation with F# (CUFP 2014)Eirik George Tsarpalis
3.3K views23 slides
Mbrace plos-slides final by
Mbrace plos-slides finalMbrace plos-slides final
Mbrace plos-slides finalPantelis Petrogiannakis
787 views29 slides
Spark training-in-bangalore by
Spark training-in-bangaloreSpark training-in-bangalore
Spark training-in-bangaloreKelly Technologies
285 views36 slides
Oct.22nd.Presentation.Final by
Oct.22nd.Presentation.FinalOct.22nd.Presentation.Final
Oct.22nd.Presentation.FinalAndrey Skripnikov
100 views37 slides
Apache Spark Introduction - CloudxLab by
Apache Spark Introduction - CloudxLabApache Spark Introduction - CloudxLab
Apache Spark Introduction - CloudxLabAbhinav Singh
3.3K views19 slides
Go Programming Patterns by
Go Programming PatternsGo Programming Patterns
Go Programming PatternsHao Chen
3.2K views77 slides

Similar to MBrace: Cloud Computing with F#(20)

MBrace: Large-scale cloud computation with F# (CUFP 2014) by Eirik George Tsarpalis
MBrace: Large-scale cloud computation with F# (CUFP 2014)MBrace: Large-scale cloud computation with F# (CUFP 2014)
MBrace: Large-scale cloud computation with F# (CUFP 2014)
Apache Spark Introduction - CloudxLab by Abhinav Singh
Apache Spark Introduction - CloudxLabApache Spark Introduction - CloudxLab
Apache Spark Introduction - CloudxLab
Abhinav Singh3.3K views
Go Programming Patterns by Hao Chen
Go Programming PatternsGo Programming Patterns
Go Programming Patterns
Hao Chen3.2K views
Automatic Task-based Code Generation for High Performance DSEL by Joel Falcou
Automatic Task-based Code Generation for High Performance DSELAutomatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSEL
Joel Falcou1.4K views
Introduction to matlab lecture 4 of 4 by Randa Elanwar
Introduction to matlab lecture 4 of 4Introduction to matlab lecture 4 of 4
Introduction to matlab lecture 4 of 4
Randa Elanwar457 views
Simplifying Big Data Analytics with Apache Spark by Databricks
Simplifying Big Data Analytics with Apache SparkSimplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache Spark
Databricks12K views
Big Data Analytics with Scala at SCALA.IO 2013 by Samir Bessalah
Big Data Analytics with Scala at SCALA.IO 2013Big Data Analytics with Scala at SCALA.IO 2013
Big Data Analytics with Scala at SCALA.IO 2013
Samir Bessalah9.5K views
ClojureScript: The Good Parts by Kent Ohashi
ClojureScript: The Good PartsClojureScript: The Good Parts
ClojureScript: The Good Parts
Kent Ohashi3K views
ComputeFest 2012: Intro To R for Physical Sciences by alexstorer
ComputeFest 2012: Intro To R for Physical SciencesComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical Sciences
alexstorer470 views
Apache Lens at Hadoop meetup by amarsri
Apache Lens at Hadoop meetupApache Lens at Hadoop meetup
Apache Lens at Hadoop meetup
amarsri7.1K views
CS 354 Transformation, Clipping, and Culling by Mark Kilgard
CS 354 Transformation, Clipping, and CullingCS 354 Transformation, Clipping, and Culling
CS 354 Transformation, Clipping, and Culling
Mark Kilgard4K views
Basic of octave matlab programming language by Aulia Khalqillah
Basic of octave matlab programming languageBasic of octave matlab programming language
Basic of octave matlab programming language
Aulia Khalqillah184 views
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr... by Raffi Khatchadourian
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
Aad Versteden | State-of-the-art web applications fuelled by Linked Data awar... by semanticsconference
Aad Versteden | State-of-the-art web applications fuelled by Linked Data awar...Aad Versteden | State-of-the-art web applications fuelled by Linked Data awar...
Aad Versteden | State-of-the-art web applications fuelled by Linked Data awar...

Recently uploaded

Web Dev - 1 PPT.pdf by
Web Dev - 1 PPT.pdfWeb Dev - 1 PPT.pdf
Web Dev - 1 PPT.pdfgdsczhcet
55 views45 slides
Uni Systems for Power Platform.pptx by
Uni Systems for Power Platform.pptxUni Systems for Power Platform.pptx
Uni Systems for Power Platform.pptxUni Systems S.M.S.A.
50 views21 slides
Understanding GenAI/LLM and What is Google Offering - Felix Goh by
Understanding GenAI/LLM and What is Google Offering - Felix GohUnderstanding GenAI/LLM and What is Google Offering - Felix Goh
Understanding GenAI/LLM and What is Google Offering - Felix GohNUS-ISS
41 views33 slides
Transcript: The Details of Description Techniques tips and tangents on altern... by
Transcript: The Details of Description Techniques tips and tangents on altern...Transcript: The Details of Description Techniques tips and tangents on altern...
Transcript: The Details of Description Techniques tips and tangents on altern...BookNet Canada
130 views15 slides
Future of Learning - Yap Aye Wee.pdf by
Future of Learning - Yap Aye Wee.pdfFuture of Learning - Yap Aye Wee.pdf
Future of Learning - Yap Aye Wee.pdfNUS-ISS
41 views11 slides
Five Things You SHOULD Know About Postman by
Five Things You SHOULD Know About PostmanFive Things You SHOULD Know About Postman
Five Things You SHOULD Know About PostmanPostman
27 views43 slides

Recently uploaded(20)

Web Dev - 1 PPT.pdf by gdsczhcet
Web Dev - 1 PPT.pdfWeb Dev - 1 PPT.pdf
Web Dev - 1 PPT.pdf
gdsczhcet55 views
Understanding GenAI/LLM and What is Google Offering - Felix Goh by NUS-ISS
Understanding GenAI/LLM and What is Google Offering - Felix GohUnderstanding GenAI/LLM and What is Google Offering - Felix Goh
Understanding GenAI/LLM and What is Google Offering - Felix Goh
NUS-ISS41 views
Transcript: The Details of Description Techniques tips and tangents on altern... by BookNet Canada
Transcript: The Details of Description Techniques tips and tangents on altern...Transcript: The Details of Description Techniques tips and tangents on altern...
Transcript: The Details of Description Techniques tips and tangents on altern...
BookNet Canada130 views
Future of Learning - Yap Aye Wee.pdf by NUS-ISS
Future of Learning - Yap Aye Wee.pdfFuture of Learning - Yap Aye Wee.pdf
Future of Learning - Yap Aye Wee.pdf
NUS-ISS41 views
Five Things You SHOULD Know About Postman by Postman
Five Things You SHOULD Know About PostmanFive Things You SHOULD Know About Postman
Five Things You SHOULD Know About Postman
Postman27 views
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor... by Vadym Kazulkin
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...
Vadym Kazulkin75 views
Special_edition_innovator_2023.pdf by WillDavies22
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdf
WillDavies2216 views
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen... by NUS-ISS
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...
NUS-ISS28 views
SAP Automation Using Bar Code and FIORI.pdf by Virendra Rai, PMP
SAP Automation Using Bar Code and FIORI.pdfSAP Automation Using Bar Code and FIORI.pdf
SAP Automation Using Bar Code and FIORI.pdf
Perth MeetUp November 2023 by Michael Price
Perth MeetUp November 2023 Perth MeetUp November 2023
Perth MeetUp November 2023
Michael Price15 views
AI: mind, matter, meaning, metaphors, being, becoming, life values by Twain Liu 刘秋艳
AI: mind, matter, meaning, metaphors, being, becoming, life valuesAI: mind, matter, meaning, metaphors, being, becoming, life values
AI: mind, matter, meaning, metaphors, being, becoming, life values
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze by NUS-ISS
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng TszeDigital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze
NUS-ISS19 views
The details of description: Techniques, tips, and tangents on alternative tex... by BookNet Canada
The details of description: Techniques, tips, and tangents on alternative tex...The details of description: Techniques, tips, and tangents on alternative tex...
The details of description: Techniques, tips, and tangents on alternative tex...
BookNet Canada121 views
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum... by NUS-ISS
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
NUS-ISS34 views

MBrace: Cloud Computing with F#

  • 2.  Athens based ISV company  Specialize in the .NET framework and C#/F#  Various business fields ◦ Business process management ◦ GIS ◦ Application framework development  R&D Development ◦ OR Mappers ◦ MBrace and related frameworks ◦ Open Source development About Nessos IT
  • 3. What is MBrace?  A Programming Model. ◦ Leverages the power of the F# language. ◦ Inspired by F#’s asynchronous workflows. ◦ Declarative, compositional, higher-order.  A Cluster Infrastructure. ◦ Based on the .NET framework. ◦ Elastic, fault tolerant, multitasking.
  • 4. HelloWorld The MBrace Programming Model val hello : Cloud<unit> let hello = cloud { printfn "hello, world!" return () } MBrace.CreateProcess <@ hello @>
  • 5. Sequential Composition The MBrace Programming Model let first = cloud { return 15 } let second = cloud { return 27 } cloud { let! x = first let! y = second return x + y }
  • 6. Example : Sequential fold The MBrace Programming Model val foldl : ('S -> 'T -> Cloud<'S>) -> 'S -> 'T list -> Cloud<'S> let rec foldl f s ts = cloud { match ts with | [] -> return s | t :: ts' -> let! s' = f s t return! foldl f s' ts' }
  • 7. ParallelComposition The MBrace Programming Model val (<||>) : Cloud<'T> -> Cloud<'S> -> Cloud<'S * 'T> cloud { let first = cloud { return 15 } let second = cloud { return 27 } let! x,y = first <||> second return x + y }
  • 8. ParallelComposition (Variadic) The MBrace Programming Model val Cloud.Parallel : Cloud<'T> [] -> Cloud<'T []> cloud { let sqr x = cloud { return x * x } let jobs = Array.map sqr [|1 .. 100|] let! sqrs = Cloud.Parallel jobs return Array.sum sqrs }
  • 9. Non-Deterministic Parallelism The MBrace Programming Model val Cloud.Choice : Cloud<'T option> [] -> Cloud<'T option> let tryPick (f : 'T -> Cloud<'S option>) (ts : 'T []) = cloud { let jobs = Array.map f ts return! Cloud.Choice jobs }
  • 10. Exception handling The MBrace Programming Model let first = cloud { return 17 } let second = cloud { return 25 / 0 } cloud { try let! x,y = first <||> second return x + y with :? DivideByZeroException -> return -1 }
  • 11. Example: Map-Reduce The MBrace Programming Model let mapReduce (mapF : 'T -> ICloud<'S>) (reduceF : 'S -> 'S -> ICloud<'S>) (identity : 'S) (inputs : 'T list) = let rec aux inputs = cloud { match inputs with | [] -> return identity | [t] -> return! mapF t | _ -> let left,right = List.split inputs let! s1, s2 = aux left <||> aux right return! reduceF s1 s2 } aux inputs
  • 13. About that MapReduce workflow…
  • 14. About that MapReduce workflow…  Communication Overhead. ◦ Data captured in cloud workflow closures. ◦ Needlessly passed between worker machines.  Granularity issues. ◦ Each input entails a scheduling decision by the cluster. ◦ Cluster size not taken into consideration. ◦ Multicore capacity of worker nodes ignored.
  • 15. The Cloud Ref Distributed Data in MBrace let createRef (data : string list) = cloud { let! cref = CloudRef.New data return cref : CloudRef<string list> } let deRef (cref : CloudRef<string list>) = cloud { return cref.Value }
  • 16. The Cloud Ref Distributed Data in MBrace  Simplest data primitive in MBrace.  References a value stored in the cluster.  Conceptually similar to ML ref types.  Immutable by design.  Values cached in worker nodes for performance.
  • 17. Disposable types Distributed Data in MBrace cloud { use! data = CloudRef.New [| 1 .. 1000000 |] let! x,y = doSomething data <||> doSomethingElse data return x + y }
  • 19. Performance  We tested MBrace against Hadoop.  Tests were staged onWindows Azure.  Clusters of 4, 8, 16 and 32 Large Azure instances.  Two algorithms were tested, grep and k-means.  Source code available on github.
  • 20. Distributed grep Performance  Find occurrences of given pattern in text files.  Straightforward Map-Reduce algorithm.  Input data was 32, 64, 128 and 256 GB of text.
  • 21. Distributed grep Performance  Find occurrences of given pattern in text files.  Straightforward Map-Reduce algorithm.  Input data was 32, 64, 128 and 256 GB of text.
  • 23. K-means Performance  Centroid computation out of a set of vectors.  Iterative algorithm.  Not naturally describable in Map-Reduce workflows.  Hadoop implementation using Apache Mahout.  Input was 106 , randomly generated 100-dimensional points.
  • 25. Future  Better C# support. ◦ LinqOptimizer, LinqOptimizer.GPU andCloudLINQ. ◦ Support for the upcoming C# interactive.  Open Source. ◦ FsPickler,Thespian, CloudLINQ, etc. components of MBrace already published.  Mono/Linux support.