Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Mathias Brandewinder, Software Engineer & Data Scientist, Clear Lines Consulting at MLconf NYC - 4/15/16

673 views

Published on

Scripts that Scale with F# and mbrace.io:
Nothing beats interactive scripting for productive data exploration and rapid prototyping: grab data, run code, and iterate based on feedback. However, that story starts to break down once you need to process large datasets or expensive computations. Your local machine becomes the bottleneck, and your are left with a slow and unresponsive environment.

In this talk, we will demonstrate on live examples how you can have your cake and eat it, too, using mbrace.io, a free, open-source engine for scalable cloud programming. Using a simple programming model, you can keep working from your favorite scripting environment, and execute code interactively against a cluster on the Azure cloud. We will discuss the relevance of F# and mbrace in a data science and machine learning context, from parallelizing code and data processing in a functional style, to leveraging F# type providers to consume data or even run R packages.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Mathias Brandewinder, Software Engineer & Data Scientist, Clear Lines Consulting at MLconf NYC - 4/15/16

  1. 1. Scripts that scale: F#, Azure & mbrace.io
  2. 2. What the hell is F#? Why do I care? •You already use many languages •F# is probably not one of them I have 2 goals today: 1) Show you why F# is awesome for ML, 2) In particular with mbrace.io + Azure
  3. 3. F# in general •Functional-first •Statically typed, lightweight syntax •Open Source, from Microsoft Research •Very active & engaged community •In the ML family (not machine learning) •First class .NET language
  4. 4. What are pain-points in ML? •I need data! •I need clean data! •Damn, this is a lot of data! •Damn, when will this computation finish!
  5. 5. F# is awesome at this • I need data! • I need clean data! • Damn, this is a lot of data! • Slow computations • Type Providers • Pipelines with |> • mbrace.io / CloudFlow • mbrace.io / cloud { }
  6. 6. Why functional? •Functional ≈ immutable data + functions •map ≈ opportunity for parallelism 1 2 3 4 5 6 2 3 4 5 6 7 f(x) = x + 1
  7. 7. Kaggle Home Depot competition
  8. 8. Conclusion: F# •Awesome scripting language •Type providers • focus on data, not on how to get it •Natural data pipelines •Functional helps exploit parallelism •Code ready to ship in production
  9. 9. Conclusion: mbrace.io + Azure •Simple, extensible model •Data parallelism •Compute parallelism •Run anything, distributed •Scale fast, for cheap, on Azure
  10. 10. Thank you  •More on F# at fsharp.org •Ping me as @brandewinder

×