F# IN YOUR PIPE
COPENHAGEN R
Phil Trelford
@ptrelford
F#UNCTIONAL LONDONERS
Founded Feb 2010
800 Members
Meet every 2 weeks
Key topics:
 Functional Programming
 Machine Learning
 Apps and Games
http://meetup.com/fsharplondon
CASE STUDIES Phillip Trelford, @ptrelford
Copenhagen R, 2014
F# TOOLS FOR HALO 3
Questions
Controllable player skill distribution (slow down!)
Controllable skills distributions (re-ordering)
Simulations
Large scale simulation of 8,000,000,000 matches
Distributed computation – 15 machines for 2wks
Tools
Result viewer (Logged results: 52GB of data)
Real-time simulator of partial update
ADCENTER
Weeks of data in training:
7,000,000,000 impressions
2 weeks of CPU time during
sessions
2 wks x 7 days x 86,400 sec/day
Learning algorithm speed
requirement:
5,787 impression updates /sec
172.8 µs per impression update
TESTIMONIALS – F# IN
MACHINE LEARNING
Phillip Trelford, @ptrelford
Copenhagen R, 2014
FSHARP.ORG/TESTIMONIALS -
MICROSOFT
For a machine learning scientist, speed of experimentation is the
critical factor to optimize.
Compiling is fast but loading large amounts of data in memory takes a
long time.
With F#’s REPL, you only need to load the data once
and you can then code and explore in the interactive environment.
Unlike C# and C++, F# was designed for this mode of interaction.
- Patrice Simard, Microsoft
FSHARP.ORG/TESTIMONIALS -
AMYRIS BIOTECH
F# has been phenomenally useful.
I would be writing a lot of this in Python otherwise
and F# is more robust, 20x - 100x faster to run
and for anything but the most trivial programs,
faster to develop.
- Darren Platt, Amyris Biotechnology
LIVE DEMOS Phil Trelford, @ptrelford
Copenhagen R, 2014
PIPE IT
load(transform(extract(data)))
f3( f2( f1( x ) ) )
let (|>) x f = f x
x |> f1 |> f2 |> f3
data |> extract |> transform |> load
K-MEANS CLUSTERING ALGORITHM
(* K-Means Algorithm *)
/// Group all the vectors by the nearest center.
let classify centroids vectors =
vectors |> Array.groupBy (fun v -> centroids |> Array.minBy (distance v))
/// Repeatedly classify the vectors, starting with the seed centroids
let computeCentroids seed vectors =
seed |> Seq.iterate (fun centers -> classify centers vectors
|> Array.map (snd >> average))
TYPE PROVIDERS
JSON
XML
CSV
Excel
SQL
R
MATLAB
Hadoop
…
CSV TYPE PROVIDER
R – TYPE PROVIDER
FSLAB
Data
Charting
Deedle
Math.Net
R Provider
RESOURCES Phil Trelford, @ptrelford
Copenhagen R, 2014
TRY IT BEFORE YOU BUY IT
BUY THE BOOK
JOIN THE COMMUNITY
F# Foundation
http://fsharp.org
Copenhagen F# Meetup
http://www.meetup.com/MoedegruppeFunktionelleKoebenhavnere/
Progressive F# Tutorials London
https://skillsmatter.com/conferences/1926-progressive-f-tutorials-
2014

F# in your pipe

  • 1.
    F# IN YOURPIPE COPENHAGEN R Phil Trelford @ptrelford
  • 2.
    F#UNCTIONAL LONDONERS Founded Feb2010 800 Members Meet every 2 weeks Key topics:  Functional Programming  Machine Learning  Apps and Games http://meetup.com/fsharplondon
  • 3.
    CASE STUDIES PhillipTrelford, @ptrelford Copenhagen R, 2014
  • 4.
    F# TOOLS FORHALO 3 Questions Controllable player skill distribution (slow down!) Controllable skills distributions (re-ordering) Simulations Large scale simulation of 8,000,000,000 matches Distributed computation – 15 machines for 2wks Tools Result viewer (Logged results: 52GB of data) Real-time simulator of partial update
  • 5.
    ADCENTER Weeks of datain training: 7,000,000,000 impressions 2 weeks of CPU time during sessions 2 wks x 7 days x 86,400 sec/day Learning algorithm speed requirement: 5,787 impression updates /sec 172.8 µs per impression update
  • 6.
    TESTIMONIALS – F#IN MACHINE LEARNING Phillip Trelford, @ptrelford Copenhagen R, 2014
  • 7.
    FSHARP.ORG/TESTIMONIALS - MICROSOFT For amachine learning scientist, speed of experimentation is the critical factor to optimize. Compiling is fast but loading large amounts of data in memory takes a long time. With F#’s REPL, you only need to load the data once and you can then code and explore in the interactive environment. Unlike C# and C++, F# was designed for this mode of interaction. - Patrice Simard, Microsoft
  • 8.
    FSHARP.ORG/TESTIMONIALS - AMYRIS BIOTECH F#has been phenomenally useful. I would be writing a lot of this in Python otherwise and F# is more robust, 20x - 100x faster to run and for anything but the most trivial programs, faster to develop. - Darren Platt, Amyris Biotechnology
  • 9.
    LIVE DEMOS PhilTrelford, @ptrelford Copenhagen R, 2014
  • 10.
    PIPE IT load(transform(extract(data))) f3( f2(f1( x ) ) ) let (|>) x f = f x x |> f1 |> f2 |> f3 data |> extract |> transform |> load
  • 11.
    K-MEANS CLUSTERING ALGORITHM (*K-Means Algorithm *) /// Group all the vectors by the nearest center. let classify centroids vectors = vectors |> Array.groupBy (fun v -> centroids |> Array.minBy (distance v)) /// Repeatedly classify the vectors, starting with the seed centroids let computeCentroids seed vectors = seed |> Seq.iterate (fun centers -> classify centers vectors |> Array.map (snd >> average))
  • 12.
  • 13.
  • 14.
    R – TYPEPROVIDER
  • 15.
  • 16.
    RESOURCES Phil Trelford,@ptrelford Copenhagen R, 2014
  • 17.
    TRY IT BEFOREYOU BUY IT
  • 18.
  • 19.
    JOIN THE COMMUNITY F#Foundation http://fsharp.org Copenhagen F# Meetup http://www.meetup.com/MoedegruppeFunktionelleKoebenhavnere/ Progressive F# Tutorials London https://skillsmatter.com/conferences/1926-progressive-f-tutorials- 2014

Editor's Notes

  • #14 http://fsharp.github.io/FSharp.Data/library/CsvProvider.html http://clear-lines.com/blog/post/Random-Forest-classification-in-F-first-cut.aspx