5. FSHARP.ORG/TESTIMONIALS
For a machine learning scientist, speed of experimentation is the critical factor to optimize.
Compiling is fast but loading large amounts of data in memory takes a long time.
With F#’s REPL, you only need to load the data once
and you can then code and explore in the interactive environment.
Unlike C# and C++, F# was designed for this mode of interaction.
- Patrice Simard, Microsoft
6. FSHARP.ORG/TESTIMONIALS - AMYRIS
BIOTECH
F# has been phenomenally useful.
I would be writing a lot of this in Python otherwise
and F# is more robust, 20x - 100x faster to run
and for anything but the most trivial programs,
faster to develop.
- Darren Platt, Amyris Biotechnology
8. F# TOOLS FOR HALO 3
Questions
• Controllable player skill distribution (slow down!)
• Controllable skills distributions (re-ordering)
Simulations
• Large scale simulation of 8,000,000,000 matches
• Distributed computation – 15 machines for 2wks
Tools
• Result viewer (Logged results: 52GB of data)
• Real-time simulator of partial update
9. ADCENTER
Weeks of data in training:
• 7,000,000,000 impressions
2 weeks of CPU time during sessions
• 2 wks x 7 days x 86,400 sec/day
Learning algorithm speed requirement:
• 5,787 impression updates /sec
• 172.8 µs per impression update
11. TYPE PROVIDERS: JSON
open FSharp.Data
type Simple = JsonProvider<“sample.js”>
let simple = Simple.Parse(""" { "name":"Tomas", "age":4 } """)
simple.Age
13. SPLIT DATA SET (FROM ML IN ACTION)
Python
def splitDataSet(dataSet, axis, value):
retDataSet = []
for featVec in dataSet:
if featVec[axis] == value:
reducedFeatVec = featVec[:axis]
reducedFeatVec.extend(featVec[axis+1:])
retDataSet.append(reducedFeatVec)
return retDataSet
F#
let splitDataSet(dataSet, axis, value) =
[|for featVec in dataSet do
if featVec.[axis] = value then
yield featVec |> Array.removeAt axis|]
14. K-MEANS CLUSTERING ALGORITHM
(* K-Means Algorithm *)
/// Group all the vectors by the nearest center.
let classify centroids vectors =
vectors |> Array.groupBy (fun v -> centroids |> Array.minBy (distance v))
/// Repeatedly classify the vectors, starting with the seed centroids
let computeCentroids seed vectors =
seed |> Seq.iterate (fun centers -> classify centers vectors
|> Array.map (snd >> average))