•3 likes•3,850 views

Report

Share

Download to read offline

Follow

- 1. ALL YOUR TYPES ARE BELONG TO US! PHILLIP TRELFORD, @PTRELFORD DDD DUNDEE 2013, #DUNDDD
- 2. F#UNCTIONAL LONDONERS Meetup Topics • 600 members • Finance • 50 meetup • Machine Learning • Meets every 2 weeks • Big Data • Talks & Hands On • Gaming
- 4. F# TESTIMONIALS – MACHINE LEARNING PHILLIP TRELFORD, @PTRELFORD DDD DUNDEE 2013, #DUNDDD
- 5. FSHARP.ORG/TESTIMONIALS For a machine learning scientist, speed of experimentation is the critical factor to optimize. Compiling is fast but loading large amounts of data in memory takes a long time. With F#’s REPL, you only need to load the data once and you can then code and explore in the interactive environment. Unlike C# and C++, F# was designed for this mode of interaction. - Patrice Simard, Microsoft
- 6. FSHARP.ORG/TESTIMONIALS - AMYRIS BIOTECH F# has been phenomenally useful. I would be writing a lot of this in Python otherwise and F# is more robust, 20x - 100x faster to run and for anything but the most trivial programs, faster to develop. - Darren Platt, Amyris Biotechnology
- 7. CASE STUDIES PHILLIP TRELFORD, @PTRELFORD DDD DUNDEE 2013, #DUNDDD
- 8. F# TOOLS FOR HALO 3 Questions • Controllable player skill distribution (slow down!) • Controllable skills distributions (re-ordering) Simulations • Large scale simulation of 8,000,000,000 matches • Distributed computation – 15 machines for 2wks Tools • Result viewer (Logged results: 52GB of data) • Real-time simulator of partial update
- 9. ADCENTER Weeks of data in training: • 7,000,000,000 impressions 2 weeks of CPU time during sessions • 2 wks x 7 days x 86,400 sec/day Learning algorithm speed requirement: • 5,787 impression updates /sec • 172.8 µs per impression update
- 10. LIVE DEMOS PHILLIP TRELFORD, @PTRELFORD DDD DUNDEE 2013, #DUNDDD
- 11. TYPE PROVIDERS: JSON open FSharp.Data type Simple = JsonProvider<“sample.js”> let simple = Simple.Parse(""" { "name":"Tomas", "age":4 } """) simple.Age
- 13. SPLIT DATA SET (FROM ML IN ACTION) Python def splitDataSet(dataSet, axis, value): retDataSet = [] for featVec in dataSet: if featVec[axis] == value: reducedFeatVec = featVec[:axis] reducedFeatVec.extend(featVec[axis+1:]) retDataSet.append(reducedFeatVec) return retDataSet F# let splitDataSet(dataSet, axis, value) = [|for featVec in dataSet do if featVec.[axis] = value then yield featVec |> Array.removeAt axis|]
- 14. K-MEANS CLUSTERING ALGORITHM (* K-Means Algorithm *) /// Group all the vectors by the nearest center. let classify centroids vectors = vectors |> Array.groupBy (fun v -> centroids |> Array.minBy (distance v)) /// Repeatedly classify the vectors, starting with the seed centroids let computeCentroids seed vectors = seed |> Seq.iterate (fun centers -> classify centers vectors |> Array.map (snd >> average))
- 15. R – TYPE PROVIDER
- 16. WORLD BANK DATA
- 17. RESOURCES PHILLIP TRELFORD, @PTRELFORD DDD DUNDEE 2013, #DUNDDD
- 18. TYPE PROVIDERS • JSON • XML • CSV • Excel • SQL • R • MATLAB • Hadoop • ...
- 19. TRYFSHARP.ORG
- 20. BUY THE BOOK
- 21. GET THE T-SHIRT
- 22. MACHINE LEARNING JOB TRENDS • Source indeed.co.uk
- 23. QUESTIONS PHILLIP TRELFORD, @PTRELFORD DDD DUNDEE 2013, #DUNDDD

- Fsharp.org map
- http://fsharp.github.io/FSharp.Data/library/CsvProvider.htmlhttp://clear-lines.com/blog/post/Random-Forest-classification-in-F-first-cut.aspx
- http://www.indeed.com/jobanalytics/jobtrends?q=machine+learning&l=