Doing data science with F#


The ability to take data, understand it, visualize it and extract useful information from it is becoming a hugely important skill. How can you turn all those logs, histories of purchases and trades or open government data, into useful information that help your business make money?

In this talk, we’ll look at doing data science using F#. The F# language is perfectly suited for this task – type providers integrate external data directly into the language – your language suddenly _understands_ CSV, XML, JSON, REST services and other sources. The interactive development style makes it easy to explore data and test your algorithms as you’re writing them. Rich set of libraries for working with data frames, time series and for visualization gives you all the tools you need. And finally – F# easily integrates with statistical environments like R and Matlab, giving you access to the industry standard libraries.

  1. 1. Doing data science with F# Tomas Petricek | @tomaspetricek PhD Student at Cambridge & Coordinator of
  2. 2. software stacks trainings mac and linux teaching F# user groups snippets community books and tutorials F# Software Foundation consulting open-source MonoDevelop contributions research support cross-platform mailing lists
  3. 3. Community matters!
  4. 4. All the Data of the World
  5. 5. data acquisition statistics data cleaning machine learning data transformation visualization type providers F# Data Science Working Group kaggle vega grammar R provider data sources presentation time-series visualization data aggregation
  6. 6. Acquire Visualize Analyze
  7. 7. Demo: Analyzing Titanic survivors
  8. 8. Deedle data frame Data exploration Indexing and aggregation F# Charting library Simple & composable Interactive style
  9. 9. Demo: Understanding the world
  10. 10. F# Data type providers First-class data CSV, REST, WorldBank… R Type provider Statistics & visualization 5000 tested packages
  11. 11. Demo: US debt over the last century
  12. 12. Deedle data frame Time-series alignment Data transformations Vega visualization F# wrapper for Vega Pre-alpha version
  13. 13. F# for Data Science acquire, analyze, visualize interactive experience safety and efficiency of .net ready for production @tomaspetricek
  14. 14. Going forward Use #fsharp for fun & profit Join local user groups Help us build data science tools | | @tomaspetricek