Feature vectors – sequences of heterogenous types – are the basic unit of any machine learning algorithm. Further, feature engineering involves manipulations of these feature vectors and is a fundamental step in optimizing the accuracy of machine learning models. These manipulations may take the form of regular Scala sequence operations that can also be distributed using frameworks such as Spark or Flink. When building a general purpose machine learning framework, the types of engineered features is not known in advance, which is a problem for statically typed languages. In this talk, I will walk through possible solutions for designing type-safe feature vectors in Scala that provide compile-time type safety for feature engineering and other machine learning use cases. The solutions will demonstrate applications of Shapeless, Scala Macros, and Quasiquotes.
Presentation Recording - SBTB2016 https://www.youtube.com/watch?v=FfpSyXTx0uo