We will discuss what feature engineering is all about , various techniques to use and how to scale to 20000 column datasets using random forest, svd, pca. Also demonstrated is how we can build a service around these to save time and effort when building 100s of models. We will share how we did all this using spark ml to build logistic regression, neural networks, Bayesian networks, etc.