The document discusses automating data science on Spark using a Bayesian approach for hyper-parameter tuning, featuring various methods such as Bayesian optimization and a machine learning pipeline. It includes experimental results from two case studies: one focusing on SF crime data and the other on airline delays, highlighting hyper-parameter selections and experiments conducted for optimization. The authors promote the use of tools like Spearmint and Hyperopt for enhancing automated data science workflows.