This document discusses a study on integrating machine learning tools into a Hadoop-based analytics platform at Twitter. Specifically, it focuses on extending the Pig dataflow language to provide predictive analytics capabilities using machine learning algorithms like logistic regression and decision trees. Key points discussed include developing storage functions in Pig for both batch and online learning algorithms, training models by streaming training data through the functions, and scaling out training via model ensembles. An example application of sentiment analysis on tweets using these new Pig extensions is also described to illustrate the end-to-end workflow.