Talk at the London Data Science Meetup, March 23 2016 http://www.meetup.com/Data-Science-London/events/229755935/ At Schibsted Technology we have been building predictive pipelines in order model our users' attributes and behaviour traits, such as age, gender, interests and (buying) intent etc. Predictive models on the aforementioned properties are fundamental in order to facilitate core parts of the business such as our ad-targeting platform. In this talk I will present the challenges that we had to overtake in order to put together our scalable predictive pipelines, and how we utilized Spark ML features such as User Defined Aggregate Functions (UDAF) in order to achieve this.