Budapest Spark Meetup - Apache Spark @enbrite.ly presentation held on
March 30, 2016.
The vision we all share at enbrite.ly is to create the next generation decision supporting system in online advertising that combines the market needs; anti-fraud, viewability, brand safety and traffic quality assurances in one platform. We do this by analyzing vast amount of data to create value for our customers. In the last 6 months we created our ETL pipeline, the core component of our data platform based on Apache Spark. In this presentation I share the journey from the whiteboard designs to the maintenance of a TB-scale data pipeline. I share the lessons we learned and the ups and downs using Spark in scale.