Presented at Data Science Meetup on 4/12/2018.
In this talk, Grega Kespret (head of analytics group) will present Celtra’s data analytics pipeline and how it evolved through the years - sometimes forward, sometimes backward. On this journey, we became early adopter of different technologies: BigQuery, Vertica (pre-join projections), Spark (version 0.5), Databricks (beta users) and Snowflake (one of the first users). As the business grew and the product evolved, volume and complexity of data increased ten-fold, as has the number of users generating insights from this data. How come BigQuery did not scale? Why was choosing Vertica a mistake for our use case, and what have we learned from it? What requirements did we have for the analytics database, why did we have to abandon MySQL, and why we finally chose Snowflake? This talk will be heavily opinionated and will describe our experience and learnings - what worked for us and what didn't.