Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Snowplow: where we came from and where we are going - March 2016


Published on

A look back at the history of the Snowplow platform, and a look forward at the key areas of product development on our roadmap

Published in: Data & Analytics
  • Login to see the comments

  • Be the first to like this

Snowplow: where we came from and where we are going - March 2016

  1. 1. Where we came from and where we’re going March 2016
  2. 2. Snowplow was born in 2012 Web data: rich but GA / SiteCatalyst are limited “Big data” tech • Marketing, not product analytics • Silo’d: can’t join with other customer data Snowplow • Open source frameworks • Cloud services • Open source click stream data warehouse • Event level: any query • Built on top of Cloudfront / EMR / Hadoop
  3. 3. The plan: spend 6 months building a pipeline… …then get back to using the data
  4. 4. So what went wrong?
  5. 5. Increased project scope • Click stream data warehouse -> Event analytics platform • Collect events from anywhere, not just the web • Make event data actionable in real-time • Support more in-pipeline processing steps (enrichment and modeling) • Support more storage targets (where your data is has big implications for what you can do with that data)
  6. 6. Track events from anywhere • Events • Entities
  7. 7. Make event data actionable in real-time • Personalization • Marketing automation • Content analytics
  8. 8. Today, Snowplow is an event data pipeline
  9. 9. What makes Snowplow special? • Data pipeline evolves with your business • Channel coverage • Flexibility: where your data is delivered • Flexibility: how your data is processed (enrichment and modeling) • Data quality • Speed • Transparency
  10. 10. Used by 100s (1000s?) of companies… …to answer their most important business questions
  11. 11. But there’s still much more to build! • Improve automation around schema evolution • Make modeling event data easier, more robust, more performant • Support more storage targets • Make it easier to act on event data Data modeling in Spark Druid, BigQuery, graph databases Analytics SDKs, Sauna Iglu: machine-readable schema registry
  12. 12. Questions? • Can take questions now or after the other talks