Be the first to like this
In this talk at Snowplow London Meetup #3 I introduced Tupilak, Snowplow’s unified log fabric. Putting a real-time event pipeline into production has many challenges: we need the pipeline to scale automatically based on event volumes, we need constant monitoring to prevent data loss and minimise end-to-end lag, and we need the ability to upgrade and extend the pipeline with zero downtime. We call software which does all this a “unified log fabric”, to distinguish it from the unified logs (e.g. Kafka and Kinesis) and stream processing frameworks (e.g. Spark Streaming and Kafka Streams) which such a fabric monitors and orchestrates.
As part of incorporating Snowplow’s Kinesis-based event pipeline into our Managed Service, we developed our own unified log fabric, called Tupilak. In this talk, I introduced Tupilak, explaining the core monitoring and scaling functions of Tupilak and showing live real-time pipelines visualised in the Tupilak UI. I dived into the architecture of Tupilak, shared its basic scaling algorithm and also took a look at how Tupilak itself is built on a Snowplow event stream. I also talked about the roadmap for Tupilak, including our plans for introducing lag-based auto-scaling and porting Tupilak to Kubernetes.