Be the first to like this
Heap's analytics infrastructure is built around PostgreSQL. The most important choice to make when building a system this way is the schema you'll use to represent your data. This foundation will determine your write throughput, what sorts of read queries will be fast, what indexing strategies will be available to you, and what data inconsistencies will be possible. With the wrong choice, you won't be able to leverage PostgreSQL's most powerful features.
This talk walks through the different schemas we've used to power Heap over the last three years, their relative strengths and weaknesses, and the mistakes we've made.