Metrics play an important role in data-driven companies like LinkedIn, where we leverage them extensively for reporting, experimentation, and in-product applications. We built an offline platform to help people define and produce metrics driven through their transformation code, mostly in Pig or Hive, and metadata-rich configurations. Many of our users would like to look at these metrics in a real-time fashion. To support this, we recently built an extension to the platform that auto-generates Samza real-time flow from existing offline transformation code with just a single command. Combining with the existing offline platform, we delivered Lambda architecture without maintaining multiple code bases.
In this talk, we will describe how we use Apache Calcite to translate our offline logic, served as the single source of truth, into both Samza code and configuration for real-time execution.