Large-scale data stream processing has come a long way to where it is today. It combines all the essential requirements of modern data analytics: subsecond latency, high throughput and impressively, strong consistency. Apache Flink is a system that serves as a proof-of-concept of these characteristics and it is mainly well-known for its lightweight fault tolerance. Data engineers and analysts can now let the system handle Terabytes of computational state without worrying about failures that can potentially occur.
This presentation describes all the fundamental challenges behind exactly-once processing guarantees in large-scale streaming in a simple and intuitive way. Furthermore, it demonstrate the basic and extended versions of Flink's state-of-the-art snapshotting algorithm tailored to the needs of a dataflow graph.