Stream Processing Systems
Karthik Ramasamy
Twitter
@karthikz
2
Value of Real Time DataIt’s contextual
[1] Courtesy Michael Franklin, BIRTE, 2015.
3
Heron
Batching of tuples
Amortizing the cost of transferring tuples
Task isolation
Ease of debug-
ability/isolation/profiling
Fully API compatible with Storm
Directed acyclic graph
Topologies, Spouts and Bolts
Support for back pressure
Topologies should self adjusting
gUse of main stream languages
C++, Java and Python
Efficiency
Reduce resource consumptionG
Design: Goals
4
Better Storm
Twitter Heron
Container Based Architecture
Separate Monitoring and Scheduling
-
Simplified Execution Model2
Much Better Performance
5
Heron
Sample Topologies
6
Heron@Twitter
Storm is decommissioned
3X reduction in resource usage
Auto scaling the system in the presence of unpredictability
7
Technology Challenges
The Road Ahead
Auto tuning of real time analytics jobs/queries
Exploiting faster networks for efficiently moving data
Ä
Ü
J
8
@karthikz
Get in Touch

The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter