Twitter is a stream processing precursor. They process ones of the biggest streams in the internet. Storm introduction 5 years ago was a revolution in real-time distributed computing but after few years Twitter decided to replace it. What were Storm issues? Why Heron has been implemented? Why using existing engines was not an option (eg Samza, Spark, Flink) ? Is this a revolution in open source stream processing? I will give you a brief overview of Twitter stream processing history with interesting technical details.
13. Storm at Twitter
(2013)
Benchmarked at a million tuples
processed per second
Running 30 topologies in a 200
node cluster
Processing 50 billion messages a
day with an average complete
latency under 50ms
http://www.slideshare.net/KrishnaGade2/storm-at-twitter/39-numbers_benchmarked_at_a_million
14. Storm is very powerful, but...
http://www.slideshare.net/KrishnaGade2/storm-at-twitter
15. Apache Storm issues
Performance
● Every worker is
homogeneous, which results
in inefficient utilization of
allocated resources
● There is no backpressure
mechanism
● Topologies using a large
amount of RAM for a worker
encounter gc cycles greater
than a minute
Debugging
Each worker runs a mix of
tasks
Logs from multiple tasks are
written into a single file
Each tuple has to pass through
four threads in the
worker process from the
point of entry to the point
of exit
Scheduling
● Multiple level of
scheduling
● Single task takes down
the whole worker process
● Nimbus is a single point of
failure
https://blog.acolyer.org/2015/06/15/twitter-heron-stream-processing-at-scale/
16. Enhancing Storm would take too long and no other system met their
scaling, throughput and latency needs. Plus, other systems are not
compatible with Storm’s API, requiring rewriting all topologies. The
decision was to create Heron, but keep its external API compatible with
Storm’s.
https://blog.twitter.com/2015/flying-faster-with-twitter-heron
Twitter approach...
17. Flying faster with Twitter Heron
Tuesday, June 2, 2015 | By Karthik Ramasamy (@karthikz), Engineering Manager
https://blog.twitter.com/2015/flying-faster-with-twitter-heron
18. Flying Faster with Twitter Heron
Scheduler
Pluggable solution. Fit to Twitter infrastructure:
Apache Mesos + Apache Aurora
Back Pressure
Automatically slows down on tuples producing
when queues overloaded
Easy Debugging
Moved from typical thread-based system to
process-based system (running each tusk in isolation)
Compatibility with Storm Easy migration from Storm to Heron
https://blog.twitter.com/2015/flying-faster-with-twitter-heron
19. Heron Performance
We compared the performance of Heron with Twitter’s production version of Storm, which was forked from an open source
version in October 2013, using word count topology. This topology counts the distinct words in a stream generated from a
set of 150,000 words.
https://blog.twitter.com/2015/flying-faster-with-twitter-heron
20. Heron at Twitter
At Twitter, Heron is used as our primary streaming system, running
hundreds of development and production topologies. Since Heron is
efficient in terms of resource usage, after migrating all Twitter’s
topologies to it we’ve seen an overall 3x reduction in hardware, causing
a significant improvement in our infrastructure efficiency.
https://blog.twitter.com/2015/flying-faster-with-twitter-heron
23. Open Sourcing Twitter Heron
Wednesday, May 25, 2016 | By Karthik Ramasamy (@karthikz), Engineering Manager
https://blog.twitter.com/2016/open-sourcing-twitter-heron
24. Inside Heron
Written in Java & Python (~80%)
Critical parts of the framework, the code that manages the topologies
and network communications are not written in a JVM language
https://blog.twitter.com/2015/flying-faster-with-twitter-heron
26. Storm has evolved
Heron's speed improvements are measured from the Storm 0.8.x code it
diverged from, not the current version; if you have migrated over to
Storm 1.0 already, you might not see much more improvement over your
current Storm topologies, and you may run into incompatibilities
between the implementation of new features like back-pressure
support between Storm and Heron
https://blog.twitter.com/2015/flying-faster-with-twitter-heron
28. Storm has evolved
➢ Support for back pressure
➢ Introduced pacemaker (daemon for offloading heartbeat traffic
from ZooKeeper, freeing larger topologies from the infamous
ZooKeeper bottleneck)
➢ Nimbus HA
➢ Distributed cache
https://blog.twitter.com/2015/flying-faster-with-twitter-heron
29. Storm has evolved
➢ improved debugging and profiling options
➢ 60 percent decrease in latency
➢ up to 16x speed improvement.
https://blog.twitter.com/2015/flying-faster-with-twitter-heron
30. When to use Storm?
➢ Want to avoid infrastructure configuration overhead (Heron is
currently tied to Mesos, so if you don't have existing Mesos
infrastructure, you'll need to set that up as well, which is no small
undertaking)
➢ Don’t need extremely large scale
➢ DRPC (deprecated in Heron)
➢ More ready to use integrationshttps://blog.twitter.com/2015/flying-faster-with-twitter-heron
31. When to use Heron?
➢ Have Mesos infrastructure
➢ Larger scale
➢ Running multiple clusters
https://blog.twitter.com/2015/flying-faster-with-twitter-heron
Let us count the ways…
Multiple levels of scheduling and their complex interaction leads to uncertainty about when tasks are being scheduled.
Each worker runs a mix of tasks, making it difficult to reason about the behaviour and performance of a particular task, since it is not possible to isolate its resource usage.
Logs from multiple tasks are written into a single file making it hard to identify errors and exceptions associated with a particular task, and causes tasks that log verbosely to swamp the logs of other tasks.
An unhandled exception in a single task takes down the whole worker process killing other (perfectly fine) tasks.
Storm assumes that every worker is homogeneous, which results in inefficient utilization of allocated resources, and often results in over-provisioning.
Because of the large amount of memory allocated to workers, use of common profiling tools becomes very cumbersome. Dumps take so long that the heartbeats are missed and the supervisor kills the process (preventing the dump from completing).
Re-architecting Storm to run one task per-worker would led to big inefficiencies in resource usage and limit the degree of parallelism achieved.
Each tuple has to pass through four (count ’em) threads in the worker process from the point of entry to the point of exit. This design leads to significant overhead and contention issues.
Nimbus is functionally overloaded and becomes an operational bottleneck.
Storm workers belonging to different topologies but running on the same machine can interfere with each other, which leads to untraceable performance issues. Thus Twitter had to run production Storm topologies in isolation on dedicated machines. Which of course leads to wasted resources.
Nimbus is a single point of failure. When it fails, you can’t submit any new topologies or kill existing ones. Nor can any topology that undergoes failures be detected and recovered.
There is no backpressure mechanism. This can result in unbounded tuple drops with little visibility into the situation when acknowledgements are disabled. Work done by upstream components can be lost, and in extreme scenarios the topology can fail to make any progress while consuming all resources.
A tuple failure anywhere in the tuple tree leads to failure of the whole tuple tree.
Topologies using a large amount of RAM for a worker encounter gc cycles greater than a minute.
There can be a lot of contention at the transfer queues, especially when a worker runs several executors.
To mitigate some of these performance risks, Twitter often had to over provision the allocated resources. And they really do mean overprovision – one of their topologies used 600 cores at an average 20-30% utilization. From the analysis, one would have expected the topology to require only 150 cores.