Twitter Heron. Evolution or Revolution

Twitter Heron
Evolution or Revolution?
Analytics Conf, November 15-16, 2016

Grzegorz Kolpuc
@gkolpuc
https://pl.linkedin.com/in/grzegorz-kolpuc-7000b755

There are 310M monthly active users
https://www.brandwatch.com/blog/44-twitter-stats-2016/

A total of 1.3 billion accounts have been
created

There are 500 million Tweets sent each day.
That’s 6,000 Tweets every second.

Enable analytics:
scoring, stats, trends,
recommendations, real-time
reporting
http://www.slideshare.net/KrishnaGade2/storm-at-twitter

What is Storm?
https://blog.twitter.com/2015/flying-faster-with-twitter-heron

What is Storm?
Storm makes it easy to reliably process unbounded streams of data, doing for
realtime processing what Hadoop did for batch processing (DAG processing engine)

http://hortonworks.com/blog/brief-history-apache-storm/
2011 : Twitter acquires @BackType

Storm at Twitter
(2013)
Benchmarked at a million tuples
processed per second
Running 30 topologies in a 200
node cluster
Processing 50 billion messages a
day with an average complete
latency under 50ms
http://www.slideshare.net/KrishnaGade2/storm-at-twitter/39-numbers_benchmarked_at_a_million

Storm is very powerful, but...
http://www.slideshare.net/KrishnaGade2/storm-at-twitter

Apache Storm issues
Performance
● Every worker is
homogeneous, which results
in inefficient utilization of
allocated resources
● There is no backpressure
mechanism
● Topologies using a large
amount of RAM for a worker
encounter gc cycles greater
than a minute
Debugging
Each worker runs a mix of
tasks
Logs from multiple tasks are
written into a single file
Each tuple has to pass through
four threads in the
worker process from the
point of entry to the point
of exit
Scheduling
● Multiple level of
scheduling
● Single task takes down
the whole worker process
● Nimbus is a single point of
failure
https://blog.acolyer.org/2015/06/15/twitter-heron-stream-processing-at-scale/

Enhancing Storm would take too long and no other system met their
scaling, throughput and latency needs. Plus, other systems are not
compatible with Storm’s API, requiring rewriting all topologies. The
decision was to create Heron, but keep its external API compatible with
Storm’s.
Twitter approach...

Flying faster with Twitter Heron
Tuesday, June 2, 2015 | By Karthik Ramasamy (@karthikz), Engineering Manager

Flying Faster with Twitter Heron
Scheduler
Pluggable solution. Fit to Twitter infrastructure:
Apache Mesos + Apache Aurora
Back Pressure
Automatically slows down on tuples producing
when queues overloaded
Easy Debugging
Moved from typical thread-based system to
process-based system (running each tusk in isolation)
Compatibility with Storm Easy migration from Storm to Heron

Heron Performance
We compared the performance of Heron with Twitter’s production version of Storm, which was forked from an open source
version in October 2013, using word count topology. This topology counts the distinct words in a stream generated from a
set of 150,000 words.

Heron at Twitter
At Twitter, Heron is used as our primary streaming system, running
hundreds of development and production topologies. Since Heron is
efficient in terms of resource usage, after migrating all Twitter’s
topologies to it we’ve seen an overall 3x reduction in hardware, causing
a significant improvement in our infrastructure efficiency.

Heron Topology
Topology master
Stream
Manager
Stream
Manager
Metrics
Manager
Metrics
Manager
I1 I2 I3 I4 I1 I2 I3 I4
ZK Cluster
Nimbus
Supervisor Supervisor
W1 W2 W3 W4 W1 W2 W3 W4
ZK Cluster
Storm Topology

W1 W2 W3 W4 W1 W2 W3 W4
Topology master
Stream
Manager
Stream
Manager
Metrics
Manager
Metrics
Manager
I1 I2 I3 I4 I1 I2 I3 I4
Heron Topology
ZK Cluster
Nimbus
Supervisor Supervisor
ZK Cluster
Storm Topology
Scheduler Uploader
Heron
Tracker

Open Sourcing Twitter Heron
Wednesday, May 25, 2016 | By Karthik Ramasamy (@karthikz), Engineering Manager
https://blog.twitter.com/2016/open-sourcing-twitter-heron

Inside Heron
Written in Java & Python (~80%)
Critical parts of the framework, the code that manages the topologies
and network communications are not written in a JVM language

Storm has evolved
Heron's speed improvements are measured from the Storm 0.8.x code it
diverged from, not the current version; if you have migrated over to
Storm 1.0 already, you might not see much more improvement over your
current Storm topologies, and you may run into incompatibilities
between the implementation of new features like back-pressure
support between Storm and Heron

http://hortonworks.com/blog/brief-history-apache-storm/

Storm has evolved
➢ Support for back pressure
➢ Introduced pacemaker (daemon for offloading heartbeat traffic
from ZooKeeper, freeing larger topologies from the infamous
ZooKeeper bottleneck)
➢ Nimbus HA
➢ Distributed cache

Storm has evolved
➢ improved debugging and profiling options
➢ 60 percent decrease in latency
➢ up to 16x speed improvement.

When to use Storm?
➢ Want to avoid infrastructure configuration overhead (Heron is
currently tied to Mesos, so if you don't have existing Mesos
infrastructure, you'll need to set that up as well, which is no small
undertaking)
➢ Don’t need extremely large scale
➢ DRPC (deprecated in Heron)
➢ More ready to use integrationshttps://blog.twitter.com/2015/flying-faster-with-twitter-heron

When to use Heron?
➢ Have Mesos infrastructure
➢ Larger scale
➢ Running multiple clusters

Twitter Heron. Evolution or Revolution

Recommended

Recommended

More Related Content

What's hot

What's hot (8)

Similar to Twitter Heron. Evolution or Revolution

Similar to Twitter Heron. Evolution or Revolution (20)

Recently uploaded

Recently uploaded (20)

Twitter Heron. Evolution or Revolution

Editor's Notes