Storm Real Time Computation

SERC – CADL
Indian Institute of Science
Bangalore, India
TWITTER STORM
Real Time, Fault Tolerant Distributed Framework
Created : 25th May, 2013
SONAL RAJ
National Institute of Technology,
Jamshedpur, India

Background
• Created by Nathan Marz @ BackType/Twitter
• Analyze tweets, links, users on Twitter
• Opensourced at Sep 2011
• Eclipse Public License 1.0
• Storm 0.5.2
• 16k java and 7k Clojure LOC
• Current stable release 0.8.2
• 0.9.0 major core improvement

Background
• Active user group
• https://groups.google.com/group/storm-user
• https://github.com/nathanmarz/storm
• Most watched java repo at GitHub (>4k watcher)
• Used by over 30 companies
• Twitter, Groupon, Alibaba, GumGum, ..

Problems . . .
•Scale is painful
•Poor fault-tolerance
• Hadoop is stateful
•Coding is tedious
•Batch processing
• Long latency
• no realtime

Storm . . .Problems Solved !!
•Scalable and robust
• No persistent layer
•Guarantees no data loss
•Fault-tolerant
•Programming language agnostic
•Use case
• Stream processing
• Distributed RPC
• Continues computation

STORM FEATURES
Storm
Guaranteed data processing
...,Horizontal scalability
Fault-tolerance
..., No intermediate message brokers!
...,Higher level abstraction than message passing
...,"Just works"

Storm’s edge over hadoop
HADOOP STORM
• Batch processing
• Jobs runs to completion
• JobTracker is SPOF*
• Stateful nodes
• Scalable
• Guarantees no data loss
• Open source
Real-time processing
Topologies run forever
No single point of failure
Stateless nodes
Scalable
Guarantees no data loss
Open source
* Hadoop 0.21 added some checkpointing
SPOF: Single Point Of Failure

Paradigm of stream computation
Queues /Workers

general method
Message routing can be complex
Messages Queue

COMPONENTS
• Nimbus daemon is comparable to Hadoop JobTracker. It is
the master
• Supervisor daemon spawns workers, it is comparable to
Hadoop TaskTracker
• Worker is spawned by supervisor, one per port defined in
storm.yaml configuration
• Task is run as a thread in workers
• Zookeeper is a distributed system, used to store metadata.
Nimbus and Supervisor daemons are fail-fast and stateless.
All states is kept in Zookeeper.
Notice all communication between Nimbus and
Supervisors are done through Zookeeper
On a cluster with 2k+1 zookeeper nodes, the
system can recover when maximally k nodes fails.

Storm architecture
Master Node ( Similar to Hadoop Job-Tracker )

STORM ARCHITECTLlRE
Used for Cluster Co-ordination

STORM ARCHITECTLlRE
Runs Worker Nodes I Processes

CONCEPTS
• Streams
• Topology
• A spout
• A bolt
• An edge represents a grouping

spouts
• Example
• Read from logs, API calls,
event data, queues, …

SPOUTS
•Interface ISpout
l·lethod Summanr"
void ack(java.lang.Object msg_d)
Storm has detennined that thetnpl1
e emitted by this spout th the msgld identifierhas been fuUy processed.
void acti-.:rate 0
Called when a spout has been actPtated out ,of a deactivated mode.
void close()
Called when an ISpout is going to be shutdovn.
void deactivate()
Called vhen a spout has been deacty.,ated.
void fail(java.lang.Object msgidl
The tnple emitted by this spout vith the msgld identifier hasfailed to befulrlprocessed.
void nextTu12le()
Vhen thls method is calle<l Stonn is requesting iliat the Spout emit tnples to theoutput colleotor.
void open(java.· ti .Map con.f, Tog.ologyContext context, SQoutOutQutCollector co ector)
Called when a task for this component is initialized within a worker on the d1rrster.

Bolts
•Bolts
• Processes input streams and produces new streams
• Example
• Stream Joins, DBs, APIs, Filters, Aggregation, …

TOPOLOGY
•Topology
• is a graph where each node is a spout or bolt, and the edges
indicate which bolts are subscribing to which streams.

TASKS
• Parallelism is implemented using multiples instances of each spout
and bolt for simultaneous similar tasks. Spouts and bolts execute as
many tasks across the cluster.
• Managed by the supervisor daemon

Stream groupings
When a tuple is emitted, which task
does it go to?

Stream grouping
Shuffle grouping: pick a random task
Fields grouping: consistent hashing on a
subset of tuple fields
All grouping: send to all tasks
Global grouping: pick task with lowest id

example : streaming word count
• TopologyBuilder is used to construct topologies in Java.
• Define a Spout in the Topology with parallelism of 5 tasks.

abstraction : DRPC
Consumer decides what data it receives and how it gets
grouped
• Split Sentences into words with parallelism of 8 tasks.
• Create a word count stream

ABSTRACTION : DRPC
)
public static class SplttSentence extends ShellBolt implements IRtchBolt {
public SplttSentence()
super("python", "splltsentence.pyH);
}
public votd declareOutputF1elds(OutputF1eldsDeclarer declare!){
declarer.declaren(ew Fields ''word''));
}
}
'import storm
class SplttSentenceBolts(torm.BastcBolt):
def process(self, tup):
words = tup.values[0].spl1t"( 11
for word tn words:
storm.emit([word])

INSIDE A BOLT ..
public static class WordCount implements IBasicBolt {
Map<String, Integer> counts = new HashMap<String, Integer>();
public void prepare(Map conf, TopologyContext conte ) {
}
public void execute(Tuple tuple, BastcOutputCollector
collector){ String vorc..J = tuple.getStr1ng(0);
Integer count = counts.get(word);
if(count==null)count = 0;
count++;
counts.put(word, count);
collector.emitn(ew Values(word, count));
}
public votd cleanup(){
}
public vo1d declareOutputFields(OutputFieldsDeclarer declarEr){
declarer.declaren(ew flelds("word", "count"));
}
}

abstraction : DRPC
• Submitting Topologies to the cluster

abstraction : DRPC
• Running the Topology in Local Mode

Fault-Tolerance
• Zookeeper stores metadata in a very robust way
• Nimbus and Supervisor are stateless and only need metadata from ZK to
work/restart
• When a node dies
• The tasks will time out and be reassigned to other workers by Nimbus.
• When a worker dies
• The supervisor will restart the worker.
• Nimbus will reassign worker to another supervisor, if no heartbeats are
sent.
• If not possible (no free ports), then tasks will be run on other workers in
topology. If more capacity is added to the cluster later, STORM will
automatically initialize a new worker and spread out the tasks.
• When nimbus or supervisor dies
• Workers will continue to run
• Workers cannot be reassigned without Nimbus
• Nimbus and Supervisor should be run using a process monitoring tool, to
restarts them automatically if they fail.

AT LEAST ONCE Processing
• STORM guarantees at-least-once processing of tuples.
• Message id, gets assigned to a tuple when emitting from spout or bolt. Is 64 bits
long
• Tree of tuples is the tuples generated (directly and indirectly) from a spout tuple.
• Ack is called on spout, when tree of tuples for spout tuple is fully processed.
• Fail is called on spout, if one of the tuples in the tree of tuples fails or the tree of
tuples is not fully processed within a specified timeout (default is 30 seconds).
• It is possible to specify the message id, when emitting a tuple. This might be
useful for replaying tuples from a queue.
Ack/fail method called when tree of
tuples have been fully processed or
failed / timed-out

AT Least once processing
• Anchoring is used to copy the spout tuple message id(s) to the new
tuples generated. In this way, every tuple knows the message id(s) of all
spout tuples.
• Multi-anchoring is when multiple tuples are anchored. If the tuple tree
fails, then multiple spout tuples will be replayed. Useful for doing
streaming joins and more.
• Ack called from a bolt, indicates the tuple has been processed as
intented
• Fail called from a bolt, replays the spout tuple(s)
• Every tuple must be acked/failed or the task will run out of memory at
some point.
_collector.emit(tuple,new Values(word)); Uses anchoring
_collector.emit(new Values(word)); Does NOT use anchoring

exactly once processing
• Transactional topologies (TT) is an abstraction built on STORM primitives.
• TT guarantees exactly-once-processing of tuples.
• Acking is optimized in TT, no need to do anchoring or acking manually.
• Bolts execute as new instances per attempt of processing a batch
• Example
All grouping
Spout
Task: 1
Bolt
Task: 2
Bolt
Task: 3
1. A spout tuple is emitted to task 2 and 3
2. Worker responsible for task 3 fails
3. Supervisor restarts worker
4. Spout tuple is replayed and emitted to task
2 and 3
5. Task 2 and 3 initiate new bolts because of new
attempt
Now there is no problem

ABSTRACTION : DRPC
f
/
l["request-id"',..result"]
,-----
+''result.. - DRPC
-"args.. Server
::.,
Topology
[..request-id"1· "args' "return-info..]
Ill
Ill
Distributed RPC Architecture

WHY DRPC ?
Before Distributed RPC, time-sensitive queries relied
on a pre-computed index
Storm Does away with the indexing !!

abstraction : DRPC example
• Calculating the “Reach” of URL on the fly (in real time ! )
• Written by Nathan Marz to implement storm !
• Real World Application of Storm , open source, available
at http://github.com/nathanmarz/storm
• Reach is the number of unique people exposed to a URL
(tweet) on twitter at any given time.

abstraction : DRPC >> computing reach

ABSTRACTION : DRPC >> REACH TOPOLOGY
Spout - shuffle
["follower-id"]
+
global
t

abstraction : DRPC >> Reach topology
Create the Topology for the DRPC
Implementation of Reach Computation

ABSTRACTION : DRPC
_collector.emitn(ew Values(id, count));
}
public static class PartialUniquer implements IRichBolt, FinishedCallback {
OutputCollector _collecto";
Map<Object, Set<String>> _sets - new HashMap<Object, Set<String>>();
public void execute(Tuple tuple){
Object id = tuple.getValue(0);
Set<String> curr = _sets.get(id);
if(curr==null){
curr = new HashSet<String>();
_sets.put(id, curr);
}
curr.add(tuple.getString(l));
_collector.ack(tuple);
}
@Override
public void finishedidO(bject 1d){
Set<String> curr = _sets.remove(id);
int count = 0;
if(curr!=null)count = curr.size();

ABSTRACTION : DRPC
}
public static class Part1a1Un1 uer 1m lements IR1chBolt, F1n1shedCa1lback {
Ou _co ector;
ap<Object, Set<String>> _sets = new HashMap<Object, Set<String>>
public void execu e u
Object 1d = tuple.getVa1ue(0);
Set<String> curr = _sets.get(1d);
1f(curr==nu11){
curr = new HashSet<Str1ng>();
}
curr.add(tup1e.getStr1ng(l));
Keep set of followers for
each request id in n1en1ory
}
@Override
public void f1n1shedidO(bject id){
i.nt count = 0;
1f(curr!=nu11)count = curr.size();

ABSTRACTION : DRPC
}
OutputCollector _collector;
Map<Object, Set<String>> _sets - new HashMap<Object, Set<String>>();
pub · oid
execute(Tuple
Object id = tuple.getValue(0 ,
Set<String> curr = _sets.get(id
if(curr==null){
}
@Override
public void finishedidO(bject id){
int count = 0;

ABSTRACTION : DRPC
}

ABSTRACTION : DRPC
OutputCollector _collector;
Map<Object, Set<String>> _sets = new HashMap<Object, Set<String>>();
public void execute(Tuple tuple){
Object id = tuple.getValue(0);
Set<String> curr = _sets.get(id);
if(curr==null){
}
}
lie void finishedidO(bject id){
int count = 0;
_collector.emitn(ew Values(id, count

guaranteeing message processing
Tuple Tree

Guaranteeing message processing
• A spout tuple is not fully processed until all tuples in
the tree have been completed.
• If the tuple tree is not completed within a specified
timeout, the spout tuple is replayed
• Use of an inherent tool called the Reliability API

Guaranteeing message processing
Marks a single node in
the tree as complete
“ Anchoring “ creates a
new edge in the tuple
tree
Storm tracks tuple trees for you in an extremely efficient way

Running a storm application
•Local Mode
• Runs on a single JVM
• Used for development testing and debugging
•Remote Mode
• Submit our processes to Storm Cluster which has many processes
running on different machines.
• Doesn’t show debugging info, hence it is considered Production Mode.

STORM UI
l Pilm•
231HmOI
Hos1
p 11-32 181-'B.ta.llltf<!11
l>orl
6700
l:meted lnondwTecS ,,_ .....ey (ntsJ
OSII 'UJ21'l!J 0
2 23n' n 57s p11).98 200- 01 «:2 '*'nil (i100 54!S.."'60 033-1 2742"..&0 0
a 2'31 17 tp.IG-t
"""
&roo 64l!.S320 &oee'.l320 0. 274.."'«>0 0
5 231117m!l!l p 10.1'V-Il7·116.tc2.1nterno! fl700 03:!6 274274() D
,_
Storm Ul
Component summary
2
Bolt stats
Proc.n cYIMII
031!1
O.alll
0.3:<'0
0320
Input stats (AJItime)
• 'Stt.., Process bl.tone)' IM•I
032CI
Fa'lood
0
Acted Uosl "'""
• 17n• tOll IP 10.:»-73·2311.«,11111! 6100 0 742740 0

DOCUlVIENTATION
nathanman: DastOoard lnbox
nathanmarz I storm 2.,051 I. 109
Pull • 23 Wild 2.4 SlAts e.Graphs
Home Pages WtklHistory GitAocess
Home wPage fGitP&ge
Storm is a distributed realtime computation system.Similar to how Hadoop provides a set of generalprimJtives for doing batch processing,
Storm prov1desa set or generalprimitivesror doang realtJmecomputation.Storm iss1mp1e,canbe usedwath anyprogramm1ng Jaoguage,and
Is a lot of fun to use!
Read these first
• Ra:Jonale
• Sottmg up devolopment environment
• Creatmg a new Stormproject
• Tutor al
Getting help
Feeltree to askquestionson Storm's mailing list·ttp:lkjro p :. ooo oom/qrn 1p torm-user
You can also come to tho Istorm-user room on " cnodo You can usually find a Storm dovolopor thoro to help you out
fated projects

STORM LIBRARIES . .
STORM uses a lot of libraries. The most prominent are
• Clojure a new lisp programming language. Crash-course follows
• Jetty an embedded webserver. Used to host the UI of Nimbus.
• Kryo a fast serializer, used when sending tuples
• Thrift a framework to build services. Nimbus is a thrift daemon
• ZeroMQ a very fast transportation layer
• Zookeeper a distributed system for storing metadata

References
•Twitter Storm
• Mathan Marz
• http://www.storm-project.org
•Storm
• nathanmarz@github
• http://www.github.com/nathanmarz/storm
•Realtime Analytics with Storm and Hadoop
• Hadoop_Summit

Storm Real Time Computation

More Related Content

What's hot

Similar to Storm Real Time Computation

More from Sonal Raj

Recently uploaded

Storm Real Time Computation