Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Storm introduction


Published on

A Brief introduction to Apache Storm. Talk given at the October Toronto Java User Group meeting, video available at

Published in: Technology
  • Be the first to comment

Storm introduction

  1. 1. Apache Storm A Brief Introduction
  2. 2. What Is Storm ● Distributed ● Stream Oriented ● Real Time* ● Scalable ● Reliable**
  3. 3. Topologies ● Storm’s equivalent of an application ○ Consists of Components (Spouts and Bolts) ■ parallelism ■ Data represented as streams of tuples
  4. 4. Spouts ● Spouts ○ Expose an external input source ■ An unbounded stream of data ○ Are polled, by storm, for their next Tuple ○ Produce one or more streams of Tuples ○ Notified when a Tuple is completely processed ○ Notified when a Tuple fails to be processed
  5. 5. Bolts ● Bolts ○ Subscribes to one or more Streams ■ Grouped by all, randomly, or by Field values ○ Produces zero or more Streams ○ Single threaded execution per instance
  6. 6. An Example Classic word count: Lets assume we have an unbounded incoming stream of sentences and we want to have a constantly updated count of the words found in the text. Code:
  7. 7. Building a topology We provide a description of the topology to be deployed via the fluent TopologyBuilder class. TopologyBuilder builder = new TopologyBuilder();
  8. 8. Spout First we’ll create a spout to generate random sentences and add it to the topology public class RandomSentenceSpout extends BaseRichSpout { … public void nextTuple() { String[] sentences = new String[]{ "the cow jumped over the moon", "an apple a day keeps the doctor away", "four score and seven years ago", "snow white and the seven dwarfs", "i am at two with nature" }; String sentence = sentences[_rand.nextInt(sentences.length)]; _collector.emit(new Values(sentence)); } … } builder.setSpout("spout", new RandomSentenceSpout(), 5);
  9. 9. Split Messages Next we’ll split each item into words. The example is in python, here it is in Java: public static class SplitSentence extends BaseBasicBolt { ... public void execute(Tuple tuple, BasicOutputCollector collector) { String sentence = tuple.getString(0); for (String word : sentence.split(“s”)) { collector.emit(new Values(word)); } } ... } builder.setBolt("split", new SplitSentence(), 8).shuffleGrouping("spout");
  10. 10. Count Words Now we can count the words public static class WordCount extends BaseBasicBolt { Map<String, Integer> counts = new HashMap<String, Integer>(); public void execute(Tuple tuple, BasicOutputCollector collector) { String word = tuple.getString(0); Integer count = counts.get(word); if (count == null) count = 0; count++; counts.put(word, count); collector.emit(new Values(word, count)); } ... } builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word"));
  11. 11. Discussion ● Component Lifecycle Events ● In Memory State ● Deployment view
  12. 12. Ecosystem ● Trident ● SummingBird
  13. 13. Alternatives ● Spark Streaming
  14. 14. Questions? comments, concerns or criticisms?