Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Realtime Computation      with Storm                    Brad Anderson          banderson@maprtech.com                     ...
Definition & Overview   Interoperability     Use Cases
Stream Processing       CEP Distributed RPC
Source Data•   Social Media      •   Weather Data    Feeds                      • Auctions of Ad•   Network Sensors       ...
Before StormQueues        Workers
Example (simplified)
StormGuaranteed data processingHorizontal scalabilityFault-toleranceNo intermediate message brokers!Higher level abstracti...
Concepts
streamsTuple   Tuple      Tuple    Tuple    Tuple     Tuple   Tuple                Unbounded sequence of tuples
spoutsSource of streams
spoutspublic	  interface	  ISpout	  extends	  Serializable	  {	  	  	  	  void	  open(Map	  conf,	  	  	  	  	  	  	  	  	...
boltsProcesses input streams and produces new streams
boltspublic	  class	  DoubleAndTripleBolt	  extends	  BaseRichBolt	  {	  	  	  	  private	  OutputCollectorBase	  _collect...
topologiesNetwork of spouts and bolts
topologies        TopologyBuilder builder = new TopologyBuilder();                builder.setSpout("spout", new RandomSent...
TridentCascading for Storm
Trident Facilities•   Joins•   Aggregations•   Grouping•   Functions•   Filters•   Consistent, Exactly-Once Semantics
TridentTopology	  topology	  =	  new	  TridentTopology();	  	  	  	  	  	  	  	  TridentState	  wordCounts	  =	  	  	  	  ...
Interoperability
spouts•Kafka (with transactions)• Kestrel• JMS• AMQP• Beanstalkd
bolts• Functions• Filters• Aggregation• Joins• Talk to databases, Hadoop write-behind
Storm                realtime               processes       Queue                               AppsRawData               ...
Storm                       realtime                      processes              Queue                               AppsR...
Storm                                    realtime                                   processes                             ...
Storm                            realtime                           processes                                        Apps ...
Use Cases
Twitter                  Follower                             Distinct        Tweeter   Follower   follower               ...
Heartbyte
Fleet Logistics
http://github.com/{tdunning | boorad}/mapr-spout                                    Brad Anderson                         ...
Thank you.http://github.com/{tdunning | boorad}/mapr-spout                                    Brad Anderson               ...
Realtime Computation with Storm
You’ve finished this document.
Download and read it offline.
Upcoming SlideShare
Hug france-2012-12-04
Next
Upcoming SlideShare
Hug france-2012-12-04
Next
Download to read offline and view in fullscreen.

Share

Realtime Computation with Storm

Download to read offline

Storm is a distributed realtime computation system. Similar to how Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general primitives for doing realtime computation. Storm is simple, can be used with any programming language, and is a lot of fun to use! We will talk about how Storm is architected, how to interoperate with Hadoop, and a few real-world use-cases.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Realtime Computation with Storm

  1. 1. Realtime Computation with Storm Brad Anderson banderson@maprtech.com @boorad
  2. 2. Definition & Overview Interoperability Use Cases
  3. 3. Stream Processing CEP Distributed RPC
  4. 4. Source Data• Social Media • Weather Data Feeds • Auctions of Ad• Network Sensors Impressions• App/Web Logs • Payment• Stock Tick Data Transactions
  5. 5. Before StormQueues Workers
  6. 6. Example (simplified)
  7. 7. StormGuaranteed data processingHorizontal scalabilityFault-toleranceNo intermediate message brokers!Higher level abstraction than message passing“Just works”
  8. 8. Concepts
  9. 9. streamsTuple Tuple Tuple Tuple Tuple Tuple Tuple Unbounded sequence of tuples
  10. 10. spoutsSource of streams
  11. 11. spoutspublic  interface  ISpout  extends  Serializable  {        void  open(Map  conf,                            TopologyContext  context,                            SpoutOutputCollector  collector);        void  close();        void  nextTuple();        void  ack(Object  msgId);        void  fail(Object  msgId);}
  12. 12. boltsProcesses input streams and produces new streams
  13. 13. boltspublic  class  DoubleAndTripleBolt  extends  BaseRichBolt  {        private  OutputCollectorBase  _collector;        public  void  prepare(Map  conf,                                                TopologyContext  context,                                                OutputCollectorBase  collector)  {                _collector  =  collector;        }        public  void  execute(Tuple  input)  {                int  val  =  input.getInteger(0);                                _collector.emit(input,  new  Values(val*2,  val*3));                _collector.ack(input);        }        public  void  declareOutputFields(OutputFieldsDeclarer  declarer)  {                declarer.declare(new  Fields("double",  "triple"));        }        }
  14. 14. topologiesNetwork of spouts and bolts
  15. 15. topologies        TopologyBuilder builder = new TopologyBuilder();                builder.setSpout("spout", new RandomSentenceSpout(), 5);                builder.setBolt("split", new SplitSentence(), 8)                 .shuffleGrouping("spout");        builder.setBolt("count", new WordCount(), 12)                 .fieldsGrouping("split", new Fields("word"));
  16. 16. TridentCascading for Storm
  17. 17. Trident Facilities• Joins• Aggregations• Grouping• Functions• Filters• Consistent, Exactly-Once Semantics
  18. 18. TridentTopology  topology  =  new  TridentTopology();                TridentState  wordCounts  =          topology.newStream("spout1",  spout)              .each(new  Fields("sentence"),  new  Split(),  new  Fields("word"))              .groupBy(new  Fields("word"))              .persistentAggregate(new  MemoryMapState.Factory(),                                                        new  Count(),                                                        new  Fields("count"))                                              .parallelismHint(6);
  19. 19. Interoperability
  20. 20. spouts•Kafka (with transactions)• Kestrel• JMS• AMQP• Beanstalkd
  21. 21. bolts• Functions• Filters• Aggregation• Joins• Talk to databases, Hadoop write-behind
  22. 22. Storm realtime processes Queue AppsRawData Business Value Hadoop batch processes
  23. 23. Storm realtime processes Queue AppsRawData Business Value Hadoop Parallel Cluster Ingest batch processes
  24. 24. Storm realtime processes Apps Queue TailSpoutRawData Business Franz Value Hadoop batch processes
  25. 25. Storm realtime processes Apps TailSpoutRawData Business Franz Value Hadoop batch processes
  26. 26. Use Cases
  27. 27. Twitter Follower Distinct Tweeter Follower follower Follower Distinct URL Tweeter follower Reach Follower Follower Distinct Tweeter follower Follower
  28. 28. Heartbyte
  29. 29. Fleet Logistics
  30. 30. http://github.com/{tdunning | boorad}/mapr-spout Brad Anderson banderson@maprtech.com @boorad
  31. 31. Thank you.http://github.com/{tdunning | boorad}/mapr-spout Brad Anderson banderson@maprtech.com @boorad
  • justinleeschmidtmn

    Dec. 17, 2014
  • tsubame9590206

    Mar. 4, 2013

Storm is a distributed realtime computation system. Similar to how Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general primitives for doing realtime computation. Storm is simple, can be used with any programming language, and is a lot of fun to use! We will talk about how Storm is architected, how to interoperate with Hadoop, and a few real-world use-cases.

Views

Total views

2,510

On Slideshare

0

From embeds

0

Number of embeds

4

Actions

Downloads

74

Shares

0

Comments

0

Likes

2

×