Storm

502 views
408 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
502
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Storm

  1. 1. Distributed Stream Processing With Storm Petar Kostov March 2013
  2. 2. Contents• Concepts• Cluster structure• Parallelism• Reliability• Abstractions – DRCP, Trident• Utils Storm March 2013 #2
  3. 3. Concepts• Bolt• Spout• Tuple• Stream Storm March 2013 #3
  4. 4. Cluster structure Storm March 2013 #4
  5. 5. Parallelism Storm March 2013 #5
  6. 6. Taming Parallelism• Stream groupings Consumer component Emitter component Task 1 Source Task Task 2 … Task N Storm March 2013 #6
  7. 7. Reliability• Tuple ACK-ing – @10K acks/sec the system will fail in 50 000 000 years• Reliable/Unreliable spout Storm March 2013 #7
  8. 8. Abstractions: DRPC Storm March 2013 #8
  9. 9. Abstractions: Trident• Stateful stream processing• Exactly once semantics• The new way to do DRPC Storm March 2013 #9
  10. 10. Abstractions: TridentTridentTopology topology = new TridentTopology();TridentState wordCounts = topology.newStream("spout1", spout).each(new Fields("sentence"), new Split(), new Fields("word")).groupBy(new Fields("word")).persistentAggregate(new MemoryMapState.Factory(), new Count(),new Fields("count")).parallelismHint(6); Storm March 2013 #10
  11. 11. Utils• Local mode• storm-deploy• Storm UI Storm March 2013 #11
  12. 12. THANK YOU! Storm March 2012 #12

×