Storm
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
545
On Slideshare
545
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
5
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Distributed Stream Processing With Storm Petar Kostov March 2013
  • 2. Contents• Concepts• Cluster structure• Parallelism• Reliability• Abstractions – DRCP, Trident• Utils Storm March 2013 #2
  • 3. Concepts• Bolt• Spout• Tuple• Stream Storm March 2013 #3
  • 4. Cluster structure Storm March 2013 #4
  • 5. Parallelism Storm March 2013 #5
  • 6. Taming Parallelism• Stream groupings Consumer component Emitter component Task 1 Source Task Task 2 … Task N Storm March 2013 #6
  • 7. Reliability• Tuple ACK-ing – @10K acks/sec the system will fail in 50 000 000 years• Reliable/Unreliable spout Storm March 2013 #7
  • 8. Abstractions: DRPC Storm March 2013 #8
  • 9. Abstractions: Trident• Stateful stream processing• Exactly once semantics• The new way to do DRPC Storm March 2013 #9
  • 10. Abstractions: TridentTridentTopology topology = new TridentTopology();TridentState wordCounts = topology.newStream("spout1", spout).each(new Fields("sentence"), new Split(), new Fields("word")).groupBy(new Fields("word")).persistentAggregate(new MemoryMapState.Factory(), new Count(),new Fields("count")).parallelismHint(6); Storm March 2013 #10
  • 11. Utils• Local mode• storm-deploy• Storm UI Storm March 2013 #11
  • 12. THANK YOU! Storm March 2012 #12