Your SlideShare is downloading. ×
Short introduction to Storm
Short introduction to Storm
Short introduction to Storm
Short introduction to Storm
Short introduction to Storm
Short introduction to Storm
Short introduction to Storm
Short introduction to Storm
Short introduction to Storm
Short introduction to Storm
Short introduction to Storm
Short introduction to Storm
Short introduction to Storm
Short introduction to Storm
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Short introduction to Storm

336

Published on

Presentation given in class for Cloud Computing at Universitat Politècnica de Catalunya

Presentation given in class for Cloud Computing at Universitat Politècnica de Catalunya

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
336
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. STORMDISTRIBUTED AND FAULT-TOLERANTREALTIME COMPUTATIONJimmy ZögerCLC < FIB < UPC2013-06-03
  • 2. INTRODUCTION• Like Hadoop for realtime processing instead of batch• Open Source• Developed by BackType which was later acquired byTwitter• Developed for analyzingTwitter data• Similar to S4
  • 3. STORMTOPOLOGY
  • 4. SPOUTS
  • 5. SPOUTS• The component responsible for feeding messages into thetopology• Emits tuples• Can be reliable or unreliable (ack() and fail())
  • 6. INTEGRATION• Kestrel• RabbitMQ• Kafka• JMS• Integration is easy with the simple Spout abstraction
  • 7. BOLTS
  • 8. BOLTS• A component that takes tuples as input and produces tuplesas output• Can do filtering, joining, functions, aggregations etc.• Does not have to process a tuple immediately and may holdonto tuples to process later• Comparison with Hadoop:A bolt can be a mapper or a reducer (or anything)
  • 9. STORMTOPOLOGY
  • 10. STORMTOPOLOGY• Spouts, bolts and streams• Distributed• Runs indefinitely until it is stopped• Arbitrary complexity• Streams requiring multiple steps also requires multiple bolts• No intermediate queues for streams
  • 11. FAULT-TOLERANCE• Nimbus daemon and Supervisordaemons are fail-fast and stateless• Each worker sends heartbeats to Nimbus• Transactional topologies → Guaranteed processingNimbusZookeeperSupervisorSupervisorSupervisorSupervisorZookeeper
  • 12. USE CASES• Counting words!• Realtime analytics - trending topics onTwitter• Online machine learning• Continuous computation• Distributed RPC• Extract,Transform and Load (ETL)
  • 13. FASTOne benchmark clocked it overa million tuples processedper second per node{x,y,z} ↠ {x,y,z} ↠ {x,y,z} ↠ {x,y,z} ↠ {x,y,z} ↠
  • 14. STORMDISTRIBUTED AND FAULT-TOLERANTREALTIME COMPUTATIONJimmy ZögerCLC < FIB < UPC2013-06-03

×