Your SlideShare is downloading. ×
  • Like
Storm - Altamira University Presentation
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Storm - Altamira University Presentation

  • 717 views
Published

 

Published in Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
717
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
37
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Apache Storm A distributed, real-time computation system Ryan Lanman Some content borrowed from Nathan Marz’ Presentation of a similar name
  • 2. Objectives 1.Their Motivation 2.Our Motivation 3.Storm Basics 4.Demo
  • 3. Their Motivation How Storm Came To Be
  • 4. What They Wanted • • • • • • Guaranteed data processing Horizontal scalability Fault-tolerance No intermediate message brokers! Higher level abstraction than message passing “Just works”
  • 5. Our Motivation Why We Chose Storm ^
  • 6. Lumify Ingest Raw Data Text Extraction Entity Extraction Text Highlighting Location Extraction Full Text Indexing
  • 7. Issues • • • • • No Reducers High DB Read/Writes Batch-style processing M/R Overhead Zero Fault Tolerance
  • 8. What We Really Wanted • Distributed, Stream-type Processing • Simple Logical DAG • Better Fault Tolerance
  • 9. Storm Ingest Workflow Documents Raw Data Content Sorter Video Images Text Extraction Video Frame Splitting Image Text Extraction Video Frame Text Extraction Text …
  • 10. Storm Basics What the heck’s a Topology?
  • 11. Storm Cluster Supervisor Zookeeper Nimbus Supervisor Zookeeper Supervisor Zookeeper Supervisor Supervisor
  • 12. Storm Cluster Supervisor Zookeeper Nimbus Supervisor Zookeeper Supervisor Zookeeper Supervisor Supervisor
  • 13. Storm Cluster Supervisor Zookeeper Nimbus Supervisor Zookeeper Supervisor Zookeeper Supervisor Supervisor
  • 14. Storm Cluster Supervisor Zookeeper Nimbus Supervisor Zookeeper Supervisor Zookeeper Supervisor Supervisor
  • 15. Storm Data Concepts • • • • • Tuples Streams Spouts Bolts Topologies
  • 16. Tuples • Single unit of data in Storm • Examples – Tweet – User Activity Log Entry – File Info
  • 17. Streams Tuple Tuple Tuple Tuple Tuple An unbound sequence of Tuples Tuple Tuple
  • 18. Spouts Spout Producers of Streams
  • 19. Bolts Tuple Process input streams to create new streams Tuple
  • 20. Examples Spout Examples • HDFS Filesystem Spout • Kafka Queue Spout Bolt Examples • Filtering • Aggregation • DB Operations
  • 21. Topologies Spout Spout Spout
  • 22. Demo
  • 23. Demo Topology Twitter Twitter Hosebird Spout Sentence Splitter Word Count Accumulo
  • 24. Demo Topology Twitter Twitter Hosebird Spout Shuffle Grouping Field Grouping Sentence Splitter Word Count Accumulo