• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Storm - Altamira University Presentation
 

Storm - Altamira University Presentation

on

  • 684 views

 

Statistics

Views

Total Views
684
Views on SlideShare
684
Embed Views
0

Actions

Likes
0
Downloads
31
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Storm - Altamira University Presentation Storm - Altamira University Presentation Presentation Transcript

    • Apache Storm A distributed, real-time computation system Ryan Lanman Some content borrowed from Nathan Marz’ Presentation of a similar name
    • Objectives 1.Their Motivation 2.Our Motivation 3.Storm Basics 4.Demo
    • Their Motivation How Storm Came To Be
    • What They Wanted • • • • • • Guaranteed data processing Horizontal scalability Fault-tolerance No intermediate message brokers! Higher level abstraction than message passing “Just works”
    • Our Motivation Why We Chose Storm ^
    • Lumify Ingest Raw Data Text Extraction Entity Extraction Text Highlighting Location Extraction Full Text Indexing
    • Issues • • • • • No Reducers High DB Read/Writes Batch-style processing M/R Overhead Zero Fault Tolerance
    • What We Really Wanted • Distributed, Stream-type Processing • Simple Logical DAG • Better Fault Tolerance
    • Storm Ingest Workflow Documents Raw Data Content Sorter Video Images Text Extraction Video Frame Splitting Image Text Extraction Video Frame Text Extraction Text …
    • Storm Basics What the heck’s a Topology?
    • Storm Cluster Supervisor Zookeeper Nimbus Supervisor Zookeeper Supervisor Zookeeper Supervisor Supervisor
    • Storm Cluster Supervisor Zookeeper Nimbus Supervisor Zookeeper Supervisor Zookeeper Supervisor Supervisor
    • Storm Cluster Supervisor Zookeeper Nimbus Supervisor Zookeeper Supervisor Zookeeper Supervisor Supervisor
    • Storm Cluster Supervisor Zookeeper Nimbus Supervisor Zookeeper Supervisor Zookeeper Supervisor Supervisor
    • Storm Data Concepts • • • • • Tuples Streams Spouts Bolts Topologies
    • Tuples • Single unit of data in Storm • Examples – Tweet – User Activity Log Entry – File Info
    • Streams Tuple Tuple Tuple Tuple Tuple An unbound sequence of Tuples Tuple Tuple
    • Spouts Spout Producers of Streams
    • Bolts Tuple Process input streams to create new streams Tuple
    • Examples Spout Examples • HDFS Filesystem Spout • Kafka Queue Spout Bolt Examples • Filtering • Aggregation • DB Operations
    • Topologies Spout Spout Spout
    • Demo
    • Demo Topology Twitter Twitter Hosebird Spout Sentence Splitter Word Count Accumulo
    • Demo Topology Twitter Twitter Hosebird Spout Shuffle Grouping Field Grouping Sentence Splitter Word Count Accumulo