An introduction to Apache Storm

1,913 views

Published on

A short introduction to Apache Storm, what is it and how does it work ?
How can it provide real time data processing for big data ?

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,913
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
36
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

An introduction to Apache Storm

  1. 1. Apache Storm ● What is it ? ● Architecture ● Storm Vs Hadoop ● History ● Terms www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  2. 2. Apache Storm – What is it ? ● A real time big data processing system ● Stream based ● Fault tolerant and distributed ● Non persistent ● In the Apache incubator ● Written in Clojure and Java ● Released via an Eclipse license www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  3. 3. Apache Storm – Storm Vs Hadoop Hadoop Storm ● Distributed & fault tolerant ● Distributed & fault tolerant ● Batch / file based ● Real time / stream based ● Master/slave plus Zoo Keeper ● Master/slave plus Zoo Keeper ● Persistent, uses HDFS ● Non persistent ● Big Data Analysis ● Big Data analysis www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  4. 4. Apache Storm – Storm Vs Hadoop Hadoop Versus Storm ● They are complementary technologies ● They might both be used in a single system ● Storm to process real time streams of data ● Hadoop and M/R to process batched data on HDFS www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  5. 5. Apache Storm – Architecture Storm architecture at a high level www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  6. 6. Apache Storm – Architecture ● Composed of stream of tuples, bolted together ● sourced via spouts www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  7. 7. Apache Storm – Architecture ● From these components we form topologies www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  8. 8. Apache Storm – History What is Apache Storm's history ? ● Developed by BackType ● Acquired by Twitter ● Open sourced by Twitter in Sept 2011 ● Added to Apache Incubator in 2013 www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  9. 9. Apache Storm – Terms ● Tuple – an ordered list of elements ● Stream – an unbounded feed of tuples ● Spout – like a tap or faucet, a source of streams ● Bolt – Functions / Filters etc to process streams ● Topologies – ETL like architectures built from – Spouts, Streams, Bolts ● Nimbus – master node, like Hadoop job tracker ● Supervisor – controls worker processes www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  10. 10. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems

×