SlideShare a Scribd company logo
1 of 40
Apache Storm
Course Instructor : Dr.Zarifzadeh
Presented By : Pouyan Rezazadeh, Ali Rezaie
Introduction
Hadoop and related technologies have made it
possible
to store and process data at large scales.
Unfortunately, these data processing
technologies are
not realtime systems.
Hadoop does batch processing instead of
realtime
processing.
Apache Storm 2
Processing jobs one by
one
Apache Storm 3
Introduction
Batch
processing
Processing jobs in batch
Batch processing jobs can take
hours
E.g. billing system
Realtime
processing
Processing jobs
immediately
Apache Storm
4
Introduction
E.g. airline
system
Realtime data processing at massive scale is
becoming
more and more of a requirement for businesses.
The lack of a "Hadoop of realtime" has become
the
biggest hole in the data processing ecosystem.
There's no hack that will turn Hadoop into a
realtime
system.
Apache Storm 5
Apache Storm
Solution
A distributed realtime computation
system
Founded in 2011
Apache Storm 6
Implemented in Clojure (a dialect of Lisp), some
Java
Apache Storm 7
Advantages
Free, simple and open source
Can be used with any programming
language
Very fast
Scalabl
e
Fault -
tolerant
Guarantees your data will be
processed
Integrates with any database
technology
Apache Storm 8
Storm Use Cases
And too many others
…
Apache Storm 9
Storm vs Hadoop
A Storm cluster is superficially similar to a
Hadoop
cluster.
Hadoop runs "MapReduce jobs", while Storm
runs
"topologies".
Apache Storm 10
A MapReduce job eventually finishes,
whereas a
topology processes messages forever (or until
you kill
it).
Spouts and Bolts
Spout
s
Bolts
Apache Storm 11
Apache Storm 12
Spouts and Bolts
Bolt 1
Bolt 4
Spout 1 Bolt 2
Spout 2 Bolt 3
A stream is an unbounded sequence of
tuples.
A spout is a source of streams.
Apache Storm 13
Spouts and Bolts
Bolt 1
Bolt 4
Spout 1 Bolt 2
Spout 2 Bolt 3
For example, a spout may read tuples off of a
queue and
emit them as a stream.
Apache Storm 14
Spouts and Bolts
Bolt 1
Bolt 4
Spout 1 Bolt 2
Spout 2 Bolt 3
A bolt consumes any number of input streams,
does
some processing, and possibly emits new streams.
Apache Storm 15
Spouts and Bolts
Bolt 1
Bolt 4
Spout 1 Bolt 2
Spout 2 Bolt 3
Each node (spout or bolt) in a Storm topology
executes
in parallel.
Apache Storm 16
Architecture
A machine in a storm cluster may run one or more
worker
processes. Worker Process
Each topology has one or more Task Task
worker
processes.
Each worker process
runs
Task
Task
executors (threads) for a specific
topology.
Each executor runs one or more tasks of the
same
component(spout executor or
bolt).
Apache Storm 17
Architecture
Supervisor
Supervisor
Supervisor
Supervisor
Supervisor
ZooKeeper
Nimbus ZooKeeper
ZooKeeper
Hadoop v1 Storm
JobTracker Nimbus
(only1)
. distributescode around cluster
. assigns tasks to machines/supervisors
. failure monitoring
TaskTracker Supervisor . listens for work assigned to its machine
(many) . starts and stops worker processes as necessary b
o
a
n
s
e
N
d
i
mbus
ZooKeeper . coordination between Nimbus and the Supervisors
Apache Storm 18
Architecture
The Nimbus and Supervisor are
stateless .
All state is kept in Zookeeper .
1 ZK instance per machine
When the Nimbus or Supervisor fails, they'll start
back
up like nothing happened.
storm jar all-my-code.jar org.apache.storm.MyTopology
arg1 arg2
Apache Storm 19
Architecture
A running topology consists of many worker
processes
spread across many machines.
Apache Storm 20
Topology
Worker Process Worker Process
Task
Task
Task
Task
Task
Task
Task
Task
Task
Task
Task Task
Apache Storm 21
Topology With
Tasks in Details
Apache Storm 22
Shuffle grouping :
Randomized
round-robin
Fields grouping: all Tuples
with
the same field value(s) are
always
routed to the same task
Direct grouping: producer of
the
tuple decides which task of
the
consumer will receive the tuple
Apache Storm 23
A Sample Code of
Configuring
TopologyBuilder topologyBuilder = new TopologyBuilder();
Apache Storm 24
Fault Tolerance
Workers heartbeat back to Nimbus via
ZooKeeper .
Apache Storm 25
Fault Tolerance
When a worker dies , the supervisor will restart
it.
Apache Storm 26
Fault Tolerance
If it continuously fails on startup and is unable
to
heartbeat to Nimbus, Nimbus will reschedule the worker.
Apache Storm 27
Fault Tolerance
If a supervisor node dies , Nimbus will reassign the
work
to other nodes .
Apache Storm 28
Fault Tolerance
If Nimbus dies, topologies will continue to function
normally!
Apache Storm 29
but won’t be able to perform
reassignments.
Apache Storm 30
Fault Tolerance
In contrast to Hadoop, where if the JobTracker
dies, all
Apache Storm 31
the running jobs are
lost.
Apache Storm 32
Fault Tolerance
Preferably run ZK with nodes >= 3 so that you
can
Apache Storm 33
tolerate the failure of 1 ZK
server.
A Sample Word
Count Topology
Split Word
Sentence
Spout
Report
Sentence Count
Bolt Bolt Bolt
Sentence Spout: { "sentence": "my dog has
fleas" }
Split Sentence Bolt: { "word" : "my"
} { "word" : "dog" }
Apache Storm 34
{ "word" : "has" }
{ "word" : "fleas" }
Word Count Bolt: { "word" : "dog", "count" : 5
}
Report Bolt: prints the contents
Apache Storm 35
A Sample Word
Count Code
publicclassSentenceSpoue
txtendsBaseRichSpout{
privateSpoutOutputCollectocor llector;
privateString[ sentences =
{
"my dog has flea,s
"
"
i like cold beverage,s""the dog ate my
homework,""don't have a cow ma, "ni"don't thiniklike fleas“
}
;
privateintindex =
0;
publicvoiddeclareOutputField(O
sutputFieldsDeclaredr
eclarer) {
declarer.declar(n
eewFields("sentence)"
);
}
publicvoidopen(Mapconfig,TopologyContexc
tontextS
, poutOutputCollectocor llector)
this.collector=collector;
}
publicvoidnextTuple(){
this.collector.em(in
tewValues(sentences[index]));
index++;
if(index >=sentences.leng)th{ index =
0;
}
}
Apache Storm 36
A Sample Word
Count Code
publicclassSplitSentenceBoe
ltxtendsBaseRichBol{
t
privateOutputCollectoc
rollector;
publicvoidprepareM
( apconfig, TopologyContexc
tontextO
, utputCollector
collecto)r{
this.collector=collecto;r
}
publicvoidexecuteT
(upletuple){
Stringsentence =tuple.getStringByFie("
ldsentenc"e);
String[ words =sentence.spl(i"
t");
for(Stringword : word)s
this.collector.em(n
itewValues(word));
}
}
publicvoiddeclareOutputField(O
sutputFieldsDeclaredr
eclarer){
declarer.declar(n
eewFields("word"));
}
}
Apache Storm 37
A Sample Word
Count Code
publicclassWordCountBole
txtendsBaseRichBol{
t
privateOutputCollectoc
rollector;
privateHashMap<
String, Long>counts =
null;
publicvoidprepareM
( apconfig, TopologyContexc
tontextO
, utputCollectoc
rollecto)r{
this.collector=collector;
this.counts=newHashMap<
String, Long>();
}
publicvoidexecuteT
(upletuple){
Stringword =tuple.getStringByFie("
ldword");
Longcount =
this.countsg
. et(word);
if(count= null)
count=0L;
}
count++;
this.counts.pu(wt ord, coun)t;
this.collector.em(n
itewValues(word, coun)t);
}
publicvoiddeclareOutputField(O
sutputFieldsDeclaredr
eclarer){
declarer.declar(n
eewFields("word,
""coun"t));
}
}
Apache Storm 38
A Sample Word
Count Code
publicclassReportBoltextendsBaseRichBolt{
privateHashMap<
String, Long>counts =
null;
publicvoidprepareM
( apconfig, TopologyContexc
tontextO
, utputCollectoc
rollector){
this.counts=newHashMap<
String, Long>();
}
publicvoidexecuteT
(upletuple){
Stringword =tuple.getStringByFie("
ldword");
Longcount =tuple.getLongByFie(l"
d
coun"t);
this.counts.pu(wt ord, coun)t;
}
publicvoiddeclareOutputField(O
sutputFieldsDeclaredr
eclarer){
//this bolt does not emit any}
thing
publicvoidcleanup(){
List<
String>keys =
newArrayLis<
tString>();
keys.addA(lt
lhis.counts.keySe()t);
Collection.
s
sort(keys);
for(Stringkey : keys{)
System.out.println(key+" : "+this.countsg
. et(key));
}
}
}
Apache Storm 39
storm-170531123446.dotx.pptx

More Related Content

Similar to storm-170531123446.dotx.pptx

Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014P. Taylor Goetz
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureP. Taylor Goetz
 
Bugs from Outer Space | while42 SF #6
Bugs from Outer Space | while42 SF #6Bugs from Outer Space | while42 SF #6
Bugs from Outer Space | while42 SF #6While42
 
Storm @ Fifth Elephant 2013
Storm @ Fifth Elephant 2013Storm @ Fifth Elephant 2013
Storm @ Fifth Elephant 2013Prashanth Babu
 
Real time stream processing presentation at General Assemb.ly
Real time stream processing presentation at General Assemb.lyReal time stream processing presentation at General Assemb.ly
Real time stream processing presentation at General Assemb.lyVarun Vijayaraghavan
 
PART-3 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
PART-3 : Mastering RTOS FreeRTOS and STM32Fx with DebuggingPART-3 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
PART-3 : Mastering RTOS FreeRTOS and STM32Fx with DebuggingFastBit Embedded Brain Academy
 
cs2110Concurrency1.ppt
cs2110Concurrency1.pptcs2110Concurrency1.ppt
cs2110Concurrency1.pptnarendra551069
 
Multi-tenant Apache Storm as a service
Multi-tenant Apache Storm as a serviceMulti-tenant Apache Storm as a service
Multi-tenant Apache Storm as a serviceRobert Evans
 
Beyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded SoftwareBeyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded SoftwareQuantum Leaps, LLC
 
Beyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded SoftwareBeyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded SoftwareMiro Samek
 
Threaded Programming
Threaded ProgrammingThreaded Programming
Threaded ProgrammingSri Prasanna
 
Porting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustPorting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustEvan Chan
 
How Many Slaves (Ukoug)
How Many Slaves (Ukoug)How Many Slaves (Ukoug)
How Many Slaves (Ukoug)Doug Burns
 
Cleveland HUG - Storm
Cleveland HUG - StormCleveland HUG - Storm
Cleveland HUG - Stormjustinjleet
 
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Spark Summit
 

Similar to storm-170531123446.dotx.pptx (20)

Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014
 
STORM
STORMSTORM
STORM
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm Architecture
 
Bugs from Outer Space | while42 SF #6
Bugs from Outer Space | while42 SF #6Bugs from Outer Space | while42 SF #6
Bugs from Outer Space | while42 SF #6
 
Storm 0.8.2
Storm 0.8.2Storm 0.8.2
Storm 0.8.2
 
Storm
StormStorm
Storm
 
Storm @ Fifth Elephant 2013
Storm @ Fifth Elephant 2013Storm @ Fifth Elephant 2013
Storm @ Fifth Elephant 2013
 
Real time stream processing presentation at General Assemb.ly
Real time stream processing presentation at General Assemb.lyReal time stream processing presentation at General Assemb.ly
Real time stream processing presentation at General Assemb.ly
 
PART-3 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
PART-3 : Mastering RTOS FreeRTOS and STM32Fx with DebuggingPART-3 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
PART-3 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
 
cs2110Concurrency1.ppt
cs2110Concurrency1.pptcs2110Concurrency1.ppt
cs2110Concurrency1.ppt
 
Multi-tenant Apache Storm as a service
Multi-tenant Apache Storm as a serviceMulti-tenant Apache Storm as a service
Multi-tenant Apache Storm as a service
 
Beyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded SoftwareBeyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded Software
 
SA UNIT III STORM.pdf
SA UNIT III STORM.pdfSA UNIT III STORM.pdf
SA UNIT III STORM.pdf
 
Beyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded SoftwareBeyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded Software
 
Threaded Programming
Threaded ProgrammingThreaded Programming
Threaded Programming
 
Java multi thread programming on cmp system
Java multi thread programming on cmp systemJava multi thread programming on cmp system
Java multi thread programming on cmp system
 
Porting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustPorting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to Rust
 
How Many Slaves (Ukoug)
How Many Slaves (Ukoug)How Many Slaves (Ukoug)
How Many Slaves (Ukoug)
 
Cleveland HUG - Storm
Cleveland HUG - StormCleveland HUG - Storm
Cleveland HUG - Storm
 
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
 

Recently uploaded

Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...Lviv Startup Club
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxpriyanshujha201
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...lizamodels9
 
Event mailer assignment progress report .pdf
Event mailer assignment progress report .pdfEvent mailer assignment progress report .pdf
Event mailer assignment progress report .pdftbatkhuu1
 
Understanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key InsightsUnderstanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key Insightsseri bangash
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Dipal Arora
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Centuryrwgiffor
 
Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...
Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...
Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...lizamodels9
 
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...Suhani Kapoor
 
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876dlhescort
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesDipal Arora
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.Aaiza Hassan
 
Monte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMMonte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMRavindra Nath Shukla
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Serviceritikaroy0888
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMANIlamathiKannappan
 
Unlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdfUnlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdfOnline Income Engine
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageMatteo Carbone
 
A305_A2_file_Batkhuu progress report.pdf
A305_A2_file_Batkhuu progress report.pdfA305_A2_file_Batkhuu progress report.pdf
A305_A2_file_Batkhuu progress report.pdftbatkhuu1
 
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 DelhiCall Girls in Delhi
 

Recently uploaded (20)

Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
 
Event mailer assignment progress report .pdf
Event mailer assignment progress report .pdfEvent mailer assignment progress report .pdf
Event mailer assignment progress report .pdf
 
Understanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key InsightsUnderstanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key Insights
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
 
Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...
Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...
Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...
 
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
 
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
 
Monte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMMonte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSM
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Service
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 
Unlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdfUnlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdf
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usage
 
A305_A2_file_Batkhuu progress report.pdf
A305_A2_file_Batkhuu progress report.pdfA305_A2_file_Batkhuu progress report.pdf
A305_A2_file_Batkhuu progress report.pdf
 
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
 

storm-170531123446.dotx.pptx

  • 1. Apache Storm Course Instructor : Dr.Zarifzadeh Presented By : Pouyan Rezazadeh, Ali Rezaie
  • 2. Introduction Hadoop and related technologies have made it possible to store and process data at large scales. Unfortunately, these data processing technologies are not realtime systems. Hadoop does batch processing instead of realtime processing. Apache Storm 2
  • 3. Processing jobs one by one Apache Storm 3
  • 4. Introduction Batch processing Processing jobs in batch Batch processing jobs can take hours E.g. billing system Realtime processing Processing jobs immediately Apache Storm 4
  • 5. Introduction E.g. airline system Realtime data processing at massive scale is becoming more and more of a requirement for businesses. The lack of a "Hadoop of realtime" has become the biggest hole in the data processing ecosystem. There's no hack that will turn Hadoop into a realtime system. Apache Storm 5
  • 6. Apache Storm Solution A distributed realtime computation system Founded in 2011 Apache Storm 6
  • 7. Implemented in Clojure (a dialect of Lisp), some Java Apache Storm 7
  • 8. Advantages Free, simple and open source Can be used with any programming language Very fast Scalabl e Fault - tolerant Guarantees your data will be processed Integrates with any database technology Apache Storm 8
  • 9. Storm Use Cases And too many others … Apache Storm 9
  • 10. Storm vs Hadoop A Storm cluster is superficially similar to a Hadoop cluster. Hadoop runs "MapReduce jobs", while Storm runs "topologies". Apache Storm 10
  • 11. A MapReduce job eventually finishes, whereas a topology processes messages forever (or until you kill it). Spouts and Bolts Spout s Bolts Apache Storm 11
  • 13. Spouts and Bolts Bolt 1 Bolt 4 Spout 1 Bolt 2 Spout 2 Bolt 3 A stream is an unbounded sequence of tuples. A spout is a source of streams. Apache Storm 13
  • 14. Spouts and Bolts Bolt 1 Bolt 4 Spout 1 Bolt 2 Spout 2 Bolt 3 For example, a spout may read tuples off of a queue and emit them as a stream. Apache Storm 14
  • 15. Spouts and Bolts Bolt 1 Bolt 4 Spout 1 Bolt 2 Spout 2 Bolt 3 A bolt consumes any number of input streams, does some processing, and possibly emits new streams. Apache Storm 15
  • 16. Spouts and Bolts Bolt 1 Bolt 4 Spout 1 Bolt 2 Spout 2 Bolt 3 Each node (spout or bolt) in a Storm topology executes in parallel. Apache Storm 16
  • 17. Architecture A machine in a storm cluster may run one or more worker processes. Worker Process Each topology has one or more Task Task worker processes. Each worker process runs Task Task executors (threads) for a specific topology. Each executor runs one or more tasks of the same component(spout executor or bolt). Apache Storm 17
  • 18. Architecture Supervisor Supervisor Supervisor Supervisor Supervisor ZooKeeper Nimbus ZooKeeper ZooKeeper Hadoop v1 Storm JobTracker Nimbus (only1) . distributescode around cluster . assigns tasks to machines/supervisors . failure monitoring TaskTracker Supervisor . listens for work assigned to its machine (many) . starts and stops worker processes as necessary b o a n s e N d i mbus ZooKeeper . coordination between Nimbus and the Supervisors Apache Storm 18
  • 19. Architecture The Nimbus and Supervisor are stateless . All state is kept in Zookeeper . 1 ZK instance per machine When the Nimbus or Supervisor fails, they'll start back up like nothing happened. storm jar all-my-code.jar org.apache.storm.MyTopology arg1 arg2 Apache Storm 19
  • 20. Architecture A running topology consists of many worker processes spread across many machines. Apache Storm 20
  • 21. Topology Worker Process Worker Process Task Task Task Task Task Task Task Task Task Task Task Task Apache Storm 21
  • 22. Topology With Tasks in Details Apache Storm 22
  • 23. Shuffle grouping : Randomized round-robin Fields grouping: all Tuples with the same field value(s) are always routed to the same task Direct grouping: producer of the tuple decides which task of the consumer will receive the tuple Apache Storm 23
  • 24. A Sample Code of Configuring TopologyBuilder topologyBuilder = new TopologyBuilder(); Apache Storm 24
  • 25. Fault Tolerance Workers heartbeat back to Nimbus via ZooKeeper . Apache Storm 25
  • 26. Fault Tolerance When a worker dies , the supervisor will restart it. Apache Storm 26
  • 27. Fault Tolerance If it continuously fails on startup and is unable to heartbeat to Nimbus, Nimbus will reschedule the worker. Apache Storm 27
  • 28. Fault Tolerance If a supervisor node dies , Nimbus will reassign the work to other nodes . Apache Storm 28
  • 29. Fault Tolerance If Nimbus dies, topologies will continue to function normally! Apache Storm 29
  • 30. but won’t be able to perform reassignments. Apache Storm 30
  • 31. Fault Tolerance In contrast to Hadoop, where if the JobTracker dies, all Apache Storm 31
  • 32. the running jobs are lost. Apache Storm 32
  • 33. Fault Tolerance Preferably run ZK with nodes >= 3 so that you can Apache Storm 33
  • 34. tolerate the failure of 1 ZK server. A Sample Word Count Topology Split Word Sentence Spout Report Sentence Count Bolt Bolt Bolt Sentence Spout: { "sentence": "my dog has fleas" } Split Sentence Bolt: { "word" : "my" } { "word" : "dog" } Apache Storm 34
  • 35. { "word" : "has" } { "word" : "fleas" } Word Count Bolt: { "word" : "dog", "count" : 5 } Report Bolt: prints the contents Apache Storm 35
  • 36. A Sample Word Count Code publicclassSentenceSpoue txtendsBaseRichSpout{ privateSpoutOutputCollectocor llector; privateString[ sentences = { "my dog has flea,s " " i like cold beverage,s""the dog ate my homework,""don't have a cow ma, "ni"don't thiniklike fleas“ } ; privateintindex = 0; publicvoiddeclareOutputField(O sutputFieldsDeclaredr eclarer) { declarer.declar(n eewFields("sentence)" ); } publicvoidopen(Mapconfig,TopologyContexc tontextS , poutOutputCollectocor llector) this.collector=collector; } publicvoidnextTuple(){ this.collector.em(in tewValues(sentences[index])); index++; if(index >=sentences.leng)th{ index = 0; } } Apache Storm 36
  • 37. A Sample Word Count Code publicclassSplitSentenceBoe ltxtendsBaseRichBol{ t privateOutputCollectoc rollector; publicvoidprepareM ( apconfig, TopologyContexc tontextO , utputCollector collecto)r{ this.collector=collecto;r } publicvoidexecuteT (upletuple){ Stringsentence =tuple.getStringByFie(" ldsentenc"e); String[ words =sentence.spl(i" t"); for(Stringword : word)s this.collector.em(n itewValues(word)); } } publicvoiddeclareOutputField(O sutputFieldsDeclaredr eclarer){ declarer.declar(n eewFields("word")); } } Apache Storm 37
  • 38. A Sample Word Count Code publicclassWordCountBole txtendsBaseRichBol{ t privateOutputCollectoc rollector; privateHashMap< String, Long>counts = null; publicvoidprepareM ( apconfig, TopologyContexc tontextO , utputCollectoc rollecto)r{ this.collector=collector; this.counts=newHashMap< String, Long>(); } publicvoidexecuteT (upletuple){ Stringword =tuple.getStringByFie(" ldword"); Longcount = this.countsg . et(word); if(count= null) count=0L; } count++; this.counts.pu(wt ord, coun)t; this.collector.em(n itewValues(word, coun)t); } publicvoiddeclareOutputField(O sutputFieldsDeclaredr eclarer){ declarer.declar(n eewFields("word, ""coun"t)); } } Apache Storm 38
  • 39. A Sample Word Count Code publicclassReportBoltextendsBaseRichBolt{ privateHashMap< String, Long>counts = null; publicvoidprepareM ( apconfig, TopologyContexc tontextO , utputCollectoc rollector){ this.counts=newHashMap< String, Long>(); } publicvoidexecuteT (upletuple){ Stringword =tuple.getStringByFie(" ldword"); Longcount =tuple.getLongByFie(l" d coun"t); this.counts.pu(wt ord, coun)t; } publicvoiddeclareOutputField(O sutputFieldsDeclaredr eclarer){ //this bolt does not emit any} thing publicvoidcleanup(){ List< String>keys = newArrayLis< tString>(); keys.addA(lt lhis.counts.keySe()t); Collection. s sort(keys); for(Stringkey : keys{) System.out.println(key+" : "+this.countsg . et(key)); } } } Apache Storm 39