SlideShare a Scribd company logo
1 of 49
Download to read offline
Page1
Developing Java Streaming Applications
with Apache Storm
Lester Martin www.ajug.org - Nov 2017
Page2
Connection before Content
Lester Martin – Hadoop/Spark/Storm Trainer & Consultant
lester.martin@gmail.com
http://lester.website (links to blog, twitter,
github, LI, FB, etc)
Page3
Agenda – Needs Updating!!!!
• What is Storm?
• Conceptual Model
• Compile Time
• DEMO: Develop Word Count Topology
• Runtime
• DEMO: Submit Word Count Topology
• Additional Features
• DEMO: Kafka > Storm > HBase Topology in Local Cluster
Page4
What is Storm?
Page5
Storm is …
à Streaming
– Key enabler of the Lambda Architecture
à Fast
– Clocked at 1M+ messages per second per node
à Scalable
– Thousands of workers per cluster
à Fault Tolerant
– Failure is expected, and embraced
à Reliable
– Guaranteed message delivery
– Exactly-once semantics
Page6
Storm in the Lambda Architecture
persists data
Hadoop
batch processing
batch feeds
Update event models
Pattern templates, key-
performance indicators, and
alerts
Dashboards and Applications
Stormreal-time data
feeds
Page7
Conceptual Model
Page8
TUPLE
{…}
Page9
Tuple
à Unit of work to be processes
à Immutable ordered set of serializable values
à Fields must have assigned name
{…}
Page10
Stream
à Core abstraction of Storm
à Unbounded sequence of Tuples
{…} {…} {…} {…} {…} {…} {…}
Page11
SPOUT
Page12
Spout
à Source of Streams
à Wrap an event source and emit Tuples
Page13
Message Queues
Message queues are often the source of the data processed by Storm
Storm Spouts integrate with many types of message queues
real-time data
source
operating
systems,
services and
applications,
sensors
Kestrel,
RabbitMQ,
AMQP, Kafka,
JMS, others…
message
queue
log entries,
events, errors,
status
messages, etc.
Storm
data from queue
is read by Storm
Page14
BOLT
Page15
Bolt
à Core unit of computation
à Receive Tuples and do stuff
à Optionally, emit additional Tuples
Page16
Bolt
à Write to a data store
Page17
Bolt
à Read from a data store
Page18
Bolt
à Perform arbitrary computation
Page19
Bolt
à (Optionally) Emit additional Stream(s)
Page20
TOPOLOGY
Page21
Topology
à DAG of Spouts and Bolts
à Data Flow Representation
à Streaming Computation
Page22
Topology
à Storm executes Spouts and Bolts as Tasks that run in parallel on
multiple machines
Page23
Parallel Execution of Topology Components
a logical
topology
spout A
bolt A bolt B
bolt C
a physical
implementation
machine A
machine B
machine E
machine C
machine D
machine F
machine G
spout A
two tasks
bolt A
two tasks
bolt B two
tasks
bolt C
one task
Page24
Stream Groupings
Stream Groupings determine how Storm routes Tuples between Tasks
Grouping Type Routing Behavior
Shuffle Randomized round-robin (evenly distribute
load to downstream Bolts)
Fields Ensures all Tuples with the same Field
value(s) are always routed to the same Task
All Replicates Stream across all the Bolt’s
Tasks (use with care)
Other options Including custom RYO grouping logic
Page25
Compile Time
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields(”sentence"));
}
Page26
Example Spout Code (1 of 2)
public class RandomSentenceSpout extends BaseRichSpout {
SpoutOutputCollector _collector;
Random _rand;
@Override
public void open(Map conf, TopologyContext context, SpoutOutputCollector collector) {
_collector = collector;
_rand = new Random();
}
@Override
public void nextTuple() {
Utils.sleep(100);
String[] sentences = new String[]{ "the cow jumped over the moon", "an apple a day keeps
the doctor away", "four score and seven years ago", "snow white and the seven dwarfs",
"i am at two with nature" };
String sentence = sentences[_rand.nextInt(sentences.length)];
_collector.emit(new Values(sentence));
}
Continued next page…
Storm uses open to open the spout and provide it with its configuration,
a context object providing information about components in the
topology, and an output collector used to emit tuples.
Storm uses nextTuple to request
the spout emit the next tuple.
The spout uses emit to send a
tuple to one or more bolts.
Name of the spout class. Storm spout class used as a “template”.
Page27
Example Spout Code (2 of 2)
@Override
public void ack(Object id) {
}
@Override
public void fail(Object id) {
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields(”sentence"));
}
}
Storm calls the spout’s ack method to signal that
a tuple has been fully processed.
Storm calls the spout’s fail method to signal
that a tuple has not been fully processed.
The declareOutputFields
method names the fields in a tuple.
Continued…
Page28
Example Bolt Code
public static class ExclamationBolt extends BaseRichBolt {
OutputCollector _collector;
public void prepare(Map conf, TopologyContext context, OutputCollector collector) {
_collector = collector;
}
public void execute(Tuple tuple) {
_collector.emit(tuple, new Values(tuple.getString(0) + "!!!"));
_collector.ack(tuple);
}
public void cleanup(); {
}
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("word"));
}
}
The prepare method
provides the bolt with
its configuration and
an
OutputCollector
used to emit tuples.
The execute method
receives a tuple from a
stream and emits a
new tuple. It also
provides an ack
method that can be
used after successful
delivery.
The cleanup method
releases system
resources when bolt is
shut down.
Names the fields in the output
tuples. More detail later.
Name of the bolt class. Bolt class used as a “template.”
Page29
Example Topology Code
public static main(String[] args) throws exception {
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout(“words”, new TestWordSpout());
builder.setBolt(“exclaim1”, new NewExclamationBolt()).shuffleGrouping(“words”);
builder.setBolt(“exclaim2”, new NewExclamationBolt()).shuffleGrouping(“exclaim1”);
Config conf = new Config();
StormSubmitter.submitTopology(”add-exclamation", conf, builder.createTopology());
}
This code…
words exclaim1 exclaim2
shuffleGrouping shuffleGrouping
…builds this
Topology.
runs code in
TestWordSpout()
runs code in
NewExclamationBolt()
runs code in
NewExclamationBolt()
Page30
DEMO
Develop Word Count Topology
Page31
Runtime
Nimbus
Supervisor
Supervisor
Supervisor
Supervisor
Page32
Physical View
Page33
Topology Submitter uploads topology:
• topology.jar
• topology.ser
• conf.ser
Topology Deployment
Page34
Topology Deployment
Nimbus calculates assignments and sends to Zookeeper
Page35
Topology Deployment
Supervisor nodes receive assignment information
via Zookeeper watches
Page36
Topology Deployment
Supervisor nodes download topology from Nimbus:
• topology.jar
• topology.ser
• conf.ser
Page37
Topology Deployment
Supervisors spawn workers (JVM processes)
Page38
DEMO
Submit Topology to Storm Topology
Page39
Additional Features
FAIL
Page40
Local Versus Distributed Storm Clusters
The topology program code submitted to Storm using storm jar is
different when submitting to local mode versus a distributed cluster.
The submitTopology method is used in both cases.
• The difference is the class that contains the submitTopology method.
Config conf = new Config();
LocalCluster cluster = new LocalCluster();
LocalCluster.submitTopology("mytopology", conf, topology);
Config conf = new Config();
StormSubmitter.submitTopology("mytopology", conf, topology);
Instantiate a local
cluster object.
Submit a topology
to a local cluster.
Submit a topology to a
distributed cluster.
Same method
name, different
classes
Same method
name, different
classes.
Page41
Reliable Processing
Bolts may emit Tuples Anchored to one received.
Tuple “B” is a descendant of Tuple “A”
Page42
Reliable Processing
Multiple Anchorings form a Tuple tree
(bolts not shown)
Page43
Reliable Processing
Bolts can Acknowledge that a tuple
has been processed successfully.
ACK
Page44
Reliable Processing
Bolts can also Fail a tuple to trigger a spout to
replay the original.
FAIL
Page45
Reliable Processing
Any failure in the Tuple tree will trigger a
replay of the original tuple
Page46
More Stuff
à Topology description/deployment options
– Flux
– Storm SQL
à Polyglot development
à Micro-batching with Trident
à Fault tolerance & deployment isolation
à Integrations
– Messaging; Kafka, Redis, Kestrel, Kinesis, MQTT, JMS
– Databases; HBase, Hive, Druid, Cassandra, MongoDB, JDBC
– Search Engines; Solr, Elasticsearch
– HDFS
– And more!
Page47
DEMO
Kafka > Storm > HBase Topology in a Local Cluster
Page48
Kafka > Storm > HBase Example
Requirements:
• Land simulated server logs into Kafka
• Configure a Kafka Bolt to consume the server log messages
• Ignore all messages that are not either WARN or ERROR
• Persist WARN and ERROR messages into HBase
– Keep 10 most recent messages for each server
– Maintain a running total of these concerning messages
• Publish these messages back to Kafka
Kafka
Kafka
HBase
HBaseParse FilterKafka
Kafka
Page49
Questions?
Lester Martin – Hadoop/Spark/Storm Trainer & Consultant
lester.martin@gmail.com
http://lester.website (links to blog, twitter, github, LI, FB, etc)
THANKS FOR YOUR TIME!!

More Related Content

What's hot

Rapport PFE ISMAGI SQLI Microsoft
Rapport PFE ISMAGI SQLI MicrosoftRapport PFE ISMAGI SQLI Microsoft
Rapport PFE ISMAGI SQLI MicrosoftOussama BAHLOULI
 
Rapport d'une application mobile de recommendation de livres
Rapport d'une application mobile de recommendation de livresRapport d'une application mobile de recommendation de livres
Rapport d'une application mobile de recommendation de livreskaies Labiedh
 
Rapport de Stage PFE - Développement d'un Projet ALTEN MAROC Concernant le Sy...
Rapport de Stage PFE - Développement d'un Projet ALTEN MAROC Concernant le Sy...Rapport de Stage PFE - Développement d'un Projet ALTEN MAROC Concernant le Sy...
Rapport de Stage PFE - Développement d'un Projet ALTEN MAROC Concernant le Sy...tayebbousfiha1
 
rapport PFE ingénieur génie logiciel INSAT
rapport PFE ingénieur génie logiciel INSATrapport PFE ingénieur génie logiciel INSAT
rapport PFE ingénieur génie logiciel INSATSiwar GUEMRI
 
Livre Blanc Cloud Computing / Sécurité
Livre Blanc Cloud Computing / Sécurité Livre Blanc Cloud Computing / Sécurité
Livre Blanc Cloud Computing / Sécurité Syntec Numérique
 
Rapport de projet de fin d"études
Rapport de projet de fin d"étudesRapport de projet de fin d"études
Rapport de projet de fin d"étudesMohamed Boubaya
 
presentation projet domotique
presentation projet domotiquepresentation projet domotique
presentation projet domotiquets4riadhoc
 
Reconnaissance faciale
Reconnaissance facialeReconnaissance faciale
Reconnaissance facialeAymen Fodda
 
exercices base de données - sql
exercices  base de données - sql exercices  base de données - sql
exercices base de données - sql Yassine Badri
 
Rapport application web (Spring BOOT,angular4) et mobile(ionc3) gestion des a...
Rapport application web (Spring BOOT,angular4) et mobile(ionc3) gestion des a...Rapport application web (Spring BOOT,angular4) et mobile(ionc3) gestion des a...
Rapport application web (Spring BOOT,angular4) et mobile(ionc3) gestion des a...MOHAMMED MOURADI
 
Rapport PFE | Remitec | Automatisation d'une installation de production des e...
Rapport PFE | Remitec | Automatisation d'une installation de production des e...Rapport PFE | Remitec | Automatisation d'une installation de production des e...
Rapport PFE | Remitec | Automatisation d'une installation de production des e...Zouhair Boufakri
 
Mémoire : Cloud iaas Slim Hannachi
Mémoire :  Cloud iaas Slim HannachiMémoire :  Cloud iaas Slim Hannachi
Mémoire : Cloud iaas Slim Hannachislim Hannachi
 
Rapport de stage boite à idées innovantes avec dashboard
Rapport de stage boite à idées innovantes avec dashboardRapport de stage boite à idées innovantes avec dashboard
Rapport de stage boite à idées innovantes avec dashboardSiwar GUEMRI
 
RapportPFE_IngenieurInformatique_ESPRIT
RapportPFE_IngenieurInformatique_ESPRITRapportPFE_IngenieurInformatique_ESPRIT
RapportPFE_IngenieurInformatique_ESPRITLina Meddeb
 
Chatbot arabe-dialectale-covid19
Chatbot arabe-dialectale-covid19Chatbot arabe-dialectale-covid19
Chatbot arabe-dialectale-covid19othmanakka
 
nombres aléatoires en langage C
nombres aléatoires en langage Cnombres aléatoires en langage C
nombres aléatoires en langage Cmohamednacim
 
Mise en place d’une application mobile de géolocalisation
Mise en place d’une application mobile de géolocalisationMise en place d’une application mobile de géolocalisation
Mise en place d’une application mobile de géolocalisationCléa Aurianne Leencé BAWE
 
Deploiement solution_ha_de_stockage_ceph_sous_une_plateforme_virtualisee_vsph...
Deploiement solution_ha_de_stockage_ceph_sous_une_plateforme_virtualisee_vsph...Deploiement solution_ha_de_stockage_ceph_sous_une_plateforme_virtualisee_vsph...
Deploiement solution_ha_de_stockage_ceph_sous_une_plateforme_virtualisee_vsph...Abdelmadjid Djebbari
 

What's hot (20)

Rapport PFE ISMAGI SQLI Microsoft
Rapport PFE ISMAGI SQLI MicrosoftRapport PFE ISMAGI SQLI Microsoft
Rapport PFE ISMAGI SQLI Microsoft
 
Rapport d'une application mobile de recommendation de livres
Rapport d'une application mobile de recommendation de livresRapport d'une application mobile de recommendation de livres
Rapport d'une application mobile de recommendation de livres
 
Rapport de Stage PFE - Développement d'un Projet ALTEN MAROC Concernant le Sy...
Rapport de Stage PFE - Développement d'un Projet ALTEN MAROC Concernant le Sy...Rapport de Stage PFE - Développement d'un Projet ALTEN MAROC Concernant le Sy...
Rapport de Stage PFE - Développement d'un Projet ALTEN MAROC Concernant le Sy...
 
rapport PFE ingénieur génie logiciel INSAT
rapport PFE ingénieur génie logiciel INSATrapport PFE ingénieur génie logiciel INSAT
rapport PFE ingénieur génie logiciel INSAT
 
Livre Blanc Cloud Computing / Sécurité
Livre Blanc Cloud Computing / Sécurité Livre Blanc Cloud Computing / Sécurité
Livre Blanc Cloud Computing / Sécurité
 
Rapport de projet de fin d"études
Rapport de projet de fin d"étudesRapport de projet de fin d"études
Rapport de projet de fin d"études
 
presentation projet domotique
presentation projet domotiquepresentation projet domotique
presentation projet domotique
 
Reconnaissance faciale
Reconnaissance facialeReconnaissance faciale
Reconnaissance faciale
 
exercices base de données - sql
exercices  base de données - sql exercices  base de données - sql
exercices base de données - sql
 
Rapport application web (Spring BOOT,angular4) et mobile(ionc3) gestion des a...
Rapport application web (Spring BOOT,angular4) et mobile(ionc3) gestion des a...Rapport application web (Spring BOOT,angular4) et mobile(ionc3) gestion des a...
Rapport application web (Spring BOOT,angular4) et mobile(ionc3) gestion des a...
 
Rapport PFE | Remitec | Automatisation d'une installation de production des e...
Rapport PFE | Remitec | Automatisation d'une installation de production des e...Rapport PFE | Remitec | Automatisation d'une installation de production des e...
Rapport PFE | Remitec | Automatisation d'une installation de production des e...
 
Rapport pfa
Rapport pfaRapport pfa
Rapport pfa
 
Mémoire : Cloud iaas Slim Hannachi
Mémoire :  Cloud iaas Slim HannachiMémoire :  Cloud iaas Slim Hannachi
Mémoire : Cloud iaas Slim Hannachi
 
Rapport de stage boite à idées innovantes avec dashboard
Rapport de stage boite à idées innovantes avec dashboardRapport de stage boite à idées innovantes avec dashboard
Rapport de stage boite à idées innovantes avec dashboard
 
RapportPFE_IngenieurInformatique_ESPRIT
RapportPFE_IngenieurInformatique_ESPRITRapportPFE_IngenieurInformatique_ESPRIT
RapportPFE_IngenieurInformatique_ESPRIT
 
Chatbot arabe-dialectale-covid19
Chatbot arabe-dialectale-covid19Chatbot arabe-dialectale-covid19
Chatbot arabe-dialectale-covid19
 
nombres aléatoires en langage C
nombres aléatoires en langage Cnombres aléatoires en langage C
nombres aléatoires en langage C
 
Mise en place d’une application mobile de géolocalisation
Mise en place d’une application mobile de géolocalisationMise en place d’une application mobile de géolocalisation
Mise en place d’une application mobile de géolocalisation
 
PFE .NET CRM
PFE .NET CRMPFE .NET CRM
PFE .NET CRM
 
Deploiement solution_ha_de_stockage_ceph_sous_une_plateforme_virtualisee_vsph...
Deploiement solution_ha_de_stockage_ceph_sous_une_plateforme_virtualisee_vsph...Deploiement solution_ha_de_stockage_ceph_sous_une_plateforme_virtualisee_vsph...
Deploiement solution_ha_de_stockage_ceph_sous_une_plateforme_virtualisee_vsph...
 

Similar to Developing Java Streaming Applications with Apache Storm

Real-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormReal-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormDavorin Vukelic
 
Distributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache StormDistributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache Stormthe100rabh
 
Real-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpacesReal-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpacesOleksii Diagiliev
 
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormC*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormDataStax
 
Java design patterns
Java design patternsJava design patterns
Java design patternsShawn Brito
 
NET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptxNET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptxpetabridge
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache StormP. Taylor Goetz
 
Real time and reliable processing with Apache Storm
Real time and reliable processing with Apache StormReal time and reliable processing with Apache Storm
Real time and reliable processing with Apache StormAndrea Iacono
 
Continuations in scala (incomplete version)
Continuations in scala (incomplete version)Continuations in scala (incomplete version)
Continuations in scala (incomplete version)Fuqiang Wang
 
The GO Language : From Beginners to Gophers
The GO Language : From Beginners to GophersThe GO Language : From Beginners to Gophers
The GO Language : From Beginners to GophersAlessandro Sanino
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinaloscon2007
 

Similar to Developing Java Streaming Applications with Apache Storm (20)

Apache Storm Tutorial
Apache Storm TutorialApache Storm Tutorial
Apache Storm Tutorial
 
Storm
StormStorm
Storm
 
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormReal-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache Storm
 
Distributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache StormDistributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache Storm
 
Storm is coming
Storm is comingStorm is coming
Storm is coming
 
Real-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpacesReal-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpaces
 
Storm
StormStorm
Storm
 
Introduction to Apache Storm
Introduction to Apache StormIntroduction to Apache Storm
Introduction to Apache Storm
 
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormC*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
 
Storm 0.8.2
Storm 0.8.2Storm 0.8.2
Storm 0.8.2
 
Java design patterns
Java design patternsJava design patterns
Java design patterns
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
NET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptxNET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptx
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
Real time and reliable processing with Apache Storm
Real time and reliable processing with Apache StormReal time and reliable processing with Apache Storm
Real time and reliable processing with Apache Storm
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
Continuations in scala (incomplete version)
Continuations in scala (incomplete version)Continuations in scala (incomplete version)
Continuations in scala (incomplete version)
 
The GO Language : From Beginners to Gophers
The GO Language : From Beginners to GophersThe GO Language : From Beginners to Gophers
The GO Language : From Beginners to Gophers
 
Storm
StormStorm
Storm
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinal
 

Recently uploaded

2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group MeetingAlison Pitt
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxDilipVasan
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfEmmanuel Dauda
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理pyhepag
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyRafigAliyev2
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理pyhepag
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictJack Cole
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Calllward7
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理pyhepag
 
Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp onlinebalibahu1313
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理cyebo
 
Data analytics courses in Nepal Presentation
Data analytics courses in Nepal PresentationData analytics courses in Nepal Presentation
Data analytics courses in Nepal Presentationanshikakulshreshtha11
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理cyebo
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理pyhepag
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxStephen266013
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Jon Hansen
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonPayment Village
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfscitechtalktv
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfMichaelSenkow
 

Recently uploaded (20)

2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 
Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp online
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
 
Data analytics courses in Nepal Presentation
Data analytics courses in Nepal PresentationData analytics courses in Nepal Presentation
Data analytics courses in Nepal Presentation
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prison
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdf
 

Developing Java Streaming Applications with Apache Storm

  • 1. Page1 Developing Java Streaming Applications with Apache Storm Lester Martin www.ajug.org - Nov 2017
  • 2. Page2 Connection before Content Lester Martin – Hadoop/Spark/Storm Trainer & Consultant lester.martin@gmail.com http://lester.website (links to blog, twitter, github, LI, FB, etc)
  • 3. Page3 Agenda – Needs Updating!!!! • What is Storm? • Conceptual Model • Compile Time • DEMO: Develop Word Count Topology • Runtime • DEMO: Submit Word Count Topology • Additional Features • DEMO: Kafka > Storm > HBase Topology in Local Cluster
  • 5. Page5 Storm is … à Streaming – Key enabler of the Lambda Architecture à Fast – Clocked at 1M+ messages per second per node à Scalable – Thousands of workers per cluster à Fault Tolerant – Failure is expected, and embraced à Reliable – Guaranteed message delivery – Exactly-once semantics
  • 6. Page6 Storm in the Lambda Architecture persists data Hadoop batch processing batch feeds Update event models Pattern templates, key- performance indicators, and alerts Dashboards and Applications Stormreal-time data feeds
  • 9. Page9 Tuple à Unit of work to be processes à Immutable ordered set of serializable values à Fields must have assigned name {…}
  • 10. Page10 Stream à Core abstraction of Storm à Unbounded sequence of Tuples {…} {…} {…} {…} {…} {…} {…}
  • 12. Page12 Spout à Source of Streams à Wrap an event source and emit Tuples
  • 13. Page13 Message Queues Message queues are often the source of the data processed by Storm Storm Spouts integrate with many types of message queues real-time data source operating systems, services and applications, sensors Kestrel, RabbitMQ, AMQP, Kafka, JMS, others… message queue log entries, events, errors, status messages, etc. Storm data from queue is read by Storm
  • 15. Page15 Bolt à Core unit of computation à Receive Tuples and do stuff à Optionally, emit additional Tuples
  • 16. Page16 Bolt à Write to a data store
  • 17. Page17 Bolt à Read from a data store
  • 19. Page19 Bolt à (Optionally) Emit additional Stream(s)
  • 21. Page21 Topology à DAG of Spouts and Bolts à Data Flow Representation à Streaming Computation
  • 22. Page22 Topology à Storm executes Spouts and Bolts as Tasks that run in parallel on multiple machines
  • 23. Page23 Parallel Execution of Topology Components a logical topology spout A bolt A bolt B bolt C a physical implementation machine A machine B machine E machine C machine D machine F machine G spout A two tasks bolt A two tasks bolt B two tasks bolt C one task
  • 24. Page24 Stream Groupings Stream Groupings determine how Storm routes Tuples between Tasks Grouping Type Routing Behavior Shuffle Randomized round-robin (evenly distribute load to downstream Bolts) Fields Ensures all Tuples with the same Field value(s) are always routed to the same Task All Replicates Stream across all the Bolt’s Tasks (use with care) Other options Including custom RYO grouping logic
  • 25. Page25 Compile Time @Override public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields(”sentence")); }
  • 26. Page26 Example Spout Code (1 of 2) public class RandomSentenceSpout extends BaseRichSpout { SpoutOutputCollector _collector; Random _rand; @Override public void open(Map conf, TopologyContext context, SpoutOutputCollector collector) { _collector = collector; _rand = new Random(); } @Override public void nextTuple() { Utils.sleep(100); String[] sentences = new String[]{ "the cow jumped over the moon", "an apple a day keeps the doctor away", "four score and seven years ago", "snow white and the seven dwarfs", "i am at two with nature" }; String sentence = sentences[_rand.nextInt(sentences.length)]; _collector.emit(new Values(sentence)); } Continued next page… Storm uses open to open the spout and provide it with its configuration, a context object providing information about components in the topology, and an output collector used to emit tuples. Storm uses nextTuple to request the spout emit the next tuple. The spout uses emit to send a tuple to one or more bolts. Name of the spout class. Storm spout class used as a “template”.
  • 27. Page27 Example Spout Code (2 of 2) @Override public void ack(Object id) { } @Override public void fail(Object id) { } @Override public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields(”sentence")); } } Storm calls the spout’s ack method to signal that a tuple has been fully processed. Storm calls the spout’s fail method to signal that a tuple has not been fully processed. The declareOutputFields method names the fields in a tuple. Continued…
  • 28. Page28 Example Bolt Code public static class ExclamationBolt extends BaseRichBolt { OutputCollector _collector; public void prepare(Map conf, TopologyContext context, OutputCollector collector) { _collector = collector; } public void execute(Tuple tuple) { _collector.emit(tuple, new Values(tuple.getString(0) + "!!!")); _collector.ack(tuple); } public void cleanup(); { } public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields("word")); } } The prepare method provides the bolt with its configuration and an OutputCollector used to emit tuples. The execute method receives a tuple from a stream and emits a new tuple. It also provides an ack method that can be used after successful delivery. The cleanup method releases system resources when bolt is shut down. Names the fields in the output tuples. More detail later. Name of the bolt class. Bolt class used as a “template.”
  • 29. Page29 Example Topology Code public static main(String[] args) throws exception { TopologyBuilder builder = new TopologyBuilder(); builder.setSpout(“words”, new TestWordSpout()); builder.setBolt(“exclaim1”, new NewExclamationBolt()).shuffleGrouping(“words”); builder.setBolt(“exclaim2”, new NewExclamationBolt()).shuffleGrouping(“exclaim1”); Config conf = new Config(); StormSubmitter.submitTopology(”add-exclamation", conf, builder.createTopology()); } This code… words exclaim1 exclaim2 shuffleGrouping shuffleGrouping …builds this Topology. runs code in TestWordSpout() runs code in NewExclamationBolt() runs code in NewExclamationBolt()
  • 33. Page33 Topology Submitter uploads topology: • topology.jar • topology.ser • conf.ser Topology Deployment
  • 34. Page34 Topology Deployment Nimbus calculates assignments and sends to Zookeeper
  • 35. Page35 Topology Deployment Supervisor nodes receive assignment information via Zookeeper watches
  • 36. Page36 Topology Deployment Supervisor nodes download topology from Nimbus: • topology.jar • topology.ser • conf.ser
  • 40. Page40 Local Versus Distributed Storm Clusters The topology program code submitted to Storm using storm jar is different when submitting to local mode versus a distributed cluster. The submitTopology method is used in both cases. • The difference is the class that contains the submitTopology method. Config conf = new Config(); LocalCluster cluster = new LocalCluster(); LocalCluster.submitTopology("mytopology", conf, topology); Config conf = new Config(); StormSubmitter.submitTopology("mytopology", conf, topology); Instantiate a local cluster object. Submit a topology to a local cluster. Submit a topology to a distributed cluster. Same method name, different classes Same method name, different classes.
  • 41. Page41 Reliable Processing Bolts may emit Tuples Anchored to one received. Tuple “B” is a descendant of Tuple “A”
  • 42. Page42 Reliable Processing Multiple Anchorings form a Tuple tree (bolts not shown)
  • 43. Page43 Reliable Processing Bolts can Acknowledge that a tuple has been processed successfully. ACK
  • 44. Page44 Reliable Processing Bolts can also Fail a tuple to trigger a spout to replay the original. FAIL
  • 45. Page45 Reliable Processing Any failure in the Tuple tree will trigger a replay of the original tuple
  • 46. Page46 More Stuff à Topology description/deployment options – Flux – Storm SQL à Polyglot development à Micro-batching with Trident à Fault tolerance & deployment isolation à Integrations – Messaging; Kafka, Redis, Kestrel, Kinesis, MQTT, JMS – Databases; HBase, Hive, Druid, Cassandra, MongoDB, JDBC – Search Engines; Solr, Elasticsearch – HDFS – And more!
  • 47. Page47 DEMO Kafka > Storm > HBase Topology in a Local Cluster
  • 48. Page48 Kafka > Storm > HBase Example Requirements: • Land simulated server logs into Kafka • Configure a Kafka Bolt to consume the server log messages • Ignore all messages that are not either WARN or ERROR • Persist WARN and ERROR messages into HBase – Keep 10 most recent messages for each server – Maintain a running total of these concerning messages • Publish these messages back to Kafka Kafka Kafka HBase HBaseParse FilterKafka Kafka
  • 49. Page49 Questions? Lester Martin – Hadoop/Spark/Storm Trainer & Consultant lester.martin@gmail.com http://lester.website (links to blog, twitter, github, LI, FB, etc) THANKS FOR YOUR TIME!!