SlideShare a Scribd company logo
Page1
Developing Java Streaming Applications
with Apache Storm
Lester Martin www.ajug.org - Nov 2017
Page2
Connection before Content
Lester Martin – Hadoop/Spark/Storm Trainer & Consultant
lester.martin@gmail.com
http://lester.website (links to blog, twitter,
github, LI, FB, etc)
Page3
Agenda – Needs Updating!!!!
• What is Storm?
• Conceptual Model
• Compile Time
• DEMO: Develop Word Count Topology
• Runtime
• DEMO: Submit Word Count Topology
• Additional Features
• DEMO: Kafka > Storm > HBase Topology in Local Cluster
Page4
What is Storm?
Page5
Storm is …
à Streaming
– Key enabler of the Lambda Architecture
à Fast
– Clocked at 1M+ messages per second per node
à Scalable
– Thousands of workers per cluster
à Fault Tolerant
– Failure is expected, and embraced
à Reliable
– Guaranteed message delivery
– Exactly-once semantics
Page6
Storm in the Lambda Architecture
persists data
Hadoop
batch processing
batch feeds
Update event models
Pattern templates, key-
performance indicators, and
alerts
Dashboards and Applications
Stormreal-time data
feeds
Page7
Conceptual Model
Page8
TUPLE
{…}
Page9
Tuple
à Unit of work to be processes
à Immutable ordered set of serializable values
à Fields must have assigned name
{…}
Page10
Stream
à Core abstraction of Storm
à Unbounded sequence of Tuples
{…} {…} {…} {…} {…} {…} {…}
Page11
SPOUT
Page12
Spout
à Source of Streams
à Wrap an event source and emit Tuples
Page13
Message Queues
Message queues are often the source of the data processed by Storm
Storm Spouts integrate with many types of message queues
real-time data
source
operating
systems,
services and
applications,
sensors
Kestrel,
RabbitMQ,
AMQP, Kafka,
JMS, others…
message
queue
log entries,
events, errors,
status
messages, etc.
Storm
data from queue
is read by Storm
Page14
BOLT
Page15
Bolt
à Core unit of computation
à Receive Tuples and do stuff
à Optionally, emit additional Tuples
Page16
Bolt
à Write to a data store
Page17
Bolt
à Read from a data store
Page18
Bolt
à Perform arbitrary computation
Page19
Bolt
à (Optionally) Emit additional Stream(s)
Page20
TOPOLOGY
Page21
Topology
à DAG of Spouts and Bolts
à Data Flow Representation
à Streaming Computation
Page22
Topology
à Storm executes Spouts and Bolts as Tasks that run in parallel on
multiple machines
Page23
Parallel Execution of Topology Components
a logical
topology
spout A
bolt A bolt B
bolt C
a physical
implementation
machine A
machine B
machine E
machine C
machine D
machine F
machine G
spout A
two tasks
bolt A
two tasks
bolt B two
tasks
bolt C
one task
Page24
Stream Groupings
Stream Groupings determine how Storm routes Tuples between Tasks
Grouping Type Routing Behavior
Shuffle Randomized round-robin (evenly distribute
load to downstream Bolts)
Fields Ensures all Tuples with the same Field
value(s) are always routed to the same Task
All Replicates Stream across all the Bolt’s
Tasks (use with care)
Other options Including custom RYO grouping logic
Page25
Compile Time
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields(”sentence"));
}
Page26
Example Spout Code (1 of 2)
public class RandomSentenceSpout extends BaseRichSpout {
SpoutOutputCollector _collector;
Random _rand;
@Override
public void open(Map conf, TopologyContext context, SpoutOutputCollector collector) {
_collector = collector;
_rand = new Random();
}
@Override
public void nextTuple() {
Utils.sleep(100);
String[] sentences = new String[]{ "the cow jumped over the moon", "an apple a day keeps
the doctor away", "four score and seven years ago", "snow white and the seven dwarfs",
"i am at two with nature" };
String sentence = sentences[_rand.nextInt(sentences.length)];
_collector.emit(new Values(sentence));
}
Continued next page…
Storm uses open to open the spout and provide it with its configuration,
a context object providing information about components in the
topology, and an output collector used to emit tuples.
Storm uses nextTuple to request
the spout emit the next tuple.
The spout uses emit to send a
tuple to one or more bolts.
Name of the spout class. Storm spout class used as a “template”.
Page27
Example Spout Code (2 of 2)
@Override
public void ack(Object id) {
}
@Override
public void fail(Object id) {
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields(”sentence"));
}
}
Storm calls the spout’s ack method to signal that
a tuple has been fully processed.
Storm calls the spout’s fail method to signal
that a tuple has not been fully processed.
The declareOutputFields
method names the fields in a tuple.
Continued…
Page28
Example Bolt Code
public static class ExclamationBolt extends BaseRichBolt {
OutputCollector _collector;
public void prepare(Map conf, TopologyContext context, OutputCollector collector) {
_collector = collector;
}
public void execute(Tuple tuple) {
_collector.emit(tuple, new Values(tuple.getString(0) + "!!!"));
_collector.ack(tuple);
}
public void cleanup(); {
}
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("word"));
}
}
The prepare method
provides the bolt with
its configuration and
an
OutputCollector
used to emit tuples.
The execute method
receives a tuple from a
stream and emits a
new tuple. It also
provides an ack
method that can be
used after successful
delivery.
The cleanup method
releases system
resources when bolt is
shut down.
Names the fields in the output
tuples. More detail later.
Name of the bolt class. Bolt class used as a “template.”
Page29
Example Topology Code
public static main(String[] args) throws exception {
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout(“words”, new TestWordSpout());
builder.setBolt(“exclaim1”, new NewExclamationBolt()).shuffleGrouping(“words”);
builder.setBolt(“exclaim2”, new NewExclamationBolt()).shuffleGrouping(“exclaim1”);
Config conf = new Config();
StormSubmitter.submitTopology(”add-exclamation", conf, builder.createTopology());
}
This code…
words exclaim1 exclaim2
shuffleGrouping shuffleGrouping
…builds this
Topology.
runs code in
TestWordSpout()
runs code in
NewExclamationBolt()
runs code in
NewExclamationBolt()
Page30
DEMO
Develop Word Count Topology
Page31
Runtime
Nimbus
Supervisor
Supervisor
Supervisor
Supervisor
Page32
Physical View
Page33
Topology Submitter uploads topology:
• topology.jar
• topology.ser
• conf.ser
Topology Deployment
Page34
Topology Deployment
Nimbus calculates assignments and sends to Zookeeper
Page35
Topology Deployment
Supervisor nodes receive assignment information
via Zookeeper watches
Page36
Topology Deployment
Supervisor nodes download topology from Nimbus:
• topology.jar
• topology.ser
• conf.ser
Page37
Topology Deployment
Supervisors spawn workers (JVM processes)
Page38
DEMO
Submit Topology to Storm Topology
Page39
Additional Features
FAIL
Page40
Local Versus Distributed Storm Clusters
The topology program code submitted to Storm using storm jar is
different when submitting to local mode versus a distributed cluster.
The submitTopology method is used in both cases.
• The difference is the class that contains the submitTopology method.
Config conf = new Config();
LocalCluster cluster = new LocalCluster();
LocalCluster.submitTopology("mytopology", conf, topology);
Config conf = new Config();
StormSubmitter.submitTopology("mytopology", conf, topology);
Instantiate a local
cluster object.
Submit a topology
to a local cluster.
Submit a topology to a
distributed cluster.
Same method
name, different
classes
Same method
name, different
classes.
Page41
Reliable Processing
Bolts may emit Tuples Anchored to one received.
Tuple “B” is a descendant of Tuple “A”
Page42
Reliable Processing
Multiple Anchorings form a Tuple tree
(bolts not shown)
Page43
Reliable Processing
Bolts can Acknowledge that a tuple
has been processed successfully.
ACK
Page44
Reliable Processing
Bolts can also Fail a tuple to trigger a spout to
replay the original.
FAIL
Page45
Reliable Processing
Any failure in the Tuple tree will trigger a
replay of the original tuple
Page46
More Stuff
à Topology description/deployment options
– Flux
– Storm SQL
à Polyglot development
à Micro-batching with Trident
à Fault tolerance & deployment isolation
à Integrations
– Messaging; Kafka, Redis, Kestrel, Kinesis, MQTT, JMS
– Databases; HBase, Hive, Druid, Cassandra, MongoDB, JDBC
– Search Engines; Solr, Elasticsearch
– HDFS
– And more!
Page47
DEMO
Kafka > Storm > HBase Topology in a Local Cluster
Page48
Kafka > Storm > HBase Example
Requirements:
• Land simulated server logs into Kafka
• Configure a Kafka Bolt to consume the server log messages
• Ignore all messages that are not either WARN or ERROR
• Persist WARN and ERROR messages into HBase
– Keep 10 most recent messages for each server
– Maintain a running total of these concerning messages
• Publish these messages back to Kafka
Kafka
Kafka
HBase
HBaseParse FilterKafka
Kafka
Page49
Questions?
Lester Martin – Hadoop/Spark/Storm Trainer & Consultant
lester.martin@gmail.com
http://lester.website (links to blog, twitter, github, LI, FB, etc)
THANKS FOR YOUR TIME!!

More Related Content

What's hot

Microservices Architecture Part 2 Event Sourcing and Saga
Microservices Architecture Part 2 Event Sourcing and SagaMicroservices Architecture Part 2 Event Sourcing and Saga
Microservices Architecture Part 2 Event Sourcing and Saga
Araf Karsh Hamid
 
Presentation du socle technique Java open source Scub Foundation
Presentation du socle technique Java open source Scub FoundationPresentation du socle technique Java open source Scub Foundation
Presentation du socle technique Java open source Scub Foundation
Stéphane Traumat
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
Rishabh Indoria
 
Apache Kafka, Un système distribué de messagerie hautement performant
Apache Kafka, Un système distribué de messagerie hautement performantApache Kafka, Un système distribué de messagerie hautement performant
Apache Kafka, Un système distribué de messagerie hautement performant
ALTIC Altic
 
Monolith to serverless service based architectures in the enterprise
Monolith to serverless  service based architectures in the enterpriseMonolith to serverless  service based architectures in the enterprise
Monolith to serverless service based architectures in the enterprise
Sameh Deabes
 
TD3-UML-Correction
TD3-UML-CorrectionTD3-UML-Correction
TD3-UML-Correction
Lilia Sfaxi
 
Kubernetes Architecture
 Kubernetes Architecture Kubernetes Architecture
Kubernetes Architecture
Knoldus Inc.
 
Hadoop Hbase - Introduction
Hadoop Hbase - IntroductionHadoop Hbase - Introduction
Hadoop Hbase - Introduction
Blandine Larbret
 
Architecture orientée service (SOA)
Architecture orientée service (SOA)Architecture orientée service (SOA)
Architecture orientée service (SOA)
Klee Group
 
An Introduction to OpenStack Heat
An Introduction to OpenStack HeatAn Introduction to OpenStack Heat
An Introduction to OpenStack Heat
Mirantis
 
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
End to end Machine Learning using Kubeflow - Build, Train, Deploy and ManageEnd to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
Animesh Singh
 
kubernetes, pourquoi et comment
kubernetes, pourquoi et commentkubernetes, pourquoi et comment
kubernetes, pourquoi et comment
Jean-Baptiste Claramonte
 
Cours Big Data Chap6
Cours Big Data Chap6Cours Big Data Chap6
Cours Big Data Chap6
Amal Abid
 
Introduction à pl/sql
Introduction à pl/sqlIntroduction à pl/sql
Introduction à pl/sql
Abdelouahed Abdou
 
OpenStack Ironic - Bare Metal-as-a-Service
OpenStack Ironic - Bare Metal-as-a-ServiceOpenStack Ironic - Bare Metal-as-a-Service
OpenStack Ironic - Bare Metal-as-a-Service
Ramon Acedo Rodriguez
 
Diagramme d'état transition
Diagramme d'état transitionDiagramme d'état transition
Diagramme d'état transition
abdoMarocco
 
Terraform 101
Terraform 101Terraform 101
Terraform 101
Pradeep Loganathan
 
Formation jpa-hibernate-spring-data
Formation jpa-hibernate-spring-dataFormation jpa-hibernate-spring-data
Formation jpa-hibernate-spring-data
Lhouceine OUHAMZA
 
Kubernetes (K8s) 簡介 | GDSC NYCU
Kubernetes (K8s) 簡介 | GDSC NYCUKubernetes (K8s) 簡介 | GDSC NYCU
Kubernetes (K8s) 簡介 | GDSC NYCU
秀吉(Hsiu-Chi) 蔡(Tsai)
 
Une introduction à Hive
Une introduction à HiveUne introduction à Hive
Une introduction à Hive
Modern Data Stack France
 

What's hot (20)

Microservices Architecture Part 2 Event Sourcing and Saga
Microservices Architecture Part 2 Event Sourcing and SagaMicroservices Architecture Part 2 Event Sourcing and Saga
Microservices Architecture Part 2 Event Sourcing and Saga
 
Presentation du socle technique Java open source Scub Foundation
Presentation du socle technique Java open source Scub FoundationPresentation du socle technique Java open source Scub Foundation
Presentation du socle technique Java open source Scub Foundation
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
 
Apache Kafka, Un système distribué de messagerie hautement performant
Apache Kafka, Un système distribué de messagerie hautement performantApache Kafka, Un système distribué de messagerie hautement performant
Apache Kafka, Un système distribué de messagerie hautement performant
 
Monolith to serverless service based architectures in the enterprise
Monolith to serverless  service based architectures in the enterpriseMonolith to serverless  service based architectures in the enterprise
Monolith to serverless service based architectures in the enterprise
 
TD3-UML-Correction
TD3-UML-CorrectionTD3-UML-Correction
TD3-UML-Correction
 
Kubernetes Architecture
 Kubernetes Architecture Kubernetes Architecture
Kubernetes Architecture
 
Hadoop Hbase - Introduction
Hadoop Hbase - IntroductionHadoop Hbase - Introduction
Hadoop Hbase - Introduction
 
Architecture orientée service (SOA)
Architecture orientée service (SOA)Architecture orientée service (SOA)
Architecture orientée service (SOA)
 
An Introduction to OpenStack Heat
An Introduction to OpenStack HeatAn Introduction to OpenStack Heat
An Introduction to OpenStack Heat
 
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
End to end Machine Learning using Kubeflow - Build, Train, Deploy and ManageEnd to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
 
kubernetes, pourquoi et comment
kubernetes, pourquoi et commentkubernetes, pourquoi et comment
kubernetes, pourquoi et comment
 
Cours Big Data Chap6
Cours Big Data Chap6Cours Big Data Chap6
Cours Big Data Chap6
 
Introduction à pl/sql
Introduction à pl/sqlIntroduction à pl/sql
Introduction à pl/sql
 
OpenStack Ironic - Bare Metal-as-a-Service
OpenStack Ironic - Bare Metal-as-a-ServiceOpenStack Ironic - Bare Metal-as-a-Service
OpenStack Ironic - Bare Metal-as-a-Service
 
Diagramme d'état transition
Diagramme d'état transitionDiagramme d'état transition
Diagramme d'état transition
 
Terraform 101
Terraform 101Terraform 101
Terraform 101
 
Formation jpa-hibernate-spring-data
Formation jpa-hibernate-spring-dataFormation jpa-hibernate-spring-data
Formation jpa-hibernate-spring-data
 
Kubernetes (K8s) 簡介 | GDSC NYCU
Kubernetes (K8s) 簡介 | GDSC NYCUKubernetes (K8s) 簡介 | GDSC NYCU
Kubernetes (K8s) 簡介 | GDSC NYCU
 
Une introduction à Hive
Une introduction à HiveUne introduction à Hive
Une introduction à Hive
 

Similar to Developing Java Streaming Applications with Apache Storm

Apache Storm Tutorial
Apache Storm TutorialApache Storm Tutorial
Apache Storm Tutorial
Farzad Nozarian
 
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormReal-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Davorin Vukelic
 
Distributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache StormDistributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache Storm
the100rabh
 
Storm is coming
Storm is comingStorm is coming
Storm is coming
Grzegorz Kolpuc
 
Real-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpacesReal-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpaces
Oleksii Diagiliev
 
Storm
StormStorm
Introduction to Apache Storm
Introduction to Apache StormIntroduction to Apache Storm
Introduction to Apache Storm
Tiziano De Matteis
 
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormC*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
DataStax
 
Storm 0.8.2
Storm 0.8.2Storm 0.8.2
Java design patterns
Java design patternsJava design patterns
Java design patterns
Shawn Brito
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
DataWorks Summit/Hadoop Summit
 
NET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptxNET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptx
petabridge
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
P. Taylor Goetz
 
Real time and reliable processing with Apache Storm
Real time and reliable processing with Apache StormReal time and reliable processing with Apache Storm
Real time and reliable processing with Apache Storm
Andrea Iacono
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
DataWorks Summit/Hadoop Summit
 
Continuations in scala (incomplete version)
Continuations in scala (incomplete version)Continuations in scala (incomplete version)
Continuations in scala (incomplete version)
Fuqiang Wang
 
The GO Language : From Beginners to Gophers
The GO Language : From Beginners to GophersThe GO Language : From Beginners to Gophers
The GO Language : From Beginners to Gophers
Alessandro Sanino
 
Storm
StormStorm
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinal
oscon2007
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinal
oscon2007
 

Similar to Developing Java Streaming Applications with Apache Storm (20)

Apache Storm Tutorial
Apache Storm TutorialApache Storm Tutorial
Apache Storm Tutorial
 
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormReal-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache Storm
 
Distributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache StormDistributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache Storm
 
Storm is coming
Storm is comingStorm is coming
Storm is coming
 
Real-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpacesReal-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpaces
 
Storm
StormStorm
Storm
 
Introduction to Apache Storm
Introduction to Apache StormIntroduction to Apache Storm
Introduction to Apache Storm
 
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormC*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
 
Storm 0.8.2
Storm 0.8.2Storm 0.8.2
Storm 0.8.2
 
Java design patterns
Java design patternsJava design patterns
Java design patterns
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
NET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptxNET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptx
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
Real time and reliable processing with Apache Storm
Real time and reliable processing with Apache StormReal time and reliable processing with Apache Storm
Real time and reliable processing with Apache Storm
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
Continuations in scala (incomplete version)
Continuations in scala (incomplete version)Continuations in scala (incomplete version)
Continuations in scala (incomplete version)
 
The GO Language : From Beginners to Gophers
The GO Language : From Beginners to GophersThe GO Language : From Beginners to Gophers
The GO Language : From Beginners to Gophers
 
Storm
StormStorm
Storm
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinal
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinal
 

Recently uploaded

reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
perranet1
 
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
yuvarajkumar334
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
nyvan3
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
aguty
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
uevausa
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
ytypuem
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
NABLAS株式会社
 
Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)
GeorgiiSteshenko
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
bmucuha
 
Template xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptxTemplate xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptx
TeukuEriSyahputra
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
davidpietrzykowski1
 
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdfNamma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
22ad0301
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
blueshagoo1
 
Senior Engineering Sample EM DOE - Sheet1.pdf
Senior Engineering Sample EM DOE  - Sheet1.pdfSenior Engineering Sample EM DOE  - Sheet1.pdf
Senior Engineering Sample EM DOE - Sheet1.pdf
Vineet
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
z6osjkqvd
 
一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理
ugydym
 
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
9gr6pty
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
hqfek
 
Sid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.pptSid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.ppt
ArshadAyub49
 

Recently uploaded (20)

reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
 
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS NOTES FOR MCA
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
 
Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Template xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptxTemplate xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptx
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
 
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdfNamma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
 
Senior Engineering Sample EM DOE - Sheet1.pdf
Senior Engineering Sample EM DOE  - Sheet1.pdfSenior Engineering Sample EM DOE  - Sheet1.pdf
Senior Engineering Sample EM DOE - Sheet1.pdf
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 
一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理
 
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
 
Sid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.pptSid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.ppt
 

Developing Java Streaming Applications with Apache Storm

  • 1. Page1 Developing Java Streaming Applications with Apache Storm Lester Martin www.ajug.org - Nov 2017
  • 2. Page2 Connection before Content Lester Martin – Hadoop/Spark/Storm Trainer & Consultant lester.martin@gmail.com http://lester.website (links to blog, twitter, github, LI, FB, etc)
  • 3. Page3 Agenda – Needs Updating!!!! • What is Storm? • Conceptual Model • Compile Time • DEMO: Develop Word Count Topology • Runtime • DEMO: Submit Word Count Topology • Additional Features • DEMO: Kafka > Storm > HBase Topology in Local Cluster
  • 5. Page5 Storm is … à Streaming – Key enabler of the Lambda Architecture à Fast – Clocked at 1M+ messages per second per node à Scalable – Thousands of workers per cluster à Fault Tolerant – Failure is expected, and embraced à Reliable – Guaranteed message delivery – Exactly-once semantics
  • 6. Page6 Storm in the Lambda Architecture persists data Hadoop batch processing batch feeds Update event models Pattern templates, key- performance indicators, and alerts Dashboards and Applications Stormreal-time data feeds
  • 9. Page9 Tuple à Unit of work to be processes à Immutable ordered set of serializable values à Fields must have assigned name {…}
  • 10. Page10 Stream à Core abstraction of Storm à Unbounded sequence of Tuples {…} {…} {…} {…} {…} {…} {…}
  • 12. Page12 Spout à Source of Streams à Wrap an event source and emit Tuples
  • 13. Page13 Message Queues Message queues are often the source of the data processed by Storm Storm Spouts integrate with many types of message queues real-time data source operating systems, services and applications, sensors Kestrel, RabbitMQ, AMQP, Kafka, JMS, others… message queue log entries, events, errors, status messages, etc. Storm data from queue is read by Storm
  • 15. Page15 Bolt à Core unit of computation à Receive Tuples and do stuff à Optionally, emit additional Tuples
  • 16. Page16 Bolt à Write to a data store
  • 17. Page17 Bolt à Read from a data store
  • 19. Page19 Bolt à (Optionally) Emit additional Stream(s)
  • 21. Page21 Topology à DAG of Spouts and Bolts à Data Flow Representation à Streaming Computation
  • 22. Page22 Topology à Storm executes Spouts and Bolts as Tasks that run in parallel on multiple machines
  • 23. Page23 Parallel Execution of Topology Components a logical topology spout A bolt A bolt B bolt C a physical implementation machine A machine B machine E machine C machine D machine F machine G spout A two tasks bolt A two tasks bolt B two tasks bolt C one task
  • 24. Page24 Stream Groupings Stream Groupings determine how Storm routes Tuples between Tasks Grouping Type Routing Behavior Shuffle Randomized round-robin (evenly distribute load to downstream Bolts) Fields Ensures all Tuples with the same Field value(s) are always routed to the same Task All Replicates Stream across all the Bolt’s Tasks (use with care) Other options Including custom RYO grouping logic
  • 25. Page25 Compile Time @Override public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields(”sentence")); }
  • 26. Page26 Example Spout Code (1 of 2) public class RandomSentenceSpout extends BaseRichSpout { SpoutOutputCollector _collector; Random _rand; @Override public void open(Map conf, TopologyContext context, SpoutOutputCollector collector) { _collector = collector; _rand = new Random(); } @Override public void nextTuple() { Utils.sleep(100); String[] sentences = new String[]{ "the cow jumped over the moon", "an apple a day keeps the doctor away", "four score and seven years ago", "snow white and the seven dwarfs", "i am at two with nature" }; String sentence = sentences[_rand.nextInt(sentences.length)]; _collector.emit(new Values(sentence)); } Continued next page… Storm uses open to open the spout and provide it with its configuration, a context object providing information about components in the topology, and an output collector used to emit tuples. Storm uses nextTuple to request the spout emit the next tuple. The spout uses emit to send a tuple to one or more bolts. Name of the spout class. Storm spout class used as a “template”.
  • 27. Page27 Example Spout Code (2 of 2) @Override public void ack(Object id) { } @Override public void fail(Object id) { } @Override public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields(”sentence")); } } Storm calls the spout’s ack method to signal that a tuple has been fully processed. Storm calls the spout’s fail method to signal that a tuple has not been fully processed. The declareOutputFields method names the fields in a tuple. Continued…
  • 28. Page28 Example Bolt Code public static class ExclamationBolt extends BaseRichBolt { OutputCollector _collector; public void prepare(Map conf, TopologyContext context, OutputCollector collector) { _collector = collector; } public void execute(Tuple tuple) { _collector.emit(tuple, new Values(tuple.getString(0) + "!!!")); _collector.ack(tuple); } public void cleanup(); { } public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields("word")); } } The prepare method provides the bolt with its configuration and an OutputCollector used to emit tuples. The execute method receives a tuple from a stream and emits a new tuple. It also provides an ack method that can be used after successful delivery. The cleanup method releases system resources when bolt is shut down. Names the fields in the output tuples. More detail later. Name of the bolt class. Bolt class used as a “template.”
  • 29. Page29 Example Topology Code public static main(String[] args) throws exception { TopologyBuilder builder = new TopologyBuilder(); builder.setSpout(“words”, new TestWordSpout()); builder.setBolt(“exclaim1”, new NewExclamationBolt()).shuffleGrouping(“words”); builder.setBolt(“exclaim2”, new NewExclamationBolt()).shuffleGrouping(“exclaim1”); Config conf = new Config(); StormSubmitter.submitTopology(”add-exclamation", conf, builder.createTopology()); } This code… words exclaim1 exclaim2 shuffleGrouping shuffleGrouping …builds this Topology. runs code in TestWordSpout() runs code in NewExclamationBolt() runs code in NewExclamationBolt()
  • 33. Page33 Topology Submitter uploads topology: • topology.jar • topology.ser • conf.ser Topology Deployment
  • 34. Page34 Topology Deployment Nimbus calculates assignments and sends to Zookeeper
  • 35. Page35 Topology Deployment Supervisor nodes receive assignment information via Zookeeper watches
  • 36. Page36 Topology Deployment Supervisor nodes download topology from Nimbus: • topology.jar • topology.ser • conf.ser
  • 40. Page40 Local Versus Distributed Storm Clusters The topology program code submitted to Storm using storm jar is different when submitting to local mode versus a distributed cluster. The submitTopology method is used in both cases. • The difference is the class that contains the submitTopology method. Config conf = new Config(); LocalCluster cluster = new LocalCluster(); LocalCluster.submitTopology("mytopology", conf, topology); Config conf = new Config(); StormSubmitter.submitTopology("mytopology", conf, topology); Instantiate a local cluster object. Submit a topology to a local cluster. Submit a topology to a distributed cluster. Same method name, different classes Same method name, different classes.
  • 41. Page41 Reliable Processing Bolts may emit Tuples Anchored to one received. Tuple “B” is a descendant of Tuple “A”
  • 42. Page42 Reliable Processing Multiple Anchorings form a Tuple tree (bolts not shown)
  • 43. Page43 Reliable Processing Bolts can Acknowledge that a tuple has been processed successfully. ACK
  • 44. Page44 Reliable Processing Bolts can also Fail a tuple to trigger a spout to replay the original. FAIL
  • 45. Page45 Reliable Processing Any failure in the Tuple tree will trigger a replay of the original tuple
  • 46. Page46 More Stuff à Topology description/deployment options – Flux – Storm SQL à Polyglot development à Micro-batching with Trident à Fault tolerance & deployment isolation à Integrations – Messaging; Kafka, Redis, Kestrel, Kinesis, MQTT, JMS – Databases; HBase, Hive, Druid, Cassandra, MongoDB, JDBC – Search Engines; Solr, Elasticsearch – HDFS – And more!
  • 47. Page47 DEMO Kafka > Storm > HBase Topology in a Local Cluster
  • 48. Page48 Kafka > Storm > HBase Example Requirements: • Land simulated server logs into Kafka • Configure a Kafka Bolt to consume the server log messages • Ignore all messages that are not either WARN or ERROR • Persist WARN and ERROR messages into HBase – Keep 10 most recent messages for each server – Maintain a running total of these concerning messages • Publish these messages back to Kafka Kafka Kafka HBase HBaseParse FilterKafka Kafka
  • 49. Page49 Questions? Lester Martin – Hadoop/Spark/Storm Trainer & Consultant lester.martin@gmail.com http://lester.website (links to blog, twitter, github, LI, FB, etc) THANKS FOR YOUR TIME!!