SlideShare a Scribd company logo
Sistemas distribuidos escalables
Tutorial
Miguel C´arcamo V´asquez
Daniel Wladdimiro Cottet
Profesores: Erika Rosas Olivos
Nicol´as Hidalgo Castillo
Departamento de Ingenier´ıa Inform´atica
Universidad de Santiago de Chile
November, 2014
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 1 / 31
Kafka
What is Kafka?
Apache Kafka is publish-subscribe messaging rethought as a distributed
commit log.
• Fast
• Hundreds of megabytes
• Scalable
• Elastically
• Transparently
• Durable
• Persisted on disk
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 2 / 31
Kafka
Architecture
It is a distributed, partitioned, replicated commit log service. It provides
the functionality of a messaging system, but with a unique design.
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 3 / 31
Kafka
Architecture
A two server Kafka cluster hosting four partitions (P0-P3) with two
consumer groups. Consumer group A has two consumer instances and
group B has four.
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 4 / 31
Kafka
Zookeper
zookeeperServer.sh
bin/zookeeper-server-start.sh ../config/zookeeper.properties
Configuration
• dataDir
• clientPort
• maxClientCnxns
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 5 / 31
Kafka
Kafka Server
kafkaServer.sh
bin/kafka-server-start.sh ../config/server.properties
Mandatory configuration
• broker.id
• log.dirs
• zookeeper.connect
Optional configuration
• Log basics
• num.partition
• Log Retention Policy
• log.retention.hours
• log.flush.interval.messages
• log.flush.interval.ms
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 6 / 31
Kafka
Create Topics
createTopics.sh
bin/kafka-topics.sh –create –zookeeper localhost:2181 –replication-factor 1
–partitions 1 –topic $1
Parameters
• replication-factor
• partitions
• topic
Configuration –config
• max.message.bytes
• index.interval.bytes
• flush.messages
• flush.ms
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 7 / 31
Kafka
Check Topics
checkTopics.sh
bin/kafka-topics.sh –list –zookeeper localhost:2181
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 8 / 31
Kafka
Producer
createProducer.sh
bin/kafka-console-producer.sh –broker-list localhost:9092 –topic $1
Mandatory configuration
• metadata.broker.list
• request.required.acks
• producer.type
• serializer.class
Optional configuration
• compression.codec
• request.timeout.ms
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 9 / 31
Kafka
Consumer
createConsumer.sh
bin/kafka-console-consumer.sh –zookeeper localhost:2181 –topic $1
–from-beginning
Mandatory configuration
• group.id
• zookeeper.connect
Optional configuration
• fetch.message.max.bytes
• consumer.id
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 10 / 31
Kafka
Clients
Producer Daemon Storm
Python Scala DSL
Go (AKA golang) HTTP REST
C JRuby
C++ Perl
.NET Clojure
Ruby Node.js
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 11 / 31
Kafka
Multi-Broker
createMultiBroker.sh
cp config/server.properties config/server-1.properties
cp config/server.properties config/server-2.properties
config/server-1.properties:
broker.id=1
port=9093
log.dir=/tmp/kafka-logs-1
config/server-2.properties:
broker.id=2
port=9094
log.dir=/tmp/kafka-logs-2
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 12 / 31
Kafka
Create Kafka Server
Kafka Server 1
../bin/kafka-server-start.sh config/server-1.properties &
Kafka Server 2
../bin/kafka-server-start.sh config/server-2.properties &
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 13 / 31
Topic with replication
Create new topic
../bin/kafka-topics.sh –create –zookeeper localhost:2181
–replication-factor 3 –partitions 1 –topic my-replicated-topic
Show topic
../bin/kafka-topics.sh –describe –zookeeper localhost:2181 –topic
my-replicated-topic
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 14 / 31
Fault Tolerance
Kill replication
ps -ef — grep server-1.properties
kill -9 # pid
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 15 / 31
Storm
What is Storm?
• Computation platform for stream data processing
• Fault Tolerant
• Scalable
• Distributed
• Reliable
• Learn, code and run
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 16 / 31
Architecture
Fig. 1: Storm Cluster
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 17 / 31
Spouts & Bolts
Fig. 2: Spouts & Bolts
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 18 / 31
Physical & Logical
Fig. 3: Physical & Logical Architecture
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 19 / 31
Before coding
• Install maven or graddle
• Install Eclipse (only if you want to)
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 20 / 31
Coding a Spout
Structure
• import libraries
• public class ”SpoutName” extends BaseRichSpout
• class variables
• public void open(Map conf, TopologyContext topologyContext,
SpoutOutCollector collector)
• public void nextTuple()
• public void declareOutputFields(OutputFields declarer)
• Your methods
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 21 / 31
Coding a Bolt
Structure
• import libraries
• public class ”BoltName” extends BaseRichBolt
• class variables
• public ”BoltName”() (Constructor)
• public void prepare(Map map, TopologyContext topologyContext,
OutputCollector collector)
• public void execute(Tuple input)
• public void declareOutputFields(OutputFields declarer)
• Your methods
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 22 / 31
Coding a Topology
Structure
• import libraries
• public class Topology
• class variables
• public static void main(String[] args)
• Config config = new Config()
• TopologyBuilder b = new TopologyBuilder()
• b.setSpout(”SpoutName”, new ”SpoutName”)
• b.setBolt(”BoltName”, new
”BoltName”.shuffleGroping(”SpoutName”))
• final LocalCluster cluster = new LocalCluster()
• cluster.submitTopology(”TopologyName”, config, b.createTopology())
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 23 / 31
Compile & Run
• Download a Storm release , unpack it, and put the unpacked bin/
directory on your PATH.
• cd myapp
• mvn package
• storm jar target/my-app-1.0-SNAPSHOT.jar
com.mycompany.app.App
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 24 / 31
Grouping
Fig. 4: Groupings
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 25 / 31
Grouping
• Shuffle: Stream tuples are randomly distributed such that each bolt is
guaranteed to get an equal number of tuples.
• Fields: Stream tuples are partitioned by the fields specified in the
grouping.
• All grouping: Stream tuples are replicated across all the bolts.
• Global grouping: entire stream goes to a single bolt.
• Direct Grouping: the source decides which component will receive the
tuple.
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 26 / 31
Project Topology
Fig. 5: Project Topology
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 27 / 31
Web Services
Node.js
Install Node.js
https://github.com/joyent/node/archive/master.zip
./configure
make
make install
Run web services
node server.js
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 28 / 31
Kafka
Server Start
Stages
1 zookeeperServer.sh
2 kafkaServer.sh
3 createTopics.sh voteLog
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 29 / 31
Web Services
Connection Kafka
Install API Kafka-Python
pip install ./kafka-python
runKafkaLogs.sh
./tail2kafka/tail2kafka -l ../logs/vote-info.log -t voteLog -s localhost -p
9092 -d 5
Final stage
createProducer.sh voteLog
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 30 / 31
Questions?
M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 31 / 31

More Related Content

What's hot

Realtime processing with storm presentation
Realtime processing with storm presentationRealtime processing with storm presentation
Realtime processing with storm presentation
Gabriel Eisbruch
 
Apache Storm Concepts
Apache Storm ConceptsApache Storm Concepts
Apache Storm ConceptsAndré Dias
 
Learning Stream Processing with Apache Storm
Learning Stream Processing with Apache StormLearning Stream Processing with Apache Storm
Learning Stream Processing with Apache Storm
Eugene Dvorkin
 
Experience with Kafka & Storm
Experience with Kafka & StormExperience with Kafka & Storm
Experience with Kafka & Storm
Otto Mok
 
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...
Folio3 Software
 
Distributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache StormDistributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache Storm
the100rabh
 
PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.
DECK36
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm Architecture
P. Taylor Goetz
 
Storm: The Real-Time Layer - GlueCon 2012
Storm: The Real-Time Layer  - GlueCon 2012Storm: The Real-Time Layer  - GlueCon 2012
Storm: The Real-Time Layer - GlueCon 2012
Dan Lynn
 
Introduction to Twitter Storm
Introduction to Twitter StormIntroduction to Twitter Storm
Introduction to Twitter Storm
Uwe Printz
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014
P. Taylor Goetz
 
Real-time Big Data Processing with Storm
Real-time Big Data Processing with StormReal-time Big Data Processing with Storm
Real-time Big Data Processing with Storm
viirya
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopDataWorks Summit
 
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsReal time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Data Con LA
 
Storm Real Time Computation
Storm Real Time ComputationStorm Real Time Computation
Storm Real Time Computation
Sonal Raj
 
Slide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache StormSlide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache Storm
Md. Shamsur Rahim
 
Real-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using StormReal-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using Storm
Nati Shalom
 
Storm
StormStorm
Storm
nathanmarz
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
DataWorks Summit/Hadoop Summit
 

What's hot (20)

Realtime processing with storm presentation
Realtime processing with storm presentationRealtime processing with storm presentation
Realtime processing with storm presentation
 
Apache Storm Concepts
Apache Storm ConceptsApache Storm Concepts
Apache Storm Concepts
 
Introduction to Storm
Introduction to StormIntroduction to Storm
Introduction to Storm
 
Learning Stream Processing with Apache Storm
Learning Stream Processing with Apache StormLearning Stream Processing with Apache Storm
Learning Stream Processing with Apache Storm
 
Experience with Kafka & Storm
Experience with Kafka & StormExperience with Kafka & Storm
Experience with Kafka & Storm
 
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...
Distributed and Fault Tolerant Realtime Computation with Apache Storm, Apache...
 
Distributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache StormDistributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache Storm
 
PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm Architecture
 
Storm: The Real-Time Layer - GlueCon 2012
Storm: The Real-Time Layer  - GlueCon 2012Storm: The Real-Time Layer  - GlueCon 2012
Storm: The Real-Time Layer - GlueCon 2012
 
Introduction to Twitter Storm
Introduction to Twitter StormIntroduction to Twitter Storm
Introduction to Twitter Storm
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014
 
Real-time Big Data Processing with Storm
Real-time Big Data Processing with StormReal-time Big Data Processing with Storm
Real-time Big Data Processing with Storm
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
 
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsReal time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
 
Storm Real Time Computation
Storm Real Time ComputationStorm Real Time Computation
Storm Real Time Computation
 
Slide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache StormSlide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache Storm
 
Real-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using StormReal-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using Storm
 
Storm
StormStorm
Storm
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
 

Viewers also liked

Hadoop, Infrastructure and Stack
Hadoop, Infrastructure and StackHadoop, Infrastructure and Stack
Hadoop, Infrastructure and Stack
John Dougherty
 
Apache storm vs. Spark Streaming
Apache storm vs. Spark StreamingApache storm vs. Spark Streaming
Apache storm vs. Spark Streaming
P. Taylor Goetz
 
Apache Storm Internals
Apache Storm InternalsApache Storm Internals
Apache Storm Internals
Humoyun Ahmedov
 
Transformations and actions a visual guide training
Transformations and actions a visual guide trainingTransformations and actions a visual guide training
Transformations and actions a visual guide training
Spark Summit
 
Storm特性
Storm特性Storm特性
Storm特性
zyh
 
Continuous Integration and Deployment Best Practices on AWS (ARC307) | AWS re...
Continuous Integration and Deployment Best Practices on AWS (ARC307) | AWS re...Continuous Integration and Deployment Best Practices on AWS (ARC307) | AWS re...
Continuous Integration and Deployment Best Practices on AWS (ARC307) | AWS re...
Amazon Web Services
 
Apache Storm vs. Spark Streaming - two stream processing platforms compared
Apache Storm vs. Spark Streaming - two stream processing platforms comparedApache Storm vs. Spark Streaming - two stream processing platforms compared
Apache Storm vs. Spark Streaming - two stream processing platforms compared
Guido Schmutz
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
P. Taylor Goetz
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
DataWorks Summit/Hadoop Summit
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationnathanmarz
 
Yahoo compares Storm and Spark
Yahoo compares Storm and SparkYahoo compares Storm and Spark
Yahoo compares Storm and Spark
Chicago Hadoop Users Group
 

Viewers also liked (11)

Hadoop, Infrastructure and Stack
Hadoop, Infrastructure and StackHadoop, Infrastructure and Stack
Hadoop, Infrastructure and Stack
 
Apache storm vs. Spark Streaming
Apache storm vs. Spark StreamingApache storm vs. Spark Streaming
Apache storm vs. Spark Streaming
 
Apache Storm Internals
Apache Storm InternalsApache Storm Internals
Apache Storm Internals
 
Transformations and actions a visual guide training
Transformations and actions a visual guide trainingTransformations and actions a visual guide training
Transformations and actions a visual guide training
 
Storm特性
Storm特性Storm特性
Storm特性
 
Continuous Integration and Deployment Best Practices on AWS (ARC307) | AWS re...
Continuous Integration and Deployment Best Practices on AWS (ARC307) | AWS re...Continuous Integration and Deployment Best Practices on AWS (ARC307) | AWS re...
Continuous Integration and Deployment Best Practices on AWS (ARC307) | AWS re...
 
Apache Storm vs. Spark Streaming - two stream processing platforms compared
Apache Storm vs. Spark Streaming - two stream processing platforms comparedApache Storm vs. Spark Streaming - two stream processing platforms compared
Apache Storm vs. Spark Streaming - two stream processing platforms compared
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computation
 
Yahoo compares Storm and Spark
Yahoo compares Storm and SparkYahoo compares Storm and Spark
Yahoo compares Storm and Spark
 

Similar to Tutorial Kafka-Storm

Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
confluent
 
DevoxxFR 2016 - 3 degrees of MoM
DevoxxFR 2016 - 3 degrees of MoMDevoxxFR 2016 - 3 degrees of MoM
DevoxxFR 2016 - 3 degrees of MoM
Guillaume Arnaud
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Spark
nickmbailey
 
FAIR Projector Builder
FAIR Projector BuilderFAIR Projector Builder
FAIR Projector Builder
Mark Wilkinson
 
Kubernetes Operability Tooling (Minnebar 2019)
Kubernetes Operability Tooling (Minnebar 2019)Kubernetes Operability Tooling (Minnebar 2019)
Kubernetes Operability Tooling (Minnebar 2019)
bridgetkromhout
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Guido Schmutz
 
Apache Kafka - A modern Stream Processing Platform
Apache Kafka - A modern Stream Processing PlatformApache Kafka - A modern Stream Processing Platform
Apache Kafka - A modern Stream Processing Platform
Guido Schmutz
 
Follow the (Kafka) Streams
Follow the (Kafka) StreamsFollow the (Kafka) Streams
Follow the (Kafka) Streams
confluent
 
Machine Learning for Big Data Analytics: Scaling In with Containers while Sc...
Machine Learning for Big Data Analytics:  Scaling In with Containers while Sc...Machine Learning for Big Data Analytics:  Scaling In with Containers while Sc...
Machine Learning for Big Data Analytics: Scaling In with Containers while Sc...
Ian Lumb
 
Westpac Bank Tech Talk 1: Dive into Apache Kafka
Westpac Bank Tech Talk 1: Dive into Apache KafkaWestpac Bank Tech Talk 1: Dive into Apache Kafka
Westpac Bank Tech Talk 1: Dive into Apache Kafka
confluent
 
Stream Data Deduplication Powered by Kafka Streams | Philipp Schirmer, Bakdata
Stream Data Deduplication Powered by Kafka Streams | Philipp Schirmer, BakdataStream Data Deduplication Powered by Kafka Streams | Philipp Schirmer, Bakdata
Stream Data Deduplication Powered by Kafka Streams | Philipp Schirmer, Bakdata
HostedbyConfluent
 
Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...
Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...
Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...
HostedbyConfluent
 
Spinnaker - Bay Area AWS Meetup - 20160726
Spinnaker - Bay Area AWS Meetup - 20160726Spinnaker - Bay Area AWS Meetup - 20160726
Spinnaker - Bay Area AWS Meetup - 20160726
Adam Jordens
 
Scala.io 2013 - Scala and ZeroMQ: Events beyond the JVM
Scala.io 2013 - Scala and ZeroMQ: Events beyond the JVMScala.io 2013 - Scala and ZeroMQ: Events beyond the JVM
Scala.io 2013 - Scala and ZeroMQ: Events beyond the JVM
RUDDER
 
A Tale of Squirrels and Storms
A Tale of Squirrels and StormsA Tale of Squirrels and Storms
A Tale of Squirrels and Storms
Matthias J. Sax
 
Matthias J. Sax – A Tale of Squirrels and Storms
Matthias J. Sax – A Tale of Squirrels and StormsMatthias J. Sax – A Tale of Squirrels and Storms
Matthias J. Sax – A Tale of Squirrels and Storms
Flink Forward
 
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...
Jean-Paul Azar
 
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
MLconf
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
Amir Sedighi
 
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaKafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Guido Schmutz
 

Similar to Tutorial Kafka-Storm (20)

Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
 
DevoxxFR 2016 - 3 degrees of MoM
DevoxxFR 2016 - 3 degrees of MoMDevoxxFR 2016 - 3 degrees of MoM
DevoxxFR 2016 - 3 degrees of MoM
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Spark
 
FAIR Projector Builder
FAIR Projector BuilderFAIR Projector Builder
FAIR Projector Builder
 
Kubernetes Operability Tooling (Minnebar 2019)
Kubernetes Operability Tooling (Minnebar 2019)Kubernetes Operability Tooling (Minnebar 2019)
Kubernetes Operability Tooling (Minnebar 2019)
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
 
Apache Kafka - A modern Stream Processing Platform
Apache Kafka - A modern Stream Processing PlatformApache Kafka - A modern Stream Processing Platform
Apache Kafka - A modern Stream Processing Platform
 
Follow the (Kafka) Streams
Follow the (Kafka) StreamsFollow the (Kafka) Streams
Follow the (Kafka) Streams
 
Machine Learning for Big Data Analytics: Scaling In with Containers while Sc...
Machine Learning for Big Data Analytics:  Scaling In with Containers while Sc...Machine Learning for Big Data Analytics:  Scaling In with Containers while Sc...
Machine Learning for Big Data Analytics: Scaling In with Containers while Sc...
 
Westpac Bank Tech Talk 1: Dive into Apache Kafka
Westpac Bank Tech Talk 1: Dive into Apache KafkaWestpac Bank Tech Talk 1: Dive into Apache Kafka
Westpac Bank Tech Talk 1: Dive into Apache Kafka
 
Stream Data Deduplication Powered by Kafka Streams | Philipp Schirmer, Bakdata
Stream Data Deduplication Powered by Kafka Streams | Philipp Schirmer, BakdataStream Data Deduplication Powered by Kafka Streams | Philipp Schirmer, Bakdata
Stream Data Deduplication Powered by Kafka Streams | Philipp Schirmer, Bakdata
 
Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...
Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...
Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...
 
Spinnaker - Bay Area AWS Meetup - 20160726
Spinnaker - Bay Area AWS Meetup - 20160726Spinnaker - Bay Area AWS Meetup - 20160726
Spinnaker - Bay Area AWS Meetup - 20160726
 
Scala.io 2013 - Scala and ZeroMQ: Events beyond the JVM
Scala.io 2013 - Scala and ZeroMQ: Events beyond the JVMScala.io 2013 - Scala and ZeroMQ: Events beyond the JVM
Scala.io 2013 - Scala and ZeroMQ: Events beyond the JVM
 
A Tale of Squirrels and Storms
A Tale of Squirrels and StormsA Tale of Squirrels and Storms
A Tale of Squirrels and Storms
 
Matthias J. Sax – A Tale of Squirrels and Storms
Matthias J. Sax – A Tale of Squirrels and StormsMatthias J. Sax – A Tale of Squirrels and Storms
Matthias J. Sax – A Tale of Squirrels and Storms
 
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...
 
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
 
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaKafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
 

Recently uploaded

Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 

Recently uploaded (20)

Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 

Tutorial Kafka-Storm

  • 1. Sistemas distribuidos escalables Tutorial Miguel C´arcamo V´asquez Daniel Wladdimiro Cottet Profesores: Erika Rosas Olivos Nicol´as Hidalgo Castillo Departamento de Ingenier´ıa Inform´atica Universidad de Santiago de Chile November, 2014 M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 1 / 31
  • 2. Kafka What is Kafka? Apache Kafka is publish-subscribe messaging rethought as a distributed commit log. • Fast • Hundreds of megabytes • Scalable • Elastically • Transparently • Durable • Persisted on disk M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 2 / 31
  • 3. Kafka Architecture It is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design. M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 3 / 31
  • 4. Kafka Architecture A two server Kafka cluster hosting four partitions (P0-P3) with two consumer groups. Consumer group A has two consumer instances and group B has four. M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 4 / 31
  • 5. Kafka Zookeper zookeeperServer.sh bin/zookeeper-server-start.sh ../config/zookeeper.properties Configuration • dataDir • clientPort • maxClientCnxns M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 5 / 31
  • 6. Kafka Kafka Server kafkaServer.sh bin/kafka-server-start.sh ../config/server.properties Mandatory configuration • broker.id • log.dirs • zookeeper.connect Optional configuration • Log basics • num.partition • Log Retention Policy • log.retention.hours • log.flush.interval.messages • log.flush.interval.ms M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 6 / 31
  • 7. Kafka Create Topics createTopics.sh bin/kafka-topics.sh –create –zookeeper localhost:2181 –replication-factor 1 –partitions 1 –topic $1 Parameters • replication-factor • partitions • topic Configuration –config • max.message.bytes • index.interval.bytes • flush.messages • flush.ms M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 7 / 31
  • 8. Kafka Check Topics checkTopics.sh bin/kafka-topics.sh –list –zookeeper localhost:2181 M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 8 / 31
  • 9. Kafka Producer createProducer.sh bin/kafka-console-producer.sh –broker-list localhost:9092 –topic $1 Mandatory configuration • metadata.broker.list • request.required.acks • producer.type • serializer.class Optional configuration • compression.codec • request.timeout.ms M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 9 / 31
  • 10. Kafka Consumer createConsumer.sh bin/kafka-console-consumer.sh –zookeeper localhost:2181 –topic $1 –from-beginning Mandatory configuration • group.id • zookeeper.connect Optional configuration • fetch.message.max.bytes • consumer.id M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 10 / 31
  • 11. Kafka Clients Producer Daemon Storm Python Scala DSL Go (AKA golang) HTTP REST C JRuby C++ Perl .NET Clojure Ruby Node.js M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 11 / 31
  • 12. Kafka Multi-Broker createMultiBroker.sh cp config/server.properties config/server-1.properties cp config/server.properties config/server-2.properties config/server-1.properties: broker.id=1 port=9093 log.dir=/tmp/kafka-logs-1 config/server-2.properties: broker.id=2 port=9094 log.dir=/tmp/kafka-logs-2 M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 12 / 31
  • 13. Kafka Create Kafka Server Kafka Server 1 ../bin/kafka-server-start.sh config/server-1.properties & Kafka Server 2 ../bin/kafka-server-start.sh config/server-2.properties & M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 13 / 31
  • 14. Topic with replication Create new topic ../bin/kafka-topics.sh –create –zookeeper localhost:2181 –replication-factor 3 –partitions 1 –topic my-replicated-topic Show topic ../bin/kafka-topics.sh –describe –zookeeper localhost:2181 –topic my-replicated-topic M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 14 / 31
  • 15. Fault Tolerance Kill replication ps -ef — grep server-1.properties kill -9 # pid M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 15 / 31
  • 16. Storm What is Storm? • Computation platform for stream data processing • Fault Tolerant • Scalable • Distributed • Reliable • Learn, code and run M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 16 / 31
  • 17. Architecture Fig. 1: Storm Cluster M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 17 / 31
  • 18. Spouts & Bolts Fig. 2: Spouts & Bolts M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 18 / 31
  • 19. Physical & Logical Fig. 3: Physical & Logical Architecture M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 19 / 31
  • 20. Before coding • Install maven or graddle • Install Eclipse (only if you want to) M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 20 / 31
  • 21. Coding a Spout Structure • import libraries • public class ”SpoutName” extends BaseRichSpout • class variables • public void open(Map conf, TopologyContext topologyContext, SpoutOutCollector collector) • public void nextTuple() • public void declareOutputFields(OutputFields declarer) • Your methods M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 21 / 31
  • 22. Coding a Bolt Structure • import libraries • public class ”BoltName” extends BaseRichBolt • class variables • public ”BoltName”() (Constructor) • public void prepare(Map map, TopologyContext topologyContext, OutputCollector collector) • public void execute(Tuple input) • public void declareOutputFields(OutputFields declarer) • Your methods M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 22 / 31
  • 23. Coding a Topology Structure • import libraries • public class Topology • class variables • public static void main(String[] args) • Config config = new Config() • TopologyBuilder b = new TopologyBuilder() • b.setSpout(”SpoutName”, new ”SpoutName”) • b.setBolt(”BoltName”, new ”BoltName”.shuffleGroping(”SpoutName”)) • final LocalCluster cluster = new LocalCluster() • cluster.submitTopology(”TopologyName”, config, b.createTopology()) M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 23 / 31
  • 24. Compile & Run • Download a Storm release , unpack it, and put the unpacked bin/ directory on your PATH. • cd myapp • mvn package • storm jar target/my-app-1.0-SNAPSHOT.jar com.mycompany.app.App M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 24 / 31
  • 25. Grouping Fig. 4: Groupings M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 25 / 31
  • 26. Grouping • Shuffle: Stream tuples are randomly distributed such that each bolt is guaranteed to get an equal number of tuples. • Fields: Stream tuples are partitioned by the fields specified in the grouping. • All grouping: Stream tuples are replicated across all the bolts. • Global grouping: entire stream goes to a single bolt. • Direct Grouping: the source decides which component will receive the tuple. M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 26 / 31
  • 27. Project Topology Fig. 5: Project Topology M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 27 / 31
  • 28. Web Services Node.js Install Node.js https://github.com/joyent/node/archive/master.zip ./configure make make install Run web services node server.js M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 28 / 31
  • 29. Kafka Server Start Stages 1 zookeeperServer.sh 2 kafkaServer.sh 3 createTopics.sh voteLog M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 29 / 31
  • 30. Web Services Connection Kafka Install API Kafka-Python pip install ./kafka-python runKafkaLogs.sh ./tail2kafka/tail2kafka -l ../logs/vote-info.log -t voteLog -s localhost -p 9092 -d 5 Final stage createProducer.sh voteLog M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 30 / 31
  • 31. Questions? M. C´arcamo & D. Wladdimiro (USACH) Kafka & Storm November, 2014 31 / 31