Kafka & message bus
Robin GRAILLON & Alexandre ANDRÉ
12/10/2016
Synchronous
● Call
● Processing
○ Might be long
● Response
Asynchronous
● Call
● Response
● Post-Processing
Message Bus
● Events aggregation
● CQRS oriented
● µS oriented
● Event sourcing oriented
● Language agnostic
● Multiple implementations
○ RabbitMQ
○ Kafka
○ etc.
Apache Kafka
● LinkedIn creation
● Open sourced in 2011
● Filesystem oriented
● Wrote in Scala
● Highly scalable
● Used by big companies
○ LinkedIn
○ Netflix
○ Spotify
○ Meetic
LinkedIn statistics
● 800 billion m/day
○ 175 TB
● 13 million m/sec
○ 2.75 GB
● 1100 kafka instances
○ 60 clusters
Basic workflow
Producer 1
Producer 2
Producer 3
Producer X
Consumer 1
Consumer 2
Consumer 3
Consumer Y
Kafka Stack
Kafka stack
● Zookeeper : scalability, manager
● Kafka broker : kafka server instance
● Consumer : consumes events and do things
● Producer : produces events (like user as connected)
● topic : event name
● partition : way to split messages between brokers
Kafka partitions (1)
For one topic (let’s say “astronaut.connection”)
Kafka stack & partitions
Let’s play...
...but easily !
./console run
HELP
run:zk run zookeeper
kill:zk kill zookeeper
run:kk run kafka
kill:kk kill kafka
producer run producer on topic
consumer run consumer on topic
t:c create a topic
t:d delete a topic
t:l list topics
What do you want to do?
Step 1: launch Zookeeper!
What do you want to do? run:zk
Configuration file? [./zookeeper.properties]
Port? [2181]
./../bin/zookeeper-server-start.sh ./zookeeper.properties
What do you want to do?
Step 2: launch Kafka!
What do you want to do? run:kk
Step 3: list topics!
What do you want to do? t:l
Zookeeper host? [localhost:2181]
./../bin/kafka-topics.sh --list --zookeeper localhost:2181
What do you want to do?
Step 4: create a topic!
What do you want to do? t:c
Topic name? [test]
Zookeeper host? [localhost:2181]
Partitions? [1]
Replication factor? [1]
./../bin/kafka-topics.sh --create --zookeeper localhost:2181 --topic test
--partitions 1 --replication-factor 1
Created topic "test".
What do you want to do?
Step 5: list topics AGAIIIIN!
What do you want to do? t:l
Zookeeper host? [localhost:2181]
./../bin/kafka-topics.sh --list --zookeeper localhost:2181
test
What do you want to do?
Step 6: run a consumer!
New terminal !
$ ./console run
What do you want to do? consumer
Topic name? [test]
Zookeeper host? [localhost:2181]
From beginning? [1]
./../bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test
--from-beginning
Step 7: generate data!
What do you want to do? g
Message? [I'm an astronaut!]
How many times? [1] 10000000
File? [data_10000000.txt]
10000000/10000000 [============================] 100%
Step 8: produce events!
Back to the other terminal
What do you want to do? p
Topic name? [test]
brokers? [localhost:9092]
data_10000000.txt
Dataset to use? [data_10000000.txt]
./../bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test <
./data/data_10000000.txt
GroupId: experiment 1
● 2 groupId
● 2 partitions
● 2 consumers
GroupId: experiment 2
● 1 groupId
● 2 consumer
Documentation/resources
● Kafka quick setup
https://kafka.apache.org/quickstart
● Kafka at LinkedIn
https://engineering.linkedin.com/kafka/running-kafka-scale
● Why zookeeper ?
https://www.quora.com/What-is-the-actual-role-of-ZooKeeper-in-Kafka
Titre 1 Titre 2

Kafka Workshop

  • 1.
    Kafka & messagebus Robin GRAILLON & Alexandre ANDRÉ 12/10/2016
  • 2.
    Synchronous ● Call ● Processing ○Might be long ● Response
  • 3.
  • 4.
    Message Bus ● Eventsaggregation ● CQRS oriented ● µS oriented ● Event sourcing oriented ● Language agnostic ● Multiple implementations ○ RabbitMQ ○ Kafka ○ etc.
  • 5.
    Apache Kafka ● LinkedIncreation ● Open sourced in 2011 ● Filesystem oriented ● Wrote in Scala ● Highly scalable ● Used by big companies ○ LinkedIn ○ Netflix ○ Spotify ○ Meetic
  • 6.
    LinkedIn statistics ● 800billion m/day ○ 175 TB ● 13 million m/sec ○ 2.75 GB ● 1100 kafka instances ○ 60 clusters
  • 7.
    Basic workflow Producer 1 Producer2 Producer 3 Producer X Consumer 1 Consumer 2 Consumer 3 Consumer Y Kafka Stack
  • 8.
    Kafka stack ● Zookeeper: scalability, manager ● Kafka broker : kafka server instance ● Consumer : consumes events and do things ● Producer : produces events (like user as connected) ● topic : event name ● partition : way to split messages between brokers
  • 9.
    Kafka partitions (1) Forone topic (let’s say “astronaut.connection”)
  • 10.
    Kafka stack &partitions
  • 11.
  • 12.
    ...but easily ! ./consolerun HELP run:zk run zookeeper kill:zk kill zookeeper run:kk run kafka kill:kk kill kafka producer run producer on topic consumer run consumer on topic t:c create a topic t:d delete a topic t:l list topics What do you want to do?
  • 13.
    Step 1: launchZookeeper! What do you want to do? run:zk Configuration file? [./zookeeper.properties] Port? [2181] ./../bin/zookeeper-server-start.sh ./zookeeper.properties What do you want to do?
  • 14.
    Step 2: launchKafka! What do you want to do? run:kk
  • 15.
    Step 3: listtopics! What do you want to do? t:l Zookeeper host? [localhost:2181] ./../bin/kafka-topics.sh --list --zookeeper localhost:2181 What do you want to do?
  • 16.
    Step 4: createa topic! What do you want to do? t:c Topic name? [test] Zookeeper host? [localhost:2181] Partitions? [1] Replication factor? [1] ./../bin/kafka-topics.sh --create --zookeeper localhost:2181 --topic test --partitions 1 --replication-factor 1 Created topic "test". What do you want to do?
  • 17.
    Step 5: listtopics AGAIIIIN! What do you want to do? t:l Zookeeper host? [localhost:2181] ./../bin/kafka-topics.sh --list --zookeeper localhost:2181 test What do you want to do?
  • 18.
    Step 6: runa consumer! New terminal ! $ ./console run What do you want to do? consumer Topic name? [test] Zookeeper host? [localhost:2181] From beginning? [1] ./../bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning
  • 19.
    Step 7: generatedata! What do you want to do? g Message? [I'm an astronaut!] How many times? [1] 10000000 File? [data_10000000.txt] 10000000/10000000 [============================] 100%
  • 20.
    Step 8: produceevents! Back to the other terminal What do you want to do? p Topic name? [test] brokers? [localhost:9092] data_10000000.txt Dataset to use? [data_10000000.txt] ./../bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test < ./data/data_10000000.txt
  • 21.
    GroupId: experiment 1 ●2 groupId ● 2 partitions ● 2 consumers
  • 22.
    GroupId: experiment 2 ●1 groupId ● 2 consumer
  • 23.
    Documentation/resources ● Kafka quicksetup https://kafka.apache.org/quickstart ● Kafka at LinkedIn https://engineering.linkedin.com/kafka/running-kafka-scale ● Why zookeeper ? https://www.quora.com/What-is-the-actual-role-of-ZooKeeper-in-Kafka
  • 24.