SlideShare a Scribd company logo
1 of 83
Download to read offline
Paul Brebner
Technology Evangelist
www.instaclustr.com
sales@instaclustr.com
© Instaclustr Pty Limited, 2022 [https://www.instaclustr.com/
company/policies/terms-conditions/]. Except as permitted by the
copyright law applicable to you, you may not reproduce, distribute,
publish, display, communicate or transmit any of the content of this
document, in any form, but any means, without the prior written
permission of Instaclustr Pty Limited.
In this Visual Introduction to Kafka,
we’re going to build a
Postal
Service
We’ll learn about Kafka Producers,
Consumers, Topics, Partitions, Keys,
Records, Delivery Semantics
(Guaranteed delivery, and who gets what
messages), Consumer Groups, Kafka
Connect and Streams!
©Instaclustr Pty Limited 2019, 2021, 2022
Kafka is a distributed streams processing system, it
allows distributed producers to send messages to
distributed consumers via a Kafka cluster.
What is
©Instaclustr Pty Limited 2019, 2021, 2022
Kafka?
Kafka has lots of benefits:
It’s Fast: It has high throughput and low latency
It’s Scalable: It’s horizontally scalable, to scale just add nodes and partitions
It’s Reliable: It’s distributed and fault tolerant
It has Zero Data Loss: Messages are persisted to disk with an immutable log
It’s Open Source: An Apache project
And it’s available as an Instaclustr Managed Service: On multiple cloud platforms
Managed
Service
Fast
Scalable
Reliable
Durable
Open Source
©Instaclustr Pty Limited 2019, 2021, 2022
But the usual
Kafka diagram
(right) is a bit
monochrome
and boring.
©Instaclustr Pty Limited 2019, 2021, 2022
This visual introduction
will be more
colourful
and it’s going to be
an extended story…
©Instaclustr Pty Limited 2019, 2021, 2022
Let’s build a modern day fully electronic postal service
T
o send messages from A to B
Postal
Service
A B
©Instaclustr Pty Limited 2019, 2021, 2022
T
o B, the consumer,
the recipient of the
message.
A is a producer,
it sends a message…
First, we need an “A”.
©Instaclustr Pty Limited 2019, 2021, 2022
Due to the decline in
“snail mail” volumes,
direct deliveries have
been canceled.
CANCELED
Actually,
not.
©Instaclustr Pty Limited 2019, 2021, 2022
Consumers poll for messages by
visiting the counter at the post
office.
Poste Restante is not a post office
in a restaurant, it’s called general
delivery (in the US).
The mail is delivered to a post
office, and they hold it for you
until you call for it.
Instead we have
“Poste Restante”
Image: La Poste Restante, Francois-Auguste Biard (Wikimedia)
©Instaclustr Pty Limited 2019, 2021, 2022
Disconnected delivery—consumer doesn’t need to be
available to receive messages
There’s less effort for the messaging service— only
has to deliver to a few locations not many consumer
addresses
And it can scale better and handle more
complex delivery semantics! Postal
Service
Kafka topics act like a Post Office.
What are the benefits?
©Instaclustr Pty Limited 2019, 2021, 2022
Kafka Topics have 1 or more Partitions. Partitions function like multiple counters
and enable high concurrency.
A single counter introduces
delays and limits concurrency.
More counters
increases concurrency and
reduces delays.
First lets see how it scales.
What if there are many consumers for a topic?
©Instaclustr Pty Limited 2019, 2021, 2022
Santa
North
Pole
Let’s see what a message looks like.
In Kafka a message is
called a Record and is a
bit like a letter.
The topic is the
destination,
The North Pole.
©Instaclustr Pty Limited 2019, 2021, 2022
Santa
North
Pole Time semantics are flexible,
either the time of event
creation, ingestion, or
processing.
timestamp,
offset,
partition
T
opic
The “Postmark” includes a
timestamp, offset in the
topic, and the partition it
was sent to.
©Instaclustr Pty Limited 2019, 2021, 2022
Santa
North
Pole
We want this letter
sent to Santa not
just a random Elf.
timestamp,
offset,
partition
T
opic
Key Partition
(optional)
There’s also a thing called a Key, which is
optional. It refines the destination so it’s a
bit like the rest of the address.
©Instaclustr Pty Limited 2019, 2021, 2022
Santa
North
Pole
And the value is the contents (just a byte array).
Kafka Producers and consumers need to have a shared
serializer and de-serializer for both the key and value.
timestamp,
offset,
partition
T
opic
Key Partition
(optional)
Value (Content)
©Instaclustr Pty Limited 2019, 2021, 2022
Kafka doesn’t look
inside the value, but
the Producer and
Consumer do, and the
Consumer can try and
make sense of the
message
(Can you?!)
Image: Dear Santa by Zack Poitras / http://theinclusive.net/article.php?id=268
©Instaclustr Pty Limited 2019, 2021, 2022
let’s look at
delivery semantics
For example, do we care if the
message actually arrives or not?
Next
©Instaclustr Pty Limited 2019, 2021, 2022
Last century, homing pigeons were
prone to getting lost or eaten by
predators, so the same message was
sent with several pigeons.
Yes we do!
Guaranteed
message
delivery is
desirable.
©Instaclustr Pty Limited 2019, 2021, 2022
How does Kafka guarantee delivery?
The message is always
persisted to disk.
This makes it
resilient to
power failure
A Message (M1) is
written to a broker (2).
Producer
M1
M1
Broker
1
Broker
2
Broker
3
©Instaclustr Pty Limited 2019, 2021, 2022
Producer
Broker
1
Broker Broker
3
M1 M1 M1
The message is also replicated on
multiple brokers, 3 is typical.
2
©Instaclustr Pty Limited 2019, 2021, 2022
Producer
M1
M1 M1
And makes it resilient to
loss of some servers
(all but one).
Broker
1
©Instaclustr Pty Limited 2019, 2021, 2022
Finally the producer gets acknowledgement once the message is
persisted and replicated (configurable for number, and sync or async).
Producer
M1
Broker
1
Broker
2
Broker
3
M1 M1
Acknowledgement
This also increases the
read concurrency as
partitions are spread
over multiple brokers.
The message is now
available from more
than one broker in
case some fail.
©Instaclustr Pty Limited 2019, 2021, 2022
let’s look at another aspect of
delivery semantics
Who gets the messages and how many
times are messages delivered?
Now
©Instaclustr Pty Limited 2019, 2021, 2022
Producer
Consumer
Consumer
Consumer
Consumer
?
Kafka is “pub-sub”. It’s loosely coupled,
producers and consumers don’t know about
each other.
©Instaclustr Pty Limited 2019, 2021, 2022
Filtering, or which consumers get which messages, is topic based.
- Producers send messages to topics.
- Consumers subscribe to topics of interest, e.g. parties.
- When they poll they only receive messages sent to those topics.
None of these consumers will receive messages sent to the “Work” topic.
Producer
Consumer
Consumer
Consumer
Consumer
Topic “Parties”
Topic “Work”
Consumers subscribed
to Topic “Parties”
Consumers poll to
receive messages
from “Parties”
Consumers not subscribed to
“Work” messages
©Instaclustr Pty Limited 2019, 2021, 2022
A few more details and we can see how this works.
Kafka works like Amish Barn raising.
Partitions and a consumer group share work
across multiple consumers, the more
partitions a topic has the more consumers
it supports.
Image: Paul Cyr ©2018 NorthernMainePhotos.com
©Instaclustr Pty Limited 2019, 2021, 2022
Kafka also works like Clones.
It supports delivery of the same message to
multiple consumers with consumer groups.
Kafka doesn’t throw messages away
immediately they are delivered, so the
same message can be delivered to
multiple consumer groups.
Image: Shutterstock.com
©Instaclustr Pty Limited 2019, 2021, 2022
Consumers subscribed to ”parties” topic are allocated partitions.
When they poll they will only get messages from their allocated
partitions.
Consumer
Partition n
Topic “Parties”
Partition 1
Producer
Partition 2
Consumer Group
Consumer
Consumer Group
Consumer
Consumer
©Instaclustr Pty Limited 2019, 2021, 2022
This enables consumers in the same group to share the work
around. Each consumer gets only a subset of the available
messages.
Partition n
Topic “Parties”
Partition 1
Producer
Partition 2
Consumer Group
Consumer
Consumer
Consumers share
work within groups
Consumer
©Instaclustr Pty Limited 2019, 2021, 2022
Multiple groups enable message broadcasting. Messages
are duplicated across groups, as each consumer group
receives a copy of each message.
Consumer
Consumer
Consumer
Consumer
Topic “Parties”
Partition 1
Partition 2
Partition n
Producer
Consumer Group
Consumer Group
Messages are
duplicated across
Consumer groups
©Instaclustr Pty Limited 2019, 2021, 2022
Which messages are delivered to which consumers?
The final aspect of delivery semantics
is to do with message keys.
If a message has a key, then Kafka uses
Partition based delivery.
Messages with
the same key are
always sent to the same partition and
therefore the same consumer. And the
order (within partitions) is guaranteed.
Key
©Instaclustr Pty Limited 2019, 2021, 2022
But if the key is null, then Kafka uses
round robin delivery.
Each message is delivered to the next partition.
Round
robin
delivery
©Instaclustr Pty Limited 2019, 2021, 2022
Let’s look at a concrete example with two consumer groups:
Group 1: Nerds
which has multiple consumers
Group 2: The Pugsters
which has a single consumer, Zug
Image: Shutterstock.com
Bill
Paul
Penny
Kate
Millie
Jenny
Image: Nenad Aksic / Shutterstock.com
©Instaclustr Pty Limited 2019, 2021, 2022
Consumer 1
(Bill)
Consumer 2
(Jenny)
Consumer 1 (Zug
from The Pugsters)
Topic “Parties”
Partition 1
Partition 2
Partition n
Producer
Group “Nerds”
Group “Pugsters”
Consumers
subscribe to
“Parties”
Each message (1, 2, etc.) is sent to the next
partition, and consumers allocated to that
partition will receive the message when they
poll next.
Looking at the case
where there’s
No Keyfirst
Round robin
No
Key
1
2
etc
1
2
1
2
Consumer n
©Instaclustr Pty Limited 2019, 2021, 2022
Here’s what actually happens.
We’re not showing the producer,
topics, or partitions for simplicity.
You’ll have to imagine them.
Bill
Paul
Penny
Kate
Millie
Jenny
No
Key
©Instaclustr Pty Limited 2019, 2021, 2022
Bill
Penny
Kate
Millie
Jenny
Both Groups subscribe to T
opic“parties”
(assuming 6 partitions, each consumer in the Nerds
group gets 1 partition each; Zug gets them all)
1
Paul
Subscribe to
“Parties”
No
Key
©Instaclustr Pty Limited 2019, 2021, 2022
Bill
Pau
l
Penny
Kate
Millie
Jenny
Producer sends record with the
value “Pool party—Invitation” to
“parties” topic (there’s no key)
2
Invitation
No
Key
©Instaclustr Pty Limited 2019, 2021, 2022
Value
Bill
Paul
Penny
Kate
Millie
Jenny
Bill and Zug receive a copy of the
invitation and plan to attend
3
Invitation Invitation
No
Key
©Instaclustr Pty Limited 2019, 2021, 2022
Bill
Pen ny
Pau
l
Kate
Millie
Jenny
The Producer sends another record with the
value “Pool party—Canceled”
4
No
Key
Invitation
Canceled
©Instaclustr Pty Limited 2019, 2021, 2022
Invitation
Bill
Paul
Penny
Kate
Millie
Jenny
In the Nerds group, Jenny gets the message this time as it’s round robin, and Zug
gets it as he’s the only consumer in his group:
▶ Jenny ignores it as she didn’t get the original invite
▶ Bill wastes his time trying to go (as he doesn’t know it’s canceled)
▶ The rest of the gang aren’t surprised at not receiving any invites and
stay home to do some hacking
5
Invitation
Canceled
No
Key
Invitation
Canceled
©Instaclustr Pty Limited 2019, 2021, 2022
Zug plans
something else
fun instead… A
jam session with
his band
Image: Shutterstock.com
©Instaclustr Pty Limited 2019, 2021, 2022
Consumer 1
(Bill)
Consumer 2
(Jenny)
Consumer 1
(Zug)
Topic “Parties”
Partition 1
Partition 2
Partition n
Producer
Group “Nerds”
Group “Pugster”
Consumers
subscribe to
“Parties”
The key is hashed to a partition, so the Message is
always sent to that partition. Assume there are 3
messages, and messages 1 and 2 are hashed to the
same partition.
How does it work if
there is a Key?
1,2
3
etc
1,2
3
1,2
3
Consumer n
Hashed to
partition Key
©Instaclustr Pty Limited 2019, 2021, 2022
Bill
Paul
Penny
Kate
Millie
Jenny
As before Both Groups subscribe to
Topic “parties”
The Producer sends a record with the
key equal to “Pool Party” and the
value equal to “Invitation” to
“parties” topic
Here’s what happens with a
key, assuming that the key is
the “title” of the message
(“Pool Party”), and the value
is invitation or canceled
1
2
Key
Invitation
Key
Value
©Instaclustr Pty Limited 2019, 2021, 2022
Bill
Paul
Penny
Kate
Millie
Jenny
As before, Bill and Zug receive a copy of the
invitation and plan to attend
3
Invitation
Key
Key
Invitation
Key
Value
©Instaclustr Pty Limited 2019, 2021, 2022
Value
Bill
Pen n
y
Kate
Millie
Jenny
The Producer sends another record with
the same key but with the value
“canceled” to “parties” topic
4
Key
Invitation
Key
Value
Canceled
Paul
Value
Key
©Instaclustr Pty Limited 2019, 2021, 2022
Invitation
Key
Value
Paul
Penny
Kate
Millie
Jenny
This time, Bill and Zug receive the cancelation
(the same consumers as the key is identical)
5
BK
il
el
y
Key
Value
©Instaclustr Pty Limited 2019, 2021, 2022
Invitation
Key
Value
Bill
Canceled
Key
Value
Key
Invitation
BK
il
el
y
Key
Canceled
Key
Value
Key
Paul
Penny
Kate
Millie
The Producer sends out another
invitation to a Halloween party.
The key is different this time.
6
Key
©Instaclustr Pty Limited 2019, 2021, 2022
Key
Jenny
Bill
Paul
Penny
Kate
Millie
Jenny
Jenny receives the Halloween invitation as the key is
different and the record is sent to Jenny’s partition.
Zug is the only consumer in his group so he gets
every record no matter what partition it’s sent to.
7
Key
©Instaclustr Pty Limited 2019, 2021, 2022
Bill
Jenny’s
partitionkey
This time Zug
gets dressed up
and has fun at
the party.
Image: Shutterstock.com
©Instaclustr Pty Limited 2019, 2021, 2022
But wait! There’s more—
event reprocessing
(time travel)!
Kafka stores message streams on disk, so
Consumers can go back and request the same
messages they’ve already received, earlier messages, or
ignore some messages etc.
Image: Shutterstock.com
©Instaclustr Pty Limited 2019, 2021, 2022
©Instaclustr Pty Limited 2019, 2021, 2022
So Zug can go
“back to the
future”!
©Instaclustr Pty Limited 2019, 2021, 2022
But! The postal system
is global and
heterogeneous
©Instaclustr Pty Limited 2019, 2021, 2022
How can
post offices
be connected?
©Instaclustr Pty Limited 2019, 2021, 2022
Underground pneumatic tubes delivered mail between
postal facilities in USA cities in the 1900’s
(Source: Wikimediacommons)
Compressed Air?
©Instaclustr Pty Limited 2019, 2021, 2022
Sink
Source
Kafka Connect enables message flows across
heterogenous systems.
From Sources to Sinks via a Lake (Kafka)
©Instaclustr Pty Limited 2019, 2021, 2022
Kafka Connect Architecture:
Source and Sink Connectors
©Instaclustr Pty Limited 2019, 2021, 2022
Form Pipelines (Berlin)
©Instaclustr Pty Limited 2019, 2021, 2022
(Source: Paul Brebner)
For Beer?
©Instaclustr Pty Limited 2019, 2021, 2022
(Source: Paul Brebner)
Tides Topic
REST call
JSON result
{"metadata": {
"id":"8724580",
"name":"Key West",
"lat":"24.5508”,
"lon":"-81.8081"},
"data":[{
"t":"2020-09-24 04:18",
"v":"0.597"}]}
Elasticsearch sink connector Tides Index
Example of a Kafka Connect IoT pipeline:
Tidal Data à Kafka à Elasticsearch à Kibana
©Instaclustr Pty Limited, 2021
REST source connector
{"metadata": {
"id":"8724580",
"name":"Key West",
"lat":"24.5508”,
"lon":"-81.8081"},
"data":[{
"t":"2020-09-24
04:18",
"v":"0.597"}]}
©Instaclustr Pty Limited 2019, 2021, 2022
Tides Topic
REST call
JSON result
{"metadata": {
"id":"8724580",
"name":"Key West",
"lat":"24.5508”,
"lon":"-81.8081"},
"data":[{
"t":"2020-09-24 04:18",
"v":"0.597"}]}
Elasticsearch sink connector Tides Index
Example of a Kafka Connect IoT pipeline:
Tidal Data à Kafka à Elasticsearch à Kibana
©Instaclustr Pty Limited, 2021
REST source connector
{"metadata": {
"id":"8724580",
"name":"Key West",
"lat":"24.5508”,
"lon":"-81.8081"},
"data":[{
"t":"2020-09-24
04:18",
"v":"0.597"}]}
©Instaclustr Pty Limited 2019, 2021, 2022
Connectors
require
Configuration
Source: https://commons.wikimedia.org/wiki/File:Royal_mail_sorting.jpg
What’s Still Missing?
Mail Sorting
©Instaclustr Pty Limited 2019, 2021, 2022
Automated!
Source: https://commons.wikimedia.org/wiki/File:Post_Sorting_Machine_(4479045801).jpg
©Instaclustr Pty Limited 2019, 2021, 2022
At Scale!
(Source: https://commons.wikimedia.org/w/index.php?search=mail+sorting&title=Special:MediaSearch&go=Go&type=image)
©Instaclustr Pty Limited 2019, 2021, 2022
Kafka Streams:
Topics in, Topics out, via Streams
©Instaclustr Pty Limited 2019, 2021, 2022
All Kafka APIs: Producer, Consumer, Connect, Streams
(Source: Shutterstock)
Simple Streams Topology
©Instaclustr Pty Limited 2019, 2021, 2022
Join
Group
Filter
Aggregate
etc.
Complex Streams Topology
(Source: Shutterstock)
Kafka Streams = Rapids!?
©Instaclustr Pty Limited 2019, 2021, 2022
(Source: Shutterstock)
Streams get complicated quickly!
One way to keep dry…
©Instaclustr Pty Limited 2019, 2021, 2022
Or, This Diagram Which Explains
The Order of Streams DSL Operations
(Source: https://kafka.apache.org/20/documentation/streams/developer-guide/dsl-api.html)
©Instaclustr Pty Limited 2019, 2021, 2022
Dr Black has been murdered in the
Billiard Room with a Candlestick!
Whodunnit?!
[KSTREAM-FILTER-0000000024]: Conservatory: Professor Plum has no alibi
[KSTREAM-FILTER-0000000024]: Library: Colonel Mustard has no alibi
[KSTREAM-FILTER-0000000024]: Billiard Room: Mrs White has no alibi
Cluedo Kafka Streams Example
Tracks who’s in
what rooms and
when, and emits
list of suspects
without an alibi
©Instaclustr Pty Limited 2019, 2021, 2022
Topology of Cluedo Streams Example
This tool is very useful for
visualizing and debugging streams
https://zz85.github.io/kafka-streams-viz/
©Instaclustr Pty Limited 2019, 2021, 2022
Some Kafka Use Cases
©Instaclustr Pty Limited 2019, 2021, 2022
Example 1 - Kafka ”Kongo” Logistics IoT
Application – Goods, Warehouses, Trucks,
Sensors and Rules
©Instaclustr Pty Limited 2019, 2021, 2022
Detect transportation and storage violations in
real-time
©Instaclustr Pty Limited 2019, 2021, 2022
And Kafka Streams to prevent Truck Overloading
©Instaclustr Pty Limited 2019, 2021, 2022
(Source: Shutterstock)
Example 2 - One of these things is not
like the others
©Instaclustr Pty Limited 2019, 2021, 2022
(Source: Shutterstock)
Massively Scalable Anomaly
Detection with Kafka and Cassandra
©Instaclustr Pty Limited 2019, 2021, 2022
19 Billion Checks/day with 470 CPU Cores
©Instaclustr Pty Limited 2019, 2021, 2022
0
2
4
6
8
10
12
14
16
18
20
0 50 100 150 200 250 300 350 400 450 500
Billion
checks/day
Total CPU Cores
Anomaly checks/day (billion)
19
Billion
Example 3 - Which Came First? State
or Events?
©Instaclustr Pty Limited 2019, 2021, 2022
State
(Source: Shutterstock)
State
Events
Change Data Capture (CDC) with Debezium and
Kafka Connect
State to Events and back to State again
©Instaclustr Pty Limited 2019, 2021, 2022
State
Events
State
Apache Kafka:
https://kafka.apache.org/
Gently down the Stream:
www.gentlydownthe.stream
That’s it for this
short visual
introduction to
Apache Kafka.
For more information
please have a look at the
Apache Kafka docs, the
Instaclustr Blogs, and check out
our free Kafka trial.
“Gently down the
Stream” - another
“Visual” introduction to
Kafka, with Otters!
©Instaclustr Pty Limited 2019, 2021, 2022
All of my blogs (Cassandra, Kafka, MirrorMaker, Spark, Zookeeper,
OpenSearch, Redis, PostgreSQL, Debezium, Cadence, etc)
www.instaclustr.com/paul-brebner/
Kafka Streams Cluedo Example (part of “Kongo” Kafka intro series)
www.instaclustr.com/blog/kongo-5-3-apache-kafka-streams-examples/
Kafka Connect Pipeline Series (Tides data processing)
www.instaclustr.com/blog/kafka-connect-pipelines-conclusion-pipeline-series-part-10/
Kafka Xmas Tree Lights Simulation (my 1st Kafka program)
www.instaclustr.com/blog/seasons-greetings-instaclustr-kafka-christmas-tree-light-simulation/
Instaclustr’s Managed Kafka (Free Trial)
www.instaclustr.com/platform/managed-apache-kafka/
Instaclustr Blogs
© Instaclustr Pty Limited 2019, 2021, 2022 [https://www.instaclustr.com/company/policies/terms-conditions/]. Except as permitted by the copyright law applicable to you, you may not reproduce, distribute,
publish, display, communicate or transmit any of the content of this document, in any form, but any means, without the prior written permission of Instaclustr Pty Limited.

More Related Content

What's hot

[GKE & Spanner 勉強会] Cloud Spanner の技術概要
[GKE & Spanner 勉強会] Cloud Spanner の技術概要[GKE & Spanner 勉強会] Cloud Spanner の技術概要
[GKE & Spanner 勉強会] Cloud Spanner の技術概要Google Cloud Platform - Japan
 
Inside the InfluxDB storage engine
Inside the InfluxDB storage engineInside the InfluxDB storage engine
Inside the InfluxDB storage engineInfluxData
 
Introduction à OpenStack
Introduction à OpenStackIntroduction à OpenStack
Introduction à OpenStackAnDaolVras
 
Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編Yuki Morishita
 
Hadoopの概念と基本的知識
Hadoopの概念と基本的知識Hadoopの概念と基本的知識
Hadoopの概念と基本的知識Ken SASAKI
 
大規模データ処理の定番OSS Hadoop / Spark 最新動向 - 2021秋 -(db tech showcase 2021 / ONLINE 発...
大規模データ処理の定番OSS Hadoop / Spark 最新動向 - 2021秋 -(db tech showcase 2021 / ONLINE 発...大規模データ処理の定番OSS Hadoop / Spark 最新動向 - 2021秋 -(db tech showcase 2021 / ONLINE 発...
大規模データ処理の定番OSS Hadoop / Spark 最新動向 - 2021秋 -(db tech showcase 2021 / ONLINE 発...NTT DATA Technology & Innovation
 
Internet Week 2018 知っておくべきIPv6とセキュリティの話
Internet Week 2018 知っておくべきIPv6とセキュリティの話Internet Week 2018 知っておくべきIPv6とセキュリティの話
Internet Week 2018 知っておくべきIPv6とセキュリティの話Akira Nakagawa
 
[2018] 오픈스택 5년 운영의 경험
[2018] 오픈스택 5년 운영의 경험[2018] 오픈스택 5년 운영의 경험
[2018] 오픈스택 5년 운영의 경험NHN FORWARD
 
CVE-2021-3156 Baron samedit (sudoの脆弱性)
CVE-2021-3156 Baron samedit (sudoの脆弱性)CVE-2021-3156 Baron samedit (sudoの脆弱性)
CVE-2021-3156 Baron samedit (sudoの脆弱性)Tetsuya Hasegawa
 
Apache CloudStack Architecture by Alex Huang
Apache CloudStack Architecture by Alex HuangApache CloudStack Architecture by Alex Huang
Apache CloudStack Architecture by Alex Huangbuildacloud
 
Apache Sparkについて
Apache SparkについてApache Sparkについて
Apache SparkについてBrainPad Inc.
 
CloudStack再入門!15分でおさらいするCloudStackの基礎
CloudStack再入門!15分でおさらいするCloudStackの基礎CloudStack再入門!15分でおさらいするCloudStackの基礎
CloudStack再入門!15分でおさらいするCloudStackの基礎Satoshi Shimazaki
 
openstack+cephインテグレーション
openstack+cephインテグレーションopenstack+cephインテグレーション
openstack+cephインテグレーションOSSラボ株式会社
 
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...DataStax
 
Cassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary DifferencesCassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary DifferencesScyllaDB
 

What's hot (20)

[GKE & Spanner 勉強会] Cloud Spanner の技術概要
[GKE & Spanner 勉強会] Cloud Spanner の技術概要[GKE & Spanner 勉強会] Cloud Spanner の技術概要
[GKE & Spanner 勉強会] Cloud Spanner の技術概要
 
Apache OpenWhiskで実現するプライベートFaaS環境 #tjdev
Apache OpenWhiskで実現するプライベートFaaS環境 #tjdevApache OpenWhiskで実現するプライベートFaaS環境 #tjdev
Apache OpenWhiskで実現するプライベートFaaS環境 #tjdev
 
Inside the InfluxDB storage engine
Inside the InfluxDB storage engineInside the InfluxDB storage engine
Inside the InfluxDB storage engine
 
Introduction à OpenStack
Introduction à OpenStackIntroduction à OpenStack
Introduction à OpenStack
 
Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編
 
Hadoopの概念と基本的知識
Hadoopの概念と基本的知識Hadoopの概念と基本的知識
Hadoopの概念と基本的知識
 
Ingenierie des reseaux radiomobiles 3
Ingenierie des reseaux radiomobiles 3Ingenierie des reseaux radiomobiles 3
Ingenierie des reseaux radiomobiles 3
 
大規模データ処理の定番OSS Hadoop / Spark 最新動向 - 2021秋 -(db tech showcase 2021 / ONLINE 発...
大規模データ処理の定番OSS Hadoop / Spark 最新動向 - 2021秋 -(db tech showcase 2021 / ONLINE 発...大規模データ処理の定番OSS Hadoop / Spark 最新動向 - 2021秋 -(db tech showcase 2021 / ONLINE 発...
大規模データ処理の定番OSS Hadoop / Spark 最新動向 - 2021秋 -(db tech showcase 2021 / ONLINE 発...
 
Internet Week 2018 知っておくべきIPv6とセキュリティの話
Internet Week 2018 知っておくべきIPv6とセキュリティの話Internet Week 2018 知っておくべきIPv6とセキュリティの話
Internet Week 2018 知っておくべきIPv6とセキュリティの話
 
[2018] 오픈스택 5년 운영의 경험
[2018] 오픈스택 5년 운영의 경험[2018] 오픈스택 5년 운영의 경험
[2018] 오픈스택 5년 운영의 경험
 
CVE-2021-3156 Baron samedit (sudoの脆弱性)
CVE-2021-3156 Baron samedit (sudoの脆弱性)CVE-2021-3156 Baron samedit (sudoの脆弱性)
CVE-2021-3156 Baron samedit (sudoの脆弱性)
 
Apache CloudStack Architecture by Alex Huang
Apache CloudStack Architecture by Alex HuangApache CloudStack Architecture by Alex Huang
Apache CloudStack Architecture by Alex Huang
 
Apache Sparkについて
Apache SparkについてApache Sparkについて
Apache Sparkについて
 
Nwdafまとめ
NwdafまとめNwdafまとめ
Nwdafまとめ
 
CloudStack再入門!15分でおさらいするCloudStackの基礎
CloudStack再入門!15分でおさらいするCloudStackの基礎CloudStack再入門!15分でおさらいするCloudStackの基礎
CloudStack再入門!15分でおさらいするCloudStackの基礎
 
openstack+cephインテグレーション
openstack+cephインテグレーションopenstack+cephインテグレーション
openstack+cephインテグレーション
 
「ネットワーク超入門 IPsec VPN編」
「ネットワーク超入門 IPsec VPN編」「ネットワーク超入門 IPsec VPN編」
「ネットワーク超入門 IPsec VPN編」
 
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
 
Hadoop入門
Hadoop入門Hadoop入門
Hadoop入門
 
Cassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary DifferencesCassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary Differences
 

Similar to A Visual Introduction to Apache Kafka.pdf

Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Timothy Spann
 
How Digital is Changing Direct Mail
How Digital is Changing Direct MailHow Digital is Changing Direct Mail
How Digital is Changing Direct MailDirect Development
 
MuleSoft Meetup Singapore #8 March 2021
MuleSoft Meetup Singapore #8 March 2021MuleSoft Meetup Singapore #8 March 2021
MuleSoft Meetup Singapore #8 March 2021Julian Douch
 
Open Source Telecom Software Survey 2022, Alan Quayle
Open Source Telecom Software Survey 2022, Alan QuayleOpen Source Telecom Software Survey 2022, Alan Quayle
Open Source Telecom Software Survey 2022, Alan QuayleAlan Quayle
 
APAC-05 XMPP AccessGrid presentation
APAC-05 XMPP AccessGrid presentationAPAC-05 XMPP AccessGrid presentation
APAC-05 XMPP AccessGrid presentationSteve Smith
 
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020Databricks
 
TADSummit EMEA 2019, Challenges Consuming Programmable Telecoms from the Deve...
TADSummit EMEA 2019, Challenges Consuming Programmable Telecoms from the Deve...TADSummit EMEA 2019, Challenges Consuming Programmable Telecoms from the Deve...
TADSummit EMEA 2019, Challenges Consuming Programmable Telecoms from the Deve...Alan Quayle
 
Challenges Consuming Programmable Telecoms from the Developer’s Perspective
Challenges Consuming Programmable Telecoms from the Developer’s PerspectiveChallenges Consuming Programmable Telecoms from the Developer’s Perspective
Challenges Consuming Programmable Telecoms from the Developer’s PerspectiveSebastian Schumann
 
The big shift 2011 07
The big shift 2011 07The big shift 2011 07
The big shift 2011 07Frank Bennett
 
Magazine collect
Magazine collectMagazine collect
Magazine collectYing-Wen Su
 
IBM MQ and Kafka, what is the difference?
IBM MQ and Kafka, what is the difference?IBM MQ and Kafka, what is the difference?
IBM MQ and Kafka, what is the difference?David Ware
 
Unleashing the Power of the Unbound Book
Unleashing the Power of the Unbound BookUnleashing the Power of the Unbound Book
Unleashing the Power of the Unbound BookMuruga J
 
Cloud Computing-The Challenges for Data Networks-Final Poster
Cloud Computing-The Challenges for Data Networks-Final PosterCloud Computing-The Challenges for Data Networks-Final Poster
Cloud Computing-The Challenges for Data Networks-Final PosterCharles Edwards
 
Cubeacon Smart Retail Industry with iBeacon Technology
Cubeacon Smart Retail Industry with iBeacon TechnologyCubeacon Smart Retail Industry with iBeacon Technology
Cubeacon Smart Retail Industry with iBeacon TechnologyAvianto Tiyo
 
VMblog - 2018 Containers Predictions from 16 Industry Experts
VMblog - 2018 Containers Predictions from 16 Industry ExpertsVMblog - 2018 Containers Predictions from 16 Industry Experts
VMblog - 2018 Containers Predictions from 16 Industry Expertsvmblog
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA
 
D Villegas - Associate Editor.PDF
D Villegas - Associate Editor.PDFD Villegas - Associate Editor.PDF
D Villegas - Associate Editor.PDFDean Villegas
 

Similar to A Visual Introduction to Apache Kafka.pdf (20)

Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)
 
How Digital is Changing Direct Mail
How Digital is Changing Direct MailHow Digital is Changing Direct Mail
How Digital is Changing Direct Mail
 
MuleSoft Meetup Singapore #8 March 2021
MuleSoft Meetup Singapore #8 March 2021MuleSoft Meetup Singapore #8 March 2021
MuleSoft Meetup Singapore #8 March 2021
 
Open Source Telecom Software Survey 2022, Alan Quayle
Open Source Telecom Software Survey 2022, Alan QuayleOpen Source Telecom Software Survey 2022, Alan Quayle
Open Source Telecom Software Survey 2022, Alan Quayle
 
Kafka RealTime Streaming
Kafka RealTime StreamingKafka RealTime Streaming
Kafka RealTime Streaming
 
APAC-05 XMPP AccessGrid presentation
APAC-05 XMPP AccessGrid presentationAPAC-05 XMPP AccessGrid presentation
APAC-05 XMPP AccessGrid presentation
 
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
 
TADSummit EMEA 2019, Challenges Consuming Programmable Telecoms from the Deve...
TADSummit EMEA 2019, Challenges Consuming Programmable Telecoms from the Deve...TADSummit EMEA 2019, Challenges Consuming Programmable Telecoms from the Deve...
TADSummit EMEA 2019, Challenges Consuming Programmable Telecoms from the Deve...
 
Challenges Consuming Programmable Telecoms from the Developer’s Perspective
Challenges Consuming Programmable Telecoms from the Developer’s PerspectiveChallenges Consuming Programmable Telecoms from the Developer’s Perspective
Challenges Consuming Programmable Telecoms from the Developer’s Perspective
 
The big shift 2011 07
The big shift 2011 07The big shift 2011 07
The big shift 2011 07
 
Magazine collect
Magazine collectMagazine collect
Magazine collect
 
IBM MQ and Kafka, what is the difference?
IBM MQ and Kafka, what is the difference?IBM MQ and Kafka, what is the difference?
IBM MQ and Kafka, what is the difference?
 
Unleashing the Power of the Unbound Book
Unleashing the Power of the Unbound BookUnleashing the Power of the Unbound Book
Unleashing the Power of the Unbound Book
 
Cloud Computing-The Challenges for Data Networks-Final Poster
Cloud Computing-The Challenges for Data Networks-Final PosterCloud Computing-The Challenges for Data Networks-Final Poster
Cloud Computing-The Challenges for Data Networks-Final Poster
 
Mobile Marketing 2015
Mobile Marketing   2015Mobile Marketing   2015
Mobile Marketing 2015
 
Cubeacon Smart Retail Industry with iBeacon Technology
Cubeacon Smart Retail Industry with iBeacon TechnologyCubeacon Smart Retail Industry with iBeacon Technology
Cubeacon Smart Retail Industry with iBeacon Technology
 
A Whitepaper on Hybrid Set-Top-Box
A Whitepaper on Hybrid Set-Top-BoxA Whitepaper on Hybrid Set-Top-Box
A Whitepaper on Hybrid Set-Top-Box
 
VMblog - 2018 Containers Predictions from 16 Industry Experts
VMblog - 2018 Containers Predictions from 16 Industry ExpertsVMblog - 2018 Containers Predictions from 16 Industry Experts
VMblog - 2018 Containers Predictions from 16 Industry Experts
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
 
D Villegas - Associate Editor.PDF
D Villegas - Associate Editor.PDFD Villegas - Associate Editor.PDF
D Villegas - Associate Editor.PDF
 

Recently uploaded

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 

Recently uploaded (20)

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 

A Visual Introduction to Apache Kafka.pdf

  • 1. Paul Brebner Technology Evangelist www.instaclustr.com sales@instaclustr.com © Instaclustr Pty Limited, 2022 [https://www.instaclustr.com/ company/policies/terms-conditions/]. Except as permitted by the copyright law applicable to you, you may not reproduce, distribute, publish, display, communicate or transmit any of the content of this document, in any form, but any means, without the prior written permission of Instaclustr Pty Limited.
  • 2. In this Visual Introduction to Kafka, we’re going to build a Postal Service We’ll learn about Kafka Producers, Consumers, Topics, Partitions, Keys, Records, Delivery Semantics (Guaranteed delivery, and who gets what messages), Consumer Groups, Kafka Connect and Streams! ©Instaclustr Pty Limited 2019, 2021, 2022
  • 3. Kafka is a distributed streams processing system, it allows distributed producers to send messages to distributed consumers via a Kafka cluster. What is ©Instaclustr Pty Limited 2019, 2021, 2022 Kafka?
  • 4. Kafka has lots of benefits: It’s Fast: It has high throughput and low latency It’s Scalable: It’s horizontally scalable, to scale just add nodes and partitions It’s Reliable: It’s distributed and fault tolerant It has Zero Data Loss: Messages are persisted to disk with an immutable log It’s Open Source: An Apache project And it’s available as an Instaclustr Managed Service: On multiple cloud platforms Managed Service Fast Scalable Reliable Durable Open Source ©Instaclustr Pty Limited 2019, 2021, 2022
  • 5. But the usual Kafka diagram (right) is a bit monochrome and boring. ©Instaclustr Pty Limited 2019, 2021, 2022
  • 6. This visual introduction will be more colourful and it’s going to be an extended story… ©Instaclustr Pty Limited 2019, 2021, 2022
  • 7. Let’s build a modern day fully electronic postal service T o send messages from A to B Postal Service A B ©Instaclustr Pty Limited 2019, 2021, 2022
  • 8. T o B, the consumer, the recipient of the message. A is a producer, it sends a message… First, we need an “A”. ©Instaclustr Pty Limited 2019, 2021, 2022
  • 9. Due to the decline in “snail mail” volumes, direct deliveries have been canceled. CANCELED Actually, not. ©Instaclustr Pty Limited 2019, 2021, 2022
  • 10. Consumers poll for messages by visiting the counter at the post office. Poste Restante is not a post office in a restaurant, it’s called general delivery (in the US). The mail is delivered to a post office, and they hold it for you until you call for it. Instead we have “Poste Restante” Image: La Poste Restante, Francois-Auguste Biard (Wikimedia) ©Instaclustr Pty Limited 2019, 2021, 2022
  • 11. Disconnected delivery—consumer doesn’t need to be available to receive messages There’s less effort for the messaging service— only has to deliver to a few locations not many consumer addresses And it can scale better and handle more complex delivery semantics! Postal Service Kafka topics act like a Post Office. What are the benefits? ©Instaclustr Pty Limited 2019, 2021, 2022
  • 12. Kafka Topics have 1 or more Partitions. Partitions function like multiple counters and enable high concurrency. A single counter introduces delays and limits concurrency. More counters increases concurrency and reduces delays. First lets see how it scales. What if there are many consumers for a topic? ©Instaclustr Pty Limited 2019, 2021, 2022
  • 13. Santa North Pole Let’s see what a message looks like. In Kafka a message is called a Record and is a bit like a letter. The topic is the destination, The North Pole. ©Instaclustr Pty Limited 2019, 2021, 2022
  • 14. Santa North Pole Time semantics are flexible, either the time of event creation, ingestion, or processing. timestamp, offset, partition T opic The “Postmark” includes a timestamp, offset in the topic, and the partition it was sent to. ©Instaclustr Pty Limited 2019, 2021, 2022
  • 15. Santa North Pole We want this letter sent to Santa not just a random Elf. timestamp, offset, partition T opic Key Partition (optional) There’s also a thing called a Key, which is optional. It refines the destination so it’s a bit like the rest of the address. ©Instaclustr Pty Limited 2019, 2021, 2022
  • 16. Santa North Pole And the value is the contents (just a byte array). Kafka Producers and consumers need to have a shared serializer and de-serializer for both the key and value. timestamp, offset, partition T opic Key Partition (optional) Value (Content) ©Instaclustr Pty Limited 2019, 2021, 2022
  • 17. Kafka doesn’t look inside the value, but the Producer and Consumer do, and the Consumer can try and make sense of the message (Can you?!) Image: Dear Santa by Zack Poitras / http://theinclusive.net/article.php?id=268 ©Instaclustr Pty Limited 2019, 2021, 2022
  • 18. let’s look at delivery semantics For example, do we care if the message actually arrives or not? Next ©Instaclustr Pty Limited 2019, 2021, 2022
  • 19. Last century, homing pigeons were prone to getting lost or eaten by predators, so the same message was sent with several pigeons. Yes we do! Guaranteed message delivery is desirable. ©Instaclustr Pty Limited 2019, 2021, 2022
  • 20. How does Kafka guarantee delivery? The message is always persisted to disk. This makes it resilient to power failure A Message (M1) is written to a broker (2). Producer M1 M1 Broker 1 Broker 2 Broker 3 ©Instaclustr Pty Limited 2019, 2021, 2022
  • 21. Producer Broker 1 Broker Broker 3 M1 M1 M1 The message is also replicated on multiple brokers, 3 is typical. 2 ©Instaclustr Pty Limited 2019, 2021, 2022
  • 22. Producer M1 M1 M1 And makes it resilient to loss of some servers (all but one). Broker 1 ©Instaclustr Pty Limited 2019, 2021, 2022
  • 23. Finally the producer gets acknowledgement once the message is persisted and replicated (configurable for number, and sync or async). Producer M1 Broker 1 Broker 2 Broker 3 M1 M1 Acknowledgement This also increases the read concurrency as partitions are spread over multiple brokers. The message is now available from more than one broker in case some fail. ©Instaclustr Pty Limited 2019, 2021, 2022
  • 24. let’s look at another aspect of delivery semantics Who gets the messages and how many times are messages delivered? Now ©Instaclustr Pty Limited 2019, 2021, 2022
  • 25. Producer Consumer Consumer Consumer Consumer ? Kafka is “pub-sub”. It’s loosely coupled, producers and consumers don’t know about each other. ©Instaclustr Pty Limited 2019, 2021, 2022
  • 26. Filtering, or which consumers get which messages, is topic based. - Producers send messages to topics. - Consumers subscribe to topics of interest, e.g. parties. - When they poll they only receive messages sent to those topics. None of these consumers will receive messages sent to the “Work” topic. Producer Consumer Consumer Consumer Consumer Topic “Parties” Topic “Work” Consumers subscribed to Topic “Parties” Consumers poll to receive messages from “Parties” Consumers not subscribed to “Work” messages ©Instaclustr Pty Limited 2019, 2021, 2022
  • 27. A few more details and we can see how this works. Kafka works like Amish Barn raising. Partitions and a consumer group share work across multiple consumers, the more partitions a topic has the more consumers it supports. Image: Paul Cyr ©2018 NorthernMainePhotos.com ©Instaclustr Pty Limited 2019, 2021, 2022
  • 28. Kafka also works like Clones. It supports delivery of the same message to multiple consumers with consumer groups. Kafka doesn’t throw messages away immediately they are delivered, so the same message can be delivered to multiple consumer groups. Image: Shutterstock.com ©Instaclustr Pty Limited 2019, 2021, 2022
  • 29. Consumers subscribed to ”parties” topic are allocated partitions. When they poll they will only get messages from their allocated partitions. Consumer Partition n Topic “Parties” Partition 1 Producer Partition 2 Consumer Group Consumer Consumer Group Consumer Consumer ©Instaclustr Pty Limited 2019, 2021, 2022
  • 30. This enables consumers in the same group to share the work around. Each consumer gets only a subset of the available messages. Partition n Topic “Parties” Partition 1 Producer Partition 2 Consumer Group Consumer Consumer Consumers share work within groups Consumer ©Instaclustr Pty Limited 2019, 2021, 2022
  • 31. Multiple groups enable message broadcasting. Messages are duplicated across groups, as each consumer group receives a copy of each message. Consumer Consumer Consumer Consumer Topic “Parties” Partition 1 Partition 2 Partition n Producer Consumer Group Consumer Group Messages are duplicated across Consumer groups ©Instaclustr Pty Limited 2019, 2021, 2022
  • 32. Which messages are delivered to which consumers? The final aspect of delivery semantics is to do with message keys. If a message has a key, then Kafka uses Partition based delivery. Messages with the same key are always sent to the same partition and therefore the same consumer. And the order (within partitions) is guaranteed. Key ©Instaclustr Pty Limited 2019, 2021, 2022
  • 33. But if the key is null, then Kafka uses round robin delivery. Each message is delivered to the next partition. Round robin delivery ©Instaclustr Pty Limited 2019, 2021, 2022
  • 34. Let’s look at a concrete example with two consumer groups: Group 1: Nerds which has multiple consumers Group 2: The Pugsters which has a single consumer, Zug Image: Shutterstock.com Bill Paul Penny Kate Millie Jenny Image: Nenad Aksic / Shutterstock.com ©Instaclustr Pty Limited 2019, 2021, 2022
  • 35. Consumer 1 (Bill) Consumer 2 (Jenny) Consumer 1 (Zug from The Pugsters) Topic “Parties” Partition 1 Partition 2 Partition n Producer Group “Nerds” Group “Pugsters” Consumers subscribe to “Parties” Each message (1, 2, etc.) is sent to the next partition, and consumers allocated to that partition will receive the message when they poll next. Looking at the case where there’s No Keyfirst Round robin No Key 1 2 etc 1 2 1 2 Consumer n ©Instaclustr Pty Limited 2019, 2021, 2022
  • 36. Here’s what actually happens. We’re not showing the producer, topics, or partitions for simplicity. You’ll have to imagine them. Bill Paul Penny Kate Millie Jenny No Key ©Instaclustr Pty Limited 2019, 2021, 2022
  • 37. Bill Penny Kate Millie Jenny Both Groups subscribe to T opic“parties” (assuming 6 partitions, each consumer in the Nerds group gets 1 partition each; Zug gets them all) 1 Paul Subscribe to “Parties” No Key ©Instaclustr Pty Limited 2019, 2021, 2022
  • 38. Bill Pau l Penny Kate Millie Jenny Producer sends record with the value “Pool party—Invitation” to “parties” topic (there’s no key) 2 Invitation No Key ©Instaclustr Pty Limited 2019, 2021, 2022 Value
  • 39. Bill Paul Penny Kate Millie Jenny Bill and Zug receive a copy of the invitation and plan to attend 3 Invitation Invitation No Key ©Instaclustr Pty Limited 2019, 2021, 2022
  • 40. Bill Pen ny Pau l Kate Millie Jenny The Producer sends another record with the value “Pool party—Canceled” 4 No Key Invitation Canceled ©Instaclustr Pty Limited 2019, 2021, 2022 Invitation
  • 41. Bill Paul Penny Kate Millie Jenny In the Nerds group, Jenny gets the message this time as it’s round robin, and Zug gets it as he’s the only consumer in his group: ▶ Jenny ignores it as she didn’t get the original invite ▶ Bill wastes his time trying to go (as he doesn’t know it’s canceled) ▶ The rest of the gang aren’t surprised at not receiving any invites and stay home to do some hacking 5 Invitation Canceled No Key Invitation Canceled ©Instaclustr Pty Limited 2019, 2021, 2022
  • 42. Zug plans something else fun instead… A jam session with his band Image: Shutterstock.com ©Instaclustr Pty Limited 2019, 2021, 2022
  • 43. Consumer 1 (Bill) Consumer 2 (Jenny) Consumer 1 (Zug) Topic “Parties” Partition 1 Partition 2 Partition n Producer Group “Nerds” Group “Pugster” Consumers subscribe to “Parties” The key is hashed to a partition, so the Message is always sent to that partition. Assume there are 3 messages, and messages 1 and 2 are hashed to the same partition. How does it work if there is a Key? 1,2 3 etc 1,2 3 1,2 3 Consumer n Hashed to partition Key ©Instaclustr Pty Limited 2019, 2021, 2022
  • 44. Bill Paul Penny Kate Millie Jenny As before Both Groups subscribe to Topic “parties” The Producer sends a record with the key equal to “Pool Party” and the value equal to “Invitation” to “parties” topic Here’s what happens with a key, assuming that the key is the “title” of the message (“Pool Party”), and the value is invitation or canceled 1 2 Key Invitation Key Value ©Instaclustr Pty Limited 2019, 2021, 2022
  • 45. Bill Paul Penny Kate Millie Jenny As before, Bill and Zug receive a copy of the invitation and plan to attend 3 Invitation Key Key Invitation Key Value ©Instaclustr Pty Limited 2019, 2021, 2022 Value
  • 46. Bill Pen n y Kate Millie Jenny The Producer sends another record with the same key but with the value “canceled” to “parties” topic 4 Key Invitation Key Value Canceled Paul Value Key ©Instaclustr Pty Limited 2019, 2021, 2022 Invitation Key Value
  • 47. Paul Penny Kate Millie Jenny This time, Bill and Zug receive the cancelation (the same consumers as the key is identical) 5 BK il el y Key Value ©Instaclustr Pty Limited 2019, 2021, 2022 Invitation Key Value Bill Canceled Key Value Key Invitation BK il el y Key Canceled Key Value Key
  • 48. Paul Penny Kate Millie The Producer sends out another invitation to a Halloween party. The key is different this time. 6 Key ©Instaclustr Pty Limited 2019, 2021, 2022 Key Jenny Bill
  • 49. Paul Penny Kate Millie Jenny Jenny receives the Halloween invitation as the key is different and the record is sent to Jenny’s partition. Zug is the only consumer in his group so he gets every record no matter what partition it’s sent to. 7 Key ©Instaclustr Pty Limited 2019, 2021, 2022 Bill Jenny’s partitionkey
  • 50. This time Zug gets dressed up and has fun at the party. Image: Shutterstock.com ©Instaclustr Pty Limited 2019, 2021, 2022
  • 51. But wait! There’s more— event reprocessing (time travel)! Kafka stores message streams on disk, so Consumers can go back and request the same messages they’ve already received, earlier messages, or ignore some messages etc. Image: Shutterstock.com ©Instaclustr Pty Limited 2019, 2021, 2022
  • 52. ©Instaclustr Pty Limited 2019, 2021, 2022 So Zug can go “back to the future”! ©Instaclustr Pty Limited 2019, 2021, 2022
  • 53. But! The postal system is global and heterogeneous ©Instaclustr Pty Limited 2019, 2021, 2022
  • 54. How can post offices be connected? ©Instaclustr Pty Limited 2019, 2021, 2022
  • 55. Underground pneumatic tubes delivered mail between postal facilities in USA cities in the 1900’s (Source: Wikimediacommons) Compressed Air? ©Instaclustr Pty Limited 2019, 2021, 2022
  • 56. Sink Source Kafka Connect enables message flows across heterogenous systems. From Sources to Sinks via a Lake (Kafka) ©Instaclustr Pty Limited 2019, 2021, 2022
  • 57. Kafka Connect Architecture: Source and Sink Connectors ©Instaclustr Pty Limited 2019, 2021, 2022
  • 58. Form Pipelines (Berlin) ©Instaclustr Pty Limited 2019, 2021, 2022 (Source: Paul Brebner)
  • 59. For Beer? ©Instaclustr Pty Limited 2019, 2021, 2022 (Source: Paul Brebner)
  • 60. Tides Topic REST call JSON result {"metadata": { "id":"8724580", "name":"Key West", "lat":"24.5508”, "lon":"-81.8081"}, "data":[{ "t":"2020-09-24 04:18", "v":"0.597"}]} Elasticsearch sink connector Tides Index Example of a Kafka Connect IoT pipeline: Tidal Data à Kafka à Elasticsearch à Kibana ©Instaclustr Pty Limited, 2021 REST source connector {"metadata": { "id":"8724580", "name":"Key West", "lat":"24.5508”, "lon":"-81.8081"}, "data":[{ "t":"2020-09-24 04:18", "v":"0.597"}]} ©Instaclustr Pty Limited 2019, 2021, 2022
  • 61. Tides Topic REST call JSON result {"metadata": { "id":"8724580", "name":"Key West", "lat":"24.5508”, "lon":"-81.8081"}, "data":[{ "t":"2020-09-24 04:18", "v":"0.597"}]} Elasticsearch sink connector Tides Index Example of a Kafka Connect IoT pipeline: Tidal Data à Kafka à Elasticsearch à Kibana ©Instaclustr Pty Limited, 2021 REST source connector {"metadata": { "id":"8724580", "name":"Key West", "lat":"24.5508”, "lon":"-81.8081"}, "data":[{ "t":"2020-09-24 04:18", "v":"0.597"}]} ©Instaclustr Pty Limited 2019, 2021, 2022 Connectors require Configuration
  • 62. Source: https://commons.wikimedia.org/wiki/File:Royal_mail_sorting.jpg What’s Still Missing? Mail Sorting ©Instaclustr Pty Limited 2019, 2021, 2022
  • 65. Kafka Streams: Topics in, Topics out, via Streams ©Instaclustr Pty Limited 2019, 2021, 2022 All Kafka APIs: Producer, Consumer, Connect, Streams
  • 66. (Source: Shutterstock) Simple Streams Topology ©Instaclustr Pty Limited 2019, 2021, 2022
  • 68. (Source: Shutterstock) Kafka Streams = Rapids!? ©Instaclustr Pty Limited 2019, 2021, 2022
  • 69. (Source: Shutterstock) Streams get complicated quickly! One way to keep dry… ©Instaclustr Pty Limited 2019, 2021, 2022
  • 70. Or, This Diagram Which Explains The Order of Streams DSL Operations (Source: https://kafka.apache.org/20/documentation/streams/developer-guide/dsl-api.html) ©Instaclustr Pty Limited 2019, 2021, 2022
  • 71. Dr Black has been murdered in the Billiard Room with a Candlestick! Whodunnit?! [KSTREAM-FILTER-0000000024]: Conservatory: Professor Plum has no alibi [KSTREAM-FILTER-0000000024]: Library: Colonel Mustard has no alibi [KSTREAM-FILTER-0000000024]: Billiard Room: Mrs White has no alibi Cluedo Kafka Streams Example Tracks who’s in what rooms and when, and emits list of suspects without an alibi ©Instaclustr Pty Limited 2019, 2021, 2022
  • 72. Topology of Cluedo Streams Example This tool is very useful for visualizing and debugging streams https://zz85.github.io/kafka-streams-viz/ ©Instaclustr Pty Limited 2019, 2021, 2022
  • 73. Some Kafka Use Cases ©Instaclustr Pty Limited 2019, 2021, 2022
  • 74. Example 1 - Kafka ”Kongo” Logistics IoT Application – Goods, Warehouses, Trucks, Sensors and Rules ©Instaclustr Pty Limited 2019, 2021, 2022
  • 75. Detect transportation and storage violations in real-time ©Instaclustr Pty Limited 2019, 2021, 2022
  • 76. And Kafka Streams to prevent Truck Overloading ©Instaclustr Pty Limited 2019, 2021, 2022 (Source: Shutterstock)
  • 77. Example 2 - One of these things is not like the others ©Instaclustr Pty Limited 2019, 2021, 2022 (Source: Shutterstock)
  • 78. Massively Scalable Anomaly Detection with Kafka and Cassandra ©Instaclustr Pty Limited 2019, 2021, 2022
  • 79. 19 Billion Checks/day with 470 CPU Cores ©Instaclustr Pty Limited 2019, 2021, 2022 0 2 4 6 8 10 12 14 16 18 20 0 50 100 150 200 250 300 350 400 450 500 Billion checks/day Total CPU Cores Anomaly checks/day (billion) 19 Billion
  • 80. Example 3 - Which Came First? State or Events? ©Instaclustr Pty Limited 2019, 2021, 2022 State (Source: Shutterstock) State Events
  • 81. Change Data Capture (CDC) with Debezium and Kafka Connect State to Events and back to State again ©Instaclustr Pty Limited 2019, 2021, 2022 State Events State
  • 82. Apache Kafka: https://kafka.apache.org/ Gently down the Stream: www.gentlydownthe.stream That’s it for this short visual introduction to Apache Kafka. For more information please have a look at the Apache Kafka docs, the Instaclustr Blogs, and check out our free Kafka trial. “Gently down the Stream” - another “Visual” introduction to Kafka, with Otters! ©Instaclustr Pty Limited 2019, 2021, 2022
  • 83. All of my blogs (Cassandra, Kafka, MirrorMaker, Spark, Zookeeper, OpenSearch, Redis, PostgreSQL, Debezium, Cadence, etc) www.instaclustr.com/paul-brebner/ Kafka Streams Cluedo Example (part of “Kongo” Kafka intro series) www.instaclustr.com/blog/kongo-5-3-apache-kafka-streams-examples/ Kafka Connect Pipeline Series (Tides data processing) www.instaclustr.com/blog/kafka-connect-pipelines-conclusion-pipeline-series-part-10/ Kafka Xmas Tree Lights Simulation (my 1st Kafka program) www.instaclustr.com/blog/seasons-greetings-instaclustr-kafka-christmas-tree-light-simulation/ Instaclustr’s Managed Kafka (Free Trial) www.instaclustr.com/platform/managed-apache-kafka/ Instaclustr Blogs © Instaclustr Pty Limited 2019, 2021, 2022 [https://www.instaclustr.com/company/policies/terms-conditions/]. Except as permitted by the copyright law applicable to you, you may not reproduce, distribute, publish, display, communicate or transmit any of the content of this document, in any form, but any means, without the prior written permission of Instaclustr Pty Limited.