What is Apache Kafka? Why is it so popular? Should I use it?
1. BASEL | BERN | BRUGG | BUCHAREST | COPENHAGEN | DÜSSELDORF | FRANKFURT A.M. | FREIBURG I.BR.
GENEVA | HAMBURG | LAUSANNE | MANNHEIM | MUNICH | STUTTGART | VIENNA | ZURICH
http://guidoschmutz@wordpress.com@gschmutz
What is Apache Kafka? Why is it so popular?
Should I use it?
Guido Schmutz
Trivadis Speed Session 2019
2. BASEL | BERN | BRUGG | BUKAREST | DÜSSELDORF | FRANKFURT A.M. | FREIBURG I.BR. | GENF
HAMBURG | KOPENHAGEN | LAUSANNE | MANNHEIM | MÜNCHEN | STUTTGART | WIEN | ZÜRICH
Guido
Working at Trivadis for more than 22 years
Consultant, Trainer, Platform Architect for Java,
Oracle, SOA and Big Data / Fast Data
Oracle Groundbreaker Ambassador & Oracle ACE
Director
@gschmutz guidoschmutz.wordpress.com
174th
edition
3. Event Hub
Kafka Message Broker – Key properties
• Publish / Subscribe Messaging –
message can be consumed by 0 – n
consumers
• horizontally scalable – throughput
increases with more nodes
• highly available – no SPOF
• durable – messages are not lost
• Schema-less – Kafka broker has no
knowledge on message content and
format
6. Event Hub
Stream Data
Integration
Stream Data
Integration
Stream
Analytics
Vehicle
Weather
Streaming Data Sources
Stream Analytics
• Stream-to-Stream Joins
• Stream-to-Table Joins
• Time Windowed State Management
• Event Pattern Detection
• Machine Learning Model Execution
(Inference)
7. Event Hub
Stream Data
Integration
Stream Data
Integration
Stream
Analytics
Vehicle
Weather
Streaming Data Sources
Data Lake /
DWH
Batch Data
Integration
Batch
Visualize
Data Lake Ingestion
• Machine Learning
• Graph Algorithms
• Natural Language
Processing
8. Event Hub
Stream Data
Integration
Stream Data
Integration
Stream
Analytics
Vehicle
Weather
Legacy
App
Machine
IIoT
Batch Data Sources
Streaming Data Sources
Data Lake /
DWH
Batch Data
Integration
Batch
Visualize
9. Event Hub
Stream Data
Integration
Stream Data
Integration
Stream
Analytics
Vehicle
Weather
Legacy
App
Machine
IIoT
Batch Data Sources
Streaming Data Sources
Data Lake /
DWH
Batch Data
Integration
Batch
Visualize
10. Event Hub
Stream Data
Integration
Stream Data
Integration
Stream
Analytics
Vehicle
Weather
Legacy
App
Machine
IIoT
Stream Data
Integration
Batch Data Sources
Streaming Data Sources
CDC
Data Lake /
DWH
Batch Data
Integration
Batch
Visualize
(Right-Time) Legacy Integration
11. Event Hub
Stream Data
Integration
Stream Data
Integration
Stream
Analytics
Vehicle
Weather
Legacy
App
Machine
IIoT
Stream Data
Integration
Batch Data Sources
Streaming Data Sources
CDC
Data Lake /
DWH
Batch Data
Integration
Batch
Visualize
(Right-Time) Legacy Integration
12. Event Hub
Stream Data
Integration
Stream Data
Integration
Stream
Analytics
Vehicle
Weather
Legacy
App
Machine
IIoT
Stream Data
Integration
Batch Data Sources
Streaming Data Sources
CDC
Stream Data
Integration
CDC
Data Lake /
DWH
Batch Data
Integration
Batch
Visualize
(Right-Time) Legacy Integration
13. Event Hub
Stream Data
Integration
Stream Data
Integration
Stream
Analytics
Vehicle
Weather
Legacy
App
Machine
IIoT
Stream Data
Integration
Batch Data Sources
Streaming Data Sources
CDC
Stream Data
Integration
CDC
Data Lake /
DWH
Batch Data
Integration
Batch
Visualize
(Right-Time) Legacy Integration
14. Event Hub
Stream Data
Integration
Stream Data
Integration
Stream
Analytics
Vehicle
Weather
Legacy
App
Machine
IIoT
Stream Data
Integration
Batch Data Sources
Streaming Data Sources
CDC
Stream Data
Integration
CDC
Streaming
Visualize
Data Lake /
DWH
Batch Data
Integration
Stream Data
Integration
Batch
Visualize
Streaming
Visualization
15. Event Hub
Stream Data
Integration
Stream Data
Integration
Stream
Analytics
Vehicle
Weather
Legacy
App
Machine
IIoT
Stream Data
Integration
Batch Data Sources
Streaming Data Sources
CDC
Stream Data
Integration
CDC
Streaming
Visualize
Data Lake /
DWH
Batch Data
Integration
Stream Data
Integration
Stream Data
Integration
NOSQL
NewSQL
Batch
Visualize
Result Store
Integration
16. Event Hub
Stream Data
Integration
Stream Data
Integration
Stream
Analytics
Vehicle
Weather
Legacy
App
Machine
IIoT
Stream Data
Integration
Batch Data Sources
Streaming Data Sources
CDC
Stream Data
Integration
CDC
Streaming
Visualize
Data Lake /
DWH
Batch Data
Integration
Stream Data
Integration
Stream Data
Integration
NOSQL
NewSQL
Batch
Visualize
Micro
service
Highly Decoupled Modern Apps
17. Event Hub
Stream Data
Integration
Stream Data
Integration
Stream
Analytics
Vehicle
Weather
Legacy
App
Machine
IIoT
Stream Data
Integration
Batch Data Sources
Streaming Data Sources
CDC
Stream Data
Integration
CDC
Streaming
Visualize
Data Lake /
DWH
Batch Data
Integration
Stream Data
Integration
Stream Data
Integration
NOSQL
NewSQL
Batch
Visualize
Micro
service
Micro
service
Highly Decoupled Modern Apps
18. Event Hub
Stream Data
Integration
Stream Data
Integration
Stream
Analytics
Vehicle
Weather
Legacy
App
Machine
IIoT
Stream Data
Integration
Batch Data Sources
Streaming Data Sources
CDC
Stream Data
Integration
CDC
Streaming
Visualize
Data Lake /
DWH
Batch Data
Integration
Stream Data
Integration
Stream Data
Integration
NOSQL
NewSQL
Batch
Visualize
Micro
service
Micro
service
Gateway
Data Source talks to
Kafka through MQTT
19. Event Hub
Stream Data
Integration
Stream Data
Integration
Stream
Analytics
Vehicle
Weather
Legacy
App
Machine
IIoT
Stream Data
Integration
Batch Data Sources
Streaming Data Sources
CDC
Stream Data
Integration
CDC
Streaming
Visualize
Data Lake /
DWH
Batch Data
Integration
Stream Data
Integration
Stream Data
Integration
NOSQL
NewSQL
Batch
Visualize
Micro
service
Micro
service
Gateway
Kafka becomes
central nervous
system for data
20. Apache Kafka
Kafka Cluster
Consumer 1 Consume 2r
Broker 1 Broker 2 Broker 3
Zookeeper
Ensemble
ZK 1 ZK 2ZK 3
Schema
Registry
Service 1
Management
Control Center
Kafka Manager
KAdmin
Producer 1 Producer 2
kafkacat
Data Retention:
• Never
• Time (TTL) or Size-based
• Log-Compacted based
Producer3Producer3
ConsumerConsumer 3
21. • No SPoF, highly available
• Consumer polls for new messages
Apache Kafka
• horizontally scalable, guaranteed order