Wide Spectrum of Messaging Offerings
Ultra- low Latency (often no broker in the middle)
High Volume (Persistent or Non-Persistent)
Highly Available (Clustered and Fault Tolerant)
Embedded Messaging (inside apps)
Cross Datacenter / Organizational / B2B
Enterprise Message Bus
Web / IoT Messaging
“publish-subscribe messaging rethought as a distributed commit log”
Kafka is a Mashup
Mashup of some well proven concepts into something even greater and easier to use:
EAI + ETL
Messaging Middleware + Big Data
Batch + Real-time
Data Movement + Data Processing
Log Data Streams + Structured Database Tables
+ Distributed clustered storage
Kafka is a blend of messaging, stream processing, ETL and
modern database designs built around a distributed log
+ Streaming platform
Kafka is much more than messaging
+ Exactly Once
+ Designed for the Cloud
+ Inter DC replication
+ Schema evolution
What’s different about Kafka? Topics are also Queues
Consumers can share one copy of the data
• Independent consumers share the same log
• Inter-dependent consumers share the same log
• No need for Topic/Queue bridging or multiple
copies of the data
Message processing is greatly simplified
- There is no “head’ of the queue
- Writes are sequential, distributed, and
What’s different about Kafka? Messages are not deleted when
Messages in the commit logs are persistent and immutable
Slow Consumers are (very) decoupled from Fast Producers
Batch and real-time are unified
Message Replay, Replication, and Auditing are built-in (for free)
All production messaging deployment need some form of these
Message Retention is not a waste of disk space
You need to size for offline/disconnected consumers anyway
Distributed State can always be recreated from a common commit log
Makes distributed HA apps much easier to build
What’s different about Kafka? Topic Partitions and Keyed Messages
- Topics/Queues are not the smallest unit of
- Topics partitions are distributed across
brokers for parallel in-order consumption
- This is very different from a cluster of
traditional message brokers
- [graphic of topic partitions with parallel
Producers, Brokers, and Consumers]
- Sometime you can just use more keys
instead of more topics
- Eg. don’t create a new topic for every user,
or IoT device, create unique keys
- This is proven to scale to many millions of
connected users, cars and IoT devices
- [graphic to show Keyed messages get
distributed across topic partitions]
From an event stream / transaction log we can derive all of the following
database centric features:
- Secondary Indexing
- Materialized Views
What’s different about Kafka? Duality of streams and databases
Duality of a message streams and database tables is a key design point
(Good) Microservices avoid shared mutable state
Shared, mutable state
Old World: REST Based Microservices Interconnect
UI Service Order
Pay Fulfilment Stock
Each Microservice has to maintain their own stateful
nature by using their own databases
1. Difficult to Enforce Same REST API standards
across many languages and micro-services.
2. Rest APIs Inherently Slow: Limited to Thousands
3. Inter Service Dependencies are Messy.
4. Each Service Needs to Maintain State.
5. Difficult to enforce consistent security standards.
6. Logging is distributed between services.
7. Version compatibility between services is difficult.
Streaming Microservices with Kafka
Database Sources Now Centralized on the Kafka Bus for all microservices
1. Service inter-communication standard enforce by Kafka Schema Registry.
2. Millions of messages per second on cheap hardware.
3. No Inter-Service Dependency: just depend on Kafka.
4. Each service can be stateless: Kafka maintains state.
5. Security can be enforced by ACLs from Kafka.
6. Logs can be aggregated into Kafka.
7. Version compatibility can be enforced by Scheme Registry.
8. Kafka is inherently HA, horizontal scalable: still no central point of failure.
What’s different about Kafka? Ecosystem and Adoption
The Kafka ecosystem is flourishing and developer adoption continues to grow
• Confluent Platform additions (REST Proxy, Schema Registry, KSQL etc.)
• Third Party Connectors ( Confluent Hub)
• Open Source contributions from individuals, corporations, vendors, consulting organizations
• Inside and outside of Big Data/Stream Processing
Adoption of Event Streaming
60%Fortune 100 Companies
Using Apache Kafka
Event Streaming at the Heart of the Enterprise