Introduction to Apache Kafka

Traditional Messaging
● Java Messaging Service (JMS)

● Advanced Messaging Queuing Protocol (AMQP)

● Message Queuing Telemetry Transport (MQTT)

○ Apache Active MQ , IBM Websphere MQ , Hornet MQ, Fiorano* MQ

○ Rabbit MQ

○ Rabbit MQ
○ Hive MQ

A very famous Qns
https://www.quora.com/What-are-the-differences-between-Apache-Kafka-and-RabbitMQ

“Performance-wise, both are excellent
performers, but have major architectural
differences.”
--from the quora qns discussion

What’s the Diff ?
● Server Pushes/delivers msg to Subscriber ● Subscriber pulls/picks up msg from server

What’s the Diff ?
● Server Pushes/delivers msg to Subscriber
● Server does lot of work in-mem
● Subscriber pulls/picks up msg from server
● Not much in-mem work for server, just store msg

What’s the Diff ?
○ Store each msg & its state(delivered etc)
○ Just store msg. Dont care whether pickedup or not

What’s the Diff ?
○ Maintain order of msg
○ Ordering logic dictated by client & storage format

What’s the Diff ?
● Hence mostly an ‘Online’ processing model
● Hence mostly an Oﬄine processing model

What’s the Diff ?
● Server can do complex routing logic.
● Client maintains routing logic. Server is blind to it.

What’s the Diff ?
● Server can do complex routing logic.
● Client maintains routing logic. Server is blind to it.
● Also. Subscriber stores state i.e. which msg’s it picked up

Apache Kafka
Notions:
● Publisher

Apache Kafka
Notions:
● Publisher
● Message

Apache Kafka
Notions:
● Publisher
● Message
● Topic

Apache Kafka
Notions:
● Publisher
● Message
● Topic
○ Topic Partition

Apache Kafka
Notions:
● Publisher
● Message
● Topic
○ Topic Partition
● Broker

Apache Kafka
Notions:
● Publisher
● Message
● Topic
○ Topic Partition
● Broker
● Subscriber/Consumer

Apache Kafka
Notions:
● Publisher
● Message
● Topic
○ Topic Partition
● Broker
● Subscriber/Consumer
● Message Oﬀset

Summary
● Publisher chooses a topic to publish onto.
○ It also decides Routing logic i.e. chooses which partition to publish onto (uses a partitioning key)

Summary
● Broker receives message & appends message to end of topic partition.

Summary
● Subscriber requests broker for msg at speciﬁc oﬀset in a Topic Partition.

Summary
● Subscriber requests broker for msg at specific offset in a Topic Partition.
● Upto Subscriber to remember which msg offset it has processed.

A lovely use case - REPLAY
● Since Subscriber requests for a message at an oﬀset in a topic partition, the
subscriber is free to REPLAY the processing at any point in time.

A lovely use case - REPLAY
● Since Subscriber requests for a message at an oﬀset in a topic partition, the
subscriber is free to REPLAY the processing at any point in time.
● Handy when outages occur.

Things to Ponder about
● How do i achieve high Read/Write Throughput ?

○ Have more partitions per topic , this determines read/write throughput

● Can multiple publishers publish concurrently to same topic partition ?

○ Yes

○ Yes
● Should multiple Consumers read from same topic partition ?

○ Yes
○ Ideally one Consumer per partition or Consumer group per partition

○ Yes
● What about replication of data ?

○ Yes
○ While creating a topic, you can set replication factor which applies to each topic partition.

○ Yes
● What about data retention time policy ?

○ Yes
○ While creating a topic, please set it. You can edit later on.

○ Yes
● Think about producer Partitioning key ...

○ Yes
● Think about producer Partitioning key …

Others ...
● Amazon Kinesis is similar to Kafka ….
● You have Redis - PubSub (diﬀerent guarantees, not similar to
kafka)

What i did not cover ? :)
● Kafka Replication mechanism
○ ISR = in sync replica set
● Tools like Kafka mirror
● Zookeeper interaction (yes kafka depends on zookeeper)

What’s new in kafka ?
● Kafka stream api
● Kafka Sql
● See release notes … :)

producer.send(“ Any Questions ? Thanks ”)

Introduction to Apache Kafka

Recommended

Recommended

More Related Content

Similar to Introduction to Apache Kafka

Similar to Introduction to Apache Kafka (20)

More from vishnu rao

More from vishnu rao (10)

Recently uploaded

Recently uploaded (20)

Introduction to Apache Kafka