Kafka as a message queue

KAFKA AS A MQ
CAN YOU DO IT, AND SHOULD YOU DO IT? 
Adam Warski, Apache Kafka London Meetup

@adamwarski, SoftwareMill, Kafka London Meetup
THE PLAN
➤ Acknowledgments in plain Kafka
➤ Why selective acknowledgments?
➤ Why not …MQ?
➤ Kmq implementation
➤ Demo
➤ Performance

➤ Oﬀset commits:
➤ Using this, we can implement:
➤ at-least-once processing
➤ at-most-once processing
topic
msg25msg24
ACKNOWLEDGMENTS IN PLAIN KAFKA
msg18
partition 1
partition 2
partition 3
msg19 msg20 msg21 msg22 msg23
commit offset: 20
commit offset: 24

WHY SELECTIVE ACKNOWLEDGMENTS?
➤ Integrating with external systems
➤ e.g. HTTP/REST endpoints
➤ email
➤ other messaging
➤ Individual calls might fail
➤ should be retried
➤ without retrying the whole batch
➤ without delaying subsequent batches

WHY NOT …MQ?
➤ Typical usage scenario for a message queue
➤ RabbitMQ, ActiveMQ, Artemis, SQS …
➤ Kafka:
➤ proven & reliable clustering & replication mechanisms
➤ performance
➤ convenience: reduce operational complexity

AMAZON SQS
➤ Message queue as-a-service
➤ Simple API:
➤ CreateQueue
➤ SendMessage
➤ ReceiveMessage
➤ DeleteMessage
➤ Received messages are blocked for a period of time
➤ visibility timeout

KMQ: IMPLEMENTATION
➤ Two topics:
➤ queue: messages to process
➤ markers: for each message, start/end markers
➤ same number of partitions
➤ A number of queue clients
➤ here data is processed
➤ A number of redelivery trackers

QUEUE CLIENT
1. Read message from queue
2. Write start [offset] to markers
➤ wait for send to complete!
3. Commit oﬀset to queue
4. Process the message
5. Write end [offset] markers

markers topic
partition 1
partition 2
partition 3
queue topic
partition 1
partition 2
partition 3
msg37
4. process
message
fail processing, wait
for redelivery
msg39msg40
1. read
messages from
topic
start marker
offset: 39
2. write start
markers
msg38
3. commit
offsets
offset: 38
success, confirm
message processed
end marker
offset: 37
5. write end
markers
redelivery tracker
// started, not ended markers
offset=10, time=1488010644
offset=15, time=1488141843
offset=24, time=1488289812
…
marker 
stream
every second  
trigger
redeliver 
timed out 
messages 
read & redeliver message
msg10

REDELIVERY TRACKER
➤ A Kafka application
➤ consumes the markers topic
➤ Multiple instances for fail-over
➤ Uses Kafka’s auto-partition-assignment

REDELIVERY TRACKER
➤ In-memory priority queue
➤ by Kafka’s marker timestamp
➤ messages with start markers, but no end markers
➤ Checks for messages to redeliver at regular intervals
➤ redelivery: seek + send
➤ in order

PERFORMANCE
➤ 3-node Kafka cluster
➤ m4.2xlarge servers (8 CPUs, 32GiB RAM)
➤ single AZ
➤ 100 byte messages, sent in batches of up to 10
➤ Up to 8 sender/receiver nodes
➤ 64 to 160 partitions
➤ replication-factor=3
➤ min.insync.replicas=2
➤ acks=all (-1)

PLAIN KAFKA KMQ

LATENCY
➤ Plain Kafka: ~50 milliseconds
➤ kmq: 50ms - 130ms

WHAT IF MESSAGES ARE DROPPED?
➤ 50% drop rate

KMQ INTERNALS
➤ RedeliveryTracker
➤ Implemented in Scala, with a Java API
➤ Uses Akka
➤ One tracking actor per markers topic partition
➤ One redeliver actor per queue topic partition
➤ Started/stopped when partitions are revoked/assigned
➤ KmqClient
➤ Single Java class
➤ + marker value classes

ABOUT ME
➤ Software engineer, co-founder @
➤ Custom software development: Scala/Kafka/Java/Cassandra/…
➤ Open-source: sttp, QuickLens, ElasticMQ, Envers, MacWire, …
➤ Blog @ softwaremill.com/blog
➤ Twitter @ twitter.com/adamwarski

SUMMARY
➤ Individual, selective message acknowledgments
➤ similar to SQS
➤ Alternative to batch/up-to-oﬀset acknowledgments in plain Kafka
➤ Storage overhead: additional meta-data topic
➤ Performance overhead: comparable
➤ Integrating with external systems

LINKS
➤ GitHub: https://github.com/softwaremill/kmq
➤ Introductory blog: https://softwaremill.com/using-kafka-as-a-message-queue/
➤ Message queue performance: https://softwaremill.com/mqperf/
➤ @adamwarski / adam@warski.org

Kafka as a message queue

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Kafka as a message queue

Similar to Kafka as a message queue (20)

More from SoftwareMill

More from SoftwareMill (20)

Recently uploaded

Recently uploaded (20)

Kafka as a message queue