Apache Kafka as
Message Queue
for your microservices and other occasions
Michael Reinsch

michael@movingfast.io

at Rug::B Feb 2018
Message Queues?
Message Queue
• Queue acts as buffer

• Indirect communication

• Multiple consumer processes

• Multiple producers
Message queueProducer
Hello! Hello!
Consumer
With Exchange
• Queue is strongly coupled to consumer

• Exchange needs to know system topology

• Adding additional consumers is relatively expensive
Message queue
Producer
Hello!
Hello!
Hello!
X
Exchange
Consumer X
Consumer Y
Apache Kafka
• Topics are always multi-publisher and multi-subscriber

• Adding / removing consumers is very cheap
Topic
Consumer X
Hello!
Hello!
Greets Consumer Y
Hello!
Producer
Apache Kafka:
Distributed Streaming Platform
Key capabilities:

• Publish and subscribe to streams of records

• Store streams of records in a fault-tolerant way

• Process streams of records as they occur
Topics
• ‘Category’ for a stream of records

• Producers only append

• All published records are retained
for a configurable retention period

• Consumers use offset pointers to
store their last processed event
Topic Partitions
• Topic can have many partitions

• Partitions are an ordered,
immutable sequence of records

• Partition size is limited by disk
space

• Partitions are replicated for fault-
tolerance

• Unit of parallelism
Consumer Groups
• Consumers can form consumer groups

• Each consumer in a group is the
exclusive consumer of a “fair share” of
partitions

• Strong ordering guaranty within a topic
partition
TopicProducer
Hello!
Hello!
greets
Hello!
Consumer Group X
Consumer Group Y
Demo!
Scripts at github.com/mreinsch/kafka_demo
Let’s compare
Kafka vs. REST API
• Asynchronous, indirect communication

• Less coupling in producer -> easier to extend

• Fewer critical paths
Topic
Producer
Ex: new users
Ex: new orders
Consumer Group X
Consumer Group Y
Kafka vs. job queue
• Similar, but different

• Less coupling in producer
Topic
Producer
Ex: new users
Ex: new orders
Consumer Group X
Consumer Group Y
Challenges
when using Kafka (or other message queues) 

for your microservices
Asynchronicity
• Example: existing clients use REST API, but processing is done by
some microservice

• Possible solution:

• Return `202 Accepted` pointing to a job resource

• Job resource returns status / actual resource location when
finished

• Use another Kafka topic to communicate job status changes

• http://restcookbook.com/Resources/asynchroneous-operations/
More Asynchronicity
• Example: need data from another service

• Pragmatic solution:

• Just keep using a REST API

• Scalable solution:

• Combine REST API with local cache which gets
invalidated by an asynchronous event
Error handling
• You need strategy for handling errors to avoid consumer
processes getting stuck

• ruby-kafka doesn’t provide this - higher level libs exist

• kafka_worker, minimalist worker abstraction

• Pushes message+metadata into error topic on hard
errors

• Work in progress…
Event Loops
• Example: 

• Consumer A: consumes topic-a, publishes to topic-b

• Consumer B: consumes topic-b, publishes to topic-a

• Usually much more complex…

• We haven’t had one yet, but with more services it
becomes more likely
Some Tips
• Add some common metadata to each event, such as
Origin-UUID (pass on when triggering other events),
Seen-By

• Document which service consumes / produces which
events

• Only include data relevant to the event, other data should
be fetched as needed
Get in touch
Michael Reinsch

michael@movingfast.io

GitHub: mreinsch
Looking
for new
interesting
projects
/ opportunities!

Apache Kafka as Message Queue for your microservices and other occasions

  • 1.
    Apache Kafka as MessageQueue for your microservices and other occasions Michael Reinsch michael@movingfast.io at Rug::B Feb 2018
  • 2.
  • 3.
    Message Queue • Queueacts as buffer • Indirect communication • Multiple consumer processes • Multiple producers Message queueProducer Hello! Hello! Consumer
  • 4.
    With Exchange • Queueis strongly coupled to consumer • Exchange needs to know system topology • Adding additional consumers is relatively expensive Message queue Producer Hello! Hello! Hello! X Exchange Consumer X Consumer Y
  • 5.
    Apache Kafka • Topicsare always multi-publisher and multi-subscriber • Adding / removing consumers is very cheap Topic Consumer X Hello! Hello! Greets Consumer Y Hello! Producer
  • 6.
    Apache Kafka: Distributed StreamingPlatform Key capabilities: • Publish and subscribe to streams of records • Store streams of records in a fault-tolerant way • Process streams of records as they occur
  • 7.
    Topics • ‘Category’ fora stream of records • Producers only append • All published records are retained for a configurable retention period • Consumers use offset pointers to store their last processed event
  • 8.
    Topic Partitions • Topiccan have many partitions • Partitions are an ordered, immutable sequence of records • Partition size is limited by disk space • Partitions are replicated for fault- tolerance • Unit of parallelism
  • 9.
    Consumer Groups • Consumerscan form consumer groups • Each consumer in a group is the exclusive consumer of a “fair share” of partitions • Strong ordering guaranty within a topic partition TopicProducer Hello! Hello! greets Hello! Consumer Group X Consumer Group Y
  • 10.
  • 11.
  • 12.
    Kafka vs. RESTAPI • Asynchronous, indirect communication • Less coupling in producer -> easier to extend • Fewer critical paths Topic Producer Ex: new users Ex: new orders Consumer Group X Consumer Group Y
  • 13.
    Kafka vs. jobqueue • Similar, but different • Less coupling in producer Topic Producer Ex: new users Ex: new orders Consumer Group X Consumer Group Y
  • 14.
    Challenges when using Kafka(or other message queues) 
 for your microservices
  • 15.
    Asynchronicity • Example: existingclients use REST API, but processing is done by some microservice • Possible solution: • Return `202 Accepted` pointing to a job resource • Job resource returns status / actual resource location when finished • Use another Kafka topic to communicate job status changes • http://restcookbook.com/Resources/asynchroneous-operations/
  • 16.
    More Asynchronicity • Example:need data from another service • Pragmatic solution: • Just keep using a REST API • Scalable solution: • Combine REST API with local cache which gets invalidated by an asynchronous event
  • 17.
    Error handling • Youneed strategy for handling errors to avoid consumer processes getting stuck • ruby-kafka doesn’t provide this - higher level libs exist • kafka_worker, minimalist worker abstraction • Pushes message+metadata into error topic on hard errors • Work in progress…
  • 18.
    Event Loops • Example: • Consumer A: consumes topic-a, publishes to topic-b • Consumer B: consumes topic-b, publishes to topic-a • Usually much more complex… • We haven’t had one yet, but with more services it becomes more likely
  • 19.
    Some Tips • Addsome common metadata to each event, such as Origin-UUID (pass on when triggering other events), Seen-By • Document which service consumes / produces which events • Only include data relevant to the event, other data should be fetched as needed
  • 20.
    Get in touch MichaelReinsch michael@movingfast.io GitHub: mreinsch Looking for new interesting projects / opportunities!