Reducing Microservice Complexity with Kafka and Reactive Streams

Reducing Microservice Complexity
with Kafka and Reactive Streams
Jim Riecken

Reducing Microservice
Complexity with Kafka and
Reactive Streams
Senior Software Developer
Jim Riecken
@jimriecken - jim.riecken@hootsuite.com

• Monolith to Microservices + Complexity
• Asynchronous Messaging
• Kafka
• Reactive Streams + Akka Streams
Agenda

• Details on how to set up a Kafka cluster
• In-depth tutorial on Akka Streams
Anti-Agenda

Efficiency
Time
• Small
• Scalable
• Independent
• Easy to Create
• Clear ownership

Network Calls
• Latency
• Failure

~99.5%
Reliability
99.9% 99.9% 99.9% 99.9%

Coordination
• Between services
• Between teams

Message Bus
Synchronous
Asynchronous

• Decoupling
• Pub/Sub
• Less coordination
• Additional consumers are easy
• Help scale organization
Why?

• Well-defined delivery semantics
• High-Throughput
• Highly-Available
• Durable
• Scalable
• Backpressure
Messaging Requirements

• Distributed, partitioned, replicated
commit log service
• Pub/Sub messaging functionality
• Created by LinkedIn, now an
Apache open-source project
What is Kafka?

Producers
Kafka Brokers
Consumers

0 | 1 | 2 | 3 | 4 | 5
0 | 1 | 2 | 3 | 4 | 5 | 6
0 | 1 | 2 | 3
P0
P1
P2
New Messages
Appended
Topic
Topics + Partitions

• Send messages to topics
• Responsible for choosing which
partition to send to
• Round-robin
• Consistent hashing based on a
message key
Producers

• Pull messages from topics
• Track their own offset in each
partition
Consumers

P0 P1 P2
1 2 3 4 5 6
Topic
Group 1 Group 2

How does Kafka
meet the
requirements?

• Hundreds of MB/s of reads/writes from
thousands of concurrent clients
• LinkedIn (2015)
• 800 billion messages per day (18 million/s
peak)
• 175 TB of data produced per day
• > 1000 servers in 60 clusters
Kafka is Fast

• Brokers
• All data is persisted to disk
• Partitions replicated to other nodes
• Consumers
• Start where they left off
• Producers
• Can retry - at-least-once messaging
Kafka is Resilient

• Capacity can be added at runtime
with zero downtime
• More servers => more disk space
• Topics can be larger than any single
node could hold
• Additional partitions can be added to
add more parallelism
Kafka is Scalable

• Large storage capacity
• Topic retention is a Consumer SLA
• Almost impossible for a fast
producer to overload a slow
consumer
• Allows real-time as well as batch
consumption
Kafka Helps with Back-Pressure

• Array[Byte]
• Serialization?
• JSON?
• Protocol Buffers
• Binary - Fast
• IDL - Code Generation
• Message evolution
Messages

Processing Data
with Reactive
Streams

• Standard for async stream
processing with non-blocking back-
pressure
• Subscriber signals demand to publisher
• Publisher sends no more than demand
• Low-level
• Mainly meant for library authors
Reactive Streams

Publisher[T] Subscriber[T]
onSubscribe(s: Subscription)
onNext(t: T)
onComplete()
onError(t: Throwable)
Subscription
subscribe(s: Subscriber[-T])
request(n: Long)
cancel()

Processing Data
with Akka Streams

• Library on top of Akka Actors and
Reactive Streams
• Process sequences of elements
using bounded buffer space
• Strongly Typed
Akka Streams

Flow
Source
Sink
Fan
Out
Fan
In
Concepts

• Turning on the tap
• Create actors
• Open files/sockets/other resources
• Materialized values
• Source: Actor, Promise, Subscriber
• Sink: Actor, Future, Producer
Materialization

• https://github.com/akka/reactive-kafka
• Akka Streams wrapper around Kafka
API
• Consumer Source
• Producer Sink
Reactive Kafka

• Sink - sends message to Kafka topic
• Flow - sends message to Kafka topic +
emits result downstream
• When the stream completes/fails the
connection to Kafka will be
automatically closed
Producer

• Source - pulls messages from
Kafka topics
• Offset Management
• Back-pressure
• Materialization
• Object that can stop the consumer
(and complete the stream)
Consumer

Simple Producer Example
implicit val system = ActorSystem("producer-test")
implicit val materializer = ActorMaterializer()
val producerSettings = ProducerSettings(
system, new ByteArraySerializer, new StringSerializer
).withBootstrapServers("localhost:9092")
Source(1 to 100)
.map(i => s"Message $i")
.map(m => new ProducerRecord[Array[Byte], String]("lower", m))
.to(Producer.plainSink(producerSettings)).run()

Simple Consumer Example
implicit val system = ActorSystem("producer-test")
implicit val materializer = ActorMaterializer()
val consumerSettings = ConsumerSettings(
system, new ByteArrayDeserializer, new StringDeserializer, Set("lower")
).withBootstrapServers("localhost:9092").withGroupId("test-group")
val control = Consumer.atMostOnceSource(consumerSettings.withClientId("client1"))
.map(record => record.value)
.to(Sink.foreach(v => println(v))).run()
control.stop()

val control = Consumer.committableSource(consumerSettings.withClientId("client1"))
.map { msg =>
val upper = msg.value.toUpperCase
Producer.Message(
new ProducerRecord[Array[Byte], String]("upper", upper),
msg.committableOffset)
}.to(Producer.commitableSink(producerSettings)).run()
control.stop()
Combined Example

• Microservices have many advantages, but can
introduce failure and complexity.
• Asynchronous messaging can help reduce this
complexity and Kafka is a great option.
• Akka Streams makes reliably processing data
from Kafka with back-pressure easy
Wrap-Up

Thank you!
Questions?
@jimriecken - jim.riecken@hootsuite.com
Jim Riecken

Reducing Microservice Complexity with Kafka and Reactive Streams

Reducing Microservice Complexity with Kafka and Reactive Streams

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Reducing Microservice Complexity with Kafka and Reactive Streams

Similar to Reducing Microservice Complexity with Kafka and Reactive Streams (20)

Recently uploaded

Recently uploaded (20)

Reducing Microservice Complexity with Kafka and Reactive Streams