Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Reducing Microservice Complexity
with Kafka and Reactive Streams
Jim Riecken
Reducing Microservice
Complexity with Kafka and
Reactive Streams
Senior Software Developer
Jim Riecken
@jimriecken - jim.r...
@jimriecken
• Monolith to Microservices + Complexity
• Asynchronous Messaging
• Kafka
• Reactive Streams + Akka Streams
Agenda
• Details on how to set up a Kafka cluster
• In-depth tutorial on Akka Streams
Anti-Agenda
Monolith to
Microservices
M
Efficiency
Time
M
S1
S2
F
S1
S2
S3
S4
S5
Efficiency
Time
• Small
• Scalable
• Independent
• Easy to Create
• Clear ownership
Network Calls
• Latency
• Failure
~99.5%
Reliability
99.9% 99.9% 99.9% 99.9%
Coordination
• Between services
• Between teams
Asynchronous
Messaging
Message Bus
Synchronous
Asynchronous
• Decoupling
• Pub/Sub
• Less coordination
• Additional consumers are easy
• Help scale organization
Why?
• Well-defined delivery semantics
• High-Throughput
• Highly-Available
• Durable
• Scalable
• Backpressure
Messaging Requi...
Kafka
• Distributed, partitioned, replicated
commit log service
• Pub/Sub messaging functionality
• Created by LinkedIn, now an
...
Producers
Kafka Brokers
Consumers
0 | 1 | 2 | 3 | 4 | 5
0 | 1 | 2 | 3 | 4 | 5 | 6
0 | 1 | 2 | 3
P0
P1
P2
New Messages
Appended
Topic
Topics + Partitions
• Send messages to topics
• Responsible for choosing which
partition to send to
• Round-robin
• Consistent hashing based o...
• Pull messages from topics
• Track their own offset in each
partition
Consumers
P0 P1 P2
1 2 3 4 5 6
Topic
Group 1 Group 2
How does Kafka
meet the
requirements?
• Hundreds of MB/s of reads/writes from
thousands of concurrent clients
• LinkedIn (2015)
• 800 billion messages per day (...
• Brokers
• All data is persisted to disk
• Partitions replicated to other nodes
• Consumers
• Start where they left off
•...
• Capacity can be added at runtime
with zero downtime
• More servers => more disk space
• Topics can be larger than any si...
• Large storage capacity
• Topic retention is a Consumer SLA
• Almost impossible for a fast
producer to overload a slow
co...
Message Data
Format
• Array[Byte]
• Serialization?
• JSON?
• Protocol Buffers
• Binary - Fast
• IDL - Code Generation
• Message evolution
Mess...
Processing Data
with Reactive
Streams
• Standard for async stream
processing with non-blocking back-
pressure
• Subscriber signals demand to publisher
• Publish...
Publisher[T] Subscriber[T]
onSubscribe(s: Subscription)
onNext(t: T)
onComplete()
onError(t: Throwable)
Subscription
subsc...
Processing Data
with Akka Streams
• Library on top of Akka Actors and
Reactive Streams
• Process sequences of elements
using bounded buffer space
• Strongly...
Flow
Source
Sink
Fan
Out
Fan
In
Concepts
Runnable Graph
Concepts
Composition
• Turning on the tap
• Create actors
• Open files/sockets/other resources
• Materialized values
• Source: Actor, Promise, ...
Reactive Kafka
• https://github.com/akka/reactive-kafka
• Akka Streams wrapper around Kafka
API
• Consumer Source
• Producer Sink
Reactiv...
• Sink - sends message to Kafka topic
• Flow - sends message to Kafka topic +
emits result downstream
• When the stream co...
• Source - pulls messages from
Kafka topics
• Offset Management
• Back-pressure
• Materialization
• Object that can stop t...
Simple Producer Example
implicit val system = ActorSystem("producer-test")
implicit val materializer = ActorMaterializer()...
Simple Consumer Example
implicit val system = ActorSystem("producer-test")
implicit val materializer = ActorMaterializer()...
val control = Consumer.committableSource(consumerSettings.withClientId("client1"))
.map { msg =>
val upper = msg.value.toU...
Demo
Wrap-Up
• Microservices have many advantages, but can
introduce failure and complexity.
• Asynchronous messaging can help reduce t...
Thank you!
Questions?
@jimriecken - jim.riecken@hootsuite.com
Jim Riecken
Reducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive Streams
Upcoming SlideShare
Loading in …5
×

Reducing Microservice Complexity with Kafka and Reactive Streams

12,285 views

Published on

My talk from ScalaDays 2016 in New York on May 11, 2016:

Transitioning from a monolithic application to a set of microservices can help increase performance and scalability, but it can also drastically increase complexity. Layers of inter-service network calls for add latency and an increasing risk of failure where previously only local function calls existed. In this talk, I'll speak about how to tame this complexity using Apache Kafka and Reactive Streams to:
- Extract non-critical processing from the critical path of your application to reduce request latency
- Provide back-pressure to handle both slow and fast producers/consumers
- Maintain high availability, high performance, and reliable messaging
- Evolve message payloads while maintaining backwards and forwards compatibility.

Published in: Software
  • Be the first to comment

Reducing Microservice Complexity with Kafka and Reactive Streams

  1. 1. Reducing Microservice Complexity with Kafka and Reactive Streams Jim Riecken
  2. 2. Reducing Microservice Complexity with Kafka and Reactive Streams Senior Software Developer Jim Riecken @jimriecken - jim.riecken@hootsuite.com
  3. 3. @jimriecken
  4. 4. • Monolith to Microservices + Complexity • Asynchronous Messaging • Kafka • Reactive Streams + Akka Streams Agenda
  5. 5. • Details on how to set up a Kafka cluster • In-depth tutorial on Akka Streams Anti-Agenda
  6. 6. Monolith to Microservices
  7. 7. M
  8. 8. Efficiency Time
  9. 9. M S1 S2
  10. 10. F S1 S2 S3 S4 S5
  11. 11. Efficiency Time • Small • Scalable • Independent • Easy to Create • Clear ownership
  12. 12. Network Calls • Latency • Failure
  13. 13. ~99.5% Reliability 99.9% 99.9% 99.9% 99.9%
  14. 14. Coordination • Between services • Between teams
  15. 15. Asynchronous Messaging
  16. 16. Message Bus Synchronous Asynchronous
  17. 17. • Decoupling • Pub/Sub • Less coordination • Additional consumers are easy • Help scale organization Why?
  18. 18. • Well-defined delivery semantics • High-Throughput • Highly-Available • Durable • Scalable • Backpressure Messaging Requirements
  19. 19. Kafka
  20. 20. • Distributed, partitioned, replicated commit log service • Pub/Sub messaging functionality • Created by LinkedIn, now an Apache open-source project What is Kafka?
  21. 21. Producers Kafka Brokers Consumers
  22. 22. 0 | 1 | 2 | 3 | 4 | 5 0 | 1 | 2 | 3 | 4 | 5 | 6 0 | 1 | 2 | 3 P0 P1 P2 New Messages Appended Topic Topics + Partitions
  23. 23. • Send messages to topics • Responsible for choosing which partition to send to • Round-robin • Consistent hashing based on a message key Producers
  24. 24. • Pull messages from topics • Track their own offset in each partition Consumers
  25. 25. P0 P1 P2 1 2 3 4 5 6 Topic Group 1 Group 2
  26. 26. How does Kafka meet the requirements?
  27. 27. • Hundreds of MB/s of reads/writes from thousands of concurrent clients • LinkedIn (2015) • 800 billion messages per day (18 million/s peak) • 175 TB of data produced per day • > 1000 servers in 60 clusters Kafka is Fast
  28. 28. • Brokers • All data is persisted to disk • Partitions replicated to other nodes • Consumers • Start where they left off • Producers • Can retry - at-least-once messaging Kafka is Resilient
  29. 29. • Capacity can be added at runtime with zero downtime • More servers => more disk space • Topics can be larger than any single node could hold • Additional partitions can be added to add more parallelism Kafka is Scalable
  30. 30. • Large storage capacity • Topic retention is a Consumer SLA • Almost impossible for a fast producer to overload a slow consumer • Allows real-time as well as batch consumption Kafka Helps with Back-Pressure
  31. 31. Message Data Format
  32. 32. • Array[Byte] • Serialization? • JSON? • Protocol Buffers • Binary - Fast • IDL - Code Generation • Message evolution Messages
  33. 33. Processing Data with Reactive Streams
  34. 34. • Standard for async stream processing with non-blocking back- pressure • Subscriber signals demand to publisher • Publisher sends no more than demand • Low-level • Mainly meant for library authors Reactive Streams
  35. 35. Publisher[T] Subscriber[T] onSubscribe(s: Subscription) onNext(t: T) onComplete() onError(t: Throwable) Subscription subscribe(s: Subscriber[-T]) request(n: Long) cancel()
  36. 36. Processing Data with Akka Streams
  37. 37. • Library on top of Akka Actors and Reactive Streams • Process sequences of elements using bounded buffer space • Strongly Typed Akka Streams
  38. 38. Flow Source Sink Fan Out Fan In Concepts
  39. 39. Runnable Graph Concepts
  40. 40. Composition
  41. 41. • Turning on the tap • Create actors • Open files/sockets/other resources • Materialized values • Source: Actor, Promise, Subscriber • Sink: Actor, Future, Producer Materialization
  42. 42. Reactive Kafka
  43. 43. • https://github.com/akka/reactive-kafka • Akka Streams wrapper around Kafka API • Consumer Source • Producer Sink Reactive Kafka
  44. 44. • Sink - sends message to Kafka topic • Flow - sends message to Kafka topic + emits result downstream • When the stream completes/fails the connection to Kafka will be automatically closed Producer
  45. 45. • Source - pulls messages from Kafka topics • Offset Management • Back-pressure • Materialization • Object that can stop the consumer (and complete the stream) Consumer
  46. 46. Simple Producer Example implicit val system = ActorSystem("producer-test") implicit val materializer = ActorMaterializer() val producerSettings = ProducerSettings( system, new ByteArraySerializer, new StringSerializer ).withBootstrapServers("localhost:9092") Source(1 to 100) .map(i => s"Message $i") .map(m => new ProducerRecord[Array[Byte], String]("lower", m)) .to(Producer.plainSink(producerSettings)).run()
  47. 47. Simple Consumer Example implicit val system = ActorSystem("producer-test") implicit val materializer = ActorMaterializer() val consumerSettings = ConsumerSettings( system, new ByteArrayDeserializer, new StringDeserializer, Set("lower") ).withBootstrapServers("localhost:9092").withGroupId("test-group") val control = Consumer.atMostOnceSource(consumerSettings.withClientId("client1")) .map(record => record.value) .to(Sink.foreach(v => println(v))).run() control.stop()
  48. 48. val control = Consumer.committableSource(consumerSettings.withClientId("client1")) .map { msg => val upper = msg.value.toUpperCase Producer.Message( new ProducerRecord[Array[Byte], String]("upper", upper), msg.committableOffset) }.to(Producer.commitableSink(producerSettings)).run() control.stop() Combined Example
  49. 49. Demo
  50. 50. Wrap-Up
  51. 51. • Microservices have many advantages, but can introduce failure and complexity. • Asynchronous messaging can help reduce this complexity and Kafka is a great option. • Akka Streams makes reliably processing data from Kafka with back-pressure easy Wrap-Up
  52. 52. Thank you! Questions? @jimriecken - jim.riecken@hootsuite.com Jim Riecken

×