SlideShare a Scribd company logo
1 of 86
Download to read offline
Streaming Design Patterns Using
Alpakka Kafka Connector
Sean Glover, Lightbend
@seg1o
Who am I?
I’m Sean Glover
• Principal Engineer at Lightbend
• Member of the Lightbend Pipelines team
• Organizer of Scala Toronto (scalator)
• Author and contributor to various projects in the Kafka
ecosystem including Kafka, Alpakka Kafka (reactive-kafka),
Strimzi, Kafka Lag Exporter, DC/OS Commons SDK
2
/ seg1o
3
“ “The Alpakka project is an initiative to
implement a library of integration
modules to build stream-aware, reactive,
pipelines for Java and Scala.
4
Cloud Services Data Stores
JMS
Messaging
5
kafka connector
“ “This Alpakka Kafka connector lets you
connect Apache Kafka to Akka Streams.
Top Alpakka Modules
6
Alpakka Module Downloads in August 2018
Kafka 61177
Cassandra 15946
AWS S3 15075
MQTT 11403
File 10636
Simple Codecs 8285
CSV 7428
AWS SQS 5385
AMQP 4036
7
“
“
Akka Streams is a library toolkit to provide low latency
complex event processing streaming semantics using the
Reactive Streams specification implemented internally with
an Akka actor system.
streams
8
Source Flow Sink
User Messages (flow downstream)
Internal Back-pressure Messages (flow upstream)
Outlet
Inlet
streams
Reactive Streams Specification
9
“ “Reactive Streams is an initiative to
provide a standard for asynchronous
stream processing with non-blocking
back pressure.
http://www.reactive-streams.org/
Reactive Streams Libraries
10
streams
Spec now part of JDK 9
java.util.concurrent.Flow
migrating to
GraphStageAkka Actor Concepts
11
Mailbox
1. Constrained Actor Mailbox
2. One message at a time
“Single Threaded Illusion”
3. May contain state
Actor
State
// Message Handler “Receive Block”
def receive = {
case message: MessageType =>
}
Back Pressure Demo
12
Source Flow Sink
Source Kafka Topic
Destination Kafka Topic
I need some
messages.
Demand request is
sent upstream
I need to load
some messages
for downstream
...
Key: EN, Value: {“message”: “Hi Akka!” }
Key: FR, Value: {“message”: “Salut Akka!” }
Key: ES, Value: {“message”: “Hola Akka!” }
...
Demand satisfied
downstream
...
Key: EN, Value: {“message”: “Bye Akka!” }
Key: FR, Value: {“message”: “Au revoir Akka!” }
Key: ES, Value: {“message”: “Adiós Akka!” }
...
openclipart
Dynamic Push Pull
13
Source Flow
Bounded Mailbox
Flow sends demand request
(pull) of 5 messages max
x
I can handle 5 more
messages
Source sends (push) a batch
of 5 messages downstream
I can’t send more
messages
downstream
because I no more
demand to fulfill.
Flow’s mailbox is full!
Slow Consumer
Fast Producer
openclipart
Why Back Pressure?
• Prevent cascading failure
• Alternative to using a big buffer (i.e. Kafka)
• Back Pressure flow control can use several strategies
• Slow down until there’s demand (classic back pressure, “throttling”)
• Discard elements
• Buffer in memory to some max, then discard elements
• Shutdown
14
Why Back Pressure? A case study.
15
https://medium.com/@programmerohit/back-press
ure-implementation-aws-sqs-polling-from-a-shard
ed-akka-cluster-running-on-kubernetes-56ee8c67
efb
Akka Streams Factorial Example
import ...
object Main extends App {
implicit val system = ActorSystem("QuickStart")
implicit val materializer = ActorMaterializer()
val source: Source[Int, NotUsed] = Source(1 to 100)
val factorials = source.scan(BigInt(1))((acc, next) ⇒ acc * next)
val result: Future[IOResult] =
factorials
.map(num => ByteString(s"$numn"))
.runWith(FileIO.toPath(Paths.get("factorials.txt")))
}
16
https://doc.akka.io/docs/akka/2.5/stream/stream-quickstart.html
Apache Kafka
17
Kafka Documentation
“ “Apache Kafka is a distributed
streaming system. It’s best suited to
support fast, high volume, and fault
tolerant, data streaming platforms.
18
streams
streams!=
Akka Streams and Kafka Streams solve different problems
When to use Alpakka Kafka?
When to use Alpakka Kafka?
1. To build back pressure aware integrations
2. Complex Event Processing
3. A need to model the most complex of graphs
19
Anatomy of an Alpakka Kafka app
Alpakka Kafka Setup
val consumerClientConfig = system.settings. config.getConfig( "akka.kafka.consumer")
val consumerSettings =
ConsumerSettings(consumerClientConfig, new StringDeserializer, new ByteArrayDeserializer)
.withBootstrapServers( "localhost:9092")
.withGroupId( "group1")
.withProperty(ConsumerConfig. AUTO_OFFSET_RESET_CONFIG, "earliest")
val producerClientConfig = system.settings. config.getConfig( "akka.kafka.producer")
val producerSettings = ProducerSettings(system, new StringSerializer, new ByteArraySerializer)
.withBootstrapServers( "localhost:9092")
21
Alpakka Kafka config & Kafka Client
config can go here
Set ad-hoc Kafka client config
Anatomy of an Alpakka Kafka App
val control = Consumer
.committableSource(consumerSettings, Subscriptions. topics(topic1))
.map { msg =>
ProducerMessage. single(
new ProducerRecord(topic1, msg.record.key, msg.record.value),
passThrough = msg.committableOffset)
}
.via(Producer. flexiFlow(producerSettings))
.map(_.passThrough)
.toMat(Committer. sink(committerSettings))(Keep. both)
.mapMaterializedValue(DrainingControl. apply)
.run()
// Add shutdown hook to respond to SIGTERM and gracefully shutdown stream
sys.ShutdownHookThread {
Await.result(control.shutdown(), 10.seconds)
}
22
A small Consume -> Transform -> Produce Akka Streams app using Alpakka Kafka
val control = Consumer
.committableSource(consumerSettings, Subscriptions. topics(topic1))
.map { msg =>
ProducerMessage. single(
new ProducerRecord(topic1, msg.record.key, msg.record.value),
passThrough = msg.committableOffset)
}
.via(Producer. flexiFlow(producerSettings))
.map(_.passThrough)
.toMat(Committer. sink(committerSettings))(Keep. both)
.mapMaterializedValue(DrainingControl. apply)
.run()
// Add shutdown hook to respond to SIGTERM and gracefully shutdown stream
sys.ShutdownHookThread {
Await.result(control.shutdown(), 10.seconds)
}
23
Anatomy of an Alpakka Kafka App
The Committable Source propagates Kafka offset
information downstream with consumed messages
val control = Consumer
.committableSource(consumerSettings, Subscriptions. topics(topic1))
.map { msg =>
ProducerMessage. single(
new ProducerRecord(topic1, msg.record.key, msg.record.value),
passThrough = msg.committableOffset)
}
.via(Producer. flexiFlow(producerSettings))
.map(_.passThrough)
.toMat(Committer. sink(committerSettings))(Keep. both)
.mapMaterializedValue(DrainingControl. apply)
.run()
// Add shutdown hook to respond to SIGTERM and gracefully shutdown stream
sys.ShutdownHookThread {
Await.result(control.shutdown(), 10.seconds)
}
24
Anatomy of an Alpakka Kafka App
ProducerMessage used to map consumed offset to
transformed results.
One to One (1:1)
ProducerMessage. single
One to Many (1:M)
ProducerMessage. multi(
immutable.Seq(
new ProducerRecord(topic1, msg.record.key, msg.record.value),
new ProducerRecord(topic2, msg.record.key, msg.record.value)),
passthrough = msg.committableOffset
)
One to None (1:0)
ProducerMessage. passThrough(msg.committableOffset)
val control = Consumer
.committableSource(consumerSettings, Subscriptions. topics(topic1))
.map { msg =>
ProducerMessage. single(
new ProducerRecord(topic1, msg.record.key, msg.record.value),
passThrough = msg.committableOffset)
}
.via(Producer. flexiFlow(producerSettings))
.map(_.passThrough)
.toMat(Committer. sink(committerSettings))(Keep. both)
.mapMaterializedValue(DrainingControl. apply)
.run()
// Add shutdown hook to respond to SIGTERM and gracefully shutdown stream
sys.ShutdownHookThread {
Await.result(control.shutdown(), 10.seconds)
}
25
Anatomy of an Alpakka Kafka App
Produce messages to destination topic
flexiFlow accepts new ProducerMessage
type and will replace deprecated flow in the
future.
val control = Consumer
.committableSource(consumerSettings, Subscriptions. topics(topic1))
.map { msg =>
ProducerMessage. single(
new ProducerRecord(topic1, msg.record.key, msg.record.value),
passThrough = msg.committableOffset)
}
.via(Producer. flexiFlow(producerSettings))
.map(_.passThrough)
.toMat( Committer.sink(committerSettings))(Keep.both)
.mapMaterializedValue(DrainingControl. apply)
.run()
// Add shutdown hook to respond to SIGTERM and gracefully shutdown stream
sys.ShutdownHookThread {
Await.result(control.shutdown(), 10.seconds)
}
26
Anatomy of an Alpakka Kafka App
Batches consumed offset commits
Passthrough allows us to track what messages
have been successfully processed for At Least
Once message delivery guarantees.
val control = Consumer
.committableSource(consumerSettings, Subscriptions. topics(topic1))
.map { msg =>
ProducerMessage. single(
new ProducerRecord(topic1, msg.record.key, msg.record.value),
passThrough = msg.committableOffset)
}
.via(Producer. flexiFlow(producerSettings))
.map(_.passThrough)
.toMat(Committer. sink(committerSettings))(Keep. both)
.mapMaterializedValue(DrainingControl. apply)
.run()
// Add shutdown hook to respond to SIGTERM and gracefully shutdown stream
sys.ShutdownHookThread {
Await.result(control.shutdown(), 10.seconds)
}
27
Anatomy of an Alpakka Kafka App
Gacefully shutdown stream
1. Stop consuming (polling) new messages
from Source
2. Wait for all messages to be successfully
committed (when applicable)
3. Wait for all produced messages to ACK
val control = Consumer
.committableSource(consumerSettings, Subscriptions. topics(topic1))
.map { msg =>
ProducerMessage. single(
new ProducerRecord(topic1, msg.record.key, msg.record.value),
passThrough = msg.committableOffset)
}
.via(Producer. flexiFlow(producerSettings))
.map(_.passThrough)
.toMat(Committer. sink(committerSettings))(Keep. both)
.mapMaterializedValue(DrainingControl. apply)
.run()
// Add shutdown hook to respond to SIGTERM and gracefully shutdown stream
sys.ShutdownHookThread {
Await.result(control.shutdown(), 10.seconds)
}
28
Anatomy of an Alpakka Kafka App
Graceful shutdown when SIGTERM sent to
app (i.e. by docker daemon)
Force shutdown after grace interval
Consumer Group Rebalancing
Why use Consumer Groups?
1. Easy, robust, and
performant scaling of
consumers to reduce
consumer lag
30
Back
Pressure
Consumer Group
Latency and Offset Lag
31
Cluster
Topic
Producer 1
Producer 2
Producer n
...
Throughput: 10 MB/s
Consumer 1
Consumer 2
Consumer 3
Consumer Throughput ~3 MB/s
each
~9 MB/s
Total offset lag and latency is
growing.
openclipart
Consumer Group
Latency and Offset Lag
32
Cluster
Topic
Producer 1
Producer 2
Producer n
...
Data Throughput: 10 MB/s
Consumer 1
Consumer 2
Consumer 3
Consumer 4
Add new consumer and rebalance
Consumers now can support a
throughput of ~12 MB/s
Offset lag and latency decreases until
consumers are caught up
Committable Sink
val committerSettings = CommitterSettings(system)
val control: DrainingControl[Done] =
Consumer
.committableSource(consumerSettings, Subscriptions.topics(topic))
.mapAsync(1) { msg =>
business(msg.record.key, msg.record.value)
.map(_ => msg.committableOffset)
}
.toMat(Committer.sink(committerSettings))(Keep.both)
.mapMaterializedValue(DrainingControl.apply)
.run()
33
Anatomy of a Consumer Group
34
Client A
Client B
Client C
Cluster
Consumer Group
Partitions: 0,1,2
Partitions: 3,4,5
Partitions: 6,7,8
Consumer Group Offsets topic
Ex)
P0: 100489
P1: 128048
P2: 184082
P3: 596837
P4: 110847
P5: 99472
P6: 148270
P7: 3582785
P8: 182483
Consumer
Offset Log
T3
T1 T2
Consumer
Group
Coordinator
Consumer Group Topic Subscription
Important Consumer Group Client Config
Topic Subscription:
Subscription.topics(“Topic1”, “Topic2”, “Topic3”)
Kafka Consumer Properties:
group.id: [“my-group”]
session.timeout.ms: [30000 ms]
partition.assignment.strategy: [RangeAssignor]
heartbeat.interval.ms: [3000 ms]
Consumer Group
Leader
Consumer Group Rebalance (1/7)
35
Client A
Client B
Client C
Cluster
Consumer Group
Partitions: 0,1,2
Partitions: 3,4,5
Partitions: 6,7,8
Consumer
Offset Log
T3
T1 T2
Consumer
Group
Coordinator
Consumer Group
Leader
Consumer Group Rebalance (2/7)
36
Client D
Client A
Client B
Client C
Cluster
Consumer Group
Partitions: 0,1,2
Partitions: 3,4,5
Partitions: 6,7,8
Consumer
Offset Log
T3
T1 T2
Consumer
Group
Coordinator
Consumer Group
Leader
Client D requests to join the consumer groupNew Client D with same group.id sends a request to join
the group to Coordinator
Consumer Group Rebalance (3/7)
37
Client D
Client A
Client B
Client C
Cluster
Consumer Group
Partitions: 0,1,2
Partitions: 3,4,5
Partitions: 6,7,8
Consumer
Offset Log
T3
T1 T2
Consumer
Group
Coordinator
Consumer Group
Leader
Consumer group coordinator requests group leader to
calculate new Client:partition assignments.
Consumer Group Rebalance (4/7)
38
Client D
Client A
Client B
Client C
Cluster
Consumer Group
Partitions: 0,1,2
Partitions: 3,4,5
Partitions: 6,7,8
Consumer
Offset Log
T3
T1 T2
Consumer
Group
Coordinator
Consumer Group
Leader
Consumer group leader sends new Client:Partition
assignment to group coordinator.
Consumer Group Rebalance (5/7)
39
Client D
Client A
Client B
Client C
Cluster
Consumer Group
Assign Partitions: 0,1
Assign Partitions: 2,3
Assign
Partitions:6,7,8
Consumer
Offset Log
T3
T1 T2
Consumer
Group
Coordinator
Consumer Group
Leader
Consumer group coordinator informs all clients of their
new Client:Partition assignments.
Assign Partitions: 4,5
Consumer Group Rebalance (6/7)
40
Client D
Client A
Client B
Client C
Cluster
Consumer Group
Consumer
Offset Log
T3
T1 T2
Consumer
Group
Coordinator
Consumer Group
Leader
Clients that had partitions revoked are given the chance
to commit their latest processed offsets.
Partitions
to
C
om
m
it:2
Partitions to Commit: 3,5
Partitions to Commit: 6,7,8
Consumer Group Rebalance (7/7)
41
Client D
Client A
Client B
Client C
Cluster
Consumer Group
Consumer
Offset Log
T3
T1 T2
Consumer
Group
Coordinator
Consumer Group
Leader
Rebalance complete. Clients begin consuming partitions
from their last committed offsets.
Partitions: 0,1
Partitions: 2,3
Partitions: 4,5
Partitions: 6,7,8
Consumer Group Rebalancing (asynchronous)
42
class RebalanceListener extends Actor with ActorLogging {
def receive: Receive = {
case TopicPartitionsAssigned(sub, assigned) =>
case TopicPartitionsRevoked(sub, revoked) =>
processRevokedPartitions(revoked)
}
}
val subscription = Subscriptions.topics("topic1", "topic2")
.withRebalanceListener(system.actorOf(Props[RebalanceListener]))
val control = Consumer.committableSource(consumerSettings, subscription)
...
Declare an Akka Actor to handle
assigned and revoked partition messages
asynchronously.
Useful to perform asynchronous actions
during rebalance, but not for blocking
operations you want to happen during
rebalance.
Consumer Group Rebalancing (asynchronous)
43
class RebalanceListener extends Actor with ActorLogging {
def receive: Receive = {
case TopicPartitionsAssigned(sub, assigned) =>
case TopicPartitionsRevoked(sub, revoked) =>
processRevokedPartitions(revoked)
}
}
val subscription = Subscriptions.topics("topic1", "topic2")
.withRebalanceListener(system.actorOf(Props[RebalanceListener]))
val control = Consumer.committableSource(consumerSettings, subscription)
...
Add Actor Reference to Topic
subscription to use.
Consumer Group Rebalancing (synchronous)
Synchronous partition assignment handler for next release by Enno Runne
https://github.com/akka/alpakka-kafka/pull/761
● Synchronous operations difficult to model in async library
● Will allow users block Consumer Actor thread (Consumer.poll thread)
● Provides limited consumer operations
○ Seek to offset
○ Synchronous commit
44
Transactional “Exactly-Once”
Kafka Transactions
46
“ “
Transactions enable atomic writes to
multiple Kafka topics and partitions.
All of the messages included in the
transaction will be successfully written
or none of them will be.
Message Delivery Semantics
• At most once
• At least once
• “Exactly once”
47
openclipart
Exactly Once Delivery vs Exactly Once Processing
48
“ “Exactly-once message delivery is
impossible between two parties where
failures of communication are
possible.
Two Generals/Byzantine Generals problem
Why use Transactions?
1. Zero tolerance for duplicate messages
2. Less boilerplate (deduping, client offset
management)
49
Anatomy of Kafka Transactions
50
Client
Cluster
Consumer
Offset Log
Topic Sub
Consumer
Group
Coordinator
Transaction
Log
Transaction
Coordinator
Topic Dest
Transformation
CM UM UM CM UM UM
Control Messages
Important Client Config
Topic Subscription:
Subscription.topics(“Topic1”, “Topic2”, “Topic3”)
Destination topic partitions get included in the transaction based on messages that are
produced.
Kafka Consumer Properties:
group.id: “my-group”
isolation.level: “read_committed”
plus other relevant consumer group configuration
Kafka Producer Properties:
transactional.id: “my-transaction”
enable.idempotence: “true” (implicit)
max.in.flight.requests.per.connection: “1” (implicit)
“Consume, Transform, Produce”
Kafka Features That Enable Transactions
1. Idempotent producer
2. Multiple partition atomic writes
3. Consumer read isolation level
51
Idempotent Producer (1/5)
52
Client
Cluster Broker
KafkaProducer.send(k,v)
sequence num = 0
producer id = 123
Leader Partition
Log
Idempotent Producer (2/5)
53
Client
Cluster Broker
Leader Partition
Log
Append (k,v) to partition
sequence num = 0
producer id = 123
(k,v)
seq = 0
pid = 123
Idempotent Producer (3/5)
54
Client
Cluster Broker
Leader Partition
Log
(k,v)
seq = 0
pid = 123
KafkaProducer.send(k,v)
sequence num = 0
producer id = 123
Broker acknowledgement fails
x
Idempotent Producer (4/5)
55
Client
Cluster Broker
Leader Partition
Log
(k,v)
seq = 0
pid = 123
(Client Retry)
KafkaProducer.send(k,v)
sequence num = 0
producer id = 123
Idempotent Producer (5/5)
56
Client
Cluster Broker
Leader Partition
Log
(k,v)
seq = 0
pid = 123
KafkaProducer.send(k,v)
sequence num = 0
producer id = 123
Broker acknowledgement succeeds
ack(duplicate)
Multiple Partition Atomic Writes
57
Client
Consumer
Offset Log
Transactions
Log
User Defined
Partition 1
User Defined
Partition 2
User Defined
Partition 3
Cluster Transaction and Consumer Group Coordinators
CM
UM
UM
CM
UM
UM
CM
UM
UM
CM
CM
CM
CM
CM
CM
Ex) Second phase of two phase commit
KafkaProducer.commitTransaction()
Last Offset Processed for
Consumer Subscription
Transaction Committed
(internal) Transaction Committed control
messages (user topics)
Multiple Partitions Committed Atomically, “All or nothing”
Consumer Read Isolation Level
58
Client
User Defined
Partition 1
User Defined
Partition 2
User Defined
Partition 3Cluster
CM
UM
UM
CM
UM
UM
CM
UM
UM
Kafka Consumer Properties:
isolation.level: “read_committed”
Transactional Pipeline Latency
59
Client Client Client
Transaction Batches every 100ms
End-to-end Latency
~300ms
Alpakka Kafka Transactions
60
Transactional
Source
Transform
Transactional
Sink
Source Kafka Partition(s)
Destination Kafka Partitions
...
Key: EN, Value: {“message”: “Hi Akka!” }
Key: FR, Value: {“message”: “Salut Akka!” }
Key: ES, Value: {“message”: “Hola Akka!” }
...
...
Key: EN, Value: {“message”: “Bye Akka!” }
Key: FR, Value: {“message”: “Au revoir Akka!” }
Key: ES, Value: {“message”: “Adiós Akka!” }
...
akka.kafka.producer.eos-commit-interval = 100ms
Cluster
Cluster
Messages waiting for ack
before commit
openclipart
Transactional GraphStage (1/7)
61
Transactional GraphStage
Transaction
Flow
Back Pressure Status
Resume Demand
Waiting for ACK
Commit Loop
Waiting
Transaction Status
Begin Transaction
Mailbox
Transactional GraphStage (2/7)
62
Transactional GraphStage
Transaction
Flow
Back Pressure Status
Resume Demand
Waiting for ACK
Commit Loop
Commit Interval Elapses
Transaction Status
Transaction is Open
Mailbox
Messages flowing
Transactional GraphStage (3/7)
63
Transactional GraphStage
Transaction
Flow
Back Pressure Status
Resume Demand
Waiting for ACK
Transaction Status
Transaction is Open
Commit Loop
Commit Interval Elapses
Messages flowing
Mailbox
Commit loop “tick”
message
100ms
Transactional GraphStage (4/7)
64
Transactional GraphStage
Transaction
Flow
Back Pressure Status
Suspend Demand
Waiting for ACK
Transaction Status
Transaction is Open
Commit Loop
Commit Interval Elapses
x
Mailbox
Messages stopped
Transactional GraphStage (5/7)
65
Transactional GraphStage
Transaction
Flow
Back Pressure Status
Suspend Demand
Waiting for ACK
Transaction Status
Send Consumed Offsets
Commit Loop
Commit Interval Elapses
x
Mailbox
Messages stopped
Transactional GraphStage (6/7)
66
Transactional GraphStage
Transaction
Flow
Back Pressure Status
Suspend Demand
Waiting for ACK
Transaction Status
Commit Transaction
Commit Loop
Commit Interval Elapses
x
Mailbox
Messages stopped
Transactional GraphStage (7/7)
67
Transactional GraphStage
Transaction
Flow
Back Pressure Status
Resume Demand
Waiting for ACK
Commit Loop
Waiting
Transaction Status
Begin New Transaction
Mailbox
Messages flowing
again
Alpakka Kafka Transactions
68
val producerSettings = ProducerSettings(system, new StringSerializer, new ByteArraySerializer)
.withBootstrapServers("localhost:9092")
.withEosCommitInterval(100.millis)
val control =
Transactional
.source(consumerSettings, Subscriptions.topics("source-topic"))
.via(transform)
.map { msg =>
ProducerMessage.Single(new ProducerRecord[String, Array[Byte]]("sink-topic", msg.record.value),
msg.partitionOffset)
}
.to(Transactional.sink(producerSettings, "transactional-id"))
.run()
Optionally provide a Transaction
commit interval (default is 100ms)
Alpakka Kafka Transactions
69
val producerSettings = ProducerSettings(system, new StringSerializer, new ByteArraySerializer)
.withBootstrapServers("localhost:9092")
.withEosCommitInterval(100.millis)
val control =
Transactional
.source(consumerSettings, Subscriptions.topics("source-topic"))
.via(transform)
.map { msg =>
ProducerMessage.Single(new ProducerRecord[String, Array[Byte]]("sink-topic", msg.record.value),
msg.partitionOffset)
}
.to(Transactional.sink(producerSettings, "transactional-id"))
.run()
Use Transactional.source to
propagate necessary info to
Transactional.sink (CG ID,
Offsets)
Alpakka Kafka Transactions
70
val producerSettings = ProducerSettings(system, new StringSerializer, new ByteArraySerializer)
.withBootstrapServers("localhost:9092")
.withEosCommitInterval(100.millis)
val control =
Transactional
.source(consumerSettings, Subscriptions.topics("source-topic"))
.via(transform)
.map { msg =>
ProducerMessage.Single(new ProducerRecord[String, Array[Byte]]("sink-topic", msg.record.value),
msg.partitionOffset)
}
.to(Transactional.sink(producerSettings, "transactional-id"))
.run()
Call Transactional.sink or .flow
to produce and commit messages.
Complex Event Processing
What is Complex Event Processing (CEP)?
72
“ “
Complex event processing, or CEP, is
event processing that combines data
from multiple sources to infer events
or patterns that suggest more
complicated circumstances.
Foundations of Complex Event Processing, Cornell
Calling into an Akka Actor System
73
Source
Ask
? SinkCluster Cluster
“Ask pattern” models non-blocking request and
response of Akka messages.
openclipart
Actor System
& JVM
Actor System
& JVM
Actor System
& JVM
Cluster
Router
Akka Cluster/Actor System
Actor
Actor System Integration
class ProblemSolverRouter extends Actor {
def receive = {
case problem: Problem =>
val solution = businessLogic(problem)
sender() ! solution // reply to the ask
}
}
...
val control = Consumer
.committableSource(consumerSettings, Subscriptions.topics("topic1", "topic2"))
...
.mapAsync(parallelism = 5)(problem => (problemSolverRouter ? problem).mapTo[Solution])
...
.run()
74
Transform your stream by processing messages in an
Actor System. All you need is an ActorRef.
Actor System Integration
class ProblemSolverRouter extends Actor {
def receive = {
case problem: Problem =>
val solution = businessLogic(problem)
sender() ! solution // reply to the ask
}
}
...
val control = Consumer
.committableSource(consumerSettings, Subscriptions.topics("topic1", "topic2"))
...
.mapAsync(parallelism = 5)(problem => (problemSolverRouter ? problem).mapTo[Solution])
...
.run()
75
Use Ask pattern (? function) to call provided
ActorRef to get an async response
Actor System Integration
class ProblemSolverRouter extends Actor {
def receive = {
case problem: Problem =>
val solution = businessLogic(problem)
sender() ! solution // reply to the ask
}
}
...
val control = Consumer
.committableSource(consumerSettings, Subscriptions.topics("topic1", "topic2"))
...
.mapAsync(parallelism = 5)(problem => (problemSolverRouter ? problem).mapTo[Solution])
...
.run()
76
Parallelism used to limit how many messages in
flight so we don’t overwhelm mailbox of destination
Actor and maintain stream back-pressure.
Persistent Stateful Stages
Options for implementing Stateful Streams
1. Provided Akka Streams stages: fold, scan,
etc.
2. Custom GraphStage
3. Call into an Akka Actor System
78
Persistent Stateful Stages using Event Sourcing
79
1. Recover state after failure
2. Create an event log
3. Share state
Sound familiar? KTable’s!
Persistent GraphStage using Event Sourcing
80
Source
Stateful
Stage
SinkCluster Cluster
Event Log
Response (Event)
Triggers State Change
akka.persistence.Journal
State
Akka Persistence Plugins
Request
Handler
Event
Handler
Request (Command/Query)
Writes Reads
(Replays)
openclipart
81
krasserm / akka-stream-eventsourcing
“ “This project brings to Akka Streams what
Akka Persistence brings to Akka Actors:
persistence via event sourcing.
Experimental
Public Domain Vectors
New in Alpakka Kafka 1.0
Alpakka Kafka 1.0 Release Notes
Released Feb 28, 2019. Highlights:
● Upgraded the Kafka client to version 2.0.0 #544 by @fr3akX
○ Support new API’s from KIP-299: Fix Consumer indefinite blocking behaviour in #614 by
@zaharidichev
● New Committer.sink for standardised committing #622 by @rtimush
● Commit with metadata #563 and #579 by @johnclara
● Factored out akka.kafka.testkit for internal and external use: see Testing
● Support for merging commit batches #584 by @rtimush
● Reduced risk of message loss for partitioned sources #589
● Expose Kafka errors to stream #617
● Java APIs for all settings classes #616
● Much more comprehensive tests
83
Conclusion
85
kafka connector
openclipart
Thank You!
Sean Glover
@seg1o
in/seanaglover
sean.glover@lightbend.com
Free eBook!
https://bit.ly/2J9xmZm

More Related Content

What's hot

Introducing Kafka's Streams API
Introducing Kafka's Streams APIIntroducing Kafka's Streams API
Introducing Kafka's Streams API
confluent
 
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
SANG WON PARK
 
Kafka Security 101 and Real-World Tips
Kafka Security 101 and Real-World Tips Kafka Security 101 and Real-World Tips
Kafka Security 101 and Real-World Tips
confluent
 

What's hot (20)

Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?
 
Introducing Kafka's Streams API
Introducing Kafka's Streams APIIntroducing Kafka's Streams API
Introducing Kafka's Streams API
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producer
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database System
 
Storing 16 Bytes at Scale
Storing 16 Bytes at ScaleStoring 16 Bytes at Scale
Storing 16 Bytes at Scale
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?
 
Kafka Tutorial - DevOps, Admin and Ops
Kafka Tutorial - DevOps, Admin and OpsKafka Tutorial - DevOps, Admin and Ops
Kafka Tutorial - DevOps, Admin and Ops
 
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
 
Kafka Security 101 and Real-World Tips
Kafka Security 101 and Real-World Tips Kafka Security 101 and Real-World Tips
Kafka Security 101 and Real-World Tips
 
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, ConfluentTemporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 
Introduction to the Disruptor
Introduction to the DisruptorIntroduction to the Disruptor
Introduction to the Disruptor
 
Apache Spark Core – Practical Optimization
Apache Spark Core – Practical OptimizationApache Spark Core – Practical Optimization
Apache Spark Core – Practical Optimization
 
Exactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache KafkaExactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache Kafka
 
Kafka Streams State Stores Being Persistent
Kafka Streams State Stores Being PersistentKafka Streams State Stores Being Persistent
Kafka Streams State Stores Being Persistent
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
 
Event-sourced architectures with Akka
Event-sourced architectures with AkkaEvent-sourced architectures with Akka
Event-sourced architectures with Akka
 

Similar to Streaming Design Patterns Using Alpakka Kafka Connector (Sean Glover, Lightbend) Kafka Summit NYC 2019

Spark Streaming Info
Spark Streaming InfoSpark Streaming Info
Spark Streaming Info
Doug Chang
 
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaKafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Guido Schmutz
 
Reactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka StreamsReactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka Streams
Konrad Malawski
 

Similar to Streaming Design Patterns Using Alpakka Kafka Connector (Sean Glover, Lightbend) Kafka Summit NYC 2019 (20)

Let the alpakka pull your stream
Let the alpakka pull your streamLet the alpakka pull your stream
Let the alpakka pull your stream
 
Spark streaming and Kafka
Spark streaming and KafkaSpark streaming and Kafka
Spark streaming and Kafka
 
Reactive Streams 1.0 and Akka Streams
Reactive Streams 1.0 and Akka StreamsReactive Streams 1.0 and Akka Streams
Reactive Streams 1.0 and Akka Streams
 
Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!
 
Sparkstreaming
SparkstreamingSparkstreaming
Sparkstreaming
 
Spark Streaming Info
Spark Streaming InfoSpark Streaming Info
Spark Streaming Info
 
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
 
Leveraging Azure Databricks to minimize time to insight by combining Batch an...
Leveraging Azure Databricks to minimize time to insight by combining Batch an...Leveraging Azure Databricks to minimize time to insight by combining Batch an...
Leveraging Azure Databricks to minimize time to insight by combining Batch an...
 
Designing a Scalable Data Platform
Designing a Scalable Data PlatformDesigning a Scalable Data Platform
Designing a Scalable Data Platform
 
I can't believe it's not a queue: Kafka and Spring
I can't believe it's not a queue: Kafka and SpringI can't believe it's not a queue: Kafka and Spring
I can't believe it's not a queue: Kafka and Spring
 
Kafka Summit SF 2017 - Exactly-once Stream Processing with Kafka Streams
Kafka Summit SF 2017 - Exactly-once Stream Processing with Kafka StreamsKafka Summit SF 2017 - Exactly-once Stream Processing with Kafka Streams
Kafka Summit SF 2017 - Exactly-once Stream Processing with Kafka Streams
 
Exactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsExactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka Streams
 
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaKafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
 
Training
TrainingTraining
Training
 
Reactive Streams - László van den Hoek
Reactive Streams - László van den HoekReactive Streams - László van den Hoek
Reactive Streams - László van den Hoek
 
Service messaging using Kafka
Service messaging using KafkaService messaging using Kafka
Service messaging using Kafka
 
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
 
Real-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaReal-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache Kafka
 
Reactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka StreamsReactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka Streams
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
 

More from confluent

More from confluent (20)

Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 

Recently uploaded

Recently uploaded (20)

THE BEST IPTV in GERMANY for 2024: IPTVreel
THE BEST IPTV in  GERMANY for 2024: IPTVreelTHE BEST IPTV in  GERMANY for 2024: IPTVreel
THE BEST IPTV in GERMANY for 2024: IPTVreel
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
Buy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptxBuy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptx
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
The UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoThe UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, Ocado
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 

Streaming Design Patterns Using Alpakka Kafka Connector (Sean Glover, Lightbend) Kafka Summit NYC 2019

  • 1. Streaming Design Patterns Using Alpakka Kafka Connector Sean Glover, Lightbend @seg1o
  • 2. Who am I? I’m Sean Glover • Principal Engineer at Lightbend • Member of the Lightbend Pipelines team • Organizer of Scala Toronto (scalator) • Author and contributor to various projects in the Kafka ecosystem including Kafka, Alpakka Kafka (reactive-kafka), Strimzi, Kafka Lag Exporter, DC/OS Commons SDK 2 / seg1o
  • 3. 3 “ “The Alpakka project is an initiative to implement a library of integration modules to build stream-aware, reactive, pipelines for Java and Scala.
  • 4. 4 Cloud Services Data Stores JMS Messaging
  • 5. 5 kafka connector “ “This Alpakka Kafka connector lets you connect Apache Kafka to Akka Streams.
  • 6. Top Alpakka Modules 6 Alpakka Module Downloads in August 2018 Kafka 61177 Cassandra 15946 AWS S3 15075 MQTT 11403 File 10636 Simple Codecs 8285 CSV 7428 AWS SQS 5385 AMQP 4036
  • 7. 7 “ “ Akka Streams is a library toolkit to provide low latency complex event processing streaming semantics using the Reactive Streams specification implemented internally with an Akka actor system. streams
  • 8. 8 Source Flow Sink User Messages (flow downstream) Internal Back-pressure Messages (flow upstream) Outlet Inlet streams
  • 9. Reactive Streams Specification 9 “ “Reactive Streams is an initiative to provide a standard for asynchronous stream processing with non-blocking back pressure. http://www.reactive-streams.org/
  • 10. Reactive Streams Libraries 10 streams Spec now part of JDK 9 java.util.concurrent.Flow migrating to
  • 11. GraphStageAkka Actor Concepts 11 Mailbox 1. Constrained Actor Mailbox 2. One message at a time “Single Threaded Illusion” 3. May contain state Actor State // Message Handler “Receive Block” def receive = { case message: MessageType => }
  • 12. Back Pressure Demo 12 Source Flow Sink Source Kafka Topic Destination Kafka Topic I need some messages. Demand request is sent upstream I need to load some messages for downstream ... Key: EN, Value: {“message”: “Hi Akka!” } Key: FR, Value: {“message”: “Salut Akka!” } Key: ES, Value: {“message”: “Hola Akka!” } ... Demand satisfied downstream ... Key: EN, Value: {“message”: “Bye Akka!” } Key: FR, Value: {“message”: “Au revoir Akka!” } Key: ES, Value: {“message”: “Adiós Akka!” } ... openclipart
  • 13. Dynamic Push Pull 13 Source Flow Bounded Mailbox Flow sends demand request (pull) of 5 messages max x I can handle 5 more messages Source sends (push) a batch of 5 messages downstream I can’t send more messages downstream because I no more demand to fulfill. Flow’s mailbox is full! Slow Consumer Fast Producer openclipart
  • 14. Why Back Pressure? • Prevent cascading failure • Alternative to using a big buffer (i.e. Kafka) • Back Pressure flow control can use several strategies • Slow down until there’s demand (classic back pressure, “throttling”) • Discard elements • Buffer in memory to some max, then discard elements • Shutdown 14
  • 15. Why Back Pressure? A case study. 15 https://medium.com/@programmerohit/back-press ure-implementation-aws-sqs-polling-from-a-shard ed-akka-cluster-running-on-kubernetes-56ee8c67 efb
  • 16. Akka Streams Factorial Example import ... object Main extends App { implicit val system = ActorSystem("QuickStart") implicit val materializer = ActorMaterializer() val source: Source[Int, NotUsed] = Source(1 to 100) val factorials = source.scan(BigInt(1))((acc, next) ⇒ acc * next) val result: Future[IOResult] = factorials .map(num => ByteString(s"$numn")) .runWith(FileIO.toPath(Paths.get("factorials.txt"))) } 16 https://doc.akka.io/docs/akka/2.5/stream/stream-quickstart.html
  • 17. Apache Kafka 17 Kafka Documentation “ “Apache Kafka is a distributed streaming system. It’s best suited to support fast, high volume, and fault tolerant, data streaming platforms.
  • 18. 18 streams streams!= Akka Streams and Kafka Streams solve different problems When to use Alpakka Kafka?
  • 19. When to use Alpakka Kafka? 1. To build back pressure aware integrations 2. Complex Event Processing 3. A need to model the most complex of graphs 19
  • 20. Anatomy of an Alpakka Kafka app
  • 21. Alpakka Kafka Setup val consumerClientConfig = system.settings. config.getConfig( "akka.kafka.consumer") val consumerSettings = ConsumerSettings(consumerClientConfig, new StringDeserializer, new ByteArrayDeserializer) .withBootstrapServers( "localhost:9092") .withGroupId( "group1") .withProperty(ConsumerConfig. AUTO_OFFSET_RESET_CONFIG, "earliest") val producerClientConfig = system.settings. config.getConfig( "akka.kafka.producer") val producerSettings = ProducerSettings(system, new StringSerializer, new ByteArraySerializer) .withBootstrapServers( "localhost:9092") 21 Alpakka Kafka config & Kafka Client config can go here Set ad-hoc Kafka client config
  • 22. Anatomy of an Alpakka Kafka App val control = Consumer .committableSource(consumerSettings, Subscriptions. topics(topic1)) .map { msg => ProducerMessage. single( new ProducerRecord(topic1, msg.record.key, msg.record.value), passThrough = msg.committableOffset) } .via(Producer. flexiFlow(producerSettings)) .map(_.passThrough) .toMat(Committer. sink(committerSettings))(Keep. both) .mapMaterializedValue(DrainingControl. apply) .run() // Add shutdown hook to respond to SIGTERM and gracefully shutdown stream sys.ShutdownHookThread { Await.result(control.shutdown(), 10.seconds) } 22 A small Consume -> Transform -> Produce Akka Streams app using Alpakka Kafka
  • 23. val control = Consumer .committableSource(consumerSettings, Subscriptions. topics(topic1)) .map { msg => ProducerMessage. single( new ProducerRecord(topic1, msg.record.key, msg.record.value), passThrough = msg.committableOffset) } .via(Producer. flexiFlow(producerSettings)) .map(_.passThrough) .toMat(Committer. sink(committerSettings))(Keep. both) .mapMaterializedValue(DrainingControl. apply) .run() // Add shutdown hook to respond to SIGTERM and gracefully shutdown stream sys.ShutdownHookThread { Await.result(control.shutdown(), 10.seconds) } 23 Anatomy of an Alpakka Kafka App The Committable Source propagates Kafka offset information downstream with consumed messages
  • 24. val control = Consumer .committableSource(consumerSettings, Subscriptions. topics(topic1)) .map { msg => ProducerMessage. single( new ProducerRecord(topic1, msg.record.key, msg.record.value), passThrough = msg.committableOffset) } .via(Producer. flexiFlow(producerSettings)) .map(_.passThrough) .toMat(Committer. sink(committerSettings))(Keep. both) .mapMaterializedValue(DrainingControl. apply) .run() // Add shutdown hook to respond to SIGTERM and gracefully shutdown stream sys.ShutdownHookThread { Await.result(control.shutdown(), 10.seconds) } 24 Anatomy of an Alpakka Kafka App ProducerMessage used to map consumed offset to transformed results. One to One (1:1) ProducerMessage. single One to Many (1:M) ProducerMessage. multi( immutable.Seq( new ProducerRecord(topic1, msg.record.key, msg.record.value), new ProducerRecord(topic2, msg.record.key, msg.record.value)), passthrough = msg.committableOffset ) One to None (1:0) ProducerMessage. passThrough(msg.committableOffset)
  • 25. val control = Consumer .committableSource(consumerSettings, Subscriptions. topics(topic1)) .map { msg => ProducerMessage. single( new ProducerRecord(topic1, msg.record.key, msg.record.value), passThrough = msg.committableOffset) } .via(Producer. flexiFlow(producerSettings)) .map(_.passThrough) .toMat(Committer. sink(committerSettings))(Keep. both) .mapMaterializedValue(DrainingControl. apply) .run() // Add shutdown hook to respond to SIGTERM and gracefully shutdown stream sys.ShutdownHookThread { Await.result(control.shutdown(), 10.seconds) } 25 Anatomy of an Alpakka Kafka App Produce messages to destination topic flexiFlow accepts new ProducerMessage type and will replace deprecated flow in the future.
  • 26. val control = Consumer .committableSource(consumerSettings, Subscriptions. topics(topic1)) .map { msg => ProducerMessage. single( new ProducerRecord(topic1, msg.record.key, msg.record.value), passThrough = msg.committableOffset) } .via(Producer. flexiFlow(producerSettings)) .map(_.passThrough) .toMat( Committer.sink(committerSettings))(Keep.both) .mapMaterializedValue(DrainingControl. apply) .run() // Add shutdown hook to respond to SIGTERM and gracefully shutdown stream sys.ShutdownHookThread { Await.result(control.shutdown(), 10.seconds) } 26 Anatomy of an Alpakka Kafka App Batches consumed offset commits Passthrough allows us to track what messages have been successfully processed for At Least Once message delivery guarantees.
  • 27. val control = Consumer .committableSource(consumerSettings, Subscriptions. topics(topic1)) .map { msg => ProducerMessage. single( new ProducerRecord(topic1, msg.record.key, msg.record.value), passThrough = msg.committableOffset) } .via(Producer. flexiFlow(producerSettings)) .map(_.passThrough) .toMat(Committer. sink(committerSettings))(Keep. both) .mapMaterializedValue(DrainingControl. apply) .run() // Add shutdown hook to respond to SIGTERM and gracefully shutdown stream sys.ShutdownHookThread { Await.result(control.shutdown(), 10.seconds) } 27 Anatomy of an Alpakka Kafka App Gacefully shutdown stream 1. Stop consuming (polling) new messages from Source 2. Wait for all messages to be successfully committed (when applicable) 3. Wait for all produced messages to ACK
  • 28. val control = Consumer .committableSource(consumerSettings, Subscriptions. topics(topic1)) .map { msg => ProducerMessage. single( new ProducerRecord(topic1, msg.record.key, msg.record.value), passThrough = msg.committableOffset) } .via(Producer. flexiFlow(producerSettings)) .map(_.passThrough) .toMat(Committer. sink(committerSettings))(Keep. both) .mapMaterializedValue(DrainingControl. apply) .run() // Add shutdown hook to respond to SIGTERM and gracefully shutdown stream sys.ShutdownHookThread { Await.result(control.shutdown(), 10.seconds) } 28 Anatomy of an Alpakka Kafka App Graceful shutdown when SIGTERM sent to app (i.e. by docker daemon) Force shutdown after grace interval
  • 30. Why use Consumer Groups? 1. Easy, robust, and performant scaling of consumers to reduce consumer lag 30
  • 31. Back Pressure Consumer Group Latency and Offset Lag 31 Cluster Topic Producer 1 Producer 2 Producer n ... Throughput: 10 MB/s Consumer 1 Consumer 2 Consumer 3 Consumer Throughput ~3 MB/s each ~9 MB/s Total offset lag and latency is growing. openclipart
  • 32. Consumer Group Latency and Offset Lag 32 Cluster Topic Producer 1 Producer 2 Producer n ... Data Throughput: 10 MB/s Consumer 1 Consumer 2 Consumer 3 Consumer 4 Add new consumer and rebalance Consumers now can support a throughput of ~12 MB/s Offset lag and latency decreases until consumers are caught up
  • 33. Committable Sink val committerSettings = CommitterSettings(system) val control: DrainingControl[Done] = Consumer .committableSource(consumerSettings, Subscriptions.topics(topic)) .mapAsync(1) { msg => business(msg.record.key, msg.record.value) .map(_ => msg.committableOffset) } .toMat(Committer.sink(committerSettings))(Keep.both) .mapMaterializedValue(DrainingControl.apply) .run() 33
  • 34. Anatomy of a Consumer Group 34 Client A Client B Client C Cluster Consumer Group Partitions: 0,1,2 Partitions: 3,4,5 Partitions: 6,7,8 Consumer Group Offsets topic Ex) P0: 100489 P1: 128048 P2: 184082 P3: 596837 P4: 110847 P5: 99472 P6: 148270 P7: 3582785 P8: 182483 Consumer Offset Log T3 T1 T2 Consumer Group Coordinator Consumer Group Topic Subscription Important Consumer Group Client Config Topic Subscription: Subscription.topics(“Topic1”, “Topic2”, “Topic3”) Kafka Consumer Properties: group.id: [“my-group”] session.timeout.ms: [30000 ms] partition.assignment.strategy: [RangeAssignor] heartbeat.interval.ms: [3000 ms] Consumer Group Leader
  • 35. Consumer Group Rebalance (1/7) 35 Client A Client B Client C Cluster Consumer Group Partitions: 0,1,2 Partitions: 3,4,5 Partitions: 6,7,8 Consumer Offset Log T3 T1 T2 Consumer Group Coordinator Consumer Group Leader
  • 36. Consumer Group Rebalance (2/7) 36 Client D Client A Client B Client C Cluster Consumer Group Partitions: 0,1,2 Partitions: 3,4,5 Partitions: 6,7,8 Consumer Offset Log T3 T1 T2 Consumer Group Coordinator Consumer Group Leader Client D requests to join the consumer groupNew Client D with same group.id sends a request to join the group to Coordinator
  • 37. Consumer Group Rebalance (3/7) 37 Client D Client A Client B Client C Cluster Consumer Group Partitions: 0,1,2 Partitions: 3,4,5 Partitions: 6,7,8 Consumer Offset Log T3 T1 T2 Consumer Group Coordinator Consumer Group Leader Consumer group coordinator requests group leader to calculate new Client:partition assignments.
  • 38. Consumer Group Rebalance (4/7) 38 Client D Client A Client B Client C Cluster Consumer Group Partitions: 0,1,2 Partitions: 3,4,5 Partitions: 6,7,8 Consumer Offset Log T3 T1 T2 Consumer Group Coordinator Consumer Group Leader Consumer group leader sends new Client:Partition assignment to group coordinator.
  • 39. Consumer Group Rebalance (5/7) 39 Client D Client A Client B Client C Cluster Consumer Group Assign Partitions: 0,1 Assign Partitions: 2,3 Assign Partitions:6,7,8 Consumer Offset Log T3 T1 T2 Consumer Group Coordinator Consumer Group Leader Consumer group coordinator informs all clients of their new Client:Partition assignments. Assign Partitions: 4,5
  • 40. Consumer Group Rebalance (6/7) 40 Client D Client A Client B Client C Cluster Consumer Group Consumer Offset Log T3 T1 T2 Consumer Group Coordinator Consumer Group Leader Clients that had partitions revoked are given the chance to commit their latest processed offsets. Partitions to C om m it:2 Partitions to Commit: 3,5 Partitions to Commit: 6,7,8
  • 41. Consumer Group Rebalance (7/7) 41 Client D Client A Client B Client C Cluster Consumer Group Consumer Offset Log T3 T1 T2 Consumer Group Coordinator Consumer Group Leader Rebalance complete. Clients begin consuming partitions from their last committed offsets. Partitions: 0,1 Partitions: 2,3 Partitions: 4,5 Partitions: 6,7,8
  • 42. Consumer Group Rebalancing (asynchronous) 42 class RebalanceListener extends Actor with ActorLogging { def receive: Receive = { case TopicPartitionsAssigned(sub, assigned) => case TopicPartitionsRevoked(sub, revoked) => processRevokedPartitions(revoked) } } val subscription = Subscriptions.topics("topic1", "topic2") .withRebalanceListener(system.actorOf(Props[RebalanceListener])) val control = Consumer.committableSource(consumerSettings, subscription) ... Declare an Akka Actor to handle assigned and revoked partition messages asynchronously. Useful to perform asynchronous actions during rebalance, but not for blocking operations you want to happen during rebalance.
  • 43. Consumer Group Rebalancing (asynchronous) 43 class RebalanceListener extends Actor with ActorLogging { def receive: Receive = { case TopicPartitionsAssigned(sub, assigned) => case TopicPartitionsRevoked(sub, revoked) => processRevokedPartitions(revoked) } } val subscription = Subscriptions.topics("topic1", "topic2") .withRebalanceListener(system.actorOf(Props[RebalanceListener])) val control = Consumer.committableSource(consumerSettings, subscription) ... Add Actor Reference to Topic subscription to use.
  • 44. Consumer Group Rebalancing (synchronous) Synchronous partition assignment handler for next release by Enno Runne https://github.com/akka/alpakka-kafka/pull/761 ● Synchronous operations difficult to model in async library ● Will allow users block Consumer Actor thread (Consumer.poll thread) ● Provides limited consumer operations ○ Seek to offset ○ Synchronous commit 44
  • 46. Kafka Transactions 46 “ “ Transactions enable atomic writes to multiple Kafka topics and partitions. All of the messages included in the transaction will be successfully written or none of them will be.
  • 47. Message Delivery Semantics • At most once • At least once • “Exactly once” 47 openclipart
  • 48. Exactly Once Delivery vs Exactly Once Processing 48 “ “Exactly-once message delivery is impossible between two parties where failures of communication are possible. Two Generals/Byzantine Generals problem
  • 49. Why use Transactions? 1. Zero tolerance for duplicate messages 2. Less boilerplate (deduping, client offset management) 49
  • 50. Anatomy of Kafka Transactions 50 Client Cluster Consumer Offset Log Topic Sub Consumer Group Coordinator Transaction Log Transaction Coordinator Topic Dest Transformation CM UM UM CM UM UM Control Messages Important Client Config Topic Subscription: Subscription.topics(“Topic1”, “Topic2”, “Topic3”) Destination topic partitions get included in the transaction based on messages that are produced. Kafka Consumer Properties: group.id: “my-group” isolation.level: “read_committed” plus other relevant consumer group configuration Kafka Producer Properties: transactional.id: “my-transaction” enable.idempotence: “true” (implicit) max.in.flight.requests.per.connection: “1” (implicit) “Consume, Transform, Produce”
  • 51. Kafka Features That Enable Transactions 1. Idempotent producer 2. Multiple partition atomic writes 3. Consumer read isolation level 51
  • 52. Idempotent Producer (1/5) 52 Client Cluster Broker KafkaProducer.send(k,v) sequence num = 0 producer id = 123 Leader Partition Log
  • 53. Idempotent Producer (2/5) 53 Client Cluster Broker Leader Partition Log Append (k,v) to partition sequence num = 0 producer id = 123 (k,v) seq = 0 pid = 123
  • 54. Idempotent Producer (3/5) 54 Client Cluster Broker Leader Partition Log (k,v) seq = 0 pid = 123 KafkaProducer.send(k,v) sequence num = 0 producer id = 123 Broker acknowledgement fails x
  • 55. Idempotent Producer (4/5) 55 Client Cluster Broker Leader Partition Log (k,v) seq = 0 pid = 123 (Client Retry) KafkaProducer.send(k,v) sequence num = 0 producer id = 123
  • 56. Idempotent Producer (5/5) 56 Client Cluster Broker Leader Partition Log (k,v) seq = 0 pid = 123 KafkaProducer.send(k,v) sequence num = 0 producer id = 123 Broker acknowledgement succeeds ack(duplicate)
  • 57. Multiple Partition Atomic Writes 57 Client Consumer Offset Log Transactions Log User Defined Partition 1 User Defined Partition 2 User Defined Partition 3 Cluster Transaction and Consumer Group Coordinators CM UM UM CM UM UM CM UM UM CM CM CM CM CM CM Ex) Second phase of two phase commit KafkaProducer.commitTransaction() Last Offset Processed for Consumer Subscription Transaction Committed (internal) Transaction Committed control messages (user topics) Multiple Partitions Committed Atomically, “All or nothing”
  • 58. Consumer Read Isolation Level 58 Client User Defined Partition 1 User Defined Partition 2 User Defined Partition 3Cluster CM UM UM CM UM UM CM UM UM Kafka Consumer Properties: isolation.level: “read_committed”
  • 59. Transactional Pipeline Latency 59 Client Client Client Transaction Batches every 100ms End-to-end Latency ~300ms
  • 60. Alpakka Kafka Transactions 60 Transactional Source Transform Transactional Sink Source Kafka Partition(s) Destination Kafka Partitions ... Key: EN, Value: {“message”: “Hi Akka!” } Key: FR, Value: {“message”: “Salut Akka!” } Key: ES, Value: {“message”: “Hola Akka!” } ... ... Key: EN, Value: {“message”: “Bye Akka!” } Key: FR, Value: {“message”: “Au revoir Akka!” } Key: ES, Value: {“message”: “Adiós Akka!” } ... akka.kafka.producer.eos-commit-interval = 100ms Cluster Cluster Messages waiting for ack before commit openclipart
  • 61. Transactional GraphStage (1/7) 61 Transactional GraphStage Transaction Flow Back Pressure Status Resume Demand Waiting for ACK Commit Loop Waiting Transaction Status Begin Transaction Mailbox
  • 62. Transactional GraphStage (2/7) 62 Transactional GraphStage Transaction Flow Back Pressure Status Resume Demand Waiting for ACK Commit Loop Commit Interval Elapses Transaction Status Transaction is Open Mailbox Messages flowing
  • 63. Transactional GraphStage (3/7) 63 Transactional GraphStage Transaction Flow Back Pressure Status Resume Demand Waiting for ACK Transaction Status Transaction is Open Commit Loop Commit Interval Elapses Messages flowing Mailbox Commit loop “tick” message 100ms
  • 64. Transactional GraphStage (4/7) 64 Transactional GraphStage Transaction Flow Back Pressure Status Suspend Demand Waiting for ACK Transaction Status Transaction is Open Commit Loop Commit Interval Elapses x Mailbox Messages stopped
  • 65. Transactional GraphStage (5/7) 65 Transactional GraphStage Transaction Flow Back Pressure Status Suspend Demand Waiting for ACK Transaction Status Send Consumed Offsets Commit Loop Commit Interval Elapses x Mailbox Messages stopped
  • 66. Transactional GraphStage (6/7) 66 Transactional GraphStage Transaction Flow Back Pressure Status Suspend Demand Waiting for ACK Transaction Status Commit Transaction Commit Loop Commit Interval Elapses x Mailbox Messages stopped
  • 67. Transactional GraphStage (7/7) 67 Transactional GraphStage Transaction Flow Back Pressure Status Resume Demand Waiting for ACK Commit Loop Waiting Transaction Status Begin New Transaction Mailbox Messages flowing again
  • 68. Alpakka Kafka Transactions 68 val producerSettings = ProducerSettings(system, new StringSerializer, new ByteArraySerializer) .withBootstrapServers("localhost:9092") .withEosCommitInterval(100.millis) val control = Transactional .source(consumerSettings, Subscriptions.topics("source-topic")) .via(transform) .map { msg => ProducerMessage.Single(new ProducerRecord[String, Array[Byte]]("sink-topic", msg.record.value), msg.partitionOffset) } .to(Transactional.sink(producerSettings, "transactional-id")) .run() Optionally provide a Transaction commit interval (default is 100ms)
  • 69. Alpakka Kafka Transactions 69 val producerSettings = ProducerSettings(system, new StringSerializer, new ByteArraySerializer) .withBootstrapServers("localhost:9092") .withEosCommitInterval(100.millis) val control = Transactional .source(consumerSettings, Subscriptions.topics("source-topic")) .via(transform) .map { msg => ProducerMessage.Single(new ProducerRecord[String, Array[Byte]]("sink-topic", msg.record.value), msg.partitionOffset) } .to(Transactional.sink(producerSettings, "transactional-id")) .run() Use Transactional.source to propagate necessary info to Transactional.sink (CG ID, Offsets)
  • 70. Alpakka Kafka Transactions 70 val producerSettings = ProducerSettings(system, new StringSerializer, new ByteArraySerializer) .withBootstrapServers("localhost:9092") .withEosCommitInterval(100.millis) val control = Transactional .source(consumerSettings, Subscriptions.topics("source-topic")) .via(transform) .map { msg => ProducerMessage.Single(new ProducerRecord[String, Array[Byte]]("sink-topic", msg.record.value), msg.partitionOffset) } .to(Transactional.sink(producerSettings, "transactional-id")) .run() Call Transactional.sink or .flow to produce and commit messages.
  • 72. What is Complex Event Processing (CEP)? 72 “ “ Complex event processing, or CEP, is event processing that combines data from multiple sources to infer events or patterns that suggest more complicated circumstances. Foundations of Complex Event Processing, Cornell
  • 73. Calling into an Akka Actor System 73 Source Ask ? SinkCluster Cluster “Ask pattern” models non-blocking request and response of Akka messages. openclipart Actor System & JVM Actor System & JVM Actor System & JVM Cluster Router Akka Cluster/Actor System Actor
  • 74. Actor System Integration class ProblemSolverRouter extends Actor { def receive = { case problem: Problem => val solution = businessLogic(problem) sender() ! solution // reply to the ask } } ... val control = Consumer .committableSource(consumerSettings, Subscriptions.topics("topic1", "topic2")) ... .mapAsync(parallelism = 5)(problem => (problemSolverRouter ? problem).mapTo[Solution]) ... .run() 74 Transform your stream by processing messages in an Actor System. All you need is an ActorRef.
  • 75. Actor System Integration class ProblemSolverRouter extends Actor { def receive = { case problem: Problem => val solution = businessLogic(problem) sender() ! solution // reply to the ask } } ... val control = Consumer .committableSource(consumerSettings, Subscriptions.topics("topic1", "topic2")) ... .mapAsync(parallelism = 5)(problem => (problemSolverRouter ? problem).mapTo[Solution]) ... .run() 75 Use Ask pattern (? function) to call provided ActorRef to get an async response
  • 76. Actor System Integration class ProblemSolverRouter extends Actor { def receive = { case problem: Problem => val solution = businessLogic(problem) sender() ! solution // reply to the ask } } ... val control = Consumer .committableSource(consumerSettings, Subscriptions.topics("topic1", "topic2")) ... .mapAsync(parallelism = 5)(problem => (problemSolverRouter ? problem).mapTo[Solution]) ... .run() 76 Parallelism used to limit how many messages in flight so we don’t overwhelm mailbox of destination Actor and maintain stream back-pressure.
  • 78. Options for implementing Stateful Streams 1. Provided Akka Streams stages: fold, scan, etc. 2. Custom GraphStage 3. Call into an Akka Actor System 78
  • 79. Persistent Stateful Stages using Event Sourcing 79 1. Recover state after failure 2. Create an event log 3. Share state Sound familiar? KTable’s!
  • 80. Persistent GraphStage using Event Sourcing 80 Source Stateful Stage SinkCluster Cluster Event Log Response (Event) Triggers State Change akka.persistence.Journal State Akka Persistence Plugins Request Handler Event Handler Request (Command/Query) Writes Reads (Replays) openclipart
  • 81. 81 krasserm / akka-stream-eventsourcing “ “This project brings to Akka Streams what Akka Persistence brings to Akka Actors: persistence via event sourcing. Experimental Public Domain Vectors
  • 82. New in Alpakka Kafka 1.0
  • 83. Alpakka Kafka 1.0 Release Notes Released Feb 28, 2019. Highlights: ● Upgraded the Kafka client to version 2.0.0 #544 by @fr3akX ○ Support new API’s from KIP-299: Fix Consumer indefinite blocking behaviour in #614 by @zaharidichev ● New Committer.sink for standardised committing #622 by @rtimush ● Commit with metadata #563 and #579 by @johnclara ● Factored out akka.kafka.testkit for internal and external use: see Testing ● Support for merging commit batches #584 by @rtimush ● Reduced risk of message loss for partitioned sources #589 ● Expose Kafka errors to stream #617 ● Java APIs for all settings classes #616 ● Much more comprehensive tests 83