This document discusses backpressure in Akka streams and provides examples using Envy. It begins with defining backpressure and its importance. It then explains how backpressure is implemented in Akka streams using the publisher/subscriber model and demand signaling. Examples are shown of handling backpressure when writing to MongoDB to avoid overloading it, and when receiving messages from RabbitMQ to limit the rate. The key aspects of backpressure covered are batching, throttling, and how it propagates upstream.
Introduction to Machine Learning Unit-3 for II MECH
Backpressure in akka-streams
1. Today we will talk about akka streams and back pressure
and we will show examples in envy
Backpressure
In Akka-streams
2. At the end of this talk you should be able to answers to
these questions
Keep these in mind ...
• What is backpressure ? and Why do I need it ?
• How is backpressure implemented in akka-streams?
• What is the speed of a graph when there is backpressure?
• How is envy handling backpressure ?
• What is the link between batching, throttling and backpressure ?
• Why is Erik using Z everywhere ? (systemaz , signupz , ...)
3. Set of interfaces and components and their relations
Terminology to impress your
friendz
• Reactive Streams : High level specification for stream libraries
• Publisher : generate elements, (can be unbounded)
Source[Events] in akka-streams
• Subscriber : subscribe to a publisher, Sink[Events] or
Flow[In,Out] in akka-streams
• Backpressure: ...
7. The answer is the new arrow , the feedback to the publisher
that tell hin how nuch he has left in his buffer , even if he is 1/s
8. For backpressure to be possible these features
need to be implemented in the Subscriber
In kafka it's possible to do
Fast Publisher (kafka
source), slow Subscriber
(mongo sink)
Wait a second ! Intel is sending more data than I can handle !!
- Not generate elements, if it is able to control their production rate,
- Try buffering the elements in a bounded manner until more demand
is signalled,
- Drop elements until more demand is signalled,
- Tear down the stream if unable to apply any of the above strategies.
9. ze protocol che magdir kama elemtim ah subscriber yehol lekabel, ve ze a demand.
Yesh guarantee she a Pusblisher lo yehol lishloah yoter mui a demand a ze
Backpressure protocol
The back pressure protocol is defined in terms of the number of
elements a downstream Subscriber is able to receive and buffer,
referred to as demand. The source of data, referred to as Publisher in
Reactive Streams terminology and implemented as Source in Akka
Streams, guarantees that it will never emit more elements than the
received total demand for any given Subscriber.
Publisher.emit < Subscriber.total_demand
10. The slowest branch (Mongo) slows
down the graph (backpressure
propagates to the source)
11. This is a graph , souce , flow , sink . different brances fast
rabbit, slow mongo
12. There is no mention of backpressure , it's build in to the
stages , the mongowriter will tell the KafkaSource to slow down
backpressure will propagate to the source, batch will start to
work
Kafka is faster than Mongo
//Publisher
KafkaSource[Event]
// batch events in a List when mongo is too slow to handle the load
.batch(batchSize, List(_))((els, el) => el :: els)
// incidents are snapshots, group by id and keep only the last one
.map(keepLatestEvent)
// write to mongo in batches, can be very slow (~ 1 min)
// create backpressure for the source
.via(mongoEventWriter(_)))
//Subscriber
13. Batch per size, Mongo has a
hard limit of 16MB per batch
Source(updates)
//split my list in batches of maximum 16MB
.batchWeighted(MongoMaxDocumentSize, estimateUpdatesSize, List(_))((s, o) => o :: s)
//insert to mongo in batches
.mapAsync(1) { updateElements =>
val batchCommand = BatchCommands.UpdateCommand(updateElements)
runBatchUpdateCommand(batchCommand): Future[Unit]
}
14. Create backpressure to slow
down Rabbitmq
Flow[NewNotifications.Notification]
.conflateWithSeed(List(_))((list, elem) => elem :: list)
.mapConcat { batch =>
// too many notifications, replace it by 1 refresh notif
// the UI will read from mongo instead
if (batch.size > threshold) {
List(NewNotifications.Refresh(organization, System.currentTimeMillis())
} else { batch.reverse}
}
//Create backpressure by limiting the rate to 10/s
.throttle(10, 1.second, 10, ThrottleMode.shaping)