11. Streams
“You cannot enter the same river twice”
~ Heraclitus
http://en.wikiquote.org/wiki/Heraclitus
12. Streams
Real Time Stream Processing
When you attach “late” to a Publisher,
you may miss initial elements – it’s a river of data.
http://en.wikiquote.org/wiki/Heraclitus
21. Reactive Streams - Inter-op
We want to make different implementations
co-operate with each other.
http://reactive-streams.org
22. Reactive Streams - Inter-op
The different implementations “talk to each other”
using the Reactive Streams protocol.
http://reactive-streams.org
23. Reactive Streams - Inter-op
The Reactive Streams SPI is NOT meant to be user-api.
You should use one of the implementing libraries.
http://reactive-streams.org
45. Back-pressure? RS: Dynamic Push/Pull
Just push – not safe when Slow Subscriber
Just pull – too slow when Fast Subscriber
46. Back-pressure? RS: Dynamic Push/Pull
Just push – not safe when Slow Subscriber
Just pull – too slow when Fast Subscriber
Solution:
Dynamic adjustment
47. Back-pressure? RS: Dynamic Push/Pull
Slow Subscriber sees it’s buffer can take 3 elements.
Publisher will never blow up it’s buffer.
48. Back-pressure? RS: Dynamic Push/Pull
Fast Publisher will send at-most 3 elements.
This is pull-based-backpressure.
49. Back-pressure? RS: Dynamic Push/Pull
Fast Subscriber can issue more Request(n),
before more data arrives!
50. Back-pressure? RS: Dynamic Push/Pull
Fast Subscriber can issue more Request(n),
before more data arrives.
Publisher can accumulate demand.
58. Akka
Akka is a high-performance concurrency
library for Scala and Java.
At it’s core it focuses on the Actor Model:
59. Akka
Akka is a high-performance concurrency
library for Scala and Java.
At it’s core it focuses on the Actor Model:
An Actor can only:
• Send and receive messages
• Create Actors
• Change it’s behaviour
60. Akka
class Player extends Actor {
def receive = {
case NextTurn => sender() ! decideOnMove()
}
def decideOnMove(): Move = ???
}
67. Akka Streams – Linear Flow
Flow[Double].map(_.toInt). [...]
No Source attached yet.
“Pipe ready to work with Doubles”.
68. Akka Streams – Linear Flow
implicit val sys = ActorSystem("tokyo-sys")
An ActorSystem is the world in which Actors live in.
AkkaStreams uses Actors, so it needs ActorSystem.
69. Akka Streams – Linear Flow
implicit val sys = ActorSystem("tokyo-sys")
implicit val mat = FlowMaterializer()
Contains logic on HOW to materialise the stream.
70. Akka Streams – Linear Flow
implicit val sys = ActorSystem("tokyo-sys")
implicit val mat = FlowMaterializer()
A materialiser chooses HOW to materialise a Stream.
The Flow’s AST is fully “lifted”.
The Materialiser can choose to materialise the Flow in any way it sees fit.
Our implementation uses Actors.
But you could easily plug in an SparkMaterializer!
71. Akka Streams – Linear Flow
implicit val sys = ActorSystem("tokyo-sys")
implicit val mat = FlowMaterializer()
You can configure it’s buffer sizes etc.
72. Akka Streams – Linear Flow
implicit val sys = ActorSystem("tokyo-sys")
implicit val mat = FlowMaterializer()
val foreachSink = Sink.foreach[Int](println)
val mf = Source(1 to 3).runWith(foreachSink)
73. Akka Streams – Linear Flow
implicit val sys = ActorSystem("tokyo-sys")
implicit val mat = FlowMaterializer()
val foreachSink = Sink.foreach[Int](println)
val mf = FlowFrom(1 to 3).runWith(foreachSink)(mat)
Uses the implicit FlowMaterializer
74. Akka Streams – Linear Flow
implicit val sys = ActorSystem("tokyo-sys")
implicit val mat = FlowMaterializer()
// sugar for runWith
Source(1 to 3).foreach(println)
75. Akka Streams – Linear Flow
val mf = Flow[Int].
map(_ * 2).
runWith(Sink.foreach(println))
// is missing a Source,
// can NOT run == won’t compile!
76. Akka Streams – Linear Flow
val f = Flow[Int].
map(_ * 2).
runWith(Sink.foreach(i => println(s"i = $i”))).
// needs Source to run!
77. Akka Streams – Linear Flow
val f = Flow[Int].
map(_ * 2).
runWith(Sink.foreach(i => println(s"i = $i”))).
// needs Source to run!
78. Akka Streams – Linear Flow
val f = Flow[Int].
map(_ * 2).
runWith(Sink.foreach(i => println(s"i = $i”))).
// needs Source to run!
79. Akka Streams – Linear Flow
val f = Flow[Int].
map(_ * 2).
runWith(Sink.foreach(i => println(s"i = $i”))).
// needs Source to run!
f.connect(Source(1 to 10)).run()
80. Akka Streams – Linear Flow
val f = Flow[Int].
map(_ * 2).
runWith(Sink.foreach(i => println(s"i = $i”))).
// needs Source to run!
f.connect(Source(1 to 10)).run()
With a Source attached… it can run()
81. Akka Streams – Linear Flow
Flow[Int].
map(_.toString).
runWith(Source(1 to 10), Sink.ignore)
Connects Source and Sink, then runs
82. Akka Streams – Flows are reusable
f.withSource(IterableSource(1 to 10)).run()
f.withSource(IterableSource(1 to 100)).run()
f.withSource(IterableSource(1 to 1000)).run()
84. Akka Streams <-> Actors – Advanced
val subscriber = ActorSubscriber(
system.actorOf(Props[SubStreamParent], ”parent”))
Source(1 to 100).
map(_.toString).
filter(_.length == 2).
drop(2).
groupBy(_.last).
runWith(subscriber)
Each “group” is a stream too! It’s a “Stream of Streams”.
85. Akka Streams <-> Actors – Advanced
groupBy(_.last).
GroupBy groups “11” to group “1”, “12” to group “2” etc.
86. Akka Streams <-> Actors – Advanced
groupBy(_.last).
Source
It offers (groupKey, subStreamSource) to Subscriber
87. Akka Streams <-> Actors – Advanced
groupBy(_.last).
Source
It can then start children, to handle the sub-flows!
88. Akka Streams <-> Actors – Advanced
groupBy(_.last).
Source
For example, one child for each group.
89. Akka Streams <-> Actors – Advanced
val subscriber = ActorSubscriber(
system.actorOf(Props[SubStreamParent], ”parent”))
Source(1 to 100).
map(_.toString).
filter(_.length == 2).
drop(2).
groupBy(_.last).
runWith(subscriber)
The Actor, will consume SubStream offers.
99. Akka Streams – FlowGraph
Linear Flows
or
non-akka pipelines
Could be another RS implementation!
100. Akka Streams – GraphFlow
Fan-out elements
and
Fan-in elements
101. Akka Streams – GraphFlow
// first define some pipeline pieces
val f1 = Flow[Input].map(_.toIntermediate)
val f2 = Flow[Intermediate].map(_.enrich)
val f3 = Flow[Enriched].filter(_.isImportant)
val f4 = Flow[Intermediate].mapFuture(_.enrichAsync)
// then add input and output placeholders
val in = SubscriberSource[Input]
val out = PublisherSink[Enriched]
103. Akka Streams – GraphFlow
val b3 = Broadcast[Int]("b3")
val b7 = Broadcast[Int]("b7")
val b11 = Broadcast[Int]("b11")
val m8 = Merge[Int]("m8")
val m9 = Merge[Int]("m9")
val m10 = Merge[Int]("m10")
val m11 = Merge[Int]("m11")
val in3 = Source(List(3))
val in5 = Source(List(5))
val in7 = Source(List(7))
110. Akka Streams – GraphFlow
Sinks and Sources are “keys”
which can be addressed within the graph
val resultFuture2 = Sink.future[Seq[Int]]
val resultFuture9 = Sink.future[Seq[Int]]
val resultFuture10 = Sink.future[Seq[Int]]
val g = FlowGraph { implicit b =>
// ...
m10 ~> Flow[Int].grouped(1000) ~> resultFuture10
// ...
}.run()
Await.result(g.get(resultFuture2), 3.seconds).sorted
should be(List(5, 7))
111. Akka Streams – GraphFlow
Sinks and Sources are “keys”
which can be addressed within the graph
val resultFuture2 = Sink.future[Seq[Int]]
val resultFuture9 = Sink.future[Seq[Int]]
val resultFuture10 = Sink.future[Seq[Int]]
val g = FlowGraph { implicit b =>
// ...
m10 ~> Flow[Int].grouped(1000) ~> resultFuture10
// ...
}.run()
Await.result(g.get(resultFuture2), 3.seconds).sorted
should be(List(5, 7))
112. Akka Streams – GraphFlow
val g = FlowGraph {}
FlowGraph is immutable and safe to share and re-use!
119. Rough plans
• 0.9 released
• 0.10 in 2~3 weeks
• 1.0 “soon” after…
• Means “stabilised APIs”
• Will not yet be performance tuned, though it’s already
pretty good… we know where and how we can tune it
for 1.x.
0.7 early preview
120. Java DSL
• Partial Java DSL in 0.9 (released)
• Full Java DSL in 0.10 (in 2~3 weeks)
• as 1st class citizen (!)
0.7 early preview
121. Spray => Akka-Http && ReactiveStreams
Spray is now merged into Akka, as Akka-Http
Works on Reactive Streams
Streaming end-to-end!