Dr. Roland Kuhn
Akka Tech Lead
@rolandkuhn
Reactive Streams
Handling Data-Flows the Reactive Way
Introduction: Streams
What is a Stream?
• ephemeral flow of data
• focused on describing transformation
• possibly unbounded in size
3
Common uses of Streams
• bulk data transfer
• real-time data sources
• batch processing of large data sets
• monitoring and analytics
4
What is special about Reactive Streams?
Reactive	
  Applications
The Four Reactive Traits
6
http://reactivemanifesto.org/
Needed: Asynchrony
• Resilience demands it:
• encapsulation
• isolation
• Scalability demands it:
• distribution across nodes
• distribution across cores
7
Many Kinds of Async Boundaries
8
Many Kinds of Async Boundaries
• between different applications
8
Many Kinds of Async Boundaries
• between different applications
• between network nodes
8
Many Kinds of Async Boundaries
• between different applications
• between network nodes
• between CPUs
8
Many Kinds of Async Boundaries
• between different applications
• between network nodes
• between CPUs
• between threads
8
Many Kinds of Async Boundaries
• between different applications
• between network nodes
• between CPUs
• between threads
• between actors
8
The Problem:
!
Getting Data across an Async Boundary
Possible Solutions
10
Possible Solutions
• the Traditional way: blocking calls
10
Possible Solutions
10
Possible Solutions
!
• the Push way: buffering and/or dropping
11
Possible Solutions
11
Possible Solutions
!
!
• the Reactive way:

non-blocking & non-dropping & bounded
12
How do we achieve that?
Supply and Demand
• data items flow downstream
• demand flows upstream
• data items flow only when there is demand
• recipient is in control of incoming data rate
• data in flight is bounded by signaled demand
14
Publisher Subscriber
data
demand
Dynamic Push–Pull
• “push” behavior when consumer is faster
• “pull” behavior when producer is faster
• switches automatically between these
• batching demand allows batching data
15
Publisher Subscriber
data
demand
Explicit Demand: Tailored Flow Control
16
demand
data
splitting the data means merging the demand
Explicit Demand: Tailored Flow Control
17
merging the data means splitting the demand
Reactive Streams
• asynchronous non-blocking data flow
• asynchronous non-blocking demand flow
• minimal coordination and contention
• message passing allows for distribution
• across applications
• across nodes
• across CPUs
• across threads
• across actors
18
Are Streams Collections?
What is a Collection?
20
What is a Collection?
• Oxford Dictionary:
• “a group of things or people”
20
What is a Collection?
• Oxford Dictionary:
• “a group of things or people”
• wikipedia:
• “a grouping of some variable number of data items”
20
What is a Collection?
• Oxford Dictionary:
• “a group of things or people”
• wikipedia:
• “a grouping of some variable number of data items”
• backbone.js:
• “collections are simply an ordered set of models”
20
What is a Collection?
• Oxford Dictionary:
• “a group of things or people”
• wikipedia:
• “a grouping of some variable number of data items”
• backbone.js:
• “collections are simply an ordered set of models”
• java.util.Collection:
• definite size, provides an iterator, query membership
20
User Expectations
• an Iterator is expected to visit all elements

(especially with immutable collections)
• x.head + x.tail == x
• the contents does not depend on who is
processing the collection
• the contents does not depend on when the
processing happens

(especially with immutable collections)
21
Streams have Unexpected Properties
• the observed sequence depends on
• … when the observer subscribed to the stream
• … whether the observer can process fast enough
• … whether the streams flows fast enough
22
Streams are not Collections!
• java.util.stream:

Stream is not derived from Collection

“Streams differ from Coll’s in several ways”
• no storage
• functional in nature
• laziness seeking
• possibly unbounded
• consumable
23
Streams are not Collections!
• a collection can be streamed
• a stream observer can create a collection
• … but saying that a Stream is just a lazy
Collection evokes the wrong associations
24
So, Reactive Streams:
why not just java.util.stream.Stream?
Java 8 Stream
26
import java.util.stream.*;
!
// get some stream
final Stream<Integer> s = Stream.of(1, 2, 3);
// describe transformation
final Stream<String> s2 = s.map(i -> "a" + i);
// make a pull collection
s2.iterator();
// or alternatively push it somewhere
s2.forEach(i -> System.out.println(i));
// (need to pick one, Stream is consumable)
Java 8 Stream
• provides a DSL for describing transformation
• introduces staged computation

(but does not allow reuse)
• prescribes an eager model of execution
• offers either push or pull, chosen statically
27
What about RxJava?
RxJava
29
import rx.Observable;
import rx.Observable.*;
!
// get some stream source
final Observable<Integer> obs = range(1, 3);
// describe transformation
final Observable<String> obs2 =
obs.map(i -> "b" + i);
// and use it twice
obs2.subscribe(i -> System.out.println(i));
obs2.filter(i -> i.equals("b2"))
.subscribe(i -> System.out.println(i));
RxJava
• implements pure “push” model
• includes extensive DSL for transformations
• only allows blocking for back pressure
• currently uses unbounded buffering for
crossing an async boundary
• work on distributed Observables sparked
participation in Reactive Streams
30
The Reactive Streams Project
Participants
• Engineers from
• Netflix
• Oracle
• Pivotal
• Red Hat
• Twitter
• Typesafe
• Individuals like Doug Lea and Todd Montgomery
32
The Motivation
• all participants had the same basic problem
• all are building tools for their community
• a common solution benefits everybody
• interoperability to make best use of efforts
• e.g. use Reactor data store driver with Akka
transformation pipeline and Rx monitoring to drive a
vert.x REST API (purely made up, at this point)
33
see also Jon Brisbin’s post on “Tribalism as a Force for Good”
Recipe for Success
• minimal interfaces
• rigorous specification of semantics
• full TCK for verification of implementation
• complete freedom for many idiomatic APIs
34
The Meat
35
trait Publisher[T] {
def subscribe(sub: Subscriber[T]): Unit
}
trait Subscription {
def requestMore(n: Int): Unit
def cancel(): Unit
}
trait Subscriber[T] {
def onSubscribe(s: Subscription): Unit
def onNext(elem: T): Unit
def onError(thr: Throwable): Unit
def onComplete(): Unit
}
The Sauce
• all calls on Subscriber must dispatch async
• all calls on Subscription must not block
• Publisher is just there to create Subscriptions
36
How does it Connect?
37
SubscriberPublisher
subscribe
onSubscribeSubscription
requestMore
onNext
Akka Streams
Akka Streams
• powered by Akka Actors
• execution
• distribution
• resilience
• type-safe streaming through Actors with
bounded buffering
39
Basic Akka Example
40
implicit val system = ActorSystem("Sys")
val mat = FlowMaterializer(...)
!
Flow(text.split("s").toVector).
map(word => word.toUpperCase).
foreach(tranformed => println(tranformed)).
onComplete(mat) {
case Success(_) => system.shutdown()
case Failure(e) =>
println("Failure: " + e.getMessage)
system.shutdown()
}
More Advanced Akka Example
41
val s = Source.fromFile("log.txt", "utf-8")
Flow(s.getLines()).
groupBy {
case LogLevelPattern(level) => level
case other => "OTHER"
}.
foreach { case (level, producer) =>
val out = new PrintWriter(level+".txt")
Flow(producer).
foreach(line => out.println(line)).
onComplete(mat)(_ => Try(out.close()))
}.
onComplete(mat)(_ => Try(s.close()))
Java 8 Example
42
final ActorSystem system = ActorSystem.create("Sys");
final MaterializerSettings settings =
MaterializerSettings.create();
final FlowMaterializer materializer =
FlowMaterializer.create(settings, system);
!
final String[] lookup = { "a", "b", "c", "d", "e", "f" };
final Iterable<Integer> input = Arrays.asList(0, 1, 2, 3, 4, 5);
!
Flow.create(input).drop(2).take(3). // leave 2, 3, 4
map(elem -> lookup[elem]). // translate to "c","d","e"
filter(elem -> !elem.equals("c")). // filter out the "c"
grouped(2). // make into a list
mapConcat(list -> list). // flatten the list
fold("", (acc, elem) -> acc + elem). // accumulate into "de"
foreach(elem -> System.out.println(elem)). // print it
consume(materializer);
Akka TCP Echo Server
43
val future = IO(StreamTcp) ? Bind(addr, ...)
future.onComplete {
case Success(server: TcpServerBinding) =>
println("listen at "+server.localAddress)
!
Flow(server.connectionStream).foreach{ c =>
println("client from "+c.remoteAddress)
c.inputStream.produceTo(c.outputStream)
}.consume(mat)
!
case Failure(ex) =>
println("cannot bind: "+ex)
system.shutdown()
}
Akka HTTP Sketch (NOT REAL CODE)
44
val future = IO(Http) ? RequestChannelSetup(...)
future.onSuccess {
case ch: HttpRequestChannel =>
val p = ch.processor[Int]
val body = Flow(new File(...)).toProducer(mat)
Flow(HttpRequest(method = POST,
uri = "http://ex.amp.le/42",
entity = body) -> 42)
.produceTo(mat, p)
Flow(p).map { case (resp, token) =>
val out = FileSink(new File(...))
Flow(resp.entity.data).produceTo(mat, out)
}.consume(mat)
}
Closing Remarks
Current State
• Early Preview is available:

"org.reactivestreams" % "reactive-streams-spi" % "0.2"

"com.typesafe.akka" %% "akka-stream-experimental" % "0.3"
• check out the Activator template

"Akka Streams with Scala!"

(https://github.com/typesafehub/activator-akka-stream-scala)
46
Next Steps
• we work towards inclusion in future JDK
• try it out and give feedback!
• http://reactive-streams.org/
• https://github.com/reactive-streams
47
©Typesafe 2014 – All Rights Reserved

Reactive Streams: Handling Data-Flow the Reactive Way

  • 1.
    Dr. Roland Kuhn AkkaTech Lead @rolandkuhn Reactive Streams Handling Data-Flows the Reactive Way
  • 2.
  • 3.
    What is aStream? • ephemeral flow of data • focused on describing transformation • possibly unbounded in size 3
  • 4.
    Common uses ofStreams • bulk data transfer • real-time data sources • batch processing of large data sets • monitoring and analytics 4
  • 5.
    What is specialabout Reactive Streams?
  • 6.
    Reactive  Applications The FourReactive Traits 6 http://reactivemanifesto.org/
  • 7.
    Needed: Asynchrony • Resiliencedemands it: • encapsulation • isolation • Scalability demands it: • distribution across nodes • distribution across cores 7
  • 8.
    Many Kinds ofAsync Boundaries 8
  • 9.
    Many Kinds ofAsync Boundaries • between different applications 8
  • 10.
    Many Kinds ofAsync Boundaries • between different applications • between network nodes 8
  • 11.
    Many Kinds ofAsync Boundaries • between different applications • between network nodes • between CPUs 8
  • 12.
    Many Kinds ofAsync Boundaries • between different applications • between network nodes • between CPUs • between threads 8
  • 13.
    Many Kinds ofAsync Boundaries • between different applications • between network nodes • between CPUs • between threads • between actors 8
  • 14.
    The Problem: ! Getting Dataacross an Async Boundary
  • 15.
  • 16.
    Possible Solutions • theTraditional way: blocking calls 10
  • 17.
  • 18.
    Possible Solutions ! • thePush way: buffering and/or dropping 11
  • 19.
  • 20.
    Possible Solutions ! ! • theReactive way:
 non-blocking & non-dropping & bounded 12
  • 21.
    How do weachieve that?
  • 22.
    Supply and Demand •data items flow downstream • demand flows upstream • data items flow only when there is demand • recipient is in control of incoming data rate • data in flight is bounded by signaled demand 14 Publisher Subscriber data demand
  • 23.
    Dynamic Push–Pull • “push”behavior when consumer is faster • “pull” behavior when producer is faster • switches automatically between these • batching demand allows batching data 15 Publisher Subscriber data demand
  • 24.
    Explicit Demand: TailoredFlow Control 16 demand data splitting the data means merging the demand
  • 25.
    Explicit Demand: TailoredFlow Control 17 merging the data means splitting the demand
  • 26.
    Reactive Streams • asynchronousnon-blocking data flow • asynchronous non-blocking demand flow • minimal coordination and contention • message passing allows for distribution • across applications • across nodes • across CPUs • across threads • across actors 18
  • 27.
  • 28.
    What is aCollection? 20
  • 29.
    What is aCollection? • Oxford Dictionary: • “a group of things or people” 20
  • 30.
    What is aCollection? • Oxford Dictionary: • “a group of things or people” • wikipedia: • “a grouping of some variable number of data items” 20
  • 31.
    What is aCollection? • Oxford Dictionary: • “a group of things or people” • wikipedia: • “a grouping of some variable number of data items” • backbone.js: • “collections are simply an ordered set of models” 20
  • 32.
    What is aCollection? • Oxford Dictionary: • “a group of things or people” • wikipedia: • “a grouping of some variable number of data items” • backbone.js: • “collections are simply an ordered set of models” • java.util.Collection: • definite size, provides an iterator, query membership 20
  • 33.
    User Expectations • anIterator is expected to visit all elements
 (especially with immutable collections) • x.head + x.tail == x • the contents does not depend on who is processing the collection • the contents does not depend on when the processing happens
 (especially with immutable collections) 21
  • 34.
    Streams have UnexpectedProperties • the observed sequence depends on • … when the observer subscribed to the stream • … whether the observer can process fast enough • … whether the streams flows fast enough 22
  • 35.
    Streams are notCollections! • java.util.stream:
 Stream is not derived from Collection
 “Streams differ from Coll’s in several ways” • no storage • functional in nature • laziness seeking • possibly unbounded • consumable 23
  • 36.
    Streams are notCollections! • a collection can be streamed • a stream observer can create a collection • … but saying that a Stream is just a lazy Collection evokes the wrong associations 24
  • 37.
    So, Reactive Streams: whynot just java.util.stream.Stream?
  • 38.
    Java 8 Stream 26 importjava.util.stream.*; ! // get some stream final Stream<Integer> s = Stream.of(1, 2, 3); // describe transformation final Stream<String> s2 = s.map(i -> "a" + i); // make a pull collection s2.iterator(); // or alternatively push it somewhere s2.forEach(i -> System.out.println(i)); // (need to pick one, Stream is consumable)
  • 39.
    Java 8 Stream •provides a DSL for describing transformation • introduces staged computation
 (but does not allow reuse) • prescribes an eager model of execution • offers either push or pull, chosen statically 27
  • 40.
  • 41.
    RxJava 29 import rx.Observable; import rx.Observable.*; ! //get some stream source final Observable<Integer> obs = range(1, 3); // describe transformation final Observable<String> obs2 = obs.map(i -> "b" + i); // and use it twice obs2.subscribe(i -> System.out.println(i)); obs2.filter(i -> i.equals("b2")) .subscribe(i -> System.out.println(i));
  • 42.
    RxJava • implements pure“push” model • includes extensive DSL for transformations • only allows blocking for back pressure • currently uses unbounded buffering for crossing an async boundary • work on distributed Observables sparked participation in Reactive Streams 30
  • 43.
  • 44.
    Participants • Engineers from •Netflix • Oracle • Pivotal • Red Hat • Twitter • Typesafe • Individuals like Doug Lea and Todd Montgomery 32
  • 45.
    The Motivation • allparticipants had the same basic problem • all are building tools for their community • a common solution benefits everybody • interoperability to make best use of efforts • e.g. use Reactor data store driver with Akka transformation pipeline and Rx monitoring to drive a vert.x REST API (purely made up, at this point) 33 see also Jon Brisbin’s post on “Tribalism as a Force for Good”
  • 46.
    Recipe for Success •minimal interfaces • rigorous specification of semantics • full TCK for verification of implementation • complete freedom for many idiomatic APIs 34
  • 47.
    The Meat 35 trait Publisher[T]{ def subscribe(sub: Subscriber[T]): Unit } trait Subscription { def requestMore(n: Int): Unit def cancel(): Unit } trait Subscriber[T] { def onSubscribe(s: Subscription): Unit def onNext(elem: T): Unit def onError(thr: Throwable): Unit def onComplete(): Unit }
  • 48.
    The Sauce • allcalls on Subscriber must dispatch async • all calls on Subscription must not block • Publisher is just there to create Subscriptions 36
  • 49.
    How does itConnect? 37 SubscriberPublisher subscribe onSubscribeSubscription requestMore onNext
  • 50.
  • 51.
    Akka Streams • poweredby Akka Actors • execution • distribution • resilience • type-safe streaming through Actors with bounded buffering 39
  • 52.
    Basic Akka Example 40 implicitval system = ActorSystem("Sys") val mat = FlowMaterializer(...) ! Flow(text.split("s").toVector). map(word => word.toUpperCase). foreach(tranformed => println(tranformed)). onComplete(mat) { case Success(_) => system.shutdown() case Failure(e) => println("Failure: " + e.getMessage) system.shutdown() }
  • 53.
    More Advanced AkkaExample 41 val s = Source.fromFile("log.txt", "utf-8") Flow(s.getLines()). groupBy { case LogLevelPattern(level) => level case other => "OTHER" }. foreach { case (level, producer) => val out = new PrintWriter(level+".txt") Flow(producer). foreach(line => out.println(line)). onComplete(mat)(_ => Try(out.close())) }. onComplete(mat)(_ => Try(s.close()))
  • 54.
    Java 8 Example 42 finalActorSystem system = ActorSystem.create("Sys"); final MaterializerSettings settings = MaterializerSettings.create(); final FlowMaterializer materializer = FlowMaterializer.create(settings, system); ! final String[] lookup = { "a", "b", "c", "d", "e", "f" }; final Iterable<Integer> input = Arrays.asList(0, 1, 2, 3, 4, 5); ! Flow.create(input).drop(2).take(3). // leave 2, 3, 4 map(elem -> lookup[elem]). // translate to "c","d","e" filter(elem -> !elem.equals("c")). // filter out the "c" grouped(2). // make into a list mapConcat(list -> list). // flatten the list fold("", (acc, elem) -> acc + elem). // accumulate into "de" foreach(elem -> System.out.println(elem)). // print it consume(materializer);
  • 55.
    Akka TCP EchoServer 43 val future = IO(StreamTcp) ? Bind(addr, ...) future.onComplete { case Success(server: TcpServerBinding) => println("listen at "+server.localAddress) ! Flow(server.connectionStream).foreach{ c => println("client from "+c.remoteAddress) c.inputStream.produceTo(c.outputStream) }.consume(mat) ! case Failure(ex) => println("cannot bind: "+ex) system.shutdown() }
  • 56.
    Akka HTTP Sketch(NOT REAL CODE) 44 val future = IO(Http) ? RequestChannelSetup(...) future.onSuccess { case ch: HttpRequestChannel => val p = ch.processor[Int] val body = Flow(new File(...)).toProducer(mat) Flow(HttpRequest(method = POST, uri = "http://ex.amp.le/42", entity = body) -> 42) .produceTo(mat, p) Flow(p).map { case (resp, token) => val out = FileSink(new File(...)) Flow(resp.entity.data).produceTo(mat, out) }.consume(mat) }
  • 57.
  • 58.
    Current State • EarlyPreview is available:
 "org.reactivestreams" % "reactive-streams-spi" % "0.2"
 "com.typesafe.akka" %% "akka-stream-experimental" % "0.3" • check out the Activator template
 "Akka Streams with Scala!"
 (https://github.com/typesafehub/activator-akka-stream-scala) 46
  • 59.
    Next Steps • wework towards inclusion in future JDK • try it out and give feedback! • http://reactive-streams.org/ • https://github.com/reactive-streams 47
  • 60.
    ©Typesafe 2014 –All Rights Reserved