Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Journey into Reactive Streams and Akka Streams

1,400 views

Published on

Are streams just collections? What's the difference between Java 8 streams and Reactive Streams? How do I implement Reactive Streams with Akka? Pub/sub, dynamic push/pull, non-blocking, non-dropping; these are some of the other concepts covered. We'll also discuss how to leverage streams in a real-world application.

Published in: Software

Journey into Reactive Streams and Akka Streams

  1. 1. A journey into stream processing with Reactive Streams and Akka Streams
  2. 2. What to expect • An Introduction • The Reactive Streams specification • A deep-dive into Akka Streams • Code walkthrough and demo • Q&A
  3. 3. An Introduction Part 1 of 4
  4. 4. What's an array? • A series of elements arranged in memory • Has a beginning and an end
  5. 5. What's a stream? • A series of elements emitted over time • Live data (e.g, events) or at rest data (e.g, partitions of a file) • May not have a beginning or an end
  6. 6. Appeal of stream processing? • Scaling business logic • Processing real-time data (fast data) • Batch processing of large data sets (big data) • Monitoring, analytics, complex event processing, etc
  7. 7. Challenges? • Ephemeral • Unbounded in size • Potential "flooding" downstream • Unfamiliar programming paradigm You cannot step twice into the same stream. For as you are stepping in, other waters are ever flowing on to you. — Heraclitus
  8. 8. Exploring two challenges of stream processing • An Rx-based approach for passing data across an asynchronous boundary • An approach for implementing back pressure
  9. 9. Synchrony
  10. 10. Asynchrony
  11. 11. Asynchrony
  12. 12. Back pressure
  13. 13. Flow control options
  14. 14. Flow control • We need a way to signal when a subscriber is able to process more data • Effectively push-based (dynamic pull/push) A lack of back pressure will eventually lead to an Out of Memory Exception (OOME), which is the worst possible outcome. Then you lose not just the work that overloaded the system, but everything, even the stuff that you were safely working on.  — Jim Powers, Typesafe
  15. 15. Subscriber usually has some kind of buffer.
  16. 16. Fast publishers can overwhelm the buffer of a slow subscriber.
  17. 17. Option 1: Use bounded buffer and drop messages.
  18. 18. Option 2: Increase buffer size if memory available.
  19. 19. Option 3: Pull-based backpressure.
  20. 20. Reactive Streams Part 2 of 4
  21. 21. Why Reactive Streams? • Reactive Streams is a specification and low-level API for library developers. • Started as an initiative in late 2013 between engineers at Netflix, Pivotal, and Typesafe • Streaming was complex! • Play had “iteratees”, Akka had Akka IO
  22. 22. What is Reactive Streams? • TCK (Technology Compatibility Kit) • API (JVM, JavaScript) • Specifications for library developers • Early conversation on future spec for IO
  23. 23. 1. Flow control via back pressure • Fast publisher responsibilities 1. Not generate elements, if it is able to control their production rate 2. Buffer elements in a bounded manner until more demand is signalled 3. Drop elements until more demand is signalled 4. Tear down the stream if unable to apply any of the above strategies
  24. 24. 2. An Rx-based approach to asyncrony public interface Processor<T, R> extends Subscriber<T>, Publisher<R> {} public interface Publisher<T> { public void subscribe(Subscriber<? super T> s); } public interface Subscriber<T> { public void onSubscribe(Subscription s); public void onNext(T t); public void onError(Throwable t); public void onComplete(); } public interface Subscription { public void request(long n); public void cancel(); }
  25. 25. Interoperability • RxJava (Netflix) • Reactor (Pivotal) • Vert.x (RedHat) • Akka Streams and Slick (Typesafe)
  26. 26. Three main repositories • Reactive Streams for the JVM • Reactive Streams for JavaScript • Reactive Streams IO (for network protocols such as TCP, WebSockets and possibly HTTP/2) • Early exploration kicked off by Netflix • 2016 timeframe
  27. 27. Reactive Streams Visit the Reactive Streams website for more information. http://www.reactive-streams.org/
  28. 28. Akka Streams Part 3 of 4
  29. 29. Akka Streams Akka Streams provides a way to express and run a chain of asynchronous processing steps acting on a sequence of elements. • DSL for async/non-blocking stream processing • Default back pressure • Conforms to the Reactive Streams spec for interop
  30. 30. Basics
  31. 31. • Source - A processing stage with exactly one output • Sink - A processing stage with exactly one input • Flow - A processing stage which has exactly one input and output • RunnableFlow - A Flow that has both ends "attached" to a Source and Sink
  32. 32. API design Considerations • Immutable, composable stream blueprints • Explicit materialization step • No magic at the expense of some extra code
  33. 33. Materialization • Separate the what from the how • Declarative Source/Flow/Sink to create a blueprint • FlowMaterializer turns blueprint into actors • Involves an extra step, but no magic
  34. 34. Error handling • The element causing division by zero will be dropped • Result will be a Future completed with Success(228) val decider: Supervision.Decider = exc => exc match { case _: ArithmeticException => Supervision.Resume case _ => Supervision.Stop } // ActorFlowMaterializer takes the list of transformations comprising a akka.stream.scaladsl.Flow // and materializes them in the form of org.reactivestreams.Processor implicit val mat = ActorFlowMaterializer( ActorFlowMaterializerSettings(system).withSupervisionStrategy(decider)) val source = Source(0 to 5).map(100 / _) val result = source.runWith(Sink.fold(0)(_ + _))
  35. 35. Dynamic push/pull backpressure • Fast subscriber can issue more Request(n) even before more data arrives • Publisher can accumulate demand • Conforming to "fast publisher" responsibilities • Total demand of elements is safe to publish • Subscriber's buffer will never overflow
  36. 36. In-depth
  37. 37. Fan out • Broadcast[T] (1 input, n outputs) • Signals each output given an input signal • Balance[T] (1 input => n outputs) • Signals one of its output ports given an input signal • FlexiRoute[In] (1 input, n outputs) • Write custom fan out elements using a simple DSL
  38. 38. Fan in • Merge[In] (n inputs , 1 output) • Picks signals randomly from inputs • Zip[A,B,Out] (2 inputs, 1 output) • Zipping into an (A,B) tuple stream • Concat[T] (2 inputs, 1 output) • Concatenate streams (first, then second)
  39. 39. Scala example val g = FlowGraph.closed() { implicit builder: FlowGraph.Builder => import FlowGraph.Implicits._ val in = Source(1 to 10) val out = Sink.ignore val bcast = builder.add(Broadcast[Int](2)) val merge = builder.add(Merge[Int](2)) val f1, f2, f3, f4 = Flow[Int].map(_ + 10) in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out bcast ~> f4 ~> merge }
  40. 40. Advanced flow control // return only the freshest element when the subscriber signals demand val droppyStream: Flow[Message, Message] = Flow[Message].conflate(seed = identity)((lastMessage, newMessage) => newMessage) • conflate can be thought as a special fold operation that collapses multiple upstream elements into one aggregate element • groupedWithin chunks up this stream into groups of elements received within a time window, or limited by the given number of elements, whatever happens first
  41. 41. Other sinks and sources - simple streaming from/to Kafka implicit val actorSystem = ActorSystem("ReactiveKafka") implicit val materializer = ActorMaterializer() val kafka = new ReactiveKafka(host = "localhost:9092", zooKeeperHost = "localhost:2181") val publisher = kafka.consume("lowercaseStrings", "groupName", new StringDecoder()) val subscriber = kafka.publish("uppercaseStrings", "groupName", new StringEncoder()) // consume lowercase strings from kafka and publish them transformed to uppercase Source(publisher).map(_.toUpperCase).to(Sink(subscriber)).run()
  42. 42. A quick comparison with Java 8 Streams • Pull-based, synchronous sequences of values • Iterators with a more parallelism-friendly interface • Intermediate operations are lazy (e.g, filter, map) • Terminal operations are eager (e.g, reduce) • Only high-level control (no next/hasNext) • Similar to Scala Collections
  43. 43. Java 8 Streams String concatenatedString = listOfStrings .stream() .peek(s -> listOfStrings.add("three")) // don't do this! .reduce((a, b) -> a + " " + b) .get();
  44. 44. Code review and demo Part 4 of 4 Source code available at https://github.com/rocketpages
  45. 45. Thank you!

×