Akka Stream and CQRS
Milan Das (Principal Consultant) : StructuredMap.com
About Me
1. Name: Milan Das
a. https://www.linkedin.com/in/milandas/
b. milan.das77@gmail.com
c. StructuredMap LLC
d. Java Developer since 1999
2. Sr. Consultant Open Source Talend:
DataStream/Spark
3. StructuredMap: Principal Consultant
Akka Stream
sbt
scalaVersion := "2.12.3"
libraryDependencies ++= Seq(
"com.typesafe.akka" %% "akka-actor" % "2.5.4",
"com.typesafe.akka" %% "akka-testkit" % "2.5.4" % Test,
"com.typesafe.akka" %% "akka-stream" % "2.5.4",
/"com.lightbend.akka" %% "akka-stream-alpakka-sse" % "0.11",
"com.typesafe.akka" %% "akka-stream-kafka" % "0.16",
"com.typesafe.akka" %% "akka-stream-testkit" % "2.5.4" % Test
)
● IDE (Intellij)
● SBT (Scala Build tool)
● Kafka (confluent.io)
● Akka Stream
. Stream processing analyzes and performs
actions on real-time data though the
use of continuous queries. Streaming
Analytics connects to external data
sources, enabling applications to
integrate certain data into the
application flow, or to update an
external database with processed
information.
-- Dataversity.net
What is Streaming
Akka Stream: Source
Source:
A Source is as an input source to
the stream. Source has a single
output channel and no input
channel. Source could generate
data or connect to other system to
generate Source data.
Akka Stream: Flow
Flow:
Flow connects different streams
and if needed transforms data.
Akka Stream: Sink
Sink:
A Sink is the endpoint of a stream to
consume data. Sink has a single input
channel.
Akka Stream: Materilizer
ActorMaterializer:
The Materializer is a factory for stream
execution engines, it is the thing that
makes streams run
object AkkaStream1 extends App{
implicit val system = ActorSystem("AkkaStream1")
implicit val materializer = ActorMaterializer()
import system.dispatcher
val source = Source(1 to 300)
val sink = Sink.foreach[Int](elem => println(s"sink
received: $elem"))
val flow = source to sink
flow.run()
}
Code 1
Akka Stream: Real world Unbounded data stream
● The data elements in the stream arrive
online.
● The system has no control over the
order in which data elements arrive to
be processed, either within a data
stream or across data streams.
● Data streams are potentially
unbounded in size.
● Once an element from a data stream
has been processed it is discarded or
archived — it cannot be retrieved easily
unless it is explicitly stored in memory,
which typically is small relative to the
size of the data streams
Code 2: alpakka
Complex Graph GraphDSL
Graphs are built from simple Flows
which serve as the linear connections
within the graphs to serve fan-in and
fan-out points for Flows. Image Source: http://doc.akka.io/docs/akka/2.5.3/images/simple-graph-example.png
Complex Graph GraphDSL
Akka Streams junctions :
Fan-out
● Broadcast[T] – (1 input, N outputs) given an input element emits to each output
● Balance[T] – (1 input, N outputs) given an input element emits to one of its output ports
● UnzipWith[In,A,B,...] – (1 input, N outputs) takes a function of 1 input that given a value for each input emits N output
elements (where N <= 20)
● UnZip[A,B] – (1 input, 2 outputs) splits a stream of (A,B) tuples into two streams, one of type A and one of type B
Fan-in
● Merge[In] – (N inputs , 1 output) picks randomly from inputs pushing them one by one to its output
● MergePreferred[In] – like Merge but if elements are available on preferred port, it picks from it, otherwise randomly from
others
● ZipWith[A,B,...,Out] – (N inputs, 1 output) which takes a function of N inputs that given a value for each input emits 1
output element
● Zip[A,B] – (2 inputs, 1 output) is a ZipWith specialised to zipping input streams of A and B into an (A,B) tuple stream
● Concat[A] – (2 inputs, 1 output) concatenates two streams (first consume one, then the second one)
Akka Stream Asynchronous
Source(1 to 3)
.map { i => println(s"A: $i"); i }.async
.map { i => println(s"B: $i"); i }.async
.map { i => println(s"C: $i"); i }.async
.runWith(Sink.ignore)
Materilizer Buffer:
akka.stream.materializer.max-input-buffer-size = 16
val materializer = ActorMaterializer(
ActorMaterializerSettings(system)
.withInputBuffer(
initialSize = 64,
maxSize = 64))
Akka Stream Backpressure
Akka Stream Backpressure
// 1000 jobs are dequeed
// Getting a stream of jobs from an imaginary external system as a Source
val jobs: Source[Job, NotUsed] = inboundJobsConnector()
jobs.buffer(1000, OverflowStrategy.backpressure)
jobs.buffer(1000, OverflowStrategy.dropTail)
jobs.buffer(1000, OverflowStrategy.dropNew)
jobs.buffer(1000, OverflowStrategy.dropHead)
jobs.buffer(1000, OverflowStrategy.dropBuffer)
jobs.buffer(1000, OverflowStrategy.fail)
Akka
Persistence
(CQRS)
sbt
scalaVersion := "2.12.3"
libraryDependencies ++= Seq(
"com.typesafe.akka" %% "akka-actor" % "2.5.4",
"com.typesafe.akka" %% "akka-persistence" % "2.5.3",
)
● IDE (Intellij)
● SBT (Scala Build tool)
● Kafka (confluent.io)
● Akka Persistence
Typical Architecture
Problem ?
Databases are shared.
Data is mutable
Event Source Architecture & CQRS
Solved ?
Immutable data as events
Append only storage
Replay events
Command
Query
Responsibility
Segregation
Akka Persistence Actor:Journal
● Database: Cassandra, HBase,
Mongo..
● Command to events
● Store events
● Update State
● Recover state on crash
Akka Persistence Actor:Snapshot
● Database: Cassandra, HBase,
Mongo..
● Command to events
● Store events
● Update State
● Recover state on crash
Akka Persistence Actor:Persistence Query
● Eventual Consistency
● Poling frequency
●
Akka stream and  Akka CQRS

Akka stream and Akka CQRS

  • 1.
    Akka Stream andCQRS Milan Das (Principal Consultant) : StructuredMap.com
  • 2.
    About Me 1. Name:Milan Das a. https://www.linkedin.com/in/milandas/ b. milan.das77@gmail.com c. StructuredMap LLC d. Java Developer since 1999 2. Sr. Consultant Open Source Talend: DataStream/Spark 3. StructuredMap: Principal Consultant
  • 3.
    Akka Stream sbt scalaVersion :="2.12.3" libraryDependencies ++= Seq( "com.typesafe.akka" %% "akka-actor" % "2.5.4", "com.typesafe.akka" %% "akka-testkit" % "2.5.4" % Test, "com.typesafe.akka" %% "akka-stream" % "2.5.4", /"com.lightbend.akka" %% "akka-stream-alpakka-sse" % "0.11", "com.typesafe.akka" %% "akka-stream-kafka" % "0.16", "com.typesafe.akka" %% "akka-stream-testkit" % "2.5.4" % Test ) ● IDE (Intellij) ● SBT (Scala Build tool) ● Kafka (confluent.io) ● Akka Stream
  • 4.
    . Stream processinganalyzes and performs actions on real-time data though the use of continuous queries. Streaming Analytics connects to external data sources, enabling applications to integrate certain data into the application flow, or to update an external database with processed information. -- Dataversity.net What is Streaming
  • 5.
    Akka Stream: Source Source: ASource is as an input source to the stream. Source has a single output channel and no input channel. Source could generate data or connect to other system to generate Source data.
  • 6.
    Akka Stream: Flow Flow: Flowconnects different streams and if needed transforms data.
  • 7.
    Akka Stream: Sink Sink: ASink is the endpoint of a stream to consume data. Sink has a single input channel.
  • 8.
    Akka Stream: Materilizer ActorMaterializer: TheMaterializer is a factory for stream execution engines, it is the thing that makes streams run object AkkaStream1 extends App{ implicit val system = ActorSystem("AkkaStream1") implicit val materializer = ActorMaterializer() import system.dispatcher val source = Source(1 to 300) val sink = Sink.foreach[Int](elem => println(s"sink received: $elem")) val flow = source to sink flow.run() }
  • 9.
  • 10.
    Akka Stream: Realworld Unbounded data stream ● The data elements in the stream arrive online. ● The system has no control over the order in which data elements arrive to be processed, either within a data stream or across data streams. ● Data streams are potentially unbounded in size. ● Once an element from a data stream has been processed it is discarded or archived — it cannot be retrieved easily unless it is explicitly stored in memory, which typically is small relative to the size of the data streams
  • 11.
  • 12.
    Complex Graph GraphDSL Graphsare built from simple Flows which serve as the linear connections within the graphs to serve fan-in and fan-out points for Flows. Image Source: http://doc.akka.io/docs/akka/2.5.3/images/simple-graph-example.png
  • 13.
    Complex Graph GraphDSL AkkaStreams junctions : Fan-out ● Broadcast[T] – (1 input, N outputs) given an input element emits to each output ● Balance[T] – (1 input, N outputs) given an input element emits to one of its output ports ● UnzipWith[In,A,B,...] – (1 input, N outputs) takes a function of 1 input that given a value for each input emits N output elements (where N <= 20) ● UnZip[A,B] – (1 input, 2 outputs) splits a stream of (A,B) tuples into two streams, one of type A and one of type B Fan-in ● Merge[In] – (N inputs , 1 output) picks randomly from inputs pushing them one by one to its output ● MergePreferred[In] – like Merge but if elements are available on preferred port, it picks from it, otherwise randomly from others ● ZipWith[A,B,...,Out] – (N inputs, 1 output) which takes a function of N inputs that given a value for each input emits 1 output element ● Zip[A,B] – (2 inputs, 1 output) is a ZipWith specialised to zipping input streams of A and B into an (A,B) tuple stream ● Concat[A] – (2 inputs, 1 output) concatenates two streams (first consume one, then the second one)
  • 14.
    Akka Stream Asynchronous Source(1to 3) .map { i => println(s"A: $i"); i }.async .map { i => println(s"B: $i"); i }.async .map { i => println(s"C: $i"); i }.async .runWith(Sink.ignore) Materilizer Buffer: akka.stream.materializer.max-input-buffer-size = 16 val materializer = ActorMaterializer( ActorMaterializerSettings(system) .withInputBuffer( initialSize = 64, maxSize = 64))
  • 15.
  • 16.
    Akka Stream Backpressure //1000 jobs are dequeed // Getting a stream of jobs from an imaginary external system as a Source val jobs: Source[Job, NotUsed] = inboundJobsConnector() jobs.buffer(1000, OverflowStrategy.backpressure) jobs.buffer(1000, OverflowStrategy.dropTail) jobs.buffer(1000, OverflowStrategy.dropNew) jobs.buffer(1000, OverflowStrategy.dropHead) jobs.buffer(1000, OverflowStrategy.dropBuffer) jobs.buffer(1000, OverflowStrategy.fail)
  • 17.
    Akka Persistence (CQRS) sbt scalaVersion := "2.12.3" libraryDependencies++= Seq( "com.typesafe.akka" %% "akka-actor" % "2.5.4", "com.typesafe.akka" %% "akka-persistence" % "2.5.3", ) ● IDE (Intellij) ● SBT (Scala Build tool) ● Kafka (confluent.io) ● Akka Persistence
  • 19.
    Typical Architecture Problem ? Databasesare shared. Data is mutable
  • 20.
    Event Source Architecture& CQRS Solved ? Immutable data as events Append only storage Replay events Command Query Responsibility Segregation
  • 21.
    Akka Persistence Actor:Journal ●Database: Cassandra, HBase, Mongo.. ● Command to events ● Store events ● Update State ● Recover state on crash
  • 22.
    Akka Persistence Actor:Snapshot ●Database: Cassandra, HBase, Mongo.. ● Command to events ● Store events ● Update State ● Recover state on crash
  • 23.
    Akka Persistence Actor:PersistenceQuery ● Eventual Consistency ● Poling frequency ●