Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data in Motion: Streaming Static Data Efficiently

2,910 views

Published on

Distributed streaming performance, consistency, reliable delivery, durability, optimisations, event time processing and other concepts discussed and explained on Akka Persistence and other examples.

Published in: Software

Data in Motion: Streaming Static Data Efficiently

  1. 1. MANCHESTER LONDON NEW YORK
  2. 2. Martin Zapletal @zapletal_martin #ScalaDays Data in Motion: Streaming Static Data Efficiently in Akka Persistence (and elsewhere) @cakesolutions
  3. 3. Databases
  4. 4. Batch processing
  5. 5. Data at scale ● Reactive ● Real time, asynchronous and message driven ● Elastic and scalable ● Resilient and fault tolerant
  6. 6. Streams
  7. 7. Streaming static data ● Turning database into a stream
  8. 8. Pulling data from source 0 0 5 5 10 10
  9. 9. 0 0 0 0 5 5 10 10
  10. 10. 5 5 0 0 5 5 10 10 0 0
  11. 11. 10 10 0 5 5 10 10 5 5 0 0 0
  12. 12. 10 10 0 0 5 5 10 10 5 5 0 01 1 Inserts
  13. 13. 10 10 0 0 5 55 10 10 5 5 0 0 Updates
  14. 14. Pushing data from source ● Change log, change data capture 0 0 5 5 10 10
  15. 15. 0 0 5 5 10 10 1 1
  16. 16. 11 0 0 5 5 10 10 1 1
  17. 17. Infinite streams of finite data source ● Consistent snapshot and change log 0 0 5 5 10 10 0 0 5 5 10 10 1 1 0 0 5 5 10 10 1 1
  18. 18. 0 1 2 3 4 0 5 10 1 5 Inserted value 0 Inserted value 5 Inserted value 10 Inserted value 1 Inserted value 55 Log data structure
  19. 19. Pulling data from a log 10 10 5 5 0 0 0 0 10 5 5 10
  20. 20. 10 10 5 5 0 0 0 0 10 15 15 5 5 10
  21. 21. 0 0 15 15 5 5 15 15 10 10 5 5 0 0 10 10
  22. 22. persistence_id1, event 2 persistence_id1, event 3 persistence_id1, event 4 persistence_id1, event 1 2 35 Akka Persistence 1 4
  23. 23. Akka Persistence Query ● eventsByPersistenceId, allPersistenceIds, eventsByTag 1 4 2 35 persistence_id1, event 2 persistence_id1, event 3 persistence_id1, event 4 persistence_id1, event 1
  24. 24. Persistence_ id partition_nr 0 0 0 1 event 1 event 100 event 101 event 102 event 0 event 2 1 0 event 0 event 1 event 2 Akka Persistence Query Cassandra ● Purely pull ● Event (log) data
  25. 25. Actor publisher private[query] abstract class QueryActorPublisher[MessageType, State: ClassTag](refreshInterval: Option[FiniteDuration]) extends ActorPublisher[MessageType] { protected def initialState: Future[State] protected def initialQuery(initialState: State): Future[Action] protected def requestNext(state: State, resultSet: ResultSet): Future[Action] protected def requestNextFinished(state: State, resultSet: ResultSet): Future[Action] protected def updateState(state: State, row: Row): (Option[MessageType], State) protected def completionCondition(state: State): Boolean private[this] def nextBehavior(...): Receive = { if (shouldFetchMore(...)) { listenableFutureToFuture(resultSet.fetchMoreResults()).map(FetchedResultSet).pipeTo(self) awaiting(resultSet, state, finished) } else if (shouldIdle(...)) { idle(resultSet, state, finished) } else if (shouldComplete(...)) { onCompleteThenStop() Actor.emptyBehavior } else if (shouldRequestMore(...)) { if (finished) requestNextFinished(state, resultSet).pipeTo(self) else requestNext(state, resultSet).pipeTo(self) awaiting(resultSet, state, finished) } else { idle(resultSet, state, finished) } } }
  26. 26. private[query] abstract class QueryActorPublisher[MessageType, State: ClassTag](refreshInterval: Option[FiniteDuration]) extends ActorPublisher[MessageType] { protected def initialState: Future[State] protected def initialQuery(initialState: State): Future[Action] protected def requestNext(state: State, resultSet: ResultSet): Future[Action] protected def requestNextFinished(state: State, resultSet: ResultSet): Future[Action] protected def updateState(state: State, row: Row): (Option[MessageType], State) protected def completionCondition(state: State): Boolean private[this] def nextBehavior(...): Receive = { if (shouldFetchMore(...)) { listenableFutureToFuture(resultSet.fetchMoreResults()).map(FetchedResultSet).pipeTo(self) awaiting(resultSet, state, finished) } else if (shouldIdle(...)) { idle(resultSet, state, finished) } else if (shouldComplete(...)) { onCompleteThenStop() Actor.emptyBehavior } else if (shouldRequestMore(...)) { if (finished) requestNextFinished(state, resultSet).pipeTo(self) else requestNext(state, resultSet).pipeTo(self) awaiting(resultSet, state, finished) } else { idle(resultSet, state, finished) } } }
  27. 27. initialQuery Cancel initialFinishe d shouldFetch More shouldIdle shouldTermi nate shouldReque stMore Subscription Timeout Cancel Subscription Timeout initialNewRes ultSet request newResultSet fetchedResul tSet finished Cancel Subscription Timeout request continue Red transitions deliver buffer and update internal state (progress) Blue transitions asynchronous database query
  28. 28. SELECT * FROM ${tableName} WHERE persistence_id = ? AND partition_nr = ? AND sequence_nr >= ? AND sequence_nr <= ? 0 0 0 1 event 1 event 100 event 101 event 102 event 0 event 2 Events by persistence id
  29. 29. 0 0 0 1 event 1 event 100 event 101 event 102 event 2event 0
  30. 30. 0 0 0 1 event 1 event 100 event 101 event 102 event 2event 0
  31. 31. 0 0 0 1 event 1 event 100 event 101 event 102 event 2event 0
  32. 32. 0 0 0 1 event 1 event 100 event 101 event 102 event 2event 0
  33. 33. 0 0 0 1 event 1 event 100 event 101 event 102 event 2event 0
  34. 34. 0 0 0 1 event 0 event 1 event 100 event 101 event 102 event 2
  35. 35. private[query] class EventsByPersistenceIdPublisher(...) extends QueryActorPublisher[PersistentRepr, EventsByPersistenceIdState](...) { override protected def initialState: Future[EventsByPersistenceIdState] = { ... EventsByPersistenceIdState(initialFromSequenceNr, 0, currentPnr) } override protected def updateState( state: EventsByPersistenceIdState, Row: Row): (Option[PersistentRepr], EventsByPersistenceIdState) = { val event = extractEvent(row) val partitionNr = row.getLong("partition_nr") + 1 (Some(event), EventsByPersistenceIdState(event.sequenceNr + 1, state.count + 1, partitionNr)) } }
  36. 36. private[query] class EventsByPersistenceIdPublisher(...) extends QueryActorPublisher[PersistentRepr, EventsByPersistenceIdState](...) { override protected def initialState: Future[EventsByPersistenceIdState] = { ... EventsByPersistenceIdState(initialFromSequenceNr, 0, currentPnr) } override protected def updateState( state: EventsByPersistenceIdState, Row: Row): (Option[PersistentRepr], EventsByPersistenceIdState) = { val event = extractEvent(row) val partitionNr = row.getLong("partition_nr") + 1 (Some(event), EventsByPersistenceIdState(event.sequenceNr + 1, state.count + 1, partitionNr)) } }
  37. 37. 0 0 0 1 event 1 event 100 event 101 event 102 event 0 event 2 1 0 event 0 event 1 event 2 All persistence ids SELECT DISTINCT persistence_id, partition_nr FROM $tableName
  38. 38. 0 0 0 1 event 1 event 100 event 101 event 102 event 0 event 2 1 0 event 0 event 1 event 2
  39. 39. 0 0 0 1 event 1 event 100 event 101 event 102 event 0 event 2 1 0 event 0 event 1 event 2
  40. 40. 0 0 0 1 event 1 event 100 event 101 event 102 event 0 event 2 1 0 event 0 event 1 event 2
  41. 41. private[query] class AllPersistenceIdsPublisher(...) extends QueryActorPublisher[String, AllPersistenceIdsState](...) { override protected def initialState: Future[AllPersistenceIdsState] = Future.successful(AllPersistenceIdsState(Set.empty)) override protected def updateState( state: AllPersistenceIdsState, row: Row): (Option[String], AllPersistenceIdsState) = { val event = row.getString("persistence_id") if (state.knownPersistenceIds.contains(event)) { (None, state) } else { (Some(event), state.copy(knownPersistenceIds = state.knownPersistenceIds + event)) } } }
  42. 42. private[query] class AllPersistenceIdsPublisher(...) extends QueryActorPublisher[String, AllPersistenceIdsState](...) { override protected def initialState: Future[AllPersistenceIdsState] = Future.successful(AllPersistenceIdsState(Set.empty)) override protected def updateState( state: AllPersistenceIdsState, row: Row): (Option[String], AllPersistenceIdsState) = { val event = row.getString("persistence_id") if (state.knownPersistenceIds.contains(event)) { (None, state) } else { (Some(event), state.copy(knownPersistenceIds = state.knownPersistenceIds + event)) } } }
  43. 43. Events by tag 0 0 0 1 event 1, tag 1 event 100, tag 1 event 101 event 102 event 0 event 2, tag 1 1 0 event 0 event 1 event 2, tag 1
  44. 44. 0 0 0 1 event 1, tag 1 event 100, tag 1 event 101 event 102 event 2, tag 1 1 0 event 0 event 1 event 0 event 2, tag 1
  45. 45. 0 0 0 1 event 1, tag 1 event 100, tag 1 event 101 event 102 event 0 event 2, tag 1 1 0 event 1event 0 event 2, tag 1
  46. 46. 0 0 0 1 event 1, tag 1 event 100, tag 1 event 101 event 102 event 0 event 2, tag 1 1 0 event 0 event 1 event 2, tag 1
  47. 47. event 0 event 0 0 0 0 1 event 1, tag 1 event 100, tag 1 event 101 event 102 event 2, tag 1 1 0 event 1 event 2, tag 1
  48. 48. event 0 event 0 event 1 0 0 0 1 event 100, tag 1 event 101 event 102 event 2, tag 1 1 0 event 2, tag 1 event 1, tag 1
  49. 49. 0 0 0 1 event 1, tag 1 event 100, tag 1 event 101 event 102 event 2, tag 1 1 0 event 2, tag 1 event 0 event 0 event 1 event 1, tag 1
  50. 50. event 1, tag 1 event 2, tag 1 event 0 event 0 event 1 event 1, tag 10 0 0 1 event 100, tag 1 event 101 event 102 1 0 event 2, tag 1
  51. 51. event 2, tag 1 event 0 event 0 event 1 0 0 0 1 event 100, tag 1 event 101 event 102 1 0 event 2, tag 1 event 1, tag 1
  52. 52. 0 0 0 1 1 0 event 2, tag 1 event 0 event 0 event 1 event 100, tag 1 event 101 event 102 event 2, tag 1 event 1, tag 1
  53. 53. Events by tag Id 0, event 1 Id 1, event 2 Id 0, event 100 0 0 0 1 event 1, tag 1 event 100, tag 1 event 101 event 102 event 0 1 0 event 0 event 1 event 2, tag 1 Id 0, event 2 tag 1 1/1/2016 tag 1 1/2/2016 event 2, tag 1 SELECT * FROM $eventsByTagViewName$tagId WHERE tag$tagId = ? AND timebucket = ? AND timestamp > ? AND timestamp <= ? ORDER BY timestamp ASC LIMIT ?
  54. 54. Id 1, event 2 Id 0, event 100 Id 0, event 1 0 0 0 1 event 1, tag 1 event 100, tag 1 event 101 event 102 event 0 Id 0, event 2 1 0 event 0 event 1 event 2, tag 1 tag 1 1/1/2016 tag 1 1/2/2016 event 2, tag 1
  55. 55. Id 1, event 2 Id 0, event 100 Id 0, event 1 0 0 0 1 event 1, tag 1 event 100, tag 1 event 101 event 102 event 0 Id 0, event 2 1 0 event 0 event 1 event 2, tag 1 tag 1 1/1/2016 tag 1 1/2/2016 event 2, tag 1
  56. 56. Id 0, event 100 Id 1, event 2 Id 0, event 1 0 0 0 1 event 1, tag 1 event 100, tag 1 event 101 event 102 event 0 Id 0, event 2 1 0 event 0 event 1 event 2, tag 1 tag 1 1/1/2016 tag 1 1/2/2016 event 2, tag 1
  57. 57. Id 0, event 100 Id 1, event 2 Id 0, event 1 0 0 0 1 event 1, tag 1 event 100, tag 1 event 101 event 102 event 0 1 0 event 0 event 1 event 2, tag 1 tag 1 1/1/2016 tag 1 1/2/2016 event 2, tag 1 Id 0, event 2
  58. 58. 0 0 0 1 event 1, tag 1 event 100, tag 1 event 101 event 102 event 0 event 2, tag 1 1 0 event 0 event 1 event 2, tag 1 tag 1 1/1/2016 tag 1 1/2/2016
  59. 59. tag 1 1/1/2016 tag 1 1/2/2016 Id 0, event 1 0 0 0 1 event 1, tag 1 event 100, tag 1 event 101 event 102 event 0 1 0 event 0 event 1 event 2, tag 1 persistence _id seq 0 1 1 . . . event 2, tag 1
  60. 60. Id 0, event 100 Id 0, event 1 0 0 0 1 event 1, tag 1 event 100, tag 1 event 101 event 102 event 0 1 0 event 0 event 1 event 2, tag 1 persistence _id seq 0 ? 1 . . . event 2, tag 1 tag 1 1/1/2016 tag 1 1/2/2016
  61. 61. Id 0, event 100 Id 0, event 2 Id 0, event 1 0 0 0 1 event 1, tag 1 event 100, tag 1 event 101 event 102 event 0 1 0 event 0 event 1 event 2, tag 1 persistence _id seq 0 ? 1 event 2, tag 1 tag 1 1/1/2016 tag 1 1/2/2016 . . .
  62. 62. seqNumbers match { case None => replyTo ! UUIDPersistentRepr(offs, toPersistentRepr(row, pid, seqNr)) loop(n - 1) case Some(s) => s.isNext(pid, seqNr) match { case SequenceNumbers.Yes | SequenceNumbers.PossiblyFirst => seqNumbers = Some(s.updated(pid, seqNr)) replyTo ! UUIDPersistentRepr(offs, toPersistentRepr(row, pid, seqNr)) loop(n - 1) case SequenceNumbers.After => replyTo ! ReplayAborted(seqNumbers, pid, s.get(pid) + 1, seqNr) // end loop case SequenceNumbers.Before => // duplicate, discard if (!backtracking) log.debug(s"Discarding duplicate. Got sequence number [$seqNr] for [$pid], " + s"but current sequence number is [${s.get(pid)}]") loop(n - 1) } }
  63. 63. seqNumbers match { case None => replyTo ! UUIDPersistentRepr(offs, toPersistentRepr(row, pid, seqNr)) loop(n - 1) case Some(s) => s.isNext(pid, seqNr) match { case SequenceNumbers.Yes | SequenceNumbers.PossiblyFirst => seqNumbers = Some(s.updated(pid, seqNr)) replyTo ! UUIDPersistentRepr(offs, toPersistentRepr(row, pid, seqNr)) loop(n - 1) case SequenceNumbers.After => replyTo ! ReplayAborted(seqNumbers, pid, s.get(pid) + 1, seqNr) // end loop case SequenceNumbers.Before => // duplicate, discard if (!backtracking) log.debug(s"Discarding duplicate. Got sequence number [$seqNr] for [$pid], " + s"but current sequence number is [${s.get(pid)}]") loop(n - 1) } }
  64. 64. def replay(): Unit = { val backtracking = isBacktracking val limit = if (backtracking) maxBufferSize else maxBufferSize - buf.size val toOffs = if (backtracking && abortDeadline.isEmpty) highestOffset else UUIDs.endOf(System.currentTimeMillis() - eventualConsistencyDelayMillis) context.actorOf(EventsByTagFetcher.props(tag, currTimeBucket, currOffset, toOffs, limit, backtracking, self, session, preparedSelect, seqNumbers, settings)) context.become(replaying(limit)) } def replaying(limit: Int): Receive = { case env @ UUIDPersistentRepr(offs, _) => // Deliver buffer case ReplayDone(count, seqN, highest) => // Request more case ReplayAborted(seqN, pid, expectedSeqNr, gotSeqNr) => // Causality violation, wait and retry. Only applicable if all events for persistence_id are tagged case ReplayFailed(cause) => // Failure case _: Request => // Deliver buffer case Continue => // Do nothing case Cancel => // Stop }
  65. 65. def replay(): Unit = { val backtracking = isBacktracking val limit = if (backtracking) maxBufferSize else maxBufferSize - buf.size val toOffs = if (backtracking && abortDeadline.isEmpty) highestOffset else UUIDs.endOf(System.currentTimeMillis() - eventualConsistencyDelayMillis) context.actorOf(EventsByTagFetcher.props(tag, currTimeBucket, currOffset, toOffs, limit, backtracking, self, session, preparedSelect, seqNumbers, settings)) context.become(replaying(limit)) } def replaying(limit: Int): Receive = { case env @ UUIDPersistentRepr(offs, _) => // Deliver buffer case ReplayDone(count, seqN, highest) => // Request more case ReplayAborted(seqN, pid, expectedSeqNr, gotSeqNr) => // Causality violation, wait and retry. Only applicable if all events for persistence_id are tagged case ReplayFailed(cause) => // Failure case _: Request => // Deliver buffer case Continue => // Do nothing case Cancel => // Stop }
  66. 66. Akka Persistence Cassandra Replay def asyncReplayMessages(persistenceId: String, fromSequenceNr: Long, toSequenceNr: Long, max: Long) (replayCallback: (PersistentRepr) => Unit): Future[Unit] = Future { new MessageIterator(persistenceId, fromSequenceNr, toSequenceNr, max).foreach(msg => { replayCallback(msg) }) } class MessageIterator(persistenceId: String, fromSequenceNr: Long, toSequenceNr: Long, max: Long) extends Iterator [PersistentRepr] { private val initialFromSequenceNr = math.max(highestDeletedSequenceNumber(persistenceId) + 1, fromSequenceNr) private val iter = new RowIterator(persistenceId, initialFromSequenceNr, toSequenceNr) private var mcnt = 0L private var c: PersistentRepr = null private var n: PersistentRepr = PersistentRepr(Undefined) fetch() def hasNext: Boolean = ... def next(): PersistentRepr = … ... }
  67. 67. Akka Persistence Cassandra Replay def asyncReplayMessages(persistenceId: String, fromSequenceNr: Long, toSequenceNr: Long, max: Long) (replayCallback: (PersistentRepr) => Unit): Future[Unit] = Future { new MessageIterator(persistenceId, fromSequenceNr, toSequenceNr, max).foreach(msg => { replayCallback(msg) }) } class MessageIterator(persistenceId: String, fromSequenceNr: Long, toSequenceNr: Long, max: Long) extends Iterator [PersistentRepr] { private val initialFromSequenceNr = math.max(highestDeletedSequenceNumber(persistenceId) + 1, fromSequenceNr) private val iter = new RowIterator(persistenceId, initialFromSequenceNr, toSequenceNr) private var mcnt = 0L private var c: PersistentRepr = null private var n: PersistentRepr = PersistentRepr(Undefined) fetch() def hasNext: Boolean = ... def next(): PersistentRepr = … ... }
  68. 68. Akka Persistence Cassandra Replay def asyncReplayMessages(persistenceId: String, fromSequenceNr: Long, toSequenceNr: Long, max: Long) (replayCallback: (PersistentRepr) => Unit): Future[Unit] = Future { new MessageIterator(persistenceId, fromSequenceNr, toSequenceNr, max).foreach(msg => { replayCallback(msg) }) } class MessageIterator(persistenceId: String, fromSequenceNr: Long, toSequenceNr: Long, max: Long) extends Iterator [PersistentRepr] { private val initialFromSequenceNr = math.max(highestDeletedSequenceNumber(persistenceId) + 1, fromSequenceNr) private val iter = new RowIterator(persistenceId, initialFromSequenceNr, toSequenceNr) private var mcnt = 0L private var c: PersistentRepr = null private var n: PersistentRepr = PersistentRepr(Undefined) fetch() def hasNext: Boolean = ... def next(): PersistentRepr = … ... }
  69. 69. class RowIterator(persistenceId: String, fromSequenceNr: Long, toSequenceNr: Long) extends Iterator[Row] { var currentPnr = partitionNr(fromSequenceNr) var currentSnr = fromSequenceNr var fromSnr = fromSequenceNr var toSnr = toSequenceNr var iter = newIter() def newIter() = session.execute(preparedSelectMessages.bind(persistenceId, currentPnr, fromSnr, toSnr)).iterator final def hasNext: Boolean = { if (iter.hasNext) true else if (!inUse) false } else { currentPnr += 1 fromSnr = currentSnr iter = newIter() hasNext } } def next(): Row = { val row = iter.next() currentSnr = row.getLong("sequence_nr") row } }
  70. 70. class RowIterator(persistenceId: String, fromSequenceNr: Long, toSequenceNr: Long) extends Iterator[Row] { var currentPnr = partitionNr(fromSequenceNr) var currentSnr = fromSequenceNr var fromSnr = fromSequenceNr var toSnr = toSequenceNr var iter = newIter() def newIter() = session.execute(preparedSelectMessages.bind(persistenceId, currentPnr, fromSnr, toSnr)).iterator final def hasNext: Boolean = { if (iter.hasNext) true else if (!inUse) false } else { currentPnr += 1 fromSnr = currentSnr iter = newIter() hasNext } } def next(): Row = { val row = iter.next() currentSnr = row.getLong("sequence_nr") row } }
  71. 71. class RowIterator(persistenceId: String, fromSequenceNr: Long, toSequenceNr: Long) extends Iterator[Row] { var currentPnr = partitionNr(fromSequenceNr) var currentSnr = fromSequenceNr var fromSnr = fromSequenceNr var toSnr = toSequenceNr var iter = newIter() def newIter() = session.execute(preparedSelectMessages.bind(persistenceId, currentPnr, fromSnr, toSnr)).iterator final def hasNext: Boolean = { if (iter.hasNext) true else if (!inUse) false } else { currentPnr += 1 fromSnr = currentSnr iter = newIter() hasNext } } def next(): Row = { val row = iter.next() currentSnr = row.getLong("sequence_nr") row } }
  72. 72. Non blocking asynchronous replay private[this] val queries: CassandraReadJournal = new CassandraReadJournal( extendedActorSystem, context.system.settings.config.getConfig("cassandra-query-journal")) override def asyncReplayMessages( persistenceId: String, fromSequenceNr: Long, toSequenceNr: Long, max: Long)(replayCallback: (PersistentRepr) => Unit): Future[Unit] = queries .eventsByPersistenceId( persistenceId, fromSequenceNr, toSequenceNr, max, replayMaxResultSize, None, "asyncReplayMessages") .runForeach(replayCallback) .map(_ => ())
  73. 73. private[this] val queries: CassandraReadJournal = new CassandraReadJournal( extendedActorSystem, context.system.settings.config.getConfig("cassandra-query-journal")) override def asyncReplayMessages( persistenceId: String, fromSequenceNr: Long, toSequenceNr: Long, max: Long)(replayCallback: (PersistentRepr) => Unit): Future[Unit] = queries .eventsByPersistenceId( persistenceId, fromSequenceNr, toSequenceNr, max, replayMaxResultSize, None, "asyncReplayMessages") .runForeach(replayCallback) .map(_ => ())
  74. 74. Benchmarks 5000 10 000 15 000 20 000 25 000 30 000 35 000 40 000 5000 10 000 15 000 20 000 25 000 30 000 35 000 40 000 0 0 10 000 20 000 30 000 40 000 0 50 000 Time(s) Time(s) Time(s) Actors Threads, Actors Threads 20 40 60 80 100 120 1405000 10000 15000 20000 25000 30000 10 20 30 40 50 60 70 45 000 50 000 blocking asynchronous REPLAY STRONG SCALING WEAK SCALING
  75. 75. node_id Alternative architecture 0 1 persistence_id 0, event 0 persistence_id 0, event 1 persistence_id 1, event 0 persistence_id 0, event 2 persistence_id 2, event 0 persistence_id 0, event 3
  76. 76. persistence_id 0, event 0 persistence_id 0, event 1 persistence_id 1, event 0 persistence_id 2, event 0 persistence_id 0, event 2 persistence_id 0, event 3
  77. 77. tag 1 0 allIds Id 0, event 1 Id 2, event 1 0 1 0 0 event 1event o
  78. 78. node_id 0 1 Id 0, event 0 Id 0, event 1 Id 1, event 0 Id 0, event 2 Id 2, event 0 Id 0, event 3 Id 0, event 0 Id 0, event 1 Id 1, event 0 Id 2, event 0 Id 0, event 2 Id 0, event 3 tag 1 0 allIds Id 0, event 1 Id 2, event 1 0 1 0 0 event 0 event 1
  79. 79. tag 1 0 allIds Id 0, event 1 Id 2, event 1 0 1 0 0 event 0 event 1 val boundStatements = statementGroup(eventsByPersistenceId, eventsByTag, allPersistenceIds) Future.sequence(boundStatements).flatMap { stmts => val batch = new BatchStatement().setConsistencyLevel(...).setRetryPolicy(...) stmts.foreach(batch.add) session.underlying().flatMap(_.executeAsync(batch)) }
  80. 80. tag 1 0 allIds Id 0, event 1 Id 2, event 1 0 1 0 0 event 0 event 1 val boundStatements = statementGroup(eventsByPersistenceId, eventsByTag, allPersistenceIds) Future.sequence(boundStatements).flatMap { stmts => val batch = new BatchStatement().setConsistencyLevel(...).setRetryPolicy(...) stmts.foreach(batch.add) session.underlying().flatMap(_.executeAsync(batch)) }
  81. 81. val eventsByPersistenceIdStatement = statementGroup(eventsByPersistenceIdStatement) val boundStatements = statementGroup(eventsByTagStatement, allPersistenceIdsStatement) ... session.underlying().flatMap { s => val ebpResult = s.executeAsync(eventsByPersistenceIdStatement) val batchResult = s.executeAsync(batch)) ... } tag 1 0 allIds Id 0, event 1 Id 2, event 1 0 1 0 0 event 0 event 1
  82. 82. val eventsByPersistenceIdStatement = statementGroup(eventsByPersistenceIdStatement) val boundStatements = statementGroup(eventsByTagStatement, allPersistenceIdsStatement) ... session.underlying().flatMap { s => val ebpResult = s.executeAsync(eventsByPersistenceIdStatement) val batchResult = s.executeAsync(batch)) ... } tag 1 0 allIds Id 0, event 1 Id 2, event 1 0 1 0 0 event 0 event 1
  83. 83. Event time processing ● Ingestion time, processing time, event time
  84. 84. Ordering 10 2 1 12:34:57 1 KEY TIME VALUE 2 12:34:58 2 KEY TIME VALUE 0 12:34:56 0 KEY TIME VALUE
  85. 85. 0 1 2 1 12:34:57 1 KEY TIME VALUE 2 12:34:58 2 KEY TIME VALUE 0 12:34:56 0 KEY TIME VALUE
  86. 86. Distributed causal stream merging Id 0, event 2 Id 0, event 1 Id 0, event 0 Id 1, event 00 1 Id 2, event 0 Id 0, event 3 node_id
  87. 87. Id 0, event 2 Id 0, event 1 Id 0, event 0 Id 1, event 00 1 Id 2, event 0 Id 0, event 3 Id 0, event 0 node_id
  88. 88. Id 0, event 2 Id 0, event 1 Id 0, event 0 Id 1, event 00 1 Id 2, event 0 Id 0, event 3 Id 0, event 0 node_id
  89. 89. Id 0, event 2 Id 0, event 1 Id 0, event 0 Id 1, event 00 1 Id 2, event 0 Id 0, event 3 Id 0, event 0 node_id persistence _id seq 0 0 1 . . . 2 . . .
  90. 90. persistence _id seq 0 1 1 . . . 2 . . . Id 0, event 2 Id 0, event 1 Id 0, event 0 Id 1, event 0 node_id 0 1 Id 2, event 0 Id 0, event 0 Id 0, event 1 Id 0, event 3
  91. 91. persistence _id seq 0 2 1 0 2 0 Id 0, event 1 Id 0, event 0 Id 1, event 0 node_id 0 1 Id 2, event 0 Id 0, event 0 Id 0, event 1 Id 0, event 2 Id 0, event 3 Id 2, event 0 Id 0, event 2 Id 1, event 0
  92. 92. Id 0, event 2 Id 0, event 1 Id 0, event 0 Id 1, event 00 1 Id 2, event 0 Id 0, event 3 Id 0, event 0 Id 0, event 1 Id 2, event 0 Id 0, event 2 Id 0, event 3 node_id Id 1, event 0 persistence _id seq 0 3 1 0 2 0
  93. 93. Id 0, event 2 Id 0, event 1 Id 0, event 0 Id 1, event 00 1 Id 2, event 0 Id 0, event 3 Id 0, event 0 Id 0, event 1 Id 2, event 0 Id 0, event 2 node_id Id 1, event 0 0 0 Id 0, event 0 Id 0, event 1 Replay
  94. 94. Id 0, event 2 Id 0, event 1 Id 0, event 0 Id 1, event 00 1 Id 2, event 0 Id 0, event 3 Id 0, event 0 Id 0, event 1 Id 2, event 0 Id 0, event 2 node_id Id 1, event 0 0 0 Id 0, event 0 Id 0, event 1
  95. 95. Id 0, event 2 Id 0, event 1 Id 0, event 0 Id 1, event 00 1 Id 2, event 0 Id 0, event 3 Id 0, event 0 Id 0, event 1 Id 2, event 0 Id 0, event 2 Id 1, event 0 0 0 Id 0, event 0 Id 0, event 1 node_id
  96. 96. Id 0, event 2 Id 0, event 1 Id 0, event 0 Id 1, event 00 1 Id 2, event 0 Id 0, event 3 Id 0, event 0 Id 0, event 1 Id 2, event 0 Id 0, event 2 Id 1, event 0 0 0 Id 0, event 0 Id 0, event 1 node_id persistence _id seq 0 2
  97. 97. Id 0, event 2 Id 0, event 1 Id 0, event 0 Id 1, event 00 Id 2, event 0 Id 0, event 3 Id 0, event 0 Id 0, event 1 Id 2, event 0 Id 0, event 2 Id 1, event 0 0 0 Id 0, event 0 Id 0, event 1 persistence _id seq 0 2 stream_id seq 0 1 1 2 1 node_id
  98. 98. Exactly once delivery
  99. 99. Id 0, event 0 Id 0, event 1 Id 2, event 0 Id 0, event 2 Id 0, event 3 Id 1, event 0
  100. 100. Id 0, event 0 Id 0, event 1 Id 2, event 0 Id 0, event 2 Id 0, event 3 Id 1, event 0
  101. 101. Id 0, event 0 Id 0, event 1 Id 2, event 0 Id 0, event 2 Id 0, event 3 Id 1, event 0 Id 0, event 0 Id 0, event 1 Id 2, event 0 Id 0, event 3 Id 1, event 0 ACK ACK ACK ACK ACK
  102. 102. Id 0, event 0 Id 0, event 1 Id 2, event 0 Id 0, event 2 Id 0, event 3 Id 1, event 0 Id 0, event 0 Id 0, event 1 Id 2, event 0 Id 0, event 3 Id 1, event 0 ACK ACK ACK ACK ACK
  103. 103. Id 0, event 0 Id 0, event 1 Id 2, event 0 Id 0, event 2 Id 0, event 3 Id 1, event 0 Id 0, event 0 Id 0, event 1 Id 2, event 0 Id 0, event 3 Id 1, event 0 ACK ACK ACK ACK ACK
  104. 104. Checkpoint data State Backend Source 1: 6791 Source 2: 7252 Source 3: 5589 Source 4: 6843 State 1: ptr 1 State 1: ptr 2 Sink 2: ack! Sink 2: ack!
  105. 105. class KafkaSource(private var offsetManagers: Map[TopicAndPartition, KafkaOffsetManager]) extends TimeReplayableSource { def open(context: TaskContext, startTime: Option[TimeStamp]): Unit = { fetch.setStartOffset(topicAndPartition, offsetManager.resolveOffset(time)) ... } def read(batchSize: Int): List[Message] def close(): Unit }
  106. 106. class KafkaSource(private var offsetManagers: Map[TopicAndPartition, KafkaOffsetManager]) extends TimeReplayableSource { def open(context: TaskContext, startTime: Option[TimeStamp]): Unit = { fetch.setStartOffset(topicAndPartition, offsetManager.resolveOffset(time)) ... } def read(batchSize: Int): List[Message] def close(): Unit }
  107. 107. class DirectKafkaInputDStream[K, V, U <: Decoder[K]: ClassTag, T <: Decoder[V]: ClassTag, R]( _ssc: StreamingContext, val kafkaParams: Map[String, String], val fromOffsets: Map[TopicAndPartition, Long], messageHandler: MessageAndMetadata[K, V] => R ) extends InputDStream[R](_ssc) with Logging { override def compute(validTime: Time): Option[KafkaRDD[K, V, U, T, R]] = { val untilOffsets = latestLeaderOffsets(maxRetries) ... } }
  108. 108. class DirectKafkaInputDStream[K, V, U <: Decoder[K]: ClassTag, T <: Decoder[V]: ClassTag, R]( _ssc: StreamingContext, val kafkaParams: Map[String, String], val fromOffsets: Map[TopicAndPartition, Long], messageHandler: MessageAndMetadata[K, V] => R ) extends InputDStream[R](_ssc) with Logging { override def compute(validTime: Time): Option[KafkaRDD[K, V, U, T, R]] = { val untilOffsets = latestLeaderOffsets(maxRetries) ... } }
  109. 109. Exactly once delivery ● Durable offset 0 1 2 3 4
  110. 110. 0 1 2 3 4
  111. 111. 10 2 3 4
  112. 112. 10 3 42
  113. 113. Stream source Stream source Stream source Worker Worker Worker Worker Worker Worker Worker Worker Worker select map filter filtermap select select select Optimisation
  114. 114. Worker Worker Worker Worker select where select where Worker Stream source Stream source Stream source select where select where
  115. 115. Worker Worker Worker select where select where Stream source Stream source Stream source select where select where select where select where
  116. 116. val partitioner = partitionerClassName match { case "org.apache.cassandra.dht.Murmur3Partitioner" => Murmur3TokenFactory case "org.apache.cassandra.dht.RandomPartitioner" => RandomPartitionerTokenFactory case _ => throw new IllegalArgumentException(s"Unsupported partitioner: $partitionerClassName") } private def splitToCqlClause(range: TokenRange): Iterable[CqlTokenRange] = { if (range.end == tokenFactory.minToken) List(CqlTokenRange(s"token($pk) > ?", startToken)) else if (range.start == tokenFactory.minToken) List(CqlTokenRange(s"token($pk) <= ?", endToken)) else if (!range.isWrapAround) List(CqlTokenRange(s"token($pk) > ? AND token($pk) <= ?", startToken, endToken)) else List( CqlTokenRange(s"token($pk) > ?", startToken), CqlTokenRange(s"token($pk) <= ?", endToken)) }
  117. 117. val partitioner = partitionerClassName match { case "org.apache.cassandra.dht.Murmur3Partitioner" => Murmur3TokenFactory case "org.apache.cassandra.dht.RandomPartitioner" => RandomPartitionerTokenFactory case _ => throw new IllegalArgumentException(s"Unsupported partitioner: $partitionerClassName") } private def splitToCqlClause(range: TokenRange): Iterable[CqlTokenRange] = { if (range.end == tokenFactory.minToken) List(CqlTokenRange(s"token($pk) > ?", startToken)) else if (range.start == tokenFactory.minToken) List(CqlTokenRange(s"token($pk) <= ?", endToken)) else if (!range.isWrapAround) List(CqlTokenRange(s"token($pk) > ? AND token($pk) <= ?", startToken, endToken)) else List( CqlTokenRange(s"token($pk) > ?", startToken), CqlTokenRange(s"token($pk) <= ?", endToken)) }
  118. 118. val partitioner = partitionerClassName match { case "org.apache.cassandra.dht.Murmur3Partitioner" => Murmur3TokenFactory case "org.apache.cassandra.dht.RandomPartitioner" => RandomPartitionerTokenFactory case _ => throw new IllegalArgumentException(s"Unsupported partitioner: $partitionerClassName") } private def splitToCqlClause(range: TokenRange): Iterable[CqlTokenRange] = { if (range.end == tokenFactory.minToken) List(CqlTokenRange(s"token($pk) > ?", startToken)) else if (range.start == tokenFactory.minToken) List(CqlTokenRange(s"token($pk) <= ?", endToken)) else if (!range.isWrapAround) List(CqlTokenRange(s"token($pk) > ? AND token($pk) <= ?", startToken, endToken)) else List( CqlTokenRange(s"token($pk) > ?", startToken), CqlTokenRange(s"token($pk) <= ?", endToken)) }
  119. 119. override def getPreferredLocations(split: Partition): Seq[String] = split.asInstanceOf[CassandraPartition].endpoints.flatMap(nodeAddresses.hostNames).toSeq override def getPartitions: Array[Partition] = { val partitioner = CassandraRDDPartitioner(connector, tableDef, splitCount, splitSize) val partitions = partitioner.partitions(where) partitions } override def compute(split: Partition, context: TaskContext): Iterator[R] = { val session = connector.openSession() val partition = split.asInstanceOf[CassandraPartition] val tokenRanges = partition.tokenRanges val metricsUpdater = InputMetricsUpdater(context, readConf) val rowIterator = tokenRanges.iterator.flatMap( fetchTokenRange(session, _, metricsUpdater)) new CountingIterator(rowIterator, limit) }
  120. 120. override def getPreferredLocations(split: Partition): Seq[String] = split.asInstanceOf[CassandraPartition].endpoints.flatMap(nodeAddresses.hostNames).toSeq override def getPartitions: Array[Partition] = { val partitioner = CassandraRDDPartitioner(connector, tableDef, splitCount, splitSize) val partitions = partitioner.partitions(where) partitions } override def compute(split: Partition, context: TaskContext): Iterator[R] = { val session = connector.openSession() val partition = split.asInstanceOf[CassandraPartition] val tokenRanges = partition.tokenRanges val metricsUpdater = InputMetricsUpdater(context, readConf) val rowIterator = tokenRanges.iterator.flatMap( fetchTokenRange(session, _, metricsUpdater)) new CountingIterator(rowIterator, limit) }
  121. 121. object PushPredicateThroughProject extends Rule[LogicalPlan] with PredicateHelper { def apply(plan: LogicalPlan): LogicalPlan = plan transform { case filter @ Filter(condition, project @ Project(fields, grandChild)) if fields.forall(_.deterministic) => val aliasMap = AttributeMap(fields.collect { case a: Alias => (a.toAttribute, a.child) }) project.copy(child = Filter(replaceAlias(condition, aliasMap), grandChild)) } }
  122. 122. object PushPredicateThroughProject extends Rule[LogicalPlan] with PredicateHelper { def apply(plan: LogicalPlan): LogicalPlan = plan transform { case filter @ Filter(condition, project @ Project(fields, grandChild)) if fields.forall(_.deterministic) => val aliasMap = AttributeMap(fields.collect { case a: Alias => (a.toAttribute, a.child) }) project.copy(child = Filter(replaceAlias(condition, aliasMap), grandChild)) } }
  123. 123. Table and stream duality 1 4 3 5 2
  124. 124. Table and stream duality 1 4 3 5 2 1 State X
  125. 125. 1 Id 0 Event 1 Table and stream duality 1 4 3 5 2 1 State X Id 0 Event 2 Id 0 Event 1
  126. 126. Snapshot for offset N Table and stream duality 1 4 3 5 2 1 Id 0 Event 1 1 State X Id 0 Event 2 Id 0 Event 1 4
  127. 127. Table and stream duality Snapshot for offset N 1 4 3 5 2 1 Id 0 Event 1 1 State X Id 0 Event 2 Id 0 Event 1 4 N Id 0 Offset 123 State X Id 11 Offset 123 State X
  128. 128. Cache / view / index / replica / system / service Continuous stream applying transformation function Updates to the source of truth data Original table Infinite streams application
  129. 129. internet services devices social Kafka Stream processing apps Stream consumer Search Apps Services Databases Batch Serialisation
  130. 130. Distributed systems User Mobile System Microservice Microservice Microservice Microservice Microservice Microservice Microservice CQRS/ES Relational NoSQL
  131. 131. Client 1 Client 2 Client 3 Update Update Update Model devices Model devices Model devices Input data Input data Input data Parameter devices P ΔP ΔP ΔP
  132. 132. Challenges ● All the solved problems ○ Exactly once delivery ○ Consistency ○ Availability ○ Fault tolerance ○ Cross service invariants and consistency ○ Transactions ○ Automated deployment and configuration management ○ Serialization, versioning, compatibility ○ Automated elasticity ○ No downtime version upgrades ○ Graceful shutdown of nodes ○ Distributed system verification, logging, tracing, monitoring, debugging ○ Split brains ○ ...
  133. 133. Conclusion ● From request, response, synchronous, mutable state ● To streams, asynchronous messaging ● Production ready distributed systems
  134. 134. MANCHESTER LONDON NEW YORK Questions
  135. 135. MANCHESTER LONDON NEW YORK @zapletal_martin @cakesolutions 347 708 1518 enquiries@cakesolutions.net We are hiring http://www.cakesolutions.net/careers

×