Streaming, Database &
Distributed Systems:
Bridging the Divide
Ben Stopford (@benstopford)
Codemesh 2016
Event Driven
Systems
Most stateful systems have to pull
from these three worlds
Today we have 2 goals
1.  Understand Stateful Stream
Processing (now & near future)
2.  Case for SSP as a general framework
for building data-centric systems.
Data systems come in
different forms
•  Database (OLTP)
•  Analytics Database (OLAP/Hadoop)
•  Messaging
•  Distributed log
•  Stream Processing
•  Stateful Stream Processing
Database (OLTP)
Focuses on providing a consistent view that
supports updates and queries on individual tuples.
Analytics Database (OLAP/Hadoop)
1.  Focuses on aggregations via table scans.
2.  Executes as distributed system
Messaging
Focuses on asynchronous information transfer with limited
state
Distributed Log
1.  Similar to messaging, but data can be retained
2.  Executes as distributed system (scale + fault tolerance)
Stream Processing
Manipulate concurrent streams of events
Comes from CEP background (ephemeral)
Stateful Stream Processing
Moves stream processing to be a more general
framework for building data-centric systems.
What is stream processing?
Data
Index
Query
Engine
Query
Engine
vs
Database
Finite source
Stream Processor
Infinite source
Infinite streams need
windows
How many items will we bring into the machine at
one time?
Windows bound a computation
How many items will we bring into the machine at
one time?
Buffering allows us to handle
late events
How many items will we bring into the machine at
one time?
Some query
Over some time window
Emitting at some frequency
Continually executing query
Stream(s)
Stream Processing Engine
Derived Stream
Avg(p.time – o.time)
From orders, payment
Group by payment.region
over 1 day window
emitting every second
Stream Processing
orders!
payments!
Completion time,
by region!
Avg(o.time – p.time)
From orders, payment
Group by payment.region
over 1 day window
emitting every second
Materialised View (DB )
Query
orders!
payments!
Completion time,
by region!
Avg(o.time – p.time)
From orders, payment, user
Group by user.region
over 1 day window
emitting every second
Stateful Stream Processing
Streams
Stream Processing Engine
Derived Stream
Query
Derived “Table”
Table
“View” is output as
table or stream
Table == Stream + Window0
n
== 0 N
Table is a stream with an infinite window (i.e. buffer from 0 -> now)
window !
SSP is about creating
materialised views.
Materialised as a table, or
materialised as a stream
Features: similar to database query
engine
Join Filter
Aggr-
egate
View
Windowed
Streams
Can distribute over many machines
in two dimensions
Join Filter
Aggr-
egate
View
Join Filter
Aggr-
egate
View
Join Filter
Aggr-
egate
View
Scale Out Scale Forward
Stateful Stream Processing engines typically
use Kafka (a distributed commit log)
Join Filter
Aggr-
egate
View
Kafka (a distributed log)
A log is very simple idea
Messages are added at the end of the log
Just think of the log as a file
Old New
Readers have a position & scan
Sally
is here
George
is here
Fred
is here
Old New
Scan Scan
Scan
Can “Rewind & Replay” the log
Rewind & Replay
Compacted Log
(Tabular View)
Version 3
Version 2
Version 1
Version 2
Version 1
Version 5
Version 4
Version 3
Version 2
Version 1
Version 2
Version 3
Version 5
STEAM
(All versions)
COMPACTED STREAM
(Latest Key only)
The log is a
Distributed System
For scalability and fault tolerance
Shard on the way in
Producers
Kafka
Consumers
Each shard is a queue
Producers
Kafka
Consumers
Producers
Kafka
Many consumers
share partitions
in one topic
Consumers share consumption of a
single topic
The Log reassigns data on failure
Producers
Kafka
Many consumers
share partitions in
one topic
Kafka supplies two levels of
leader election
Replicas in Kafka have
an elected leader
Consumers in Kafka
have an elected leader
The log is important for SSP
Maintains History: Acts like a “push based” distributed file system
The log is important: Two Primitives
Stream
Compacted Stream (‘table’)
The Log is, to a streaming
engine, what HDFS is to Hadoop
But it’s a bit more than a HDFS
replacement: Processors inherit the
idea of “membership” from the log
So stateful Stream Processors use
the Log
Join Filter
Aggr-
egate
View
Kafka (Distributed Log)
They also use local storage
Join Filter
Aggr-
egate
View
(1) a Kafka
(2) Local KV Store
Local KV store has a few uses
(1)  It caches streams on disk
(2) It caches “tables” on disk
Join Filter
Aggr-
egate
View
This makes join operations fast as they’re entirely local
Streams just cache recent
messages to help with joins
Tables are fully
“realised” locally
Stateful Stream Processing
stream
Compacted
stream
Join
Stream data
Stream-Tabular
Data
Infinite
Stream
Locally Cached
Table
(disk resident)
KafkaKafka Streams
e.g. Useful for Enrichment
stream
Compacted
stream
Join
Orders
Customers
KafkaKafka Streams
Local DB
Aggregates need intermediary state
stream
Compacted
stream
Join
Orders
Customers
KafkaSum(orders)
group by region
Persist current value,
in case we fail
State store inherits durability from
the log
State store flushes
back to the log
Join Filter
Aggr-
egate
View
Separate Data, Processing & View
View
OrdersPayments View
View
Storage Layer
(a Kafka)
Processing & View
Query
You can query the views from
anywhere
View
OrdersPayments View
View
Storage Layer
(a Kafka)
Processing & View
Query
So what happens on failure?
View
OrdersPayments View
View
Storage Layer
(a Kafka)
Processing & View
Clustering Reroutes Data to
surviving node
View
OrdersPayments View
View
Storage Layer
(Kafka)
Ownership of partitions is re-routed from dead node
Processing & View
But what about state?
View
OrdersPayments View
View
Storage Layer
(Kafka)
“Cold” replica of state
takes over
Processing & View
Primitives for sharding &
replication
Stock
OrdersPayments Stock
Stock
Redundant copies are
cached on other nodes
Sharding spread data
over processors
So processors inherit much
from the log
Clustering comes
from the log
You just write the
functional bit
General framework for distributed, realtime data
computation
Protection from
broker failure
Protection from
engine failure
Join tables & streams
(in process)
Event Driven
Create views which
can be queried
Query
But stream
processing has a
problem
Correctness Guarantees in multi
layer topologies
Join Filter
Aggr-
egate
View
Join Filter
Aggr-
egate
View
Join Filter
Aggr-
egate
View
Join Filter
Aggr-
egate
View
Join Filter
Aggr-
egate
View
Join Filter
Aggr-
egate
View
Join Filter
Aggr-
egate
View
Join Filter
Aggr-
egate
View
Join Filter
Aggr-
egate
View
Duplicates are a side effect of all at-least-once delivery mechanisms
Data is rerouted, on failure, which
can cause duplicates
Idempotance isn’t enough
Join Filter
Aggr-
egate
View
Join Filter
Aggr-
egate
View
Filter
Join Filter
Aggr-
egate
View
Join Filter
Aggr-
egate
View
Distributed Snapshots*
(transactions)
Join Filter
Aggr-
egate
View
Join Filter
Aggr-
egate
View
Join Filter
Aggr-
egate
View
Transaction markers:
[Begin], [Prepare], [Commit], [Abort]
Buffer
Chandy, Lamport - Distributed Snapshots: Determining Global States of Distributed Systems
*In development in Kafka
So why use these
tools?
(1) Streaming is a
superset of batch
Databases look backwards
Batch == Streaming from offset 0
Query
Query
Query
Distributed File
System (HDFS)
Query
Query
QueryDistributed Log
(Kafka)
MPP Batch System MPP Streaming System
Streaming is the superset of batch
Streaming
Batch
Database
Global, Linearisible
consistency model
(2) Separates store & view
“Engine” part is lightweight
but stateful
Storage Just a java process
which uses a library
Log handles fault
tolerance of both layers
Separates Concerns of
Model & View – Think MVC
Storage
View & Controller
Model
Physically Separates Read &
Write – Think CQRS
Storage
View & Controller
Model
Database vs SSP
Data
Index
Query
Engine
Query
Engine
vs
Database Stateful Stream Processor
Query
Query
View
Index Data
(3) Decentralised approaches
are more general
Rather than pushing processing
into an “appliance”
(code -> data)
Centralised Processing
App
Data Decentric Architecture
Distributed
Log
Decentralised Processing over many
user-specific views
This more general
than than just
analytics use cases
It’s more than taking a
database and adding push
notifications
Whether you’re building a hulking,
multistage, analytic platform
Query
Final View
Intermediary View (2)
Intermediary View (1)
Or a simple microservice that
needs to run hot-hot & scale
Business Logic
Manage local
state
Join various
streams
Hot secondary
instance
Composable Primatives
Declarative
Function
Traditional DB
Work
Distribution
Replication
Sharding
Query
Engine
Distributed DB Distributed Systems
Membership
Global
Consistency
General framework for distributed, event-
driven data computation
Protection from
broker failure
Protection from
engine failure
Join tables & streams
(in process)
Event Driven
Create views which
can be queried
Query
Stateful Stream Processing
Framework for building a streaming data
systems, just for you “~)
Find out more:
•  http://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple/
•  https://martin.kleppmann.com/2015/02/11/database-inside-out-at-salesforce.html
•  http://cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf
•  https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/cidr07p42.pdf
•  http://highscalability.com/blog/2015/5/4/elements-of-scale-composing-and-scaling-data-
platforms.html
•  https://speakerdeck.com/bobbycalderwood/commander-decoupled-immutable-rest-apis-with-
kafka-streams
•  https://timothyrenner.github.io/engineering/2016/08/11/kafka-streams-not-looking-at-facebook.html
•  https://www.madewithtea.com/processing-tweets-with-kafka-streams.html
•  http://www.infolace.com/blog/2016/07/14/simple-spatial-windowing-with-kafka-streams/
•  http://www.slideshare.net/zacharycox/updating-materialized-views-and-caches-using-kafka
The end
@benstopford
http://benstopford.com

Streaming, Database & Distributed Systems Bridging the Divide

  • 1.
    Streaming, Database & DistributedSystems: Bridging the Divide Ben Stopford (@benstopford) Codemesh 2016
  • 3.
    Event Driven Systems Most statefulsystems have to pull from these three worlds
  • 4.
    Today we have2 goals 1.  Understand Stateful Stream Processing (now & near future) 2.  Case for SSP as a general framework for building data-centric systems.
  • 5.
    Data systems comein different forms •  Database (OLTP) •  Analytics Database (OLAP/Hadoop) •  Messaging •  Distributed log •  Stream Processing •  Stateful Stream Processing
  • 6.
    Database (OLTP) Focuses onproviding a consistent view that supports updates and queries on individual tuples.
  • 7.
    Analytics Database (OLAP/Hadoop) 1. Focuses on aggregations via table scans. 2.  Executes as distributed system
  • 8.
    Messaging Focuses on asynchronousinformation transfer with limited state
  • 9.
    Distributed Log 1.  Similarto messaging, but data can be retained 2.  Executes as distributed system (scale + fault tolerance)
  • 10.
    Stream Processing Manipulate concurrentstreams of events Comes from CEP background (ephemeral)
  • 11.
    Stateful Stream Processing Movesstream processing to be a more general framework for building data-centric systems.
  • 12.
    What is streamprocessing? Data Index Query Engine Query Engine vs Database Finite source Stream Processor Infinite source
  • 13.
    Infinite streams need windows Howmany items will we bring into the machine at one time?
  • 14.
    Windows bound acomputation How many items will we bring into the machine at one time?
  • 15.
    Buffering allows usto handle late events How many items will we bring into the machine at one time?
  • 16.
    Some query Over sometime window Emitting at some frequency Continually executing query Stream(s) Stream Processing Engine Derived Stream
  • 17.
    Avg(p.time – o.time) Fromorders, payment Group by payment.region over 1 day window emitting every second Stream Processing orders! payments! Completion time, by region!
  • 18.
    Avg(o.time – p.time) Fromorders, payment Group by payment.region over 1 day window emitting every second Materialised View (DB ) Query orders! payments! Completion time, by region!
  • 19.
    Avg(o.time – p.time) Fromorders, payment, user Group by user.region over 1 day window emitting every second Stateful Stream Processing Streams Stream Processing Engine Derived Stream Query Derived “Table” Table “View” is output as table or stream
  • 20.
    Table == Stream+ Window0 n == 0 N Table is a stream with an infinite window (i.e. buffer from 0 -> now) window !
  • 21.
    SSP is aboutcreating materialised views. Materialised as a table, or materialised as a stream
  • 22.
    Features: similar todatabase query engine Join Filter Aggr- egate View Windowed Streams
  • 23.
    Can distribute overmany machines in two dimensions Join Filter Aggr- egate View Join Filter Aggr- egate View Join Filter Aggr- egate View Scale Out Scale Forward
  • 24.
    Stateful Stream Processingengines typically use Kafka (a distributed commit log) Join Filter Aggr- egate View Kafka (a distributed log)
  • 25.
    A log isvery simple idea Messages are added at the end of the log Just think of the log as a file Old New
  • 26.
    Readers have aposition & scan Sally is here George is here Fred is here Old New Scan Scan Scan
  • 27.
    Can “Rewind &Replay” the log Rewind & Replay
  • 28.
    Compacted Log (Tabular View) Version3 Version 2 Version 1 Version 2 Version 1 Version 5 Version 4 Version 3 Version 2 Version 1 Version 2 Version 3 Version 5 STEAM (All versions) COMPACTED STREAM (Latest Key only)
  • 29.
    The log isa Distributed System For scalability and fault tolerance
  • 30.
    Shard on theway in Producers Kafka Consumers
  • 31.
    Each shard isa queue Producers Kafka Consumers
  • 32.
    Producers Kafka Many consumers share partitions inone topic Consumers share consumption of a single topic
  • 33.
    The Log reassignsdata on failure Producers Kafka Many consumers share partitions in one topic
  • 34.
    Kafka supplies twolevels of leader election Replicas in Kafka have an elected leader Consumers in Kafka have an elected leader
  • 35.
    The log isimportant for SSP Maintains History: Acts like a “push based” distributed file system
  • 36.
    The log isimportant: Two Primitives Stream Compacted Stream (‘table’)
  • 37.
    The Log is,to a streaming engine, what HDFS is to Hadoop
  • 38.
    But it’s abit more than a HDFS replacement: Processors inherit the idea of “membership” from the log
  • 39.
    So stateful StreamProcessors use the Log Join Filter Aggr- egate View Kafka (Distributed Log)
  • 40.
    They also uselocal storage Join Filter Aggr- egate View (1) a Kafka (2) Local KV Store
  • 41.
    Local KV storehas a few uses (1)  It caches streams on disk (2) It caches “tables” on disk Join Filter Aggr- egate View This makes join operations fast as they’re entirely local Streams just cache recent messages to help with joins Tables are fully “realised” locally
  • 42.
    Stateful Stream Processing stream Compacted stream Join Streamdata Stream-Tabular Data Infinite Stream Locally Cached Table (disk resident) KafkaKafka Streams
  • 43.
    e.g. Useful forEnrichment stream Compacted stream Join Orders Customers KafkaKafka Streams Local DB
  • 44.
    Aggregates need intermediarystate stream Compacted stream Join Orders Customers KafkaSum(orders) group by region Persist current value, in case we fail
  • 45.
    State store inheritsdurability from the log State store flushes back to the log Join Filter Aggr- egate View
  • 46.
    Separate Data, Processing& View View OrdersPayments View View Storage Layer (a Kafka) Processing & View Query
  • 47.
    You can querythe views from anywhere View OrdersPayments View View Storage Layer (a Kafka) Processing & View Query
  • 48.
    So what happenson failure? View OrdersPayments View View Storage Layer (a Kafka) Processing & View
  • 49.
    Clustering Reroutes Datato surviving node View OrdersPayments View View Storage Layer (Kafka) Ownership of partitions is re-routed from dead node Processing & View
  • 50.
    But what aboutstate? View OrdersPayments View View Storage Layer (Kafka) “Cold” replica of state takes over Processing & View
  • 51.
    Primitives for sharding& replication Stock OrdersPayments Stock Stock Redundant copies are cached on other nodes Sharding spread data over processors
  • 52.
    So processors inheritmuch from the log Clustering comes from the log You just write the functional bit
  • 53.
    General framework fordistributed, realtime data computation Protection from broker failure Protection from engine failure Join tables & streams (in process) Event Driven Create views which can be queried Query
  • 54.
  • 55.
    Correctness Guarantees inmulti layer topologies Join Filter Aggr- egate View Join Filter Aggr- egate View Join Filter Aggr- egate View Join Filter Aggr- egate View Join Filter Aggr- egate View
  • 56.
    Join Filter Aggr- egate View Join Filter Aggr- egate View JoinFilter Aggr- egate View Join Filter Aggr- egate View Duplicates are a side effect of all at-least-once delivery mechanisms Data is rerouted, on failure, which can cause duplicates
  • 57.
    Idempotance isn’t enough JoinFilter Aggr- egate View Join Filter Aggr- egate View Filter Join Filter Aggr- egate View Join Filter Aggr- egate View
  • 58.
    Distributed Snapshots* (transactions) Join Filter Aggr- egate View JoinFilter Aggr- egate View Join Filter Aggr- egate View Transaction markers: [Begin], [Prepare], [Commit], [Abort] Buffer Chandy, Lamport - Distributed Snapshots: Determining Global States of Distributed Systems *In development in Kafka
  • 61.
    So why usethese tools?
  • 62.
    (1) Streaming isa superset of batch
  • 63.
  • 64.
    Batch == Streamingfrom offset 0 Query Query Query Distributed File System (HDFS) Query Query QueryDistributed Log (Kafka) MPP Batch System MPP Streaming System
  • 65.
    Streaming is thesuperset of batch Streaming Batch Database Global, Linearisible consistency model
  • 66.
  • 67.
    “Engine” part islightweight but stateful Storage Just a java process which uses a library Log handles fault tolerance of both layers
  • 68.
    Separates Concerns of Model& View – Think MVC Storage View & Controller Model
  • 69.
    Physically Separates Read& Write – Think CQRS Storage View & Controller Model
  • 70.
    Database vs SSP Data Index Query Engine Query Engine vs DatabaseStateful Stream Processor Query Query View Index Data
  • 71.
  • 72.
    Rather than pushingprocessing into an “appliance” (code -> data) Centralised Processing App
  • 73.
    Data Decentric Architecture Distributed Log DecentralisedProcessing over many user-specific views
  • 74.
    This more general thanthan just analytics use cases
  • 75.
    It’s more thantaking a database and adding push notifications
  • 76.
    Whether you’re buildinga hulking, multistage, analytic platform Query Final View Intermediary View (2) Intermediary View (1)
  • 77.
    Or a simplemicroservice that needs to run hot-hot & scale Business Logic Manage local state Join various streams Hot secondary instance
  • 78.
  • 79.
    General framework fordistributed, event- driven data computation Protection from broker failure Protection from engine failure Join tables & streams (in process) Event Driven Create views which can be queried Query
  • 80.
    Stateful Stream Processing Frameworkfor building a streaming data systems, just for you “~)
  • 81.
    Find out more: • http://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple/ •  https://martin.kleppmann.com/2015/02/11/database-inside-out-at-salesforce.html •  http://cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf •  https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/cidr07p42.pdf •  http://highscalability.com/blog/2015/5/4/elements-of-scale-composing-and-scaling-data- platforms.html •  https://speakerdeck.com/bobbycalderwood/commander-decoupled-immutable-rest-apis-with- kafka-streams •  https://timothyrenner.github.io/engineering/2016/08/11/kafka-streams-not-looking-at-facebook.html •  https://www.madewithtea.com/processing-tweets-with-kafka-streams.html •  http://www.infolace.com/blog/2016/07/14/simple-spatial-windowing-with-kafka-streams/ •  http://www.slideshare.net/zacharycox/updating-materialized-views-and-caches-using-kafka
  • 82.