Event-driven applications
with streams and snapshots
(Apache Flink)
@StephanEwen
J on the Beach, 2017
1
2
Why it is a great idea to build
applications on top of a stream processor
Like (micro) services?
Or complete web sites?
Streaming and Streaming Processing
3
 First wave for streaming was lambda architecture
• Aid batch systems to be more real-time
 Second wave was analytics (real time and lag-time)
• Based on distributed collections, functions, and windows
 The next wave is much broader:
A new architecture for event-driven applications
4
Large
Distributed State
Time / Order /
Completeness
offers unique building blocks to handle
A streaming architecture
for event-driven applications
Matters of State
5
Building stateful applications…
6
… typically starts with picking a data store
… and a data model
… then thinking through consistency models and guarantees
… then worrying about read/write performance
… which may lead to rethinking the data model
and consistency model
… often gets complicated at scale
7
Image by Pablo Castilla
https://pablocastilla.wordpress.com/
A core problem in many architectures
8
database
layer
compute
layer
application state
+ historic state
Separation of compute and working state
All state access / update
is remote
Often requires heavy
caching layers
Expensive and (at least partially)
synchronous durability
A core problem in many architectures
9
database
layer
compute
layer
application state
+ historic state
Separation of compute and working state
All state access / update
is remote
Often requires heavy
caching layers
Expensive and (at least partially)
synchronous durability
All just to have the state or
working set fault tolerant
Event Sourcing + Memory Image
10
event log
persists events
(temporarily)
event /
command
Process
main memory
update local
variables/structures
periodically snapshot
the memory
Event Sourcing + Memory Image
11
Recovery: Restore snapshot and replay events
since snapshot
event log
persists events
(temporarily)
Process
Distributed Memory Image
12
Distributed application, many memory images.
Snapshots are all consistent together.
13
Apache Flink in a Nutshell
That is almost
Events, State, Time, and Snapshots
14
f(a,b)
Event-driven function
executed distributedly
Events, State, Time, and Snapshots
15
f(a,b)
Maintain fault tolerant local state similar to
any normal application
Main memory +
out of core (for maps)
Events, State, Time, and Snapshots
16
f(a,b)
wall clock
event time clock
Access and react to
notions of time and progress,
handle out-of-order events
Events, State, Time, and Snapshots
17
f(a,b)
wall clock
event time clock
Snapshot point-in-time
view for recovery,
rollback, cloning,
versioning, etc.
Stateful Event & Stream Processing
18
Source
Transformation
Transformation
Sink
val lines: DataStream[String] = env.addSource(new FlinkKafkaConsumer09(…))
val events: DataStream[Event] = lines.map((line) => parse(line))
val stats: DataStream[Statistic] = stream
.keyBy("sensor")
.timeWindow(Time.seconds(5))
.sum(new MyAggregationFunction())
stats.addSink(new RollingSink(path))
Streaming
Dataflow
Source Transform Window
(state read/write)
Sink
Stateful Event & Stream Processing
19
Source
Filter /
Transform
State
read/write
Sink
Stateful Event & Stream Processing
20
Scalable embedded state
Access at memory speed &
scales with parallel operators
Stateful Event & Stream Processing
21
Re-load state
Reset positions
in input streams
Rolling back computation
Re-processing
Stateful Event & Stream Processing
22
Restore to different
programs
Bugfixes, Upgrades, A/B testing, etc
Voila: A lightweight CQRS architecture
event log
write model +
building the read views
writes reads
Voila: A lightweight CQRS architecture
event log
writes reads
mirror the
state
write model +
building the read views
"Classical" versus
Streaming Architecture
25
Compute, State, and Storage
26
Classic tiered architecture Streaming architecture
database
layer
compute
layer
application state
+ backup
compute
+
stream storage
and
snapshot storage
(backup)
application state
Performance
27
synchronous reads/writes
across tier boundary
asynchronous writes
of large blobs
all modifications
are local
Classic tiered architecture Streaming architecture
Consistency
28
distributed transactions
at scale typically
at-most / at-least once
exactly once
per state
=1 =1snapshot consistency
across states
Classic tiered architecture Streaming architecture
Scaling a Service
29
separately provision additional
database capacity
provision compute
and state together
Classic tiered architecture Streaming architecture
provision compute
Rolling out a new Service
30
provision a new database
(or add capacity to an existing one)
provision compute
and state together
simply occupies some
additional backup space
Classic tiered architecture Streaming architecture
Repair External State
31
Streaming architecture
events
live application external state
wrong results
backed up data
(HDFS, S3, etc.)
Repair External State
32
Streaming architecture
live application external state
overwrite
with correct results
backed up data
(HDFS, S3, etc.)
application on backup input
events
Repair External State
33
Streaming architecture
live application external state
overwrite
with correct results
backed up date
(HDFS, S3, etc.)
Each application doubles as
a batch job!
application on backup input
events
Versioning the state of applications
34
Savepoint
Savepoint
Savepoint
App. A
App. B
App. C
Time
Savepoint
(It's) About Time
36
Time, Completeness, Out-of-order
37
?
event time clocks
define data completeness
event time timers
handle actions for
out-of-order data
Classic tiered architecture Streaming architecture
Time: Different Notions of Time
38
Event Producer Message Queue
Flink
Data Source
Flink
Window Operator
partition 1
partition 2
Event
Time
Ingestion
Time
Window
Processing
Time
Broker
Time
Time: Event Time Example
39
1977 1980 1983 1999 2002 2005 2015
Processing Time
Episode
IV
Episode
V
Episode
VI
Episode
I
Episode
II
Episode
III
Episode
VII
Event Time
Time: Watermarks
40
7
W(11)W(17)
11159121417122220 171921
Watermark
Event
Event timestamp
Stream (in order)
7
W(11)W(20)
Watermark
991011141517
Event
Event timestamp
1820 192123
Stream (out of order)
Time: Watermarks in Parallel
41
Source
(1)
Source
(2)
map
(1)
map
(2)
window
(1)
window
(2)
29
29
17
14
14
29
14
14
W(33)
W(17)
W(17)
A|30B|31
C|30
D|15
E|30
F|15G|18H|20
K|35
Watermark
Event Time
at the operator
Event
[id|timestamp]
Event Time
at input streams
Watermark
Generation
M|39N|39Q|44
L|22O|23R|37
Event–driven applications
42
Event-driven
Applications
Stream Processing
Batch Processing
Stateful, event-driven,
event-time-aware processing
(event sourcing, CQRS, …)
(streams, windows, …)
(data sets)
Layers of abstraction
43
Ogres have
layers
So do
squirrels
Apache Flink's Layered APIs
44
Process Function (events, state, time)
DataStream API (streams, windows)
Table API (dynamic tables)
Stream SQL
Stream- &
Batch Processing
Analytics
Stateful
Event-Driven
Applications
Process Function
45
class MyFunction extends ProcessFunction[MyEvent, Result] {
// declare state to use in the program
lazy val state: ValueState[CountWithTimestamp] = getRuntimeContext().getState(…)
def processElement(event: MyEvent, ctx: Context, out: Collector[Result]): Unit = {
// work with event and state
(event, state.value) match { … }
out.collect(…) // emit events
state.update(…) // modify state
// schedule a timer callback
ctx.timerService.registerEventTimeTimer(event.timestamp + 500)
}
def onTimer(timestamp: Long, ctx: OnTimerContext, out: Collector[Result]): Unit = {
// handle callback when event-/processing- time instant is reached
}
}
Data Stream API
46
val lines: DataStream[String] = env.addSource(
new FlinkKafkaConsumer09<>(…))
val events: DataStream[Event] = lines.map((line) => parse(line))
val stats: DataStream[Statistic] = stream
.keyBy("sensor")
.timeWindow(Time.seconds(5))
.sum(new MyAggregationFunction())
stats.addSink(new RollingSink(path))
Table API & Stream SQL
47
48
Streaming Analytics and
Event-driven applications move closer
to each other (and merge)
Stream Processing is a great way of
building applications.
Think of it as event-sourcing/CQRS
on steroids.
49
Thank you! 

Building Applications with Streams and Snapshots

  • 1.
    Event-driven applications with streamsand snapshots (Apache Flink) @StephanEwen J on the Beach, 2017 1
  • 2.
    2 Why it isa great idea to build applications on top of a stream processor Like (micro) services? Or complete web sites?
  • 3.
    Streaming and StreamingProcessing 3  First wave for streaming was lambda architecture • Aid batch systems to be more real-time  Second wave was analytics (real time and lag-time) • Based on distributed collections, functions, and windows  The next wave is much broader: A new architecture for event-driven applications
  • 4.
    4 Large Distributed State Time /Order / Completeness offers unique building blocks to handle A streaming architecture for event-driven applications
  • 5.
  • 6.
    Building stateful applications… 6 …typically starts with picking a data store … and a data model … then thinking through consistency models and guarantees … then worrying about read/write performance … which may lead to rethinking the data model and consistency model
  • 7.
    … often getscomplicated at scale 7 Image by Pablo Castilla https://pablocastilla.wordpress.com/
  • 8.
    A core problemin many architectures 8 database layer compute layer application state + historic state Separation of compute and working state All state access / update is remote Often requires heavy caching layers Expensive and (at least partially) synchronous durability
  • 9.
    A core problemin many architectures 9 database layer compute layer application state + historic state Separation of compute and working state All state access / update is remote Often requires heavy caching layers Expensive and (at least partially) synchronous durability All just to have the state or working set fault tolerant
  • 10.
    Event Sourcing +Memory Image 10 event log persists events (temporarily) event / command Process main memory update local variables/structures periodically snapshot the memory
  • 11.
    Event Sourcing +Memory Image 11 Recovery: Restore snapshot and replay events since snapshot event log persists events (temporarily) Process
  • 12.
    Distributed Memory Image 12 Distributedapplication, many memory images. Snapshots are all consistent together.
  • 13.
    13 Apache Flink ina Nutshell That is almost
  • 14.
    Events, State, Time,and Snapshots 14 f(a,b) Event-driven function executed distributedly
  • 15.
    Events, State, Time,and Snapshots 15 f(a,b) Maintain fault tolerant local state similar to any normal application Main memory + out of core (for maps)
  • 16.
    Events, State, Time,and Snapshots 16 f(a,b) wall clock event time clock Access and react to notions of time and progress, handle out-of-order events
  • 17.
    Events, State, Time,and Snapshots 17 f(a,b) wall clock event time clock Snapshot point-in-time view for recovery, rollback, cloning, versioning, etc.
  • 18.
    Stateful Event &Stream Processing 18 Source Transformation Transformation Sink val lines: DataStream[String] = env.addSource(new FlinkKafkaConsumer09(…)) val events: DataStream[Event] = lines.map((line) => parse(line)) val stats: DataStream[Statistic] = stream .keyBy("sensor") .timeWindow(Time.seconds(5)) .sum(new MyAggregationFunction()) stats.addSink(new RollingSink(path)) Streaming Dataflow Source Transform Window (state read/write) Sink
  • 19.
    Stateful Event &Stream Processing 19 Source Filter / Transform State read/write Sink
  • 20.
    Stateful Event &Stream Processing 20 Scalable embedded state Access at memory speed & scales with parallel operators
  • 21.
    Stateful Event &Stream Processing 21 Re-load state Reset positions in input streams Rolling back computation Re-processing
  • 22.
    Stateful Event &Stream Processing 22 Restore to different programs Bugfixes, Upgrades, A/B testing, etc
  • 23.
    Voila: A lightweightCQRS architecture event log write model + building the read views writes reads
  • 24.
    Voila: A lightweightCQRS architecture event log writes reads mirror the state write model + building the read views
  • 25.
  • 26.
    Compute, State, andStorage 26 Classic tiered architecture Streaming architecture database layer compute layer application state + backup compute + stream storage and snapshot storage (backup) application state
  • 27.
    Performance 27 synchronous reads/writes across tierboundary asynchronous writes of large blobs all modifications are local Classic tiered architecture Streaming architecture
  • 28.
    Consistency 28 distributed transactions at scaletypically at-most / at-least once exactly once per state =1 =1snapshot consistency across states Classic tiered architecture Streaming architecture
  • 29.
    Scaling a Service 29 separatelyprovision additional database capacity provision compute and state together Classic tiered architecture Streaming architecture provision compute
  • 30.
    Rolling out anew Service 30 provision a new database (or add capacity to an existing one) provision compute and state together simply occupies some additional backup space Classic tiered architecture Streaming architecture
  • 31.
    Repair External State 31 Streamingarchitecture events live application external state wrong results backed up data (HDFS, S3, etc.)
  • 32.
    Repair External State 32 Streamingarchitecture live application external state overwrite with correct results backed up data (HDFS, S3, etc.) application on backup input events
  • 33.
    Repair External State 33 Streamingarchitecture live application external state overwrite with correct results backed up date (HDFS, S3, etc.) Each application doubles as a batch job! application on backup input events
  • 34.
    Versioning the stateof applications 34 Savepoint Savepoint Savepoint App. A App. B App. C Time Savepoint
  • 35.
  • 36.
    Time, Completeness, Out-of-order 37 ? eventtime clocks define data completeness event time timers handle actions for out-of-order data Classic tiered architecture Streaming architecture
  • 37.
    Time: Different Notionsof Time 38 Event Producer Message Queue Flink Data Source Flink Window Operator partition 1 partition 2 Event Time Ingestion Time Window Processing Time Broker Time
  • 38.
    Time: Event TimeExample 39 1977 1980 1983 1999 2002 2005 2015 Processing Time Episode IV Episode V Episode VI Episode I Episode II Episode III Episode VII Event Time
  • 39.
    Time: Watermarks 40 7 W(11)W(17) 11159121417122220 171921 Watermark Event Eventtimestamp Stream (in order) 7 W(11)W(20) Watermark 991011141517 Event Event timestamp 1820 192123 Stream (out of order)
  • 40.
    Time: Watermarks inParallel 41 Source (1) Source (2) map (1) map (2) window (1) window (2) 29 29 17 14 14 29 14 14 W(33) W(17) W(17) A|30B|31 C|30 D|15 E|30 F|15G|18H|20 K|35 Watermark Event Time at the operator Event [id|timestamp] Event Time at input streams Watermark Generation M|39N|39Q|44 L|22O|23R|37
  • 41.
    Event–driven applications 42 Event-driven Applications Stream Processing BatchProcessing Stateful, event-driven, event-time-aware processing (event sourcing, CQRS, …) (streams, windows, …) (data sets)
  • 42.
    Layers of abstraction 43 Ogreshave layers So do squirrels
  • 43.
    Apache Flink's LayeredAPIs 44 Process Function (events, state, time) DataStream API (streams, windows) Table API (dynamic tables) Stream SQL Stream- & Batch Processing Analytics Stateful Event-Driven Applications
  • 44.
    Process Function 45 class MyFunctionextends ProcessFunction[MyEvent, Result] { // declare state to use in the program lazy val state: ValueState[CountWithTimestamp] = getRuntimeContext().getState(…) def processElement(event: MyEvent, ctx: Context, out: Collector[Result]): Unit = { // work with event and state (event, state.value) match { … } out.collect(…) // emit events state.update(…) // modify state // schedule a timer callback ctx.timerService.registerEventTimeTimer(event.timestamp + 500) } def onTimer(timestamp: Long, ctx: OnTimerContext, out: Collector[Result]): Unit = { // handle callback when event-/processing- time instant is reached } }
  • 45.
    Data Stream API 46 vallines: DataStream[String] = env.addSource( new FlinkKafkaConsumer09<>(…)) val events: DataStream[Event] = lines.map((line) => parse(line)) val stats: DataStream[Statistic] = stream .keyBy("sensor") .timeWindow(Time.seconds(5)) .sum(new MyAggregationFunction()) stats.addSink(new RollingSink(path))
  • 46.
    Table API &Stream SQL 47
  • 47.
    48 Streaming Analytics and Event-drivenapplications move closer to each other (and merge) Stream Processing is a great way of building applications. Think of it as event-sourcing/CQRS on steroids.
  • 48.