Stream processing has been traditionally associated with realtime analytics. Modern stream processors, like Apache Flink, however, go far beyond that and give us a new approach to build applications and services as a whole.
This talk shows how to build applications on *data streams*, *state*, and *snaphots* (point-in-time views of application state) using Apache Flink. Rather than separating computation (application) and state (database), Flink manages the application logic and state as a tight pair and uses snapshots for consistent view onto the application and its state. With features like Flink's queryable state, the stream processor and database effectively become one.
This application pattern has many interesting properties: Aside from having fewer moving parts, it supports very high event rates because of its tight integration between computation and state, and its simple concurrency and recovery model. At the same time, it exposes a powerful consistency model, allows for seamless forking/updating/rollback of online applications, generalizes across historic and real-time data, and easily incorporates event time semantics and handling of late data. Finally, it allows applications to be defined in an easy way via streaming SQL.
2. 2
Why it is a great idea to build
applications on top of a stream processor
Like (micro) services?
Or complete web sites?
3. Streaming and Streaming Processing
3
First wave for streaming was lambda architecture
• Aid batch systems to be more real-time
Second wave was analytics (real time and lag-time)
• Based on distributed collections, functions, and windows
The next wave is much broader:
A new architecture for event-driven applications
4. 4
Large
Distributed State
Time / Order /
Completeness
offers unique building blocks to handle
A streaming architecture
for event-driven applications
6. Building stateful applications…
6
… typically starts with picking a data store
… and a data model
… then thinking through consistency models and guarantees
… then worrying about read/write performance
… which may lead to rethinking the data model
and consistency model
7. … often gets complicated at scale
7
Image by Pablo Castilla
https://pablocastilla.wordpress.com/
8. A core problem in many architectures
8
database
layer
compute
layer
application state
+ historic state
Separation of compute and working state
All state access / update
is remote
Often requires heavy
caching layers
Expensive and (at least partially)
synchronous durability
9. A core problem in many architectures
9
database
layer
compute
layer
application state
+ historic state
Separation of compute and working state
All state access / update
is remote
Often requires heavy
caching layers
Expensive and (at least partially)
synchronous durability
All just to have the state or
working set fault tolerant
10. Event Sourcing + Memory Image
10
event log
persists events
(temporarily)
event /
command
Process
main memory
update local
variables/structures
periodically snapshot
the memory
11. Event Sourcing + Memory Image
11
Recovery: Restore snapshot and replay events
since snapshot
event log
persists events
(temporarily)
Process
14. Events, State, Time, and Snapshots
14
f(a,b)
Event-driven function
executed distributedly
15. Events, State, Time, and Snapshots
15
f(a,b)
Maintain fault tolerant local state similar to
any normal application
Main memory +
out of core (for maps)
16. Events, State, Time, and Snapshots
16
f(a,b)
wall clock
event time clock
Access and react to
notions of time and progress,
handle out-of-order events
17. Events, State, Time, and Snapshots
17
f(a,b)
wall clock
event time clock
Snapshot point-in-time
view for recovery,
rollback, cloning,
versioning, etc.
28. Consistency
28
distributed transactions
at scale typically
at-most / at-least once
exactly once
per state
=1 =1snapshot consistency
across states
Classic tiered architecture Streaming architecture
29. Scaling a Service
29
separately provision additional
database capacity
provision compute
and state together
Classic tiered architecture Streaming architecture
provision compute
30. Rolling out a new Service
30
provision a new database
(or add capacity to an existing one)
provision compute
and state together
simply occupies some
additional backup space
Classic tiered architecture Streaming architecture
32. Repair External State
32
Streaming architecture
live application external state
overwrite
with correct results
backed up data
(HDFS, S3, etc.)
application on backup input
events
33. Repair External State
33
Streaming architecture
live application external state
overwrite
with correct results
backed up date
(HDFS, S3, etc.)
Each application doubles as
a batch job!
application on backup input
events
34. Versioning the state of applications
34
Savepoint
Savepoint
Savepoint
App. A
App. B
App. C
Time
Savepoint
36. Time, Completeness, Out-of-order
37
?
event time clocks
define data completeness
event time timers
handle actions for
out-of-order data
Classic tiered architecture Streaming architecture
37. Time: Different Notions of Time
38
Event Producer Message Queue
Flink
Data Source
Flink
Window Operator
partition 1
partition 2
Event
Time
Ingestion
Time
Window
Processing
Time
Broker
Time
38. Time: Event Time Example
39
1977 1980 1983 1999 2002 2005 2015
Processing Time
Episode
IV
Episode
V
Episode
VI
Episode
I
Episode
II
Episode
III
Episode
VII
Event Time
43. Apache Flink's Layered APIs
44
Process Function (events, state, time)
DataStream API (streams, windows)
Table API (dynamic tables)
Stream SQL
Stream- &
Batch Processing
Analytics
Stateful
Event-Driven
Applications
44. Process Function
45
class MyFunction extends ProcessFunction[MyEvent, Result] {
// declare state to use in the program
lazy val state: ValueState[CountWithTimestamp] = getRuntimeContext().getState(…)
def processElement(event: MyEvent, ctx: Context, out: Collector[Result]): Unit = {
// work with event and state
(event, state.value) match { … }
out.collect(…) // emit events
state.update(…) // modify state
// schedule a timer callback
ctx.timerService.registerEventTimeTimer(event.timestamp + 500)
}
def onTimer(timestamp: Long, ctx: OnTimerContext, out: Collector[Result]): Unit = {
// handle callback when event-/processing- time instant is reached
}
}
45. Data Stream API
46
val lines: DataStream[String] = env.addSource(
new FlinkKafkaConsumer09<>(…))
val events: DataStream[Event] = lines.map((line) => parse(line))
val stats: DataStream[Statistic] = stream
.keyBy("sensor")
.timeWindow(Time.seconds(5))
.sum(new MyAggregationFunction())
stats.addSink(new RollingSink(path))
47. 48
Streaming Analytics and
Event-driven applications move closer
to each other (and merge)
Stream Processing is a great way of
building applications.
Think of it as event-sourcing/CQRS
on steroids.