1
Piotr Nowojski
@PiotrNowojski
piotr@data-artisans.com
Apache Flink
Better, Faster & UncutBig data Warsaw 201
2
Original creators of Apache
Flink®
Providers of the
dA Platform 2, a supported
Flink distribution
This will be about ...
3
● What is Apache Flink?
● What can I do with it?
● What has recently changed?
4
Stateful Stream Processing
“Why” and “What is” ...
Stream Processing
5
Your
Code
process records
one-at-a-time
...
Long running computation, on an endless stream of input
Distributed Stream Processing
6
Your
Code
...
...
...
Your
Code
Your
Code
● partitions input streams by
some key in the data
● distributes computation
across multiple instances
● Each instance is responsible
for some key range
qwe
Stateful Stream Processing
7
...
...
Your
Code
Your
Code
Process
var sum = 0
def map(element):
sum += element
return sum
var sum = 0
def map(element):
sum += element
return sum
Stateful Stream Processing
8
...
...
Your
Code
Your
Code
Process
● embedded local state
backend
● State co-partitioned with
the input stream by key
State fault tolerance
9
Fault tolerance concerns for a stateful stream processor:
● No guarantees, at-least-once vs exactly-once
● How to ensure exactly-once semantics for the state?
● How to create consistent snapshots of distributed state?
● More importantly, how to do it efficiently without abrupting computation?
State fault tolerance
10
...
Your
Code
Your
Code
Your
Code
State
State
State
Your
Code
State
● Consistent snapshotting:
State fault tolerance
11
...
Your
Code
Your
Code
Your
Code
State
State
State
Your
Code
State
checkpointed
state
checkpointed
state
checkpointed
state
Distributed File System
Checkpoint
● Consistent snapshotting:
State fault tolerance
12
...
Your
Code
Your
Code
Your
Code
State
State
State
Your
Code
State
checkpointed
state
checkpointed
state
checkpointed
state
Restore
● Recover all embedded state
● Reset position in input stream
Distributed File System
13
Apache Flink®
Apache Flink Stack
14
DataStream API
Stream Processing
DataSet API
Batch Processing
Runtime
Distributed Streaming Data Flow
Libraries
Streaming and batch as first class citizens.
Programming Model
15
Computation
Computation
Computation
Computation
Source Source
Sink
Sink
Transformation
state
state
state
state
API and Execution
16
Source
DataStream<String> lines = env.addSource(new FlinkKafkaConsumer010(…));
DataStream<Event> events = lines.map(line -> parse(line));
DataStream<Statistic> stats = events
.keyBy("id")
.timeWindow(Time.seconds(5))
.aggregate(new MyAggregationFunction());
stats.addSink(new BucketingSink(path));
map()
[1]
keyBy()/
window()/
apply()
[1]
Transformation
Transformation
Sink
Streaming
DataflowkeyBy()/
window()/
apply()
[2]
map()
[1]
map()
[2]
Source
[1]
Source
[2]
Sink
[1]
Levels of abstraction
17
Process Function (events, state, time)
DataStream API (streams, windows)
Table API (dynamic tables)
Stream SQL
low-level (stateful
stream processing)
stream processing &
analytics
declarative DSL
high-level langauge
18
End to end exactly-once
Exactly-once
19
● End to end exactly-once
○ before Flink 1.4 only for writing to Files
○ since Flink 1.4 also for Pravega and Kafka
Exactly-once two-phase commit
Kafka
Data Source Data SinkOperator
Kafka
State
Backend
Job
Manager
(External
System)
(External
System)
Exactly-once two-phase commit
Kafka
Data Source Data SinkOperator
Kafka
State
Backend
Job
Manager
Inject checkpoint
barrier (1)
Pre-commit
(checkpoint starts)
Exactly-once two-phase commit
Kafka
Data Source Data SinkOperator
Kafka
State
Backend
Job
Manager
Inject checkpoint
barrier (1)
Pre-commit without
external state
Snapshot offsets (2)
Pass checkpoint barrier (2)
Exactly-once two-phase commit
Kafka
Data Source Data SinkOperator
Kafka
State
Backend
Job
Manager
Inject checkpoint
barrier (1)
Pre-commit second
operator without external
stateSnapshot offsets (2)
Pass checkpoint barrier (2)
Snapshot
state (3)
Pass checkpoint barrier (3)
Exactly-once two-phase commit
Kafka
Data Source Data SinkOperator
Kafka
State
Backend
Job
Manager
Inject checkpoint
barrier (1)
Pre-commit with external
state in data sink
Snapshot offsets (2)
Pass checkpoint barrier (2)
Snapshot
state (3)
Pass checkpoint barrier (3)
Pre-commit external
transaction (4)
Snapshot
state (4)
Exactly-once two-phase commit
Kafka
Data Source Data SinkWindow
Kafka
State
Backend
Job
Manager
Notify checkpoint
completed (1)
Exactly-once two-phase commit
Kafka
Data Source Data SinkWindow
Kafka
State
Backend
Job
Manager
Notify checkpoint
completed (1)
Commit external
transaction (2)
Exactly-once
27
● End to end exactly-once requires transactional writes support from
external systems
● Kafka supports transactional writes only since 0.11 version (released in
second half of 2017)
28
Large state
Large state
29
● Large state (multiple GB per machine) - too long checkpointing
● Large state with small changes
● Long recovery times
Flink State and Distributed Snapshots
3
0
State
Backend
Stateful
Operation
Source
Event
3
1
Trigger checkpoint
Inject checkpoint barrier
Stateful
Operation
Source
Flink State and Distributed Snapshots
3
2
Take state snapshot
Synchronously trigger
state snapshot (e.g.
copy-on-write)
Flink State and Distributed Snapshots
Stateful
Operation
Source
„Asynchronous Snapshotting“
3
3
DFS
Processing pipeline continues
Durably persist
full snapshots
asynchronously
Flink State and Distributed Snapshots
Stateful
Operation
Source
Asynchronous Checkpoints
34
● Minimize pipeline stall time while taking the snapshot
● Keep overhead (memory, CPU,…) as low as possible while writing
the snapshot
● Support multiple parallel checkpoints
Incremental checkpointing
35
● Large state with small changes over time
● Efficiently detect the (minimal) set of state changes between two
checkpoints
○ copy on write
● Persist only the difference
○ faster taking snapshots
○ slower recovery (replaying differences)
36
DFS
Local recovery - checkpointing
Stateful
Operation
Source
Copy-on-write
snapshots
Checkpointing state of the operators
37
DFS
Durably persist
full snapshots
asynchronously
Local recovery - checkpointing
Stateful
Operation
Source
Store snapshots on
local disks as well
Checkpointing state of the operators
38
Local recovery - restoring
Stateful
Operation
Source
Restoring state of the operators
DFS
Recover from by
default from DFS
Recover from local
state if possible
Local recovery
39
● Higher cost of asynchronous snapshotting
○ If system is not overloaded do not affect latency/throughput
● Faster recovery
○ Smaller downtime
40
Low latency
Challenges in Streaming
41
● Streaming is very different from batching and micro-batching
● Providing high throughput with low latency is difficult
Challenges in Streaming
42
Your
Code
process records
one-at-a-time
...
Batching allows to hide handing over costs in an exchange for
higher latency
Network changes in Flink 1.5
43
44
Conclusion
TL; DR
45
● Stateful stream processing as a paradigm for continuous
data
● Apache Flink is a sophisticated and battle-tested
stateful stream processor with a comprehensive set of
features
● Efficiency, management, and operational issues for state
are taken very seriously
4
Thank you!
@PiotrNowojski
@ApacheFlink
@dataArtisans
We are hiring!
data-artisans.com/careers

Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans