Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

1
Stefan Richter 
@stefanrrichter 
 
29.10.2016
A look at Flink 1.2 and beyond

Agenda
▪ Flink 1.2 feature overview & walkthrough
▪ Taking a closer look at two features:
▪ Queryable state
▪ Dynamic scaling
2

Feature Overview
Flink Release 1.2
3

Flink 1.1+ ongoing development
4
Session Windows(Stream) SQL
Library 
enhancements
Metric 
System
Metrics & 
Visualization
Dynamic Scaling
Savepoint 
compatibility Checkpoints 
to savepoints
Connectors in Flink
Stream SQL 
Windows
Large state 
Maintenance
Fine grained 
recovery
Side in-/outputs
Window DSL
Security
Mesos & 
others
Dynamic Resource 
Management
Authentication
Queryable StateApache Bahir connectors
Operations
Ecosystem
Application 
Features
Broader 
Audience

Flink 1.2 Improvements
5
Session Windows(Stream) SQL
Library 
enhancements
Metric 
System
Operations
Ecosystem
Application 
Features
Metrics & 
Visualization
Dynamic Scaling
Savepoint 
compatibility Checkpoints 
to savepoints
Connectors in Flink
Stream SQL 
Windows
Large state 
Maintenance
Fine grained 
recovery
Side in-/outputs
Window DSL
Broader 
Audience
Security
Mesos & 
others
Dynamic Resource 
Management
Authentication
Queryable StateApache Bahir connectors

Security / Authentication - Flink 1.2
6
Authorized data access
Secured clusters with Kerberos-based authentication
• Kafka, ZooKeeper, HDFS, YARN, HBase, …
Encrypted traffic between Flink Processes
• RPC, Data Exchange, Web UI, … - „SSL for all connections“
Largely contributed by
Prevent malicious users to hook into Flink jobs

Cluster Management - Flink 1.1
7
Standalone
Flink on Yarn

Cluster Management - Flink 1.2
8Mesos integration contributed by
Standalone
Flink on Yarn
Flink on Mesos

Cluster Management - Beyond 1.2
9
Efforts to seamlessly interoperate with various
cluster managers.
Generalized abstraction (FLIP-6).
Driven by and

Cluster Management - Beyond (ct’d)
10
TaskManagerJobManager
(1) Register
(2) Deploy Tasks
ResourceManager
(1) Request
slots
TaskManager
JobManager
(2) Start
TaskManager
(3) Register
(4) Deploy Tasks
Dispatcher
(0) Start
JobManager

Metrics
▪ Rates
▪ Latency (operator)
11

Metrics
▪ Rates
▪ Latency (operator)
▪ Visualization in WebUI
11

Savepoint / Checkpoint Robustness
12

▪ Resume job from
checkpoints
12
C S

▪ Resume job from
checkpoints
▪ Use older checkpoint
on failed recovery
12
C1 C2 C3
t
✘

▪ Resume job from
checkpoints
on failed recovery
▪ Skip failed Checkpoints
12
C1 C2 C3
t
✘

▪ Resume job from
checkpoints
on failed recovery
▪ Skip failed Checkpoints
▪ Backwards compatible
12
S
1.1 1.2

Processing Function
13
Stream SQL
Streaming API
Processing Function
Window
Operator
Timer
Handling
?
Problem: Implement custom windowing?

Processing Function
13
Stream SQL
Streaming API
Processing Function
Window
Operator
Timer
Handling
Interface ProcessingFunction:
void flatMap(I value, Context ctx, Collector<O> out) throws Exception;
void onTimer(long timestamp, OnTimerContext ctx, Collector<O> out) throws Exception

Table API & Stream SQL
14
Example:

▪ Group-windows
14
Example:
table
.groupBy('user')
.window(Session withGap
10.minutes on 'rowtime')
.select('uid', 'product.count')

▪ Group-windows
▪ More SQL operations
14
Example:
EXISTS, VALUES, LIMIT

▪ Group-windows
▪ More built-in scalar functions
14
Example:
CURRENT_DATE, INITCAP, NULLIF

▪ Group-windows
▪ More datatypes & better
integration
14
Example:
pojo.get('field')
pojo.flatten()

▪ Group-windows
▪ More datatypes & better
integration
▪ User-defined scalar functions
14
Example:
table.
select('uid',
parseName('userJson'))

Many more improvements…
▪ Kafka 0.10 (with watermarks)
15

▪ Bucketing Sink: divides output into different file w.r.t. user
logic
15

logic
▪ Detached execution: first step in programatically controlled
job
15

logic
job
▪ Async IO operator: non-blocking queries to external systems
15

logic
job
▪ Async IO operator: non-blocking queries to external systems
▪ Improved scalability, robustness + bugfixes
15

Queryable State - Motivation
17
Realtime
Queries
Periodically (every second) 
flush new aggregates 
to Redis

18
Number of 
Keys

19
Realtime
QueriesWhere is the bottleneck?

19
Writes to the key/value 
store take too long
Realtime
QueriesWhere is the bottleneck?

Queryable State - Idea
20
Realtime
Queries
Archive
Database
Optional +
only at end of windows
“Streamprocessor
as a database“

Queryable State - Performance
21
Number of 
Keys

Queryable State - Implementation
22
Query Client
State 
Registry
window()
/
sum()
Job Manager Task Manager
ExecutionGraph
State Location Server
deploy
status
Query: /job/operation/state-name/key
State 
Registry
Task Manager
(1) Get location of "key-partition" 
for "operator" of" job"
(2) Look up 
location
(3) 
Respond location
(4) Query 
state-name and key
local 
state
register
window()
/
sum()

Queryable State Enablers
▪ Flink has state as a first class citizen
23

▪ State is fault tolerant (exactly once semantics)
23

▪ State is partitioned (sharded) together with the
operators that create/update it
23

▪ State is continuous (not mini batched)
23

▪ State is continuous (not mini batched)
▪ State is scalable (e.g., embedded RocksDB state
backend)
23

Motivation - Changing Workloads
25

Motivation - Resource Adaption
26
time
Workload
Resources

26
time
Workload
Resources
time
Workload
Resources

26
+
time
Workload
Resources
time
Workload
Resources

Basic Idea
27
• Spread work across more workers to decrease workload

Scaling Stateless Jobs
28
Scale Up Scale Down
Source
Mapper
Sink
• Scale up: Deploy new tasks
• Scale down: Cancel running tasks

Scaling Stateful Jobs
29
?
• Problem 1: Which state to assign to new task?
• Problem 2: Read + filter whole state?

Non-keyed vs Keyed State
30
• State bound to an operator + key
• E.g. Keyed UDF and window state
• „SELECT count(*) FROM t GROUP BY t.key“
• State bound only to operator
• E.g. Source state
KeyedNon-keyed

Repartitioning Non-keyed state
31
#1 #2
#3 #4
#1 #2
#3 #4
Flink 1.1:
T snapshot()
void restore(T)
Flink 1.2:
List<T> snapshot()
void restore(List<T>)
Idea: break up state into finer granules that can be redistributed independently

Example: Kafka Source Flink 1.1
32
partitionId: 1, offset: 42
?
Operator state is black box. How to repartition?

33
?
?
?
Return a list of sub-states which can be freely repartitioned.

34
Scale Out

35
Scale In

Non-keyed vs Keyed State
36
• State bound to an operator + key
• E.g. Keyed UDF and window state
• „SELECT count(*) FROM t GROUP BY t.key“
• State bound only to operator
• E.g. Source state
KeyedNon-keyed

Repartitioning Keyed State
▪ Split key space into
key groups
▪ Every key falls into
exactly one key group
▪ Assign key groups to
tasks
37
Key space
Key group #1 Key group #2
Key group #3Key group #4
One key

Repartitioning Keyed State (ct’d)
▪ Rescaling changes
key group assignment
▪ Maximum parallelism
defined by #key
groups
38

Current State in Flink 1.2
▪ Manual rescaling
1. Take savepoint
2. Restart job with adjusted parallelism and
savepoint
39

Next Steps beyond Flink 1.2
▪ Rescaling individual operators w/o restart
▪ Refactor Flink deployment and process
model (previously discussed)
▪ On-the-fly Scaling
40

Autoscaling Policies
41
• Latency
• Throughput
• Resource utilization
• Kubernetes on GCE, EC2 and Mesos (marathon-
autoscale) already support auto-scaling

Conclusion
▪ Many great features in Flink 1.2
▪ Walkthrough
▪ Queryable State & Dynamic Scaling
42

Conclusion
▪ Many great features in Flink 1.2
▪ Walkthrough
▪ Queryable State & Dynamic Scaling
▪ Glimpse beyond the 1.2 release
42

43
Thank you!
@stefanrrichter
@ApacheFlink
@dataArtisans

Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (11)

Similar to Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Similar to Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup (20)

More from Ververica

More from Ververica (9)

Recently uploaded

Recently uploaded (20)

Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup