Introduction to
Apache Flink™
Kostas Tzoumas
@kostas_tzoumas
2
Flink is a stream processor with many faces
Streaming dataflow runtime
3
case class Path (from: Long, to:
Long)
val tc = edges.iterate(10) {
paths: DataSet[Path] =>
val next = paths
.join(edges)
.where("to")
.equalTo("from") {
(path, edge) =>
Path(path.from, edge.to)
}
.union(paths)
.distinct()
next
}
Optimizer
Type extraction
stack
Task
scheduling
Dataflow
metadata
Pre-flight (Client)
JobManager
TaskManagers
Data
Source
orders.tbl
Filter
Map
DataSourc
e
lineitem.tbl
Join
Hybrid Hash
build
HT
probe
hash-part [0] hash-part [0]
GroupRed
sort
forward
Program
Dataflow
Graph
deploy
operators
track
intermediate
results
Flink's internal execution
model
4
Flink execution model
 A program is a dag of operators
 Operators = computation + state
 Operators produce intermediate results = logical
streams of records
 Other operators can consume those
5
map
join sum
ID1
ID2
ID3
6
A map-reduce
job with Flink
ExecutionGraph
JobManager
TaskManager 1
TaskManager 2
M1
M2
RP1RP2
R1
R2
1
2 3a
3b
4a
4b
5a
5b
One runtime for batch and streaming
Pipelined Blocked
Ephemeral Stream data
shuffles
Batch data
shuffles
Checkpointed Caching for
recovery or reuse
7
Pipelining
8
Basic building block to “keep the data moving”
Note: pipelined
systems do not
usually transfer
individual tuples, but
buffers that batch
several tuples!
Streaming fault tolerance
 Ensure that operators see all events
• “At least once”
• Solved by replaying a stream from a
checkpoint, e.g., from a past Kafka offset
 Ensure that operators do not perform
duplicate updates to their state
• “Exactly once”
• Several solutions
9
Exactly once approaches
 Discretized streams (Spark Streaming)
• Treat streaming as a series of small atomic computations
• “Fast track” to fault tolerance, but restricts computational
and programming model (e.g., cannot mutate state across
“mini-batches”, window functions correlated with mini-
batch size)
 MillWheel (Google Cloud Dataflow)
• State update and derived events committed as atomic
transaction to a high-throughput transactional store
• Requires a very high-throughput transactional store 
 Chandy-Lamport distributed snapshots (Flink)
10
11
JobManager
Register checkpoint
barrier on master
Replay will start from here
12
JobManagerBarriers “push” prior events
(assumes in-order delivery in
individual channels)
Operator checkpointing
starting
Operator checkpointing
finished
Operator checkpointing in
progress
13
JobManager
Operator checkpointing takes
snapshot of state after ack’d
data have updated the state.
Checkpoints currently one-off
and synchronous, WiP for
incremental and
asynchronous
State backup
Pluggable mechanism. Currently
either JobManager (for small state) or
file system (HDFS/Tachyon). WiP for
in-memory grids
14
JobManager
State snapshots at sinks
signal successful end of this
checkpoint
At failure,
recover last
checkpointed
state and
restart
sources from
last barrier
guarantees at
least once
State backup
Best of all worlds for streaming
 Low latency
• Thanks to pipelined engine
 Exactly-once guarantees
• Variation of Chandy-Lamport
 High throughput
• Controllable checkpointing overhead
 Separates app logic from recovery
• Checkpointing interval is just a config parameter
15
Faces of a stream processor
16
Stream
processing
Batch
processing
Machine Learning at scale
Graph Analysis
Stream data analytics
17
DataStream API
18
case class Word (word: String, frequency: Int)
val lines: DataStream[String] = env.fromSocketStream(...)
lines.flatMap {line => line.split(" ")
.map(word => Word(word,1))}
.window(Time.of(5,SECONDS)).every(Time.of(1,SECONDS))
.groupBy("word").sum("frequency")
.print()
val lines: DataSet[String] = env.readTextFile(...)
lines.flatMap {line => line.split(" ")
.map(word => Word(word,1))}
.groupBy("word").sum("frequency")
.print()
DataSet API (batch):
DataStream API (streaming):
Flink stack
19
DataStream (Java/Scala)
Streaming dataflow runtime
Batch data analytics
20
Batch is a special case of streaming
21
map
ID1
ID2
ID3
Blocking work units are embedded
in streaming topology
Lower-overhead fault-tolerance
via replaying intermediate results
join sum
Managed memory in Flink
22Memory runs out
Cost-based optimizer
23
Flink stack
24
Table
DataSet (Java/Scala) DataStream (Java/Scala)
HadoopM/R
Local Cluster Yarn Tez Embedded
Table
Streaming dataflow runtime
Iterative processing
25
FlinkML
 API for ML pipelines inspired by scikit-learn
 Collection of packaged algorithms
• SVM, Multiple Linear Regression, Optimization, ALS, ...
26
val trainingData: DataSet[LabeledVector] = ...
val testingData: DataSet[Vector] = ...
val scaler = StandardScaler()
val polyFeatures = PolynomialFeatures().setDegree(3)
val mlr = MultipleLinearRegression()
val pipeline = scaler.chainTransformer(polyFeatures).chainPredictor(mlr)
pipeline.fit(trainingData)
val predictions: DataSet[LabeledVector] = pipeline.predict(testingData)
Gelly
 Graph API and library
 Packaged algorithms
• PageRank, SSSP, Label Propagation, Community
Detection, Connected Components
27
ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
Graph<Long, Long, NullValue> graph = ...
DataSet<Vertex<Long, Long>> verticesWithCommunity = graph.run(
new LabelPropagation<Long>(30)).getVertices();
verticesWithCommunity.print();
env.execute();
Iterative processing in Flink
Flink offers built-in iterations and delta
iterations to execute ML and graph
algorithms efficiently
28
map
join sum
ID1
ID2
ID3
Example: Matrix Factorization
29
Factorizing a matrix with
28 billion ratings for
recommendations
More at: http://data-artisans.com/computing-recommendations-with-flink.html
The full stack
30
Gelly
Table
ML
SAMOA
DataSet (Java/Scala) DataStream
HadoopM/R
Local Cluster Yarn Tez Embedded
Dataflow
Dataflow(WiP)
MRQL
Table
Cascading(WiP)
Streaming dataflow runtime
Storm(WiP)
Zeppelin
Closing
31
tl;dr: what was this about?
 The case for Flink as a stream processor
• Low latency
• High throughput
• Exactly once
• Easy to use APIs, library ecosystem
• Growing community
 A stream processor that is great for batch
analytics as well
32
Demo time
33
I Flink, do you? 
34
If you find this exciting,
get involved and start a discussion on Flink‘s
mailing list,
or stay tuned by
subscribing to news@flink.apache.org,
following flink.apache.org/blog, and
@ApacheFlink on Twitter
35
flink-forward.org

First Flink Bay Area meetup

  • 1.
  • 2.
    2 Flink is astream processor with many faces Streaming dataflow runtime
  • 3.
    3 case class Path(from: Long, to: Long) val tc = edges.iterate(10) { paths: DataSet[Path] => val next = paths .join(edges) .where("to") .equalTo("from") { (path, edge) => Path(path.from, edge.to) } .union(paths) .distinct() next } Optimizer Type extraction stack Task scheduling Dataflow metadata Pre-flight (Client) JobManager TaskManagers Data Source orders.tbl Filter Map DataSourc e lineitem.tbl Join Hybrid Hash build HT probe hash-part [0] hash-part [0] GroupRed sort forward Program Dataflow Graph deploy operators track intermediate results
  • 4.
  • 5.
    Flink execution model A program is a dag of operators  Operators = computation + state  Operators produce intermediate results = logical streams of records  Other operators can consume those 5 map join sum ID1 ID2 ID3
  • 6.
    6 A map-reduce job withFlink ExecutionGraph JobManager TaskManager 1 TaskManager 2 M1 M2 RP1RP2 R1 R2 1 2 3a 3b 4a 4b 5a 5b
  • 7.
    One runtime forbatch and streaming Pipelined Blocked Ephemeral Stream data shuffles Batch data shuffles Checkpointed Caching for recovery or reuse 7
  • 8.
    Pipelining 8 Basic building blockto “keep the data moving” Note: pipelined systems do not usually transfer individual tuples, but buffers that batch several tuples!
  • 9.
    Streaming fault tolerance Ensure that operators see all events • “At least once” • Solved by replaying a stream from a checkpoint, e.g., from a past Kafka offset  Ensure that operators do not perform duplicate updates to their state • “Exactly once” • Several solutions 9
  • 10.
    Exactly once approaches Discretized streams (Spark Streaming) • Treat streaming as a series of small atomic computations • “Fast track” to fault tolerance, but restricts computational and programming model (e.g., cannot mutate state across “mini-batches”, window functions correlated with mini- batch size)  MillWheel (Google Cloud Dataflow) • State update and derived events committed as atomic transaction to a high-throughput transactional store • Requires a very high-throughput transactional store   Chandy-Lamport distributed snapshots (Flink) 10
  • 11.
    11 JobManager Register checkpoint barrier onmaster Replay will start from here
  • 12.
    12 JobManagerBarriers “push” priorevents (assumes in-order delivery in individual channels) Operator checkpointing starting Operator checkpointing finished Operator checkpointing in progress
  • 13.
    13 JobManager Operator checkpointing takes snapshotof state after ack’d data have updated the state. Checkpoints currently one-off and synchronous, WiP for incremental and asynchronous State backup Pluggable mechanism. Currently either JobManager (for small state) or file system (HDFS/Tachyon). WiP for in-memory grids
  • 14.
    14 JobManager State snapshots atsinks signal successful end of this checkpoint At failure, recover last checkpointed state and restart sources from last barrier guarantees at least once State backup
  • 15.
    Best of allworlds for streaming  Low latency • Thanks to pipelined engine  Exactly-once guarantees • Variation of Chandy-Lamport  High throughput • Controllable checkpointing overhead  Separates app logic from recovery • Checkpointing interval is just a config parameter 15
  • 16.
    Faces of astream processor 16 Stream processing Batch processing Machine Learning at scale Graph Analysis
  • 17.
  • 18.
    DataStream API 18 case classWord (word: String, frequency: Int) val lines: DataStream[String] = env.fromSocketStream(...) lines.flatMap {line => line.split(" ") .map(word => Word(word,1))} .window(Time.of(5,SECONDS)).every(Time.of(1,SECONDS)) .groupBy("word").sum("frequency") .print() val lines: DataSet[String] = env.readTextFile(...) lines.flatMap {line => line.split(" ") .map(word => Word(word,1))} .groupBy("word").sum("frequency") .print() DataSet API (batch): DataStream API (streaming):
  • 19.
  • 20.
  • 21.
    Batch is aspecial case of streaming 21 map ID1 ID2 ID3 Blocking work units are embedded in streaming topology Lower-overhead fault-tolerance via replaying intermediate results join sum
  • 22.
    Managed memory inFlink 22Memory runs out
  • 23.
  • 24.
    Flink stack 24 Table DataSet (Java/Scala)DataStream (Java/Scala) HadoopM/R Local Cluster Yarn Tez Embedded Table Streaming dataflow runtime
  • 25.
  • 26.
    FlinkML  API forML pipelines inspired by scikit-learn  Collection of packaged algorithms • SVM, Multiple Linear Regression, Optimization, ALS, ... 26 val trainingData: DataSet[LabeledVector] = ... val testingData: DataSet[Vector] = ... val scaler = StandardScaler() val polyFeatures = PolynomialFeatures().setDegree(3) val mlr = MultipleLinearRegression() val pipeline = scaler.chainTransformer(polyFeatures).chainPredictor(mlr) pipeline.fit(trainingData) val predictions: DataSet[LabeledVector] = pipeline.predict(testingData)
  • 27.
    Gelly  Graph APIand library  Packaged algorithms • PageRank, SSSP, Label Propagation, Community Detection, Connected Components 27 ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); Graph<Long, Long, NullValue> graph = ... DataSet<Vertex<Long, Long>> verticesWithCommunity = graph.run( new LabelPropagation<Long>(30)).getVertices(); verticesWithCommunity.print(); env.execute();
  • 28.
    Iterative processing inFlink Flink offers built-in iterations and delta iterations to execute ML and graph algorithms efficiently 28 map join sum ID1 ID2 ID3
  • 29.
    Example: Matrix Factorization 29 Factorizinga matrix with 28 billion ratings for recommendations More at: http://data-artisans.com/computing-recommendations-with-flink.html
  • 30.
    The full stack 30 Gelly Table ML SAMOA DataSet(Java/Scala) DataStream HadoopM/R Local Cluster Yarn Tez Embedded Dataflow Dataflow(WiP) MRQL Table Cascading(WiP) Streaming dataflow runtime Storm(WiP) Zeppelin
  • 31.
  • 32.
    tl;dr: what wasthis about?  The case for Flink as a stream processor • Low latency • High throughput • Exactly once • Easy to use APIs, library ecosystem • Growing community  A stream processor that is great for batch analytics as well 32
  • 33.
  • 34.
    I Flink, doyou?  34 If you find this exciting, get involved and start a discussion on Flink‘s mailing list, or stay tuned by subscribing to news@flink.apache.org, following flink.apache.org/blog, and @ApacheFlink on Twitter
  • 35.