Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Integrations and Use Cases

Stream & Batch Processing
with Apache Flink
Robert Metzger
@rmetzger_
rmetzger@apache.org
Munich Apache Flink Meetup,
November 11, 2015

A bit of history
From incubation until now
2

3
Apr 2014 Jun 2015Dec 2014
0.70.60.5 0.90.9-m1 0.10
Nov 2015
Top level
0.8

Community growth
Flink is one of the largest and most active Apache
big data projects with well over 120 contributors
4

Flink meetups around the globe
5

Organizations at Flink Forward
6

The streaming era
Flink is all about
8

9
batch
event
based
need new systems
well served

10
Streaming is the biggest change in
data infrastructure since Hadoop

What is stream processing
 Real-world data is unbounded and is
pushed to systems
 Right now: people are using the batch
paradigm for stream analysis (there was
no good stream processor available)
 New systems (Flink, Kafka) embrace
streaming nature of data
11
Web server Kafka topic
Stream processing

12
Flink is a stream processor with many faces
Streaming dataflow runtime

Requirements for a stream processor
 Low latency
• Fast results (milliseconds)
 High throughput
• handle large data amounts (millions of events
per second)
 Exactly-once guarantees
• Correct results, also in failure cases
 Programmability
• Intuitive APIs
14

Pipelining
15
Basic building block to “keep the data moving”
• Low latency
• Operators push
data forward
• Data shipping as
buffers, not tuple-
wise
• Natural handling
of back-pressure

Fault Tolerance in streaming
 at least once: ensure all operators see all
events
• Storm: Replay stream in failure case
 Exactly once: Ensure that operators do
not perform duplicate updates to their
state
• Flink: Distributed Snapshots
• Spark: Micro-batches on batch runtime
16

Flink’s Distributed Snapshots
 Lightweight approach of storing the state
of all operators without pausing the
execution
 high throughput, low latency
 Implemented using barriers flowing
through the topology
17
Kafka
Consumer
offset = 162
Element
Counter
value = 152
Operator
stateData Stream
barrier
Before barrier =
part of the snapshot
After barrier =
Not in snapshot
(backup till next snapshot)

DataStream API
18
case class WordCount(word: String, count: Int)
val text: DataStream[String] = …;
text
.flatMap { line => line.split(" ") }
.map { word => new WordCount(word, 1) }
.keyBy("word")
.window(GlobalWindows.create())
.trigger(new EOFTrigger())
.sum("count")
Batch Word Count in the DataStream API

Batch Word Count
in the DataSet API
19
case class WordCount(word: String, count: Int)
val text: DataStream[String] = …;
text
.keyBy("word")
.window(GlobalWindows.create())
.trigger(new EOFTrigger())
.sum("count")
val text: DataSet[String] = …;
text
.groupBy("word")
.sum("count")

Best of all worlds for streaming
 Low latency
• Thanks to pipelined engine
 Exactly-once guarantees
• Distributed Snapshots
 High throughput
• Controllable checkpointing overhead
 Programmability
• APIs similar to those known from the batch world
20

Throughput of distributed grep
21
Data
Generator
“grep”
operator
30 machines, 120 cores
0
20,000,000
40,000,000
60,000,000
80,000,000
100,000,000
120,000,000
140,000,000
160,000,000
180,000,000
200,000,000
Flink, no fault
tolerance
Flink, exactly
once (5s)
Storm, no
fault tolerance
Storm, micro-
batches
aggregate throughput
of 175 million
elements per second
of 9 million elements
per second
• Flink achieves 20x
higher throughput
• Flink throughput
almost the same
with and without
exactly-once

Aggregate throughput for stream record
grouping
22
0
10,000,000
20,000,000
30,000,000
40,000,000
50,000,000
60,000,000
70,000,000
80,000,000
90,000,000
100,000,000
Flink, no
fault
tolerance
Flink,
exactly
once
Storm, no
fault
tolerance
Storm, at
least once
of 83 million elements
per second
8,6 million elements/s
309k elements/s  Flink achieves 260x
higher throughput with
fault tolerance
30 machines,
120 cores Network
transfer

Latency in stream record grouping
23
Data
Generator
Receiver:
Throughput /
Latency measure
• Measure time for a record to
travel from source to sink
0.00
5.00
10.00
15.00
20.00
25.00
30.00
Flink, no
fault
tolerance
Flink, exactly
once
Storm, at
least once
Median latency
25 ms
1 ms
0.00
10.00
20.00
30.00
40.00
50.00
60.00
Flink, no
fault
tolerance
Flink,
exactly
once
Storm, at
least
once
99th percentile
latency
50 ms

Exactly-Once with YARN Chaos Monkey
 Validate exactly-once guarantees with
state-machine
25

Performance: Summary
26
Continuous
streaming
Latency-bound
buffering
Distributed
Snapshots
High Throughput &
Low Latency
With configurable throughput/latency tradeoff

Faces of a stream processor
28
Stream
processing
Batch
processing
Machine Learning at scale
Graph Analysis

The Flink Stack
29
Specialized
Abstractions
/ APIs
Core APIs
Flink Core
Runtime
Deployment

The Flink Stack
30
DataSet (Java/Scala) DataStream (Java/Scala)
Experimental
Python API also
available
Data Source
orders.tbl
Filter
Map DataSource
lineitem.tbl
Join
Hybrid Hash
buildHT probe
hash-part [0] hash-part [0]
GroupRed
sort
forward
API independent Dataflow
Graph representation
Batch Optimizer Graph Builder

Batch is a special case of streaming
 Batch: run a bounded stream (data set) on
a stream processor
 Form a global window over the entire data
set for join or grouping operations
31

Batch-specific optimizations
 Managed memory on- and off-heap
• Operators (join, sort, …) with out-of-core
support
• Optimized serialization stack for user-types
 Cost-based Optimizer
• Job execution depends on data size
32

The Flink Stack
33
Specialized
Abstractions
/ APIs
Core APIs
Flink Core
Runtime
Deployment
DataSet (Java/Scala) DataStream

FlinkML: Machine Learning
 API for ML pipelines inspired by scikit-learn
 Collection of packaged algorithms
• SVM, Multiple Linear Regression, Optimization, ALS, ...
34
val trainingData: DataSet[LabeledVector] = ...
val testingData: DataSet[Vector] = ...
val scaler = StandardScaler()
val polyFeatures = PolynomialFeatures().setDegree(3)
val mlr = MultipleLinearRegression()
val pipeline = scaler.chainTransformer(polyFeatures).chainPredictor(mlr)
pipeline.fit(trainingData)
val predictions: DataSet[LabeledVector] = pipeline.predict(testingData)

Gelly: Graph Processing
 Graph API and library
 Packaged algorithms
• PageRank, SSSP, Label Propagation, Community
Detection, Connected Components
35
ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
Graph<Long, Long, NullValue> graph = ...
DataSet<Vertex<Long, Long>> verticesWithCommunity = graph.run(
new LabelPropagation<Long>(30)).getVertices();
verticesWithCommunity.print();
env.execute();

Flink Stack += Gelly, ML
36
Gelly
ML

Integration with other systems
37
SAMOA
DataSet DataStream
HadoopM/R
GoogleDataflow
Cascading
Storm
Zeppelin
• Use Hadoop Input/Output Formats
• Mapper / Reducer implementations
• Hadoop’s FileSystem implementations
• Run applications implemented against Google’s Data Flow API
on premise with Flink
• Run Cascading jobs on Flink, with almost no code change
• Benefit from Flink’s vastly better performance than
MapReduce
• Interactive, web-based data exploration
• Machine learning on data streams
• Compatibility layer for running Storm code
• FlinkTopologyBuilder: one line replacement for
existing jobs
• Wrappers for Storm Spouts and Bolts
• Coming soon: Exactly-once with Storm
• Build-in serializer support for
Hadoop’s Writable type

Deployment options
Gelly
Table
ML
SAMOA
DataSet(Java/Scala)DataStream
Hadoop
LocalClusterYARNTezEmbedded
Dataflow
Dataflow
MRQL
Table
Cascading
Streamingdataflowruntime
Storm
Zeppelin
• Start Flink in your IDE / on your machine
• Local debugging / development using the
same code as on the cluster
• “bare metal” standalone installation of Flink
on a cluster
• Flink on Hadoop YARN (Hadoop 2.2.0+)
• Restarts failed containers
• Support for Kerberos-secured YARN/HDFS
setups

The full stack
39
Gelly
Table
ML
SAMOA
HadoopM/R
Local Cluster Yarn Tez Embedded
Dataflow
Dataflow(WiP)
MRQL
Table
Cascading
Storm(WiP)
Zeppelin

Flink Use-cases and
production users
40

What is currently happening?
 Features for upcoming 0.10 release:
• Master High Availability
• Vastly improved monitoring GUI
• Watermarks / Event time processing /
Windowing rework
 The next talk by Matthias
• Stable streaming API (no more “beta” flag)
 Next release: 1.0
• API stability
47

How do I get started?
48
Mailing Lists: (news | user | dev)@flink.apache.org
Twitter: @ApacheFlink
Blogs: flink.apache.org/blog, data-artisans.com/blog/
IRC channel: irc.freenode.net#flink
Start Flink on YARN in 4 commands:
# get the hadoop2 package from the Flink download page at
# http://flink.apache.org/downloads.html
wget <download url>
tar xvzf flink-0.9.1-bin-hadoop2.tgz
cd flink-0.9.1/
./bin/flink run -m yarn-cluster -yn 4 ./examples/flink-java-
examples-0.9.1-WordCount.jar

flink.apache.org 49
• Check out the slides: http://flink-
forward.org/?post_type=session
• Video recordings on YouTube, “Flink Forward”
channel

Managed (off-heap) memory and out-of-
core support
55Memory runs out

Cost-based Optimizer
56
DataSource
orders.tbl
Filter
Map DataSource
lineitem.tbl
Join
Hybrid Hash
buildHT probe
broadcast forward
Combine
GroupRed
sort
DataSource
orders.tbl
Filter
Map DataSource
lineitem.tbl
Join
Hybrid Hash
buildHT probe
hash-part [0,1]
GroupRed
sort
forward
Best plan
depends on
relative sizes
of input files

57
case class Path (from: Long, to:
Long)
val tc = edges.iterate(10) {
paths: DataSet[Path] =>
val next = paths
.join(edges)
.where("to")
.equalTo("from") {
(path, edge) =>
Path(path.from, edge.to)
}
.union(paths)
.distinct()
next
}
Optimizer
Type extraction
stack
Task
scheduling
Dataflow
metadata
Pre-flight (Client)
JobManager
TaskManagers
Data
Source
orders.tbl
Filter
Map
DataSourc
e
lineitem.tbl
Join
Hybrid Hash
build
HT
probe
GroupRed
sort
forward
Program
Dataflow
Graph
deploy
operators
track
intermediate
results
LocalCluster:YARN,Standalone

Iterative processing in Flink
Flink offers built-in iterations and delta
iterations to execute ML and graph
algorithms efficiently
58
map
join sum
ID1
ID2
ID3

Example: Matrix Factorization
59
Factorizing a matrix with
28 billion ratings for
recommendations
More at: http://data-artisans.com/computing-recommendations-with-flink.html

60
Batch
aggregation
ExecutionGraph
JobManager
TaskManager 1
TaskManager 2
M1
M2
RP1RP2
R1
R2
1
2 3a
3b
4a
4b
5a
5b
"Blocked" result partition

61
Streaming
window
aggregation
ExecutionGraph
JobManager
TaskManager 1
TaskManager 2
M1
M2
RP1RP2
R1
R2
1
2 3a
3b
4a
4b
5a
5b
"Pipelined" result partition

Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Integrations and Use Cases

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Integrations and Use Cases

Similar to Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Integrations and Use Cases (20)

More from Robert Metzger

More from Robert Metzger (16)

Recently uploaded

Recently uploaded (20)

Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Integrations and Use Cases

Editor's Notes