Debunking Common Myths in Stream Processing

1
Robert Metzger
@rmetzger_
Debunking Common Myths
in Stream Processing
DataWorks Summit 2017
April 5, 2017, Munich
Ufuk Celebi
@iamuce

2
Original creators of Apache
Flink®
Providers of the
data Artisans Platform, a
supported Flink distribution

Outline
 Introduction to Stream Processing with Apache Flink®
 Myth 1: Streaming is Approximate
 Myth 2: Exactly-Once not Possible
 Myth 3: The Throughput/Latency Tradeoff
 Myth 4: Streaming is only for Real-time
 Conclusion
3

Introduction to Stream
Processing with Apache Flink®
4

Apache Flink
 Apache Flink is an open source stream
processing framework
• Low latency
• High throughput
• Stateful
• Distributed
5

Traditional data processing
6
Web server
Logs
Web server
Logs
Web server
Logs
HDFS / S3
Periodic (custom) or
continuous ingestion
(Flume) into HDFS
Batch job(s) for
log analysis
Periodic log analysis
job
Serving
layer
 Log analysis example using a batch processor
Job scheduler
(Oozie)

Traditional data processing
7
Web server
Logs
Web server
Logs
Web server
Logs
HDFS / S3
Periodic (custom) or
continuous ingestion
(Flume) into HDFS
Batch job(s) for
log analysis
Periodic log analysis
job
Serving
layer
 Latency from log event to serving layer usually in
the range of hours
every 2 hrs
Job scheduler
(Oozie)

Data processing without stream processor
8
Web server
Logs
Web server
Logs
HDFS / S3
Batch job(s) for
log analysis
 This architecture is a hand-crafted micro-batch
model
Batch interval: ~2 hours
hours minutes milliseconds
Manually triggered periodic
batch job
Batch processor with
micro-batches
Latency
Approach
seconds
Stream processor

Downsides of stream processing with a batch engine
 Very high latency (hours)
 Delayed results
 Complex architecture required:
• Periodic job scheduler (e.g. Oozie)
• Data loading into HDFS (e.g. Flume)
• Batch processor
• (When using the “lambda architecture”: a stream processor)
 Maintenance
 Backpressure: How does the pipeline handle load spikes?
 …
9

Log event analysis using a stream processor
10
Web server
Web server
Web server
High throughput
publish/subscribe
bus
Serving
layer
 Stream processors allow to analyze events with
sub-second latency.
Options:
• Apache Kafka
• Amazon Kinesis
• MapR Streams
• Google Cloud Pub/Sub
Forward events
immediately to
pub/sub bus
Stream Processor
Options:
• Apache Flink
• Apache Beam
• Apache Samza
Process events in
real time & update
serving layer

11
Real-world data is produced in a
continuous fashion.
New systems like Flink and Kafka embrace
streaming nature of data.
Web server Kafka topic
Stream processor

What is Streaming?
 Computations on never-
ending “streams” of data
records (“events”)
 A stream processor
distributes the
computation in a cluster
12
Your
code
Your
code
Your
code
Your
code
Logical view:
Distributed view:

Stateful Streaming
 Computation and state
• E.g., counters, windows of past
events, state machines, trained ML
models
 Result depends on history of
stream
 A stateful stream processor gives
the tools to manage state
• Recover, roll back, version,
upgrade, etc
13
Your
code
state

Stateful Streaming
14
// Fault tolerant managed state holding a single value
ValueState<Long> count = getRuntimeContext()
.getState(descriptor);
// Update the state
Long currentCount = count.value();
count.update(currentCount + 1);

Event-Time Streaming
 Data records associated with
timestamps (time series data)
 Processing depends on timestamps
 An event-time stream processor gives
you the tools to reason about time
• E.g., handle streams that are out of
order
• Core feature is watermarks – a clock
to measure event time
15
Your
code
state
t3 t1 t2t4 t1-t2 t3-t4

Apache Flink stack
16
Gelly
SQL/Table
ML
SAMOA
DataSet (Java/Scala)DataStream (Java / Scala)
HadoopM/R LocalClusterYARN
ApacheBeam
ApacheBeam
SQL/Table
Cascading
Streaming dataflow runtime
StormAPI
Zeppelin
CEP
Mesos DockerDeployment
Core Engine
Core APIs
Specialized
APIs and
abstractions

Common Myths in Stream
Processing
28

Myth 1: Streaming is
Approximate
29

Myth Variations
 “Stream processing cannot handle high data
volume”
 “There is no streaming without batch”
 ”Stream processing needs to be coupled with
batch processing”
30

Lambda Architecture
31
file 2 Job 2
file 1 Job 1
Streaming job
Serve&
store

Lambda No Longer Needed
 Lambda was useful in the first days of stream
processing
 Not any more
• Stream processors can compute accurate results
• Stream processors can handle high data volumes
 Good news: We don’t hear Lambda so often
anymore
32

Myth 2: Exactly-Once Not Possible
33

Myth Variations
 “Exactly-once is not needed”
 “Exactly-once requires significant
performance trade-offs”
 “End-to-end exactly-once is not possible”
34

What is “Exactly-Once”?
 Under failures, system computes
result as if there was no failure
 In contrast to:
• At-most-once: no guarantees
• At-least-once: duplicates possible
 Exactly-once state versus exactly-
once delivery
• Exactly-Once Delivery:
In general, not possible
• Exactly-Once State:
Possible via Flink Checkpoints
35
Your
code
state

Flink Checkpoints
 Periodic asynchronous consistent snapshots of application state
• Provide exactly-once state semantics
• On Failure: Restore state from checkpoint and replay
36
Task
Checkpoint Barrier
Part of checkpoint X,
older records
Part of checkpoint X+1,
newer recordsState

Flink Checkpoints
37
Task
older records
newer recordsState

Flink Checkpoints
38
Task
Snapshot to Durable
Storage (sync/async)
older records
newer recordsState

End-to-End Exactly-Once
 Checkpoints double as transaction coordination mechanism
with external systems
 Source and sink operators can take part in checkpoints
 For example the “Kafka -> Flink -> HDFS” pipeline is end-to-
end exactly-once
39
Transactional Sinks

State Management
 Checkpoints triple as state
versioning mechanism
(savepoints)
 Go back and forth in time while
maintaining state consistency
 Ease code upgrades (Flink or
app), maintenance, migration,
and debugging, what-if
simulations, A/B tests
40

Myth 3: The Throughput/Latency
Tradeoff
41

Myth Variations
 “Low latency systems cannot support high
throughput”
 “There is a high throughput category and a
low-latency category of stream processors”
 “Mini-batch systems always have better
throughput”
42

Batching vs. Buffering
 Batching: Treat a stream as a sequence of small batches.
• Drawbacks beyond introduced latency
 Buffering: Buffering is a performance optimization
• Should be opaque to the user
• Should not dictate system behavior in any other way
• Should not impose artificial stream boundaries
• Should not limit what you can do with the system
43

Buffering
 It is natural to handle many records together
• All software and hardware systems do that
• E.g., network bundles bytes into frames
 Every streaming system buffers records for
performance (Flink certainly does)
• You don’t want to send single records over the
network
• "Record-at-a-time" does not exist at the physical level
44

Physical Limits
 Many stream processing pipelines are
network bottlenecked
 The network dictates both (1) what is the
latency and (2) what is the throughput
 A well-engineered system achieves the
physical limits allowed by the network
45

Myth 4: Streaming is only for Real-
time
46

Myth variations
47
 “I don’t have low-latency applications, and
therefore I don’t need a stream processor”

Accurate Computation
 Batch processing is not an accurate computation model for
unbounded data
• Misses the right concepts and primitives
• Time handling, state across batch boundaries is pushed out of
the system
 Processed data is most often unbounded (e.g. clicks, sensors, logs)
48
2016-3-1
12:00 am
2016-3-1
1:00 am
2016-3-1
2:00 am
2016-3-11
11:00pm
2016-3-12
12:00am
2016-3-12
1:00am
2016-3-11
10:00pm
2016-3-12
2:00am
2016-3-12
3:00am…

Accurate Computation
 Stateful stream processing is a better model
• Embrace unbounded nature of produced data and
offer primitives for time and state to deal with it in
an accurate way
• Real-time/low-latency is the icing on the cake
49
partition
partition

Myths Debunked
 These and other myths are becoming less prevalent
each day
 Streaming is a powerful model to do continuous
processing of unbounded data sets
• Time and State as first-class citizens
• Low latency, high throughput
• Bounded data/batch as a special case of unbounded
data/streaming
54

Streaming Enablers
55
Low Latency
High Volume/
Throughput
State &
Accuracy
Latency Down to
Milliseconds
Millions events/second
for stateful apps
Exactly-once,
Event-Time

Flink Community
58
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
Meetup Members
0
50
100
150
200
250
300
Feb 15 Dec 15 Dec 16
Number of
Contributors
0
500
1000
1500
2000
Feb 15Dec 15Dec 16
Stars on
GitHub
0
200
400
600
800
1000
1200
1400
Feb 15Dec 15Dec 16
Forks on
GitHub

Companies using Flink
59
Zalando, one of the largest ecommerce
companies in Europe, uses Flink for real-
time business process monitoring.
King, the creators of Candy Crush Saga,
uses Flink to provide data science teams
with real-time analytics.
Bouygues Telecom uses Flink for real-time
event processing over billions of Kafka
messages per day.
Alibaba, the world's largest retailer, built a
Flink-based system (Blink) to optimize
search rankings in real time.
See more at flink.apache.org/poweredby.html

30 Flink applications in production for more than one
year. 10 billion events (2TB) processed daily
Complex jobs of > 30 operators running 24/7,
processing 30 billion events daily, maintaining state
of 100s of GB with exactly-once guarantees
Largest job has > 20 operators, runs on > 5000
vCores in 1000-node cluster, processes millions of
events per second
60

Complex Event Processing
64
Tomorrow at 12:20 PM:
“Unifying Stream SQL
and CEP for
Declarative Stream Processing
with Apache Flink”
by Till Rohrmann

We are hiring!
data-artisans.com/careers

Flink 1.2+ Ongoing Development
68
Connectors
Session
Windows
(Stream) SQL
Library
enhancements
Metric
System
Operations
Ecosystem
Application
Features
Metrics &
Visualization
Dynamic Scaling
Savepoint
compatibility State
Management
More connectors CEP
Large state
Maintenance
Fine grained
recovery
Side in-/outputs
Window DSL
Broader
Audience
Security
Mesos &
others
Dynamic Resource
Management
Authentication
Queryable State

SQL on Streams
69
 Continuous queries, incremental results
 Windows, event time, processing time
 Consistent with SQL on bounded data
Table result = tableEnv.sql("SELECT product, amount
FROM Orders WHERE product LIKE '%Rubber%'");
http://flink.apache.org/news/2017/03/29/table-sql-api-update.html

Elastic Parallelism
71
Maintaining exactly-once
state consistency
No extra effort for the user
No need to carefully plan
partitions

Apache Flink 1.2 (released)
 Dynamic Scaling (change of parallelism)
 Process Function
 Mesos and DC/OS support
 Secure data access (Kerberos)
 Queryable State
 Window support in Table API
72

Apache Flink 1.3 (upcoming)
 Asynchronous Memory State backend
 History Server
 Dynamic Tables for SQL
 Side Outputs (e.g. late window elements)
 CEP library and Kafka enhancements
73

Streaming use cases
Application
(Near) real-time apps/
Continuous apps
Analytics on historical
data
Request/response apps
Technology
Low-/High-latency
Streaming
Batch as special case of
streaming
Large queryable state
74

In summary
 The need for streaming comes from a rethinking of
data infra architecture
• Stream processing then just becomes natural
 Debunking myths
• Myth 1: The Lambda Architecture
• Myth 2: Exactly once not possible
• Myth 3: The throughput/latency tradeoff
• Myth 4: Streaming Is for (Near) Real-time Only
75

Performance of Flink's State
77
window()/
sum()
Source /
filter() /
map()
State index
(e.g., RocksDB)
Events are persistent
and ordered (per partition / key)
in the log (e.g., Apache Kafka)
Events flow without replication or synchronous writes

78
window()/
sum()
Source /
filter() /
map()
Trigger checkpoint Inject checkpoint barrier

79
window()/
sum()
Source /
filter() /
map()
Take state snapshot RocksDB:
Trigger state
copy-on-write

80
window()/
sum()
Source /
filter() /
map()
Persist state snapshots Durably persist
snapshots
asynchronously
Processing pipeline continues

Flink 1.1 + ongoing development
81
Connectors
Session
Windows
(Stream) SQL
Library
enhancements
Metric
System
Operations
Ecosystem
Application
Features
Metrics &
Visualization
Dynamic Scaling
Savepoint
compatibility Checkpoints
to savepoints
More connectors Stream SQL
Windows
Large state
Maintenance
Fine grained
recovery
Side in-/outputs
Window DSL
Broader
Audience
Security
Mesos &
others
Dynamic Resource
Management
Authentication
Queryable State

Checkpoints / Savepoints
82
Recover a running job into a new job
Recover a running job onto a new cluster
Application state backwards compatibility
• Flink 1.0 made the APIs backwards compatible
• Now making the savepoints backwards compatible
• Applications can be moved to newer versions of
Flink even when state backends or internals change
v1.x v2.0v1.y

Dynamic scaling
83
Changing load bears changing resource requirements
• Need to adjust parallelism of running streaming jobs
Re-scaling stateless operators is trivial
Re-scaling stateful operators is hard (windows, user state)
• Efficiently re-shard state
time
Workload
Resources
Re-scaling Flink jobs preserves
exactly-once guarantees

Security / Authentication
84
No unauthorized data access
Secured clusters with Kerberos-based authentication
• Kafka, ZooKeeper, HDFS, YARN, HBase, …
No unencrypted traffic between Flink Processes
• RPC, Data Exchange, Web UI
Largely contributed by
Prevent malicious users to hook into Flink jobs

Cluster management
85
Series of improvements to seamlessly
interoperate with various cluster managers
• YARN, Mesos, Docker, Standalone, …
Driven by
Mesos integration contributed by
and

Stream SQL
86
SQL is the standard high-level query language
A natural way to open up streaming to more people
Problem: There is no Streaming SQL standard
• At least beyond the basic operations
• Challenging: Incorporate windows and time semantics
Flink community working with
Apache Calcite to draft a new model

Restarting topologies
 Streaming programs can
• Establish connections to databases or other systems
• Schedule to 1000s or more tasks (Flink does)
• Keep state across any kind of batch boundary
• Have several shuffle stages, joins, etc
 All these need to be initialized once and run continuously –
restarting them every few milliseconds is not feasible
 Hence, the bulk of the computation is by nature continuous
87

Forms of batching
 Batching in Flink (and Apex, and others)
• Buffer records and send buffer over network
 Batching in Trident (Storm)
• Create a very large record
 Batching in Spark Streaming
• Restart topology every batch interval
• Let’s focus on this one
88

Debunking Common Myths in Stream Processing

More Related Content

What's hot

Similar to Debunking Common Myths in Stream Processing

More from DataWorks Summit/Hadoop Summit

Recently uploaded

Debunking Common Myths in Stream Processing

Editor's Notes