El día 21 de Septiembre, tuvimos el placer de acoger en nuestras oficinas un Meetup impartido por nuestro compañero Paco Guerrero sobre la plataforma Apache Flink.
"Apache Flink es una plataforma open source de procesamiento en tiempo real, que está en auge al ofrecer características de las que otras tecnologías con las que compite no disponen, sin impacto en su rendimiento. En esta formación introduciremos la filosofía y motor de procesamiento que hace a Flink tan especial y potente. También recorreremos los pilares básicos que confirman a Flink como la plataforma de streaming más prometedora actualmente"
15. Streaming
“Streaming is the next programming paradigm for
data applications, and you need to start thinking in
terms of streams”
“Continuous processing of data that is continuously produced”
16. Streaming
“Streaming is the next programming paradigm for
data applications, and you need to start thinking in
terms of streams”
“Continuous processing of data that is continuously produced”
Data Stream: Infinite sequence of data arriving in a
continuous fashion.
17. Streaming
“Streaming is the next programming paradigm for
data applications, and you need to start thinking in
terms of streams”
“Continuous processing of data that is continuously produced”
Data Stream: Infinite sequence of data arriving in a
continuous fashion.
Stream processing is the backbone of the new data
infrastructure.
18. Streaming
“Streaming is the next programming paradigm for
data applications, and you need to start thinking in
terms of streams”
“Continuous processing of data that is continuously produced”
Data Stream: Infinite sequence of data arriving in a
continuous fashion.
Stream processing is the backbone of the new data
infrastructure.
“The world beyond batch”
A high-level tour of modern data-processing concepts. By Tyler Akidau
August 5, 2015 https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101
29. Batch Job
Batch Job 2
All
Input
All
OutputBatch Job 3
All
Output
All
Output
Batch Job 1
Batch Frequency ?
Timestamps keeps real time
fingerprint
Micro Batch
30. Streaming Technologies
Batch StreamingMicro Batch
StateLess –
Record acknowledgements
CPU bounded performance
Not expressive declarative
functional API – Low Level API
Not auto scaling
Low level programmatic topology
Poor Streaming Windows
funcionalities
Not compatible with Hadoop APIs
Streams
31. Streaming Technologies
Batch StreamingMicro Batch
StateLess –
Record acknowledgements
CPU bounded performance
Not expressive declarative
functional API – Low Level API
Not auto scaling
Low level programmatic topology
Poor Streaming Windows
funcionalities
Not compatible with Hadoop APIs
Streams
32. Streaming Technologies
Batch StreamingMicro Batch
StateLess –
Record acknowledgements
CPU bounded performance
Not expressive declarative
functional API – Low Level API
Not auto scaling
Low level programmatic topology
Poor Streaming Windows
funcionalities
Not compatible with Hadoop APIs
Streams
33. Streaming Technologies
Batch StreamingMicro Batch
StateLess –
Record acknowledgements
CPU bounded performance
Not expressive declarative
functional API – Low Level API
Not auto scaling
Low level programmatic topology
Poor Streaming Windows
funcionalities
Not compatible with Hadoop APIs
Streams
39. Flink
Open Source Stream Processing Framework.
Last available Release 1.1.1
Top Level Apache Project since Dec '14
40. Flink
Open Source Stream Processing Framework.
Last available Release 1.1.1
Top Level Apache Project since Dec '14
Main Features
Native Stream
Low Latency
High throughput
Stateful
Exactly-one guarantees
Distributed
Expressive APIs
And more ….
41. Flink
Open Source Stream Processing Framework.
Last available Release 1.1.1
Top Level Apache Project since Dec '14
Main Features
Native Stream
Low Latency
High throughput
Stateful
Exactly-one guarantees
Distributed
Expressive APIs
And more ….
56. Tasks scheduled and executed in workers ( slots )
Tasks as chain of operators
Run operator logic in a pipelined fashion
Stream Job
Batch Job
ML Job
Flink Runtime Engine
Graph Job
optimizer
optimizer
optimizer
optimizer
57. Tasks scheduled and executed in workers ( slots )
Tasks as chain of operators
Run operator logic in a pipelined fashion
Stream Job
Batch Job
ML Job
Flink Runtime Engine
Graph Job
optimizer
optimizer
optimizer
optimizer
58. Tasks scheduled and executed in workers ( slots )
Tasks as chain of operators
Run operator logic in a pipelined fashion
Stream Job
Batch Job
ML Job
Flink Runtime Engine
Graph Job
optimizer
optimizer
optimizer
optimizer
59. If you want to know one thing about Flink
is that you don't need to know
the internals of Flink
64. Event Times & Windowing
Flink
Data Source
Event
Time
Event
Time
Ingestion
Time
65. Event Times & Windowing
Flink
Data Source
Flink
Window Operator
Event
Time
Event
Time
Ingestion
Time
Processing
Time
66. Event Time: when data is generated
Ingestion time: when data is loaded from source
Processing time: when data is processed
Event time help to process out- of-order events and replay elements as the ocurred (
deterministic results )
Explicit handling of time. 3
choices:
Event Times & Windowing
75. 1 2 3 5 7
4 6 8 9 10
1 2 3 54 6 8
1 2 3 54 6 8
1 2 3
4
4 5
Event time. Watermarks
6 8
Event Time Windows
Ingestion Time WindowsOut or Order
76. 1 2 3 5 7
4 6 8 9 10
1 2 3 54 6 8 910
1 2 3 54 6 8 910
1 2 3
4
4 5
Event time. Watermarks
6 8 9 10
Event Time Windows
Ingestion Time WindowsOut or Order
Not event time
before 5 will come
Late Time of 2
5
77. 1 2 3 5 7
4 6 8 9 10
1 2 3 5 74 6 8 910
1 2 3 5 74 6 8 910
1 2 3
4
4 5
Event time. Watermarks
6 7 8 9 10
Event Time Windows
Ingestion Time WindowsOut or Order
Not event time
before 10 will
come
Late Time of 2
10
80. Windowing
Windows: grouping of events according to time, session*, count
Powerful built-in windows:
Count: number of events to trigger the window. Process X last events each Y events.
81. Windowing
Windows: grouping of events according to time, session*, count
Powerful built-in windows:
Count: number of events to trigger the window. Process X last events each Y events.
Time:
lTumbling: trigger every X time with received events
lSliding: trigger every X time with received events in last Y time
82. Windowing
Windows: grouping of events according to time, session*, count
Powerful built-in windows:
Count: number of events to trigger the window. Process X last events each Y events.
Time:
lTumbling: trigger every X time with received events
lSliding: trigger every X time with received events in last Y time
Session: all events from session/user X until session time expired ( Gap )
83. Windowing
Windows: grouping of events according to time, session*, count
Powerful built-in windows:
Count: number of events to trigger the window. Process X last events each Y events.
Time:
lTumbling: trigger every X time with received events
lSliding: trigger every X time with received events in last Y time
Session: all events from session/user X until session time expired ( Gap )
High level API for user windows: Window Assigner, Trigger, Evictor
86. Stateful Streaming
Op Op
State
Stateless Stream
Processing
Stateful Stream
Processing
lBuilt-in internal state in each operator for
exactly-once semantics
lUser state can be declared in each operator to be
saved locally in memory ( API, key/value pars )
lSnapshots: periodically local states
in memory are persisted in lightweight
distributed snapshots. No global pause !!
lCheckpoint as global consistent point-in-time
snapshot build by set of distributed snapshots.
lPluggable state backend for snapshots:
JobManager, HDFS, RocksDB
lSavepoints: user-triggered retained checkpoint
88. Periodically
Chandy-Lamport Snapshots
“The global-state-detection algorithm is to be superimposed
on the underlying computation:
It must run concurrently with, but no alter, this underlying
computation”
. Triggers snapshots asynchronously
. Embedded snapshots algorithm in stream of data ( barriers )
. No global pause, lightweight impact in performance
Handling Checkpoints
89. Periodically
Chandy-Lamport Snapshots
“The global-state-detection algorithm is to be superimposed
on the underlying computation:
It must run concurrently with, but no alter, this underlying
computation”
. Triggers snapshots asynchronously
. Embedded snapshots algorithm in stream of data ( barriers )
. No global pause, lightweight impact in performance
Handling Checkpoints
90. Periodically
Chandy-Lamport Snapshots
“The global-state-detection algorithm is to be superimposed
on the underlying computation:
It must run concurrently with, but no alter, this underlying
computation”
. Triggers snapshots asynchronously
. Embedded snapshots algorithm in stream of data ( barriers )
. No global pause, lightweight impact in performance
Handling Checkpoints
95. Streaming Fault Tolerance
In case of fail, last global checkpoint is recovered
( recovery from partial checkpoint / individual snapshots is coming )
Need of stateful source like kafka to ensure end-to-end exactly-once
semantic in case of fail.
Kafka sink doesn't guarantee end-to-end exactly-once ( multiple writes in
topic ) ( at least-once )
Semantics in Flink:
At Least Once: never loses events, events might be reprocessed
Exactly once: neither reprocessed nor lost events.
Exactly once by default, with low impact in performance
96. If you want to know one thing about Flink
is that you don't need to know
the internals of Flink
98. Exactly-once semantic with low impact in performance
Controllable checkpointing overhead
Higher throughput using processing time
Performance improvements thanks to:
. operator chaining during optimization phase
. own optimized serialization stack with code generation
Performance
Tunning
99. Benchmark for “Streaming Computation” published by Yahoo. Dec 18, 2015
https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at
Production use-case
lcounting ad impressions group by
campaign
laggregations over a 10 second
window
lsave current aggregate value to Redis
every second
Streaming
Benchmark
100. Throughput vs Latency Graph
Throughput ( 1000 events / sec )
99 Percentile
Latency ( ms )
Not Operator combinig in Storm, more complicate topology, more steps for events and more overhead
101. Apache Storm Without Trident
lAt least once / Double counting after fail / Lost state after Failures
lCPU bounded
Apache Spark
lLatency increase with throughput
Apache Flink
lExactly once / No double counting / No state loss
lLimited by bandwidth between Kafka and Flink cluster
l(1 GigE).
lkafka brokers within Kafka Cluster ( 10 GigE )
lAchieved 15 million messages /sec
l( before 3 million m/sec) with exactly once semantic
10,000,000 20,000,000
1 GigE
10 GigE
Performance
Tunning
105. API
Working on data streams ( bounded ? )
Stream Processing: Explicit Handling of Time
106. API
Working on data streams ( bounded ? )
Stream Processing: Explicit Handling of Time
Java & Scala. Python coming.
Java: Bean type classes vs Tuples with position addresses.
Scala: case classes.
107. API
Working on data streams ( bounded ? )
Stream Processing: Explicit Handling of Time
Java & Scala. Python coming.
Java: Bean type classes vs Tuples with position addresses.
Scala: case classes.
Operators:
Sources: kafka, FileSystem, Cassandra …
Sinks: Kafka, HDFS, Cassandra ….
Transformations:
Basic: map, flatmap, filter, grouping, iterate, project, join, cross, …
Streaming: Windowing + Aggregations, Temporal Binary
Iterative Stream operators
108. API
Working on data streams ( bounded ? )
Stream Processing: Explicit Handling of Time
Java & Scala. Python coming.
Java: Bean type classes vs Tuples with position addresses.
Scala: case classes.
Operators:
Sources: kafka, FileSystem, Cassandra …
Sinks: Kafka, HDFS, Cassandra ….
Transformations:
Basic: map, flatmap, filter, grouping, iterate, project, join, cross, …
Streaming: Windowing + Aggregations, Temporal Binary
Iterative Stream operators
DataStream<?> DataSet<?>
Core API
1 implementation*, 2 interfaces
119. API extension for DataSets y DataStreams
Based on relational Table abstraction
Table <=> Source / DataSet / DataStream
Operators like: where, select, as, groupBy, join, union, minus, distinct, orderBy, ...
Table API
120. Execute SQL-Like sentences on DataSets and Datastreams
Resuts returned as Table ( Table API ), convertible to DataStream or DataSets
SQL and Table API can be seamlessly mixed over DataStream/DataSets
Flink’s SQL support is not feature complete, yet.
Queries that include unsupported SQL will fail !!
SQL Support
SQL
121. Parsing and Logical plan for Table operators and SQL are optimized using Apache Calcite
Only supported a Subset of the comprehensive SQL standard
Apache Calcite provides with:
SQL Parsing
API for building expressions in relational algebra
Query planning engine
Provides SQL for Streaming Queries with windows aggregations
SELECT STREAM TUMBLE_END(rowtime, INTERVAL '1' HOUR) AS rowtime, productId, COUNT(*) AS c, SUM(units) AS units
FROM Orders
Apache Calcite
SQL Sentence
Apache Calcite:
SQL to Logical
Plan as Relational Algebra
Flink Optimizer: Logical Plan to
Execution Plan
122. If you want to know
one thing about Flink
is that you don't need
to know the internals of Flink
So … Batch
125. Stream: Unbounded Data Stream
Batch: Bounded stream ( dataset ) on a stream processor
Global window over the entire dataset
Optimization in operators for joins and grouping,
with blocking data exchange if needed
Unbounded
Data Stream
Bounded
Data Set
Batch on Stream
126. Stream: Unbounded Data Stream
Batch: Bounded stream ( dataset ) on a stream processor
Global window over the entire dataset
Optimization in operators for joins and grouping,
with blocking data exchange if needed
Unbounded
Data Stream
Bounded
Data Set
Batch on Stream
127. Stream: Unbounded Data Stream
Batch: Bounded stream ( dataset ) on a stream processor
Global window over the entire dataset
Optimization in operators for joins and grouping,
with blocking data exchange if needed
Batch specific optimizations:
Cost-based optimizer: dataset size known before hand
Manage memory on / off-heap for join, sort, …
Optimization serialization stack for user-types
Bounded
Data Set
Batch on Stream
Unbounded
Data Stream
132. Conclusions
Flink Pure streaming engine matches real life. No Abstraction
Batch on streaming
Flexible Windowing Semantics with Explicit Time handling
133. Conclusions
Flink Pure streaming engine matches real life. No Abstraction
Batch on streaming
Flexible Windowing Semantics with Explicit Time handling
Competitive Performance, low latency and hight throughput
134. Conclusions
Flink Pure streaming engine matches real life. No Abstraction
Batch on streaming
Flexible Windowing Semantics with Explicit Time handling
Competitive Performance, low latency and hight throughput
Apache Beam, open sourced by Google, uses Flink as its first order runner for
Batch and Streaming processing in partnership with Data Artisans.
100% Compliance of data processing model “what, where, when, how “