SlideShare a Scribd company logo
Stateful stream processing made
easy with Apache Flink
A. Mancini - F. Tosi
ROME - APRIL 13/14 2018
1
STATEFUL STREAM PROCESSING
MADE EASY WITH
APACHE FLINK
Alberto Mancini
alberto@k-teq.com
Francesca Tosi
francesca@k-teq.com
2
Alberto Mancini
Software Architect
K-TEQ Srls
alberto@k-teq.com
#Flink #Java #GWT #Web
3
WHO WE ARE
Francesca Tosi
Software Architect
K-TEQ Srls
francesca@k-teq.com
#Java #Flink #Web #GWT
4
Apache Flink
Apache Flink is an open source stream
processing framework developed by the
Apache Software Foundation.
The core of Apache Flink is a distributed
streaming dataflow engine written in
Java and Scala.
5
TL;DR
Flink's pipelined runtime system enables
the execution of bulk/batch and stream
processing programs.
6
TL;DR
- Native Stream
- Low Latency
- High Throughput
- Stateful
- Exactly-one guarantees
- Distributed
- Expressive Apis
- …
Main Features
7
TL;DR
https://github.com/apache/flink
#contributors +377
#commits +13.565
#committers over 25
#linesOdCode +1.257.949
Flink conferences (at the
beginning of this week
Flink Forward in San Francisco,
5th Int’l Flink conference).
8
TL;DR 2017 - Year in Review
#contributors #stars #forks
9
TL;DR 2017 - Year in Review
#LinesOfCode
10
TL;DR
https://github.com/apache/flink
#contributors +25
#commits +13.565
#committers over 25
#linesOdCode +1.257.949
Flink conferences (at the
beginning of this week
Flink Forward in San Francisco,
5th Int’l Flink conference).
11
TL;DR
#FlinkForward
A bit of History
Actually flink was born as a sort of spin-off
from the project Stratosphere.
“Stratosphere is a research project whose
goal is to develop the next generation Big
Data Analytics platform.
aimed at Next Generation Big Data
Analytics Platform
The project includes universities from the
area of Berlin, namely, TU Berlin, Humboldt
University and the Hasso Plattner Institute.” 12
TL;DR
13
TL;DR (Flink use cases)
uses Flink for real-time
process monitoring and ETL.
Telefónica NEXT's TÜV-certified
Data Anonymization Platform
is powered by Flink.
uses a fork of Flink called Blink
to optimize search rankings in
real time.
14
TL;DR (Flink use cases)
Ericsson used Flink to build a
real-time anomaly detector
over large infrastructures.
MediaMath uses Flink to power
its real-time reporting
infrastructure.
uses Flink to surface near
real-time intelligence from SaaS
application activity.
https://flink.apache.org/poweredby.html
15
co-flatMap
keyBy
Map
TL;DR
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
//create a kafka source
FlinkKafkaConsumer010<Event> kafkaConsumer
= new FlinkKafkaConsumer010<Event>("topic", new MessageDeserializer(), properties);
DataStreamSource<Event> kafkaStream = env.addSource(kafkaConsumer);
ZookeeperSource zookeeperSource = new ZookeeperSource("Quorum", "/path");
DataStreamSource<Map<String, String>> zookeeperStream = env.addSource(zookeeperSource);
ConnectedStreams<Event, Map<String, String>> connectedStream
= kafkaStream.connect(zookeeperStream);
CoFlatMapFunction<Event, Map<String, String>, Event> coFlatMap = new DoSomethingFlatMap();
SingleOutputStreamOperator<Event> flatmappedStream = connectedStream.flatMap(coFlatMap);
KeyedStream<Event, String> keyedStream = flatmappedStream.keyBy(new MyKeySelector());
SingleOutputStreamOperator<OutputEvent> transformed
= keyedStream.map(new AnotherMapThatIsExecutedInKeyedContext());
transformed.writeUsingOutputFormat( new MyHbaseOutputFormat() );
env.execute("Codemotion Sample");
16
TL;DR (code snippet)
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
//create a kafka source
FlinkKafkaConsumer010<Event> kafkaConsumer
= new FlinkKafkaConsumer010<Event>("topic", new MessageDeserializer(), properties);
DataStreamSource<Event> kafkaStream = env.addSource(kafkaConsumer);
ZookeeperSource zookeeperSource = new ZookeeperSource("Quorum", "/path");
DataStreamSource<Map<String, String>> zookeeperStream = env.addSource(zookeeperSource);
ConnectedStreams<Event, Map<String, String>> connectedStream
= kafkaStream.connect(zookeeperStream);
CoFlatMapFunction<Event, Map<String, String>, Event> coFlatMap = new DoSomethingFlatMap();
SingleOutputStreamOperator<Event> flatmappedStream = connectedStream.flatMap(coFlatMap);
KeyedStream<Event, String> keyedStream = flatmappedStream.keyBy(new MyKeySelector());
SingleOutputStreamOperator<OutputEvent> transformed
= keyedStream.map(new AnotherMapThatIsExecutedInKeyedContext());
transformed.writeUsingOutputFormat( new MyHbaseOutputFormat() );
env.execute("Codemotion Sample");
17
TL;DR (code snippet)
StreamExecutionEnvironment env
= StreamExecutionEnvironment.getExecutionEnvironment();
FlinkKafkaConsumer010<Event> kafkaConsumer
= new FlinkKafkaConsumer010<>("topic", new MessageDeserializer(), properties);
//start from latest message
kafkaConsumer.setStartFromLatest();
ZookeeperSource zookeeperSource = new ZookeeperSource("Quorum", "/path");
env.addSource(kafkaConsumer)
.connect( env.addSource(zookeeperSource) )
.flatMap( new DoSomethingFlatMap() )
.keyBy( new MyKeySelector() )
.map( new AnotherMapThatIsExecutedInKeyedContext() )
.writeUsingOutputFormat( new MyHbaseOutputFormat() );
env.execute("Codemotion Sample");
18
TL;DR (code snippet)
19
TL;DR
(Plan Visualizer)
20
TL;DR
(Plan Visualizer)
That's a lot of fun but
let’s start
from the very beginning
…
21
BigData is
…
well, big!
IBM (2014): every day about 2.5 trillion (1018
) of data
bytes are created and 90% of the data has been
created only in the last two years
Each year about EXABYTE (10^18, 2^60) of data.
22
BigData
still grows
From:
https://dvmobile.io/dvmobile-blog/feeling-overwhelmed-by-a-deluge-
of-iot-data-iot-data-analytics-dashboards-can-help 23
Processing
Model
From: https://data-artisans.com/what-is-stream-processing
24
From:
http://dme.rwth-aachen.de/en/research/projects/mapreduce
25
Computing Model (google docet)
From:
http://sqlknowledgebank.blogspot.it/2018/01/azure-data-lake-introductory.html
26
… that eventually can evolve as
From:
https://dvmobile.io/dvmobile-blog/feeling-overwhelmed-by-a-deluge-of-iot-data-iot-
data-analytics-dashboards-can-help
27
BigData still grows
1,60,0,1,60,0,1,60,0,1,60,0,1,60,0,1,60,0,1,60,0,1,60,0,1,60,0,1,60,0,1,60,0,60,0,0,0,1,60,0,60,0,0,0,1,60,0,60,0,0,0,1,60,0,60,0,0,0,1,60,0,60,0,0,0,1,1505661693,0,
1,1505661693,0,1,1505661693,0,1,1505661693,0,1,1505661693,0,1,60,0,60,0,0,0,1,60,0,60,0,0,0,1,60,0,60,0,0,0,1,60,0,60,0,0,0,1,60,0,60,0,0,0
1,354,0,1,354,0,1,354,0,1,354,0,1,354,0,1.000000032,353.9999996,4.59E-06,1.000031776,353.9996187,0.004575383,1.031757072,353.6306448,4.295
83941,2.597515222,346.6197997,34.09504708,5.319895236,344.2626949,22.18829857,1.000000032,353.9999996,0.002142519,353.9999996,4.5
9E-06,0,0,1.000031776,353.9996187,0.067641574,353.9996187,0.004575383,0,0,1.031757072,353.6306448,2.072640685,353.6306448,4.295839
41,0,0,2.597515222,346.6197997,5.839096427,346.6197997,34.09504708,0,0,5.319895236,344.2626949,4.710445687,344.2626949,22.18829857,
0,0,1.000000032,4.980575201,4.23E-07,1.000031776,4.980690887,0.000422029,1.031757072,5.092511423,0.418370441,2.597515222,31.24484876,6
976.104978,5.319895236,58.01651105,12740.63613,1.000000032,353.9999996,0.002142519,353.9999996,4.59E-06,0,0,1.000031776,353.9996187,0.
067641574,353.9996187,0.004575383,0,0,1.031757072,353.6306448,2.072640685,353.6306448,4.29583941,0,0,2.597515222,346.6197997,5.839
096427,346.6197997,34.09504708,0,0,5.319895236,344.2626949,4.710445687,344.2626949,22.18829857,0,0
1.857878541,360.4589798,35.78933753,1.91212736,360.2757326,35.92397154,1.969806657,360.0919684,35.9915418,1.99693884,360.0091976,35.
9999154,1.999693461,360.0009198,35.99999915,1.857878569,360.4589795,35.78934202,1.912156343,360.2754556,35.92848955,2.000604877,
359.8134525,40.39880279,3.589563813,352.0188402,100.0815133,6.318264483,347.7030867,81.6250773,1.857878569,360.4589795,5.982419412,
360.4589795,35.78934202,0,0,1.912156343,360.2754556,5.994037834,360.2754556,35.92848955,0,0,2.000604877,359.8134525,6.356005254,
359.8134525,40.39880279,0,0,3.589563813,352.0188402,10.00407484,352.0188402,100.0815133,0,0,6.318264483,347.7030867,9.034659778,347.
7030867,81.6250773,0,0,1.857878569,2.323596243,6.056225848,1.912156343,2.399071467,6.079503362,2.000604877,2.569134347,6.58053185,3
.589563813,22.55281278,5228.309522,6.318264483,48.84116231,11171.88781,1.857878569,360.4589795,5.982419412,360.4589795,35.78934202,0,
0,1.912156343,360.2754556,5.994037834,360.2754556,35.92848955,0,0,2.000604877,359.8134525,6.356005254,359.8134525,40.39880279,0,
0,3.589563813,352.0188402,10.00407484,352.0188402,100.0815133,0,0,6.318264483,347.7030867,9.034659778,347.7030867,81.6250773,0,0
1,337,0,1,337,0,1,337,0,1,337,0,1,337,0,1,337,0,1,337,0,1,337,0,1,337,0,1,337,0,1,337,0,337,0,0,0,1,337,0,337,0,0,0,1,337,0,337,0,0,0,1,337,0,337,0,0,0,1,337,0,337,
0,0,0,1,1505661697,0,1,1505661697,0,1,1505661697,0,1,1505661697,0,1,1505661697,0,1,337,0,337,0,0,0,1,337,0,337,0,0,0,1,337,0,337,0,0,0,1,337,0,337,0,0,0
,1,337,0,337,0,0,0
… x 40k lines
https://archive.ics.uci.edu/ml/machine-learning-databases/00442/Danmini_Doorbell/
28
not just big: big and often raw, often senseless
Government
- smarter surveillance: analyze data from vehicles and cameras to alert
law enforcement of potential issues
HealthCare
- proactive treatment: continuously improve care based on
personalized data streams
Finance
- manage risk: continuously monitor trades and calculate derivative
values in real-time
Automotive
- improved quality and functionalities: detect problems sooner and
predict breakdowns
Telco
- processing call data: predictive spam and fraud detection
29
areas of application
cf. use-cases-streaming-analytics
Government
- smarter surveillance: analyze data from vehicles and cameras to alert
law enforcement of potential issues
HealthCare
- proactive treatment: continuously improve care based on
personalized data streams
Finance
- manage risk: continuously monitor trades and calculate derivative
values in real-time
Automotive
- improved quality and functionalities: detect problems sooner and
predict breakdowns
Telco
- processing call data: predictive spam and fraud detection
30
areas of application
cf. use-cases-streaming-analytics
31
stream processing
From:
https://www.slideshare.net/ConfluentInc/kafka-and-stream-processing-taking-analytics-realt
ime-mike-spicer
32
process the data as asap
From:
https://www.flickr.com/photos/sheila_sund/15221366308
33
long running (distributed) application
- state management
- fault tolerance and recovery
- performance and scalability
- programming model
- ecosystem
34
Challenges
only a few problems can be solved without keeping some sort of
application state
state is needed for any kind of aggregation or counting
35
State Management
only a few problems can be solved without keeping some sort of
application state
● on a single node (and neglecting
threading) keeping state seems easy,
just keep it in the local memory but
with threads in the picture, even
on a single node, keeping state
consistent requires careful
synchronization;
● on a multi node/multi thread/long running application it may
will end up in a mess.
36
State Management
only a few problems can be solved without keeping some sort of
application state
Classical solution: keep state in an external database (KV-stores are
frequently the best fit).
● yet another system to manage
● yet another bottleneck to avoid
● yet another syncronization point to
care about
37
State Management
Only a few problems can be solved without keeping some sort of
application state
Flink keeps state for your application: synchronize, distribute and even
rescale.
38
State Management
Flink state comes in two flavors.
https://www.slideshare.net/dataArtisans/apache-flink-training-working-with-state
39
State Management
Flink state backends are threefold:
- MemoryStateBackend
holds data internally as objects on the Java heap, then
collected in the JobManager (master)
- FsStateBackend
holds in-flight data in the TaskManager’s memory then on
filesystem (hdfs or s3 for instance)
- RocksDBStateBackend
holds in-flight data in a RocksDB data base per task then the
whole RocksDB data base is stored on disk
40
State Management
From:
https://www.flickr.com/photos/sheila_sund/15221366308
41
Long running distributed stateful application
From:
https://www.wikihow.com/Untangle-a-Newton%27s-Cradle 42
Checkpoints
43
fault tolerance and recovery
Savepoints
“Savepoints are externally stored self-contained checkpoints that you
can use to stop-and-resume or update your Flink programs. They use
Flink’s checkpointing mechanism to create a (non-incremental)
snapshot of the state of your streaming program and write the
checkpoint data and meta data out to an external file system.”
caveat: you must consider serializers evolution for objects stored in the
state
44
fault tolerance and recovery
- event time semantics
- flexible windowing
- metrics and counters
- queryable state
- DataSet API
- Event Processing (CEP)
- Graphs: Gelly
- Machine Learning,
- TABLE API,
- SteamSQL
45
- Map: DataStream → DataStream
- FlatMap: DataStream → DataStream
- Filter: DataStream → DataStream
- KeyBy: DataStream → KeyedStream
- Reduce: KeyedStream (DataStream)
- Fold: KeyedStream (DataStream)
- Union: DataStream* → DataStream
- Connect: DataStream, DataStream →
ConnectedStreams
- CoMap: C’dStreams → DataStream
- CoFlatMap: C’dStreams →
DataStream
- Split: DataStream → SplitStream
- Select: SplitStream → DataStream
- Iterate: DataStream →
IterativeStream →
DataStream
- …
and there’s a lot more to tell
46
the big picture
Q & A
47
-
Q. Isn’t easier to use just Kafka ?
A. Well, No.
Kafka, or exactly Kafka Streams API
is a library that any standard Java application can embed and
hence does not attempt to dictate a deployment method;
whereas
Flink is a cluster framework, which means that the framework
takes care of deploying the application
https://www.confluent.io/blog/apache-flink-apache-kafka-streams-comparison-guideline-users/
48
Q. Did You write “Real Time” ?
A. Well, yes … actually, i meant …
Rigorously
“Real-time programs must guarantee response
within specified time constraints”
Here Real-time means (as in Cambridge
Dictionary)
“communicated, shown, presented, etc. at the
same time as events actually happen”
49
Q. What about Apache Storm ?
A. There is actually a compatiblility suite that let’s
you
● Run unmodified Storm topologies
● Embed Storm code (spouts and bolts) as
operators inside Flink DataStream programs.
50
Q. any kind of comparison chart ?
https://www.gmv.com/blog_gmv/future-streaming-technologies-apache-flink/
51

More Related Content

What's hot

OSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoringOSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoring
Gianluca Arbezzano
 
Virtual Flink Forward 2020: Everything is connected: How watermarking, scalin...
Virtual Flink Forward 2020: Everything is connected: How watermarking, scalin...Virtual Flink Forward 2020: Everything is connected: How watermarking, scalin...
Virtual Flink Forward 2020: Everything is connected: How watermarking, scalin...
Flink Forward
 
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...
Flink Forward
 
Kubernetes + Operator + PaaSTA = Flink @ Yelp - Antonio Verardi, Yelp
Kubernetes + Operator + PaaSTA = Flink @ Yelp -  Antonio Verardi, YelpKubernetes + Operator + PaaSTA = Flink @ Yelp -  Antonio Verardi, Yelp
Kubernetes + Operator + PaaSTA = Flink @ Yelp - Antonio Verardi, Yelp
Flink Forward
 
Apache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream ProcessorApache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream Processor
Aljoscha Krettek
 
Virtual Flink Forward 2020: Data driven matchmaking streaming at Hyperconnect...
Virtual Flink Forward 2020: Data driven matchmaking streaming at Hyperconnect...Virtual Flink Forward 2020: Data driven matchmaking streaming at Hyperconnect...
Virtual Flink Forward 2020: Data driven matchmaking streaming at Hyperconnect...
Flink Forward
 
A look at Flink 1.2
A look at Flink 1.2A look at Flink 1.2
A look at Flink 1.2
Stefan Richter
 
Reactive Programming in Java and Spring Framework 5
Reactive Programming in Java and Spring Framework 5Reactive Programming in Java and Spring Framework 5
Reactive Programming in Java and Spring Framework 5
Richard Langlois P. Eng.
 
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
Flink Forward
 
Servlet vs Reactive Stacks in 5 Use Cases
Servlet vs Reactive Stacks in 5 Use CasesServlet vs Reactive Stacks in 5 Use Cases
Servlet vs Reactive Stacks in 5 Use Cases
VMware Tanzu
 
Stream Processing with Apache Flink
Stream Processing with Apache FlinkStream Processing with Apache Flink
Stream Processing with Apache Flink
C4Media
 
Flink. Pure Streaming
Flink. Pure StreamingFlink. Pure Streaming
Flink. Pure Streaming
Indizen Technologies
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
Ververica
 
HBaseCon2017 Data Product at AirBnB
HBaseCon2017 Data Product at AirBnBHBaseCon2017 Data Product at AirBnB
HBaseCon2017 Data Product at AirBnB
HBaseCon
 
Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Ver...
Towards Flink 2.0:  Unified Batch & Stream Processing - Aljoscha Krettek, Ver...Towards Flink 2.0:  Unified Batch & Stream Processing - Aljoscha Krettek, Ver...
Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Ver...
Flink Forward
 
Virtual Flink Forward 2020: Keynote: The Evolution of Data Infrastructure at ...
Virtual Flink Forward 2020: Keynote: The Evolution of Data Infrastructure at ...Virtual Flink Forward 2020: Keynote: The Evolution of Data Infrastructure at ...
Virtual Flink Forward 2020: Keynote: The Evolution of Data Infrastructure at ...
Flink Forward
 
Apache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedInApache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedIn
Chris Riccomini
 
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkFabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Ververica
 

What's hot (18)

OSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoringOSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoring
 
Virtual Flink Forward 2020: Everything is connected: How watermarking, scalin...
Virtual Flink Forward 2020: Everything is connected: How watermarking, scalin...Virtual Flink Forward 2020: Everything is connected: How watermarking, scalin...
Virtual Flink Forward 2020: Everything is connected: How watermarking, scalin...
 
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...
 
Kubernetes + Operator + PaaSTA = Flink @ Yelp - Antonio Verardi, Yelp
Kubernetes + Operator + PaaSTA = Flink @ Yelp -  Antonio Verardi, YelpKubernetes + Operator + PaaSTA = Flink @ Yelp -  Antonio Verardi, Yelp
Kubernetes + Operator + PaaSTA = Flink @ Yelp - Antonio Verardi, Yelp
 
Apache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream ProcessorApache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream Processor
 
Virtual Flink Forward 2020: Data driven matchmaking streaming at Hyperconnect...
Virtual Flink Forward 2020: Data driven matchmaking streaming at Hyperconnect...Virtual Flink Forward 2020: Data driven matchmaking streaming at Hyperconnect...
Virtual Flink Forward 2020: Data driven matchmaking streaming at Hyperconnect...
 
A look at Flink 1.2
A look at Flink 1.2A look at Flink 1.2
A look at Flink 1.2
 
Reactive Programming in Java and Spring Framework 5
Reactive Programming in Java and Spring Framework 5Reactive Programming in Java and Spring Framework 5
Reactive Programming in Java and Spring Framework 5
 
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
 
Servlet vs Reactive Stacks in 5 Use Cases
Servlet vs Reactive Stacks in 5 Use CasesServlet vs Reactive Stacks in 5 Use Cases
Servlet vs Reactive Stacks in 5 Use Cases
 
Stream Processing with Apache Flink
Stream Processing with Apache FlinkStream Processing with Apache Flink
Stream Processing with Apache Flink
 
Flink. Pure Streaming
Flink. Pure StreamingFlink. Pure Streaming
Flink. Pure Streaming
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
HBaseCon2017 Data Product at AirBnB
HBaseCon2017 Data Product at AirBnBHBaseCon2017 Data Product at AirBnB
HBaseCon2017 Data Product at AirBnB
 
Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Ver...
Towards Flink 2.0:  Unified Batch & Stream Processing - Aljoscha Krettek, Ver...Towards Flink 2.0:  Unified Batch & Stream Processing - Aljoscha Krettek, Ver...
Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Ver...
 
Virtual Flink Forward 2020: Keynote: The Evolution of Data Infrastructure at ...
Virtual Flink Forward 2020: Keynote: The Evolution of Data Infrastructure at ...Virtual Flink Forward 2020: Keynote: The Evolution of Data Infrastructure at ...
Virtual Flink Forward 2020: Keynote: The Evolution of Data Infrastructure at ...
 
Apache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedInApache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedIn
 
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkFabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache Flink
 

Similar to Stateful stream processing made easy with Apache Flink

Robust stream processing with Apache Flink
Robust stream processing with Apache FlinkRobust stream processing with Apache Flink
Robust stream processing with Apache Flink
Aljoscha Krettek
 
Apache Fink 1.0: A New Era for Real-World Streaming Analytics
Apache Fink 1.0: A New Era  for Real-World Streaming AnalyticsApache Fink 1.0: A New Era  for Real-World Streaming Analytics
Apache Fink 1.0: A New Era for Real-World Streaming Analytics
Slim Baltagi
 
Do flink on web with flow - Dongwon Kim & Haemee park, SK Telecom)
Do flink on web with flow - Dongwon Kim & Haemee park, SK Telecom)Do flink on web with flow - Dongwon Kim & Haemee park, SK Telecom)
Do flink on web with flow - Dongwon Kim & Haemee park, SK Telecom)
Flink Forward
 
K. Tzoumas & S. Ewen – Flink Forward Keynote
K. Tzoumas & S. Ewen – Flink Forward KeynoteK. Tzoumas & S. Ewen – Flink Forward Keynote
K. Tzoumas & S. Ewen – Flink Forward Keynote
Flink Forward
 
Apache flink
Apache flinkApache flink
Apache flink
Ahmed Nader
 
FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...
FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...
FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...
Zhenzhong Xu
 
Apache Flink Stream Processing
Apache Flink Stream ProcessingApache Flink Stream Processing
Apache Flink Stream Processing
Suneel Marthi
 
Kubernetes for the PHP developer
Kubernetes for the PHP developerKubernetes for the PHP developer
Kubernetes for the PHP developer
Paul Czarkowski
 
Why apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksWhy apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics Frameworks
Slim Baltagi
 
Apache Flink Training: System Overview
Apache Flink Training: System OverviewApache Flink Training: System Overview
Apache Flink Training: System Overview
Flink Forward
 
BigDataFest_ Building Modern Data Streaming Apps
BigDataFest_  Building Modern Data Streaming AppsBigDataFest_  Building Modern Data Streaming Apps
BigDataFest_ Building Modern Data Streaming Apps
ssuser73434e
 
big data fest building modern data streaming apps
big data fest building modern data streaming appsbig data fest building modern data streaming apps
big data fest building modern data streaming apps
Timothy Spann
 
maxbox starter72 multilanguage coding
maxbox starter72 multilanguage codingmaxbox starter72 multilanguage coding
maxbox starter72 multilanguage coding
Max Kleiner
 
project_docs
project_docsproject_docs
project_docs
Andrey Lavrinovic
 
BigDataFest Building Modern Data Streaming Apps
BigDataFest  Building Modern Data Streaming AppsBigDataFest  Building Modern Data Streaming Apps
BigDataFest Building Modern Data Streaming Apps
ssuser73434e
 
eBPF Powered Distributed Kubernetes Performance Analysis - Lorenzo Fontana, I...
eBPF Powered Distributed Kubernetes Performance Analysis - Lorenzo Fontana, I...eBPF Powered Distributed Kubernetes Performance Analysis - Lorenzo Fontana, I...
eBPF Powered Distributed Kubernetes Performance Analysis - Lorenzo Fontana, I...
InfluxData
 
Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...
Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...
Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...
DataWorks Summit
 
OSDC 2018 | Distributed Monitoring by Gianluca Arbezzano
OSDC 2018 | Distributed Monitoring by Gianluca ArbezzanoOSDC 2018 | Distributed Monitoring by Gianluca Arbezzano
OSDC 2018 | Distributed Monitoring by Gianluca Arbezzano
NETWAYS
 
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
HostedbyConfluent
 
Maximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Maximilian Michels – Google Cloud Dataflow on Top of Apache FlinkMaximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Maximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Flink Forward
 

Similar to Stateful stream processing made easy with Apache Flink (20)

Robust stream processing with Apache Flink
Robust stream processing with Apache FlinkRobust stream processing with Apache Flink
Robust stream processing with Apache Flink
 
Apache Fink 1.0: A New Era for Real-World Streaming Analytics
Apache Fink 1.0: A New Era  for Real-World Streaming AnalyticsApache Fink 1.0: A New Era  for Real-World Streaming Analytics
Apache Fink 1.0: A New Era for Real-World Streaming Analytics
 
Do flink on web with flow - Dongwon Kim & Haemee park, SK Telecom)
Do flink on web with flow - Dongwon Kim & Haemee park, SK Telecom)Do flink on web with flow - Dongwon Kim & Haemee park, SK Telecom)
Do flink on web with flow - Dongwon Kim & Haemee park, SK Telecom)
 
K. Tzoumas & S. Ewen – Flink Forward Keynote
K. Tzoumas & S. Ewen – Flink Forward KeynoteK. Tzoumas & S. Ewen – Flink Forward Keynote
K. Tzoumas & S. Ewen – Flink Forward Keynote
 
Apache flink
Apache flinkApache flink
Apache flink
 
FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...
FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...
FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...
 
Apache Flink Stream Processing
Apache Flink Stream ProcessingApache Flink Stream Processing
Apache Flink Stream Processing
 
Kubernetes for the PHP developer
Kubernetes for the PHP developerKubernetes for the PHP developer
Kubernetes for the PHP developer
 
Why apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksWhy apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics Frameworks
 
Apache Flink Training: System Overview
Apache Flink Training: System OverviewApache Flink Training: System Overview
Apache Flink Training: System Overview
 
BigDataFest_ Building Modern Data Streaming Apps
BigDataFest_  Building Modern Data Streaming AppsBigDataFest_  Building Modern Data Streaming Apps
BigDataFest_ Building Modern Data Streaming Apps
 
big data fest building modern data streaming apps
big data fest building modern data streaming appsbig data fest building modern data streaming apps
big data fest building modern data streaming apps
 
maxbox starter72 multilanguage coding
maxbox starter72 multilanguage codingmaxbox starter72 multilanguage coding
maxbox starter72 multilanguage coding
 
project_docs
project_docsproject_docs
project_docs
 
BigDataFest Building Modern Data Streaming Apps
BigDataFest  Building Modern Data Streaming AppsBigDataFest  Building Modern Data Streaming Apps
BigDataFest Building Modern Data Streaming Apps
 
eBPF Powered Distributed Kubernetes Performance Analysis - Lorenzo Fontana, I...
eBPF Powered Distributed Kubernetes Performance Analysis - Lorenzo Fontana, I...eBPF Powered Distributed Kubernetes Performance Analysis - Lorenzo Fontana, I...
eBPF Powered Distributed Kubernetes Performance Analysis - Lorenzo Fontana, I...
 
Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...
Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...
Apache Flume - Streaming data easily to Hadoop from any source for Telco oper...
 
OSDC 2018 | Distributed Monitoring by Gianluca Arbezzano
OSDC 2018 | Distributed Monitoring by Gianluca ArbezzanoOSDC 2018 | Distributed Monitoring by Gianluca Arbezzano
OSDC 2018 | Distributed Monitoring by Gianluca Arbezzano
 
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
 
Maximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Maximilian Michels – Google Cloud Dataflow on Top of Apache FlinkMaximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Maximilian Michels – Google Cloud Dataflow on Top of Apache Flink
 

Recently uploaded

Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 

Recently uploaded (20)

Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 

Stateful stream processing made easy with Apache Flink