SlideShare a Scribd company logo
© 2019 Ververica
Konstantin Knauf, @snntrable, Solutions Architect
99 WAYS TO ENRICH
STREAMING DATA WITH
APACHE FLINK
© 2019 Ververica2
Agenda
• Introduction
• Per-Record Reference Data Lookup
• Reference Data Pre-Loading
• Reference Data Change Stream
• Summary
© 2019 Ververica
Introduction
© 2019 Ververica4
Running Example
Sensor
Reference
Data
Apache Flink
• low update frequency
• one record per key
• high frequency
• many events per key over time
look up by sensorID
Sensor
Measurements
Enriched
Measurements
© 2019 Ververica
Per-Record Reference Data Lookup
© 2019 Ververica6
Per-Record Synchronous Lookup
Sensor
Reference
Data
Implementation in Flink:
● RichFlatMapFunction
● Database Client instantiated in open()
© 2019 Ververica7
Per-Record Synchronous Lookup
• very simple
• always uses up-to-date reference
data
• high latency
• high load on database
• low CPU utilization on
Taskmanagers
• low throughput
• overall not suitable for high
frequency streams
+ -
© 2019 Ververica8
Per-Record Asynchronous Lookup
Sensor
Reference
Data
Implementation in Flink:
● AsyncDataStream#unorderedWait
● https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/op
erators/asyncio.html
© 2019 Ververica9
Per-Record Asynchronous Lookup
Code
© 2019 Ververica10
Per-Record Asynchronous Lookup
• still pretty simple
• always uses up-to-date reference
data
• higher throughput than “Per-Record
Synchronous Lookups”
• high latency
• high load on database
• overall not suitable for high
frequency streams
+ -
© 2019 Ververica11
Per-Record (A)Synchronous Lookup with In-Memory Cache
Sensor
Reference
Data
C
a
c
h
e
C
a
c
h
e
Implementation in Flink:
● RichFlatMapFunction
● Simple HashMap or cache implementation of your favorite library
© 2019 Ververica12
Per-Record (A)Synchronous Lookup with In-Memory Cache
• lower load on database
• higher throughput if cache hit rate
is high
• low latency if cache is hit
• high tail latencies
• high load on database during warm
up phase, possibly hard to predict
• events might be enriched by stale
data
• cache size limited by available
memory
+ -
© 2019 Ververica
Reference Data Pre-Loading
© 2019 Ververica14
Pre-Loading of Reference Data
Sensor
Reference
Data
Implementation in Flink:
● RichFlatMapFunction
● HashMap populated in open()
© 2019 Ververica15
Pre-Loading of Reference Data
• usually very simple (depending on
the database client)
• high throughput
• low latency
• reference data size limited by
memory of single Taskmanager
• events are enriched with stale data,
no updates to reference data
+ -
© 2019 Ververica16
Partitioned Pre-Loading of Reference Data
Sensor
Reference
Data
Implementation in Flink:
● DataStream#partitionCustom
● RichFlatMapFunction
© 2019 Ververica17
Partitioned Pre-Loading of Reference Data
Code
© 2019 Ververica18
Partitioned Pre-Loading of Reference Data
• high throughput
• low latency
• reference data size limited by
memory of all Taskmanagers
• events are enriched with stale data,
no updates to reference data
• requires custom partitioning of
DataStream
+ -
© 2019 Ververica19
Periodic (Partitioned) Pre-Loading of Reference Data
Sensor
Reference
Data
Implementation in Flink:
● CoProcessFunction
● Processing or Event Time Timers for Reloading
© 2019 Ververica20
Periodic (Partitioned) Pre-Loading of Reference Data
• high throughput
• low latency
• staleness of reference data limited
by refresh interval
• reference data size limited by
memory of all Taskmanagers
• events are enriched with stale
reference data
• load spikes/high tail latencies during
refresh of reference data
+ -
© 2019 Ververica21
Per-Record Lookup with Initial Cache Pre-Loading
Sensor
Reference
Data
C
a
c
h
e
C
a
c
h
e
Implementation in Flink:
● RichFlatMapFunction
● Cache Preloading in open()
© 2019 Ververica22
Per-Record Lookup with Initial Cache Pre-Loading
+ -
• high throughput
• low latency
• staleness of reference data limited
by cache timeout
• reference data size limited by
memory of all Taskmanagers
• events are enriched with stale
reference data
• load spikes/high tail latencies
depending on cache miss rate
© 2019 Ververica23
Half-Time Assessment
• multiple solutions for enrichment via external database
• So far, always a trade-off between
– database load
– staleness of reference data
– reference data size
– latency & throughput of event stream
© 2019 Ververica
Reference Data Change Stream
© 2019 Ververica25
High-Level Architecture
Sensor
Reference
Data
Apache Flink
• low update frequency
• one record per key
• high frequency
• many events per key over time
look up by sensorID
Sensor
Measurements
Enriched
Measurements
© 2019 Ververica26
More Streamy High-Level Architecture
Sensor
Reference
Data
Apache Flink
• database updates are captured and written
into message queue
• local stream join instead of
external lookup
Sensor
Measurements
Enriched
Measurements
Sensor
Reference Data
Updates
© 2019 Ververica27
High-Level Streaming/Event-Driven Architecture
Sensor
Reference
Data
Apache Flink
• ground truth is moved to message broker
• Former sensor reference database and
Apache Flink consume the same stream of
reference data updates
Sensor
Measurements
Enriched
Measurements
Sensor
Reference Data
Updates
Sensor Management
System
• local stream join instead of
external lookup
© 2019 Ververica28
Simple Streaming Enrichment
KeyedCoProcessFunction
Implementation in
Flink:
● Key by sensorId
● ValueState<
SensorReference
Data>
© 2019 Ververica29
Simple Streaming Enrichment
Code
© 2019 Ververica30
Simple Streaming Enrichment
+ -
• high throughput
• low latency
• always uses up-to-date reference
data
• reference data size not limited
(RocksDB) by memory
• might necessitate change of
high-level architecture or a
conversation with DBAs
• events might be enriched by
reference data from the future
© 2019 Ververica31
Simple Event Time Join
CoProcessFunction
Look up
based on
event time
Implementation in
Flink:
● Key by sensorId
● MapState<Long,
SensorReference
Data>
© 2019 Ververica32
Simple Event Time Join
+ -
• high throughput
• low latency
• always uses latest available
reference data for each record
• reference data size not limited
(RocksDB) by memory
• might necessitate change of
high-level architecture or a
conversation with DBAs
© 2019 Ververica33
Temporal Table Join
KeyedCoProcessFunction
On each watermark:
joins all events up till
watermark correct
reference
Implementation in Flink:
● https://ci.apache.org/pro
jects/flink/flink-docs-rele
ase-1.8/dev/table/strea
ming/temporal_tables.ht
ml
● org.apache.flink.table.r
untime.join.TemporalR
owtimeJoin
© 2019 Ververica34
Temporal Table Join
+ -
• high throughput
• always uses latest available
reference data for each record &
reference data is complete
• reference data size not limited
(RocksDB) by memory
• might necessitate change of
high-level architecture or a
conversation with DBAs
• higher latency
© 2019 Ververica35
TL;DL
• Flink provides numerous ways to enrich streaming data with slow changing reference
data.
• The highest performance and cleanest semantics result from a stream processing
architecture and streaming enrichment methods like Temporal Table Joins.
• More Resources
– https://www.ververica.com/resources/flink-forward-san-francisco-2019/how-to-join-two-data-streams
– https://github.com/knaufk/enrichments-with-flink
• Join the community!
– Subscribe to mailing lists
– Join Flink Forward Europe 2019 in October
© 2019 Ververica
Questions?
© 2019 Ververica
www.ververica.com @VervericaDatakonstantin@ververica.com

More Related Content

What's hot

Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
Flink Forward
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
Databricks
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?
Kai Wähner
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
ScyllaDB
 
Data ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFiData ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFi
Lev Brailovskiy
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
Flink Forward
 
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, UberDemystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Flink Forward
 
kafka
kafkakafka
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
Flink Forward
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
Flink Forward
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
Flink Forward
 
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the FieldKafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
confluent
 
Data Pipelines with Kafka Connect
Data Pipelines with Kafka ConnectData Pipelines with Kafka Connect
Data Pipelines with Kafka Connect
Kaufman Ng
 
Kafka Connect - debezium
Kafka Connect - debeziumKafka Connect - debezium
Kafka Connect - debezium
Kasun Don
 
Real-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache FlinkReal-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache Flink
DataWorks Summit
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Flink Forward
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
Knoldus Inc.
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafka
confluent
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics
Araf Karsh Hamid
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
Kostas Tzoumas
 

What's hot (20)

Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
Data ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFiData ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFi
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
 
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, UberDemystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
 
kafka
kafkakafka
kafka
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
 
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the FieldKafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
 
Data Pipelines with Kafka Connect
Data Pipelines with Kafka ConnectData Pipelines with Kafka Connect
Data Pipelines with Kafka Connect
 
Kafka Connect - debezium
Kafka Connect - debeziumKafka Connect - debezium
Kafka Connect - debezium
 
Real-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache FlinkReal-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache Flink
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafka
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 

Similar to Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf

The Past, Present, and Future of Apache Flink
The Past, Present, and Future of Apache FlinkThe Past, Present, and Future of Apache Flink
The Past, Present, and Future of Apache Flink
Aljoscha Krettek
 
Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...
Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...
Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...
Flink Forward
 
Spring-Boot-PQS with Apache Ignite Caching @ HbaseCon PhoenixCon Dataworks su...
Spring-Boot-PQS with Apache Ignite Caching @ HbaseCon PhoenixCon Dataworks su...Spring-Boot-PQS with Apache Ignite Caching @ HbaseCon PhoenixCon Dataworks su...
Spring-Boot-PQS with Apache Ignite Caching @ HbaseCon PhoenixCon Dataworks su...
Anirudha Jadhav
 
The Past, Present, and Future of Apache Flink®
The Past, Present, and Future of Apache Flink®The Past, Present, and Future of Apache Flink®
The Past, Present, and Future of Apache Flink®
Aljoscha Krettek
 
Introduction to Apache NiFi dws19 DWS - DC 2019
Introduction to Apache NiFi   dws19 DWS - DC 2019Introduction to Apache NiFi   dws19 DWS - DC 2019
Introduction to Apache NiFi dws19 DWS - DC 2019
Timothy Spann
 
Stream Processing @ Lyft
Stream Processing @ LyftStream Processing @ Lyft
Stream Processing @ Lyft
Jamie Grier
 
Machine Data 101
Machine Data 101Machine Data 101
Machine Data 101
Splunk
 
What's New in NGINX Plus R7?
What's New in NGINX Plus R7?What's New in NGINX Plus R7?
What's New in NGINX Plus R7?
NGINX, Inc.
 
Re-define network visibility for capacity planning & forecasting with Grafana
Re-define network visibility for capacity planning & forecasting with GrafanaRe-define network visibility for capacity planning & forecasting with Grafana
Re-define network visibility for capacity planning & forecasting with Grafana
Bangladesh Network Operators Group
 
Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...
Event Streaming Architecture for Industry 4.0 -  Abdelkrim Hadjidj & Jan Kuni...Event Streaming Architecture for Industry 4.0 -  Abdelkrim Hadjidj & Jan Kuni...
Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...
Flink Forward
 
dA Platform Overview
dA Platform OverviewdA Platform Overview
dA Platform Overview
Robert Metzger
 
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward
 
NGINX Kubernetes Ingress Controller: Getting Started – EMEA
NGINX Kubernetes Ingress Controller: Getting Started – EMEANGINX Kubernetes Ingress Controller: Getting Started – EMEA
NGINX Kubernetes Ingress Controller: Getting Started – EMEA
Aine Long
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Data Con LA
 
Introduction to Apache NiFi 1.10
Introduction to Apache NiFi 1.10Introduction to Apache NiFi 1.10
Introduction to Apache NiFi 1.10
Timothy Spann
 
Big Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHING
Big Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHINGBig Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHING
Big Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHING
Matt Stubbs
 
The Never Landing Stream with HTAP and Streaming
The Never Landing Stream with HTAP and StreamingThe Never Landing Stream with HTAP and Streaming
The Never Landing Stream with HTAP and Streaming
Timothy Spann
 
Unconference Round Table Notes
Unconference Round Table NotesUnconference Round Table Notes
Unconference Round Table Notes
Timothy Spann
 
Spark+flume seattle
Spark+flume seattleSpark+flume seattle
Spark+flume seattle
Hari Shreedharan
 
Project Controls Expo, 13th Nov 2013 - "Loading Cost and Activity data into P...
Project Controls Expo, 13th Nov 2013 - "Loading Cost and Activity data into P...Project Controls Expo, 13th Nov 2013 - "Loading Cost and Activity data into P...
Project Controls Expo, 13th Nov 2013 - "Loading Cost and Activity data into P...
Project Controls Expo
 

Similar to Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf (20)

The Past, Present, and Future of Apache Flink
The Past, Present, and Future of Apache FlinkThe Past, Present, and Future of Apache Flink
The Past, Present, and Future of Apache Flink
 
Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...
Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...
Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...
 
Spring-Boot-PQS with Apache Ignite Caching @ HbaseCon PhoenixCon Dataworks su...
Spring-Boot-PQS with Apache Ignite Caching @ HbaseCon PhoenixCon Dataworks su...Spring-Boot-PQS with Apache Ignite Caching @ HbaseCon PhoenixCon Dataworks su...
Spring-Boot-PQS with Apache Ignite Caching @ HbaseCon PhoenixCon Dataworks su...
 
The Past, Present, and Future of Apache Flink®
The Past, Present, and Future of Apache Flink®The Past, Present, and Future of Apache Flink®
The Past, Present, and Future of Apache Flink®
 
Introduction to Apache NiFi dws19 DWS - DC 2019
Introduction to Apache NiFi   dws19 DWS - DC 2019Introduction to Apache NiFi   dws19 DWS - DC 2019
Introduction to Apache NiFi dws19 DWS - DC 2019
 
Stream Processing @ Lyft
Stream Processing @ LyftStream Processing @ Lyft
Stream Processing @ Lyft
 
Machine Data 101
Machine Data 101Machine Data 101
Machine Data 101
 
What's New in NGINX Plus R7?
What's New in NGINX Plus R7?What's New in NGINX Plus R7?
What's New in NGINX Plus R7?
 
Re-define network visibility for capacity planning & forecasting with Grafana
Re-define network visibility for capacity planning & forecasting with GrafanaRe-define network visibility for capacity planning & forecasting with Grafana
Re-define network visibility for capacity planning & forecasting with Grafana
 
Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...
Event Streaming Architecture for Industry 4.0 -  Abdelkrim Hadjidj & Jan Kuni...Event Streaming Architecture for Industry 4.0 -  Abdelkrim Hadjidj & Jan Kuni...
Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...
 
dA Platform Overview
dA Platform OverviewdA Platform Overview
dA Platform Overview
 
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
 
NGINX Kubernetes Ingress Controller: Getting Started – EMEA
NGINX Kubernetes Ingress Controller: Getting Started – EMEANGINX Kubernetes Ingress Controller: Getting Started – EMEA
NGINX Kubernetes Ingress Controller: Getting Started – EMEA
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
 
Introduction to Apache NiFi 1.10
Introduction to Apache NiFi 1.10Introduction to Apache NiFi 1.10
Introduction to Apache NiFi 1.10
 
Big Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHING
Big Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHINGBig Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHING
Big Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHING
 
The Never Landing Stream with HTAP and Streaming
The Never Landing Stream with HTAP and StreamingThe Never Landing Stream with HTAP and Streaming
The Never Landing Stream with HTAP and Streaming
 
Unconference Round Table Notes
Unconference Round Table NotesUnconference Round Table Notes
Unconference Round Table Notes
 
Spark+flume seattle
Spark+flume seattleSpark+flume seattle
Spark+flume seattle
 
Project Controls Expo, 13th Nov 2013 - "Loading Cost and Activity data into P...
Project Controls Expo, 13th Nov 2013 - "Loading Cost and Activity data into P...Project Controls Expo, 13th Nov 2013 - "Loading Cost and Activity data into P...
Project Controls Expo, 13th Nov 2013 - "Loading Cost and Activity data into P...
 

More from Ververica

2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...
2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...
2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...
Ververica
 
Webinar: How to contribute to Apache Flink - Robert Metzger
Webinar:  How to contribute to Apache Flink - Robert MetzgerWebinar:  How to contribute to Apache Flink - Robert Metzger
Webinar: How to contribute to Apache Flink - Robert Metzger
Ververica
 
Webinar: Detecting row patterns with Flink SQL - Dawid Wysakowicz
Webinar:  Detecting row patterns with Flink SQL - Dawid WysakowiczWebinar:  Detecting row patterns with Flink SQL - Dawid Wysakowicz
Webinar: Detecting row patterns with Flink SQL - Dawid Wysakowicz
Ververica
 
Webinar: Flink SQL in Action - Fabian Hueske
 Webinar: Flink SQL in Action - Fabian Hueske Webinar: Flink SQL in Action - Fabian Hueske
Webinar: Flink SQL in Action - Fabian Hueske
Ververica
 
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
Ververica
 
2018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 2
2018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 22018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 2
2018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 2
Ververica
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
Ververica
 
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkFabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Ververica
 
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache FlinkTzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
Ververica
 
Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP
Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP
Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP
Ververica
 
Aljoscha Krettek - Portable stateful big data processing in Apache Beam
Aljoscha Krettek - Portable stateful big data processing in Apache BeamAljoscha Krettek - Portable stateful big data processing in Apache Beam
Aljoscha Krettek - Portable stateful big data processing in Apache Beam
Ververica
 
Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...
Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...
Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...
Ververica
 
Timo Walther - Table & SQL API - unified APIs for batch and stream processing
Timo Walther - Table & SQL API - unified APIs for batch and stream processingTimo Walther - Table & SQL API - unified APIs for batch and stream processing
Timo Walther - Table & SQL API - unified APIs for batch and stream processing
Ververica
 
Apache Flink Meetup: Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
Apache Flink Meetup:  Sanjar Akhmedov - Joining Infinity – Windowless Stream ...Apache Flink Meetup:  Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
Apache Flink Meetup: Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
Ververica
 
Kostas Kloudas - Extending Flink's Streaming APIs
Kostas Kloudas - Extending Flink's Streaming APIsKostas Kloudas - Extending Flink's Streaming APIs
Kostas Kloudas - Extending Flink's Streaming APIs
Ververica
 
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkFabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Ververica
 
Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...
Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...
Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...
Ververica
 
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup
Ververica
 
Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup
Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup
Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup
Ververica
 
Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...
Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...
Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...
Ververica
 

More from Ververica (20)

2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...
2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...
2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...
 
Webinar: How to contribute to Apache Flink - Robert Metzger
Webinar:  How to contribute to Apache Flink - Robert MetzgerWebinar:  How to contribute to Apache Flink - Robert Metzger
Webinar: How to contribute to Apache Flink - Robert Metzger
 
Webinar: Detecting row patterns with Flink SQL - Dawid Wysakowicz
Webinar:  Detecting row patterns with Flink SQL - Dawid WysakowiczWebinar:  Detecting row patterns with Flink SQL - Dawid Wysakowicz
Webinar: Detecting row patterns with Flink SQL - Dawid Wysakowicz
 
Webinar: Flink SQL in Action - Fabian Hueske
 Webinar: Flink SQL in Action - Fabian Hueske Webinar: Flink SQL in Action - Fabian Hueske
Webinar: Flink SQL in Action - Fabian Hueske
 
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
 
2018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 2
2018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 22018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 2
2018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 2
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkFabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache Flink
 
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache FlinkTzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
 
Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP
Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP
Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP
 
Aljoscha Krettek - Portable stateful big data processing in Apache Beam
Aljoscha Krettek - Portable stateful big data processing in Apache BeamAljoscha Krettek - Portable stateful big data processing in Apache Beam
Aljoscha Krettek - Portable stateful big data processing in Apache Beam
 
Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...
Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...
Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...
 
Timo Walther - Table & SQL API - unified APIs for batch and stream processing
Timo Walther - Table & SQL API - unified APIs for batch and stream processingTimo Walther - Table & SQL API - unified APIs for batch and stream processing
Timo Walther - Table & SQL API - unified APIs for batch and stream processing
 
Apache Flink Meetup: Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
Apache Flink Meetup:  Sanjar Akhmedov - Joining Infinity – Windowless Stream ...Apache Flink Meetup:  Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
Apache Flink Meetup: Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
 
Kostas Kloudas - Extending Flink's Streaming APIs
Kostas Kloudas - Extending Flink's Streaming APIsKostas Kloudas - Extending Flink's Streaming APIs
Kostas Kloudas - Extending Flink's Streaming APIs
 
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkFabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache Flink
 
Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...
Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...
Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...
 
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup
 
Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup
Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup
Robert Metzger - Apache Flink Community Updates November 2016 @ Berlin Meetup
 
Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...
Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...
Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...
 

Recently uploaded

FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
UiPathCommunity
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 

Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf

  • 1. © 2019 Ververica Konstantin Knauf, @snntrable, Solutions Architect 99 WAYS TO ENRICH STREAMING DATA WITH APACHE FLINK
  • 2. © 2019 Ververica2 Agenda • Introduction • Per-Record Reference Data Lookup • Reference Data Pre-Loading • Reference Data Change Stream • Summary
  • 4. © 2019 Ververica4 Running Example Sensor Reference Data Apache Flink • low update frequency • one record per key • high frequency • many events per key over time look up by sensorID Sensor Measurements Enriched Measurements
  • 5. © 2019 Ververica Per-Record Reference Data Lookup
  • 6. © 2019 Ververica6 Per-Record Synchronous Lookup Sensor Reference Data Implementation in Flink: ● RichFlatMapFunction ● Database Client instantiated in open()
  • 7. © 2019 Ververica7 Per-Record Synchronous Lookup • very simple • always uses up-to-date reference data • high latency • high load on database • low CPU utilization on Taskmanagers • low throughput • overall not suitable for high frequency streams + -
  • 8. © 2019 Ververica8 Per-Record Asynchronous Lookup Sensor Reference Data Implementation in Flink: ● AsyncDataStream#unorderedWait ● https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/op erators/asyncio.html
  • 9. © 2019 Ververica9 Per-Record Asynchronous Lookup Code
  • 10. © 2019 Ververica10 Per-Record Asynchronous Lookup • still pretty simple • always uses up-to-date reference data • higher throughput than “Per-Record Synchronous Lookups” • high latency • high load on database • overall not suitable for high frequency streams + -
  • 11. © 2019 Ververica11 Per-Record (A)Synchronous Lookup with In-Memory Cache Sensor Reference Data C a c h e C a c h e Implementation in Flink: ● RichFlatMapFunction ● Simple HashMap or cache implementation of your favorite library
  • 12. © 2019 Ververica12 Per-Record (A)Synchronous Lookup with In-Memory Cache • lower load on database • higher throughput if cache hit rate is high • low latency if cache is hit • high tail latencies • high load on database during warm up phase, possibly hard to predict • events might be enriched by stale data • cache size limited by available memory + -
  • 13. © 2019 Ververica Reference Data Pre-Loading
  • 14. © 2019 Ververica14 Pre-Loading of Reference Data Sensor Reference Data Implementation in Flink: ● RichFlatMapFunction ● HashMap populated in open()
  • 15. © 2019 Ververica15 Pre-Loading of Reference Data • usually very simple (depending on the database client) • high throughput • low latency • reference data size limited by memory of single Taskmanager • events are enriched with stale data, no updates to reference data + -
  • 16. © 2019 Ververica16 Partitioned Pre-Loading of Reference Data Sensor Reference Data Implementation in Flink: ● DataStream#partitionCustom ● RichFlatMapFunction
  • 17. © 2019 Ververica17 Partitioned Pre-Loading of Reference Data Code
  • 18. © 2019 Ververica18 Partitioned Pre-Loading of Reference Data • high throughput • low latency • reference data size limited by memory of all Taskmanagers • events are enriched with stale data, no updates to reference data • requires custom partitioning of DataStream + -
  • 19. © 2019 Ververica19 Periodic (Partitioned) Pre-Loading of Reference Data Sensor Reference Data Implementation in Flink: ● CoProcessFunction ● Processing or Event Time Timers for Reloading
  • 20. © 2019 Ververica20 Periodic (Partitioned) Pre-Loading of Reference Data • high throughput • low latency • staleness of reference data limited by refresh interval • reference data size limited by memory of all Taskmanagers • events are enriched with stale reference data • load spikes/high tail latencies during refresh of reference data + -
  • 21. © 2019 Ververica21 Per-Record Lookup with Initial Cache Pre-Loading Sensor Reference Data C a c h e C a c h e Implementation in Flink: ● RichFlatMapFunction ● Cache Preloading in open()
  • 22. © 2019 Ververica22 Per-Record Lookup with Initial Cache Pre-Loading + - • high throughput • low latency • staleness of reference data limited by cache timeout • reference data size limited by memory of all Taskmanagers • events are enriched with stale reference data • load spikes/high tail latencies depending on cache miss rate
  • 23. © 2019 Ververica23 Half-Time Assessment • multiple solutions for enrichment via external database • So far, always a trade-off between – database load – staleness of reference data – reference data size – latency & throughput of event stream
  • 24. © 2019 Ververica Reference Data Change Stream
  • 25. © 2019 Ververica25 High-Level Architecture Sensor Reference Data Apache Flink • low update frequency • one record per key • high frequency • many events per key over time look up by sensorID Sensor Measurements Enriched Measurements
  • 26. © 2019 Ververica26 More Streamy High-Level Architecture Sensor Reference Data Apache Flink • database updates are captured and written into message queue • local stream join instead of external lookup Sensor Measurements Enriched Measurements Sensor Reference Data Updates
  • 27. © 2019 Ververica27 High-Level Streaming/Event-Driven Architecture Sensor Reference Data Apache Flink • ground truth is moved to message broker • Former sensor reference database and Apache Flink consume the same stream of reference data updates Sensor Measurements Enriched Measurements Sensor Reference Data Updates Sensor Management System • local stream join instead of external lookup
  • 28. © 2019 Ververica28 Simple Streaming Enrichment KeyedCoProcessFunction Implementation in Flink: ● Key by sensorId ● ValueState< SensorReference Data>
  • 29. © 2019 Ververica29 Simple Streaming Enrichment Code
  • 30. © 2019 Ververica30 Simple Streaming Enrichment + - • high throughput • low latency • always uses up-to-date reference data • reference data size not limited (RocksDB) by memory • might necessitate change of high-level architecture or a conversation with DBAs • events might be enriched by reference data from the future
  • 31. © 2019 Ververica31 Simple Event Time Join CoProcessFunction Look up based on event time Implementation in Flink: ● Key by sensorId ● MapState<Long, SensorReference Data>
  • 32. © 2019 Ververica32 Simple Event Time Join + - • high throughput • low latency • always uses latest available reference data for each record • reference data size not limited (RocksDB) by memory • might necessitate change of high-level architecture or a conversation with DBAs
  • 33. © 2019 Ververica33 Temporal Table Join KeyedCoProcessFunction On each watermark: joins all events up till watermark correct reference Implementation in Flink: ● https://ci.apache.org/pro jects/flink/flink-docs-rele ase-1.8/dev/table/strea ming/temporal_tables.ht ml ● org.apache.flink.table.r untime.join.TemporalR owtimeJoin
  • 34. © 2019 Ververica34 Temporal Table Join + - • high throughput • always uses latest available reference data for each record & reference data is complete • reference data size not limited (RocksDB) by memory • might necessitate change of high-level architecture or a conversation with DBAs • higher latency
  • 35. © 2019 Ververica35 TL;DL • Flink provides numerous ways to enrich streaming data with slow changing reference data. • The highest performance and cleanest semantics result from a stream processing architecture and streaming enrichment methods like Temporal Table Joins. • More Resources – https://www.ververica.com/resources/flink-forward-san-francisco-2019/how-to-join-two-data-streams – https://github.com/knaufk/enrichments-with-flink • Join the community! – Subscribe to mailing lists – Join Flink Forward Europe 2019 in October
  • 37. © 2019 Ververica www.ververica.com @VervericaDatakonstantin@ververica.com