SlideShare a Scribd company logo
1 of 61
Download to read offline
Unlocking the Power of Apache Flink:
An Introduction in 4 Actsin
David Anderson
Software Practice Lead, Confluent
Apache Flink Committer
Today’s consumers expect real-time services
Real-time
Data
A Sale
A Shipment
A Trade
Rich Front-End
Customer Experiences
A Customer
Experience
Real-Time Backend
Operations
Real-time services rely on stream processing
Real-time Stream Processing
Driving business value with Apache Flink
Real-time analytics Event-driven applications
Streaming data
pipelines
Continuously produce and
update results which are
displayed and delivered to users
as real-time data streams are
consumed
● Ad/campaign performance
● Content performance
● Quality monitoring of Telco
networks
● Usage metering and billing
Recognize patterns and react to
incoming events by triggering
computations, state updates, or
external actions
● Fraud detection
● Anomaly detection
● Business process
monitoring
● Geo-fencing
Real-time data pipelines that
continuously ingest, enrich,
and transform data streams,
loading them into destination
systems for timely action (vs.
batch processing)
● Continuous ETL
● Real-time search index
building
● ML pipelines
● Data lake ingestion
Developers are choosing Flink because of
Its performance and rich feature set
Scalability & high
performance
Flink supports stream
processing workloads at
tremendous scale
Flink supports Java,
Python, & SQL, enabling
developers to work in
their language of choice
Flink supports stream
processing, batch
processing, and ad-hoc
analytics through one
technology
Unified processing
Flink's checkpointing
mechanism provides
exactly-once guarantees
automatically
Fault tolerance &
high availability
Language flexibility
Flink is a top 5 Apache project and has a very active community
@yourtwitterhandle | developer.confluent.io
Streaming
The four cornerstones on which Flink is built
State
Time Snapshots
● A stream is a sequence of events
● Business data is always a stream: bounded or unbounded
● For Flink, batch processing is just a special case in the runtime
now
past future
bounded stream
unbounded stream
Streaming
Real-time services rely on stream processing
Kafka
Databases
Key/Value Stores
Files
Apps
Sources
Real-time Stream Processing
Sinks
Real-time Stream Processing
Real-time services rely on stream processing
Kafka
Databases
Key/Value Stores
Files
Apps
Sources Sinks
The Job Graph (or Topology)
The Job Graph (or Topology)
OPERATOR
CONNECTION
Stream processing
• Parallel
• Forward
• Repartition
• Rebalance
grouped by
shape
SOURCE
Stream processing
• Parallel
• Forward
• Repartition
• Rebalance
grouped by
shape
SOURCE
Stream processing
• Parallel
• Forward
• Repartition
• Rebalance
group by
color
FILTER
Stream processing
• Parallel
• Forward
• Repartition
• Rebalance
COUNT
1
2
3
1
2
3
4
Stream processing with SQL
INSERT INTO results
SELECT color, COUNT(*)
FROM events
WHERE color <> orange
GROUP BY color;
GROUP BY color
results
COUNT
WHERE color <> orange
events
Stream processing with SQL
INSERT INTO results
SELECT color, COUNT(*)
FROM events
WHERE color <> orange
GROUP BY color;
GROUP BY color
results
COUNT
WHERE color <> orange
events
Stream processing with SQL
INSERT INTO results
SELECT color, COUNT(*)
FROM events
WHERE color <> orange
GROUP BY color;
GROUP BY color
events
results
COUNT
WHERE color <> orange
Stream processing with SQL
INSERT INTO results
SELECT color, COUNT(*)
FROM events
WHERE color <> orange
GROUP BY color;
GROUP BY color
results
COUNT
WHERE color <> orange
events
Stream processing with SQL
INSERT INTO results
SELECT color, COUNT(*)
FROM events
WHERE color <> orange
GROUP BY color;
GROUP BY color
results
COUNT
WHERE color <> orange
events
events
Flink’s APIs
Apache Flink Runtime
Low-Level
Stream Operator API
Optimizer / Planner
Table / SQL API
DataStream API
Runtime Architecture
Runtime Architecture
Flink supports streaming
● Bounded or unbounded streams
● Entire pipeline must always be running
● Input must be processed as it arrives
● Results are reported as they become ready
● Failure recovery resumes from a recent snapshot
● Flink guarantees effectively exactly-once results
despite out-of-order data and restarts due to
failures, etc.
● Only bounded streams
● Execution proceeds in stages, running as needed
● Input may be pre-sorted by time and key
● Results are reported at the end of the job
● Failure recovery does a reset and full restart
● Effectively exactly-once guarantees are more
straightforward
and batch
@yourtwitterhandle | developer.confluent.io
Streaming State
Time Snapshots
Stateful stream processing with Flink SQL
INSERT INTO results
SELECT color, COUNT(*)
FROM events
WHERE color <> orange
GROUP BY color;
GROUP BY color
events
results
COUNT
WHERE color <> orange
Stateful stream processing with Flink SQL
INSERT INTO results
SELECT color, COUNT(*)
FROM events
WHERE color <> orange
GROUP BY color;
GROUP BY color
events
results
COUNT
WHERE color <> orange
Stateful stream processing with Flink SQL
● Counting requires
state
GROUP BY color
events
results
COUNT
WHERE color <> orange
State
• Local
• Fast
• Fault tolerant
State
• Local
• Fast
• Fault tolerant
@yourtwitterhandle | developer.confluent.io
Streaming State
Time Snapshots
Time
• Synchronize
• Wait
• Timeout
09:05:44
When the event was created at its
original source.
Event time
09:08:01
When the event is being processed.
This time varies between applications.
Processing time
● Streams are (roughly) ordered by time
Out-of-order event streams
10:10
10:14
10:10
10:14
Coping with out of order events
This event will
be read next
Coping with out of order events
These events
follow
Coping with out of order events
Imagine a window counting events for the hour ending at 2:00. How long should this
window wait before producing its results?
Watermarks measure progress of event time
Watermark
● This watermark has been generated by assuming that the stream is at most 5 minutes
out-of-order
Watermarks measure progress of event time
● This watermark has been generated by assuming that the stream is at most 5 minutes
out-of-order
● The watermark is the max timestamp seen so far, minus this out-of-orderness estimate
1:50 - 5 = 1:45
Watermarks measure progress of event time
● This watermark has been generated by assuming that the stream is at most 5 minutes
out-of-order
● The watermark is the max timestamp seen so far, minus this out-of-orderness estimate
● A watermark is an assertion about the completeness of the stream
Now this stream is
complete up to 1:45
Watermarks measure progress of event time
Imagine a window counting events for the hour ending at 2:00. How long should this
window wait before producing its results?
It should wait for a watermark with a timestamp of at least 2:00.
What are
watermarks for?
They make things happen when the time is right.
The idle stream
problem
● Streams that are idle do not
advance the watermark
● This prevents windows from
producing results
The idle stream
problem
● Streams that are idle do not
advance the watermark
● This prevents windows from
producing results
Solutions
● Balance the partitions so none are
empty or idle, or
● Send keep-alive events, or
● Configure the watermarking to
use idleness detection
Watermarks
● Not needed for applications that only use wall-clock (processing) time
● Not needed for batch processing
● Are needed for triggering actions based on event-time, e.g., closing a
window
● Are generated based on an assumption of how out of order the data might
be
● Provide control over the tradeoff between completeness and latency
● Flink SQL drops late events; the DataStream API offers more control
● Allow for consistent, reproducible results
● Potentially idle sources require special attention
@yourtwitterhandle | developer.confluent.io
Streaming State
Time Snapshots
A checkpoint is an
automatic snapshot
created by Flink,
primarily for the
purpose of failure
recovery
A checkpoint is an
automatic snapshot
created by Flink,
primarily for the
purpose of failure
recovery
A savepoint is a
manual snapshot
created for some
operational purpose
(e.g., a stateful
upgrade)
Snapshots
events
results
COUNT
FILTER
GROUP BY color
Snapshots
Source Filter Count by color Sink
events
results
COUNT
FILTER
GROUP BY color
Snapshots
Source Filter Count by color Sink
Offsets for some
partitions
Offsets for other
partitions
events
results
COUNT
FILTER
GROUP BY color
Snapshots
Source Filter Count by color Sink
Offsets for some
partitions
______________________
Offsets for other
partitions
______________________
events
results
COUNT
FILTER
GROUP BY color
Snapshots
Source Filter Count by color Sink
Offsets for some
partitions
______________________ Counters for some
colors
Offsets for other
partitions
______________________ Counters for other
colors
events
results
COUNT
FILTER
GROUP BY color
Snapshots
Source Filter Count by color Sink
Offsets for some
partitions
______________________ Counters for some
colors
Transaction ID
Offsets for other
partitions
______________________ Counters for other
colors
__________________
events
results
COUNT
FILTER
GROUP BY color
Taking a
snapshot
does NOT
stop the
world
Checkpoints and savepoints are created
asynchronously, while the job continues to
process events and produce results
Because
these are
self-consistent,
global
snapshots
● Flink provides (effectively) exactly-once
guarantees
● Recovery involves restarting the entire job
from the most recent checkpoint
Recovery
Wrap-up
Streaming
Unfamiliar to many
developers, but
ultimately
straightforward
Watermarks
encapsulate something
complex in one place –
the sources
● how out-of-order?
● can it be idle?
Transparent to
application developers
State snapshots for
recovery
Delightfully simple
● local
● key/value
● single-threaded
State Event time and
watermarks
Where is the
Flink community?
To subscribe to the mailing lists, or get an invite
to Slack, see https://flink.apache.org/community/
Your Apache Flink®
journey begins here
developer.confluent.io

More Related Content

What's hot

Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Flink Forward
 
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022HostedbyConfluent
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink Forward
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsFlink Forward
 
Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...
Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...
Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...Flink Forward
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkFlink Forward
 
Data Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkData Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkFabian Hueske
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Jean-Paul Azar
 
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, UberDemystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, UberFlink Forward
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?confluent
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022Flink Forward
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergFlink Forward
 
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Databricks
 
From Zero to Hero with Kafka Connect
From Zero to Hero with Kafka ConnectFrom Zero to Hero with Kafka Connect
From Zero to Hero with Kafka Connectconfluent
 
Diving into the Deep End - Kafka Connect
Diving into the Deep End - Kafka ConnectDiving into the Deep End - Kafka Connect
Diving into the Deep End - Kafka Connectconfluent
 
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan EwenAdvanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewenconfluent
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 

What's hot (20)

Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!
 
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...
Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...
Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache Flink
 
Data Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkData Stream Processing with Apache Flink
Data Stream Processing with Apache Flink
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, UberDemystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
 
From Zero to Hero with Kafka Connect
From Zero to Hero with Kafka ConnectFrom Zero to Hero with Kafka Connect
From Zero to Hero with Kafka Connect
 
Diving into the Deep End - Kafka Connect
Diving into the Deep End - Kafka ConnectDiving into the Deep End - Kafka Connect
Diving into the Deep End - Kafka Connect
 
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan EwenAdvanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 

Similar to Unlocking the Power of Apache Flink: An Introduction in 4 Acts

Making Sense of Apache Flink: A Fearless Introduction
Making Sense of Apache Flink: A Fearless IntroductionMaking Sense of Apache Flink: A Fearless Introduction
Making Sense of Apache Flink: A Fearless IntroductionHostedbyConfluent
 
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 KeynoteAdvanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 KeynoteStreamNative
 
Flink SQL: The Challenges to Build a Streaming SQL Engine
Flink SQL: The Challenges to Build a Streaming SQL EngineFlink SQL: The Challenges to Build a Streaming SQL Engine
Flink SQL: The Challenges to Build a Streaming SQL EngineHostedbyConfluent
 
Streaming SQL to unify batch and stream processing: Theory and practice with ...
Streaming SQL to unify batch and stream processing: Theory and practice with ...Streaming SQL to unify batch and stream processing: Theory and practice with ...
Streaming SQL to unify batch and stream processing: Theory and practice with ...Fabian Hueske
 
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache Flink
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache FlinkUnifying Stream, SWL and CEP for Declarative Stream Processing with Apache Flink
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache FlinkDataWorks Summit/Hadoop Summit
 
Unbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxiniUnbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxiniMonal Daxini
 
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...Flink Forward
 
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...Flink Forward
 
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...C4Media
 
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드confluent
 
When Streaming Needs Batch With Konstantin Knauf | Current 2022
When Streaming Needs Batch With Konstantin Knauf | Current 2022When Streaming Needs Batch With Konstantin Knauf | Current 2022
When Streaming Needs Batch With Konstantin Knauf | Current 2022HostedbyConfluent
 
Fluentd and Distributed Logging at Kubecon
Fluentd and Distributed Logging at KubeconFluentd and Distributed Logging at Kubecon
Fluentd and Distributed Logging at KubeconN Masahiro
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingApache Apex
 
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...Ververica
 
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingCloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingDoiT International
 
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data ArtisansApache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data ArtisansEvention
 

Similar to Unlocking the Power of Apache Flink: An Introduction in 4 Acts (20)

Making Sense of Apache Flink: A Fearless Introduction
Making Sense of Apache Flink: A Fearless IntroductionMaking Sense of Apache Flink: A Fearless Introduction
Making Sense of Apache Flink: A Fearless Introduction
 
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 KeynoteAdvanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
 
Flink SQL: The Challenges to Build a Streaming SQL Engine
Flink SQL: The Challenges to Build a Streaming SQL EngineFlink SQL: The Challenges to Build a Streaming SQL Engine
Flink SQL: The Challenges to Build a Streaming SQL Engine
 
Streaming SQL to unify batch and stream processing: Theory and practice with ...
Streaming SQL to unify batch and stream processing: Theory and practice with ...Streaming SQL to unify batch and stream processing: Theory and practice with ...
Streaming SQL to unify batch and stream processing: Theory and practice with ...
 
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache Flink
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache FlinkUnifying Stream, SWL and CEP for Declarative Stream Processing with Apache Flink
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache Flink
 
Flink. Pure Streaming
Flink. Pure StreamingFlink. Pure Streaming
Flink. Pure Streaming
 
Unbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxiniUnbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxini
 
Zurich Flink Meetup
Zurich Flink MeetupZurich Flink Meetup
Zurich Flink Meetup
 
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
 
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
 
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...
Have Your Cake and Eat It Too -- Further Dispelling the Myths of the Lambda A...
 
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
 
When Streaming Needs Batch With Konstantin Knauf | Current 2022
When Streaming Needs Batch With Konstantin Knauf | Current 2022When Streaming Needs Batch With Konstantin Knauf | Current 2022
When Streaming Needs Batch With Konstantin Knauf | Current 2022
 
Gcp dataflow
Gcp dataflowGcp dataflow
Gcp dataflow
 
Fluentd and Distributed Logging at Kubecon
Fluentd and Distributed Logging at KubeconFluentd and Distributed Logging at Kubecon
Fluentd and Distributed Logging at Kubecon
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark Streaming
 
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
 
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingCloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
 
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data ArtisansApache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans
 
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Recently uploaded

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 

Recently uploaded (20)

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 

Unlocking the Power of Apache Flink: An Introduction in 4 Acts

  • 1. Unlocking the Power of Apache Flink: An Introduction in 4 Actsin David Anderson Software Practice Lead, Confluent Apache Flink Committer
  • 2. Today’s consumers expect real-time services
  • 3. Real-time Data A Sale A Shipment A Trade Rich Front-End Customer Experiences A Customer Experience Real-Time Backend Operations Real-time services rely on stream processing Real-time Stream Processing
  • 4. Driving business value with Apache Flink Real-time analytics Event-driven applications Streaming data pipelines Continuously produce and update results which are displayed and delivered to users as real-time data streams are consumed ● Ad/campaign performance ● Content performance ● Quality monitoring of Telco networks ● Usage metering and billing Recognize patterns and react to incoming events by triggering computations, state updates, or external actions ● Fraud detection ● Anomaly detection ● Business process monitoring ● Geo-fencing Real-time data pipelines that continuously ingest, enrich, and transform data streams, loading them into destination systems for timely action (vs. batch processing) ● Continuous ETL ● Real-time search index building ● ML pipelines ● Data lake ingestion
  • 5. Developers are choosing Flink because of Its performance and rich feature set Scalability & high performance Flink supports stream processing workloads at tremendous scale Flink supports Java, Python, & SQL, enabling developers to work in their language of choice Flink supports stream processing, batch processing, and ad-hoc analytics through one technology Unified processing Flink's checkpointing mechanism provides exactly-once guarantees automatically Fault tolerance & high availability Language flexibility Flink is a top 5 Apache project and has a very active community
  • 6. @yourtwitterhandle | developer.confluent.io Streaming The four cornerstones on which Flink is built State Time Snapshots
  • 7. ● A stream is a sequence of events ● Business data is always a stream: bounded or unbounded ● For Flink, batch processing is just a special case in the runtime now past future bounded stream unbounded stream Streaming
  • 8. Real-time services rely on stream processing Kafka Databases Key/Value Stores Files Apps Sources Real-time Stream Processing Sinks
  • 9. Real-time Stream Processing Real-time services rely on stream processing Kafka Databases Key/Value Stores Files Apps Sources Sinks
  • 10. The Job Graph (or Topology)
  • 11. The Job Graph (or Topology) OPERATOR CONNECTION
  • 12. Stream processing • Parallel • Forward • Repartition • Rebalance grouped by shape SOURCE
  • 13. Stream processing • Parallel • Forward • Repartition • Rebalance grouped by shape SOURCE
  • 14. Stream processing • Parallel • Forward • Repartition • Rebalance group by color FILTER
  • 15. Stream processing • Parallel • Forward • Repartition • Rebalance COUNT 1 2 3 1 2 3 4
  • 16. Stream processing with SQL INSERT INTO results SELECT color, COUNT(*) FROM events WHERE color <> orange GROUP BY color; GROUP BY color results COUNT WHERE color <> orange events
  • 17. Stream processing with SQL INSERT INTO results SELECT color, COUNT(*) FROM events WHERE color <> orange GROUP BY color; GROUP BY color results COUNT WHERE color <> orange events
  • 18. Stream processing with SQL INSERT INTO results SELECT color, COUNT(*) FROM events WHERE color <> orange GROUP BY color; GROUP BY color events results COUNT WHERE color <> orange
  • 19. Stream processing with SQL INSERT INTO results SELECT color, COUNT(*) FROM events WHERE color <> orange GROUP BY color; GROUP BY color results COUNT WHERE color <> orange events
  • 20. Stream processing with SQL INSERT INTO results SELECT color, COUNT(*) FROM events WHERE color <> orange GROUP BY color; GROUP BY color results COUNT WHERE color <> orange events events
  • 21. Flink’s APIs Apache Flink Runtime Low-Level Stream Operator API Optimizer / Planner Table / SQL API DataStream API
  • 24. Flink supports streaming ● Bounded or unbounded streams ● Entire pipeline must always be running ● Input must be processed as it arrives ● Results are reported as they become ready ● Failure recovery resumes from a recent snapshot ● Flink guarantees effectively exactly-once results despite out-of-order data and restarts due to failures, etc. ● Only bounded streams ● Execution proceeds in stages, running as needed ● Input may be pre-sorted by time and key ● Results are reported at the end of the job ● Failure recovery does a reset and full restart ● Effectively exactly-once guarantees are more straightforward and batch
  • 26. Stateful stream processing with Flink SQL INSERT INTO results SELECT color, COUNT(*) FROM events WHERE color <> orange GROUP BY color; GROUP BY color events results COUNT WHERE color <> orange
  • 27. Stateful stream processing with Flink SQL INSERT INTO results SELECT color, COUNT(*) FROM events WHERE color <> orange GROUP BY color; GROUP BY color events results COUNT WHERE color <> orange
  • 28. Stateful stream processing with Flink SQL ● Counting requires state GROUP BY color events results COUNT WHERE color <> orange
  • 32. Time • Synchronize • Wait • Timeout 09:05:44 When the event was created at its original source. Event time 09:08:01 When the event is being processed. This time varies between applications. Processing time
  • 33. ● Streams are (roughly) ordered by time Out-of-order event streams 10:10 10:14 10:10 10:14
  • 34. Coping with out of order events This event will be read next
  • 35. Coping with out of order events These events follow
  • 36. Coping with out of order events Imagine a window counting events for the hour ending at 2:00. How long should this window wait before producing its results?
  • 37. Watermarks measure progress of event time Watermark ● This watermark has been generated by assuming that the stream is at most 5 minutes out-of-order
  • 38. Watermarks measure progress of event time ● This watermark has been generated by assuming that the stream is at most 5 minutes out-of-order ● The watermark is the max timestamp seen so far, minus this out-of-orderness estimate 1:50 - 5 = 1:45
  • 39. Watermarks measure progress of event time ● This watermark has been generated by assuming that the stream is at most 5 minutes out-of-order ● The watermark is the max timestamp seen so far, minus this out-of-orderness estimate ● A watermark is an assertion about the completeness of the stream Now this stream is complete up to 1:45
  • 40. Watermarks measure progress of event time Imagine a window counting events for the hour ending at 2:00. How long should this window wait before producing its results? It should wait for a watermark with a timestamp of at least 2:00.
  • 41. What are watermarks for? They make things happen when the time is right.
  • 42. The idle stream problem ● Streams that are idle do not advance the watermark ● This prevents windows from producing results
  • 43. The idle stream problem ● Streams that are idle do not advance the watermark ● This prevents windows from producing results Solutions ● Balance the partitions so none are empty or idle, or ● Send keep-alive events, or ● Configure the watermarking to use idleness detection
  • 44. Watermarks ● Not needed for applications that only use wall-clock (processing) time ● Not needed for batch processing ● Are needed for triggering actions based on event-time, e.g., closing a window ● Are generated based on an assumption of how out of order the data might be ● Provide control over the tradeoff between completeness and latency ● Flink SQL drops late events; the DataStream API offers more control ● Allow for consistent, reproducible results ● Potentially idle sources require special attention
  • 46. A checkpoint is an automatic snapshot created by Flink, primarily for the purpose of failure recovery
  • 47. A checkpoint is an automatic snapshot created by Flink, primarily for the purpose of failure recovery A savepoint is a manual snapshot created for some operational purpose (e.g., a stateful upgrade)
  • 49. Snapshots Source Filter Count by color Sink events results COUNT FILTER GROUP BY color
  • 50. Snapshots Source Filter Count by color Sink Offsets for some partitions Offsets for other partitions events results COUNT FILTER GROUP BY color
  • 51. Snapshots Source Filter Count by color Sink Offsets for some partitions ______________________ Offsets for other partitions ______________________ events results COUNT FILTER GROUP BY color
  • 52. Snapshots Source Filter Count by color Sink Offsets for some partitions ______________________ Counters for some colors Offsets for other partitions ______________________ Counters for other colors events results COUNT FILTER GROUP BY color
  • 53. Snapshots Source Filter Count by color Sink Offsets for some partitions ______________________ Counters for some colors Transaction ID Offsets for other partitions ______________________ Counters for other colors __________________ events results COUNT FILTER GROUP BY color
  • 54. Taking a snapshot does NOT stop the world Checkpoints and savepoints are created asynchronously, while the job continues to process events and produce results
  • 55. Because these are self-consistent, global snapshots ● Flink provides (effectively) exactly-once guarantees ● Recovery involves restarting the entire job from the most recent checkpoint Recovery
  • 57. Streaming Unfamiliar to many developers, but ultimately straightforward Watermarks encapsulate something complex in one place – the sources ● how out-of-order? ● can it be idle? Transparent to application developers State snapshots for recovery Delightfully simple ● local ● key/value ● single-threaded State Event time and watermarks
  • 58. Where is the Flink community? To subscribe to the mailing lists, or get an invite to Slack, see https://flink.apache.org/community/
  • 59.
  • 60.
  • 61. Your Apache Flink® journey begins here developer.confluent.io