SlideShare a Scribd company logo
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Flink Snapshots
A Comprehensive Guide for New Users
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Flink Snapshots
A Comprehensive Guide for New Users
Danny Cranmer
Principal Engineer at AWS
Apache Flink PMC Member
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
1. Stateful processing recap
2. Flink Checkpoints
3. State backends
4. Common problems
3
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 4
Simple Flink Example
SELECT SUM(clicks) FROM MyKafkaTopic
2
5
7
3
4
2 2
Input Output
7
14
17
21
23
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 5
Simple Flink Example
3
4
2 7
14 2
14
o6 o5 o4 o3 o2 o1
o4
In flight state
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 6
Exactly once processing
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 7
At least once processing - Duplicates
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 8
At least once processing - Duplicates
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 9
At most once processing – Dropped Records
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 10
At most once processing – Dropped Records
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
1. Stateful processing recap
2. Flink Checkpoints
3. State backends
4. Common problems
11
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 12
Flink Checkpoints
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 13
Flink Checkpoints
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 14
Flink Checkpoints
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 15
Flink Checkpoints
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 16
Barrier Alignment
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 17
Barrier Alignment
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 18
Barrier Alignment
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 19
Checkpoint Lifecycle
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 20
Checkpoint Lifecycle
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 21
Checkpoint Lifecycle
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 22
Checkpoint Lifecycle
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 23
Checkpointing Configuration
execution.checkpointing.interval
execution.checkpointing.min-pause
execution.checkpointing.timeout
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 24
Checkpoint Statistics
Demo
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 25
Checkpoint vs Savepoint
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
1. Stateful processing recap
2. Flink Checkpoints
3. State backends
4. Common problems
26
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 27
State Backends
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
State Backend Selection
28
state size < sum(Task Manager memory) / ?
? HashMap
: RocksDB
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
1. Stateful processing recap
2. Flink Checkpoints
3. State backends
4. Common problems
29
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 30
Bottlenecks
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 31
Buffer Debloating
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 32
Buffer Debloating - Disabled
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 33
Buffer Debloating - Disabled
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 34
Buffer Debloating - Disabled
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 35
Buffer Debloating - Disabled
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 36
Buffer Debloating - Disabled
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 37
Buffer Debloating - Disabled
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 38
Buffer Debloating - Enabled
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 39
Buffer Debloating - Enabled
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 40
Buffer Debloating - Enabled
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 41
Buffer Debloating
taskmanager.network.memory.buffer-debloat.enabled: true
taskmanager.network.memory.buffer-debloat.target: 1s
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 42
Unaligned Checkpoints
- Checkpoint barriers jump the queue
- Records in buffers stored in the checkpoint
- Not supported for savepoints
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 43
Unaligned Checkpoints
env.getCheckpointConfig().enableUnalignedCheckpoints();
execution.checkpointing.unaligned: true
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 44
Incremental Checkpoints
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 45
Incremental Checkpoints
env.setStateBackend(new RocksDBStateBackend(filebackend, true));
state.backend.incremental: true
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Load Test/Skew
46
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Load Test/Skew
47
FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Danny Cranmer
Principal Engineer at AWS
Apache Flink PMC Member

More Related Content

Similar to Flink Snapshots: A Comprehensive Guide for New Users

AWS Reinvent 2020 - Recap Amazon Builder's Library session
AWS Reinvent 2020 - Recap Amazon Builder's Library sessionAWS Reinvent 2020 - Recap Amazon Builder's Library session
AWS Reinvent 2020 - Recap Amazon Builder's Library session
Guillaume Marchand
 
Amazon CI/CD Practices for Software Development Teams - SRV320 - Atlanta AWS ...
Amazon CI/CD Practices for Software Development Teams - SRV320 - Atlanta AWS ...Amazon CI/CD Practices for Software Development Teams - SRV320 - Atlanta AWS ...
Amazon CI/CD Practices for Software Development Teams - SRV320 - Atlanta AWS ...
Amazon Web Services
 
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
elesangwon
 
Amazon CI/CD Practices for Software Development Teams - SRV320 - Chicago AWS ...
Amazon CI/CD Practices for Software Development Teams - SRV320 - Chicago AWS ...Amazon CI/CD Practices for Software Development Teams - SRV320 - Chicago AWS ...
Amazon CI/CD Practices for Software Development Teams - SRV320 - Chicago AWS ...
Amazon Web Services
 
Amazon CI/CD Practices for Software Development Teams - SRV320 - Anaheim AWS ...
Amazon CI/CD Practices for Software Development Teams - SRV320 - Anaheim AWS ...Amazon CI/CD Practices for Software Development Teams - SRV320 - Anaheim AWS ...
Amazon CI/CD Practices for Software Development Teams - SRV320 - Anaheim AWS ...
Amazon Web Services
 
Hands-On Labs: Introduction to CA Unified Infrastructure Management
Hands-On Labs: Introduction to CA Unified Infrastructure Management Hands-On Labs: Introduction to CA Unified Infrastructure Management
Hands-On Labs: Introduction to CA Unified Infrastructure Management
CA Technologies
 
Hands-On Lab: From Zero to Compliance Using CA Software Asset Management
Hands-On Lab: From Zero to Compliance Using CA Software Asset ManagementHands-On Lab: From Zero to Compliance Using CA Software Asset Management
Hands-On Lab: From Zero to Compliance Using CA Software Asset Management
CA Technologies
 
Pre-Con Ed: Build Your Own Apps for an Enhanced Network Management Experience...
Pre-Con Ed: Build Your Own Apps for an Enhanced Network Management Experience...Pre-Con Ed: Build Your Own Apps for an Enhanced Network Management Experience...
Pre-Con Ed: Build Your Own Apps for an Enhanced Network Management Experience...
CA Technologies
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
Real World Problem Solving Using Application Performance Management 10
Real World Problem Solving Using Application Performance Management 10Real World Problem Solving Using Application Performance Management 10
Real World Problem Solving Using Application Performance Management 10
CA Technologies
 
Implementation and Use of Generic VTAM Resources with Parallel SYSPLEX Features
Implementation and Use of Generic VTAM Resources with Parallel SYSPLEX FeaturesImplementation and Use of Generic VTAM Resources with Parallel SYSPLEX Features
Implementation and Use of Generic VTAM Resources with Parallel SYSPLEX Features
CA Technologies
 
Deep-Dive-with-Cloud-Monitoring-with-Amazon-EKS-and-Prometheus
Deep-Dive-with-Cloud-Monitoring-with-Amazon-EKS-and-PrometheusDeep-Dive-with-Cloud-Monitoring-with-Amazon-EKS-and-Prometheus
Deep-Dive-with-Cloud-Monitoring-with-Amazon-EKS-and-Prometheus
Amazon Web Services
 
深探如何使用-Amazon-EKS-與-Prometheus-進行雲端監控
深探如何使用-Amazon-EKS-與-Prometheus-進行雲端監控深探如何使用-Amazon-EKS-與-Prometheus-進行雲端監控
深探如何使用-Amazon-EKS-與-Prometheus-進行雲端監控
Amazon Web Services
 
Advanced networking on AWS | AWS Floor28
Advanced networking on AWS | AWS Floor28Advanced networking on AWS | AWS Floor28
Advanced networking on AWS | AWS Floor28
Amazon Web Services
 
Hands-On Lab: Tune CA Performance Management for an Optimal Network Performan...
Hands-On Lab: Tune CA Performance Management for an Optimal Network Performan...Hands-On Lab: Tune CA Performance Management for an Optimal Network Performan...
Hands-On Lab: Tune CA Performance Management for an Optimal Network Performan...
CA Technologies
 
Continuous Delivery on AWS with Zero Downtime
Continuous Delivery on AWS with Zero DowntimeContinuous Delivery on AWS with Zero Downtime
Continuous Delivery on AWS with Zero Downtime
Casey Lee
 
Extending Jenkins to the Mainframe. A Simpler Approach.
Extending Jenkins to the Mainframe.  A Simpler Approach.Extending Jenkins to the Mainframe.  A Simpler Approach.
Extending Jenkins to the Mainframe. A Simpler Approach.
DevOps.com
 
All levels of performance testing and monitoring in web-apps
All levels of performance testing and monitoring in web-appsAll levels of performance testing and monitoring in web-apps
All levels of performance testing and monitoring in web-apps
Andrii Skrypnychenko
 
Adding the Sec to Your DevOps Pipelines (SEC332-R1) - AWS re:Invent 2018
Adding the Sec to Your DevOps Pipelines (SEC332-R1) - AWS re:Invent 2018Adding the Sec to Your DevOps Pipelines (SEC332-R1) - AWS re:Invent 2018
Adding the Sec to Your DevOps Pipelines (SEC332-R1) - AWS re:Invent 2018
Amazon Web Services
 

Similar to Flink Snapshots: A Comprehensive Guide for New Users (20)

AWS Reinvent 2020 - Recap Amazon Builder's Library session
AWS Reinvent 2020 - Recap Amazon Builder's Library sessionAWS Reinvent 2020 - Recap Amazon Builder's Library session
AWS Reinvent 2020 - Recap Amazon Builder's Library session
 
Airheads dallas 2011 rap troubleshooting
Airheads dallas 2011   rap troubleshootingAirheads dallas 2011   rap troubleshooting
Airheads dallas 2011 rap troubleshooting
 
Amazon CI/CD Practices for Software Development Teams - SRV320 - Atlanta AWS ...
Amazon CI/CD Practices for Software Development Teams - SRV320 - Atlanta AWS ...Amazon CI/CD Practices for Software Development Teams - SRV320 - Atlanta AWS ...
Amazon CI/CD Practices for Software Development Teams - SRV320 - Atlanta AWS ...
 
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
 
Amazon CI/CD Practices for Software Development Teams - SRV320 - Chicago AWS ...
Amazon CI/CD Practices for Software Development Teams - SRV320 - Chicago AWS ...Amazon CI/CD Practices for Software Development Teams - SRV320 - Chicago AWS ...
Amazon CI/CD Practices for Software Development Teams - SRV320 - Chicago AWS ...
 
Amazon CI/CD Practices for Software Development Teams - SRV320 - Anaheim AWS ...
Amazon CI/CD Practices for Software Development Teams - SRV320 - Anaheim AWS ...Amazon CI/CD Practices for Software Development Teams - SRV320 - Anaheim AWS ...
Amazon CI/CD Practices for Software Development Teams - SRV320 - Anaheim AWS ...
 
Hands-On Labs: Introduction to CA Unified Infrastructure Management
Hands-On Labs: Introduction to CA Unified Infrastructure Management Hands-On Labs: Introduction to CA Unified Infrastructure Management
Hands-On Labs: Introduction to CA Unified Infrastructure Management
 
Hands-On Lab: From Zero to Compliance Using CA Software Asset Management
Hands-On Lab: From Zero to Compliance Using CA Software Asset ManagementHands-On Lab: From Zero to Compliance Using CA Software Asset Management
Hands-On Lab: From Zero to Compliance Using CA Software Asset Management
 
Pre-Con Ed: Build Your Own Apps for an Enhanced Network Management Experience...
Pre-Con Ed: Build Your Own Apps for an Enhanced Network Management Experience...Pre-Con Ed: Build Your Own Apps for an Enhanced Network Management Experience...
Pre-Con Ed: Build Your Own Apps for an Enhanced Network Management Experience...
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
Real World Problem Solving Using Application Performance Management 10
Real World Problem Solving Using Application Performance Management 10Real World Problem Solving Using Application Performance Management 10
Real World Problem Solving Using Application Performance Management 10
 
Implementation and Use of Generic VTAM Resources with Parallel SYSPLEX Features
Implementation and Use of Generic VTAM Resources with Parallel SYSPLEX FeaturesImplementation and Use of Generic VTAM Resources with Parallel SYSPLEX Features
Implementation and Use of Generic VTAM Resources with Parallel SYSPLEX Features
 
Deep-Dive-with-Cloud-Monitoring-with-Amazon-EKS-and-Prometheus
Deep-Dive-with-Cloud-Monitoring-with-Amazon-EKS-and-PrometheusDeep-Dive-with-Cloud-Monitoring-with-Amazon-EKS-and-Prometheus
Deep-Dive-with-Cloud-Monitoring-with-Amazon-EKS-and-Prometheus
 
深探如何使用-Amazon-EKS-與-Prometheus-進行雲端監控
深探如何使用-Amazon-EKS-與-Prometheus-進行雲端監控深探如何使用-Amazon-EKS-與-Prometheus-進行雲端監控
深探如何使用-Amazon-EKS-與-Prometheus-進行雲端監控
 
Advanced networking on AWS | AWS Floor28
Advanced networking on AWS | AWS Floor28Advanced networking on AWS | AWS Floor28
Advanced networking on AWS | AWS Floor28
 
Hands-On Lab: Tune CA Performance Management for an Optimal Network Performan...
Hands-On Lab: Tune CA Performance Management for an Optimal Network Performan...Hands-On Lab: Tune CA Performance Management for an Optimal Network Performan...
Hands-On Lab: Tune CA Performance Management for an Optimal Network Performan...
 
Continuous Delivery on AWS with Zero Downtime
Continuous Delivery on AWS with Zero DowntimeContinuous Delivery on AWS with Zero Downtime
Continuous Delivery on AWS with Zero Downtime
 
Extending Jenkins to the Mainframe. A Simpler Approach.
Extending Jenkins to the Mainframe.  A Simpler Approach.Extending Jenkins to the Mainframe.  A Simpler Approach.
Extending Jenkins to the Mainframe. A Simpler Approach.
 
All levels of performance testing and monitoring in web-apps
All levels of performance testing and monitoring in web-appsAll levels of performance testing and monitoring in web-apps
All levels of performance testing and monitoring in web-apps
 
Adding the Sec to Your DevOps Pipelines (SEC332-R1) - AWS re:Invent 2018
Adding the Sec to Your DevOps Pipelines (SEC332-R1) - AWS re:Invent 2018Adding the Sec to Your DevOps Pipelines (SEC332-R1) - AWS re:Invent 2018
Adding the Sec to Your DevOps Pipelines (SEC332-R1) - AWS re:Invent 2018
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
HostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
HostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
HostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
HostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
HostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
HostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
HostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
HostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
HostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
HostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
HostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
HostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
HostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
HostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
HostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
HostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Recently uploaded

Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 

Recently uploaded (20)

Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 

Flink Snapshots: A Comprehensive Guide for New Users

  • 1. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Flink Snapshots A Comprehensive Guide for New Users
  • 2. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Flink Snapshots A Comprehensive Guide for New Users Danny Cranmer Principal Engineer at AWS Apache Flink PMC Member
  • 3. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda 1. Stateful processing recap 2. Flink Checkpoints 3. State backends 4. Common problems 3
  • 4. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 4 Simple Flink Example SELECT SUM(clicks) FROM MyKafkaTopic 2 5 7 3 4 2 2 Input Output 7 14 17 21 23
  • 5. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 5 Simple Flink Example 3 4 2 7 14 2 14 o6 o5 o4 o3 o2 o1 o4 In flight state
  • 6. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 6 Exactly once processing
  • 7. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 7 At least once processing - Duplicates
  • 8. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 8 At least once processing - Duplicates
  • 9. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 9 At most once processing – Dropped Records
  • 10. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 10 At most once processing – Dropped Records
  • 11. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda 1. Stateful processing recap 2. Flink Checkpoints 3. State backends 4. Common problems 11
  • 12. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 12 Flink Checkpoints
  • 13. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 13 Flink Checkpoints
  • 14. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 14 Flink Checkpoints
  • 15. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 15 Flink Checkpoints
  • 16. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 16 Barrier Alignment
  • 17. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 17 Barrier Alignment
  • 18. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 18 Barrier Alignment
  • 19. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 19 Checkpoint Lifecycle
  • 20. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 20 Checkpoint Lifecycle
  • 21. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 21 Checkpoint Lifecycle
  • 22. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 22 Checkpoint Lifecycle
  • 23. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 23 Checkpointing Configuration execution.checkpointing.interval execution.checkpointing.min-pause execution.checkpointing.timeout
  • 24. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 24 Checkpoint Statistics Demo
  • 25. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 25 Checkpoint vs Savepoint
  • 26. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda 1. Stateful processing recap 2. Flink Checkpoints 3. State backends 4. Common problems 26
  • 27. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 27 State Backends
  • 28. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. State Backend Selection 28 state size < sum(Task Manager memory) / ? ? HashMap : RocksDB
  • 29. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda 1. Stateful processing recap 2. Flink Checkpoints 3. State backends 4. Common problems 29
  • 30. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 30 Bottlenecks
  • 31. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 31 Buffer Debloating
  • 32. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 32 Buffer Debloating - Disabled
  • 33. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 33 Buffer Debloating - Disabled
  • 34. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 34 Buffer Debloating - Disabled
  • 35. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 35 Buffer Debloating - Disabled
  • 36. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 36 Buffer Debloating - Disabled
  • 37. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 37 Buffer Debloating - Disabled
  • 38. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 38 Buffer Debloating - Enabled
  • 39. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 39 Buffer Debloating - Enabled
  • 40. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 40 Buffer Debloating - Enabled
  • 41. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 41 Buffer Debloating taskmanager.network.memory.buffer-debloat.enabled: true taskmanager.network.memory.buffer-debloat.target: 1s
  • 42. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 42 Unaligned Checkpoints - Checkpoint barriers jump the queue - Records in buffers stored in the checkpoint - Not supported for savepoints
  • 43. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 43 Unaligned Checkpoints env.getCheckpointConfig().enableUnalignedCheckpoints(); execution.checkpointing.unaligned: true
  • 44. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 44 Incremental Checkpoints
  • 45. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 45 Incremental Checkpoints env.setStateBackend(new RocksDBStateBackend(filebackend, true)); state.backend.incremental: true
  • 46. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Load Test/Skew 46
  • 47. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Load Test/Skew 47
  • 48. FLINK SNAPSHOTS: A COMPREHENSIVE GUIDE FOR NEW USERS © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thank you! © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Danny Cranmer Principal Engineer at AWS Apache Flink PMC Member