Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Piotr Nowojski
piotr@data-artisans.com
@PiotrNowojski
“Hit me, baby, just one
time”
Building End-to-End Exactly-Once
Appli...
Overview of delivering
guarantees
2
No guarantees and failures
3
At-Least-Once and failures
4
Exactly-Once and failures
5
How to achieve Exactly-Once
6
Exactly-Once on a single node
▪ Simple transactions
▪ Write data in transaction
▪ Include read offsets in transaction
▪ On...
Exactly-Once on a single node
▪ What about application state?
▪ For example running average
▪ Include in the transaction, ...
Exactly-Once in a distributed system
▪ Persistent communication channels
9
Exactly-Once in a distributed system
▪ Persistent communication channels
▪ Allow to re-process messages from last
committe...
Exactly-Once in Flink
1
1
Exactly-Once two phase commit
▪ Can we do better? Yes! Two phase commit
protocol
12
Input data Process 1 Process 2 Output ...
Exactly-Once two phase commit
13
Input data Process 1 Process 2 Output data
State
backend
Job
manager
Exactly-Once two phase commit
14
Input data Process 1 Process 2 Output data
State
backend
Job
manager
Start external
trans...
Exactly-Once two phase commit
15
Input data Process 1 Process 2 Output data
State
backend
Job
manager
Inject checkpoint
ba...
Exactly-Once two phase commit
16
Input data Process 1 Process 2 Output data
State
backend
Job
manager
Inject checkpoint
ba...
Exactly-Once two phase commit
17
Input data Process 1 Process 2 Output data
State
backend
Job
manager
Inject checkpoint
ba...
Exactly-Once two phase commit
▪ Without side effects (only internal state)
•Example: calculating running average
•State ma...
Exactly-Once two phase commit
19
Input data Process 1 Process 2 Output data
State
backend
Job
manager
Inject checkpoint
ba...
Exactly-Once two phase commit
20
Input data Process 1 Process 2 Output data
State
backend
Job
manager
Notify checkpoint
co...
Exactly-Once two phase commit
21
Input data Process 1 Process 2 Output data
State
backend
Job
manager
Notify checkpoint
co...
Exactly-Once two phase commit
▪ Once all processes successfully complete pre-
commit, commit is issued
▪ If at least one p...
Two phase commit example
implementation
▪ Begin transaction - create a temporary file
▪ Write element - write data to that...
Exactly-Once two phase commit
▪ In case of system crash, we can restore
state of the application to the latest
snapshot
▪ ...
TwoPhaseCommitSinkFunction
▪ Flink’s snapshots/checkpoints are very
similar to two phase commit
▪ There is a TwoPhaseCommi...
Exactly-Once producers in Flink
▪ BucketingSink
▪ Pravega connector
▪ Kafka 0.11 connector
▪ Kafka 0.11 introduced transac...
Summarizon
▪ Many ways to achieve Exactly-Once
▪ Flink checkpointing is similar to the two
phase commit protocol
▪ No mate...
Questions?
2
8
29
Thank you!
@PiotrNowojski
@ApacheFlink
@dataArtisans
We are hiring!
data-artisans.com/careers
Upcoming SlideShare
Loading in …5
×

of

Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 1 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 2 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 3 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 4 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 5 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 6 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 7 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 8 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 9 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 10 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 11 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 12 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 13 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 14 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 15 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 16 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 17 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 18 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 19 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 20 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 21 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 22 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 23 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 24 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 25 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 26 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 27 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 28 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 29 Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink Slide 30
Upcoming SlideShare
What to Upload to SlideShare
Next

6 Likes

Share

Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink

Getting data in and out of Flink is by far the most important aspect, and an everyday typical requirement of building Flink applications. Doing so in an end-to-end exactly-once manner, however, can be tricky. Being able to reliably consume data from the outside world without any duplicate processing and guaranteeing consistent distributed state, and at the same time provide computed results back to the outside world also without introducing duplicates, is crucial for the consistency and correctness of applications built upon stream processors. In this talk, we will talk about how end-to-end exactly-once guarantees can be achieved with Apache Flink. We will talk about Flink’s checkpointing mechanism, and how exactly to leverage it when consuming and producing data from your Flink streaming pipelines. In particular, we will be having a detailed review on how our supported connectors do so, with the aim to provide reference implementations for your own custom consumers and sinks.

Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - Building End-to-End Exactly-Once Applications with Flink

  1. 1. Piotr Nowojski piotr@data-artisans.com @PiotrNowojski “Hit me, baby, just one time” Building End-to-End Exactly-Once Applications with Flink
  2. 2. Overview of delivering guarantees 2
  3. 3. No guarantees and failures 3
  4. 4. At-Least-Once and failures 4
  5. 5. Exactly-Once and failures 5
  6. 6. How to achieve Exactly-Once 6
  7. 7. Exactly-Once on a single node ▪ Simple transactions ▪ Write data in transaction ▪ Include read offsets in transaction ▪ On success - commit transaction ▪ On failure - abort transaction and restart from last committed read offset 7
  8. 8. Exactly-Once on a single node ▪ What about application state? ▪ For example running average ▪ Include in the transaction, by adding as a separate file/table just before commit 8
  9. 9. Exactly-Once in a distributed system ▪ Persistent communication channels 9
  10. 10. Exactly-Once in a distributed system ▪ Persistent communication channels ▪ Allow to re-process messages from last committed read offsets ▪ High costs of communication ▪ Processes operate independent of each other 10
  11. 11. Exactly-Once in Flink 1 1
  12. 12. Exactly-Once two phase commit ▪ Can we do better? Yes! Two phase commit protocol 12 Input data Process 1 Process 2 Output data
  13. 13. Exactly-Once two phase commit 13 Input data Process 1 Process 2 Output data State backend Job manager
  14. 14. Exactly-Once two phase commit 14 Input data Process 1 Process 2 Output data State backend Job manager Start external transaction
  15. 15. Exactly-Once two phase commit 15 Input data Process 1 Process 2 Output data State backend Job manager Inject checkpoint barrier (1) Pre-commit (checkpoint starts)
  16. 16. Exactly-Once two phase commit 16 Input data Process 1 Process 2 Output data State backend Job manager Inject checkpoint barrier (1) Snapshot offsets & state (2) Pass checkpoint barrier (2) Pre-commit
  17. 17. Exactly-Once two phase commit 17 Input data Process 1 Process 2 Output data State backend Job manager Inject checkpoint barrier (1) Snapshot offsets & state (2) Pass checkpoint barrier (2) Snapshot state (3) Pre-commit external transaction (3) Pre-commit
  18. 18. Exactly-Once two phase commit ▪ Without side effects (only internal state) •Example: calculating running average •State managed by Flink ▪ With side effects •Operator must manage external state on it’s own 18
  19. 19. Exactly-Once two phase commit 19 Input data Process 1 Process 2 Output data State backend Job manager Inject checkpoint barrier (1) Snapshot offsets & state (2) Pass checkpoint barrier (2) Snapshot state (3) Pre-commit external transaction (3) Pre-commit
  20. 20. Exactly-Once two phase commit 20 Input data Process 1 Process 2 Output data State backend Job manager Notify checkpoint completed (1) Commit
  21. 21. Exactly-Once two phase commit 21 Input data Process 1 Process 2 Output data State backend Job manager Notify checkpoint completed (1) Commit external transaction (2) Commit
  22. 22. Exactly-Once two phase commit ▪ Once all processes successfully complete pre- commit, commit is issued ▪ If at least one pre-commit fails others are aborted ▪ After a successful pre-commit, commit MUST be guaranteed to eventually succeed ▪ This guarantees that all processes agree on the final outcome 22
  23. 23. Two phase commit example implementation ▪ Begin transaction - create a temporary file ▪ Write element - write data to that file ▪ Pre-commit - flush the file ▪ And begin new transaction for subsequent writes ▪ Commit - move the file to a target directory ▪ Can increase latency 23
  24. 24. Exactly-Once two phase commit ▪ In case of system crash, we can restore state of the application to the latest snapshot ▪ We could/should abort previous “pending” transactions •Delete temporary files 24
  25. 25. TwoPhaseCommitSinkFunction ▪ Flink’s snapshots/checkpoints are very similar to two phase commit ▪ There is a TwoPhaseCommitSinkFunction which maps checkpoint calls to beginTransaction, preCommit, commit and abort calls 25
  26. 26. Exactly-Once producers in Flink ▪ BucketingSink ▪ Pravega connector ▪ Kafka 0.11 connector ▪ Kafka 0.11 introduced transactions ▪ Implemented on top of the TwoPhaseCommitSinkFunction ▪ Low overhead 26
  27. 27. Summarizon ▪ Many ways to achieve Exactly-Once ▪ Flink checkpointing is similar to the two phase commit protocol ▪ No materialization of the data in transit ▪ Lower latency ▪ Higher throughput 27
  28. 28. Questions? 2 8
  29. 29. 29 Thank you! @PiotrNowojski @ApacheFlink @dataArtisans
  30. 30. We are hiring! data-artisans.com/careers
  • andrewlin114

    Jan. 14, 2021
  • FengXu56

    May. 22, 2019
  • jhaoniu

    Jul. 13, 2018
  • superzhu

    Apr. 19, 2018
  • larryliuqing

    Mar. 5, 2018
  • javafan

    Sep. 28, 2017

Getting data in and out of Flink is by far the most important aspect, and an everyday typical requirement of building Flink applications. Doing so in an end-to-end exactly-once manner, however, can be tricky. Being able to reliably consume data from the outside world without any duplicate processing and guaranteeing consistent distributed state, and at the same time provide computed results back to the outside world also without introducing duplicates, is crucial for the consistency and correctness of applications built upon stream processors. In this talk, we will talk about how end-to-end exactly-once guarantees can be achieved with Apache Flink. We will talk about Flink’s checkpointing mechanism, and how exactly to leverage it when consuming and producing data from your Flink streaming pipelines. In particular, we will be having a detailed review on how our supported connectors do so, with the aim to provide reference implementations for your own custom consumers and sinks.

Views

Total views

1,516

On Slideshare

0

From embeds

0

Number of embeds

142

Actions

Downloads

2

Shares

0

Comments

0

Likes

6

×