Successfully reported this slideshow.
Your SlideShare is downloading. ×

Click-Through Example for Flink’s KafkaConsumer Checkpointing

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 12 Ad

More Related Content

Slideshows for you (20)

Similar to Click-Through Example for Flink’s KafkaConsumer Checkpointing (17)

Advertisement

More from Robert Metzger (20)

Recently uploaded (20)

Advertisement

Click-Through Example for Flink’s KafkaConsumer Checkpointing

  1. 1. Click-Through Example for Flink’s KafkaConsumer Checkpointing
  2. 2. a b c d e a b c d e Flink Kafka Consumer Flink Kafka Consumer Flink Map Operator counter = 0 Zookeeper offset partition 0: 0 offset partition 1: 0 Flink Checkpoint Coordinator Pending: Completed: offsets = 0, 0 This toy example is reading from a Kafka topic with two partitions, each containing “a”, “b”, “c”, … as messages. The offset is set to 0 for both partitions, a counter is initialized to 0.
  3. 3. a b c d e a b c d e Flink Kafka Consumer Flink Kafka Consumer Flink Map Operator a counter = 0 Zookeeper offset partition 0: 0 offset partition 1: 0 Flink Checkpoint Coordinator Pending: Completed: offsets = 1, 0 The Kafka consumer starts reading messages from partition 0. Message “a” is in-flight, the offset for the first consumer has been set to 1.
  4. 4. a b c d e a b c d e Flink Kafka Consumer Flink Kafka Consumer Flink Map Operator a counter = 1 Zookeeper offset partition 0: 0 offset partition 1: 0 Flink Checkpoint Coordinator Pending: Completed: offsets = 2, 1 a b Trigger Checkpoint at source Message “a” arrives at the counter, it is set to 1. The consumers both read the next records (“b” and “a”). The offsets are set accordingly. In parallel, the checkpoint coordinator decides to trigger a checkpoint at the source …
  5. 5. a b c d e a b c d e Flink Kafka Consumer Flink Kafka Consumer Flink Map Operator a counter = 2 Zookeeper offset partition 0: 0 offset partition 1: 0 Flink Checkpoint Coordinator Pending: Completed: offsets = 3, 1 a b offsets = 2, 1 c The source has created a snapshot of its state (“offset=2,1”), which is now stored in the checkpoint coordinator. The sources emitted a checkpoint barrier after messages “a” and “b”.
  6. 6. a b c d e a b c d e Flink Kafka Consumer Flink Kafka Consumer Flink Map Operator counter = 3 Zookeeper offset partition 0: 0 offset partition 1: 0 Flink Checkpoint Coordinator Pending: Completed: offsets = 3, 2 a b offsets = 2, 1 counter = 3 c b The map operator has received checkpoint barriers from both sources. It checkpoints its state (counter=3) in the coordinator. At the same time, the consumers are further reading more data from the Kafka partitions.
  7. 7. a b c d e a b c d e Flink Kafka Consumer Flink Kafka Consumer Flink Map Operator counter = 4 Zookeeper offset partition 0: 0 offset partition 1: 0 Flink Checkpoint Coordinator Pending: Completed: offsets = 3, 2 a offsets = 2, 1 counter = 3 c b Notify checkpoint complete The checkpoint coordinator informs the Kafka consumer that the checkpoint has been completed. It commits the checkpoints offsets into Zookeeper. Note that Flink is not relying on the Kafka offsets in ZK for restoring from failures
  8. 8. a b c d e a b c d e Flink Kafka Consumer Flink Kafka Consumer Flink Map Operator counter = 4 Zookeeper offset partition 0: 2 offset partition 1: 1 Flink Checkpoint Coordinator Pending: Completed: offsets = 3, 2 a offsets = 2, 1 counter = 3 c b Checkpoint in Zookeeper The checkpoint is now persisted in Zookeeper. External tools such as the Kafka Offset Checker can see the lag of the consumer group.
  9. 9. a b c d e a b c d e Flink Kafka Consumer Flink Kafka Consumer Flink Map Operator counter = 5 Zookeeper offset partition 0: 2 offset partition 1: 1 Flink Checkpoint Coordinator Pending: Completed: offsets = 4, 2 offsets = 2, 1 counter = 3 c b d The processing further advances
  10. 10. a b c d e a b c d e Flink Kafka Consumer Flink Kafka Consumer Flink Map Operator counter = 5 Zookeeper offset partition 0: 2 offset partition 1: 1 Flink Checkpoint Coordinator Pending: Completed: offsets = 4, 2 offsets = 2, 1 counter = 3 c b d Failure Some failure has happened (such as worker failure)
  11. 11. a b c d e a b c d e Flink Kafka Consumer Flink Kafka Consumer Flink Map Operator counter = 3 Zookeeper offset partition 0: 2 offset partition 1: 1 Flink Checkpoint Coordinator Pending: Completed: offsets = 2, 1 offsets = 2, 1 counter = 3 Reset all operators to last completed checkpoint The checkpoint coordinator restores the state at all the operators participating at the checkpointing. The Kafka sources start from offset 2 and 1, the counter’s value is 3.
  12. 12. a b c d e a b c d e Flink Kafka Consumer Flink Kafka Consumer Flink Map Operator counter = 3 Zookeeper offset partition 0: 2 offset partition 1: 1 Flink Checkpoint Coordinator Pending: Completed: offsets = 3, 1 offsets = 2, 1 counter = 3 Continue processing … c The system continues with the processing, the counter’s value is consistent across a worker failure.

×