SlideShare a Scribd company logo
When it absolutely, positively,
has to be there
Reliability Guarantees in Apache Kafka
@jeffholoman @gwenshap
Kafka
• High Throughput
• Low Latency
• Scalable
• Centralized
• Real-time
“If data is the lifeblood of high technology, Apache
Kafka is the circulatory system”
--Todd Palino
Kafka SRE @ LinkedIn
If Kafka is a critical piece of our pipeline
 Can we be 100% sure that our data will get there?
 Can we lose messages?
 How do we verify?
 Who’s fault is it?
Distributed Systems
 Things Fail
 Systems are designed to
tolerate failure
 We must expect failures
and design our code and
configure our systems to
handle them
Network
Broker MachineClient Machine
Data Flow
Kafka Client
Broker
O/S Socket Buffer
NIC
NIC
Page Cache
Disk
Application Thread
O/S Socket Buffer
async
callback
✗
✗
✗
✗
✗
✗
data
ack /
exception
Client Machine
Kafka Client
O/S Socket Buffer
NIC
Application Thread
✗
✗
Broker Machine
Broker
NIC
Page Cache
Disk
O/S Socket Buffer
miss
✗
✗
✗
Network
Data Flow
✗
data
offsets
ZK
Kafka✗
Replication is your friend
 Kafka protects against failures by replicating data
 The unit of replication is the partition
 One replica is designated as the Leader
 Follower replicas fetch data from the leader
 The leader holds the list of “in-sync” replicas
Replication and ISRs
0
1
2
0
1
2
0
1
2
Producer
Broker
100
Broker
101
Broker
102
Topic:
Partitions:
Replicas:
my_topic
3
3
Partition:
Leader:
ISR:
1
101
100,102
Partition:
Leader:
ISR:
2
102
101,100
Partition:
Leader:
ISR:
0
100
101,102
ISR
• 2 things make a replica in-sync
– Lag behind leader
• replica.lag.time.max.ms – replica that didn’t fetch or is behind
• replica.lag.max.messages – will go away has gone away in 0.9
– Connection to Zookeeper
Terminology
• Acked
– Producers will not retry sending.
– Depends on producer setting
• Committed
– Consumers can read.
– Only when message got to all
ISR.
• replica.lag.time.max.ms
– how long can a dead replica
prevent consumers from
reading?
Replication
• Acks = all
– only waits for in-sync replicas to reply.
Replica 3
100
Replica 2
100
Replica 1
100
Time
• Replica 3 stopped replicating for some reason
Replication
Replica 3
100
Replica 2
100
101
Replica 1
100
101
Time
Acked in acks = all
“committed”
Acked in acks = 1
but not
“committed”
Replication
Replica 3
100
Replica 2
100
101
Replica 1
100
101
Time
• One replica drops out of ISR, or goes offline
• All messages are now acked and committed
• 2nd Replica drops out, or is offline
Replication
Replica 3
100
Replica 2
100
101
Replica 1
100
101
102
103
104Time
Replication
Replica 3
100
Replica 2
100
101
Replica 1
100
101
102
103
104Time
• Now we’re in trouble
Replication
• If Replica 2 or 3 come back online before the leader, you can will lose
data.
Replica 3
100
Replica 2
100
101
Replica 1
100
101
102
103
104Time
All those are
“acked” and
“committed”
So what to do
• Disable Unclean Leader Election
– unclean.leader.election.enable = false
• Set replication factor
– default.replication.factor = 3
• Set minimum ISRs
– min.insync.replicas = 2
Warning
• min.insync.replicas is applied at the topic-level.
• Must alter the topic configuration manually if created
before the server level change
• Must manually alter the topic < 0.9.0 (KAFKA-2114)
Replication
• Replication = 3
• Min ISR = 2
Replica 3
100
Replica 2
100
Replica 1
100
Time
Replication
Replica 3
100
Replica 2
100
101
Replica 1
100
101
Time
• One replica drops out of ISR, or goes offline
Replication
Replica 3
100
Replica 2
100
101
Replica 1
100
101102
103
104
Time
• 2nd Replica fails out, or is out of sync
Buffers
in
Produce
r
Producer Internals
• Producer sends batches of messages to a buffer
M3
Application
Thread
Application
Thread
Application
Thread
send()
M2 M1 M0
Batch 3
Batch 2
Batch 1
Fail
? response
retry
Update Future
callback
drain
Metadata or
Exception
Basics
• Durability can be configured with the producer
configuration request.required.acks
– 0 The message is written to the network (buffer)
– 1 The message is written to the leader
– all The producer gets an ack after all ISRs receive the data; the
message is committed
• Make sure producer doesn’t just throws messages away!
– block.on.buffer.full = true
“New” Producer
• All calls are non-blocking async
• 2 Options for checking for failures:
– Immediately block for response: send().get()
– Do followup work in Callback, close producer after error threshold
• Be careful about buffering these failures. Future work? KAFKA-1955
• Don’t forget to close the producer! producer.close() will block until in-flight txns
complete
• retries (producer config) defaults to 0
• message.send.max.retries (server config) defaults to 3
• In flight requests could lead to message re-ordering
Consumer
• Three choices for Consumer API
– Simple Consumer
– High Level Consumer (ZookeeperConsumer)
– New KafkaConsumer
New Consumer – attempt #1
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "10000");
KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props);
consumer.subscribe(Arrays.asList("foo", "bar"));
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records) {
processAndUpdateDB(record);
}
}
What if we crash
after 8 seconds?
Commit automatically every
10 seconds
New Consumer – attempt #2
props.put("enable.auto.commit", "false");
KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props);
consumer.subscribe(Arrays.asList("foo", "bar"));
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records) {
processAndUpdateDB(record);
consumer.commitSync();
What are you really
committing?
New Consumer – attempt #3
props.put("enable.auto.commit", "false");
KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props);
consumer.subscribe(Arrays.asList("foo", "bar"));
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records) {
processAndUpdateDB(record);
TopicPartition tp = new TopicPartition(record.topic(), record.partition());
OffsetAndMetadata oam = new OffsetAndMetadata(record.offset() +1);
consumer.commitSync(Collections.singletonMap(tp,oam));
Is this fast enough?
New Consumer – attempt #4
props.put("enable.auto.commit", "false");
KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props);
consumer.subscribe(Arrays.asList("foo", "bar"));
int counter = 0;
while (true) {
ConsumerRecords<String, String> records = consumer.poll(500);
for (ConsumerRecord<String, String> record : records) {
counter ++;
processAndUpdateDB(record);
if (counter % 100 == 0) {
TopicPartition tp = new TopicPartition(record.topic(), record.partition());
OffsetAndMetadata oam = new OffsetAndMetadata(record.offset() + 1);
consumer.commitSync(Collections.singletonMap(tp, oam));
Almost.
Consumer Offsets
P0 P2 P3 P4 P5 P6
Commit
Consumer Offsets
P0 P2 P3 P4 P5 P6
Consumer
Thread 1 Thread 2 Thread 3 Thread 4
Duplicates
Rebalance Listener
public class MyRebalanceListener implements ConsumerRebalanceListener {
@Override
public void onPartitionsAssigned(Collection<TopicPartition> partitions) {
}
@Override
public void onPartitionsRevoked(Collection<TopicPartition> partitions) {
commitOffsets();
}
}
consumer.subscribe(Arrays.asList("foo", "bar"), new MyRebalanceListener());
Careful! This method will need
to know the topic, partition and
offset of last record you got
At Least Once Consuming
1. Commit your own offsets - Set autocommit.enable =
false
2. Use Rebalance Listener to limit duplicates
3. Make sure you commit only what you are done processing
4. Note: New consumer is single threaded – one consumer
per thread.
Exactly Once Semantics
• At most once is easy
• At least once is not bad either – commit after 100% sure
data is safe
• Exactly once is tricky
– Commit data and offsets in one transaction
– Idempotent producer
Using External Store
• Don’t use commitSync()
• Implement your own “commit” that saves both data and
offsets to external store.
• Use the RebalanceListener to find the correct offset
Seeking right offset
public class SaveOffsetsOnRebalance implements ConsumerRebalanceListener {
private Consumer<?,?> consumer;
public void onPartitionsRevoked(Collection<TopicPartition> partitions) {
// save the offsets in an external store using some custom code not described here
for (TopicPartition partition : partitions)
saveOffsetInExternalStore(consumer.position(partition));
}
public void onPartitionsAssigned(Collection<TopicPartition> partitions) {
// read the offsets from an external store using some custom code not described here
for (TopicPartition partition : partitions)
consumer.seek(partition, readOffsetFromExternalStore(partition));
}
}
Monitoring for Data Loss
• Monitor for producer errors – watch the retry numbers
• Monitor consumer lag – MaxLag or via offsets
• Standard schema:
– Each message should contain timestamp and originating service and
host
• Each producer can report message counts and offsets to a
special topic
• “Monitoring consumer” reports message counts to another
special topic
• “Important consumers” also report message counts
• Reconcile the results
Be Safe, Not Sorry
• Acks = all
• Block.on.buffer.full = true
• Retries = MAX_INT
• ( Max.inflight.requests.per.connect = 1 )
• Producer.close()
• Replication-factor >= 3
• Min.insync.replicas = 2
• Unclean.leader.election = false
• Auto.offset.commit = false
• Commit after processing
• Monitor!

More Related Content

What's hot

Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan EwenAdvanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
confluent
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
Dvir Volk
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
Chandler Huang
 

What's hot (20)

Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan EwenAdvanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
 
Hadoop Security Architecture
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security Architecture
 
Getting Started with Confluent Schema Registry
Getting Started with Confluent Schema RegistryGetting Started with Confluent Schema Registry
Getting Started with Confluent Schema Registry
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
 
Logging using ELK Stack for Microservices
Logging using ELK Stack for MicroservicesLogging using ELK Stack for Microservices
Logging using ELK Stack for Microservices
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
MongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDB
MongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDBMongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDB
MongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDB
 
APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka Streams
 
Introducing Change Data Capture with Debezium
Introducing Change Data Capture with DebeziumIntroducing Change Data Capture with Debezium
Introducing Change Data Capture with Debezium
 
Spark with Delta Lake
Spark with Delta LakeSpark with Delta Lake
Spark with Delta Lake
 
Kafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersKafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer Consumers
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign
 
Vault
VaultVault
Vault
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 

Viewers also liked

Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer
confluent
 
Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...
Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...
Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...
confluent
 
Introducing Kafka Streams: Large-scale Stream Processing with Kafka, Neha Nar...
Introducing Kafka Streams: Large-scale Stream Processing with Kafka, Neha Nar...Introducing Kafka Streams: Large-scale Stream Processing with Kafka, Neha Nar...
Introducing Kafka Streams: Large-scale Stream Processing with Kafka, Neha Nar...
confluent
 

Viewers also liked (20)

Kafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be thereKafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be there
 
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean FellowsDeploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
 
The Enterprise Service Bus is Dead! Long live the Enterprise Service Bus, Rim...
The Enterprise Service Bus is Dead! Long live the Enterprise Service Bus, Rim...The Enterprise Service Bus is Dead! Long live the Enterprise Service Bus, Rim...
The Enterprise Service Bus is Dead! Long live the Enterprise Service Bus, Rim...
 
Espresso Database Replication with Kafka, Tom Quiggle
Espresso Database Replication with Kafka, Tom QuiggleEspresso Database Replication with Kafka, Tom Quiggle
Espresso Database Replication with Kafka, Tom Quiggle
 
Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer
 
The Rise of Real Time
The Rise of Real TimeThe Rise of Real Time
The Rise of Real Time
 
Deep Dive into Apache Kafka
Deep Dive into Apache KafkaDeep Dive into Apache Kafka
Deep Dive into Apache Kafka
 
Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron SchildkroutKafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
 
Streaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in ProductionStreaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in Production
 
What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2 What's new in Confluent 3.2 and Apache Kafka 0.10.2
What's new in Confluent 3.2 and Apache Kafka 0.10.2
 
Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...
Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...
Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...
 
Kafka, Killer of Point-to-Point Integrations, Lucian Lita
Kafka, Killer of Point-to-Point Integrations, Lucian LitaKafka, Killer of Point-to-Point Integrations, Lucian Lita
Kafka, Killer of Point-to-Point Integrations, Lucian Lita
 
Demystifying Stream Processing with Apache Kafka
Demystifying Stream Processing with Apache KafkaDemystifying Stream Processing with Apache Kafka
Demystifying Stream Processing with Apache Kafka
 
Stream Processing with Kafka in Uber, Danny Yuan
Stream Processing with Kafka in Uber, Danny Yuan Stream Processing with Kafka in Uber, Danny Yuan
Stream Processing with Kafka in Uber, Danny Yuan
 
Real-Time Analytics Visualized w/ Kafka + Streamliner + MemSQL + ZoomData, An...
Real-Time Analytics Visualized w/ Kafka + Streamliner + MemSQL + ZoomData, An...Real-Time Analytics Visualized w/ Kafka + Streamliner + MemSQL + ZoomData, An...
Real-Time Analytics Visualized w/ Kafka + Streamliner + MemSQL + ZoomData, An...
 
Data integration with Apache Kafka
Data integration with Apache KafkaData integration with Apache Kafka
Data integration with Apache Kafka
 
Introducing Kafka Streams: Large-scale Stream Processing with Kafka, Neha Nar...
Introducing Kafka Streams: Large-scale Stream Processing with Kafka, Neha Nar...Introducing Kafka Streams: Large-scale Stream Processing with Kafka, Neha Nar...
Introducing Kafka Streams: Large-scale Stream Processing with Kafka, Neha Nar...
 
Securing Kafka
Securing Kafka Securing Kafka
Securing Kafka
 
Introduction To Streaming Data and Stream Processing with Apache Kafka
Introduction To Streaming Data and Stream Processing with Apache KafkaIntroduction To Streaming Data and Stream Processing with Apache Kafka
Introduction To Streaming Data and Stream Processing with Apache Kafka
 
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and VormetricProtecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
 

Similar to When it Absolutely, Positively, Has to be There: Reliability Guarantees in Kafka, Gwen Shapira, Jeff Holoman

Similar to When it Absolutely, Positively, Has to be There: Reliability Guarantees in Kafka, Gwen Shapira, Jeff Holoman (20)

Reliability Guarantees for Apache Kafka
Reliability Guarantees for Apache KafkaReliability Guarantees for Apache Kafka
Reliability Guarantees for Apache Kafka
 
Apache Kafka Reliability Guarantees StrataHadoop NYC 2015
Apache Kafka Reliability Guarantees StrataHadoop NYC 2015 Apache Kafka Reliability Guarantees StrataHadoop NYC 2015
Apache Kafka Reliability Guarantees StrataHadoop NYC 2015
 
Apache Kafka Reliability
Apache Kafka Reliability Apache Kafka Reliability
Apache Kafka Reliability
 
Kafka reliability velocity 17
Kafka reliability   velocity 17Kafka reliability   velocity 17
Kafka reliability velocity 17
 
Kafka Reliability Guarantees ATL Kafka User Group
Kafka Reliability Guarantees ATL Kafka User GroupKafka Reliability Guarantees ATL Kafka User Group
Kafka Reliability Guarantees ATL Kafka User Group
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producer
 
Exactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache KafkaExactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache Kafka
 
Kafka zero to hero
Kafka zero to heroKafka zero to hero
Kafka zero to hero
 
Apache Kafka - From zero to hero
Apache Kafka - From zero to heroApache Kafka - From zero to hero
Apache Kafka - From zero to hero
 
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
 
Big Data Day LA 2015 - Introduction to Apache Kafka - The Big Data Message Bu...
Big Data Day LA 2015 - Introduction to Apache Kafka - The Big Data Message Bu...Big Data Day LA 2015 - Introduction to Apache Kafka - The Big Data Message Bu...
Big Data Day LA 2015 - Introduction to Apache Kafka - The Big Data Message Bu...
 
Webinar slides: 9 DevOps Tips for Going in Production with Galera Cluster for...
Webinar slides: 9 DevOps Tips for Going in Production with Galera Cluster for...Webinar slides: 9 DevOps Tips for Going in Production with Galera Cluster for...
Webinar slides: 9 DevOps Tips for Going in Production with Galera Cluster for...
 
Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28
 
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopEventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
 
Seek and Destroy Kafka Under Replication
Seek and Destroy Kafka Under ReplicationSeek and Destroy Kafka Under Replication
Seek and Destroy Kafka Under Replication
 
Azure Cloud Patterns
Azure Cloud PatternsAzure Cloud Patterns
Azure Cloud Patterns
 
Aerospike Go Language Client
Aerospike Go Language ClientAerospike Go Language Client
Aerospike Go Language Client
 
intro-kafka
intro-kafkaintro-kafka
intro-kafka
 
Cassandra and drivers
Cassandra and driversCassandra and drivers
Cassandra and drivers
 
Apache Kafka – (Pattern and) Anti-Pattern
Apache Kafka – (Pattern and) Anti-PatternApache Kafka – (Pattern and) Anti-Pattern
Apache Kafka – (Pattern and) Anti-Pattern
 

More from confluent

More from confluent (20)

Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 

Recently uploaded

Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdf
Kamal Acharya
 
Hall booking system project report .pdf
Hall booking system project report  .pdfHall booking system project report  .pdf
Hall booking system project report .pdf
Kamal Acharya
 
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
Kamal Acharya
 
Laundry management system project report.pdf
Laundry management system project report.pdfLaundry management system project report.pdf
Laundry management system project report.pdf
Kamal Acharya
 
Digital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdfDigital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdf
AbrahamGadissa
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 

Recently uploaded (20)

Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdf
 
Hall booking system project report .pdf
Hall booking system project report  .pdfHall booking system project report  .pdf
Hall booking system project report .pdf
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
Explosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdfExplosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdf
 
Pharmacy management system project report..pdf
Pharmacy management system project report..pdfPharmacy management system project report..pdf
Pharmacy management system project report..pdf
 
Top 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering ScientistTop 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering Scientist
 
NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...
NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...
NO1 Pandit Amil Baba In Bahawalpur, Sargodha, Sialkot, Sheikhupura, Rahim Yar...
 
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdfA CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
 
Peek implant persentation - Copy (1).pdf
Peek implant persentation - Copy (1).pdfPeek implant persentation - Copy (1).pdf
Peek implant persentation - Copy (1).pdf
 
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdfRESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
 
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
 
Online resume builder management system project report.pdf
Online resume builder management system project report.pdfOnline resume builder management system project report.pdf
Online resume builder management system project report.pdf
 
Laundry management system project report.pdf
Laundry management system project report.pdfLaundry management system project report.pdf
Laundry management system project report.pdf
 
Furniture showroom management system project.pdf
Furniture showroom management system project.pdfFurniture showroom management system project.pdf
Furniture showroom management system project.pdf
 
Digital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdfDigital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdf
 
Natalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in KrakówNatalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in Kraków
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
ONLINE CAR SERVICING SYSTEM PROJECT REPORT.pdf
ONLINE CAR SERVICING SYSTEM PROJECT REPORT.pdfONLINE CAR SERVICING SYSTEM PROJECT REPORT.pdf
ONLINE CAR SERVICING SYSTEM PROJECT REPORT.pdf
 
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical EngineeringIntroduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
 
KIT-601 Lecture Notes-UNIT-5.pdf Frame Works and Visualization
KIT-601 Lecture Notes-UNIT-5.pdf Frame Works and VisualizationKIT-601 Lecture Notes-UNIT-5.pdf Frame Works and Visualization
KIT-601 Lecture Notes-UNIT-5.pdf Frame Works and Visualization
 

When it Absolutely, Positively, Has to be There: Reliability Guarantees in Kafka, Gwen Shapira, Jeff Holoman

  • 1. When it absolutely, positively, has to be there Reliability Guarantees in Apache Kafka @jeffholoman @gwenshap
  • 2. Kafka • High Throughput • Low Latency • Scalable • Centralized • Real-time
  • 3. “If data is the lifeblood of high technology, Apache Kafka is the circulatory system” --Todd Palino Kafka SRE @ LinkedIn
  • 4. If Kafka is a critical piece of our pipeline  Can we be 100% sure that our data will get there?  Can we lose messages?  How do we verify?  Who’s fault is it?
  • 5. Distributed Systems  Things Fail  Systems are designed to tolerate failure  We must expect failures and design our code and configure our systems to handle them
  • 6. Network Broker MachineClient Machine Data Flow Kafka Client Broker O/S Socket Buffer NIC NIC Page Cache Disk Application Thread O/S Socket Buffer async callback ✗ ✗ ✗ ✗ ✗ ✗ data ack / exception
  • 7. Client Machine Kafka Client O/S Socket Buffer NIC Application Thread ✗ ✗ Broker Machine Broker NIC Page Cache Disk O/S Socket Buffer miss ✗ ✗ ✗ Network Data Flow ✗ data offsets ZK Kafka✗
  • 8. Replication is your friend  Kafka protects against failures by replicating data  The unit of replication is the partition  One replica is designated as the Leader  Follower replicas fetch data from the leader  The leader holds the list of “in-sync” replicas
  • 10. ISR • 2 things make a replica in-sync – Lag behind leader • replica.lag.time.max.ms – replica that didn’t fetch or is behind • replica.lag.max.messages – will go away has gone away in 0.9 – Connection to Zookeeper
  • 11. Terminology • Acked – Producers will not retry sending. – Depends on producer setting • Committed – Consumers can read. – Only when message got to all ISR. • replica.lag.time.max.ms – how long can a dead replica prevent consumers from reading?
  • 12. Replication • Acks = all – only waits for in-sync replicas to reply. Replica 3 100 Replica 2 100 Replica 1 100 Time
  • 13. • Replica 3 stopped replicating for some reason Replication Replica 3 100 Replica 2 100 101 Replica 1 100 101 Time Acked in acks = all “committed” Acked in acks = 1 but not “committed”
  • 14. Replication Replica 3 100 Replica 2 100 101 Replica 1 100 101 Time • One replica drops out of ISR, or goes offline • All messages are now acked and committed
  • 15. • 2nd Replica drops out, or is offline Replication Replica 3 100 Replica 2 100 101 Replica 1 100 101 102 103 104Time
  • 16. Replication Replica 3 100 Replica 2 100 101 Replica 1 100 101 102 103 104Time • Now we’re in trouble
  • 17. Replication • If Replica 2 or 3 come back online before the leader, you can will lose data. Replica 3 100 Replica 2 100 101 Replica 1 100 101 102 103 104Time All those are “acked” and “committed”
  • 18. So what to do • Disable Unclean Leader Election – unclean.leader.election.enable = false • Set replication factor – default.replication.factor = 3 • Set minimum ISRs – min.insync.replicas = 2
  • 19. Warning • min.insync.replicas is applied at the topic-level. • Must alter the topic configuration manually if created before the server level change • Must manually alter the topic < 0.9.0 (KAFKA-2114)
  • 20. Replication • Replication = 3 • Min ISR = 2 Replica 3 100 Replica 2 100 Replica 1 100 Time
  • 21. Replication Replica 3 100 Replica 2 100 101 Replica 1 100 101 Time • One replica drops out of ISR, or goes offline
  • 22. Replication Replica 3 100 Replica 2 100 101 Replica 1 100 101102 103 104 Time • 2nd Replica fails out, or is out of sync Buffers in Produce r
  • 23.
  • 24. Producer Internals • Producer sends batches of messages to a buffer M3 Application Thread Application Thread Application Thread send() M2 M1 M0 Batch 3 Batch 2 Batch 1 Fail ? response retry Update Future callback drain Metadata or Exception
  • 25. Basics • Durability can be configured with the producer configuration request.required.acks – 0 The message is written to the network (buffer) – 1 The message is written to the leader – all The producer gets an ack after all ISRs receive the data; the message is committed • Make sure producer doesn’t just throws messages away! – block.on.buffer.full = true
  • 26. “New” Producer • All calls are non-blocking async • 2 Options for checking for failures: – Immediately block for response: send().get() – Do followup work in Callback, close producer after error threshold • Be careful about buffering these failures. Future work? KAFKA-1955 • Don’t forget to close the producer! producer.close() will block until in-flight txns complete • retries (producer config) defaults to 0 • message.send.max.retries (server config) defaults to 3 • In flight requests could lead to message re-ordering
  • 27.
  • 28. Consumer • Three choices for Consumer API – Simple Consumer – High Level Consumer (ZookeeperConsumer) – New KafkaConsumer
  • 29. New Consumer – attempt #1 props.put("enable.auto.commit", "true"); props.put("auto.commit.interval.ms", "10000"); KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props); consumer.subscribe(Arrays.asList("foo", "bar")); while (true) { ConsumerRecords<String, String> records = consumer.poll(100); for (ConsumerRecord<String, String> record : records) { processAndUpdateDB(record); } } What if we crash after 8 seconds? Commit automatically every 10 seconds
  • 30. New Consumer – attempt #2 props.put("enable.auto.commit", "false"); KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props); consumer.subscribe(Arrays.asList("foo", "bar")); while (true) { ConsumerRecords<String, String> records = consumer.poll(100); for (ConsumerRecord<String, String> record : records) { processAndUpdateDB(record); consumer.commitSync(); What are you really committing?
  • 31. New Consumer – attempt #3 props.put("enable.auto.commit", "false"); KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props); consumer.subscribe(Arrays.asList("foo", "bar")); while (true) { ConsumerRecords<String, String> records = consumer.poll(100); for (ConsumerRecord<String, String> record : records) { processAndUpdateDB(record); TopicPartition tp = new TopicPartition(record.topic(), record.partition()); OffsetAndMetadata oam = new OffsetAndMetadata(record.offset() +1); consumer.commitSync(Collections.singletonMap(tp,oam)); Is this fast enough?
  • 32. New Consumer – attempt #4 props.put("enable.auto.commit", "false"); KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props); consumer.subscribe(Arrays.asList("foo", "bar")); int counter = 0; while (true) { ConsumerRecords<String, String> records = consumer.poll(500); for (ConsumerRecord<String, String> record : records) { counter ++; processAndUpdateDB(record); if (counter % 100 == 0) { TopicPartition tp = new TopicPartition(record.topic(), record.partition()); OffsetAndMetadata oam = new OffsetAndMetadata(record.offset() + 1); consumer.commitSync(Collections.singletonMap(tp, oam));
  • 34. Consumer Offsets P0 P2 P3 P4 P5 P6 Commit
  • 35. Consumer Offsets P0 P2 P3 P4 P5 P6 Consumer Thread 1 Thread 2 Thread 3 Thread 4 Duplicates
  • 36. Rebalance Listener public class MyRebalanceListener implements ConsumerRebalanceListener { @Override public void onPartitionsAssigned(Collection<TopicPartition> partitions) { } @Override public void onPartitionsRevoked(Collection<TopicPartition> partitions) { commitOffsets(); } } consumer.subscribe(Arrays.asList("foo", "bar"), new MyRebalanceListener()); Careful! This method will need to know the topic, partition and offset of last record you got
  • 37. At Least Once Consuming 1. Commit your own offsets - Set autocommit.enable = false 2. Use Rebalance Listener to limit duplicates 3. Make sure you commit only what you are done processing 4. Note: New consumer is single threaded – one consumer per thread.
  • 38. Exactly Once Semantics • At most once is easy • At least once is not bad either – commit after 100% sure data is safe • Exactly once is tricky – Commit data and offsets in one transaction – Idempotent producer
  • 39. Using External Store • Don’t use commitSync() • Implement your own “commit” that saves both data and offsets to external store. • Use the RebalanceListener to find the correct offset
  • 40. Seeking right offset public class SaveOffsetsOnRebalance implements ConsumerRebalanceListener { private Consumer<?,?> consumer; public void onPartitionsRevoked(Collection<TopicPartition> partitions) { // save the offsets in an external store using some custom code not described here for (TopicPartition partition : partitions) saveOffsetInExternalStore(consumer.position(partition)); } public void onPartitionsAssigned(Collection<TopicPartition> partitions) { // read the offsets from an external store using some custom code not described here for (TopicPartition partition : partitions) consumer.seek(partition, readOffsetFromExternalStore(partition)); } }
  • 41. Monitoring for Data Loss • Monitor for producer errors – watch the retry numbers • Monitor consumer lag – MaxLag or via offsets • Standard schema: – Each message should contain timestamp and originating service and host • Each producer can report message counts and offsets to a special topic • “Monitoring consumer” reports message counts to another special topic • “Important consumers” also report message counts • Reconcile the results
  • 42. Be Safe, Not Sorry • Acks = all • Block.on.buffer.full = true • Retries = MAX_INT • ( Max.inflight.requests.per.connect = 1 ) • Producer.close() • Replication-factor >= 3 • Min.insync.replicas = 2 • Unclean.leader.election = false • Auto.offset.commit = false • Commit after processing • Monitor!

Editor's Notes

  1. Low Level Diagram: Not talking about producer / consumer design yet…maybe this is too low-level though Show diagram of network send -> os socket -> NIC -> ---- NIC -> Os socket buffer -> socket -> internal message flow / socket server -> response back to client -> how writes get persisted to disk including os buffers, async write etc Then overlay places where things can fail.
  2. Low Level Diagram: Not talking about producer / consumer design yet…maybe this is too low-level though Show diagram of network send -> os socket -> NIC -> ---- NIC -> Os socket buffer -> socket -> internal message flow / socket server -> response back to client -> how writes get persisted to disk including os buffers, async write etc Then overlay places where things can fail.
  3. Highlight boxes with different color
  4. Kafka exposes it’s binary TCP protocl via a Java api which is what we’ll be discussing here. So everything in the box is what’s happening inside the producer. Generally speaking, you have an application thread, or threads that take individual messages and “send” them to Kafka. What happens under the covers is that these messages are batched up where possible in order to amortize the overhead of the send, stored in a buffer and communicated over to kafka. After Kafka has completed it’s work, a response is returned back for each message. This happens asynchronously, using Java’s concurrent API. This response is comprised of either an exception or a metadata record. If the metadata is returned, which contains the offset, partition and topic, then things are good and we continue processing. However, if an error has returned the producer will automatically retry the failed message, up to a configurable # or amount of time. When this exception occurs and we have retries enabled, these retries actually just go right back to the start of the batches being prepared to send back to Kafka.
  5. Commit every 10 seconds, but we don’t really have any control over what’s processed, and this can lead to duplicates
  6. If you are doing too much work commits don’t count as heartbeat ->
  7. So lets so we have auto-commit enabled, and we are chugging along, and counting on the consumer to commit our offsets for us. This is great because we don’t have to code anything, and don’t have think about the frequency of commits and the impact that might have on our throughput. Life is good. But now we’ve lost a thread or a process. And we don’t really know where we are in the processing, Because the last auto-commit committed stuff that we hadn’t actually written to disk.
  8. So now we’re in a situation where we think we’ve read all of our data but we will have gaps in data. Note the same risk applies if we lose a partition or broker and get a new leader. OR