Stress the application level resends bit. Encourage people to rely on the producer retries and not re-send messages from their apps.
New concept – control messages. Mention that commit markers are special message with log the producer id and the result of the transaction. These messages are not passed on to application – the client interprets them and acts accordingly.
Mention ’read_uncommitted’ Mention that the buffering is broker side.
Mention transaction expiration.
Specify that you don’t need to use any of the new features to get these performance savings.
max.inflight is required for idempotence. It will cause a slowdown because you now have a sync producer.
Introducing Exactly Once Semantics To Apache Kafka
Introducing Exactly Once
Semantics in Apache Kafka
Jason Gustafson, Guozhang Wang, Sriram
Subramaniam, and Apurva Mehta
• Kafka’s existing delivery semantics.
• Why did we improve them?
• What’s new?
• How do you use it?
• Stream processing is becoming an ever bigger part of the
• Apache Kafka is the heart of the streams platform.
• Strengthening Kafka’s semantics expands the universe of
A motivating example..
A peer to peer lending platform which processes micro-loans
• Sequence numbers and producer ids:
• enable de-dup
• are in the log.
• Hence de-dup works transparently across leader
• Will not de-dup application-level resends.
• Works transparently – no API changes.
Some notes on consuming transactions
• Two ‘isolation levels’ : read_committed, and
• Messages read in offset order.
• read_committed consumers read to the point where there
are no open transactions.
• Transaction coordinator and transaction log maintain
• Use the new producer APIs for transactions.
• Consumers can read only committed messages.
What’s new, part 3: Performance boost!
• Up to +20% producer throughput
• Up to +50% consumer throughput
• Up to -20% disk utilization
• Savings start when you batch
• Details: https://bit.ly/kafka-eos-perf
A visual comparison with 7 records, 10 bytes each
• With a batch size of 2, the new format starts saving
• Savings are maximal for large batches of small
• Hence higher throughput when IO bound.
• Works as soon as you upgrade to the new format.