SlideShare a Scribd company logo
1 of 25
Download to read offline
Apache Kafka:
New Features That You
Might Not Know About
Yaroslav Tkachenko
Software Architect at Activision
Apache Kafka Versions
0.10.0.0 0.11.0.0 1.0.0 2.0.0
0.10.0.1
0.10.2.2
0.11.0.1
0.11.0.3
1.0.1
1.1.1
2.0.1
2.1.0
May 2016 June 2017 November 2017 July 2018
0.11: New Message Format
Record Batch:
...
magic: 2
…
attributes:
…
bit 4: isTransactional
…
producerId: int64
producerEpoch: int16
records: [Record]
Message Format v2
Record:
...
key: byte[]
value: byte[]
headers: [Header]
Header:
...
headerKey: String
value: byte[]
0.11: Headers
Message Headers
public interface Header {
String key();
byte[] value();
}
List<Header> headers = Arrays.asList(
new RecordHeader("hkey1", "hvalue1".getBytes()),
new RecordHeader("hkey2", "hvalue2".getBytes())
);
new ProducerRecord<>("topic", 0, "key", "value", headers);
Pros
• No need to deserialize the whole
message payload for routing /
filtering use-cases
Cons
• Harder to save the headers
together with the payload when
archiving, persisting to data
stores or integrating with 3rd
party systems
Message Headers
Message Headers
0.11: Transactions
Transactions
• Atomic writes to multiple Kafka topics and partitions
• Offset commits happen in the same transaction
• transactional.id + epoch for every producer
• Consumers must use “read_committed” isolation level for consuming
only committed transactional data
Transactions
KafkaProducer producer = ...
producer.initTransactions();
KafkaConsumer consumer = ...
consumer.subscribe("inputTopic"));
ConsumerRecords records = consumer.poll(Long.MAX_VALUE);
try {
producer.beginTransaction();
for (ConsumerRecord record : records) {
producer.send(processAndProduceRecord("outputTopic", record));
}
producer.sendOffsetsToTransaction(currentOffsets(consumer), groupId);
producer.commitTransaction();
} catch (Exception e) {
producer.abortTransaction();
}
Transactions
Transactions
In practice, for a producer producing 1KB records at maximum
throughput, committing messages every 100ms results in only a
3% degradation in throughput.
https://www.confluent.io/blog/transactions-apache-kafka/
0.11: Exactly-Once
Delivery
Exactly-Once: Why
is it so Hard?
At most once
• May or may not be
received
• No duplicates
• Probably missing
data
Delivery Guarantees
Exactly once
• Delivery guaranteed
• No duplicates
• No missing data
At least once
• Delivery guaranteed
• Possible duplicates
• No missing data
Idempotent producer writesTransactions API
Atomic writes and reads
Transactions Idempotence
Idempotence
• Unique producer ID is assigned to each producer
• Monotonically increasing sequence number is generated for every
topic/partition write
• Broker persists and validates sequence numbers:
• lower number → duplicate, reject
• higher number → out-of-sequence error, reject
• exactly one greater than the last → allow
Enabling Exactly-Once in
Kafka Streams?
Just set “processing.guarantee” to “exactly_once”. That’s it!
Don’t need to think about checkpointing and related challenges (like in some
other frameworks...)
1.1: Controller
Improvements
Controller Improvements
• One Controller per cluster
• Responsible for state management of partitions and replicas
• Communicates with Zookeeper
Updating partition leaders in batches
during the controlled shutdown
Zookeeper Asynchronous API is used
during the controlled shutdown and
controller failover
Controlled shutdown time: 3 seconds
Updating partition leaders one by
one, sequentially during the
controlled shutdown
Zookeeper Synchronous API is
used during the controlled
shutdown and controller failover
Controlled shutdown time: 6.5
minutes
Before 1.1.0 After 1.1.0
2.0: Kafka Streams
Improvements
Kafka Streams Improvements
• Message header support in the Processor API
• TopicNameExtractor for dynamic routing
• kafka-streams-testutil helper for unit-testing
• Scala wrapper for the Streams DSL
Thanks!
@sap1ens

More Related Content

What's hot

Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 
Event Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQEvent Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQAraf Karsh Hamid
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producerconfluent
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaJiangjie Qin
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explainedconfluent
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache KafkaPaul Brebner
 
Deep Dive into Apache Kafka
Deep Dive into Apache KafkaDeep Dive into Apache Kafka
Deep Dive into Apache Kafkaconfluent
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...confluent
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache KafkaAmir Sedighi
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?confluent
 
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developersconfluent
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
 
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...HostedbyConfluent
 
APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsKetan Gote
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaJeff Holoman
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache KafkaChhavi Parasher
 

What's hot (20)

Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Event Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQEvent Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQ
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producer
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache Kafka
 
Deep Dive into Apache Kafka
Deep Dive into Apache KafkaDeep Dive into Apache Kafka
Deep Dive into Apache Kafka
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developers
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
 
APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka Streams
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
InnoDB Locking Explained with Stick Figures
InnoDB Locking Explained with Stick FiguresInnoDB Locking Explained with Stick Figures
InnoDB Locking Explained with Stick Figures
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Kafka basics
Kafka basicsKafka basics
Kafka basics
 

Similar to Apache Kafka: New Features That You Might Not Know About

Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Evan Chan
 
Logging for Production Systems in The Container Era
Logging for Production Systems in The Container EraLogging for Production Systems in The Container Era
Logging for Production Systems in The Container EraSadayuki Furuhashi
 
Treasure Data Summer Internship 2016
Treasure Data Summer Internship 2016Treasure Data Summer Internship 2016
Treasure Data Summer Internship 2016Yuta Iwama
 
Apache Kafka Women Who Code Meetup
Apache Kafka Women Who Code MeetupApache Kafka Women Who Code Meetup
Apache Kafka Women Who Code MeetupSnehal Nagmote
 
Kafka zero to hero
Kafka zero to heroKafka zero to hero
Kafka zero to heroAvi Levi
 
Apache Kafka - From zero to hero
Apache Kafka - From zero to heroApache Kafka - From zero to hero
Apache Kafka - From zero to heroApache Kafka TLV
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Diveconfluent
 
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...LINE Corporation
 
AWS Lambda and the Serverless Cloud -Pop-up Loft
AWS Lambda and the Serverless Cloud -Pop-up LoftAWS Lambda and the Serverless Cloud -Pop-up Loft
AWS Lambda and the Serverless Cloud -Pop-up LoftAmazon Web Services
 
Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!Guido Schmutz
 
ログ収集プラットフォーム開発におけるElasticsearchの運用
ログ収集プラットフォーム開発におけるElasticsearchの運用ログ収集プラットフォーム開発におけるElasticsearchの運用
ログ収集プラットフォーム開発におけるElasticsearchの運用LINE Corporation
 
Exactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache KafkaExactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache Kafkaconfluent
 
Singer, Pinterest's Logging Infrastructure
Singer, Pinterest's Logging InfrastructureSinger, Pinterest's Logging Infrastructure
Singer, Pinterest's Logging InfrastructureDiscover Pinterest
 
AWS Lambda and the Serverless Cloud
AWS Lambda and the Serverless CloudAWS Lambda and the Serverless Cloud
AWS Lambda and the Serverless CloudAmazon Web Services
 
From a Kafkaesque Story to The Promised Land at LivePerson
From a Kafkaesque Story to The Promised Land at LivePersonFrom a Kafkaesque Story to The Promised Land at LivePerson
From a Kafkaesque Story to The Promised Land at LivePersonLivePerson
 
Past, Present, and Future of Apache Storm
Past, Present, and Future of Apache StormPast, Present, and Future of Apache Storm
Past, Present, and Future of Apache StormP. Taylor Goetz
 
From a kafkaesque story to The Promised Land
From a kafkaesque story to The Promised LandFrom a kafkaesque story to The Promised Land
From a kafkaesque story to The Promised LandRan Silberman
 
Scaling big with Apache Kafka
Scaling big with Apache KafkaScaling big with Apache Kafka
Scaling big with Apache KafkaNikolay Stoitsev
 
Transaction preview of Apache Pulsar
Transaction preview of Apache PulsarTransaction preview of Apache Pulsar
Transaction preview of Apache PulsarStreamNative
 

Similar to Apache Kafka: New Features That You Might Not Know About (20)

Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015
 
Logging for Production Systems in The Container Era
Logging for Production Systems in The Container EraLogging for Production Systems in The Container Era
Logging for Production Systems in The Container Era
 
Kafka overview v0.1
Kafka overview v0.1Kafka overview v0.1
Kafka overview v0.1
 
Treasure Data Summer Internship 2016
Treasure Data Summer Internship 2016Treasure Data Summer Internship 2016
Treasure Data Summer Internship 2016
 
Apache Kafka Women Who Code Meetup
Apache Kafka Women Who Code MeetupApache Kafka Women Who Code Meetup
Apache Kafka Women Who Code Meetup
 
Kafka zero to hero
Kafka zero to heroKafka zero to hero
Kafka zero to hero
 
Apache Kafka - From zero to hero
Apache Kafka - From zero to heroApache Kafka - From zero to hero
Apache Kafka - From zero to hero
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...
 
AWS Lambda and the Serverless Cloud -Pop-up Loft
AWS Lambda and the Serverless Cloud -Pop-up LoftAWS Lambda and the Serverless Cloud -Pop-up Loft
AWS Lambda and the Serverless Cloud -Pop-up Loft
 
Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!
 
ログ収集プラットフォーム開発におけるElasticsearchの運用
ログ収集プラットフォーム開発におけるElasticsearchの運用ログ収集プラットフォーム開発におけるElasticsearchの運用
ログ収集プラットフォーム開発におけるElasticsearchの運用
 
Exactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache KafkaExactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache Kafka
 
Singer, Pinterest's Logging Infrastructure
Singer, Pinterest's Logging InfrastructureSinger, Pinterest's Logging Infrastructure
Singer, Pinterest's Logging Infrastructure
 
AWS Lambda and the Serverless Cloud
AWS Lambda and the Serverless CloudAWS Lambda and the Serverless Cloud
AWS Lambda and the Serverless Cloud
 
From a Kafkaesque Story to The Promised Land at LivePerson
From a Kafkaesque Story to The Promised Land at LivePersonFrom a Kafkaesque Story to The Promised Land at LivePerson
From a Kafkaesque Story to The Promised Land at LivePerson
 
Past, Present, and Future of Apache Storm
Past, Present, and Future of Apache StormPast, Present, and Future of Apache Storm
Past, Present, and Future of Apache Storm
 
From a kafkaesque story to The Promised Land
From a kafkaesque story to The Promised LandFrom a kafkaesque story to The Promised Land
From a kafkaesque story to The Promised Land
 
Scaling big with Apache Kafka
Scaling big with Apache KafkaScaling big with Apache Kafka
Scaling big with Apache Kafka
 
Transaction preview of Apache Pulsar
Transaction preview of Apache PulsarTransaction preview of Apache Pulsar
Transaction preview of Apache Pulsar
 

More from Yaroslav Tkachenko

Dynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent HashingDynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent HashingYaroslav Tkachenko
 
Streaming SQL for Data Engineers: The Next Big Thing?
Streaming SQL for Data Engineers: The Next Big Thing?Streaming SQL for Data Engineers: The Next Big Thing?
Streaming SQL for Data Engineers: The Next Big Thing?Yaroslav Tkachenko
 
Apache Flink Adoption at Shopify
Apache Flink Adoption at ShopifyApache Flink Adoption at Shopify
Apache Flink Adoption at ShopifyYaroslav Tkachenko
 
Storing State Forever: Why It Can Be Good For Your Analytics
Storing State Forever: Why It Can Be Good For Your AnalyticsStoring State Forever: Why It Can Be Good For Your Analytics
Storing State Forever: Why It Can Be Good For Your AnalyticsYaroslav Tkachenko
 
It's Time To Stop Using Lambda Architecture
It's Time To Stop Using Lambda ArchitectureIt's Time To Stop Using Lambda Architecture
It's Time To Stop Using Lambda ArchitectureYaroslav Tkachenko
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to StreamingBravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to StreamingYaroslav Tkachenko
 
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...Yaroslav Tkachenko
 
Designing Scalable and Extendable Data Pipeline for Call Of Duty Games
Designing Scalable and Extendable Data Pipeline for Call Of Duty GamesDesigning Scalable and Extendable Data Pipeline for Call Of Duty Games
Designing Scalable and Extendable Data Pipeline for Call Of Duty GamesYaroslav Tkachenko
 
10 tips for making Bash a sane programming language
10 tips for making Bash a sane programming language10 tips for making Bash a sane programming language
10 tips for making Bash a sane programming languageYaroslav Tkachenko
 
Actors or Not: Async Event Architectures
Actors or Not: Async Event ArchitecturesActors or Not: Async Event Architectures
Actors or Not: Async Event ArchitecturesYaroslav Tkachenko
 
Kafka Streams: the easiest way to start with stream processing
Kafka Streams: the easiest way to start with stream processingKafka Streams: the easiest way to start with stream processing
Kafka Streams: the easiest way to start with stream processingYaroslav Tkachenko
 
Building Stateful Microservices With Akka
Building Stateful Microservices With AkkaBuilding Stateful Microservices With Akka
Building Stateful Microservices With AkkaYaroslav Tkachenko
 
Querying Data Pipeline with AWS Athena
Querying Data Pipeline with AWS AthenaQuerying Data Pipeline with AWS Athena
Querying Data Pipeline with AWS AthenaYaroslav Tkachenko
 
Akka Microservices Architecture And Design
Akka Microservices Architecture And DesignAkka Microservices Architecture And Design
Akka Microservices Architecture And DesignYaroslav Tkachenko
 
Why Actor-Based Systems Are The Best For Microservices
Why Actor-Based Systems Are The Best For MicroservicesWhy Actor-Based Systems Are The Best For Microservices
Why Actor-Based Systems Are The Best For MicroservicesYaroslav Tkachenko
 
Why actor-based systems are the best for microservices
Why actor-based systems are the best for microservicesWhy actor-based systems are the best for microservices
Why actor-based systems are the best for microservicesYaroslav Tkachenko
 
Building Eventing Systems for Microservice Architecture
Building Eventing Systems for Microservice Architecture  Building Eventing Systems for Microservice Architecture
Building Eventing Systems for Microservice Architecture Yaroslav Tkachenko
 
Быстрая и безболезненная разработка клиентской части веб-приложений
Быстрая и безболезненная разработка клиентской части веб-приложенийБыстрая и безболезненная разработка клиентской части веб-приложений
Быстрая и безболезненная разработка клиентской части веб-приложенийYaroslav Tkachenko
 

More from Yaroslav Tkachenko (18)

Dynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent HashingDynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
 
Streaming SQL for Data Engineers: The Next Big Thing?
Streaming SQL for Data Engineers: The Next Big Thing?Streaming SQL for Data Engineers: The Next Big Thing?
Streaming SQL for Data Engineers: The Next Big Thing?
 
Apache Flink Adoption at Shopify
Apache Flink Adoption at ShopifyApache Flink Adoption at Shopify
Apache Flink Adoption at Shopify
 
Storing State Forever: Why It Can Be Good For Your Analytics
Storing State Forever: Why It Can Be Good For Your AnalyticsStoring State Forever: Why It Can Be Good For Your Analytics
Storing State Forever: Why It Can Be Good For Your Analytics
 
It's Time To Stop Using Lambda Architecture
It's Time To Stop Using Lambda ArchitectureIt's Time To Stop Using Lambda Architecture
It's Time To Stop Using Lambda Architecture
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to StreamingBravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
 
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
 
Designing Scalable and Extendable Data Pipeline for Call Of Duty Games
Designing Scalable and Extendable Data Pipeline for Call Of Duty GamesDesigning Scalable and Extendable Data Pipeline for Call Of Duty Games
Designing Scalable and Extendable Data Pipeline for Call Of Duty Games
 
10 tips for making Bash a sane programming language
10 tips for making Bash a sane programming language10 tips for making Bash a sane programming language
10 tips for making Bash a sane programming language
 
Actors or Not: Async Event Architectures
Actors or Not: Async Event ArchitecturesActors or Not: Async Event Architectures
Actors or Not: Async Event Architectures
 
Kafka Streams: the easiest way to start with stream processing
Kafka Streams: the easiest way to start with stream processingKafka Streams: the easiest way to start with stream processing
Kafka Streams: the easiest way to start with stream processing
 
Building Stateful Microservices With Akka
Building Stateful Microservices With AkkaBuilding Stateful Microservices With Akka
Building Stateful Microservices With Akka
 
Querying Data Pipeline with AWS Athena
Querying Data Pipeline with AWS AthenaQuerying Data Pipeline with AWS Athena
Querying Data Pipeline with AWS Athena
 
Akka Microservices Architecture And Design
Akka Microservices Architecture And DesignAkka Microservices Architecture And Design
Akka Microservices Architecture And Design
 
Why Actor-Based Systems Are The Best For Microservices
Why Actor-Based Systems Are The Best For MicroservicesWhy Actor-Based Systems Are The Best For Microservices
Why Actor-Based Systems Are The Best For Microservices
 
Why actor-based systems are the best for microservices
Why actor-based systems are the best for microservicesWhy actor-based systems are the best for microservices
Why actor-based systems are the best for microservices
 
Building Eventing Systems for Microservice Architecture
Building Eventing Systems for Microservice Architecture  Building Eventing Systems for Microservice Architecture
Building Eventing Systems for Microservice Architecture
 
Быстрая и безболезненная разработка клиентской части веб-приложений
Быстрая и безболезненная разработка клиентской части веб-приложенийБыстрая и безболезненная разработка клиентской части веб-приложений
Быстрая и безболезненная разработка клиентской части веб-приложений
 

Recently uploaded

Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 

Recently uploaded (20)

Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 

Apache Kafka: New Features That You Might Not Know About

  • 1. Apache Kafka: New Features That You Might Not Know About Yaroslav Tkachenko Software Architect at Activision
  • 2. Apache Kafka Versions 0.10.0.0 0.11.0.0 1.0.0 2.0.0 0.10.0.1 0.10.2.2 0.11.0.1 0.11.0.3 1.0.1 1.1.1 2.0.1 2.1.0 May 2016 June 2017 November 2017 July 2018
  • 4. Record Batch: ... magic: 2 … attributes: … bit 4: isTransactional … producerId: int64 producerEpoch: int16 records: [Record] Message Format v2 Record: ... key: byte[] value: byte[] headers: [Header] Header: ... headerKey: String value: byte[]
  • 6. Message Headers public interface Header { String key(); byte[] value(); } List<Header> headers = Arrays.asList( new RecordHeader("hkey1", "hvalue1".getBytes()), new RecordHeader("hkey2", "hvalue2".getBytes()) ); new ProducerRecord<>("topic", 0, "key", "value", headers);
  • 7. Pros • No need to deserialize the whole message payload for routing / filtering use-cases Cons • Harder to save the headers together with the payload when archiving, persisting to data stores or integrating with 3rd party systems Message Headers
  • 10. Transactions • Atomic writes to multiple Kafka topics and partitions • Offset commits happen in the same transaction • transactional.id + epoch for every producer • Consumers must use “read_committed” isolation level for consuming only committed transactional data
  • 11. Transactions KafkaProducer producer = ... producer.initTransactions(); KafkaConsumer consumer = ... consumer.subscribe("inputTopic")); ConsumerRecords records = consumer.poll(Long.MAX_VALUE); try { producer.beginTransaction(); for (ConsumerRecord record : records) { producer.send(processAndProduceRecord("outputTopic", record)); } producer.sendOffsetsToTransaction(currentOffsets(consumer), groupId); producer.commitTransaction(); } catch (Exception e) { producer.abortTransaction(); }
  • 13. Transactions In practice, for a producer producing 1KB records at maximum throughput, committing messages every 100ms results in only a 3% degradation in throughput. https://www.confluent.io/blog/transactions-apache-kafka/
  • 16. At most once • May or may not be received • No duplicates • Probably missing data Delivery Guarantees Exactly once • Delivery guaranteed • No duplicates • No missing data At least once • Delivery guaranteed • Possible duplicates • No missing data
  • 17. Idempotent producer writesTransactions API Atomic writes and reads Transactions Idempotence
  • 18. Idempotence • Unique producer ID is assigned to each producer • Monotonically increasing sequence number is generated for every topic/partition write • Broker persists and validates sequence numbers: • lower number → duplicate, reject • higher number → out-of-sequence error, reject • exactly one greater than the last → allow
  • 19. Enabling Exactly-Once in Kafka Streams? Just set “processing.guarantee” to “exactly_once”. That’s it! Don’t need to think about checkpointing and related challenges (like in some other frameworks...)
  • 21. Controller Improvements • One Controller per cluster • Responsible for state management of partitions and replicas • Communicates with Zookeeper
  • 22. Updating partition leaders in batches during the controlled shutdown Zookeeper Asynchronous API is used during the controlled shutdown and controller failover Controlled shutdown time: 3 seconds Updating partition leaders one by one, sequentially during the controlled shutdown Zookeeper Synchronous API is used during the controlled shutdown and controller failover Controlled shutdown time: 6.5 minutes Before 1.1.0 After 1.1.0
  • 24. Kafka Streams Improvements • Message header support in the Processor API • TopicNameExtractor for dynamic routing • kafka-streams-testutil helper for unit-testing • Scala wrapper for the Streams DSL