SlideShare a Scribd company logo
Presented By: Amit Kumar
Deep Dive into Kafka for Big
Data Application
Lack of etiquette and manners is a huge turn off.
KnolX Etiquettes
Punctuality
Join the session 5 minutes prior to
the session start time. We start on
time and conclude on time!
Feedback
Make sure to submit a constructive
feedback for all sessions as it is
very helpful for the presenter.
Silent Mode
Keep your mobile devices in silent
mode, feel free to move out of
session in case you need to attend
an urgent call.
Avoid Disturbance
Avoid unwanted chit chat during
the session.
Our Agenda
01 Kafka Introduction
02 Topic Config
03 Producer Config
04 Consumer Config
05 Demo
Introduction
● Apache Kafka is used primarily to build real-time data streaming pipelines.
● It is used to store Stream of Data. It is build on pub/sub model.
● Streams can receive tens of thousands of records per second, and some will receive one or two records per
hour.
● Thus kafka is most important tool to store intermediate data/events of big data application.
● These stream of data is store in kafka as Kafka topic.
● Once the stream is stored in kafka, it can be consumed by multiple applications for different use cases such as
storing in database, Analytics.
● Apache Kafka works as cushion between two application mostly if later application is slower.
Kafka Features
● Low Latency:- It offers low latency low latency value, i.e., upto 10 milliseconds.
● High Throughput:- due to low latency it can handle high velocity and volume.
● Fault Tolerance:- handle node/machine failure within the cluster.
● Durability:- because of its replication feature.
● Distributed:- kafka contains a distributed architecture which makes it scalable.
● Real Time Handling:- kafka is able to handle real-time data pipeline.
● Batching:- Kafka works with batch-like use cases, supports batching.
Kafka Components
● Apache Kafka stores data stream in Topic.
● Producer API is used to write data/publish in kafka topic.
● Consumer API is used to consumed data from the topic for further use.
Kafka Topic
It’s similar to the table of database, kafka uses topics to organise the message of a particular catogery.
We can do query on kafka topic unlike the database table. We need to create the producer to write data and
consumer to read that too in sequential order.
Data in topics are deleted as per retention period.
Important kafka topic config:-
Number of Partition:-
Replication Factor:-
Message Size:-
Log CleanUp Policy:-
Config Details
Number of partition govern the parallelism of application.
In order to do parallel computation we need multiple consumer instance and since we know one partition can’t feed data
to multiple consumer. We have to increase the partition count to achieve same.
Continue _ _ _ _
Replication Factor is multiple copy of the data over different broker. It help us in dealing
with data loss when broker goes offline or fails. Replicated data server the data.
In ideal case we give replication factor as 3.
If we increase the replication factor more it will hit the performance and keeping it as less we
will lose the data.
Message Size:- Kafka has a default limit of 1MB per message in the topic.
in few scenario we need to send data which is larger than 1 Mb. In that case we can modify
the default message size till 10 MB.
replica.fetch.max.bytes=10485880
Continue _ _ _ _
Log cleanup policy make sure that older message in the topic is getting cleaned. so that it
free up memory of the broker.
It is being controlled by following two configuration.
log.retention.hours :- The most common configuration for how long Kafka will retain
messages is by time. The default is specified in the configuration file using the
log.retention.hours parameter, and it is set to 168 hours, the equivalent of one week.
log.retention.bytes:- Another way to expire messages is based on the total number of bytes of
messages retained. This value is set using the log.retention.bytes parameter, and it is applied per partition.
The default is -1, meaning that there is no limit and only a time limit is applied.
Kafka Producer
Once a topic has been created with Kafka, the next step is to send data into the topic. This is
where Kafka Producers come in.
Kafka producer sends messages to a topic, and messages are distributed to partitions
according to a mechanism such as key hashing
Continue _ _ _ _
Kafka messages are created by the producer. A Kafka message consists of the following elements:
Continue _ _ _ _
Key is optional in the Kafka message and it can be null. A key may be a string, number, or any object and then the
key is serialized into binary format.
Value represents the content of the message and can also be null. The value format is arbitrary and is then also
serialized into binary format.
Compression Type Kafka messages can be compressed. The compression Options are none, gzip, lz4, snappy,
and zstd
Headers. This is key value pair added especially for tracing of the message.
Partition + Offset. Once a message is sent into a Kafka topic, it receives a partition number and an offset id. The
combination of topic+partition+offset uniquely identifies the message
Timestamp. A timestamp is added either by the user or the system in the message.
Continue _ _ _ _
// Must Have Config
bootstrap.servers -> bootstrapServers,
key.serializer -> stringSerializer,
value.serializer -> stringSerializer,
// safe producer
enable.idempotence -> "true",
acks -> "all",
retries -> Integer.MAX_VALUE.toString(),
max.in.flight.requests.per.connection -> "5",
// high throughput producer at the expense of a bit of latency and CPU usage
compression.type -> "snappy",
linger.ms -> "20",
batch.size -> Integer.toString(32 * 1024) // 32KB
Ack
Safe Producer:-
Acks is the number of brokers who need to acknowledge receiving the message before it is considered a
successful write.
acks=0 producers consider messages as "written successfully" the moment the message was sent without
waiting for the broker to accept it at all. this is fastest approaches but data loss is possible.
acks=1 , producers consider messages as "written successfully" when the message was acknowledged by only
the leader.
acks=all, producers consider messages as "written successfully" when the message is accepted by all in-sync
replicas (ISR)
Retry
Retries ensure that no messages are dropped when sent to Apache Kafka.
for kafka > 2.1 default retires value is max int retries = 214748364
it doesn’t mean that it will keep retrying forever. it is being controlled by delivery.timeout.ms
default setting for the timeout is 2 minute delivery.timeout.ms=120000
max.in.flight.requests.per.connection = 1 if we want to keep the ordering maintained
then we have to set the max in fight request = 1 but it impact the performance to keep
the performance high we should set it as 5
Compression
Producers group messages in a batch before sending.
If the producer is sending compressed messages, all the messages in a single producer batch are compressed
together and sent as the "value" of a "wrapper message".
Compression is more effective the bigger the batch of messages being sent to Kafka.
Compression options are are compression.type= none, gzip, lz4, snappy, and zstd
Batching
By Default, Kafka producers try to send records as soon as possible.
If we want to increase the throughput we have to enable batching.
Batching is mainly controlled by two producer settings - linger.ms and batch.size
the default value of longest.ms = 20ms and batch.size = 16KB
Producer will wait either till 16Kb of the message or till 20 ms before sending message.
kafka Consumer
Applications that pull event data from one or more Kafka topics are known as Kafka Consumers.
Consumers can read from one or more partitions at a time in Apache Kafka.
Data is being read in order within each partition.
Delivery Semantics
A consumer reading from a Kafka partition may choose when to commit offsets.
At Most Once Delivery:- offsets are committed as soon as a message batch is received after calling poll().
If processing fails, the message will be lost as, it will not be read again as the offsets of those messages have been
committed already.
At Least Once Delivery:- In this semantic we don’t want to lose the message to ensure it we commit the offset after
processing is done but due to retries it leads to duplicate processing.
This is suitable for consumers that cannot afford any data loss.
Exactly Once Delivery:- In this semantic we want the message to be processing exactly once we don’t want the
duplicate data. it’s applicable in case of payment and similar sensitive use cases.
processing.guarantee=exactly.once
Polling
Kafka consumers poll the Kafka broker to receive batches of data.
Polling allows consumers to control:-
● From where in the log they want to consume
● How fast they want to consume
● Ability to replay events
consumer sends the heartbeat on regular interval (heartbeat.interval.ms). if it stops sending heartbeat group
coordinator wait till (session.timeout.ms) time and retrigger a rebalance.
Consumers poll brokers periodically using the .poll() method. If two .poll() calls are separated by more than
max.poll.interval.ms time, then the consumer will be disconnected from the group.
default value of max.poll.interval.ms = 5 minute
Continue _ _ _
max.poll.records: (default 500) :- It control the maximum number of records that a single call to poll() will fetch.
This is the important config which control how fast the data will be coming to our application.
Suppose your application is process the data slower then you must set this max.poll.records value as lower.
and if the application is processing data faster then we can set this value as high.
If processing of a particular batch takes more time than the max.poll.interval.ms time, then the consumer will
be disconnected from the group. to make sure this doesn’t happen reduce the max.poll.records value.
Auto Offsets Reset
● A consumer is expected to read from a log continuously.
● But due to some network error or bug in application. We are not able to process the message and
● once we restart the application the current offset data is not available (deleted due to retention policy) then
● The specific consumer have two option either to read from beginning or read from latest.
The same behavior is controlled by auto.offset.reset it can take following values:-
latest:- consumers will read messages from the tail of the partition
earliest:- reading from the oldest offset in the partition
none:- throw exception to the consumer if no previous offset is found
Thank You !
Get in touch with us:

More Related Content

What's hot

Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain
confluent
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
Clement Demonchy
 
Pub/Sub Messaging
Pub/Sub MessagingPub/Sub Messaging
Pub/Sub Messaging
Peter Hanzlik
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
Shiao-An Yuan
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
Kumar Shivam
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
Jeff Holoman
 
Messaging queue - Kafka
Messaging queue - KafkaMessaging queue - Kafka
Messaging queue - Kafka
Mayank Bansal
 
Message Broker System and RabbitMQ
Message Broker System and RabbitMQMessage Broker System and RabbitMQ
Message Broker System and RabbitMQ
University of Alabama at Birmingham
 
Apache kafka 확장과 응용
Apache kafka 확장과 응용Apache kafka 확장과 응용
Apache kafka 확장과 응용
JANGWONSEO4
 
Kafka tutorial
Kafka tutorialKafka tutorial
Kafka tutorial
Srikrishna k
 
Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...
SANG WON PARK
 
Kafka Overview
Kafka OverviewKafka Overview
Kafka Overview
iamtodor
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
confluent
 
Introduction to AMQP Messaging with RabbitMQ
Introduction to AMQP Messaging with RabbitMQIntroduction to AMQP Messaging with RabbitMQ
Introduction to AMQP Messaging with RabbitMQ
Dmitriy Samovskiy
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
emreakis
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
Amir Sedighi
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta Lake
Flink Forward
 
Introduction to apache kafka
Introduction to apache kafkaIntroduction to apache kafka
Introduction to apache kafka
Dimitris Kontokostas
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
Saroj Panyasrivanit
 
kafka
kafkakafka

What's hot (20)

Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Pub/Sub Messaging
Pub/Sub MessagingPub/Sub Messaging
Pub/Sub Messaging
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Messaging queue - Kafka
Messaging queue - KafkaMessaging queue - Kafka
Messaging queue - Kafka
 
Message Broker System and RabbitMQ
Message Broker System and RabbitMQMessage Broker System and RabbitMQ
Message Broker System and RabbitMQ
 
Apache kafka 확장과 응용
Apache kafka 확장과 응용Apache kafka 확장과 응용
Apache kafka 확장과 응용
 
Kafka tutorial
Kafka tutorialKafka tutorial
Kafka tutorial
 
Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...
 
Kafka Overview
Kafka OverviewKafka Overview
Kafka Overview
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Introduction to AMQP Messaging with RabbitMQ
Introduction to AMQP Messaging with RabbitMQIntroduction to AMQP Messaging with RabbitMQ
Introduction to AMQP Messaging with RabbitMQ
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta Lake
 
Introduction to apache kafka
Introduction to apache kafkaIntroduction to apache kafka
Introduction to apache kafka
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
kafka
kafkakafka
kafka
 

Similar to Kafka Deep Dive

Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQShameera Rathnayaka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
Apache kafkaApache kafka
Apache kafka
Srikrishna k
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
Ramakrishna kapa
 
Kafka RealTime Streaming
Kafka RealTime StreamingKafka RealTime Streaming
Kafka RealTime Streaming
Viyaan Jhiingade
 
Introduction to Kafka and Event-Driven
Introduction to Kafka and Event-DrivenIntroduction to Kafka and Event-Driven
Introduction to Kafka and Event-Driven
Dimosthenis Botsaris
 
Introduction to Kafka and Event-Driven
Introduction to Kafka and Event-DrivenIntroduction to Kafka and Event-Driven
Introduction to Kafka and Event-Driven
arconsis
 
Removing performance bottlenecks with Kafka Monitoring and topic configuration
Removing performance bottlenecks with Kafka Monitoring and topic configurationRemoving performance bottlenecks with Kafka Monitoring and topic configuration
Removing performance bottlenecks with Kafka Monitoring and topic configuration
Knoldus Inc.
 
Kafka Fundamentals
Kafka FundamentalsKafka Fundamentals
Kafka Fundamentals
Ketan Keshri
 
apachekafka-160907180205.pdf
apachekafka-160907180205.pdfapachekafka-160907180205.pdf
apachekafka-160907180205.pdf
TarekHamdi8
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
Srikrishna k
 
Copy of Kafka-Camus
Copy of Kafka-CamusCopy of Kafka-Camus
Copy of Kafka-CamusDeep Shah
 
Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018
Otávio Carvalho
 
Intoduction to Apache Kafka
Intoduction to Apache KafkaIntoduction to Apache Kafka
Intoduction to Apache Kafka
Veysel Gündüzalp
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
Guido Schmutz
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
Joe Stein
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache Kafka
Joe Stein
 
A Quick Guide to Refresh Kafka Skills
A Quick Guide to Refresh Kafka SkillsA Quick Guide to Refresh Kafka Skills
A Quick Guide to Refresh Kafka Skills
Ravindra kumar
 
kafka_session_updated.pptx
kafka_session_updated.pptxkafka_session_updated.pptx
kafka_session_updated.pptx
Koiuyt1
 

Similar to Kafka Deep Dive (20)

Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Kafka RealTime Streaming
Kafka RealTime StreamingKafka RealTime Streaming
Kafka RealTime Streaming
 
Introduction to Kafka and Event-Driven
Introduction to Kafka and Event-DrivenIntroduction to Kafka and Event-Driven
Introduction to Kafka and Event-Driven
 
Introduction to Kafka and Event-Driven
Introduction to Kafka and Event-DrivenIntroduction to Kafka and Event-Driven
Introduction to Kafka and Event-Driven
 
Removing performance bottlenecks with Kafka Monitoring and topic configuration
Removing performance bottlenecks with Kafka Monitoring and topic configurationRemoving performance bottlenecks with Kafka Monitoring and topic configuration
Removing performance bottlenecks with Kafka Monitoring and topic configuration
 
Kafka Fundamentals
Kafka FundamentalsKafka Fundamentals
Kafka Fundamentals
 
apachekafka-160907180205.pdf
apachekafka-160907180205.pdfapachekafka-160907180205.pdf
apachekafka-160907180205.pdf
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Copy of Kafka-Camus
Copy of Kafka-CamusCopy of Kafka-Camus
Copy of Kafka-Camus
 
Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018
 
Intoduction to Apache Kafka
Intoduction to Apache KafkaIntoduction to Apache Kafka
Intoduction to Apache Kafka
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache Kafka
 
A Quick Guide to Refresh Kafka Skills
A Quick Guide to Refresh Kafka SkillsA Quick Guide to Refresh Kafka Skills
A Quick Guide to Refresh Kafka Skills
 
kafka_session_updated.pptx
kafka_session_updated.pptxkafka_session_updated.pptx
kafka_session_updated.pptx
 
KAFKA Quickstart
KAFKA QuickstartKAFKA Quickstart
KAFKA Quickstart
 

More from Knoldus Inc.

Using InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in JmeterUsing InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in Jmeter
Knoldus Inc.
 
Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)
Knoldus Inc.
 
Stakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) PresentationStakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) Presentation
Knoldus Inc.
 
Introduction To Kaniko (DevOps) Presentation
Introduction To Kaniko (DevOps) PresentationIntroduction To Kaniko (DevOps) Presentation
Introduction To Kaniko (DevOps) Presentation
Knoldus Inc.
 
Efficient Test Environments with Infrastructure as Code (IaC)
Efficient Test Environments with Infrastructure as Code (IaC)Efficient Test Environments with Infrastructure as Code (IaC)
Efficient Test Environments with Infrastructure as Code (IaC)
Knoldus Inc.
 
Exploring Terramate DevOps (Presentation)
Exploring Terramate DevOps (Presentation)Exploring Terramate DevOps (Presentation)
Exploring Terramate DevOps (Presentation)
Knoldus Inc.
 
Clean Code in Test Automation Differentiating Between the Good and the Bad
Clean Code in Test Automation  Differentiating Between the Good and the BadClean Code in Test Automation  Differentiating Between the Good and the Bad
Clean Code in Test Automation Differentiating Between the Good and the Bad
Knoldus Inc.
 
Integrating AI Capabilities in Test Automation
Integrating AI Capabilities in Test AutomationIntegrating AI Capabilities in Test Automation
Integrating AI Capabilities in Test Automation
Knoldus Inc.
 
State Management with NGXS in Angular.pptx
State Management with NGXS in Angular.pptxState Management with NGXS in Angular.pptx
State Management with NGXS in Angular.pptx
Knoldus Inc.
 
Authentication in Svelte using cookies.pptx
Authentication in Svelte using cookies.pptxAuthentication in Svelte using cookies.pptx
Authentication in Svelte using cookies.pptx
Knoldus Inc.
 
OAuth2 Implementation Presentation (Java)
OAuth2 Implementation Presentation (Java)OAuth2 Implementation Presentation (Java)
OAuth2 Implementation Presentation (Java)
Knoldus Inc.
 
Supply chain security with Kubeclarity.pptx
Supply chain security with Kubeclarity.pptxSupply chain security with Kubeclarity.pptx
Supply chain security with Kubeclarity.pptx
Knoldus Inc.
 
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML ParsingMastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
Knoldus Inc.
 
Akka gRPC Essentials A Hands-On Introduction
Akka gRPC Essentials A Hands-On IntroductionAkka gRPC Essentials A Hands-On Introduction
Akka gRPC Essentials A Hands-On Introduction
Knoldus Inc.
 
Entity Core with Core Microservices.pptx
Entity Core with Core Microservices.pptxEntity Core with Core Microservices.pptx
Entity Core with Core Microservices.pptx
Knoldus Inc.
 
Introduction to Redis and its features.pptx
Introduction to Redis and its features.pptxIntroduction to Redis and its features.pptx
Introduction to Redis and its features.pptx
Knoldus Inc.
 
GraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdfGraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdf
Knoldus Inc.
 
NuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptxNuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptx
Knoldus Inc.
 
Data Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable TestingData Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable Testing
Knoldus Inc.
 
K8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose KubernetesK8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose Kubernetes
Knoldus Inc.
 

More from Knoldus Inc. (20)

Using InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in JmeterUsing InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in Jmeter
 
Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)
 
Stakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) PresentationStakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) Presentation
 
Introduction To Kaniko (DevOps) Presentation
Introduction To Kaniko (DevOps) PresentationIntroduction To Kaniko (DevOps) Presentation
Introduction To Kaniko (DevOps) Presentation
 
Efficient Test Environments with Infrastructure as Code (IaC)
Efficient Test Environments with Infrastructure as Code (IaC)Efficient Test Environments with Infrastructure as Code (IaC)
Efficient Test Environments with Infrastructure as Code (IaC)
 
Exploring Terramate DevOps (Presentation)
Exploring Terramate DevOps (Presentation)Exploring Terramate DevOps (Presentation)
Exploring Terramate DevOps (Presentation)
 
Clean Code in Test Automation Differentiating Between the Good and the Bad
Clean Code in Test Automation  Differentiating Between the Good and the BadClean Code in Test Automation  Differentiating Between the Good and the Bad
Clean Code in Test Automation Differentiating Between the Good and the Bad
 
Integrating AI Capabilities in Test Automation
Integrating AI Capabilities in Test AutomationIntegrating AI Capabilities in Test Automation
Integrating AI Capabilities in Test Automation
 
State Management with NGXS in Angular.pptx
State Management with NGXS in Angular.pptxState Management with NGXS in Angular.pptx
State Management with NGXS in Angular.pptx
 
Authentication in Svelte using cookies.pptx
Authentication in Svelte using cookies.pptxAuthentication in Svelte using cookies.pptx
Authentication in Svelte using cookies.pptx
 
OAuth2 Implementation Presentation (Java)
OAuth2 Implementation Presentation (Java)OAuth2 Implementation Presentation (Java)
OAuth2 Implementation Presentation (Java)
 
Supply chain security with Kubeclarity.pptx
Supply chain security with Kubeclarity.pptxSupply chain security with Kubeclarity.pptx
Supply chain security with Kubeclarity.pptx
 
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML ParsingMastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
Mastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing
 
Akka gRPC Essentials A Hands-On Introduction
Akka gRPC Essentials A Hands-On IntroductionAkka gRPC Essentials A Hands-On Introduction
Akka gRPC Essentials A Hands-On Introduction
 
Entity Core with Core Microservices.pptx
Entity Core with Core Microservices.pptxEntity Core with Core Microservices.pptx
Entity Core with Core Microservices.pptx
 
Introduction to Redis and its features.pptx
Introduction to Redis and its features.pptxIntroduction to Redis and its features.pptx
Introduction to Redis and its features.pptx
 
GraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdfGraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdf
 
NuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptxNuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptx
 
Data Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable TestingData Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable Testing
 
K8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose KubernetesK8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose Kubernetes
 

Recently uploaded

Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 

Recently uploaded (20)

Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 

Kafka Deep Dive

  • 1. Presented By: Amit Kumar Deep Dive into Kafka for Big Data Application
  • 2. Lack of etiquette and manners is a huge turn off. KnolX Etiquettes Punctuality Join the session 5 minutes prior to the session start time. We start on time and conclude on time! Feedback Make sure to submit a constructive feedback for all sessions as it is very helpful for the presenter. Silent Mode Keep your mobile devices in silent mode, feel free to move out of session in case you need to attend an urgent call. Avoid Disturbance Avoid unwanted chit chat during the session.
  • 3. Our Agenda 01 Kafka Introduction 02 Topic Config 03 Producer Config 04 Consumer Config 05 Demo
  • 4. Introduction ● Apache Kafka is used primarily to build real-time data streaming pipelines. ● It is used to store Stream of Data. It is build on pub/sub model. ● Streams can receive tens of thousands of records per second, and some will receive one or two records per hour. ● Thus kafka is most important tool to store intermediate data/events of big data application. ● These stream of data is store in kafka as Kafka topic. ● Once the stream is stored in kafka, it can be consumed by multiple applications for different use cases such as storing in database, Analytics. ● Apache Kafka works as cushion between two application mostly if later application is slower.
  • 5.
  • 6. Kafka Features ● Low Latency:- It offers low latency low latency value, i.e., upto 10 milliseconds. ● High Throughput:- due to low latency it can handle high velocity and volume. ● Fault Tolerance:- handle node/machine failure within the cluster. ● Durability:- because of its replication feature. ● Distributed:- kafka contains a distributed architecture which makes it scalable. ● Real Time Handling:- kafka is able to handle real-time data pipeline. ● Batching:- Kafka works with batch-like use cases, supports batching.
  • 7. Kafka Components ● Apache Kafka stores data stream in Topic. ● Producer API is used to write data/publish in kafka topic. ● Consumer API is used to consumed data from the topic for further use.
  • 8. Kafka Topic It’s similar to the table of database, kafka uses topics to organise the message of a particular catogery. We can do query on kafka topic unlike the database table. We need to create the producer to write data and consumer to read that too in sequential order. Data in topics are deleted as per retention period. Important kafka topic config:- Number of Partition:- Replication Factor:- Message Size:- Log CleanUp Policy:-
  • 9. Config Details Number of partition govern the parallelism of application. In order to do parallel computation we need multiple consumer instance and since we know one partition can’t feed data to multiple consumer. We have to increase the partition count to achieve same.
  • 10. Continue _ _ _ _ Replication Factor is multiple copy of the data over different broker. It help us in dealing with data loss when broker goes offline or fails. Replicated data server the data. In ideal case we give replication factor as 3. If we increase the replication factor more it will hit the performance and keeping it as less we will lose the data. Message Size:- Kafka has a default limit of 1MB per message in the topic. in few scenario we need to send data which is larger than 1 Mb. In that case we can modify the default message size till 10 MB. replica.fetch.max.bytes=10485880
  • 11. Continue _ _ _ _ Log cleanup policy make sure that older message in the topic is getting cleaned. so that it free up memory of the broker. It is being controlled by following two configuration. log.retention.hours :- The most common configuration for how long Kafka will retain messages is by time. The default is specified in the configuration file using the log.retention.hours parameter, and it is set to 168 hours, the equivalent of one week. log.retention.bytes:- Another way to expire messages is based on the total number of bytes of messages retained. This value is set using the log.retention.bytes parameter, and it is applied per partition. The default is -1, meaning that there is no limit and only a time limit is applied.
  • 12. Kafka Producer Once a topic has been created with Kafka, the next step is to send data into the topic. This is where Kafka Producers come in. Kafka producer sends messages to a topic, and messages are distributed to partitions according to a mechanism such as key hashing
  • 13. Continue _ _ _ _ Kafka messages are created by the producer. A Kafka message consists of the following elements:
  • 14. Continue _ _ _ _ Key is optional in the Kafka message and it can be null. A key may be a string, number, or any object and then the key is serialized into binary format. Value represents the content of the message and can also be null. The value format is arbitrary and is then also serialized into binary format. Compression Type Kafka messages can be compressed. The compression Options are none, gzip, lz4, snappy, and zstd Headers. This is key value pair added especially for tracing of the message. Partition + Offset. Once a message is sent into a Kafka topic, it receives a partition number and an offset id. The combination of topic+partition+offset uniquely identifies the message Timestamp. A timestamp is added either by the user or the system in the message.
  • 15. Continue _ _ _ _ // Must Have Config bootstrap.servers -> bootstrapServers, key.serializer -> stringSerializer, value.serializer -> stringSerializer, // safe producer enable.idempotence -> "true", acks -> "all", retries -> Integer.MAX_VALUE.toString(), max.in.flight.requests.per.connection -> "5", // high throughput producer at the expense of a bit of latency and CPU usage compression.type -> "snappy", linger.ms -> "20", batch.size -> Integer.toString(32 * 1024) // 32KB
  • 16. Ack Safe Producer:- Acks is the number of brokers who need to acknowledge receiving the message before it is considered a successful write. acks=0 producers consider messages as "written successfully" the moment the message was sent without waiting for the broker to accept it at all. this is fastest approaches but data loss is possible. acks=1 , producers consider messages as "written successfully" when the message was acknowledged by only the leader. acks=all, producers consider messages as "written successfully" when the message is accepted by all in-sync replicas (ISR)
  • 17. Retry Retries ensure that no messages are dropped when sent to Apache Kafka. for kafka > 2.1 default retires value is max int retries = 214748364 it doesn’t mean that it will keep retrying forever. it is being controlled by delivery.timeout.ms default setting for the timeout is 2 minute delivery.timeout.ms=120000 max.in.flight.requests.per.connection = 1 if we want to keep the ordering maintained then we have to set the max in fight request = 1 but it impact the performance to keep the performance high we should set it as 5
  • 18. Compression Producers group messages in a batch before sending. If the producer is sending compressed messages, all the messages in a single producer batch are compressed together and sent as the "value" of a "wrapper message". Compression is more effective the bigger the batch of messages being sent to Kafka. Compression options are are compression.type= none, gzip, lz4, snappy, and zstd
  • 19. Batching By Default, Kafka producers try to send records as soon as possible. If we want to increase the throughput we have to enable batching. Batching is mainly controlled by two producer settings - linger.ms and batch.size the default value of longest.ms = 20ms and batch.size = 16KB Producer will wait either till 16Kb of the message or till 20 ms before sending message.
  • 20. kafka Consumer Applications that pull event data from one or more Kafka topics are known as Kafka Consumers. Consumers can read from one or more partitions at a time in Apache Kafka. Data is being read in order within each partition.
  • 21. Delivery Semantics A consumer reading from a Kafka partition may choose when to commit offsets. At Most Once Delivery:- offsets are committed as soon as a message batch is received after calling poll(). If processing fails, the message will be lost as, it will not be read again as the offsets of those messages have been committed already. At Least Once Delivery:- In this semantic we don’t want to lose the message to ensure it we commit the offset after processing is done but due to retries it leads to duplicate processing. This is suitable for consumers that cannot afford any data loss. Exactly Once Delivery:- In this semantic we want the message to be processing exactly once we don’t want the duplicate data. it’s applicable in case of payment and similar sensitive use cases. processing.guarantee=exactly.once
  • 22. Polling Kafka consumers poll the Kafka broker to receive batches of data. Polling allows consumers to control:- ● From where in the log they want to consume ● How fast they want to consume ● Ability to replay events consumer sends the heartbeat on regular interval (heartbeat.interval.ms). if it stops sending heartbeat group coordinator wait till (session.timeout.ms) time and retrigger a rebalance. Consumers poll brokers periodically using the .poll() method. If two .poll() calls are separated by more than max.poll.interval.ms time, then the consumer will be disconnected from the group. default value of max.poll.interval.ms = 5 minute
  • 23. Continue _ _ _ max.poll.records: (default 500) :- It control the maximum number of records that a single call to poll() will fetch. This is the important config which control how fast the data will be coming to our application. Suppose your application is process the data slower then you must set this max.poll.records value as lower. and if the application is processing data faster then we can set this value as high. If processing of a particular batch takes more time than the max.poll.interval.ms time, then the consumer will be disconnected from the group. to make sure this doesn’t happen reduce the max.poll.records value.
  • 24. Auto Offsets Reset ● A consumer is expected to read from a log continuously. ● But due to some network error or bug in application. We are not able to process the message and ● once we restart the application the current offset data is not available (deleted due to retention policy) then ● The specific consumer have two option either to read from beginning or read from latest. The same behavior is controlled by auto.offset.reset it can take following values:- latest:- consumers will read messages from the tail of the partition earliest:- reading from the oldest offset in the partition none:- throw exception to the consumer if no previous offset is found
  • 25. Thank You ! Get in touch with us: