Watch full webinar here: https://buff.ly/43PDVsz
In today's fast-paced, data-driven world, organizations need real-time data pipelines and streaming applications to make informed decisions. Apache Kafka, a distributed streaming platform, provides a powerful solution for building such applications and, at the same time, gives the ability to scale without downtime and to work with high volumes of data. At the heart of Apache Kafka lies Kafka Topics, which enable communication between clients and brokers in the Kafka cluster.
Join us for this session with Pooja Dusane, Data Engineer at Denodo where we will explore the critical role that Kafka listeners play in enabling connectivity to Kafka Topics. We'll dive deep into the technical details, discussing the key concepts of Kafka listeners, including their role in enabling real-time communication between consumers and producers. We'll also explore the various configuration options available for Kafka listeners and demonstrate how they can be customized to suit specific use cases.
Attend and Learn:
- The critical role that Kafka listeners play in enabling connectivity in Apache Kafka.
- Key concepts of Kafka listeners and how they enable real-time communication between clients and brokers.
- Configuration options available for Kafka listeners and how they can be customized to suit specific use cases.
Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time Data Enrichment
1. UNLOCKING THE POWER
OF APACHE KAFKA:
HOW KAFKA LISTENERS FACILITATE
REAL TIME DATA ENRICHMENT
Pooja Dusane
Data Engineer | Denodo
2. AGENDA
1. Kafka
a. Why is Kafka Popular?
b. Kafka History
c. What is Kafka
d. Kafka Key Terminologies
2. Kafka Listener
a. What are Kafka Listeners
b. How Kafka Listeners facilitate real time data enrichment
c. Denodo Kafka Listener
d. Difference between Custom Wrapper and Listener
3. Demo
4. Closing Remarks
4. 4
More than 80% of all Fortune 100
companies trust, and use Kafka.
5. ‹#›
WHY IS KAFKA POPULAR
Architecture - Kafka uses a partitioned log model, which combines messaging queue and publish subscribe
approaches.
Scalability - Kafka provides scalability by allowing partitions to be distributed across different servers.
Zero Downtime - Kafka appears to be a publish-subscribe system capable of delivering in-order, continuous, and
scalable messaging.
Low Latency & High Throughput - Without the need for such powerful hardware, Apache Kafka as a service can
manage high-volume, high-speed data with millisecond latency, which is what most new use cases require.
Fault Tolerance - If a job is executing on a system that fails, Kafka Streams will immediately resume the process on one
of the remaining running instances of the application.
Extensibility - Kafka’s prominence has prompted numerous other programs to develop integrations with it over time.
Guaranteed Delivery - Kafka will ensure that no redundant messages are created in the topic and that messages sent
by a producer to a specific topic partition are attached in the order in which they were sent.
6. 6
HISTORY
1 3 5
6
4
2
2010 LinkedIn
Developed Kafka
2015 Kafka version
0.8.2 is released
2019 Confluent
raised money to
expand.
2012 Kafka is
donated to the
Apache Software
Foundation
2017: Kafka
version 1.0.0 stable
release
2021: Kafka
version 2.8.0 is
released
(improvements)
7. 7
WHAT IS KAFKA
Apache Kafka is a distributed data store optimized for ingesting and
processing streaming data in real-time.
Different models are available:
▪ Publish-Subscribe model
▪ Queuing model
Apache Kafka is horizontally scalable, highly available, fault tolerant.
It allows cluster architectures, load balancer configuration and topics
are partitioned.
8. ‹#›
KEY TERMINOLOGY
● Broker : Apache Kafka runs as a cluster on one or more servers that can span multiple data centers.
● Producer : It writes data to the brokers.
● Consumer : It consumes data from brokers.
● Topics : A Topic is a category/feed name to which messages are stored and published.
● Partitions : Kafka topics are divided into a number of partitions, which contains immutable messages
11. 11
WHAT ARE KAFKA LISTENERS
● Kafka listeners are part of an application that consume data from Kafka topics.
● They continuously poll Kafka for new messages in near real-time.
● Kafka listeners retrieve messages and process them according to the application's logic.
● Kafka listeners can be configured to listen to one or more topics and use consumer groups for fault-tolerance and
load balancing.
12. 12
HOW KAFKA LISTENERS FACILITATE REAL TIME DATA ENRICHMENT
● Real-time data enrichment is the process of adding additional information to incoming data in real-time.
● Kafka listeners allow applications to consume data from Kafka topics and process it in real-time.
● When a Kafka listener is configured to listen to a particular Kafka topic, it will receive a stream of messages as they
are published to the topic.
● The listener can then process each message and add additional information to it before passing it on to
downstream systems or a consuming kafka topic.
● With Kafka listeners, organizations can build highly performant and scalable applications that can handle large
volumes of data in real-time.
13. ‹#›
Overview
KAFKA LISTENERS IN DENODO
● Component in the Denodo Platform that allows receiving and sending events to Apache Kafka
● Executes the sentences against Denodo from the information received in Apache Kafka events
● Extension of the VQL language to allow configuring the created components
● Graphical component for the Design Studio applications to manage the created components
14. ‹#›
Overview
KAFKA LISTENER IN DENODO
In Virtual DataPort you can create a Kafka listener to subscribe to data originated in a Kafka server.
● Execute the VQL statements received from the Kafka server.
● Or, define a query with the interpolation variable (@LISTENEREXPRESSION)
15. ‹#›
Difference between Kafka Listener and Kafka Custom Wrapper
Custom Wrapper
● Custom Wrapper enables “pull” access (or query
based)
● Wrapper allows access to topic information in the same
way as if it were a conventional data source.
● Access is incrementally or from a certain point to obtain
all the requested data
● Only read from the Kafka topics so as to combine it
with other views
● Key Use Case- To access Kafka topics in as a data
source for publishing data in web services or reporting
tools
Listener
● Listener enables “push” access ( or event- based)
● The listener's objective is to process the information
from these topics.
● Access is through VQL statements or interpolation
variable
● Read and Write to the Kafka topics
● Key Use Case - Data enrichment of producer data.
19. CLOSING
REMARKS
● Kafka listeners continuously pull Kafka for new messages
in near real-time
● The listener can process each message and add additional
information thus enriching the data before passing it on to
a consuming kafka topic.
● In Denodo, Kafka listener can execute VQL statements that
are received from kafka server or you can use query with
the interpolation variable (@LISTENEREXPRESSION)