SlideShare a Scribd company logo
1 of 14
KAFKA INTRODUCTION
KAFKA VS JMS: SIMILARITIES AND DIFFERENCES
EXAMPLE: MATECO
Mateco Business ‘Streams’
• Microsoft Dynamics 365
• CRM + ERP
• Invoicing
• Q.Rent
• Q.Planning
• Q.Service
• Q.Trade
• Integrations
Each ‘Stream’ has their own IT squad and
their own set of applications.
WHY DO WE NEED MESSAGES/EVENTS?
 IT Architecture with multiple teams
 Make the teams independent as much as possible
 No central database
 No distributed transactions
 No blocking REST calls
 But: Keep data consistent across teams’ services
 ‘Eventual Consistency’
 Solution
 each service publishes an event whenever it updates its
data
 other services subscribe to events
 when an event is received, that service updates its data
EVENT STREAMING
 Information is packaged as ‘events’, with enough
information for any consumer to be able to handle it
 The event is published in a central event store
 The sender does not need to know which consumers
will process it
 Allows:
 Replicate information across independent services
 In pseudo-real-time
 Create new applications without disrupting other
applications
JMS TOPICS AND QUEUES VS KAFKA TOPICS
JMS
 Topic: publish-subscribe
 All subscribers receive all
messages
 Messages are only sent to active
subscribers
 Queue: send-receive
 Messages are queue’d until a
consumer consumes it.
 Allows horizontal scaling
 Needs a separate queue for a
different receiver group
Kafka Topic
 Stores all messages
 Limited retention (optional)
 Stores offsets on the server for
each consumer group
 Multiple consumer(group)s can
receive all messages
 Consumers can restart from the
beginning
 New consumers can be added
KAFKA ARCHITECTURE: CONSUMER GROUPS
KAFKA ARCHITECTURE: BROKERS
Broker: Server for Partitions
• Partition master
• Partition replica
ZooKeeper
• Used for synchronization within brokers
• Optional since Kafka 2.8
• Older Kafka versions used ZooKeeper
for connection management and to
store offsets
MESSAGE GROUPS VS PARTITIONS
JMS
 Message Group ID is an optional header
 Guarantees that messages for the same
Message Group ID are processed by the same
thread
KAFKA
 Each message can have a key
 A topic is divided into partitions, each message will
be put into a partition based on a hash of its key
 Random partition if there’s no key
 Subscribers get a fixed number of partitions
assigned
 Partitions are also used for horizontal scaling
 Partitions are distributed across servers
 Consumers only need to connect to owners of their
partitions
KAFKA DEMO
 Run Kafka on Kubernetes
 Using helm charts, e.g: https://artifacthub.io/packages/helm/bitnami/kafka
 Alternative: Confluent Cloud, AWS Managed Kafka, …
 Demo application using Spring Boot: https://www.baeldung.com/java-kafka-streams-vs-kafka-consumer
KAFKA TOPIC CLEANUP POLICY
Kafka topics are stored in append-only segments.
 Cleanup is still done, but per segment, not per
message
 Time-based retention
 Size-based retention
 Unlimited retention
 Each topic has a cleanup policy that decides what
happens when the retention expires
 Delete: delete the oldest messages
 Compact: delete messages with duplicate keys
 Keep only the latest version of the value for each key
 How to delete a key: overwrite it with an empty value,
a.k.a ‘tombstone’
KEY/VALUE SERIALIZATION
 Message Keys and Values are Binary, but can be
serialized/deserialized by the client
 AVRO is a popular encoding format, more
compact than JSON
 You can generate java classes from AVRO schema’s
AVRO IDL EXAMPLE
@namespace("example.avr")
protocol ExampleProtocol {
record User {
string name;
int? favorite_number;
string? favorite_color = “red”;
}
}
Optional: Schema Registry server
KAFKA STREAMS
 Library for Event Processing
 Write your event processing logic as a series of processors
 Map, aggregate, reduce events
 Write to tables, join with other tables
 Kafka Streams makes it fault-tolerant and highly scalable
 Load-balances the processors via Kafka partitioning
 Re-distributes the load between processors via intermediate topics
 Tables can be replicated (global tables) or partitioned (local tables)
STREAMS DEMO

More Related Content

Similar to Kafka Introduction.pptx

Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQShameera Rathnayaka
 
Copy of Kafka-Camus
Copy of Kafka-CamusCopy of Kafka-Camus
Copy of Kafka-CamusDeep Shah
 
Streaming Data with Apache Kafka
Streaming Data with Apache KafkaStreaming Data with Apache Kafka
Streaming Data with Apache KafkaMarkus Günther
 
Large scale, distributed and reliable messaging with Kafka
Large scale, distributed and reliable messaging with KafkaLarge scale, distributed and reliable messaging with Kafka
Large scale, distributed and reliable messaging with KafkaRafał Hryniewski
 
Kafka 10000 feet view
Kafka 10000 feet viewKafka 10000 feet view
Kafka 10000 feet viewyounessx01
 
Kafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersKafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersJean-Paul Azar
 
Apache Big Data Europe 2015: Selected Talks
Apache Big Data Europe 2015: Selected TalksApache Big Data Europe 2015: Selected Talks
Apache Big Data Europe 2015: Selected TalksAndrii Gakhov
 
NaradaBrokering Grid Messaging and Applications as Web Services
NaradaBrokering Grid Messaging and Applications as Web ServicesNaradaBrokering Grid Messaging and Applications as Web Services
NaradaBrokering Grid Messaging and Applications as Web ServicesVideoguy
 
NaradaBrokering Grid Messaging and Applications as Web Services
NaradaBrokering Grid Messaging and Applications as Web ServicesNaradaBrokering Grid Messaging and Applications as Web Services
NaradaBrokering Grid Messaging and Applications as Web ServicesVideoguy
 
Brief introduction to Kafka Streaming Platform
Brief introduction to Kafka Streaming PlatformBrief introduction to Kafka Streaming Platform
Brief introduction to Kafka Streaming PlatformJean-Paul Azar
 
Introduction to Kafka
Introduction to KafkaIntroduction to Kafka
Introduction to KafkaDucas Francis
 
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...DataStax Academy
 
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...GeeksLab Odessa
 
Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...
Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...
Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...HostedbyConfluent
 

Similar to Kafka Introduction.pptx (20)

Kafka overview
Kafka overviewKafka overview
Kafka overview
 
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
 
Copy of Kafka-Camus
Copy of Kafka-CamusCopy of Kafka-Camus
Copy of Kafka-Camus
 
Streaming Data with Apache Kafka
Streaming Data with Apache KafkaStreaming Data with Apache Kafka
Streaming Data with Apache Kafka
 
Large scale, distributed and reliable messaging with Kafka
Large scale, distributed and reliable messaging with KafkaLarge scale, distributed and reliable messaging with Kafka
Large scale, distributed and reliable messaging with Kafka
 
Kafka 10000 feet view
Kafka 10000 feet viewKafka 10000 feet view
Kafka 10000 feet view
 
Kafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersKafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer Consumers
 
Apache Big Data Europe 2015: Selected Talks
Apache Big Data Europe 2015: Selected TalksApache Big Data Europe 2015: Selected Talks
Apache Big Data Europe 2015: Selected Talks
 
Kafka
KafkaKafka
Kafka
 
NaradaBrokering Grid Messaging and Applications as Web Services
NaradaBrokering Grid Messaging and Applications as Web ServicesNaradaBrokering Grid Messaging and Applications as Web Services
NaradaBrokering Grid Messaging and Applications as Web Services
 
NaradaBrokering Grid Messaging and Applications as Web Services
NaradaBrokering Grid Messaging and Applications as Web ServicesNaradaBrokering Grid Messaging and Applications as Web Services
NaradaBrokering Grid Messaging and Applications as Web Services
 
Brief introduction to Kafka Streaming Platform
Brief introduction to Kafka Streaming PlatformBrief introduction to Kafka Streaming Platform
Brief introduction to Kafka Streaming Platform
 
Kafka RealTime Streaming
Kafka RealTime StreamingKafka RealTime Streaming
Kafka RealTime Streaming
 
Introduction to Kafka
Introduction to KafkaIntroduction to Kafka
Introduction to Kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
 
Apache Kafka
Apache Kafka Apache Kafka
Apache Kafka
 
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...
Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...
Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...
 

More from Geert Pante

OAuth2 and OpenID with Spring Boot
OAuth2 and OpenID with Spring BootOAuth2 and OpenID with Spring Boot
OAuth2 and OpenID with Spring BootGeert Pante
 
Kubernetes and Amazon ECS
Kubernetes and Amazon ECSKubernetes and Amazon ECS
Kubernetes and Amazon ECSGeert Pante
 
Docker in practice
Docker in practiceDocker in practice
Docker in practiceGeert Pante
 
Spring JMS and ActiveMQ
Spring JMS and ActiveMQSpring JMS and ActiveMQ
Spring JMS and ActiveMQGeert Pante
 
Log management with ELK
Log management with ELKLog management with ELK
Log management with ELKGeert Pante
 
Spring 4 en spring data
Spring 4 en spring dataSpring 4 en spring data
Spring 4 en spring dataGeert Pante
 
Spring and SOA (2006)
Spring and SOA (2006)Spring and SOA (2006)
Spring and SOA (2006)Geert Pante
 
Maven plugins, properties en profiles: Advanced concepts in Maven
Maven plugins, properties en profiles: Advanced concepts in MavenMaven plugins, properties en profiles: Advanced concepts in Maven
Maven plugins, properties en profiles: Advanced concepts in MavenGeert Pante
 
The glory of REST in Java: Spring HATEOAS, RAML, Temenos IRIS
The glory of REST in Java: Spring HATEOAS, RAML, Temenos IRISThe glory of REST in Java: Spring HATEOAS, RAML, Temenos IRIS
The glory of REST in Java: Spring HATEOAS, RAML, Temenos IRISGeert Pante
 
Version Management in Maven
Version Management in MavenVersion Management in Maven
Version Management in MavenGeert Pante
 

More from Geert Pante (11)

OAuth2 and OpenID with Spring Boot
OAuth2 and OpenID with Spring BootOAuth2 and OpenID with Spring Boot
OAuth2 and OpenID with Spring Boot
 
Kubernetes and Amazon ECS
Kubernetes and Amazon ECSKubernetes and Amazon ECS
Kubernetes and Amazon ECS
 
Docker in practice
Docker in practiceDocker in practice
Docker in practice
 
Spring JMS and ActiveMQ
Spring JMS and ActiveMQSpring JMS and ActiveMQ
Spring JMS and ActiveMQ
 
Log management with ELK
Log management with ELKLog management with ELK
Log management with ELK
 
Java EE 6
Java EE 6Java EE 6
Java EE 6
 
Spring 4 en spring data
Spring 4 en spring dataSpring 4 en spring data
Spring 4 en spring data
 
Spring and SOA (2006)
Spring and SOA (2006)Spring and SOA (2006)
Spring and SOA (2006)
 
Maven plugins, properties en profiles: Advanced concepts in Maven
Maven plugins, properties en profiles: Advanced concepts in MavenMaven plugins, properties en profiles: Advanced concepts in Maven
Maven plugins, properties en profiles: Advanced concepts in Maven
 
The glory of REST in Java: Spring HATEOAS, RAML, Temenos IRIS
The glory of REST in Java: Spring HATEOAS, RAML, Temenos IRISThe glory of REST in Java: Spring HATEOAS, RAML, Temenos IRIS
The glory of REST in Java: Spring HATEOAS, RAML, Temenos IRIS
 
Version Management in Maven
Version Management in MavenVersion Management in Maven
Version Management in Maven
 

Recently uploaded

The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 

Recently uploaded (20)

The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 

Kafka Introduction.pptx

  • 1. KAFKA INTRODUCTION KAFKA VS JMS: SIMILARITIES AND DIFFERENCES
  • 2. EXAMPLE: MATECO Mateco Business ‘Streams’ • Microsoft Dynamics 365 • CRM + ERP • Invoicing • Q.Rent • Q.Planning • Q.Service • Q.Trade • Integrations Each ‘Stream’ has their own IT squad and their own set of applications.
  • 3. WHY DO WE NEED MESSAGES/EVENTS?  IT Architecture with multiple teams  Make the teams independent as much as possible  No central database  No distributed transactions  No blocking REST calls  But: Keep data consistent across teams’ services  ‘Eventual Consistency’  Solution  each service publishes an event whenever it updates its data  other services subscribe to events  when an event is received, that service updates its data
  • 4. EVENT STREAMING  Information is packaged as ‘events’, with enough information for any consumer to be able to handle it  The event is published in a central event store  The sender does not need to know which consumers will process it  Allows:  Replicate information across independent services  In pseudo-real-time  Create new applications without disrupting other applications
  • 5. JMS TOPICS AND QUEUES VS KAFKA TOPICS JMS  Topic: publish-subscribe  All subscribers receive all messages  Messages are only sent to active subscribers  Queue: send-receive  Messages are queue’d until a consumer consumes it.  Allows horizontal scaling  Needs a separate queue for a different receiver group Kafka Topic  Stores all messages  Limited retention (optional)  Stores offsets on the server for each consumer group  Multiple consumer(group)s can receive all messages  Consumers can restart from the beginning  New consumers can be added
  • 7. KAFKA ARCHITECTURE: BROKERS Broker: Server for Partitions • Partition master • Partition replica ZooKeeper • Used for synchronization within brokers • Optional since Kafka 2.8 • Older Kafka versions used ZooKeeper for connection management and to store offsets
  • 8. MESSAGE GROUPS VS PARTITIONS JMS  Message Group ID is an optional header  Guarantees that messages for the same Message Group ID are processed by the same thread KAFKA  Each message can have a key  A topic is divided into partitions, each message will be put into a partition based on a hash of its key  Random partition if there’s no key  Subscribers get a fixed number of partitions assigned  Partitions are also used for horizontal scaling  Partitions are distributed across servers  Consumers only need to connect to owners of their partitions
  • 9. KAFKA DEMO  Run Kafka on Kubernetes  Using helm charts, e.g: https://artifacthub.io/packages/helm/bitnami/kafka  Alternative: Confluent Cloud, AWS Managed Kafka, …  Demo application using Spring Boot: https://www.baeldung.com/java-kafka-streams-vs-kafka-consumer
  • 10. KAFKA TOPIC CLEANUP POLICY Kafka topics are stored in append-only segments.  Cleanup is still done, but per segment, not per message  Time-based retention  Size-based retention  Unlimited retention  Each topic has a cleanup policy that decides what happens when the retention expires  Delete: delete the oldest messages  Compact: delete messages with duplicate keys  Keep only the latest version of the value for each key  How to delete a key: overwrite it with an empty value, a.k.a ‘tombstone’
  • 11. KEY/VALUE SERIALIZATION  Message Keys and Values are Binary, but can be serialized/deserialized by the client  AVRO is a popular encoding format, more compact than JSON  You can generate java classes from AVRO schema’s
  • 12. AVRO IDL EXAMPLE @namespace("example.avr") protocol ExampleProtocol { record User { string name; int? favorite_number; string? favorite_color = “red”; } } Optional: Schema Registry server
  • 13. KAFKA STREAMS  Library for Event Processing  Write your event processing logic as a series of processors  Map, aggregate, reduce events  Write to tables, join with other tables  Kafka Streams makes it fault-tolerant and highly scalable  Load-balances the processors via Kafka partitioning  Re-distributes the load between processors via intermediate topics  Tables can be replicated (global tables) or partitioned (local tables)