SlideShare a Scribd company logo
1 of 9
Distributed streaming platform:
•Publish and subscribe to streams of records, similar to a message queue or enterprise messaging system
•Store streams of records in a fault-tolerant durable way
•Process streams of records as they occur
•
•Kafka is run as a cluster on one or more servers that can span multiple datacenters.
•The Kafka cluster stores streams of records in categories called topics.
•Each record consists of a key, a value, and a timestamp.
•
Used for:
•
•Building real-time streaming data pipelines that reliably get data between systems or applications
•Building real-time streaming applications that transform or react to the streams of data
What is Kafka?
Kafka API
Kafka API
The Producer API:
Allows an application to publish a stream of records to one or more Kafka topics.
The Consumer API:
Allows an application to subscribe to one or more topics and process the stream of records
produced to them.
The Streams API allows:
An application to act as a stream processor, consuming an input stream from one or more topics
and producing an output stream to one or more output topics, effectively transforming the input
streams to output streams.
The Connector API:
Allows building and running reusable producers or consumers that connect Kafka topics to existing
applications or data systems. For example, a connector to a relational database might capture
every change to a table.
Kafka Communication
Protocol: TCP Protocol
Communication between client and server is done
•simple
•high-performance
•language agnostic
•
Java client for Kafka is provided and available in following language
C/C++, Python, Go (AKA golang), Erlang, .NET, Clojure, Ruby, Node.js, Proxy (HTTP
REST, etc), Perl, stdin/stdout, PHP, Rust, Alternative Java
Storm, Scala DSL , Clojure, Swift, Client Libraries Previously Supported
Kafka Topics
•Topic is a category or feed name to which records are published
•Topics in Kafka are always multi-subscriber
•A topic can have zero, one, or many consumers that subscribe to the data written to it
•For each topic, the Kafka cluster maintains a partitioned log that looks like this:
Kafka Partitions
Each partition:
•Is an ordered,
•immutable sequence of records that is continually appended to—a
structured commit log
Offset:
•The records in the partitions are each assigned a sequential id number
called the offset that uniquely identifies each record within the partition
Records:
•Records configurable retention period
•If the retention policy is set to two days, then for the two days after a record
is published, it is available for consumption, after which it will be discarded to
free up space.
Performance:
•Effectively constant with respect to data size so storing data for a long time
is not a problem
Kafka Consumer Group
•Each consumer in the group is assigned a set of partitions to consume from. They will receive messages from a different
subset of the partitions in the topic. Kafka guarantees that a message is only read by a single consumer in the group.
•The number of partitions impacts the maximum parallelism of consumers as you cannot have more consumers than
partitions.
•Kafka only provides a total order over records within a partition, not between different partitions in a topic. Per-partition
ordering combined with the ability to partition data by key is sufficient for most applications. However, if you require a total
order over records this can be achieved with a topic that has only one partition, though this will mean only one consumer
process per consumer group.
Kafka At Yelp with MySQL
MySQL Kafka and Flat File

More Related Content

What's hot

Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaAngelo Cesaro
 
Kafka Basic For Beginners
Kafka Basic For BeginnersKafka Basic For Beginners
Kafka Basic For BeginnersRiby Varghese
 
Kafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platformKafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platformJean-Paul Azar
 
Kafka Overview
Kafka OverviewKafka Overview
Kafka Overviewiamtodor
 
Apache Kafka: Next Generation Distributed Messaging System
Apache Kafka: Next Generation Distributed Messaging SystemApache Kafka: Next Generation Distributed Messaging System
Apache Kafka: Next Generation Distributed Messaging SystemEdureka!
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache KafkaJoe Stein
 
Kafka clients and emitters
Kafka clients and emittersKafka clients and emitters
Kafka clients and emittersEdgar Domingues
 
Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Timothy Spann
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin PodvalMartin Podval
 
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Christopher Curtin
 

What's hot (20)

Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
 
Kafka Basic For Beginners
Kafka Basic For BeginnersKafka Basic For Beginners
Kafka Basic For Beginners
 
Kafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platformKafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platform
 
Kafka Overview
Kafka OverviewKafka Overview
Kafka Overview
 
Kafka tutorial
Kafka tutorialKafka tutorial
Kafka tutorial
 
Kafka basics
Kafka basicsKafka basics
Kafka basics
 
kafka
kafkakafka
kafka
 
ES & Kafka
ES & KafkaES & Kafka
ES & Kafka
 
Apache Kafka: Next Generation Distributed Messaging System
Apache Kafka: Next Generation Distributed Messaging SystemApache Kafka: Next Generation Distributed Messaging System
Apache Kafka: Next Generation Distributed Messaging System
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache Kafka
 
Kafka clients and emitters
Kafka clients and emittersKafka clients and emitters
Kafka clients and emitters
 
kafka for db as postgres
kafka for db as postgreskafka for db as postgres
kafka for db as postgres
 
Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)
 
Flume vs. kafka
Flume vs. kafkaFlume vs. kafka
Flume vs. kafka
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 

Similar to Kafka

Kafka for begginer
Kafka for begginerKafka for begginer
Kafka for begginerYousun Jeong
 
apachekafka-160907180205.pdf
apachekafka-160907180205.pdfapachekafka-160907180205.pdf
apachekafka-160907180205.pdfTarekHamdi8
 
Kafka syed academy_v1_introduction
Kafka syed academy_v1_introductionKafka syed academy_v1_introduction
Kafka syed academy_v1_introductionSyed Hadoop
 
Unleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptxUnleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptxKnoldus Inc.
 
kafka_session_updated.pptx
kafka_session_updated.pptxkafka_session_updated.pptx
kafka_session_updated.pptxKoiuyt1
 
Distributed messaging with Apache Kafka
Distributed messaging with Apache KafkaDistributed messaging with Apache Kafka
Distributed messaging with Apache KafkaSaumitra Srivastav
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache KafkaChhavi Parasher
 
OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland HochmuthOSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland HochmuthNETWAYS
 
OSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland HochmuthOSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland HochmuthNETWAYS
 
Kafka pub sub demo
Kafka pub sub demoKafka pub sub demo
Kafka pub sub demoSrish Kumar
 
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...Lucas Jellema
 
Kafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersKafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersJean-Paul Azar
 
Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...
Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...
Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...Denodo
 
Brief introduction to Kafka Streaming Platform
Brief introduction to Kafka Streaming PlatformBrief introduction to Kafka Streaming Platform
Brief introduction to Kafka Streaming PlatformJean-Paul Azar
 

Similar to Kafka (20)

Kafka for begginer
Kafka for begginerKafka for begginer
Kafka for begginer
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
apachekafka-160907180205.pdf
apachekafka-160907180205.pdfapachekafka-160907180205.pdf
apachekafka-160907180205.pdf
 
Kafka syed academy_v1_introduction
Kafka syed academy_v1_introductionKafka syed academy_v1_introduction
Kafka syed academy_v1_introduction
 
Unleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptxUnleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptx
 
kafka_session_updated.pptx
kafka_session_updated.pptxkafka_session_updated.pptx
kafka_session_updated.pptx
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Distributed messaging with Apache Kafka
Distributed messaging with Apache KafkaDistributed messaging with Apache Kafka
Distributed messaging with Apache Kafka
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland HochmuthOSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
 
OSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland HochmuthOSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
 
Kafka pub sub demo
Kafka pub sub demoKafka pub sub demo
Kafka pub sub demo
 
Kafka overview
Kafka overviewKafka overview
Kafka overview
 
Notes leo kafka
Notes leo kafkaNotes leo kafka
Notes leo kafka
 
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
 
Kafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersKafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer Consumers
 
Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...
Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...
Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...
 
Brief introduction to Kafka Streaming Platform
Brief introduction to Kafka Streaming PlatformBrief introduction to Kafka Streaming Platform
Brief introduction to Kafka Streaming Platform
 
Kafka.pptx
Kafka.pptxKafka.pptx
Kafka.pptx
 
Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
 

Recently uploaded

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Recently uploaded (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Kafka

  • 1. Distributed streaming platform: •Publish and subscribe to streams of records, similar to a message queue or enterprise messaging system •Store streams of records in a fault-tolerant durable way •Process streams of records as they occur • •Kafka is run as a cluster on one or more servers that can span multiple datacenters. •The Kafka cluster stores streams of records in categories called topics. •Each record consists of a key, a value, and a timestamp. • Used for: • •Building real-time streaming data pipelines that reliably get data between systems or applications •Building real-time streaming applications that transform or react to the streams of data What is Kafka?
  • 3. Kafka API The Producer API: Allows an application to publish a stream of records to one or more Kafka topics. The Consumer API: Allows an application to subscribe to one or more topics and process the stream of records produced to them. The Streams API allows: An application to act as a stream processor, consuming an input stream from one or more topics and producing an output stream to one or more output topics, effectively transforming the input streams to output streams. The Connector API: Allows building and running reusable producers or consumers that connect Kafka topics to existing applications or data systems. For example, a connector to a relational database might capture every change to a table.
  • 4. Kafka Communication Protocol: TCP Protocol Communication between client and server is done •simple •high-performance •language agnostic • Java client for Kafka is provided and available in following language C/C++, Python, Go (AKA golang), Erlang, .NET, Clojure, Ruby, Node.js, Proxy (HTTP REST, etc), Perl, stdin/stdout, PHP, Rust, Alternative Java Storm, Scala DSL , Clojure, Swift, Client Libraries Previously Supported
  • 5. Kafka Topics •Topic is a category or feed name to which records are published •Topics in Kafka are always multi-subscriber •A topic can have zero, one, or many consumers that subscribe to the data written to it •For each topic, the Kafka cluster maintains a partitioned log that looks like this:
  • 6. Kafka Partitions Each partition: •Is an ordered, •immutable sequence of records that is continually appended to—a structured commit log Offset: •The records in the partitions are each assigned a sequential id number called the offset that uniquely identifies each record within the partition Records: •Records configurable retention period •If the retention policy is set to two days, then for the two days after a record is published, it is available for consumption, after which it will be discarded to free up space. Performance: •Effectively constant with respect to data size so storing data for a long time is not a problem
  • 7. Kafka Consumer Group •Each consumer in the group is assigned a set of partitions to consume from. They will receive messages from a different subset of the partitions in the topic. Kafka guarantees that a message is only read by a single consumer in the group. •The number of partitions impacts the maximum parallelism of consumers as you cannot have more consumers than partitions. •Kafka only provides a total order over records within a partition, not between different partitions in a topic. Per-partition ordering combined with the ability to partition data by key is sufficient for most applications. However, if you require a total order over records this can be achieved with a topic that has only one partition, though this will mean only one consumer process per consumer group.
  • 8. Kafka At Yelp with MySQL
  • 9. MySQL Kafka and Flat File