SlideShare a Scribd company logo
1
Fundamentals for Apache Kafka®
Apache Kafka Architecture & Fundamentals Explained
Joe Desmond, Sr. Technical Trainer, Confluent
2
Session Schedule
● Session 1: Benefits of Stream Processing and Apache Kafka Use Cases
● Session 2: Apache Kafka Architecture & Fundamentals Explained
● Session 3: How Apache Kafka Works
● Session 4: Integrating Apache Kafka into your Environment
3
Learning Objectives
After this module you will be able to:
● Identify the key elements in a Kafka cluster
● Name the essential responsibilities of each key
element
● Explain what a Topic is and describe its relation to
Partitions and Segments
4
The World Produces Data
5
Producers
6
Kafka Brokers
7
Consumers
8
Architecture
9
Decoupling Producers and Consumers
● Producers and Consumers are decoupled
● Slow Consumers do not affect Producers
● Add Consumers without affecting Producers
● Failure of Consumer does not affect System
10
How Kafka
Uses
ZooKeeper
11
ZooKeeper Basics
● Open Source Apache Project
● Distributed Key Value Store
● Maintains configuration information
● Stores ACLs and Secrets
● Enables highly reliable distributed coordination
● Provides distributed synchronization
● Three or five servers form an ensemble
12
Topics
● Topics: Streams of “related” Messages in Kafka
○ Is a Logical Representation
○ Categorizes Messages into Groups
● Developers define Topics
● Producer Topic: N to N Relation
● Unlimited Number of Topics
13
Topics, Partitions, and Segments
14
Topics, Partitions, and Segments
15
The Log
16
Log Structured Data Flow
17
The Stream
18
Data Elements
19
Brokers Manage Partitions
● Messages of Topic spread across Partitions
● Partitions spread across Brokers
● Each Broker handles many Partitions
● Each Partition stored on Broker’s disk
● Partition: 1..n log files
● Each message in Log identified by Offset
● Configurable Retention Policy
20
Broker Basics
● Producer sends Messages to
Brokers
● Brokers receive and store
Messages
● A Kafka Cluster can have many
Brokers
● Each Broker manages multiple
Partitions
21
Broker Replication
22
Producer Basics
● Producers write Data as Messages
● Can be written in any language
○ Native: Java, C/C++, Python, Go,, .NET, JMS
○ More Languages by Community
○ REST Server for any unsupported Language
● Command Line Producer Tool
23
Load Balancing and Semantic Partitioning
● Producers use a Partitioning Strategy to assign each message to a Partition
● Two Purposes:
○ Load Balancing
○ Semantic Partitioning
● Partitioning Strategy specified by Producer
○ Default Strategy: hash(key) % number_of_partitions
○ No Key Round-Robin
● Custom Partitioner possible
24
Consumer Basics
● Consumers pull messages from 1..n topics
● New inflowing messages are automatically retrieved
● Consumer offset
○ Keeps track of the last message read
○ Is stored in special topic
● CLI tools exist to read from cluster
25
Consumer Offset
26
Distributed Consumption
27
Scalable Data Pipeline
28
Q&A
Questions:
● Why do we need an odd number of ZooKeeper nodes?
● How many Kafka brokers can a cluster maximally have?
● How many Kafka brokers do you minimally need for high
availability?
● What is the criteria that two or more consumers form a
consumer group?
29
Continue your Apache Kafka Education!
● Confluent Operations for Apache Kafka
● Confluent Developer Skills for Building Apache Kafka
● Confluent Stream Processing using Apache Kafka Streams
and KSQL
● Confluent Advanced Skills for Optimizing Apache Kafka
For more details, seehttp://confluent.io/training
30
30
Certifications
Confluent Certified Developer
for Apache Kafka
(aligns to Confluent Developer Skills
for Building Apache Kafka course)
Confluent Certified
Administrator for Apache
Kafka
(aligns to Confluent Operations Skills
for Apache Kafka)
What you Need to Know
○ Qualifications: 6-to-9 months hands-on
experience
○ Duration: 90 mins
○ Availability: Live, online 24/7
○ Cost: $150
○ Register online:
www.confluent.io/certification
31
31
cnfl.io/slack
Stay in touch!
cnfl.io/kafka-training
cnfl.io/download
32
Thank you for attending!
• Thank you for attending thesession!
• Feedback to: training-admin@confluent.io
33
Copyright ©Confluent, Inc. 2014-2019. Privacy Policy | Terms &Conditions.
Apache, Apache Kafka, Kafka and the Kafka logo are trademarks of
the Apache Software Foundation

More Related Content

What's hot

What's hot (20)

Stream Processing – Concepts and Frameworks
Stream Processing – Concepts and FrameworksStream Processing – Concepts and Frameworks
Stream Processing – Concepts and Frameworks
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
kafka
kafkakafka
kafka
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafka
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache Kafka
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Kafka At Scale in the Cloud
Kafka At Scale in the CloudKafka At Scale in the Cloud
Kafka At Scale in the Cloud
 
Stream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NETStream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NET
 
Kafka Overview
Kafka OverviewKafka Overview
Kafka Overview
 
Apache Kafka - Overview
Apache Kafka - OverviewApache Kafka - Overview
Apache Kafka - Overview
 

Similar to Apache KAfka

Copy of Kafka-Camus
Copy of Kafka-CamusCopy of Kafka-Camus
Copy of Kafka-Camus
Deep Shah
 

Similar to Apache KAfka (20)

Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpStrimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
 
kafka
kafkakafka
kafka
 
Event driven architectures with Kinesis
Event driven architectures with KinesisEvent driven architectures with Kinesis
Event driven architectures with Kinesis
 
Citi Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and PerformanceCiti Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and Performance
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
 
Apache Kafka's Common Pitfalls & Intricacies: A Customer Support Perspective
Apache Kafka's Common Pitfalls & Intricacies: A Customer Support PerspectiveApache Kafka's Common Pitfalls & Intricacies: A Customer Support Perspective
Apache Kafka's Common Pitfalls & Intricacies: A Customer Support Perspective
 
Kafka in action - Tech Talk - Paytm
Kafka in action - Tech Talk - PaytmKafka in action - Tech Talk - Paytm
Kafka in action - Tech Talk - Paytm
 
Uber: Kafka Consumer Proxy
Uber: Kafka Consumer ProxyUber: Kafka Consumer Proxy
Uber: Kafka Consumer Proxy
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Apache Kafka - Free Friday
Apache Kafka - Free FridayApache Kafka - Free Friday
Apache Kafka - Free Friday
 
Insta clustr seattle kafka meetup presentation bb
Insta clustr seattle kafka meetup presentation   bbInsta clustr seattle kafka meetup presentation   bb
Insta clustr seattle kafka meetup presentation bb
 
Twitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
Twitter’s Apache Kafka Adoption Journey | Ming Liu, TwitterTwitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
Twitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
 
Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018
 
Copy of Kafka-Camus
Copy of Kafka-CamusCopy of Kafka-Camus
Copy of Kafka-Camus
 
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ... A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 
How Much Kafka?
How Much Kafka?How Much Kafka?
How Much Kafka?
 
Tokyo AK Meetup Speedtest - Share.pdf
Tokyo AK Meetup Speedtest - Share.pdfTokyo AK Meetup Speedtest - Share.pdf
Tokyo AK Meetup Speedtest - Share.pdf
 
Mule soft meetup_chandigarh_#7_25_sept_2021
Mule soft meetup_chandigarh_#7_25_sept_2021Mule soft meetup_chandigarh_#7_25_sept_2021
Mule soft meetup_chandigarh_#7_25_sept_2021
 
Day in the life event-driven workshop
Day in the life  event-driven workshopDay in the life  event-driven workshop
Day in the life event-driven workshop
 

Recently uploaded

Recently uploaded (20)

Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
 
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 

Apache KAfka

  • 1. 1 Fundamentals for Apache Kafka® Apache Kafka Architecture & Fundamentals Explained Joe Desmond, Sr. Technical Trainer, Confluent
  • 2. 2 Session Schedule ● Session 1: Benefits of Stream Processing and Apache Kafka Use Cases ● Session 2: Apache Kafka Architecture & Fundamentals Explained ● Session 3: How Apache Kafka Works ● Session 4: Integrating Apache Kafka into your Environment
  • 3. 3 Learning Objectives After this module you will be able to: ● Identify the key elements in a Kafka cluster ● Name the essential responsibilities of each key element ● Explain what a Topic is and describe its relation to Partitions and Segments
  • 9. 9 Decoupling Producers and Consumers ● Producers and Consumers are decoupled ● Slow Consumers do not affect Producers ● Add Consumers without affecting Producers ● Failure of Consumer does not affect System
  • 11. 11 ZooKeeper Basics ● Open Source Apache Project ● Distributed Key Value Store ● Maintains configuration information ● Stores ACLs and Secrets ● Enables highly reliable distributed coordination ● Provides distributed synchronization ● Three or five servers form an ensemble
  • 12. 12 Topics ● Topics: Streams of “related” Messages in Kafka ○ Is a Logical Representation ○ Categorizes Messages into Groups ● Developers define Topics ● Producer Topic: N to N Relation ● Unlimited Number of Topics
  • 19. 19 Brokers Manage Partitions ● Messages of Topic spread across Partitions ● Partitions spread across Brokers ● Each Broker handles many Partitions ● Each Partition stored on Broker’s disk ● Partition: 1..n log files ● Each message in Log identified by Offset ● Configurable Retention Policy
  • 20. 20 Broker Basics ● Producer sends Messages to Brokers ● Brokers receive and store Messages ● A Kafka Cluster can have many Brokers ● Each Broker manages multiple Partitions
  • 22. 22 Producer Basics ● Producers write Data as Messages ● Can be written in any language ○ Native: Java, C/C++, Python, Go,, .NET, JMS ○ More Languages by Community ○ REST Server for any unsupported Language ● Command Line Producer Tool
  • 23. 23 Load Balancing and Semantic Partitioning ● Producers use a Partitioning Strategy to assign each message to a Partition ● Two Purposes: ○ Load Balancing ○ Semantic Partitioning ● Partitioning Strategy specified by Producer ○ Default Strategy: hash(key) % number_of_partitions ○ No Key Round-Robin ● Custom Partitioner possible
  • 24. 24 Consumer Basics ● Consumers pull messages from 1..n topics ● New inflowing messages are automatically retrieved ● Consumer offset ○ Keeps track of the last message read ○ Is stored in special topic ● CLI tools exist to read from cluster
  • 28. 28 Q&A Questions: ● Why do we need an odd number of ZooKeeper nodes? ● How many Kafka brokers can a cluster maximally have? ● How many Kafka brokers do you minimally need for high availability? ● What is the criteria that two or more consumers form a consumer group?
  • 29. 29 Continue your Apache Kafka Education! ● Confluent Operations for Apache Kafka ● Confluent Developer Skills for Building Apache Kafka ● Confluent Stream Processing using Apache Kafka Streams and KSQL ● Confluent Advanced Skills for Optimizing Apache Kafka For more details, seehttp://confluent.io/training
  • 30. 30 30 Certifications Confluent Certified Developer for Apache Kafka (aligns to Confluent Developer Skills for Building Apache Kafka course) Confluent Certified Administrator for Apache Kafka (aligns to Confluent Operations Skills for Apache Kafka) What you Need to Know ○ Qualifications: 6-to-9 months hands-on experience ○ Duration: 90 mins ○ Availability: Live, online 24/7 ○ Cost: $150 ○ Register online: www.confluent.io/certification
  • 32. 32 Thank you for attending! • Thank you for attending thesession! • Feedback to: training-admin@confluent.io
  • 33. 33 Copyright ©Confluent, Inc. 2014-2019. Privacy Policy | Terms &Conditions. Apache, Apache Kafka, Kafka and the Kafka logo are trademarks of the Apache Software Foundation