Successfully reported this slideshow.
Your SlideShare is downloading. ×

Apache Kafka Architecture & Fundamentals Explained

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 33 Ad

Apache Kafka Architecture & Fundamentals Explained

Download to read offline

Watch this talk here: https://www.confluent.io/online-talks/apache-kafka-architecture-and-fundamentals-explained-on-demand

This session explains Apache Kafka’s internal design and architecture. Companies like LinkedIn are now sending more than 1 trillion messages per day to Apache Kafka. Learn about the underlying design in Kafka that leads to such high throughput.

This talk provides a comprehensive overview of Kafka architecture and internal functions, including:
-Topics, partitions and segments
-The commit log and streams
-Brokers and broker replication
-Producer basics
-Consumers, consumer groups and offsets

This session is part 2 of 4 in our Fundamentals for Apache Kafka series.

Watch this talk here: https://www.confluent.io/online-talks/apache-kafka-architecture-and-fundamentals-explained-on-demand

This session explains Apache Kafka’s internal design and architecture. Companies like LinkedIn are now sending more than 1 trillion messages per day to Apache Kafka. Learn about the underlying design in Kafka that leads to such high throughput.

This talk provides a comprehensive overview of Kafka architecture and internal functions, including:
-Topics, partitions and segments
-The commit log and streams
-Brokers and broker replication
-Producer basics
-Consumers, consumer groups and offsets

This session is part 2 of 4 in our Fundamentals for Apache Kafka series.

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Similar to Apache Kafka Architecture & Fundamentals Explained (20)

Advertisement

More from confluent (20)

Recently uploaded (20)

Advertisement

Apache Kafka Architecture & Fundamentals Explained

  1. 1. 1 Fundamentals for Apache Kafka® Apache Kafka Architecture & Fundamentals Explained Joe Desmond, Sr. Technical Trainer, Confluent
  2. 2. 2 Session Schedule ● Session 1: Benefits of Stream Processing and Apache Kafka Use Cases ● Session 2: Apache Kafka Architecture & Fundamentals Explained ● Session 3: How Apache Kafka Works ● Session 4: Integrating Apache Kafka into your Environment
  3. 3. 3 Learning Objectives After this module you will be able to: ● Identify the key elements in a Kafka cluster ● Name the essential responsibilities of each key element ● Explain what a Topic is and describe its relation to Partitions and Segments
  4. 4. 4 The World Produces Data
  5. 5. 5 Producers
  6. 6. 6 Kafka Brokers
  7. 7. 7 Consumers
  8. 8. 8 Architecture
  9. 9. 9 Decoupling Producers and Consumers ● Producers and Consumers are decoupled ● Slow Consumers do not affect Producers ● Add Consumers without affecting Producers ● Failure of Consumer does not affect System
  10. 10. 10 How Kafka Uses ZooKeeper
  11. 11. 11 ZooKeeper Basics ● Open Source Apache Project ● Distributed Key Value Store ● Maintains configuration information ● Stores ACLs and Secrets ● Enables highly reliable distributed coordination ● Provides distributed synchronization ● Three or five servers form an ensemble
  12. 12. 12 Topics ● Topics: Streams of “related” Messages in Kafka ○ Is a Logical Representation ○ Categorizes Messages into Groups ● Developers define Topics ● Producer Topic: N to N Relation ● Unlimited Number of Topics
  13. 13. 13 Topics, Partitions, and Segments
  14. 14. 14 Topics, Partitions, and Segments
  15. 15. 15 The Log
  16. 16. 16 Log Structured Data Flow
  17. 17. 17 The Stream
  18. 18. 18 Data Elements
  19. 19. 19 Brokers Manage Partitions ● Messages of Topic spread across Partitions ● Partitions spread across Brokers ● Each Broker handles many Partitions ● Each Partition stored on Broker’s disk ● Partition: 1..n log files ● Each message in Log identified by Offset ● Configurable Retention Policy
  20. 20. 20 Broker Basics ● Producer sends Messages to Brokers ● Brokers receive and store Messages ● A Kafka Cluster can have many Brokers ● Each Broker manages multiple Partitions
  21. 21. 21 Broker Replication
  22. 22. 22 Producer Basics ● Producers write Data as Messages ● Can be written in any language ○ Native: Java, C/C++, Python, Go,, .NET, JMS ○ More Languages by Community ○ REST Server for any unsupported Language ● Command Line Producer Tool
  23. 23. 23 Load Balancing and Semantic Partitioning ● Producers use a Partitioning Strategy to assign each message to a Partition ● Two Purposes: ○ Load Balancing ○ Semantic Partitioning ● Partitioning Strategy specified by Producer ○ Default Strategy: hash(key) % number_of_partitions ○ No Key Round-Robin ● Custom Partitioner possible
  24. 24. 24 Consumer Basics ● Consumers pull messages from 1..n topics ● New inflowing messages are automatically retrieved ● Consumer offset ○ Keeps track of the last message read ○ Is stored in special topic ● CLI tools exist to read from cluster
  25. 25. 25 Consumer Offset
  26. 26. 26 Distributed Consumption
  27. 27. 27 Scalable Data Pipeline
  28. 28. 28 Q&A Questions: ● Why do we need an odd number of ZooKeeper nodes? ● How many Kafka brokers can a cluster maximally have? ● How many Kafka brokers do you minimally need for high availability? ● What is the criteria that two or more consumers form a consumer group?
  29. 29. 29 Continue your Apache Kafka Education! ● Confluent Operations for Apache Kafka ● Confluent Developer Skills for Building Apache Kafka ● Confluent Stream Processing using Apache Kafka Streams and KSQL ● Confluent Advanced Skills for Optimizing Apache Kafka For more details, seehttp://confluent.io/training
  30. 30. 3030 Certifications Confluent Certified Developer for Apache Kafka (aligns to Confluent Developer Skills for Building Apache Kafka course) Confluent Certified Administrator for Apache Kafka (aligns to Confluent Operations Skills for Apache Kafka) What you Need to Know ○ Qualifications: 6-to-9 months hands-on experience ○ Duration: 90 mins ○ Availability: Live, online 24/7 ○ Cost: $150 ○ Register online: www.confluent.io/certification
  31. 31. 3131 cnfl.io/slack Stay in touch! cnfl.io/kafka-trainingcnfl.io/download
  32. 32. 32 Thank you for attending! • Thank you for attending thesession! • Feedback to: training-admin@confluent.io
  33. 33. 33 Copyright ©Confluent, Inc. 2014-2019. Privacy Policy | Terms &Conditions. Apache, Apache Kafka, Kafka and the Kafka logo are trademarks of the Apache Software Foundation

×