SlideShare a Scribd company logo
1 of 28
Kafka Internals
Ayyappadas Ravindran
Linkedin Bangalore SRE Team
Introduction
• Who am I ?
– Ayyappadas Ravindran
– Staff SRE in Linkedin
– Responsible for Data Infra Streaming team
• What is this talk about ?
– Kafka building blocks in details
– Operating Kafka
– Data assurance with Kafka
–Kafka 0.9
Agenda
• Kafka – Reminder !
• Zookeeper
• Kafka Cluster – Brokers
• Kafka – Message
• Producers
• Schema Registry
• Consumers
• Data Assurance
• What is new in Kafka (Kafka 0.9)
• Q & A
Kafka Pub/Sub Basics – Reminder !
Broker
A
P0
A
P1
A
P0
Consumer
Producer
Zookeeper
Zookeeper
• Distributed coordination service
• Also used for maintaining configuration
• Guarantees
– Order
– Atomicity
– Reliability
• Simple API
• Hierarchical Namespace
• Ephemeral Nodes
• Watches
Zookeeper in Kafka ecosystem
• Used to store metadata information
– About brokers
– About topics & partitions
– Consumers / Consumer groups
• Service coordination
– Controller election
– For administrative tasks
Zookeeper at Linkedin
• We are running Zookeeper 3.4
• Cluster of 5 (participants) + 1 (observer)
• Network and power redundancy
• Transaction logs on SSD.
• Lesson Learned : Do not over build your cluster
Kafka Cluster - Brokers
• Brokers
– Runs Kafka
– Stores commit logs
• Why cluster ?
– Redundancy and fault tolerance
– Horizontal scalability
– Improves reads and writes. Better network usage & disk IO
• Controller – special broker
Kafka Message
• Distributed partition replicated commit log.
• Messages
– Fixed size Header
– Variable length Payload (byte array)
– Payload can have any serialized data.
– Linkedin uses Avro
• Commit Logs
– Stored in sequence file under folders named with topic name
– contains sequence of log entries
Kafka Message - continued
• Logs
– Log entry (message) have 4 byte header and followed N byte messages
– offset is a 64 byte integer
– offset give the position of message from the start of the stream
– on disk log files are saved as segment files
– segment files are named with the first offset message in that file. E.g.
00000000000.kafka
Kafka Message - continued
• Write to logs
– Appends to the latest segment file
– OS flushes the messages to disk either based on number of messages or time
• Reads from logs
– Consumer provides offset & a chunk size
– Returns an iterator to iterate over the message set
– On failure, consumers can start consuming from either the start of the stream or from
latest offset
Message Retention
• Kafka retains and expires messages via three options
– Time-based (the default, which keeps messages for at least 168 hours)
– Size-based (configurable amount of messages per-partition)
– Key-based (one message is retained for each discrete key)
• Time and size retention can work together, but not with key-based
– With time and size configured, messages are retained either until the size limit is reached
OR the time limit is reached, whichever comes first
• Retention can be overridden per-topic
– Use the kafka-topics.sh CLI to set these configs
Kafka Producer
• Producer publishes message to topic
– metadata.broker.list
– serializer.class
– partitioner.class
– request.required.acks (0,1,-1)
– topics
• Partition strategy
– DefaultPartitioner – Round Robin
– DefaultPartitioner with Keyed messages – Hashing
Ref : https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+Producer+Example
Kafka Producer - Continued
• Message Batching
• Compression (gzip, snappy & lz4)
• Sticky partition
• CLI
– Create a topic
• bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic
newtopic --replication-factor 1 --partitions 1
– Produce messages
• bin/kafka-console-producer.sh –broker-list localhost:9092 -–topic
newtopic
Ref : https://cwiki.apache.org/confluence/display/KAFKA/Clients
Schema Registry
Kafka consumer
• Consumer are the processes subscribed to a topic and that processes the feeds
•High level consumer
– multi threaded
– manages offset for you
• Simple consumer
– Greater control over consumption
– Need to manage offset
– Need to find broker for leader partition
Kafka Consumer -- continued
• Important options to provide while consuming
– Zookeeper details
– Topic name
– Where to start consuming (from beginning or from the tail)
– auto.offset.reset
– group.id
– auto.commit.enable (true)
• console consumer
– Helps in debugging issues & can be used inside application
– bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic mytopic
--from-beginning
Basic Kafka operations
• Add a topic
– bin/kafka-topics.sh --zookeeper zk_host:port/chroot --create --topic
newtopic --partitions 10 --replication-factor 3 --config x=y
• Modify topic
– bin/kafka-topics.sh --zookeeper zk_host:port/chroot --alter --topic
newtopic –partitions 20
– beware this may impact semantically partitioned topic
• Modify configuration
– bin/kafka-topics.sh --zookeeper zk_host:port/chroot --alter --topic
newtopic --config x=y
• Delete configurations
– bin/kafka-topics.sh --zookeeper zk_host:port/chroot --alter --topic
newtopic --deleteConfig x
Basic Kafka operations -- continued
• DO NOT DELETE TOPICS ! Though you have an option to do that
• What happens when a broker dies ?
– Leader fail over
– corrupted index / log files
– URP
– Uneven leader distribution
•Preferred replica election
– bin/kafka-preferred-replica-election.sh --zookeeper zk_host:port/chroot
– or auto.leader.rebalance.enable=true
Adding a broker
20
Brokers
Consumers
Producers
A
P1
A
P0
B
P1
B
P0
A
P5
A
P4
B
P5
B
P4
A
P3
A
P2
B
P3
B
P2
A
P7
A
P6
B
P7
B
P6
A
P5
A
P4
B
P5
B
P4
A
P1
A
P0
B
P1
B
P0
A
P7
A
P6
B
P7
B
P6
A
P3
A
P2
B
P3
B
P2
C
P1
C
P0
C
P3
C
P2
C
P1
C
P0
C
P3
C
P2
Kafka operations – continued
• Expanding Kafka cluster
– Create a brokers with new broker ID
– Will not automatically move topics to new brokers
– Admin need to initiate the move
• Generate the plan : bin/kafka-reassign-partitions.sh --zookeeper
localhost:2181 --topics-to-move-json-file topics-to-move.json --
broker-list "5,6" –generate
• Execute the plan : bin/kafka-reassign-partitions.sh --zookeeper
localhost:2181 --reassignment-json-file expand-cluster-
reassignment.json –execute
• Verify the execution : bin/kafka-reassign-partitions.sh --zookeeper
localhost:2181 --reassignment-json-file expand-cluster-
reassignment.json --verify
Data Assurance
• No data loss or no reordering
– Critical for applications like DB replication
– Can Kafka do this ? Yes !
• Cause of data loss on producer side
– setting block.on.buffer.full=false
– retires exhausting
– sending messages with out ack=all
• How can you fix ?
– set block.on.buffer.full=true
– set retired to Long.MAX_VALUE
– set acks to all
– have resend in your call back function (producer.send(record, callback))
Data Assurance - Continued
• Cause of data loss on consumer side
– offsets are carelessly committed
– data loss can happen if consumer committed the offset, but died while processing the
message
• Fixing data loss on consumer side
– commit offset only after processing of the message is completed
– disable auto.offset.commit
• Fixing on Broker Side
– have replication factor >= 3
– have min.isr 2
– disable unclean leader election
Data Assurance - Continued
• Message reordering
– If more than one message is in transit
– and also retry is enabled
• Fixing message reordering
– set max.in.flight.requests.per.connection=1
Kafka 0.9 (Beta release)
• Security
– Kerberos or TLS based authentication
– Unix like permission to restrict who can access data
– Encryption on the wire Via SSL
• Kafka Connect
– support large-scale real-time import and export for Kafka
– takes care of fault tolerance, offset management and delivery management
– will be supporting connectors for Hadoop and database
• User defined quota
– To manage abusive clients
– rate limit traffic or producer side and consumer side
Kafka 0.9 (Beta release)
– Allows only 10MBps for read and 5MBps for write
– If clients violate, slows down
– Can be overridden
• New Consumer
– Removes distinction between high level consumer and simple consumer
– Unified consumer API
– No longer zookeeper dependent
– Offers pluggable offset management
How Can You Get Involved?
•http://kafka.apache.org
•Join the mailing lists
–users@kafka.apache.org
• irc.freenode.net - #apache-kafka
27
Q & A
Want to contact us ?
Akash Vacher (avacher@linkedin.com)
Ayyappadas Ravindran (appu@linkedin.com)
Talent Partner : Syed Hussain (sshussain@linkedin.com)
Mob : +91 953 581 8876

More Related Content

What's hot

Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...StreamNative
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache KafkaJoe Stein
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaAngelo Cesaro
 
Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1Knoldus Inc.
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache KafkaAmir Sedighi
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin PodvalMartin Podval
 
Kafka connect 101
Kafka connect 101Kafka connect 101
Kafka connect 101Whiteklay
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaShiao-An Yuan
 
Building High-Throughput, Low-Latency Pipelines in Kafka
Building High-Throughput, Low-Latency Pipelines in KafkaBuilding High-Throughput, Low-Latency Pipelines in Kafka
Building High-Throughput, Low-Latency Pipelines in Kafkaconfluent
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafkaemreakis
 
Kafka meetup JP #3 - Engineering Apache Kafka at LINE
Kafka meetup JP #3 - Engineering Apache Kafka at LINEKafka meetup JP #3 - Engineering Apache Kafka at LINE
Kafka meetup JP #3 - Engineering Apache Kafka at LINEkawamuray
 
How to Lock Down Apache Kafka and Keep Your Streams Safe
How to Lock Down Apache Kafka and Keep Your Streams SafeHow to Lock Down Apache Kafka and Keep Your Streams Safe
How to Lock Down Apache Kafka and Keep Your Streams Safeconfluent
 
Architecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructureArchitecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructuremattlieber
 
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the FieldKafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Fieldconfluent
 

What's hot (20)

Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache Kafka
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
 
Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1
 
Kafka
KafkaKafka
Kafka
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
 
Kafka connect 101
Kafka connect 101Kafka connect 101
Kafka connect 101
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Building High-Throughput, Low-Latency Pipelines in Kafka
Building High-Throughput, Low-Latency Pipelines in KafkaBuilding High-Throughput, Low-Latency Pipelines in Kafka
Building High-Throughput, Low-Latency Pipelines in Kafka
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Introduction to apache kafka
Introduction to apache kafkaIntroduction to apache kafka
Introduction to apache kafka
 
Kafka meetup JP #3 - Engineering Apache Kafka at LINE
Kafka meetup JP #3 - Engineering Apache Kafka at LINEKafka meetup JP #3 - Engineering Apache Kafka at LINE
Kafka meetup JP #3 - Engineering Apache Kafka at LINE
 
How to Lock Down Apache Kafka and Keep Your Streams Safe
How to Lock Down Apache Kafka and Keep Your Streams SafeHow to Lock Down Apache Kafka and Keep Your Streams Safe
How to Lock Down Apache Kafka and Keep Your Streams Safe
 
Architecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructureArchitecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructure
 
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the FieldKafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
 
Kafka aws
Kafka awsKafka aws
Kafka aws
 
kafka
kafkakafka
kafka
 
kafka for db as postgres
kafka for db as postgreskafka for db as postgres
kafka for db as postgres
 

Similar to Kafka blr-meetup-presentation - Kafka internals

Kafka and ibm event streams basics
Kafka and ibm event streams basicsKafka and ibm event streams basics
Kafka and ibm event streams basicsBrian S. Paskin
 
Apache Kafka - From zero to hero
Apache Kafka - From zero to heroApache Kafka - From zero to hero
Apache Kafka - From zero to heroApache Kafka TLV
 
Kafka zero to hero
Kafka zero to heroKafka zero to hero
Kafka zero to heroAvi Levi
 
Multitenancy: Kafka clusters for everyone at LINE
Multitenancy: Kafka clusters for everyone at LINEMultitenancy: Kafka clusters for everyone at LINE
Multitenancy: Kafka clusters for everyone at LINEkawamuray
 
Streaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in ProductionStreaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in Productionconfluent
 
Kafka Multi-Tenancy—160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy—160 Billion Daily Messages on One Shared Cluster at LINE Kafka Multi-Tenancy—160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy—160 Billion Daily Messages on One Shared Cluster at LINE confluent
 
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINEKafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINEkawamuray
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka IntroductionAmita Mirajkar
 
Kafka - Messaging System
Kafka - Messaging SystemKafka - Messaging System
Kafka - Messaging SystemTanuj Mehta
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Kafka syed academy_v1_introduction
Kafka syed academy_v1_introductionKafka syed academy_v1_introduction
Kafka syed academy_v1_introductionSyed Hadoop
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache KafkaChhavi Parasher
 
Gfs google-file-system-13331
Gfs google-file-system-13331Gfs google-file-system-13331
Gfs google-file-system-13331Fengchang Xie
 
Hands-on Workshop: Apache Pulsar
Hands-on Workshop: Apache PulsarHands-on Workshop: Apache Pulsar
Hands-on Workshop: Apache PulsarSijie Guo
 

Similar to Kafka blr-meetup-presentation - Kafka internals (20)

Kafka and ibm event streams basics
Kafka and ibm event streams basicsKafka and ibm event streams basics
Kafka and ibm event streams basics
 
Apache Kafka - From zero to hero
Apache Kafka - From zero to heroApache Kafka - From zero to hero
Apache Kafka - From zero to hero
 
Kafka zero to hero
Kafka zero to heroKafka zero to hero
Kafka zero to hero
 
Kafka Deep Dive
Kafka Deep DiveKafka Deep Dive
Kafka Deep Dive
 
Multitenancy: Kafka clusters for everyone at LINE
Multitenancy: Kafka clusters for everyone at LINEMultitenancy: Kafka clusters for everyone at LINE
Multitenancy: Kafka clusters for everyone at LINE
 
Streaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in ProductionStreaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in Production
 
Kafka Multi-Tenancy—160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy—160 Billion Daily Messages on One Shared Cluster at LINE Kafka Multi-Tenancy—160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy—160 Billion Daily Messages on One Shared Cluster at LINE
 
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINEKafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
Kafka - Messaging System
Kafka - Messaging SystemKafka - Messaging System
Kafka - Messaging System
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Kafka overview v0.1
Kafka overview v0.1Kafka overview v0.1
Kafka overview v0.1
 
Kafka syed academy_v1_introduction
Kafka syed academy_v1_introductionKafka syed academy_v1_introduction
Kafka syed academy_v1_introduction
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Gfs google-file-system-13331
Gfs google-file-system-13331Gfs google-file-system-13331
Gfs google-file-system-13331
 
Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
 
Hands-on Workshop: Apache Pulsar
Hands-on Workshop: Apache PulsarHands-on Workshop: Apache Pulsar
Hands-on Workshop: Apache Pulsar
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 

Recently uploaded

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Recently uploaded (20)

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

Kafka blr-meetup-presentation - Kafka internals

  • 2. Introduction • Who am I ? – Ayyappadas Ravindran – Staff SRE in Linkedin – Responsible for Data Infra Streaming team • What is this talk about ? – Kafka building blocks in details – Operating Kafka – Data assurance with Kafka –Kafka 0.9
  • 3. Agenda • Kafka – Reminder ! • Zookeeper • Kafka Cluster – Brokers • Kafka – Message • Producers • Schema Registry • Consumers • Data Assurance • What is new in Kafka (Kafka 0.9) • Q & A
  • 4. Kafka Pub/Sub Basics – Reminder ! Broker A P0 A P1 A P0 Consumer Producer Zookeeper
  • 5. Zookeeper • Distributed coordination service • Also used for maintaining configuration • Guarantees – Order – Atomicity – Reliability • Simple API • Hierarchical Namespace • Ephemeral Nodes • Watches
  • 6. Zookeeper in Kafka ecosystem • Used to store metadata information – About brokers – About topics & partitions – Consumers / Consumer groups • Service coordination – Controller election – For administrative tasks
  • 7. Zookeeper at Linkedin • We are running Zookeeper 3.4 • Cluster of 5 (participants) + 1 (observer) • Network and power redundancy • Transaction logs on SSD. • Lesson Learned : Do not over build your cluster
  • 8. Kafka Cluster - Brokers • Brokers – Runs Kafka – Stores commit logs • Why cluster ? – Redundancy and fault tolerance – Horizontal scalability – Improves reads and writes. Better network usage & disk IO • Controller – special broker
  • 9. Kafka Message • Distributed partition replicated commit log. • Messages – Fixed size Header – Variable length Payload (byte array) – Payload can have any serialized data. – Linkedin uses Avro • Commit Logs – Stored in sequence file under folders named with topic name – contains sequence of log entries
  • 10. Kafka Message - continued • Logs – Log entry (message) have 4 byte header and followed N byte messages – offset is a 64 byte integer – offset give the position of message from the start of the stream – on disk log files are saved as segment files – segment files are named with the first offset message in that file. E.g. 00000000000.kafka
  • 11. Kafka Message - continued • Write to logs – Appends to the latest segment file – OS flushes the messages to disk either based on number of messages or time • Reads from logs – Consumer provides offset & a chunk size – Returns an iterator to iterate over the message set – On failure, consumers can start consuming from either the start of the stream or from latest offset
  • 12. Message Retention • Kafka retains and expires messages via three options – Time-based (the default, which keeps messages for at least 168 hours) – Size-based (configurable amount of messages per-partition) – Key-based (one message is retained for each discrete key) • Time and size retention can work together, but not with key-based – With time and size configured, messages are retained either until the size limit is reached OR the time limit is reached, whichever comes first • Retention can be overridden per-topic – Use the kafka-topics.sh CLI to set these configs
  • 13. Kafka Producer • Producer publishes message to topic – metadata.broker.list – serializer.class – partitioner.class – request.required.acks (0,1,-1) – topics • Partition strategy – DefaultPartitioner – Round Robin – DefaultPartitioner with Keyed messages – Hashing Ref : https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+Producer+Example
  • 14. Kafka Producer - Continued • Message Batching • Compression (gzip, snappy & lz4) • Sticky partition • CLI – Create a topic • bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic newtopic --replication-factor 1 --partitions 1 – Produce messages • bin/kafka-console-producer.sh –broker-list localhost:9092 -–topic newtopic Ref : https://cwiki.apache.org/confluence/display/KAFKA/Clients
  • 16. Kafka consumer • Consumer are the processes subscribed to a topic and that processes the feeds •High level consumer – multi threaded – manages offset for you • Simple consumer – Greater control over consumption – Need to manage offset – Need to find broker for leader partition
  • 17. Kafka Consumer -- continued • Important options to provide while consuming – Zookeeper details – Topic name – Where to start consuming (from beginning or from the tail) – auto.offset.reset – group.id – auto.commit.enable (true) • console consumer – Helps in debugging issues & can be used inside application – bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic mytopic --from-beginning
  • 18. Basic Kafka operations • Add a topic – bin/kafka-topics.sh --zookeeper zk_host:port/chroot --create --topic newtopic --partitions 10 --replication-factor 3 --config x=y • Modify topic – bin/kafka-topics.sh --zookeeper zk_host:port/chroot --alter --topic newtopic –partitions 20 – beware this may impact semantically partitioned topic • Modify configuration – bin/kafka-topics.sh --zookeeper zk_host:port/chroot --alter --topic newtopic --config x=y • Delete configurations – bin/kafka-topics.sh --zookeeper zk_host:port/chroot --alter --topic newtopic --deleteConfig x
  • 19. Basic Kafka operations -- continued • DO NOT DELETE TOPICS ! Though you have an option to do that • What happens when a broker dies ? – Leader fail over – corrupted index / log files – URP – Uneven leader distribution •Preferred replica election – bin/kafka-preferred-replica-election.sh --zookeeper zk_host:port/chroot – or auto.leader.rebalance.enable=true
  • 21. Kafka operations – continued • Expanding Kafka cluster – Create a brokers with new broker ID – Will not automatically move topics to new brokers – Admin need to initiate the move • Generate the plan : bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --topics-to-move-json-file topics-to-move.json -- broker-list "5,6" –generate • Execute the plan : bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file expand-cluster- reassignment.json –execute • Verify the execution : bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file expand-cluster- reassignment.json --verify
  • 22. Data Assurance • No data loss or no reordering – Critical for applications like DB replication – Can Kafka do this ? Yes ! • Cause of data loss on producer side – setting block.on.buffer.full=false – retires exhausting – sending messages with out ack=all • How can you fix ? – set block.on.buffer.full=true – set retired to Long.MAX_VALUE – set acks to all – have resend in your call back function (producer.send(record, callback))
  • 23. Data Assurance - Continued • Cause of data loss on consumer side – offsets are carelessly committed – data loss can happen if consumer committed the offset, but died while processing the message • Fixing data loss on consumer side – commit offset only after processing of the message is completed – disable auto.offset.commit • Fixing on Broker Side – have replication factor >= 3 – have min.isr 2 – disable unclean leader election
  • 24. Data Assurance - Continued • Message reordering – If more than one message is in transit – and also retry is enabled • Fixing message reordering – set max.in.flight.requests.per.connection=1
  • 25. Kafka 0.9 (Beta release) • Security – Kerberos or TLS based authentication – Unix like permission to restrict who can access data – Encryption on the wire Via SSL • Kafka Connect – support large-scale real-time import and export for Kafka – takes care of fault tolerance, offset management and delivery management – will be supporting connectors for Hadoop and database • User defined quota – To manage abusive clients – rate limit traffic or producer side and consumer side
  • 26. Kafka 0.9 (Beta release) – Allows only 10MBps for read and 5MBps for write – If clients violate, slows down – Can be overridden • New Consumer – Removes distinction between high level consumer and simple consumer – Unified consumer API – No longer zookeeper dependent – Offers pluggable offset management
  • 27. How Can You Get Involved? •http://kafka.apache.org •Join the mailing lists –users@kafka.apache.org • irc.freenode.net - #apache-kafka 27
  • 28. Q & A Want to contact us ? Akash Vacher (avacher@linkedin.com) Ayyappadas Ravindran (appu@linkedin.com) Talent Partner : Syed Hussain (sshussain@linkedin.com) Mob : +91 953 581 8876

Editor's Notes

  1. Kafka is a publish-subscribe messaging system, in which there are four components: - Broker (what we call the Kafka server) - Zookeeper (which serves as a data store for information about the cluster and consumers) - Producer (sends data into the system) - Consumer (reads data out of the system) Data is organized into topics (here we show a topic named “A”) and topics are split into partitions (we have partitions 0 and 1 here). A “message” is a discrete unit of data within Kafka. Producers create messages and send them into the system. The broker stores them, and any number of consumers can then read those messages. In order to provide scalability, we have multiple brokers. By spreading out the partitions, we can handle more messages in any topic. This also provides redundancy. We can now replicate partitions on separate brokers. When we do this, one broker is the designated “leader” for each partition. This is the only broker that producers and consumers connect to for that partition. The brokers that hold the replicas are designated “followers” and all they do with the partition is keep it in sync with the leader. When a broker fails, one of the brokers holding an in-sync replica takes over as the leader for the partition. The producer and consumer clients have logic built-in to automatically rebalance and find the new leader when the cluster changes like this. When the original broker comes back online, it gets its replicas back in sync, and then it functions as the follower. It does not become the leader again until something else happens to the cluster (such as a manual change of leaders, or another broker going offline).
  2. In the previous slide we have seen that zookeeper is an integral part of Kafka echo system So lets see what is zookeeper, Zookeeper is a distributed coordination service for distributed application Zookeeper is also used for configuration maintenance Zookeeper exposes simple APIs, using which application can build high level coordination service Zookeeper guarantees ordering, atomicity and reliability Zookeeper is implemented using a shared hierarchal name space. Implemented in the model of a shared Linux file system Every node in zookeeper is called a znode Znode is similar to file & it stores data, it has ACL and stat information Two important concepts in Zookeeper echo system is Ephemeral Nodes & Watches Ephemeral node exists as long as the session that created the ephemeral node exists Client can set watches on Znode, client is informed when there is a change in znode Now lets quickly see coordination service in action. A leader election Consider that you have multiple clients competing to become the leader. The challenge is on how to elect a leader Zookeeper can be used for leader election Znodes are created by setting SEQUENCE and EPHEMERAL flag By setting SEQUENCE flag, each node is created with a monotonically increasing number to the end of the path The client which manages to create the znode with lowest SEQUENCE ID is elected as leader The znode created is an ephemeral znode, so the znode exists as long as the leader exists.
  3. Now Lets see how is zookeeper used in Kafka environment Kafka uses zookeeper both for storing configuration information and also for coordination service (Leader election & executing administrative tasks ) Zookeeper is used to store meta data information of broker, topics, consumers When a broker comes live, it registers itself with ZK. Creates a znode & stores the broker ID, hostname and end point details in znode Two type of topic related information is stored in ZK. One is the broker related topic information. Which broker hosts which topic/partition & replication information and have information about which replicas are leader and which are followers Second information related to topic is the config information, this stores per topic configuration information like, retention, clean-up policies etc Zookeeper also stores consumer information, like which consumers are consuming from which partition and till what data (in the log) a consumer has consumed , i.e offset information Coming to the co-ordination service. One of the brokers in Kafka cluster, assigns the role of controller. The controller is responsible for managing the state of brokers, partition and replicas. Controller also performs the administrative tasks. Controller election is done using zookeeper.
  4. We run zookeeper on 3.4. Its not just used for Kafka but also for other critical applications in Linkedin. We have a cluster size of 5 +1, where 5 are voting members and 1 non-voting member called observer. Primary role of observer is for disaster recovery and also helps in read scalability. We make sure the nodes are in different racks. This is for ensuring power redundancy, we have bond0 (balance-rr bonding). This provides load balancing and fault tolerance. If your system is write heavy its good to have better disk performance, we have SSD for keeping transaction logs. Or at least have a separate drive for transactions logs other than the drive which is used for applications logs and snapshots. Do not over build your cluster, as the cluster size increases the latency for ZK writes transactions increases
  5. Alright, we have seen zookeeper, now lets talks about brokers Brokers are the nodes which run kafka process on it Brokers store commit logs for topics/partitions Brokers register themselves with zookeeper when they start Multiple brokers create a kafka cluster Cluster is good because they help in redundancy (replica), fault tolerance Default replication policy which we have is 2, so we can afford one node failure Can be horizontally scaled, as and when you want to expand clusters you need to add more brokers. You can have better network usage and disk IO with multiple machines Controller is a broker with additional responsibility Controller is the brain of the cluster it’s a state machines. We keep the data in zookeeper, when there is a state change controller acts on it Controller manages the brokers, take care of the partitions and replications and does administrative tasks.
  6. As said earlier in Kafka message is a discrete unit of data Messages are stored in commit logs Commit logs are distributed, partitioned and replicated Message contains headers and payload Header contains information like, size of the payload, crc32 check sum, compression used (snappy or gzip) Leaving payload as byte array, give a lot of flexibility In Linkedin our messages are avro formatted message Commit logs are in stored in sequence files Sequence files are stored under the folder named after topic-partition Sequence files contains logs entries
  7. The header size is 4 byte The payload can be of variable size. We at Linkedin caps it at 1 MB Messages in the commit log is identified using offset number An offset number is 64 byte, it represents the position of the message from the start of the stream. Ie start of that topic partition Segments are named after the first offset in the segment
  8. Write happens to the tail end of the latest segment Messages are written to OS page cache and its flushed to disk either based on the number of messages or a period of time While reading from log consumers provide the offset number and chunk size Kafka returns an iterator, which contains a message set Ideally the chunk size will have multiple messages There can be a corner case in which the message is larger than the chunk size provided. In that case, the consumer doubles its chunk size and retries On consumer failures (here failure means consumer trying to fetch an offset which doesn’t exits), that is when consumer tries to read an offset which doesn’t exists, consumer can fail or consumer has the option to reset offset to start or current offset
  9. Now you have the commit logs which keep on getting data, at one point in time this is gonna fill you disk, so you need a retention policy to rotate and purge the log Kafka provides two clean up policies, You can either rotate the logs or compact the logs Rotation can happen based on time or size Log compaction is interesting, here we don’t purge an entire segment of log. Instead we remove logs having the same key and just retain the latest entry. Compaction can only happen with sematic partitioning You can have per topic retention policy. Kafka ships a CLI, which can be used to set this value
  10. Application which writes data into kafka topic is called producer Producer code need to be given the details of the broker from where the producer can fetch meta data. Meta data contains the information about the brokers and broker ID where leader partition for the topic resides Serialization class is pluggable, you can specify encoder class. In linkedin we use avro serialization Partitioner class specifies how the messages should be partitioned Or to which partition a message should be written to. Request.requires.acks specifies whether producer need to wait for an ack from broker or not. It has 3 values, 0-don’t wait, 1 get at least ack from leader, -1 or all get ack from all followers as well
  11. When producer sends messages to broker, you can ask producer to batch multiple messages and send it in one go. This way you can compress the messages and also lesser overhead in terms of creating connections to the broker Different type of compression are supported like gzip, snappy and lz4 Sticky partition is specific to Linkedin, this make sure that we are sending messages only to one partition for a given period of time. This way we can reduce the connection count
  12. So we talked about messages In Linkedin we send messages in avro format Avro message contains schema of the data and data. Data is stored in binary format (serialized) Linkedin custom producer adds extra information to each messages for tracking and auditing purpose To save on storage and on n/w. Schema is stripped off from the message and is stored in a centralized location. Message has a schema ID to retrieve the schema When the consumer wants to read the data it retrieves the schema from schema registry and reads the message Schema is caches locally so as to reduce load on schema registry We don’t want to break existing consumers so old backward compactible schemas are also stored in schema registry
  13. Consumers are the process responsible for consuming from the topics. They subscribe to a topic will consume the messages and will process them Kafka offers consumer abstraction called ‘Consumer Group’. A consumer group will have one or more consumer instances. Multiple consumer instances label themselves as consumer group Traditionally consumers work either in queue mode or pub-sub mode. In queue mode each message is send to one consumer instance. In pub-sub mode, a message is send to all instance. Messaging system guarantee ordering, but when delivered to multiple consumers asynchronously the messages may not be received in order. The work around is to use one single consumer, but in this approach there wont be any parallel consumption. Kafka does this via partitions. For an N partition topic you can have N consumers, this way one consumer will be consuming from one partition. This guarantee ordering and with multiple partition you can get parallelism. High level consumers are multithreaded, manages offset for you. Has consumer groups. Does rebalancing, when new consumer instance joins or leaves consumer group Simple consumer provides you greater control. You can read a subset of partitions, the messages can be read repeatedly. The drawback is that you need to deal with the offsets and need to find leader partitions
  14. Consumers keeps the offset information in zookeeper. This is the point till where a particular consumer has consumed. In case if that consumer thread dies, it exactly knows from where need to consume again Obviously you need to tell consumers from which topic it need to consume You have an option to start consuming at a particular offset, or from start or end of the stream Auto.offset.reset, remember mentioning that consumers store offset in zookeeper. Say in case if the consumer was not able to fetch the zookeeper or the consumer provided an offset which doesn’t exists, what should be the default behavior, whether the consumer should consume from the tail end or beginning. This is controlled by offset.rest Consumer group is an abstraction, it help the consumer instance to consume message either in a queuing fashion or in pub-sub mode. Group.id is used to set consumer gourp name. Takes a string Auto.commit.enabled : consumer store the value of offset which they have consumed. By enabling this consumer automatically sets offset String that represents consumer group, should be unique
  15. 1. Kafka ships command line tools to manage the Kafka clusters. These tools are used for maintenance and debug, we will quickly go through this
  16. Now lets see the operational challenges when a broker dies Kafka does the leader fail over, so one of the followers in ISR becomes the new leader You will end-up having corrupt index/log files You will end-up having under replicated partitions. Obviously, since you lost one of the replicas Kafka takes care of the corrupt index/log files. It discards incomplete log entry URP’s are fixed when the broker comes up. Point to remember is that the replicas will come back as follower This creates a challenge, now you have uneven leader distribution across your cluster Kafka ships CLI, using which you can rebalance the leader distribution There is also an option to automatically do, but its not very clean
  17. -Partition Reassignment -Broker leveling script moves data to even out data volume per broker As I mentioned when you add a broker to a cluster it won’t be used by existing partitions. With the 0.8.1 release of Kafka there is a new feature, Partition Reassignment! Now when you add a broker to the cluster, it can be used by your existing topics and partitions! Existing partitions can be moved around live and be completely transparent to all consumers and producers. We have developed a tool sits on top of the partition reassignment tool that will balance a cluster after you add new brokers, or if your cluster is simply unbalanced(there are many ways you can wind up in this state). What it does is it goes out to each broker and figures out how big each partition is(on disk), and the total amount of storage used on each broker. Next it starts calling the partition reassignment tool to make the larger brokers smaller, and the smaller brokers larger. It stops once the overall datasize is within 1GB between the smallest and largest brokers. This is just one example of the many possible ways to optimize a cluster with the partition reassignment tool.
  18. Expanding kafka cluster is very simple. You need to create brokers with unique broker ID and start Kafka server. The server will be automatically added to the cluster But on adding new brokers won’t trigger automatic balancing in kafka cluster. An admin need to move topic to the new broker. Only the initiation is manual, the process is automated
  19. Is not a big deal in application like pageview event tracking, Loss of one or two messages in a million messages is fine Becomes critical for applications like DB replication and transactions which involve money Where all can the loss happen, it can happen on producer end , consumer side and broker side Lets see issues on each side Cause of data loss on producer end Setting block on buffer full to false will throw and error and discards the messages Another cause can be number of retries being exhausted. Whether you are running cluster in asyn mode or syn mode. In sync mode are you waiting for a commit message from all replicas or not ? When setting block on buffer full as true, producer won’t take any more message when its buffer is full If you set retries to long.Max_value, it will retry for 2^63 -1 times Set ack to all
  20. Cause of data on consumer end is because you are careless ! Just kidding This can happen if you consume the messages and commit the offset before really processing the message. You can have failure during processing How do you fix this ? One commit offset after processing the message Disable auto.offset.commit
  21. Data need to be moved in an out of Kafka and other systems People uses multiple solutions,
  22. So how can you get more involved in the Kafka community? The most obvious answer is to go apache.kafka.org. From there you can Join the mailing lists, either on the development or the user side You can also dive into the source repository, and work on and contribute your own tools back. Kafka may be young, but it’s a critical piece of data infrastructure for many of us.