Successfully reported this slideshow.
Your SlideShare is downloading. ×

Apache Kafka Reference

More Related Content

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Apache Kafka Reference

  1. 1. Apache Kafka Reference Biju Nair
  2. 2. Architecture BK1 BK2 BK3 BK4 BK5 ZK1 ZK2 ZK3 Producer -  Serialize Data - Identify Partition -  Send data - a/sync -  Wait for sync ack Consumer -  Assigned partition - Polls for message -  Grp Coord heartbeat -  Partition rebalance -  Consumer group lead -  Partition will be assigned to mount points with less number of partitions -  Partition restricted to one disk / mount point : log.dirs -  Max message size : message.max.bytes Brokers -  All brokers registers to ZK -  First broker becomes the controller -  Responsible for partition leader election - Controller failure and election uses ZK watch notification -  Broker creating /controller Znode becomes the controller -  Controller epoch through ZK conditional increment operation is used -  First consumer is group lead - One broker assigned as group coordinator -  Change in # of consumer initiates partition rebalance -  All reads go through part leader -  Leader keeps track of ISR /brokers/ids – ephemeral nodes – subscribed by kafka components /controller – ZooKeeper watch -  All writes go through leader - Writes will block if < min ISR -  Broker failure results in next ISR being assigned part leader -  Part leader keeps track of ISR Fetch Replica
  3. 3. Partition Leader Election •  When a broker fails the next broker hosting an ISR will be assigned the leader •  Only brokers with ISR can become the leader •  Broker which was the leader when the partition is created is the Preferred Leader (PL) •  If PL is not the leader and is an ISR then a leader election is triggered – auto.leader.rebalance.enable = true|false
  4. 4. Partition Replication •  Replicas send fetch requests to part leaders •  Replica is out of sync if – No request for 10 seconds or – Not caught up to the last message in 10 seconds – replica.lag.time.max.ms
  5. 5. Kafka Client •  Uses binary protocol •  Requests are ordered – Produce/Fetch •  Acceptor thread to Processing/network thread •  Processing thread is configurable –  Places requests into request queue –  Picks responses from response queue and returns •  Requests picked-up and processed by IO threads •  Client makes metadata request and caches result –  metadata.max.age.ms
  6. 6. Request Header •  Request Type •  Request Version – Client version •  Correlation Id – Unique ID to relate req/res •  Client Id – Identifies the application
  7. 7. Producer Request •  Checks for – User privilege – Valid acks – 0, 1, ALL – For valid acks ALL, check there are enough ISRs •  Ack ALL – leader responds when replicated to all – Until then stores request in buffer called Purgatory
  8. 8. Producer Request Produce.send(r) Producer.send(r).get() Producer.send(r, new CallBack()) acks = 0|1|ALL buffer.memory.compression.type = snappy|gzip|lz4 retry.backoff.ms batch.size – batch size in bytes linger.ms – time before batch send client.id – for stats/logging timeout.ms request.timeout.ms metadata.fetch.timeout.ms max.block.ms max.request.size receive|sender.buffer.bytes max.in.flight.requests.per.session = 1 B K 2 Producer : BootStrap, KeySerializer, ValueSerializer Producer Record (r) Topic Partition Key Value Serializer Partitioner Sender Threads B K 1 B K 3 RecordMetaData
  9. 9. Fetch Request •  Topic, partition, offset •  Limit the data returned –  Size or # of messages •  Uses Zero-Copy for performance –  File system cache to network cache •  Can set min size to minimize network traffic –  Also set time in ms to send data •  Only sees data which has been replicated – high water mark •  replica.lag.time.max.ms
  10. 10. Partition Allocation •  Rack awareness •  Equal distribution •  Leader at node A, followers in A+1, A+2 •  Partition assigned to directory with least number of partitions •  Partitions are divided into segments –  Segments store 1 GB or 1 weeks worth of data •  File handlers open to all segments in all partitions –  OS ulimit need to be changed for open file handlers
  11. 11. Files •  Indexes –  Index to segment files -> positions within segments –  Indexes correspond to data segments –  Indexes will be purged along with data •  Compaction –  Retention policy : delete|compact –  log.cleaner.enabled –  To delete message; generate message with NULL values –  No compaction of active segments –  Compaction on topics with 50% of records being dirty Offset Magic/ Checksum Compression Codec Timestamp Key Size Key Value Size Value Data File
  12. 12. Consumer •  Consumer Group –  Consumers -> Partition •  More consumers than partition -> idle consumers •  Adding/dropping consumers -> partition rebalancing –  While rebalancing can’t consume messages •  Group membership maintained by heartbeat to group coordinator –  Heartbeats are send during poll() •  Commit records offset of message consumed •  Consumer crash leads to no processing of messages from the assigned partition –  session.timeout.ms / max.poll.interval.ms
  13. 13. Consumer Group •  One broker acts as Group Coordinator •  Consumers make JoinGroup request •  First consumer becomes group leader •  Leader receives all details about all consumers •  Assigns partitions to consumer using “PartitionAssignor” •  Sends assignments to Group Coordinator •  Group coordinator sends relevant information to each consumer like the assigned partition
  14. 14. Consumer •  Consumer –  Bootstrap, KeyDeSerializer, ValueDeSerializer, Group.id •  subscribe() •  poll –  Returns records –  Sends heartbeats •  close •  pause/resume –  Poll without retrieving data which does heartbeat •  wakeup –  JVM shutdown hook to perform housekeeping
  15. 15. Consumer Attributes •  fetch.min.bytes •  fetch.max.wait.ms – 500 ms •  max.partiton.fetch.bytes > max.message.size •  session.timeout.ms – 3 secs •  heartbeat.interval.ms •  auto.offset.reset – latest|earliest •  enable.auto.commit – true|false •  auto.commit.interval.ms •  partition.assignment.strategy – range|round robin •  client.id •  max.poll.records •  receive.buffer.bytes, send.buffer.bytes
  16. 16. Commits & Offsets •  Topics __consumer_offset to store offsets –  enable.auto.commit = true –  Auto.commit.interval.ms = 5 •  During rebalancing, data can be processed twice or missed with auto commit •  Disable auto commit – auto.commit.offset = false –  poll() and process all the records –  Commit •  commitSync() •  commitAsync() •  commitAsync(new OffsetCommitCallback())
  17. 17. Handling Rebalancing •  Pass in “ConsumerRebalancerListener” to subscribe – onPartitionsRevoked – onPartitionsAssigned •  consumer.seekToBegining(TopicPartition) •  consumer.seekToEnd(TopicPartition) •  Consumer.seek(partition, offset)
  18. 18. Serializer/DeSerializer •  Avro •  String •  Integer •  ByteArray
  19. 19. Broker Configs •  default.replication.factor – replication.factor •  broker.rack •  unclean.leader.election.enable – true|false •  min.insync.replicas
  20. 20. Administration Shell Script Feature Kafka-topics.sh Create, delete, alter, list, describe topics Kafka-consumer-groups.sh List, describe, delete (o) consumer groups kafka-run-class.sh kafka.tools.ExportZkOffsets Export offset of a consumer group kafka-run-class.sh kafka.tools.ImportZkOffsets Import offset for a consumer group kafka-configs.sh Dynamic configuration changes for topics and client quotas kafka-preferred-replica- election.sh Request preferred replica leader election kafka-reassign-partitions.sh Reassign partitions, rebalance, change replication kafka-run-class.sh kafka.tools.DumpLogSegments Dump log segments and verify indexes kafka-replica-verification.sh Replica verification kafka-verifiable-producer/consumer.sh Script

×