SlideShare a Scribd company logo
1 of 26
Download to read offline
Kafka used at scale
to deliver real-time notifications
Sergio Nunes
Infrastructure Engineer @ Zendesk
@sbrnunes
Agenda
Monitoring
Delivering real-time notifications
Some challenges
Kafka for real-time
notifications
Kafka for real-time notifications
Some initial requirements
● Listen to any database event that could translate into a notification
● Process and aggregate those events into transactions
● Translate transactions into notifications
● Push those notifications to mobile devices
○ In real-time
○ In a sane order (for the User)
○ With an accurate badge count (number of unread notifications)
Kafka for real-time notifications
Architecture of the system
Maxwell
Transactions
Stream
Events
Stream
Notifications
Stream
Notifications Service
MySQL
Kafka for real-time notifications
Architecture of the system
Maxwell
Transactions
Stream
Events
Stream
Notifications
Stream
Notifications Service
Partitioned by database === account
Partitioned by account
MySQL
Kafka for real-time notifications
Architecture of the system
Maxwell
Transactions
Stream
Events
Stream
Notifications
Stream
Notifications Service
MySQL
Partitioned by user
Kafka for real-time notifications
Architecture of the system
Maxwell
Transactions
Stream
Events
Stream
Notifications
Stream
Notifications Service
MySQL
API Server
Mark
as “read”
Badge Update
Notification
Kafka for real-time notifications
Conclusions
● Crazy fast!!!
○ Easily streaming at 7 - 10K database events /s
○ We’ve seen it handle up to 20 - 25K /s ( > 1M /min)
● Ordering guarantees provided really matched our needs
● Highly configurable
● Easily scalable (horizontally)
○ We can “easily” add or remove nodes to the Kafka cluster
○ We can easily add and remove consumer instances
Questions ?
Monitoring
Monitoring
How to monitor a service like this ?
● We need to be able to answer a few questions
○ Are we getting messages from Kafka ?
○ How fast are we reading those messages?
○ How much are we lagging ?
● We need to capture metrics!
● We can use those metrics to create alerts!
Monitoring
Application metrics
● Some metrics can be easily captured in the application and then reported to some monitoring
service
○ Examples: events produced /s, events consumed /s, etc.
Monitoring
Kafka metrics
● Other metrics are hard to get
● Consumer Lag: probably the most important metric
○ Size of partition (last offset) - consumer offset (last committed)
○ Brokers know about partition sizes
○ Consumer owns the consumed offsets
● Consumers acknowledge offsets by “committing” them back into a special Kafka
topic
○ What do we do with this?
Monitoring
Burrow
● Burrow for the rescue!!!
○ Monitoring application open sourced by Linkedin
https://github.com/linkedin/Burrow
○ Can be deployed as a sidecar application for Kafka
○ Keeps track of committed offsets, as well as the last
offsets known by the brokers
○ Exposes an REST API to gather all this information
Monitoring
Datadog
● Open source plugin for Burrow
○ https://github.com/packetloop/datadog-agent-burrow
○ Uses Burrow’s API to fetch some metrics (including the
consumer lag)
○ Publishes the metrics to Datadog
Some challenges
Some challenges
The problem of (not understanding) Kafka horizontal scaling
● Partition is considered the unit of parallelism in Kafka
○ So… we created our topics with 200 partitions!!! The more the merrier!
○ How many consumer instances ?
■ Service was running in 2 hosts, 4 cores each, maybe 4 instances in each machine ?
■ This gives us a total of 8 consumer instances for 200 partitions (?!?)
○ Did we need 200 partitions ? No !
https://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/
● The problem: we can add partitions to a topic, but we can’t remove them
Some challenges
The problem of (not understanding) Kafka horizontal scaling
Some challenges
Reducing the number of partitions
Producers
Topic V1 Topic V2
Consumers
Process followed to reduce the partitions:
1. Create topic V2
2. Make the consumer read from V1 and V2
3. Make the producer write only to V2
4. Wait until V1 is drained
5. Re-create V1 with the new number of partitions
6. Make the producer write only to V1
7. When V2 is drained, make the consumer read only from V1
Some challenges
● Advantages of partitioning the data per account:
○ Accounts data gets load balanced across the cluster
○ Guarantees the ordering per account
● Disadvantages:
○ Partitions can become heavily unbalanced
■ Some accounts are bigger than others
■ We can’t guarantee that big accounts won’t end up in the same partition
■ Some accounts may be under load
● The problem: this can slow down the assigned consumer a lot!!!
Unbalanced partitions
Some challenges
● In some rare occasions (load spikes), we had the need to increase the throughput
● Recommended solution: to add more consumer instances
● The problem: we only had two hosts, adding more consumer instances had to be done at the
application level
○ Our implementation with Akka Streams was not helping
○ To add a new consumer we had to replicate the entire stream (significant increase in the
number of threads)
○ Was causing us some issues and actually reducing the throughput
● The solution: keep one or two consumer instances per host, within the app, but consider adding
more more nodes
Increasing the number of consumers
Some challenges
Upgrading Kafka clients
● A few surprises upgrading to 0.9
○ Processing big chunks of data can lead to delays on the heartbeat mechanism which may cause
constant rebalances
○ Possible solutions:
■ reducing the maximum size of data consumed
■ Another possible solution: increase the consumer session timeout (prevent it from expiring
before processing the data)
Questions ?
TM and © 2017 Zendesk Inc. All rights reserved.

More Related Content

What's hot

What's hot (20)

Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
 
Kafka as Message Broker
Kafka as Message BrokerKafka as Message Broker
Kafka as Message Broker
 
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developers
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice Architecture
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Virtualization Vs. Containers
Virtualization Vs. ContainersVirtualization Vs. Containers
Virtualization Vs. Containers
 
AWS re:Invent 2016: Offload Security Heavy-lifting to the AWS Edge (CTD204)
AWS re:Invent 2016: Offload Security Heavy-lifting to the AWS Edge (CTD204)AWS re:Invent 2016: Offload Security Heavy-lifting to the AWS Edge (CTD204)
AWS re:Invent 2016: Offload Security Heavy-lifting to the AWS Edge (CTD204)
 
Apache Kafka® Use Cases for Financial Services
Apache Kafka® Use Cases for Financial ServicesApache Kafka® Use Cases for Financial Services
Apache Kafka® Use Cases for Financial Services
 
How Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per dayHow Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per day
 
Microservices Architecture Part 2 Event Sourcing and Saga
Microservices Architecture Part 2 Event Sourcing and SagaMicroservices Architecture Part 2 Event Sourcing and Saga
Microservices Architecture Part 2 Event Sourcing and Saga
 
Kafka and Machine Learning in Banking and Insurance Industry
Kafka and Machine Learning in Banking and Insurance IndustryKafka and Machine Learning in Banking and Insurance Industry
Kafka and Machine Learning in Banking and Insurance Industry
 
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and VormetricProtecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
 
APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka Streams
 
Kafka Tutorial: Kafka Security
Kafka Tutorial: Kafka SecurityKafka Tutorial: Kafka Security
Kafka Tutorial: Kafka Security
 
Application modernization patterns with apache kafka, debezium, and kubernete...
Application modernization patterns with apache kafka, debezium, and kubernete...Application modernization patterns with apache kafka, debezium, and kubernete...
Application modernization patterns with apache kafka, debezium, and kubernete...
 
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka
Top 5 Event Streaming Use Cases for 2021 with Apache KafkaTop 5 Event Streaming Use Cases for 2021 with Apache Kafka
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
 
Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)
 

Similar to Kafka used at scale to deliver real-time notifications

How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
StreamNative
 

Similar to Kafka used at scale to deliver real-time notifications (20)

Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
OSDC 2018 | From Monolith to Microservices by Paul Puschmann_
OSDC 2018 | From Monolith to Microservices by Paul Puschmann_OSDC 2018 | From Monolith to Microservices by Paul Puschmann_
OSDC 2018 | From Monolith to Microservices by Paul Puschmann_
 
Event driven architectures with Kinesis
Event driven architectures with KinesisEvent driven architectures with Kinesis
Event driven architectures with Kinesis
 
iFood on Delivering 100 Million Events a Month to Restaurants with Scylla
iFood on Delivering 100 Million Events a Month to Restaurants with ScyllaiFood on Delivering 100 Million Events a Month to Restaurants with Scylla
iFood on Delivering 100 Million Events a Month to Restaurants with Scylla
 
Domain events & Kafka in Ruby applications
Domain events & Kafka in Ruby applicationsDomain events & Kafka in Ruby applications
Domain events & Kafka in Ruby applications
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
 
Twitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
Twitter’s Apache Kafka Adoption Journey | Ming Liu, TwitterTwitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
Twitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
 
Our Multi-Year Journey to a 10x Faster Confluent Cloud
Our Multi-Year Journey to a 10x Faster Confluent CloudOur Multi-Year Journey to a 10x Faster Confluent Cloud
Our Multi-Year Journey to a 10x Faster Confluent Cloud
 
Apache Kafka's Common Pitfalls & Intricacies: A Customer Support Perspective
Apache Kafka's Common Pitfalls & Intricacies: A Customer Support PerspectiveApache Kafka's Common Pitfalls & Intricacies: A Customer Support Perspective
Apache Kafka's Common Pitfalls & Intricacies: A Customer Support Perspective
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark Streaming
 
Devops - Microservice and Kubernetes
Devops - Microservice and KubernetesDevops - Microservice and Kubernetes
Devops - Microservice and Kubernetes
 
Citi Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and PerformanceCiti Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and Performance
 
Docebo: history of a journey from legacy to serverless
Docebo: history of a journey from legacy to serverlessDocebo: history of a journey from legacy to serverless
Docebo: history of a journey from legacy to serverless
 
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
 
LINE's Private Cloud - Meet Cloud Native World
LINE's Private Cloud - Meet Cloud Native WorldLINE's Private Cloud - Meet Cloud Native World
LINE's Private Cloud - Meet Cloud Native World
 
9th docker meetup 2016.07.13
9th docker meetup 2016.07.139th docker meetup 2016.07.13
9th docker meetup 2016.07.13
 
Ultimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on KubernetesUltimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on Kubernetes
 
Stream processing with Apache Flink @ OfferUp
Stream processing with Apache Flink @ OfferUpStream processing with Apache Flink @ OfferUp
Stream processing with Apache Flink @ OfferUp
 
Introduction to Akka Streams
Introduction to Akka StreamsIntroduction to Akka Streams
Introduction to Akka Streams
 

Recently uploaded

Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 

Recently uploaded (20)

Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 

Kafka used at scale to deliver real-time notifications

  • 1. Kafka used at scale to deliver real-time notifications
  • 5. Kafka for real-time notifications Some initial requirements ● Listen to any database event that could translate into a notification ● Process and aggregate those events into transactions ● Translate transactions into notifications ● Push those notifications to mobile devices ○ In real-time ○ In a sane order (for the User) ○ With an accurate badge count (number of unread notifications)
  • 6. Kafka for real-time notifications Architecture of the system Maxwell Transactions Stream Events Stream Notifications Stream Notifications Service MySQL
  • 7. Kafka for real-time notifications Architecture of the system Maxwell Transactions Stream Events Stream Notifications Stream Notifications Service Partitioned by database === account Partitioned by account MySQL
  • 8. Kafka for real-time notifications Architecture of the system Maxwell Transactions Stream Events Stream Notifications Stream Notifications Service MySQL Partitioned by user
  • 9. Kafka for real-time notifications Architecture of the system Maxwell Transactions Stream Events Stream Notifications Stream Notifications Service MySQL API Server Mark as “read” Badge Update Notification
  • 10. Kafka for real-time notifications Conclusions ● Crazy fast!!! ○ Easily streaming at 7 - 10K database events /s ○ We’ve seen it handle up to 20 - 25K /s ( > 1M /min) ● Ordering guarantees provided really matched our needs ● Highly configurable ● Easily scalable (horizontally) ○ We can “easily” add or remove nodes to the Kafka cluster ○ We can easily add and remove consumer instances
  • 13. Monitoring How to monitor a service like this ? ● We need to be able to answer a few questions ○ Are we getting messages from Kafka ? ○ How fast are we reading those messages? ○ How much are we lagging ? ● We need to capture metrics! ● We can use those metrics to create alerts!
  • 14. Monitoring Application metrics ● Some metrics can be easily captured in the application and then reported to some monitoring service ○ Examples: events produced /s, events consumed /s, etc.
  • 15. Monitoring Kafka metrics ● Other metrics are hard to get ● Consumer Lag: probably the most important metric ○ Size of partition (last offset) - consumer offset (last committed) ○ Brokers know about partition sizes ○ Consumer owns the consumed offsets ● Consumers acknowledge offsets by “committing” them back into a special Kafka topic ○ What do we do with this?
  • 16. Monitoring Burrow ● Burrow for the rescue!!! ○ Monitoring application open sourced by Linkedin https://github.com/linkedin/Burrow ○ Can be deployed as a sidecar application for Kafka ○ Keeps track of committed offsets, as well as the last offsets known by the brokers ○ Exposes an REST API to gather all this information
  • 17. Monitoring Datadog ● Open source plugin for Burrow ○ https://github.com/packetloop/datadog-agent-burrow ○ Uses Burrow’s API to fetch some metrics (including the consumer lag) ○ Publishes the metrics to Datadog
  • 19. Some challenges The problem of (not understanding) Kafka horizontal scaling ● Partition is considered the unit of parallelism in Kafka ○ So… we created our topics with 200 partitions!!! The more the merrier! ○ How many consumer instances ? ■ Service was running in 2 hosts, 4 cores each, maybe 4 instances in each machine ? ■ This gives us a total of 8 consumer instances for 200 partitions (?!?) ○ Did we need 200 partitions ? No ! https://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/ ● The problem: we can add partitions to a topic, but we can’t remove them
  • 20. Some challenges The problem of (not understanding) Kafka horizontal scaling
  • 21. Some challenges Reducing the number of partitions Producers Topic V1 Topic V2 Consumers Process followed to reduce the partitions: 1. Create topic V2 2. Make the consumer read from V1 and V2 3. Make the producer write only to V2 4. Wait until V1 is drained 5. Re-create V1 with the new number of partitions 6. Make the producer write only to V1 7. When V2 is drained, make the consumer read only from V1
  • 22. Some challenges ● Advantages of partitioning the data per account: ○ Accounts data gets load balanced across the cluster ○ Guarantees the ordering per account ● Disadvantages: ○ Partitions can become heavily unbalanced ■ Some accounts are bigger than others ■ We can’t guarantee that big accounts won’t end up in the same partition ■ Some accounts may be under load ● The problem: this can slow down the assigned consumer a lot!!! Unbalanced partitions
  • 23. Some challenges ● In some rare occasions (load spikes), we had the need to increase the throughput ● Recommended solution: to add more consumer instances ● The problem: we only had two hosts, adding more consumer instances had to be done at the application level ○ Our implementation with Akka Streams was not helping ○ To add a new consumer we had to replicate the entire stream (significant increase in the number of threads) ○ Was causing us some issues and actually reducing the throughput ● The solution: keep one or two consumer instances per host, within the app, but consider adding more more nodes Increasing the number of consumers
  • 24. Some challenges Upgrading Kafka clients ● A few surprises upgrading to 0.9 ○ Processing big chunks of data can lead to delays on the heartbeat mechanism which may cause constant rebalances ○ Possible solutions: ■ reducing the maximum size of data consumed ■ Another possible solution: increase the consumer session timeout (prevent it from expiring before processing the data)
  • 26. TM and © 2017 Zendesk Inc. All rights reserved.