Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Kafka at Scale: Multi-Tier Architectures

11,318 views

Published on

This is a talk given at ApacheCon 2015

If data is the lifeblood of high technology, Apache Kafka is the circulatory system in use at LinkedIn. It is used for moving every type of data around between systems, and it touches virtually every server, every day. This can only be accomplished with multiple Kafka clusters, installed at several sites, and they must all work together to assure no message loss, and almost no message duplication. In this presentation, we will discuss the architectural choices behind how the clusters are deployed, and the tools and processes that have been developed to manage them. Todd Palino will also discuss some of the challenges of running Kafka at this scale, and how they are being addressed both operationally and in the Kafka development community.

Note - there are a significant amount of slide notes on each slide that goes into detail. Please make sure to check out the downloaded file to get the full content!

Published in: Data & Analytics
  • Hello! High Quality And Affordable Essays For You. Starting at $4.99 per page - Check our website! https://vk.cc/82gJD2
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Kafka at Scale: Multi-Tier Architectures

  1. 1. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Kafka at Scale
  2. 2. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Todd Palino
  3. 3. 3 You may remember me from such talks as… “Apache Kafka Meetup” And “Enterprise Kafka: QoS and Multitenancy”
  4. 4. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Who Am I?  Kafka, Samza, and Zookeeper SRE at LinkedIn  Site Reliability Engineering – Administrators – Architects – Developers  Keep the site running, always 4
  5. 5. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. What Will We Talk About?  Tiered Cluster Architecture  Kafka Mirror Maker  Performance Tuning  Data Assurance  What’s Next? 5
  6. 6. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Kafka At LinkedIn  300+ Kafka brokers  Over 18,000 topics  140,000+ Partitions  220 Billion messages per day  40 Terabytes In  160 Terabytes Out  Peak Load – 3.25 Million messages/sec – 5.5 Gigabits/sec Inbound – 18 Gigabits/sec Outbound 6  1100+ Kafka brokers  Over 32,000 topics  350,000+ Partitions  875 Billion messages per day  185 Terabytes In  675 Terabytes Out  Peak Load – 10.5 Million messages/sec – 18.5 Gigabits/sec Inbound – 70.5 Gigabits/sec Outbound
  7. 7. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Tiered Cluster Architecture 7
  8. 8. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. One Kafka Cluster 8
  9. 9. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Single Cluster – Remote Clients 9
  10. 10. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Multiple Clusters – Local and Remote Clients 10
  11. 11. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Multiple Clusters – Message Aggregation 11
  12. 12. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Why Not Direct?  Network Concerns – Bandwidth – Network partitioning – Latency  Security Concerns – Firewalls and ACLs – Encrypting data in transit  Resource Concerns – A misbehaving application can swamp production resources 12
  13. 13. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Kafka Mirror Maker 13
  14. 14. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Kafka Mirror Maker  Consumes from one cluster, produces to another  No communication from producer back to consumer  Best practice is to keep the mirror maker local to the target cluster  Kafka does not prevent loops 14
  15. 15. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Rules of Aggregation  NEVER produce to aggregate clusters 15
  16. 16. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. NEVER produce to aggregate clusters! 16
  17. 17. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Rules of Aggregation  NEVER produce to aggregate clusters  Not every topic needs to be aggregated – Log compacted topics do not play nice – Most queuing topics are local only  But your whitelist/blacklist configurations must be consistent – If you have a topic that is aggregated, make sure to do it from all source clusters to all aggregate clusters  Carefully consider if you want front-line aggregate clusters – It can encourage creating single-master services – Sometimes it is necessary, such as for search services 17
  18. 18. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Mirror Maker Concerns  Adding a site increases the number of mirror maker instances – Solution: Multi-consumer mirror makers  Mirror maker can lose messages like any producer – Solution: reduce inflight batches and acks=-1  Mirror maker has to decompress and recompress every batch – Possible solution: flag compressed batches for keyed messages  Message partitions are not preserved – Possible solution: an identity mirror maker 18
  19. 19. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Performance Tuning 19
  20. 20. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Kafka Cluster Sizing  How big for your local cluster? – How much disk space do you have? – How much network bandwidth do you have? – CPU, memory, disk I/O  How big for your aggregate cluster? – In general, multiple the number of brokers by the number of local clusters – May have additional concerns with lots of consumers 20
  21. 21. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Topic Configuration  Partition Counts for Local – Many theories on how to do this correctly, but the answer is “it depends” – How many consumers do you have? – Do you have specific partition requirements? – Keeping partition sizes manageable  Partition Counts for Aggregate – Multiply the number of partitions in a local cluster by the number of local clusters – Periodically review partition counts in all clusters  Message Retention – If aggregate is where you really need the messages, only retain it in local for long enough to cover mirror maker problems 21
  22. 22. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Mirror Maker Sizing  Number of servers and streams – Size the number of servers based on the peak bytes per second – Co-locate mirror makers – Run more mirror makers in an instance than you need – Use multiple consumer and producer streams  Other tunables to look at – Partition assignment strategy – In flight requests per connection – Linger time 22
  23. 23. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Segregation of Topics  Not all topics are created equal  High Priority Topics – Topics that change search results – Topics used for hourly or daily reporting  Run a separate mirror maker for these topics – One bloated topic won’t affect reporting – Restarting the mirror maker takes less time – Less time to catch up when you fall behind 23
  24. 24. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Data Assurance 24
  25. 25. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Monitoring  Kafka is great for monitoring your applications 25
  26. 26. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Monitoring  Have a system for monitoring Kafka components that does not use Kafka – At least for critical metrics  For tiered architectures – Simple health check on mirror maker instances – Mirror maker consumer lag  Is the data intact? 26
  27. 27. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Auditing Message Flows 27
  28. 28. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Audit Content  Message audit header – Timestamp – Service and hostname  Audit messages – Start and end timestamps – Topic and tier – Count 28
  29. 29. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Audit Concerns  We are only counting messages – Duplication of messages can hide losses – Using the detailed service and host audit criteria, we can get around this  We can’t audit all consumers – The relational DB has issues keeping up with bootstrapping clients – This can be improved with changes to the database backend  We cannot handle complex message flows – The total number of messages has to appear in each tier that the topic is in – Multiple source clusters must have the same tier name 29
  30. 30. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Conclusion 30
  31. 31. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Work Needed in Kafka  Access controls  Encryption  Quotas  Decompression improvements in mirror maker 31
  32. 32. SITE RELIABILITY ENGINEERING©2015 LinkedIn Corporation. All Rights Reserved. Getting Involved With Kafka  http://kafka.apache.org  Join the mailing lists – users@kafka.apache.org – dev@kafka.apache.org  irc.freenode.net - #apache-kafka  Meetups – Apache Kafka - http://www.meetup.com/http-kafka-apache-org – Bay Area Samza - http://www.meetup.com/Bay-Area-Samza-Meetup/  Contribute code 32

×