Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Kafka overview and use cases

745 views

Published on

Overview of kafka, how it works, components of kafka, use cases.
Kafka at LinkedIn. Download the slides to see animations explaining how the components fit.
This was presented on kafka meetup help on June 11, 2016 @LinkedinBangalore ofc

Published in: Technology
  • Be the first to comment

Kafka overview and use cases

  1. 1. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. Kafka - Overview Indrajeet Kumar Site Reliability Engineer at LinkedIn
  2. 2. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. 2 So what is it? It is a high-throughput, low-latency messaging system
  3. 3. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. And who uses it? 3
  4. 4. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. What for? •Messaging •Website Activity Tracking •Metrics •Log Aggregation •Stream Processing •For fun ;) 4
  5. 5. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. So how does it work? ▪ Components – Producer – Broker ▪ Topic ▪ Partition – Consumer 5
  6. 6. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. Broker producer producer producer B2B1P1 P2P1 R P2 R 6 consumer consumer consumer
  7. 7. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. The consumer 7 consumer B2B1P1 P2 B3P3 C1 C2 P1 R
  8. 8. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. The Producer 8 Producer B2B1P1 P2 B3P3 P1 P2 P1 R
  9. 9. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. Attributes of a Kafka Cluster ▪ Durable ▪ Scalable ▪ Low Latency ▪ Finite Retention ▪ No single point of failure 9
  10. 10. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. Kafka At LinkedIn ▪ Multiple Datacenters, Multiple Clusters ▪ Mirroring between clusters ▪ Message Types – Metrics – Tracking – Queuing ▪ Data transport from applications to Hadoop, and back 10
  11. 11. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. Some numbers! ▪ 1800+ Broker machines ▪ 79K+ Topics ▪ 1.1M+ Partitions ▪ 1.3 Trillion messages per day ▪ 330 Terabytes in/day ▪ 1.2 Petabytes out/day ▪ Peak load for a single cluster – 2 million messages/sec – 4.7 Gigabits/sec inbound – 15 Gigabits/sec outbound 11
  12. 12. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. Questions 12
  13. 13. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved.

×