Kafka - Linkedin’s messaging backbone
Who are we ?
▪ Kafka SRE at LinkedIn
▪ Site Reliability Engineering
– Administrators
– Architects
– Developers
▪ Keep the site running, always
Presenters
▪ Clark Haskins
– Manager for Data Infra Streaming SRE (Mountain View, CA)
▪ Ayyappadas Ravindran
– Staff Site Reliability Engineer,
Data Infra Streaming (Bengaluru)
▪ Akash Vacher
▪ Site Reliability Engineer,
Data Infra Streaming (Bengaluru)
Agenda
▪ What the heck is Kafka ?
– Brief intro
– Motivation to build Kafka
▪ Okay, why should I bother ?
– Kafka facts, scale & performance
▪ You have my attention, tell me more !
– Core concepts
– Operating kafka
– Kafka @ Linkedin
▪ Nice, where all do you use kafka ?
– Tale of two applications
▪ Have questions ?
What the heck is Kafka?
▪ A high-throughput distributed messaging system
▪ Developed at Linkedin and open sourced in early 2011
▪ Implemented in Scala and Java
▪ Linkedin’s messaging backbone
▪ Kafka powers around 1000 companies including
Linkedin, Yahoo!, Netflix, Uber, Twitter and many more
If data is lifeblood of high technology, Apache Kafka is the circulatory system used in Linkedin
– Todd Palino (Staff SRE Engineer Linkedin)
Motivation to create Kafka ?
▪ Needed a unified platform to handle all
real time data feeds and stream processing
▪ Wanted a messaging system with high
throughput to support high volume event feeds
▪ Needed data persistence for offline
systems and in case of service recovery
▪ Low latency
▪ Fault tolerant
▪ Linearly scalable
Okay ! what was
the motivation to
create Kafka?
Before
After
How is Kafka used at Linkedin?
▪ Application and System Monitoring (inGraphs)
▪ User tracking on Linkedin web sites
▪ Email, push & SMS notifications
▪ Live search updates
▪ Samza Jobs (standardization, call graph and more)
▪ Database Replication
Okay, why should I bother?
▪ Over 1,300,000,000,000 messages are transported
via Kafka every day at LinkedIn
▪ 300 Terabytes of inbound and 900 Terabytes of
outbound traffic
▪ 4.5 Million messages per second, on single
cluster
▪ Kafka runs on around 1300 servers at LinkedIn
hmmm .. ! How
good is Kafka ?
You have my attention, tell me more !
▪ Building blocks
– Message
– Producers
– Consumers
– Topics
– Partitions
– Segments
– Brokers
– Replicas
Awesome !! I am in Tell
me more !
Bird’s eye view
The data continues ..
What Is Kafka?
Broker
A
P0
A
P1
A
P0
15
Consumer
Producer
Zookeeper
Performance recipes
▪ OS page cache
▪ Linear IO, never fear the file system !
▪ sendfile(), system call
▪ Message batching
Dude, tell me the
performance secret!!
Operating Kafka
▪ Broker Hardware
– Cisco C240, Intel xeon, 64GB
RAM , 14 disk Raid-10
▪ Zookeeper Hardware
– 5 + 1 ensemble, 64GB RAM,
500GB SSD
▪ Monitoring
– Lag monitoring
– Under Replicated Partitions
– Unclean leader election
– Burrow
▪ Cluster rebalance
– Sizewise rebalance
– Partitionwise rebalance
Tell me how you
manage this beast !
Mirror Maker and Audit
Kafka Audit(event count)
Kafka Audit(data transport time)
Kafka @ Linkedin
▪ Cluster Types
– Tracking
– Metrics
– Queuing
▪ Kafka Rest
▪ Schema Registry
Kafka @ Linkedin - Schema registry
Autometrics
▪ Building Blocks
– Sensors
– EventBus
– Kafka Rest
– Kafka cluster
– Kafka consumer
– RRD
– Front end
▪ Facts & Figures
– 320,000,000 metrics
collected per minute
▪ 530 TB of disk space
▪ Over 210,000 metrics
collected per service
InGraphs
Kafka for database replication - Master slave
Kafka for database replication - Multi master
Have questions?

Kafka - Linkedin's messaging backbone

  • 1.
    Kafka - Linkedin’smessaging backbone
  • 3.
    Who are we? ▪ Kafka SRE at LinkedIn ▪ Site Reliability Engineering – Administrators – Architects – Developers ▪ Keep the site running, always
  • 4.
    Presenters ▪ Clark Haskins –Manager for Data Infra Streaming SRE (Mountain View, CA) ▪ Ayyappadas Ravindran – Staff Site Reliability Engineer, Data Infra Streaming (Bengaluru) ▪ Akash Vacher ▪ Site Reliability Engineer, Data Infra Streaming (Bengaluru)
  • 5.
    Agenda ▪ What theheck is Kafka ? – Brief intro – Motivation to build Kafka ▪ Okay, why should I bother ? – Kafka facts, scale & performance ▪ You have my attention, tell me more ! – Core concepts – Operating kafka – Kafka @ Linkedin ▪ Nice, where all do you use kafka ? – Tale of two applications ▪ Have questions ?
  • 6.
    What the heckis Kafka? ▪ A high-throughput distributed messaging system ▪ Developed at Linkedin and open sourced in early 2011 ▪ Implemented in Scala and Java ▪ Linkedin’s messaging backbone ▪ Kafka powers around 1000 companies including Linkedin, Yahoo!, Netflix, Uber, Twitter and many more If data is lifeblood of high technology, Apache Kafka is the circulatory system used in Linkedin – Todd Palino (Staff SRE Engineer Linkedin)
  • 7.
    Motivation to createKafka ? ▪ Needed a unified platform to handle all real time data feeds and stream processing ▪ Wanted a messaging system with high throughput to support high volume event feeds ▪ Needed data persistence for offline systems and in case of service recovery ▪ Low latency ▪ Fault tolerant ▪ Linearly scalable Okay ! what was the motivation to create Kafka?
  • 8.
  • 9.
  • 10.
    How is Kafkaused at Linkedin? ▪ Application and System Monitoring (inGraphs) ▪ User tracking on Linkedin web sites ▪ Email, push & SMS notifications ▪ Live search updates ▪ Samza Jobs (standardization, call graph and more) ▪ Database Replication
  • 11.
    Okay, why shouldI bother? ▪ Over 1,300,000,000,000 messages are transported via Kafka every day at LinkedIn ▪ 300 Terabytes of inbound and 900 Terabytes of outbound traffic ▪ 4.5 Million messages per second, on single cluster ▪ Kafka runs on around 1300 servers at LinkedIn hmmm .. ! How good is Kafka ?
  • 12.
    You have myattention, tell me more ! ▪ Building blocks – Message – Producers – Consumers – Topics – Partitions – Segments – Brokers – Replicas Awesome !! I am in Tell me more !
  • 13.
  • 14.
  • 15.
  • 16.
    Performance recipes ▪ OSpage cache ▪ Linear IO, never fear the file system ! ▪ sendfile(), system call ▪ Message batching Dude, tell me the performance secret!!
  • 17.
    Operating Kafka ▪ BrokerHardware – Cisco C240, Intel xeon, 64GB RAM , 14 disk Raid-10 ▪ Zookeeper Hardware – 5 + 1 ensemble, 64GB RAM, 500GB SSD ▪ Monitoring – Lag monitoring – Under Replicated Partitions – Unclean leader election – Burrow ▪ Cluster rebalance – Sizewise rebalance – Partitionwise rebalance Tell me how you manage this beast !
  • 18.
  • 20.
  • 21.
  • 22.
    Kafka @ Linkedin ▪Cluster Types – Tracking – Metrics – Queuing ▪ Kafka Rest ▪ Schema Registry
  • 23.
    Kafka @ Linkedin- Schema registry
  • 24.
    Autometrics ▪ Building Blocks –Sensors – EventBus – Kafka Rest – Kafka cluster – Kafka consumer – RRD – Front end ▪ Facts & Figures – 320,000,000 metrics collected per minute ▪ 530 TB of disk space ▪ Over 210,000 metrics collected per service
  • 25.
  • 26.
    Kafka for databasereplication - Master slave
  • 27.
    Kafka for databasereplication - Multi master
  • 28.