Apache Kafka is a simple, high-performance, distributed, fault-tolerant messaging system. It was initially developed at LinkedIn and is now used at many companies, including Twitter, Square, Mozilla, Foursquare, and Tumblr. This talk will cover the architecture of Kafka and how LinkedIn uses Kafka to build a distributed low-latency pipeline that handles all messaging, tracking, logging, and metrics data. This unified pipeline provides data feeds into Hadoop and a diverse set of user-facing real-time stream processing applications. We will describe the lessons learned scaling this service to thousands of data feeds and many terabytes of messages per day.