Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Stream Processing with Kafka and Samza

701 views

Published on

Stream Processing with Kafka and Samza

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Stream Processing with Kafka and Samza

  1. 1. Stream Processing with Kafka and Samza Diego Pacheco @diego_pacheco Principal Software Architect
  2. 2. ●LinkedIN 2011 ●Implemented with Scala and Java ●Motivation: Real-time data feeds ●Goals: –Low Latency –High Throughtput ●Kafka at LinkedIN(2014): –300+ brokers –18k topics –140k partitions –220B messages per day –40TB inboud –160TB outbound –Peak Load: 3.25M messages/second ●Use case: Activity Stream, Offline log processing
  3. 3. NO JMS
  4. 4. ● LinkedIN 2013 ● Stream Processing with Save Points. ● Multi-tenancy: 1 Thread per container ● State is simple – You handle logging and restoring – Single threaded programing ● Works with YARN ● Works well with Kafka ● Simple API – Record-like.
  5. 5. ● Stream Processing ● Low Latency ● Async Processing ● Local State ● Stores data localy on DISK ● SAME machine where container runs – Awesome FIT for Statefull processing ● Tight Integration with Kafka ● Strong Model For Streams: Ordered, Highly Avaliable, Partitioned and Durable(Kafka). ● Full feature Set of Kafka ● Client Side Join
  6. 6. Stream Processing with Kafka and Samza Diego Pacheco @diego_pacheco Principal Software Architect Thank You! Obrigado !

×