Apache Kafka is publish-subscribe messaging rethought as a distributed commit log. It is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime. Storm is a distributed realtime computation system. Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing. Storm has many use cases: realtime analytics, online machine learning, continuous computation, distributed RPC, ETL, and more. This session presents the main concepts of Kafka and Storm and then shows how a simple stream processing application is implemented using these two technologies.
Clipping is a handy way to collect important slides you want to go back to later.