Message Intelligence Service collects connection and message metadata from thousands of production email clusters for near real-time monitoring, reporting, and searching. The service processes billions of messages a day and manages hundreds of terabytes of indices with a standard 30 day retention.
MIS consists of conventional data pipeline components from the Kafka and Elastic ecosystems. What might be unusual is the choice of using Kubernetes to deploy these components.
Kubernetes is the current choice for deploying stateless services in the cloud. Unfortunately, many of the simplifying assumptions that makes Kubernetes outstanding for deploying stateless services cause complications for stateful services such as Kafka and Elasticsearch.
This presentation discusses some of these challenges and how we dealt with them.