This document presents a memory capacity model for optimizing data filtering applications built on the Samza framework. The model estimates the live data set size based on application parameters like input topics, partitions, and message sizes. It then sets the required heap size to twice the live data size to avoid garbage collection issues. Evaluation shows the model accurately predicts memory usage, allowing more Samza containers per node while maintaining service level agreements.