This document discusses using a stack of Hadoop, Spark, and Elasticsearch to perform anomaly detection on large datasets in both batch and real-time. Hadoop is used for large-scale data storage and preprocessing. Spark is used to perform in-depth analysis to identify common entities and build models. Elasticsearch allows searching the data in real-time and performing aggregations to identify uncommon entities. A live loop continuously adapts the models to react to streaming data and improve anomaly detection over time.