This document describes a system called DeviceAnalyzer that builds predictive models in near-real time using Apache Spark and Apache Lucene. It discusses: 1) Integrating Spark and Lucene to enable column search capabilities in Spark and add Spark operations to Lucene. 2) Representing Spark DataFrames as Lucene documents to build a distributed Lucene index from DataFrames. 3) Using the index for tasks like searching devices matching a query, generating statistical and predictive models on retrieved devices, and finding dimensions correlated with selected devices. 4) Architectural components like Trapezium for batch, streaming, and API services and a LuceneDAO for indexing DataFrames and querying the index.