Incident response, post-facto forensics, and network troubleshooting rely on the ability to quickly extract relevant information. To this end, security analysts and network operators need a system that (i) allows for directly expressing a query using domain-specific constructs, (ii) that delivers the performance required for interactive analysis, and (iii) that is not affected by a continuously arriving stream of semi-structured data.
This talk covers the design and implementation plans of a distributed analytics platform that meets these requirements. Well-proven Google architectures like GFS, BigTable, Chubby, and Dremel heavily influenced the design of the system, which leverages bitmap indexes to meet the interactive query requirements. The goal is to develop a prototype ready for production usage in the next few months and obtain feedback from using it on various large-scale sites serving tens of thousands of machines.