This document proposes using Flume to centrally collect logs from distributed machines in near real-time, storing the logs in Hadoop and making them queryable through Hive. Flume agents would monitor logs on machines and transfer data to HDFS. Hive provides a SQL-like interface to insert and query log data stored in HDFS. This allows generating reports by running SQL commands against centralized log data.