The document describes a log collection and processing system. Logs are collected from clients by log collection daemons and sent to log processors. The log processors aggregate the logs by category, clean and convert the data, then store it in HDFS. The data is then replicated to analytics clusters where it can be accessed for analysis, reporting and model training.