Paul Kim from Daum presented on data analysis using Hadoop. The goals were to improve search quality and understand users by analyzing 40TB of half-year search logs. Features were extracted from query-collection, query-document-session, and session-query relationships using MapReduce jobs on Hadoop clusters. Models were built with SAS, Weka, and R to perform tasks like spam indexing and blog classification. Hadoop allowed analyzing the full dataset at a low cost without sampling.