This document summarizes a presentation on using semi-supervised learning on Hadoop to understand user behaviors on large websites. It discusses clustering user sessions to identify different user segments, labeling the clusters, then using supervised learning to classify all sessions. Key metrics like satisfaction scores are then computed for each segment to identify opportunities to improve the user experience and business metrics. Smoothing is applied to metrics over time to avoid scaring people with daily fluctuations. The overall goal is to measure and drive user satisfaction across diverse users.