This document provides an overview of clustering techniques and presents a sample MapReduce implementation of K-Means clustering and Canopy clustering on a large dataset. It discusses how clustering can be used to group large, high-dimensional datasets and describes hierarchical and partitional clustering algorithms. It also outlines the steps taken in the MapReduce implementation to distribute the clustering computation across multiple nodes.