This document discusses techniques for finding similar sets of items, specifically documents. It introduces the concepts of shingling, minhashing, and locality-sensitive hashing. Shingling converts documents to sets of n-grams. Minhashing converts large sets to short signatures while preserving similarity using multiple hash functions. Locality-sensitive hashing focuses on signature pairs likely to be similar by hashing signatures to multiple bands to generate candidate pairs for further comparison. These techniques allow efficiently finding similar documents when directly comparing all pairs would be prohibitively expensive.