This document provides an overview of data preprocessing techniques. It discusses data quality issues like missing values, noise, and inconsistencies that require cleaning. Major tasks in preprocessing include data cleaning, integration, reduction, and transformation. Data cleaning techniques are described for handling incomplete, noisy, and inconsistent data. Methods for data integration, reduction through dimensionality reduction and sampling, and transformation through normalization and discretization are also summarized.