This document discusses various techniques for data preprocessing, which is an important step in the knowledge discovery process. It describes why data preprocessing is necessary due to real-world data often being inconsistent, incomplete or noisy. Common techniques discussed include data cleaning, integration, transformation and reduction. Data cleaning aims to fill in missing values, smooth noisy data and correct inconsistencies. Normalization and discretization are forms of data transformation that put attribute values into appropriate ranges and reduce continuous attributes to discrete intervals. The goal of data preprocessing is to prepare the raw data into a format suitable for data mining algorithms to effectively discover useful knowledge.