This document discusses module 3 on data pre-processing. It begins with an overview of data quality issues like accuracy, completeness, consistency and timeliness. The major tasks in data pre-processing are then summarized as data cleaning, integration, reduction, and transformation. Data cleaning techniques like handling missing values, noisy data and outliers are covered. The need for feature engineering and dimensionality reduction due to the curse of dimensionality is also highlighted. Finally, techniques for data integration, reduction and discretization are briefly introduced.