Data preprocessing involves cleaning data by handling missing values, outliers, and noise. It also includes integrating and transforming data from multiple sources through normalization, aggregation, and dimensionality reduction. The goals of preprocessing are to improve data quality, reduce data size for analysis, and convert continuous attributes to discrete intervals or concepts. Preprocessing helps produce higher quality mining results.