Data preprocessing involves cleaning raw data by handling missing values and inconsistencies, integrating data from multiple sources, reducing the data volume through techniques like aggregation and dimensionality reduction, and transforming the data through operations like normalization, discretization, and generalization. The major tasks in data preprocessing are data cleaning, integration, reduction, and transformation/discretization to prepare the data for analysis.