This document provides an introduction to data mining. It discusses why organizations mine data from both a commercial and scientific viewpoint. Large amounts of data are being collected but not fully analyzed. Data mining can help discover useful patterns and information that is hidden within large datasets. The document defines data mining and differentiates it from simple queries. It outlines some common data mining tasks like classification, clustering, association rule mining, and their applications. Overall, the document serves as a high-level overview of the key concepts and motivations behind data mining.
3. Why Mine Data? Scientific Viewpoint
q Data collected and stored at
enormous speeds (GB/hour)
– remote sensors on a satellite
– telescopes scanning the skies
– microarrays generating gene
expression data
– scientific simulations
generating terabytes of data
q Traditional techniques infeasible for raw data
q Data mining may help scientists
– in classifying and segmenting data
– in Hypothesis Formation