Data Mining
By:
Rizgar Ramadhan
Data Mining as the Evolution of Information
Technology
What Is Data Mining?
 Data mining:
is non-trivial process of extraction valid,
novel,
potentially useful, and ultimately
understandable
patterns or information in data
 Alternative name
Process of Data Mining
Data Mining is essentially a process of
data extracting of not obvious but useful
information from large database.
The entire process interactive and iterative.
Process of Data Mining:
◦ Data mining—core of
knowledge discovery
process
Data Cleaning
Data Integration
Databases
Data Warehouse
Task-relevant Data
Selection
Data Mining
Pattern Evaluation
Steps of a KDD Process
 Learning the application domain
◦ Relevant prior knowledge and goals of application
 Creating a target data set: data selection
 Data cleaning and preprocessing: (may take 60% of
effort!)
 Data reduction and transformation
◦ Find useful features, dimensionality/variable reduction.
 Choosing functions of data mining
◦ Classification, regression, association, clustering.
 Choosing the mining algorithm(s)
 Pattern evaluation and knowledge presentation
◦ Visualization, transformation, removing redundant patterns, etc.
 Use of discovered knowledge

introduction to data mining

  • 1.
  • 2.
    Data Mining asthe Evolution of Information Technology
  • 3.
    What Is DataMining?  Data mining: is non-trivial process of extraction valid, novel, potentially useful, and ultimately understandable patterns or information in data  Alternative name
  • 4.
    Process of DataMining Data Mining is essentially a process of data extracting of not obvious but useful information from large database. The entire process interactive and iterative.
  • 5.
    Process of DataMining: ◦ Data mining—core of knowledge discovery process Data Cleaning Data Integration Databases Data Warehouse Task-relevant Data Selection Data Mining Pattern Evaluation
  • 6.
    Steps of aKDD Process  Learning the application domain ◦ Relevant prior knowledge and goals of application  Creating a target data set: data selection  Data cleaning and preprocessing: (may take 60% of effort!)  Data reduction and transformation ◦ Find useful features, dimensionality/variable reduction.  Choosing functions of data mining ◦ Classification, regression, association, clustering.  Choosing the mining algorithm(s)  Pattern evaluation and knowledge presentation ◦ Visualization, transformation, removing redundant patterns, etc.  Use of discovered knowledge