• Like
  • Save
Knowledge  Discovery
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Knowledge Discovery


Knowledge Discovery

Knowledge Discovery

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Knowledge Discovery
  • 2. Definition
    Definition – “Non-trivial extraction of implicit, previously unknown and potentially useful information from data.”
    Data Mining – Responsible for detecting patterns from the pre-processed (prepared) data. It is only a part of Knowledge discovery process.
  • 3. Applications
    Can be divided into four major kinds:
    Numerical prediction
    Some examples:
    Automatic abstraction
    Financial forecasting
    Targeted marketing
    Medical diagnosis
    Credit card fraud detection
    Weather forecasting etc.
  • 4. Labeled & Unlabeled data
    General Terminology:
    Instances – Dataset of examples
    Attributes – Variables in an instance
    Labeled data
    Specific attribute whose value in some instances can be used to predict its value in unknown instances
    Unlabeled data
    No such specific attribute that can be used to predict the value in unknown instances.
    Supervised learning – Data mining using labeled data
    Unsupervised learning – Data mining using unlabeled data
  • 5. Labeled data
    Attributes can be of two types:
    Categorical attribute
    Takes a value from only a fixed set of values (like an enumeration) eg. ‘very good’, ‘good’, ‘poor’
    Supervised learning is called Classification
    Numerical attribute
    Can take a value from a continuous range of numerical values
    Supervised learning is called Regression
  • 6. Unlabeled data
    It doesn’t have any specifically designated attribute
    Unsupervised learning
    Data mining using unlabeled data
    Purpose - To extract as much as it is possible from the data available.
  • 7. Supervised learning: Classification
    It is based on the following three methods:
    Nearest neighbor matching:
    Identifying the classified instances that are closest (in some sense) to the unclassified one
    Classification rules:
    Look for rules that can be used to predict the classification of an unknown instance
    Classification tree:
    Generation of classification rules via the tree-like structure
  • 8. Supervised Learning: Numerical Prediction
    Regression is done by using Neural Networks
    Neural Network: Given a set of inputs to predict one or more outputs
  • 9. Unsupervised Learning: Association Rules
    Association Rules: To find any relationship that exists amongst the values of variables within a training set
    IF variable_1>90 and switch_6 = open
    THEN variable_3 < 47.5 and switch_9 = closed
    (probability = 0.8)
  • 10. Unsupervised Learning: Clustering
    To find groups of items that are similar
    A company may group its customers based on income to target its policies etc.
  • 11. Visit more self help tutorials
    Pick a tutorial of your choice and browse through it at your own pace.
    The tutorials section is free, self-guiding and will not involve any additional support.
    Visit us at www.dataminingtools.net