BY
G.KARTHIGA M.SC INFO TECH
NADAR SARASWATHI COLLEGE OF ARTS
&SCIENCE ,THENI.

Prediction makes use of existing variables in the
database in order to predict unknown or future values
of interest.
Description focuses on finding patterns describing the
data and the subsequent presentation for user
interpretation.
DM techniques is to classify the techniques as:
1)User guided or verification driven data mining
2)Discovery driven or automatic discovery of
rules
DATA MINING
TECHNIQUES

In this process of data mining , the user makes a
hypothesis and tests the hypothesis on the data to
verify its validity.
The emphasis is on the user who is responsible for
formulating the hypothesis and issuing the query
on the data to affirm or negate the hypothesis.
For example, with a limited budget for a mailing
campaign to launch a new product.
VERIFICATION MODEL

The discovery model differ in its emphasis , it is
the system automatically discovering important
information hidden in the data.
The typical discovery driven tasks are:
1)discovery of association rules
2)discovery of classification rules
3)clustering
4)discovery of frequent episodes
5)deviation detection
DISCOVERY MODEL

The association with a very high support and
confidence is a pattern that occur often in the
database that should be obvious to the end user.
Pattern with extremely low support and
confidence should be regarded as no significance.
An association rule is an expression of the form
XY, where X and Y are the sets of items.
DISCOVERY OF
ASSOCIATION RULES

Clustering is a method of grouping data into
different groups , so that the data in each group
share similar trends and pattern.
Clustering constitutes a major class of data mining
algorithm.
The objectives of clustering are:
1)to uncover natural groupings
2)to initiate hypothesis about the data
3)to find consistent and valid organization
of the data
CLUSTERING

 Frequent episode are the sequence of event that occur
frequently close to each other and are extracted from the
time sequence.
 This is given by the user as the input and the output are the
prediction rules for the time sequence.
 S={(A1,t1),(A2,A2),(A3,A3)} is the ordered sequence of
events.
FREQUENT EPISODE

Deviation detection is to identify outlying points
in a particular data set and explain whether they
are due to noise or other impurities being present
in the data or due to trivial reason.
By calculating the values of measure of current
data , the deviation can be obtained .
They can be applied in forecasting ,fraud
detection, customer retention, etc.
DEVIATION DETECTION

The method are the result of academic attempts to
model the nervous system learning.
Neural network have the remarkable ability to
drive meaning from complicated or imprecise data
and can be used extract pattern and detect trends
that are too complex to be noticed by either
humans or other computer techniques.
This distinguishes neural network from traditional
computing program that simply follow instruction
in a fixed sequential order.
NEURAL NETWORK

Genetic algorithm are a relatively paradigm
inspired by Darwin’s theory of evolution.
A mutation process is also used to randomly
modify the genetic structure of some members of
each new generation.
The algorithm runs to generate solution for
successive generation.
To the availability of affordable , high speed
computer.
GENETIC ALGORITHM

Data mining techniques

  • 1.
    BY G.KARTHIGA M.SC INFOTECH NADAR SARASWATHI COLLEGE OF ARTS &SCIENCE ,THENI.
  • 2.
     Prediction makes useof existing variables in the database in order to predict unknown or future values of interest. Description focuses on finding patterns describing the data and the subsequent presentation for user interpretation. DM techniques is to classify the techniques as: 1)User guided or verification driven data mining 2)Discovery driven or automatic discovery of rules DATA MINING TECHNIQUES
  • 3.
     In this processof data mining , the user makes a hypothesis and tests the hypothesis on the data to verify its validity. The emphasis is on the user who is responsible for formulating the hypothesis and issuing the query on the data to affirm or negate the hypothesis. For example, with a limited budget for a mailing campaign to launch a new product. VERIFICATION MODEL
  • 4.
     The discovery modeldiffer in its emphasis , it is the system automatically discovering important information hidden in the data. The typical discovery driven tasks are: 1)discovery of association rules 2)discovery of classification rules 3)clustering 4)discovery of frequent episodes 5)deviation detection DISCOVERY MODEL
  • 5.
     The association witha very high support and confidence is a pattern that occur often in the database that should be obvious to the end user. Pattern with extremely low support and confidence should be regarded as no significance. An association rule is an expression of the form XY, where X and Y are the sets of items. DISCOVERY OF ASSOCIATION RULES
  • 6.
     Clustering is amethod of grouping data into different groups , so that the data in each group share similar trends and pattern. Clustering constitutes a major class of data mining algorithm. The objectives of clustering are: 1)to uncover natural groupings 2)to initiate hypothesis about the data 3)to find consistent and valid organization of the data CLUSTERING
  • 7.
      Frequent episodeare the sequence of event that occur frequently close to each other and are extracted from the time sequence.  This is given by the user as the input and the output are the prediction rules for the time sequence.  S={(A1,t1),(A2,A2),(A3,A3)} is the ordered sequence of events. FREQUENT EPISODE
  • 8.
     Deviation detection isto identify outlying points in a particular data set and explain whether they are due to noise or other impurities being present in the data or due to trivial reason. By calculating the values of measure of current data , the deviation can be obtained . They can be applied in forecasting ,fraud detection, customer retention, etc. DEVIATION DETECTION
  • 9.
     The method arethe result of academic attempts to model the nervous system learning. Neural network have the remarkable ability to drive meaning from complicated or imprecise data and can be used extract pattern and detect trends that are too complex to be noticed by either humans or other computer techniques. This distinguishes neural network from traditional computing program that simply follow instruction in a fixed sequential order. NEURAL NETWORK
  • 10.
     Genetic algorithm area relatively paradigm inspired by Darwin’s theory of evolution. A mutation process is also used to randomly modify the genetic structure of some members of each new generation. The algorithm runs to generate solution for successive generation. To the availability of affordable , high speed computer. GENETIC ALGORITHM