2.
Prediction makes use of existing variables in the
database in order to predict unknown or future values
of interest.
Description focuses on finding patterns describing the
data and the subsequent presentation for user
interpretation.
DM techniques is to classify the techniques as:
1)User guided or verification driven data mining
2)Discovery driven or automatic discovery of
rules
DATA MINING
TECHNIQUES
3.
In this process of data mining , the user makes a
hypothesis and tests the hypothesis on the data to
verify its validity.
The emphasis is on the user who is responsible for
formulating the hypothesis and issuing the query
on the data to affirm or negate the hypothesis.
For example, with a limited budget for a mailing
campaign to launch a new product.
VERIFICATION MODEL
4.
The discovery model differ in its emphasis , it is
the system automatically discovering important
information hidden in the data.
The typical discovery driven tasks are:
1)discovery of association rules
2)discovery of classification rules
3)clustering
4)discovery of frequent episodes
5)deviation detection
DISCOVERY MODEL
5.
The association with a very high support and
confidence is a pattern that occur often in the
database that should be obvious to the end user.
Pattern with extremely low support and
confidence should be regarded as no significance.
An association rule is an expression of the form
XY, where X and Y are the sets of items.
DISCOVERY OF
ASSOCIATION RULES
6.
Clustering is a method of grouping data into
different groups , so that the data in each group
share similar trends and pattern.
Clustering constitutes a major class of data mining
algorithm.
The objectives of clustering are:
1)to uncover natural groupings
2)to initiate hypothesis about the data
3)to find consistent and valid organization
of the data
CLUSTERING
7.
Frequent episode are the sequence of event that occur
frequently close to each other and are extracted from the
time sequence.
This is given by the user as the input and the output are the
prediction rules for the time sequence.
S={(A1,t1),(A2,A2),(A3,A3)} is the ordered sequence of
events.
FREQUENT EPISODE
8.
Deviation detection is to identify outlying points
in a particular data set and explain whether they
are due to noise or other impurities being present
in the data or due to trivial reason.
By calculating the values of measure of current
data , the deviation can be obtained .
They can be applied in forecasting ,fraud
detection, customer retention, etc.
DEVIATION DETECTION
9.
The method are the result of academic attempts to
model the nervous system learning.
Neural network have the remarkable ability to
drive meaning from complicated or imprecise data
and can be used extract pattern and detect trends
that are too complex to be noticed by either
humans or other computer techniques.
This distinguishes neural network from traditional
computing program that simply follow instruction
in a fixed sequential order.
NEURAL NETWORK
10.
Genetic algorithm are a relatively paradigm
inspired by Darwin’s theory of evolution.
A mutation process is also used to randomly
modify the genetic structure of some members of
each new generation.
The algorithm runs to generate solution for
successive generation.
To the availability of affordable , high speed
computer.
GENETIC ALGORITHM