ByAjaydeep
Abhishek kutiyal


Classification



Classification is the process of finding a model that
describes and distinguishes data classes or co...
Classification
Algorithms

Training
Data

name

age

Mike
Mary
Bill
Jim
Dave
Anne

young

incomeloan decision

low
young
l...
Classifier
Testing
Data

Unseen Data
(john,mid_age,low)

name age income loan_deci
Tom
low
Safe
senior
Mariya mid_age low
...




Genetic Algorithms
Rough Set Approach
Fuzzy set Approach





Genetic algorithms are examples of
evolutionary computing methods and are
optimization-type algorithms.
Given a po...




The basis for evolutionary computing
algorithms is biological evolution, where
over time evolution produces the best...


Individual (chromosome):

• feasible solution in an optimization problem



Population
• Set of individuals
• Should b...




The most important starting point to
develop a genetic algorithm
Each gene has its special meaning
Based on this re...
The fitness function takes a
single chromosome as input
and returns a measure of the
goodness of the
solution
represented
...


In genetic algorithms, reproduction is defined
by precise algorithms that indicate how to
combine the given set of indi...


Single-point Crossover

1 1 1 0 1 0 0 1 0 0 0
0 0 0



1 1 1 0 1 0 1 0 1 0 1
0 0 0

0 1

0 1 0 1 0 1

0 1

0 0 1 0 0 0...



Usually change a single bit in a bit string
This operator should happen with very
low probability.
0

1

1

0

1
Muta...
0 1 0 0 1
1 1 1 0 0
0 0 1 1 1
0 1 1 0 1
1 1 1 0 0
1 1 1 0 1

old generation

1 1

1 0 1

1 1

0 0 1

0 1

0 0 1

0 1

1 0 ...




A rough set is a formal approximation of a
crisp set in terms of a pair of sets which give
the lower and the upper a...
•

A Rough Set Definition for a given class C is
approximated by two sets1. Lower Approximation of C consist of
all of
the...
One of the new data mining theories is the rough set
theories that can be used for
1.Classification to discover structured...








Fuzzy logic uses truth values between 0.0 and 1.0 to
represent the degree of membership (such as using
fuzzy ...
Other classification methods in data mining
Upcoming SlideShare
Loading in …5
×

Other classification methods in data mining

383
-1

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
383
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
18
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Genetic algorithms accept their input coded as a finite length string (or chromosome.) Each of the elements in the chromosome is a gene, and each gene has an allele value.
  • Other classification methods in data mining

    1. 1. ByAjaydeep Abhishek kutiyal
    2. 2.  Classification  Classification is the process of finding a model that describes and distinguishes data classes or concept . for the purpose of being able to use the model to predict the class of objects whose class label is unknown.   predicts categorical class labels (discrete or nominal)  classifies data (constructs a model) based on the training set and the values (class labels) in a classifying attribute and uses it in classifying new data 2
    3. 3. Classification Algorithms Training Data name age Mike Mary Bill Jim Dave Anne young incomeloan decision low young low midage high midage low senior low senior medium risky Classifier (Model) risky safe risky safe safe IF age=youth THEN loan_deci=risky IF income=high then loan_deci=safe IF age=mid AND income=low THEN Loan_deci=risky 3
    4. 4. Classifier Testing Data Unseen Data (john,mid_age,low) name age income loan_deci Tom low Safe senior Mariya mid_age low risky George mid_age high safe ...... ..... ...... ....... Loan deci?
    5. 5.    Genetic Algorithms Rough Set Approach Fuzzy set Approach
    6. 6.    Genetic algorithms are examples of evolutionary computing methods and are optimization-type algorithms. Given a population of potential problem solutions (individuals). evolutionary computing expands this population with new and potentially better solutions.
    7. 7.   The basis for evolutionary computing algorithms is biological evolution, where over time evolution produces the best or “fittest” individuals. In Data mining, genetic algorithms may be used for clustering, prediction, and even association rules.
    8. 8.  Individual (chromosome): • feasible solution in an optimization problem  Population • Set of individuals • Should be maintained in each generation
    9. 9.    The most important starting point to develop a genetic algorithm Each gene has its special meaning Based on this representation, we can define • fitness evaluation function, • crossover operator, • mutation operator.
    10. 10. The fitness function takes a single chromosome as input and returns a measure of the goodness of the solution represented by the chromosome.
    11. 11.  In genetic algorithms, reproduction is defined by precise algorithms that indicate how to combine the given set of individuals to produce new ones. These are called “crossover algorithms”.  Given two individuals; parents from a population, the crossover technique generates new individuals (offspring or children) by switching subsequences of the string
    12. 12.  Single-point Crossover 1 1 1 0 1 0 0 1 0 0 0 0 0 0  1 1 1 0 1 0 1 0 1 0 1 0 0 0 0 1 0 1 0 1 0 1 0 1 0 0 1 0 0 0 Two-point Crossover 1 1 1 0 1 0 0 1 0 0 0  1 1 0 0 1 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 Uniform Crossover 1 0 0 1 1 0 1 0 0 1 1 Crossover template Crossover template 1 1 1 0 1 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 1 1 0 1 0 1 1 0 0 1 0 0 1 0 1 0 1 0 1
    13. 13.   Usually change a single bit in a bit string This operator should happen with very low probability. 0 1 1 0 1 Mutation point (random) 0 1 1 1 1
    14. 14. 0 1 0 0 1 1 1 1 0 0 0 0 1 1 1 0 1 1 0 1 1 1 1 0 0 1 1 1 0 1 old generation 1 1 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1 1 0 1 Crossover point randomly selected Probabilistically select individuals Probabilistically select individuals  Crossover mates are probabilistically selected based on their fitness value. new generation Mutation point (random) 0 1 1 1 1 1 1 0 0 1 1 0 1 1 1 0 1 1 0 1 1 1 1 0 0 1 1 1 0 1
    15. 15.   A rough set is a formal approximation of a crisp set in terms of a pair of sets which give the lower and the upper approximation of the original set. The tuple composed of the lower and upper approximation is called a rough set.
    16. 16. • A Rough Set Definition for a given class C is approximated by two sets1. Lower Approximation of C consist of all of the data tuples that based on the knowledge of the attributes, are certain belong to C without ambiguity. 2. Upper Approximation of C consist of all of the data tuples that based on the knowledge of the attributes, cannot be described as not belonging to C.
    17. 17. One of the new data mining theories is the rough set theories that can be used for 1.Classification to discover structured relationship within noisy data. 2.Attributes subset selection. 3.Reduction of data set. 4.Finding hidden data patterns 5. Generation of decision rules
    18. 18.      Fuzzy logic uses truth values between 0.0 and 1.0 to represent the degree of membership (such as using fuzzy membership graph) Attribute values are converted to fuzzy values • e.g., income is mapped into the discrete categories {low, medium, high} with fuzzy values calculated For a given new sample, more than one fuzzy value may apply Each applicable rule contributes a vote for membership in the categories Typically, the truth values for each predicted category are summed, and these sums are combined 18
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×