XL-Miner: Classification


Published on

XL-Miner: Classification

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

XL-Miner: Classification

  1. 1. Introduction to <br />XLMiner™<br />The Data mining add-in for Microsoft Excel.<br />Classification<br />XLMiner and Microsoft Office are registered trademarks of the respective owners.<br />
  2. 2. CLASSIFICATION<br />XLMiner provides us with different tools that can be used to classify data:<br />They are:<br />Discriminant Analysis <br />Logistic Regression <br />Classification Tree <br />Naive Bayes <br />Neural Network (Multilayer feed forward) <br />k-Nearest Neighbors<br />Let us look at each of these methods one by one.<br />http://dataminingtools.net<br />
  3. 3. CLASSIFICATION-Discriminant Analysis<br />Discriminant analysis is a technique for classifying a set of observations into predefined classes. The purpose is to determine the class of an observation based on a set of variables known as predictors or input variables. <br />The model is built based on a set of observations for which the classes are known. This set of observations is sometimes referred to as the training set. Based on the training set , the technique constructs a set of linear functions of the predictors, known as discriminant functions.<br />We will use the Wine.xls as the data source.<br />http://dataminingtools.net<br />
  4. 4. CLASSIFICATION-Discriminant Analysis(Step 1)<br />The variables (independent) that are selected as the input variables<br />The output ( dependent) variable<br />http://dataminingtools.net<br />
  5. 5. CLASSIFICATION-Discriminant Analysis(Step 2)<br />Choosing the “According to relative occurrences” will specify the prior class probability i.e. the probability of a particular class occurring is selected equal to its frequency in the training set.<br />Choosing “Use equal” specifies the class probabilities to be taken as equal .<br />http://dataminingtools.net<br />
  6. 6. CLASSIFICATION-Discriminant Analysis (Step 3)<br />Check the options which you want to be displayed in the output, and then click on finish.<br />http://dataminingtools.net<br />
  7. 7. CLASSIFICATION-Discriminant Analysis (Output)<br />http://dataminingtools.net<br />
  8. 8. CLASSIFICATION-Discriminant Analysis<br />This section of the output shows how each training data case was classified. The highest probability values in each record are highlighted<br />http://dataminingtools.net<br />
  9. 9. CLASSIFICATION- Classification Trees<br />These trees are very useful to classify/predict outcomes. They generate simple rules that can easily be translated to a natural query language. <br />The decision trees work by binary recursive partitioning – i.e. they keep on classifying a record by checking whether it meets the criteria at a node or not. <br />Since the partitioning is binary, it is essential that the nodes be divided such that they represent mutually exclusive conditions. <br />http://dataminingtools.net<br />
  10. 10. CLASSIFICATION- Classification Trees (Step 1)<br />http://dataminingtools.net<br />
  11. 11. CLASSIFICATION- Classification Trees (Step 2)<br />The “Minimum #records in terminal node” determines when the classification should stop i.e. when the minimum number of records is reached classification is halted so that the built model is not over fitted.<br />http://dataminingtools.net<br />
  12. 12. CLASSIFICATION- Classification Trees (Step 3)<br />Select the options for output. Selecting “Best pruned tree” causes the tree to be pruned and the best fitting for validation set is selected.<br />http://dataminingtools.net<br />
  13. 13. CLASSIFICATION- Classification Trees (Output)<br />Rules that are used to create nodes.<br />http://dataminingtools.net<br />
  14. 14. CLASSIFICATION- Classification Trees (output)<br />http://dataminingtools.net<br />
  15. 15. CLASSIFICATION- Classification Trees (output)<br />http://dataminingtools.net<br />
  16. 16. CLASSIFICATION- Naïve Bayes Theorem<br />This theorem is applicable to independent events only, i.e. the value of one variable will not affect that of the others. If there are say, 10 variables that a classification technique has to consider, the Bayes theorem does classification by taking each variable into account separately.<br />http://dataminingtools.net<br />
  17. 17. CLASSIFICATION- Naïve Bayes Theorem (Step 1 )<br />http://dataminingtools.net<br />
  18. 18. CLASSIFICATION- Naïve Bayes Theorem(Step 2-3)<br />http://dataminingtools.net<br />
  19. 19. CLASSIFICATION- Naïve Bayes Theorem (output)<br />http://dataminingtools.net<br />
  20. 20. CLASSIFICATION- Naïve Bayes Theorem (Output)<br />http://dataminingtools.net<br />
  21. 21. CLASSIFICATION- k-nearest neighbors <br />In k-nearest neighbours classification (k-NN), for each record, the k-nearest neighbours (nearness is defined by the Euclidean distance to the record in question) are identified and the class a majority of them belong to is determined. <br />The original record is also attributed to the same class.<br />http://dataminingtools.net<br />
  22. 22. CLASSIFICATION- k-nearest neighbors (Step 1) <br />http://dataminingtools.net<br />
  23. 23. CLASSIFICATION- k-nearest neighbors (Step 2) <br />http://dataminingtools.net<br />
  24. 24. CLASSIFICATION- k-nearest neighbors (Output) <br />http://dataminingtools.net<br />
  25. 25. CLASSIFICATION- k-nearest neighbors (Output) <br />Based on the probability , record is placed in the class with highest probability.<br />http://dataminingtools.net<br />
  26. 26. Thank you<br />For more presentations, tutorial videos on <br />Data Mining, please visit<br />http://dataminingtools.net<br />http://dataminingtools.net<br />