Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
×

# Enhancing the performance of Naive Bayesian Classifier using Information Gain concept of Decision Tree

566 views

Published on

Enhancing the performance of Naive Bayesian(NB) using the attributes with have highest Information Gain

Published in: Technology, Education
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

• Be the first to like this

### Enhancing the performance of Naive Bayesian Classifier using Information Gain concept of Decision Tree

1. 1. 7/29/2013 1
2. 2.  Abstract of the work  Why we need it?  Naïve Bayesian Classifier  Definition  Algorithm  Gaussian Distribution  Decision Tree  Definition  Algorithm 7/29/2013 2
3. 3.  Information Gain  My Algorithm  Experimental Design  Experimental Result  Remarks 7/29/2013 3
4. 4.  Apply Naïve Bayesian.  Based on information gain create decision tree & select attributes.  Apply Naïve Bayesian with the selected attributes.  Minimize the time & space need to analysis.  Can work with continuous data stream. 7/29/2013 4
5. 5. 1. Now-a-days data volume of internet user is getting larger. 2. Machine learning is getting harder day by day. 3. Pre-processing of data may be a solution for it. 4. Using the necessary data only can make the learning process faster 7/29/2013 5
6. 6. 5. A better technique can make the process more organized using only necessary data 6. Cut off all un-important attributes from data set. 7. Dataset become compact in terms of attributes and calculation becomes fast. 8. Get better performance than past on behalf of time and space. 7/29/2013 6
7. 7.  Naïve Bayesian Classifier  Gaussian Distribution  Decision Tree  Information Gain 7/29/2013 7
8. 8.  The Naïve Bayesian classifier(NB) is a straightforward and frequently used method for supervised learning.  It provides a flexible way for dealing with any number of attributes or classes  It’s based on statistical probability theory. 7/29/2013 8
9. 9.  It is the asymptotically fastest learning algorithm that examines all its training input.  It has been demonstrated to perform surprisingly well in a very wide variety of problems in spite of the simplistic nature of the model.  Furthermore, small amounts of bad data, or “noise,” do not perturb the results by much. 7/29/2013 9
10. 10.  There are classes, say Ck for the data to be classified into.  Each class has a probability P(Ck) that represents the prior probability of classifying an attribute into Ck.  For n attribute values, vj, the goal of classification is clearly to find the conditional probability P(Ck | v1 ∧ v2 ∧ … ∧ vn). 7/29/2013 10
11. 11.  By Bayes’ rule, this probability is equivalent to 7/29/2013 11
12. 12.  The mathematical function for calculating the probability density of Gaussian distribution at a particular point X is: where µ is the mean and σ is the standard deviation of the continues-valued attribute X 7/29/2013 12
13. 13.  1. Decision trees are one of the most popular methods used for inductive inference.  2. The basic algorithm for decision tree induction is a greedy algorithm that constructs decision trees in a top-down recursive divide-and-conquer manner.  3. The main concept of selecting an attribute and constructing a decision tree is Information Gain(IG) 7/29/2013 13
14. 14.  The basic idea behind any decision tree algorithm is as follows:  Choose the best attribute(s) to split the remaining instances and make that attribute a decision node using Information Gain  Repeat this process for recursively for each child  Stop when:  All the instances have the same target attribute value  There are no more attributes  There are no more instances 7/29/2013 14
15. 15. Leave At Stall? Accident? 10 AM 9 AM 8 AM Long Long Short Medium Long No Yes No Yes If we leave at 9 AM and there is no accident happened on the road, what will our commute time be?
16. 16.  The critical step in decision trees is the selection of the best test attribute.  The information gain measure is used to select the test attribute at each node in the tree.  The expected information needed to classify a given sample is given by where pk is the probability that an arbitrary sample belongs to class Ck and is estimated by sk / s. 7/29/2013 16
17. 17. 1. Run Naïve Bayesian classifier on the training data set 2. Run C4.5 on data from step 1. 3. Select a set of attributes that appear only in the simplified decision tree as relevant features. 4. Run Naïve Bayesian classifier on the training data using only the final attributes selected in step 3. 5. Compare the result of step 4 with step 1. 7/29/2013 17
18. 18.  Each dataset is shuffled randomly.  Produce disjoint training and test sets as follows.  80% training & 20% test data  70% training & 30% test data  60% training & 40% test data  For each set of training and test data, run  Naïve Bayesian Classifier (NBC)  C4.5  Selective Bayesian Classifier(SBC) 7/29/2013 18
19. 19. Dataset # of instances # of attributes # of attributes selected Iris 150 4 2 Diabetes 768 8 6 Ionosphere 351 34 14 Breast Cancer 286 9 6 Ecoli 336 8 7 7/29/2013 19  Number of instances and attributes before & after Decision Tree
20. 20.  Number of test instance(s) classified properly 7/29/2013 20 Trainin g : Test Number of instance Naïve Bayesia n Accurac y(%) Selectiv e Naïve Bayesia n Accurac y(%) 80 : 20 30 27 90% 29 96.67% Iris 70 : 30 45 42 93.33% 43 95.56% 60 : 40 60 56 93.33% 57 95% Trainin g : Test Number of instance Naïve Bayesia n(NB) Accurac y(%) Selectiv e Naïve Bayesia n Accurac y(%) 80 : 20 154 119 77.27% 126 81.81% Diabetes 70 : 30 231 173 76.20% 181 78.35% 60 :40 308 239 77.60% 246 79.87%
21. 21. 7/29/2013 21 Training : Test Number of instance Naïve Bayesian (NB) Accuracy (%) Selective Naïve Bayesian Accuracy (%) 80 : 20 137 134 97.81% 135 98.54% Breast Cancer 70 : 30 205 200 97.56% 202 98.54% 60 : 40 274 261 95.26% 264 96.35% Training : Test Number of instance Naïve Bayesian (NB) Accuracy (%) Selective Naïve Bayesian Accuracy (%) 80 : 20 68 56 82.35% 58 85.29% Ecoli 70 : 30 101 81 80.20% 82 81.19% 60 :40 135 110 81.48% 110 81.48%
22. 22. 7/29/2013 22 Trainin g : Test Number of instance Naïve Bayesian Accuracy (%) Selective Naïve Bayesian Accuracy (%) 80 : 20 81 74 91.36% 78 96.30% Ionospher e 70 : 30 106 97 91.51% 100 94.34% 60 : 40 141 131 92.91% 134 95.04%
23. 23.  Result of Cross Validation(10 fold) 7/29/2013 23 Naïve Bayesian Selective Naïve Bayesian Number of instances 15 16 16 16 16 16 14 14 16 16 16 16 Iris 13 13 16 16 16 16 15 15 16 15 16 16 15 16 16 15 15 15
24. 24.  Result of Cross Validation(10 fold) 7/29/2013 24 Naïve Bayesian Selective Naïve Bayesian Number of instances 65 63 69 68 68 69 68 68 69 66 65 69 Breast Cancer 65 65 69 66 66 69 68 69 69 67 68 69 65 66 69 67 69 69
25. 25.  Result of Cross Validation(10 fold) 7/29/2013 25 Naïve Bayesian Selective Naïve Bayesian Number of instances 69 68 77 53 56 77 56 57 77 61 62 77 Diabetes 65 64 77 56 56 77 56 57 77 60 59 77 52 54 77 59 60 77
26. 26.  Result of Cross Validation(10 fold) 7/29/2013 26 Naïve Bayesian Selective Naïve Bayesian Number of instances 21 21 34 31 31 34 31 31 34 26 26 34 Ecoli 25 25 34 23 23 34 24 24 34 27 27 34 29 29 34 30 30 34
27. 27.  Result of Cross Validation(10 fold) 7/29/2013 27 Naïve Bayesian Selective Naïve Bayesian Number of instances 35 33 36 33 33 36 31 32 36 33 34 36 Ionosphere 33 35 36 30 31 36 30 31 36 32 33 36 31 31 36 33 34 36
28. 28.  Dataset:  UCI Machine Learning Repository  Weka provided datasets  Software & Tools:  Weka 3.6.9  Python Data Mining libraries  sklearn  numpy  pylab 7/29/2013 28
29. 29. Thank You 7/29/2013 29