Upcoming SlideShare
×

# Belief Networks & Bayesian Classification

5,030 views
4,587 views

Published on

An Introduction to Bayesian Belief Networks and Naïve Bayesian Classification

Published in: Technology, Education
3 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
5,030
On SlideShare
0
From Embeds
0
Number of Embeds
45
Actions
Shares
0
223
0
Likes
3
Embeds 0
No embeds

No notes for slide

### Belief Networks & Bayesian Classification

1. 1. A N I N T R O D U C T I O N T O B A Y E S I A N B E L I E FN E T W O R K S A N D N A Ï V E B A Y E S I A NC L A S S I F I C A T I O NA D N A N M A S O O DS C I S . N O V A . E D U / ~ A D N A NA D N A N @ N O V A . E D UBelief Networks &Bayesian Classification
2. 2. Overview Probability and Uncertainty Probability Notation Bayesian Statistics Notation of Probability Axioms of Probability Probability Table Bayesian Belief Network Joint Probability Table Probability of Disjunctions Conditional Probability Conditional Independence Bayes Rule Classification with Bayes rule Bayesian Classification Conclusion & Further Reading
3. 3. Probability and Uncertainty Probability provide a way of summarizing the uncertainty. 60% chance of rain today 85% chance of alarm in case of a burglary Probability is calculated based upon past performance, ordegree of belief.
4. 4. Bayesian Statistics Three approaches to Probability Axiomatic Probability by definition and properties Relative Frequency Repeated trials Degree of belief (subjective) Personal measure of uncertainty Examples The chance that a meteor strikes earth is 1% The probability of rain today is 30% The chance of getting an A on the exam is 50%
5. 5. Notation of Probability
6. 6. Notation of Probability
7. 7. Axioms of Probability
8. 8. Probability Table P(Weather= sunny)=P(sunny)=5/13 P(Weather)={5/14, 4/14, 5/14} Calculate probabilities from datasunny overcast rainy5/14 4/14 5/14Outlook
9. 9. An expert built belief network using weatherdataset(Mitchell; Witten & Frank)Bayesian inference can help answer questions like probability ofgame play ifa. Outlook=sunny, Temperature=cool, Humidity=high,Wind=strongb. Outlook=overcast, Temperature=cool, Humidity=high,Wind=strong
10. 10. Bayesian Belief Network Bayesian belief network allows a subset of thevariables conditionally independent A graphical model of causal relationships Several cases of learning Bayesian belief networks• Given both network structure and all the variables: easy• Given network structure but only some variables• When the network structure is not known in advance
11. 11. Bayesian Belief NetworkFamilyHistorySmokerLung Cancer EmphysemaPositive X Ray DyspneaLC 0.8 0.5 0.7 0.1~LC 0.2 0.5 0.3 0.9(FH, S) (FH, ~S)(~FH, S) (~FH, ~S)Bayesian Belief NetworkThe conditional probability tablefor the variable Lung Cancer
12. 12. A Hypothesis for playing tennis
13. 13. Joint Probability Table2/14 2/14 0/142/14 1/14 3/141/14 1/14 2/14OutlookSunny overcast rainyHotmildcoolTemperature
14. 14. Example: Calculating Global Probabilistic Beliefs P(PlayTennis) = 9/14 = 0.64 P(~PlayTennis) = 5/14 = 0.36 P(Outlook=sunny|PlayTennis) = 2/9 = 0.22 P(Outlook=sunny|~PlayTennis) = 3/5 = 0.60 P(Temperature=cool|PlayTennis) = 3/9 = 0.33 P(Temperature=cool|~PlayTennis) = 1/5 = .20 P(Humidity=high|PlayTennis) = 3/9 = 0.33 P(Humidity=high|~PlayTennis) = 4/5 = 0.80 P(Wind=strong|PlayTennis) = 3/9 = 0.33 P(Wind=strong|~PlayTennis) = 3/5 = 0.60
15. 15. Probability of Disjunctions
16. 16. Conditional Probability Probabilities discussed so far are called prior probabilitiesor unconditional probabilities Probabilities depend only on the data, not on any other variable But what if you have some evidence or knowledge about thesituation? You know have a toothache. Now what is theprobability of having a cavity?
17. 17. Conditional Probability
18. 18. Conditional Probability
19. 19. Conditional Independence
20. 20. The independence hypothesis… … makes computation possible … yields optimal classifiers when satisfied … but is seldom satisfied in practice, as attributes(variables) are often correlated. Attempts to overcome this limitation:• Bayesian networks, that combine Bayesian reasoning withcausal relationships between attributes• Decision trees, that reason on one attribute at the time,considering most important attributes first
21. 21. Conditional Independence
22. 22. Bayes’ Rule Remember Conditional Probabilities: P(A|B)=P(A,B)/P(B) P(B)P(A|B)=P(A.B) P(B|A)=P(B,A)/P(A) P(A)P(B|A)=P(B,A) P(B,A)=P(A,B) P(B)P(A|B)=P(A)P(B|A)Bayes’ Rule: P(A|B)=P(B|A)P(A)/P(B)
23. 23. Bayes’ Rule
24. 24. Classification with Bayes Rule
25. 25. Naïve Bayes Classifier
26. 26. Bayesian Classification: Why? Probabilistic learning: Computation of explicitprobabilities for hypothesis, among the most practicalapproaches to certain types of learning problems Incremental: Each training example can incrementallyincrease/decrease the probability that a hypothesis iscorrect. Prior knowledge can be combined withobserved data. Probabilistic prediction: Predict multiplehypotheses, weighted by their probabilities Benchmark: Even if Bayesian methods arecomputationally intractable, they can provide abenchmark for other algorithms
27. 27. Classification with Bayes RuleCourtesy, Simafore - http://www.simafore.com/blog/bid/100934/Beware-of-2-facts-when-using-Naive-Bayes-classification-for-analytics
28. 28. Issues with naïve Bayes Change in Classifier Data (on the fly, during classification) Conditional independence assumption is violated Consider the task of classifying whether or not a certain word iscorporation name E.g. “Google,” “Microsoft,”” “IBM,” and “ACME” Two useful features we might want to use are capitalized, and all-capitals Native Bayes will assume that these two features are independentgiven the class, but this clearly isn’t the case (things that are all-capsmust also be capitalized )!! However naïve Bayes seems to work well in practice evenwhen this assumption is violated
29. 29. Naïve Bayes Classifier
30. 30. Naive Bayesian Classifier Given a training set, we can compute the probabilitiesOutlook P NSunny 2/9 3/5Overcast 4/9 0rain 3/9 2/5TemperatureHot 2/9 2/5Mild 4/9 2/5cool 3/9 1/5Humidity P NHigh 3/9 4/5normal 6/9 1/5Windytrue 3/9 3/5false 6/9 2/5
31. 31.
32. 32. Estimating a-posteriori probabilities
33. 33. Naïve Bayesian Classification
34. 34. P(p) = 9/14P(n) = 5/14outlookP(sunny|p) = 2/9 P(sunny|n) = 3/5P(overcast|p) =4/9 P(overcast|n) = 0P(rain|p) = 3/9 P(rain|n) = 2/5temperatureP(hot|p) = 2/9 P(hot|n) = 2/5P(mild|p) = 4/9 P(mild|n) = 2/5P(cool|p) = 3/9 P(cool|n) = 1/5humidityP(high|p) = 3/9 P(high|n) = 4/5P(normal|p) = 6/9 P(normal|n) = 2/5windyP(true|p) = 3/9 P(true|n) = 3/5P(false|p) = 6/9 P(false|n) = 2/5
35. 35. Play Tennis example
36. 36. Conclusion & Future Reading Probabilities Joint Probabilities Conditional Probabilities Independence, Conditional Independence Naïve Bayes Classifier
37. 37. References J. Han, M. Kamber; Data Mining; Morgan Kaufmann Publishers: SanFrancisco, CA. Bayesian Networks without Tears. | Charniak | AI Magazinehttp://www.aaai.org/ojs/index.php/aimagazine/article/view/918 Bayesian networks - Automated Reasoning Group – UCLA – AdnanDarwiche