SlideShare a Scribd company logo
1 of 37
Download to read offline
Naïve Bayes Classification
Things We’d Like to Do
• Spam Classification
– Given an email, predict whether it is spam or not
• Medical Diagnosis
– Given a list of symptoms, predict whether a patient has
disease X or not
• Weather
– Based on temperature, humidity, etc… predict if it will rain
tomorrow
• The relationship between attribute set and
the class variable is non-deterministic.
• Even if the attributes are same, the class label
may differ in training set even and hence can
not be predicted with certainty.
• Reason: noisy data, certain other attributes
are not included in the data.
• Example: Task of predicting whether a person
is at risk for heart disease based on the
person’s diet and workout frequency.
• So need an approach to model probabilistic
relationship between attribute set and the
class variable.
Bayesian Classification
• Problem statement:
– Given features X1,X2,…,Xn
– Predict a label Y
Another Application
• Digit Recognition
• X1,…,Xn  {0,1} (Black vs. White pixels)
• Y  {5,6} (predict whether a digit is a 5 or a 6)
Classifier 5
Bayes Classifier
• A probabilistic framework for solving
classification problems
• Conditional Probability:
• Bayes theorem:
)
(
)
(
)
|
(
)
|
(
A
P
C
P
C
A
P
A
C
P 
)
(
)
,
(
)
|
(
)
(
)
,
(
)
|
(
C
P
C
A
P
C
A
P
A
P
C
A
P
A
C
P


Example of Bayes Theorem
• Given:
– A doctor knows that Cold causes fever 50% of the time
– Prior probability of any patient having cold is 1/50,000
– Prior probability of any patient having fever is 1/20
• If a patient has fever, what’s the probability
he/she has cold?
0002
.
0
20
/
1
50000
/
1
5
.
0
)
(
)
(
)
|
(
)
|
( 



F
P
C
P
C
F
P
F
C
P
Bayesian Classifiers
• Consider each attribute and class label as
random variables
• Given a record with attributes (A1, A2,…,An)
– Goal is to predict class C
– Specifically, we want to find the value of C that
maximizes P(C| A1, A2,…,An )
• Can we estimate P(C| A1, A2,…,An ) directly from
data?
Bayesian Classifiers
• Approach:
– compute the posterior probability P(C | A1, A2, …, An) for all
values of C using the Bayes theorem
– Choose value of C that maximizes
P(C | A1, A2, …, An)
– Equivalent to choosing value of C that maximizes
P(A1, A2, …, An|C) P(C)
• How to estimate P(A1, A2, …, An | C )?
)
(
)
(
)
|
(
)
|
(
2
1
2
1
2
1
n
n
n
A
A
A
P
C
P
C
A
A
A
P
A
A
A
C
P


 
Naïve Bayes Classifier
• Assume independence among attributes Ai when class is
given:
– P(A1, A2, …, An |C) = P(A1| Cj) P(A2| Cj)… P(An| Cj)
– Can estimate P(Ai| Cj) for all Ai and Cj.
– New point is classified to Cj if P(Cj)  P(Ai| Cj) is
maximum.
How to Estimate Probabilities from
Data?
• Class: P(C) = Nc/N
– e.g., P(No) = 7/10,
P(Yes) = 3/10
• For discrete attributes:
P(Ai | Ck) = |Aik|/ Nc
– where |Aik| is number of
instances having attribute Ai
and belongs to class Ck
– Examples:
P(Status=Married|No) = 4/7
P(Refund=Yes|Yes)=0
k
Tid Refund Marital
Status
Taxable
Income Evade
1 Yes Single 125K No
2 No Married 100K No
3 No Single 70K No
4 Yes Married 120K No
5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
10
categorical
categorical
continuous
class
How to Estimate Probabilities from
Data?
• For continuous attributes:
– Discretize the range into bins
• one ordinal attribute per bin
• violates independence assumption
– Two-way split: (A < v) or (A > v)
• choose only one of the two splits as new attribute
– Probability density estimation:
• Assume attribute follows a normal distribution
• Use data to estimate parameters of distribution
(e.g., mean and standard deviation)
• Once probability distribution is known, can use it to
estimate the conditional probability P(Ai|c)
k
How to Estimate Probabilities from
Data?
• Normal distribution:
– One for each (Ai,ci) pair
• For (Income, Class=No):
– If Class=No
• sample mean = 110
• sample variance = 2975
Tid Refund Marital
Status
Taxable
Income Evade
1 Yes Single 125K No
2 No Married 100K No
3 No Single 70K No
4 Yes Married 120K No
5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
10
categorical
categorical
continuous
class
2
2
2
)
(
2
2
1
)
|
( ij
ij
i
A
ij
j
i
e
c
A
P 





0072
.
0
)
54
.
54
(
2
1
)
|
120
( )
2975
(
2
)
110
120
( 2





e
No
Income
P

Example of Naïve Bayes Classifier
P(Refund=Yes|No) = 3/7
P(Refund=No|No) = 4/7
P(Refund=Yes|Yes) = 0
P(Refund=No|Yes) = 1
P(Marital Status=Single|No) = 2/7
P(Marital Status=Divorced|No)=1/7
P(Marital Status=Married|No) = 4/7
P(Marital Status=Single|Yes) = 2/7
P(Marital Status=Divorced|Yes)=1/7
P(Marital Status=Married|Yes) = 0
For taxable income:
If class=No: sample mean=110
sample variance=2975
If class=Yes: sample mean=90
sample variance=25
naive Bayes Classifier:
120K)
Income
Married,
No,
Refund
( 


X
 P(X|Class=No) = P(Refund=No|Class=No)
 P(Married| Class=No)
 P(Income=120K| Class=No)
= 4/7  4/7  0.0072 = 0.0024
 P(X|Class=Yes) = P(Refund=No| Class=Yes)
 P(Married| Class=Yes)
 P(Income=120K| Class=Yes)
= 1  0  1.2  10-9 = 0
Since P(X|No)P(No) > P(X|Yes)P(Yes)
Therefore P(No|X) > P(Yes|X)
=> Class = No
Given a Test Record:
Naïve Bayes Classifier
• If one of the conditional probability is zero,
then the entire expression becomes zero
• Probability estimation:
m
N
mp
N
C
A
P
c
N
N
C
A
P
N
N
C
A
P
c
ic
i
c
ic
i
c
ic
i







)
|
(
:
estimate
-
m
1
)
|
(
:
Laplace
)
|
(
:
Original
c: number of classes
p: prior probability
m: parameter
Example of Naïve Bayes Classifier
Name Give Birth Can Fly Live in Water Have Legs Class
human yes no no yes mammals
python no no no no non-mammals
salmon no no yes no non-mammals
whale yes no yes no mammals
frog no no sometimes yes non-mammals
komodo no no no yes non-mammals
bat yes yes no yes mammals
pigeon no yes no yes non-mammals
cat yes no no yes mammals
leopard shark yes no yes no non-mammals
turtle no no sometimes yes non-mammals
penguin no no sometimes yes non-mammals
porcupine yes no no yes mammals
eel no no yes no non-mammals
salamander no no sometimes yes non-mammals
gila monster no no no yes non-mammals
platypus no no no yes mammals
owl no yes no yes non-mammals
dolphin yes no yes no mammals
eagle no yes no yes non-mammals
Give Birth Can Fly Live in Water Have Legs Class
yes no yes no ?
0027
.
0
20
13
004
.
0
)
(
)
|
(
021
.
0
20
7
06
.
0
)
(
)
|
(
0042
.
0
13
4
13
3
13
10
13
1
)
|
(
06
.
0
7
2
7
2
7
6
7
6
)
|
(
















N
P
N
A
P
M
P
M
A
P
N
A
P
M
A
P
A: attributes
M: mammals
N: non-mammals
P(A|M)P(M) >
P(A|N)P(N)
=> Mammals
Naïve Bayes (Summary)
• Robust to isolated noise points
• Handle missing values by ignoring the instance
during probability estimate calculations
• Robust to irrelevant attributes
• Independence assumption may not hold for
some attributes
– Use other techniques such as Bayesian Belief
The Bayes Classifier
• Use Bayes Rule!
• Why did this help? Well, we think that we might be able to
specify how features are “generated” by the class label
Normalization Constant
Likelihood Prior
The Bayes Classifier
• Let’s expand this for our digit recognition task:
• To classify, we’ll simply compute these two probabilities and predict based
on which one is greater
Model Parameters
• For the Bayes classifier, we need to “learn” two functions, the
likelihood and the prior
• How many parameters are required to specify the prior for
our digit recognition example?
Model Parameters
• The problem with explicitly modeling P(X1,…,Xn|Y) is that
there are usually way too many parameters:
– We’ll run out of space
– We’ll run out of time
– And we’ll need tons of training data (which is usually not
available)
The Naïve Bayes Model
• The Naïve Bayes Assumption: Assume that all features are
independent given the class label Y
• Equationally speaking:
• (We will discuss the validity of this assumption later)
Naïve Bayes Training
• Now that we’ve decided to use a Naïve Bayes classifier, we need to train it
with some data:
MNIST Training Data
Naïve Bayes Training
• Training in Naïve Bayes is easy:
– Estimate P(Y=v) as the fraction of records with Y=v
– Estimate P(Xi=u|Y=v) as the fraction of records with Y=v for which Xi=u
Naïve Bayes Training
• In practice, some of these counts can be zero
• Fix this by adding “virtual” counts:
– This is called Smoothing
Naïve Bayes Training
• For binary digits, training amounts to averaging all of the training fives
together and all of the training sixes together.
Naïve Bayes Classification
Another Example of the Naïve Bayes
Classifier
The weather data, with counts and probabilities
outlook temperature humidity windy play
yes no yes no yes no yes no yes no
sunny 2 3 hot 2 2 high 3 4 false 6 2 9 5
overcast 4 0 mild 4 2 normal 6 1 true 3 3
rainy 3 2 cool 3 1
sunny 2/9 3/5 hot 2/9 2/5 high 3/9 4/5 false 6/9 2/5 9/14 5/14
overcast 4/9 0/5 mild 4/9 2/5 normal 6/9 1/5 true 3/9 3/5
rainy 3/9 2/5 cool 3/9 1/5
A new day
outlook temperature humidity windy play
sunny cool high true ?
• Likelihood of yes
• Likelihood of no
• Therefore, the prediction is No
0053
.
0
14
9
9
3
9
3
9
3
9
2






0206
.
0
14
5
5
3
5
4
5
1
5
3






The Naive Bayes Classifier for Data
Sets with Numerical Attribute Values
• One common practice to handle numerical
attribute values is to assume normal
distributions for numerical attributes.
The numeric weather data with summary statistics
outlook temperature humidity windy play
yes no yes no yes no yes no yes no
sunny 2 3 83 85 86 85 false 6 2 9 5
overcast 4 0 70 80 96 90 true 3 3
rainy 3 2 68 65 80 70
64 72 65 95
69 71 70 91
75 80
75 70
72 90
81 75
sunny 2/9 3/5 mean 73 74.6 mean 79.1 86.2 false 6/9 2/5 9/14 5/14
overcast 4/9 0/5 std
dev
6.2 7.9 std
dev
10.2 9.7 true 3/9 3/5
rainy 3/9 2/5
• Let x1, x2, …, xn be the values of a numerical attribute
in the training data set.
 
 
2
2
2
1
)
(
1
1
1
1
2
1


















w
e
w
f
x
n
x
n
n
i
i
n
i
i
• For examples,
• Likelihood of Yes =
• Likelihood of No =
000036
.
0
14
9
9
3
0221
.
0
0340
.
0
9
2





000136
.
0
14
5
5
3
038
.
0
0291
.
0
5
3





 
 
 
 
0340
.
0
2
.
6
2
1
Yes
|
66
e
temperatur
2
2
.
6
2
2
73
66





e
f

Outputting Probabilities
• What’s nice about Naïve Bayes (and generative models in
general) is that it returns probabilities
– These probabilities can tell us how confident the algorithm
is
– So… don’t throw away those probabilities!
Recap
• We defined a Bayes classifier but saw that it’s intractable to
compute P(X1,…,Xn|Y)
• We then used the Naïve Bayes assumption – that everything
is independent given the class label Y
• A natural question: is there some happy compromise where
we only assume that some features are conditionally
independent?
Conclusions
• Naïve Bayes is:
– Really easy to implement and often works well
– Often a good first thing to try
– Commonly used as a “punching bag” for smarter
algorithms

More Related Content

Similar to naive bayes example.pdf

Pattern recognition binoy 05-naive bayes classifier
Pattern recognition binoy 05-naive bayes classifierPattern recognition binoy 05-naive bayes classifier
Pattern recognition binoy 05-naive bayes classifier108kaushik
 
Data classification sammer
Data classification sammer Data classification sammer
Data classification sammer Sammer Qader
 
Cs221 lecture5-fall11
Cs221 lecture5-fall11Cs221 lecture5-fall11
Cs221 lecture5-fall11darwinrlo
 
Introduction to Bayesian Statistics.ppt
Introduction to Bayesian Statistics.pptIntroduction to Bayesian Statistics.ppt
Introduction to Bayesian Statistics.pptLong Dang
 
Naïve Bayes Classifier Algorithm.pptx
Naïve Bayes Classifier Algorithm.pptxNaïve Bayes Classifier Algorithm.pptx
Naïve Bayes Classifier Algorithm.pptxPriyadharshiniG41
 
Machine Learning: An introduction โดย รศ.ดร.สุรพงค์ เอื้อวัฒนามงคล
Machine Learning: An introduction โดย รศ.ดร.สุรพงค์  เอื้อวัฒนามงคลMachine Learning: An introduction โดย รศ.ดร.สุรพงค์  เอื้อวัฒนามงคล
Machine Learning: An introduction โดย รศ.ดร.สุรพงค์ เอื้อวัฒนามงคลBAINIDA
 
Datamining with nb
Datamining with nbDatamining with nb
Datamining with nbTony Nguyen
 
Datamining with nb
Datamining with nbDatamining with nb
Datamining with nbDavid Hoen
 
Datamining with nb
Datamining with nbDatamining with nb
Datamining with nbJames Wong
 
Datamining with nb
Datamining with nbDatamining with nb
Datamining with nbFraboni Ec
 
Datamining with nb
Datamining with nbDatamining with nb
Datamining with nbYoung Alista
 
Data Mining Lecture_10(b).pptx
Data Mining Lecture_10(b).pptxData Mining Lecture_10(b).pptx
Data Mining Lecture_10(b).pptxSubrata Kumer Paul
 

Similar to naive bayes example.pdf (20)

Pattern recognition binoy 05-naive bayes classifier
Pattern recognition binoy 05-naive bayes classifierPattern recognition binoy 05-naive bayes classifier
Pattern recognition binoy 05-naive bayes classifier
 
ch8Bayes.ppt
ch8Bayes.pptch8Bayes.ppt
ch8Bayes.ppt
 
Data classification sammer
Data classification sammer Data classification sammer
Data classification sammer
 
Cs221 lecture5-fall11
Cs221 lecture5-fall11Cs221 lecture5-fall11
Cs221 lecture5-fall11
 
Introduction to Bayesian Statistics.ppt
Introduction to Bayesian Statistics.pptIntroduction to Bayesian Statistics.ppt
Introduction to Bayesian Statistics.ppt
 
Machine learning mathematicals.pdf
Machine learning mathematicals.pdfMachine learning mathematicals.pdf
Machine learning mathematicals.pdf
 
Naïve Bayes Classifier Algorithm.pptx
Naïve Bayes Classifier Algorithm.pptxNaïve Bayes Classifier Algorithm.pptx
Naïve Bayes Classifier Algorithm.pptx
 
bayesjaw.ppt
bayesjaw.pptbayesjaw.ppt
bayesjaw.ppt
 
Machine Learning: An introduction โดย รศ.ดร.สุรพงค์ เอื้อวัฒนามงคล
Machine Learning: An introduction โดย รศ.ดร.สุรพงค์  เอื้อวัฒนามงคลMachine Learning: An introduction โดย รศ.ดร.สุรพงค์  เอื้อวัฒนามงคล
Machine Learning: An introduction โดย รศ.ดร.สุรพงค์ เอื้อวัฒนามงคล
 
Inferential statistics nominal data
Inferential statistics   nominal dataInferential statistics   nominal data
Inferential statistics nominal data
 
Datamining with nb
Datamining with nbDatamining with nb
Datamining with nb
 
Datamining with nb
Datamining with nbDatamining with nb
Datamining with nb
 
Datamining with nb
Datamining with nbDatamining with nb
Datamining with nb
 
Datamining withnb
Datamining withnbDatamining withnb
Datamining withnb
 
Datamining with nb
Datamining with nbDatamining with nb
Datamining with nb
 
Datamining with nb
Datamining with nbDatamining with nb
Datamining with nb
 
Datamining with nb
Datamining with nbDatamining with nb
Datamining with nb
 
Bayes 6
Bayes 6Bayes 6
Bayes 6
 
Data Mining Lecture_10(b).pptx
Data Mining Lecture_10(b).pptxData Mining Lecture_10(b).pptx
Data Mining Lecture_10(b).pptx
 
Naive Bayes Presentation
Naive Bayes PresentationNaive Bayes Presentation
Naive Bayes Presentation
 

Recently uploaded

Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 

Recently uploaded (20)

Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 

naive bayes example.pdf

  • 2. Things We’d Like to Do • Spam Classification – Given an email, predict whether it is spam or not • Medical Diagnosis – Given a list of symptoms, predict whether a patient has disease X or not • Weather – Based on temperature, humidity, etc… predict if it will rain tomorrow
  • 3. • The relationship between attribute set and the class variable is non-deterministic. • Even if the attributes are same, the class label may differ in training set even and hence can not be predicted with certainty. • Reason: noisy data, certain other attributes are not included in the data.
  • 4. • Example: Task of predicting whether a person is at risk for heart disease based on the person’s diet and workout frequency. • So need an approach to model probabilistic relationship between attribute set and the class variable.
  • 5. Bayesian Classification • Problem statement: – Given features X1,X2,…,Xn – Predict a label Y
  • 6. Another Application • Digit Recognition • X1,…,Xn  {0,1} (Black vs. White pixels) • Y  {5,6} (predict whether a digit is a 5 or a 6) Classifier 5
  • 7. Bayes Classifier • A probabilistic framework for solving classification problems • Conditional Probability: • Bayes theorem: ) ( ) ( ) | ( ) | ( A P C P C A P A C P  ) ( ) , ( ) | ( ) ( ) , ( ) | ( C P C A P C A P A P C A P A C P  
  • 8. Example of Bayes Theorem • Given: – A doctor knows that Cold causes fever 50% of the time – Prior probability of any patient having cold is 1/50,000 – Prior probability of any patient having fever is 1/20 • If a patient has fever, what’s the probability he/she has cold? 0002 . 0 20 / 1 50000 / 1 5 . 0 ) ( ) ( ) | ( ) | (     F P C P C F P F C P
  • 9. Bayesian Classifiers • Consider each attribute and class label as random variables • Given a record with attributes (A1, A2,…,An) – Goal is to predict class C – Specifically, we want to find the value of C that maximizes P(C| A1, A2,…,An ) • Can we estimate P(C| A1, A2,…,An ) directly from data?
  • 10. Bayesian Classifiers • Approach: – compute the posterior probability P(C | A1, A2, …, An) for all values of C using the Bayes theorem – Choose value of C that maximizes P(C | A1, A2, …, An) – Equivalent to choosing value of C that maximizes P(A1, A2, …, An|C) P(C) • How to estimate P(A1, A2, …, An | C )? ) ( ) ( ) | ( ) | ( 2 1 2 1 2 1 n n n A A A P C P C A A A P A A A C P    
  • 11. Naïve Bayes Classifier • Assume independence among attributes Ai when class is given: – P(A1, A2, …, An |C) = P(A1| Cj) P(A2| Cj)… P(An| Cj) – Can estimate P(Ai| Cj) for all Ai and Cj. – New point is classified to Cj if P(Cj)  P(Ai| Cj) is maximum.
  • 12. How to Estimate Probabilities from Data? • Class: P(C) = Nc/N – e.g., P(No) = 7/10, P(Yes) = 3/10 • For discrete attributes: P(Ai | Ck) = |Aik|/ Nc – where |Aik| is number of instances having attribute Ai and belongs to class Ck – Examples: P(Status=Married|No) = 4/7 P(Refund=Yes|Yes)=0 k Tid Refund Marital Status Taxable Income Evade 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes 10 categorical categorical continuous class
  • 13. How to Estimate Probabilities from Data? • For continuous attributes: – Discretize the range into bins • one ordinal attribute per bin • violates independence assumption – Two-way split: (A < v) or (A > v) • choose only one of the two splits as new attribute – Probability density estimation: • Assume attribute follows a normal distribution • Use data to estimate parameters of distribution (e.g., mean and standard deviation) • Once probability distribution is known, can use it to estimate the conditional probability P(Ai|c) k
  • 14. How to Estimate Probabilities from Data? • Normal distribution: – One for each (Ai,ci) pair • For (Income, Class=No): – If Class=No • sample mean = 110 • sample variance = 2975 Tid Refund Marital Status Taxable Income Evade 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes 10 categorical categorical continuous class 2 2 2 ) ( 2 2 1 ) | ( ij ij i A ij j i e c A P       0072 . 0 ) 54 . 54 ( 2 1 ) | 120 ( ) 2975 ( 2 ) 110 120 ( 2      e No Income P 
  • 15. Example of Naïve Bayes Classifier P(Refund=Yes|No) = 3/7 P(Refund=No|No) = 4/7 P(Refund=Yes|Yes) = 0 P(Refund=No|Yes) = 1 P(Marital Status=Single|No) = 2/7 P(Marital Status=Divorced|No)=1/7 P(Marital Status=Married|No) = 4/7 P(Marital Status=Single|Yes) = 2/7 P(Marital Status=Divorced|Yes)=1/7 P(Marital Status=Married|Yes) = 0 For taxable income: If class=No: sample mean=110 sample variance=2975 If class=Yes: sample mean=90 sample variance=25 naive Bayes Classifier: 120K) Income Married, No, Refund (    X  P(X|Class=No) = P(Refund=No|Class=No)  P(Married| Class=No)  P(Income=120K| Class=No) = 4/7  4/7  0.0072 = 0.0024  P(X|Class=Yes) = P(Refund=No| Class=Yes)  P(Married| Class=Yes)  P(Income=120K| Class=Yes) = 1  0  1.2  10-9 = 0 Since P(X|No)P(No) > P(X|Yes)P(Yes) Therefore P(No|X) > P(Yes|X) => Class = No Given a Test Record:
  • 16. Naïve Bayes Classifier • If one of the conditional probability is zero, then the entire expression becomes zero • Probability estimation: m N mp N C A P c N N C A P N N C A P c ic i c ic i c ic i        ) | ( : estimate - m 1 ) | ( : Laplace ) | ( : Original c: number of classes p: prior probability m: parameter
  • 17. Example of Naïve Bayes Classifier Name Give Birth Can Fly Live in Water Have Legs Class human yes no no yes mammals python no no no no non-mammals salmon no no yes no non-mammals whale yes no yes no mammals frog no no sometimes yes non-mammals komodo no no no yes non-mammals bat yes yes no yes mammals pigeon no yes no yes non-mammals cat yes no no yes mammals leopard shark yes no yes no non-mammals turtle no no sometimes yes non-mammals penguin no no sometimes yes non-mammals porcupine yes no no yes mammals eel no no yes no non-mammals salamander no no sometimes yes non-mammals gila monster no no no yes non-mammals platypus no no no yes mammals owl no yes no yes non-mammals dolphin yes no yes no mammals eagle no yes no yes non-mammals Give Birth Can Fly Live in Water Have Legs Class yes no yes no ? 0027 . 0 20 13 004 . 0 ) ( ) | ( 021 . 0 20 7 06 . 0 ) ( ) | ( 0042 . 0 13 4 13 3 13 10 13 1 ) | ( 06 . 0 7 2 7 2 7 6 7 6 ) | (                 N P N A P M P M A P N A P M A P A: attributes M: mammals N: non-mammals P(A|M)P(M) > P(A|N)P(N) => Mammals
  • 18. Naïve Bayes (Summary) • Robust to isolated noise points • Handle missing values by ignoring the instance during probability estimate calculations • Robust to irrelevant attributes • Independence assumption may not hold for some attributes – Use other techniques such as Bayesian Belief
  • 19. The Bayes Classifier • Use Bayes Rule! • Why did this help? Well, we think that we might be able to specify how features are “generated” by the class label Normalization Constant Likelihood Prior
  • 20. The Bayes Classifier • Let’s expand this for our digit recognition task: • To classify, we’ll simply compute these two probabilities and predict based on which one is greater
  • 21. Model Parameters • For the Bayes classifier, we need to “learn” two functions, the likelihood and the prior • How many parameters are required to specify the prior for our digit recognition example?
  • 22. Model Parameters • The problem with explicitly modeling P(X1,…,Xn|Y) is that there are usually way too many parameters: – We’ll run out of space – We’ll run out of time – And we’ll need tons of training data (which is usually not available)
  • 23. The Naïve Bayes Model • The Naïve Bayes Assumption: Assume that all features are independent given the class label Y • Equationally speaking: • (We will discuss the validity of this assumption later)
  • 24. Naïve Bayes Training • Now that we’ve decided to use a Naïve Bayes classifier, we need to train it with some data: MNIST Training Data
  • 25. Naïve Bayes Training • Training in Naïve Bayes is easy: – Estimate P(Y=v) as the fraction of records with Y=v – Estimate P(Xi=u|Y=v) as the fraction of records with Y=v for which Xi=u
  • 26. Naïve Bayes Training • In practice, some of these counts can be zero • Fix this by adding “virtual” counts: – This is called Smoothing
  • 27. Naïve Bayes Training • For binary digits, training amounts to averaging all of the training fives together and all of the training sixes together.
  • 29. Another Example of the Naïve Bayes Classifier The weather data, with counts and probabilities outlook temperature humidity windy play yes no yes no yes no yes no yes no sunny 2 3 hot 2 2 high 3 4 false 6 2 9 5 overcast 4 0 mild 4 2 normal 6 1 true 3 3 rainy 3 2 cool 3 1 sunny 2/9 3/5 hot 2/9 2/5 high 3/9 4/5 false 6/9 2/5 9/14 5/14 overcast 4/9 0/5 mild 4/9 2/5 normal 6/9 1/5 true 3/9 3/5 rainy 3/9 2/5 cool 3/9 1/5 A new day outlook temperature humidity windy play sunny cool high true ?
  • 30. • Likelihood of yes • Likelihood of no • Therefore, the prediction is No 0053 . 0 14 9 9 3 9 3 9 3 9 2       0206 . 0 14 5 5 3 5 4 5 1 5 3      
  • 31. The Naive Bayes Classifier for Data Sets with Numerical Attribute Values • One common practice to handle numerical attribute values is to assume normal distributions for numerical attributes.
  • 32. The numeric weather data with summary statistics outlook temperature humidity windy play yes no yes no yes no yes no yes no sunny 2 3 83 85 86 85 false 6 2 9 5 overcast 4 0 70 80 96 90 true 3 3 rainy 3 2 68 65 80 70 64 72 65 95 69 71 70 91 75 80 75 70 72 90 81 75 sunny 2/9 3/5 mean 73 74.6 mean 79.1 86.2 false 6/9 2/5 9/14 5/14 overcast 4/9 0/5 std dev 6.2 7.9 std dev 10.2 9.7 true 3/9 3/5 rainy 3/9 2/5
  • 33. • Let x1, x2, …, xn be the values of a numerical attribute in the training data set.     2 2 2 1 ) ( 1 1 1 1 2 1                   w e w f x n x n n i i n i i
  • 34. • For examples, • Likelihood of Yes = • Likelihood of No = 000036 . 0 14 9 9 3 0221 . 0 0340 . 0 9 2      000136 . 0 14 5 5 3 038 . 0 0291 . 0 5 3              0340 . 0 2 . 6 2 1 Yes | 66 e temperatur 2 2 . 6 2 2 73 66      e f 
  • 35. Outputting Probabilities • What’s nice about Naïve Bayes (and generative models in general) is that it returns probabilities – These probabilities can tell us how confident the algorithm is – So… don’t throw away those probabilities!
  • 36. Recap • We defined a Bayes classifier but saw that it’s intractable to compute P(X1,…,Xn|Y) • We then used the Naïve Bayes assumption – that everything is independent given the class label Y • A natural question: is there some happy compromise where we only assume that some features are conditionally independent?
  • 37. Conclusions • Naïve Bayes is: – Really easy to implement and often works well – Often a good first thing to try – Commonly used as a “punching bag” for smarter algorithms