SlideShare a Scribd company logo
1 of 20
Naïve Bayes Classifier
Dr. Binoy B Nair
Algorithm
• A Naive Bayesian model is easy to
build, with no complicated
iterative parameter estimation
which makes it particularly useful
for very large datasets.
• Despite its simplicity, the Naive
Bayesian classifier often does
surprisingly well and is widely
used.
Assume that there are n number of features in
the dataset, then X= {x1 ,x2 , … , xn }
Naïve Bayes -Details
• Bayes classification:
Difficulty: learning the joint probability
• Naïve Bayes classification- Assumption that all input features are conditionally
independent!
)()|,,()()()( 1 CPCXXPCPC|P|CP n XX
)|,,( 1 CXXP n
)|()|()|()|,,,( 2121 CXPCXPCXPCXXXP nn 
Naïve Bayes
• NB classification rule:
• for given 𝑋 = (𝑥1, 𝑥2, 𝑥3, . . 𝑥 𝑛) and L number of classes: C1, C2, .., CL, the
vector X is assigned to class c* when:
Lnn ccccccPcxPcxPcPcxPcxP ,,,),()]|()|([)()]|()|([ 1
*
1
***
1 
Naïve Bayes
• Algorithm: Continuous-valued Features
– Conditional probability often modeled with the normal distribution
– Learning Phase:
Output: normal distributions and
– Test Phase: Given an unknown instance
• Instead of looking-up tables, calculate conditional probabilities with all the
normal distributions achieved in the learning phrase
• Apply the MAP rule to make a decision
ijji
ijji
ji
jij
ji
ij
cC
cX
X
cCXP















whichforexamplesofXvaluesfeatureofdeviationstandard:
Cwhichforexamplesofvaluesfeatureof(avearage)mean:
2
)(
exp
2
1
)|(ˆ
2
2
Ln ccCXX ,,),,,(for 11 XLn
LicCP i ,,1)( 
),,( 1 naa X
Example 3-Naïve Bayes Classifier with Continuous
Attributes
• Problem: classify
whether a given
person is a male or a
female based on the
measured features.
The features include:
height, weight, and
foot size.
Training
Example training set below.
Sex
(o/p class)
Height
(ft)
Weight
(lbs)
foot size
(inches)
male 6 180 12
male 5.92 190 11
male 5.58 170 12
male 5.92 165 10
female 5 100 6
female 5.5 150 8
female 5.42 130 7
female 5.75 150 9
Example 3
• Solution
• Phase 1: Training
• The classifier created from the training set using a Gaussian distribution assumption would be:
sex
mean
(height)
variance
(height)
mean
(weight)
variance
(weight)
Mean
(foot size)
variance
(foot size)
male 5.855 3.50E-02 176.25 1.23E+02 11.25 9.17E-01
female 5.4175 9.72E-02 132.5 5.58E+02 7.5 1.67E+00
We have equiprobable classes from the dataset, so P(male)= P(female) = 0.5.
Example 3
• Phase 2: Testing
• Below is a sample X to be classified as a male or female.
sex height (ft) weight foot size(inches)
To identify 6 130 8
Solution:
X={6,130,8}
Given this info, We wish to determine which is greater, p(male|X) or p(female|X) .
p(male|X) = P(male)*P(height|male)*P(weight|male)*P(foot size|male) / evidence
p(female|X) = P(female)*P(height|female)*P(weight|female)*P(foot size|female) / evidence
Example 3
• The evidence (also termed normalizing constant) may be calculated
since the sum of the posteriors equals one.
• evidence = P(male)*P(height|male)*P(weight|male)*P(foot size|male) +
P(female)*P(height|female)*P(weight|female)*P(foot size|female)
• The evidence may be ignored since it is a positive constant and is
same for both the classes. (Normal distributions are always positive.)
Example 3
• We now determine the sex of the sample.
• P(male) = 0.5
• P(height|male) = 1.5789 (A probability density greater than 1 is OK. It is the area under
the bell curve that is equal to 1.)
• P(weight|male) = 5.9881e-06
• P(foot size|male) = 1.3112e-3
• numerator of p(male|X) = their product = 6.1984e-09
Example 3
• P(female) = 0.5
• P(height|female) = 2.2346e-1
• P(weight|female) = 1.6789e-2
• P(foot size|female) = 2.8669e-1
• numerator of p(female|X) = their product = 5.3778e-04
Result:
Since posterior numerator of p(female|X) > posterior numerator of p(male|X) , the sample is
female.
Naïve Bayes
• Algorithm: Discrete-Valued Features
– Learning Phase: Given a training set S,
Output: conditional probability tables; for elements
– Test Phase: Given an unknown instance ,
Look up tables to assign the label c* to X’ if
;inexampleswith)|(estimate)|(ˆ
),1;,,1(featureeachofvaluefeatureeveryFor
;inexampleswith)(estimate)(ˆ
ofvaluetargeteachFor 1
S
S
ijkjijkj
jjjk
ii
Lii
cCxXPcCxXP
N,knjXx
cCPcCP
)c,,c(cc




Lnn ccccccPcaPcaPcPcaPcaP ,,,),(ˆ)]|(ˆ)|(ˆ[)(ˆ)]|(ˆ)|(ˆ[ 1
*
1
***
1 
),,( 1 naa X
LNX jj ,
Example
• Example: Play Tennis
Given a new instance, predict its label
x’=(Outlook=Sunny, Temperature=Cool,
Humidity=High, Wind=Strong)
Example
• Learning Phase
Outlook Play=Yes Play=No
Sunny 2/9 3/5
Overcast 4/9 0/5
Rain 3/9 2/5
Temperature Play=Yes Play=No
Hot 2/9 2/5
Mild 4/9 2/5
Cool 3/9 1/5
Humidity Play=Yes Play=No
High 3/9 4/5
Normal 6/9 1/5
Wind Play=Yes Play=No
Strong 3/9 3/5
Weak 6/9 2/5
P(Play=Yes) = 9/14 P(Play=No) = 5/14
We have four variables, we calculate for each
we calculate the conditional probability table
Example
• Test Phase
– Given a new instance, predict its label
x’=(Outlook=Sunny, Temperature=Cool, Humidity=High, Wind=Strong)
– Look up tables achieved in the learning phrase
– Decision making with the MAP rule
P(Outlook=Sunny|Play=No) = 3/5
P(Temperature=Cool|Play==No) = 1/5
P(Huminity=High|Play=No) = 4/5
P(Wind=Strong|Play=No) = 3/5
P(Play=No) = 5/14
P(Outlook=Sunny|Play=Yes) = 2/9
P(Temperature=Cool|Play=Yes) = 3/9
P(Huminity=High|Play=Yes) = 3/9
P(Wind=Strong|Play=Yes) = 3/9
P(Play=Yes) = 9/14
P(Yes|x’): [P(Sunny|Yes)P(Cool|Yes)P(High|Yes)P(Strong|Yes)]P(Play=Yes) = 0.0053
P(No|x’): [P(Sunny|No) P(Cool|No)P(High|No)P(Strong|No)]P(Play=No) = 0.0206
Given the fact P(Yes|x’) < P(No|x’), we label x’ to be “No”.
Example
• Test Phase
– Given a new instance,
x’=(Outlook=Sunny, Temperature=Cool, Humidity=High, Wind=Strong)
– Look up tables
– MAP rule
P(Outlook=Sunny|Play=No) = 3/5
P(Temperature=Cool|Play==No) = 1/5
P(Huminity=High|Play=No) = 4/5
P(Wind=Strong|Play=No) = 3/5
P(Play=No) = 5/14
P(Outlook=Sunny|Play=Yes) = 2/9
P(Temperature=Cool|Play=Yes) = 3/9
P(Huminity=High|Play=Yes) = 3/9
P(Wind=Strong|Play=Yes) = 3/9
P(Play=Yes) = 9/14
P(Yes|x’): [P(Sunny|Yes)P(Cool|Yes)P(High|Yes)P(Strong|Yes)]P(Play=Yes) = 0.0053
P(No|x’): [P(Sunny|No) P(Cool|No)P(High|No)P(Strong|No)]P(Play=No) = 0.0206
Given the fact P(Yes|x’) < P(No|x’), we label x’ to be “No”.
Example 2: Training dataset
age income student credit_rating buys_computer
<=30 high no fair no
<=30 high no excellent no
30…40 high no fair yes
>40 medium no fair yes
>40 low yes fair yes
>40 low yes excellent no
31…40 low yes excellent yes
<=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
Class:
C1:buys_computer=‘yes’
C2:buys_computer=‘no’
Data sample:
X =
(age<=30,
Income=medium,
Student=yes
Credit_rating=Fair)
Naïve Bayesian Classifier: Example 2
• Compute P(X|Ci) for each class
P(age=“<30” | buys_computer=“yes”) = 2/9=0.222
P(age=“<30” | buys_computer=“no”) = 3/5 =0.6
P(income=“medium” | buys_computer=“yes”)= 4/9 =0.444
P(income=“medium” | buys_computer=“no”) = 2/5 = 0.4
P(student=“yes” | buys_computer=“yes)= 6/9 =0.667
P(student=“yes” | buys_computer=“no”)= 1/5=0.2
P(credit_rating=“fair” | buys_computer=“yes”)=6/9=0.667
P(credit_rating=“fair” | buys_computer=“no”)=2/5=0.4
• X=(age<=30 ,income =medium, student=yes,credit_rating=fair)
P(X|Ci) : P(X|buys_computer=“yes”)= 0.222 x 0.444 x 0.667 x 0.0.667 =0.044
P(X|buys_computer=“no”)= 0.6 x 0.4 x 0.2 x 0.4 =0.019
P(X|Ci)*P(Ci ) : P(X|buys_computer=“yes”) * P(buys_computer=“yes”)=0.028
P(X|buys_computer=“no”) * P(buys_computer=“no”)=0.007
 X belongs to class “buys_computer=yes”
P(buys_computer=“yes“)=9/14
P(buys_computer=“no“)=5/14
Summary
• Naïve Bayes: the conditional independence assumption
• Training is very easy and fast; just requiring considering each attribute
in each class separately
• Test is straightforward; just looking up tables or calculating conditional
probabilities with estimated distributions
• A popular generative model
• Performance competitive to most of state-of-the-art classifiers even in
presence of violating independence assumption
• Many successful applications, e.g., spam mail filtering
• A good candidate of a base learner in ensemble learning
• Apart from classification, naïve Bayes can do more…
Thank You

More Related Content

What's hot

2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revised2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revisedKrish_ver2
 
Naive Bayes Classifier
Naive Bayes ClassifierNaive Bayes Classifier
Naive Bayes ClassifierYiqun Hu
 
Overview on Optimization algorithms in Deep Learning
Overview on Optimization algorithms in Deep LearningOverview on Optimization algorithms in Deep Learning
Overview on Optimization algorithms in Deep LearningKhang Pham
 
Handling Missing Values for Machine Learning.pptx
Handling Missing Values for Machine Learning.pptxHandling Missing Values for Machine Learning.pptx
Handling Missing Values for Machine Learning.pptxShamimBhuiyan8
 
Bayesian classification
Bayesian classificationBayesian classification
Bayesian classificationManu Chandel
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reductionmrizwan969
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kambererror007
 
Cure, Clustering Algorithm
Cure, Clustering AlgorithmCure, Clustering Algorithm
Cure, Clustering AlgorithmLino Possamai
 
Svm and kernel machines
Svm and kernel machinesSvm and kernel machines
Svm and kernel machinesNawal Sharma
 
3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methodsKrish_ver2
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data MiningValerii Klymchuk
 
CS 402 DATAMINING AND WAREHOUSING -MODULE 6
CS 402 DATAMINING AND WAREHOUSING -MODULE 6CS 402 DATAMINING AND WAREHOUSING -MODULE 6
CS 402 DATAMINING AND WAREHOUSING -MODULE 6NIMMYRAJU
 
Bayesian classification
Bayesian classification Bayesian classification
Bayesian classification Zul Kawsar
 
Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep LearningSebastian Ruder
 
Classification By Back Propagation
Classification By Back PropagationClassification By Back Propagation
Classification By Back PropagationBineeshJose99
 

What's hot (20)

K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
 
2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revised2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revised
 
Naive Bayes Classifier
Naive Bayes ClassifierNaive Bayes Classifier
Naive Bayes Classifier
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Overview on Optimization algorithms in Deep Learning
Overview on Optimization algorithms in Deep LearningOverview on Optimization algorithms in Deep Learning
Overview on Optimization algorithms in Deep Learning
 
Handling Missing Values for Machine Learning.pptx
Handling Missing Values for Machine Learning.pptxHandling Missing Values for Machine Learning.pptx
Handling Missing Values for Machine Learning.pptx
 
Bayesian classification
Bayesian classificationBayesian classification
Bayesian classification
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 
Cure, Clustering Algorithm
Cure, Clustering AlgorithmCure, Clustering Algorithm
Cure, Clustering Algorithm
 
Svm and kernel machines
Svm and kernel machinesSvm and kernel machines
Svm and kernel machines
 
3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methods
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
 
Yolo
YoloYolo
Yolo
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 
CS 402 DATAMINING AND WAREHOUSING -MODULE 6
CS 402 DATAMINING AND WAREHOUSING -MODULE 6CS 402 DATAMINING AND WAREHOUSING -MODULE 6
CS 402 DATAMINING AND WAREHOUSING -MODULE 6
 
Bayesian classification
Bayesian classification Bayesian classification
Bayesian classification
 
Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep Learning
 
Classification By Back Propagation
Classification By Back PropagationClassification By Back Propagation
Classification By Back Propagation
 

Viewers also liked

Data mining-2
Data mining-2Data mining-2
Data mining-2Nit Hik
 
PROMISE 2011: "Handling missing data in software effort prediction with naive...
PROMISE 2011: "Handling missing data in software effort prediction with naive...PROMISE 2011: "Handling missing data in software effort prediction with naive...
PROMISE 2011: "Handling missing data in software effort prediction with naive...CS, NcState
 
Analysis of crop yield prediction using data mining techniques
Analysis of crop yield prediction using data mining techniquesAnalysis of crop yield prediction using data mining techniques
Analysis of crop yield prediction using data mining techniqueseSAT Journals
 
A FUZZY LOGIC BASED SCHEME FOR THE PARAMETERIZATION OF THE INTER-TROPICAL DIS...
A FUZZY LOGIC BASED SCHEME FOR THE PARAMETERIZATION OF THE INTER-TROPICAL DIS...A FUZZY LOGIC BASED SCHEME FOR THE PARAMETERIZATION OF THE INTER-TROPICAL DIS...
A FUZZY LOGIC BASED SCHEME FOR THE PARAMETERIZATION OF THE INTER-TROPICAL DIS...ijfls
 
Weather report project
Weather report projectWeather report project
Weather report projectalzambra
 
Software Project Management for 'Weather Forecasting using Data mining'
Software Project Management for 'Weather Forecasting using Data mining'Software Project Management for 'Weather Forecasting using Data mining'
Software Project Management for 'Weather Forecasting using Data mining'Rushikesh Mangrulkar
 
2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classificationKrish_ver2
 
Classification with Naive Bayes
Classification with Naive BayesClassification with Naive Bayes
Classification with Naive BayesJosh Patterson
 
Crime Analysis using Data Analysis
Crime Analysis using Data AnalysisCrime Analysis using Data Analysis
Crime Analysis using Data AnalysisChetan Hireholi
 
My Project Report Documentation with Abstract & Snapshots
My Project Report Documentation with Abstract & SnapshotsMy Project Report Documentation with Abstract & Snapshots
My Project Report Documentation with Abstract & SnapshotsUsman Sait
 

Viewers also liked (12)

Bayes 6
Bayes 6Bayes 6
Bayes 6
 
Data mining-2
Data mining-2Data mining-2
Data mining-2
 
PROMISE 2011: "Handling missing data in software effort prediction with naive...
PROMISE 2011: "Handling missing data in software effort prediction with naive...PROMISE 2011: "Handling missing data in software effort prediction with naive...
PROMISE 2011: "Handling missing data in software effort prediction with naive...
 
Analysis of crop yield prediction using data mining techniques
Analysis of crop yield prediction using data mining techniquesAnalysis of crop yield prediction using data mining techniques
Analysis of crop yield prediction using data mining techniques
 
A FUZZY LOGIC BASED SCHEME FOR THE PARAMETERIZATION OF THE INTER-TROPICAL DIS...
A FUZZY LOGIC BASED SCHEME FOR THE PARAMETERIZATION OF THE INTER-TROPICAL DIS...A FUZZY LOGIC BASED SCHEME FOR THE PARAMETERIZATION OF THE INTER-TROPICAL DIS...
A FUZZY LOGIC BASED SCHEME FOR THE PARAMETERIZATION OF THE INTER-TROPICAL DIS...
 
Weather report project
Weather report projectWeather report project
Weather report project
 
Software Project Management for 'Weather Forecasting using Data mining'
Software Project Management for 'Weather Forecasting using Data mining'Software Project Management for 'Weather Forecasting using Data mining'
Software Project Management for 'Weather Forecasting using Data mining'
 
2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classification
 
Naive Bayes Presentation
Naive Bayes PresentationNaive Bayes Presentation
Naive Bayes Presentation
 
Classification with Naive Bayes
Classification with Naive BayesClassification with Naive Bayes
Classification with Naive Bayes
 
Crime Analysis using Data Analysis
Crime Analysis using Data AnalysisCrime Analysis using Data Analysis
Crime Analysis using Data Analysis
 
My Project Report Documentation with Abstract & Snapshots
My Project Report Documentation with Abstract & SnapshotsMy Project Report Documentation with Abstract & Snapshots
My Project Report Documentation with Abstract & Snapshots
 

Similar to Pattern recognition binoy 05-naive bayes classifier

Naive bayes
Naive bayesNaive bayes
Naive bayesumeskath
 
"Naive Bayes Classifier" @ Papers We Love Bucharest
"Naive Bayes Classifier" @ Papers We Love Bucharest"Naive Bayes Classifier" @ Papers We Love Bucharest
"Naive Bayes Classifier" @ Papers We Love BucharestStefan Adam
 
ch8Bayes.ppt
ch8Bayes.pptch8Bayes.ppt
ch8Bayes.pptImXaib
 
GonzalezGinestetResearchDay2016
GonzalezGinestetResearchDay2016GonzalezGinestetResearchDay2016
GonzalezGinestetResearchDay2016Pablo Ginestet
 
Business Analytics using R.ppt
Business Analytics using R.pptBusiness Analytics using R.ppt
Business Analytics using R.pptRohit Raj
 
Data classification sammer
Data classification sammer Data classification sammer
Data classification sammer Sammer Qader
 
Text classification
Text classificationText classification
Text classificationFraboni Ec
 
Text classification
Text classificationText classification
Text classificationDavid Hoen
 
Text classification
Text classificationText classification
Text classificationTony Nguyen
 
Text classification
Text classificationText classification
Text classificationYoung Alista
 
Text classification
Text classificationText classification
Text classificationHarry Potter
 

Similar to Pattern recognition binoy 05-naive bayes classifier (20)

ch8Bayes.ppt
ch8Bayes.pptch8Bayes.ppt
ch8Bayes.ppt
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
ML.pptx
ML.pptxML.pptx
ML.pptx
 
"Naive Bayes Classifier" @ Papers We Love Bucharest
"Naive Bayes Classifier" @ Papers We Love Bucharest"Naive Bayes Classifier" @ Papers We Love Bucharest
"Naive Bayes Classifier" @ Papers We Love Bucharest
 
ch8Bayes.pptx
ch8Bayes.pptxch8Bayes.pptx
ch8Bayes.pptx
 
naive bayes example.pdf
naive bayes example.pdfnaive bayes example.pdf
naive bayes example.pdf
 
naive bayes example.pdf
naive bayes example.pdfnaive bayes example.pdf
naive bayes example.pdf
 
ch8Bayes.ppt
ch8Bayes.pptch8Bayes.ppt
ch8Bayes.ppt
 
ch8Bayes.ppt
ch8Bayes.pptch8Bayes.ppt
ch8Bayes.ppt
 
Naïve Bayes.pptx
Naïve Bayes.pptxNaïve Bayes.pptx
Naïve Bayes.pptx
 
Classification
Classification Classification
Classification
 
GonzalezGinestetResearchDay2016
GonzalezGinestetResearchDay2016GonzalezGinestetResearchDay2016
GonzalezGinestetResearchDay2016
 
Business Analytics using R.ppt
Business Analytics using R.pptBusiness Analytics using R.ppt
Business Analytics using R.ppt
 
Data classification sammer
Data classification sammer Data classification sammer
Data classification sammer
 
Text classification
Text classificationText classification
Text classification
 
Text classification
Text classificationText classification
Text classification
 
Text classification
Text classificationText classification
Text classification
 
Text classification
Text classificationText classification
Text classification
 
Text classification
Text classificationText classification
Text classification
 
Text classification
Text classificationText classification
Text classification
 

Recently uploaded

ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
microprocessor 8085 and its interfacing
microprocessor 8085  and its interfacingmicroprocessor 8085  and its interfacing
microprocessor 8085 and its interfacingjaychoudhary37
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 

Recently uploaded (20)

ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
microprocessor 8085 and its interfacing
microprocessor 8085  and its interfacingmicroprocessor 8085  and its interfacing
microprocessor 8085 and its interfacing
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 

Pattern recognition binoy 05-naive bayes classifier

  • 2. Algorithm • A Naive Bayesian model is easy to build, with no complicated iterative parameter estimation which makes it particularly useful for very large datasets. • Despite its simplicity, the Naive Bayesian classifier often does surprisingly well and is widely used. Assume that there are n number of features in the dataset, then X= {x1 ,x2 , … , xn }
  • 3. Naïve Bayes -Details • Bayes classification: Difficulty: learning the joint probability • Naïve Bayes classification- Assumption that all input features are conditionally independent! )()|,,()()()( 1 CPCXXPCPC|P|CP n XX )|,,( 1 CXXP n )|()|()|()|,,,( 2121 CXPCXPCXPCXXXP nn 
  • 4. Naïve Bayes • NB classification rule: • for given 𝑋 = (𝑥1, 𝑥2, 𝑥3, . . 𝑥 𝑛) and L number of classes: C1, C2, .., CL, the vector X is assigned to class c* when: Lnn ccccccPcxPcxPcPcxPcxP ,,,),()]|()|([)()]|()|([ 1 * 1 *** 1 
  • 5. Naïve Bayes • Algorithm: Continuous-valued Features – Conditional probability often modeled with the normal distribution – Learning Phase: Output: normal distributions and – Test Phase: Given an unknown instance • Instead of looking-up tables, calculate conditional probabilities with all the normal distributions achieved in the learning phrase • Apply the MAP rule to make a decision ijji ijji ji jij ji ij cC cX X cCXP                whichforexamplesofXvaluesfeatureofdeviationstandard: Cwhichforexamplesofvaluesfeatureof(avearage)mean: 2 )( exp 2 1 )|(ˆ 2 2 Ln ccCXX ,,),,,(for 11 XLn LicCP i ,,1)(  ),,( 1 naa X
  • 6. Example 3-Naïve Bayes Classifier with Continuous Attributes • Problem: classify whether a given person is a male or a female based on the measured features. The features include: height, weight, and foot size. Training Example training set below. Sex (o/p class) Height (ft) Weight (lbs) foot size (inches) male 6 180 12 male 5.92 190 11 male 5.58 170 12 male 5.92 165 10 female 5 100 6 female 5.5 150 8 female 5.42 130 7 female 5.75 150 9
  • 7. Example 3 • Solution • Phase 1: Training • The classifier created from the training set using a Gaussian distribution assumption would be: sex mean (height) variance (height) mean (weight) variance (weight) Mean (foot size) variance (foot size) male 5.855 3.50E-02 176.25 1.23E+02 11.25 9.17E-01 female 5.4175 9.72E-02 132.5 5.58E+02 7.5 1.67E+00 We have equiprobable classes from the dataset, so P(male)= P(female) = 0.5.
  • 8. Example 3 • Phase 2: Testing • Below is a sample X to be classified as a male or female. sex height (ft) weight foot size(inches) To identify 6 130 8 Solution: X={6,130,8} Given this info, We wish to determine which is greater, p(male|X) or p(female|X) . p(male|X) = P(male)*P(height|male)*P(weight|male)*P(foot size|male) / evidence p(female|X) = P(female)*P(height|female)*P(weight|female)*P(foot size|female) / evidence
  • 9. Example 3 • The evidence (also termed normalizing constant) may be calculated since the sum of the posteriors equals one. • evidence = P(male)*P(height|male)*P(weight|male)*P(foot size|male) + P(female)*P(height|female)*P(weight|female)*P(foot size|female) • The evidence may be ignored since it is a positive constant and is same for both the classes. (Normal distributions are always positive.)
  • 10. Example 3 • We now determine the sex of the sample. • P(male) = 0.5 • P(height|male) = 1.5789 (A probability density greater than 1 is OK. It is the area under the bell curve that is equal to 1.) • P(weight|male) = 5.9881e-06 • P(foot size|male) = 1.3112e-3 • numerator of p(male|X) = their product = 6.1984e-09
  • 11. Example 3 • P(female) = 0.5 • P(height|female) = 2.2346e-1 • P(weight|female) = 1.6789e-2 • P(foot size|female) = 2.8669e-1 • numerator of p(female|X) = their product = 5.3778e-04 Result: Since posterior numerator of p(female|X) > posterior numerator of p(male|X) , the sample is female.
  • 12. Naïve Bayes • Algorithm: Discrete-Valued Features – Learning Phase: Given a training set S, Output: conditional probability tables; for elements – Test Phase: Given an unknown instance , Look up tables to assign the label c* to X’ if ;inexampleswith)|(estimate)|(ˆ ),1;,,1(featureeachofvaluefeatureeveryFor ;inexampleswith)(estimate)(ˆ ofvaluetargeteachFor 1 S S ijkjijkj jjjk ii Lii cCxXPcCxXP N,knjXx cCPcCP )c,,c(cc     Lnn ccccccPcaPcaPcPcaPcaP ,,,),(ˆ)]|(ˆ)|(ˆ[)(ˆ)]|(ˆ)|(ˆ[ 1 * 1 *** 1  ),,( 1 naa X LNX jj ,
  • 13. Example • Example: Play Tennis Given a new instance, predict its label x’=(Outlook=Sunny, Temperature=Cool, Humidity=High, Wind=Strong)
  • 14. Example • Learning Phase Outlook Play=Yes Play=No Sunny 2/9 3/5 Overcast 4/9 0/5 Rain 3/9 2/5 Temperature Play=Yes Play=No Hot 2/9 2/5 Mild 4/9 2/5 Cool 3/9 1/5 Humidity Play=Yes Play=No High 3/9 4/5 Normal 6/9 1/5 Wind Play=Yes Play=No Strong 3/9 3/5 Weak 6/9 2/5 P(Play=Yes) = 9/14 P(Play=No) = 5/14 We have four variables, we calculate for each we calculate the conditional probability table
  • 15. Example • Test Phase – Given a new instance, predict its label x’=(Outlook=Sunny, Temperature=Cool, Humidity=High, Wind=Strong) – Look up tables achieved in the learning phrase – Decision making with the MAP rule P(Outlook=Sunny|Play=No) = 3/5 P(Temperature=Cool|Play==No) = 1/5 P(Huminity=High|Play=No) = 4/5 P(Wind=Strong|Play=No) = 3/5 P(Play=No) = 5/14 P(Outlook=Sunny|Play=Yes) = 2/9 P(Temperature=Cool|Play=Yes) = 3/9 P(Huminity=High|Play=Yes) = 3/9 P(Wind=Strong|Play=Yes) = 3/9 P(Play=Yes) = 9/14 P(Yes|x’): [P(Sunny|Yes)P(Cool|Yes)P(High|Yes)P(Strong|Yes)]P(Play=Yes) = 0.0053 P(No|x’): [P(Sunny|No) P(Cool|No)P(High|No)P(Strong|No)]P(Play=No) = 0.0206 Given the fact P(Yes|x’) < P(No|x’), we label x’ to be “No”.
  • 16. Example • Test Phase – Given a new instance, x’=(Outlook=Sunny, Temperature=Cool, Humidity=High, Wind=Strong) – Look up tables – MAP rule P(Outlook=Sunny|Play=No) = 3/5 P(Temperature=Cool|Play==No) = 1/5 P(Huminity=High|Play=No) = 4/5 P(Wind=Strong|Play=No) = 3/5 P(Play=No) = 5/14 P(Outlook=Sunny|Play=Yes) = 2/9 P(Temperature=Cool|Play=Yes) = 3/9 P(Huminity=High|Play=Yes) = 3/9 P(Wind=Strong|Play=Yes) = 3/9 P(Play=Yes) = 9/14 P(Yes|x’): [P(Sunny|Yes)P(Cool|Yes)P(High|Yes)P(Strong|Yes)]P(Play=Yes) = 0.0053 P(No|x’): [P(Sunny|No) P(Cool|No)P(High|No)P(Strong|No)]P(Play=No) = 0.0206 Given the fact P(Yes|x’) < P(No|x’), we label x’ to be “No”.
  • 17. Example 2: Training dataset age income student credit_rating buys_computer <=30 high no fair no <=30 high no excellent no 30…40 high no fair yes >40 medium no fair yes >40 low yes fair yes >40 low yes excellent no 31…40 low yes excellent yes <=30 medium no fair no <=30 low yes fair yes >40 medium yes fair yes <=30 medium yes excellent yes 31…40 medium no excellent yes 31…40 high yes fair yes >40 medium no excellent no Class: C1:buys_computer=‘yes’ C2:buys_computer=‘no’ Data sample: X = (age<=30, Income=medium, Student=yes Credit_rating=Fair)
  • 18. Naïve Bayesian Classifier: Example 2 • Compute P(X|Ci) for each class P(age=“<30” | buys_computer=“yes”) = 2/9=0.222 P(age=“<30” | buys_computer=“no”) = 3/5 =0.6 P(income=“medium” | buys_computer=“yes”)= 4/9 =0.444 P(income=“medium” | buys_computer=“no”) = 2/5 = 0.4 P(student=“yes” | buys_computer=“yes)= 6/9 =0.667 P(student=“yes” | buys_computer=“no”)= 1/5=0.2 P(credit_rating=“fair” | buys_computer=“yes”)=6/9=0.667 P(credit_rating=“fair” | buys_computer=“no”)=2/5=0.4 • X=(age<=30 ,income =medium, student=yes,credit_rating=fair) P(X|Ci) : P(X|buys_computer=“yes”)= 0.222 x 0.444 x 0.667 x 0.0.667 =0.044 P(X|buys_computer=“no”)= 0.6 x 0.4 x 0.2 x 0.4 =0.019 P(X|Ci)*P(Ci ) : P(X|buys_computer=“yes”) * P(buys_computer=“yes”)=0.028 P(X|buys_computer=“no”) * P(buys_computer=“no”)=0.007  X belongs to class “buys_computer=yes” P(buys_computer=“yes“)=9/14 P(buys_computer=“no“)=5/14
  • 19. Summary • Naïve Bayes: the conditional independence assumption • Training is very easy and fast; just requiring considering each attribute in each class separately • Test is straightforward; just looking up tables or calculating conditional probabilities with estimated distributions • A popular generative model • Performance competitive to most of state-of-the-art classifiers even in presence of violating independence assumption • Many successful applications, e.g., spam mail filtering • A good candidate of a base learner in ensemble learning • Apart from classification, naïve Bayes can do more…