SlideShare a Scribd company logo
Bayesian Classification
Thomas Bayes (1701 – 7 April 1761) was an English
statistician, philosopher and Presbyterian minister
who is known for having formulated a specific case of
the theorem that bears his name: Bayes' theorem.
Bayes never published what would eventually
become his most famous accomplishment, his notes
were edited and published after his death by Richard
Price
Sir Thomas Bayes
Slides by Manu Chandel, IIT Roorkee 1
Bayes Theorem
Total Probability
Bayes Theorem
E1 E2 E3 …………………………… EN
A
1. A is a outcome which can result from
all the events E1, E2, ………… EN
2. All the events E1, E2, E3………. EN are
mutually exclusive and exhaustive
Slides by Manu Chandel, IIT Roorkee 2
Bayes Theorem Example
Q. Given two bags each one having red and white balls.
Both bags have equal chance of being chosen.
If a ball is picked at random and found to be Red,
what is the probability that the ball was chosen from bag A?
Ans. Total probability of Red Ball =
=
=
Probability that red ball was from Bag A
∗
( )
=
Slides by Manu Chandel, IIT Roorkee 3
Discriminative v/s Generative classifiers
For a prediction function
Discriminative classifiers
estimate directly from
the training data
Generative classifiers estimate
and directly
from the training data.
Naïve Bayes Classifier is a generative classifier
Slides by Manu Chandel, IIT Roorkee 4
MAP Classification Rule
Maximum A Posterior rule says that :
“Jiski lathi uski bhains “
Input data belongs to the class whose is highest.
Example :
Suppose a news article is to be classified into following three categories: a) Politics b) Finance and
c) Sports.
So, X is our news article and three categories are denoted by Y1, Y2 and Y3 .
Lets say , ,
then according to MAP classification rule news article will be classified into category 2 i.e. finance.
Slides by Manu Chandel, IIT Roorkee 5
Naïve Bayes (Discrete values)
An input to the classifier is often a feature vector containing various feature values
e.g. A news article input to a news article classifier may be a vector of words.
In Bayes classification we need to learn and from the given data.
Here is feature vector with as feature values.
Learning joint probability ( , ,…… , )
is difficult. Hence Naïve Bayes
assumes that features are independent of each other. Assuming
independence of features leads to
( , ,…… , )
Slides by Manu Chandel, IIT Roorkee 6
Naïve Bayes Algorithm (with Example)
Learning phase of Naïve Bayes is represented by an example.
Classifier needs to learn and for all Y
Sr Year Height Pocket
Money
Grade Single
1 1 Average Low High Yes
2 2 Tall Average Low No
3 3 Short High High No
4 4 Average Average Low No
5 2 Tall High Low Yes
6 3 Tall Low High No
7 3 Average High Average Yes
8 1 Tall Average Average Yes
9 4 Short Average High Yes
Data collected
anonymously
from BTECH
Students IITR.
Slides by Manu Chandel, IIT Roorkee 7
Naïve Bayes (Learning Phase )
Year ( = ) ( = )
1 2/5 0
2 1/5 1/4
3 1/5 2/4
4 1/5 1/4
Height ( = ) ( = )
Tall 2/5 2/4
Short 1/5 1/4
Average 2/5 1/4
PM ( = ) ( = )
High 2/5 1/4
Low 1/5 1/4
Average 2/5 2/4
Grade ( = ) ( = )
High 2/5 2/4
Low 1/5 2/4
Average 2/5 0
Slides by Manu Chandel, IIT Roorkee 8
Naïve Bayes (Testing Phase)
What will be the outcome of X= <4,Tall,Average,High> ?
=
= 4
∗
=
∗
=
∗
=
∗ ( = )
= 1/5 * 2/5 * 2/5 * 2/5*5/9
= 0.00711
=
= 4
∗
=
∗
=
∗
=
∗ ( = )
= 1/4 * 2/4 * 2/4 * 2/4*4/9
= 0.0138 As 0.0138 > 0.00711 then X will be classified as Single = No
Slides by Manu Chandel, IIT Roorkee 9
Naïve Bayes (Continuous Values )
Conditional probability often modeled with the normal distribution
= =
1
2
  exp(−
( − )
2
)
= mean of feature values of =
= standard deviation of feature values of =
Learning Phase
For = , , … … , = , , … . output Normal distributions.
Test Phase
Given an unknown instance = , , … . . ,
• Instead of looking-up tables, calculate conditional probabilities with all the normal distributions achieved in
the learning phrase
• Apply the MAP rule to make a decision
Slides by Manu Chandel, IIT Roorkee 10
Naïve Bayes Continuous Value Example
• Temperature is naturally of continuous value.
• Yes: 25.2, 19.3, 18.5, 21.7, 20.1, 24.3, 22.8, 23.1, 19.8
• No: 27.3, 30.1, 17.4, 29.5, 15.1
• Estimate mean and variance for each class
• and
•
• Learning phase output two Gaussian models for
•
.
 
( . )
.
•
.
 
( . )
.
Slides by Manu Chandel, IIT Roorkee 11
Relevant Issues
1. Violation of independence Assumption
2. Zero Conditional Probability Problem
If no example contains a feature value In this circumstances
This can be solved by
Slides by Manu Chandel, IIT Roorkee 12
Underflow Prevention
• Multiplying lots of probabilities, which are between 0 and 1 by definition, can
result in floating-point underflow.
• Since it is better to perform all computations by
summing logs of probabilities rather than multiplying probabilities.
• Class with highest final un-normalized log probability score is still the most
probable.
Slides by Manu Chandel, IIT Roorkee 13
Summary
• Naïve Bayes: the conditional independence assumption
• Training is very easy and fast, just requiring considering each attribute in each class separately.
• Test is straightforward, just looking up tables or calculating conditional probabilities with
estimated distributions.
• A popular generative model
• Performance competitive to most of state-of-the-art classifiers even in presence of violating
independence assumption.
• Many successful applications, e.g., spam mail filtering
Slides by Manu Chandel, IIT Roorkee 14

More Related Content

What's hot

Bayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionBayesian Networks - A Brief Introduction
Bayesian Networks - A Brief Introduction
Adnan Masood
 
Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for Classification
Prakash Pimpale
 
Bayesian Deep Learning
Bayesian Deep LearningBayesian Deep Learning
Bayesian Deep Learning
RayKim51
 
Bayes network
Bayes networkBayes network
Bayes network
Dr. C.V. Suresh Babu
 
Belief Networks & Bayesian Classification
Belief Networks & Bayesian ClassificationBelief Networks & Bayesian Classification
Belief Networks & Bayesian Classification
Adnan Masood
 
Module 4 part_1
Module 4 part_1Module 4 part_1
Module 4 part_1
ShashankN22
 
Bayesian learning
Bayesian learningBayesian learning
Bayesian learning
Vignesh Saravanan
 
Linear models for classification
Linear models for classificationLinear models for classification
Linear models for classification
Sung Yub Kim
 
Bayes Belief Networks
Bayes Belief NetworksBayes Belief Networks
Bayes Belief Networks
Sai Kumar Kodam
 
Dbscan algorithom
Dbscan algorithomDbscan algorithom
Dbscan algorithom
Mahbubur Rahman Shimul
 
2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classification
Krish_ver2
 
Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and Boosting
Mohit Rajput
 
Bayes Classification
Bayes ClassificationBayes Classification
Bayes Classification
sathish sak
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machinesnextlib
 
Naive Bayes Classifier
Naive Bayes ClassifierNaive Bayes Classifier
Naive Bayes Classifier
Yiqun Hu
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
ChetnaChandwani3
 
Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods
Marina Santini
 
Lasso and ridge regression
Lasso and ridge regressionLasso and ridge regression
Lasso and ridge regression
SreerajVA
 
Learning from imbalanced data
Learning from imbalanced data Learning from imbalanced data
Learning from imbalanced data
Aboul Ella Hassanien
 
Lecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryLecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryAlbert Orriols-Puig
 

What's hot (20)

Bayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionBayesian Networks - A Brief Introduction
Bayesian Networks - A Brief Introduction
 
Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for Classification
 
Bayesian Deep Learning
Bayesian Deep LearningBayesian Deep Learning
Bayesian Deep Learning
 
Bayes network
Bayes networkBayes network
Bayes network
 
Belief Networks & Bayesian Classification
Belief Networks & Bayesian ClassificationBelief Networks & Bayesian Classification
Belief Networks & Bayesian Classification
 
Module 4 part_1
Module 4 part_1Module 4 part_1
Module 4 part_1
 
Bayesian learning
Bayesian learningBayesian learning
Bayesian learning
 
Linear models for classification
Linear models for classificationLinear models for classification
Linear models for classification
 
Bayes Belief Networks
Bayes Belief NetworksBayes Belief Networks
Bayes Belief Networks
 
Dbscan algorithom
Dbscan algorithomDbscan algorithom
Dbscan algorithom
 
2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classification
 
Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and Boosting
 
Bayes Classification
Bayes ClassificationBayes Classification
Bayes Classification
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
Naive Bayes Classifier
Naive Bayes ClassifierNaive Bayes Classifier
Naive Bayes Classifier
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods
 
Lasso and ridge regression
Lasso and ridge regressionLasso and ridge regression
Lasso and ridge regression
 
Learning from imbalanced data
Learning from imbalanced data Learning from imbalanced data
Learning from imbalanced data
 
Lecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryLecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-Theory
 

Similar to Bayesian classification

exercises.pdf
exercises.pdfexercises.pdf
exercises.pdf
mekuannintdemeke
 
Bill howe 6_machinelearning_1
Bill howe 6_machinelearning_1Bill howe 6_machinelearning_1
Bill howe 6_machinelearning_1
Mahammad Valiyev
 
Introduction to Probability and Probability Distributions
Introduction to Probability and Probability DistributionsIntroduction to Probability and Probability Distributions
Introduction to Probability and Probability Distributions
Jezhabeth Villegas
 
analytical representation of data
 analytical representation of data analytical representation of data
analytical representation of data
Unsa Shakir
 
Quantitative Analysis For Management 11th Edition Render Solutions Manual
Quantitative Analysis For Management 11th Edition Render Solutions ManualQuantitative Analysis For Management 11th Edition Render Solutions Manual
Quantitative Analysis For Management 11th Edition Render Solutions Manual
Shermanne
 
Statistics Slides.pdf
Statistics Slides.pdfStatistics Slides.pdf
Statistics Slides.pdf
YasirAli74993
 
MMC Math 2009
MMC Math 2009MMC Math 2009
MMC Math 2009
emilmarques
 
probability.pptx
probability.pptxprobability.pptx
probability.pptx
AqeelRahman9
 
Recursion (in Python)
Recursion (in Python)Recursion (in Python)
Recursion (in Python)
saverioperugini
 
ch04sdsdsdsdsdsdsdsdsdsdswewrerertrtr.ppt
ch04sdsdsdsdsdsdsdsdsdsdswewrerertrtr.pptch04sdsdsdsdsdsdsdsdsdsdswewrerertrtr.ppt
ch04sdsdsdsdsdsdsdsdsdsdswewrerertrtr.ppt
Tushar Chaudhari
 
artficial intelligence
artficial intelligenceartficial intelligence
artficial intelligence
Phanindra Mortha
 
Math 221 Massive Success / snaptutorial.com
Math 221 Massive Success / snaptutorial.comMath 221 Massive Success / snaptutorial.com
Math 221 Massive Success / snaptutorial.com
Stephenson164
 
Data Handling
Data HandlingData Handling
Data Handling
Rohan Sahu
 
Math 235 - Summer 2015Homework 2Due Monday June 8 in cla.docx
Math 235 - Summer 2015Homework 2Due Monday June 8 in cla.docxMath 235 - Summer 2015Homework 2Due Monday June 8 in cla.docx
Math 235 - Summer 2015Homework 2Due Monday June 8 in cla.docx
andreecapon
 
Quantitative Analysis For Management 11th Edition Render Test Bank
Quantitative Analysis For Management 11th Edition Render Test BankQuantitative Analysis For Management 11th Edition Render Test Bank
Quantitative Analysis For Management 11th Edition Render Test Bank
Richmondere
 
352735322 rsh-qam11-tif-02-doc
352735322 rsh-qam11-tif-02-doc352735322 rsh-qam11-tif-02-doc
352735322 rsh-qam11-tif-02-doc
BookStoreLib
 
352735322 rsh-qam11-tif-02-doc
352735322 rsh-qam11-tif-02-doc352735322 rsh-qam11-tif-02-doc
352735322 rsh-qam11-tif-02-doc
Firas Husseini
 
UMUC Biology 102103Lab 7 Ecology of OrganismsINSTRUCTIONS · T.docx
UMUC Biology 102103Lab 7 Ecology of OrganismsINSTRUCTIONS · T.docxUMUC Biology 102103Lab 7 Ecology of OrganismsINSTRUCTIONS · T.docx
UMUC Biology 102103Lab 7 Ecology of OrganismsINSTRUCTIONS · T.docx
marilucorr
 
De vry math221 all ilabs latest 2016 november
De vry math221 all ilabs latest 2016 novemberDe vry math221 all ilabs latest 2016 november
De vry math221 all ilabs latest 2016 november
lenasour
 
Lesson 1 06 using the mean to measure central tendency
Lesson 1 06 using the mean to measure central tendency Lesson 1 06 using the mean to measure central tendency
Lesson 1 06 using the mean to measure central tendency
Perla Pelicano Corpez
 

Similar to Bayesian classification (20)

exercises.pdf
exercises.pdfexercises.pdf
exercises.pdf
 
Bill howe 6_machinelearning_1
Bill howe 6_machinelearning_1Bill howe 6_machinelearning_1
Bill howe 6_machinelearning_1
 
Introduction to Probability and Probability Distributions
Introduction to Probability and Probability DistributionsIntroduction to Probability and Probability Distributions
Introduction to Probability and Probability Distributions
 
analytical representation of data
 analytical representation of data analytical representation of data
analytical representation of data
 
Quantitative Analysis For Management 11th Edition Render Solutions Manual
Quantitative Analysis For Management 11th Edition Render Solutions ManualQuantitative Analysis For Management 11th Edition Render Solutions Manual
Quantitative Analysis For Management 11th Edition Render Solutions Manual
 
Statistics Slides.pdf
Statistics Slides.pdfStatistics Slides.pdf
Statistics Slides.pdf
 
MMC Math 2009
MMC Math 2009MMC Math 2009
MMC Math 2009
 
probability.pptx
probability.pptxprobability.pptx
probability.pptx
 
Recursion (in Python)
Recursion (in Python)Recursion (in Python)
Recursion (in Python)
 
ch04sdsdsdsdsdsdsdsdsdsdswewrerertrtr.ppt
ch04sdsdsdsdsdsdsdsdsdsdswewrerertrtr.pptch04sdsdsdsdsdsdsdsdsdsdswewrerertrtr.ppt
ch04sdsdsdsdsdsdsdsdsdsdswewrerertrtr.ppt
 
artficial intelligence
artficial intelligenceartficial intelligence
artficial intelligence
 
Math 221 Massive Success / snaptutorial.com
Math 221 Massive Success / snaptutorial.comMath 221 Massive Success / snaptutorial.com
Math 221 Massive Success / snaptutorial.com
 
Data Handling
Data HandlingData Handling
Data Handling
 
Math 235 - Summer 2015Homework 2Due Monday June 8 in cla.docx
Math 235 - Summer 2015Homework 2Due Monday June 8 in cla.docxMath 235 - Summer 2015Homework 2Due Monday June 8 in cla.docx
Math 235 - Summer 2015Homework 2Due Monday June 8 in cla.docx
 
Quantitative Analysis For Management 11th Edition Render Test Bank
Quantitative Analysis For Management 11th Edition Render Test BankQuantitative Analysis For Management 11th Edition Render Test Bank
Quantitative Analysis For Management 11th Edition Render Test Bank
 
352735322 rsh-qam11-tif-02-doc
352735322 rsh-qam11-tif-02-doc352735322 rsh-qam11-tif-02-doc
352735322 rsh-qam11-tif-02-doc
 
352735322 rsh-qam11-tif-02-doc
352735322 rsh-qam11-tif-02-doc352735322 rsh-qam11-tif-02-doc
352735322 rsh-qam11-tif-02-doc
 
UMUC Biology 102103Lab 7 Ecology of OrganismsINSTRUCTIONS · T.docx
UMUC Biology 102103Lab 7 Ecology of OrganismsINSTRUCTIONS · T.docxUMUC Biology 102103Lab 7 Ecology of OrganismsINSTRUCTIONS · T.docx
UMUC Biology 102103Lab 7 Ecology of OrganismsINSTRUCTIONS · T.docx
 
De vry math221 all ilabs latest 2016 november
De vry math221 all ilabs latest 2016 novemberDe vry math221 all ilabs latest 2016 november
De vry math221 all ilabs latest 2016 november
 
Lesson 1 06 using the mean to measure central tendency
Lesson 1 06 using the mean to measure central tendency Lesson 1 06 using the mean to measure central tendency
Lesson 1 06 using the mean to measure central tendency
 

Recently uploaded

一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 

Recently uploaded (20)

一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 

Bayesian classification

  • 1. Bayesian Classification Thomas Bayes (1701 – 7 April 1761) was an English statistician, philosopher and Presbyterian minister who is known for having formulated a specific case of the theorem that bears his name: Bayes' theorem. Bayes never published what would eventually become his most famous accomplishment, his notes were edited and published after his death by Richard Price Sir Thomas Bayes Slides by Manu Chandel, IIT Roorkee 1
  • 2. Bayes Theorem Total Probability Bayes Theorem E1 E2 E3 …………………………… EN A 1. A is a outcome which can result from all the events E1, E2, ………… EN 2. All the events E1, E2, E3………. EN are mutually exclusive and exhaustive Slides by Manu Chandel, IIT Roorkee 2
  • 3. Bayes Theorem Example Q. Given two bags each one having red and white balls. Both bags have equal chance of being chosen. If a ball is picked at random and found to be Red, what is the probability that the ball was chosen from bag A? Ans. Total probability of Red Ball = = = Probability that red ball was from Bag A ∗ ( ) = Slides by Manu Chandel, IIT Roorkee 3
  • 4. Discriminative v/s Generative classifiers For a prediction function Discriminative classifiers estimate directly from the training data Generative classifiers estimate and directly from the training data. Naïve Bayes Classifier is a generative classifier Slides by Manu Chandel, IIT Roorkee 4
  • 5. MAP Classification Rule Maximum A Posterior rule says that : “Jiski lathi uski bhains “ Input data belongs to the class whose is highest. Example : Suppose a news article is to be classified into following three categories: a) Politics b) Finance and c) Sports. So, X is our news article and three categories are denoted by Y1, Y2 and Y3 . Lets say , , then according to MAP classification rule news article will be classified into category 2 i.e. finance. Slides by Manu Chandel, IIT Roorkee 5
  • 6. Naïve Bayes (Discrete values) An input to the classifier is often a feature vector containing various feature values e.g. A news article input to a news article classifier may be a vector of words. In Bayes classification we need to learn and from the given data. Here is feature vector with as feature values. Learning joint probability ( , ,…… , ) is difficult. Hence Naïve Bayes assumes that features are independent of each other. Assuming independence of features leads to ( , ,…… , ) Slides by Manu Chandel, IIT Roorkee 6
  • 7. Naïve Bayes Algorithm (with Example) Learning phase of Naïve Bayes is represented by an example. Classifier needs to learn and for all Y Sr Year Height Pocket Money Grade Single 1 1 Average Low High Yes 2 2 Tall Average Low No 3 3 Short High High No 4 4 Average Average Low No 5 2 Tall High Low Yes 6 3 Tall Low High No 7 3 Average High Average Yes 8 1 Tall Average Average Yes 9 4 Short Average High Yes Data collected anonymously from BTECH Students IITR. Slides by Manu Chandel, IIT Roorkee 7
  • 8. Naïve Bayes (Learning Phase ) Year ( = ) ( = ) 1 2/5 0 2 1/5 1/4 3 1/5 2/4 4 1/5 1/4 Height ( = ) ( = ) Tall 2/5 2/4 Short 1/5 1/4 Average 2/5 1/4 PM ( = ) ( = ) High 2/5 1/4 Low 1/5 1/4 Average 2/5 2/4 Grade ( = ) ( = ) High 2/5 2/4 Low 1/5 2/4 Average 2/5 0 Slides by Manu Chandel, IIT Roorkee 8
  • 9. Naïve Bayes (Testing Phase) What will be the outcome of X= <4,Tall,Average,High> ? = = 4 ∗ = ∗ = ∗ = ∗ ( = ) = 1/5 * 2/5 * 2/5 * 2/5*5/9 = 0.00711 = = 4 ∗ = ∗ = ∗ = ∗ ( = ) = 1/4 * 2/4 * 2/4 * 2/4*4/9 = 0.0138 As 0.0138 > 0.00711 then X will be classified as Single = No Slides by Manu Chandel, IIT Roorkee 9
  • 10. Naïve Bayes (Continuous Values ) Conditional probability often modeled with the normal distribution = = 1 2   exp(− ( − ) 2 ) = mean of feature values of = = standard deviation of feature values of = Learning Phase For = , , … … , = , , … . output Normal distributions. Test Phase Given an unknown instance = , , … . . , • Instead of looking-up tables, calculate conditional probabilities with all the normal distributions achieved in the learning phrase • Apply the MAP rule to make a decision Slides by Manu Chandel, IIT Roorkee 10
  • 11. Naïve Bayes Continuous Value Example • Temperature is naturally of continuous value. • Yes: 25.2, 19.3, 18.5, 21.7, 20.1, 24.3, 22.8, 23.1, 19.8 • No: 27.3, 30.1, 17.4, 29.5, 15.1 • Estimate mean and variance for each class • and • • Learning phase output two Gaussian models for • .   ( . ) . • .   ( . ) . Slides by Manu Chandel, IIT Roorkee 11
  • 12. Relevant Issues 1. Violation of independence Assumption 2. Zero Conditional Probability Problem If no example contains a feature value In this circumstances This can be solved by Slides by Manu Chandel, IIT Roorkee 12
  • 13. Underflow Prevention • Multiplying lots of probabilities, which are between 0 and 1 by definition, can result in floating-point underflow. • Since it is better to perform all computations by summing logs of probabilities rather than multiplying probabilities. • Class with highest final un-normalized log probability score is still the most probable. Slides by Manu Chandel, IIT Roorkee 13
  • 14. Summary • Naïve Bayes: the conditional independence assumption • Training is very easy and fast, just requiring considering each attribute in each class separately. • Test is straightforward, just looking up tables or calculating conditional probabilities with estimated distributions. • A popular generative model • Performance competitive to most of state-of-the-art classifiers even in presence of violating independence assumption. • Many successful applications, e.g., spam mail filtering Slides by Manu Chandel, IIT Roorkee 14