SlideShare a Scribd company logo
1 of 9
Download to read offline
Lesson 16
Classification Techniques – Naïve Bayes Kush Kulshrestha
Bayes Theorem
Bayes' theorem (alternatively Bayes' law or Bayes' rule) describes the probability of an event, based on prior
knowledge of conditions that might be related to the event.
For example, if cancer is related to age, then, using Bayes' theorem, a person's age can be used to more accurately
assess the probability that they have cancer, compared to the assessment of the probability of cancer made without
knowledge of the person's age.
Bayes' theorem is stated mathematically as the following equation:
where A and B are events and P(B) <> 0
• P (A|B) is a conditional probability: the likelihood of event A occurring given that B is true.
• P (B|A) is also a conditional probability: the likelihood of event B occurring given that A is true.
• P(A) and P(B) are the probabilities of observing A and B independently of each other. P(B) is also called marginal
probability of B and P(A) is called prior.
Bayes Theorem
Why is Prior important?
Image you are standing in a line of people and the person standing in front of you is having long hairs. What can you
conclude about the gender of the person?
Most people will suggest that the person with long hair is probably woman.
Now if some more information is given to you (prior) that the line you are standing in is the line towards Gents
restroom. Now what would be your estimate about the gender?
You can say with very high certainty that the person is a male.
This is how including prior into your analysis could help you in predicting the best fit category / class.
Intro to Naïve Bayes
Candy Selection Example
Once there lived an old Grandma. Every year on her birthday, her entire family would visit her and stay at her
mansion. Sons, daughters, their spouses, her grandchildren. It would be a big bash every year, with a lot of fanfare.
But what Grandma loved the most was meeting her grandchildren and getting to play with them. She had ten
grandchildren in total, all of them around 10 years of age, and she would lovingly call them "random variables".
Every year, Grandma would present a candy to each of the kids.
Grandma had a large box full of candies of ten different kinds.
She would give a single candy to each one of the kids, since she
didn't want to spoil their teeth.
But, as she loved the kids so much, she took great efforts to
decide which candy to present to which kid, such that it would
maximize their total happiness. (the maximum likelihood
estimate, as she would call it)
Intro to Naïve Bayes
But that was not an easy task for Grandma. She knew that each type of candy had a certain probability of making a
kid happy. That probability was different for different candy types, and for different kids. Rakesh liked the red candy
more than the green one, while Sheila liked the orange one above all else. Each of the 10 kids had different
preferences for each of the 10 candies. Moreover, their preferences largely depended on external factors which were
unknown (hidden variables) to Grandma. If Sameer had seen a blue building on the way to the mansion, he'd want a
blue candy, while Sandeep always wanted the candy that matched the colour of his shirt that day. But the biggest
challenge was that their happiness depended on what candies the other kids got! If Rohan got a red candy, then
Niyati would want a red candy as well, and anything else would make her go crying into her mother's arms
(conditional dependency). Sakshi always wanted what the majority of kids got (positive correlation), while Tanmay
would be happiest if nobody else got the kind of candy that he received (negative correlation). Grandma had
concluded long ago that her grandkids were completely mutually dependent
It was computationally a big task for Grandma to get the candy selection right. There were too many conditions to
consider and she could not simplify the calculation. Every year before her birthday, she would spend days figuring
out the optimal assignment of candies, by enumerating all configurations of candies for all the kids together.
She was getting old, and the task was getting harder and harder. She used to feel that she would die before figuring
out the optimal selection of candies that would make her kids the happiest all at once.
Intro to Naïve Bayes
But an interesting thing happened. As the years passed and the kids grew up, they finally passed from teenage and
turned into independent adults. Their choices became less and less dependent on each other, and it became easier
to figure out what is each one's most preferred candy.
Grandma was quick to realize this, and she joyfully began calling them "independent random variables".
It was much easier for her to figure out the optimal selection of candies - she just had to think of one kid at a time
and, for each kid, assign a happiness probability to each of the 10 candy types for that kid. Then she would pick the
candy with the highest happiness probability for that kid, without worrying about what she would assign to the
other kids. This was a super easy task, and Grandma was finally able to get it right.
Learnings:
• In statistical modelling, having mutually dependent random variables makes it really hard to find out the optimal
assignment of values for each variable that maximizes the cumulative probability of the set.
• However, if the variables are independent, it is easy to pick out the individual assignments that maximize the
probability of each variable, and then combine the individual assignments to get a configuration for the entire set.
• In Naive Bayes, you make the assumption that the variables are independent (even if they are actually not). That’s
why it is called Naïve.
• This simplifies your calculation, and in many cases, it actually gives estimates that are comparable to those which
you would have obtained from a more (computationally) expensive model that takes into account the conditional
dependencies between variables.
How Naïve Bayes works?
Below I have a training data set of weather and corresponding target variable ‘Play’ (suggesting possibilities of
playing).
How Naïve Bayes works?
Players will play if weather is sunny. Is this statement is correct?
We can solve it using above (Bayes Theorem) discussed method of posterior probability.
P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny)
We know from the data: P (Sunny |Yes) = 3/9 = 0.33, P(Sunny) = 5/14 = 0.36, P( Yes)= 9/14 = 0.64
Now, P (Yes | Sunny) = 0.33 * 0.64 / 0.36 = 0.60, which has higher probability.
Naive Bayes uses a similar method to predict the probability of different class based on various attributes. This
algorithm is mostly used in text classification and with problems having multiple classes.
Pros and Cons of Naïve Bayes
Pros:
• It is easy and fast to predict class of test data set. It also perform well in multi class prediction.
• When assumption of independence holds, a Naive Bayes classifier performs better compare to other models like
logistic regression and you need less training data.
• It performs reasonably good even when the assumption of independence is somewhat not true.
Cons:
• If categorical variable has a category (in test data set), which was not observed in training data set, then model
will assign a 0 (zero) probability and will be unable to make a prediction. This is often known as “Zero Frequency”.
• Another limitation of Naive Bayes is the assumption of independent predictors. In real life, it is almost impossible
that we get a set of predictors which are completely independent.

More Related Content

More from Kush Kulshrestha

Clustering - Machine Learning Techniques
Clustering - Machine Learning TechniquesClustering - Machine Learning Techniques
Clustering - Machine Learning TechniquesKush Kulshrestha
 
Machine Learning Algorithm - KNN
Machine Learning Algorithm - KNNMachine Learning Algorithm - KNN
Machine Learning Algorithm - KNNKush Kulshrestha
 
Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Kush Kulshrestha
 
Machine Learning Algorithm - Logistic Regression
Machine Learning Algorithm - Logistic RegressionMachine Learning Algorithm - Logistic Regression
Machine Learning Algorithm - Logistic RegressionKush Kulshrestha
 
Assumptions of Linear Regression - Machine Learning
Assumptions of Linear Regression - Machine LearningAssumptions of Linear Regression - Machine Learning
Assumptions of Linear Regression - Machine LearningKush Kulshrestha
 
Interpreting Regression Results - Machine Learning
Interpreting Regression Results - Machine LearningInterpreting Regression Results - Machine Learning
Interpreting Regression Results - Machine LearningKush Kulshrestha
 
Machine Learning Algorithm - Linear Regression
Machine Learning Algorithm - Linear RegressionMachine Learning Algorithm - Linear Regression
Machine Learning Algorithm - Linear RegressionKush Kulshrestha
 
General Concepts of Machine Learning
General Concepts of Machine LearningGeneral Concepts of Machine Learning
General Concepts of Machine LearningKush Kulshrestha
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsKush Kulshrestha
 
Wireless Charging of Electric Vehicles
Wireless Charging of Electric VehiclesWireless Charging of Electric Vehicles
Wireless Charging of Electric VehiclesKush Kulshrestha
 
Handshakes and their types
Handshakes and their typesHandshakes and their types
Handshakes and their typesKush Kulshrestha
 

More from Kush Kulshrestha (17)

Clustering - Machine Learning Techniques
Clustering - Machine Learning TechniquesClustering - Machine Learning Techniques
Clustering - Machine Learning Techniques
 
Machine Learning Algorithm - KNN
Machine Learning Algorithm - KNNMachine Learning Algorithm - KNN
Machine Learning Algorithm - KNN
 
Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees
 
Machine Learning Algorithm - Logistic Regression
Machine Learning Algorithm - Logistic RegressionMachine Learning Algorithm - Logistic Regression
Machine Learning Algorithm - Logistic Regression
 
Assumptions of Linear Regression - Machine Learning
Assumptions of Linear Regression - Machine LearningAssumptions of Linear Regression - Machine Learning
Assumptions of Linear Regression - Machine Learning
 
Interpreting Regression Results - Machine Learning
Interpreting Regression Results - Machine LearningInterpreting Regression Results - Machine Learning
Interpreting Regression Results - Machine Learning
 
Machine Learning Algorithm - Linear Regression
Machine Learning Algorithm - Linear RegressionMachine Learning Algorithm - Linear Regression
Machine Learning Algorithm - Linear Regression
 
General Concepts of Machine Learning
General Concepts of Machine LearningGeneral Concepts of Machine Learning
General Concepts of Machine Learning
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning Algorithms
 
Visualization-2
Visualization-2Visualization-2
Visualization-2
 
Visualization-1
Visualization-1Visualization-1
Visualization-1
 
Inferential Statistics
Inferential StatisticsInferential Statistics
Inferential Statistics
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
 
Scaling and Normalization
Scaling and NormalizationScaling and Normalization
Scaling and Normalization
 
Wireless Charging of Electric Vehicles
Wireless Charging of Electric VehiclesWireless Charging of Electric Vehicles
Wireless Charging of Electric Vehicles
 
Time management
Time managementTime management
Time management
 
Handshakes and their types
Handshakes and their typesHandshakes and their types
Handshakes and their types
 

Recently uploaded

Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 

Recently uploaded (20)

Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 

Machine Learning Algorithm - Naive Bayes for Classification

  • 1. Lesson 16 Classification Techniques – Naïve Bayes Kush Kulshrestha
  • 2. Bayes Theorem Bayes' theorem (alternatively Bayes' law or Bayes' rule) describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For example, if cancer is related to age, then, using Bayes' theorem, a person's age can be used to more accurately assess the probability that they have cancer, compared to the assessment of the probability of cancer made without knowledge of the person's age. Bayes' theorem is stated mathematically as the following equation: where A and B are events and P(B) <> 0 • P (A|B) is a conditional probability: the likelihood of event A occurring given that B is true. • P (B|A) is also a conditional probability: the likelihood of event B occurring given that A is true. • P(A) and P(B) are the probabilities of observing A and B independently of each other. P(B) is also called marginal probability of B and P(A) is called prior.
  • 3. Bayes Theorem Why is Prior important? Image you are standing in a line of people and the person standing in front of you is having long hairs. What can you conclude about the gender of the person? Most people will suggest that the person with long hair is probably woman. Now if some more information is given to you (prior) that the line you are standing in is the line towards Gents restroom. Now what would be your estimate about the gender? You can say with very high certainty that the person is a male. This is how including prior into your analysis could help you in predicting the best fit category / class.
  • 4. Intro to Naïve Bayes Candy Selection Example Once there lived an old Grandma. Every year on her birthday, her entire family would visit her and stay at her mansion. Sons, daughters, their spouses, her grandchildren. It would be a big bash every year, with a lot of fanfare. But what Grandma loved the most was meeting her grandchildren and getting to play with them. She had ten grandchildren in total, all of them around 10 years of age, and she would lovingly call them "random variables". Every year, Grandma would present a candy to each of the kids. Grandma had a large box full of candies of ten different kinds. She would give a single candy to each one of the kids, since she didn't want to spoil their teeth. But, as she loved the kids so much, she took great efforts to decide which candy to present to which kid, such that it would maximize their total happiness. (the maximum likelihood estimate, as she would call it)
  • 5. Intro to Naïve Bayes But that was not an easy task for Grandma. She knew that each type of candy had a certain probability of making a kid happy. That probability was different for different candy types, and for different kids. Rakesh liked the red candy more than the green one, while Sheila liked the orange one above all else. Each of the 10 kids had different preferences for each of the 10 candies. Moreover, their preferences largely depended on external factors which were unknown (hidden variables) to Grandma. If Sameer had seen a blue building on the way to the mansion, he'd want a blue candy, while Sandeep always wanted the candy that matched the colour of his shirt that day. But the biggest challenge was that their happiness depended on what candies the other kids got! If Rohan got a red candy, then Niyati would want a red candy as well, and anything else would make her go crying into her mother's arms (conditional dependency). Sakshi always wanted what the majority of kids got (positive correlation), while Tanmay would be happiest if nobody else got the kind of candy that he received (negative correlation). Grandma had concluded long ago that her grandkids were completely mutually dependent It was computationally a big task for Grandma to get the candy selection right. There were too many conditions to consider and she could not simplify the calculation. Every year before her birthday, she would spend days figuring out the optimal assignment of candies, by enumerating all configurations of candies for all the kids together. She was getting old, and the task was getting harder and harder. She used to feel that she would die before figuring out the optimal selection of candies that would make her kids the happiest all at once.
  • 6. Intro to Naïve Bayes But an interesting thing happened. As the years passed and the kids grew up, they finally passed from teenage and turned into independent adults. Their choices became less and less dependent on each other, and it became easier to figure out what is each one's most preferred candy. Grandma was quick to realize this, and she joyfully began calling them "independent random variables". It was much easier for her to figure out the optimal selection of candies - she just had to think of one kid at a time and, for each kid, assign a happiness probability to each of the 10 candy types for that kid. Then she would pick the candy with the highest happiness probability for that kid, without worrying about what she would assign to the other kids. This was a super easy task, and Grandma was finally able to get it right. Learnings: • In statistical modelling, having mutually dependent random variables makes it really hard to find out the optimal assignment of values for each variable that maximizes the cumulative probability of the set. • However, if the variables are independent, it is easy to pick out the individual assignments that maximize the probability of each variable, and then combine the individual assignments to get a configuration for the entire set. • In Naive Bayes, you make the assumption that the variables are independent (even if they are actually not). That’s why it is called Naïve. • This simplifies your calculation, and in many cases, it actually gives estimates that are comparable to those which you would have obtained from a more (computationally) expensive model that takes into account the conditional dependencies between variables.
  • 7. How Naïve Bayes works? Below I have a training data set of weather and corresponding target variable ‘Play’ (suggesting possibilities of playing).
  • 8. How Naïve Bayes works? Players will play if weather is sunny. Is this statement is correct? We can solve it using above (Bayes Theorem) discussed method of posterior probability. P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny) We know from the data: P (Sunny |Yes) = 3/9 = 0.33, P(Sunny) = 5/14 = 0.36, P( Yes)= 9/14 = 0.64 Now, P (Yes | Sunny) = 0.33 * 0.64 / 0.36 = 0.60, which has higher probability. Naive Bayes uses a similar method to predict the probability of different class based on various attributes. This algorithm is mostly used in text classification and with problems having multiple classes.
  • 9. Pros and Cons of Naïve Bayes Pros: • It is easy and fast to predict class of test data set. It also perform well in multi class prediction. • When assumption of independence holds, a Naive Bayes classifier performs better compare to other models like logistic regression and you need less training data. • It performs reasonably good even when the assumption of independence is somewhat not true. Cons: • If categorical variable has a category (in test data set), which was not observed in training data set, then model will assign a 0 (zero) probability and will be unable to make a prediction. This is often known as “Zero Frequency”. • Another limitation of Naive Bayes is the assumption of independent predictors. In real life, it is almost impossible that we get a set of predictors which are completely independent.