SlideShare a Scribd company logo
Data Mining
Bayes Classification
Baye's Theorem
 Bayes' Theorem is named after Thomas Bayes.There are
two types of probabilities −
 Posterior Probability [P(H/X)]
 Prior Probability [P(H)]
 where X is data tuple and H is some hypothesis.
 According to Bayes' Theorem,
 P(H/X)= P(X/H)P(H) / P(X)
Naïve Bayes
 It is a classification technique based on Bayes’ Theorem with an
assumption of independence among predictors.
 In simple terms, a Naive Bayes classifier assumes that the
presence of a particular feature in a class is unrelated to the
presence of any other feature.
 For example, a fruit may be considered to be an apple if it is
red, round, and about 3 inches in diameter. Even if these
features depend on each other or upon the existence of the
other features, all of these properties independently
contribute to the probability that this fruit is an apple and that
is why it is known as ‘Naive’.
 Naive Bayes model is easy to build and particularly useful for
very large data sets. Along with simplicity, Naive Bayes is
known to outperform even highly sophisticated classification
methods.
 Bayes theorem provides a way of calculating posterior
probability P(c|x) from P(c), P(x) and P(x|c). Look at the
equation below:
Formula
Formula
 P(c|x) is the posterior probability of class (c, target) given
predictor (x, attributes).
 P(c) is the prior probability of class.
 P(x|c) is the likelihood which is the probability of predictor
given class.
 P(x) is the prior probability of predictor.
Dataset
Day Outlook Temperature Humidity Wind Class: Play ball
D1 Sunny Hot High False No
D2 Sunny Hot High True No
D3 Overcast Hot High False Yes
D4 Rain Mild High False Yes
D5 Rain Cool Normal False Yes
D6 Rain Cool Normal True No
D7 Overcast Cool Normal True Yes
D8 Sunny Mild High False No
D9 Sunny Cool Normal False Yes
D10 Rain Mild Normal False Yes
D11 Sunny Mild Normal True Yes
D12 Overcast Mild High True Yes
D13 Overcast Hot Normal False Yes
D14 Rain Mild High True No
Problem
The weather data, with counts and probabilities
outlook temperature humidity windy play
yes no yes no yes no yes no yes no
sunny 2 3 hot 2 2 high 3 4 false 6 2 9 5
overcast 4 0 mild 4 2 normal 6 1 true 3 3
rainy 3 2 cool 3 1
sunny 2/9 3/5 hot 2/9 2/5 high 3/9 4/5 false 6/9 2/5 9/14 5/14
overcast 4/9 0/5 mild 4/9 2/5 normal 6/9 1/5 true 3/9 3/5
rainy 3/9 2/5 cool 3/9 1/5
A new day
outlook temperature humidity windy play
sunny cool high true ?
Outlook Temp Humidity Wind
Overcast Mild Normal True
Problem
P(outlook=Sunny|Yes) = 2/9
P(temp=cool|yes) = 3/9
P(humidity=high|yes)=3/9
P(Windy=true|yes)=3/9
P(outlook=Sunny|temp=cool|humidity=high|Windy=true|Yes)=2/9*3/9*3/9*3/9*9/14
= 0.00529
P(outlook=Sunny|No) = 3/5
P(temp=cool|No) = 1/5
P(humidity=high|No)= 4/5
P(Windy=true|No)= 3/5
P(outlook=Sunny|temp=cool|humidity=high|Windy=true|No) =
3/5*1/5*4/5*3/5*5/14 = 0.0206
P(Yes)<P(No)
Prediction = No
 Likelihood of yes
 Likelihood of no
 Therefore, the prediction is No
0053
.
0
14
9
9
3
9
3
9
3
9
2






0206
.
0
14
5
5
3
5
4
5
1
5
3






 Predict stolen for
 Color=red
 Type=suv
 Origin=domestic
Color Type Origin
Yes No Yes No Yes No
Red 3 2 Sports 4 2 Dom 2 3
Yellow 2 3 SUV 1 3 Imp 3 2
Red 3/5 2/5 Sports 4/6 2/6 Dom 2/5 3/5
Yellow 2/5 3/5 SUV 1/4 3/4 Imp 3/5 2/5
Total Rows = 10
P(Yes) = 5/10
P(No) = 5/10
Predict stolen for
Likelihood forYes
Color=red = 3/5
Type=suv = 1/4
Origin=domestic = 2/5
P(X|Yes) = 3/5*1/4*2/5*5/10 = 0.003
Likelihood for No
Color=red = 2/5
Type=suv = 3/4
Origin=domestic = 3/5
P(X|No) = 2/5*3/4*3/5*5/10 = 0.033
Prediction = Stolen = No

More Related Content

More from SatishH5

Regression trees lot example
Regression trees lot exampleRegression trees lot example
Regression trees lot example
SatishH5
 
Regression trees
Regression treesRegression trees
Regression trees
SatishH5
 
Regression trees
Regression treesRegression trees
Regression trees
SatishH5
 
Regression tree
Regression treeRegression tree
Regression tree
SatishH5
 
Multi linear regression
Multi linear regressionMulti linear regression
Multi linear regression
SatishH5
 
Knn classification
Knn classificationKnn classification
Knn classification
SatishH5
 
Knn classification (1)
Knn classification (1)Knn classification (1)
Knn classification (1)
SatishH5
 
Decision trees
Decision treesDecision trees
Decision trees
SatishH5
 
Decision tree cart c4.5
Decision tree   cart c4.5Decision tree   cart c4.5
Decision tree cart c4.5
SatishH5
 

More from SatishH5 (9)

Regression trees lot example
Regression trees lot exampleRegression trees lot example
Regression trees lot example
 
Regression trees
Regression treesRegression trees
Regression trees
 
Regression trees
Regression treesRegression trees
Regression trees
 
Regression tree
Regression treeRegression tree
Regression tree
 
Multi linear regression
Multi linear regressionMulti linear regression
Multi linear regression
 
Knn classification
Knn classificationKnn classification
Knn classification
 
Knn classification (1)
Knn classification (1)Knn classification (1)
Knn classification (1)
 
Decision trees
Decision treesDecision trees
Decision trees
 
Decision tree cart c4.5
Decision tree   cart c4.5Decision tree   cart c4.5
Decision tree cart c4.5
 

Recently uploaded

Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Game Development with Unity3D (Game Development lecture 3)
Game Development  with Unity3D (Game Development lecture 3)Game Development  with Unity3D (Game Development lecture 3)
Game Development with Unity3D (Game Development lecture 3)
abdulrafaychaudhry
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)
abdulrafaychaudhry
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
Boni García
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Nidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, TipsNidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, Tips
vrstrong314
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
Hornet Dynamics
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
ShamsuddeenMuhammadA
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Mind IT Systems
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 

Recently uploaded (20)

Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Game Development with Unity3D (Game Development lecture 3)
Game Development  with Unity3D (Game Development lecture 3)Game Development  with Unity3D (Game Development lecture 3)
Game Development with Unity3D (Game Development lecture 3)
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Nidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, TipsNidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, Tips
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 

Module 4 bayes classification

  • 2. Baye's Theorem  Bayes' Theorem is named after Thomas Bayes.There are two types of probabilities −  Posterior Probability [P(H/X)]  Prior Probability [P(H)]  where X is data tuple and H is some hypothesis.  According to Bayes' Theorem,  P(H/X)= P(X/H)P(H) / P(X)
  • 3. Naïve Bayes  It is a classification technique based on Bayes’ Theorem with an assumption of independence among predictors.  In simple terms, a Naive Bayes classifier assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature.  For example, a fruit may be considered to be an apple if it is red, round, and about 3 inches in diameter. Even if these features depend on each other or upon the existence of the other features, all of these properties independently contribute to the probability that this fruit is an apple and that is why it is known as ‘Naive’.  Naive Bayes model is easy to build and particularly useful for very large data sets. Along with simplicity, Naive Bayes is known to outperform even highly sophisticated classification methods.
  • 4.  Bayes theorem provides a way of calculating posterior probability P(c|x) from P(c), P(x) and P(x|c). Look at the equation below: Formula
  • 5. Formula  P(c|x) is the posterior probability of class (c, target) given predictor (x, attributes).  P(c) is the prior probability of class.  P(x|c) is the likelihood which is the probability of predictor given class.  P(x) is the prior probability of predictor.
  • 6. Dataset Day Outlook Temperature Humidity Wind Class: Play ball D1 Sunny Hot High False No D2 Sunny Hot High True No D3 Overcast Hot High False Yes D4 Rain Mild High False Yes D5 Rain Cool Normal False Yes D6 Rain Cool Normal True No D7 Overcast Cool Normal True Yes D8 Sunny Mild High False No D9 Sunny Cool Normal False Yes D10 Rain Mild Normal False Yes D11 Sunny Mild Normal True Yes D12 Overcast Mild High True Yes D13 Overcast Hot Normal False Yes D14 Rain Mild High True No
  • 7. Problem The weather data, with counts and probabilities outlook temperature humidity windy play yes no yes no yes no yes no yes no sunny 2 3 hot 2 2 high 3 4 false 6 2 9 5 overcast 4 0 mild 4 2 normal 6 1 true 3 3 rainy 3 2 cool 3 1 sunny 2/9 3/5 hot 2/9 2/5 high 3/9 4/5 false 6/9 2/5 9/14 5/14 overcast 4/9 0/5 mild 4/9 2/5 normal 6/9 1/5 true 3/9 3/5 rainy 3/9 2/5 cool 3/9 1/5 A new day outlook temperature humidity windy play sunny cool high true ? Outlook Temp Humidity Wind Overcast Mild Normal True
  • 8. Problem P(outlook=Sunny|Yes) = 2/9 P(temp=cool|yes) = 3/9 P(humidity=high|yes)=3/9 P(Windy=true|yes)=3/9 P(outlook=Sunny|temp=cool|humidity=high|Windy=true|Yes)=2/9*3/9*3/9*3/9*9/14 = 0.00529 P(outlook=Sunny|No) = 3/5 P(temp=cool|No) = 1/5 P(humidity=high|No)= 4/5 P(Windy=true|No)= 3/5 P(outlook=Sunny|temp=cool|humidity=high|Windy=true|No) = 3/5*1/5*4/5*3/5*5/14 = 0.0206 P(Yes)<P(No) Prediction = No
  • 9.  Likelihood of yes  Likelihood of no  Therefore, the prediction is No 0053 . 0 14 9 9 3 9 3 9 3 9 2       0206 . 0 14 5 5 3 5 4 5 1 5 3      
  • 10.  Predict stolen for  Color=red  Type=suv  Origin=domestic
  • 11. Color Type Origin Yes No Yes No Yes No Red 3 2 Sports 4 2 Dom 2 3 Yellow 2 3 SUV 1 3 Imp 3 2 Red 3/5 2/5 Sports 4/6 2/6 Dom 2/5 3/5 Yellow 2/5 3/5 SUV 1/4 3/4 Imp 3/5 2/5 Total Rows = 10 P(Yes) = 5/10 P(No) = 5/10 Predict stolen for Likelihood forYes Color=red = 3/5 Type=suv = 1/4 Origin=domestic = 2/5 P(X|Yes) = 3/5*1/4*2/5*5/10 = 0.003 Likelihood for No Color=red = 2/5 Type=suv = 3/4 Origin=domestic = 3/5 P(X|No) = 2/5*3/4*3/5*5/10 = 0.033 Prediction = Stolen = No