SlideShare a Scribd company logo
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
What to expect?
 What is Machine Learning?
 Introduction to Classification
 Classification Algorithms
 What is Naive Bayes?
 Use Cases of Naive Bayes
 Demo – Employee Salary Prediction
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
What is Machine Learning?
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
What is Machine Learning?
 Machine Learning explores the study and construction of algorithms that can learn from
and make predictions on data.
 Closely related to computational statistics.
 Used to devise complex models and algorithms that lend themselves to a prediction
which in commercial use is known as predictive analytics.
Speech Recognition Face Recognition Anti Virus Weather Prediction
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Supervised vs Unsupervised Learning
Supervised Learning Unsupervised Learning
Classification is the result of supervised
learning which means that there is a known
label that you want the system to generate.
Clustering is the result of unsupervised
learning which means that you’ve seen lots of
examples, but don’t have labels.
E.g. If you built a fruit classifier, the labels will be “this
is an orange, this is an apple and this is a banana”,
based on showing the classifier examples of apples,
oranges and bananas.
E.g. In the same example, a fruit clustering will
categorize as “fruits with soft skin and lots of dimples”,
“fruits with shiny hard skin” and “elongated yellow
fruits”.
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Introduction to
Classification
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Introduction to Classification
 Classification is the problem of identifying to which set of categories a
new observation belongs
 It is based on the training set of data containing observations.
Figure: Examples of Classification
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Classification Algorithms
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Classification Algorithms
Classifier
Quadratic
Linear
SVM
Logistic Regression
Naive Bayes
Neural Networks
Decision Trees
Kernel Estimation
Perceptron
Naive Bayes
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
What is Naive Bayes?
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
What is Naive Bayes?
Let us understand Naive Bayes with the help of an example
Hi! I just cannot seem to figure
out which are the best days to
play football with my friends.
Can you help me out?
Summer Monsoon Winter
Sunny No Sun
Windy No Wind
All possible weather combinations
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
What is Naive Bayes?
That is perfect. We will be
using Naive Bayes algorithm
to predict if you should play
on a particular day or not.
I have noted down all the days
it was good/bad to play football
and the combination of weather
metrics on that particular day.
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
What is Naive Bayes?
Summer Monsoon Winter
No
Yes
Season
Sunny
Case 1 – Sunny
 We have categorized the probability
to play into “High” (P>0.5) and “Low”
(P<0.5)
 Big circles represent “High”, i.e.
probability greater than 0.5
 Small circles represent “Low”, i.e.
probability less than 0.5
Case 1 – Sunny
Moving further we can draw charts based on the probabilities of days favouring games
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
 The second attribute is the wind speeds
on a particular day.
 Let us look at how wind affects the
chances of playing Football on a particular
day.
What is Naive Bayes?
Summer Monsoon Winter
No
Yes
Season
Windy
Case 2 – Windy
Here, we will look at days where there was wind and when it was good to play
Case 2 – Windy
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
What is Naive Bayes?
Summer Monsoon Winter
(Sunny = No,
Windy = Yes)
Sunny = No
(Sunny = No,
Windy = No)
Summer Monsoon Winter
(Sunny = Yes,
Windy = Yes)
Sunny = Yes
(Sunny = Yes,
Windy = No)
Here, we have the complete set of attributes and whether to play on that day or not.
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
What is Naive Bayes?
If you notice in summer, it
is advisable to play when
there is no sun. But the
second graph shows a
different picture.
This is because a day in
Summer which is not
Sunny might have P > 0.5
but when there is no wind,
the Posterior probability
P < 0.5
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
 Naive Bayes classifier is a simple probabilistic classifier based on applying
Bayes' theorem with strong (naive) independence assumptions between the
features.
 Bayes' theorem is stated mathematically as the following equation:
where A and B are events and P(B) ≠ 0.
What is Naive Bayes?
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Understanding Bayes’ Theorem
P(c|x) =
P(x|c) P(c)
P(x)
Likelihood Class Prior Probability
Posterior Probability Predictor Prior Probability
 Let us understand how Bayes’ Theorem can be used in Naive Bayes classifier:
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Understanding Bayes’ Theorem
In Figure 1, We have the Posterior
Probability of Sunny across seasons
excluding Wind speed.
In Figure 2, We have the Posterior
Probabilities ( E.g. Sunny = No,
Windy = Yes and Season = Summer
)
Figure 1
Figure 2
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Understanding Bayes’ Theorem
We can use Naive Bayes
Classifier to predict whether to
play Football on
( Season = Winter, Sunny = No ,
Windy = Yes ).
Our Demo will help you clearly
understand Naive Bayes.
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Understanding Bayes’ Theorem
Yes No
3 2
4 0
2 3
Summer
Monsoon
Winter
Season
Play
Frequency Table
Yes No
3 4
6 1
Yes
No
Sunny
Play
Frequency Table
Yes No
6 2
3 3
Yes
No
Windy
Play
Frequency Table
From the dataset we have obtained, we will populate
frequency tables for each of the attribute
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Understanding Bayes’ Theorem
For each of the frequency tables, we will find the likelihoods for each of the cases
P(c | x) = P(Yes | Summer) = P(Summer | Yes)* P(Yes) / P(Summer) = (0.33 x 0.64) /0.36 = 0.60
Likelihood of ‘Yes’ given Summer is:
Yes No
3/9 2/5
4/9 0/5
2/9 3/5
Summer
Monsoon
Winter
Season
Play
Likelihood Table
9/14 5/14
5/14
4/14
5/14
P(x | c) = P(Summer | Yes)
= 3/9 = 0.33
P(c) = P(Yes)
= 9/14 = 0.64
P(x) = P(Summer)
= 5/14 = 0.36
Here, c = Play and x = Variables like Season, Sunny & Windy.
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Understanding Bayes’ Theorem
Let us use the likelihood
table to predict whether
to play football on
( Season = Winter, Sunny
= No , Windy = Yes )
P(c | x) = P(Play = Yes | Winter, Sunny = No, Windy = Yes)
= P(Winter | Yes) * P(Sunny = No | Yes) * P(Windy = Yes | Yes) * P(Yes)
P(Winter) * P(Sunny = No) * P(Windy = Yes)
= (2/9) * (6/9) * (6/9) * (9/14) / (5/14) * (7/14) * (8/14) = 0.6223
Since the probability
is greater than 0.5,
we should play
football on that day.
Yayiee!!
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Use Cases of Naive Bayes
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Use Cases of Naive Bayes
Email Spam Detection
Categorizing News
Face Recognition
Sentiment Analysis
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Use Cases of Naive Bayes
Weather Prediction
Digit RecognitionMedical Diagnosis
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Demo – Employee
Salary Prediction
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Demo – Problem Statement
Problem Statement: To devise a model to predict an employee’s salary based on the given
set of attributes using Naive Bayes classifier.
 We have an Employee Dataset where there are 14
attributes and our output variable is Employee’s Salary.
 We will use Naive Bayes Classifier to predict an
Employee’s Salary as high(>50k) or low(<50k)by finding
out the probabilities for the given attribute combination.
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Demo – Employee Salary Prediction
Feature Selection
Divide Dataset
Implement Model
Optimize Model
Prediction
Model Validation
Data Acquisition
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Demo – Employee Salary Prediction
Field Description
Age_Of_emp Age of the employee
Emp_Stat_type Type of the employment industry
srnumber Serial number of the employee
Edu_of_Emp Employee education details
Edu_Cat Employee’s education category
marital_Status Employee marital status
Occ_Of_Emp Job description of the employee
Emp_rel_status Employee relationship status
Emp_race_type Race of the employee
sex_of_emp Sex of the employee
capital_gain Income from investment sources apart from wages/salary
capital_loss Losses from investment sources apart from wages/salary
Work_hour_in_week Number of weekly working hours
country_of_res Country of residence
Emp_sal Employee’s salary
Feature Selection
Divide Dataset
Implement Model
Optimize Model
Prediction
Model Validation
Data Acquisition
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Demo – Employee Salary Prediction
Feature Selection
Divide Dataset
Implement Model
Optimize Model
Prediction
Model Validation
Data Acquisition
 From the following fields, we need to filter out unnecessary columns which
will not affect the Employee’s Salary.
 We will be removing fields srnumber, marital_Status, Emp_rel_status,
Emp_race_type, sex_of_emp, capital_gain and capital_loss because these fields
are factors which do not affect a person’s salary.
 The remaining fields will be used to build our model.
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Demo – Employee Salary Prediction
We will divide our entire dataset into two subsets as:
 Training dataset -> To train the model
 Testing dataset -> To validate and make predictions
Feature Selection
Divide Dataset
Implement Model
Optimize Model
Prediction
Model Validation
Data Acquisition
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Demo – Employee Salary Prediction
 We model the Naive Bayes using the library ‘e1071’ on the
training dataset that we created just now.
 The model is called emp_nb.
Feature Selection
Divide Dataset
Implement Model
Optimize Model
Prediction
Model Validation
Data Acquisition
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Demo – Employee Salary Prediction
The following is the output from emp_nb model
Feature Selection
Divide Dataset
Implement Model
Optimize Model
Prediction
Model Validation
Data Acquisition
Likelihood of High & Low Salaries
Likelihood of Employee Department
against High & Low Salaries
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Demo – Employee Salary Prediction
Optimizing Models refers to modifying our model so as to achieve highest accuracy.
If the P-value is > 0.05, then we should reject the model. Our P-value is lesser than 0.05, so our
model is acceptable.
Kappa is the value obtained by:
Kappa = (totalAccuracy - randomAccuracy) / (1 - randomAccuracy)
Naive Bayes classifier can be further improved using the following steps:
 Include Laplace Correction
 Normalization
Feature Selection
Divide Dataset
Implement Model
Optimize Model
Prediction
Model Validation
Data Acquisition
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Demo – Employee Salary Prediction
 We can go ahead and check the
validation of the predictions.
 We will populate the Confusion
Matrix which shows all the metrics to
measure the accuracy, sensitivity,
specificity, prevalence, etc.
Feature Selection
Divide Dataset
Implement Model
Optimize Model
Prediction
Model Validation
Data Acquisition
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Demo – Employee Salary Prediction
 The final step in our project is to predict the Salary of the employee based on the Naive
Bayes model that we have created.
 The prediction for our specific input is Low.
Feature Selection
Divide Dataset
Implement Model
Optimize Model
Prediction
Model Validation
Data Acquisition
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Summary
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Introduction to ClassificationWhat is Machine Learning?
Summary
Use Cases of Naive BayesWhat is Naive Bayes? Demo
Classification Algorithms
www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING
Thank You …
Questions/Queries/Feedback

More Related Content

What's hot

Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Marina Santini
 
Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Statistical Pattern recognition(1)
Statistical Pattern recognition(1)
Syed Atif Naseem
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
umeskath
 
Naive Bayes Classifier
Naive Bayes ClassifierNaive Bayes Classifier
Naive Bayes Classifier
Arunabha Saha
 
CART: Not only Classification and Regression Trees
CART: Not only Classification and Regression TreesCART: Not only Classification and Regression Trees
CART: Not only Classification and Regression Trees
Marc Garcia
 
Belief Networks & Bayesian Classification
Belief Networks & Bayesian ClassificationBelief Networks & Bayesian Classification
Belief Networks & Bayesian Classification
Adnan Masood
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
Sulman Ahmed
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
Functional Imperative
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
Knoldus Inc.
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
Milind Gokhale
 
Bayes Belief Networks
Bayes Belief NetworksBayes Belief Networks
Bayes Belief Networks
Sai Kumar Kodam
 
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
Simplilearn
 
Naive Bayes Classifier
Naive Bayes ClassifierNaive Bayes Classifier
Naive Bayes Classifier
Yiqun Hu
 
End-to-End Machine Learning Project
End-to-End Machine Learning ProjectEnd-to-End Machine Learning Project
End-to-End Machine Learning Project
Eng Teong Cheah
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
Ashraf Uddin
 
Machine learning with scikitlearn
Machine learning with scikitlearnMachine learning with scikitlearn
Machine learning with scikitlearn
Pratap Dangeti
 
ML - Simple Linear Regression
ML - Simple Linear RegressionML - Simple Linear Regression
ML - Simple Linear Regression
Andrew Ferlitsch
 
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Simplilearn
 
Dcgan
DcganDcgan
Dcgan
Brian Kim
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
YashwantGahlot1
 

What's hot (20)

Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
 
Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Statistical Pattern recognition(1)
Statistical Pattern recognition(1)
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Naive Bayes Classifier
Naive Bayes ClassifierNaive Bayes Classifier
Naive Bayes Classifier
 
CART: Not only Classification and Regression Trees
CART: Not only Classification and Regression TreesCART: Not only Classification and Regression Trees
CART: Not only Classification and Regression Trees
 
Belief Networks & Bayesian Classification
Belief Networks & Bayesian ClassificationBelief Networks & Bayesian Classification
Belief Networks & Bayesian Classification
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
 
Bayes Belief Networks
Bayes Belief NetworksBayes Belief Networks
Bayes Belief Networks
 
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
 
Naive Bayes Classifier
Naive Bayes ClassifierNaive Bayes Classifier
Naive Bayes Classifier
 
End-to-End Machine Learning Project
End-to-End Machine Learning ProjectEnd-to-End Machine Learning Project
End-to-End Machine Learning Project
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Machine learning with scikitlearn
Machine learning with scikitlearnMachine learning with scikitlearn
Machine learning with scikitlearn
 
ML - Simple Linear Regression
ML - Simple Linear RegressionML - Simple Linear Regression
ML - Simple Linear Regression
 
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
 
Dcgan
DcganDcgan
Dcgan
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 

Similar to Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Bayes in R | Edureka

Naive.pdf
Naive.pdfNaive.pdf
Naive.pdf
MahimMajee
 
Naive_hehe.pptx
Naive_hehe.pptxNaive_hehe.pptx
Naive_hehe.pptx
MahimMajee
 
Naïve Bayes Machine Learning Classification with R Programming: A case study ...
Naïve Bayes Machine Learning Classification with R Programming: A case study ...Naïve Bayes Machine Learning Classification with R Programming: A case study ...
Naïve Bayes Machine Learning Classification with R Programming: A case study ...
SubmissionResearchpa
 
Navies bayes
Navies bayesNavies bayes
Navies bayes
HassanRaza323
 
Supervised algorithms
Supervised algorithmsSupervised algorithms
Supervised algorithms
Yassine Akhiat
 
Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...
Edureka!
 
Unit-2.ppt
Unit-2.pptUnit-2.ppt
Unit-2.ppt
AshwaniShukla47
 
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Edureka!
 
Using Open Source Tools for Machine Learning
Using Open Source Tools for Machine LearningUsing Open Source Tools for Machine Learning
Using Open Source Tools for Machine Learning
All Things Open
 
NAIVE BAYES ALGORITHM
NAIVE BAYES ALGORITHMNAIVE BAYES ALGORITHM
NAIVE BAYES ALGORITHM
Rang Technologies
 
Machine learning algorithms
Machine learning algorithmsMachine learning algorithms
Machine learning algorithms
Shalitha Suranga
 
Calculus in Machine Learning
Calculus in Machine Learning Calculus in Machine Learning
Calculus in Machine Learning
Gokul Jayan
 
Data Mining the City - A (practical) introduction to Machine Learning
Data Mining the City - A (practical) introduction to Machine LearningData Mining the City - A (practical) introduction to Machine Learning
Data Mining the City - A (practical) introduction to Machine Learning
Danil Nagy
 
Barga Data Science lecture 7
Barga Data Science lecture 7Barga Data Science lecture 7
Barga Data Science lecture 7
Roger Barga
 
09learning.ppt
09learning.ppt09learning.ppt
09learning.ppt
ABINASHPADHY6
 
Learning Predictive Modeling with TSA and Kaggle
Learning Predictive Modeling with TSA and KaggleLearning Predictive Modeling with TSA and Kaggle
Learning Predictive Modeling with TSA and Kaggle
Yvonne K. Matos
 
IRJET- Big Data and Bayes Theorem used Analyze the Student’s Performance in E...
IRJET- Big Data and Bayes Theorem used Analyze the Student’s Performance in E...IRJET- Big Data and Bayes Theorem used Analyze the Student’s Performance in E...
IRJET- Big Data and Bayes Theorem used Analyze the Student’s Performance in E...
IRJET Journal
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
Abdullah al Mamun
 
Data mining
Data mining Data mining
Data mining
Jhadesunil
 
Opinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classicationOpinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classication
IJECEIAES
 

Similar to Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Bayes in R | Edureka (20)

Naive.pdf
Naive.pdfNaive.pdf
Naive.pdf
 
Naive_hehe.pptx
Naive_hehe.pptxNaive_hehe.pptx
Naive_hehe.pptx
 
Naïve Bayes Machine Learning Classification with R Programming: A case study ...
Naïve Bayes Machine Learning Classification with R Programming: A case study ...Naïve Bayes Machine Learning Classification with R Programming: A case study ...
Naïve Bayes Machine Learning Classification with R Programming: A case study ...
 
Navies bayes
Navies bayesNavies bayes
Navies bayes
 
Supervised algorithms
Supervised algorithmsSupervised algorithms
Supervised algorithms
 
Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...
 
Unit-2.ppt
Unit-2.pptUnit-2.ppt
Unit-2.ppt
 
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
 
Using Open Source Tools for Machine Learning
Using Open Source Tools for Machine LearningUsing Open Source Tools for Machine Learning
Using Open Source Tools for Machine Learning
 
NAIVE BAYES ALGORITHM
NAIVE BAYES ALGORITHMNAIVE BAYES ALGORITHM
NAIVE BAYES ALGORITHM
 
Machine learning algorithms
Machine learning algorithmsMachine learning algorithms
Machine learning algorithms
 
Calculus in Machine Learning
Calculus in Machine Learning Calculus in Machine Learning
Calculus in Machine Learning
 
Data Mining the City - A (practical) introduction to Machine Learning
Data Mining the City - A (practical) introduction to Machine LearningData Mining the City - A (practical) introduction to Machine Learning
Data Mining the City - A (practical) introduction to Machine Learning
 
Barga Data Science lecture 7
Barga Data Science lecture 7Barga Data Science lecture 7
Barga Data Science lecture 7
 
09learning.ppt
09learning.ppt09learning.ppt
09learning.ppt
 
Learning Predictive Modeling with TSA and Kaggle
Learning Predictive Modeling with TSA and KaggleLearning Predictive Modeling with TSA and Kaggle
Learning Predictive Modeling with TSA and Kaggle
 
IRJET- Big Data and Bayes Theorem used Analyze the Student’s Performance in E...
IRJET- Big Data and Bayes Theorem used Analyze the Student’s Performance in E...IRJET- Big Data and Bayes Theorem used Analyze the Student’s Performance in E...
IRJET- Big Data and Bayes Theorem used Analyze the Student’s Performance in E...
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 
Data mining
Data mining Data mining
Data mining
 
Opinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classicationOpinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classication
 

More from Edureka!

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | Edureka
Edureka!
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
Edureka!
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
Edureka!
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | Edureka
Edureka!
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
Edureka!
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | Edureka
Edureka!
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | Edureka
Edureka!
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | Edureka
Edureka!
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| Edureka
Edureka!
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | Edureka
Edureka!
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | Edureka
Edureka!
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | Edureka
Edureka!
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | Edureka
Edureka!
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | Edureka
Edureka!
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
Edureka!
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | Edureka
Edureka!
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka
Edureka!
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
Edureka!
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | Edureka
Edureka!
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | Edureka
Edureka!
 

More from Edureka! (20)

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | Edureka
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | Edureka
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | Edureka
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | Edureka
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | Edureka
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| Edureka
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | Edureka
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | Edureka
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | Edureka
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | Edureka
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | Edureka
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | Edureka
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | Edureka
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | Edureka
 

Recently uploaded

Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 

Recently uploaded (20)

Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 

Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Bayes in R | Edureka

  • 2. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING What to expect?  What is Machine Learning?  Introduction to Classification  Classification Algorithms  What is Naive Bayes?  Use Cases of Naive Bayes  Demo – Employee Salary Prediction
  • 3. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING What is Machine Learning?
  • 4. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING What is Machine Learning?  Machine Learning explores the study and construction of algorithms that can learn from and make predictions on data.  Closely related to computational statistics.  Used to devise complex models and algorithms that lend themselves to a prediction which in commercial use is known as predictive analytics. Speech Recognition Face Recognition Anti Virus Weather Prediction
  • 5. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Supervised vs Unsupervised Learning Supervised Learning Unsupervised Learning Classification is the result of supervised learning which means that there is a known label that you want the system to generate. Clustering is the result of unsupervised learning which means that you’ve seen lots of examples, but don’t have labels. E.g. If you built a fruit classifier, the labels will be “this is an orange, this is an apple and this is a banana”, based on showing the classifier examples of apples, oranges and bananas. E.g. In the same example, a fruit clustering will categorize as “fruits with soft skin and lots of dimples”, “fruits with shiny hard skin” and “elongated yellow fruits”.
  • 6. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Introduction to Classification
  • 7. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Introduction to Classification  Classification is the problem of identifying to which set of categories a new observation belongs  It is based on the training set of data containing observations. Figure: Examples of Classification
  • 8. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Classification Algorithms
  • 9. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Classification Algorithms Classifier Quadratic Linear SVM Logistic Regression Naive Bayes Neural Networks Decision Trees Kernel Estimation Perceptron Naive Bayes
  • 10. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING What is Naive Bayes?
  • 11. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING What is Naive Bayes? Let us understand Naive Bayes with the help of an example Hi! I just cannot seem to figure out which are the best days to play football with my friends. Can you help me out? Summer Monsoon Winter Sunny No Sun Windy No Wind All possible weather combinations
  • 12. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING What is Naive Bayes? That is perfect. We will be using Naive Bayes algorithm to predict if you should play on a particular day or not. I have noted down all the days it was good/bad to play football and the combination of weather metrics on that particular day.
  • 13. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING What is Naive Bayes? Summer Monsoon Winter No Yes Season Sunny Case 1 – Sunny  We have categorized the probability to play into “High” (P>0.5) and “Low” (P<0.5)  Big circles represent “High”, i.e. probability greater than 0.5  Small circles represent “Low”, i.e. probability less than 0.5 Case 1 – Sunny Moving further we can draw charts based on the probabilities of days favouring games
  • 14. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING  The second attribute is the wind speeds on a particular day.  Let us look at how wind affects the chances of playing Football on a particular day. What is Naive Bayes? Summer Monsoon Winter No Yes Season Windy Case 2 – Windy Here, we will look at days where there was wind and when it was good to play Case 2 – Windy
  • 15. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING What is Naive Bayes? Summer Monsoon Winter (Sunny = No, Windy = Yes) Sunny = No (Sunny = No, Windy = No) Summer Monsoon Winter (Sunny = Yes, Windy = Yes) Sunny = Yes (Sunny = Yes, Windy = No) Here, we have the complete set of attributes and whether to play on that day or not.
  • 16. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING What is Naive Bayes? If you notice in summer, it is advisable to play when there is no sun. But the second graph shows a different picture. This is because a day in Summer which is not Sunny might have P > 0.5 but when there is no wind, the Posterior probability P < 0.5
  • 17. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING  Naive Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem with strong (naive) independence assumptions between the features.  Bayes' theorem is stated mathematically as the following equation: where A and B are events and P(B) ≠ 0. What is Naive Bayes?
  • 18. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Understanding Bayes’ Theorem P(c|x) = P(x|c) P(c) P(x) Likelihood Class Prior Probability Posterior Probability Predictor Prior Probability  Let us understand how Bayes’ Theorem can be used in Naive Bayes classifier:
  • 19. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Understanding Bayes’ Theorem In Figure 1, We have the Posterior Probability of Sunny across seasons excluding Wind speed. In Figure 2, We have the Posterior Probabilities ( E.g. Sunny = No, Windy = Yes and Season = Summer ) Figure 1 Figure 2
  • 20. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Understanding Bayes’ Theorem We can use Naive Bayes Classifier to predict whether to play Football on ( Season = Winter, Sunny = No , Windy = Yes ). Our Demo will help you clearly understand Naive Bayes.
  • 21. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Understanding Bayes’ Theorem Yes No 3 2 4 0 2 3 Summer Monsoon Winter Season Play Frequency Table Yes No 3 4 6 1 Yes No Sunny Play Frequency Table Yes No 6 2 3 3 Yes No Windy Play Frequency Table From the dataset we have obtained, we will populate frequency tables for each of the attribute
  • 22. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Understanding Bayes’ Theorem For each of the frequency tables, we will find the likelihoods for each of the cases P(c | x) = P(Yes | Summer) = P(Summer | Yes)* P(Yes) / P(Summer) = (0.33 x 0.64) /0.36 = 0.60 Likelihood of ‘Yes’ given Summer is: Yes No 3/9 2/5 4/9 0/5 2/9 3/5 Summer Monsoon Winter Season Play Likelihood Table 9/14 5/14 5/14 4/14 5/14 P(x | c) = P(Summer | Yes) = 3/9 = 0.33 P(c) = P(Yes) = 9/14 = 0.64 P(x) = P(Summer) = 5/14 = 0.36 Here, c = Play and x = Variables like Season, Sunny & Windy.
  • 23. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Understanding Bayes’ Theorem Let us use the likelihood table to predict whether to play football on ( Season = Winter, Sunny = No , Windy = Yes ) P(c | x) = P(Play = Yes | Winter, Sunny = No, Windy = Yes) = P(Winter | Yes) * P(Sunny = No | Yes) * P(Windy = Yes | Yes) * P(Yes) P(Winter) * P(Sunny = No) * P(Windy = Yes) = (2/9) * (6/9) * (6/9) * (9/14) / (5/14) * (7/14) * (8/14) = 0.6223 Since the probability is greater than 0.5, we should play football on that day. Yayiee!!
  • 24. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Use Cases of Naive Bayes
  • 25. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Use Cases of Naive Bayes Email Spam Detection Categorizing News Face Recognition Sentiment Analysis
  • 26. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Use Cases of Naive Bayes Weather Prediction Digit RecognitionMedical Diagnosis
  • 27. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Demo – Employee Salary Prediction
  • 28. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Demo – Problem Statement Problem Statement: To devise a model to predict an employee’s salary based on the given set of attributes using Naive Bayes classifier.  We have an Employee Dataset where there are 14 attributes and our output variable is Employee’s Salary.  We will use Naive Bayes Classifier to predict an Employee’s Salary as high(>50k) or low(<50k)by finding out the probabilities for the given attribute combination.
  • 29. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Demo – Employee Salary Prediction Feature Selection Divide Dataset Implement Model Optimize Model Prediction Model Validation Data Acquisition
  • 30. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Demo – Employee Salary Prediction Field Description Age_Of_emp Age of the employee Emp_Stat_type Type of the employment industry srnumber Serial number of the employee Edu_of_Emp Employee education details Edu_Cat Employee’s education category marital_Status Employee marital status Occ_Of_Emp Job description of the employee Emp_rel_status Employee relationship status Emp_race_type Race of the employee sex_of_emp Sex of the employee capital_gain Income from investment sources apart from wages/salary capital_loss Losses from investment sources apart from wages/salary Work_hour_in_week Number of weekly working hours country_of_res Country of residence Emp_sal Employee’s salary Feature Selection Divide Dataset Implement Model Optimize Model Prediction Model Validation Data Acquisition
  • 31. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Demo – Employee Salary Prediction Feature Selection Divide Dataset Implement Model Optimize Model Prediction Model Validation Data Acquisition  From the following fields, we need to filter out unnecessary columns which will not affect the Employee’s Salary.  We will be removing fields srnumber, marital_Status, Emp_rel_status, Emp_race_type, sex_of_emp, capital_gain and capital_loss because these fields are factors which do not affect a person’s salary.  The remaining fields will be used to build our model.
  • 32. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Demo – Employee Salary Prediction We will divide our entire dataset into two subsets as:  Training dataset -> To train the model  Testing dataset -> To validate and make predictions Feature Selection Divide Dataset Implement Model Optimize Model Prediction Model Validation Data Acquisition
  • 33. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Demo – Employee Salary Prediction  We model the Naive Bayes using the library ‘e1071’ on the training dataset that we created just now.  The model is called emp_nb. Feature Selection Divide Dataset Implement Model Optimize Model Prediction Model Validation Data Acquisition
  • 34. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Demo – Employee Salary Prediction The following is the output from emp_nb model Feature Selection Divide Dataset Implement Model Optimize Model Prediction Model Validation Data Acquisition Likelihood of High & Low Salaries Likelihood of Employee Department against High & Low Salaries
  • 35. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Demo – Employee Salary Prediction Optimizing Models refers to modifying our model so as to achieve highest accuracy. If the P-value is > 0.05, then we should reject the model. Our P-value is lesser than 0.05, so our model is acceptable. Kappa is the value obtained by: Kappa = (totalAccuracy - randomAccuracy) / (1 - randomAccuracy) Naive Bayes classifier can be further improved using the following steps:  Include Laplace Correction  Normalization Feature Selection Divide Dataset Implement Model Optimize Model Prediction Model Validation Data Acquisition
  • 36. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Demo – Employee Salary Prediction  We can go ahead and check the validation of the predictions.  We will populate the Confusion Matrix which shows all the metrics to measure the accuracy, sensitivity, specificity, prevalence, etc. Feature Selection Divide Dataset Implement Model Optimize Model Prediction Model Validation Data Acquisition
  • 37. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Demo – Employee Salary Prediction  The final step in our project is to predict the Salary of the employee based on the Naive Bayes model that we have created.  The prediction for our specific input is Low. Feature Selection Divide Dataset Implement Model Optimize Model Prediction Model Validation Data Acquisition
  • 38. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Summary
  • 39. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Introduction to ClassificationWhat is Machine Learning? Summary Use Cases of Naive BayesWhat is Naive Bayes? Demo Classification Algorithms
  • 40. www.edureka.co/data-scienceEDUREKA DATA SCIENCE CERTIFICATION TRAINING Thank You … Questions/Queries/Feedback