naive bayes classification for machine learning..pptx

Naive Bayes Classifier:
• Naive Bayes classifiers are a collection of classification algorithms based on Bayes’
Theorem.
• It is not a single algorithm but a family of algorithms where all of them share a common
principle, i.e. every pair of features being classified is independent of each other.

# Load the dataset
D = iris
# View dataset
View(D)
# Check dataset structure
names(D)
dim(D)
# Split the data into training and testing sets
set.seed(123) # Set seed for reproducibility
Split_D = sample.split(D$Species, SplitRatio = 0.7)
# Create training and testing subsets
train_d = subset(D, Split_D == TRUE)
test_d = subset(D, Split_D == FALSE)
# Fit the Naïve Bayes model
NB = naiveBayes(Species ~ ., data = train_d)
print(NB)
# Predict on the test datapred_
d = predict(NB, newdata = test_d)
# Create confusion matrix
cm = table(test_d$Species, pred_d)
print(cm)
# Use confusionMatrix from caret
conf_matrix = confusionMatrix(pred_d, test_d$Species)
print(conf_matrix)
R-CODE:

>NB = naiveBayes(Species ~ ., data = train_d)
>print(NB)

Types of Naive Bayes Classifier:
1) Multinomial Naive Bayes:
– This is mostly used for document classification
problem, i.e whether a document belongs to the
category of sports, politics, technology etc.
– The features/predictors used by the classifier are
the frequency of the words present in the
document.
2)Bernoulli Naive Bayes:
– This is similar to the multinomial naive bayes but
the predictors are boolean variables.
– The parameters that we use to predict the class
variable take up only values yes or no, for example
if a word occurs in the text or not.

3) Gaussian Naive Bayes:
– When the predictors take up a continuous value
and are not discrete, we assume that these values
are sampled from a Gaussian distribution.

Advantages of a Naive Bayes Classifier:
•It doesn’t require larger amounts of training data.
•It is straightforward to implement.
•Convergence is quicker than other models, which are discriminative.
•It is highly scalable with several data points and predictors.
•It can handle both continuous and categorical data.
•It is not sensitive to irrelevant data and doesn’t follow the assumptions it
holds.
•It is used in real-time predictions.

Disadvantages of a Naive Bayes Classifier:
•The Naive Bayes Algorithm has trouble with the ‘zero-frequency problem’. It
happens when you assign zero probability for categorical variables in the training
dataset that is not available. When you use a smooth method for overcoming this
problem, you can make it work the best.
•It will assume that all the attributes are independent, which rarely happens in real
life. It will limit the application of this algorithm in real-world situations.
•It will estimate things wrong sometimes, so you shouldn’t take its probability
outputs seriously.

naive bayes classification for machine learning..pptx

More Related Content

Similar to naive bayes classification for machine learning..pptx

Recently uploaded

naive bayes classification for machine learning..pptx