SlideShare a Scribd company logo
1 of 3
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 02 | Feb 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 433
Implementation of Spam Classifier using Naïve Bayes Algorithm
Ajay Gangare1, Jitesh Rathore2, Akash Kumar Tadge3, Abhinav Shrivastav4, Rashmi Yadav5,
Pankaj Singh Sisodiya6
1-6Dept. of Computer Science and Engineering, Shri Balaji Institute of Technology and Management, Betul, M.P
---------------------------------------------------------------------***----------------------------------------------------------------------
Abstract - Spam involves sending someone unwanted
messages. Currently the internet is the biggest platformtoget
some information, also social media is going to be very
popular nowadays. Because ofthat, manyspammerswilltryto
mislead users by sending lots of spam messages. And because
of spam messages, there are lots of problems, fraud occurs. So
we want to filter messages into spam or ham. To classify the
messages as spam or not spam we are using machine learning
(the multinomial naïve Bayes classifier algorithm) and
CountVectorizer provided by Scikit-learn library in python
programming. First, we collect the datasets and convert them
into numerical data (matrix) by CountVectorizer and then we
apply the naïve Bayes algorithm on datasets for classification
purposes.
Key Words: naive Bayes algorithm, CountVectorizer,
machine learning, Bayesian classification.
1.INTRODUCTION
In today’s world, since social media and the internet is very
popular, so there are lots of people using abusing messages
or shady comments, also many spammers will send a bulk of
spam messages like they send (malicious spam) fake link to
us and when we click on the link, then they get the access of
our information, because of that we may get into trouble.
Many organizations and people could face financial loss.
Some of spamming include unwanted advertisement of the
product, sometimes it becomes very irritating for people.
Some of the spammers are doing the work of spreading
computer viruses.
1.1 Problem Statement
Every day, we receive tons of junk messages, emails, some
spam messages are just marketing/advertising messages
while some are fake spam messages. People are
annoyed/irritated because of such spam messages they
receive. The main objective of the problem is to classify the
comments into spam or not spam. And in our project , we are
designing 3 separate modules of spam classifier for social
media spam, SMS spam, and email spam.
2. LITERATURE SURVEY
There are many types of spam, such as web spam, short
message spam, email spam. social network spamandothers.
In this paper, we are focusing on social media, mobile SMS
spam, and email spam.
2.1 Existing System
Data mining techniques are commonly used for spam
classification. In our paper, we are using machine learning
algorithm to automate the process of spam detection.
2.2 Proposed System
In our proposed system, count vectorizer is used for
extracting features from a given dataset, and a design model
is used for generating tests and training sets from given
dataset. Then the naive bayes classifier is trained in training
data. And the proposed system will say given data isspamor
not.
3. Naïve Bayes
Naïve Bayes classifiers are a type of linear classifiers. Naïve
Bayes algorithm is a very simple algorithm used for the
classification purpose, and the naïve bayesclassifierisbased
on a probabilistic model, the base of the naïve bayes
algorithm is the Bayesian theorem. Naïve Bayes algorithm
will calculate the probability of input word based on the
probability of past (trained dataset) and classify input as
spam or not spam. Naïve bayes algorithm uses the formula
for classifying input message as spam or not spam is given
below.
Fig -3.1: Formula of Bayesian theorem
In this article [7] the author Vinoth introduced (in above
picture) that
Above,
P(c|x) is the posterior probability of class (c, target)
given predictor (x, attributes).
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 02 | Feb 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 434
P(c) is the prior probability of class.
P(x|c) is the likelihood which is the probability
of predictor given class.
P(x) is the prior probability of predictor.
4. Count Vectorizer
CountVectorizer is a tool in python programming. Since we
can only apply our machine learning algorithm (naïve bayes
algorithm) to numerical data, we need to convert our text
message into numerical data, for that we use
CountVectorizer. Now CountVectorizer will analyze the text
data and create a matrix and in the matrix, each row
represents the certain text data and each column represents
a unique word. The count of the word in certain text data is
considered as the value of each cell. As a result, we get a
vector that gives (specify) the length of the entire text data.
5. Flowchart
Fig 5.1: Flowchart of proposed system
6. METHODOLOGY AND IMPLEMENTATION
We are going to build our project (spam classifier
application) in 4 phases
(i) Collection of Datasets (CSV files)
(ii) Pre-Processing of the dataset
(iii) Features Selection and Extraction
(iv) Classification phase
(i) Data Collection
In the dataset collection phase, the dataset that isgoingto be
used for the training dataset we downloaded through API
contained 9 CSV files of email spam and testing for machine
learning algorithm is downloaded from the website UCI
machine learning repository, The, mobile SMS spam, social
media spam respectively.
(ii) Pre – processing
In the pre-processing phase, we take the collected dataset
and we will preprocess it, preprocess means we deletesome
unwanted information (data) from the datasets like we
delete some columns and rows that are not needed
(important). If the collected dataset is not cleaned, then we
have to perform certain operations to make it clean. In the
pre-processing phase, if the same words are occurring
multiple times then we consider it only once.
(iii) Feature Selection and Extraction
After preprocessing phase, we need to find the feature from
the preprocessed dataset, so that the naïve bayes algorithm
can be run on the feature. First, we analyzethe preprocessed
dataset and we use several feature selection methodstofind
the best feature that will be well suited to train dataand give
the best result.
(iv) Classification
In the classification phase we have to divide our data
(dataset) into 2 phases – (a) Training Phase (b) Testing
Phase. 60% of data is used for the training phase and 40% of
data is used for the testing phase. After completing the step
feature selection and extraction, we get the feature and the
feature that is selected is considered spam. In this
classification phase, our datasetswill betrainedbasedon the
naïve bayes algorithm. For training the dataset we use a
linear classification algorithm that is naïve bayes algorithm.
The feature is selected in the featureselectionand extraction
phase and we use the naïve bayes algorithm to run on the
features so that output is produced. There are many
classifiers available for the naïve bayes algorithm, but in our
article, we are using the naïve bayes multinomial classifier
for classification purposes because the multinomial naïve
bayes algorithm gives higher accuracy.
7. PROPOSED SYSTEM
There are many existing systems available that are based on
data mining and machine learning techniques, but our
proposed system will be based on a machine learning
algorithm (naïve bayes algorithm) which is based on the
Bayesian theorem. The proposed system will calculate the
probabilities of spam and not spam messages,andaccording
to the probabilities, it will produce the result.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 02 | Feb 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 435
8. RESULT
After Creating the model (completion of the project) we can
predict whether the comment is SPAM or NOT We are able
to classify the messages as spam or non-spam. After
implementing algorithm (model) on dataset , after testing
our model ,the efficiency of our model will be up to 94%.
9. CONCLUSION
In this article, we developed the spam classifier application
(model) by using a machine learning algorithm. there is
always the issue of security in the social media platform. The
proposed system is used to identifythespamandclassifythe
messages into spam and not spamcategoriesusingtheNaıve
Bayes technique. But if we use the Support Vector Machine
for training dataset along with naïve bayes algorithm then,
we can get better efficiency. After testing various test cases
we conclude that the efficiency of our proposed system
(model) will be up to 94%. We can improve the efficiency of
our model by doing some improvements like using “tfidf
vectorizer” etc.
REFERENCES
[1] Nurul Fitriah Rusland, Norfaradilla Wahid, Shahreen
Kasim, Hanayanti Hafi 2017 IOP Conf. Ser.: Mater. Sci. Eng.
226 012091
[2] D. Sen, C. Das and S. Chakraborty, “A New Machine
Learning based Approach for Text Spam Filtering
Technique”, Communications on Applied Electronics, Vol. 6,
No. 10, pp. 28-34, 2017.
[3] H. Sajedi, G.Z. Parast and F. Akbari, “SMS Spam Filtering
using Machine Learning Techniques: A Survey”, Machine
Learning Research, Vol. 1, No. 1, pp. 1-14, 2016.
[4] Rizky et al. “The Effect of Best First and Spread
subsample on Selection of a Feature Wrapper with Naïve
Bayes Classifier for Classification of Ratio of Inpatients”.
Scientific Journal of Informatics.
[5] Anjana Kumari,” Study on Naive Bayesian Classifier and
its relation to Information Gain,” International Journal on
Recent and Innovation Trends in Computing and
Communication, Volume: 2, Issue- 3, March 2014, pp.601 –
603.
[6] Rushdi Shams and Robert Mercer,” Classifying Spam
Emails using Text and Readability Features,” IEEE 13th
International Conference on Data Mining (ICDM), 2013, pp.
657-666.
[7] Naive Bayes Algorithm, 2019.
https://huntdatascience.wordpress.com/2019/09/24/naive
-bayes-algorithm/. Accessed September 24, 2019.

More Related Content

Similar to Implementation of Spam Classifier using Naïve Bayes Algorithm

Bug Triage: An Automated Process
Bug Triage: An Automated ProcessBug Triage: An Automated Process
Bug Triage: An Automated ProcessIRJET Journal
 
Online java compiler with security editor
Online java compiler with security editorOnline java compiler with security editor
Online java compiler with security editorIRJET Journal
 
EMAIL SPAM DETECTION USING HYBRID ALGORITHM
EMAIL SPAM DETECTION USING HYBRID ALGORITHMEMAIL SPAM DETECTION USING HYBRID ALGORITHM
EMAIL SPAM DETECTION USING HYBRID ALGORITHMIRJET Journal
 
IRJET - Recommendation System using Big Data Mining on Social Networks
IRJET -  	  Recommendation System using Big Data Mining on Social NetworksIRJET -  	  Recommendation System using Big Data Mining on Social Networks
IRJET - Recommendation System using Big Data Mining on Social NetworksIRJET Journal
 
IRJET - Cognitive based Emotion Analysis of a Child Reading a Book
IRJET -  	  Cognitive based Emotion Analysis of a Child Reading a BookIRJET -  	  Cognitive based Emotion Analysis of a Child Reading a Book
IRJET - Cognitive based Emotion Analysis of a Child Reading a BookIRJET Journal
 
Spam detection using machine learning based binary classifier_043660
Spam detection using machine learning based binary classifier_043660Spam detection using machine learning based binary classifier_043660
Spam detection using machine learning based binary classifier_043660syaidatulamirah
 
IRJET- Analysis of Brand Value Prediction based on Social Media Data
IRJET-  	  Analysis of Brand Value Prediction based on Social Media DataIRJET-  	  Analysis of Brand Value Prediction based on Social Media Data
IRJET- Analysis of Brand Value Prediction based on Social Media DataIRJET Journal
 
Cross Domain Recommender System using Machine Learning and Transferable Knowl...
Cross Domain Recommender System using Machine Learning and Transferable Knowl...Cross Domain Recommender System using Machine Learning and Transferable Knowl...
Cross Domain Recommender System using Machine Learning and Transferable Knowl...IRJET Journal
 
IRJET- Automated CV Classification using Clustering Technique
IRJET- Automated CV Classification using Clustering TechniqueIRJET- Automated CV Classification using Clustering Technique
IRJET- Automated CV Classification using Clustering TechniqueIRJET Journal
 
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET-  	  Identify the Human or Bots Twitter Data using Machine Learning Alg...IRJET-  	  Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...IRJET Journal
 
Intelligent Spam Mail Detection System
Intelligent Spam Mail Detection SystemIntelligent Spam Mail Detection System
Intelligent Spam Mail Detection SystemIRJET Journal
 
IRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
IRJET- Sentimental Analysis for Online Reviews using Machine Learning AlgorithmsIRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
IRJET- Sentimental Analysis for Online Reviews using Machine Learning AlgorithmsIRJET Journal
 
A Survey on Spam Filtering Methods and Mapreduce with SVM
A Survey on Spam Filtering Methods and Mapreduce with SVMA Survey on Spam Filtering Methods and Mapreduce with SVM
A Survey on Spam Filtering Methods and Mapreduce with SVMIRJET Journal
 
IRJET- Detecting Phishing Websites using Machine Learning
IRJET- Detecting Phishing Websites using Machine LearningIRJET- Detecting Phishing Websites using Machine Learning
IRJET- Detecting Phishing Websites using Machine LearningIRJET Journal
 
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...IRJET Journal
 
trialFinal report7th sem.pdf
trialFinal report7th sem.pdftrialFinal report7th sem.pdf
trialFinal report7th sem.pdfUMAPATEL34
 
A Real Time Advance Automated Attendance System using Face-Net Algorithm
A Real Time Advance Automated Attendance System using Face-Net AlgorithmA Real Time Advance Automated Attendance System using Face-Net Algorithm
A Real Time Advance Automated Attendance System using Face-Net AlgorithmIRJET Journal
 
Analysis on Fraud Detection Mechanisms Using Machine Learning Techniques
Analysis on Fraud Detection Mechanisms Using Machine Learning TechniquesAnalysis on Fraud Detection Mechanisms Using Machine Learning Techniques
Analysis on Fraud Detection Mechanisms Using Machine Learning TechniquesIRJET Journal
 
IRJET- Advanced Phishing Identification Technique using Machine Learning
IRJET-  	  Advanced Phishing Identification Technique using Machine LearningIRJET-  	  Advanced Phishing Identification Technique using Machine Learning
IRJET- Advanced Phishing Identification Technique using Machine LearningIRJET Journal
 
Email Spam Detection Using Machine Learning
Email Spam Detection Using Machine LearningEmail Spam Detection Using Machine Learning
Email Spam Detection Using Machine LearningIRJET Journal
 

Similar to Implementation of Spam Classifier using Naïve Bayes Algorithm (20)

Bug Triage: An Automated Process
Bug Triage: An Automated ProcessBug Triage: An Automated Process
Bug Triage: An Automated Process
 
Online java compiler with security editor
Online java compiler with security editorOnline java compiler with security editor
Online java compiler with security editor
 
EMAIL SPAM DETECTION USING HYBRID ALGORITHM
EMAIL SPAM DETECTION USING HYBRID ALGORITHMEMAIL SPAM DETECTION USING HYBRID ALGORITHM
EMAIL SPAM DETECTION USING HYBRID ALGORITHM
 
IRJET - Recommendation System using Big Data Mining on Social Networks
IRJET -  	  Recommendation System using Big Data Mining on Social NetworksIRJET -  	  Recommendation System using Big Data Mining on Social Networks
IRJET - Recommendation System using Big Data Mining on Social Networks
 
IRJET - Cognitive based Emotion Analysis of a Child Reading a Book
IRJET -  	  Cognitive based Emotion Analysis of a Child Reading a BookIRJET -  	  Cognitive based Emotion Analysis of a Child Reading a Book
IRJET - Cognitive based Emotion Analysis of a Child Reading a Book
 
Spam detection using machine learning based binary classifier_043660
Spam detection using machine learning based binary classifier_043660Spam detection using machine learning based binary classifier_043660
Spam detection using machine learning based binary classifier_043660
 
IRJET- Analysis of Brand Value Prediction based on Social Media Data
IRJET-  	  Analysis of Brand Value Prediction based on Social Media DataIRJET-  	  Analysis of Brand Value Prediction based on Social Media Data
IRJET- Analysis of Brand Value Prediction based on Social Media Data
 
Cross Domain Recommender System using Machine Learning and Transferable Knowl...
Cross Domain Recommender System using Machine Learning and Transferable Knowl...Cross Domain Recommender System using Machine Learning and Transferable Knowl...
Cross Domain Recommender System using Machine Learning and Transferable Knowl...
 
IRJET- Automated CV Classification using Clustering Technique
IRJET- Automated CV Classification using Clustering TechniqueIRJET- Automated CV Classification using Clustering Technique
IRJET- Automated CV Classification using Clustering Technique
 
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET-  	  Identify the Human or Bots Twitter Data using Machine Learning Alg...IRJET-  	  Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
 
Intelligent Spam Mail Detection System
Intelligent Spam Mail Detection SystemIntelligent Spam Mail Detection System
Intelligent Spam Mail Detection System
 
IRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
IRJET- Sentimental Analysis for Online Reviews using Machine Learning AlgorithmsIRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
IRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
 
A Survey on Spam Filtering Methods and Mapreduce with SVM
A Survey on Spam Filtering Methods and Mapreduce with SVMA Survey on Spam Filtering Methods and Mapreduce with SVM
A Survey on Spam Filtering Methods and Mapreduce with SVM
 
IRJET- Detecting Phishing Websites using Machine Learning
IRJET- Detecting Phishing Websites using Machine LearningIRJET- Detecting Phishing Websites using Machine Learning
IRJET- Detecting Phishing Websites using Machine Learning
 
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
 
trialFinal report7th sem.pdf
trialFinal report7th sem.pdftrialFinal report7th sem.pdf
trialFinal report7th sem.pdf
 
A Real Time Advance Automated Attendance System using Face-Net Algorithm
A Real Time Advance Automated Attendance System using Face-Net AlgorithmA Real Time Advance Automated Attendance System using Face-Net Algorithm
A Real Time Advance Automated Attendance System using Face-Net Algorithm
 
Analysis on Fraud Detection Mechanisms Using Machine Learning Techniques
Analysis on Fraud Detection Mechanisms Using Machine Learning TechniquesAnalysis on Fraud Detection Mechanisms Using Machine Learning Techniques
Analysis on Fraud Detection Mechanisms Using Machine Learning Techniques
 
IRJET- Advanced Phishing Identification Technique using Machine Learning
IRJET-  	  Advanced Phishing Identification Technique using Machine LearningIRJET-  	  Advanced Phishing Identification Technique using Machine Learning
IRJET- Advanced Phishing Identification Technique using Machine Learning
 
Email Spam Detection Using Machine Learning
Email Spam Detection Using Machine LearningEmail Spam Detection Using Machine Learning
Email Spam Detection Using Machine Learning
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture designssuser87fa0c1
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 

Recently uploaded (20)

Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture design
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 

Implementation of Spam Classifier using Naïve Bayes Algorithm

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 02 | Feb 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 433 Implementation of Spam Classifier using Naïve Bayes Algorithm Ajay Gangare1, Jitesh Rathore2, Akash Kumar Tadge3, Abhinav Shrivastav4, Rashmi Yadav5, Pankaj Singh Sisodiya6 1-6Dept. of Computer Science and Engineering, Shri Balaji Institute of Technology and Management, Betul, M.P ---------------------------------------------------------------------***---------------------------------------------------------------------- Abstract - Spam involves sending someone unwanted messages. Currently the internet is the biggest platformtoget some information, also social media is going to be very popular nowadays. Because ofthat, manyspammerswilltryto mislead users by sending lots of spam messages. And because of spam messages, there are lots of problems, fraud occurs. So we want to filter messages into spam or ham. To classify the messages as spam or not spam we are using machine learning (the multinomial naïve Bayes classifier algorithm) and CountVectorizer provided by Scikit-learn library in python programming. First, we collect the datasets and convert them into numerical data (matrix) by CountVectorizer and then we apply the naïve Bayes algorithm on datasets for classification purposes. Key Words: naive Bayes algorithm, CountVectorizer, machine learning, Bayesian classification. 1.INTRODUCTION In today’s world, since social media and the internet is very popular, so there are lots of people using abusing messages or shady comments, also many spammers will send a bulk of spam messages like they send (malicious spam) fake link to us and when we click on the link, then they get the access of our information, because of that we may get into trouble. Many organizations and people could face financial loss. Some of spamming include unwanted advertisement of the product, sometimes it becomes very irritating for people. Some of the spammers are doing the work of spreading computer viruses. 1.1 Problem Statement Every day, we receive tons of junk messages, emails, some spam messages are just marketing/advertising messages while some are fake spam messages. People are annoyed/irritated because of such spam messages they receive. The main objective of the problem is to classify the comments into spam or not spam. And in our project , we are designing 3 separate modules of spam classifier for social media spam, SMS spam, and email spam. 2. LITERATURE SURVEY There are many types of spam, such as web spam, short message spam, email spam. social network spamandothers. In this paper, we are focusing on social media, mobile SMS spam, and email spam. 2.1 Existing System Data mining techniques are commonly used for spam classification. In our paper, we are using machine learning algorithm to automate the process of spam detection. 2.2 Proposed System In our proposed system, count vectorizer is used for extracting features from a given dataset, and a design model is used for generating tests and training sets from given dataset. Then the naive bayes classifier is trained in training data. And the proposed system will say given data isspamor not. 3. Naïve Bayes Naïve Bayes classifiers are a type of linear classifiers. Naïve Bayes algorithm is a very simple algorithm used for the classification purpose, and the naïve bayesclassifierisbased on a probabilistic model, the base of the naïve bayes algorithm is the Bayesian theorem. Naïve Bayes algorithm will calculate the probability of input word based on the probability of past (trained dataset) and classify input as spam or not spam. Naïve bayes algorithm uses the formula for classifying input message as spam or not spam is given below. Fig -3.1: Formula of Bayesian theorem In this article [7] the author Vinoth introduced (in above picture) that Above, P(c|x) is the posterior probability of class (c, target) given predictor (x, attributes).
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 02 | Feb 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 434 P(c) is the prior probability of class. P(x|c) is the likelihood which is the probability of predictor given class. P(x) is the prior probability of predictor. 4. Count Vectorizer CountVectorizer is a tool in python programming. Since we can only apply our machine learning algorithm (naïve bayes algorithm) to numerical data, we need to convert our text message into numerical data, for that we use CountVectorizer. Now CountVectorizer will analyze the text data and create a matrix and in the matrix, each row represents the certain text data and each column represents a unique word. The count of the word in certain text data is considered as the value of each cell. As a result, we get a vector that gives (specify) the length of the entire text data. 5. Flowchart Fig 5.1: Flowchart of proposed system 6. METHODOLOGY AND IMPLEMENTATION We are going to build our project (spam classifier application) in 4 phases (i) Collection of Datasets (CSV files) (ii) Pre-Processing of the dataset (iii) Features Selection and Extraction (iv) Classification phase (i) Data Collection In the dataset collection phase, the dataset that isgoingto be used for the training dataset we downloaded through API contained 9 CSV files of email spam and testing for machine learning algorithm is downloaded from the website UCI machine learning repository, The, mobile SMS spam, social media spam respectively. (ii) Pre – processing In the pre-processing phase, we take the collected dataset and we will preprocess it, preprocess means we deletesome unwanted information (data) from the datasets like we delete some columns and rows that are not needed (important). If the collected dataset is not cleaned, then we have to perform certain operations to make it clean. In the pre-processing phase, if the same words are occurring multiple times then we consider it only once. (iii) Feature Selection and Extraction After preprocessing phase, we need to find the feature from the preprocessed dataset, so that the naïve bayes algorithm can be run on the feature. First, we analyzethe preprocessed dataset and we use several feature selection methodstofind the best feature that will be well suited to train dataand give the best result. (iv) Classification In the classification phase we have to divide our data (dataset) into 2 phases – (a) Training Phase (b) Testing Phase. 60% of data is used for the training phase and 40% of data is used for the testing phase. After completing the step feature selection and extraction, we get the feature and the feature that is selected is considered spam. In this classification phase, our datasetswill betrainedbasedon the naïve bayes algorithm. For training the dataset we use a linear classification algorithm that is naïve bayes algorithm. The feature is selected in the featureselectionand extraction phase and we use the naïve bayes algorithm to run on the features so that output is produced. There are many classifiers available for the naïve bayes algorithm, but in our article, we are using the naïve bayes multinomial classifier for classification purposes because the multinomial naïve bayes algorithm gives higher accuracy. 7. PROPOSED SYSTEM There are many existing systems available that are based on data mining and machine learning techniques, but our proposed system will be based on a machine learning algorithm (naïve bayes algorithm) which is based on the Bayesian theorem. The proposed system will calculate the probabilities of spam and not spam messages,andaccording to the probabilities, it will produce the result.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 02 | Feb 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 435 8. RESULT After Creating the model (completion of the project) we can predict whether the comment is SPAM or NOT We are able to classify the messages as spam or non-spam. After implementing algorithm (model) on dataset , after testing our model ,the efficiency of our model will be up to 94%. 9. CONCLUSION In this article, we developed the spam classifier application (model) by using a machine learning algorithm. there is always the issue of security in the social media platform. The proposed system is used to identifythespamandclassifythe messages into spam and not spamcategoriesusingtheNaıve Bayes technique. But if we use the Support Vector Machine for training dataset along with naïve bayes algorithm then, we can get better efficiency. After testing various test cases we conclude that the efficiency of our proposed system (model) will be up to 94%. We can improve the efficiency of our model by doing some improvements like using “tfidf vectorizer” etc. REFERENCES [1] Nurul Fitriah Rusland, Norfaradilla Wahid, Shahreen Kasim, Hanayanti Hafi 2017 IOP Conf. Ser.: Mater. Sci. Eng. 226 012091 [2] D. Sen, C. Das and S. Chakraborty, “A New Machine Learning based Approach for Text Spam Filtering Technique”, Communications on Applied Electronics, Vol. 6, No. 10, pp. 28-34, 2017. [3] H. Sajedi, G.Z. Parast and F. Akbari, “SMS Spam Filtering using Machine Learning Techniques: A Survey”, Machine Learning Research, Vol. 1, No. 1, pp. 1-14, 2016. [4] Rizky et al. “The Effect of Best First and Spread subsample on Selection of a Feature Wrapper with Naïve Bayes Classifier for Classification of Ratio of Inpatients”. Scientific Journal of Informatics. [5] Anjana Kumari,” Study on Naive Bayesian Classifier and its relation to Information Gain,” International Journal on Recent and Innovation Trends in Computing and Communication, Volume: 2, Issue- 3, March 2014, pp.601 – 603. [6] Rushdi Shams and Robert Mercer,” Classifying Spam Emails using Text and Readability Features,” IEEE 13th International Conference on Data Mining (ICDM), 2013, pp. 657-666. [7] Naive Bayes Algorithm, 2019. https://huntdatascience.wordpress.com/2019/09/24/naive -bayes-algorithm/. Accessed September 24, 2019.