Submit Search
Upload
Implementation of Spam Classifier using Naïve Bayes Algorithm
•
0 likes
•
5 views
IRJET Journal
Follow
https://www.irjet.net/archives/V9/i2/IRJET-V9I272.pdf
Read less
Read more
Engineering
Report
Share
Report
Share
1 of 3
Download now
Download to read offline
Recommended
Email Spam Detection Using Machine Learning
Email Spam Detection Using Machine Learning
IRJET Journal
IRJET- Machine Learning based Network Security
IRJET- Machine Learning based Network Security
IRJET Journal
PHISHING URL DETECTION USING MACHINE LEARNING
PHISHING URL DETECTION USING MACHINE LEARNING
IRJET Journal
IRJET- Analysis and Detection of E-Mail Phishing using Pyspark
IRJET- Analysis and Detection of E-Mail Phishing using Pyspark
IRJET Journal
IRJET - Survey on Malware Detection using Deep Learning Methods
IRJET - Survey on Malware Detection using Deep Learning Methods
IRJET Journal
IRJET - Fake News Detection using Machine Learning
IRJET - Fake News Detection using Machine Learning
IRJET Journal
Sentiment Analysis using Naïve Bayes, CNN, SVM
Sentiment Analysis using Naïve Bayes, CNN, SVM
IRJET Journal
E-Mail Spam Detection Using Supportive Vector Machine
E-Mail Spam Detection Using Supportive Vector Machine
IRJET Journal
Recommended
Email Spam Detection Using Machine Learning
Email Spam Detection Using Machine Learning
IRJET Journal
IRJET- Machine Learning based Network Security
IRJET- Machine Learning based Network Security
IRJET Journal
PHISHING URL DETECTION USING MACHINE LEARNING
PHISHING URL DETECTION USING MACHINE LEARNING
IRJET Journal
IRJET- Analysis and Detection of E-Mail Phishing using Pyspark
IRJET- Analysis and Detection of E-Mail Phishing using Pyspark
IRJET Journal
IRJET - Survey on Malware Detection using Deep Learning Methods
IRJET - Survey on Malware Detection using Deep Learning Methods
IRJET Journal
IRJET - Fake News Detection using Machine Learning
IRJET - Fake News Detection using Machine Learning
IRJET Journal
Sentiment Analysis using Naïve Bayes, CNN, SVM
Sentiment Analysis using Naïve Bayes, CNN, SVM
IRJET Journal
E-Mail Spam Detection Using Supportive Vector Machine
E-Mail Spam Detection Using Supportive Vector Machine
IRJET Journal
Bug Triage: An Automated Process
Bug Triage: An Automated Process
IRJET Journal
Online java compiler with security editor
Online java compiler with security editor
IRJET Journal
EMAIL SPAM DETECTION USING HYBRID ALGORITHM
EMAIL SPAM DETECTION USING HYBRID ALGORITHM
IRJET Journal
IRJET - Recommendation System using Big Data Mining on Social Networks
IRJET - Recommendation System using Big Data Mining on Social Networks
IRJET Journal
IRJET - Cognitive based Emotion Analysis of a Child Reading a Book
IRJET - Cognitive based Emotion Analysis of a Child Reading a Book
IRJET Journal
Spam detection using machine learning based binary classifier_043660
Spam detection using machine learning based binary classifier_043660
syaidatulamirah
IRJET- Analysis of Brand Value Prediction based on Social Media Data
IRJET- Analysis of Brand Value Prediction based on Social Media Data
IRJET Journal
Cross Domain Recommender System using Machine Learning and Transferable Knowl...
Cross Domain Recommender System using Machine Learning and Transferable Knowl...
IRJET Journal
IRJET- Automated CV Classification using Clustering Technique
IRJET- Automated CV Classification using Clustering Technique
IRJET Journal
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET Journal
Intelligent Spam Mail Detection System
Intelligent Spam Mail Detection System
IRJET Journal
IRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
IRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
IRJET Journal
A Survey on Spam Filtering Methods and Mapreduce with SVM
A Survey on Spam Filtering Methods and Mapreduce with SVM
IRJET Journal
IRJET- Detecting Phishing Websites using Machine Learning
IRJET- Detecting Phishing Websites using Machine Learning
IRJET Journal
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
IRJET Journal
trialFinal report7th sem.pdf
trialFinal report7th sem.pdf
UMAPATEL34
A Real Time Advance Automated Attendance System using Face-Net Algorithm
A Real Time Advance Automated Attendance System using Face-Net Algorithm
IRJET Journal
Analysis on Fraud Detection Mechanisms Using Machine Learning Techniques
Analysis on Fraud Detection Mechanisms Using Machine Learning Techniques
IRJET Journal
IRJET- Advanced Phishing Identification Technique using Machine Learning
IRJET- Advanced Phishing Identification Technique using Machine Learning
IRJET Journal
Email Spam Detection Using Machine Learning
Email Spam Detection Using Machine Learning
IRJET Journal
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
IRJET Journal
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
IRJET Journal
More Related Content
Similar to Implementation of Spam Classifier using Naïve Bayes Algorithm
Bug Triage: An Automated Process
Bug Triage: An Automated Process
IRJET Journal
Online java compiler with security editor
Online java compiler with security editor
IRJET Journal
EMAIL SPAM DETECTION USING HYBRID ALGORITHM
EMAIL SPAM DETECTION USING HYBRID ALGORITHM
IRJET Journal
IRJET - Recommendation System using Big Data Mining on Social Networks
IRJET - Recommendation System using Big Data Mining on Social Networks
IRJET Journal
IRJET - Cognitive based Emotion Analysis of a Child Reading a Book
IRJET - Cognitive based Emotion Analysis of a Child Reading a Book
IRJET Journal
Spam detection using machine learning based binary classifier_043660
Spam detection using machine learning based binary classifier_043660
syaidatulamirah
IRJET- Analysis of Brand Value Prediction based on Social Media Data
IRJET- Analysis of Brand Value Prediction based on Social Media Data
IRJET Journal
Cross Domain Recommender System using Machine Learning and Transferable Knowl...
Cross Domain Recommender System using Machine Learning and Transferable Knowl...
IRJET Journal
IRJET- Automated CV Classification using Clustering Technique
IRJET- Automated CV Classification using Clustering Technique
IRJET Journal
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET Journal
Intelligent Spam Mail Detection System
Intelligent Spam Mail Detection System
IRJET Journal
IRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
IRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
IRJET Journal
A Survey on Spam Filtering Methods and Mapreduce with SVM
A Survey on Spam Filtering Methods and Mapreduce with SVM
IRJET Journal
IRJET- Detecting Phishing Websites using Machine Learning
IRJET- Detecting Phishing Websites using Machine Learning
IRJET Journal
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
IRJET Journal
trialFinal report7th sem.pdf
trialFinal report7th sem.pdf
UMAPATEL34
A Real Time Advance Automated Attendance System using Face-Net Algorithm
A Real Time Advance Automated Attendance System using Face-Net Algorithm
IRJET Journal
Analysis on Fraud Detection Mechanisms Using Machine Learning Techniques
Analysis on Fraud Detection Mechanisms Using Machine Learning Techniques
IRJET Journal
IRJET- Advanced Phishing Identification Technique using Machine Learning
IRJET- Advanced Phishing Identification Technique using Machine Learning
IRJET Journal
Email Spam Detection Using Machine Learning
Email Spam Detection Using Machine Learning
IRJET Journal
Similar to Implementation of Spam Classifier using Naïve Bayes Algorithm
(20)
Bug Triage: An Automated Process
Bug Triage: An Automated Process
Online java compiler with security editor
Online java compiler with security editor
EMAIL SPAM DETECTION USING HYBRID ALGORITHM
EMAIL SPAM DETECTION USING HYBRID ALGORITHM
IRJET - Recommendation System using Big Data Mining on Social Networks
IRJET - Recommendation System using Big Data Mining on Social Networks
IRJET - Cognitive based Emotion Analysis of a Child Reading a Book
IRJET - Cognitive based Emotion Analysis of a Child Reading a Book
Spam detection using machine learning based binary classifier_043660
Spam detection using machine learning based binary classifier_043660
IRJET- Analysis of Brand Value Prediction based on Social Media Data
IRJET- Analysis of Brand Value Prediction based on Social Media Data
Cross Domain Recommender System using Machine Learning and Transferable Knowl...
Cross Domain Recommender System using Machine Learning and Transferable Knowl...
IRJET- Automated CV Classification using Clustering Technique
IRJET- Automated CV Classification using Clustering Technique
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
Intelligent Spam Mail Detection System
Intelligent Spam Mail Detection System
IRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
IRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
A Survey on Spam Filtering Methods and Mapreduce with SVM
A Survey on Spam Filtering Methods and Mapreduce with SVM
IRJET- Detecting Phishing Websites using Machine Learning
IRJET- Detecting Phishing Websites using Machine Learning
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
trialFinal report7th sem.pdf
trialFinal report7th sem.pdf
A Real Time Advance Automated Attendance System using Face-Net Algorithm
A Real Time Advance Automated Attendance System using Face-Net Algorithm
Analysis on Fraud Detection Mechanisms Using Machine Learning Techniques
Analysis on Fraud Detection Mechanisms Using Machine Learning Techniques
IRJET- Advanced Phishing Identification Technique using Machine Learning
IRJET- Advanced Phishing Identification Technique using Machine Learning
Email Spam Detection Using Machine Learning
Email Spam Detection Using Machine Learning
More from IRJET Journal
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
IRJET Journal
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
IRJET Journal
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
IRJET Journal
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
IRJET Journal
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
IRJET Journal
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
IRJET Journal
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
IRJET Journal
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
IRJET Journal
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
IRJET Journal
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
IRJET Journal
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
IRJET Journal
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
IRJET Journal
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
IRJET Journal
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
IRJET Journal
React based fullstack edtech web application
React based fullstack edtech web application
IRJET Journal
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
IRJET Journal
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
IRJET Journal
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
IRJET Journal
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
IRJET Journal
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
IRJET Journal
More from IRJET Journal
(20)
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
React based fullstack edtech web application
React based fullstack edtech web application
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Recently uploaded
Past, Present and Future of Generative AI
Past, Present and Future of Generative AI
abhishek36461
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
ROCENODodongVILLACER
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
PoojaBan
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting .
Satyam Kumar
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
eptoze12
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
asadnawaz62
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
wendy cai
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
Asst.prof M.Gokilavani
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
KartikeyaDwivedi3
pipeline in computer architecture design
pipeline in computer architecture design
ssuser87fa0c1
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
959SahilShah
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
C Sai Kiran
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Low Rate Call Girls In Saket, Delhi NCR
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
jennyeacort
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
Asst.prof M.Gokilavani
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
dollysharma2066
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
VICTOR MAESTRE RAMIREZ
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
null - The Open Security Community
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
Dr SOUNDIRARAJ N
Recently uploaded
(20)
Past, Present and Future of Generative AI
Past, Present and Future of Generative AI
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting .
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
pipeline in computer architecture design
pipeline in computer architecture design
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
Implementation of Spam Classifier using Naïve Bayes Algorithm
1.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 02 | Feb 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 433 Implementation of Spam Classifier using Naïve Bayes Algorithm Ajay Gangare1, Jitesh Rathore2, Akash Kumar Tadge3, Abhinav Shrivastav4, Rashmi Yadav5, Pankaj Singh Sisodiya6 1-6Dept. of Computer Science and Engineering, Shri Balaji Institute of Technology and Management, Betul, M.P ---------------------------------------------------------------------***---------------------------------------------------------------------- Abstract - Spam involves sending someone unwanted messages. Currently the internet is the biggest platformtoget some information, also social media is going to be very popular nowadays. Because ofthat, manyspammerswilltryto mislead users by sending lots of spam messages. And because of spam messages, there are lots of problems, fraud occurs. So we want to filter messages into spam or ham. To classify the messages as spam or not spam we are using machine learning (the multinomial naïve Bayes classifier algorithm) and CountVectorizer provided by Scikit-learn library in python programming. First, we collect the datasets and convert them into numerical data (matrix) by CountVectorizer and then we apply the naïve Bayes algorithm on datasets for classification purposes. Key Words: naive Bayes algorithm, CountVectorizer, machine learning, Bayesian classification. 1.INTRODUCTION In today’s world, since social media and the internet is very popular, so there are lots of people using abusing messages or shady comments, also many spammers will send a bulk of spam messages like they send (malicious spam) fake link to us and when we click on the link, then they get the access of our information, because of that we may get into trouble. Many organizations and people could face financial loss. Some of spamming include unwanted advertisement of the product, sometimes it becomes very irritating for people. Some of the spammers are doing the work of spreading computer viruses. 1.1 Problem Statement Every day, we receive tons of junk messages, emails, some spam messages are just marketing/advertising messages while some are fake spam messages. People are annoyed/irritated because of such spam messages they receive. The main objective of the problem is to classify the comments into spam or not spam. And in our project , we are designing 3 separate modules of spam classifier for social media spam, SMS spam, and email spam. 2. LITERATURE SURVEY There are many types of spam, such as web spam, short message spam, email spam. social network spamandothers. In this paper, we are focusing on social media, mobile SMS spam, and email spam. 2.1 Existing System Data mining techniques are commonly used for spam classification. In our paper, we are using machine learning algorithm to automate the process of spam detection. 2.2 Proposed System In our proposed system, count vectorizer is used for extracting features from a given dataset, and a design model is used for generating tests and training sets from given dataset. Then the naive bayes classifier is trained in training data. And the proposed system will say given data isspamor not. 3. Naïve Bayes Naïve Bayes classifiers are a type of linear classifiers. Naïve Bayes algorithm is a very simple algorithm used for the classification purpose, and the naïve bayesclassifierisbased on a probabilistic model, the base of the naïve bayes algorithm is the Bayesian theorem. Naïve Bayes algorithm will calculate the probability of input word based on the probability of past (trained dataset) and classify input as spam or not spam. Naïve bayes algorithm uses the formula for classifying input message as spam or not spam is given below. Fig -3.1: Formula of Bayesian theorem In this article [7] the author Vinoth introduced (in above picture) that Above, P(c|x) is the posterior probability of class (c, target) given predictor (x, attributes).
2.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 02 | Feb 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 434 P(c) is the prior probability of class. P(x|c) is the likelihood which is the probability of predictor given class. P(x) is the prior probability of predictor. 4. Count Vectorizer CountVectorizer is a tool in python programming. Since we can only apply our machine learning algorithm (naïve bayes algorithm) to numerical data, we need to convert our text message into numerical data, for that we use CountVectorizer. Now CountVectorizer will analyze the text data and create a matrix and in the matrix, each row represents the certain text data and each column represents a unique word. The count of the word in certain text data is considered as the value of each cell. As a result, we get a vector that gives (specify) the length of the entire text data. 5. Flowchart Fig 5.1: Flowchart of proposed system 6. METHODOLOGY AND IMPLEMENTATION We are going to build our project (spam classifier application) in 4 phases (i) Collection of Datasets (CSV files) (ii) Pre-Processing of the dataset (iii) Features Selection and Extraction (iv) Classification phase (i) Data Collection In the dataset collection phase, the dataset that isgoingto be used for the training dataset we downloaded through API contained 9 CSV files of email spam and testing for machine learning algorithm is downloaded from the website UCI machine learning repository, The, mobile SMS spam, social media spam respectively. (ii) Pre – processing In the pre-processing phase, we take the collected dataset and we will preprocess it, preprocess means we deletesome unwanted information (data) from the datasets like we delete some columns and rows that are not needed (important). If the collected dataset is not cleaned, then we have to perform certain operations to make it clean. In the pre-processing phase, if the same words are occurring multiple times then we consider it only once. (iii) Feature Selection and Extraction After preprocessing phase, we need to find the feature from the preprocessed dataset, so that the naïve bayes algorithm can be run on the feature. First, we analyzethe preprocessed dataset and we use several feature selection methodstofind the best feature that will be well suited to train dataand give the best result. (iv) Classification In the classification phase we have to divide our data (dataset) into 2 phases – (a) Training Phase (b) Testing Phase. 60% of data is used for the training phase and 40% of data is used for the testing phase. After completing the step feature selection and extraction, we get the feature and the feature that is selected is considered spam. In this classification phase, our datasetswill betrainedbasedon the naïve bayes algorithm. For training the dataset we use a linear classification algorithm that is naïve bayes algorithm. The feature is selected in the featureselectionand extraction phase and we use the naïve bayes algorithm to run on the features so that output is produced. There are many classifiers available for the naïve bayes algorithm, but in our article, we are using the naïve bayes multinomial classifier for classification purposes because the multinomial naïve bayes algorithm gives higher accuracy. 7. PROPOSED SYSTEM There are many existing systems available that are based on data mining and machine learning techniques, but our proposed system will be based on a machine learning algorithm (naïve bayes algorithm) which is based on the Bayesian theorem. The proposed system will calculate the probabilities of spam and not spam messages,andaccording to the probabilities, it will produce the result.
3.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 02 | Feb 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 435 8. RESULT After Creating the model (completion of the project) we can predict whether the comment is SPAM or NOT We are able to classify the messages as spam or non-spam. After implementing algorithm (model) on dataset , after testing our model ,the efficiency of our model will be up to 94%. 9. CONCLUSION In this article, we developed the spam classifier application (model) by using a machine learning algorithm. there is always the issue of security in the social media platform. The proposed system is used to identifythespamandclassifythe messages into spam and not spamcategoriesusingtheNaıve Bayes technique. But if we use the Support Vector Machine for training dataset along with naïve bayes algorithm then, we can get better efficiency. After testing various test cases we conclude that the efficiency of our proposed system (model) will be up to 94%. We can improve the efficiency of our model by doing some improvements like using “tfidf vectorizer” etc. REFERENCES [1] Nurul Fitriah Rusland, Norfaradilla Wahid, Shahreen Kasim, Hanayanti Hafi 2017 IOP Conf. Ser.: Mater. Sci. Eng. 226 012091 [2] D. Sen, C. Das and S. Chakraborty, “A New Machine Learning based Approach for Text Spam Filtering Technique”, Communications on Applied Electronics, Vol. 6, No. 10, pp. 28-34, 2017. [3] H. Sajedi, G.Z. Parast and F. Akbari, “SMS Spam Filtering using Machine Learning Techniques: A Survey”, Machine Learning Research, Vol. 1, No. 1, pp. 1-14, 2016. [4] Rizky et al. “The Effect of Best First and Spread subsample on Selection of a Feature Wrapper with Naïve Bayes Classifier for Classification of Ratio of Inpatients”. Scientific Journal of Informatics. [5] Anjana Kumari,” Study on Naive Bayesian Classifier and its relation to Information Gain,” International Journal on Recent and Innovation Trends in Computing and Communication, Volume: 2, Issue- 3, March 2014, pp.601 – 603. [6] Rushdi Shams and Robert Mercer,” Classifying Spam Emails using Text and Readability Features,” IEEE 13th International Conference on Data Mining (ICDM), 2013, pp. 657-666. [7] Naive Bayes Algorithm, 2019. https://huntdatascience.wordpress.com/2019/09/24/naive -bayes-algorithm/. Accessed September 24, 2019.
Download now