SlideShare a Scribd company logo
1 of 19
 INTRODUCTION
 NATURAL LANGUAGE PROCESSING
 SPAMMING
 RELATED WORK
 N-GRAM MODELLING
 WORD STEMMING
 BAYESIAN CLASSIFICATION
 PROPOSED MODEL
 COMPONENTS
 BLOCK DIAGRAM
 WORKING
 CONCLUSION
 QUESTION & ANSWER
 THANK YOU
Natural Language Processing (NLP)
is a field of computer science and
linguistics concerned with the
interactions between computers and
human (natural) languages.
Spamming is the use of electronic
messaging systems like e-mails and
other digital delivery systems and
broadcast media to send unwanted
bulk messages indiscriminately.
N-GRAM MODELLING
WORD STEMMING
BAYESIAN CLASSIFICATION
 An N-gram is an N-character slice of a longer string.
 N-grams of several different lengths simultaneously are
used in the processing while detecting spam emails.
 In N-gram blank space is replaces by underscore
character(“_”).
Here the example ,
WORD = STAY
 Bi-grams: _S, ST, TA, SY, Y_
 Tri-grams: _ _S, _ST, STA, TAY, AY_, Y_
 Quad-grams: _ _ _ S, _ _ ST, _STA, STAY,
TAY_, AY_ _, Y_ _ _
The following steps are used in a word stemming algorithm:
 Remove all non-alpha characters. (Allow some characters
like '/' ' ' '|' etc. which can be used together to look like some
characters, such as / for 'V')
 Remove all vowels from the word except the initial one.
 Replace all digits and symbols by similar looking digits or
characters.
 Replace consecutive repeated characters by a
single character and vice versa.
 Use phonetic algorithms like sound ex on the
resultant string.
 Give it a numeric value depending on the
operations performed over it.
 Update the database for that particular keyword.
 Bayesian Spam Filtering is a statistical method of
classifying an email into spam and legitimate classes.
 Naive Bayes classifiers work by correlating the use of
tokens with spam and non-spam emails and then using
Bayesian inference to calculate a probability that an email is
or is not spam.
Here,
A= no. Of Ham meassage
b= no. Of Spam messgae
 Email Input
 URL Source Check
 URL Blacklist Database
 Threshold Counter
 Keyword Extractor
 Classifier
 NLP Engine
WORKING
 An unclassified email is taken as input for processing.
 The URL Source checking block checks the URL of the incoming
email and searches URL Blacklist Database to find a match. If the
search is successful then, it categorizes the email as a spam or
considers it to be ham and processes further.
 At the same time there is a Threshold Counter which keeps track
of the number of emails coming from this source over a period of
time t.
 The Email is then passed to the Keyword Extractor which
will tokenize the message into keywords.
 These keywords are passed to the Classifier which will
classify the mail into a specific category E.
 The email is then passed to the NLP Engine along with its
category.
Spam mails are a serious concern to and a major annoyance
for many Internet users. The mode proposed as a solution in
this paper is highly beneficial because it introduces a
threshold counter which helps overcome congestion on the
web server and also maintain the spam filter efficiency but
at the same time, it also requires overhead storage space for
the databases.
Spam Detection Using Natural Language processing
Spam Detection Using Natural Language processing

More Related Content

What's hot

final-spam-e-mail-detection-180125111231.pptx
final-spam-e-mail-detection-180125111231.pptxfinal-spam-e-mail-detection-180125111231.pptx
final-spam-e-mail-detection-180125111231.pptxinfotowards
 
Spam detection using machine learning based binary classifier_043660
Spam detection using machine learning based binary classifier_043660Spam detection using machine learning based binary classifier_043660
Spam detection using machine learning based binary classifier_043660syaidatulamirah
 
E mail image spam filtering techniques
E mail image spam filtering techniquesE mail image spam filtering techniques
E mail image spam filtering techniquesranjit banshpal
 
Spam filtering with Naive Bayes Algorithm
Spam filtering with Naive Bayes AlgorithmSpam filtering with Naive Bayes Algorithm
Spam filtering with Naive Bayes AlgorithmAkshay Pal
 
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...IRJET Journal
 
Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...Daniel Adenew
 
Detecting the presence of cyberbullying using computer software
Detecting the presence of cyberbullying using computer softwareDetecting the presence of cyberbullying using computer software
Detecting the presence of cyberbullying using computer softwareAshish Arora
 
Presentation on Sentiment Analysis
Presentation on Sentiment AnalysisPresentation on Sentiment Analysis
Presentation on Sentiment AnalysisRebecca Williams
 
Amazon Product Sentiment review
Amazon Product Sentiment reviewAmazon Product Sentiment review
Amazon Product Sentiment reviewLalit Jain
 
Presentation-Detecting Spammers on Social Networks
Presentation-Detecting Spammers on Social NetworksPresentation-Detecting Spammers on Social Networks
Presentation-Detecting Spammers on Social NetworksAshish Arora
 
SMS Spam Filter Design Using R: A Machine Learning Approach
SMS Spam Filter Design Using R: A Machine Learning ApproachSMS Spam Filter Design Using R: A Machine Learning Approach
SMS Spam Filter Design Using R: A Machine Learning ApproachReza Rahimi
 
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...Geetika Gautam
 
Seminar On Naive Bayes for Spam Filtering
Seminar On Naive Bayes for Spam Filtering Seminar On Naive Bayes for Spam Filtering
Seminar On Naive Bayes for Spam Filtering Asrarulhaq Maktedar
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSumit Raj
 
Classification and Regression
Classification and RegressionClassification and Regression
Classification and RegressionMegha Sharma
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Dev Sahu
 

What's hot (20)

final-spam-e-mail-detection-180125111231.pptx
final-spam-e-mail-detection-180125111231.pptxfinal-spam-e-mail-detection-180125111231.pptx
final-spam-e-mail-detection-180125111231.pptx
 
Spam detection using machine learning based binary classifier_043660
Spam detection using machine learning based binary classifier_043660Spam detection using machine learning based binary classifier_043660
Spam detection using machine learning based binary classifier_043660
 
E mail image spam filtering techniques
E mail image spam filtering techniquesE mail image spam filtering techniques
E mail image spam filtering techniques
 
Spam filtering with Naive Bayes Algorithm
Spam filtering with Naive Bayes AlgorithmSpam filtering with Naive Bayes Algorithm
Spam filtering with Naive Bayes Algorithm
 
Spam Email identification
Spam Email identificationSpam Email identification
Spam Email identification
 
Final Report(SuddhasatwaSatpathy)
Final Report(SuddhasatwaSatpathy)Final Report(SuddhasatwaSatpathy)
Final Report(SuddhasatwaSatpathy)
 
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
 
Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...
 
Detecting the presence of cyberbullying using computer software
Detecting the presence of cyberbullying using computer softwareDetecting the presence of cyberbullying using computer software
Detecting the presence of cyberbullying using computer software
 
Presentation on Sentiment Analysis
Presentation on Sentiment AnalysisPresentation on Sentiment Analysis
Presentation on Sentiment Analysis
 
Amazon Product Sentiment review
Amazon Product Sentiment reviewAmazon Product Sentiment review
Amazon Product Sentiment review
 
Presentation-Detecting Spammers on Social Networks
Presentation-Detecting Spammers on Social NetworksPresentation-Detecting Spammers on Social Networks
Presentation-Detecting Spammers on Social Networks
 
SMS Spam Filter Design Using R: A Machine Learning Approach
SMS Spam Filter Design Using R: A Machine Learning ApproachSMS Spam Filter Design Using R: A Machine Learning Approach
SMS Spam Filter Design Using R: A Machine Learning Approach
 
Handwritten Character Recognition
Handwritten Character RecognitionHandwritten Character Recognition
Handwritten Character Recognition
 
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
 
FAKE NEWS DETECTION PPT
FAKE NEWS DETECTION PPT FAKE NEWS DETECTION PPT
FAKE NEWS DETECTION PPT
 
Seminar On Naive Bayes for Spam Filtering
Seminar On Naive Bayes for Spam Filtering Seminar On Naive Bayes for Spam Filtering
Seminar On Naive Bayes for Spam Filtering
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 
Classification and Regression
Classification and RegressionClassification and Regression
Classification and Regression
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
 

Similar to Spam Detection Using Natural Language processing

Cross breed Spam Categorization Method using Machine Learning Techniques
Cross breed Spam Categorization Method using Machine Learning TechniquesCross breed Spam Categorization Method using Machine Learning Techniques
Cross breed Spam Categorization Method using Machine Learning TechniquesIJSRED
 
Emailphishing(deep anti phishnet applying deep neural networks for phishing e...
Emailphishing(deep anti phishnet applying deep neural networks for phishing e...Emailphishing(deep anti phishnet applying deep neural networks for phishing e...
Emailphishing(deep anti phishnet applying deep neural networks for phishing e...Venkat Projects
 
Prepare black list using bayesian approach to improve performance of spam fil...
Prepare black list using bayesian approach to improve performance of spam fil...Prepare black list using bayesian approach to improve performance of spam fil...
Prepare black list using bayesian approach to improve performance of spam fil...IAEME Publication
 
Thematic and self learning method for
Thematic and self learning method forThematic and self learning method for
Thematic and self learning method forijcsa
 
Presentation2.pptx
Presentation2.pptxPresentation2.pptx
Presentation2.pptxWanderer20
 
NetworkPaperthesis1
NetworkPaperthesis1NetworkPaperthesis1
NetworkPaperthesis1Dhara Shah
 
A Model for Fuzzy Logic Based Machine Learning Approach for Spam Filtering
A Model for Fuzzy Logic Based Machine Learning Approach for  Spam FilteringA Model for Fuzzy Logic Based Machine Learning Approach for  Spam Filtering
A Model for Fuzzy Logic Based Machine Learning Approach for Spam FilteringIOSR Journals
 
Sentence Validation by Statistical Language Modeling and Semantic Relations
Sentence Validation by Statistical Language Modeling and Semantic RelationsSentence Validation by Statistical Language Modeling and Semantic Relations
Sentence Validation by Statistical Language Modeling and Semantic RelationsEditor IJCATR
 
Network paperthesis1
Network paperthesis1Network paperthesis1
Network paperthesis1Dhara Shah
 
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Editor IJCATR
 
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Editor IJCATR
 
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Editor IJCATR
 
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Editor IJCATR
 
A multi layer architecture for spam-detection system
A multi layer architecture for spam-detection systemA multi layer architecture for spam-detection system
A multi layer architecture for spam-detection systemcsandit
 

Similar to Spam Detection Using Natural Language processing (20)

spam_msg_detection.pdf
spam_msg_detection.pdfspam_msg_detection.pdf
spam_msg_detection.pdf
 
Jt3616901697
Jt3616901697Jt3616901697
Jt3616901697
 
SAS Text Mining
SAS Text MiningSAS Text Mining
SAS Text Mining
 
Cross breed Spam Categorization Method using Machine Learning Techniques
Cross breed Spam Categorization Method using Machine Learning TechniquesCross breed Spam Categorization Method using Machine Learning Techniques
Cross breed Spam Categorization Method using Machine Learning Techniques
 
Emailphishing(deep anti phishnet applying deep neural networks for phishing e...
Emailphishing(deep anti phishnet applying deep neural networks for phishing e...Emailphishing(deep anti phishnet applying deep neural networks for phishing e...
Emailphishing(deep anti phishnet applying deep neural networks for phishing e...
 
Prepare black list using bayesian approach to improve performance of spam fil...
Prepare black list using bayesian approach to improve performance of spam fil...Prepare black list using bayesian approach to improve performance of spam fil...
Prepare black list using bayesian approach to improve performance of spam fil...
 
402 406
402 406402 406
402 406
 
Thematic and self learning method for
Thematic and self learning method forThematic and self learning method for
Thematic and self learning method for
 
Email deliverability
Email deliverabilityEmail deliverability
Email deliverability
 
SPAM FILTERS
SPAM FILTERSSPAM FILTERS
SPAM FILTERS
 
Presentation2.pptx
Presentation2.pptxPresentation2.pptx
Presentation2.pptx
 
NetworkPaperthesis1
NetworkPaperthesis1NetworkPaperthesis1
NetworkPaperthesis1
 
A Model for Fuzzy Logic Based Machine Learning Approach for Spam Filtering
A Model for Fuzzy Logic Based Machine Learning Approach for  Spam FilteringA Model for Fuzzy Logic Based Machine Learning Approach for  Spam Filtering
A Model for Fuzzy Logic Based Machine Learning Approach for Spam Filtering
 
Sentence Validation by Statistical Language Modeling and Semantic Relations
Sentence Validation by Statistical Language Modeling and Semantic RelationsSentence Validation by Statistical Language Modeling and Semantic Relations
Sentence Validation by Statistical Language Modeling and Semantic Relations
 
Network paperthesis1
Network paperthesis1Network paperthesis1
Network paperthesis1
 
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
 
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
 
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
 
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
 
A multi layer architecture for spam-detection system
A multi layer architecture for spam-detection systemA multi layer architecture for spam-detection system
A multi layer architecture for spam-detection system
 

More from युनीक तुषार गुप्ता

More from युनीक तुषार गुप्ता (20)

Online second hand book store project report
Online second hand book store project reportOnline second hand book store project report
Online second hand book store project report
 
Basic operation of computer
Basic operation of computerBasic operation of computer
Basic operation of computer
 
Random access memory
Random access memoryRandom access memory
Random access memory
 
Vaccum tube
Vaccum tubeVaccum tube
Vaccum tube
 
Projector
ProjectorProjector
Projector
 
Memory card
Memory cardMemory card
Memory card
 
Components of computer
Components of computerComponents of computer
Components of computer
 
Digitizer
DigitizerDigitizer
Digitizer
 
Optical disc
Optical discOptical disc
Optical disc
 
Indira gandhi
Indira gandhiIndira gandhi
Indira gandhi
 
Kiran mazumdaar shaw
Kiran mazumdaar shawKiran mazumdaar shaw
Kiran mazumdaar shaw
 
Kumar mangalam birla
Kumar mangalam birlaKumar mangalam birla
Kumar mangalam birla
 
Rani lakshmi bai
Rani lakshmi baiRani lakshmi bai
Rani lakshmi bai
 
Nagavara ramarao narayana murthy
Nagavara ramarao narayana murthyNagavara ramarao narayana murthy
Nagavara ramarao narayana murthy
 
Dhiru bhai ambani
Dhiru bhai ambaniDhiru bhai ambani
Dhiru bhai ambani
 
Leadership
LeadershipLeadership
Leadership
 
Cognizant
CognizantCognizant
Cognizant
 
Where there's a will, there's a way
Where there's a will, there's a wayWhere there's a will, there's a way
Where there's a will, there's a way
 
Persistent system
Persistent systemPersistent system
Persistent system
 
Hcl
HclHcl
Hcl
 

Recently uploaded

Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 

Recently uploaded (20)

Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 

Spam Detection Using Natural Language processing

  • 1.
  • 2.  INTRODUCTION  NATURAL LANGUAGE PROCESSING  SPAMMING  RELATED WORK  N-GRAM MODELLING  WORD STEMMING  BAYESIAN CLASSIFICATION
  • 3.  PROPOSED MODEL  COMPONENTS  BLOCK DIAGRAM  WORKING  CONCLUSION  QUESTION & ANSWER  THANK YOU
  • 4. Natural Language Processing (NLP) is a field of computer science and linguistics concerned with the interactions between computers and human (natural) languages.
  • 5. Spamming is the use of electronic messaging systems like e-mails and other digital delivery systems and broadcast media to send unwanted bulk messages indiscriminately.
  • 7.  An N-gram is an N-character slice of a longer string.  N-grams of several different lengths simultaneously are used in the processing while detecting spam emails.  In N-gram blank space is replaces by underscore character(“_”).
  • 8. Here the example , WORD = STAY  Bi-grams: _S, ST, TA, SY, Y_  Tri-grams: _ _S, _ST, STA, TAY, AY_, Y_  Quad-grams: _ _ _ S, _ _ ST, _STA, STAY, TAY_, AY_ _, Y_ _ _
  • 9. The following steps are used in a word stemming algorithm:  Remove all non-alpha characters. (Allow some characters like '/' ' ' '|' etc. which can be used together to look like some characters, such as / for 'V')  Remove all vowels from the word except the initial one.  Replace all digits and symbols by similar looking digits or characters.
  • 10.  Replace consecutive repeated characters by a single character and vice versa.  Use phonetic algorithms like sound ex on the resultant string.  Give it a numeric value depending on the operations performed over it.  Update the database for that particular keyword.
  • 11.  Bayesian Spam Filtering is a statistical method of classifying an email into spam and legitimate classes.  Naive Bayes classifiers work by correlating the use of tokens with spam and non-spam emails and then using Bayesian inference to calculate a probability that an email is or is not spam.
  • 12. Here, A= no. Of Ham meassage b= no. Of Spam messgae
  • 13.  Email Input  URL Source Check  URL Blacklist Database  Threshold Counter  Keyword Extractor  Classifier  NLP Engine
  • 14.
  • 15. WORKING  An unclassified email is taken as input for processing.  The URL Source checking block checks the URL of the incoming email and searches URL Blacklist Database to find a match. If the search is successful then, it categorizes the email as a spam or considers it to be ham and processes further.  At the same time there is a Threshold Counter which keeps track of the number of emails coming from this source over a period of time t.
  • 16.  The Email is then passed to the Keyword Extractor which will tokenize the message into keywords.  These keywords are passed to the Classifier which will classify the mail into a specific category E.  The email is then passed to the NLP Engine along with its category.
  • 17. Spam mails are a serious concern to and a major annoyance for many Internet users. The mode proposed as a solution in this paper is highly beneficial because it introduces a threshold counter which helps overcome congestion on the web server and also maintain the spam filter efficiency but at the same time, it also requires overhead storage space for the databases.