SlideShare a Scribd company logo
1 of 28
1
A Study on E-mail Image Spam Filtering
Techniques
Presented By:
Thakur Ranjit Banshpal
1
1
Agenda
 INTRODUCTION
 EMAIL SPAM
 IMAGE SPAM
 TYPES OF IMAGE SPAM
 TYPES OF SPAM CONTENT
 LIFE CYCLE OF SPAM
 ANTISPAM TECHNIQUES
 EXISTING TECHNIQUES
 CONCLUSION
 REFERENCES
Introduction
• ‘Spamming’:
• The action of sending unrequested commercial messages in bulk quantity
with obtaining explicit permission of the recipients is defined as
‘Spamming’.
• Examples Include
– email spam,
– instant messaging spam ,
– Usenet newsgroup spam, etc.
Email Spam
 Email spam refers to sending irrelevant, inappropriate and
unrequested email messages to numerous people.
 The purpose of email spam is advertising, promotion, and
spreading backdoors or malicious programs.
 There are two basic forms of email spam
1. Text-based spam mails
2. Image-based spam mails
Image Spam
 Spammers put their spam content into the images.
 They embed text such as advertisement text in the images and
attach these images to emails.
 Anti-spam filters that analyze content of email cannot detect
spam text in images
Types Of Image Spam
1. text-only images,
2. sliced images,
3. randomized images,
4. color modified images,
5. gray images,
6. wild background images,
7. multi-frame animated images and
8. stock splits
Types Of Spam Content
1. products advertisement,
2. financial,
3. adult,
4. Internet,
5. health,
6. scams,
7. leisure,
8. fraud,
9. political and
10. spiritual. etc
Life Cycle of Spam
Anti Spam Techniques
 Stopping spam exists at several levels, it can be
(i) Before spam is sent
(ii) After spam is sent
(iii) After spam is in mailbox and
(iv) Legal solutions.
Before spam is sent
 Techniques like Blacklists and Whitelists can be used to avoid
spam mails.
 In the online world, a blacklist refers to those people who are
responsible for generating spam in a very big way. The
blacklisting can be by IP address, person, company or domain.
 A whitelist is a predefined list of IP addresses that are allowed to
send email to and receive email from each other.
 To send email to a whitelist, the sender must be approved and
verified by the owner of the whitelist.
After Spam is Sent
 Use a spam database, which involves gathering feedback from the
user community, who reports a mail as spam when they receive it.
 A triggering algorithm identifies a mail as spam, when the
number of reports for a particular message exceeds a given
threshold.
 Spam firewall, firewall is a system designed to prevent
unauthorized access to or from a private network.
 The Challenge Response Spam Filtering technique, each email
address must be authorized before delivering an email to the
receiver. The receiver can either authorize these email addresses
manually, or can challenge the sender to identify themselves.
After Spam is in Mailbox
 A spam filter - Examine the incoming email and match it against a
set of pre-defined rules.
 Heuristic filtering - Each rule assigns a numerical score to the
probability of the message being spam. The spam score is then
measured against the user’s desired level of spam sensitivity (low,
medium or high sensitivity).
 Bayesian filtering - Bayesian spam filters can take one group of
legitimate email and another group of spam and compare the
values and data of each. Bayesian filters look for obvious repeating
patterns to form an “opinion” on something. In spam filter terms
that “opinion” becomes a rule which identifies spam.
Legal Solution
 Federal Regulations (CAN-SPAM act of 2003)
Controlling the Assault of Non-Solicited Pornography And
Marketing (CAN-SPAM) Act of 2003 – signed on 12/16/2003
 Ciphersend, which combines encryption schemes with emails to
prevent mailbox from being spammed. The Ciphersend uses
2048-bit encryption, which is much more than online banking
services, which uses a 128-bit encryption or 256-bit at best.
Existing Techniques
1. General Spam Characteristics
2. Email Transmission Protocol
3. Local Changes in Transmission
4. Language- Based Filters
5. Non-Content Features
6. Content Based Classification
7. Hybrid Filters
General Spam Characteristics
 More than 99% of spam falls into one or more of the categories given below
(i) To advertise some goods, services, or ideas
(ii) To cheat users out of their private information and to deliver malicious
software
(iii) To cause a temporary crash of a mail server.
 Characteristics of spam traffic are different from those of legitimate mail traffic in
particular legitimate mail is concentrated on diurnal periods, while spam arrival
rate is stable over time. This behavior of spam mail was reported by Gomes,
Cazita, Almeida, J.M. Virgı, and Meira.
 Pu and Webb analyze the evolution of spamming techniques. They showed that
spam constructing methods become extinct if filters are effective to cope with them
or if other successful efforts are taken against them.
Email Transmission Protocol
SMTP
 The Simple Mail Transfer Protocol (SMTP) is the mechanism for delivery of email.
In the context of the JavaMail API,
 The JavaMail-based program will communicate with the company or Internet
Service Provider's (ISP's) SMTP server.
 That SMTP server will relay the message on to the SMTP server of the recipient(s)
to eventually be acquired by the user(s) through POP or IMAP.
 This does not require the SMTP server to be an open relay, as authentication is
supported, but it should be ensured that the SMTP server is configured properly.
POP
POP stands for Post Office Protocol. POP is the mechanism most
people on the Internet use to get their mail. It defines support for a
single mailbox for each user The Post Office Protocol defines how
the email client should talk to the POP server.
POP can perform the following functions:
Retrieve mail from an ISP and delete it on the server.
Retrieve mail from an ISP but not delete it on the server.
Ask whether new mail has arrived but not retrieve it.
Peek at a few lines of a message to see whether it is worth
retrieving.
IMAP
IMAP is a more advanced protocol for receiving messages. IMAP
stands for Internet Message Access Protocol. It permits a "client"
email program to access remote message stores as if they were
local.
Key features of IMAP include:
 It is fully compatible with Internet messaging standards, e.g. MIME.
 It allows message access and management from more than one computer.
 It allows access without reliance on less efficient file access protocols.
 It provides support for "online", "offline", and "disconnected" access modes.
 It supports concurrent access to shared mailboxes
 Client software needs no knowledge about the server's file store format.
 The main drawback of the commonly used Simple Mail Transfer
Protocol (SMTP) is that it provides no reliable mechanism of
checking the identity of the message source.
 Overcoming this disadvantage, namely providing better ways of
sender identification,
Designated Mailers Protocol (DMP)
Trusted Email Open Standard (TEOS), and
SenderID
Local Changes In Transmission
 Some solutions do not require global protocol changes but
propose to manage email in a different way locally.
 Li and Saito, propose slowing down the operations with messages
that are likely to be spam. Where use the past behavior of senders
for fast prediction of message category. The spam mails are then
maintained in a lower priority queue, while the ham mails in a
higher priority queue.
Language- Based Filters
 Filters based on email body language
 Can be used to filter out spam written in foreign languages
 Examples of such models include dynamic Markov compression and prediction
by partial matching. They were successfully used with the data extracted from
both bodies and headers of the messages.
 Smoothed N-gram language models, proposed by Medlock, used smoothed
higher order N-gram models. N-gram language models are based on the
assumption that the existence of a certain word at a certain position in a
sequence depends only of the previous N-1 words.
ISCF - 2006
N–gram Approach
∏ −+−= )...|()...( 111 iniin wwwPwwP
 Language Model Approach
 Looks for repeated patterns
 Each word depends probabilistically on the n-1 preceding
words.
 Calculating and Comparing the N-Gram profiles.
Non-content Features
 The methods based on structured analysis of the header and of meta-level
features, such as number of attachments, use specific technical aspects of email
and so they are specific to spam filtering.
 Leiba proposed a method called analyzing SMTP path to detect spam. This
method was based on analyzing IP addresses in the reverse-path and ascribing
reputation to them according to amount of spam and legitimate mail delivered
through them. Both this and the subsequent method can be viewed as
development of the idea of blacklisting and whitelisting.
 Behavior-based filtering rests on extracting knowledge about the behavior
behind a given message or group of messages from their non-content features.
Later detect spam by comparing it to the predefined or extracted knowledge
about the typical behaviors of malicious and normal users.
Content Based Classification
 One popular practice when creating spam pages is “Keyword
Stuffing", where the keywords within a web page is analyzed to
detect spam mails. Excessive appearance of keywords in the title
of a page is a clear indication of spam.
 The content and the header of the incoming email are
mostly analyzed by the available anti-spam techniques. They try
to infer something about the kind of the material contained in the
message by looking for specific pattern typical of a spam
message. For these reasons, these filters are known as “content
based.”
Hybrid Filters
 The hybrid technique can be implemented by using various
models, considering available resources with the server. Proposed
a framework which combines white/black listing and challenge-
response methods. Bhuleskar , after identifying the advantages
and disadvantages of various filters, combines the advantages of
the various filtering techniques and proposes a hybrid filter.
 Hybrid solutions need to be carefully designed as the combination
might increase time complexity while increasing security and
accuracy.
Conclusion
We have seen various type of image spam and there content
along with that , We have discussed various solution provided for
image spam problem. As spammers have innumerable techniques
for creating a spam image, the research for a perfect spam filter is
always fertile. Several works have been proposed and almost all
of these methods have the common objectives of high
processing speed and high accuracy, to make it applicable in time
critical environment like the Internet. Future work includes
analysis and comparison of these some techniques reviewed in
terms of computation and time complexity along with accuracy.
 
References
 Delany, S.J., Cunningham, P., Tsymbal, A. and Coyle, L., “A
case-based technique for tracking concept drift in spam filtering,”
Knowledge-based systems, pp. 187–195, 2004.
 Drake, C., Oliver, J. and Koontz, E., “Anatomy of a phishing
email,” Proceedings of the First Conference on Email and Anti-
Spam, CEAS’2004, 2004
 Fawcett, T., “in vivo” spam filtering: a challenge problem
for data mining,” KDD Explorations, vol. 5, no.2, pp.140–148,
2003.
 Gomes, L.H., Cazita, C., Almeida, J.M. Virgı, A. and Meira, W.,
“Characterizing a spam traffic,” IMC ’04: Proceedings of the 4th
ACM SIGCOMM conference on Internet measurement, pages
356–369, New York, NY, USA, ACM Press. ISBN 1-58113-821-
0, 2005.
Thank You

More Related Content

What's hot

final-spam-e-mail-detection-180125111231.pptx
final-spam-e-mail-detection-180125111231.pptxfinal-spam-e-mail-detection-180125111231.pptx
final-spam-e-mail-detection-180125111231.pptx
infotowards
 
Spam and Anti-spam - Sudipta Bhattacharya
Spam and Anti-spam - Sudipta BhattacharyaSpam and Anti-spam - Sudipta Bhattacharya
Spam and Anti-spam - Sudipta Bhattacharya
sankhadeep
 
E Mail & Spam Presentation
E Mail & Spam PresentationE Mail & Spam Presentation
E Mail & Spam Presentation
newsan2001
 
Spoofing
SpoofingSpoofing
Spoofing
Sanjeev
 

What's hot (20)

miniproject.ppt.pptx
miniproject.ppt.pptxminiproject.ppt.pptx
miniproject.ppt.pptx
 
A Survey: SMS Spam Filtering
A Survey: SMS Spam FilteringA Survey: SMS Spam Filtering
A Survey: SMS Spam Filtering
 
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
 
Spam Email identification
Spam Email identificationSpam Email identification
Spam Email identification
 
Sms spam-detection
Sms spam-detectionSms spam-detection
Sms spam-detection
 
final-spam-e-mail-detection-180125111231.pptx
final-spam-e-mail-detection-180125111231.pptxfinal-spam-e-mail-detection-180125111231.pptx
final-spam-e-mail-detection-180125111231.pptx
 
Email spam detection
Email spam detectionEmail spam detection
Email spam detection
 
Presentation2.pptx
Presentation2.pptxPresentation2.pptx
Presentation2.pptx
 
Spam detection using machine learning based binary classifier_043660
Spam detection using machine learning based binary classifier_043660Spam detection using machine learning based binary classifier_043660
Spam detection using machine learning based binary classifier_043660
 
Spam and Anti-spam - Sudipta Bhattacharya
Spam and Anti-spam - Sudipta BhattacharyaSpam and Anti-spam - Sudipta Bhattacharya
Spam and Anti-spam - Sudipta Bhattacharya
 
Cyber Spamming & its Types
Cyber Spamming & its TypesCyber Spamming & its Types
Cyber Spamming & its Types
 
Spam
SpamSpam
Spam
 
E Mail & Spam Presentation
E Mail & Spam PresentationE Mail & Spam Presentation
E Mail & Spam Presentation
 
Spoofing
SpoofingSpoofing
Spoofing
 
Spam, security
Spam, securitySpam, security
Spam, security
 
Spam filtering with Naive Bayes Algorithm
Spam filtering with Naive Bayes AlgorithmSpam filtering with Naive Bayes Algorithm
Spam filtering with Naive Bayes Algorithm
 
Phishing Presentation
Phishing Presentation Phishing Presentation
Phishing Presentation
 
Email Security and Awareness
Email Security and AwarenessEmail Security and Awareness
Email Security and Awareness
 
Captcha ppt
Captcha pptCaptcha ppt
Captcha ppt
 
Email security presentation
Email security presentationEmail security presentation
Email security presentation
 

Similar to E mail image spam filtering techniques

Network paperthesis1
Network paperthesis1Network paperthesis1
Network paperthesis1
Dhara Shah
 
NetworkPaperthesis1
NetworkPaperthesis1NetworkPaperthesis1
NetworkPaperthesis1
Dhara Shah
 
Detecting Spambot as an Antispam Technique for Web Internet BBS
Detecting Spambot as an Antispam Technique for Web Internet BBSDetecting Spambot as an Antispam Technique for Web Internet BBS
Detecting Spambot as an Antispam Technique for Web Internet BBS
ijsrd.com
 

Similar to E mail image spam filtering techniques (20)

Network paperthesis1
Network paperthesis1Network paperthesis1
Network paperthesis1
 
NetworkPaperthesis1
NetworkPaperthesis1NetworkPaperthesis1
NetworkPaperthesis1
 
Identification of Spam Emails from Valid Emails by Using Voting
Identification of Spam Emails from Valid Emails by Using VotingIdentification of Spam Emails from Valid Emails by Using Voting
Identification of Spam Emails from Valid Emails by Using Voting
 
Analysis of an image spam in email based on content analysis
Analysis of an image spam in email based on content analysisAnalysis of an image spam in email based on content analysis
Analysis of an image spam in email based on content analysis
 
B0940509
B0940509B0940509
B0940509
 
SPAM FILTERS
SPAM FILTERSSPAM FILTERS
SPAM FILTERS
 
MINIMIZING THE TIME OF SPAM MAIL DETECTION BY RELOCATING FILTERING SYSTEM TO ...
MINIMIZING THE TIME OF SPAM MAIL DETECTION BY RELOCATING FILTERING SYSTEM TO ...MINIMIZING THE TIME OF SPAM MAIL DETECTION BY RELOCATING FILTERING SYSTEM TO ...
MINIMIZING THE TIME OF SPAM MAIL DETECTION BY RELOCATING FILTERING SYSTEM TO ...
 
Survey on spam filtering
Survey on spam filteringSurvey on spam filtering
Survey on spam filtering
 
Technical Background Overview Ppt
Technical Background Overview PptTechnical Background Overview Ppt
Technical Background Overview Ppt
 
402 406
402 406402 406
402 406
 
A multi layer architecture for spam-detection system
A multi layer architecture for spam-detection systemA multi layer architecture for spam-detection system
A multi layer architecture for spam-detection system
 
A multi layer architecture for spam-detection system
A multi layer architecture for spam-detection systemA multi layer architecture for spam-detection system
A multi layer architecture for spam-detection system
 
Thematic and self learning method for
Thematic and self learning method forThematic and self learning method for
Thematic and self learning method for
 
Identifying Valid Email Spam Emails Using Decision Tree
Identifying Valid Email Spam Emails Using Decision TreeIdentifying Valid Email Spam Emails Using Decision Tree
Identifying Valid Email Spam Emails Using Decision Tree
 
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
 
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
 
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
 
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
 
DEVELOPMENT OF AN EFFECTIVE BAYESIAN APPROACH FOR SPAM FILTERING
DEVELOPMENT OF AN EFFECTIVE BAYESIAN APPROACH FOR SPAM FILTERINGDEVELOPMENT OF AN EFFECTIVE BAYESIAN APPROACH FOR SPAM FILTERING
DEVELOPMENT OF AN EFFECTIVE BAYESIAN APPROACH FOR SPAM FILTERING
 
Detecting Spambot as an Antispam Technique for Web Internet BBS
Detecting Spambot as an Antispam Technique for Web Internet BBSDetecting Spambot as an Antispam Technique for Web Internet BBS
Detecting Spambot as an Antispam Technique for Web Internet BBS
 

More from ranjit banshpal

using big-data methods analyse the Cross platform aviation
 using big-data methods analyse the Cross platform aviation using big-data methods analyse the Cross platform aviation
using big-data methods analyse the Cross platform aviation
ranjit banshpal
 

More from ranjit banshpal (15)

Designing Hybrid Cryptosystem for Secure Transmission of Image Data using Bio...
Designing Hybrid Cryptosystem for Secure Transmission of Image Data using Bio...Designing Hybrid Cryptosystem for Secure Transmission of Image Data using Bio...
Designing Hybrid Cryptosystem for Secure Transmission of Image Data using Bio...
 
SECURE IMAGE RETRIEVAL BASED ON HYBRID FEATURES AND HASHES
SECURE IMAGE RETRIEVAL BASED ON HYBRID FEATURES AND HASHESSECURE IMAGE RETRIEVAL BASED ON HYBRID FEATURES AND HASHES
SECURE IMAGE RETRIEVAL BASED ON HYBRID FEATURES AND HASHES
 
Secure Image Retrieval based on Hybrid Features and Hashes
Secure Image Retrieval based on Hybrid Features and HashesSecure Image Retrieval based on Hybrid Features and Hashes
Secure Image Retrieval based on Hybrid Features and Hashes
 
LCT in day2 day life
LCT in day2 day lifeLCT in day2 day life
LCT in day2 day life
 
Fingerprint recognition
Fingerprint recognitionFingerprint recognition
Fingerprint recognition
 
“Web crawler”
“Web crawler”“Web crawler”
“Web crawler”
 
Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...
 
Parallelization using open mp
Parallelization using open mpParallelization using open mp
Parallelization using open mp
 
Face recognition technology
Face recognition technologyFace recognition technology
Face recognition technology
 
using big-data methods analyse the Cross platform aviation
 using big-data methods analyse the Cross platform aviation using big-data methods analyse the Cross platform aviation
using big-data methods analyse the Cross platform aviation
 
Hybrid encryption
Hybrid encryption Hybrid encryption
Hybrid encryption
 
Autocorrelators1
Autocorrelators1Autocorrelators1
Autocorrelators1
 
Static Networks
Static NetworksStatic Networks
Static Networks
 
Ranjitbanshpal
RanjitbanshpalRanjitbanshpal
Ranjitbanshpal
 
Ranjitbanshpal1
Ranjitbanshpal1Ranjitbanshpal1
Ranjitbanshpal1
 

Recently uploaded

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Recently uploaded (20)

Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 

E mail image spam filtering techniques

  • 1. 1 A Study on E-mail Image Spam Filtering Techniques Presented By: Thakur Ranjit Banshpal 1 1
  • 2. Agenda  INTRODUCTION  EMAIL SPAM  IMAGE SPAM  TYPES OF IMAGE SPAM  TYPES OF SPAM CONTENT  LIFE CYCLE OF SPAM  ANTISPAM TECHNIQUES  EXISTING TECHNIQUES  CONCLUSION  REFERENCES
  • 3. Introduction • ‘Spamming’: • The action of sending unrequested commercial messages in bulk quantity with obtaining explicit permission of the recipients is defined as ‘Spamming’. • Examples Include – email spam, – instant messaging spam , – Usenet newsgroup spam, etc.
  • 4. Email Spam  Email spam refers to sending irrelevant, inappropriate and unrequested email messages to numerous people.  The purpose of email spam is advertising, promotion, and spreading backdoors or malicious programs.  There are two basic forms of email spam 1. Text-based spam mails 2. Image-based spam mails
  • 5. Image Spam  Spammers put their spam content into the images.  They embed text such as advertisement text in the images and attach these images to emails.  Anti-spam filters that analyze content of email cannot detect spam text in images
  • 6. Types Of Image Spam 1. text-only images, 2. sliced images, 3. randomized images, 4. color modified images, 5. gray images, 6. wild background images, 7. multi-frame animated images and 8. stock splits
  • 7. Types Of Spam Content 1. products advertisement, 2. financial, 3. adult, 4. Internet, 5. health, 6. scams, 7. leisure, 8. fraud, 9. political and 10. spiritual. etc
  • 9. Anti Spam Techniques  Stopping spam exists at several levels, it can be (i) Before spam is sent (ii) After spam is sent (iii) After spam is in mailbox and (iv) Legal solutions.
  • 10. Before spam is sent  Techniques like Blacklists and Whitelists can be used to avoid spam mails.  In the online world, a blacklist refers to those people who are responsible for generating spam in a very big way. The blacklisting can be by IP address, person, company or domain.  A whitelist is a predefined list of IP addresses that are allowed to send email to and receive email from each other.  To send email to a whitelist, the sender must be approved and verified by the owner of the whitelist.
  • 11. After Spam is Sent  Use a spam database, which involves gathering feedback from the user community, who reports a mail as spam when they receive it.  A triggering algorithm identifies a mail as spam, when the number of reports for a particular message exceeds a given threshold.  Spam firewall, firewall is a system designed to prevent unauthorized access to or from a private network.  The Challenge Response Spam Filtering technique, each email address must be authorized before delivering an email to the receiver. The receiver can either authorize these email addresses manually, or can challenge the sender to identify themselves.
  • 12. After Spam is in Mailbox  A spam filter - Examine the incoming email and match it against a set of pre-defined rules.  Heuristic filtering - Each rule assigns a numerical score to the probability of the message being spam. The spam score is then measured against the user’s desired level of spam sensitivity (low, medium or high sensitivity).  Bayesian filtering - Bayesian spam filters can take one group of legitimate email and another group of spam and compare the values and data of each. Bayesian filters look for obvious repeating patterns to form an “opinion” on something. In spam filter terms that “opinion” becomes a rule which identifies spam.
  • 13. Legal Solution  Federal Regulations (CAN-SPAM act of 2003) Controlling the Assault of Non-Solicited Pornography And Marketing (CAN-SPAM) Act of 2003 – signed on 12/16/2003  Ciphersend, which combines encryption schemes with emails to prevent mailbox from being spammed. The Ciphersend uses 2048-bit encryption, which is much more than online banking services, which uses a 128-bit encryption or 256-bit at best.
  • 14. Existing Techniques 1. General Spam Characteristics 2. Email Transmission Protocol 3. Local Changes in Transmission 4. Language- Based Filters 5. Non-Content Features 6. Content Based Classification 7. Hybrid Filters
  • 15. General Spam Characteristics  More than 99% of spam falls into one or more of the categories given below (i) To advertise some goods, services, or ideas (ii) To cheat users out of their private information and to deliver malicious software (iii) To cause a temporary crash of a mail server.  Characteristics of spam traffic are different from those of legitimate mail traffic in particular legitimate mail is concentrated on diurnal periods, while spam arrival rate is stable over time. This behavior of spam mail was reported by Gomes, Cazita, Almeida, J.M. Virgı, and Meira.  Pu and Webb analyze the evolution of spamming techniques. They showed that spam constructing methods become extinct if filters are effective to cope with them or if other successful efforts are taken against them.
  • 16. Email Transmission Protocol SMTP  The Simple Mail Transfer Protocol (SMTP) is the mechanism for delivery of email. In the context of the JavaMail API,  The JavaMail-based program will communicate with the company or Internet Service Provider's (ISP's) SMTP server.  That SMTP server will relay the message on to the SMTP server of the recipient(s) to eventually be acquired by the user(s) through POP or IMAP.  This does not require the SMTP server to be an open relay, as authentication is supported, but it should be ensured that the SMTP server is configured properly.
  • 17. POP POP stands for Post Office Protocol. POP is the mechanism most people on the Internet use to get their mail. It defines support for a single mailbox for each user The Post Office Protocol defines how the email client should talk to the POP server. POP can perform the following functions: Retrieve mail from an ISP and delete it on the server. Retrieve mail from an ISP but not delete it on the server. Ask whether new mail has arrived but not retrieve it. Peek at a few lines of a message to see whether it is worth retrieving.
  • 18. IMAP IMAP is a more advanced protocol for receiving messages. IMAP stands for Internet Message Access Protocol. It permits a "client" email program to access remote message stores as if they were local. Key features of IMAP include:  It is fully compatible with Internet messaging standards, e.g. MIME.  It allows message access and management from more than one computer.  It allows access without reliance on less efficient file access protocols.  It provides support for "online", "offline", and "disconnected" access modes.  It supports concurrent access to shared mailboxes  Client software needs no knowledge about the server's file store format.
  • 19.  The main drawback of the commonly used Simple Mail Transfer Protocol (SMTP) is that it provides no reliable mechanism of checking the identity of the message source.  Overcoming this disadvantage, namely providing better ways of sender identification, Designated Mailers Protocol (DMP) Trusted Email Open Standard (TEOS), and SenderID
  • 20. Local Changes In Transmission  Some solutions do not require global protocol changes but propose to manage email in a different way locally.  Li and Saito, propose slowing down the operations with messages that are likely to be spam. Where use the past behavior of senders for fast prediction of message category. The spam mails are then maintained in a lower priority queue, while the ham mails in a higher priority queue.
  • 21. Language- Based Filters  Filters based on email body language  Can be used to filter out spam written in foreign languages  Examples of such models include dynamic Markov compression and prediction by partial matching. They were successfully used with the data extracted from both bodies and headers of the messages.  Smoothed N-gram language models, proposed by Medlock, used smoothed higher order N-gram models. N-gram language models are based on the assumption that the existence of a certain word at a certain position in a sequence depends only of the previous N-1 words.
  • 22. ISCF - 2006 N–gram Approach ∏ −+−= )...|()...( 111 iniin wwwPwwP  Language Model Approach  Looks for repeated patterns  Each word depends probabilistically on the n-1 preceding words.  Calculating and Comparing the N-Gram profiles.
  • 23. Non-content Features  The methods based on structured analysis of the header and of meta-level features, such as number of attachments, use specific technical aspects of email and so they are specific to spam filtering.  Leiba proposed a method called analyzing SMTP path to detect spam. This method was based on analyzing IP addresses in the reverse-path and ascribing reputation to them according to amount of spam and legitimate mail delivered through them. Both this and the subsequent method can be viewed as development of the idea of blacklisting and whitelisting.  Behavior-based filtering rests on extracting knowledge about the behavior behind a given message or group of messages from their non-content features. Later detect spam by comparing it to the predefined or extracted knowledge about the typical behaviors of malicious and normal users.
  • 24. Content Based Classification  One popular practice when creating spam pages is “Keyword Stuffing", where the keywords within a web page is analyzed to detect spam mails. Excessive appearance of keywords in the title of a page is a clear indication of spam.  The content and the header of the incoming email are mostly analyzed by the available anti-spam techniques. They try to infer something about the kind of the material contained in the message by looking for specific pattern typical of a spam message. For these reasons, these filters are known as “content based.”
  • 25. Hybrid Filters  The hybrid technique can be implemented by using various models, considering available resources with the server. Proposed a framework which combines white/black listing and challenge- response methods. Bhuleskar , after identifying the advantages and disadvantages of various filters, combines the advantages of the various filtering techniques and proposes a hybrid filter.  Hybrid solutions need to be carefully designed as the combination might increase time complexity while increasing security and accuracy.
  • 26. Conclusion We have seen various type of image spam and there content along with that , We have discussed various solution provided for image spam problem. As spammers have innumerable techniques for creating a spam image, the research for a perfect spam filter is always fertile. Several works have been proposed and almost all of these methods have the common objectives of high processing speed and high accuracy, to make it applicable in time critical environment like the Internet. Future work includes analysis and comparison of these some techniques reviewed in terms of computation and time complexity along with accuracy.  
  • 27. References  Delany, S.J., Cunningham, P., Tsymbal, A. and Coyle, L., “A case-based technique for tracking concept drift in spam filtering,” Knowledge-based systems, pp. 187–195, 2004.  Drake, C., Oliver, J. and Koontz, E., “Anatomy of a phishing email,” Proceedings of the First Conference on Email and Anti- Spam, CEAS’2004, 2004  Fawcett, T., “in vivo” spam filtering: a challenge problem for data mining,” KDD Explorations, vol. 5, no.2, pp.140–148, 2003.  Gomes, L.H., Cazita, C., Almeida, J.M. Virgı, A. and Meira, W., “Characterizing a spam traffic,” IMC ’04: Proceedings of the 4th ACM SIGCOMM conference on Internet measurement, pages 356–369, New York, NY, USA, ACM Press. ISBN 1-58113-821- 0, 2005.