This document discusses various techniques for filtering image spam in emails. It begins with introducing email spam and image spam, then describes types of image spam and spam content. It discusses the lifecycle of spam and various antispam techniques, including techniques that operate before spam is sent, after it is sent, and after it reaches mailboxes. It also covers existing techniques like analyzing spam characteristics, transmission protocols, local changes, language-based filters, non-content features, content-based classification, and hybrid filters. In the end, it emphasizes that hybrid techniques can effectively combine various filtering models.
Natural language processing (NLP) is an area of computer science and artificial intelligence concerned with the interactions between computers and human (natural) languages
Now a days Short Message Service(SMS) is most popular way to communication for mobile user because it is cheapest mode or version for communication than other mode.SMS is used for transmitting short length msg of around 160 character to different devices such as smart phones, cellular phones, PDAs using standardized communication protocols. The amount of Short Message Service (SMS) spam is increasing. SMS spam should be put into the spam folder, not the inbox. The growth of the mobile phone users has led to a dramatic increase in SMS spam messages. To avoid this problem SMS filtering Techniques are used. Our proposed approach filters SMS spam on an independent mobile phone on a large dataset and acceptable processing time. There are different approaches able to automatically detect and remove most of these messages, and the best-known ones are based on Bayesian decision theory and Support Vector Machines. Riya Mehta | Ankita Gandhi"A Survey: SMS Spam Filtering" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-3 , April 2018, URL: http://www.ijtsrd.com/papers/ijtsrd12850.pdf http://www.ijtsrd.com/computer-science/data-miining/12850/a-survey-sms-spam-filtering/riya-mehta
This is the presentation for Machine Learning Assignment in Dublin City University for Spring 2017. In this Project, we made an email spam filtering code using Enron Dataset
Natural language processing (NLP) is an area of computer science and artificial intelligence concerned with the interactions between computers and human (natural) languages
Now a days Short Message Service(SMS) is most popular way to communication for mobile user because it is cheapest mode or version for communication than other mode.SMS is used for transmitting short length msg of around 160 character to different devices such as smart phones, cellular phones, PDAs using standardized communication protocols. The amount of Short Message Service (SMS) spam is increasing. SMS spam should be put into the spam folder, not the inbox. The growth of the mobile phone users has led to a dramatic increase in SMS spam messages. To avoid this problem SMS filtering Techniques are used. Our proposed approach filters SMS spam on an independent mobile phone on a large dataset and acceptable processing time. There are different approaches able to automatically detect and remove most of these messages, and the best-known ones are based on Bayesian decision theory and Support Vector Machines. Riya Mehta | Ankita Gandhi"A Survey: SMS Spam Filtering" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-3 , April 2018, URL: http://www.ijtsrd.com/papers/ijtsrd12850.pdf http://www.ijtsrd.com/computer-science/data-miining/12850/a-survey-sms-spam-filtering/riya-mehta
This is the presentation for Machine Learning Assignment in Dublin City University for Spring 2017. In this Project, we made an email spam filtering code using Enron Dataset
This document will make you understand the basic issues related to E-mail like, Spamming, Bombing, Malware, Email Spoofing and Email Bankruptcy, etc. after that you will learn about the first Email security protocol Privacy Enhanced Mail (PEM), step-by-step working of PEM.
This document will make you understand the basic issues related to E-mail like, Spamming, Bombing, Malware, Email Spoofing and Email Bankruptcy, etc. after that you will learn about the first Email security protocol Privacy Enhanced Mail (PEM), step-by-step working of PEM.
Identification of Spam Emails from Valid Emails by Using VotingEditor IJCATR
In recent years, the increasing use of e-mails has led to the emergence and increase of problems caused by mass unwanted
messages which are commonly known as spam. In this study, by using decision trees, support vector machine, Naïve Bayes theorem
and voting algorithm, a new version for identifying and classifying spams is provided. In order to verify the proposed method, a set of
a mails are chosen to get tested. First three algorithms try to detect spams, and then by using voting method, spams are identified. The
advantage of this method is utilizing a combination of three algorithms at the same time: decision tree, support vector machine and
Naïve Bayes method. During the evaluation of this method, a data set is analyzed by Weka software. Charts prepared in spam
detection indicate improved accuracy compared to the previous methods.
Analysis of an image spam in email based on content analysisijnlc
Researchers initially have addressed the problem of spam detection as a text classification or
categorization problem. However, as spammers’ continue to develop new techniques and the type of email
content becomes more disparate, text-based anti-spam approaches alone are not sufficiently enough in
preventing spam. In an attempt to defeat the anti-spam development technologies, spammers have recently
adopted the image spam trick to make the scrutiny of emails’ body text inefficient. The main idea behind
this project is to design a spam detection system. The system will be enabled to analyze the content of
emails, in particular the artificially generated image sent as attachment in an email. The system will
analyze the image content and classify the embedded image as spam or legitimate hence classify the email
accordingly.
MINIMIZING THE TIME OF SPAM MAIL DETECTION BY RELOCATING FILTERING SYSTEM TO ...IJNSA Journal
Unsolicited Bulk Emails (also known as Spam) are undesirable emails sent to massive number of users. Spam emails consume the network resources and cause lots of security uncertainties. As we studied, the location where the spam filter operates in is an important parameter to preserve network resources. Although there are many different methods to block spam emails, most of program developers only intend to block spam emails from being delivered to their clients. In this paper, we will introduce a new and efficient approach to prevent spam emails from being transferred. The result shows that if we focus on developing a filtering method for spams emails in the sender mail server rather than the receiver mail server, we can detect the spam emails in the shortest time consequently to avoid wasting network resources.
A multi layer architecture for spam-detection systemcsandit
As the email is becoming a prominent mode of commun
ication so are the attempts to misuse it to
take undue advantage of its low cost and high reach
ability. However, as email communication
is very cheap, spammers are taking advantage of it
for advertising their products, for
committing cybercrimes. So, researchers are working
hard to combat with the spammers. Many
spam detections techniques and systems are built to
fight spammers. But the spammers are
continuously finding new ways to defeat the existin
g filters. This paper describes the existing
spam filters techniques and proposes a multi-level
architecture for spam email detection. We
present the analysis of the architecture to prove t
he effectiveness of the architecture
A multi layer architecture for spam-detection systemcsandit
As the email is becoming a prominent mode of communication so are the attempts to misuse it to
take undue advantage of its low cost and high reachability. However, as email communication
is very cheap, spammers are taking advantage of it for advertising their products, for
committing cybercrimes. So, researchers are working hard to combat with the spammers. Many
spam detections techniques and systems are built to fight spammers. But the spammers are
continuously finding new ways to defeat the existing filters. This paper describes the existing
spam filters techniques and proposes a multi-level architecture for spam email detection. We
present the analysis of the architecture to prove the effectiveness of the architecture.
Spams are unwanted and also undesirable emails which are mass sent to the numerous victims. Further
penetration of spams into electronic processors and communication equipments such as computers and
mobiles as well as lack of control on the information shared on the internet and other communication
networks and also inefficiency of the spam detecting methods developed for Persian contexts are among the
main challenging issues of the Persian subscribers. This paper presents a novel and efficient method for
thematic identification of Persian spams. The proposed method is capable of identifying the Persian, spams
and also “Penglish” spams. “Penglish” is made up of two words Persian and English and demonstrates a
Persian text which is written by English alphabetic letters. Based on the experimental analysis of the 10000
spams of different type the efficiency of the proposed method is evaluated to be more than 98%. The
presented method is also capable of updating its databases taking the advantage of the feedbacks received
from the users.
Identifying Valid Email Spam Emails Using Decision TreeEditor IJCATR
The increasing use of e-mail and the growing trend of Internet users sending unsolicited bulk e-mail, the need for an antispam
filtering or have created, Filter large poster have been produced in this area, each with its own method and some parameters are
to recognize spam. The advantage of this method is the simultaneous use of two algorithms decision tree ID3 - Mamdani and Naive
Bayesian is fuzzy. The first two algorithms are then used to detect spam Bagging approach is to identify spam. In the evaluation of this
dataset contains a thousand letters have been analyzed by the software Weka charts provided in spam detection accuracy than previous
methods of improvement
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Editor IJCATR
Bayesian classifier works efficiently on some fields, and badly on some. The performance of Bayesian Classifier suffers in fields that involve correlated features. Feature selection is beneficial in reducing dimensionality, removing irrelevant data, incrementing learning accuracy, and improving result comprehensibility. But, the recent increase of dimensionality of data place a hard challenge to many existing feature selection methods with respect to efficiency and effectiveness. In this paper, Bayesian Classifier with Correlation Based Feature Selection is introduced which can key out relevant features as well as redundancy among relevant features without pair wise correlation analysis. The efficiency and effectiveness of our method is presented through broad.
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Editor IJCATR
Bayesian classifier works efficiently on some fields, and badly on some. The performance of Bayesian Classifier suffers in fields that involve correlated features. Feature selection is beneficial in reducing dimensionality, removing irrelevant data, incrementing learning accuracy, and improving result comprehensibility. But, the recent increase of dimensionality of data place a hard challenge to many existing feature selection methods with respect to efficiency and effectiveness. In this paper, Bayesian Classifier with Correlation Based Feature Selection is introduced which can key out relevant features as well as redundancy among relevant features without pair wise correlation analysis. The efficiency and effectiveness of our method is presented through broad.
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Editor IJCATR
Bayesian classifier works efficiently on some fields, and badly on some. The performance of Bayesian Classifier suffers in
fields that involve correlated features. Feature selection is beneficial in reducing dimensionality, removing irrelevant data,
incrementing learning accuracy, and improving result comprehensibility. But, the recent increase of dimensionality of data place a hard
challenge to many existing feature selection methods with respect to efficiency and effectiveness. In this paper, Bayesian Classifier
with Correlation Based Feature Selection is introduced which can key out relevant features as well as redundancy among relevant
features without pair wise correlation analysis. The efficiency and effectiveness of our method is presented through broad.
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Editor IJCATR
Bayesian classifier works efficiently on some fields, and badly on some. The performance of Bayesian Classifier suffers in
fields that involve correlated features. Feature selection is beneficial in reducing dimensionality, removing irrelevant data,
incrementing learning accuracy, and improving result comprehensibility. But, the recent increase of dimensionality of data place a hard
challenge to many existing feature selection methods with respect to efficiency and effectiveness. In this paper, Bayesian Classifier
with Correlation Based Feature Selection is introduced which can key out relevant features as well as redundancy among relevant
features without pair wise correlation analysis. The efficiency and effectiveness of our method is presented through broad.
Email spam, also known as junk email or unsolicited
bulk email(UBE), is a subset of electronic spam involving nearly
identical messages sent to numerous recipients by email. Clicking
on links in spam email may send users to phishing web sites or sites
that are hosting malware. Spam email may also include malware as
scripts or other executable file attachments. Definitions of spam
usually include the aspects that email is unsolicited and sent in bulk
In order to overcome spam problem many researchers have
been conducted and various method of anti-spam filtering have
been implemented. A spam filter is a set of instruction for
determining the status of the received email. Spam filters are used
to prevent spam email passing through the recipient. The main
challenge is how to design an effective spam filter that allows
desired email to pass through while blocking the unwanted email.
Detecting Spambot as an Antispam Technique for Web Internet BBSijsrd.com
Spam which is one of the most popular and also the most relevant topic that needs to be understood in the current scenario. Everyone whether it may be a small child or an old person are using emails everyday all around the world. The scenario which we are seeing is that almost no one is aware or in simple sentence they do not know what actually the spam is and what they will do in their systems. Spam in general means unsolicited or unwanted mails. Botnets are considered one of the main source of the spam. Botnet means the group of software's called bots and the function of these bots is to run on several compromised computers autonomously and automatically. The main objective of this paper is to detect such a bot or spambots for the Bulletin Board System (BBS). BBS is a computer that is running software that allows users to leave a message and access information of general interest. Originally BBSes were accessed only over a phone line using a modem, but nowadays some BBSes allowed access via a Telnet, packet switched network, or packet radio connection. The main methodology that we are going to focus is on Behavioural-based Spam Detection (BSD) method. Behavioral-based Spam Detector (BSD) combines several behaviours of the spam bots at different stages including the behaviour of spam preparation before the spam session when the spammers search for an open relay SMTP service to send e-mails through, and the behaviour of spammers while connecting to the mail server. Detecting the abnormal behaviour produced by the spam activities gives a high rate of suspicion on the existence of bots.
Similar to E mail image spam filtering techniques (20)
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxEduSkills OECD
Andreas Schleicher presents at the OECD webinar ‘Digital devices in schools: detrimental distraction or secret to success?’ on 27 May 2024. The presentation was based on findings from PISA 2022 results and the webinar helped launch the PISA in Focus ‘Managing screen time: How to protect and equip students against distraction’ https://www.oecd-ilibrary.org/education/managing-screen-time_7c225af4-en and the OECD Education Policy Perspective ‘Students, digital devices and success’ can be found here - https://oe.cd/il/5yV
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
The Indian economy is classified into different sectors to simplify the analysis and understanding of economic activities. For Class 10, it's essential to grasp the sectors of the Indian economy, understand their characteristics, and recognize their importance. This guide will provide detailed notes on the Sectors of the Indian Economy Class 10, using specific long-tail keywords to enhance comprehension.
For more information, visit-www.vavaclasses.com
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
How to Split Bills in the Odoo 17 POS ModuleCeline George
Bills have a main role in point of sale procedure. It will help to track sales, handling payments and giving receipts to customers. Bill splitting also has an important role in POS. For example, If some friends come together for dinner and if they want to divide the bill then it is possible by POS bill splitting. This slide will show how to split bills in odoo 17 POS.
This is a presentation by Dada Robert in a Your Skill Boost masterclass organised by the Excellence Foundation for South Sudan (EFSS) on Saturday, the 25th and Sunday, the 26th of May 2024.
He discussed the concept of quality improvement, emphasizing its applicability to various aspects of life, including personal, project, and program improvements. He defined quality as doing the right thing at the right time in the right way to achieve the best possible results and discussed the concept of the "gap" between what we know and what we do, and how this gap represents the areas we need to improve. He explained the scientific approach to quality improvement, which involves systematic performance analysis, testing and learning, and implementing change ideas. He also highlighted the importance of client focus and a team approach to quality improvement.
The Art Pastor's Guide to Sabbath | Steve ThomasonSteve Thomason
What is the purpose of the Sabbath Law in the Torah. It is interesting to compare how the context of the law shifts from Exodus to Deuteronomy. Who gets to rest, and why?
Palestine last event orientationfvgnh .pptxRaedMohamed3
An EFL lesson about the current events in Palestine. It is intended to be for intermediate students who wish to increase their listening skills through a short lesson in power point.
The Roman Empire A Historical Colossus.pdfkaushalkr1407
The Roman Empire, a vast and enduring power, stands as one of history's most remarkable civilizations, leaving an indelible imprint on the world. It emerged from the Roman Republic, transitioning into an imperial powerhouse under the leadership of Augustus Caesar in 27 BCE. This transformation marked the beginning of an era defined by unprecedented territorial expansion, architectural marvels, and profound cultural influence.
The empire's roots lie in the city of Rome, founded, according to legend, by Romulus in 753 BCE. Over centuries, Rome evolved from a small settlement to a formidable republic, characterized by a complex political system with elected officials and checks on power. However, internal strife, class conflicts, and military ambitions paved the way for the end of the Republic. Julius Caesar’s dictatorship and subsequent assassination in 44 BCE created a power vacuum, leading to a civil war. Octavian, later Augustus, emerged victorious, heralding the Roman Empire’s birth.
Under Augustus, the empire experienced the Pax Romana, a 200-year period of relative peace and stability. Augustus reformed the military, established efficient administrative systems, and initiated grand construction projects. The empire's borders expanded, encompassing territories from Britain to Egypt and from Spain to the Euphrates. Roman legions, renowned for their discipline and engineering prowess, secured and maintained these vast territories, building roads, fortifications, and cities that facilitated control and integration.
The Roman Empire’s society was hierarchical, with a rigid class system. At the top were the patricians, wealthy elites who held significant political power. Below them were the plebeians, free citizens with limited political influence, and the vast numbers of slaves who formed the backbone of the economy. The family unit was central, governed by the paterfamilias, the male head who held absolute authority.
Culturally, the Romans were eclectic, absorbing and adapting elements from the civilizations they encountered, particularly the Greeks. Roman art, literature, and philosophy reflected this synthesis, creating a rich cultural tapestry. Latin, the Roman language, became the lingua franca of the Western world, influencing numerous modern languages.
Roman architecture and engineering achievements were monumental. They perfected the arch, vault, and dome, constructing enduring structures like the Colosseum, Pantheon, and aqueducts. These engineering marvels not only showcased Roman ingenuity but also served practical purposes, from public entertainment to water supply.
How to Create Map Views in the Odoo 17 ERPCeline George
The map views are useful for providing a geographical representation of data. They allow users to visualize and analyze the data in a more intuitive manner.
1. 1
A Study on E-mail Image Spam Filtering
Techniques
Presented By:
Thakur Ranjit Banshpal
1
1
2. Agenda
INTRODUCTION
EMAIL SPAM
IMAGE SPAM
TYPES OF IMAGE SPAM
TYPES OF SPAM CONTENT
LIFE CYCLE OF SPAM
ANTISPAM TECHNIQUES
EXISTING TECHNIQUES
CONCLUSION
REFERENCES
3. Introduction
• ‘Spamming’:
• The action of sending unrequested commercial messages in bulk quantity
with obtaining explicit permission of the recipients is defined as
‘Spamming’.
• Examples Include
– email spam,
– instant messaging spam ,
– Usenet newsgroup spam, etc.
4. Email Spam
Email spam refers to sending irrelevant, inappropriate and
unrequested email messages to numerous people.
The purpose of email spam is advertising, promotion, and
spreading backdoors or malicious programs.
There are two basic forms of email spam
1. Text-based spam mails
2. Image-based spam mails
5. Image Spam
Spammers put their spam content into the images.
They embed text such as advertisement text in the images and
attach these images to emails.
Anti-spam filters that analyze content of email cannot detect
spam text in images
9. Anti Spam Techniques
Stopping spam exists at several levels, it can be
(i) Before spam is sent
(ii) After spam is sent
(iii) After spam is in mailbox and
(iv) Legal solutions.
10. Before spam is sent
Techniques like Blacklists and Whitelists can be used to avoid
spam mails.
In the online world, a blacklist refers to those people who are
responsible for generating spam in a very big way. The
blacklisting can be by IP address, person, company or domain.
A whitelist is a predefined list of IP addresses that are allowed to
send email to and receive email from each other.
To send email to a whitelist, the sender must be approved and
verified by the owner of the whitelist.
11. After Spam is Sent
Use a spam database, which involves gathering feedback from the
user community, who reports a mail as spam when they receive it.
A triggering algorithm identifies a mail as spam, when the
number of reports for a particular message exceeds a given
threshold.
Spam firewall, firewall is a system designed to prevent
unauthorized access to or from a private network.
The Challenge Response Spam Filtering technique, each email
address must be authorized before delivering an email to the
receiver. The receiver can either authorize these email addresses
manually, or can challenge the sender to identify themselves.
12. After Spam is in Mailbox
A spam filter - Examine the incoming email and match it against a
set of pre-defined rules.
Heuristic filtering - Each rule assigns a numerical score to the
probability of the message being spam. The spam score is then
measured against the user’s desired level of spam sensitivity (low,
medium or high sensitivity).
Bayesian filtering - Bayesian spam filters can take one group of
legitimate email and another group of spam and compare the
values and data of each. Bayesian filters look for obvious repeating
patterns to form an “opinion” on something. In spam filter terms
that “opinion” becomes a rule which identifies spam.
13. Legal Solution
Federal Regulations (CAN-SPAM act of 2003)
Controlling the Assault of Non-Solicited Pornography And
Marketing (CAN-SPAM) Act of 2003 – signed on 12/16/2003
Ciphersend, which combines encryption schemes with emails to
prevent mailbox from being spammed. The Ciphersend uses
2048-bit encryption, which is much more than online banking
services, which uses a 128-bit encryption or 256-bit at best.
14. Existing Techniques
1. General Spam Characteristics
2. Email Transmission Protocol
3. Local Changes in Transmission
4. Language- Based Filters
5. Non-Content Features
6. Content Based Classification
7. Hybrid Filters
15. General Spam Characteristics
More than 99% of spam falls into one or more of the categories given below
(i) To advertise some goods, services, or ideas
(ii) To cheat users out of their private information and to deliver malicious
software
(iii) To cause a temporary crash of a mail server.
Characteristics of spam traffic are different from those of legitimate mail traffic in
particular legitimate mail is concentrated on diurnal periods, while spam arrival
rate is stable over time. This behavior of spam mail was reported by Gomes,
Cazita, Almeida, J.M. Virgı, and Meira.
Pu and Webb analyze the evolution of spamming techniques. They showed that
spam constructing methods become extinct if filters are effective to cope with them
or if other successful efforts are taken against them.
16. Email Transmission Protocol
SMTP
The Simple Mail Transfer Protocol (SMTP) is the mechanism for delivery of email.
In the context of the JavaMail API,
The JavaMail-based program will communicate with the company or Internet
Service Provider's (ISP's) SMTP server.
That SMTP server will relay the message on to the SMTP server of the recipient(s)
to eventually be acquired by the user(s) through POP or IMAP.
This does not require the SMTP server to be an open relay, as authentication is
supported, but it should be ensured that the SMTP server is configured properly.
17. POP
POP stands for Post Office Protocol. POP is the mechanism most
people on the Internet use to get their mail. It defines support for a
single mailbox for each user The Post Office Protocol defines how
the email client should talk to the POP server.
POP can perform the following functions:
Retrieve mail from an ISP and delete it on the server.
Retrieve mail from an ISP but not delete it on the server.
Ask whether new mail has arrived but not retrieve it.
Peek at a few lines of a message to see whether it is worth
retrieving.
18. IMAP
IMAP is a more advanced protocol for receiving messages. IMAP
stands for Internet Message Access Protocol. It permits a "client"
email program to access remote message stores as if they were
local.
Key features of IMAP include:
It is fully compatible with Internet messaging standards, e.g. MIME.
It allows message access and management from more than one computer.
It allows access without reliance on less efficient file access protocols.
It provides support for "online", "offline", and "disconnected" access modes.
It supports concurrent access to shared mailboxes
Client software needs no knowledge about the server's file store format.
19. The main drawback of the commonly used Simple Mail Transfer
Protocol (SMTP) is that it provides no reliable mechanism of
checking the identity of the message source.
Overcoming this disadvantage, namely providing better ways of
sender identification,
Designated Mailers Protocol (DMP)
Trusted Email Open Standard (TEOS), and
SenderID
20. Local Changes In Transmission
Some solutions do not require global protocol changes but
propose to manage email in a different way locally.
Li and Saito, propose slowing down the operations with messages
that are likely to be spam. Where use the past behavior of senders
for fast prediction of message category. The spam mails are then
maintained in a lower priority queue, while the ham mails in a
higher priority queue.
21. Language- Based Filters
Filters based on email body language
Can be used to filter out spam written in foreign languages
Examples of such models include dynamic Markov compression and prediction
by partial matching. They were successfully used with the data extracted from
both bodies and headers of the messages.
Smoothed N-gram language models, proposed by Medlock, used smoothed
higher order N-gram models. N-gram language models are based on the
assumption that the existence of a certain word at a certain position in a
sequence depends only of the previous N-1 words.
22. ISCF - 2006
N–gram Approach
∏ −+−= )...|()...( 111 iniin wwwPwwP
Language Model Approach
Looks for repeated patterns
Each word depends probabilistically on the n-1 preceding
words.
Calculating and Comparing the N-Gram profiles.
23. Non-content Features
The methods based on structured analysis of the header and of meta-level
features, such as number of attachments, use specific technical aspects of email
and so they are specific to spam filtering.
Leiba proposed a method called analyzing SMTP path to detect spam. This
method was based on analyzing IP addresses in the reverse-path and ascribing
reputation to them according to amount of spam and legitimate mail delivered
through them. Both this and the subsequent method can be viewed as
development of the idea of blacklisting and whitelisting.
Behavior-based filtering rests on extracting knowledge about the behavior
behind a given message or group of messages from their non-content features.
Later detect spam by comparing it to the predefined or extracted knowledge
about the typical behaviors of malicious and normal users.
24. Content Based Classification
One popular practice when creating spam pages is “Keyword
Stuffing", where the keywords within a web page is analyzed to
detect spam mails. Excessive appearance of keywords in the title
of a page is a clear indication of spam.
The content and the header of the incoming email are
mostly analyzed by the available anti-spam techniques. They try
to infer something about the kind of the material contained in the
message by looking for specific pattern typical of a spam
message. For these reasons, these filters are known as “content
based.”
25. Hybrid Filters
The hybrid technique can be implemented by using various
models, considering available resources with the server. Proposed
a framework which combines white/black listing and challenge-
response methods. Bhuleskar , after identifying the advantages
and disadvantages of various filters, combines the advantages of
the various filtering techniques and proposes a hybrid filter.
Hybrid solutions need to be carefully designed as the combination
might increase time complexity while increasing security and
accuracy.
26. Conclusion
We have seen various type of image spam and there content
along with that , We have discussed various solution provided for
image spam problem. As spammers have innumerable techniques
for creating a spam image, the research for a perfect spam filter is
always fertile. Several works have been proposed and almost all
of these methods have the common objectives of high
processing speed and high accuracy, to make it applicable in time
critical environment like the Internet. Future work includes
analysis and comparison of these some techniques reviewed in
terms of computation and time complexity along with accuracy.
27. References
Delany, S.J., Cunningham, P., Tsymbal, A. and Coyle, L., “A
case-based technique for tracking concept drift in spam filtering,”
Knowledge-based systems, pp. 187–195, 2004.
Drake, C., Oliver, J. and Koontz, E., “Anatomy of a phishing
email,” Proceedings of the First Conference on Email and Anti-
Spam, CEAS’2004, 2004
Fawcett, T., “in vivo” spam filtering: a challenge problem
for data mining,” KDD Explorations, vol. 5, no.2, pp.140–148,
2003.
Gomes, L.H., Cazita, C., Almeida, J.M. Virgı, A. and Meira, W.,
“Characterizing a spam traffic,” IMC ’04: Proceedings of the 4th
ACM SIGCOMM conference on Internet measurement, pages
356–369, New York, NY, USA, ACM Press. ISBN 1-58113-821-
0, 2005.