The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
This document discusses techniques for detecting compromised machines ("zombies") that are involved in spamming activities on a network. It proposes using heuristic search and message partitioning/replication to minimize spam access from zombies while ensuring data confidentiality and integrity. Zombies are controlled by botnet herders and use various techniques to send large volumes of spam while remaining untraceable, such as exploiting vulnerabilities on Windows systems to use infected machines as mail relays or sending spam from dynamic IP addresses. The document analyzes spam sent from different IPs to examine the extent to which spam originates from a small number of hosts.
A novel hybrid approach of SVM combined with NLP and probabilistic neural ne...IJECEIAES
Phishing attacks are one of the slanting cyber-attacks that apply socially engineered messages that are imparted to individuals from expert hackers going for tricking clients to uncover their delicate data, the most mainstream correspondence channel to those messages is through clients' emails. Phishing has turned into a generous danger for web clients and a noteworthy reason for money related misfortunes. Therefore, different arrangements have been created to handle this issue. Deceitful emails, also called phishing emails, utilize a scope of impact strategies to convince people to react, for example, promising a fiscal reward or summoning a feeling of criticalness. Regardless of far reaching alerts and intends to instruct clients to distinguish phishing sends, these are as yet a pervasive practice and a worthwhile business. The creators accept that influence, as a style of human correspondence intended to impact others, has a focal job in fruitful advanced tricks. Cyber criminals have ceaselessly propelling their techniques for assault. The current strategies to recognize the presence of such malevolent projects and to keep them from executing are static, dynamic and hybrid analysis. In this work we are proposing a hybrid methodology for phishing detection incorporating feature extraction and classification of the mails using SVM. At last, alongside the chose features, the PNN characterizes the spam mails from the genuine mails with more exactness and accuracy.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
A multi classifier prediction model for phishing detectioneSAT Journals
Abstract Phishing is the technique of stealing personal and sensitive information from an email client by sending emails that impersonates like the ones from some trustworthy organizations. Phishing mails are a specific type of spam mails; however the effects of them are much more terrible than alternate sorts. Mostly the phishing attackers aim the clients of the financial organizations, so its detection needs high priority. Lots of research activities are done to detect the phished emails, in the proposed methodology a multi-classifier prediction model is introduced for detecting phished emails. Our contention is that solitary classifier prediction might not be satisfactory to urge the clearest picture of the phishing email; multi-classifier prediction has accuracy 99.8% with an FP rate of 0.8%. Keywords: Phishing email detection, Machine learning techniques, Multi-classifier prediction model, Majority voting
Cross breed Spam Categorization Method using Machine Learning TechniquesIJSRED
This document presents a hybrid spam filtering method using machine learning techniques of Naive Bayes and Markov Random Fields. It aims to efficiently identify and filter spam emails. The proposed method is evaluated based on accuracy, precision, and time taken. The results show the hybrid technique achieves a high true positive rate in detecting spam emails. Keywords discussed are email spam, Naive Bayes algorithm, and Markov Random Fields.
A multi layer architecture for spam-detection systemcsandit
As the email is becoming a prominent mode of communication so are the attempts to misuse it to
take undue advantage of its low cost and high reachability. However, as email communication
is very cheap, spammers are taking advantage of it for advertising their products, for
committing cybercrimes. So, researchers are working hard to combat with the spammers. Many
spam detections techniques and systems are built to fight spammers. But the spammers are
continuously finding new ways to defeat the existing filters. This paper describes the existing
spam filters techniques and proposes a multi-level architecture for spam email detection. We
present the analysis of the architecture to prove the effectiveness of the architecture.
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORKijcseit
The World Wide Web has become an important part of our everyday life for information communication
and knowledge dissemination. It helps to transact information timely, rapidly and easily. Identifying theft
and identity fraud are referred as two sides of cyber-crime in which hackers and malicious users obtain the
personal data of existing legitimate users to attempt fraud or deception motivation for financial gain.
Malicious URLs host unsolicited content (spam, phishing, drive-by exploits, etc.) and lure unsuspecting
users to become victims of scams (monetary loss, theft of private information, and malware installation),
and cause losses of billions of dollars every year. To detect such crimes systems should be fast and precise
with the ability to detect new malicious content. Traditionally, this detection is done mostly through the
usage of blacklists. However, blacklists cannot be exhaustive, and lack the ability to detect newly generated
malicious URLs. To improve the generality of malicious URL detectors, machine learning techniques have
been explored with increasing attention in recent years. In this paper, I use a simple algorithm to detect
and predicting URLs it is good or bad and compared with two other algorithms to know (SVM, LR).
Identifying Valid Email Spam Emails Using Decision TreeEditor IJCATR
The increasing use of e-mail and the growing trend of Internet users sending unsolicited bulk e-mail, the need for an antispam
filtering or have created, Filter large poster have been produced in this area, each with its own method and some parameters are
to recognize spam. The advantage of this method is the simultaneous use of two algorithms decision tree ID3 - Mamdani and Naive
Bayesian is fuzzy. The first two algorithms are then used to detect spam Bagging approach is to identify spam. In the evaluation of this
dataset contains a thousand letters have been analyzed by the software Weka charts provided in spam detection accuracy than previous
methods of improvement
This document discusses techniques for detecting compromised machines ("zombies") that are involved in spamming activities on a network. It proposes using heuristic search and message partitioning/replication to minimize spam access from zombies while ensuring data confidentiality and integrity. Zombies are controlled by botnet herders and use various techniques to send large volumes of spam while remaining untraceable, such as exploiting vulnerabilities on Windows systems to use infected machines as mail relays or sending spam from dynamic IP addresses. The document analyzes spam sent from different IPs to examine the extent to which spam originates from a small number of hosts.
A novel hybrid approach of SVM combined with NLP and probabilistic neural ne...IJECEIAES
Phishing attacks are one of the slanting cyber-attacks that apply socially engineered messages that are imparted to individuals from expert hackers going for tricking clients to uncover their delicate data, the most mainstream correspondence channel to those messages is through clients' emails. Phishing has turned into a generous danger for web clients and a noteworthy reason for money related misfortunes. Therefore, different arrangements have been created to handle this issue. Deceitful emails, also called phishing emails, utilize a scope of impact strategies to convince people to react, for example, promising a fiscal reward or summoning a feeling of criticalness. Regardless of far reaching alerts and intends to instruct clients to distinguish phishing sends, these are as yet a pervasive practice and a worthwhile business. The creators accept that influence, as a style of human correspondence intended to impact others, has a focal job in fruitful advanced tricks. Cyber criminals have ceaselessly propelling their techniques for assault. The current strategies to recognize the presence of such malevolent projects and to keep them from executing are static, dynamic and hybrid analysis. In this work we are proposing a hybrid methodology for phishing detection incorporating feature extraction and classification of the mails using SVM. At last, alongside the chose features, the PNN characterizes the spam mails from the genuine mails with more exactness and accuracy.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
A multi classifier prediction model for phishing detectioneSAT Journals
Abstract Phishing is the technique of stealing personal and sensitive information from an email client by sending emails that impersonates like the ones from some trustworthy organizations. Phishing mails are a specific type of spam mails; however the effects of them are much more terrible than alternate sorts. Mostly the phishing attackers aim the clients of the financial organizations, so its detection needs high priority. Lots of research activities are done to detect the phished emails, in the proposed methodology a multi-classifier prediction model is introduced for detecting phished emails. Our contention is that solitary classifier prediction might not be satisfactory to urge the clearest picture of the phishing email; multi-classifier prediction has accuracy 99.8% with an FP rate of 0.8%. Keywords: Phishing email detection, Machine learning techniques, Multi-classifier prediction model, Majority voting
Cross breed Spam Categorization Method using Machine Learning TechniquesIJSRED
This document presents a hybrid spam filtering method using machine learning techniques of Naive Bayes and Markov Random Fields. It aims to efficiently identify and filter spam emails. The proposed method is evaluated based on accuracy, precision, and time taken. The results show the hybrid technique achieves a high true positive rate in detecting spam emails. Keywords discussed are email spam, Naive Bayes algorithm, and Markov Random Fields.
A multi layer architecture for spam-detection systemcsandit
As the email is becoming a prominent mode of communication so are the attempts to misuse it to
take undue advantage of its low cost and high reachability. However, as email communication
is very cheap, spammers are taking advantage of it for advertising their products, for
committing cybercrimes. So, researchers are working hard to combat with the spammers. Many
spam detections techniques and systems are built to fight spammers. But the spammers are
continuously finding new ways to defeat the existing filters. This paper describes the existing
spam filters techniques and proposes a multi-level architecture for spam email detection. We
present the analysis of the architecture to prove the effectiveness of the architecture.
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORKijcseit
The World Wide Web has become an important part of our everyday life for information communication
and knowledge dissemination. It helps to transact information timely, rapidly and easily. Identifying theft
and identity fraud are referred as two sides of cyber-crime in which hackers and malicious users obtain the
personal data of existing legitimate users to attempt fraud or deception motivation for financial gain.
Malicious URLs host unsolicited content (spam, phishing, drive-by exploits, etc.) and lure unsuspecting
users to become victims of scams (monetary loss, theft of private information, and malware installation),
and cause losses of billions of dollars every year. To detect such crimes systems should be fast and precise
with the ability to detect new malicious content. Traditionally, this detection is done mostly through the
usage of blacklists. However, blacklists cannot be exhaustive, and lack the ability to detect newly generated
malicious URLs. To improve the generality of malicious URL detectors, machine learning techniques have
been explored with increasing attention in recent years. In this paper, I use a simple algorithm to detect
and predicting URLs it is good or bad and compared with two other algorithms to know (SVM, LR).
Identifying Valid Email Spam Emails Using Decision TreeEditor IJCATR
The increasing use of e-mail and the growing trend of Internet users sending unsolicited bulk e-mail, the need for an antispam
filtering or have created, Filter large poster have been produced in this area, each with its own method and some parameters are
to recognize spam. The advantage of this method is the simultaneous use of two algorithms decision tree ID3 - Mamdani and Naive
Bayesian is fuzzy. The first two algorithms are then used to detect spam Bagging approach is to identify spam. In the evaluation of this
dataset contains a thousand letters have been analyzed by the software Weka charts provided in spam detection accuracy than previous
methods of improvement
Honeywords for Password Security and ManagementIRJET Journal
This document summarizes a research paper on using honeywords to improve password security. Honeywords involve storing fake passwords (honeywords) along with users' real passwords in a hashed format. If an attacker steals the hashed password database, they cannot distinguish real passwords from honeywords. Any attempt to login with a honeyword would trigger an alarm. The document discusses how honeywords can detect password database breaches and prevent attackers from impersonating users. It also outlines the objectives of generating realistic honeywords and reducing storage costs compared to previous honeyword approaches.
Now a days Short Message Service(SMS) is most popular way to communication for mobile user because it is cheapest mode or version for communication than other mode.SMS is used for transmitting short length msg of around 160 character to different devices such as smart phones, cellular phones, PDAs using standardized communication protocols. The amount of Short Message Service (SMS) spam is increasing. SMS spam should be put into the spam folder, not the inbox. The growth of the mobile phone users has led to a dramatic increase in SMS spam messages. To avoid this problem SMS filtering Techniques are used. Our proposed approach filters SMS spam on an independent mobile phone on a large dataset and acceptable processing time. There are different approaches able to automatically detect and remove most of these messages, and the best-known ones are based on Bayesian decision theory and Support Vector Machines. Riya Mehta | Ankita Gandhi"A Survey: SMS Spam Filtering" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-3 , April 2018, URL: http://www.ijtsrd.com/papers/ijtsrd12850.pdf http://www.ijtsrd.com/computer-science/data-miining/12850/a-survey-sms-spam-filtering/riya-mehta
Analysis of an image spam in email based on content analysisijnlc
Researchers initially have addressed the problem of spam detection as a text classification or
categorization problem. However, as spammers’ continue to develop new techniques and the type of email
content becomes more disparate, text-based anti-spam approaches alone are not sufficiently enough in
preventing spam. In an attempt to defeat the anti-spam development technologies, spammers have recently
adopted the image spam trick to make the scrutiny of emails’ body text inefficient. The main idea behind
this project is to design a spam detection system. The system will be enabled to analyze the content of
emails, in particular the artificially generated image sent as attachment in an email. The system will
analyze the image content and classify the embedded image as spam or legitimate hence classify the email
accordingly.
IRJET- Identification of Clone Attacks in Social Networking SitesIRJET Journal
This document discusses identifying clone attacks in social networking sites. It proposes a two-level approach using fuzzy-sim algorithm for profile matching and three algorithms (Predictive FP Growth, Eclat, and Apriori) for user activity matching. An experiment compares the execution times of the three algorithms, finding that Predictive FP Growth has the fastest time of 181 milliseconds, making it best for user activity matching.
This document discusses email spam detection. It begins with the objectives of classifying emails as spam or ham (legitimate emails) and providing users knowledge about fake vs real emails. It introduces spam as unsolicited commercial email involving mass mailing, explains how spammers obtain emails and the spam lifecycle. The document also outlines different types of spam filters including header, Bayesian, permission and content filters. It provides a flowchart of the email processing and preprocessing steps like removing stop words and lemmatization (grouping word forms). Finally, it discusses the scope of the project, including increased security and reduced costs.
Clustering Categorical Data for Internet Security ApplicationsIJSTA
This document summarizes research on clustering categorical data for internet security applications. It discusses using clustering techniques for malware categorization, phishing website detection, and detecting secure emails. Feature extraction and categorization are generally used to automatically group file samples or websites. The document also reviews several related works applying clustering and other techniques for malware analysis, phishing detection, and analyzing privacy-breaching malware behavior.
This document discusses web spam detection using machine learning techniques. Specifically, it proposes an improved Naive Bayes classifier that incorporates user feedback and domain-specific features to better detect spam pages. The key points are:
1) Web spam has become a serious problem as internet usage has increased, threatening search engines and users. Spam pages aim to deceive search engines' ranking algorithms.
2) Existing spam detection techniques like content analysis are still lacking and Naive Bayes classifiers are commonly used but have limitations like treating all terms equally.
3) The paper proposes an improved Naive Bayes classifier that assigns different weights to terms based on user feedback and considers domain-specific features to reduce false positives and negatives and improve accuracy
Email spam, also known as junk email or unsolicited
bulk email(UBE), is a subset of electronic spam involving nearly
identical messages sent to numerous recipients by email. Clicking
on links in spam email may send users to phishing web sites or sites
that are hosting malware. Spam email may also include malware as
scripts or other executable file attachments. Definitions of spam
usually include the aspects that email is unsolicited and sent in bulk
In order to overcome spam problem many researchers have
been conducted and various method of anti-spam filtering have
been implemented. A spam filter is a set of instruction for
determining the status of the received email. Spam filters are used
to prevent spam email passing through the recipient. The main
challenge is how to design an effective spam filter that allows
desired email to pass through while blocking the unwanted email.
IRJET- An Effective Analysis of Anti Troll System using Artificial Intell...IRJET Journal
This document discusses various techniques for detecting trolls using artificial intelligence and machine learning. It first reviews related work on sentiment analysis, supervised machine learning for troll detection, real-time sentiment analysis, and analyzing vulnerabilities in social networks. It then analyzes the limitations of current troll detection systems and how AI/ML solutions can help overcome these. The literature survey covers key approaches used for troll detection, including sentiment analysis, supervised learning models, and analyzing post vulnerabilities.
The document presents a presentation on spam email detection. It introduces the topic and defines spam emails. It then discusses using a naïve Bayes classifier to classify emails as spam or not spam based on feature vectors. It identifies problems caused by spam emails and the objectives of detection. It reviews literature on spam prevention. The document outlines the project scope, requirements, feasibility study, testing approach using datasets, and output. It concludes that the method can effectively classify emails but may have limitations when dealing with large volumes of emails.
1. The document presents a research project aimed at developing an intelligent hybrid biometric system using machine learning techniques to improve security system performance.
2. The proposed approaches include a unimodal biometric approach using fingerprint minutiae and principal curves, a hybrid approach combining transformation and cryptosystem methods, and a watermarking approach using DNA and fingerprints.
3. The goal is to increase security system performance by decreasing the biometric system error rate.
lectronic-mail is widely used most suitable method of transferring messages electronically from one
person to another, rising from and going to any part of the world. Main features of Electronic mail is its speed,
dependability, well-equipped storage options and a large number of added services make it highly well-liked
among people from all sectors of business and society. But being popular it also has negative side too. Electronics
mails are preferred media for a large number of attacks over the internet.. A number of the most popular attacks over
the internet include spams. Some methods are essentially in detection of spam related mails but they have higher false
positives. A number of filters such as Checksum-based filters, Bayesian filters, machine learning based and
memory-based filters are usually used in order to recognize spams. As spammers constantly try to find a way to
avoid existing filters, a new filters need to be developed to catch spam. This paper proposes to find an
resourceful spam mail filtering method using user profile base ontology. Ontologies permit for machineunderstandable
semantics of data. It is main to interchange information with each other for more efficient spam
filtering. Thus, it is essential to build ontology and a framework for capable email filtering. Using ontology that is
particularly designed to filter spam, bunch of useless bulk email could be filtered out on the system. We propose a
user profile-based spam filter that classifies email based on the likelihood that User profile within it have been
included in spam or valid email.
This document summarizes a research paper that proposes a new email spam detection framework using feature selection and similarity coefficients. The framework first applies feature selection to identify the most important features for distinguishing spam and non-spam emails. It then calculates similarity coefficients between email pairs to quantify their similarity based on the selected features. The researchers claim this approach improves spam detection accuracy and rate by removing irrelevant and redundant features before classification. They test their framework on a standard email dataset and find that detection performance is better after applying feature selection compared to using all original features.
IRJET- Analysis and Detection of E-Mail Phishing using PysparkIRJET Journal
This document proposes a method to analyze and detect phishing emails using PySpark. It involves using text analysis and link analysis techniques such as applying a Naive Bayes classifier to email text and checking links using the VirusTotal API. The method aims to provide more accurate phishing detection compared to existing approaches. It discusses related work on phishing email detection using machine learning and analyzes the proposed system's design and implementation steps, which include data preprocessing, feature extraction, classification using Naive Bayes, and link analysis.
An Approach for Malicious Spam Detection in Email with Comparison of Differen...IRJET Journal
This document summarizes a research paper that proposes a model to improve detection of malicious spam emails through feature selection. The model employs a novel dataset for feature selection to optimize classification parameters, prediction accuracy, and computation time. Feature selection is expected to improve training time and classification accuracy. The paper also compares various classifiers, including Naive Bayes and Support Vector Machine, on the selected feature subset. The goal is to automatically learn to detect malicious spam emails, which threaten privacy and security by spreading malware, phishing links, and sensitive data theft.
How Anonymous Can Someone be on Twitter?George Sam
The document describes a study that aimed to identify Twitter users based on their writing styles in tweets. The researchers collected over 3 million tweets from 100 users and extracted features such as word counts and frequencies. They were able to correctly identify the author of tweets 67% of the time using tf-idf vectors and the extracted stylistic features.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
Lopez era una persona muy positiva que siempre veía el lado bueno de las situaciones. A pesar de ser asaltado y baleado, continuó manteniendo una actitud positiva y eligió vivir. Él creía que todos los días tenemos la elección de cómo vivir la vida a través de nuestra actitud.
The International Journal of Engineering and Science (The IJES)theijes
This document summarizes a study on the environmental awareness of rural residents in Hamirpur District, Himachal Pradesh, India. The study surveyed 1208 residents across 25 villages. It assessed their awareness of different environmental issues through a questionnaire. The results showed high awareness of local issues like air, water, and noise pollution, but lower awareness of global issues like climate change. Most respondents gained environmental knowledge from TV and newspapers. While awareness levels were reasonably high, more work is still needed to increase awareness and promote environmentally responsible behavior.
Honeywords for Password Security and ManagementIRJET Journal
This document summarizes a research paper on using honeywords to improve password security. Honeywords involve storing fake passwords (honeywords) along with users' real passwords in a hashed format. If an attacker steals the hashed password database, they cannot distinguish real passwords from honeywords. Any attempt to login with a honeyword would trigger an alarm. The document discusses how honeywords can detect password database breaches and prevent attackers from impersonating users. It also outlines the objectives of generating realistic honeywords and reducing storage costs compared to previous honeyword approaches.
Now a days Short Message Service(SMS) is most popular way to communication for mobile user because it is cheapest mode or version for communication than other mode.SMS is used for transmitting short length msg of around 160 character to different devices such as smart phones, cellular phones, PDAs using standardized communication protocols. The amount of Short Message Service (SMS) spam is increasing. SMS spam should be put into the spam folder, not the inbox. The growth of the mobile phone users has led to a dramatic increase in SMS spam messages. To avoid this problem SMS filtering Techniques are used. Our proposed approach filters SMS spam on an independent mobile phone on a large dataset and acceptable processing time. There are different approaches able to automatically detect and remove most of these messages, and the best-known ones are based on Bayesian decision theory and Support Vector Machines. Riya Mehta | Ankita Gandhi"A Survey: SMS Spam Filtering" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-3 , April 2018, URL: http://www.ijtsrd.com/papers/ijtsrd12850.pdf http://www.ijtsrd.com/computer-science/data-miining/12850/a-survey-sms-spam-filtering/riya-mehta
Analysis of an image spam in email based on content analysisijnlc
Researchers initially have addressed the problem of spam detection as a text classification or
categorization problem. However, as spammers’ continue to develop new techniques and the type of email
content becomes more disparate, text-based anti-spam approaches alone are not sufficiently enough in
preventing spam. In an attempt to defeat the anti-spam development technologies, spammers have recently
adopted the image spam trick to make the scrutiny of emails’ body text inefficient. The main idea behind
this project is to design a spam detection system. The system will be enabled to analyze the content of
emails, in particular the artificially generated image sent as attachment in an email. The system will
analyze the image content and classify the embedded image as spam or legitimate hence classify the email
accordingly.
IRJET- Identification of Clone Attacks in Social Networking SitesIRJET Journal
This document discusses identifying clone attacks in social networking sites. It proposes a two-level approach using fuzzy-sim algorithm for profile matching and three algorithms (Predictive FP Growth, Eclat, and Apriori) for user activity matching. An experiment compares the execution times of the three algorithms, finding that Predictive FP Growth has the fastest time of 181 milliseconds, making it best for user activity matching.
This document discusses email spam detection. It begins with the objectives of classifying emails as spam or ham (legitimate emails) and providing users knowledge about fake vs real emails. It introduces spam as unsolicited commercial email involving mass mailing, explains how spammers obtain emails and the spam lifecycle. The document also outlines different types of spam filters including header, Bayesian, permission and content filters. It provides a flowchart of the email processing and preprocessing steps like removing stop words and lemmatization (grouping word forms). Finally, it discusses the scope of the project, including increased security and reduced costs.
Clustering Categorical Data for Internet Security ApplicationsIJSTA
This document summarizes research on clustering categorical data for internet security applications. It discusses using clustering techniques for malware categorization, phishing website detection, and detecting secure emails. Feature extraction and categorization are generally used to automatically group file samples or websites. The document also reviews several related works applying clustering and other techniques for malware analysis, phishing detection, and analyzing privacy-breaching malware behavior.
This document discusses web spam detection using machine learning techniques. Specifically, it proposes an improved Naive Bayes classifier that incorporates user feedback and domain-specific features to better detect spam pages. The key points are:
1) Web spam has become a serious problem as internet usage has increased, threatening search engines and users. Spam pages aim to deceive search engines' ranking algorithms.
2) Existing spam detection techniques like content analysis are still lacking and Naive Bayes classifiers are commonly used but have limitations like treating all terms equally.
3) The paper proposes an improved Naive Bayes classifier that assigns different weights to terms based on user feedback and considers domain-specific features to reduce false positives and negatives and improve accuracy
Email spam, also known as junk email or unsolicited
bulk email(UBE), is a subset of electronic spam involving nearly
identical messages sent to numerous recipients by email. Clicking
on links in spam email may send users to phishing web sites or sites
that are hosting malware. Spam email may also include malware as
scripts or other executable file attachments. Definitions of spam
usually include the aspects that email is unsolicited and sent in bulk
In order to overcome spam problem many researchers have
been conducted and various method of anti-spam filtering have
been implemented. A spam filter is a set of instruction for
determining the status of the received email. Spam filters are used
to prevent spam email passing through the recipient. The main
challenge is how to design an effective spam filter that allows
desired email to pass through while blocking the unwanted email.
IRJET- An Effective Analysis of Anti Troll System using Artificial Intell...IRJET Journal
This document discusses various techniques for detecting trolls using artificial intelligence and machine learning. It first reviews related work on sentiment analysis, supervised machine learning for troll detection, real-time sentiment analysis, and analyzing vulnerabilities in social networks. It then analyzes the limitations of current troll detection systems and how AI/ML solutions can help overcome these. The literature survey covers key approaches used for troll detection, including sentiment analysis, supervised learning models, and analyzing post vulnerabilities.
The document presents a presentation on spam email detection. It introduces the topic and defines spam emails. It then discusses using a naïve Bayes classifier to classify emails as spam or not spam based on feature vectors. It identifies problems caused by spam emails and the objectives of detection. It reviews literature on spam prevention. The document outlines the project scope, requirements, feasibility study, testing approach using datasets, and output. It concludes that the method can effectively classify emails but may have limitations when dealing with large volumes of emails.
1. The document presents a research project aimed at developing an intelligent hybrid biometric system using machine learning techniques to improve security system performance.
2. The proposed approaches include a unimodal biometric approach using fingerprint minutiae and principal curves, a hybrid approach combining transformation and cryptosystem methods, and a watermarking approach using DNA and fingerprints.
3. The goal is to increase security system performance by decreasing the biometric system error rate.
lectronic-mail is widely used most suitable method of transferring messages electronically from one
person to another, rising from and going to any part of the world. Main features of Electronic mail is its speed,
dependability, well-equipped storage options and a large number of added services make it highly well-liked
among people from all sectors of business and society. But being popular it also has negative side too. Electronics
mails are preferred media for a large number of attacks over the internet.. A number of the most popular attacks over
the internet include spams. Some methods are essentially in detection of spam related mails but they have higher false
positives. A number of filters such as Checksum-based filters, Bayesian filters, machine learning based and
memory-based filters are usually used in order to recognize spams. As spammers constantly try to find a way to
avoid existing filters, a new filters need to be developed to catch spam. This paper proposes to find an
resourceful spam mail filtering method using user profile base ontology. Ontologies permit for machineunderstandable
semantics of data. It is main to interchange information with each other for more efficient spam
filtering. Thus, it is essential to build ontology and a framework for capable email filtering. Using ontology that is
particularly designed to filter spam, bunch of useless bulk email could be filtered out on the system. We propose a
user profile-based spam filter that classifies email based on the likelihood that User profile within it have been
included in spam or valid email.
This document summarizes a research paper that proposes a new email spam detection framework using feature selection and similarity coefficients. The framework first applies feature selection to identify the most important features for distinguishing spam and non-spam emails. It then calculates similarity coefficients between email pairs to quantify their similarity based on the selected features. The researchers claim this approach improves spam detection accuracy and rate by removing irrelevant and redundant features before classification. They test their framework on a standard email dataset and find that detection performance is better after applying feature selection compared to using all original features.
IRJET- Analysis and Detection of E-Mail Phishing using PysparkIRJET Journal
This document proposes a method to analyze and detect phishing emails using PySpark. It involves using text analysis and link analysis techniques such as applying a Naive Bayes classifier to email text and checking links using the VirusTotal API. The method aims to provide more accurate phishing detection compared to existing approaches. It discusses related work on phishing email detection using machine learning and analyzes the proposed system's design and implementation steps, which include data preprocessing, feature extraction, classification using Naive Bayes, and link analysis.
An Approach for Malicious Spam Detection in Email with Comparison of Differen...IRJET Journal
This document summarizes a research paper that proposes a model to improve detection of malicious spam emails through feature selection. The model employs a novel dataset for feature selection to optimize classification parameters, prediction accuracy, and computation time. Feature selection is expected to improve training time and classification accuracy. The paper also compares various classifiers, including Naive Bayes and Support Vector Machine, on the selected feature subset. The goal is to automatically learn to detect malicious spam emails, which threaten privacy and security by spreading malware, phishing links, and sensitive data theft.
How Anonymous Can Someone be on Twitter?George Sam
The document describes a study that aimed to identify Twitter users based on their writing styles in tweets. The researchers collected over 3 million tweets from 100 users and extracted features such as word counts and frequencies. They were able to correctly identify the author of tweets 67% of the time using tf-idf vectors and the extracted stylistic features.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
Lopez era una persona muy positiva que siempre veía el lado bueno de las situaciones. A pesar de ser asaltado y baleado, continuó manteniendo una actitud positiva y eligió vivir. Él creía que todos los días tenemos la elección de cómo vivir la vida a través de nuestra actitud.
The International Journal of Engineering and Science (The IJES)theijes
This document summarizes a study on the environmental awareness of rural residents in Hamirpur District, Himachal Pradesh, India. The study surveyed 1208 residents across 25 villages. It assessed their awareness of different environmental issues through a questionnaire. The results showed high awareness of local issues like air, water, and noise pollution, but lower awareness of global issues like climate change. Most respondents gained environmental knowledge from TV and newspapers. While awareness levels were reasonably high, more work is still needed to increase awareness and promote environmentally responsible behavior.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The document outlines the different phases of developing a multimedia project:
1. Analysis phase where the developers identify the project details, problem, objectives, and target users through tools like questionnaires and forms.
2. Design phase where the developers create flow charts and storyboards to layout the program and apply CASPER design principles.
3. Implementation phase where the developers use authoring tools to integrate the multimedia elements like text, graphics, audio, and video.
4. Testing phase where the developers use checklists to test the program for errors in content, interface, and navigation.
5. Evaluation phase where users provide feedback on the program's content and user interface through evaluation forms.
6
The Detection of Suspicious Email Based on Decision Tree ...IRJET Journal
This document summarizes a research paper that proposes a method for detecting suspicious emails using decision trees. The method extracts keywords and indicators from emails to classify them as suspicious, not suspicious, or possibly suspicious. An ID3 decision tree algorithm is used to analyze patterns in a training set of pre-classified emails and generate rules to classify new emails. The tree is built by recursively partitioning attributes based on their information gain. The resulting decision tree and rules can then be used to detect suspicious emails, which could help identify potential criminal activities or security threats.
This document summarizes a student project on building a spam classifier. It defines spam and the problems it causes. It then introduces the goal of building a tool to identify spam messages. It reviews literature on spamming and organized cybercrime. The proposed solution discusses features of a modern spam filter, including threat detection using AI and machine learning. It provides a block diagram of the spam classifier that includes collecting an email data set, pre-processing email content, extracting and selecting features, implementing a K-Nearest Neighbors algorithm, and analyzing performance.
Study of Various Techniques to Filter Spam EmailsIRJET Journal
This document discusses techniques for filtering spam emails. It begins with an introduction explaining the rise of spam emails due to increased internet and email usage. It then describes various techniques used to filter spam, including machine learning methods like Bayesian classification and non-machine learning methods like blacklists, whitelists, and header analysis. Next, it covers how data mining techniques like clustering, classification, and association rule mining can be applied to identify spam emails. Finally, it reviews several papers that have compared the performance of techniques like Naive Bayes, SVM, KNN, and neural networks for spam filtering.
PHISHING MITIGATION TECHNIQUES: A LITERATURE SURVEYIJNSA Journal
Email is a channel of communication which is considered to be a confidential medium of communication for exchange of information among individuals and organisations. The confidentiality consideration about e-mail is no longer the case as attackers send malicious emails to users to deceive them into disclosing their private personal information such as username, password, and bank card details, etc. In search of a solution to combat phishing cybercrime attacks, different approaches have been developed. However, the traditional exiting solutions have been limited in assisting email users to identify phishing emails from legitimate ones. This paper reveals the different email and website phishing solutions in phishing attack detection. It first provides a literature analysis of different existing phishing mitigation approaches. It then provides a discussion on the limitations of the techniques, before concluding with an explorationin to how phishing detection can be improved.
Phishing Website Detection using Classification AlgorithmsIRJET Journal
This document discusses using machine learning algorithms to classify phishing websites. It begins with background on phishing and then discusses prior research applying algorithms like random forest, decision trees, SVM and KNN to detect phishing websites. The paper aims to address phishing website classification using various classifiers and ensemble learning approaches. It tests classifiers like random forest, decision tree, KNN, AdaBoost and GradientBoost on a phishing testing dataset and evaluates performance using metrics like accuracy, f1-score, precision and recall. The proposed approach achieves 97% accuracy in classifying phishing websites according to experimental results.
A LITERATURE REVIEW ON PHISHING EMAIL DETECTION USING DATA MININGHeather Strinden
This document reviews literature on detecting phishing emails using data mining. It discusses how hybrid features that include both content and header information can be used to effectively classify emails as phishing or legitimate. Various techniques currently used for phishing email detection are examined, including network-level protections, authentication, client-side tools, user education, and server-side filters. Feature selection is important, as phishing emails often resemble legitimate emails, making detection complex. The review finds that server-side filters using machine learning classifiers on selected email features show promise as a solution.
Detection of Spam in Emails using Machine LearningIRJET Journal
The document discusses detecting spam emails using machine learning techniques. It evaluates various machine learning algorithms for classifying emails as spam or ham (not spam) and finds that the Multinomial Naive Bayes algorithm provides the highest accuracy. The study uses an email dataset to train classifiers including Naive Bayes, Decision Trees, Random Forest, KNN, SVM and evaluates their performance based on precision, accuracy and other metrics. It finds that while Naive Bayes performs best, it has some limitations and ensemble methods like random forests generally provide more robust performance. The paper contributes to improving email spam detection through comparative analysis of machine learning algorithms.
Improving Phishing URL Detection Using Fuzzy Association Miningtheijes
Phishing is the process to obtain sensitive information such as usernames, passwords, and credit card details by disguising as a trustworthy entity by the use of an electronic communication. Phishing attack continues to pose a solemn risk for web users and annoying threat within the field of electronic commerce. The Phishing detection using fuzzy and binary matrix construction method focuses on discerning the significant features that discriminate between legitimate and phishing URLs. The significant features are extracting the number of dots, length of the host etc., from each URL. These features are then subjected to associative rule mining-apriori and predictive apriori. The rules obtained are interpreted to emphasize the features that are more prevalent in phishing URLs. The key factors for the phished URLs are number of slashes in the URL, dot in the host portion of the URL and length of the URL. The pitfall of binary matrix method is the time complexity. So it impacts the overall speed of the system. The fuzzy based logic association rule mining algorithm was proposed to classify the legitimate and phishing URLs based on the features. The extracted features are converted to fuzzy membership values as “Low”,’ Medium’ and “High”. By applying association rule mining algorithm the rules are generated to detect the phishing URLs. The fuzzy based methodology provides efficient and high rate of phishing detection of URLs
This document summarizes 12 sources on topics related to cybersecurity threats such as spam, phishing, ransomware, and malware detection techniques. The sources discuss using machine learning algorithms and heuristic approaches to detect spam accounts, phishing websites, and ransomware variants. Some propose new techniques like using Twitter data and mobile messages to detect spam or combining feature selection with metaheuristic algorithms to identify phishing sites. Others analyze hyperlinks or topic distributions to identify phishing attacks or classify anti-phishing solutions. The document also mentions the rise of ransomware attacks and techniques used in related studies.
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORKijcseit
The World Wide Web has become an important part of our everyday life for information communication
and knowledge dissemination. It helps to transact information timely, rapidly and easily. Identifying theft
and identity fraud are referred as two sides of cyber-crime in which hackers and malicious users obtain the
personal data of existing legitimate users to attempt fraud or deception motivation for financial gain.
Malicious URLs host unsolicited content (spam, phishing, drive-by exploits, etc.) and lure unsuspecting
users to become victims of scams (monetary loss, theft of private information, and malware installation),
and cause losses of billions of dollars every year. To detect such crimes systems should be fast and precise
with the ability to detect new malicious content. Traditionally, this detection is done mostly through the
usage of blacklists. However, blacklists cannot be exhaustive, and lack the ability to detect newly generated
malicious URLs. To improve the generality of malicious URL detectors, machine learning techniques have
been explored with increasing attention in recent years. In this paper, I use a simple algorithm to detect
and predicting URLs it is good or bad and compared with two other algorithms to know (SVM, LR).
A multi layer architecture for spam-detection systemcsandit
As the email is becoming a prominent mode of commun
ication so are the attempts to misuse it to
take undue advantage of its low cost and high reach
ability. However, as email communication
is very cheap, spammers are taking advantage of it
for advertising their products, for
committing cybercrimes. So, researchers are working
hard to combat with the spammers. Many
spam detections techniques and systems are built to
fight spammers. But the spammers are
continuously finding new ways to defeat the existin
g filters. This paper describes the existing
spam filters techniques and proposes a multi-level
architecture for spam email detection. We
present the analysis of the architecture to prove t
he effectiveness of the architecture
This document discusses phishing attacks and ways to counter them. It begins with an abstract that introduces the topic of email phishing and its growing security problems. The main body is divided into sections that: 1) explain how phishing attacks work and their typical stages, from creating spoofed websites to tricking victims into providing sensitive information; 2) describe different types of phishing scams like spear phishing, whaling, and pharming; 3) outline warning signs that an email may be a phishing attempt, such as coming from an unknown sender or having odd writing; and 4) suggest awareness and technical solutions to help prevent falling victim to phishing.
Overview of Existing Methods of Spam Mining and Potential Usefulness of Sende...IDES Editor
This document summarizes an article that analyzes existing anti-spam techniques including content-based, non-content based, and hybrid methods. It also proposes using the sender's name as a parameter for fuzzy string matching to cluster related spam domains. The key methods discussed are naïve Bayesian filtering, support vector machines, neural networks, domain blacklisting, and an existing hybrid method that clusters spam based on subject and IP address similarity. Experimental results on a large spam dataset show the sender's name can effectively be used as an additional parameter for fuzzy matching, identifying large clusters of related spam.
OPTIMIZING HYPERPARAMETERS FOR ENHANCED EMAIL CLASSIFICATION AND FORENSIC ANA...IJNSA Journal
Electronic mail, commonly known as email, is a crucial technology that enables streamlined operations and communications in corporate environments. Empowering swift and dependable transactions, email is a driving force behind heightened productivity and organizational effectiveness. However, its versatility also renders it susceptible to misuse by cybercriminals engaging in activities such as hacking, spoofing, phishing, email bombing, whaling, and spamming. As a result, effective and efficient data analysis is important in avoiding and detecting cyber-attacks and crime on times. To overcome the above challenges, a novel approach named Aquila Optimization (AO) is used in this paper to find the best set of hyperparameters of the Stacked Auto Encoder (SAE) classifier. The purpose of increasing the hyperparameters of the SAE using the AO is to obtain a higher text classification accuracy. Then the optimized SAE classifies the selected features into different classes. The experimental results showed that the proposed AO-SAE model outperforms the existing models such as Logistic Regression (LR) and Long Short-Term Model based Gated Current Unit (LSTM based GRU) in terms of Accuracy.
Identification of Spam Emails from Valid Emails by Using VotingEditor IJCATR
In recent years, the increasing use of e-mails has led to the emergence and increase of problems caused by mass unwanted
messages which are commonly known as spam. In this study, by using decision trees, support vector machine, Naïve Bayes theorem
and voting algorithm, a new version for identifying and classifying spams is provided. In order to verify the proposed method, a set of
a mails are chosen to get tested. First three algorithms try to detect spams, and then by using voting method, spams are identified. The
advantage of this method is utilizing a combination of three algorithms at the same time: decision tree, support vector machine and
Naïve Bayes method. During the evaluation of this method, a data set is analyzed by Weka software. Charts prepared in spam
detection indicate improved accuracy compared to the previous methods.
The document summarizes a research paper about AutoRE, a system that combines content-based and non-content-based approaches to detect spam emails generated by botnets in real-time. AutoRE first pre-processes URLs in emails to group related domains and then generates regular expressions to identify patterns. It verifies spam classifications using blacklists and behavioral analysis of email properties, sending times, and patterns. The document also discusses how AutoRE helped characterize botnets and their traffic, informing future research like systems that calculate sender reputations based on global email behavior analysis.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
This document presents an intelligent system to detect phishing attacks using data mining techniques. It discusses how phishing involves mimicking legitimate websites to steal private information. Various existing solutions have been proposed but cannot fully eliminate phishing. The proposed system uses classifiers like decision trees and random forests trained on features extracted from URLs to classify websites as legitimate or phishing. It aims to construct an accurate intelligent system for phishing detection using data mining techniques.
This document describes an intent-based approach to detecting email account compromise. It involves using n-gram analysis to identify suspicious phrases, analyzing URLs and sender behavior over the past 90 days, including volume of emails, recipients, and IP addresses. Suspicious emails are isolated and these features are extracted and correlated to detect if the email account has been compromised. The approach was tested on a large corporate email network and was able to successfully isolate a small percentage of emails for further analysis while avoiding most legitimate emails.
PUMMP: PHISHING URL DETECTION USING MACHINE LEARNING WITH MONOMORPHIC AND POL...IJCNCJournal
Phishing scams are increasing drastically, which affects Internet users in compromising personal
credentials. This paper proposes a novel feature utilization method for phishing URL detection called the
Polymorphic property of features. In the initial stage, the URL-related features (46 features) were
extracted. Later, a subset of features (19 out of 46) with the polymorphic property of features was
identified, and they were extracted from different parts of the URL (the domain and path). After extracting
the features, various machine learning classification algorithms were applied to build the machine
learning model using monomorphic treatment of features, polymorphic treatment of features, and both
monomorphic and polymorphic treatment of features. By the polymorphic property of features, we mean
that the same feature provides different interpretations when considered in different parts of the URL. The
machine learning models were built on two different datasets. A comparison of the machine learning
models derived from the two datasets reveals the fact that the model built with both monomorphic and
polymorphic treatment of features yielded higher accuracy in Phishing URL detection than the existing
works.
Similar to The International Journal of Engineering and Science (The IJES) (20)
PUMMP: PHISHING URL DETECTION USING MACHINE LEARNING WITH MONOMORPHIC AND POL...
The International Journal of Engineering and Science (The IJES)
1. The International Journal of Engineering And Science (IJES)
||Volume|| 2 ||Issue|| 1 ||Pages|| 62-64 ||2013||
ISSN: 2319 – 1813 ISBN: 2319 – 1805
Proposed Phishing mail detection using fuzzy classification
methods
1
Ami k. Trivedi, 2G.J.Sahani
1
(Department of Computer Engineering, Sardar Patel institute of technology, vasad, Gujarat
2
(Department of Computer Engineering, Sardar Patel institute of technology, vasad,
------------------------------------------------------Abstract------------------------------------------------
Phishing is an attack that deals with social engineering methodology to illegally acquire and use someone
else’s data on behalf of legitimate website for own benefit If. Even a few victims fall for the pitch, the effort is
profitable. Phishing mail misrepresent true sender and steal personal and financial account credentials. Most of
the detection techniques use decision tree, machine learning algorithms, genetic algorithm, clustering
techniques. In these techniques crisp logic is used. They classify email as spam and Not-spam email. Crisp logic
is often failed because it does not provide sharp boundaries. Several Artificial Intelligence (AI) techniques
including neural networks and fuzzy logic [7-9] are successfully applied to a wide variety of decision making
problems in real world. Up to our knowledge, there was not developed any system to the phishing mail detection
based on fuzzy classification method. In this work we would like to build a fuzzy rule generation system to detect
phishing mail. It classifies email into different category like very legitimate, legitimate, suspicious, phishy, very
phishy etc. The main advantage of fuzzy rule-based classification systems is that they do not require large
memory storage, their inference speed is very high and the users can carefully examine each fuzzy if-then rule.
-----------------------------------------------------------------------------------------------------------------
Date of Submission: 17, December, 2012 Date of Publication: 05, January 2013
---------------------------------------------------------------------------------------------------------------- -
I. Introduction
Mail spammers can be categorized based on II. Literature review and related work
Research on header analysis was recently
their intent. Some spammers are telemarketers, who
reported by Microsoft, IBM, and Cornell
broadcast unsolicited emails to several
University in the 2004 and 2005 anti-spam
hundred/thousands of email users. They do not
conferences [4] [5]. Barry Lieba‟s [4] filtering
have a specific target, but blindly send the
technique makes use of the path travelled by an
broadcast and expect a very limited rate of return.
email, which is extracted from the email header.
The next category of spammers comprise the opt-in
They have analysed the end units in the path but
spammers, who keep sending unsolicited messages
not the end-to-end path. They claim their approach
though you have little or no interest in them. In
complements the existing filters but does not work
some cases, they spam you with unrelated topics or
as a standalone mechanism. Christine E. Drake [5]
marketing material. Some of the examples are
in his paper, “Anatomy of Phishing Email”,
conference notices, professional news or meeting
discusses the tricks employed by the email
announcements. The third category of spammers is
scammers in phishing emails. In paper “Anti-
called phishers. Phishing attackers misrepresent the
phishing techniques: A Review” [1] authors
true sender and steal the consumers' personal
proposed content filtering methodology in which
identity data and financial account credentials.
content of email is filtered out using. Machine
These spammers send spoofed emails and lead
learning methods like Bayesian Additive
consumers to counterfeit websites designed to trick
Regression Trees (BART), support vector machine
recipients into divulging financial data such as
(SVM).They also used genetic algorithm to detect
credit card numbers, account usernames, passwords
phishing email but problem with this algorithm is
and social security numbers. By hijacking brand
that it is only work for one rule, if there are more
names of banks, e-retailers and credit card
than one rule algorithm becomes more complex.
companies, phishers often convince the recipients
Character based anti-phishing approach check for
to respond.
the URL which mostly used by the attackers, but
this approach result in false positive. In paper
“Detecting Phishing in email” [2] author suggest
classification on three kinds of analyses on the
header: DNS-based header analysis, Social
www.theijes.com The IJES Page 62
2. Proposed Phishing mail detection using fuzzy classification methods
Network analysis and Wantedness analysis. In the A fuzzy set A in X is characterized by a
DNS-based header analysis, they classified the membership function which is easily implemented
corpus into 8 buckets and used social network by fuzzy conditional statements. For example, if
analysis to further reduce the false positives. They the antecedent is true to some degree of
introduced a concept of wantedness and credibility, membership, then the consequent is also true to that
and derived equations to calculate the wantedness same degree. If antecedent Then consequent.
values of the email senders. Finally, credibility and In a fuzzy classification system, a case or
all the three analyses to classify the phishing an object can be classified by applying a set of
emails. Method resulted in far less false positives. fuzzy rules based on the linguistic values of its
They plan to perform two more analyses on the attributes. Every rule has a weight, which is a
incoming email traffic, i) End to End path Analysis, number between 0 and 1, and this is applied to the
by which they try to establish the legitimacy of the number given by the antecedent. It involves 2
path taken by an email; ii) Relay Analysis, by distinct parts. The first part involves evaluating the
which they verify the trustworthiness and antecedent, fuzzifying the input and applying any
reputation of the relays participating in the relaying necessary fuzzy operators. The second part requires
of emails. They combine all the methods i) Path application of that result to the consequent, known
analysis, ii) Relay analysis, iii) DNS-based header as inference. To build a fuzzy classification system,
analysis and iv) Social Network analysis for the most difficult task is to find a set of fuzzy rules
developing a email classifier, which classifies the pertaining to the specific classification problem. A
incoming email traffic as i) Phishing emails ii) fuzzy inference system is a rule-based system that
Socially Wanted emails (Legitimate emails) iii) uses fuzzy logic, rather than Boolean logic, to
Socially Unwanted emails (Spam emails).They are reason about data. Its basic structure includes four
only concentrated on email header not in the main components (1) a fuzzifier, which translates
content of the email. Paper “learning to detect crisp (real-valued) inputs into fuzzy values; (2) an
phishing in email”[3] proposed machine learning inference engine that applies a fuzzy reasoning
approach with 10-fold cross validation training data mechanism to obtain a fuzzy output; (3) a
which use random forest as classifier which only defuzzifier, which translates this latter output into a
categorized spam or no-spam email. They are not crisp value; and (4) a knowledge base, which
going to classify mail as legitimate, spam or contains both an ensemble of fuzzy rules, known as
phishing. the rule base, and an ensemble of membership
functions known as the database. The decision-
III. Fuzzy Systems making process is performed by the inference
Fuzzy logic was invented by Zadeh [6] in engine using the rules contained in the rule base.
1965 for handling uncertain and imprecise These fuzzy rules define the connection between
knowledge in real world applications. It has proved input and output fuzzy variables.
to be a powerful tool for decision-making, and to
handle and manipulate imprecise and noisy data. IV. Proposed fuzzy classification based
The first major commercial application was in the approach
area of cement kiln control. This requires that an To detect phishing mail, we are going to
operator monitor four internal states of the kiln, proposed four direct rule generation methods based
control four sets of operations, and dynamically on fuzzy classification. The first method generates
manage 40 or 50 "rules of thumb" about their fuzzy if-then rules using the mean and the standard
interrelationships, all with the goal of controlling a deviation of attribute values. The second approach
highly complex set of chemical interactions. One generates fuzzy if-then rules using the histogram of
such rule is "If the oxygen percentage is rather high attributes values. The third procedure generates
and the free-lime and kiln-drive torque rate is fuzzy if-then rules with certainty of each attribute
normal, decrease the flow of gas and slightly into homogeneous fuzzy sets. In the fourth
reduce the fuel rate". The notion central to fuzzy approach, only overlapping areas are partitioned.
systems is that truth values (in fuzzy logic) or The first two approaches generate a single fuzzy if-
membership values (in fuzzy sets) are indicated by then rule for each class by specifying the
a value on the range [0.0, 1.0], with 0.0 membership function of each antecedent fuzzy set
representing absolute Falseness and 1.0 using the information about attribute values of
representing absolute Truth. A fuzzy system is training patterns. The other two approaches are
characterized by a set of linguistic statements based based on fuzzy grids with homogeneous fuzzy
on expert knowledge. The expert knowledge is partitions of each attribute. To start with these
usually in the form of “if-then” rules. methods linguistic descriptors such as high, low,
medium are assigned to a range of values for each
key phishing characteristic indicator. Valid ranges
of the inputs are considered and divided into
classes, or fuzzy sets. For example, length of URL
www.theijes.com The IJES Page 63
3. Proposed Phishing mail detection using fuzzy classification methods
address can range from „low‟ to „high‟ with other References
values in between. We cannot specify clear [1] Gaurav, Madhuresh Mishra, Anurag Jain, “Anti-
boundaries between classes. The degree of Phishing Techniques: A Review”, International
belongingness of the values of the variables to any Journal of Engineering Research and
selected class is called the degree of membership; Applications, vol.2, Issue: 2, Mar-Apr 2012,
membership function is designed for each phishing pp350-355
[2] Srikanth Palla and Ram Dantu, “Detecting
characteristic indicator, which is a curve that Phishing in Emails”
defines how each point in the input space is [3] Ian fatte, Norman sadeh, Anthony Tomasic,
mapped to a membership value between [0, 1]. “Learning to detect Phishing in email”, Technical
Linguistic values are assigned for each phishing Report CMU-ISRI-06-112, June-2006.
indicator as low, moderate, and high while for [4] Barry Leiba, Joel Ossher, V.T. Rajan, Richard
spam mail as Very legitimate, Legitimate, Segal, Mark Wegman, “SMTP Path Analysis”
Suspicious, Phishy, and Very phishy (triangular First Conference on Email and Anti-Spam (CEAS)
and trapezoidal membership function). For each 2004 Proceedings.
input, their values range from 0 to 10 while for [5] Christine E. Drake, Jonathan J. Oliver, and Eugene
J. Koontz,”Anatomy of a Phishing Email”,
output, range from 0 to 100.Flow of our proposed Proceedings of First Conference on Email and
system is as follows: Anti-Spam (CEAS), July 2004.
[6] Zadeh, L.A., Fuzzy Logic IEEE Computer, pp. 83-
93 (1988).
[7] Krzysztof Trawi´nski “A Fuzzy Classification
System for Prediction of the Results of the
Basketball Games”, IEEE computer, 2010.
[8] Andreas Meier and Nicolas Werro,”A Fuzzy
Classification Model for Online Customers”,
Informatica 31, pp. 175-182.2007.
[9] Ravi. Jain, Ajith. Abraham,”A Comparative Study
of Fuzzy Classification Methods on Breast Cancer
Data” 7th International Work Conference on
Artificial and Natural Neural Networks,
IWANN’03, Spain, 2003.
Figure 1 : Flow of Proposed Fuzzy Classification System
V. Conclusion and Future Work
A motivation behind using fuzzy rule is
soft decision boundaries provide sharp transition
between classes and could become a crisp one if it
is needed. Machine learning techniques can be
applied to spam mail detection. Many systems are
built on fuzzy classification method like predicting
result of basketball game using fuzzy classification
method [7], fuzzy classification methods for breast
cancer data [9],online customer[8].By analysing
these papers we conclude that fuzzy classification
provide good result in decision making problems.
In future we are going to collect sample data for
email and then applying different fuzzy
classification methods on it and classified email as
spam or Anti-spam.
www.theijes.com The IJES Page 64