DETECTING SPAM BY USING
NAïVE BAYES IN MACHINE LEARNING
STUDENTS PROFILE
Spamming is the use of messaging systems to send an unsolicited message
(spam) to large numbers of recipients for the purpose of commercial advertising,
for the purpose of non commercial proselytizing, or for any prohibited purpose
(especially the fraudulent purpose of phishing). Spamming are the message that
we cannot count quickly in perfect time. While users used to spam some text or
message in repeat the system cannot be calculated quickly. Nowadays, lot of cases
have been reported regarding stealing of personal information via message from
the user. This project will discuss how machine learning help in detecting spam.
Machine learning is an artificial intelligence application that provides the ability to
automatically learn and improve data without being explicitly programmed. The
algorithm will detect the score more accurately.
NAME : NUR AZZIEFA BINTI AZAHAR
MATRIC NO : BTBL17046746
SUPERVISOR : SIR FAISAL AMRI BIN ABIDIN @ BHARUN
COURSE : BACHELOR IN COMPUTER SCIENCE
(COMPUTER NETWORK SECUTIY) WITH HONOURS
PROJECT : DETECTING SPAM BY USING NAIVE BAYES IN
MACHINE LEARNING
Previous works detecting spam using machine learning is difficult to trace the spam
because it must classify first. It difficult to come out the outcome that messages either
spam or ham. As spammers began to use tricky methods to overcome the spam filters like
using random sender address or append random characters at the beginning or end of
messages subject line. Another research is spam filtering technique also not relevant for
received a lot of message because it must filter one by one. That method was difficult to
differentiate whether the messages spam or ham because the spam messages mostly
more than receiver expect. Addition from the previous research spam detection are not
achieve efficiency because it must detect all the message to trace the spam. The detection
must classify all the messages that receiver accept the messages to know either spam or
ham. Spam is waste of time to the user since they have to sort the unwanted junk
messages and it consumed storage space. With regard to this, Machine Learning
technique was chosen for detecting spam because it can trace a random message to
know whether spam or ham.
To study about naïve bayes based machine learning
technique specifically to classify spam.
To apply naïve bayes in kaggle platform and integrate
the dataset accordingly.
To test the naïve bayes based machine learning
algorithm with the dataset from kaggle machine
learning repository.
OBJECTIVES
SOFTWARE
FRAMEWORK
ABSTRACT PROBLEM STATEMENT
GIT HUB KAGGLE
RESULT ANALYSIS
SPAM
From this project, it can be concluded that Kaggle Machine Learning is a
cloud collaborative tool which has capabilities to detect analytics solutions
on particular data. This research has been leveraged the Kaggle Machine
Learning with algorithm in python language in order to detect spam. The
classification technique in Naïve Bayes has been used to determined spam
or ham based on the dataset provided.
Some suggestion that can applied to this project is to widen its use to not
just classifying only text message format. So, an improvement can be made
to leverage the use of this projectso that it can detect, analyse and classify
model not limited to just text message but including any other format such
as spam word cloud. In other to get the most accurate result of classification,
these improvements should be made.
CONTRIBUTIONS
FUTURE WORKS
CONTRIBUTIONS &
IMPROVEMENTS
REFERENCES
Yang, Zhen, et al. "An approach to spam detection by naive Bayes ensemble based on decision induction."
Sixth International Conference on Intelligent Systems Design and Applications. Vol. 2. IEEE, 2006
Freeman, David Mandell. "Using naive bayes to detect spammy names in social networks." Proceedings of
the 2013 ACM workshop on Artificial intelligence and security. 2013.
Granik, Mykhailo, and Volodymyr Mesyura. "Fake news detection using naive Bayes classifier." 2017 IEEE
First Ukraine Conference on Electrical and Computer Engineering (UKRCON). IEEE, 2017.
TITLE

DETECTING SPAM BY USING NAÏVE BAYES IN MACHINE LEARNING

  • 1.
    DETECTING SPAM BYUSING NAïVE BAYES IN MACHINE LEARNING STUDENTS PROFILE Spamming is the use of messaging systems to send an unsolicited message (spam) to large numbers of recipients for the purpose of commercial advertising, for the purpose of non commercial proselytizing, or for any prohibited purpose (especially the fraudulent purpose of phishing). Spamming are the message that we cannot count quickly in perfect time. While users used to spam some text or message in repeat the system cannot be calculated quickly. Nowadays, lot of cases have been reported regarding stealing of personal information via message from the user. This project will discuss how machine learning help in detecting spam. Machine learning is an artificial intelligence application that provides the ability to automatically learn and improve data without being explicitly programmed. The algorithm will detect the score more accurately. NAME : NUR AZZIEFA BINTI AZAHAR MATRIC NO : BTBL17046746 SUPERVISOR : SIR FAISAL AMRI BIN ABIDIN @ BHARUN COURSE : BACHELOR IN COMPUTER SCIENCE (COMPUTER NETWORK SECUTIY) WITH HONOURS PROJECT : DETECTING SPAM BY USING NAIVE BAYES IN MACHINE LEARNING Previous works detecting spam using machine learning is difficult to trace the spam because it must classify first. It difficult to come out the outcome that messages either spam or ham. As spammers began to use tricky methods to overcome the spam filters like using random sender address or append random characters at the beginning or end of messages subject line. Another research is spam filtering technique also not relevant for received a lot of message because it must filter one by one. That method was difficult to differentiate whether the messages spam or ham because the spam messages mostly more than receiver expect. Addition from the previous research spam detection are not achieve efficiency because it must detect all the message to trace the spam. The detection must classify all the messages that receiver accept the messages to know either spam or ham. Spam is waste of time to the user since they have to sort the unwanted junk messages and it consumed storage space. With regard to this, Machine Learning technique was chosen for detecting spam because it can trace a random message to know whether spam or ham. To study about naïve bayes based machine learning technique specifically to classify spam. To apply naïve bayes in kaggle platform and integrate the dataset accordingly. To test the naïve bayes based machine learning algorithm with the dataset from kaggle machine learning repository. OBJECTIVES SOFTWARE FRAMEWORK ABSTRACT PROBLEM STATEMENT GIT HUB KAGGLE RESULT ANALYSIS SPAM From this project, it can be concluded that Kaggle Machine Learning is a cloud collaborative tool which has capabilities to detect analytics solutions on particular data. This research has been leveraged the Kaggle Machine Learning with algorithm in python language in order to detect spam. The classification technique in Naïve Bayes has been used to determined spam or ham based on the dataset provided. Some suggestion that can applied to this project is to widen its use to not just classifying only text message format. So, an improvement can be made to leverage the use of this projectso that it can detect, analyse and classify model not limited to just text message but including any other format such as spam word cloud. In other to get the most accurate result of classification, these improvements should be made. CONTRIBUTIONS FUTURE WORKS CONTRIBUTIONS & IMPROVEMENTS REFERENCES Yang, Zhen, et al. "An approach to spam detection by naive Bayes ensemble based on decision induction." Sixth International Conference on Intelligent Systems Design and Applications. Vol. 2. IEEE, 2006 Freeman, David Mandell. "Using naive bayes to detect spammy names in social networks." Proceedings of the 2013 ACM workshop on Artificial intelligence and security. 2013. Granik, Mykhailo, and Volodymyr Mesyura. "Fake news detection using naive Bayes classifier." 2017 IEEE First Ukraine Conference on Electrical and Computer Engineering (UKRCON). IEEE, 2017. TITLE