SlideShare a Scribd company logo
1 of 6
Download to read offline
International Journal of Electrical and Computer Engineering (IJECE)
Vol. 11, No. 1, February 2021, pp. 596~601
ISSN: 2088-8708, DOI: 10.11591/ijece.v11i1.pp596-601  596
Journal homepage: http://ijece.iaescore.com
Differential evolution detection models for SMS spam
Sarab M. Hameed
Department of Computer Science, College of Science, University of Baghdad, Iraq
Article Info ABSTRACT
Article history:
Received Nov 23, 2019
Revised Jun 17, 2020
Accepted Jun 27, 2020
With the growth of mobile phones, short message service (SMS) became an
essential text communication service. However, the low cost and ease use of
SMS led to an increase in SMS Spam. In this paper, the characteristics
of SMS spam has studied and a set of features has introduced to get rid of
SMS spam. In addition, the problem of SMS spam detection was addressed
as a clustering analysis that requires a metaheuristic algorithm to find
the clustering structures. Three differential evolution variants viz DE/rand/1,
jDE/rand/1, jDE/best/1, are adopted for solving the SMS spam problem.
Experimental results illustrate that the jDE/best/1 produces best results over
other variants in terms of accuracy, false-positive rate and false-negative rate.
Moreover, it surpasses the baseline methods.
Keywords:
Differential evolution
Feature extraction
Machine learning
SMS spam classification
SMS spam detection This is an open access article under the CC BY-SA license.
Corresponding Author:
Sarab M. Hameed,
Department of Computer Science, College of Science,
University of Baghdad,
Baghdad, Iraq.
Email: sarab.m@sc.uobaghdad.edu.iq
1. INTRODUCTION
The usage of short messaging service (SMS) became increasing rapidly as a result of its cheap cost
and ease of use. This directed to an increase in the amount of spam, which is unsolicited and undesired SMS
sent to a large number of receivers. The precise definition of spam does not exist. Essentially, spam regards
as an undesirable email, however, it is not all undesirable e-mails are spam.
SMS spam has a significant influence on users because users view each SMS they get, so SMS spam
affects the users immediately [1, 2]. In addition, to disturbing, users require a degree of secrecy with
their mobile phones and unaffected by spam and viruses intrusions [3-5]. Therefore, considerable attention
to the SMS spam problem is provided to develop a set of approaches to bypass this problem. Among
the approaches developed to combat the SMS spam is a classification of messages into SMS spam
and ham. The challenge is that the messages are short and contains few words and these words may be
abbreviated [6, 7].
Detecting SMS spam turns to be of significant worth due to the huge loss that could result from
the SMS spam. Two types of methods that are collaborative based and content-based methods can be used to
detect SMS spam. Collaborative based depends on usage and user experience. While the content-based
method concentrates on examining the textual content of messages [8, 9]. Zainal and Jali performed
a study of the distinguishing control of the features and examining it's informational or impact circumstance
in SMS spam messages classification [10]. Kaya and Ertuğrul introduced a method based on local ternary
patterns to extract two distinct features set low and up features from SMS messages and several machine
learning methods were applied for classifying SMS spam. The evaluation results over three separate SMS
datasets gained accuracy 93.318%, 87.15%, and 94.10% [11].
Int J Elec & Comp Eng ISSN: 2088-8708 
Differential evolution detection models for SMS spam (Sarab M. Hameed)
597
Sulaiman and Jali introduced a method for detection SMS spam based on well-known characters
used at transmitting SMS, SMS length and keywords. Five different algorithms and three different datasets
were used to verify the proposed method. The results indicated that the proposed method could produce
a reasonable detection rate [12].
Choudhary and Jain presented an approach for detecting spam messages using ten features and five
machine learning algorithms viz naïve Bayes, logistic regression, J48, decision table and random forest. SMS
Spam Corpus was used as an evaluation dataset and the results showed that the random forest is the best one
with detection rate equals 96.1% [13].
Abdulhamid et al. presented a review of the possible ways and challenges of spam detection and
future research direction that can help specialist researchers to know the areas that need further
improvement [1]. Kaliyar et al. proposed a general model to distinguish SMS spam messages based on
several machine learning algorithms. The proposed SMS spam filtering model was able to filter messages
from several families: Singapore, American, and Indian English. The results showed the proposed model
achieved a high precision using Indian English SMS [14].
Sharaff presented a comparative study of the effect of feature selection techniques on several
classifiers. The results showed that the feature selection technique impacts the performance of the classifier
and can assist in enhancing the performance of some classifiers [15]. Ojugo and Eboka developed SMS spam
detection method based on Bayesian Network. A set of features is deterministically selected from SMS using
the genetic algorithm (GA). The results showed that the feature selection technique impacts the performance
of the classifier [16].
A two-fold contribution is coupled in this paper. First, we raise the following question: How to
design an efficient feature extraction model to improve the accuracy. Second, to ensure more efficacious
performance, we exploit multiple variants of differential evolution (DE) models that offer positive
cooperation with a set of features for SMS spam to correctly divide the search space into two and
non-overlapping classes SMS spam and ham.
The rest of this paper is organized as follows. Section 2 elaborates on an approach for detecting
SMS spam by suggesting of a feature set. In Section 3, the results and discussion of the proposed approach
are reported. Finally, the concluding remarks and future work for other research directions are provided
in section 4.
2. RESEARCH METHOD
This section clarifies the methodology used for SMS spam detection which consists of two stages.
The first stage attempts to extract from SMS specific features that may be used to characterize SMS spam.
Based on the extracted features, three differential evolution alogirthms are used in the second stage to detect
SMS spam. Consider an SMS collection 𝕊 of 𝑁 text messages, i.e. 𝕊 = {𝑠1, … , 𝑠𝑁}. Each message, 𝑠𝑖, 1 ≤
𝑖 ≤ 𝑁 consists of words, numbers, etc. and its size is limited to 160 characters. Figure 1 demonstrates some
samples of SMS spam and ham in [17].
Figure 1. Sample of SMS spam and ham from SMS spam collection
 ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 11, No. 1, February 2021 : 596 - 601
598
The idea proposed here for dealing with the SMS spam problem is based on variant DE algorithms.
Up to the best of our knowledge, this is the first time to use DE algorithms as clustering for SMS spam.
The SMS spam detection task is to identify whether 𝑠𝑖 is SMS spam or ham using clustering is
ℂ ∶ 𝑠𝑖 → {𝑆𝑀𝑆 𝑠𝑝𝑎𝑚, ℎ𝑎𝑚}. Detecting distinctive features that have high discriminatory power to
distinguish between SMS spam and ham to be crucial in SMS spam detection. To support the clustering a set
of 𝑛 features 𝐹 = {𝑓1, 𝑓2, … , 𝑓𝑛} should be extracted from 𝕊.
This research investigates the challenge of SMS spam detection that is how the distinctive features 𝐹
are identified to classify the spam SMS messages. In this, the features of the ham and SMS spam messages
are extracted to form features set representing a sign for the corresponding message. These features are used
to train DE for identifying the centroids of the SMS spam and ham. Then, the trained centroids are given to
the testing stage to detect SMS spam or ham.
The first stage of the proposed SMS spam detection model is preproccessing. After retrieving
the SMS message from a data source, text preprocessing that is very important is applied to make
the messages to be analyzed. Several steps are involved in preprocessing in order to prepare the message to
an SMS spam detection. Firstly, the message is converted to lowercase to evade differentiation between
the same words that are different in case. Secondly, the punctuation is removed. After that, the messages are
tokenized based on delimited whites-pace to identify the tokens. Then, stop words are excluded from the set
of tokens recognized in the former step to obtain a message as a list of keywords. Finally, the tokens are
stemmed to recognize their roots.
Once applying preprocessing to the collection of messages, the message representation stage should
be performed to represent the message by distinct terms. Each term then is involved a weight calculated using
term-frequency inverse-sentence-frequency. Afterword, each message is represented as a vector of weights of
terms. In addition, some features are extracted from the raw data to further improve the message
representation as explained in what follows:
- The length of the message is considered one feature to be added to represent the message since SMS
spams length tends to be longer than ham.
- The presence of a number is considered as one feature to be added to represent the message since
the spammer prefers using the phone number to claim a lottery or prize.
The extracted feature set can be used by the DE to define the centroid of each cluster. The proposed
DE SMS spam detection consists of two stages: training and testing stages. The goal of the training stage is to
tune the value of two centroid vectors according to a set 𝐹 = {𝑓1, 𝑓2, … , 𝑓𝑛} of training messages.
While the goal of the testing stage is to classify the incoming message into SMS spam or ham based on
the two centroid vectors produced from the training stage. The following subsections illustrate how DE is
utilized for the clustering purpose based on the extracted features.
2.1. DE based clustering for SMS spam detection
After extracting the message features set, an optimization model based on DE that uses these
features to identify the centroids for clustering of SMS spam and ham is proposed. Three differential
evolution variants namely DE/rand/1, jDE/rand/1, jDE/best/1 are utilized. The detailed explanation of
the main characteristic components of the proposed models is provided in what follows.
2.2.1. Individual representation and population initialization
In the proposed models, cluster centroids are encoded in the individual. Each individual 𝐼 is
represented as a vector of 𝑛 genes that corresponds to the 2 cluster centroids. The mathematical formulation
of the individual is as follows:
𝐼 = {𝑖1, 𝑖2, … , 𝑖𝑛},
∀𝑗 ∈ {1, … , 𝑛}, 𝑖𝑗 ∈ [𝑚𝑖𝑛𝑗, 𝑚𝑎𝑥𝑗]
𝑚𝑖𝑛𝑗 : is the minimum value feature 𝑗 can get, and
𝑚𝑎𝑥𝑗: is the maximum value feature 𝑗 can get.
DE is a population-based optimization algorithm and starts with a population ℙ of 𝑁 solutions.
Formally speaking, ℙ can be formulated as follows:
ℙ = {𝐼1, 𝐼2, … , 𝐼𝑁}
where 𝑁 is the population size.
Int J Elec & Comp Eng ISSN: 2088-8708 
Differential evolution detection models for SMS spam (Sarab M. Hameed)
599
2.2.2. Mutation and crossover operations
This operation is considered as the main operation that is responsible for maintaining the population
diversity in evolution. The mutation equations for DE/rand/1, jDE/rand/1 and jDE/best/1 are in (1), (2),
and (3) respectively [18-20].
DE/rand/1 𝑉𝑖 = 𝐼𝑟1 + 𝐹(𝐼𝑟2 − 𝐼𝑟3) (1)
jDE/rand/1 𝑉𝑖 = 𝐼𝑟1 + 𝐹𝑖(𝐼𝑟2 − 𝐼𝑟3) (2)
jDE/best/1 𝑉𝑖 = 𝐼𝑏𝑒𝑠𝑡 + 𝐹𝑖(𝐼𝑟1 − 𝐼𝑟2) (3)
where 𝑟1, 𝑟2 and 𝑟3 are random numbers in [1,N] and they are mutually different, 𝑏𝑒𝑠𝑡 is the individual in
the current population. 𝐹 is a random number in (0,1) and
𝐹𝑖 = 𝐹𝑙 + 𝐹𝑢 × 𝑟𝑎𝑛𝑑 𝑖𝑓 𝑟 < 𝑇1
𝐹𝑙 is lower bound of mutation.
𝐹𝑢 is upper bound of mutation.
𝑇1 is the probability to alter 𝐹 factor.
The crossover operation aim is to combine the target 𝐼 vector with the donor vector 𝑉 to produce
the trial vector 𝑈. Equation (4) [21] illustrates the crossover operation for DE/rand/1 and (5) [22]
demonstrates the crossover operation for jDE/rand/1 and jDE/best/1.
𝑈𝑖,𝑗 = {
𝑉𝑖,𝑗 𝑖𝑓 𝑟1 ≤ 𝐶𝑅 𝑜𝑟 𝑗 = 𝐼𝑖,𝑟2
𝐼𝑖,𝑗 𝑖𝑓 𝑟1 > 𝐶𝑅 𝑎𝑛𝑑 𝑗 ≠ 𝐼𝑖,𝑟2
(4)
𝑈𝑖,𝑗 = {
𝑉𝑖,𝑗 𝑖𝑓 𝑟1 ≤ 𝐶𝑅𝑖 𝑜𝑟 𝑗 = 𝐼𝑖,𝑟2
𝐼𝑖,𝑗 𝑖𝑓 𝑟1 > 𝐶𝑅𝑖 𝑎𝑛𝑑 𝑗 ≠ 𝐼𝑖,𝑟2
(5)
where
𝐶𝑅 is a random number in (0,1] that determines the value of the trial vector 𝑈 which is inherited from
the donor 𝑉.
𝑟1 is a random numbers in (0,1]
𝑟2 is a random numbers in [1,n]
and
𝐶𝑅𝑖 = 𝑟𝑎𝑛𝑑 𝑖𝑓 𝑟 < 𝑇2
𝑇2 is the probability to alter CR factor.
2.2.3. Evaluation DE selection operation
According to the SMS spam problem, the formulation of the objective function requires maximizing
the accuracy that means the number of messages that are correctly detected as SMS spam (𝑇𝑃) and
the number of messages are correctly classified as ham(𝑇𝑁) should be maximized. In other words
the number of messages that are misclassified as SMS spam (𝐹𝑃) and the number of messages are
misclassified as ham (𝐹𝑁) should be minimized. Each individual is evaluated using the objective function as
in (6). The computation of the objective function for each individual is as follows. First, the centroids,
ℂ = {𝑆𝑀𝑆 𝑠𝑝𝑎𝑚, ℎ𝑎𝑚} encoded in the individual are extracted and the clusters are formed. Then, the cluster
is attained by assigning the messages 𝑠𝑖 to a cluster corresponding to the closest centroid. Euclidean squared
distance metric is adopted for the computation of the distance between a message and the centroid.
𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑂𝑏𝑗𝐹𝑢𝑛(𝐼) =
𝑇𝑃+𝑇𝑁
𝑇𝑃+𝑇𝑁+𝐹𝑁+𝐹𝑃
(6)
After evaluation the individuals, DE selection operation is applied as shown in (7), the resultant
vector from this operation is the vector with the higher objective function that will be passed to the next
generation.
𝐼𝑖 = {
𝑈𝑖 𝑖𝑓 𝑂𝑏𝑗𝐹𝑢𝑛(𝑈𝑖) > 𝑂𝑏𝑗𝐹𝑢𝑛(𝐼𝑖)
𝐼𝑖 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
(7)
 ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 11, No. 1, February 2021 : 596 - 601
600
3. EXPERIMENTAL RESULTS AND DISCUSSION
SMS Spam Collection [17c] is a public dataset collected for SMS spam research that contains 5,574
English, real and non-encoded messages. Each message is labeled to indicate whether a given message is
a legitimate (ham) or SMS spam. It contains about 86.67% of ham and 13.33% of SMS spam. Evaluation
metrics represented by accuracy(𝐴𝑐𝑐) [23], false negative rate (𝐹𝑁𝑅) [24] and false positive rate (𝐹𝑃𝑅) [25]
are used for assessing the proposed SMS spam. Evaluation of the proposed SMS spam detection model is
performed by applying the k-fold cross-validation approach. In this paper, 3- fold cross-validation approach
and 2-fold cross-validation are adopted to show the impact of training and testing dataset on
the performance of the proposed model. The evaluation is presented in terms of average accuracy(𝐴𝑐𝑐),
false negative rate (𝐹𝑁𝑅) and false positive rate (𝐹𝑃𝑅) over k-fold. Table 1 reports the setting parameters of
DE models. Tables 2 and 3 present the results of the proposed approach under 3-fold and 2-fold for various
classification. From those two tables, which present the average performance of the three variant DE
algorithms DE/rand/1, jDE/rand/1, jDE/best/1 against k-means baseline method in terms of 𝐴𝑐𝑐, 𝐹𝑁𝑅 and
𝐹𝑁𝑅 over ten different runs, it is clear that the three variant DE models have successfully clustered
the messages into SMS spam and ham for the different k-fold approaches against k-means through reducing
𝐹𝑁𝑅 and 𝐹𝑃𝑅 and increasing the accuracy. Also, it can be observed that the jDE/best/1 produces the best
result over other variant DE models.
Table 1. DE parameters setting
Parameter Value
Population size, 𝑁 100
Maximum number of generations 100
Crossover probability, 𝐶𝑅 0.9
Mutation probability, 𝐹 0.5
Lower bound of mutation, 𝐹 0.1
Upper bound of mutation, 𝐹 0.9
probability to alter 𝐹, 𝑇1 0.1
probability to alter 𝐶𝑅, 𝑇2 0.1
Table 2. Comparative results obtained over
10 independent runs fort he proposed model against
K-means regarding 3-fold
Model 𝐴𝑐𝑐 𝐹𝑁𝑅 𝐹𝑃𝑅
DE/rand/1 0.9641 0.0116 0.1925
jDE/rand/1 0.9660 0.0122 0.1742
jDE/best/1 0.9667 0.0116 0.1738
k-means 0.8600 0.0682 0.6024
Table 3. Comparative results obtained over
10 independent runs for the proposed model against
K-means regarding 2-fold
Model 𝐴𝑐𝑐 𝐹𝑁𝑅 𝐹𝑃𝑅
DE/rand/1 0.9648 0.0128 0.1801
jDE/rand/1 0.9652 0.0120 0.1814
jDE/best/1 0.9671 0.0107 0.1756
k-means 0.8290 0.0991 0.6350
4. CONCLUSION
In this paper, the problem of SMS spam detection was addressed as a clustering analysis. Finding
the clustering structures that provide high discrimination between SMS spam and ham has been considered as
NP-hard which needs a metaheuristic algorithm. Three DE variants were utilized to distinguish incoming
messages into SMS spam and ham. Experimental results reveal that the DE/best/1 surpasses other variants
and k-means in terms of accuracy, false-negative rate, and false-positive rate. Future work should include an
investigation to consider multi-objective evolutionary algorithms that may produce better results.
REFERENCES
[1] S. M. Abdulhamid, et al., “A Review on Mobile SMS Spam Filtering Techniques,” in IEEE Access, vol. 5,
pp. 15650-15666, 2017.
[2] G. Rawashdeh, et al., “Comparative Between Optimization Feature Selection by Using Classifiers Algorithms on
Spam Email,” International Journal of Electrical and Computer Engineering (IJECE), vol. 9, no. 6, pp. 5479-5485,
2019.
[3] N. Desai and M. Narvekar, “Normalization of Noisy Text Data,” Procedia Computer Science, vol. 45, pp. 127-132,
2015.
[4] N. Jiang, et al., “Understanding SMS Spam in Large Cellular Network: Characteristics, Strategies and Defenses,”
International Workshop on Recent Advances in Intrusion Detection, pp. 328-347, 2013.
[5] H. Sajedi, et al., “SMS Spam Filtering Using Machine Learning Techniques: A Survey,” Machine Learning
Research, vol. 1, no. 1, pp. 1-14, 2016.
Int J Elec & Comp Eng ISSN: 2088-8708 
Differential evolution detection models for SMS spam (Sarab M. Hameed)
601
[6] T. M. Mahmoud and A. M. Mahfouz, “SMS Spam Filtering Technique Based on Artificial Immune System,”
International Journal of Computer Science Issues, vol. 9, no. 2, pp. 589-597, 2012.
[7] N. Sjarif, et al., “SMS Spam Message Detection Using Term Frequency-Inverse Document Frequency and Random
Forest Algorithm,” Procedia Computer Science, vol. 161, pp. 509-515, 2019.
[8] X. Qian, “SMS Spam Detection Using Noncontent Features,” IEEE Intelligent Systems, vol. 27, no. 6, pp. 44-51,
2012.
[9] A. Karami and L. Zhou, “Improving Static SMS Spam Detection by Using New Content-based Features,” The 20th
Americas Conference on Information Systems (AMCIS), 2014.
[10] K. Zainal and M. Z. Jali, “A Review of Feature Extraction Optimization in SMS Spam Messages Classification,” in
Berry M., et al. (eds), “Soft Computing in Data Science (SCDS),” Communications in Computer and Information
Science, vol. 652, pp. 158-170, 2016.
[11] Y. Kaya and O. F. Ertuğrul, “A Novel Feature Extraction Approach in SMS Spam filtering for Mobile
Communication: One-Dimensional Ternary Patterns,” Security and communication networks, 2016.
[12] N. F. Sulaiman and M. Z. Jali, “A New SMS Spam Detection Method Using both Content-Based and Non Content-
Based Features,” Advanced Computer and Communication Engineering Technology, pp. 505-514, 2016.
[13] N. Choudhary and A. K. Jain, “Towards Filtering of SMS Spam Messages Using Machine Learning Based
Technique,” in Singh D., et al., (eds), “Advanced Informatics for Computing Research,” Communications in
Computer and Information Science, vol. 712, pp. 18-30, 2017.
[14] R. K. Kaliyar, et al., “SMS Spam Filtering on Multiple Background Datasets Using Machine Learning Techniques:
A Novel Approach,” 2018 IEEE 8th International Advance Computing Conference (IACC), pp. 59-65, 2018.
[15] A. Sharaff, “Spam Detection in SMS Based on Feature Selection Techniques,” in Abraham A., et al., (eds),
“Emerging Technologies in Data Mining and Information Security,” Advances in Intelligent Systems and
Computing, vol. 813, pp. 555-563, 2019.
[16] A. A. Ojugo and A. O. Eboka, “Memetic Algorithm for Short Messaging Service Spam Filter Using Text
Normalization and Semantic Approach,” International Journal of Informatics and Communication Technology
(IJ-ICT), vol. 9, no.1, pp. 9-18, 2020.
[17] T. A. Almeida, et al., “Contributions to the Study of SMS Spam filtering: New Collection and Results,” in
Proceedings of the 2011 ACM Symposium on Document Engineering (DOCENG’11), pp. 259-262, 2011.
[18] M. Georgioudakis, et al., “On the Performance of Differential Evolution Variants in Constrained Structural
Optimization,” Procedia Manufacturing, vol. 44, pp. 371-378, 2020.
[19] T. Eltaeib and A. Mahmood, “Differential Evolution: A Survey and Analysis,” Applied Sciences, vol. 7,
p. 1945, 2018.
[20] M. Georgioudakisa and V. Plevris, “On the Performance of Differential Evolution Variants in Constrained
Structural Optimization,” Procedia Manufacturing, vol. 44, pp. 371-378, 2020.
[21] S. Das and P. N. Suganthan, “Differential Evolution: A Survey of the State-of-the-Art,” IEEE Transactions on
Evolutionary Computation, vol. 15, pp. 4-31, 2011.
[22] J. Brest, et al., “Dynamic optimization using Self-Adaptive Differential Evolution,” 2009 IEEE Congress on
Evolutionary Computation, Trondheim, pp. 415-422, 2009.
[23] K. S. Reddy and E. S. Reddy, “Integrated Approach to Detect Spam in Social Media Networks Using Hybrid
Features,” International Journal of Electrical and Computer Engineering (IJECE), vol. 9, no. 1, pp. 562-569, 2019.
[24] A. Boukhalfa, et al., “LSTM Deep Learning Method for Network Intrusion Detection System,” International
Journal of Electrical and Computer Engineering (IJECE), vol. 10, no. 3, pp. 3315-3322, 2020.
[25] S. Sheikhi, et al., “An Effective Model for SMS Spam Detection Using Content-based Features and Averaged
Neural Network,” International Journal of Engineering, Transactions B: Applications, vol. 33, no. 2, pp. 221-228,
2020.
BIOGRAPHY OF AUTHOR
Sarab M. Hameed received B.Sc. degree in computer science from university of Baghdad, in
1992, M. Sc. from university of Baghdad, in 1999 and Ph. D. from university of technology in
2005. She is currently assitant professor at department of computer science, university of
Baghdad. Her research interest includes computer and data security and soft computing in
the security field.

More Related Content

What's hot

A Comparative Study for SMS Spam Detection
A Comparative Study for SMS Spam DetectionA Comparative Study for SMS Spam Detection
A Comparative Study for SMS Spam Detectionijtsrd
 
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...IRJET Journal
 
S1240068 presentation
S1240068 presentationS1240068 presentation
S1240068 presentationKuriharaYuta1
 
Indonesian language email spam detection using N-gram and Naïve Bayes algorithm
Indonesian language email spam detection using N-gram and Naïve Bayes algorithmIndonesian language email spam detection using N-gram and Naïve Bayes algorithm
Indonesian language email spam detection using N-gram and Naïve Bayes algorithmjournalBEEI
 
E mail image spam filtering techniques
E mail image spam filtering techniquesE mail image spam filtering techniques
E mail image spam filtering techniquesranjit banshpal
 
Text Mining Project: Identification of Age and Gender in Social Networks
Text Mining Project: Identification of Age and Gender in Social NetworksText Mining Project: Identification of Age and Gender in Social Networks
Text Mining Project: Identification of Age and Gender in Social NetworksLucas M
 
Emailphishing(deep anti phishnet applying deep neural networks for phishing e...
Emailphishing(deep anti phishnet applying deep neural networks for phishing e...Emailphishing(deep anti phishnet applying deep neural networks for phishing e...
Emailphishing(deep anti phishnet applying deep neural networks for phishing e...Venkat Projects
 
A Novel Text Classification Method Using Comprehensive Feature Weight
A Novel Text Classification Method Using Comprehensive Feature WeightA Novel Text Classification Method Using Comprehensive Feature Weight
A Novel Text Classification Method Using Comprehensive Feature WeightTELKOMNIKA JOURNAL
 
Semantic extraction of arabic
Semantic extraction of arabicSemantic extraction of arabic
Semantic extraction of arabiccsandit
 
Mills_Metafeatures.doc
Mills_Metafeatures.docMills_Metafeatures.doc
Mills_Metafeatures.docbutest
 
Spamming and Spam Filtering
Spamming and Spam FilteringSpamming and Spam Filtering
Spamming and Spam FilteringiNazneen
 
Communcation systems
Communcation systemsCommuncation systems
Communcation systemsMR Z
 

What's hot (16)

A Comparative Study for SMS Spam Detection
A Comparative Study for SMS Spam DetectionA Comparative Study for SMS Spam Detection
A Comparative Study for SMS Spam Detection
 
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
 
S1240068 presentation
S1240068 presentationS1240068 presentation
S1240068 presentation
 
Indonesian language email spam detection using N-gram and Naïve Bayes algorithm
Indonesian language email spam detection using N-gram and Naïve Bayes algorithmIndonesian language email spam detection using N-gram and Naïve Bayes algorithm
Indonesian language email spam detection using N-gram and Naïve Bayes algorithm
 
Sms spam-detection
Sms spam-detectionSms spam-detection
Sms spam-detection
 
20120140506007
2012014050600720120140506007
20120140506007
 
E mail image spam filtering techniques
E mail image spam filtering techniquesE mail image spam filtering techniques
E mail image spam filtering techniques
 
Text Mining Project: Identification of Age and Gender in Social Networks
Text Mining Project: Identification of Age and Gender in Social NetworksText Mining Project: Identification of Age and Gender in Social Networks
Text Mining Project: Identification of Age and Gender in Social Networks
 
Emailphishing(deep anti phishnet applying deep neural networks for phishing e...
Emailphishing(deep anti phishnet applying deep neural networks for phishing e...Emailphishing(deep anti phishnet applying deep neural networks for phishing e...
Emailphishing(deep anti phishnet applying deep neural networks for phishing e...
 
A Novel Text Classification Method Using Comprehensive Feature Weight
A Novel Text Classification Method Using Comprehensive Feature WeightA Novel Text Classification Method Using Comprehensive Feature Weight
A Novel Text Classification Method Using Comprehensive Feature Weight
 
Semantic extraction of arabic
Semantic extraction of arabicSemantic extraction of arabic
Semantic extraction of arabic
 
Mills_Metafeatures.doc
Mills_Metafeatures.docMills_Metafeatures.doc
Mills_Metafeatures.doc
 
Spamming and Spam Filtering
Spamming and Spam FilteringSpamming and Spam Filtering
Spamming and Spam Filtering
 
Haicku submission
Haicku submissionHaicku submission
Haicku submission
 
ma52006id386
ma52006id386ma52006id386
ma52006id386
 
Communcation systems
Communcation systemsCommuncation systems
Communcation systems
 

Similar to Differential evolution detection models for SMS spam

Identification of Spam Emails from Valid Emails by Using Voting
Identification of Spam Emails from Valid Emails by Using VotingIdentification of Spam Emails from Valid Emails by Using Voting
Identification of Spam Emails from Valid Emails by Using VotingEditor IJCATR
 
Analysis of an image spam in email based on content analysis
Analysis of an image spam in email based on content analysisAnalysis of an image spam in email based on content analysis
Analysis of an image spam in email based on content analysisijnlc
 
Detection of Spam in Emails using Machine Learning
Detection of Spam in Emails using Machine LearningDetection of Spam in Emails using Machine Learning
Detection of Spam in Emails using Machine LearningIRJET Journal
 
Detecting spam mail using machine learning algorithm
Detecting spam mail using machine learning algorithmDetecting spam mail using machine learning algorithm
Detecting spam mail using machine learning algorithmIRJET Journal
 
Thematic and self learning method for
Thematic and self learning method forThematic and self learning method for
Thematic and self learning method forijcsa
 
An intelligent auto-response short message service categorization model using...
An intelligent auto-response short message service categorization model using...An intelligent auto-response short message service categorization model using...
An intelligent auto-response short message service categorization model using...IJECEIAES
 
A Survey on Spam Filtering Methods and Mapreduce with SVM
A Survey on Spam Filtering Methods and Mapreduce with SVMA Survey on Spam Filtering Methods and Mapreduce with SVM
A Survey on Spam Filtering Methods and Mapreduce with SVMIRJET Journal
 
A Deep Analysis on Prevailing Spam Mail Filteration Machine Learning Approaches
A Deep Analysis on Prevailing Spam Mail Filteration Machine Learning ApproachesA Deep Analysis on Prevailing Spam Mail Filteration Machine Learning Approaches
A Deep Analysis on Prevailing Spam Mail Filteration Machine Learning Approachesijtsrd
 
Practical Knowledge Representation
Practical Knowledge RepresentationPractical Knowledge Representation
Practical Knowledge Representationbutest
 
Spam filtering by using Genetic based Feature Selection
Spam filtering by using Genetic based Feature SelectionSpam filtering by using Genetic based Feature Selection
Spam filtering by using Genetic based Feature SelectionEditor IJCATR
 
OPTIMIZING HYPERPARAMETERS FOR ENHANCED EMAIL CLASSIFICATION AND FORENSIC ANA...
OPTIMIZING HYPERPARAMETERS FOR ENHANCED EMAIL CLASSIFICATION AND FORENSIC ANA...OPTIMIZING HYPERPARAMETERS FOR ENHANCED EMAIL CLASSIFICATION AND FORENSIC ANA...
OPTIMIZING HYPERPARAMETERS FOR ENHANCED EMAIL CLASSIFICATION AND FORENSIC ANA...IJNSA Journal
 
EMAIL SPAM DETECTION USING HYBRID ALGORITHM
EMAIL SPAM DETECTION USING HYBRID ALGORITHMEMAIL SPAM DETECTION USING HYBRID ALGORITHM
EMAIL SPAM DETECTION USING HYBRID ALGORITHMIRJET Journal
 
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...IJNSA Journal
 
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...IJNSA Journal
 
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...IJNSA Journal
 
Cross breed Spam Categorization Method using Machine Learning Techniques
Cross breed Spam Categorization Method using Machine Learning TechniquesCross breed Spam Categorization Method using Machine Learning Techniques
Cross breed Spam Categorization Method using Machine Learning TechniquesIJSRED
 

Similar to Differential evolution detection models for SMS spam (20)

Identification of Spam Emails from Valid Emails by Using Voting
Identification of Spam Emails from Valid Emails by Using VotingIdentification of Spam Emails from Valid Emails by Using Voting
Identification of Spam Emails from Valid Emails by Using Voting
 
Analysis of an image spam in email based on content analysis
Analysis of an image spam in email based on content analysisAnalysis of an image spam in email based on content analysis
Analysis of an image spam in email based on content analysis
 
Detection of Spam in Emails using Machine Learning
Detection of Spam in Emails using Machine LearningDetection of Spam in Emails using Machine Learning
Detection of Spam in Emails using Machine Learning
 
Jt3616901697
Jt3616901697Jt3616901697
Jt3616901697
 
Detecting spam mail using machine learning algorithm
Detecting spam mail using machine learning algorithmDetecting spam mail using machine learning algorithm
Detecting spam mail using machine learning algorithm
 
Thematic and self learning method for
Thematic and self learning method forThematic and self learning method for
Thematic and self learning method for
 
An intelligent auto-response short message service categorization model using...
An intelligent auto-response short message service categorization model using...An intelligent auto-response short message service categorization model using...
An intelligent auto-response short message service categorization model using...
 
A Survey on Spam Filtering Methods and Mapreduce with SVM
A Survey on Spam Filtering Methods and Mapreduce with SVMA Survey on Spam Filtering Methods and Mapreduce with SVM
A Survey on Spam Filtering Methods and Mapreduce with SVM
 
A Deep Analysis on Prevailing Spam Mail Filteration Machine Learning Approaches
A Deep Analysis on Prevailing Spam Mail Filteration Machine Learning ApproachesA Deep Analysis on Prevailing Spam Mail Filteration Machine Learning Approaches
A Deep Analysis on Prevailing Spam Mail Filteration Machine Learning Approaches
 
Practical Knowledge Representation
Practical Knowledge RepresentationPractical Knowledge Representation
Practical Knowledge Representation
 
Spam filtering by using Genetic based Feature Selection
Spam filtering by using Genetic based Feature SelectionSpam filtering by using Genetic based Feature Selection
Spam filtering by using Genetic based Feature Selection
 
40120130406014 2
40120130406014 240120130406014 2
40120130406014 2
 
M dgx mde0mde=
M dgx mde0mde=M dgx mde0mde=
M dgx mde0mde=
 
OPTIMIZING HYPERPARAMETERS FOR ENHANCED EMAIL CLASSIFICATION AND FORENSIC ANA...
OPTIMIZING HYPERPARAMETERS FOR ENHANCED EMAIL CLASSIFICATION AND FORENSIC ANA...OPTIMIZING HYPERPARAMETERS FOR ENHANCED EMAIL CLASSIFICATION AND FORENSIC ANA...
OPTIMIZING HYPERPARAMETERS FOR ENHANCED EMAIL CLASSIFICATION AND FORENSIC ANA...
 
EMAIL SPAM DETECTION USING HYBRID ALGORITHM
EMAIL SPAM DETECTION USING HYBRID ALGORITHMEMAIL SPAM DETECTION USING HYBRID ALGORITHM
EMAIL SPAM DETECTION USING HYBRID ALGORITHM
 
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
 
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
 
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
 
spam_msg_detection.pdf
spam_msg_detection.pdfspam_msg_detection.pdf
spam_msg_detection.pdf
 
Cross breed Spam Categorization Method using Machine Learning Techniques
Cross breed Spam Categorization Method using Machine Learning TechniquesCross breed Spam Categorization Method using Machine Learning Techniques
Cross breed Spam Categorization Method using Machine Learning Techniques
 

More from IJECEIAES

Cloud service ranking with an integration of k-means algorithm and decision-m...
Cloud service ranking with an integration of k-means algorithm and decision-m...Cloud service ranking with an integration of k-means algorithm and decision-m...
Cloud service ranking with an integration of k-means algorithm and decision-m...IJECEIAES
 
Prediction of the risk of developing heart disease using logistic regression
Prediction of the risk of developing heart disease using logistic regressionPrediction of the risk of developing heart disease using logistic regression
Prediction of the risk of developing heart disease using logistic regressionIJECEIAES
 
Predictive analysis of terrorist activities in Thailand's Southern provinces:...
Predictive analysis of terrorist activities in Thailand's Southern provinces:...Predictive analysis of terrorist activities in Thailand's Southern provinces:...
Predictive analysis of terrorist activities in Thailand's Southern provinces:...IJECEIAES
 
Optimal model of vehicular ad-hoc network assisted by unmanned aerial vehicl...
Optimal model of vehicular ad-hoc network assisted by  unmanned aerial vehicl...Optimal model of vehicular ad-hoc network assisted by  unmanned aerial vehicl...
Optimal model of vehicular ad-hoc network assisted by unmanned aerial vehicl...IJECEIAES
 
Improving cyberbullying detection through multi-level machine learning
Improving cyberbullying detection through multi-level machine learningImproving cyberbullying detection through multi-level machine learning
Improving cyberbullying detection through multi-level machine learningIJECEIAES
 
Comparison of time series temperature prediction with autoregressive integrat...
Comparison of time series temperature prediction with autoregressive integrat...Comparison of time series temperature prediction with autoregressive integrat...
Comparison of time series temperature prediction with autoregressive integrat...IJECEIAES
 
Strengthening data integrity in academic document recording with blockchain a...
Strengthening data integrity in academic document recording with blockchain a...Strengthening data integrity in academic document recording with blockchain a...
Strengthening data integrity in academic document recording with blockchain a...IJECEIAES
 
Design of storage benchmark kit framework for supporting the file storage ret...
Design of storage benchmark kit framework for supporting the file storage ret...Design of storage benchmark kit framework for supporting the file storage ret...
Design of storage benchmark kit framework for supporting the file storage ret...IJECEIAES
 
Detection of diseases in rice leaf using convolutional neural network with tr...
Detection of diseases in rice leaf using convolutional neural network with tr...Detection of diseases in rice leaf using convolutional neural network with tr...
Detection of diseases in rice leaf using convolutional neural network with tr...IJECEIAES
 
A systematic review of in-memory database over multi-tenancy
A systematic review of in-memory database over multi-tenancyA systematic review of in-memory database over multi-tenancy
A systematic review of in-memory database over multi-tenancyIJECEIAES
 
Agriculture crop yield prediction using inertia based cat swarm optimization
Agriculture crop yield prediction using inertia based cat swarm optimizationAgriculture crop yield prediction using inertia based cat swarm optimization
Agriculture crop yield prediction using inertia based cat swarm optimizationIJECEIAES
 
Three layer hybrid learning to improve intrusion detection system performance
Three layer hybrid learning to improve intrusion detection system performanceThree layer hybrid learning to improve intrusion detection system performance
Three layer hybrid learning to improve intrusion detection system performanceIJECEIAES
 
Non-binary codes approach on the performance of short-packet full-duplex tran...
Non-binary codes approach on the performance of short-packet full-duplex tran...Non-binary codes approach on the performance of short-packet full-duplex tran...
Non-binary codes approach on the performance of short-packet full-duplex tran...IJECEIAES
 
Improved design and performance of the global rectenna system for wireless po...
Improved design and performance of the global rectenna system for wireless po...Improved design and performance of the global rectenna system for wireless po...
Improved design and performance of the global rectenna system for wireless po...IJECEIAES
 
Advanced hybrid algorithms for precise multipath channel estimation in next-g...
Advanced hybrid algorithms for precise multipath channel estimation in next-g...Advanced hybrid algorithms for precise multipath channel estimation in next-g...
Advanced hybrid algorithms for precise multipath channel estimation in next-g...IJECEIAES
 
Performance analysis of 2D optical code division multiple access through unde...
Performance analysis of 2D optical code division multiple access through unde...Performance analysis of 2D optical code division multiple access through unde...
Performance analysis of 2D optical code division multiple access through unde...IJECEIAES
 
On performance analysis of non-orthogonal multiple access downlink for cellul...
On performance analysis of non-orthogonal multiple access downlink for cellul...On performance analysis of non-orthogonal multiple access downlink for cellul...
On performance analysis of non-orthogonal multiple access downlink for cellul...IJECEIAES
 
Phase delay through slot-line beam switching microstrip patch array antenna d...
Phase delay through slot-line beam switching microstrip patch array antenna d...Phase delay through slot-line beam switching microstrip patch array antenna d...
Phase delay through slot-line beam switching microstrip patch array antenna d...IJECEIAES
 
A simple feed orthogonal excitation X-band dual circular polarized microstrip...
A simple feed orthogonal excitation X-band dual circular polarized microstrip...A simple feed orthogonal excitation X-band dual circular polarized microstrip...
A simple feed orthogonal excitation X-band dual circular polarized microstrip...IJECEIAES
 
A taxonomy on power optimization techniques for fifthgeneration heterogenous ...
A taxonomy on power optimization techniques for fifthgeneration heterogenous ...A taxonomy on power optimization techniques for fifthgeneration heterogenous ...
A taxonomy on power optimization techniques for fifthgeneration heterogenous ...IJECEIAES
 

More from IJECEIAES (20)

Cloud service ranking with an integration of k-means algorithm and decision-m...
Cloud service ranking with an integration of k-means algorithm and decision-m...Cloud service ranking with an integration of k-means algorithm and decision-m...
Cloud service ranking with an integration of k-means algorithm and decision-m...
 
Prediction of the risk of developing heart disease using logistic regression
Prediction of the risk of developing heart disease using logistic regressionPrediction of the risk of developing heart disease using logistic regression
Prediction of the risk of developing heart disease using logistic regression
 
Predictive analysis of terrorist activities in Thailand's Southern provinces:...
Predictive analysis of terrorist activities in Thailand's Southern provinces:...Predictive analysis of terrorist activities in Thailand's Southern provinces:...
Predictive analysis of terrorist activities in Thailand's Southern provinces:...
 
Optimal model of vehicular ad-hoc network assisted by unmanned aerial vehicl...
Optimal model of vehicular ad-hoc network assisted by  unmanned aerial vehicl...Optimal model of vehicular ad-hoc network assisted by  unmanned aerial vehicl...
Optimal model of vehicular ad-hoc network assisted by unmanned aerial vehicl...
 
Improving cyberbullying detection through multi-level machine learning
Improving cyberbullying detection through multi-level machine learningImproving cyberbullying detection through multi-level machine learning
Improving cyberbullying detection through multi-level machine learning
 
Comparison of time series temperature prediction with autoregressive integrat...
Comparison of time series temperature prediction with autoregressive integrat...Comparison of time series temperature prediction with autoregressive integrat...
Comparison of time series temperature prediction with autoregressive integrat...
 
Strengthening data integrity in academic document recording with blockchain a...
Strengthening data integrity in academic document recording with blockchain a...Strengthening data integrity in academic document recording with blockchain a...
Strengthening data integrity in academic document recording with blockchain a...
 
Design of storage benchmark kit framework for supporting the file storage ret...
Design of storage benchmark kit framework for supporting the file storage ret...Design of storage benchmark kit framework for supporting the file storage ret...
Design of storage benchmark kit framework for supporting the file storage ret...
 
Detection of diseases in rice leaf using convolutional neural network with tr...
Detection of diseases in rice leaf using convolutional neural network with tr...Detection of diseases in rice leaf using convolutional neural network with tr...
Detection of diseases in rice leaf using convolutional neural network with tr...
 
A systematic review of in-memory database over multi-tenancy
A systematic review of in-memory database over multi-tenancyA systematic review of in-memory database over multi-tenancy
A systematic review of in-memory database over multi-tenancy
 
Agriculture crop yield prediction using inertia based cat swarm optimization
Agriculture crop yield prediction using inertia based cat swarm optimizationAgriculture crop yield prediction using inertia based cat swarm optimization
Agriculture crop yield prediction using inertia based cat swarm optimization
 
Three layer hybrid learning to improve intrusion detection system performance
Three layer hybrid learning to improve intrusion detection system performanceThree layer hybrid learning to improve intrusion detection system performance
Three layer hybrid learning to improve intrusion detection system performance
 
Non-binary codes approach on the performance of short-packet full-duplex tran...
Non-binary codes approach on the performance of short-packet full-duplex tran...Non-binary codes approach on the performance of short-packet full-duplex tran...
Non-binary codes approach on the performance of short-packet full-duplex tran...
 
Improved design and performance of the global rectenna system for wireless po...
Improved design and performance of the global rectenna system for wireless po...Improved design and performance of the global rectenna system for wireless po...
Improved design and performance of the global rectenna system for wireless po...
 
Advanced hybrid algorithms for precise multipath channel estimation in next-g...
Advanced hybrid algorithms for precise multipath channel estimation in next-g...Advanced hybrid algorithms for precise multipath channel estimation in next-g...
Advanced hybrid algorithms for precise multipath channel estimation in next-g...
 
Performance analysis of 2D optical code division multiple access through unde...
Performance analysis of 2D optical code division multiple access through unde...Performance analysis of 2D optical code division multiple access through unde...
Performance analysis of 2D optical code division multiple access through unde...
 
On performance analysis of non-orthogonal multiple access downlink for cellul...
On performance analysis of non-orthogonal multiple access downlink for cellul...On performance analysis of non-orthogonal multiple access downlink for cellul...
On performance analysis of non-orthogonal multiple access downlink for cellul...
 
Phase delay through slot-line beam switching microstrip patch array antenna d...
Phase delay through slot-line beam switching microstrip patch array antenna d...Phase delay through slot-line beam switching microstrip patch array antenna d...
Phase delay through slot-line beam switching microstrip patch array antenna d...
 
A simple feed orthogonal excitation X-band dual circular polarized microstrip...
A simple feed orthogonal excitation X-band dual circular polarized microstrip...A simple feed orthogonal excitation X-band dual circular polarized microstrip...
A simple feed orthogonal excitation X-band dual circular polarized microstrip...
 
A taxonomy on power optimization techniques for fifthgeneration heterogenous ...
A taxonomy on power optimization techniques for fifthgeneration heterogenous ...A taxonomy on power optimization techniques for fifthgeneration heterogenous ...
A taxonomy on power optimization techniques for fifthgeneration heterogenous ...
 

Recently uploaded

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 

Recently uploaded (20)

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 

Differential evolution detection models for SMS spam

  • 1. International Journal of Electrical and Computer Engineering (IJECE) Vol. 11, No. 1, February 2021, pp. 596~601 ISSN: 2088-8708, DOI: 10.11591/ijece.v11i1.pp596-601  596 Journal homepage: http://ijece.iaescore.com Differential evolution detection models for SMS spam Sarab M. Hameed Department of Computer Science, College of Science, University of Baghdad, Iraq Article Info ABSTRACT Article history: Received Nov 23, 2019 Revised Jun 17, 2020 Accepted Jun 27, 2020 With the growth of mobile phones, short message service (SMS) became an essential text communication service. However, the low cost and ease use of SMS led to an increase in SMS Spam. In this paper, the characteristics of SMS spam has studied and a set of features has introduced to get rid of SMS spam. In addition, the problem of SMS spam detection was addressed as a clustering analysis that requires a metaheuristic algorithm to find the clustering structures. Three differential evolution variants viz DE/rand/1, jDE/rand/1, jDE/best/1, are adopted for solving the SMS spam problem. Experimental results illustrate that the jDE/best/1 produces best results over other variants in terms of accuracy, false-positive rate and false-negative rate. Moreover, it surpasses the baseline methods. Keywords: Differential evolution Feature extraction Machine learning SMS spam classification SMS spam detection This is an open access article under the CC BY-SA license. Corresponding Author: Sarab M. Hameed, Department of Computer Science, College of Science, University of Baghdad, Baghdad, Iraq. Email: sarab.m@sc.uobaghdad.edu.iq 1. INTRODUCTION The usage of short messaging service (SMS) became increasing rapidly as a result of its cheap cost and ease of use. This directed to an increase in the amount of spam, which is unsolicited and undesired SMS sent to a large number of receivers. The precise definition of spam does not exist. Essentially, spam regards as an undesirable email, however, it is not all undesirable e-mails are spam. SMS spam has a significant influence on users because users view each SMS they get, so SMS spam affects the users immediately [1, 2]. In addition, to disturbing, users require a degree of secrecy with their mobile phones and unaffected by spam and viruses intrusions [3-5]. Therefore, considerable attention to the SMS spam problem is provided to develop a set of approaches to bypass this problem. Among the approaches developed to combat the SMS spam is a classification of messages into SMS spam and ham. The challenge is that the messages are short and contains few words and these words may be abbreviated [6, 7]. Detecting SMS spam turns to be of significant worth due to the huge loss that could result from the SMS spam. Two types of methods that are collaborative based and content-based methods can be used to detect SMS spam. Collaborative based depends on usage and user experience. While the content-based method concentrates on examining the textual content of messages [8, 9]. Zainal and Jali performed a study of the distinguishing control of the features and examining it's informational or impact circumstance in SMS spam messages classification [10]. Kaya and Ertuğrul introduced a method based on local ternary patterns to extract two distinct features set low and up features from SMS messages and several machine learning methods were applied for classifying SMS spam. The evaluation results over three separate SMS datasets gained accuracy 93.318%, 87.15%, and 94.10% [11].
  • 2. Int J Elec & Comp Eng ISSN: 2088-8708  Differential evolution detection models for SMS spam (Sarab M. Hameed) 597 Sulaiman and Jali introduced a method for detection SMS spam based on well-known characters used at transmitting SMS, SMS length and keywords. Five different algorithms and three different datasets were used to verify the proposed method. The results indicated that the proposed method could produce a reasonable detection rate [12]. Choudhary and Jain presented an approach for detecting spam messages using ten features and five machine learning algorithms viz naïve Bayes, logistic regression, J48, decision table and random forest. SMS Spam Corpus was used as an evaluation dataset and the results showed that the random forest is the best one with detection rate equals 96.1% [13]. Abdulhamid et al. presented a review of the possible ways and challenges of spam detection and future research direction that can help specialist researchers to know the areas that need further improvement [1]. Kaliyar et al. proposed a general model to distinguish SMS spam messages based on several machine learning algorithms. The proposed SMS spam filtering model was able to filter messages from several families: Singapore, American, and Indian English. The results showed the proposed model achieved a high precision using Indian English SMS [14]. Sharaff presented a comparative study of the effect of feature selection techniques on several classifiers. The results showed that the feature selection technique impacts the performance of the classifier and can assist in enhancing the performance of some classifiers [15]. Ojugo and Eboka developed SMS spam detection method based on Bayesian Network. A set of features is deterministically selected from SMS using the genetic algorithm (GA). The results showed that the feature selection technique impacts the performance of the classifier [16]. A two-fold contribution is coupled in this paper. First, we raise the following question: How to design an efficient feature extraction model to improve the accuracy. Second, to ensure more efficacious performance, we exploit multiple variants of differential evolution (DE) models that offer positive cooperation with a set of features for SMS spam to correctly divide the search space into two and non-overlapping classes SMS spam and ham. The rest of this paper is organized as follows. Section 2 elaborates on an approach for detecting SMS spam by suggesting of a feature set. In Section 3, the results and discussion of the proposed approach are reported. Finally, the concluding remarks and future work for other research directions are provided in section 4. 2. RESEARCH METHOD This section clarifies the methodology used for SMS spam detection which consists of two stages. The first stage attempts to extract from SMS specific features that may be used to characterize SMS spam. Based on the extracted features, three differential evolution alogirthms are used in the second stage to detect SMS spam. Consider an SMS collection 𝕊 of 𝑁 text messages, i.e. 𝕊 = {𝑠1, … , 𝑠𝑁}. Each message, 𝑠𝑖, 1 ≤ 𝑖 ≤ 𝑁 consists of words, numbers, etc. and its size is limited to 160 characters. Figure 1 demonstrates some samples of SMS spam and ham in [17]. Figure 1. Sample of SMS spam and ham from SMS spam collection
  • 3.  ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 11, No. 1, February 2021 : 596 - 601 598 The idea proposed here for dealing with the SMS spam problem is based on variant DE algorithms. Up to the best of our knowledge, this is the first time to use DE algorithms as clustering for SMS spam. The SMS spam detection task is to identify whether 𝑠𝑖 is SMS spam or ham using clustering is ℂ ∶ 𝑠𝑖 → {𝑆𝑀𝑆 𝑠𝑝𝑎𝑚, ℎ𝑎𝑚}. Detecting distinctive features that have high discriminatory power to distinguish between SMS spam and ham to be crucial in SMS spam detection. To support the clustering a set of 𝑛 features 𝐹 = {𝑓1, 𝑓2, … , 𝑓𝑛} should be extracted from 𝕊. This research investigates the challenge of SMS spam detection that is how the distinctive features 𝐹 are identified to classify the spam SMS messages. In this, the features of the ham and SMS spam messages are extracted to form features set representing a sign for the corresponding message. These features are used to train DE for identifying the centroids of the SMS spam and ham. Then, the trained centroids are given to the testing stage to detect SMS spam or ham. The first stage of the proposed SMS spam detection model is preproccessing. After retrieving the SMS message from a data source, text preprocessing that is very important is applied to make the messages to be analyzed. Several steps are involved in preprocessing in order to prepare the message to an SMS spam detection. Firstly, the message is converted to lowercase to evade differentiation between the same words that are different in case. Secondly, the punctuation is removed. After that, the messages are tokenized based on delimited whites-pace to identify the tokens. Then, stop words are excluded from the set of tokens recognized in the former step to obtain a message as a list of keywords. Finally, the tokens are stemmed to recognize their roots. Once applying preprocessing to the collection of messages, the message representation stage should be performed to represent the message by distinct terms. Each term then is involved a weight calculated using term-frequency inverse-sentence-frequency. Afterword, each message is represented as a vector of weights of terms. In addition, some features are extracted from the raw data to further improve the message representation as explained in what follows: - The length of the message is considered one feature to be added to represent the message since SMS spams length tends to be longer than ham. - The presence of a number is considered as one feature to be added to represent the message since the spammer prefers using the phone number to claim a lottery or prize. The extracted feature set can be used by the DE to define the centroid of each cluster. The proposed DE SMS spam detection consists of two stages: training and testing stages. The goal of the training stage is to tune the value of two centroid vectors according to a set 𝐹 = {𝑓1, 𝑓2, … , 𝑓𝑛} of training messages. While the goal of the testing stage is to classify the incoming message into SMS spam or ham based on the two centroid vectors produced from the training stage. The following subsections illustrate how DE is utilized for the clustering purpose based on the extracted features. 2.1. DE based clustering for SMS spam detection After extracting the message features set, an optimization model based on DE that uses these features to identify the centroids for clustering of SMS spam and ham is proposed. Three differential evolution variants namely DE/rand/1, jDE/rand/1, jDE/best/1 are utilized. The detailed explanation of the main characteristic components of the proposed models is provided in what follows. 2.2.1. Individual representation and population initialization In the proposed models, cluster centroids are encoded in the individual. Each individual 𝐼 is represented as a vector of 𝑛 genes that corresponds to the 2 cluster centroids. The mathematical formulation of the individual is as follows: 𝐼 = {𝑖1, 𝑖2, … , 𝑖𝑛}, ∀𝑗 ∈ {1, … , 𝑛}, 𝑖𝑗 ∈ [𝑚𝑖𝑛𝑗, 𝑚𝑎𝑥𝑗] 𝑚𝑖𝑛𝑗 : is the minimum value feature 𝑗 can get, and 𝑚𝑎𝑥𝑗: is the maximum value feature 𝑗 can get. DE is a population-based optimization algorithm and starts with a population ℙ of 𝑁 solutions. Formally speaking, ℙ can be formulated as follows: ℙ = {𝐼1, 𝐼2, … , 𝐼𝑁} where 𝑁 is the population size.
  • 4. Int J Elec & Comp Eng ISSN: 2088-8708  Differential evolution detection models for SMS spam (Sarab M. Hameed) 599 2.2.2. Mutation and crossover operations This operation is considered as the main operation that is responsible for maintaining the population diversity in evolution. The mutation equations for DE/rand/1, jDE/rand/1 and jDE/best/1 are in (1), (2), and (3) respectively [18-20]. DE/rand/1 𝑉𝑖 = 𝐼𝑟1 + 𝐹(𝐼𝑟2 − 𝐼𝑟3) (1) jDE/rand/1 𝑉𝑖 = 𝐼𝑟1 + 𝐹𝑖(𝐼𝑟2 − 𝐼𝑟3) (2) jDE/best/1 𝑉𝑖 = 𝐼𝑏𝑒𝑠𝑡 + 𝐹𝑖(𝐼𝑟1 − 𝐼𝑟2) (3) where 𝑟1, 𝑟2 and 𝑟3 are random numbers in [1,N] and they are mutually different, 𝑏𝑒𝑠𝑡 is the individual in the current population. 𝐹 is a random number in (0,1) and 𝐹𝑖 = 𝐹𝑙 + 𝐹𝑢 × 𝑟𝑎𝑛𝑑 𝑖𝑓 𝑟 < 𝑇1 𝐹𝑙 is lower bound of mutation. 𝐹𝑢 is upper bound of mutation. 𝑇1 is the probability to alter 𝐹 factor. The crossover operation aim is to combine the target 𝐼 vector with the donor vector 𝑉 to produce the trial vector 𝑈. Equation (4) [21] illustrates the crossover operation for DE/rand/1 and (5) [22] demonstrates the crossover operation for jDE/rand/1 and jDE/best/1. 𝑈𝑖,𝑗 = { 𝑉𝑖,𝑗 𝑖𝑓 𝑟1 ≤ 𝐶𝑅 𝑜𝑟 𝑗 = 𝐼𝑖,𝑟2 𝐼𝑖,𝑗 𝑖𝑓 𝑟1 > 𝐶𝑅 𝑎𝑛𝑑 𝑗 ≠ 𝐼𝑖,𝑟2 (4) 𝑈𝑖,𝑗 = { 𝑉𝑖,𝑗 𝑖𝑓 𝑟1 ≤ 𝐶𝑅𝑖 𝑜𝑟 𝑗 = 𝐼𝑖,𝑟2 𝐼𝑖,𝑗 𝑖𝑓 𝑟1 > 𝐶𝑅𝑖 𝑎𝑛𝑑 𝑗 ≠ 𝐼𝑖,𝑟2 (5) where 𝐶𝑅 is a random number in (0,1] that determines the value of the trial vector 𝑈 which is inherited from the donor 𝑉. 𝑟1 is a random numbers in (0,1] 𝑟2 is a random numbers in [1,n] and 𝐶𝑅𝑖 = 𝑟𝑎𝑛𝑑 𝑖𝑓 𝑟 < 𝑇2 𝑇2 is the probability to alter CR factor. 2.2.3. Evaluation DE selection operation According to the SMS spam problem, the formulation of the objective function requires maximizing the accuracy that means the number of messages that are correctly detected as SMS spam (𝑇𝑃) and the number of messages are correctly classified as ham(𝑇𝑁) should be maximized. In other words the number of messages that are misclassified as SMS spam (𝐹𝑃) and the number of messages are misclassified as ham (𝐹𝑁) should be minimized. Each individual is evaluated using the objective function as in (6). The computation of the objective function for each individual is as follows. First, the centroids, ℂ = {𝑆𝑀𝑆 𝑠𝑝𝑎𝑚, ℎ𝑎𝑚} encoded in the individual are extracted and the clusters are formed. Then, the cluster is attained by assigning the messages 𝑠𝑖 to a cluster corresponding to the closest centroid. Euclidean squared distance metric is adopted for the computation of the distance between a message and the centroid. 𝑀𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑂𝑏𝑗𝐹𝑢𝑛(𝐼) = 𝑇𝑃+𝑇𝑁 𝑇𝑃+𝑇𝑁+𝐹𝑁+𝐹𝑃 (6) After evaluation the individuals, DE selection operation is applied as shown in (7), the resultant vector from this operation is the vector with the higher objective function that will be passed to the next generation. 𝐼𝑖 = { 𝑈𝑖 𝑖𝑓 𝑂𝑏𝑗𝐹𝑢𝑛(𝑈𝑖) > 𝑂𝑏𝑗𝐹𝑢𝑛(𝐼𝑖) 𝐼𝑖 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 (7)
  • 5.  ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 11, No. 1, February 2021 : 596 - 601 600 3. EXPERIMENTAL RESULTS AND DISCUSSION SMS Spam Collection [17c] is a public dataset collected for SMS spam research that contains 5,574 English, real and non-encoded messages. Each message is labeled to indicate whether a given message is a legitimate (ham) or SMS spam. It contains about 86.67% of ham and 13.33% of SMS spam. Evaluation metrics represented by accuracy(𝐴𝑐𝑐) [23], false negative rate (𝐹𝑁𝑅) [24] and false positive rate (𝐹𝑃𝑅) [25] are used for assessing the proposed SMS spam. Evaluation of the proposed SMS spam detection model is performed by applying the k-fold cross-validation approach. In this paper, 3- fold cross-validation approach and 2-fold cross-validation are adopted to show the impact of training and testing dataset on the performance of the proposed model. The evaluation is presented in terms of average accuracy(𝐴𝑐𝑐), false negative rate (𝐹𝑁𝑅) and false positive rate (𝐹𝑃𝑅) over k-fold. Table 1 reports the setting parameters of DE models. Tables 2 and 3 present the results of the proposed approach under 3-fold and 2-fold for various classification. From those two tables, which present the average performance of the three variant DE algorithms DE/rand/1, jDE/rand/1, jDE/best/1 against k-means baseline method in terms of 𝐴𝑐𝑐, 𝐹𝑁𝑅 and 𝐹𝑁𝑅 over ten different runs, it is clear that the three variant DE models have successfully clustered the messages into SMS spam and ham for the different k-fold approaches against k-means through reducing 𝐹𝑁𝑅 and 𝐹𝑃𝑅 and increasing the accuracy. Also, it can be observed that the jDE/best/1 produces the best result over other variant DE models. Table 1. DE parameters setting Parameter Value Population size, 𝑁 100 Maximum number of generations 100 Crossover probability, 𝐶𝑅 0.9 Mutation probability, 𝐹 0.5 Lower bound of mutation, 𝐹 0.1 Upper bound of mutation, 𝐹 0.9 probability to alter 𝐹, 𝑇1 0.1 probability to alter 𝐶𝑅, 𝑇2 0.1 Table 2. Comparative results obtained over 10 independent runs fort he proposed model against K-means regarding 3-fold Model 𝐴𝑐𝑐 𝐹𝑁𝑅 𝐹𝑃𝑅 DE/rand/1 0.9641 0.0116 0.1925 jDE/rand/1 0.9660 0.0122 0.1742 jDE/best/1 0.9667 0.0116 0.1738 k-means 0.8600 0.0682 0.6024 Table 3. Comparative results obtained over 10 independent runs for the proposed model against K-means regarding 2-fold Model 𝐴𝑐𝑐 𝐹𝑁𝑅 𝐹𝑃𝑅 DE/rand/1 0.9648 0.0128 0.1801 jDE/rand/1 0.9652 0.0120 0.1814 jDE/best/1 0.9671 0.0107 0.1756 k-means 0.8290 0.0991 0.6350 4. CONCLUSION In this paper, the problem of SMS spam detection was addressed as a clustering analysis. Finding the clustering structures that provide high discrimination between SMS spam and ham has been considered as NP-hard which needs a metaheuristic algorithm. Three DE variants were utilized to distinguish incoming messages into SMS spam and ham. Experimental results reveal that the DE/best/1 surpasses other variants and k-means in terms of accuracy, false-negative rate, and false-positive rate. Future work should include an investigation to consider multi-objective evolutionary algorithms that may produce better results. REFERENCES [1] S. M. Abdulhamid, et al., “A Review on Mobile SMS Spam Filtering Techniques,” in IEEE Access, vol. 5, pp. 15650-15666, 2017. [2] G. Rawashdeh, et al., “Comparative Between Optimization Feature Selection by Using Classifiers Algorithms on Spam Email,” International Journal of Electrical and Computer Engineering (IJECE), vol. 9, no. 6, pp. 5479-5485, 2019. [3] N. Desai and M. Narvekar, “Normalization of Noisy Text Data,” Procedia Computer Science, vol. 45, pp. 127-132, 2015. [4] N. Jiang, et al., “Understanding SMS Spam in Large Cellular Network: Characteristics, Strategies and Defenses,” International Workshop on Recent Advances in Intrusion Detection, pp. 328-347, 2013. [5] H. Sajedi, et al., “SMS Spam Filtering Using Machine Learning Techniques: A Survey,” Machine Learning Research, vol. 1, no. 1, pp. 1-14, 2016.
  • 6. Int J Elec & Comp Eng ISSN: 2088-8708  Differential evolution detection models for SMS spam (Sarab M. Hameed) 601 [6] T. M. Mahmoud and A. M. Mahfouz, “SMS Spam Filtering Technique Based on Artificial Immune System,” International Journal of Computer Science Issues, vol. 9, no. 2, pp. 589-597, 2012. [7] N. Sjarif, et al., “SMS Spam Message Detection Using Term Frequency-Inverse Document Frequency and Random Forest Algorithm,” Procedia Computer Science, vol. 161, pp. 509-515, 2019. [8] X. Qian, “SMS Spam Detection Using Noncontent Features,” IEEE Intelligent Systems, vol. 27, no. 6, pp. 44-51, 2012. [9] A. Karami and L. Zhou, “Improving Static SMS Spam Detection by Using New Content-based Features,” The 20th Americas Conference on Information Systems (AMCIS), 2014. [10] K. Zainal and M. Z. Jali, “A Review of Feature Extraction Optimization in SMS Spam Messages Classification,” in Berry M., et al. (eds), “Soft Computing in Data Science (SCDS),” Communications in Computer and Information Science, vol. 652, pp. 158-170, 2016. [11] Y. Kaya and O. F. Ertuğrul, “A Novel Feature Extraction Approach in SMS Spam filtering for Mobile Communication: One-Dimensional Ternary Patterns,” Security and communication networks, 2016. [12] N. F. Sulaiman and M. Z. Jali, “A New SMS Spam Detection Method Using both Content-Based and Non Content- Based Features,” Advanced Computer and Communication Engineering Technology, pp. 505-514, 2016. [13] N. Choudhary and A. K. Jain, “Towards Filtering of SMS Spam Messages Using Machine Learning Based Technique,” in Singh D., et al., (eds), “Advanced Informatics for Computing Research,” Communications in Computer and Information Science, vol. 712, pp. 18-30, 2017. [14] R. K. Kaliyar, et al., “SMS Spam Filtering on Multiple Background Datasets Using Machine Learning Techniques: A Novel Approach,” 2018 IEEE 8th International Advance Computing Conference (IACC), pp. 59-65, 2018. [15] A. Sharaff, “Spam Detection in SMS Based on Feature Selection Techniques,” in Abraham A., et al., (eds), “Emerging Technologies in Data Mining and Information Security,” Advances in Intelligent Systems and Computing, vol. 813, pp. 555-563, 2019. [16] A. A. Ojugo and A. O. Eboka, “Memetic Algorithm for Short Messaging Service Spam Filter Using Text Normalization and Semantic Approach,” International Journal of Informatics and Communication Technology (IJ-ICT), vol. 9, no.1, pp. 9-18, 2020. [17] T. A. Almeida, et al., “Contributions to the Study of SMS Spam filtering: New Collection and Results,” in Proceedings of the 2011 ACM Symposium on Document Engineering (DOCENG’11), pp. 259-262, 2011. [18] M. Georgioudakis, et al., “On the Performance of Differential Evolution Variants in Constrained Structural Optimization,” Procedia Manufacturing, vol. 44, pp. 371-378, 2020. [19] T. Eltaeib and A. Mahmood, “Differential Evolution: A Survey and Analysis,” Applied Sciences, vol. 7, p. 1945, 2018. [20] M. Georgioudakisa and V. Plevris, “On the Performance of Differential Evolution Variants in Constrained Structural Optimization,” Procedia Manufacturing, vol. 44, pp. 371-378, 2020. [21] S. Das and P. N. Suganthan, “Differential Evolution: A Survey of the State-of-the-Art,” IEEE Transactions on Evolutionary Computation, vol. 15, pp. 4-31, 2011. [22] J. Brest, et al., “Dynamic optimization using Self-Adaptive Differential Evolution,” 2009 IEEE Congress on Evolutionary Computation, Trondheim, pp. 415-422, 2009. [23] K. S. Reddy and E. S. Reddy, “Integrated Approach to Detect Spam in Social Media Networks Using Hybrid Features,” International Journal of Electrical and Computer Engineering (IJECE), vol. 9, no. 1, pp. 562-569, 2019. [24] A. Boukhalfa, et al., “LSTM Deep Learning Method for Network Intrusion Detection System,” International Journal of Electrical and Computer Engineering (IJECE), vol. 10, no. 3, pp. 3315-3322, 2020. [25] S. Sheikhi, et al., “An Effective Model for SMS Spam Detection Using Content-based Features and Averaged Neural Network,” International Journal of Engineering, Transactions B: Applications, vol. 33, no. 2, pp. 221-228, 2020. BIOGRAPHY OF AUTHOR Sarab M. Hameed received B.Sc. degree in computer science from university of Baghdad, in 1992, M. Sc. from university of Baghdad, in 1999 and Ph. D. from university of technology in 2005. She is currently assitant professor at department of computer science, university of Baghdad. Her research interest includes computer and data security and soft computing in the security field.