In today’s world, social media has spread widely, and the social life of people have become deeply associated with social media use. They use it to communicate with each other, share events and news, and even run businesses. The huge growth in social media and the massive number of users has lured attackers to distribute harmful content through fake accounts, leading to a large number of people falling victim to those accounts. In this work, we propose a mechanism for identifying fake accounts on the social media site Twitter by using two methods to preprocess data and extract the most effective features, they are the spearman correlation coefficient and the chi-square test. For classification, we used supervised machine learning algorithms based on the ensemble system (stack method) by using random forest, support vector machine, and naive Bayes algorithms in the first level of the stack, and the logistic regression algorithm as a meta classifier. The stack ensemble system was shown to be effective in achieving the best results when compared to the algorithms used with it, with data accuracy reaching 99%.
Fake accounts detection system based on bidirectional gated recurrent unit n...IJECEIAES
Online social networks have become the most widely used medium to interact with friends and family, share news and important events or publish daily activities. However, this growing popularity has made social networks a target for suspicious exploitation such as the spreading of misleading or malicious information, making them less reliable and less trustworthy. In this paper, a fake account detection system based on the bidirectional gated recurrent unit (BiGRU) model is proposed. The focus has been on the content of users’ tweets to classify twitter user profile as legitimate or fake. Tweets are gathered in a single file and are transformed into a vector space using the global vectors (GloVe) word embedding technique in order to preserve the semantic and syntax context. Compared with the baseline models such as long short-term memory (LSTM) and convolutional neural networks (CNN), the results are promising and confirm that using GloVe with BiGRU classifier outperforms with 99.44% for accuracy and 99.25% for precision. To prove the efficiency of our approach the results obtained with GloVe were compared to Word2vec under the same conditions. Results confirm that GloVe with BiGRU classifier performs the best results for detection of fake Twitter accounts using only tweets content feature.
A DATA MINING APPROACH FOR FILTERING OUT SOCIAL SPAMMERS IN LARGE-SCALE TWITT...ijaia
Social networking services – such as Facebook.com and Twitter.com – are fast-growing enterprise
platform that has become a prevalent and essential component of daily life. Due to its popularity, Twitter
draws many spammers or other fake accounts to post malicious links and infiltrate legitimate users'
accounts with many spam messages. Therefore, it is crucial to recognize and screen spam tweets and spam
accounts. As a result, spam detection is highly needed but still a difficult challenge. This article applied
several Bio-inspired optimization algorithms to reduce the features' dimensions in the first stage. Then we
used several classification schemes in the second stage to enhance the spam detection rate in three real
Twitter data collections. The performance of the chosen classifiers also revealed that Random Forest and
C4.5 classifiers achieved the highest Accuracy, Precision, Recall, and F1-score even on class imbalance.
With the advent of the Internet and social media, while hundreds of people have benefitted from the vast sources of information available, there has been an enormous increase in the rise of cyber-crimes, particularly targeted towards women. According to a 2019 report in the [4] Economics Times, India has witnessed a 457% rise in cybercrime in the five year span between 2011 and 2016. Most speculate that this is due to impact of social media such as Facebook, Instagram and Twitter on our daily lives. While these definitely help in creating a sound social network, creation of user accounts in these sites usually needs just an email-id. A real life person can create multiple fake IDs and hence impostors can easily be made. Unlike the real world scenario where multiple rules and regulations are imposed to identify oneself in a unique manner (for example while issuing one’s passport or driver’s license), in the virtual world of social media, admission does not require any such checks. In this paper, we study the different accounts of Instagram, in particular and try to assess an account as fake or real using Machine Learning techniques namely Logistic Regression and Random Forest Algorithm.
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNINGijcsit
With the advent of the Internet and social media, while hundreds of people have benefitted from the vast sources of information available, there has been an enormous increase in the rise of cyber-crimes, particularly targeted towards women. According to a 2019 report in the [4] Economics Times, India has witnessed a 457% rise in cybercrime in the five year span between 2011 and 2016. Most speculate that this is due to impact of social media such as Facebook, Instagram and Twitter on our daily lives. While these definitely help in creating a sound social network, creation of user accounts in these sites usually needs just an email-id. A real life person can create multiple fake IDs and hence impostors can easily be made. Unlike the real world scenario where multiple rules and regulations are imposed to identify oneself in a unique manner (for example while issuing one’s passport or driver’s license), in the virtual world of social media, admission does not require any such checks. In this paper, we study the different accounts of Instagram, in particular and try to assess an account as fake or real using Machine Learning techniques namely Logistic Regression and Random Forest Algorithm.
Concept drift and machine learning model for detecting fraudulent transaction...IJECEIAES
In a streaming environment, data is continuously generated and processed in an ongoing manner, and it is necessary to detect fraudulent transactions quickly to prevent significant financial losses. Hence, this paper proposes a machine learning-based approach for detecting fraudulent transactions in a streaming environment, with a focus on addressing concept drift. The approach utilizes the extreme gradient boosting (XGBoost) algorithm. Additionally, the approach employs four algorithms for detecting continuous stream drift. To evaluate the effectiveness of the approach, two datasets are used: a credit card dataset and a Twitter dataset containing financial fraudrelated social media data. The approach is evaluated using cross-validation and the results demonstrate that it outperforms traditional machine learning models in terms of accuracy, precision, and recall, and is more robust to concept drift. The proposed approach can be utilized as a real-time fraud detection system in various industries, including finance, insurance, and e-commerce.
Fake accounts detection system based on bidirectional gated recurrent unit n...IJECEIAES
Online social networks have become the most widely used medium to interact with friends and family, share news and important events or publish daily activities. However, this growing popularity has made social networks a target for suspicious exploitation such as the spreading of misleading or malicious information, making them less reliable and less trustworthy. In this paper, a fake account detection system based on the bidirectional gated recurrent unit (BiGRU) model is proposed. The focus has been on the content of users’ tweets to classify twitter user profile as legitimate or fake. Tweets are gathered in a single file and are transformed into a vector space using the global vectors (GloVe) word embedding technique in order to preserve the semantic and syntax context. Compared with the baseline models such as long short-term memory (LSTM) and convolutional neural networks (CNN), the results are promising and confirm that using GloVe with BiGRU classifier outperforms with 99.44% for accuracy and 99.25% for precision. To prove the efficiency of our approach the results obtained with GloVe were compared to Word2vec under the same conditions. Results confirm that GloVe with BiGRU classifier performs the best results for detection of fake Twitter accounts using only tweets content feature.
A DATA MINING APPROACH FOR FILTERING OUT SOCIAL SPAMMERS IN LARGE-SCALE TWITT...ijaia
Social networking services – such as Facebook.com and Twitter.com – are fast-growing enterprise
platform that has become a prevalent and essential component of daily life. Due to its popularity, Twitter
draws many spammers or other fake accounts to post malicious links and infiltrate legitimate users'
accounts with many spam messages. Therefore, it is crucial to recognize and screen spam tweets and spam
accounts. As a result, spam detection is highly needed but still a difficult challenge. This article applied
several Bio-inspired optimization algorithms to reduce the features' dimensions in the first stage. Then we
used several classification schemes in the second stage to enhance the spam detection rate in three real
Twitter data collections. The performance of the chosen classifiers also revealed that Random Forest and
C4.5 classifiers achieved the highest Accuracy, Precision, Recall, and F1-score even on class imbalance.
With the advent of the Internet and social media, while hundreds of people have benefitted from the vast sources of information available, there has been an enormous increase in the rise of cyber-crimes, particularly targeted towards women. According to a 2019 report in the [4] Economics Times, India has witnessed a 457% rise in cybercrime in the five year span between 2011 and 2016. Most speculate that this is due to impact of social media such as Facebook, Instagram and Twitter on our daily lives. While these definitely help in creating a sound social network, creation of user accounts in these sites usually needs just an email-id. A real life person can create multiple fake IDs and hence impostors can easily be made. Unlike the real world scenario where multiple rules and regulations are imposed to identify oneself in a unique manner (for example while issuing one’s passport or driver’s license), in the virtual world of social media, admission does not require any such checks. In this paper, we study the different accounts of Instagram, in particular and try to assess an account as fake or real using Machine Learning techniques namely Logistic Regression and Random Forest Algorithm.
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNINGijcsit
With the advent of the Internet and social media, while hundreds of people have benefitted from the vast sources of information available, there has been an enormous increase in the rise of cyber-crimes, particularly targeted towards women. According to a 2019 report in the [4] Economics Times, India has witnessed a 457% rise in cybercrime in the five year span between 2011 and 2016. Most speculate that this is due to impact of social media such as Facebook, Instagram and Twitter on our daily lives. While these definitely help in creating a sound social network, creation of user accounts in these sites usually needs just an email-id. A real life person can create multiple fake IDs and hence impostors can easily be made. Unlike the real world scenario where multiple rules and regulations are imposed to identify oneself in a unique manner (for example while issuing one’s passport or driver’s license), in the virtual world of social media, admission does not require any such checks. In this paper, we study the different accounts of Instagram, in particular and try to assess an account as fake or real using Machine Learning techniques namely Logistic Regression and Random Forest Algorithm.
Concept drift and machine learning model for detecting fraudulent transaction...IJECEIAES
In a streaming environment, data is continuously generated and processed in an ongoing manner, and it is necessary to detect fraudulent transactions quickly to prevent significant financial losses. Hence, this paper proposes a machine learning-based approach for detecting fraudulent transactions in a streaming environment, with a focus on addressing concept drift. The approach utilizes the extreme gradient boosting (XGBoost) algorithm. Additionally, the approach employs four algorithms for detecting continuous stream drift. To evaluate the effectiveness of the approach, two datasets are used: a credit card dataset and a Twitter dataset containing financial fraudrelated social media data. The approach is evaluated using cross-validation and the results demonstrate that it outperforms traditional machine learning models in terms of accuracy, precision, and recall, and is more robust to concept drift. The proposed approach can be utilized as a real-time fraud detection system in various industries, including finance, insurance, and e-commerce.
Integrated approach to detect spam in social media networks using hybrid feat...IJECEIAES
Online social networking sites are becoming more popular amongst Internet users. The Internet users spend some amount of time on popular social networking sites like Facebook, Twitter and LinkedIn etc. Online social networks are considered to be much useful tool to the society used by Internet lovers to communicate and transmit information. These social networking platforms are useful to share information, opinions and ideas, make new friends, and create new friend groups. Social networking sites provide large amount of technical information to the users. This large amount of information in social networking sites attracts cyber criminals to misuse these sites information. These users create their own accounts and spread vulnerable information to the genuine users. This information may be advertising some product, send some malicious links etc to disturb the natural users on social sites. Spammer detection is a major problem now days in social networking sites. Previous spam detection techniques use different set of features to classify spam and non spam users. In this paper we proposed a hybrid approach which uses content based and user based features for identification of spam on Twitter network. In this hybrid approach we used decision tree induction algorithm and Bayesian network algorithm to construct a classification model. We have analysed the proposed technique on twitter dataset. Our analysis shows that our proposed methodology is better than some other existing techniques.
MapReduce-iterative support vector machine classifier: novel fraud detection...IJECEIAES
Fraud in healthcare insurance claims is one of the significant research challenges that affect the growth of the healthcare services. The healthcare frauds are happening through subscribers, companies and the providers. The development of a decision support is to automate the claim data from service provider and to offset the patient’s challenges. In this paper, a novel hybridized big data and statistical machine learning technique, named MapReduce based iterative support vector machine (MR-ISVM) that provide a set of sophisticated steps for the automatic detection of fraudulent claims in the health insurance databases. The experimental results have proven that the MR-ISVM classifier outperforms better in classification and detection than other support vector machine (SVM) kernel classifiers. From the results, a positive impact seen in declining the computational time on processing the healthcare insurance claims without compromising the classification accuracy is achieved. The proposed MR-ISVM classifier achieves 87.73% accuracy than the linear (75.3%) and radial basis function (79.98%).
Classification of instagram fake users using supervised machine learning algo...IJECEIAES
On Instagram, the number of followers is a common success indicator. Hence, followers selling services become a huge part of the market. Influencers become bombarded with fake followers and this causes a business owner to pay more than they should for a brand endorsement. Identifying fake followers becomes important to determine the authenticity of an influencer. This research aims to identify fake users' behavior and proposes supervised machine learning models to classify authentic and fake users. The dataset contains fake users bought from various sources, and authentic users. There are 17 features used, based on these sources: 6 metadata, 3 media info, 2 engagement, 2 media tags, 4 media similarity Five machine learning algorithms will be tested. Three different approaches of classification are proposed, i.e. classification to 2-classes and 4-classes, and classification with metadata. Random forest algorithm produces the highest accuracy for the 2-classes (authentic, fake) and 4-classes (authentic, active fake user, inactive fake user, spammer) classification, with accuracy up to 91.76%. The result also shows that the five metadata variables, i.e. number of posts, followers, biography length, following, and link availability are the biggest predictors for the users class. Additionally, descriptive statistics results reveal noticeable differences between fake and authentic users.
Abstract: Detection of fake news based on deep learning techniques is a major issue used to mislead people. For
the experiments, several types of datasets, models, and methodologies have been used to detect fake news. Also,
most of the datasets contain text id, tweets id, and user-based id and user-based features. To get the proper results
and accuracy various models like CNN (Convolution neural network), DEEP CNN, and LSTM (Long short-term
memory) are used
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
call for paper 2012, hard copy of journal, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal,
During the last decade, the social media has been regarded as a rich dominant source of information and news. Its unsupervised nature leads to the emergence and spread of fake news. Fake news detection has gained a great importance posing many challenges to the research community. One of the main challenges is the detection accuracy which is highly affected by the chosen and extracted features and the used classification algorithm. In this paper, we propose a context-based solution that relies on account features and random forest classifier to detect fake news. It achieves the precision of 99.8%. The system accuracy has been compared to other commonly used classifiers such as decision tree classifier, Gaussian Naïve Bayes and neural network which give precision of 98.4%, 92.6%, and 62.7% respectively. The experiments’ accuracy results show the possibility of distinguishing fake news and giving credibility scores for social media news with a relatively high performance.
Categorize balanced dataset for troll detectionvivatechijri
As we know cyber bullying is increasing day by day and Cyber troll is one of the cyber-aggressive actions that is not much different from cyberbullying in online abuse so that the victims feel uncomfortable. One of the most used social media platforms in which cyber trolling frequently happens is Twitter. Basically, it is found that during an investigation of cyberbullying cases a lot of information gathered is false which aims to give discomfort, hatred and waste lots of time. So, it is necessary to classify between cyberbullying tweets and normal tweets on twitter. There has already been research on classification of cyberbullying tweets and normal tweets using the Support vector machine (SVM) algorithm. But the drawback of the system is that it only gives 63.83% of accuracy. Firstly, we can improve the accuracy of the system by using the Recurrent Neural Network (RNN) And Secondly, for balancing the dataset we will be using Synthetic Minority Over-sampling Technique (SMOTE). We believe that using these techniques we will be able to increase the accuracy of the previous proposed.
Predicting depression using deep learning and ensemble algorithms on raw twit...IJECEIAES
Social network and microblogging sites such as Twitter are widespread amongst all generations nowadays where people connect and share their feelings, emotions, pursuits etc. Depression, one of the most common mental disorder, is an acute state of sadness where person loses interest in all activities. If not treated immediately this can result in dire consequences such as death. In this era of virtual world, people are more comfortable in expressing their emotions in such sites as they have become a part and parcel of everyday lives. The research put forth thus, employs machine learning classifiers on the twitter data set to detect if a person’s tweet indicates any sign of depression or not.
Machine learning is a sub-field of artificial intelligence (AI) that focuses on creating statistical models and algorithms that allow computers to learn and become more proficient at performing particular tasks. Machine learning algorithms create a mathematical model with the help of historical sample data, or “training data,” that assists in making predictions or judgments without being explicitly programmed.
Machine learning is a sub-field of artificial intelligence (AI) that focuses on creating statistical models and algorithms that allow computers to learn and become more proficient at performing particular tasks. Machine learning algorithms create a mathematical model with the help of historical sample data, or “training data,” that assists in making predictions or judgments without being explicitly programmed.
How to build machine learning apps.pdfJamieDornan2
Machine learning is a sub-field of artificial intelligence (AI) that focuses on creating statistical models and algorithms that allow computers to learn and become more proficient at performing particular tasks.
Bibliometric analysis highlighting the role of women in addressing climate ch...IJECEIAES
Fossil fuel consumption increased quickly, contributing to climate change
that is evident in unusual flooding and draughts, and global warming. Over
the past ten years, women's involvement in society has grown dramatically,
and they succeeded in playing a noticeable role in reducing climate change.
A bibliometric analysis of data from the last ten years has been carried out to
examine the role of women in addressing the climate change. The analysis's
findings discussed the relevant to the sustainable development goals (SDGs),
particularly SDG 7 and SDG 13. The results considered contributions made
by women in the various sectors while taking geographic dispersion into
account. The bibliometric analysis delves into topics including women's
leadership in environmental groups, their involvement in policymaking, their
contributions to sustainable development projects, and the influence of
gender diversity on attempts to mitigate climate change. This study's results
highlight how women have influenced policies and actions related to climate
change, point out areas of research deficiency and recommendations on how
to increase role of the women in addressing the climate change and
achieving sustainability. To achieve more successful results, this initiative
aims to highlight the significance of gender equality and encourage
inclusivity in climate change decision-making processes.
Voltage and frequency control of microgrid in presence of micro-turbine inter...IJECEIAES
The active and reactive load changes have a significant impact on voltage
and frequency. In this paper, in order to stabilize the microgrid (MG) against
load variations in islanding mode, the active and reactive power of all
distributed generators (DGs), including energy storage (battery), diesel
generator, and micro-turbine, are controlled. The micro-turbine generator is
connected to MG through a three-phase to three-phase matrix converter, and
the droop control method is applied for controlling the voltage and
frequency of MG. In addition, a method is introduced for voltage and
frequency control of micro-turbines in the transition state from gridconnected mode to islanding mode. A novel switching strategy of the matrix
converter is used for converting the high-frequency output voltage of the
micro-turbine to the grid-side frequency of the utility system. Moreover,
using the switching strategy, the low-order harmonics in the output current
and voltage are not produced, and consequently, the size of the output filter
would be reduced. In fact, the suggested control strategy is load-independent
and has no frequency conversion restrictions. The proposed approach for
voltage and frequency regulation demonstrates exceptional performance and
favorable response across various load alteration scenarios. The suggested
strategy is examined in several scenarios in the MG test systems, and the
simulation results are addressed.
More Related Content
Similar to Fake accounts detection on social media using stack ensemble system
Integrated approach to detect spam in social media networks using hybrid feat...IJECEIAES
Online social networking sites are becoming more popular amongst Internet users. The Internet users spend some amount of time on popular social networking sites like Facebook, Twitter and LinkedIn etc. Online social networks are considered to be much useful tool to the society used by Internet lovers to communicate and transmit information. These social networking platforms are useful to share information, opinions and ideas, make new friends, and create new friend groups. Social networking sites provide large amount of technical information to the users. This large amount of information in social networking sites attracts cyber criminals to misuse these sites information. These users create their own accounts and spread vulnerable information to the genuine users. This information may be advertising some product, send some malicious links etc to disturb the natural users on social sites. Spammer detection is a major problem now days in social networking sites. Previous spam detection techniques use different set of features to classify spam and non spam users. In this paper we proposed a hybrid approach which uses content based and user based features for identification of spam on Twitter network. In this hybrid approach we used decision tree induction algorithm and Bayesian network algorithm to construct a classification model. We have analysed the proposed technique on twitter dataset. Our analysis shows that our proposed methodology is better than some other existing techniques.
MapReduce-iterative support vector machine classifier: novel fraud detection...IJECEIAES
Fraud in healthcare insurance claims is one of the significant research challenges that affect the growth of the healthcare services. The healthcare frauds are happening through subscribers, companies and the providers. The development of a decision support is to automate the claim data from service provider and to offset the patient’s challenges. In this paper, a novel hybridized big data and statistical machine learning technique, named MapReduce based iterative support vector machine (MR-ISVM) that provide a set of sophisticated steps for the automatic detection of fraudulent claims in the health insurance databases. The experimental results have proven that the MR-ISVM classifier outperforms better in classification and detection than other support vector machine (SVM) kernel classifiers. From the results, a positive impact seen in declining the computational time on processing the healthcare insurance claims without compromising the classification accuracy is achieved. The proposed MR-ISVM classifier achieves 87.73% accuracy than the linear (75.3%) and radial basis function (79.98%).
Classification of instagram fake users using supervised machine learning algo...IJECEIAES
On Instagram, the number of followers is a common success indicator. Hence, followers selling services become a huge part of the market. Influencers become bombarded with fake followers and this causes a business owner to pay more than they should for a brand endorsement. Identifying fake followers becomes important to determine the authenticity of an influencer. This research aims to identify fake users' behavior and proposes supervised machine learning models to classify authentic and fake users. The dataset contains fake users bought from various sources, and authentic users. There are 17 features used, based on these sources: 6 metadata, 3 media info, 2 engagement, 2 media tags, 4 media similarity Five machine learning algorithms will be tested. Three different approaches of classification are proposed, i.e. classification to 2-classes and 4-classes, and classification with metadata. Random forest algorithm produces the highest accuracy for the 2-classes (authentic, fake) and 4-classes (authentic, active fake user, inactive fake user, spammer) classification, with accuracy up to 91.76%. The result also shows that the five metadata variables, i.e. number of posts, followers, biography length, following, and link availability are the biggest predictors for the users class. Additionally, descriptive statistics results reveal noticeable differences between fake and authentic users.
Abstract: Detection of fake news based on deep learning techniques is a major issue used to mislead people. For
the experiments, several types of datasets, models, and methodologies have been used to detect fake news. Also,
most of the datasets contain text id, tweets id, and user-based id and user-based features. To get the proper results
and accuracy various models like CNN (Convolution neural network), DEEP CNN, and LSTM (Long short-term
memory) are used
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
call for paper 2012, hard copy of journal, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal,
During the last decade, the social media has been regarded as a rich dominant source of information and news. Its unsupervised nature leads to the emergence and spread of fake news. Fake news detection has gained a great importance posing many challenges to the research community. One of the main challenges is the detection accuracy which is highly affected by the chosen and extracted features and the used classification algorithm. In this paper, we propose a context-based solution that relies on account features and random forest classifier to detect fake news. It achieves the precision of 99.8%. The system accuracy has been compared to other commonly used classifiers such as decision tree classifier, Gaussian Naïve Bayes and neural network which give precision of 98.4%, 92.6%, and 62.7% respectively. The experiments’ accuracy results show the possibility of distinguishing fake news and giving credibility scores for social media news with a relatively high performance.
Categorize balanced dataset for troll detectionvivatechijri
As we know cyber bullying is increasing day by day and Cyber troll is one of the cyber-aggressive actions that is not much different from cyberbullying in online abuse so that the victims feel uncomfortable. One of the most used social media platforms in which cyber trolling frequently happens is Twitter. Basically, it is found that during an investigation of cyberbullying cases a lot of information gathered is false which aims to give discomfort, hatred and waste lots of time. So, it is necessary to classify between cyberbullying tweets and normal tweets on twitter. There has already been research on classification of cyberbullying tweets and normal tweets using the Support vector machine (SVM) algorithm. But the drawback of the system is that it only gives 63.83% of accuracy. Firstly, we can improve the accuracy of the system by using the Recurrent Neural Network (RNN) And Secondly, for balancing the dataset we will be using Synthetic Minority Over-sampling Technique (SMOTE). We believe that using these techniques we will be able to increase the accuracy of the previous proposed.
Predicting depression using deep learning and ensemble algorithms on raw twit...IJECEIAES
Social network and microblogging sites such as Twitter are widespread amongst all generations nowadays where people connect and share their feelings, emotions, pursuits etc. Depression, one of the most common mental disorder, is an acute state of sadness where person loses interest in all activities. If not treated immediately this can result in dire consequences such as death. In this era of virtual world, people are more comfortable in expressing their emotions in such sites as they have become a part and parcel of everyday lives. The research put forth thus, employs machine learning classifiers on the twitter data set to detect if a person’s tweet indicates any sign of depression or not.
Machine learning is a sub-field of artificial intelligence (AI) that focuses on creating statistical models and algorithms that allow computers to learn and become more proficient at performing particular tasks. Machine learning algorithms create a mathematical model with the help of historical sample data, or “training data,” that assists in making predictions or judgments without being explicitly programmed.
Machine learning is a sub-field of artificial intelligence (AI) that focuses on creating statistical models and algorithms that allow computers to learn and become more proficient at performing particular tasks. Machine learning algorithms create a mathematical model with the help of historical sample data, or “training data,” that assists in making predictions or judgments without being explicitly programmed.
How to build machine learning apps.pdfJamieDornan2
Machine learning is a sub-field of artificial intelligence (AI) that focuses on creating statistical models and algorithms that allow computers to learn and become more proficient at performing particular tasks.
Similar to Fake accounts detection on social media using stack ensemble system (20)
Bibliometric analysis highlighting the role of women in addressing climate ch...IJECEIAES
Fossil fuel consumption increased quickly, contributing to climate change
that is evident in unusual flooding and draughts, and global warming. Over
the past ten years, women's involvement in society has grown dramatically,
and they succeeded in playing a noticeable role in reducing climate change.
A bibliometric analysis of data from the last ten years has been carried out to
examine the role of women in addressing the climate change. The analysis's
findings discussed the relevant to the sustainable development goals (SDGs),
particularly SDG 7 and SDG 13. The results considered contributions made
by women in the various sectors while taking geographic dispersion into
account. The bibliometric analysis delves into topics including women's
leadership in environmental groups, their involvement in policymaking, their
contributions to sustainable development projects, and the influence of
gender diversity on attempts to mitigate climate change. This study's results
highlight how women have influenced policies and actions related to climate
change, point out areas of research deficiency and recommendations on how
to increase role of the women in addressing the climate change and
achieving sustainability. To achieve more successful results, this initiative
aims to highlight the significance of gender equality and encourage
inclusivity in climate change decision-making processes.
Voltage and frequency control of microgrid in presence of micro-turbine inter...IJECEIAES
The active and reactive load changes have a significant impact on voltage
and frequency. In this paper, in order to stabilize the microgrid (MG) against
load variations in islanding mode, the active and reactive power of all
distributed generators (DGs), including energy storage (battery), diesel
generator, and micro-turbine, are controlled. The micro-turbine generator is
connected to MG through a three-phase to three-phase matrix converter, and
the droop control method is applied for controlling the voltage and
frequency of MG. In addition, a method is introduced for voltage and
frequency control of micro-turbines in the transition state from gridconnected mode to islanding mode. A novel switching strategy of the matrix
converter is used for converting the high-frequency output voltage of the
micro-turbine to the grid-side frequency of the utility system. Moreover,
using the switching strategy, the low-order harmonics in the output current
and voltage are not produced, and consequently, the size of the output filter
would be reduced. In fact, the suggested control strategy is load-independent
and has no frequency conversion restrictions. The proposed approach for
voltage and frequency regulation demonstrates exceptional performance and
favorable response across various load alteration scenarios. The suggested
strategy is examined in several scenarios in the MG test systems, and the
simulation results are addressed.
Enhancing battery system identification: nonlinear autoregressive modeling fo...IJECEIAES
Precisely characterizing Li-ion batteries is essential for optimizing their
performance, enhancing safety, and prolonging their lifespan across various
applications, such as electric vehicles and renewable energy systems. This
article introduces an innovative nonlinear methodology for system
identification of a Li-ion battery, employing a nonlinear autoregressive with
exogenous inputs (NARX) model. The proposed approach integrates the
benefits of nonlinear modeling with the adaptability of the NARX structure,
facilitating a more comprehensive representation of the intricate
electrochemical processes within the battery. Experimental data collected
from a Li-ion battery operating under diverse scenarios are employed to
validate the effectiveness of the proposed methodology. The identified
NARX model exhibits superior accuracy in predicting the battery's behavior
compared to traditional linear models. This study underscores the
importance of accounting for nonlinearities in battery modeling, providing
insights into the intricate relationships between state-of-charge, voltage, and
current under dynamic conditions.
Smart grid deployment: from a bibliometric analysis to a surveyIJECEIAES
Smart grids are one of the last decades' innovations in electrical energy.
They bring relevant advantages compared to the traditional grid and
significant interest from the research community. Assessing the field's
evolution is essential to propose guidelines for facing new and future smart
grid challenges. In addition, knowing the main technologies involved in the
deployment of smart grids (SGs) is important to highlight possible
shortcomings that can be mitigated by developing new tools. This paper
contributes to the research trends mentioned above by focusing on two
objectives. First, a bibliometric analysis is presented to give an overview of
the current research level about smart grid deployment. Second, a survey of
the main technological approaches used for smart grid implementation and
their contributions are highlighted. To that effect, we searched the Web of
Science (WoS), and the Scopus databases. We obtained 5,663 documents
from WoS and 7,215 from Scopus on smart grid implementation or
deployment. With the extraction limitation in the Scopus database, 5,872 of
the 7,215 documents were extracted using a multi-step process. These two
datasets have been analyzed using a bibliometric tool called bibliometrix.
The main outputs are presented with some recommendations for future
research.
Use of analytical hierarchy process for selecting and prioritizing islanding ...IJECEIAES
One of the problems that are associated to power systems is islanding
condition, which must be rapidly and properly detected to prevent any
negative consequences on the system's protection, stability, and security.
This paper offers a thorough overview of several islanding detection
strategies, which are divided into two categories: classic approaches,
including local and remote approaches, and modern techniques, including
techniques based on signal processing and computational intelligence.
Additionally, each approach is compared and assessed based on several
factors, including implementation costs, non-detected zones, declining
power quality, and response times using the analytical hierarchy process
(AHP). The multi-criteria decision-making analysis shows that the overall
weight of passive methods (24.7%), active methods (7.8%), hybrid methods
(5.6%), remote methods (14.5%), signal processing-based methods (26.6%),
and computational intelligent-based methods (20.8%) based on the
comparison of all criteria together. Thus, it can be seen from the total weight
that hybrid approaches are the least suitable to be chosen, while signal
processing-based methods are the most appropriate islanding detection
method to be selected and implemented in power system with respect to the
aforementioned factors. Using Expert Choice software, the proposed
hierarchy model is studied and examined.
Enhancing of single-stage grid-connected photovoltaic system using fuzzy logi...IJECEIAES
The power generated by photovoltaic (PV) systems is influenced by
environmental factors. This variability hampers the control and utilization of
solar cells' peak output. In this study, a single-stage grid-connected PV
system is designed to enhance power quality. Our approach employs fuzzy
logic in the direct power control (DPC) of a three-phase voltage source
inverter (VSI), enabling seamless integration of the PV connected to the
grid. Additionally, a fuzzy logic-based maximum power point tracking
(MPPT) controller is adopted, which outperforms traditional methods like
incremental conductance (INC) in enhancing solar cell efficiency and
minimizing the response time. Moreover, the inverter's real-time active and
reactive power is directly managed to achieve a unity power factor (UPF).
The system's performance is assessed through MATLAB/Simulink
implementation, showing marked improvement over conventional methods,
particularly in steady-state and varying weather conditions. For solar
irradiances of 500 and 1,000 W/m2
, the results show that the proposed
method reduces the total harmonic distortion (THD) of the injected current
to the grid by approximately 46% and 38% compared to conventional
methods, respectively. Furthermore, we compare the simulation results with
IEEE standards to evaluate the system's grid compatibility.
Enhancing photovoltaic system maximum power point tracking with fuzzy logic-b...IJECEIAES
Photovoltaic systems have emerged as a promising energy resource that
caters to the future needs of society, owing to their renewable, inexhaustible,
and cost-free nature. The power output of these systems relies on solar cell
radiation and temperature. In order to mitigate the dependence on
atmospheric conditions and enhance power tracking, a conventional
approach has been improved by integrating various methods. To optimize
the generation of electricity from solar systems, the maximum power point
tracking (MPPT) technique is employed. To overcome limitations such as
steady-state voltage oscillations and improve transient response, two
traditional MPPT methods, namely fuzzy logic controller (FLC) and perturb
and observe (P&O), have been modified. This research paper aims to
simulate and validate the step size of the proposed modified P&O and FLC
techniques within the MPPT algorithm using MATLAB/Simulink for
efficient power tracking in photovoltaic systems.
Adaptive synchronous sliding control for a robot manipulator based on neural ...IJECEIAES
Robot manipulators have become important equipment in production lines, medical fields, and transportation. Improving the quality of trajectory tracking for
robot hands is always an attractive topic in the research community. This is a
challenging problem because robot manipulators are complex nonlinear systems
and are often subject to fluctuations in loads and external disturbances. This
article proposes an adaptive synchronous sliding control scheme to improve trajectory tracking performance for a robot manipulator. The proposed controller
ensures that the positions of the joints track the desired trajectory, synchronize
the errors, and significantly reduces chattering. First, the synchronous tracking
errors and synchronous sliding surfaces are presented. Second, the synchronous
tracking error dynamics are determined. Third, a robust adaptive control law is
designed,the unknown components of the model are estimated online by the neural network, and the parameters of the switching elements are selected by fuzzy
logic. The built algorithm ensures that the tracking and approximation errors
are ultimately uniformly bounded (UUB). Finally, the effectiveness of the constructed algorithm is demonstrated through simulation and experimental results.
Simulation and experimental results show that the proposed controller is effective with small synchronous tracking errors, and the chattering phenomenon is
significantly reduced.
Remote field-programmable gate array laboratory for signal acquisition and de...IJECEIAES
A remote laboratory utilizing field-programmable gate array (FPGA) technologies enhances students’ learning experience anywhere and anytime in embedded system design. Existing remote laboratories prioritize hardware access and visual feedback for observing board behavior after programming, neglecting comprehensive debugging tools to resolve errors that require internal signal acquisition. This paper proposes a novel remote embeddedsystem design approach targeting FPGA technologies that are fully interactive via a web-based platform. Our solution provides FPGA board access and debugging capabilities beyond the visual feedback provided by existing remote laboratories. We implemented a lab module that allows users to seamlessly incorporate into their FPGA design. The module minimizes hardware resource utilization while enabling the acquisition of a large number of data samples from the signal during the experiments by adaptively compressing the signal prior to data transmission. The results demonstrate an average compression ratio of 2.90 across three benchmark signals, indicating efficient signal acquisition and effective debugging and analysis. This method allows users to acquire more data samples than conventional methods. The proposed lab allows students to remotely test and debug their designs, bridging the gap between theory and practice in embedded system design.
Detecting and resolving feature envy through automated machine learning and m...IJECEIAES
Efficiently identifying and resolving code smells enhances software project quality. This paper presents a novel solution, utilizing automated machine learning (AutoML) techniques, to detect code smells and apply move method refactoring. By evaluating code metrics before and after refactoring, we assessed its impact on coupling, complexity, and cohesion. Key contributions of this research include a unique dataset for code smell classification and the development of models using AutoGluon for optimal performance. Furthermore, the study identifies the top 20 influential features in classifying feature envy, a well-known code smell, stemming from excessive reliance on external classes. We also explored how move method refactoring addresses feature envy, revealing reduced coupling and complexity, and improved cohesion, ultimately enhancing code quality. In summary, this research offers an empirical, data-driven approach, integrating AutoML and move method refactoring to optimize software project quality. Insights gained shed light on the benefits of refactoring on code quality and the significance of specific features in detecting feature envy. Future research can expand to explore additional refactoring techniques and a broader range of code metrics, advancing software engineering practices and standards.
Smart monitoring technique for solar cell systems using internet of things ba...IJECEIAES
Rapidly and remotely monitoring and receiving the solar cell systems status parameters, solar irradiance, temperature, and humidity, are critical issues in enhancement their efficiency. Hence, in the present article an improved smart prototype of internet of things (IoT) technique based on embedded system through NodeMCU ESP8266 (ESP-12E) was carried out experimentally. Three different regions at Egypt; Luxor, Cairo, and El-Beheira cities were chosen to study their solar irradiance profile, temperature, and humidity by the proposed IoT system. The monitoring data of solar irradiance, temperature, and humidity were live visualized directly by Ubidots through hypertext transfer protocol (HTTP) protocol. The measured solar power radiation in Luxor, Cairo, and El-Beheira ranged between 216-1000, 245-958, and 187-692 W/m 2 respectively during the solar day. The accuracy and rapidity of obtaining monitoring results using the proposed IoT system made it a strong candidate for application in monitoring solar cell systems. On the other hand, the obtained solar power radiation results of the three considered regions strongly candidate Luxor and Cairo as suitable places to build up a solar cells system station rather than El-Beheira.
An efficient security framework for intrusion detection and prevention in int...IJECEIAES
Over the past few years, the internet of things (IoT) has advanced to connect billions of smart devices to improve quality of life. However, anomalies or malicious intrusions pose several security loopholes, leading to performance degradation and threat to data security in IoT operations. Thereby, IoT security systems must keep an eye on and restrict unwanted events from occurring in the IoT network. Recently, various technical solutions based on machine learning (ML) models have been derived towards identifying and restricting unwanted events in IoT. However, most ML-based approaches are prone to miss-classification due to inappropriate feature selection. Additionally, most ML approaches applied to intrusion detection and prevention consider supervised learning, which requires a large amount of labeled data to be trained. Consequently, such complex datasets are impossible to source in a large network like IoT. To address this problem, this proposed study introduces an efficient learning mechanism to strengthen the IoT security aspects. The proposed algorithm incorporates supervised and unsupervised approaches to improve the learning models for intrusion detection and mitigation. Compared with the related works, the experimental outcome shows that the model performs well in a benchmark dataset. It accomplishes an improved detection accuracy of approximately 99.21%.
Developing a smart system for infant incubators using the internet of things ...IJECEIAES
This research is developing an incubator system that integrates the internet of things and artificial intelligence to improve care for premature babies. The system workflow starts with sensors that collect data from the incubator. Then, the data is sent in real-time to the internet of things (IoT) broker eclipse mosquito using the message queue telemetry transport (MQTT) protocol version 5.0. After that, the data is stored in a database for analysis using the long short-term memory network (LSTM) method and displayed in a web application using an application programming interface (API) service. Furthermore, the experimental results produce as many as 2,880 rows of data stored in the database. The correlation coefficient between the target attribute and other attributes ranges from 0.23 to 0.48. Next, several experiments were conducted to evaluate the model-predicted value on the test data. The best results are obtained using a two-layer LSTM configuration model, each with 60 neurons and a lookback setting 6. This model produces an R 2 value of 0.934, with a root mean square error (RMSE) value of 0.015 and a mean absolute error (MAE) of 0.008. In addition, the R 2 value was also evaluated for each attribute used as input, with a result of values between 0.590 and 0.845.
A review on internet of things-based stingless bee's honey production with im...IJECEIAES
Honey is produced exclusively by honeybees and stingless bees which both are well adapted to tropical and subtropical regions such as Malaysia. Stingless bees are known for producing small amounts of honey and are known for having a unique flavor profile. Problem identified that many stingless bees collapsed due to weather, temperature and environment. It is critical to understand the relationship between the production of stingless bee honey and environmental conditions to improve honey production. Thus, this paper presents a review on stingless bee's honey production and prediction modeling. About 54 previous research has been analyzed and compared in identifying the research gaps. A framework on modeling the prediction of stingless bee honey is derived. The result presents the comparison and analysis on the internet of things (IoT) monitoring systems, honey production estimation, convolution neural networks (CNNs), and automatic identification methods on bee species. It is identified based on image detection method the top best three efficiency presents CNN is at 98.67%, densely connected convolutional networks with YOLO v3 is 97.7%, and DenseNet201 convolutional networks 99.81%. This study is significant to assist the researcher in developing a model for predicting stingless honey produced by bee's output, which is important for a stable economy and food security.
A trust based secure access control using authentication mechanism for intero...IJECEIAES
The internet of things (IoT) is a revolutionary innovation in many aspects of our society including interactions, financial activity, and global security such as the military and battlefield internet. Due to the limited energy and processing capacity of network devices, security, energy consumption, compatibility, and device heterogeneity are the long-term IoT problems. As a result, energy and security are critical for data transmission across edge and IoT networks. Existing IoT interoperability techniques need more computation time, have unreliable authentication mechanisms that break easily, lose data easily, and have low confidentiality. In this paper, a key agreement protocol-based authentication mechanism for IoT devices is offered as a solution to this issue. This system makes use of information exchange, which must be secured to prevent access by unauthorized users. Using a compact contiki/cooja simulator, the performance and design of the suggested framework are validated. The simulation findings are evaluated based on detection of malicious nodes after 60 minutes of simulation. The suggested trust method, which is based on privacy access control, reduced packet loss ratio to 0.32%, consumed 0.39% power, and had the greatest average residual energy of 0.99 mJoules at 10 nodes.
Fuzzy linear programming with the intuitionistic polygonal fuzzy numbersIJECEIAES
In real world applications, data are subject to ambiguity due to several factors; fuzzy sets and fuzzy numbers propose a great tool to model such ambiguity. In case of hesitation, the complement of a membership value in fuzzy numbers can be different from the non-membership value, in which case we can model using intuitionistic fuzzy numbers as they provide flexibility by defining both a membership and a non-membership functions. In this article, we consider the intuitionistic fuzzy linear programming problem with intuitionistic polygonal fuzzy numbers, which is a generalization of the previous polygonal fuzzy numbers found in the literature. We present a modification of the simplex method that can be used to solve any general intuitionistic fuzzy linear programming problem after approximating the problem by an intuitionistic polygonal fuzzy number with n edges. This method is given in a simple tableau formulation, and then applied on numerical examples for clarity.
The performance of artificial intelligence in prostate magnetic resonance im...IJECEIAES
Prostate cancer is the predominant form of cancer observed in men worldwide. The application of magnetic resonance imaging (MRI) as a guidance tool for conducting biopsies has been established as a reliable and well-established approach in the diagnosis of prostate cancer. The diagnostic performance of MRI-guided prostate cancer diagnosis exhibits significant heterogeneity due to the intricate and multi-step nature of the diagnostic pathway. The development of artificial intelligence (AI) models, specifically through the utilization of machine learning techniques such as deep learning, is assuming an increasingly significant role in the field of radiology. In the realm of prostate MRI, a considerable body of literature has been dedicated to the development of various AI algorithms. These algorithms have been specifically designed for tasks such as prostate segmentation, lesion identification, and classification. The overarching objective of these endeavors is to enhance diagnostic performance and foster greater agreement among different observers within MRI scans for the prostate. This review article aims to provide a concise overview of the application of AI in the field of radiology, with a specific focus on its utilization in prostate MRI.
Seizure stage detection of epileptic seizure using convolutional neural networksIJECEIAES
According to the World Health Organization (WHO), seventy million individuals worldwide suffer from epilepsy, a neurological disorder. While electroencephalography (EEG) is crucial for diagnosing epilepsy and monitoring the brain activity of epilepsy patients, it requires a specialist to examine all EEG recordings to find epileptic behavior. This procedure needs an experienced doctor, and a precise epilepsy diagnosis is crucial for appropriate treatment. To identify epileptic seizures, this study employed a convolutional neural network (CNN) based on raw scalp EEG signals to discriminate between preictal, ictal, postictal, and interictal segments. The possibility of these characteristics is explored by examining how well timedomain signals work in the detection of epileptic signals using intracranial Freiburg Hospital (FH), scalp Children's Hospital Boston-Massachusetts Institute of Technology (CHB-MIT) databases, and Temple University Hospital (TUH) EEG. To test the viability of this approach, two types of experiments were carried out. Firstly, binary class classification (preictal, ictal, postictal each versus interictal) and four-class classification (interictal versus preictal versus ictal versus postictal). The average accuracy for stage detection using CHB-MIT database was 84.4%, while the Freiburg database's time-domain signals had an accuracy of 79.7% and the highest accuracy of 94.02% for classification in the TUH EEG database when comparing interictal stage to preictal stage.
Analysis of driving style using self-organizing maps to analyze driver behaviorIJECEIAES
Modern life is strongly associated with the use of cars, but the increase in acceleration speeds and their maneuverability leads to a dangerous driving style for some drivers. In these conditions, the development of a method that allows you to track the behavior of the driver is relevant. The article provides an overview of existing methods and models for assessing the functioning of motor vehicles and driver behavior. Based on this, a combined algorithm for recognizing driving style is proposed. To do this, a set of input data was formed, including 20 descriptive features: About the environment, the driver's behavior and the characteristics of the functioning of the car, collected using OBD II. The generated data set is sent to the Kohonen network, where clustering is performed according to driving style and degree of danger. Getting the driving characteristics into a particular cluster allows you to switch to the private indicators of an individual driver and considering individual driving characteristics. The application of the method allows you to identify potentially dangerous driving styles that can prevent accidents.
Hyperspectral object classification using hybrid spectral-spatial fusion and ...IJECEIAES
Because of its spectral-spatial and temporal resolution of greater areas, hyperspectral imaging (HSI) has found widespread application in the field of object classification. The HSI is typically used to accurately determine an object's physical characteristics as well as to locate related objects with appropriate spectral fingerprints. As a result, the HSI has been extensively applied to object identification in several fields, including surveillance, agricultural monitoring, environmental research, and precision agriculture. However, because of their enormous size, objects require a lot of time to classify; for this reason, both spectral and spatial feature fusion have been completed. The existing classification strategy leads to increased misclassification, and the feature fusion method is unable to preserve semantic object inherent features; This study addresses the research difficulties by introducing a hybrid spectral-spatial fusion (HSSF) technique to minimize feature size while maintaining object intrinsic qualities; Lastly, a soft-margins kernel is proposed for multi-layer deep support vector machine (MLDSVM) to reduce misclassification. The standard Indian pines dataset is used for the experiment, and the outcome demonstrates that the HSSF-MLDSVM model performs substantially better in terms of accuracy and Kappa coefficient.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...ssuser7dcef0
Power plants release a large amount of water vapor into the
atmosphere through the stack. The flue gas can be a potential
source for obtaining much needed cooling water for a power
plant. If a power plant could recover and reuse a portion of this
moisture, it could reduce its total cooling water intake
requirement. One of the most practical way to recover water
from flue gas is to use a condensing heat exchanger. The power
plant could also recover latent heat due to condensation as well
as sensible heat due to lowering the flue gas exit temperature.
Additionally, harmful acids released from the stack can be
reduced in a condensing heat exchanger by acid condensation. reduced in a condensing heat exchanger by acid condensation.
Condensation of vapors in flue gas is a complicated
phenomenon since heat and mass transfer of water vapor and
various acids simultaneously occur in the presence of noncondensable
gases such as nitrogen and oxygen. Design of a
condenser depends on the knowledge and understanding of the
heat and mass transfer processes. A computer program for
numerical simulations of water (H2O) and sulfuric acid (H2SO4)
condensation in a flue gas condensing heat exchanger was
developed using MATLAB. Governing equations based on
mass and energy balances for the system were derived to
predict variables such as flue gas exit temperature, cooling
water outlet temperature, mole fraction and condensation rates
of water and sulfuric acid vapors. The equations were solved
using an iterative solution technique with calculations of heat
and mass transfer coefficients and physical properties.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
6th International Conference on Machine Learning & Applications (CMLA 2024)ClaraZara1
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Immunizing Image Classifiers Against Localized Adversary Attacks
Fake accounts detection on social media using stack ensemble system
1. International Journal of Electrical and Computer Engineering (IJECE)
Vol. 12, No. 3, June 2022, pp. 3013~3022
ISSN: 2088-8708, DOI: 10.11591/ijece.v12i3.pp3013-3022 3013
Journal homepage: http://ijece.iaescore.com
Fake accounts detection on social media using stack ensemble
system
Amna Kadhim Ali, Abdulhussein Mohsin Abdullah²
Department of Computer Science, College of Computer Science and Information Technology, University of Basrah, Basrah, Iraq
Article Info ABSTRACT
Article history:
Received Mar 6, 2021
Revised Jan 4, 2022
Accepted Jan 21, 2022
In today’s world, social media has spread widely, and the social life of
people have become deeply associated with social media use. They use it to
communicate with each other, share events and news, and even run
businesses. The huge growth in social media and the massive number of
users has lured attackers to distribute harmful content through fake accounts,
leading to a large number of people falling victim to those accounts. In this
work, we propose a mechanism for identifying fake accounts on the social
media site Twitter by using two methods to preprocess data and extract the
most effective features, they are the spearman correlation coefficient and the
chi-square test. For classification, we used supervised machine learning
algorithms based on the ensemble system (stack method) by using random
forest, support vector machine, and naive Bayes algorithms in the first level
of the stack, and the logistic regression algorithm as a meta classifier. The
stack ensemble system was shown to be effective in achieving the best
results when compared to the algorithms used with it, with data accuracy
reaching 99%.
Keywords:
Classification
Combining system
Feature selection techniques
Machine learning
Twitter accounts
This is an open access article under the CC BY-SA license.
Corresponding Author:
Amna Kadhim Ali
Department of Computer Science, College of Computer Science and Information Technology, University
of Basrah
Basrah, Iraq
Email: amna.k.ali.itc.cs.p@uobasrah.edu.iq
1. INTRODUCTION
Social media use is becoming increasingly common, and it has become an essential part of daily life
around the world. Besides being a means of communication, it is also considered a means of gaining fame
and running a business. Social media sites are popular because of people’s interests in making friends,
posting pictures, tagging individuals in group photos, sharing their ideas and opinions on popular subjects,
maintaining good working relationships, and having a general interest in others.
Twitter is one of the social media platforms used for cooperation and communication between users.
It was initiated in 2006 [1], and in recent years, the number of users has reached millions. Users share short
messages, called tweets, of 140 characters or less, as well as pictures and videos, as the primary forms of
communication on the network. Regrettably, the emergence of social communication on Twitter has drawn
the attention of cybercriminals who leverage the trust between users to spread malicious content on the
network, resulting in a large number of victims. They create fake accounts [2] and use them to spread false
news or steal users’ accounts. Therefore, uncovering these accounts has become one of the major challenges
faced by social media sites at present [3].
A variety of methods have been proposed by researchers to classify fake accounts [4]–[6], some
using crowdsourcing [7] which rely on human effort to detect them, or using a graph [8], [9] by analyzing
2. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 3, June 2022: 3013-3022
3014
network contents or using machine learning algorithms to classify accounts depending on specific features.
Ersahin et al. [10] introduce a method of detecting fake accounts from the Twitter dataset using a
classification algorithm called Naive Bayes. The accuracy of the pre-processed dataset was increased by
using a supervised discretization technique called entropy minimization discretization (EMD), to reach a
90.9% accurate result.
Previous research [11] implemented a machine learning pipeline for online social networks to
identify fake accounts. The framework classified groups of fake accounts instead of creating a forecast for
each individual account to determine if they were generated by the same person. Several classification
algorithms have been proposed, such as support vector machine (SVM), random forest, and deep neural
network.
A previous study [12] examined the identification of Twitter spam accounts to enhance the initial
detection of spammer classes by incorporating both managed principal component analysis (PCA) and k-
mean algorithms. To detect spam on social networks, several existing features were adopted, and new
features were added to improve performance. Three classification algorithms, multi-layer perceptron (MLP),
support vector machine, and random forest, were trained. The best results were found using the random forest
algorithm, which had an accuracy of 96.30%.
Another previous study [13] identified fake Instagram accounts as a problem of binary classification
and proposed a cost-sensitive technique for reducing required features. The technique was based on a genetic
algorithm to pick the best attributes for automatic classification of computation, correct the variance using the
synthetic minority over-sampling technique-nominal continuous (SMOTE-NC) algorithm in a false
computation dataset, and evaluate multiple methods of pattern recognition on pooled datasets. Ultimately,
with a rating of 86%, the support vector machine and neural network-based techniques achieved the highest
F1 score for robotic computing detection, and the neural network achieved the best F1 rating at 95%. In this
paper, spearman's correlation coefficient and the chi-square test were used to preprocess Twitter data to find
the best qualities for distinguishing between fake and real accounts [14], and the min-max normalization
method to scale the data between (0, 1). For data classification, we used machine learning algorithms based
on the stack ensemble system to increase the predictive strength of the algorithms and achieve the highest
accuracy in data classification.
2. RESEARCH METHOD
This section discusses the suggested method for detecting fake accounts on social media and
contains six basic steps. They are; dataset collection, data cleaning, features extraction and selection, data
scaling, a classification stage depending on the ensemble system (stack method), and an evaluation and
comparison stage. Figure 1 shows the phases of the technique adopted.
Figure 1. The steps of the technique adopted for the detection process
Twitter data collection
Data cleaning
Feature extraction and selection
Data scaling
Training data Testing data
Random Forest SVM Naïve Bayes
Evaluatin
&
comparison
Stack
ensemble
system
prediction
Logistic Regression
Real Fake
3. Int J Elec & Comp Eng ISSN: 2088-8708
Fake accounts detection on social media using stack ensemble system (Amna Kadhim Ali)
3015
2.1. Twitter data collection
The Management Information Base "MIB" dataset [15] is used in this research, consisting of five
datasets obtained from Twitter, two of them represent real accounts and three of them are fake accounts. The
sum of all accounts is 5,301 with 29 features. They can be explained: i) the fake project (TFP) consists of 469
real accounts, ii) elections 2013 (E13) consist of 1,481 real accounts, iii) fastfollowerz dataset consist of
1,169 fake accounts, iv) InterTwitter dataset consist of 1,337 fake accounts, and v) Twitter technology
dataset consist of 845 fake accounts.
2.2. Data cleaning
During the data collection process, some errors occur that lead to the loss of some data. This
problem leads to a decrease in the quality of the data and thus leads to low-quality results when analyzing
and exploring them. Our grouped data contains several blank fields, as shown in Figure 2, where the yellow
color denotes the empty fields. Keeping these empty fields negatively affects the classification process and
leads to inaccurate results, so this stage includes removing the columns of features that contain 30% or more
blank fields [16].
Figure 2. All features
2.3. Feature extraction and selection
Feature extraction is used to determine the optimal subset of features for model creation by
eliminating inappropriate or redundant features, thereby concentrating only on necessary features. The
purpose of this strategy is to minimize the training time for the prediction model by reducing
over-processing, to improve the model's generalizability, and to help researchers interpret the model. In the
proposed research, two methods are used to select the best features. These methods are:
2.3.1. Spearman rank correlation
Spearman correlation coefficient is one of the filtering methods used for feature selection [17],
which tests the intensity and orientation of the monotonic association between two quantitative variables.
They have values ranging from (-1) to (+1) to show the correlation degree. When the two variables are
independent, each correlation measure is entirely zero. The result of spearman is a table that contains the
correlation coefficients that link each variable in the dataset to the other variables. The following formula is
employed to calculate the spearman rank correlation:
𝑆 = 1 − (
6 ∑ d𝒾s2
m(m2−1)
) (1)
where S=spearman rank correlation, dis=represents the distinction between the respective variable ranks,
m=number of observations.
4. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 3, June 2022: 3013-3022
3016
2.3.2. Chi-square test
The chi-square test is one of the statistical methods used to verify the independence of two
events [18]. Whenever the two features are independent, the calculated chi-value is small compared to the
critical chi value, meaning a large calculated chi-square value disproves the hypothesis of independence.
A large chi-square value indicates which feature is dependent on the response and can be used for model
training. The chi square formula is:
x2
𝑒 = ∑
(Oᵢ−Eᵢ)²
Eᵢ
(2)
where e=degree of freedom, O=observed value(s), E=expected value(s).
First, the spearman correlation coefficient is used to find the correlation between all the features,
whether numerical or qualitative, depending on the data rank, and extract only the correlated features. After
applying spearman's correlation coefficient, two features of the remaining dataset contain empty fields. They
are; default_profile and background_image. These features must be configured to use the statistical
chi-square test. To fill in these fields, the number of current values for the columns is calculated, and the
most common value was chosen to fill in the empty fields [19]. Then a chi-square test is implemented on
spearman's output to find the correlation between the features and the target (output), to choose the best
features that affect whether the account is fake or real to use in the classification process. The flow chart in
Figure 3 explains the feature extraction process.
Figure 3. Feature extraction flow chart
2.4. Data scaling (normalization)
Scaling of features is a technique used to normalize the range of individual data variables or
features. In this section, all numeric values in the selected features are listed between zero and one by using
Min-Max normalization to increase the processing speed. In Min-Max normalization, the minimum value of
the variable is converted to zero and the maximum value is converted to one, while the rest of the values are
converted to a decimal number between zero and one. The general formula is:
𝑣ʹ
=
v−minₐ
max ₐ− min ₐ
(𝑛𝑒𝑤_𝑚𝑎𝑥ₐ − 𝑛𝑒𝑤_𝑚𝑖𝑛ₐ) + 𝑛𝑒𝑤_𝑚𝑖𝑛ₐ (3)
whereas v= is an original value and v’=is the normalized value.
2.5. Detection model
We used the ensemble system by inserting features extracted from the datasets after normalizing
them into a stacking. Stacking is an ensemble learning method that combines several classifiers or regression
models through the meta-classifier or meta-regressor to improve predictive strength [20]. As shown in
Figure 4, based on a full training group, the basic-level models are trained, and then the meta-model is
trained on the model-like features of the basic level outputs. The algorithm below summarizes stacking.
Stacking Algorithm
Input: training data D={xᵢ, yᵢ}ᵢᵐ̳ₗ
Output: ensemble classifier H
Step 1: learn first-level classifiers
for t = 1 to T do
learn ht based on D
end for
Step 2: create a new prediction data set
New
feature set
after
cleaning
Extract the
correlation
between (all
features) by
using Spearman
Rank Correlation
Build a new
dataset from
correlated
features
Configure the
data use the
statistical Chi-
Square by filling
in the remaining
empty fields
Extract the
correlation
between (each
feature and the
target) by using
Chi-Square test
New
dataset
5. Int J Elec & Comp Eng ISSN: 2088-8708
Fake accounts detection on social media using stack ensemble system (Amna Kadhim Ali)
3017
for i=1 to m do
Dₕ={xʹᵢ, yᵢ}, where xʹᵢ={h₁ (xᵢ),…., hT (xᵢ)}
end for
Step 3: learn a meta-classifier
learn H based on Dₕ
return H
Figure 4. Stack ensemble system
The most important characteristic of the stack method is that it can benefit from the performance of
a group of well-performing models in a classification or regression task and can provide better predictions
than any individual model in the group. In our research, a group of the most common algorithms are used,
four different learning techniques are trained and tested depending on the stack method. These algorithms
are:
2.5.1. Random forest algorithm
Random forest (RF) is a powerful machine learning algorithm that performs the tasks of
classification and regression [21], [22]. The basic building block of a random forest is derived from the
decision tree. The model is obtained by dividing the data into bootstrapping samples depending on the
number of trees that we want to perform, building a simple prediction model within each section, and
combining their outputs based on the bagging ensemble learning technique to get to the final prediction.
2.5.2. Support vector machine algorithm
Support vector machine (SVM) is one of the most popular supervised learning algorithms that finds
the optimal hyperplane, which separates the data points into two-component by maximizing the margin,
which represents the distance from the decision surface to the closest data point [23], [24]. SVM is effective
in cases where the number of dimensions is greater than the number of samples given.
2.5.3. Naïve Bayes algorithm
Naive Bayes (NB) is a type of classifier of probabilities. It works on the theory of Bayes and deals
with both categorical variables and continuous variables [25], [26]. NB assumes that each pair of labeled-
value features is independent of each other, meaning that the presence of any particular feature in a class is
unrelated to the presence of other features. The NB equation is:
𝑃 (𝐴𝐵) =
P (A) P(BA)
𝑃(B)
(4)
2.5.4. Logistic regression algorithm
Logistic regression (LR) is one of the machine learning algorithms used in binary classification [27].
It is a simple and commonly used algorithm that measures the relationship between one variable and many
dependent variables (which we want to predict). As it uses its logistic function to estimate probabilities, to
make a prediction, these probabilities must be converted into binary values, a task known as the sigmoid
function. The sigmoid function is a curve in the form of an S that takes any number of real values and places
them in the range between zero and one.
In the proposed method, the first level algorithms of the stack, including random forest, SVM, and
naive Bayes, are trained on the training set, a k-fold validation is performed on each of these learners [28],
and the validated expected values are collected from all the first level algorithms to use them as inputs to the
meta classifier (logistic regression). The same steps are used to generate predictions on the test set. The
accounts in the test suite are classified into real and fake accounts based on the training suite that is provided.
The data was divided into a training and test group by choosing 75% as training data and 25% as testing data
using stratified sampling [29] to ensure an equal division and maintain the same proportion of classes.
6. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 3, June 2022: 3013-3022
3018
Default classifier parameters have been used, just the random state parameter is changed to (one) for
each of the train-test-split and for the random forest algorithm to have steady and acceptable results. A
prediction for each of the three basic algorithms is made using the dataset. The classifiers are implemented
with 10-fold validation, where the data is divided into ten parts, such that every time the classifier is trained
for nine parts and tested on the basis of the tenth part, the training and testing process are replicated ten times.
Then, these predictions are fed into the meta-learner (logistic regression) to create the group prediction.
2.6. Evaluation and comparison
A confusion matrix is used as the main source of assessment in this research to evaluate false
detection models. The result of the confusion matrix of the stack is evaluated and compared with the results
of the algorithms that were used with it. Table 1 explains the confusion matrix in more detail. A confusion
matrix is a technique used to describe the performance of classification algorithms, and which gives a better
understanding of the classification model and the types of errors it can cause. All the results of the algorithms
are plotted in a confusion matrix to determine where the error occurred.
− Accuracy: reflects the number of correctly classified instances in both groups over the overall number
of all instances within a dataset.
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 𝑟𝑎𝑡𝑒 =
TP+TN
TP+FN+FP+TN
(5)
− Precision: is the proportion of accurate positive predictions to the total number of positive predictions.
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
TP
TP+FP
(6)
− Recall: is the ratio of accurate positive predictions to the total number of positive examples in the set of
tests.
𝑅𝑒𝑐𝑎𝑙𝑙 =
TP
TP+FN
(7)
− F_measure: is the measure of model efficiency, a weighted average of model precision and recall.
𝐹 𝑀𝑒𝑎𝑠𝑢𝑟𝑒 = 2 ×
Precision ×Recall
Precision+Recall
(8)
Table 1. Confusion matrix
Actual Class Predicted Class
Positive Negative
Positive TP
correctly classified as positives
FN
incorrectly classified as negatives
Negative FP
incorrectly classified as positives
TN
correctly classified as negatives
3. RESULTS AND DISCUSSION
3.1. Pre-processing of dataset
This stage includes the results of data cleaning, followed by feature extraction and selection
methods, and normalization method. In data cleaning, the columns containing 30% empty fields were
deleted, and the features were reduced from 29 to 23. Feature extraction and selection involved using two
filtering methods. First, the spearman correlation coefficient was used to find the correlation between all data,
where the features were reduced to 7 in the dataset. Table 2 represents the result of spearman's correlation
coefficient.
Then, the chi-square test was implemented on spearman's output. Where the number of features has
reached five, they are as follows: (Statuses_count, Followers_count, Friends_count, Favourites_count, and
Listed_count) the values of these features are shown in Figure 5, where Figure 5 (a) is Statuses_count, which
represents the total number of tweets sent by the account. Figure 5(b) Followers_count is the number of
followers who follow the user. Figure 5(c) Friends_count is the number of friends who follow the account
holder. Figure 5(d) Favourites_count describe how many times each user account's tweets have been liked
over the course of the account's existence, and Figure 5(e) is Listed_count, which is the number of people
who have added the user to their list. The last stage of the data preprocessing process is applying Min-Max
7. Int J Elec & Comp Eng ISSN: 2088-8708
Fake accounts detection on social media using stack ensemble system (Amna Kadhim Ali)
3019
normalization to put all the data values on one level. All data values were converted into values between zero
and one.
Table 2. The result of Spearman rank correlation
Statuses
Count
Followers
Count
Friends
Count
Favorites
Count
Listed
Count
Default
Profile
Background
Image
Dataset
(output)
Statuses Count 1 0.84432 0.1598 0.7461 0.60878 0.00815 -0.0002 -0.7338
Followers Count 0.84432 1 0.233297 0.67047 0.63751 -0.0261 -0.00568 -0.6914
Friends Count 0.1598 0.233297 1 0.03005 0.15653 -0.2801 -0.04419 0.21315
Favorites Count 0.7461 0.67047 0.03005 1 0.60688 -0.1314 -0.0067 -0.7081
Listed Count 0.60878 0.63751 0.15653 0.60688 1 -0.1088 -0.0018 -0.5987
Default Profile 0.00815 -0.0261 -0.2801 -0.1314 -0.1088 1 0.16169 -0.1207
Background Image -0.0002 -0.00568 -0.04419 -0.0067 -0.0018 0.16169 1 -0.0310
Dataset (output) -0.7338 -0.6914 0.21315 -0.7081 -0.5987 -0.1207 -0.0310 1
(a) (b)
(c) (d)
(e)
Figure 5. Values of the selected account features (a) statuses_count, (b) followers_count, (c) friends_count,
(d) favourites_count, and (e) listed_count
Number of accounts Number of accounts
Number of accounts Number of accounts
Number of accounts
Total
number
of
tweets
Number
of
followers
Number
of
friends
Total
number
of
likes
Number
of
groups
8. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 3, June 2022: 3013-3022
3020
3.2. Detecting of fake accounts
This stage includes the result of the confusion matrix of the stack method and the algorithms that
were used with it. As shown in Table 3, based on the entire set of suggested features, the stack method
achieved a high evaluation rate in terms of accuracy, precision, and F1_score, compared with using each
algorithm separately. The accuracy of the stack method was 99%, the precision was 99%, and the F1_score
was 99.2%. Figure 6 shows the visualization of the confusion matrix between the stack method and the
algorithms. Where Figure 6(a) is the accuracy, Figure 6(b) is the F1_score, Figure 6(c) is the recall, and
Figure 6(d) is the precision.
Table 3. A comparison between the stack system and the algorithms used
Name Accuracy F1_score Recall Precision
Stack 0.990196 0.992257 0.994033 0.990488
Random forest 0.988688 0.991055 0.991647 0.990465
SVM 0.913273 0.934992 0.986874 0.888292
Naïve Bayes 0.763952 0.841839 0.994033 0.730061
Logistic regression 0.674962 0.795444 1 0.660362
(a) (b)
(c) (d)
Figure 6. Visualization of the confusion matrix between the stack method and the algorithms: (a) accuracy,
(b) F1_score, (c) recall, (d) precision
4. CONCLUSION
In this paper, we proposed a model for fake accounts detection based on a collection of basic Twitter
features that are publicly accessible. These feature sets were derived from the profile details available in the
Tweets of users. To improve the detection model, the stack ensemble method based on four machine learning
algorithms was used, and two feature selection methods were implemented to determine which features in the
detection process were most influential. Initial work results show that successful results can be achieved in a
stack ensemble method by using random forest, SVM, and naive Bayes classification algorithms as base level
classifiers, and by using logistic regression as a meta classifier. By implementing this methodology, the
1
0.8
0.6
0.4
0.2
0
1
0.8
0.6
0.4
0.2
0
1
0.995
0.99
0.985
0.98
1
0.8
0.6
0.4
0.2
0
algoritms algoritms
algoritms algoritms
accuracy
F1_score
recall
precision
9. Int J Elec & Comp Eng ISSN: 2088-8708
Fake accounts detection on social media using stack ensemble system (Amna Kadhim Ali)
3021
accuracy of the data reached 99%. The results also revealed that the ensemble system has a significantly
higher impact on the accuracy of the detection process over using each algorithm separately. For future work,
much larger data could be collected using the same methodology as this work but using other machine
learning algorithms.
REFERENCES
[1] S. Edosomwan, S. K. Prakasan, D. Kouame, J. Watson, and T. Seymour, “The history of social media and its impact on business,”
The Journal of Applied Management and Entrepreneurship, vol. 16, no. 3, pp. 79–91, 2011.
[2] D. Yuan et al., “Detecting fake accounts in online social networks at the time of registrations,” in Proceedings of the 2019 ACM
SIGSAC Conference on Computer and Communications Security, Nov. 2019, pp. 1423–1438, doi: 10.1145/3319535.3363198.
[3] R. Kaur and S. Singh, “A survey of data mining and social network analysis based anomaly detection techniques,” Egyptian
Informatics Journal, vol. 17, no. 2, pp. 199–216, Jul. 2016, doi: 10.1016/j.eij.2015.11.004.
[4] M. Al-Qurishi, M. Al-Rakhami, A. Alamri, M. Alrubaian, S. M. M. Rahman, and M. S. Hossain, “Sybil defense techniques in
online social networks: a survey,” IEEE Access, vol. 5, pp. 1200–1219, 2017, doi: 10.1109/ACCESS.2017.2656635.
[5] K. Anand, J. Kumar, and K. Anand, “Anomaly detection in online social network: a survey,” in 2017 International Conference on
Inventive Communication and Computational Technologies (ICICCT), Mar. 2017, pp. 456–459, doi:
10.1109/ICICCT.2017.7975239.
[6] F. Masood et al., “Spammer detection and fake user identification on social networks,” IEEE Access, vol. 7, pp. 68140–68152,
2019, doi: 10.1109/ACCESS.2019.2918196.
[7] G. Wang et al., “Serf and turf: crowdturfing for fun and profit,” in Proceedings of the 21st international conference on World
Wide Web - WWW ’12, Nov. 2011, pp. 679–688, doi: 10.1145/2187836.2187928.
[8] M. Mohammadrezaei, M. E. Shiri, and A. M. Rahmani, “Identifying fake accounts on social networks based on graph analysis
and classification algorithms,” Security and Communication Networks, vol. 2018, pp. 1–8, Aug. 2018, doi:
10.1155/2018/5923156.
[9] M. BalaAnand, N. Karthikeyan, S. Karthik, R. Varatharajan, G. Manogaran, and C. B. Sivaparthipan, “An enhanced graph-based
semi-supervised learning algorithm to detect fake users on Twitter,” The Journal of Supercomputing, vol. 75, no. 9,
pp. 6085–6105, Sep. 2019, doi: 10.1007/s11227-019-02948-w.
[10] B. Ersahin, O. Aktas, D. Kilinc, and C. Akyol, “Twitter fake account detection,” in 2017 International Conference on Computer
Science and Engineering (UBMK), Oct. 2017, pp. 388–392, doi: 10.1109/UBMK.2017.8093420.
[11] G. A, Radhika S, and Jayalakshmi S. L., “Detecting fake accounts in media application using machine learning,” International
Journal of Advanced Networking and Applications (IJANA), pp. 234–237.
[12] K. S. Adewole, T. Han, W. Wu, H. Song, and A. K. Sangaiah, “Twitter spam account detection based on clustering and
classification methods,” The Journal of Supercomputing, vol. 76, no. 7, pp. 4802–4837, Jul. 2020, doi: 10.1007/s11227-018-2641-
x.
[13] F. C. Akyon and M. E. Kalfaoglu, “Instagram fake and automated account detection,” 2019 Innovations in Intelligent Systems and
Applications Conference (ASYU), 2019, pp. 1-7, doi: 10.1109/ASYU48272.2019.8946437.
[14] A. El Azab, A. M. Idrees, M. A. Mahmoud, and H. Hefny, “Fake account detection in Twitter based on minimum weighted
feature set,” International Journal of Computer, Electrical, Automation, Control and Information Engineering, vol. 10, no. 1,
pp. 13–18, 2016, doi: 10.5281/zenodo.1110582.
[15] S. Cresci, R. Di Pietro, M. Petrocchi, A. Spognardi, and M. Tesconi, “Fame for sale: efficient detection of fake Twitter
followers,” Decision Support Systems, vol. 80, pp. 56–71, Sep. 2015, doi: 10.1016/j.dss.2015.09.003.
[16] J. Clavel, G. Merceron, and G. Escarguel, “Missing data estimation in morphometrics: how much is too much?,” Systematic
Biology, vol. 63, no. 2, pp. 203–218, Mar. 2014, doi: 10.1093/sysbio/syt100.
[17] Ms. K. Nagalakshmi, M. P. Nanthini, M. A. Saranya, and Mrs. B. Revathi, “Detect fake identities using machine learning,” SSRG
International Journal of Computer Science and Engineering (SSRG – IJCSE), pp. 80–84, 2019.
[18] I. S. Thaseen, C. A. Kumar, and A. Ahmad, “Integrated intrusion detection model using chi-square feature selection and ensemble
of classifiers,” Arabian Journal for Science and Engineering, vol. 44, no. 4, pp. 3357–3368, Apr. 2019, doi: 10.1007/s13369-018-
3507-5.
[19] O. Olasehinde and K. Williams, “A machine learning framework for improving classification accuracy using stacked ensemble a
machine learning framework for improving classification accuracy using stacked ensemble,” ISTEAMS Multidisciplinary
Conference AlHikmah University. Ilorin, 2018.
[20] K. R. Purba, D. Asirvatham, and R. K. Murugesan, “Classification of Instagram fake users using supervised machine learning
algorithms,” International Journal of Electrical and Computer Engineering (IJECE), vol. 10, no. 3, pp. 2763–2772, Jun. 2020,
doi: 10.11591/ijece.v10i3.pp2763-2772.
[21] P. Bharadwaj and Z. Shao, “Fake news detection with semantic features and text mining,” International Journal on Natural
Language Computing, vol. 8, no. 3, pp. 17–22, Jun. 2019, doi: 10.5121/ijnlc.2019.8302.
[22] S. Y. Wani, M. M. Kirmani, and S. I. Ansarulla, “Prediction of fake profiles on Facebook using supervised machine learning
techniques-a theoretical model,” International Journal of Computer Science and Information Technologies (IJCSIT), vol. 7, no. 4,
pp. 1735–1738, 2016.
[23] L. K. Ramasamy, S. Kadry, Y. Nam, and M. N. Meqdad, “Performance analysis of sentiments in Twitter dataset using SVM
models,” International Journal of Electrical and Computer Engineering (IJECE), vol. 11, no. 3, pp. 2275–2284, Jun. 2021, doi:
10.11591/ijece.v11i3.pp2275-2284.
[24] H. A. Santoso, E. H. Rachmawanto, and U. Hidayati, “Fake Twitter account classification of fake news spreading using Naive
Bayes,” Scientific Journal of Informatics, vol. 7, no. 2, pp. 228–237, 2020.
[25] M. M. Saritas, “Performance analysis of ANN and naive Bayes classification algorithm for data classification,” International
Journal of Intelligent Systems and Applications in Engineering, vol. 7, no. 2, pp. 88–91, Jan. 2019, doi:
10.18201/ijisae.2019252786.
[26] I. Aydin, M. Sevi, and M. U. Salur, “Detection of fake Twitter accounts with machine learning algorithms,” in 2018 International
Conference on Artificial Intelligence and Data Processing (IDAP), Sep. 2018, pp. 1–4, doi: 10.1109/IDAP.2018.8620830.
[27] D. Berrar, “Cross-validation,” in Encyclopedia of Bioinformatics and Computational Biology, vol. 1–3, Elsevier, 2019,
pp. 542–545.
[28] J. Kaiser, “Dealing with missing values in data,” Journal of Systems Integration, pp. 42–51, 2014, doi: 10.20470/jsi.v5i1.178.
10. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 3, June 2022: 3013-3022
3022
[29] A. M. Al-Zoubi, J. Alqatawna, and H. Paris, “Spam profile detection in social networks based on public features,” in 2017 8th
International Conference on Information and Communication Systems (ICICS), Apr. 2017, pp. 130–135, doi:
10.1109/IACS.2017.7921959.
BIOGRAPHIES OF AUTHORS
Amna Kadhim Ali received a B.Sc. in Computer Science in 2006 from the
University of Basrah, Iraq. Now, she is a student pursuing a master's degree in Artificial
Intelligence at the University of Basrah, Basrah, Iraq. She can be contacted by email:
amna.k.ali.itc.cs.p@uobasrah.edu.iq.
Abdulhussein M. Abdullah Professor at the University of Basrah, Basrah, Iraq.
He received an MSc in Computer Science in 1990 and a PhD degree from the University of
Basrah, Iraq in 2001. His areas of interest include Speech Recognition, the Semantic Web,
Image Processing, and Machine Learning. He can be contacted by email:
abduihussein.abdullah@uobasrah.edu.iq.