SlideShare a Scribd company logo
1 of 8
Download to read offline
International Journal of Computer Science & Information Technology (IJCSIT) Vol 11, No 5, October 2019
DOI: 10.5121/ijcsit.2019.11507 83
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING
MACHINE LEARNING
Ananya Dey1
, Hamsashree Reddy2
, Manjistha Dey3
and Niharika Sinha4
1
National Institute of Technology, Tiruchirappalli, India
2
PES University, Bangalore, India
3
RV College of Engineering, Bangalore, India
4
Manipal Institute of Technology, Karnataka, India
ABSTRACT
With the advent of the Internet and social media, while hundreds of people have benefitted from the vast sources of
information available, there has been an enormous increase in the rise of cyber-crimes, particularly targeted
towards women. According to a 2019 report in the [4] Economics Times, India has witnessed a 457% rise in
cybercrime in the five year span between 2011 and 2016. Most speculate that this is due to impact of social media
such as Facebook, Instagram and Twitter on our daily lives. While these definitely help in creating a sound social
network, creation of user accounts in these sites usually needs just an email-id. A real life person can create
multiple fake IDs and hence impostors can easily be made. Unlike the real world scenario where multiple rules and
regulations are imposed to identify oneself in a unique manner (for example while issuing one’s passport or driver’s
license), in the virtual world of social media, admission does not require any such checks. In this paper, we study
the different accounts of Instagram, in particular and try to assess an account as fake or real using Machine
Learning techniques namely Logistic Regression and Random Forest Algorithm.
KEYWORDS
Logistic Regression, Random Forest Algorithm, median imputation, Maximum likelihood estimation, k cross
validation, overfitting, out of bag data, recall, identity theft, Angler phishing.
1. INTRODUCTION
Instagram is an online photo and video sharing social networking platform that has been available on both
Android and iOS since 2012. As of May 2019, there are over a billion users registered on Instagram.
In the recent years, Instagram has been found to be using third party apps, called bots. While these can
definitely impersonate a user and tarnish their reputation leading to ‘identity theft’, there has also been
greater instances of malicious ways of promoting the brand image of a company known as “influencer
marketing”. These days a number of businesses are using social media to heed to their customers’ needs
which has led to yet another malpractice called Angler phishing. All these malpractices have made it vital
to implement strong fraud detection techniques and hence we propose our solution.
2. LITERATURE SURVEY
Previously a lot of work has been done on other platforms like Facebook and Twitter, but not much work
has been done for Instagram. Each of these Social Medias are different in terms of features that have to be
International Journal of Computer Science & Information Technology (IJCSIT) Vol 11, No 5, October 2019
84
considered, strategies used, etc. Few of the past work include [1] Worth its Weight in Likes: Towards
Detecting Fake Likes on Instagram: This particular paper concentrates on analysing likes and identifying
the genuine ones to reduce the effect of fake likes on Instagram influencer market. They used a simple
feed-forward neural network Multi-Layer Perceptron (MLP) which obtained a precision about 83%. [2]
Identifying Fake Profiles in LinkedIn: A number of features were considered to train the dataset using
neural networks, SVMs, and principal component analysis. The precision rate achieved was 84%. [3]
Detection of Fake Profiles in Social Media: This is a literature review to detecting fake social media
accounts classified into the approaches aimed on analysing individual accounts. So our 2 proposed
method is a novel approach in terms of the platform chosen and the algorithms used like Random Forest
for classification.
Figure1: Proposed System
3. MATERIALS AND METHODS
In this section we present the materials and methods used for the research work.
International Journal of Computer Science & Information Technology (IJCSIT) Vol 11, No 5, October 2019
85
Data set information:
The dataset has been taken from https://www.kaggle.com/free4ever1/instagram-fake-spammer-genuine-
accounts.It consists of two CSV files- train.csv (19 KB) and test.csv (4 KB) .The dependent variable,
which is whether it is a fake or not fake account is categorical and it takes two values 0 (not fake) and 1
(fake) profile. The distribution of the training dataset is such that 50% is fake and the rest 50% is
legitimate. Below is a table to denote the parameters that have been considered (denoted in column
Profile feature), their range of values, each of their mean values and what each of the features denote.1.
Collecting the IP addresses which will be used as the training dataset
Table 1: Dataset Features
Figure 2: Snapshot of Training Dataset
Exploratory Data Analysis: This is a critical process of initial data investigation done so as to discover
patterns in the dataset and spot anomalies with the help of summary statistics and graphical
representation. Below are the various sub processes that were done.
a. Missing Value Treatment
The given dataset had no missing value. Missing values could occur in a dataset due to mostly real world
problems and can be treated either through deletion or imputation. The presence of missing values
reduces the data available to be analysed, compromising on the statistical power of the study, and
eventually the reliability of its results.
b. Outlier Detection
Outliers are extreme values that deviate from the usual data values in the dataset. If outliers are present in
the dataset, then the accuracy is reduced significantly as the training dataset learns from this noise in the
data and could give an over-fit model. After careful analysis using graphs, we conclude that the following
features had outliers present in them- nums/length username, full name words, description length, #posts,
#followers and #follows. To deal with these outliers, we used median imputation by calculating the
median of these set of values. Note that we do not include the outliers in the median calculation. Once we
get the median, we replace all the outliers with the calculated median value.
International Journal of Computer Science & Information Technology (IJCSIT) Vol 11, No 5, October 2019
86
c. Bivariate Analysis
This is done to understand the relationship between two variables and the strength of association between
them. We calculated the correlation matrix and concluded absence of high multicollinearity between the
variables. This is one of the assumptions before building a logistic regression model.
Once data pre-processing is done, we can safely move into the algorithms. We have been provided with a
labelled training dataset and can therefore proceed with applying supervised learning algorithms that map
the input to the output .For the scope of this paper we have considered two commonly used classification
algorithms, i.e, Logistic Regression and the Random Forest algorithm. Each of them has been explained
in depth below.
1. Logistic Regression
The assumptions of this model include absence of outliers in the dataset and absence of high correlations
between the predictors, which have been taken care of in the preceding steps. In logistic regression, the
probabilities predicted using the logit function. The values greater than or equal to the decision boundary
belong to one class while the values lower than it belong to the other.
We first run the GLM function in R to perform regression and find out the beta coefficients and p values
for each of the features. The Beta coefficient, which is calculated based on maximum likelihood
estimation, is an indicator of how strongly the predictor variable indicates the dependent variable. Based
on the p values, we remove those variables whose 5 values are greater than 0.05 and re run the model.
Finally we performed K cross validation to check overfitting
Figure 3: Model Summery
2. Random forest Algorithm
This is a regression model performing well on classification model. Since there are a very few
assumptions attached to it, so data preparation is less challenging. We used the random Forest function in
International Journal of Computer Science & Information Technology (IJCSIT) Vol 11, No 5, October 2019
87
R and set the n-tree parameter (denoting the number of trees) as 500 and the number of variables tried at
each split as 3. Random Record Selection is the first task in which each tree is trained on ⅔ of the total
training data and some variables are selected at random(say m) out of all the variables and these m
variables are used to split the node. For each tree, using the leftover (36.8%) data, the misclassification
rate is calculated. This gives us the Out Of Bag (OOB) error rate. The forest chooses the classification
having the most votes over all the trees in the forest. This is the RF score and the percent YES votes
received is the predicted probability.
4. RESULTS AND ANALYSIS
In this section we determine the results of the two classification models.
After creating the models using the training datasets, we apply the models on unseen data, i.e., and the
test dataset. We create the confusion matrix based on these and calculate various performance parameters
as discussed below.
1. Logistic Regression
Figure 4: Confusion matrix for Logistic Regression (T = True, F=False, P=Positive, N=Negative)
We calculate the different model metrics as follows-
 Precision- We divide the total number of correctly classified positive examples by the total
number of predicted positive examples.
Precision= 𝑇𝑃 / (𝑇𝑃+𝐹𝑃) = 57/ (57+8) = 87.6 %
 Recall- The ratio of the total number of correctly classified positive examples divide to the total
number of positive examples. Recall= 𝑇𝑃/((𝑇𝑃+𝐹𝑁) = 57/ (57+3) = 95 %
 F1 score- It is the harmonic mean of recall and precision. F1 score= (2∗𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛∗𝑅𝑒𝑐𝑎𝑙𝑙)/
(𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙) = 91.15
 Accuracy- It is calculated using the following formula
(𝑇𝑃+𝑇𝑁) /(𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁) = (57+52) / (57+52+8+3) = 90.8 %
Based on this curve, we infer that using k=3 will give optimal results.
International Journal of Computer Science & Information Technology (IJCSIT) Vol 11, No 5, October 2019
88
Figure 5: ROC Curve for Logistic Regression
2. Random Forest Algorithm
Figure 6: Confusion matrix for Logistic Regression (T = True, F=False, P=Positive, N=Negative)
On a similar note, we calculate the various model metrics for Random forest algorithm.
 Precision= 𝑇𝑃 / (𝑇𝑃+𝐹𝑃) = 55 / (55+4) = 93.2 %
 Recall= 𝑇𝑃 / (𝑇𝑃+𝐹𝑁) = 55/(55+5) = 91.6 %
 F1 score= (2∗𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛∗𝑅𝑒𝑐𝑎𝑙𝑙) / (𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙) = 92.42
 Accuracy= (𝑇𝑃+𝑇𝑁) / (𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁) = (55+56) / (55+56+4+5) = 92.5 %
International Journal of Computer Science & Information Technology (IJCSIT) Vol 11, No 5, October 2019
89
Figure 7: Variable Importance Graph
5. CONCLUSIONS
While going through previous similar research conducted on detection of fake profiles on social media
platforms, we realized that not a lot has been done on Instagram as a social network platform in particular.
Hence, we targeted our approach for the same. In this paper, we introduced a novel approach for detecting
fake user profiles on Instagram based on certain features using concepts of machine learning. We used
two models for this- Logistic Regression and Random Forest algorithms, achieving an accuracy of 90.8%
and 92.5% respectively. Such high accuracies have not been attained in previous work conducted for
other social media platforms (highest accuracy achieved before this was 86%).
REFERENCES
[1] Indira Sen,Anupama Aggarwal,Shiven Mian.2018."Worth its Weight in Likes: Towards Detecting Fake Likes
on Instagram". In ACM International Conference on Information and Knowledge Management.
[2] Shalinda Adikari, Kaushik Dutta. 2014. “Identifying Fake Profiles In LinkedIn”. In Pacific Asia Conference
on Information Systems.
[3] Aleksei Romanov, Alexander Semenov, Oleksiy Mazhelis and Jari Veijalainen.2017. "Detection of Fake
Profiles in Social Media”. In 13th International Conference on Web Information Systems and Technologies.
International Journal of Computer Science & Information Technology (IJCSIT) Vol 11, No 5, October 2019
90
[4] https://telecom.economictimes.indiatimes.com/news/india-saw-457-rise-in-cybercrime-in-fiveyears-
study/67455224
[5] Todor Mihaylov, Preslav Nakov.2016. "Hunting for Troll Comments in News Community Forums". In
Association for Computational Linguistics.
[6] Ml-cheatsheet.readthedocs.io. (2019). Logistic Regression — ML Cheatsheet documentation. [Online]
Available at: https://ml cheatsheet.readthedocs.io/en/latest/logistic_regression.html#binarylogistic-regression
[Accessed 10 Jun. 2019].
[7] 3. Schoonjans, F. (2019). ROC curve analysis with MedCalc. [Online] MedCalc. Available at:
https://www.medcalc.org/manual/roc-curves.php [Accessed 10 Jun. 2019].
[8] Kietzmann, J.H., Hermkens, K., McCarthy, I.P., Silvestre,B.S., 2011. Social media? Get serious!
Understanding the functional building blocks of social media. Bus.Horiz., SPECIAL ISSUE: SOCIAL
MEDIA 54, 241251. doi:10.1016/j.bushor.2011.01.005.
[9] Krombholz, K., Hobel, H., Huber, M., Weippl, E., 2015.Advanced Social Engineering Attacks. J Inf
SecurAppl 22, 113–122. doi:10.1016/j.jisa.2014.09.005.

More Related Content

What's hot

Feature Extraction
Feature ExtractionFeature Extraction
Feature Extractionskylian
 
Fake news detection project
Fake news detection projectFake news detection project
Fake news detection projectHarshdaGhai
 
Overfitting & Underfitting
Overfitting & UnderfittingOverfitting & Underfitting
Overfitting & UnderfittingSOUMIT KAR
 
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsCredit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsHariteja Bodepudi
 
Loan approval prediction based on machine learning approach
Loan approval prediction based on machine learning approachLoan approval prediction based on machine learning approach
Loan approval prediction based on machine learning approachEslam Nader
 
Artificial Intelligence Notes Unit 2
Artificial Intelligence Notes Unit 2Artificial Intelligence Notes Unit 2
Artificial Intelligence Notes Unit 2DigiGurukul
 
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNINGDETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNINGijcsit
 
Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective Saurabh Kaushik
 
Fake News detection.pptx
Fake News detection.pptxFake News detection.pptx
Fake News detection.pptxSanad Bhowmik
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningEng Teong Cheah
 
Credit card fraud detection using random forest & cart algorithm
Credit card fraud detection using random forest & cart algorithmCredit card fraud detection using random forest & cart algorithm
Credit card fraud detection using random forest & cart algorithmVenkat Projects
 
Histogram Processing
Histogram ProcessingHistogram Processing
Histogram ProcessingAmnaakhaan
 
Machine Learning and its Applications
Machine Learning and its ApplicationsMachine Learning and its Applications
Machine Learning and its ApplicationsDr Ganesh Iyer
 
Prediction of Heart Disease using Machine Learning Algorithms: A Survey
Prediction of Heart Disease using Machine Learning Algorithms: A SurveyPrediction of Heart Disease using Machine Learning Algorithms: A Survey
Prediction of Heart Disease using Machine Learning Algorithms: A Surveyrahulmonikasharma
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learningbutest
 
Face recognition technology
Face recognition technologyFace recognition technology
Face recognition technologyranjit banshpal
 
Machine learning session5(logistic regression)
Machine learning   session5(logistic regression)Machine learning   session5(logistic regression)
Machine learning session5(logistic regression)Abhimanyu Dwivedi
 

What's hot (20)

Feature Extraction
Feature ExtractionFeature Extraction
Feature Extraction
 
Fake news detection project
Fake news detection projectFake news detection project
Fake news detection project
 
Overfitting & Underfitting
Overfitting & UnderfittingOverfitting & Underfitting
Overfitting & Underfitting
 
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsCredit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
 
Loan approval prediction based on machine learning approach
Loan approval prediction based on machine learning approachLoan approval prediction based on machine learning approach
Loan approval prediction based on machine learning approach
 
Artificial Intelligence Notes Unit 2
Artificial Intelligence Notes Unit 2Artificial Intelligence Notes Unit 2
Artificial Intelligence Notes Unit 2
 
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNINGDETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
 
SIFT
SIFTSIFT
SIFT
 
Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective
 
Fake News detection.pptx
Fake News detection.pptxFake News detection.pptx
Fake News detection.pptx
 
supervised learning
supervised learningsupervised learning
supervised learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Statistics for data science
Statistics for data science Statistics for data science
Statistics for data science
 
Credit card fraud detection using random forest & cart algorithm
Credit card fraud detection using random forest & cart algorithmCredit card fraud detection using random forest & cart algorithm
Credit card fraud detection using random forest & cart algorithm
 
Histogram Processing
Histogram ProcessingHistogram Processing
Histogram Processing
 
Machine Learning and its Applications
Machine Learning and its ApplicationsMachine Learning and its Applications
Machine Learning and its Applications
 
Prediction of Heart Disease using Machine Learning Algorithms: A Survey
Prediction of Heart Disease using Machine Learning Algorithms: A SurveyPrediction of Heart Disease using Machine Learning Algorithms: A Survey
Prediction of Heart Disease using Machine Learning Algorithms: A Survey
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
 
Face recognition technology
Face recognition technologyFace recognition technology
Face recognition technology
 
Machine learning session5(logistic regression)
Machine learning   session5(logistic regression)Machine learning   session5(logistic regression)
Machine learning session5(logistic regression)
 

Similar to Detection of Fake Accounts in Instagram Using Machine Learning

Fake accounts detection on social media using stack ensemble system
Fake accounts detection on social media using stack ensemble  systemFake accounts detection on social media using stack ensemble  system
Fake accounts detection on social media using stack ensemble systemIJECEIAES
 
IRJET- Machine Learning: Survey, Types and Challenges
IRJET- Machine Learning: Survey, Types and ChallengesIRJET- Machine Learning: Survey, Types and Challenges
IRJET- Machine Learning: Survey, Types and ChallengesIRJET Journal
 
Empirical analysis of ensemble methods for the classification of robocalls in...
Empirical analysis of ensemble methods for the classification of robocalls in...Empirical analysis of ensemble methods for the classification of robocalls in...
Empirical analysis of ensemble methods for the classification of robocalls in...IJECEIAES
 
A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...IRJET Journal
 
A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...IRJET Journal
 
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET-  	  Identify the Human or Bots Twitter Data using Machine Learning Alg...IRJET-  	  Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...IRJET Journal
 
IRJET- Detection of Ranking Fraud in Mobile Applications
IRJET-  	  Detection of Ranking Fraud in Mobile ApplicationsIRJET-  	  Detection of Ranking Fraud in Mobile Applications
IRJET- Detection of Ranking Fraud in Mobile ApplicationsIRJET Journal
 
Classification of instagram fake users using supervised machine learning algo...
Classification of instagram fake users using supervised machine learning algo...Classification of instagram fake users using supervised machine learning algo...
Classification of instagram fake users using supervised machine learning algo...IJECEIAES
 
IRJET- Fake News Detection using Logistic Regression
IRJET- Fake News Detection using Logistic RegressionIRJET- Fake News Detection using Logistic Regression
IRJET- Fake News Detection using Logistic RegressionIRJET Journal
 
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET Journal
 
Detecting Malicious Bots in Social Media Accounts Using Machine Learning Tech...
Detecting Malicious Bots in Social Media Accounts Using Machine Learning Tech...Detecting Malicious Bots in Social Media Accounts Using Machine Learning Tech...
Detecting Malicious Bots in Social Media Accounts Using Machine Learning Tech...IRJET Journal
 
Fake News Detection using Passive Aggressive and Naïve Bayes
Fake News Detection using Passive Aggressive and Naïve BayesFake News Detection using Passive Aggressive and Naïve Bayes
Fake News Detection using Passive Aggressive and Naïve BayesIRJET Journal
 
CLASSIFIER SELECTION MODELS FOR INTRUSION DETECTION SYSTEM (IDS)
CLASSIFIER SELECTION MODELS FOR INTRUSION DETECTION SYSTEM (IDS)CLASSIFIER SELECTION MODELS FOR INTRUSION DETECTION SYSTEM (IDS)
CLASSIFIER SELECTION MODELS FOR INTRUSION DETECTION SYSTEM (IDS)ieijjournal1
 
How to build machine learning apps.pdf
How to build machine learning apps.pdfHow to build machine learning apps.pdf
How to build machine learning apps.pdfStephenAmell4
 
How to build machine learning apps.pdf
How to build machine learning apps.pdfHow to build machine learning apps.pdf
How to build machine learning apps.pdfAnastasiaSteele10
 
How to build machine learning apps.pdf
How to build machine learning apps.pdfHow to build machine learning apps.pdf
How to build machine learning apps.pdfJamieDornan2
 
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET Journal
 
IRJET - Twitter Spam Detection using Cobweb
IRJET - Twitter Spam Detection using CobwebIRJET - Twitter Spam Detection using Cobweb
IRJET - Twitter Spam Detection using CobwebIRJET Journal
 
BITCOIN HEIST: RANSOMWARE ATTACKS PREDICTION USING DATA SCIENCE
BITCOIN HEIST: RANSOMWARE ATTACKS PREDICTION USING DATA SCIENCEBITCOIN HEIST: RANSOMWARE ATTACKS PREDICTION USING DATA SCIENCE
BITCOIN HEIST: RANSOMWARE ATTACKS PREDICTION USING DATA SCIENCEIRJET Journal
 

Similar to Detection of Fake Accounts in Instagram Using Machine Learning (20)

Fake accounts detection on social media using stack ensemble system
Fake accounts detection on social media using stack ensemble  systemFake accounts detection on social media using stack ensemble  system
Fake accounts detection on social media using stack ensemble system
 
IRJET- Machine Learning: Survey, Types and Challenges
IRJET- Machine Learning: Survey, Types and ChallengesIRJET- Machine Learning: Survey, Types and Challenges
IRJET- Machine Learning: Survey, Types and Challenges
 
Empirical analysis of ensemble methods for the classification of robocalls in...
Empirical analysis of ensemble methods for the classification of robocalls in...Empirical analysis of ensemble methods for the classification of robocalls in...
Empirical analysis of ensemble methods for the classification of robocalls in...
 
A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...
 
A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...A study of cyberbullying detection using Deep Learning and Machine Learning T...
A study of cyberbullying detection using Deep Learning and Machine Learning T...
 
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET-  	  Identify the Human or Bots Twitter Data using Machine Learning Alg...IRJET-  	  Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
 
IRJET- Detection of Ranking Fraud in Mobile Applications
IRJET-  	  Detection of Ranking Fraud in Mobile ApplicationsIRJET-  	  Detection of Ranking Fraud in Mobile Applications
IRJET- Detection of Ranking Fraud in Mobile Applications
 
Classification of instagram fake users using supervised machine learning algo...
Classification of instagram fake users using supervised machine learning algo...Classification of instagram fake users using supervised machine learning algo...
Classification of instagram fake users using supervised machine learning algo...
 
IRJET- Fake News Detection using Logistic Regression
IRJET- Fake News Detection using Logistic RegressionIRJET- Fake News Detection using Logistic Regression
IRJET- Fake News Detection using Logistic Regression
 
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
 
Fake News Detection using Deep Learning
Fake News Detection using Deep LearningFake News Detection using Deep Learning
Fake News Detection using Deep Learning
 
Detecting Malicious Bots in Social Media Accounts Using Machine Learning Tech...
Detecting Malicious Bots in Social Media Accounts Using Machine Learning Tech...Detecting Malicious Bots in Social Media Accounts Using Machine Learning Tech...
Detecting Malicious Bots in Social Media Accounts Using Machine Learning Tech...
 
Fake News Detection using Passive Aggressive and Naïve Bayes
Fake News Detection using Passive Aggressive and Naïve BayesFake News Detection using Passive Aggressive and Naïve Bayes
Fake News Detection using Passive Aggressive and Naïve Bayes
 
CLASSIFIER SELECTION MODELS FOR INTRUSION DETECTION SYSTEM (IDS)
CLASSIFIER SELECTION MODELS FOR INTRUSION DETECTION SYSTEM (IDS)CLASSIFIER SELECTION MODELS FOR INTRUSION DETECTION SYSTEM (IDS)
CLASSIFIER SELECTION MODELS FOR INTRUSION DETECTION SYSTEM (IDS)
 
How to build machine learning apps.pdf
How to build machine learning apps.pdfHow to build machine learning apps.pdf
How to build machine learning apps.pdf
 
How to build machine learning apps.pdf
How to build machine learning apps.pdfHow to build machine learning apps.pdf
How to build machine learning apps.pdf
 
How to build machine learning apps.pdf
How to build machine learning apps.pdfHow to build machine learning apps.pdf
How to build machine learning apps.pdf
 
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
 
IRJET - Twitter Spam Detection using Cobweb
IRJET - Twitter Spam Detection using CobwebIRJET - Twitter Spam Detection using Cobweb
IRJET - Twitter Spam Detection using Cobweb
 
BITCOIN HEIST: RANSOMWARE ATTACKS PREDICTION USING DATA SCIENCE
BITCOIN HEIST: RANSOMWARE ATTACKS PREDICTION USING DATA SCIENCEBITCOIN HEIST: RANSOMWARE ATTACKS PREDICTION USING DATA SCIENCE
BITCOIN HEIST: RANSOMWARE ATTACKS PREDICTION USING DATA SCIENCE
 

More from AIRCC Publishing Corporation

Call for Papers - International Journal of Computer Science & Information Tec...
Call for Papers - International Journal of Computer Science & Information Tec...Call for Papers - International Journal of Computer Science & Information Tec...
Call for Papers - International Journal of Computer Science & Information Tec...AIRCC Publishing Corporation
 
Discover Cutting-Edge Research in Computer Science and Information Technology!
Discover Cutting-Edge Research in Computer Science and Information Technology!Discover Cutting-Edge Research in Computer Science and Information Technology!
Discover Cutting-Edge Research in Computer Science and Information Technology!AIRCC Publishing Corporation
 
Constraint-based and Fuzzy Logic Student Modeling for Arabic Grammar
Constraint-based and Fuzzy Logic Student Modeling for Arabic GrammarConstraint-based and Fuzzy Logic Student Modeling for Arabic Grammar
Constraint-based and Fuzzy Logic Student Modeling for Arabic GrammarAIRCC Publishing Corporation
 
From Clicks to Security: Investigating Continuous Authentication via Mouse Dy...
From Clicks to Security: Investigating Continuous Authentication via Mouse Dy...From Clicks to Security: Investigating Continuous Authentication via Mouse Dy...
From Clicks to Security: Investigating Continuous Authentication via Mouse Dy...AIRCC Publishing Corporation
 
Call for Articles - International Journal of Computer Science & Information T...
Call for Articles - International Journal of Computer Science & Information T...Call for Articles - International Journal of Computer Science & Information T...
Call for Articles - International Journal of Computer Science & Information T...AIRCC Publishing Corporation
 
Image Segmentation and Classification using Neural Network
Image Segmentation and Classification using Neural NetworkImage Segmentation and Classification using Neural Network
Image Segmentation and Classification using Neural NetworkAIRCC Publishing Corporation
 
International Journal of Computer Science & Information Technology (IJCSIT)
International Journal of Computer Science & Information Technology (IJCSIT)International Journal of Computer Science & Information Technology (IJCSIT)
International Journal of Computer Science & Information Technology (IJCSIT)AIRCC Publishing Corporation
 
Your Device May Know You Better Than You Know Yourself-Continuous Authenticat...
Your Device May Know You Better Than You Know Yourself-Continuous Authenticat...Your Device May Know You Better Than You Know Yourself-Continuous Authenticat...
Your Device May Know You Better Than You Know Yourself-Continuous Authenticat...AIRCC Publishing Corporation
 
A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3
A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3
A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3AIRCC Publishing Corporation
 
From Clicks to Security: Investigating Continuous Authentication via Mouse Dy...
From Clicks to Security: Investigating Continuous Authentication via Mouse Dy...From Clicks to Security: Investigating Continuous Authentication via Mouse Dy...
From Clicks to Security: Investigating Continuous Authentication via Mouse Dy...AIRCC Publishing Corporation
 
Image Segmentation and Classification using Neural Network
Image Segmentation and Classification using Neural NetworkImage Segmentation and Classification using Neural Network
Image Segmentation and Classification using Neural NetworkAIRCC Publishing Corporation
 
Use of Neuronal Networks and Fuzzy Logic to Modelling the Foot Sizes
Use of Neuronal Networks and Fuzzy Logic to Modelling the Foot SizesUse of Neuronal Networks and Fuzzy Logic to Modelling the Foot Sizes
Use of Neuronal Networks and Fuzzy Logic to Modelling the Foot SizesAIRCC Publishing Corporation
 
Exploring the EV Charging Ecosystem and Performing an Experimental Assessment...
Exploring the EV Charging Ecosystem and Performing an Experimental Assessment...Exploring the EV Charging Ecosystem and Performing an Experimental Assessment...
Exploring the EV Charging Ecosystem and Performing an Experimental Assessment...AIRCC Publishing Corporation
 
Call for Papers - International Journal of Computer Science & Information Tec...
Call for Papers - International Journal of Computer Science & Information Tec...Call for Papers - International Journal of Computer Science & Information Tec...
Call for Papers - International Journal of Computer Science & Information Tec...AIRCC Publishing Corporation
 
Current Issue - February 2024, Volume 16, Number 1 - International Journal o...
Current Issue - February 2024, Volume 16, Number 1  - International Journal o...Current Issue - February 2024, Volume 16, Number 1  - International Journal o...
Current Issue - February 2024, Volume 16, Number 1 - International Journal o...AIRCC Publishing Corporation
 
Variations in Outcome for the Same Map Reduce Transitive Closure Algorithm Im...
Variations in Outcome for the Same Map Reduce Transitive Closure Algorithm Im...Variations in Outcome for the Same Map Reduce Transitive Closure Algorithm Im...
Variations in Outcome for the Same Map Reduce Transitive Closure Algorithm Im...AIRCC Publishing Corporation
 
Call for Articles - International Journal of Computer Science & Information T...
Call for Articles - International Journal of Computer Science & Information T...Call for Articles - International Journal of Computer Science & Information T...
Call for Articles - International Journal of Computer Science & Information T...AIRCC Publishing Corporation
 
February 2024-: Top Read Articles in Computer Science & Information Technology
February 2024-: Top Read Articles in Computer Science & Information TechnologyFebruary 2024-: Top Read Articles in Computer Science & Information Technology
February 2024-: Top Read Articles in Computer Science & Information TechnologyAIRCC Publishing Corporation
 
Call for Articles- International Journal of Computer Science & Information T...
Call for Articles-  International Journal of Computer Science & Information T...Call for Articles-  International Journal of Computer Science & Information T...
Call for Articles- International Journal of Computer Science & Information T...AIRCC Publishing Corporation
 

More from AIRCC Publishing Corporation (20)

Call for Papers - International Journal of Computer Science & Information Tec...
Call for Papers - International Journal of Computer Science & Information Tec...Call for Papers - International Journal of Computer Science & Information Tec...
Call for Papers - International Journal of Computer Science & Information Tec...
 
The Smart Parking Management System - IJCSIT
The Smart Parking Management System  - IJCSITThe Smart Parking Management System  - IJCSIT
The Smart Parking Management System - IJCSIT
 
Discover Cutting-Edge Research in Computer Science and Information Technology!
Discover Cutting-Edge Research in Computer Science and Information Technology!Discover Cutting-Edge Research in Computer Science and Information Technology!
Discover Cutting-Edge Research in Computer Science and Information Technology!
 
Constraint-based and Fuzzy Logic Student Modeling for Arabic Grammar
Constraint-based and Fuzzy Logic Student Modeling for Arabic GrammarConstraint-based and Fuzzy Logic Student Modeling for Arabic Grammar
Constraint-based and Fuzzy Logic Student Modeling for Arabic Grammar
 
From Clicks to Security: Investigating Continuous Authentication via Mouse Dy...
From Clicks to Security: Investigating Continuous Authentication via Mouse Dy...From Clicks to Security: Investigating Continuous Authentication via Mouse Dy...
From Clicks to Security: Investigating Continuous Authentication via Mouse Dy...
 
Call for Articles - International Journal of Computer Science & Information T...
Call for Articles - International Journal of Computer Science & Information T...Call for Articles - International Journal of Computer Science & Information T...
Call for Articles - International Journal of Computer Science & Information T...
 
Image Segmentation and Classification using Neural Network
Image Segmentation and Classification using Neural NetworkImage Segmentation and Classification using Neural Network
Image Segmentation and Classification using Neural Network
 
International Journal of Computer Science & Information Technology (IJCSIT)
International Journal of Computer Science & Information Technology (IJCSIT)International Journal of Computer Science & Information Technology (IJCSIT)
International Journal of Computer Science & Information Technology (IJCSIT)
 
Your Device May Know You Better Than You Know Yourself-Continuous Authenticat...
Your Device May Know You Better Than You Know Yourself-Continuous Authenticat...Your Device May Know You Better Than You Know Yourself-Continuous Authenticat...
Your Device May Know You Better Than You Know Yourself-Continuous Authenticat...
 
A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3
A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3
A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3
 
From Clicks to Security: Investigating Continuous Authentication via Mouse Dy...
From Clicks to Security: Investigating Continuous Authentication via Mouse Dy...From Clicks to Security: Investigating Continuous Authentication via Mouse Dy...
From Clicks to Security: Investigating Continuous Authentication via Mouse Dy...
 
Image Segmentation and Classification using Neural Network
Image Segmentation and Classification using Neural NetworkImage Segmentation and Classification using Neural Network
Image Segmentation and Classification using Neural Network
 
Use of Neuronal Networks and Fuzzy Logic to Modelling the Foot Sizes
Use of Neuronal Networks and Fuzzy Logic to Modelling the Foot SizesUse of Neuronal Networks and Fuzzy Logic to Modelling the Foot Sizes
Use of Neuronal Networks and Fuzzy Logic to Modelling the Foot Sizes
 
Exploring the EV Charging Ecosystem and Performing an Experimental Assessment...
Exploring the EV Charging Ecosystem and Performing an Experimental Assessment...Exploring the EV Charging Ecosystem and Performing an Experimental Assessment...
Exploring the EV Charging Ecosystem and Performing an Experimental Assessment...
 
Call for Papers - International Journal of Computer Science & Information Tec...
Call for Papers - International Journal of Computer Science & Information Tec...Call for Papers - International Journal of Computer Science & Information Tec...
Call for Papers - International Journal of Computer Science & Information Tec...
 
Current Issue - February 2024, Volume 16, Number 1 - International Journal o...
Current Issue - February 2024, Volume 16, Number 1  - International Journal o...Current Issue - February 2024, Volume 16, Number 1  - International Journal o...
Current Issue - February 2024, Volume 16, Number 1 - International Journal o...
 
Variations in Outcome for the Same Map Reduce Transitive Closure Algorithm Im...
Variations in Outcome for the Same Map Reduce Transitive Closure Algorithm Im...Variations in Outcome for the Same Map Reduce Transitive Closure Algorithm Im...
Variations in Outcome for the Same Map Reduce Transitive Closure Algorithm Im...
 
Call for Articles - International Journal of Computer Science & Information T...
Call for Articles - International Journal of Computer Science & Information T...Call for Articles - International Journal of Computer Science & Information T...
Call for Articles - International Journal of Computer Science & Information T...
 
February 2024-: Top Read Articles in Computer Science & Information Technology
February 2024-: Top Read Articles in Computer Science & Information TechnologyFebruary 2024-: Top Read Articles in Computer Science & Information Technology
February 2024-: Top Read Articles in Computer Science & Information Technology
 
Call for Articles- International Journal of Computer Science & Information T...
Call for Articles-  International Journal of Computer Science & Information T...Call for Articles-  International Journal of Computer Science & Information T...
Call for Articles- International Journal of Computer Science & Information T...
 

Recently uploaded

Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 

Recently uploaded (20)

Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 

Detection of Fake Accounts in Instagram Using Machine Learning

  • 1. International Journal of Computer Science & Information Technology (IJCSIT) Vol 11, No 5, October 2019 DOI: 10.5121/ijcsit.2019.11507 83 DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING Ananya Dey1 , Hamsashree Reddy2 , Manjistha Dey3 and Niharika Sinha4 1 National Institute of Technology, Tiruchirappalli, India 2 PES University, Bangalore, India 3 RV College of Engineering, Bangalore, India 4 Manipal Institute of Technology, Karnataka, India ABSTRACT With the advent of the Internet and social media, while hundreds of people have benefitted from the vast sources of information available, there has been an enormous increase in the rise of cyber-crimes, particularly targeted towards women. According to a 2019 report in the [4] Economics Times, India has witnessed a 457% rise in cybercrime in the five year span between 2011 and 2016. Most speculate that this is due to impact of social media such as Facebook, Instagram and Twitter on our daily lives. While these definitely help in creating a sound social network, creation of user accounts in these sites usually needs just an email-id. A real life person can create multiple fake IDs and hence impostors can easily be made. Unlike the real world scenario where multiple rules and regulations are imposed to identify oneself in a unique manner (for example while issuing one’s passport or driver’s license), in the virtual world of social media, admission does not require any such checks. In this paper, we study the different accounts of Instagram, in particular and try to assess an account as fake or real using Machine Learning techniques namely Logistic Regression and Random Forest Algorithm. KEYWORDS Logistic Regression, Random Forest Algorithm, median imputation, Maximum likelihood estimation, k cross validation, overfitting, out of bag data, recall, identity theft, Angler phishing. 1. INTRODUCTION Instagram is an online photo and video sharing social networking platform that has been available on both Android and iOS since 2012. As of May 2019, there are over a billion users registered on Instagram. In the recent years, Instagram has been found to be using third party apps, called bots. While these can definitely impersonate a user and tarnish their reputation leading to ‘identity theft’, there has also been greater instances of malicious ways of promoting the brand image of a company known as “influencer marketing”. These days a number of businesses are using social media to heed to their customers’ needs which has led to yet another malpractice called Angler phishing. All these malpractices have made it vital to implement strong fraud detection techniques and hence we propose our solution. 2. LITERATURE SURVEY Previously a lot of work has been done on other platforms like Facebook and Twitter, but not much work has been done for Instagram. Each of these Social Medias are different in terms of features that have to be
  • 2. International Journal of Computer Science & Information Technology (IJCSIT) Vol 11, No 5, October 2019 84 considered, strategies used, etc. Few of the past work include [1] Worth its Weight in Likes: Towards Detecting Fake Likes on Instagram: This particular paper concentrates on analysing likes and identifying the genuine ones to reduce the effect of fake likes on Instagram influencer market. They used a simple feed-forward neural network Multi-Layer Perceptron (MLP) which obtained a precision about 83%. [2] Identifying Fake Profiles in LinkedIn: A number of features were considered to train the dataset using neural networks, SVMs, and principal component analysis. The precision rate achieved was 84%. [3] Detection of Fake Profiles in Social Media: This is a literature review to detecting fake social media accounts classified into the approaches aimed on analysing individual accounts. So our 2 proposed method is a novel approach in terms of the platform chosen and the algorithms used like Random Forest for classification. Figure1: Proposed System 3. MATERIALS AND METHODS In this section we present the materials and methods used for the research work.
  • 3. International Journal of Computer Science & Information Technology (IJCSIT) Vol 11, No 5, October 2019 85 Data set information: The dataset has been taken from https://www.kaggle.com/free4ever1/instagram-fake-spammer-genuine- accounts.It consists of two CSV files- train.csv (19 KB) and test.csv (4 KB) .The dependent variable, which is whether it is a fake or not fake account is categorical and it takes two values 0 (not fake) and 1 (fake) profile. The distribution of the training dataset is such that 50% is fake and the rest 50% is legitimate. Below is a table to denote the parameters that have been considered (denoted in column Profile feature), their range of values, each of their mean values and what each of the features denote.1. Collecting the IP addresses which will be used as the training dataset Table 1: Dataset Features Figure 2: Snapshot of Training Dataset Exploratory Data Analysis: This is a critical process of initial data investigation done so as to discover patterns in the dataset and spot anomalies with the help of summary statistics and graphical representation. Below are the various sub processes that were done. a. Missing Value Treatment The given dataset had no missing value. Missing values could occur in a dataset due to mostly real world problems and can be treated either through deletion or imputation. The presence of missing values reduces the data available to be analysed, compromising on the statistical power of the study, and eventually the reliability of its results. b. Outlier Detection Outliers are extreme values that deviate from the usual data values in the dataset. If outliers are present in the dataset, then the accuracy is reduced significantly as the training dataset learns from this noise in the data and could give an over-fit model. After careful analysis using graphs, we conclude that the following features had outliers present in them- nums/length username, full name words, description length, #posts, #followers and #follows. To deal with these outliers, we used median imputation by calculating the median of these set of values. Note that we do not include the outliers in the median calculation. Once we get the median, we replace all the outliers with the calculated median value.
  • 4. International Journal of Computer Science & Information Technology (IJCSIT) Vol 11, No 5, October 2019 86 c. Bivariate Analysis This is done to understand the relationship between two variables and the strength of association between them. We calculated the correlation matrix and concluded absence of high multicollinearity between the variables. This is one of the assumptions before building a logistic regression model. Once data pre-processing is done, we can safely move into the algorithms. We have been provided with a labelled training dataset and can therefore proceed with applying supervised learning algorithms that map the input to the output .For the scope of this paper we have considered two commonly used classification algorithms, i.e, Logistic Regression and the Random Forest algorithm. Each of them has been explained in depth below. 1. Logistic Regression The assumptions of this model include absence of outliers in the dataset and absence of high correlations between the predictors, which have been taken care of in the preceding steps. In logistic regression, the probabilities predicted using the logit function. The values greater than or equal to the decision boundary belong to one class while the values lower than it belong to the other. We first run the GLM function in R to perform regression and find out the beta coefficients and p values for each of the features. The Beta coefficient, which is calculated based on maximum likelihood estimation, is an indicator of how strongly the predictor variable indicates the dependent variable. Based on the p values, we remove those variables whose 5 values are greater than 0.05 and re run the model. Finally we performed K cross validation to check overfitting Figure 3: Model Summery 2. Random forest Algorithm This is a regression model performing well on classification model. Since there are a very few assumptions attached to it, so data preparation is less challenging. We used the random Forest function in
  • 5. International Journal of Computer Science & Information Technology (IJCSIT) Vol 11, No 5, October 2019 87 R and set the n-tree parameter (denoting the number of trees) as 500 and the number of variables tried at each split as 3. Random Record Selection is the first task in which each tree is trained on ⅔ of the total training data and some variables are selected at random(say m) out of all the variables and these m variables are used to split the node. For each tree, using the leftover (36.8%) data, the misclassification rate is calculated. This gives us the Out Of Bag (OOB) error rate. The forest chooses the classification having the most votes over all the trees in the forest. This is the RF score and the percent YES votes received is the predicted probability. 4. RESULTS AND ANALYSIS In this section we determine the results of the two classification models. After creating the models using the training datasets, we apply the models on unseen data, i.e., and the test dataset. We create the confusion matrix based on these and calculate various performance parameters as discussed below. 1. Logistic Regression Figure 4: Confusion matrix for Logistic Regression (T = True, F=False, P=Positive, N=Negative) We calculate the different model metrics as follows-  Precision- We divide the total number of correctly classified positive examples by the total number of predicted positive examples. Precision= 𝑇𝑃 / (𝑇𝑃+𝐹𝑃) = 57/ (57+8) = 87.6 %  Recall- The ratio of the total number of correctly classified positive examples divide to the total number of positive examples. Recall= 𝑇𝑃/((𝑇𝑃+𝐹𝑁) = 57/ (57+3) = 95 %  F1 score- It is the harmonic mean of recall and precision. F1 score= (2∗𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛∗𝑅𝑒𝑐𝑎𝑙𝑙)/ (𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙) = 91.15  Accuracy- It is calculated using the following formula (𝑇𝑃+𝑇𝑁) /(𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁) = (57+52) / (57+52+8+3) = 90.8 % Based on this curve, we infer that using k=3 will give optimal results.
  • 6. International Journal of Computer Science & Information Technology (IJCSIT) Vol 11, No 5, October 2019 88 Figure 5: ROC Curve for Logistic Regression 2. Random Forest Algorithm Figure 6: Confusion matrix for Logistic Regression (T = True, F=False, P=Positive, N=Negative) On a similar note, we calculate the various model metrics for Random forest algorithm.  Precision= 𝑇𝑃 / (𝑇𝑃+𝐹𝑃) = 55 / (55+4) = 93.2 %  Recall= 𝑇𝑃 / (𝑇𝑃+𝐹𝑁) = 55/(55+5) = 91.6 %  F1 score= (2∗𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛∗𝑅𝑒𝑐𝑎𝑙𝑙) / (𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙) = 92.42  Accuracy= (𝑇𝑃+𝑇𝑁) / (𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁) = (55+56) / (55+56+4+5) = 92.5 %
  • 7. International Journal of Computer Science & Information Technology (IJCSIT) Vol 11, No 5, October 2019 89 Figure 7: Variable Importance Graph 5. CONCLUSIONS While going through previous similar research conducted on detection of fake profiles on social media platforms, we realized that not a lot has been done on Instagram as a social network platform in particular. Hence, we targeted our approach for the same. In this paper, we introduced a novel approach for detecting fake user profiles on Instagram based on certain features using concepts of machine learning. We used two models for this- Logistic Regression and Random Forest algorithms, achieving an accuracy of 90.8% and 92.5% respectively. Such high accuracies have not been attained in previous work conducted for other social media platforms (highest accuracy achieved before this was 86%). REFERENCES [1] Indira Sen,Anupama Aggarwal,Shiven Mian.2018."Worth its Weight in Likes: Towards Detecting Fake Likes on Instagram". In ACM International Conference on Information and Knowledge Management. [2] Shalinda Adikari, Kaushik Dutta. 2014. “Identifying Fake Profiles In LinkedIn”. In Pacific Asia Conference on Information Systems. [3] Aleksei Romanov, Alexander Semenov, Oleksiy Mazhelis and Jari Veijalainen.2017. "Detection of Fake Profiles in Social Media”. In 13th International Conference on Web Information Systems and Technologies.
  • 8. International Journal of Computer Science & Information Technology (IJCSIT) Vol 11, No 5, October 2019 90 [4] https://telecom.economictimes.indiatimes.com/news/india-saw-457-rise-in-cybercrime-in-fiveyears- study/67455224 [5] Todor Mihaylov, Preslav Nakov.2016. "Hunting for Troll Comments in News Community Forums". In Association for Computational Linguistics. [6] Ml-cheatsheet.readthedocs.io. (2019). Logistic Regression — ML Cheatsheet documentation. [Online] Available at: https://ml cheatsheet.readthedocs.io/en/latest/logistic_regression.html#binarylogistic-regression [Accessed 10 Jun. 2019]. [7] 3. Schoonjans, F. (2019). ROC curve analysis with MedCalc. [Online] MedCalc. Available at: https://www.medcalc.org/manual/roc-curves.php [Accessed 10 Jun. 2019]. [8] Kietzmann, J.H., Hermkens, K., McCarthy, I.P., Silvestre,B.S., 2011. Social media? Get serious! Understanding the functional building blocks of social media. Bus.Horiz., SPECIAL ISSUE: SOCIAL MEDIA 54, 241251. doi:10.1016/j.bushor.2011.01.005. [9] Krombholz, K., Hobel, H., Huber, M., Weippl, E., 2015.Advanced Social Engineering Attacks. J Inf SecurAppl 22, 113–122. doi:10.1016/j.jisa.2014.09.005.