This document outlines the research methodology for a study on detecting fake profiles in online social networks. It discusses challenges in collecting data from social networks due to privacy and access restrictions. It proposes using an IMcrawler to extract user data from Facebook by scraping profiles. The research will then analyze user behavior and emotions based on collected text data. A fake profile detection model will be developed using profile and network features to identify suspicious connections on Facebook. Classification techniques will be evaluated for the model.
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...ijtsrd
Online Social Networks OSNs are providing a diversity of application for human users to network through families, friends and even strangers. One of such application, friend search engine, allows the universal public to inquiry individual client friend lists and has been gaining popularity recently. Proper design, this application may incorrectly disclose client private relationship information. Existing work has a privacy perpetuation clarification that can effectively boost OSNs' sociability while protecting users' friendship privacy against attacks launched by individual malicious requestors. In this project proposed an advanced collusion attack, where a victim user's friendship privacy can be compromise from side to side a series of cautiously designed queries coordinately launched by multiple malicious requestors. The result of the proposed collusion attack is validate through synthetic and real world social network data sets. The project on the advanced collusion attacks will help us design a more vigorous and securer friend search engine on OSNs in the near future. R. Brintha | H. Parveen Bagum "Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Search Engine" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-4 , June 2020, URL: https://www.ijtsrd.com/papers/ijtsrd31687.pdf Paper Url :https://www.ijtsrd.com/computer-science/world-wide-web/31687/retrieving-hidden-friends-a-collusion-privacy-attack-against-online-friend-search-engine/r-brintha
A4.1Proceedings of Student-Faculty Research Day, CSIS, Pa.docxjoyjonna282
A4.1
Proceedings of Student-Faculty Research Day, CSIS, Pace University, May 8th, 2009
Forensics Tools for Social Network Security
Solution
s
Janet Cheng, Jennifer Hoffman, Therese LaMarche, Ahmet Tavil, Amit Yavad, and Steve Kim
Seidenberg School of CSIS, Pace University, White Plains, NY 10606, USA
Abstract
The usage of Social Network Sites has increased
rapidly in recent years. Since the success of a Social
Network Site depends on the number of users it
attracts, there is pressure on providers of Social
Network sites to design systems that encourage
behavior which increases both the number of users
and their connections. However, like any fast-
growing technology, security has not been a high
priority in the development of Social Network Sites.
As a result, along with the benefits of Social Network
Sites, significant security risks have resulted.
Providing Social Network Site users with tools which
will help protect them is ideal. Tools are developed
for installation on a user’s computer to provide them
the ability to retrieve other online user information
via chat and social network websites. These tools will
also benefit law enforcement agents when crimes are
committed.
1. Introduction
This paper analyzes and extends the forensic tools
developed in an earlier study for protecting Social
Network Site users from security threats [14]. First,
we will identify the security issues found in Social
Network Sites. Second, we will demonstrate how our
tools can provide users with more information which
we hope will help prevent them from becoming
victims. Finally, if a crime has been committed, we
will detail the tools available to assist in
apprehending the perpetrator.
The tools we developed retrieve Social Network Site
user’s non-personal-identifiable information, such as
IP address, operating system, browser type, etc.
Retrieval of this information occurs upon the virtual
contact from that other person, be it by them simply
browsing our personal page, or by other person
contacting via Virtual Meeting, for example chatting.
This paper covers methodologies used, test results,
and future goals.
The Social Network Site security issues are: [4]
Corporate Espionage; Cross Site Scripting, Viruses &
Worms; Social Network Site Aggregators; Spear
Phishing & Social Network specific Phishing;
Infiltration of Networks Leading to data leakage; I.D.
Theft; Bullying; Digital Dossier Aggregation
Vulnerabilities; Secondary Data Collection
Vulnerabilities; Face Recognition Vulnerabilities;
CBIR (Content-based Image Retrieval); Difficulty of
Complete Account Deletion; Spam; and Stalking.
2. Case Studies
There are many criminal activities arising from the
use of social network sites. For example, a mother
was convicted of computer fraud for her involvement
in creating a phony account on MySpace to trick a
teenager, who later committed suicide [15]. The
tools found in thi ...
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
International Journal of Engineering Research and DevelopmentIJERD Editor
Electrical, Electronics and Computer Engineering,
Information Engineering and Technology,
Mechanical, Industrial and Manufacturing Engineering,
Automation and Mechatronics Engineering,
Material and Chemical Engineering,
Civil and Architecture Engineering,
Biotechnology and Bio Engineering,
Environmental Engineering,
Petroleum and Mining Engineering,
Marine and Agriculture engineering,
Aerospace Engineering.
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...ijtsrd
Online Social Networks OSNs are providing a diversity of application for human users to network through families, friends and even strangers. One of such application, friend search engine, allows the universal public to inquiry individual client friend lists and has been gaining popularity recently. Proper design, this application may incorrectly disclose client private relationship information. Existing work has a privacy perpetuation clarification that can effectively boost OSNs' sociability while protecting users' friendship privacy against attacks launched by individual malicious requestors. In this project proposed an advanced collusion attack, where a victim user's friendship privacy can be compromise from side to side a series of cautiously designed queries coordinately launched by multiple malicious requestors. The result of the proposed collusion attack is validate through synthetic and real world social network data sets. The project on the advanced collusion attacks will help us design a more vigorous and securer friend search engine on OSNs in the near future. R. Brintha | H. Parveen Bagum "Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Search Engine" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-4 , June 2020, URL: https://www.ijtsrd.com/papers/ijtsrd31687.pdf Paper Url :https://www.ijtsrd.com/computer-science/world-wide-web/31687/retrieving-hidden-friends-a-collusion-privacy-attack-against-online-friend-search-engine/r-brintha
A4.1Proceedings of Student-Faculty Research Day, CSIS, Pa.docxjoyjonna282
A4.1
Proceedings of Student-Faculty Research Day, CSIS, Pace University, May 8th, 2009
Forensics Tools for Social Network Security
Solution
s
Janet Cheng, Jennifer Hoffman, Therese LaMarche, Ahmet Tavil, Amit Yavad, and Steve Kim
Seidenberg School of CSIS, Pace University, White Plains, NY 10606, USA
Abstract
The usage of Social Network Sites has increased
rapidly in recent years. Since the success of a Social
Network Site depends on the number of users it
attracts, there is pressure on providers of Social
Network sites to design systems that encourage
behavior which increases both the number of users
and their connections. However, like any fast-
growing technology, security has not been a high
priority in the development of Social Network Sites.
As a result, along with the benefits of Social Network
Sites, significant security risks have resulted.
Providing Social Network Site users with tools which
will help protect them is ideal. Tools are developed
for installation on a user’s computer to provide them
the ability to retrieve other online user information
via chat and social network websites. These tools will
also benefit law enforcement agents when crimes are
committed.
1. Introduction
This paper analyzes and extends the forensic tools
developed in an earlier study for protecting Social
Network Site users from security threats [14]. First,
we will identify the security issues found in Social
Network Sites. Second, we will demonstrate how our
tools can provide users with more information which
we hope will help prevent them from becoming
victims. Finally, if a crime has been committed, we
will detail the tools available to assist in
apprehending the perpetrator.
The tools we developed retrieve Social Network Site
user’s non-personal-identifiable information, such as
IP address, operating system, browser type, etc.
Retrieval of this information occurs upon the virtual
contact from that other person, be it by them simply
browsing our personal page, or by other person
contacting via Virtual Meeting, for example chatting.
This paper covers methodologies used, test results,
and future goals.
The Social Network Site security issues are: [4]
Corporate Espionage; Cross Site Scripting, Viruses &
Worms; Social Network Site Aggregators; Spear
Phishing & Social Network specific Phishing;
Infiltration of Networks Leading to data leakage; I.D.
Theft; Bullying; Digital Dossier Aggregation
Vulnerabilities; Secondary Data Collection
Vulnerabilities; Face Recognition Vulnerabilities;
CBIR (Content-based Image Retrieval); Difficulty of
Complete Account Deletion; Spam; and Stalking.
2. Case Studies
There are many criminal activities arising from the
use of social network sites. For example, a mother
was convicted of computer fraud for her involvement
in creating a phony account on MySpace to trick a
teenager, who later committed suicide [15]. The
tools found in thi ...
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
International Journal of Engineering Research and DevelopmentIJERD Editor
Electrical, Electronics and Computer Engineering,
Information Engineering and Technology,
Mechanical, Industrial and Manufacturing Engineering,
Automation and Mechatronics Engineering,
Material and Chemical Engineering,
Civil and Architecture Engineering,
Biotechnology and Bio Engineering,
Environmental Engineering,
Petroleum and Mining Engineering,
Marine and Agriculture engineering,
Aerospace Engineering.
Personal metadata and the opportunities and challenges of working with social networking sites, presentation by N. Osborne, SUNCAT Assistant Project Officer, given at CIGS Web2.0 metadata and issues seminar, Fri 30 Jan, 2009.
Cataloguing Your Friends and Neighbours: Personal Metadata and the Opportunit...Nicola Osborne
Presentation given by Nicola Osborne at the CIGS (Cataloguing and Indexing Group Scotland) Web 2.0 Seminar 2009, held at the National Library of Scotland on Friday 30th January 2009
A COMPREHENSIVE STUDY ON DATA EXTRACTION IN SINA WEIBOijaia
With the rapid growth of users in social networking services, data is generated in thousands of terabytes
every day. Practical frameworks for data extraction from social networking sites have not been well
investigated yet. In this paper, a methodology for data extraction with respect to Sina Weibo is discussed.
In order to design a proper method for data extraction, the properties of complex networks and the
challenges when extracting data from complex networks are discussed first. Then, the reason for choosing
Sina Weibo as the data source is given. After that, the methods for data gathering are introduced and the
techniques for data sampling and data clean-up are discussed. Over 1 million users and hundreds of
millions of social relations between them were extracted from Sina Weibo using the methods proposed in
this paper.
Social media websites are becoming more prevalent on the Internet. Sites, such as Twitter, Facebook, and Instagram, spend significantly more of their time on users online. People in social media share thoughts, views, and facts and create new acquaintances. Social media sites supply users with a great deal of useful information. This enormous quantity of social media information invites hackers to abuse data. These hackers establish fraudulent profiles for actual people and distribute useless material. The material on spam might include commercials and harmful URLs that disrupt natural users. This spam content is a massive problem in social networks. Spam identification is a vital procedure on social media networking platforms. In this paper, we have proposed a spam detection artificial intelligence technique for Twitter social networks. In this approach, we employed a vector support machine, a neural artificial network, and a random forest technique to build a model. The results indicate that, compared with RF and ANN algorithms, the suggested support vector machine algorithm has the greatest precision, recall, and Fmeasure. The findings of this paper would be useful in monitoring and tracking social media shared photos for the identification of inappropriate content and forged images and to safeguard social media from digital threats and attacks.
Authorization mechanism for multiparty data sharing in social networkeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
A Survey of Methods for Spotting Spammers on Twitterijtsrd
Social networking sites explosive expansion as a means of information sharing, management, communication, storage, and management has attracted hackers who abuse the Web to take advantage of security flaws for their own nefarious ends. Every day, forged internet accounts are compromised. Online social networks OSNs are rife with impersonators, phishers, scammers, and spammers who are difficult to spot. Users who send unsolicited communications to a large audience with the objective of advertising a product, entice victims to click on harmful links, or infect users systems only for financial gain are known as spammers. Many studies have been conducted to identify spam profiles in OSNs. In this essay, we have discussed the methods currently in use to identify spam Twitter users. User based, content based, or a combination of both features could be used to identify spammers. The current paper gives a summary of the traits, methodologies, detection rates, and restrictions if any for identifying spam profiles, primarily on Twitter. Hareesha Devi | Pankaj Verma | Ankit Dhiman "A Survey of Methods for Spotting Spammers on Twitter" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-7 | Issue-3 , June 2023, URL: https://www.ijtsrd.com.com/papers/ijtsrd57439.pdf Paper URL: https://www.ijtsrd.com.com/computer-science/artificial-intelligence/57439/a-survey-of-methods-for-spotting-spammers-on-twitter/hareesha-devi
2010 Catalyst Conference - Trends in Social Network AnalysisMarc Smith
Review of trends related to social network analysis in the enterprise. Presented at the 2010 Catalyst Conference in San Diego, CA july 29, 2010. Presented with Mike Gotta, Gartner Group.
Identification of inference attacks on private Information from Social Networkseditorjournal
Online social networks, like
Facebook, twitter are increasingly utilized by
many people. These networks permit users to
publish details about them and to connect to
their friends. Some of the details revealed
inside these networks are meant to be
keeping private. Yet it is possible to use
learning algorithms and methods on released
data have to predict private information,
which cause inference attacks. This paper
discovers how to launch inference attacks
using released social networking details to
predict private information’s. It then
separate three possible sanitization
algorithms that could be used in various
situations. Then, it investigates the
effectiveness of these techniques and tries to
use methods of collective inference
techniques to determine sensitive attributes
of the user data set. It shows that it can
decline the effectiveness of both the local and
relational classification algorithms by using
the sanitization methods we described.
Researching Social Media – Big Data and Social Media AnalysisFarida Vis
Researching Social Media – Big Data and Social Media Analysis, presentation for the Social Media for Researchers: A Sheffield Universities Social Media Symposium, 23 September 2014
An iac approach for detecting profile cloningIJNSA Journal
Nowadays, Online Social Networks (OSNs) are popular websites on the internet, which millions of users
register on and share their own personal information with others. Privacy threats and disclosing personal
information are the most important concerns of OSNs’ users. Recently, a new attack which is named
Identity Cloned Attack is detected on OSNs. In this attack the attacker tries to make a fake identity of a real
user in order to access to private information of the users’ friends which they do not publish on the public
profiles. In today OSNs, there are some verification services, but they are not active services and they are
useful for users who are familiar with online identity issues. In this paper, Identity cloned attacks are
explained in more details and a new and precise method to detect profile cloning in online social networks
is proposed. In this method, first, the social network is shown in a form of graph, then, according to
similarities among users, this graph is divided into smaller communities. Afterwards, all of the similar
profiles to the real profile are gathered (from the same community), then strength of relationship (among
all selected profiles and the real profile) is calculated, and those which have the less strength of
relationship will be verified by mutual friend system. In this study, in order to evaluate the effectiveness of
proposed method, all steps are applied on a dataset of Facebook, and finally this work is compared with
two previous works by applying them on the dataset.
Personal metadata and the opportunities and challenges of working with social networking sites, presentation by N. Osborne, SUNCAT Assistant Project Officer, given at CIGS Web2.0 metadata and issues seminar, Fri 30 Jan, 2009.
Cataloguing Your Friends and Neighbours: Personal Metadata and the Opportunit...Nicola Osborne
Presentation given by Nicola Osborne at the CIGS (Cataloguing and Indexing Group Scotland) Web 2.0 Seminar 2009, held at the National Library of Scotland on Friday 30th January 2009
A COMPREHENSIVE STUDY ON DATA EXTRACTION IN SINA WEIBOijaia
With the rapid growth of users in social networking services, data is generated in thousands of terabytes
every day. Practical frameworks for data extraction from social networking sites have not been well
investigated yet. In this paper, a methodology for data extraction with respect to Sina Weibo is discussed.
In order to design a proper method for data extraction, the properties of complex networks and the
challenges when extracting data from complex networks are discussed first. Then, the reason for choosing
Sina Weibo as the data source is given. After that, the methods for data gathering are introduced and the
techniques for data sampling and data clean-up are discussed. Over 1 million users and hundreds of
millions of social relations between them were extracted from Sina Weibo using the methods proposed in
this paper.
Social media websites are becoming more prevalent on the Internet. Sites, such as Twitter, Facebook, and Instagram, spend significantly more of their time on users online. People in social media share thoughts, views, and facts and create new acquaintances. Social media sites supply users with a great deal of useful information. This enormous quantity of social media information invites hackers to abuse data. These hackers establish fraudulent profiles for actual people and distribute useless material. The material on spam might include commercials and harmful URLs that disrupt natural users. This spam content is a massive problem in social networks. Spam identification is a vital procedure on social media networking platforms. In this paper, we have proposed a spam detection artificial intelligence technique for Twitter social networks. In this approach, we employed a vector support machine, a neural artificial network, and a random forest technique to build a model. The results indicate that, compared with RF and ANN algorithms, the suggested support vector machine algorithm has the greatest precision, recall, and Fmeasure. The findings of this paper would be useful in monitoring and tracking social media shared photos for the identification of inappropriate content and forged images and to safeguard social media from digital threats and attacks.
Authorization mechanism for multiparty data sharing in social networkeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
A Survey of Methods for Spotting Spammers on Twitterijtsrd
Social networking sites explosive expansion as a means of information sharing, management, communication, storage, and management has attracted hackers who abuse the Web to take advantage of security flaws for their own nefarious ends. Every day, forged internet accounts are compromised. Online social networks OSNs are rife with impersonators, phishers, scammers, and spammers who are difficult to spot. Users who send unsolicited communications to a large audience with the objective of advertising a product, entice victims to click on harmful links, or infect users systems only for financial gain are known as spammers. Many studies have been conducted to identify spam profiles in OSNs. In this essay, we have discussed the methods currently in use to identify spam Twitter users. User based, content based, or a combination of both features could be used to identify spammers. The current paper gives a summary of the traits, methodologies, detection rates, and restrictions if any for identifying spam profiles, primarily on Twitter. Hareesha Devi | Pankaj Verma | Ankit Dhiman "A Survey of Methods for Spotting Spammers on Twitter" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-7 | Issue-3 , June 2023, URL: https://www.ijtsrd.com.com/papers/ijtsrd57439.pdf Paper URL: https://www.ijtsrd.com.com/computer-science/artificial-intelligence/57439/a-survey-of-methods-for-spotting-spammers-on-twitter/hareesha-devi
2010 Catalyst Conference - Trends in Social Network AnalysisMarc Smith
Review of trends related to social network analysis in the enterprise. Presented at the 2010 Catalyst Conference in San Diego, CA july 29, 2010. Presented with Mike Gotta, Gartner Group.
Identification of inference attacks on private Information from Social Networkseditorjournal
Online social networks, like
Facebook, twitter are increasingly utilized by
many people. These networks permit users to
publish details about them and to connect to
their friends. Some of the details revealed
inside these networks are meant to be
keeping private. Yet it is possible to use
learning algorithms and methods on released
data have to predict private information,
which cause inference attacks. This paper
discovers how to launch inference attacks
using released social networking details to
predict private information’s. It then
separate three possible sanitization
algorithms that could be used in various
situations. Then, it investigates the
effectiveness of these techniques and tries to
use methods of collective inference
techniques to determine sensitive attributes
of the user data set. It shows that it can
decline the effectiveness of both the local and
relational classification algorithms by using
the sanitization methods we described.
Researching Social Media – Big Data and Social Media AnalysisFarida Vis
Researching Social Media – Big Data and Social Media Analysis, presentation for the Social Media for Researchers: A Sheffield Universities Social Media Symposium, 23 September 2014
An iac approach for detecting profile cloningIJNSA Journal
Nowadays, Online Social Networks (OSNs) are popular websites on the internet, which millions of users
register on and share their own personal information with others. Privacy threats and disclosing personal
information are the most important concerns of OSNs’ users. Recently, a new attack which is named
Identity Cloned Attack is detected on OSNs. In this attack the attacker tries to make a fake identity of a real
user in order to access to private information of the users’ friends which they do not publish on the public
profiles. In today OSNs, there are some verification services, but they are not active services and they are
useful for users who are familiar with online identity issues. In this paper, Identity cloned attacks are
explained in more details and a new and precise method to detect profile cloning in online social networks
is proposed. In this method, first, the social network is shown in a form of graph, then, according to
similarities among users, this graph is divided into smaller communities. Afterwards, all of the similar
profiles to the real profile are gathered (from the same community), then strength of relationship (among
all selected profiles and the real profile) is calculated, and those which have the less strength of
relationship will be verified by mutual friend system. In this study, in order to evaluate the effectiveness of
proposed method, all steps are applied on a dataset of Facebook, and finally this work is compared with
two previous works by applying them on the dataset.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
2. 2
Table of Contents
Abstract
LIST OF FIGURES
LIST OF ABBREVIATIONS
1. Background and related research
2. Problem Statement
3. Research Questions
4. Research Aims and Objectives
5. Significance of the study
6. Scope of study
7. Research Methodology
7.1 Introduction
7.2 Dataset Description
7.3 Data Pre-processing
7.4 Model Development
7.5 Evaluation Metrics
8. Resource Requirements
9. Research Plan
10. References
3. 3
ABSTRACT
OSN is an online platform that people use to create social, personal or professional
relationships to other OSN users that share common interests, habits, histories and real-life
links. An OSN is regarded as a grouping of nodes (persons, performers, organisations, etc.)
linked by a sequence of corners in the light of the graphical theory (relationships,
interactions, distances, etc.). The way people think, express and socialize with the external
world has improved with OSNs. With the Web technology 2.0, several Online Social
Networks (OSNs), such as Facebook, Twitter, LinkedIn, Instagram, Researchgate, etc. have
been developed with a variety of functionalities.
In this study we try to bring everything relating to online fake profiles in one location by
introducing different types of fake profiles (comprehensive profiles, cloned profiles and
online bots) on a variety of OSN websites, along with a variety of features in order to
differentiate between fake and actual individuals. The problem of data access was also
resolved, by the provision of strongly mandatory approaches for data collection and certain
current data sources. In addition, many machine learning approaches to design false profile
identification systems are attempted. After the strict literature analysis, we proposed to gather
data from user accounts on the social network on Facebook using an iMacros data-based
technology crawler. We performed behaviour and emotion analysis on the collected data and
observed how people share their thoughts on the social media. Based on the four profile
features ( , , ho ) along with a
network feature ( ) We introduced an approach to
detecting questionable (negative) connections that adversaries have generated by utilizing a
network of mutual friends or a Facebook profile. The three classification techniques (
) have shown better results on the test dataset.
4. 4
LIST OF FIGURES
1.1 Types of fake profiles in online social networks ………………………………..
1.2 Intra site or same site Profile Cloning ……………………………………..……
1.3 Inter site or Cross site Profile Cloning ………………………………………….
1.4 Categories of Bots in OSNs …………………………………………………….
1.5 Pictorial representation of an OSN Botnet………………………………………
1.6 Dependency diagram of all the chapters of Thesis………………………………
2.1 Data Collection Techniques……………………………………………………...
2.2 (a) Data extraction using API……………………………………………………….
2.2 (b) A typical OSN data crawler …………………………………………………….
2.3 Pictorial representation of Data extraction program …………………………….
3.1 Parallel Data Collection Approach ……………………………………………..
3.2 Algorithm for implementation of IMcrawler ……………………………………
3.3 Data Collection Framework ……………………………………………….…….
4.1 Proportion of personal attributes revealed by the users and participation of gender vector
(male, female, NA) ……………………….…….
4.2 Proportion of each personal attribute revealed as a function of gender (male and female)
……………………………………………………………….
4.3 Pattern of proportion of males and females providing personal details at different levels of
information revealed scale …………………………………..
4.4 Correlation among Personal Attributes ………………………………………….
6. 6
1. BACKGROUND
Given the vast volume of online usage data, it is the main issue for researchers to extract it. A
research has given a thorough debate with the application domains used on network data
extraction techniques. In general, the two common ways to retrieve data from OSNs is via
APIs and HTML scrapping. While APIs have well-organized data, they are correlated with
many restrictions. The strategies of HTML scrapping provide an alternate approach that can
overcome API constraints at the expense of technological difficulties. The paper proposed a
semantic structure for the collection and analysis of social network data utilizing APIs using
the open access resources offered by the family Doors. The authors of many famous OSNs
such as YouTube, Flicker, Live Journal etc were evaluated in the framework. In order to
extract the necessary details, the analysis utilized APIs from these OSNs and the HTML
scrapping technique. Data extraction schemes rely heavily on online social network rules.
Without becoming a participant the authors have given the process to retrieve personal
attributes and the list of top friends from MySpace social network. MySpace often offers a
wealth of data outlets to non-members, while networks such as Reddit, Friendster and others
do not reveal external users any information or content. Various experiments have been
performed specifically on the Facebook network, as it is one of the most common online
networking websites and the most challenging to preserve secrecy. Netviz is a facebook
framework developed to help scientists gather profile features such as personal networks,
communities and sites. As every other API, though, Netvizz's application is still restricted in
practice by the Facebook service's authorization and privacy model. Initially, the Facebook
account needs signing in. Second of all, the user is specifically required to enable access to
the numerous data and, thirdly, via their privacy settings the user can further limit the
provision of data to the app. The authors used PhantomJS, a headless browser for the
extraction of the friend's Facebook network of users from a certain area in Macao, to create
an HTML dependent crawler. Authors have addressed the technological difficulties and their
feasible alternatives in the construction of the OSN data extractor.
RELATED WORK
The way people think, express and socialize with the external environment has improved with
OSNs. Currently there are a large range of platforms that are used for the social and
professional operation of social networking such as Facebook, Twitter, Flicker, LinkedIn,
Study Gate etc. Since the nature of OSNs parallels real-life societies and include a vast
number of user material, researchers and numerous other disciplines such as marketing,
sociology, politics etc are extremely relevant in their work. Marketing firms research OSNs
in order to devise viral marketing campaigns to reach their target customers; sociologists use
them to evaluate human behaviours. These massive contents, which are held on these OSNS,
regarding consumer social, personal and workplace existence, have not only drawn scientists
but cyber criminals. This cyber criminals infiltrate OSNs by creating false profiles or by
launching a variety of identity robbery attacks on current users to snatch their passwords,
such as cloning attacks, detecting attacks etc. The computer criminals in particular the expert
attackers often design various kinds of bots for the control without great human effort of false
profiles. In order to reach the social as well as personal details of people, endorse a specific
7. 7
brand or individual, to defame a consumer, and so on, an increasing amount of hackers build
forged identifiers on networks like Facebook and Twitter. Adversaries may threaten specialist
forums such as LinkedIn and Researchgate to monitor participants' actions or obtain the
interest of business professional users in order to provide personal information. They often
seek to establish professional, romantic, or sexual relationships or to receive financial
rewards, gifts, or personal details, etc. An overview of numerous protection and privacy
threats for OSN users and guidelines to secure the interactive as well as real-world users are
given. However, counterfeit profiles are not necessarily harmful; users often build additional
profiles for fun and amusement, for link to a certain community of friends, etc. However,
they are deemed unconstitutional since they breach the laws and regulations of the operation.
In this relation, OSN rules and regulations can imply
No more than one personal account can be owned by the user.
No unauthorized or malicious material should be disseminated,
May not immediately capture user details or navigate the network, such as bots and
spiders, etc.
According to Facebook, an individual other than his main user keeps a false account. On
common social networks such as Facebook and Twitter, there are millions of fake profiles,
particularly in the markets of China and India. Providers of social networking sites use a
variety of ways to offer protections to consumers.
Researchers have proposed a number of approaches to mitigate the fraudulent identities from
OSNs, but these adversaries are kept on altering their behaviour and strategies to hoodwink
and evade these detection systems. In order to curtail the unlawful and discriminative
activities on OSNs, more advanced fake profile detection systems are needed. This segment
provides a rigorous literature survey investigating the behaviour of fraudulent user accounts
on OSNs. The chapter continues with the analysis of different characteristics used by
researchers to identify and mitigate false profiles on numerous websites. Next, numerous
machines learning approaches can help to build successful counterfeit profile identification
systems. Furthermore, the challenge of data unavailability is discussed in this chapter by
presenting highly mandatory data collection strategies and current sources.
2. PROBLEM OF STATMENT
Online social networks are an ideal forum for communicating, connecting and exchanging
web-based information. These OSNs may be categorized under numerous forms of
applications in order to promote contact between media and the forums of applications that
permit individuals to communicate their information, news and ideas focused on their own
functionalities to their participants, such as applications to create and sustain social links.
Today, OSNs are commonly used for social events by citizens. As a consequence, a vast
volume of information is processed on these OSNs regarding the social, personal and
professional life of consumer. Although these platforms have enriched people's social lives,
there are many problems with their use, and one of them is the multiplication of false
accounts. The consumer data accessible on OSNs are often drawing cyber criminals apart
from academics and social observers. This cybercriminals manipulate the exposure and
8. 8
insecurity of an OSN by counterfeit accounts and carry out illegal, deceptive and disruptive
acts, like identity stealing, slander and trolling, intimidation and spamming. Fake profiles are
the perfect place to spam, fraud or exploit the mechanism for malicious users of social
networks. Fake profile users have fixed their roots in topmost social networking sites to
perform illicit activities. According to a report1 - Facebook, the most popular social
networking site has identified and eliminated more than 580 million fake profiles from the
network in the year 2018, and more than million fake profiles still exist on the platform. In
another report, Twitter has suspended more than 70 million fake accounts in 2018, and there
are more than 45 million fake accounts which constitute more than 15% of total monthly
active users on the platform The existence of fake profiles is one of the prominent problems
in this cyber age. Cyber intelligence is severely struggling to alleviate these profiles as they
use OSN medium to conduct daily serious crimes. According to news by NDTV , a team of
Iranian hackers created around 14 false personas on various OSNs including Facebook to
stalk various military and political members in the United States. They were able to fool
around 2,000 users on the network by establishing friend connection with them. The hackers
initially send nonmalicious content to the victims in order to enhance the trust among them
and afterward used the fake accounts for sending the links that infected the victim’s PCs with
malicious software. According to another study , the Facebook detected and purged 32 fake
accounts that were engaged in a false political influence campaign. These accounts were
reported to be created in the timeline of March 2017 to May 2018. The NBC News shared
many cases where online identities were stolen to create fake profiles. As for example, one of
the Atlanta City Councilmen, Alex Wan discovered his photo to be used by multiple fake
accounts for attracting the women. Scientists proposed and implemented a number of
techniques to identify, combat and mitigate these fake profiles from OSNs. But attackers find
different alternatives to evade these systems and continue to deceive the network. Hence, to
eradicate the problem of fake profiles on OSNs, an efficient fake profile detection system is
needed.
3. RESEARCH QUESTION
What is unavailability of ground-truth data and efficient tools to harvest data?
What is optimal Feature set for Fake Profile Detection?
4. RESEARCH AIMS & OBJECTIVE
Design of IMCrawler for extracting data from OSNs in a convenient and efficient
manner.
Identification of optimal feature set for the detection of fake profiles in OSNs.
Design of network and profile-based suspicious link detection model for the
Facebook social network.
Design of Fake profile detection model for the Facebook social network.
9. 9
5. SIGNIFICANCE OF STUDY
The way people think, express and socialize with the external environment has improved with
OSNs. Nowadays blogs are used for people to perform their social and professional practices,
such as Facebook, Twitter, Flicker, LinkedIn, ResearchGate, etc. Since the nature of OSNs
parallels real-life societies and include a vast number of user material, researchers and
numerous other disciplines such as marketing, sociology, politics etc are extremely relevant
in their work. However, not just the general population, scholars and organisations, but also
the cyber criminals were drawn into the massive verities of those ONS and their ubiquitous
fame. On social networks such as Facebook, Twitter, and LinkedIn, cyber criminals build
forged identities, which may lead to illegal practices such as distributing spam message to the
consumer, casting biased ballots, rumouring, etc. There are several forms of fake accounts
and, in general, their roles differ due to their form of network. Although all of these profiles
are being massively compounded by researchers developing a method to detect false profiles
on OSNs, this is one of the most prevalent issues in our present era of the cyber world. There
are some inherent challenges towards designing an efficient fake profile detection system
such as unavailability of ground truth data, unavailability of suitable data collection approach
from the OSNs because of several security and privacy concerns, sampling fake profiles from
the hub of real profiles for model training and identifying a robust feature set. Apart from this
issue, there exists very less literature which places everything related to fake profiles at a
single place.
In order to overcome such issues and challenges, this thesis presents a rigorous survey of fake
profiles in OSNs. This thesis also presents a number of suitable approaches for harvesting the
user data from OSNs followed by behavioral and emotion analysis of the users. Furthermore,
the thesis proposes an efficient fake profile detection model based on a novel feature set.
6. SCOPE OF STUDY
The IMcrawler is developed exclusively for Facebook. However, researchers will widen the
IMcrawler easily to satisfy their data extraction criteria for other OSNs. OSN service
providers may use the proposed suspicious link detection model to warn their members with a
list of suspicious connections (links) from their respective friend lists so they can check the
suggested links themselves and filter their friend's list according to their requirements. In
addition, researchers may use the methodology suggested to design powerful fake profile
identification systems to support the OSN users and service providers recognize suspicious
contacts on the net. The method suggested can be used to classify a user's weaker and
stronger relations. In this report, however, we have used the approach to recognise suspicious
contacts between Facebook users that allow researchers to build effective false profile
detection systems. More than 800 Facebook connections have already been gathered to
establish the suggested classification, and one of the potential extensions would be to further
expand the profile count within the dataset and to include it to other researchers freely
accessible for their analysis. The emotional analysis should be further expanded such that
individuals in conflicting areas understand and provide mental treatment resources for the
effects of psychiatric conditions. Only on the text material contained in the consumer articles
10. 10
was the emotional interpretation presented. The research may, however, be generalized to
study feelings contained in the user's exchanged photographs and videos. In addition, sarcasm
and emoticon research may be applied to evaluate the emotions of the messages.
The research was performed on the dataset of Facebook. The research may however be
applied in other social networking sites to discern false identification. One job in the future
would require expanding it as social networks to Twitter and LinkedIn.
7. RESEARCH METHODOLOGY
The first prerequisite is a dataset wide enough to complete the learning for some type of
study. OSNs like Facebook contain billions of consumer accounts and service companies
maintain that their data is secured, rendering it incredibly impossible for researchers to gather
the data. Due to privacy concerns, OSN databases are not open to the public and as the data
are massive, manual processing complicates and takes time. However, the more common
social networking platforms such as Facebook, Twitter, Flickr, etc. have methods for
accessing network data through their own well specified APIs such as Graph-API, REST
API, etc. however these APIs are coupled with some inevitable restrictions such as data
request rate limitations, data selective access, etc. Web waste presents an alternative approach
as material is routinely removed from web sites. Even if the issue of data collection can be
overcome to a great degree, writing a scrapper has several problems, including:
Social networking sites, like Facebook, usually have a bot identification feature
incorporated into their structures that can identify an artificial operation, meaning that
software-based data collection can suspend the username that is used for data
collection.
The interactive content loading function from different web technologies (e.g. Ajax
and JavaScript) complicates the job further, since it is not included in a website's
source code. Moreover, user connections to the page are typically liable for calling
dynamic material on the website, which means that a system should be in effect to
automate these interactions to load this dynamic content into the parent HTML, etc.
Therefore, a tool is needed that can bypass the API restrictions and circumvent barriers to
data scrapping. This chapter includes design and implementation of the IMcrawler data
crawler for a Facebook network, which solves the problems described above and allows end
users to easily and conveniently collect data. Facebook provides the most diverse system of
privacy rules and is one of the most commonly used networking websites. The API can only
be used to retrieve data from certain users already registered with the program. The Facebook
API must specifically require that its users have authorization to access their info, contrary to
the Twitter API. Data that can be deleted from their accounts would be determined by
consumer privacy settings and rights given to the applicant. The entire data collection system
is often defined in step-savvy processes accompanied by a crawling of the network in a
helpful format.
11. 11
7.1 INTRODUCTION
Online social networks are an ideal forum for communicating, connecting and exchanging
web-based information. These OSNs may be categorized under numerous forms of
applications in order to promote contact between media and the forums of applications that
permit individuals to communicate their information, news and ideas focused on their own
functionalities to their participants, such as applications to create and sustain social links.
Today, OSNs are commonly used for social events by citizens. As a consequence, a vast
volume of information is processed on these OSNs regarding the social, personal and
professional life of consumer. Although these platforms have enriched people's social lives,
there are many problems with their use, and one of them is the multiplication of false
accounts. The consumer data accessible on OSNs are often drawing cyber criminals apart
from academics and social observers. These cybercriminals manipulate the exposure and
insecurity of an OSN by counterfeit accounts and carry out illegal, deceptive and disruptive
acts, like identity stealing, slander and trolling, intimidation and spamming. Fake profiles are
a favourite way of delivering spam, committing theft, or otherwise exploiting the mechanism
to malignant social network consumers. Fake profile users have fixed their roots in topmost
social networking sites to perform illicit activities. According to a report1 - Facebook, the
most popular social networking site has identified and eliminated more than 580 million fake
profiles from the network in the year 2018, and more than 87 million fake profiles still exist
on the platform. In another report, Twitter has suspended more than 70 million fake accounts
in 20183 , and there are more than 45 million fake accounts which constitute more than 15%
of total monthly active users on the platform.
Scientists proposed and implemented a number of techniques to identify, combat and
mitigate these fake profiles from OSNs. But attackers find different alternatives to evade
these systems and continue to deceive the network. Hence, to eradicate the problem of fake
profiles on OSNs, an efficient fake profile detection system is needed.
7.2 DATASET DESCRIPTION
Datasets extracted using IMcrawler This section briefly describes all the datasets extracted
with the help proposed data crawler. Four different datasets (Dataset_1, Dataset_2, Dataset_3
and Dataset_4) have been created for different studies. Each dataset has been described as
under:
Dataset_1- (user_basic_info_and wall_activity Dataset):
Dataset_1 holds two sections of a user profiles, viz., profile information and wall activity.
The profile information holds the profile-based attributes including Gender, Friend _Count,
Relationship_Status, Family_Members, Interested_In, Languages, Hometown, Birthday,
Phone_No., Address, Email_Id, Political_Views, Religious_Views, Social_ links, and
Website_Address. Whereas, wall activity contains post related features of a user such as an
owner, user, post_title, post_content, post_reactions, post_views, post_data_time.
Dataset_2- (user_post_info Dataset):
12. 12
Dataset_2 Contains Facebook users' posting attributes that include user id, post id post
content and home town. Two authors' Facebook profiles have been used to gather the
necessary data as root nodes. The first person is from Delhi, while the other is from Kashmir,
India. Both profiles play an important role in collecting user details from two nations, as the
lists of friends often consist of friends from the same region. An average of 30 posts is
collected with the proposed data crawler from any user profile
Dataset _3-(user_mcc_and_profile_info Dataset):
Dataset_3 has been extracted from a user community on the Facebook network. From each
user profile in the community, four features including Work, Education, Home Town, Current
City
Dataset_4-(user_post_emotion Dataset):
Dataset_4 This includes the user id, post id, post content and mark functionality. As roots
(seed nodes), two true and two honeypot (false) accounts harvest real and fake user data from
their friendly lists. More than 1200 Facebook users with over 60k messages are included in
the Dataset 4. More than 600 users in each user category.
7.3 DATA PRE- PROCESSING
The details gathered are often crude and can include knowledge that is not accessible. The
lost meaning is the prevalent occurrence in the processing of social network data since people
have the ability, while registered with social networks like Facebook, to mask details from
other users or peers and the rest of the regions. Profiles whose profiles are not freely open or
are not considered for friends of friends. Until estimation of user similarity, the python
programming librarian Natural Language toolkit (NLTK) seen in Algorithm 1. Stop words
including "the," "a," "an" and "in" have been omitted and the upper and lower cases of strings
are removed, including the same instance, stop-word exclusion, tokenization and stemming,
on the derived User features. The stemming strategy transforms all term combinations of the
same definition into a root word that allows the final estimation method of similarities
comfortable. In order to render the measurement of the similitude between two related users
more simple for different similarity measures to calculate the similarity score between two
profiles, a dictionary of terms was built on the basis of v. The goal is to implement the
various text analytics here is to generate the derived data for multiple similarity measures.
7.4 MODEL DEVELOPMENT
To date, we have seen numerous forms of functionality used to identify false malicious
identities in OSNs in particular. During online social network research, the selection of a
necessary data set (specific to false accounts, for example) is seen as a major challenge.
Researchers have used different methods for collecting data from ONS pages. A research has
given a thorough debate with the application domains used on network data extraction
techniques. In this segment, we address numerous methods for collecting the necessary data
from social network profiles. Data extraction with APIs supported by service providers, the
creation of a stand alone crawler application, artificial data generation with available
13. 13
resources, or the use of existing are the most common approaches to collect the data needed.
All four approaches are explained momentarily as follows:
Data Collection using APIs
Data collection using APIs is currently primarily used for study of social networks and is
strongly recommended. In general, the OSN service providers support developers and regular
users to conduct different data extraction operations with several libraries (packages). Most
researchers compose their own code to communicate with the social network using an API to
collect problem-specific data on a social network.
Bot-Based (Crawler) Approach
The bot-based solution includes the creation of a standalone data crawler that will collect data
from the social network. Like API-based strategy, knowledge regarding users is often
gathered, but the crawler software uses no API to communicate with the social network; the
contact between the crawler program and the social network is very direct. The architecture
of data extraction applications such as JavaScript, Python, PHP, etc. can be found in various
programming languages. Any extraction software, however, involves a number of seed
profiles, usually chosen according to such parameters, including a large number of friends,
position profiles, etc. In order to cross the network and retrieve details, the seed profiles are
used by the software. By social networking scholars. It begins at the goal profile (seed node)
and first examines its neighbouring nodes before going on to the next stage. When a DFS
rawler is used, the attributes of the neighbour profile are first extracted to a certain extent
instead of collecting all the neighbours of the goal profile. The extraction software may
normally be assumed to provide three items a data crawler wants. Next, a source file with the
14. 14
goal profile URL (seed profiles). Second, data fields from user accounts are expected to be
removed. Third, the collected data is contained in a register. The vision of the data extraction
software in principle.
Pictorial representation of Data extraction program
Artificial Data Generation
API-based and bot-based techniques are time intensive data collection strategies and are
strongly susceptible to consumer secrecy and protection settings. Like in certain situations,
we need data to address a specific issue quickly, but it cannot always be usable. Further,
because of privacy considerations, you do not have access to data of interest. In such
instances, we create a synthetic data sample utilizing existing data generator packages based
on the configuration of a network or the characteristics of existing datasets.
Existing Dataset Study (EDS)
The researchers are now able to carry out numerous experiments utilizing the data gathered
and made accessible to the public. This method is regarded as a secondary review in which
the research is carried out using the data collection created by others. For social bookmarking
consumer estimation Writers used the public data collection to perform analyzes on the
BibSonomy platform.
Profile Selection Approaches
It is now evident from above that current databases and artificial processes require no
profiles to extract the data although, in the other two cases, the profiles (seed node) must be
listed in order to extract data (API-based approach and crawler-based approach). We need
actual as well as fake profiles to extract data through API or a crawler. The actual accounts
can be conveniently found in vast numbers on the social network.
Manual Approach
In manual methods, we have to manually examine the suspect accounts and report profiles of
fraudulent activities. There are usually many approaches to examine and pick the false profile
setup in the manual sorting technique. One approach is to manually gather a collection of
random network profiles and mark each profile in the list on the basis of a selection of
features that discriminate between true and forged profiles.
Honey Profile-Based Approach
15. 15
The OSN profile used to draw other (most possibly similar) users, as suggested by the label.
Honey profiles Various forms of honey profiles or only honeytraps are built to draw both real
and false users according to the requirements. For example, certain people develop honey
profiles that draw young people to the focus network, and others create honey profiles that
attract the population at large. For fake profile collection, however, researchers build honey
profiles, for example porn profiles, that explicitly appeal to the fake profiles of the same
category.
Botnet-Based Approach
As stated, a botnet is a network of automated programs (bots) managed and monitored by a
'botherder' human control unit programmed to conduct numerous tasks, such as
communicating and attracting other network users, promoting goods and brands,
campaigning and other activities.
7.5 EVALUATION METRICS
False OSN profiles may be identified in the context of data creep, ideal function sets,
machine-based learning models, etc. With thousands of fake profiles in numerous OSNs that
aim at misleading, more sophisticated methods are required to ensure one's on-line presence
as least can be achieved if the protection is influenced by a thorough review of current
techniques and approaches for analysis and identification of different fake profile categories,
such that an efficient framework for fake profiles can be designed A analysis of numerous
features of current approaches has also been presented to distinguish false from true profiles.
The segment often emphasizes numerous approaches to data crawling together with some
existing data sources to mitigate the data shortage faced by OSN researchers. When analyzing
the existing literature carefully, we realized the need to use an effective info-sticker to
retrieve data from user accounts on various OSNs. After the rigorous review of the existing
literature we realized the need of an efficient data crawler which can be used to extract the
data from user profiles on different OSNs.
8. RESOURCE AND REQUIRMENTS
Researchers have over time included numerous types of functionality to build a machine
learning model for the recognition of actual and false profiles. Studies based for example on
network features such as friend growth, OSN graph structure development, clustering
coefficient, link power, etc. in order to differentiate between bogus reports. In different
experiments, researchers used content-based attributes such as post URLs, message
comparisons, message length, hash tags, tag counts, capital letter counts, term length, etc. to
find fraudulent accounts. The authors suggested a solution on the basis of profile attributes
displayed on the social network by a person, including gender, connection status, training
information, etc. Several experiments have also merged attributes for performing research of
more than one group in order to increase the performance of fake detection model profiles.
We concentrated on feeling-based features for the false profiles. The thoughts of consumers
in the area of fraudulent profile identification are not discussed too much. The writers on
16. 16
paper say that the detection of spam accounts may support emoticons, good fortune, valence
and enthusiasm. However, sensation analysis of many other fields of social networks have
been done. For example, the paper analyzes the remarks made on MySpace to look at the
disparities in gender-based emotional actions. The research concluded that the statements of
women had more optimistic characteristics than the comments of the men. In another
research the two psychologists Ekman and Friesen suggested six simple types of emotions
(i.e. rage, disgost, anxiety, pleasure, sorrow and surprise) to describe the moods of this blog
post. Related experiments assess emotional power through the use of two-dimensional space.
In order to evaluate emotions, researchers primarily use lexicon and deep learning methods.
For starters, MPQA Subjectivity Lexicon 45 is used by author in papers to assess the feeling
of the context-conscious framework of recommendations. And in the paper, writers used
many machine learning algorithms to identify consumer feedback on movies like Naïve
Bayes and Help Vector Machine.
9. RESEARCHPLAN
We attempt to put everything related to fake profiles at a single place. We also researched
different types of false profiles (spray bots, socialbots, likbots, and powerful bots) in various
OSSs, such as hacked profiles, cloned profiles and online bots. We also outlined multiple
types of fake profile functionality, able to differentiate different kinds of fake individuals
from true ones, in order to improve fake profile detection devices. The chapter tackles even
the challenge of non-accessibility by offering highly binding strategies for data processing
and some current data sources. In order to design fake profile identification systems, many
machine learning approaches are attempted.
Presents iMacros' technical data trawler, called IMcrawler, architecture and deployment for
the processing of data from Facebook profiles. It will gather all the details that is available
from a user profile from a tab. The proposed crawler addresses the challenges associated with
existing approaches of data collection from Online Social Networks. The Facebook network
of data derived from Facebook accounts is a comprehensive behavioural and emotional study
of people utilizing Facebook profiles. The information gathered is split into two broad
categories, including profiling information (profile features) and wall interaction (post
features). Profile details consists of information provided by the users and consists of acts
taken by the users on their schedules. In undertaking the behavioral study, we observed what
details people appear to share on the social network and whether there is sexual distortion in
the sharing of their personal information regarding themselves. Furthermore, in this chapter,
we analyzed what type of content people mostly post on their timelines and which activities
are highly performed on the network. The design of a novel -based suspicions identity
detection system. Represented by, the communication among mutual friends of two linked
users in a community is calculated quantitatively. In this chapter we are introducing the
method to identify the suspicious connections in the user population based on shared
coefficients of clustering and profile details of users. Profile details tends to find user-to-user
similarity. An cognitive mechanism for identifying legitimate and false people on the social
network on Facebook. We aim to explore in this chapter the feelings of fictional, actual users
in the context of a text on their Facebook walls. Our theory indicates that the emotions of
17. 17
ordinary (real) users exhibit greater variation than those of unauthorized users. In order to
examine the contents of the user message, Plutchik used the eight fundamental emotions,
including terror, rage, sorrow, pleasure, shocking, disgust, trustiness and anticipation. Data
were retrieved using the IMcrawler from the Facebook network.
10 REFERENCES
[1] Andrew Hutchinson, “Facebook Outlines the Number of Fake Accounts on Their Platform in New
Report,” 2018. [Online]. Available: https://www.socialmediatoday.com/news/facebook-outlines-the-
number-of-fakeaccounts-on-their-platform-in-new-repo/523614/.
[2] K. R. Nicholas Fandos, “Facebook Identifies an Active Political Influence Campaign Using Fake
Accounts - The New York Times,” The New York Times, 2018. [Online]. Available:
https://www.nytimes.com/2018/07/31/us/politics/facebook-political-campaignmidterms.html.
[3] M. Vergeer, L. Hermans, and S. Sams, “Online social networks and microblogging in political
campaigning,” Party Polit., vol. 19, no. 3, pp. 477–501, May 2013.
[4] S. Staab et al., “Social Networks Applied,” IEEE Intell. Syst., vol. 20, no. 1, pp. 80–93,
[5] N. A. Christakis and J. H. Fowler, “Social contagion theory: examining dynamic social networks and
human behavior,” Stat. Med., vol. 32, no. 4, pp. 556–577.
[6] T. Aichner, “Measuring the degree of corporate socialmedia use,”
[7] E. Grabianowski, “How Online Dating Works,” HowStuffWorks.com, 2005. [Online]. Available:
https://people.howstuffworks.com/online-dating.htm.
[8] M. Y. Kharaji, F. S. Rizi, and M. R. Khayyambashi, “A New Approach for Finding Cloned Profiles in
Online Social Networks,” vol. 6,
[9] I. Zeifman, “Bot traffic is up to 61.5% of all website traffic,” 2013. [Online]. Available:
https://www.incapsula.com/blog/bot-traffic-report-2013.
[10] M. Varvello and G. M. Voelker, “Second Life: A Social Network of Humans and Bots,” NOSSDAV
’10 Proc. 20th Int. Work. Netw. Oper. Syst. Support Digit. Audio Video, pp. 9–14, 2010.