The document presents AMUSED, a semi-automated framework for annotating multi-modal social media data from multiple platforms like Twitter, YouTube, and Reddit. AMUSED aims to address challenges in timely data collection and annotation by combining machine and human annotation. It detects social media posts linked from fact-checked news articles and downloads and assigns a label to the posts. As a use case, AMUSED has been used to collect over 8,000 COVID-19 misinformation posts from social media and categorize them.
Survey of data mining techniques for socialFiras Husseini
This document summarizes data mining techniques that have been used for social network analysis. It discusses how social networks generate massive amounts of data that present computational challenges due to their size, noise, and dynamism. It then reviews both traditional and recent unsupervised, semi-supervised, and supervised data mining techniques that have been applied to social network analysis to handle these challenges and discover useful knowledge from social network data, including graph theoretic techniques, tools for analyzing opinions and sentiment, and techniques for topic detection and tracking.
Online Data Preprocessing: A Case Study ApproachIJECEIAES
Besides the Internet search facility and e-mails, social networking is now one of the three best uses of the Internet. A tremendous number of volunteers every day write articles, share photos, videos and links at a scope and scale never imagined before. However, because social network data are huge and come from heterogeneous sources, the data are highly susceptible to inconsistency, redundancy, noise, and loss. For data scientists, preparing the data and getting it into a standard format is critical because the quality of data is going to directly affect the performance of mining algorithms that are going to be applied next. Low-quality data will certainly limit the analysis and lower the quality of mining results. To this end, the goal of this study is to provide an overview of the different phases involved in data preprocessing, with a focus on social network data. As a case study, we will show how we applied preprocessing to the data that we collected for the Malaysian Flight MH370 that disappeared in 2014.
Sentiment analysis of comments in social media IJECEIAES
Social media platforms are witnessing a significant growth in both size and purpose. One specific aspect of social media platforms is sentiment analysis, by which insights into the emotions and feelings of a person can be inferred from their posted text. Research related to sentiment analysis is acquiring substantial interest as it is a promising filed that can improve user experience and provide countless personalized services. Twitter is one of the most popular social media platforms, it has users from different regions with a variety of cultures and languages. It can thus provide valuable information for a diverse and large amount of data to be used to improve decision making. In this paper, the sentiment orientation of the textual features and emoji-based components is studied targeting “Tweets” and comments posted in Arabic on Twitter, during the 2018 world cup event. This study also measures the significance of analyzing texts including or excluding emojis. The data is obtained from thousands of extracted tweets, to find the results of sentiment analysis for texts and emojis separately. Results show that emojis support the sentiment orientation of the texts and those texts or emojis cannot separately provide reliable information as they complement each other to give the intended meaning.
IRJET- Identification of Prevalent News from Twitter and Traditional Media us...IRJET Journal
This document describes a study that uses community detection models to identify prevalent news topics discussed on both Twitter and traditional media like BBC. It collects tweets and news articles about sports over a one-month period. Keywords are extracted from the data and a graph is constructed to represent relationships between words. Three community detection models - Girvan-Newman clustering, CLIQUE, and Louvain - are used to cluster similar content and detect communities of keywords representing news topics. The number of unique Twitter users engaged with each topic is also calculated to rank topics by user attention. The goal is to analyze how information is distributed between social and traditional media and identify emerging topics with low coverage in traditional sources.
SOCIAL MEDIA NEWS: MOTIVATION, PURPOSE AND USAGEijcsit
This paper presents the results of an online survey which was conducted to analyse the use of social web in
the context of daily news. Users’ motivation and habit in the news consumption were focused. Moreover,
users’ news behaviour was distinguished in three purposes such news consumption, news production and
news dissemination to find out if the usage has a passive or active character. In a second step it was
questioned which social software is used for which purpose. In conclusion users appreciate social software
for features such as interactivity and information that traditional media does not provide. Among the social
web platforms users prefer social networking sites as well as videoshare platforms. Social networking sites
also rank first in the news production and dissemination.
This document summarizes research on detecting fake news using text analysis techniques. It discusses how social media consumption of news has increased and the challenges of identifying trustworthy sources. Various types of fake news are described based on visual/text content or the targeted audience. Methods for detection include clustering similar news reports and using predictive models to analyze linguistic features like punctuation, semantic levels, and readability. The proposed approach uses text summarization, web crawling to find related articles, latent semantic analysis to compare articles, and fuzzy logic to determine the authenticity score of a target news article. The goal is to develop a system to help users identify fake news on social media platforms.
Big data analytics and its impact on internet usersStruggler Ever
Big Data Analytic tools are promising techniques for a future prediction in many aspects of our life. The need for such predictive techniques has been exponentially increasing. even though, there are many challenges and risks are still of concern of researchers and decision makers, the outcome from the use of these techniques will considerable revolutionize our world to a new era of technology.
IDENTITY DISCLOSURE PROTECTION IN DYNAMIC NETWORKS USING K W – STRUCTURAL DIV...IJITE
The data mining figures out accurate information for requesting user after the raw data is analyzed. Among
lots of developments, data mining face hot issues on security, privacy and integrity. Data mining use one of the latest technique called privacy preserving data publishing (PPDP), which enforces security for the digital information provided by governments, corporations, companies and individuals in social networks. People become embarrassed when adversary tries to know the sensitive information shared. Sensitive information is gathered through the vertex and multi community identities of the user. Vertex identity denotes the self-information of user like name, address, mobile number, etc. Multi community identity denotes the community group in which the user participates. To prevent such identity disclosures, this paper proposes KW -structural diversity anonymity technique, for the protection of vertex and multi community identity disclosure. In KW -structural diversity anonymity technique, k is privacy level applied for users and W is an adversary monitoring time.
Survey of data mining techniques for socialFiras Husseini
This document summarizes data mining techniques that have been used for social network analysis. It discusses how social networks generate massive amounts of data that present computational challenges due to their size, noise, and dynamism. It then reviews both traditional and recent unsupervised, semi-supervised, and supervised data mining techniques that have been applied to social network analysis to handle these challenges and discover useful knowledge from social network data, including graph theoretic techniques, tools for analyzing opinions and sentiment, and techniques for topic detection and tracking.
Online Data Preprocessing: A Case Study ApproachIJECEIAES
Besides the Internet search facility and e-mails, social networking is now one of the three best uses of the Internet. A tremendous number of volunteers every day write articles, share photos, videos and links at a scope and scale never imagined before. However, because social network data are huge and come from heterogeneous sources, the data are highly susceptible to inconsistency, redundancy, noise, and loss. For data scientists, preparing the data and getting it into a standard format is critical because the quality of data is going to directly affect the performance of mining algorithms that are going to be applied next. Low-quality data will certainly limit the analysis and lower the quality of mining results. To this end, the goal of this study is to provide an overview of the different phases involved in data preprocessing, with a focus on social network data. As a case study, we will show how we applied preprocessing to the data that we collected for the Malaysian Flight MH370 that disappeared in 2014.
Sentiment analysis of comments in social media IJECEIAES
Social media platforms are witnessing a significant growth in both size and purpose. One specific aspect of social media platforms is sentiment analysis, by which insights into the emotions and feelings of a person can be inferred from their posted text. Research related to sentiment analysis is acquiring substantial interest as it is a promising filed that can improve user experience and provide countless personalized services. Twitter is one of the most popular social media platforms, it has users from different regions with a variety of cultures and languages. It can thus provide valuable information for a diverse and large amount of data to be used to improve decision making. In this paper, the sentiment orientation of the textual features and emoji-based components is studied targeting “Tweets” and comments posted in Arabic on Twitter, during the 2018 world cup event. This study also measures the significance of analyzing texts including or excluding emojis. The data is obtained from thousands of extracted tweets, to find the results of sentiment analysis for texts and emojis separately. Results show that emojis support the sentiment orientation of the texts and those texts or emojis cannot separately provide reliable information as they complement each other to give the intended meaning.
IRJET- Identification of Prevalent News from Twitter and Traditional Media us...IRJET Journal
This document describes a study that uses community detection models to identify prevalent news topics discussed on both Twitter and traditional media like BBC. It collects tweets and news articles about sports over a one-month period. Keywords are extracted from the data and a graph is constructed to represent relationships between words. Three community detection models - Girvan-Newman clustering, CLIQUE, and Louvain - are used to cluster similar content and detect communities of keywords representing news topics. The number of unique Twitter users engaged with each topic is also calculated to rank topics by user attention. The goal is to analyze how information is distributed between social and traditional media and identify emerging topics with low coverage in traditional sources.
SOCIAL MEDIA NEWS: MOTIVATION, PURPOSE AND USAGEijcsit
This paper presents the results of an online survey which was conducted to analyse the use of social web in
the context of daily news. Users’ motivation and habit in the news consumption were focused. Moreover,
users’ news behaviour was distinguished in three purposes such news consumption, news production and
news dissemination to find out if the usage has a passive or active character. In a second step it was
questioned which social software is used for which purpose. In conclusion users appreciate social software
for features such as interactivity and information that traditional media does not provide. Among the social
web platforms users prefer social networking sites as well as videoshare platforms. Social networking sites
also rank first in the news production and dissemination.
This document summarizes research on detecting fake news using text analysis techniques. It discusses how social media consumption of news has increased and the challenges of identifying trustworthy sources. Various types of fake news are described based on visual/text content or the targeted audience. Methods for detection include clustering similar news reports and using predictive models to analyze linguistic features like punctuation, semantic levels, and readability. The proposed approach uses text summarization, web crawling to find related articles, latent semantic analysis to compare articles, and fuzzy logic to determine the authenticity score of a target news article. The goal is to develop a system to help users identify fake news on social media platforms.
Big data analytics and its impact on internet usersStruggler Ever
Big Data Analytic tools are promising techniques for a future prediction in many aspects of our life. The need for such predictive techniques has been exponentially increasing. even though, there are many challenges and risks are still of concern of researchers and decision makers, the outcome from the use of these techniques will considerable revolutionize our world to a new era of technology.
IDENTITY DISCLOSURE PROTECTION IN DYNAMIC NETWORKS USING K W – STRUCTURAL DIV...IJITE
The data mining figures out accurate information for requesting user after the raw data is analyzed. Among
lots of developments, data mining face hot issues on security, privacy and integrity. Data mining use one of the latest technique called privacy preserving data publishing (PPDP), which enforces security for the digital information provided by governments, corporations, companies and individuals in social networks. People become embarrassed when adversary tries to know the sensitive information shared. Sensitive information is gathered through the vertex and multi community identities of the user. Vertex identity denotes the self-information of user like name, address, mobile number, etc. Multi community identity denotes the community group in which the user participates. To prevent such identity disclosures, this paper proposes KW -structural diversity anonymity technique, for the protection of vertex and multi community identity disclosure. In KW -structural diversity anonymity technique, k is privacy level applied for users and W is an adversary monitoring time.
Full Paper: Analytics: Key to go from generating big data to deriving busines...Piyush Malik
This document discusses how analytics can help organizations derive business value from big data. It describes how statistical analysis, machine learning, optimization and text mining can extract meaningful insights from social media, online commerce, telecommunications, smart utility meters, and improve security. While tools exist to analyze big data, challenges remain around data security, privacy, and developing skilled talent. The paper aims to illustrate how existing algorithms can generate value from different industry use cases.
Social Media Privacy Protection for Blockchain with Cyber Security Prediction...IRJET Journal
This document discusses privacy and security issues related to social media. It begins by introducing how social media has become integral to modern life but also presents privacy risks if users share personal information publicly. Some key privacy threats on social media mentioned include data breaches, passive attacks like unauthorized data collection, and active attacks trying to access other user accounts. The document then reviews literature around social media security and privacy concerns. It outlines common security risks like unmonitored accounts, human error, and vulnerabilities in third-party apps linked to social media profiles. Potential threats to social networks are categorized as data breaches, passive attacks, and active attacks. The document concludes that social networks pose significant security and privacy risks and all users should take steps to protect
MACHINE LEARNING ALGORITHMS FOR HETEROGENEOUS DATA: A COMPARATIVE STUDYIAEME Publication
In the present digital era massive amount of data is being continuously generated
at exceptional and increasing scales. This data has become an important and
indispensable part of every economy, industry, organization, business and individual.
Further handling of these large datasets due to the heterogeneity in their formats is
one of the major challenge. There is a need for efficient data processing techniques to
handle the heterogeneous data and also to meet the computational requirements to
process this huge volume of data. The objective of this paper is to review, describe
and reflect on heterogeneous data with its complexity in processing, and also the use
of machine learning algorithms which plays a major role in data analytics
HADOOP based Recommendation Algorithm for Micro-video URLdbpublications
In the recent years usage social media applications pervade in our daily life which makes the Social Networking Sites (SNSs) being dependent on users for content generation. Considering user interest, contents produced by individual SNSs significantly leaves some of the interest based content undiscovered. This led to facilitate features such as “like”, “share”, “hashtags” functions to deliver the content from one platform to another platform. These allowed users to interact with multiple SNSs but limited to receive contents for separate SNSs. Although Open Identity allowed users for single sign-in in multiple platforms, it still remained to target multiple platforms. A Unified Access Model is proposed to internet-based-content modeling where the content for the users could be images or videos or text. Videos of short length termed as “micro-videos” are more popular both for the viewers and also the producers. The work carried out provides a recommendation algorithm for micro-video url, which compared to traditional recommendation algorithms such as content based recommendation, the big data uses parallel computing framework. High performance computing is achieved by using slope one algorithm that uses Mapreduce and Hadoop techniques. Hence, the proposed recommendation system for micro-video url can achieve high performance parallel computing, which can be used by the producers and viewers.
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...CSEIJJournal
In machine learning, the intelligence of a developed model is greatly influenced by the dataset used for the
target domain on which the developed model will be deployed. Social media platform has experienced
more of hackers’ attacks on the platform in recent time. To identify a hacker on the platform, there are two
possible ways. The first is to use the activities of the user while the second is to use the supplied details the
user registered the account with. To adequately identify a social media user as hacker proactively, there
are relevant user details called features that can be used to determine whether a social media user is a
hacker or not. In this paper, an exploratory data analysis was carried out to determine the best features
that can be used by a predictive model to proactively identify hackers on the social media platform. A web
crawler was developed to mine the user dataset on which exploratory data analysis was carried out to
select the best features for the dataset which could be used to correctly identify a hacker on a social media
platform.
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...CSEIJJournal
In machine learning, the intelligence of a developed model is greatly influenced by the dataset used for the
target domain on which the developed model will be deployed. Social media platform has experienced
more of hackers’ attacks on the platform in recent time. To identify a hacker on the platform, there are two
possible ways. The first is to use the activities of the user while the second is to use the supplied details the
user registered the account with. To adequately identify a social media user as hacker proactively, there
are relevant user details called features that can be used to determine whether a social media user is a
hacker or not. In this paper, an exploratory data analysis was carried out to determine the best features
that can be used by a predictive model to proactively identify hackers on the social media platform. A web
crawler was developed to mine the user dataset on which exploratory data analysis was carried out to
select the best features for the dataset which could be used to correctly identify a hacker on a social media
platform.
Researching Social Media – Big Data and Social Media AnalysisFarida Vis
Researching Social Media – Big Data and Social Media Analysis, presentation for the Social Media for Researchers: A Sheffield Universities Social Media Symposium, 23 September 2014
This document discusses different approaches for analyzing social media data to gain customer insights:
1) Channel reporting tools provide overviews of specific social media platforms but lack deeper insights.
2) Scorecard systems aggregate data across sources but users cannot enhance the data.
3) Text mining analyzes sentiment but network analysis examines relationships; each technique has limitations alone.
4) The document proposes combining text mining, network analysis, and other techniques using a predictive analytics platform to generate new insights, as was done successfully for a major European telecom company.
It provides examples analyzing publicly available Slashdot data to identify influencers and show how sentiment relates to influence.
There are various online networking sites such as Facebook, twitter where students casually discuss their educational
experiences, their opinions, emotions, and concerns about the learning process. Information from such open environment can
give valuable knowledge for opinions, emotions and help the educational organizations to get insight into students’ educational
life. Analysing down such data, on the other hand, can be challenging therefore a qualitative research and significant data
mining process needs to be done. Sentiment classification can be done using NLP (Natural Language Processing). For a social
network that provides micro blogging services such as twitter, the incoming tweets can be classified into News, Opinions,
Events, Deals and private Messages based on authors information available in the tweets. This approach is similar to
Tweetstand, which classifies the tweets into news and non-news. Even for e-commerce applications virtual customer
environments can be created using social networking sites. Since the data is ever growing, using data mining techniques can get
difficult, hence we can use data analysis tools
Comprehensive Social Media Security Analysis & XKeyscore Espionage TechnologyCSCJournals
Social networks can offer many services to the users for sharing activities events and their ideas. Many attacks can happened to the social networking websites due to trust that have been given by the users. Cyber threats are discussed in this paper. We study the types of cyber threats, classify them and give some suggestions to protect social networking websites of variety of attacks. Moreover, we gave some antithreats strategies with future trends.
Collusion-resistant multiparty data sharing in social networksIJECEIAES
The number of users on online social networks (OSNs) has grown tremendously over the past few years, with sites like Facebook amassing over a billion users. With the popularity of OSNs, the increase in privacy risk from the large volume of sensitive and private data is inevitable. While there are many features for access control for an individual user, most OSNs still need concrete mechanisms to preserve the privacy of data shared between multiple users. The proposed method uses metrics such as identity leakage (IL) and strength of interaction (SoI) to fine-tune the scenarios that use privacy risk and sharing loss to identify and resolve conflicts. In addition to conflict resolution, bot detection is also done to mitigate collusion attacks. The final decision to share the data item is then ascertained based on whether it passes the threshold condition for the above metrics.
Combating propaganda texts using transfer learningIAESIJAI
Recently, it has been observed that people are shifting away from traditional news media sources towards trusting social networks to gather news information. Social networks have become the primary news source, although the validity and reliability of the information provided are uncertain. Memes are crucial content types that are very popular among young people and play a vital role in social media. It spreads quickly and continues to spread rapidly among people in a peer-to-peer manner rather than a prescriptive. Unfortunately, promoters and propagandists have adopted memes to indirectly manipulate public opinion and influence their attitudes using psychological and rhetorical techniques. This type of content could lead to unpleasant consequences in communities. This paper introduces an ensemble model system that resolves one of the most recent natural language processing research topics; propaganda techniques detection in texts extracted from memes. The paper also explores state-of-the-art pre-trained language models. The proposed model also uses different optimization techniques, such as data augmentation and model ensemble. It has been evaluated using a reference dataset from SemEval-2021 task 6. Our system outperforms the baseline and state-of-the-art results by achieving an F1-micro score of 0.604% on the test set.
Scraping and Clustering Techniques for the Characterization of Linkedin Profilescsandit
The socialization of the web has undertaken a new dimension after the emergence of the Online
Social Networks (OSN) concept. The fact that each Internet user becomes a potential content
creator entails managing a big amount of data. This paper explores the most popular
professional OSN: LinkedIn. A scraping technique was implemented to get around 5 Million
public profiles. The application of natural language processing techniques (NLP) to classify the
educational background and to cluster the professional background of the collected profiles led
us to provide some insights about this OSN’s users and to evaluate the relationships between
educational degrees and professional careers.
The socialization of the web has undertaken a new dimension after the emergence of the Online
Social Networks (OSN) concept. The fact that each Internet user becomes a potential content
creator entails managing a big amount of data. This paper explores the most popular
professional OSN: LinkedIn. A scraping technique was implemented to get around 5 Million
public profiles. The application of natural language processing techniques (NLP) to classify the
educational background and to cluster the professional background of the collected profiles led
us to provide some insights about this OSN’s users and to evaluate the relationships between
educational degrees and professional careers.
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...ijtsrd
Online Social Networks OSNs are providing a diversity of application for human users to network through families, friends and even strangers. One of such application, friend search engine, allows the universal public to inquiry individual client friend lists and has been gaining popularity recently. Proper design, this application may incorrectly disclose client private relationship information. Existing work has a privacy perpetuation clarification that can effectively boost OSNs' sociability while protecting users' friendship privacy against attacks launched by individual malicious requestors. In this project proposed an advanced collusion attack, where a victim user's friendship privacy can be compromise from side to side a series of cautiously designed queries coordinately launched by multiple malicious requestors. The result of the proposed collusion attack is validate through synthetic and real world social network data sets. The project on the advanced collusion attacks will help us design a more vigorous and securer friend search engine on OSNs in the near future. R. Brintha | H. Parveen Bagum "Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Search Engine" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-4 , June 2020, URL: https://www.ijtsrd.com/papers/ijtsrd31687.pdf Paper Url :https://www.ijtsrd.com/computer-science/world-wide-web/31687/retrieving-hidden-friends-a-collusion-privacy-attack-against-online-friend-search-engine/r-brintha
Social networking sites are a significant source of information to know the behavior of users and to know
what is occupying society of all ages and accordingly helpful information can be provided to specialists
and decision-makers. According to official sources, 98.43% of Saudi youth use social networking sites. The
study and analysis of social media data are done to provide the necessary information to increase
investment opportunities within the Kingdom of Saudi Arabia, by studying and analyzing what people
occupy on the communication sites through their tweets about the labor market and investment. Given the
huge volume of data and also its randomness, a survey of the data will be done and collected from through
keywords, the priority of arranging the data, and recording it as (positive - negative - mixed). The study
analysis and conclusion will be based on data-mining and its techniques of analysis and deduction
.
INCREASING THE INVESTMENT’S OPPORTUNITIES IN KINGDOM OF SAUDI ARABIA BY STUDY...ijcsit
Social networking sites are a significant source of information to know the behavior of users and to know
what is occupying society of all ages and accordingly helpful information can be provided to specialists
and decision-makers. According to official sources, 98.43% of Saudi youth use social networking sites. The
study and analysis of social media data are done to provide the necessary information to increase
investment opportunities within the Kingdom of Saudi Arabia, by studying and analyzing what people
occupy on the communication sites through their tweets about the labor market and investment. Given the
huge volume of data and also its randomness, a survey of the data will be done and collected from through
keywords, the priority of arranging the data, and recording it as (positive - negative - mixed). The study
analysis and conclusion will be based on data-mining and its techniques of analysis and deduction.
Terrorism Analysis through Social Media using Data MiningIRJET Journal
This document presents a study that uses deep learning models like Deep Neural Networks (DNN) and Convolutional Neural Networks (CNN) to analyze terrorism through detecting toxicity in social media text data. The study aims to classify text data into categories like toxicity, severe toxicity, obscenity, threat, insult or identity hate. It provides an overview of DNN and CNN models for text classification and compares their methodology, architecture and performance. The models are trained on preprocessed social media data related to terrorist activities and aim to accurately predict the toxicity level and classify tweets for concerned authorities to make informed decisions.
Organisational challenges of using social media marketing caliesch liebrich_2...www.rw-oberwallis.ch
This document discusses the organizational challenges of using social media marketing, specifically looking at the Facebook pages of KLM and Swiss International Airlines. It finds that KLM had a larger social media presence with more fan engagement. Both airlines faced customer service issues on their Facebook walls but KLM responded to more posts and comments. The key organizational challenges identified are: 1) Setting up internal networks to respond quickly to public queries, 2) Developing an authentic communication style aligned with corporate culture, 3) Coordinating responses across departments to multi-purpose queries, and 4) Continually adapting to the dynamic social media environment to better serve customers.
10 Best Printable Primary Writing Paper TemplateChristina Bauer
The document provides instructions for requesting writing assistance from HelpWriting.net. It outlines a 5-step process: 1) Create an account with a password and email. 2) Complete a 10-minute order form providing instructions, sources, and deadline. 3) Review bids from writers and select one based on qualifications. 4) Review the completed paper and authorize payment or request revisions. 5) Request revisions to ensure satisfaction, and HelpWriting.net guarantees original, high-quality work or a full refund.
Essay Writing Format For Kids. Browse Printable EssChristina Bauer
The document discusses both the positive and negative aspects of the 1969 moon landing. Positively, it achieved an important goal of the Space Race with the Soviet Union during the Cold War. However, it also describes the human toll of the space program, including the deaths of astronauts in training accidents. It then provides details of President John F. Kennedy's visit to Fort Worth, Texas on November 21st, 1963, the day before his assassination in Dallas.
More Related Content
Similar to AMUSED An Annotation Framework Of Multi-Modal Social Media Data
Full Paper: Analytics: Key to go from generating big data to deriving busines...Piyush Malik
This document discusses how analytics can help organizations derive business value from big data. It describes how statistical analysis, machine learning, optimization and text mining can extract meaningful insights from social media, online commerce, telecommunications, smart utility meters, and improve security. While tools exist to analyze big data, challenges remain around data security, privacy, and developing skilled talent. The paper aims to illustrate how existing algorithms can generate value from different industry use cases.
Social Media Privacy Protection for Blockchain with Cyber Security Prediction...IRJET Journal
This document discusses privacy and security issues related to social media. It begins by introducing how social media has become integral to modern life but also presents privacy risks if users share personal information publicly. Some key privacy threats on social media mentioned include data breaches, passive attacks like unauthorized data collection, and active attacks trying to access other user accounts. The document then reviews literature around social media security and privacy concerns. It outlines common security risks like unmonitored accounts, human error, and vulnerabilities in third-party apps linked to social media profiles. Potential threats to social networks are categorized as data breaches, passive attacks, and active attacks. The document concludes that social networks pose significant security and privacy risks and all users should take steps to protect
MACHINE LEARNING ALGORITHMS FOR HETEROGENEOUS DATA: A COMPARATIVE STUDYIAEME Publication
In the present digital era massive amount of data is being continuously generated
at exceptional and increasing scales. This data has become an important and
indispensable part of every economy, industry, organization, business and individual.
Further handling of these large datasets due to the heterogeneity in their formats is
one of the major challenge. There is a need for efficient data processing techniques to
handle the heterogeneous data and also to meet the computational requirements to
process this huge volume of data. The objective of this paper is to review, describe
and reflect on heterogeneous data with its complexity in processing, and also the use
of machine learning algorithms which plays a major role in data analytics
HADOOP based Recommendation Algorithm for Micro-video URLdbpublications
In the recent years usage social media applications pervade in our daily life which makes the Social Networking Sites (SNSs) being dependent on users for content generation. Considering user interest, contents produced by individual SNSs significantly leaves some of the interest based content undiscovered. This led to facilitate features such as “like”, “share”, “hashtags” functions to deliver the content from one platform to another platform. These allowed users to interact with multiple SNSs but limited to receive contents for separate SNSs. Although Open Identity allowed users for single sign-in in multiple platforms, it still remained to target multiple platforms. A Unified Access Model is proposed to internet-based-content modeling where the content for the users could be images or videos or text. Videos of short length termed as “micro-videos” are more popular both for the viewers and also the producers. The work carried out provides a recommendation algorithm for micro-video url, which compared to traditional recommendation algorithms such as content based recommendation, the big data uses parallel computing framework. High performance computing is achieved by using slope one algorithm that uses Mapreduce and Hadoop techniques. Hence, the proposed recommendation system for micro-video url can achieve high performance parallel computing, which can be used by the producers and viewers.
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...CSEIJJournal
In machine learning, the intelligence of a developed model is greatly influenced by the dataset used for the
target domain on which the developed model will be deployed. Social media platform has experienced
more of hackers’ attacks on the platform in recent time. To identify a hacker on the platform, there are two
possible ways. The first is to use the activities of the user while the second is to use the supplied details the
user registered the account with. To adequately identify a social media user as hacker proactively, there
are relevant user details called features that can be used to determine whether a social media user is a
hacker or not. In this paper, an exploratory data analysis was carried out to determine the best features
that can be used by a predictive model to proactively identify hackers on the social media platform. A web
crawler was developed to mine the user dataset on which exploratory data analysis was carried out to
select the best features for the dataset which could be used to correctly identify a hacker on a social media
platform.
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...CSEIJJournal
In machine learning, the intelligence of a developed model is greatly influenced by the dataset used for the
target domain on which the developed model will be deployed. Social media platform has experienced
more of hackers’ attacks on the platform in recent time. To identify a hacker on the platform, there are two
possible ways. The first is to use the activities of the user while the second is to use the supplied details the
user registered the account with. To adequately identify a social media user as hacker proactively, there
are relevant user details called features that can be used to determine whether a social media user is a
hacker or not. In this paper, an exploratory data analysis was carried out to determine the best features
that can be used by a predictive model to proactively identify hackers on the social media platform. A web
crawler was developed to mine the user dataset on which exploratory data analysis was carried out to
select the best features for the dataset which could be used to correctly identify a hacker on a social media
platform.
Researching Social Media – Big Data and Social Media AnalysisFarida Vis
Researching Social Media – Big Data and Social Media Analysis, presentation for the Social Media for Researchers: A Sheffield Universities Social Media Symposium, 23 September 2014
This document discusses different approaches for analyzing social media data to gain customer insights:
1) Channel reporting tools provide overviews of specific social media platforms but lack deeper insights.
2) Scorecard systems aggregate data across sources but users cannot enhance the data.
3) Text mining analyzes sentiment but network analysis examines relationships; each technique has limitations alone.
4) The document proposes combining text mining, network analysis, and other techniques using a predictive analytics platform to generate new insights, as was done successfully for a major European telecom company.
It provides examples analyzing publicly available Slashdot data to identify influencers and show how sentiment relates to influence.
There are various online networking sites such as Facebook, twitter where students casually discuss their educational
experiences, their opinions, emotions, and concerns about the learning process. Information from such open environment can
give valuable knowledge for opinions, emotions and help the educational organizations to get insight into students’ educational
life. Analysing down such data, on the other hand, can be challenging therefore a qualitative research and significant data
mining process needs to be done. Sentiment classification can be done using NLP (Natural Language Processing). For a social
network that provides micro blogging services such as twitter, the incoming tweets can be classified into News, Opinions,
Events, Deals and private Messages based on authors information available in the tweets. This approach is similar to
Tweetstand, which classifies the tweets into news and non-news. Even for e-commerce applications virtual customer
environments can be created using social networking sites. Since the data is ever growing, using data mining techniques can get
difficult, hence we can use data analysis tools
Comprehensive Social Media Security Analysis & XKeyscore Espionage TechnologyCSCJournals
Social networks can offer many services to the users for sharing activities events and their ideas. Many attacks can happened to the social networking websites due to trust that have been given by the users. Cyber threats are discussed in this paper. We study the types of cyber threats, classify them and give some suggestions to protect social networking websites of variety of attacks. Moreover, we gave some antithreats strategies with future trends.
Collusion-resistant multiparty data sharing in social networksIJECEIAES
The number of users on online social networks (OSNs) has grown tremendously over the past few years, with sites like Facebook amassing over a billion users. With the popularity of OSNs, the increase in privacy risk from the large volume of sensitive and private data is inevitable. While there are many features for access control for an individual user, most OSNs still need concrete mechanisms to preserve the privacy of data shared between multiple users. The proposed method uses metrics such as identity leakage (IL) and strength of interaction (SoI) to fine-tune the scenarios that use privacy risk and sharing loss to identify and resolve conflicts. In addition to conflict resolution, bot detection is also done to mitigate collusion attacks. The final decision to share the data item is then ascertained based on whether it passes the threshold condition for the above metrics.
Combating propaganda texts using transfer learningIAESIJAI
Recently, it has been observed that people are shifting away from traditional news media sources towards trusting social networks to gather news information. Social networks have become the primary news source, although the validity and reliability of the information provided are uncertain. Memes are crucial content types that are very popular among young people and play a vital role in social media. It spreads quickly and continues to spread rapidly among people in a peer-to-peer manner rather than a prescriptive. Unfortunately, promoters and propagandists have adopted memes to indirectly manipulate public opinion and influence their attitudes using psychological and rhetorical techniques. This type of content could lead to unpleasant consequences in communities. This paper introduces an ensemble model system that resolves one of the most recent natural language processing research topics; propaganda techniques detection in texts extracted from memes. The paper also explores state-of-the-art pre-trained language models. The proposed model also uses different optimization techniques, such as data augmentation and model ensemble. It has been evaluated using a reference dataset from SemEval-2021 task 6. Our system outperforms the baseline and state-of-the-art results by achieving an F1-micro score of 0.604% on the test set.
Scraping and Clustering Techniques for the Characterization of Linkedin Profilescsandit
The socialization of the web has undertaken a new dimension after the emergence of the Online
Social Networks (OSN) concept. The fact that each Internet user becomes a potential content
creator entails managing a big amount of data. This paper explores the most popular
professional OSN: LinkedIn. A scraping technique was implemented to get around 5 Million
public profiles. The application of natural language processing techniques (NLP) to classify the
educational background and to cluster the professional background of the collected profiles led
us to provide some insights about this OSN’s users and to evaluate the relationships between
educational degrees and professional careers.
The socialization of the web has undertaken a new dimension after the emergence of the Online
Social Networks (OSN) concept. The fact that each Internet user becomes a potential content
creator entails managing a big amount of data. This paper explores the most popular
professional OSN: LinkedIn. A scraping technique was implemented to get around 5 Million
public profiles. The application of natural language processing techniques (NLP) to classify the
educational background and to cluster the professional background of the collected profiles led
us to provide some insights about this OSN’s users and to evaluate the relationships between
educational degrees and professional careers.
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...ijtsrd
Online Social Networks OSNs are providing a diversity of application for human users to network through families, friends and even strangers. One of such application, friend search engine, allows the universal public to inquiry individual client friend lists and has been gaining popularity recently. Proper design, this application may incorrectly disclose client private relationship information. Existing work has a privacy perpetuation clarification that can effectively boost OSNs' sociability while protecting users' friendship privacy against attacks launched by individual malicious requestors. In this project proposed an advanced collusion attack, where a victim user's friendship privacy can be compromise from side to side a series of cautiously designed queries coordinately launched by multiple malicious requestors. The result of the proposed collusion attack is validate through synthetic and real world social network data sets. The project on the advanced collusion attacks will help us design a more vigorous and securer friend search engine on OSNs in the near future. R. Brintha | H. Parveen Bagum "Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Search Engine" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-4 , June 2020, URL: https://www.ijtsrd.com/papers/ijtsrd31687.pdf Paper Url :https://www.ijtsrd.com/computer-science/world-wide-web/31687/retrieving-hidden-friends-a-collusion-privacy-attack-against-online-friend-search-engine/r-brintha
Social networking sites are a significant source of information to know the behavior of users and to know
what is occupying society of all ages and accordingly helpful information can be provided to specialists
and decision-makers. According to official sources, 98.43% of Saudi youth use social networking sites. The
study and analysis of social media data are done to provide the necessary information to increase
investment opportunities within the Kingdom of Saudi Arabia, by studying and analyzing what people
occupy on the communication sites through their tweets about the labor market and investment. Given the
huge volume of data and also its randomness, a survey of the data will be done and collected from through
keywords, the priority of arranging the data, and recording it as (positive - negative - mixed). The study
analysis and conclusion will be based on data-mining and its techniques of analysis and deduction
.
INCREASING THE INVESTMENT’S OPPORTUNITIES IN KINGDOM OF SAUDI ARABIA BY STUDY...ijcsit
Social networking sites are a significant source of information to know the behavior of users and to know
what is occupying society of all ages and accordingly helpful information can be provided to specialists
and decision-makers. According to official sources, 98.43% of Saudi youth use social networking sites. The
study and analysis of social media data are done to provide the necessary information to increase
investment opportunities within the Kingdom of Saudi Arabia, by studying and analyzing what people
occupy on the communication sites through their tweets about the labor market and investment. Given the
huge volume of data and also its randomness, a survey of the data will be done and collected from through
keywords, the priority of arranging the data, and recording it as (positive - negative - mixed). The study
analysis and conclusion will be based on data-mining and its techniques of analysis and deduction.
Terrorism Analysis through Social Media using Data MiningIRJET Journal
This document presents a study that uses deep learning models like Deep Neural Networks (DNN) and Convolutional Neural Networks (CNN) to analyze terrorism through detecting toxicity in social media text data. The study aims to classify text data into categories like toxicity, severe toxicity, obscenity, threat, insult or identity hate. It provides an overview of DNN and CNN models for text classification and compares their methodology, architecture and performance. The models are trained on preprocessed social media data related to terrorist activities and aim to accurately predict the toxicity level and classify tweets for concerned authorities to make informed decisions.
Organisational challenges of using social media marketing caliesch liebrich_2...www.rw-oberwallis.ch
This document discusses the organizational challenges of using social media marketing, specifically looking at the Facebook pages of KLM and Swiss International Airlines. It finds that KLM had a larger social media presence with more fan engagement. Both airlines faced customer service issues on their Facebook walls but KLM responded to more posts and comments. The key organizational challenges identified are: 1) Setting up internal networks to respond quickly to public queries, 2) Developing an authentic communication style aligned with corporate culture, 3) Coordinating responses across departments to multi-purpose queries, and 4) Continually adapting to the dynamic social media environment to better serve customers.
Similar to AMUSED An Annotation Framework Of Multi-Modal Social Media Data (20)
10 Best Printable Primary Writing Paper TemplateChristina Bauer
The document provides instructions for requesting writing assistance from HelpWriting.net. It outlines a 5-step process: 1) Create an account with a password and email. 2) Complete a 10-minute order form providing instructions, sources, and deadline. 3) Review bids from writers and select one based on qualifications. 4) Review the completed paper and authorize payment or request revisions. 5) Request revisions to ensure satisfaction, and HelpWriting.net guarantees original, high-quality work or a full refund.
Essay Writing Format For Kids. Browse Printable EssChristina Bauer
The document discusses both the positive and negative aspects of the 1969 moon landing. Positively, it achieved an important goal of the Space Race with the Soviet Union during the Cold War. However, it also describes the human toll of the space program, including the deaths of astronauts in training accidents. It then provides details of President John F. Kennedy's visit to Fort Worth, Texas on November 21st, 1963, the day before his assassination in Dallas.
24 Page Set Of Winter Themed Writing Paper ByChristina Bauer
I apologize, upon further review I do not feel comfortable providing a full summary of this document without having the full context and being able to verify the accuracy and appropriateness of the content. The document discusses technical topics related to physics and I do not have the expertise to fully comprehend or summarize it.
Summarize Paragraph In Short. Online assignment writing service.Christina Bauer
The document provides instructions for creating an account and submitting a paper writing request on the HelpWriting.net site. It outlines a 5-step process: 1) Create an account with an email and password. 2) Complete a form with paper details, sources, and deadline. 3) Writers will bid on the request and their qualifications can be reviewed. 4) Place a deposit to start the writing. 5) Review the completed paper and authorize final payment or request revisions. The summary highlights the key steps involved in obtaining writing help through the site.
How To Write About The Theme Of A Book CoverlChristina Bauer
The document provides instructions for requesting writing assistance from HelpWriting.net. It outlines a 5-step process: 1) Create an account with a password and email. 2) Complete an order form with instructions, sources, and deadline. 3) Review bids from writers and choose one. 4) Receive the paper and authorize payment if pleased. 5) Request revisions until satisfied.
Art Thesis Examples - What Is Art Essay Examples WhChristina Bauer
The document provides instructions for requesting writing assistance from HelpWriting.net. It outlines a 5-step process: 1) Create an account, 2) Complete an order form providing instructions and deadline, 3) Review bids from writers and select one, 4) Review the completed paper and authorize payment, 5) Request revisions until satisfied. It emphasizes that original, high-quality content is guaranteed or a full refund will be provided.
The document provides instructions for requesting and completing an assignment writing request through the HelpWriting.net website. It outlines a 5-step process: 1) Create an account with an email and password. 2) Complete a form with assignment details and attach samples. 3) Review bids from writers and select one. 4) Review the completed paper and authorize payment. 5) Request revisions until satisfied, with refunds offered for plagiarized work. The document promises original, high-quality content and full satisfaction of needs.
Thank You Writing Paper. Online assignment writing service.Christina Bauer
The critique analyzes a piece of music that opens quickly with the cello but is interrupted by the piano playing in a deep key, bringing anxiety. The piano and cello alternate taking the lead, with the deep vibratos of the cello associated with lament and the single notes of the piano with hope. Though the piano suggests hope, it always ends the movement in a deep, somber key, indicating an underlying melancholy.
Great Personal Narratives. How To Write A Personal NChristina Bauer
This document provides instructions for how to request and complete an assignment writing request on the HelpWriting.net website. It outlines a 5 step process: 1) Create an account with a password and email, 2) Complete an order form with instructions and deadline, 3) Review bids from writers and choose one, 4) Review the completed paper and authorize payment, 5) Request revisions until satisfied. It emphasizes that the site aims to provide original, high-quality content and offers refunds for plagiarized work.
How To Insert A Citation In Mla Style - LasgraceChristina Bauer
The document provides instructions for how to request an assignment writing service from HelpWriting.net in 5 steps:
1. Create an account with a password and email.
2. Complete a 10-minute order form providing instructions, sources, deadline, and attaching a sample work.
3. Review bids from writers and choose one based on qualifications, history, and feedback, then pay a deposit.
4. Review the completed paper and authorize final payment if pleased, or request revisions.
5. Request multiple revisions to ensure satisfaction, and HelpWriting.net promises original, high-quality work with refunds for plagiarism.
Autobiography Outline Template For Middle SchoolChristina Bauer
The document provides information about Operation Barbarossa, Germany's invasion of the Soviet Union in 1941. It begins by discussing the 1939 non-aggression pact between Germany and the Soviet Union, noting that while they agreed not to attack each other, Hitler planned to invade the Soviet Union to remove communists and exterminate Jews. The summary then explains that Germany launched Operation Barbarossa on June 22, 1941, attacking the Soviet Union with over 3 million soldiers, marking the start of the Great Patriotic War. Finally, it notes the Germans had the advantage over the Soviet air force at the beginning of the invasion.
In Herman Melville's novella Billy Budd, Sailor, the protagonist Billy Budd is portrayed as a naive but excellent sailor. While Billy is good-natured and diligent, he is emotionally underdeveloped and fails to recognize evil in others. These attributes, along with his impressment into the British Navy and the malicious intentions of others on his ship, ultimately lead to Billy's downfall. However, the essay concludes that while Billy faces wrongful acts, he is also a victim of his own naivety and inability to see the corruption in people.
Ghost Writing And Craft By Its MoNiques World TeacChristina Bauer
This document discusses social anxiety disorder and how it differs from just being shy. Social anxiety disorder is defined as a fear and anxiety of being negatively judged and evaluated by others. It is a chronic mental illness, unlike shyness which is a temporary feeling. While it is common for people to feel judged at times, social anxiety disorder causes excessive and disabling fear and anxiety in social situations. The document aims to raise awareness that social anxiety disorder is a real mental illness and not just an extreme form of shyness or something people pretend to have.
The document provides instructions for creating an account and submitting an assignment request on the HelpWriting.net website in 5 steps:
1. Create an account by providing a password and email.
2. Complete a 10-minute order form with instructions, sources, deadline, and attach a sample if wanting the writer to imitate your style.
3. Review bids from writers and choose one based on qualifications, history, and feedback, then pay a deposit to start the assignment.
4. Review the completed paper and authorize full payment if satisfied, or request free revisions.
5. You can request multiple revisions to ensure satisfaction, and HelpWriting guarantees original, high-quality work with a full refund
The Black Sox Scandal involved several members of the Chicago White Sox baseball team who were accused of intentionally losing the 1919 World Series against the Cincinnati Reds in exchange for money from gamblers; some key players had been offered bribes of $5,000 or more each to lose the series by Chicago gamblers allied with the notorious crime boss Arnold Rothstein; while the players were acquitted in court due to lack of evidence, they were banned from organized baseball for life due to the determination by the newly created Commissioner of Baseball that they had conspired to fix the series.
Paperback Writer Partitions The Beat. Online assignment writing service.Christina Bauer
1. Putnam and Campbell's book American Grace analyzes religion's role in both dividing and uniting American society over recent decades through a sociological lens.
2. The book traces America's social history to understand shifts in religious beliefs and behaviors. It finds religion has both divided Americans along denominational lines but also brought people together in new ways.
3. Putnam and Campbell take a nuanced approach, acknowledging both religion's positive and negative impacts on American social cohesion over time. Their sociological analysis provides insights into religion's complex role in the nation's social fabric.
Argument Analysis - Excelsior College OWL - EChristina Bauer
This document provides instructions for creating an account and requesting writing assistance from the HelpWriting.net website. It outlines a 5-step process: 1) Create an account with a password and email; 2) Complete a form with assignment details and attach samples; 3) Review bids from writers and select one; 4) Review the completed paper and authorize payment; 5) Request revisions until satisfied. It emphasizes the bidding system used to match clients with writers and promises original, high-quality work with refunds for plagiarism.
Short Essay For School Students O. Online assignment writing service.Christina Bauer
The document discusses how romance fiction focuses on women's triumphs over inequality, but this is primarily for Caucasian women. It notes that romance fiction seldom extends the same concern to other marginalized groups. While romance fiction aims to challenge justifications for women's subordination, it offers little insight into how sexism intersects with other forms of oppression like racism. The essay will examine how romance novels can incorporate intersectional perspectives by representing diverse protagonists and relationships.
My First Day At Secondary School. Online assignment writing service.Christina Bauer
The document provides instructions for requesting writing assistance from HelpWriting.net. It outlines a 5-step process: 1) Create an account with a password and email. 2) Complete a order form with instructions, sources, and deadline. 3) Review bids from writers and select one. 4) Review the completed paper and authorize payment. 5) Request revisions to ensure satisfaction, with a refund option for plagiarized work. The process aims to match clients with qualified writers to fully meet their needs for original, high-quality content.
Famous Quotes For Essays. 170 Writing Quotes By Famous AutChristina Bauer
The document provides instructions for using the HelpWriting.net service to request that writers complete assignments and papers. It outlines a 5-step process: 1) Create an account, 2) Submit a request with instructions and deadline, 3) Review writer bids and qualifications and select a writer, 4) Review the completed paper and authorize payment, 5) Request revisions if needed, knowing the service guarantees original work or a refund.
Executive Directors Chat Leveraging AI for Diversity, Equity, and InclusionTechSoup
Let’s explore the intersection of technology and equity in the final session of our DEI series. Discover how AI tools, like ChatGPT, can be used to support and enhance your nonprofit's DEI initiatives. Participants will gain insights into practical AI applications and get tips for leveraging technology to advance their DEI goals.
How to Add Chatter in the odoo 17 ERP ModuleCeline George
In Odoo, the chatter is like a chat tool that helps you work together on records. You can leave notes and track things, making it easier to talk with your team and partners. Inside chatter, all communication history, activity, and changes will be displayed.
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
How to Make a Field Mandatory in Odoo 17Celine George
In Odoo, making a field required can be done through both Python code and XML views. When you set the required attribute to True in Python code, it makes the field required across all views where it's used. Conversely, when you set the required attribute in XML views, it makes the field required only in the context of that particular view.
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
it describes the bony anatomy including the femoral head , acetabulum, labrum . also discusses the capsule , ligaments . muscle that act on the hip joint and the range of motion are outlined. factors affecting hip joint stability and weight transmission through the joint are summarized.
A workshop hosted by the South African Journal of Science aimed at postgraduate students and early career researchers with little or no experience in writing and publishing journal articles.
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
AMUSED An Annotation Framework Of Multi-Modal Social Media Data
1. AMUSED: An Annotation Framework of
Multi-modal Social Media Data
Gautam Kishore Shahi1
and Tim A. Majchrzak2
1
University of Duisburg-Essen, Germany
2
University of Agder, Norway
gautam.shahi@uni-due.de, timam@uia.no
Abstract. Social media nowadays is both an important news source
and used for spreading misinformation. Systematically studying social
media phenomena, however, has been challenging due to the lack of la-
belled data. This paper presents the semi-automated annotation frame-
work AMUSED for gathering multi-lingual multi-modal annotated data
from social networking sites. The framework is designed to mitigate the
workload in collecting and annotating social media data by cohesively
combining machine and human in the data collection process. From a
given list of news articles, AMUSED detects links to social media posts
and then downloads the data from the respective social networking sites
and assigns a label to it. The framework can fetch the annotated data
from multiple platforms like Twitter, YouTube, and Reddit. As a use
case, we have implemented AMUSED for collecting COVID-19 misin-
formation data from different social media sites from 8 077 fact-checked
articles into four different categories of Misinformation.
Keywords: Data Annotation · Social media · Misinformation· News
articles · Fact-checking
1 Introduction
With the growth of users on different social media sites, social media have be-
come part of our lives. They play an essential role in making communication
easier and accessible. People and organisations use social media to share and
browse information, especially during the current pandemic; social media sites
get massive attention from users [33,23]. Braun and Tarleton [5] conducted a
study to analyse the public discourse on social media sites and news organisa-
tion. Social media sites allow getting more attention from the users for sharing
news or user-generated content. Several statistical or computational study has
been conducted using social media data [5]. But data gathering and its annota-
tion are challenging and financially costly [19].
Social media data analytics research poses challenges in data collection, data
sampling, data annotation, quality of the data, and bias in data [17]. Data an-
notation is the process of assigning a category to the data. Researchers annotate
social media data for research on hate speech, misinformation, online mental
arXiv:2010.00502v2
[cs.SI]
10
Aug
2021
2. 2 Shahi and Majchrzak
health etc. For supervised machine learning, labelled data sets are required to
understand the input patterns[26]. To build a supervised or semi-supervised
model on social media data, researchers face two challenges- timely data collec-
tion and data annotation [30]. Timely data collection is essential because some
platforms either restrict data access or the post itself is deleted by social media
platforms or by the user [32]. Another problem stands with data annotation; it is
conducted either in an in-house fashion (within lab or organisation) or by using a
crowdsourced tool (like Amazon Mechanical Turk (AMT)) [4]. Both approaches
require a fair amount of effort to write the annotation guidelines along. There is
also a chance of wrongly labelled data leading to bias [10].
We propose a semi-automatic framework for data annotation from social me-
dia platforms to solve timely data collection and annotation. AMUSED gathers
labelled data from different social media platform in multiple formats (text,
image, video). It can get annotated data on social issues like misinformation,
hate speech or other critical social scenarios. AMUSED resolves bias in the data
(wrong label assigned by annotator). Our contribution is to provide a semi-
automatic approach for collecting labelled data from different social media sites
in multiple languages and data formats. Our framework can be applied in many
application domains for which it typically is hard to gather the data, for instance,
misinformation, mob lynching etc.
This paper is structured as follows. In Section 2 we discuss the background
of our work. We then present the work method of AMUSED in Section 3, In Sec-
tion 4 we give details on the implementation of AMUSED based on a case study.
We discuss our observations in Section 6 and draw a conclusion in Section 7.
2 Background
The following section describes the background on data annotation, types of
data on social media, and the problem of the current annotations technique.
2.1 Data Annotation
Much research has been published that uses social media data. Typically, it is
limited to a few social media platforms or language in a single work. Also, the
result is published with a limited amount of data. There are multiple reasons for
these limitations; one of the key reason is the availability of annotated data for
the research [36,2]. Chapman et al. [8] highlight the problem of getting labelled
data for an NLP related problem. A study is conducted on data quality and
the role of annotator in the performance of machine learning model. With poor
data, it hard to build a generalisable classifier [15].
Researchers are dependent on in-house or crowd-based data annotation. Re-
cently, Alam et al. [3] used a crowd-based annotation technique and asks people
to volunteer for data annotation, but there is no significant success in getting a
large number of labelled data. The current annotation technique is dependent
on the background expertise of the annotators. Finding past data on an incident
3. AMUSED Annotation Framework 3
like mob lynching is challenging because of data restrictions by social media
platforms. It requires looking at a massive number of posts and news articles,
leading to much manual work. In addition, billions of social media posts are
sampled to a few thousand posts for data annotation either by random sample
or keyword sampling, leading to sampling bias.
With in-house data annotation, it is challenging to hire an annotator with
background expertise in a domain. Another issue is the development of a codebook
with a proper explanation [13]. The entire process is financially costly and time-
taking [12]. The problem with crowd-based annotation tools like AMT is that
the low cost may result in the wrong labelling of data. Many annotators may
cheat, not properly performing the job, use robots, or answer randomly [14,25].
Since the emergence of social media as a news resource [7], people use this re-
source very differently. They may share news, state a personal opinion or commit
a social crime in the form of hate speech or cyberbullying [22]. The COVID-19
pandemic arguably has to lead to a surge in the spread of misinformation [28]
Nowadays, journalists cover some common issues like misinformation, mob lynch-
ing, and hate speech; they also link the social media post in the news articles [11].
To solve the problem of the data collection and its annotation, related social
media posts from news articles can be used. Labelling social media is then done
based on the news article’s contents. To get a reliable label, the credibility of the
news sources must be considered [21]. For example, a professional news website
registered with the International Fact-Checking Network [20] should, generally,
be rather creditable.
2.2 Data on Social Media Platforms
Social Media sites allow users to create and view posts in multiple formats.
Every day, billions of posts containing images, text, videos are shared on social
media sites such as Facebook, Twitter, YouTube and Instagram [1]. Data are
available in different formats, and each social media platform apply restriction
on data crawling. For instance, Facebook allows crawling data only related to
public posts and groups.
Giglietto et al. discuss the requirement of multi-modal data for the study of
social phenomenon [16]. Almost every social media platform allows user to create
or respond to the social media post in text. But each social media platform has a
different restriction on the length of the text. The content and the writing style
changes with the character limit of different social media platform. Images are
also common across different social media platforms. Platform have restriction
on the size of the image. Some platforms are primarily focused on video, whereas
some are multi-modal. Furthermore, for video, there are restrictions in terms of
duration. This influences the characteristics of usage.
2.3 Problems of Current Annotation Techniques
There are several problems with the current annotation approaches. First, social
media platforms restrict users when fetching data; for example, a user delete
4. 4 Shahi and Majchrzak
Fig. 1. AMUSED: An Annotation Framework for Multi-modal Social Media data
the tweets or videos on YouTube. Without on-time crawling, data access is lost.
Second, if the volume of data is high, filtering based on several criteria like
keyword, date, location etc., is needed. This filtering degrades the data quality
by excluding much data. For example, if we sample data using hateful keywords
for hate speech, we might lose many hate speech tweets but do not contain any
hateful words.
Third, getting a good annotator is a difficult task. Annotation quality de-
pends on the background expertise of the person. For crowdsourcing, maintain-
ing annotation quality is complicated. Moreover, maintaining a good agreement
between multiple annotators is tedious. Fourth, the development of annotation
guidelines is tricky. Writing a good codebook requires domain knowledge and
consultation from experts.
Fifth, data annotation is costly and time-consuming [31]. Sixth, social media
is available in multiple languages, but much research is limited to English. Data
annotation in other languages, especially under-resourced languages, is difficult
due to the lack of experienced annotators.
3 Method
AMUSED’s elements are summarised in Figure 1. It follows nine steps.
Step 1: Domain Identification The first step is the identification of the
domain in which we want to gather the data. A domain could focus on a par-
ticular public discourse. For example, a domain could be fake news in the US
election, or hate speech in trending hashtags on Twitter. Domain selection helps
to find the relevant data sources.
5. AMUSED Annotation Framework 5
Element Definition
News ID Unique identifying ID of each news articles. We use an acronym for
news source and the number to identify a news articles.
Example: PY9
Newssource URL Unique identifier pointing to the news articles.
Example: https: // factcheck. afp. com/
video-actually-shows-anti-government-protest-belarus
News Title The title of the news article.
Example: A video shows a rally against coronavirus restrictions in
the British capital of London.
Published date Date when an article published in online media.
Example: 01 September 2020
News Class Each news articles published the fact check article with a class like
false, true, misleading. We store it in the class column.
Example: False
Published-By The name of the news websites
Example: AFP, TheQuint
Country Country where the news article is published.
Example: Australia
Language Language used for news article.
Example: English
Table 1. Description of attributes and their examples
Step 2: Data Source Data sources comprise news websites that mention a
particular topic. For example, many news websites have a separate section that
discusses the election or other ongoing issues.
Step 3: Web scraping AMUSED then crawls all news articles from news
websites using a Python-based crawler. We fetch details such as the published
date, author, location, news content (see Table 1).
Step 4: Language Identification After getting the details from the news
articles, we check its language. We use ISO 639-1 for naming the language. Based
on the language, we can further filter articles and apply a language-specific model
for finding insights.
Step 5: Social Media Link From the crawled data, we fetch the anchor tag
<a> mentioned in the news content. We then filter the hyperlinks to identify
social media platforms and fetch unique identifiers to the posts.
Step 6: Social Media Data Crawling We now fetch the data from the
respective social media platform. For this purpose, we built a crawler for each
social media platform, which consumes the unique identifiers obtained from the
previous step. For Twitter we used a Python crawler using Tweepy, which crawls
all details about a Tweet. We collect text, time, likes, retweet, user details such
as name, location, follower count. Similarly, we build our crawler for other plat-
forms. Due to the data restriction from Facebook and Instagram, we use Crowd-
tangle [34] to fetch data from Facebook and Instagram, but it only gives numer-
ical data like likes and followers.
Step 7: Data Labelling We assign labels to the social media data based
on the label assigned to the news articles by journalists. Often news articles
6. 6 Shahi and Majchrzak
categorise a social media post, for example, like hate speech or propaganda.
We assign the label to social media post as class mentioned in the news article
as a class described by the journalist. For example, suppose a news article a
containing social media post s has been published by a journalist j, and journalist
j has described the social media post s to be misinformation. In that case, We
label the social media post s as misinformation. It will ease the workload by
getting the number of social media post check by a journalist.
Step 8: Human Verification To check the correctness, a human verifies
the assigned label to the social media post. If the label is wrongly assigned,
then data is removed from the corpus. This step assures that the collected social
media post contains the relevant post and correctly given label. A human can
verify the label of the randomly selected news articles.
Step 9: Data Enrichment We finally merge the social media data with the
details from the news articles. It helps to accumulate extra information, which
might allow for further analysis.
4 Implementation: A Case Study on Misinformation
While our framework allows for general application, understanding its merits is
best possible by applying it to a specific domain. AMUSED can be helpful for
several domains, but news companies are quite active in the domain of misinfor-
mation, especially during a crisis. Misinformation, often yet imprecisely referred
to as a piece of information that is shared unintentionally or by mistake, without
knowing the truthfulness of the content [27].
There is an increasing amount of Misinformation in the media, social media,
and other web sources; this has become a topic of much research attention [38].
Nowadays, more than 100 fact-checking websites are working to tackle the prob-
lem of misinformation [9].
People have spread vast amounts of misinformation during the COVID-19
pandemic and in relation to elections and disasters [18]. Due to the lack of
labelled data, it is challenging to make a proper analysis of the misinformation.
As a case study, we apply the AMUSED for data annotation for COVID-19
misinformation, following the steps illustrated in the prior section.
Step 1: Domain Identification Out of several possible application do-
mains, we consider the spread of misinformation in the context of COVID-19.
Misinformation likely worsens the negative effects of the pandemic [28]. The di-
rector of the World Health Organization (WHO) considers that we are not only
fighting with a pandemic but also an infodemic [35,37]. One of the fundamental
problems is the lack of sufficient corpus related to pandemic [27].
Step 2: Data Sources For data source, we analysed 25 fact-checking web-
sites and decided to use Poynter and Snopes. We choose Poynter because it has
a central data hub that collects data from more than 98 fact-checking websites,
while Snopes is not integrated with Poynter but has more than 300 fact-checked
articles on COVID-19.
7. AMUSED Annotation Framework 7
Step 3: Web Scraping In this step, we fetched all the news articles from
Poynter and Snopes.
Step 4: Language Detection We collected data in multiple languages like
English, German, and Hindi. To identify the language of the news article, we
have used langdetect, a Python-based library to detect the language of the news
articles. We used the textual content of new articles to check the language of the
news articles.
Step 5: Social Media Link In the next step, while doing HTML crawling,
we filter the URL from the parsed tree of the DOM (Document Object Model).
We analysed the URL pattern from different social media platforms and applied
keyword-based filtering from all hyperlinks in the DOM. For instance, For each
Tweet, Twitter follows a pattern twitter.com/user name/status/tweetid. So, in
the collection hyperlink, we searched for the keyword “twitter.com” and “status”.
This assures that we have collected the hyperlink referring to the tweet. This
process is shown in Figure 2.
Similarly, we followed the approach for other social media platforms like
Facebook and Instagram. We used the regex code to filter the unique ID for
each social media post in the next step.
Fig. 2. An Illustration of data collection from social media platform(Twitter) from a
news article [27]
Step 6: Social Media Data Crawling We now have the unique identifier
of each social media post. We built a Python-based program for crawling the
data from the respective social media platform. The summary is given in Table 2.
Step 7: Data Labelling For data labelling, we used the label assigned in
the news articles, then we mapped the social media post with their respective
news article and assigned the label to the social media post. For example, a
Tweet extracted from a news article is mapped to the class of the news article.
This process is shown in Figure 3.
8. 8 Shahi and Majchrzak
Platform Posts Unique Text Image Text+Image Video
Facebook 5 799 3 200 1167 567 1 006 460
Instagram 385 197 - 106 41 52
Pinterest 5 3 - 3 0 0
Reddit 67 33 16 10 7 0
TikTok 43 18 - - - 18
Twitter 3 142 1 758 1300 116 143 199
Wikipedia 393 176 106 34 20 16
YouTube 2 087 (916) - - - 916
Table 2. Summary of data collected
Fig. 3. An Illustration for annotation of social media posting using the label mentioned
in the news article.
Step 8: Human Verification We manually checked each social media post
to assess the correctness of the process. We provided the annotator with all
necessary information about the class mapping and asked them to verify it. For
example, in Figure 3, a human open the news article using the newssource URL
and verifies the label assigned to the tweet. For COVID-19 misinformation, we
check the annotation by randomly choosing 100 social media posts from each
social media platform and verifying the label assigned to the social media post
and label mentioned in the news articles. We measured the inter-coder reliability
using Cohen’s kappa and got a value of 0.72-0.86, which is a good agreement.
We further normalised the data label into false, partially false, true and others
using the definitions mentioned in [27].
Step 9: Data Enrichment We then enriched the data by providing extra
information about the social media post. The first step is merging the social me-
dia post with the respective news article, and it includes additional information
like textual content, news source, author. The detailed analysis of the collected
data is discussed in the result section.
5 Results
For the use case of COVID-19 Misinformation, we identified Poynter and Snopes
as the data source, and we collected data from different social media platforms.
We found that around 51% of news articles linked their content to social media
websites. Overall, we have collected 8,077 fact-checked news articles from 105
countries in 40 languages. We have cleaned the hyperlinks collected using the
AMUSED framework and filtered the social media posts by removing the dupli-
cates using the unique identifier. Finally, we will release the data as open-source.
9. AMUSED Annotation Framework 9
SM Platform False Partially False Other True
Facebook 2,776 325 94 6
Instagram 166 28 2 1
Reddit 21 9 2 1
Twitter 1,318 234 50 13
Wikipedia 154 18 3 1
YouTube 739 164 13 0
Table 3. Summary of COVID-19 misinformation posts collected.
We plotted the data from those social media platform which has the total
number of post more than 25 unique posts in Table 3 because it depreciates
the plot distribution. We dropped the plot from Pinterest (3), Whatsapp (23),
Tiktok (25), Reddit (43). The plot shows that most of the social media posts are
from Facebook and Twitter, followed by YouTube, Wikipedia and Instagram.
Table 3 also presents the class distribution of these posts. Misinformation also
follows the COVID-19 situation in many countries because the number of social
media posts also decreased after June 2020. The possible reason could be either
that the spread of Misinformation is reduced or that fact-checking websites are
not focusing on this issue as during the early stage.
6 Discussion
Our study highlighted the process of fetching the labelled social media post from
news fact-checked articles. Usually, the fact-checking website links the social me-
dia post from multiple social media platforms. We tried to gather data from var-
ious social media platforms, but we found the maximum number of Facebook,
Twitter, and YouTube links. There are few unique posts from Reddit (21), Tik-
Tok (9) etc., which shows that fact-checker mainly focused on analysing content
from Facebook, Twitter, and YouTube.
Surprisingly there are only three unique posts from Pinterest, and there are
no data available from Gab, ShareChat, and Snapchat. However, Gab is well
known for harmful content, and people in their regional languages use ShareChat.
There are only three unique posts from Pinterest. Many people use Wikipedia as
a reliable source of information, but there are 393 links from Wikipedia. Hence,
overall fact-checking website is limited to some trending social media platforms
like Twitter or Facebook, while social media platforms like Gab, TikTok is fa-
mously famous for malformation or misinformation [6]. WhatsApp is an instant
messaging app used among friends or group of people. So, we only found some
hyperlink which links to the public WhatsApp group. To increase the visibility of
fact-checked articles, a journalist can also use schema.org vocabulary along with
the Microdata, RDFa, or JSON-LD formats to add details about Misinformation
to the news articles [29].
AMUSED requires some effort but still is beneficial compared to random
data annotation because we need to annotate thousands of social media posts.
Still, the chances of getting misinformation are significantly less.
10. 10 Shahi and Majchrzak
Another aspect is the diversity of social media post on the different social me-
dia platforms. News articles often mention Facebook, Twitter, YouTube, yet only
seldom Instagram, Pinterest, Gab and Tiktok were not mentioned at all. The
reasons for this need to be explored. It would be interesting to study the propa-
gation of misinformation on different platforms like Tiktok and Gab in relation
to the news coverage they get. Such a cross-platform study would particularly
insightful with contemporary topics such as misinformation on COVID-19. Such
a cross-platform work could also be linked to classification models [26,24].
We have also analysed the multi-modality of the data on the social media
platform; the number of social media post is shown in Table 2. We further classify
the misinformation into four different categories, as discussed in step 8. The
amount of Misinformation as text is greater compared to video or image. Thus,
in Table 3 we present the textual misinformation into four different categories.
Apart from text, the misinformation is also shared as image, video or embedding
format like image-text.
While applying the AMUSED framework on the misinformation on COVID-
19, we found that misinformation spreads across multiple source platforms, but
it mainly circulated across Facebook, Twitter, YouTube. Our finding suggests
concentrating mitigation efforts onto these platforms.
7 Conclusion and Future Work
In this paper, we presented a semi-automatic framework for social media data
annotation. The framework can be applied to several domains like misinforma-
tion, mob lynching, and online abuse. As a part of the framework, we also used
a Python-based crawler for different social media websites. After data labelling,
the labels are cross-checked by a human, which ensures a two-step verification of
data annotation for the social media posts. We also enrich the social media post
by mapping it to the news article to gather more analysis about it. The data
enrichment will be able to provide additional information for the social media
post. We have implemented the proposed framework for collecting the misinfor-
mation post related to the COVID-19. One of the limitations of the framework
is that, presently, we do not address the multiple (possibly contradicting) labels
assigned by different fact-checkers over the same claim.
As future work, the framework can be extended for getting the annotated
data on other topics like hate speech, mob lynching etc. The framework will
be helpful in gathering annotated data for other domains from multiple social
media sites for further analysis.
AMUSED will decrease the labour cost and time for the data annotation
process. Our framework will also increase the data annotation quality because
we crawl the data from news articles published by an expert journalist.
References
1. Aggarwal, C.C.: An introduction to social network data analytics. In: Social net-
work data analytics, pp. 1–15. Springer (2011)
11. AMUSED Annotation Framework 11
2. Ahmed, S., Pasquier, M., Qadah, G.: Key issues in conducting sentiment analysis
on arabic social media text. In: 2013 9th International Conference on Innovations
in Information Technology (IIT). pp. 72–77. IEEE (2013)
3. Alam, F., Dalvi, F., Shaar, S., Durrani, N., Mubarak, H., Nikolov, A., Martino,
G.D.S., Abdelali, A., Sajjad, H., Darwish, K., et al.: Fighting the covid-19 info-
demic in social media: A holistic perspective and a call to arms. arXiv preprint
arXiv:2007.07996 (2020)
4. Aroyo, L., Welty, C.: Truth is a lie: Crowd truth and the seven myths of human
annotation. AI Magazine 36(1), 15–24 (2015)
5. Braun, J., Gillespie, T.: Hosting the public discourse, hosting the public: When
online news and social media converge. Journalism Practice 5(4), 383–398 (2011)
6. Brennen, J.S., Simon, F., Howard, s.N., Nielsen, R.K.: Types, sources, and claims
of covid-19 misinformation. Reuters Institute 7, 3–1 (2020)
7. Caumont, A.: 12 trends shaping digital news. Pew Research Center 16 (2013)
8. Chapman, W.W., Nadkarni, P.M., Hirschman, L., D’avolio, L.W., Savova, G.K.,
Uzuner, O.: Overcoming barriers to nlp for clinical text: the role of shared tasks
and the need for additional creative solutions (2011)
9. Cherubini, F., Graves, L.: The rise of fact-checking sites in europe. Reuters Institute
for the Study of Journalism, University of Oxford. http://reutersinsfitute. polifics.
ox. ac. uk/our-research/rise-fact-checking-sites-europe (2016)
10. Cook, P., Stevenson, S.: Automatically identifying changes in the semantic orien-
tation of words. In: LREC (2010)
11. Cui, X., Liu, Y.: How does online news curate linked sources? a content analysis
of three online news media. Journalism 18(7), 852–870 (2017)
12. Duchenne, O., Laptev, I., Sivic, J., Bach, F., Ponce, J.: Automatic annotation of
human actions in video. In: 2009 IEEE 12th International Conference on Computer
Vision. pp. 1491–1498. IEEE (2009)
13. Forbush, T.B., Shen, S., South, B.R., DuValla, S.L.: What a catch! traits that
define good annotators. Studies in health technology and informatics 192, 1213–
1213 (2013)
14. Fort, K., Adda, G., Cohen, K.B.: Amazon mechanical turk: Gold mine or coal
mine? Computational Linguistics 37(2), 413–420 (2011)
15. Geiger, R.S., Yu, K., Yang, Y., Dai, M., Qiu, J., Tang, R., Huang, J.: Garbage
in, garbage out? do machine learning application papers in social computing re-
port where human-labeled training data comes from? In: Proceedings of the 2020
Conference on Fairness, Accountability, and Transparency. pp. 325–336 (2020)
16. Giglietto, F., Rossi, L., Bennato, D.: The open laboratory: Limits and possibili-
ties of using facebook, twitter, and youtube as a research data source. Journal of
technology in human services 30(3-4), 145–159 (2012)
17. Grant-Muller, S.M., Gal-Tzur, A., Minkov, E., Nocera, S., Kuflik, T., Shoor, I.:
Enhancing transport data collection through social media sources: methods, chal-
lenges and opportunities for textual data. IET Intelligent Transport Systems 9(4),
407–417 (2014)
18. Gupta, A., Lamba, H., Kumaraguru, P., Joshi, A.: Faking sandy: characterizing
and identifying fake images on twitter during hurricane sandy. In: Proceedings of
the 22nd international conference on World Wide Web. pp. 729–736 (2013)
19. Haertel, R.A.: Practical cost-conscious active learning for data annotation in
annotator-initiated environments. Brigham Young University-Provo (2013)
20. Institute, P.: The International Fact-Checking Network (2020), https://www.
poynter.org/ifcn/
12. 12 Shahi and Majchrzak
21. Kohring, M., Matthes, J.: Trust in news media: Development and validation of a
multidimensional scale. Communication research 34(2), 231–252 (2007)
22. Mandl, T., Modha, S., Shahi, G.K., Jaiswal, A.K., Nandini, D., Patel, D., Ma-
jumder, P., Schäfer, J.: Overview of the HASOC track at FIRE 2020: Hate speech
and offensive content identification in indo-european languages. In: Mehta, P.,
Mandl, T., Majumder, P., Mitra, M. (eds.) Working Notes of FIRE 2020. CEUR
Workshop Proceedings, vol. 2826, pp. 87–111. CEUR-WS.org (2020)
23. McGahan, C., Katsion, J.: Secondary communication crisis: Social media news
information. Liberty University Research Week (2021)
24. Nandini, D., Capecci, E., Koefoed, L., Laña, I., Shahi, G.K., Kasabov, N.: Mod-
elling and analysis of temporal gene expression data using spiking neural net-
works. In: International Conference on Neural Information Processing. pp. 571–
581. Springer (2018)
25. Sabou, M., Bontcheva, K., Derczynski, L., Scharl, A.: Corpus annotation through
crowdsourcing: Towards best practice guidelines. In: LREC. pp. 859–866 (2014)
26. Shahi, G.K., Bilbao, I., Capecci, E., Nandini, D., Choukri, M., Kasabov, N.: Anal-
ysis, classification and marker discovery of gene expression data with evolving
spiking neural networks. In: International Conference on Neural Information Pro-
cessing. pp. 517–527. Springer (2018)
27. Shahi, G.K., Dirkson, A., Majchrzak, T.A.: An exploratory study of covid-19 mis-
information on twitter. Online social networks and media p. 100104 (2021)
28. Shahi, G.K., Nandini, D.: FakeCovid – a multilingual cross-domain fact check news
dataset for covid-19. In: Workshop Proceedings of the 14th International AAAI
Conference on Web and Social Media (2020), http://workshop-proceedings.
icwsm.org/pdf/2020_14.pdf
29. Shahi, G.K., Nandini, D., Kumari, S.: Inducing schema. org markup from natural
language context. Kalpa Publications in Computing 10, 38–42 (2019)
30. Shu, K., Sliva, A., Wang, S., Tang, J., Liu, H.: Fake news detection on social media:
A data mining perspective. ACM SIGKDD explorations newsletter 19(1), 22–36
(2017)
31. Sorokin, A., Forsyth, D.: Utility data annotation with amazon mechanical turk. In:
2008 IEEE computer society conference on computer vision and pattern recognition
workshops. pp. 1–8. IEEE (2008)
32. Stieglitz, S., Mirbabaie, M., Ross, B., Neuberger, C.: Social media analytics–
challenges in topic discovery, data collection, and data preparation. International
journal of information management 39, 156–168 (2018)
33. Talwar, S., Dhir, A., Kaur, P., Zafar, N., Alrasheedy, M.: Why do people share
fake news? associations between the dark side of social media use and fake news
sharing behavior. Journal of Retailing and Consumer Services 51, 72–82 (2019)
34. Team, C.: Crowdtangle. facebook, menlo park, california, united states (2020)
35. The Guardian: The WHO v coronavirus: why it can’t handle the
pandemic (2020), https://www.theguardian.com/news/2020/apr/10/
world-health-organization-who-v-coronavirus-why-it-cant-handle-pandemic
36. Thorson, K., Driscoll, K., Ekdale, B., Edgerly, S., Thompson, L.G., Schrock, A.,
Swartz, L., Vraga, E.K., Wells, C.: Youtube, twitter and the occupy movement:
Connecting content and circulation practices. Information, Communication & So-
ciety 16(3), 421–451 (2013)
37. Zarocostas, J.: World Report How to fight an infodemic. The Lancet 395, 676
(2020). https://doi.org/10.1016/S0140-6736(20)30461-X
13. AMUSED Annotation Framework 13
38. Zhou, X., Zafarani, R.: Fake news: A survey of research, detection methods, and
opportunities. CoRR abs / 1812.00315 (2018), http://arxiv.org/abs/1812.
00315