Abstract: Characterizing information diffusion on social platforms like Twitter enables us to understand the properties of underlying media and model communication patterns. As Twitter gains in popularity, it has also become a venue to broadcast rumors and misinformation. We use epidemiological models to characterize information cascades in twitter resulting from both news and rumors. Specifically, we use the SEIZ enhanced epidemic model that explicitly recognizes skeptics to characterize eight events across the world and spanning a range of event types. We demonstrate that our approach is accurate at capturing diffusion in these events. Our approach can be fruitfully combined with other strategies that use content modeling and graph theoretic features to detect (and possibly disrupt) rumors.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
Slides: Epidemiological Modeling of News and Rumors on TwitterParang Saraf
Abstract: Characterizing information diffusion on social platforms like Twitter enables us to understand the properties of underlying media and model communication patterns. As Twitter gains in popularity, it has also become a venue to broadcast rumors and misinformation. We use epidemiological models to characterize information cascades in twitter resulting from both news and rumors. Specifically, we use the SEIZ enhanced epidemic model that explicitly recognizes skeptics to characterize eight events across the world and spanning a range of event types. We demonstrate that our approach is accurate at capturing diffusion in these events. Our approach can be fruitfully combined with other strategies that use content modeling and graph theoretic features to detect (and possibly disrupt) rumors.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
EMBERS AutoGSR: Automated Coding of Civil Unrest EventsParang Saraf
Abstract: We describe the EMBERS AutoGSR system that conducts automated coding of civil unrest events from news articles published in multiple languages. The nuts and bolts of the AutoGSR system constitute an ecosystem of filtering, ranking, and recommendation models to determine if an article reports a civil unrest event and, if so, proceed to identify and encode specific characteristics of the civil unrest event such as the when, where, who, and why of the protest. AutoGSR is a deployed system for the past 6 months continually processing data 24x7 in languages such as Spanish, Portuguese, English and encoding civil unrest events in 10 countries of Latin America: Argentina, Brazil, Chile, Colombia, Ecuador, El Salvador, Mexico, Paraguay, Uruguay, and Venezuela. We demonstrate the superiority of AutoGSR over both manual approaches and other state-of-the-art en- coding systems for civil unrest.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
Slides: Epidemiological Modeling of News and Rumors on TwitterParang Saraf
Abstract: Characterizing information diffusion on social platforms like Twitter enables us to understand the properties of underlying media and model communication patterns. As Twitter gains in popularity, it has also become a venue to broadcast rumors and misinformation. We use epidemiological models to characterize information cascades in twitter resulting from both news and rumors. Specifically, we use the SEIZ enhanced epidemic model that explicitly recognizes skeptics to characterize eight events across the world and spanning a range of event types. We demonstrate that our approach is accurate at capturing diffusion in these events. Our approach can be fruitfully combined with other strategies that use content modeling and graph theoretic features to detect (and possibly disrupt) rumors.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
EMBERS AutoGSR: Automated Coding of Civil Unrest EventsParang Saraf
Abstract: We describe the EMBERS AutoGSR system that conducts automated coding of civil unrest events from news articles published in multiple languages. The nuts and bolts of the AutoGSR system constitute an ecosystem of filtering, ranking, and recommendation models to determine if an article reports a civil unrest event and, if so, proceed to identify and encode specific characteristics of the civil unrest event such as the when, where, who, and why of the protest. AutoGSR is a deployed system for the past 6 months continually processing data 24x7 in languages such as Spanish, Portuguese, English and encoding civil unrest events in 10 countries of Latin America: Argentina, Brazil, Chile, Colombia, Ecuador, El Salvador, Mexico, Paraguay, Uruguay, and Venezuela. We demonstrate the superiority of AutoGSR over both manual approaches and other state-of-the-art en- coding systems for civil unrest.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
Toward Formal Reasoning with Epistemic Policies about Information Quality i...Brian Ulicny
Presentation by VIStology Inc. at Fusion 2011 Conference, Chicago, IL. July 2011.
Automatically evaluating the reliability and credibility of messages on Twitter.
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITYcscpconf
This research work discusses how an integrated open source intelligence framework can help the law enforcements and government entities who are investigating crimes based on statistical and graph analysis on Twitter data. The solution supports a real-time and off-line analysis of the tweets collections. The framework employs tools that support big data processing capabilities, to collect, process and analyze a huge amount of data. The outline solution supports content and textual based analysis, helping the investigators to dig into a person and the community linked to that person based on a tweet. Our solution supports an investigative processes composed of the following phases (i) find suspicious tweets and individuals based on hash tags analysis (ii) classify the user profile based on Twitter features (iii) identify influencers in the FOAF networks of the senders (iiii) analyze these influencers’ background and history to find hints of past or current criminal activity.
Using Tweets for Understanding Public Opinion During U.S. Primaries and Predi...Monica Powell
Abstract
Using social media for political analysis, especially during elections, has become popular in the past few years where many researchers and media now use social media to understand the public opinion and current trends. In this paper, we investigate methods for using Twitter to analyze public opinion and to predict U.S. Presidential Primary Election results. We analyzed over 13 million tweets from February 2016 to April 2016 during the primary elections, and we looked at tweets that mentioned either Hillary Clin- ton, Bernie Sanders, Donald Trump or Ted Cruz. First, we use the methods of sentiment analysis, geospatial analysis, net- work analysis, and visualizations tools to examine public opinion on twitter. We then use the twitter data and analysis results to propose a prediction model for predicting primary election results. Our results highlight the feasibility of using social media to look at public opinion and predict election results.
Predictions of links in graphs based on content and information propagations.
Lecture for the M. Sc. Data Science, Sapienza University of Rome, Spring 2016.
This project has been realized during the 2015-2016 master “Business Intelligence and Big Data Analytics” at Università di Milano - Bicocca.
Authors: Marco Fusi @marco_fusi, Raffaele Lorusso @rlorusso76
In the age of social media communication, it is easy to
modulate the minds of users and also instigate violent
actions being taken by them in some cases. There is a need
to have a system that can analyze the threat level of tweets
from influential users and rank their Twitter handles so
that dangerous tweets can be avoided going public on
Twitter before fact-checking which can hurt the sentiments
of people and can take the shape of violence. The study
aims to analyse and rank twitter users according to their
influential power and extremism of their tweets to help
prevent major protests and violent events. We scraped top
trending topics and fetched tweets using those hashtags.
We propose a custom ranking algorithm which considers
source based and content based features along with a
knowledge graph which generates the score and rank the
twitter users according to the scores. Our aim with this
study is to identify and rank extremist twitter users with
regards to their impact and influence. We use a technique
that takes into consideration both source based and
content-based features of tweets to generate the ranking of
the extremist twitter users having a high impact factor
Embeddings-Based Clustering for Target Specific StancesAmmar Rashed
We propose an unsupervised user stance detection method to capture fine grained divergences in a community across various topics. We employ pre-trained universal sentence encoders to represent users based on the content of their tweets on a particular topic. User vectors are projected onto a lower dimensional space using UMAP, then clustered using HDBSCAN. Our method performs better than previous approaches on two datasets in different domains, achieving precision and recall scores ranging between 0.89 and 0.97. We compiled a dataset of more than 300k tweets about UEFA Super Cup's 2019 final, and tagged 12k users as Liverpool FC or Chelsea FC fans. We utilized our method to analyze the stances of Twitter users noting a correlation between user stances towards various polarizing issues. We used the resultant clusters to quantify the polarization in various topics, and analyze the semantic divergence between clusters.
Computing Social Score of Web Artifacts - IRE Major Project Spring 2015Amar Budhiraja
Final Presentation on Computing Social Scores of Web Artifacts - Major Project IRE Spring 2015
Web artifacts are items like news articles, web pages, videos, pinterest pints,tweets or URLs.
The task is to find the social score of an artifact (something like popularity). This is quite trivial because it can be done using features like “number of times the item has been shared”, “number of people who like or dislike the item” and so on.
The challenge is to find the social score of an item, if it has been shared on multiple places. For example a URL shared on facebook and twitter. For the purpose of this study, we mined tweets and Facebook posts of time frame and created an engine out of SOLR and Lucene to accomplish the desired goals.
Information Contagion through Social Media: Towards a Realistic Model of the ...Axel Bruns
Paper by Axel Bruns, Patrik Wikström, Peta Mitchell, Brenda Moon, Felix Münch, Lucia Falzon, and Lucy Resnyansky presented at the ACSPRI 2016 conference, Sydney, 19-22 July 2016/
A general stochastic information diffusion model in social networks based on ...IJCNCJournal
Social networks are an important infrastructure for information, viruses and innovations propagation. Since users’
behavior has influenced by other users’ activity, some groups of people would be made regard to similarity of users’
interests. On the other hand, dealing with many events in real worlds, can be justified in social networks; spreading
disease is one instance of them. People’s manner and infection severity are more important parameters in
dissemination of diseases. Both of these reasons derive, whether the diffusion leads to an epidemic or not. SIRS is a
hybrid model of SIR and SIS disease models to spread contamination. A person in this model can be returned to
susceptible state after it removed. According to communities which are established on the social network, we use the
compartmental type of SIRS model. During this paper, a general compartmental information diffusion model would
be proposed and extracted some of the beneficial parameters to analyze our model. To adapt our model to realistic
behaviors, we use Markovian model, which would be helpful to create a stochastic manner of the proposed model.
In the case of random model, we can calculate probabilities of transaction between states and predicting value of
each state. The comparison between two mode of the model shows that, the prediction of population would be
verified in each state.
Toward Formal Reasoning with Epistemic Policies about Information Quality i...Brian Ulicny
Presentation by VIStology Inc. at Fusion 2011 Conference, Chicago, IL. July 2011.
Automatically evaluating the reliability and credibility of messages on Twitter.
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITYcscpconf
This research work discusses how an integrated open source intelligence framework can help the law enforcements and government entities who are investigating crimes based on statistical and graph analysis on Twitter data. The solution supports a real-time and off-line analysis of the tweets collections. The framework employs tools that support big data processing capabilities, to collect, process and analyze a huge amount of data. The outline solution supports content and textual based analysis, helping the investigators to dig into a person and the community linked to that person based on a tweet. Our solution supports an investigative processes composed of the following phases (i) find suspicious tweets and individuals based on hash tags analysis (ii) classify the user profile based on Twitter features (iii) identify influencers in the FOAF networks of the senders (iiii) analyze these influencers’ background and history to find hints of past or current criminal activity.
Using Tweets for Understanding Public Opinion During U.S. Primaries and Predi...Monica Powell
Abstract
Using social media for political analysis, especially during elections, has become popular in the past few years where many researchers and media now use social media to understand the public opinion and current trends. In this paper, we investigate methods for using Twitter to analyze public opinion and to predict U.S. Presidential Primary Election results. We analyzed over 13 million tweets from February 2016 to April 2016 during the primary elections, and we looked at tweets that mentioned either Hillary Clin- ton, Bernie Sanders, Donald Trump or Ted Cruz. First, we use the methods of sentiment analysis, geospatial analysis, net- work analysis, and visualizations tools to examine public opinion on twitter. We then use the twitter data and analysis results to propose a prediction model for predicting primary election results. Our results highlight the feasibility of using social media to look at public opinion and predict election results.
Predictions of links in graphs based on content and information propagations.
Lecture for the M. Sc. Data Science, Sapienza University of Rome, Spring 2016.
This project has been realized during the 2015-2016 master “Business Intelligence and Big Data Analytics” at Università di Milano - Bicocca.
Authors: Marco Fusi @marco_fusi, Raffaele Lorusso @rlorusso76
In the age of social media communication, it is easy to
modulate the minds of users and also instigate violent
actions being taken by them in some cases. There is a need
to have a system that can analyze the threat level of tweets
from influential users and rank their Twitter handles so
that dangerous tweets can be avoided going public on
Twitter before fact-checking which can hurt the sentiments
of people and can take the shape of violence. The study
aims to analyse and rank twitter users according to their
influential power and extremism of their tweets to help
prevent major protests and violent events. We scraped top
trending topics and fetched tweets using those hashtags.
We propose a custom ranking algorithm which considers
source based and content based features along with a
knowledge graph which generates the score and rank the
twitter users according to the scores. Our aim with this
study is to identify and rank extremist twitter users with
regards to their impact and influence. We use a technique
that takes into consideration both source based and
content-based features of tweets to generate the ranking of
the extremist twitter users having a high impact factor
Embeddings-Based Clustering for Target Specific StancesAmmar Rashed
We propose an unsupervised user stance detection method to capture fine grained divergences in a community across various topics. We employ pre-trained universal sentence encoders to represent users based on the content of their tweets on a particular topic. User vectors are projected onto a lower dimensional space using UMAP, then clustered using HDBSCAN. Our method performs better than previous approaches on two datasets in different domains, achieving precision and recall scores ranging between 0.89 and 0.97. We compiled a dataset of more than 300k tweets about UEFA Super Cup's 2019 final, and tagged 12k users as Liverpool FC or Chelsea FC fans. We utilized our method to analyze the stances of Twitter users noting a correlation between user stances towards various polarizing issues. We used the resultant clusters to quantify the polarization in various topics, and analyze the semantic divergence between clusters.
Computing Social Score of Web Artifacts - IRE Major Project Spring 2015Amar Budhiraja
Final Presentation on Computing Social Scores of Web Artifacts - Major Project IRE Spring 2015
Web artifacts are items like news articles, web pages, videos, pinterest pints,tweets or URLs.
The task is to find the social score of an artifact (something like popularity). This is quite trivial because it can be done using features like “number of times the item has been shared”, “number of people who like or dislike the item” and so on.
The challenge is to find the social score of an item, if it has been shared on multiple places. For example a URL shared on facebook and twitter. For the purpose of this study, we mined tweets and Facebook posts of time frame and created an engine out of SOLR and Lucene to accomplish the desired goals.
Information Contagion through Social Media: Towards a Realistic Model of the ...Axel Bruns
Paper by Axel Bruns, Patrik Wikström, Peta Mitchell, Brenda Moon, Felix Münch, Lucia Falzon, and Lucy Resnyansky presented at the ACSPRI 2016 conference, Sydney, 19-22 July 2016/
A general stochastic information diffusion model in social networks based on ...IJCNCJournal
Social networks are an important infrastructure for information, viruses and innovations propagation. Since users’
behavior has influenced by other users’ activity, some groups of people would be made regard to similarity of users’
interests. On the other hand, dealing with many events in real worlds, can be justified in social networks; spreading
disease is one instance of them. People’s manner and infection severity are more important parameters in
dissemination of diseases. Both of these reasons derive, whether the diffusion leads to an epidemic or not. SIRS is a
hybrid model of SIR and SIS disease models to spread contamination. A person in this model can be returned to
susceptible state after it removed. According to communities which are established on the social network, we use the
compartmental type of SIRS model. During this paper, a general compartmental information diffusion model would
be proposed and extracted some of the beneficial parameters to analyze our model. To adapt our model to realistic
behaviors, we use Markovian model, which would be helpful to create a stochastic manner of the proposed model.
In the case of random model, we can calculate probabilities of transaction between states and predicting value of
each state. The comparison between two mode of the model shows that, the prediction of population would be
verified in each state.
Evolution and Influence Measurement Association Information NetworkEditor IJCATR
This paper mainly using SIR model, based on least squares polynomial fitting GM (1,1) model, Information flow reasonable
quantitative analysis and evaluation of the impact of information and the public interest and public opinion to make a qualitative
analysis.First, the study of journalism and communication problems model. Using SIR model divided population into three categories:
Communicators, unknown person and rem. Then contra the news spread found the SIR model. Through qualitative analysis get
rationality of SIR model. Then get specific data unknown persons in accordance with the SIR model was built using the least squares
polynomial fitting based on those data, the unknown polynomial fitting.If the data is in line with the model it can be considered as
news data, otherwise, the data is not news.Besides, to validate the reliability of model and to forecast the trend of the information
flows today. Getting a group of reliable news media data through the local statistical offices personnel is consistent with the model.
Since there is no question as to the data to obtain information dissemination five times, so this article by setting the initial value of SIR
model, Since there is no question as to get information dissemination five periods of data, so this article by setting the initial value of
SIR model to obtain information dissemination five periods of data, and then use the GM (1,1) model predict unknown persons data
for each day, obtained dissemination of information specific data of each day. Whereby the trend today of unknown people the number
of information dissemination, through testing the GM (1,1) model, it showed that the use of GM (1,1) model to predict today's
information communication situation is entirely possible.
Comprehensive Social Media Security Analysis & XKeyscore Espionage TechnologyCSCJournals
Social networks can offer many services to the users for sharing activities events and their ideas. Many attacks can happened to the social networking websites due to trust that have been given by the users. Cyber threats are discussed in this paper. We study the types of cyber threats, classify them and give some suggestions to protect social networking websites of variety of attacks. Moreover, we gave some antithreats strategies with future trends.
CATEGORIZING 2019-N-COV TWITTER HASHTAG DATA BY CLUSTERINGijaia
Unsupervised machine learning techniques such as clustering are widely gaining use with the recent increase in social communication platforms like Twitter and Facebook. Clustering enables the finding of patterns in these unstructured datasets. We collected tweets matching hashtags linked to COVID-19 from a Kaggle dataset. We compared the performance of nine clustering algorithms using this dataset. We evaluated the generalizability of these algorithms using a supervised learning model. Finally, using a selected unsupervised learning algorithm we categorized the clusters. The top five categories are Safety,
Crime, Products, Countries and Health. This can prove helpful for bodies using large amount of Twitter data needing to quickly find key points in the data before going into further classification.
Categorizing 2019-n-CoV Twitter Hashtag Data by Clusteringgerogepatton
Unsupervised machine learning techniques such as clustering are widely gaining use with the recent increase in social communication platforms like Twitter and Facebook. Clustering enables the finding of patterns in these unstructured datasets. We collected tweets matching hashtags linked to COVID-19 from a Kaggle dataset. We compared the performance of nine clustering algorithms using this dataset. We evaluated the generalizability of these algorithms using a supervised learning model. Finally, using a selected unsupervised learning algorithm we categorized the clusters. The top five categories are Safety, Crime, Products, Countries and Health. This can prove helpful for bodies using large amount of Twitter data needing to quickly find key points in the data before going into further classification.
CATEGORIZING 2019-N-COV TWITTER HASHTAG DATA BY CLUSTERINGgerogepatton
Unsupervised machine learning techniques such as clustering are widely gaining use with the recent increase in social communication platforms like Twitter and Facebook. Clustering enables the finding of patterns in these unstructured datasets. We collected tweets matching hashtags linked to COVID-19 from a Kaggle dataset. We compared the performance of nine clustering algorithms using this dataset. We evaluated the generalizability of these algorithms using a supervised learning model. Finally, using a selected unsupervised learning algorithm we categorized the clusters. The top five categories are Safety, Crime, Products, Countries and Health. This can prove helpful for bodies using large amount of Twitter data needing to quickly find key points in the data before going into further classification.
International Journal of HRM and Organizational Behavior (IJHRMOB) is an online International Journal published half yearly. It is a peer reviewed journal aiming to communicate high quality original research work, reviews, in the fields of Human Resource Management and Organizational Behavior. The Journal publishes research and review articles.
https://ijhrmob.com/
Hyderabad Telangana
Enhancing prediction of user stance for social networks rumorsIJECEIAES
The spread of social media has led to a massive change in the way information is dispersed. It provides organizations and individuals wider opportunities of collaboration. But it also causes an emergence of malicious users and attention seekers to spread rumors and fake news. Understanding user stances in rumor posts is very important to identify the veracity of the underlying content as news becomes viral in a few seconds which can lead to mass panic and confusion. In this paper, different machine learning techniques were utilized to enhance the user stance prediction through a conversation thread towards a given rumor on Twitter platform. We utilized both conversation thread features as well as features related to users who participated in this conversation, in order to predict the users’ stances, in terms of supporting, denying, querying, or commenting (SDQC), towards the source tweet. Furthermore, different datasets for the stance-prediction task were explored to handle the data imbalance problem and data augmentation for minority classes was applied to enhance the results. The proposed framework outperforms the state-of-the-art results with macro F1-score of 0.7233
How does fakenews spread understanding pathways of disinformation spread thro...Araz Taeihagh
What are the pathways for spreading disinformation on social media platforms? This article addresses this question by collecting, categorising, and situating an extensive body of research on how application programming interfaces (APIs) provided by social media platforms facilitate the spread of disinformation. We first examine the landscape of official social media APIs, then perform quantitative research on the open-source code repositories GitHub and GitLab to understand the usage patterns of these APIs. By inspecting the code repositories, we classify developers' usage of the APIs as official and unofficial, and further develop a four-stage framework characterising pathways for spreading disinformation on social media platforms. We further highlight how the stages in the framework were activated during the 2016 US Presidential Elections, before providing policy recommendations for issues relating to access to APIs, algorithmic content, advertisements, and suggest rapid response to coordinate campaigns, development of collaborative, and participatory approaches as well as government stewardship in the regulation of social media platforms.
Expelling Information of Events from Critical Public Space using Social Senso...ijtsrd
Open foundation frameworks give a significant number of the administrations that are basic to the wellbeing, working, and security of society. A considerable lot of these frameworks, in any case, need persistent physical sensor checking to have the option to recognize disappointment occasions or harm that has struck these frameworks. We propose the utilization of social sensor enormous information to recognize these occasions. We center around two primary framework frameworks, transportation and vitality, and use information from Twitter streams to identify harm to spans, expressways, gas lines, and power foundation. Through a three step filtering approach and assignment to geographical cells, we are able to filter out noise in this data to produce relevant geo located tweets identifying failure events. Applying the strategy to real world data, we demonstrate the ability of our approach to utilize social sensor big data to detect damage and failure events in these critical public infrastructures. Samatha P. K | Dr. Mohamed Rafi "Expelling Information of Events from Critical Public Space using Social Sensor Big Data" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd25350.pdfPaper URL: https://www.ijtsrd.com/engineering/computer-engineering/25350/expelling-information-of-events-from-critical-public-space-using-social-sensor-big-data/samatha-p-k
Assessment of the main features of the model of dissemination of information ...IJECEIAES
Social networks provide a fairly wide range of data that allows one way or another to evaluate the effect of the dissemination of information. This article presents the results of a study that describes methods for determining the key parameters of the model needed to analyze and predict the dissemination of information in social networks. An approach based on the analysis of statistical data on user behavior in social networks is proposed. The process of evaluating the main features of the model is described, including the mathematical methods used for data analysis and information dissemination modeling. The study aims to understand the processes of information dissemination in social networks and develop recommendations for the effective use of social networks as a communication and brand promotion tool, as well as to consider the analytical properties of the classical susceptible-infected-removed (SIR) model and evaluate its applicability to the problem of information dissemination. The results of the study can be used to create algorithms and techniques that will effectively manage the process of information dissemination in social networks.
The Email Analyzer interface helps an analyst visualize the email network and identify local group of people who frequently exchange emails amongst themselves. This interface was developed as an entry for VAST Challenge 2014.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
Slides: Safeguarding Abila through Multiple Data PerspectivesParang Saraf
Abstract: This paper introduces a system for visual analysis of news articles, emails, GPS tracking data, financial transactions and streaming micro-blog data. The system was developed in response to the 2014 VAST Grand Challenge and comprises of several interfaces for mining textual, network, spatio-temporal, financial, and streaming data.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
Abstract: This paper introduces a system for visualization and analysis of geo-spatial and temporal data from call center and microblog sources. The system provides a streaming client for interacting with the data in real time. The system presents the data in a partitioned format using coordinated visualizations that allows the analyst to view the data in multiple dimension simultaneously. This allows the user to see patterns that occur in space and time. This project was developed in response to the VAST 2014 Mini-Challenge 3.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
Abstract: This paper introduces a system for visual analysis of GPS tracking and financial data. The system was developed in response to VAST Mini-Challenge 2 and comprises of different interfaces for mining spatio-temporal and financial data.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
Abstract: This paper introduces a system for visual analysis of news articles and emails. The system was developed in response to VAST MiniChallenge 1 and comprises different interfaces for mining textual data and network data.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
The News Analyzer interface provides novel ways to visualize and analyze news articles collected from different sources. An analyst can identify trending entities like name of people, organizations and locations in the news articles and can also see how the corresponding trends have evolved over time. This was developed as an entry for VAST Challenge 2014.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
EMBERS at 4 years: Experiences operating an Open Source Indicators Forecastin...Parang Saraf
Abstract: EMBERS is an anticipatory intelligence system forecasting population-level events in multiple countries of Latin America. A deployed system from 2012, EMBERS has been generating alerts 24x7 by ingesting a broad range of data sources including news, blogs, tweets, machine coded events, currency rates, and food prices. In this paper, we describe our experiences operating EMBERS continuously for nearly 4 years, with specific attention to the discoveries it has enabled, correct as well as missed forecasts, and lessons learnt from participating in a forecasting tournament including our perspectives on the limits of forecasting and ethical considerations.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
Slides: Forex-Foreteller: Currency Trend Modeling using News ArticlesParang Saraf
Abstract: Financial markets are quite sensitive to unanticipated news and events. Identifying the effect of news on the market is a challenging task. In this demo, we present Forex-foreteller (FF) which mines news articles and makes forecasts about the movement of foreign currency markets. The system uses a combination of language models, topic clustering, and sentiment analysis to identify relevant news articles. These articles along with the historical stock index and currency exchange values are used in a linear regression model to make forecasts. The system has an interactive visualizer designed specifically for touch-sensitive devices which depicts forecasts along with the chronological news events and financial data used for making the forecasts.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...Parang Saraf
Abstract: Topic modeling techniques have been widely used to uncover dominant themes hidden inside an unstructured document collection. Though these techniques first originated in the probabilistic analysis of word distributions, many deep learning approaches have been adopted recently. In this paper, we propose a novel neural network based architecture that produces distributed representation of topics to capture topical themes in a dataset. Unlike many state-of-the-art techniques for generating distributed representation of words and documents that directly use neighboring words for training, we leverage the outcome of a sophisticated deep neural network to estimate the topic labels of each document. The networks, for topic modeling and generation of distributed representations, are trained concurrently in a cascaded style with better runtime without sacrificing the quality of the topics. Empirical studies reported in the paper show that the distributed representations of topics represent intuitive themes using smaller dimensions than conventional topic modeling approaches.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
EMBERS AutoGSR: Automated Coding of Civil Unrest EventsParang Saraf
EMBERS AutoGSR is a novel, web based framework that generates a comprehensive database of validated civil unrest events using minimal human effort. AutoGSR is a deployed system for the past 6 months that is continually processing data 24X7 in an automated fashion. The system extracts civil unrest events of type "who protested where, when and why?" from news articles published in over 7 languages, and collected from 16 countries.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
DMAP: Data Aggregation and Presentation FrameworkParang Saraf
DMAP (Data Mining and Automation for Platforms) is an online framework that presents a wide variety of official data, news and information about companies. It was developed with an aim to act as a one-stop platform for displaying everything official related to a company and its competitors. It aggregates data from the following sources: Bing News, Companys' official blogs, RSS Sources, Facebook, Twitter, Google Trends, Crunchbase, Financial Data Sources, and Alexa Web Analytics.
The aggregated information is shown in an intuitive fashion that allows a user to perform exploratory analysis of a particular company. Specifically, the interface makes it easy to do comparative analysis of a company with respect to its competitors. The user can use filters to select a given set of companies, data sources and date ranges. All the information is presented within the context of the framework such that the user doesn't have to go to different domains. The fetched news articles are cleaned, enriched, and presented in a manner that allows an analyst to navigate through news articles using named entities. For each news article, named entities: people, location, organization, and name of companies are extracted. For a given search query, the interface returns matching news articles along with associated entities which are shown using word clouds. This allows for easy discovery of connections between entities.
The framework was developed as a part of the 2015 summer internship with The Center for Global Enterprise (CGE). Conceptualization, Design and Development of the framework was done by me during the three month period.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
Concurrent Inference of Topic Models and Distributed Vector RepresentationsParang Saraf
Abstract: Topic modeling techniques have been widely used to uncover dominant themes hidden inside an unstructured document collection. Though these techniques first originated in the probabilistic analysis of word distributions, many deep learning approaches have been adopted recently. In this paper, we propose a novel neural network based architecture that produces distributed representation of topics to capture topical themes in a dataset. Unlike many state-of-the-art techniques for generating distributed representation of words and documents that directly use neighboring words for training, we leverage the outcome of a sophisticated deep neural network to estimate the topic labels of each document. The networks, for topic modeling and generation of distributed representations, are trained concurrently in a cascaded style with better runtime without sacrificing the quality of the topics. Empirical studies reported in the paper show that the distributed representations of topics represent intuitive themes using smaller dimensions than conventional topic modeling approaches.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
Bayesian Model Fusion for Forecasting Civil UnrestParang Saraf
Abstract: With the rapid rise in social media, alternative news sources, and blogs, ordinary citizens have become information producers as much as information consumers. Highly charged prose, images, and videos spread virally, and stoke the embers of social unrest by alerting fellow citizens to relevant happenings and
spurring them into action. We are interested in using Big Data approaches to generate forecasts of civil unrest from open source indicators. The heterogenous nature of data coupled with the rich and diverse origins of civil unrest call for a multi-model approach to such forecasting. We present a modular approach wherein a collection of models use overlapping sources of data to independently forecast protests. Fusion of alerts into one single alert stream becomes a key system informatics problem and we present a statistical framework to accomplish such fusion. Given an alert from one of the numerous models, the decision space
for fusion has two possibilities: (i) release the alert or (ii) suppress the alert. Using a Bayesian decision theoretical framework, we present a fusion approach for releasing or suppressing alerts. The resulting system enables real-time decisions and more importantly tuning of precision and recall. Supplementary materials for this article are available online.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
‘Beating the News’ with EMBERS: Forecasting Civil Unrest using Open Source In...Parang Saraf
Abstract: We describe the design, implementation, and evaluation of EMBERS, an automated, 24x7 continuous system for forecasting civil unrest across 10 countries of Latin America using open source indicators such as tweets, news sources, blogs, economic indicators, and other data sources. Unlike retrospective studies, EMBERS has been making forecasts into the future since Nov 2012 which have been (and continue to be) evaluated by an independent T&E team (MITRE). Of note, EMBERS has successfully forecast the June 2013 protests in Brazil and Feb 2014 violent protests in Venezuela. We outline the system architecture of EMBERS, individual models that leverage specific data sources, and a fusion and suppression engine that supports trading off specific evaluation criteria. EMBERS also provides an audit trail interface that enables the investigation of why specific
predictions were made along with the data utilized for forecasting. Through numerous evaluations, we demonstrate the superiority of EMBERS over baserate methods and its capability to forecast significant societal happenings.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
Safeguarding Abila through Multiple Data PerspectivesParang Saraf
Award: VAST 2014 Grand Challenge Award: Effective Analysis and Presentation
Abstract: This paper introduces a system for visual analysis of news articles, emails, GPS tracking data, financial transactions and streaming micro-blog data. The system was developed in response to the 2014 VAST Grand Challenge and comprises of several interfaces for mining textual, network, spatio-temporal, financial, and streaming data.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
Abstract: This paper introduces a system for visualization and analysis of geo-spatial and temporal data from call center and microblog sources. The system provides a streaming client for interacting with the data in real time. The system presents the data in a partitioned format using coordinated visualizations
that allows the analyst to view the data in multiple dimension
simultaneously. This allows the user to see patterns that occur in space and time. This project was developed in response to the VAST 2014 Mini-Challenge 3.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
Award: This system won the "Honorable Mention for Effective Presentation" award.
Abstract: This paper introduces a system for visual analysis of GPS tracking and financial data. The system was developed in response to VAST Mini-Challenge 2 and comprises of different interfaces for mining spatio-temporal and financial data.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
Abstract: This paper introduces a system for visual analysis of news articles and emails. The system was developed in response to VAST MiniChallenge 1 and comprises different interfaces for mining textual data and network data.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
Forex-Foreteller: Currency Trend Modeling using News ArticlesParang Saraf
Abstract: Financial markets are quite sensitive to unanticipated news and events. Identifying the effect of news on the market is a challenging task. In this demo, we present Forex-foreteller (FF) which mines news articles and makes forecasts about the movement of foreign currency markets. The system uses a combination of language models, topic clustering, and sentiment analysis to identify relevant news articles. These articles along with the historical stock index and currency exchange values are used in a linear regression model to make forecasts. The system has an interactive visualizer designed specifically for touch-sensitive devices which depicts forecasts along with the chronological news events and financial data used for making the forecasts.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Epidemiological Modeling of News and Rumors on Twitter
1. Epidemiological Modeling of News and Rumors on Twitter
Fang Jin∗
, Edward Dougherty †
, Parang Saraf∗
, Yang Cao∗†
, Naren Ramakrishnan∗
∗
Department of Computer Science
†
Genetics, Bioinformatics, and Computational Biology Department
Virginia Tech, Blacksburg, VA 24061
∗
{jfang8, parang, ycao, naren}@cs.vt.edu, †
edougherty@vt.edu
ABSTRACT
Characterizing information diffusion on social platforms like
Twitter enables us to understand the properties of underly-
ing media and model communication patterns. As Twitter
gains in popularity, it has also become a venue to broadcast
rumors and misinformation. We use epidemiological mod-
els to characterize information cascades in twitter resulting
from both news and rumors. Specifically, we use the SEIZ
enhanced epidemic model that explicitly recognizes skeptics
to characterize eight events across the world and spanning
a range of event types. We demonstrate that our approach
is accurate at capturing diffusion in these events. Our ap-
proach can be fruitfully combined with other strategies that
use content modeling and graph theoretic features to detect
(and possibly disrupt) rumors.
Categories and Subject Descriptors
H.2.8 [Database Management]: Database Applications—
Data Mining; I.2.6 [Artificial Intelligence]: Learning—
Knowledge acquisition; Parameter learning
General Terms
Experimentation, Performance
Keywords
SIS, SEIZ, Epidemiological modeling, Rumor detection.
1. INTRODUCTION
Online social networks have become a staging ground for
modern movements, with the Arab Spring being the most
prominent example. Nine out of ten Egyptians and Tunisians
responded to a poll indicating that they used Facebook to
organize protests and spread awareness. As a precaution-
ary measure, governments have taken to blocking social net-
working websites, showcasing the importance of understand-
ing this phenomenon.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
The 7th SNA-KDD Workshop ’13 (SNA-KDD’13), August 11, 2013,
Chicago, United States.
Copyright 2013 ACM 978-1-4503-2330-7 ...$5.00.
Interestingly, the role of social networks is not limited to
helping organize the activities of disruptive elements. Many
key government and news agencies have also begun to em-
brace Twitter and other social platforms to disseminate in-
formation. After the tragic 2013 explosions at the Boston
Marathon, the FBI resorted to online social networks to
broadcast crucial information about the suspects. The viral
diffusion of information provided them with vital informa-
tion about the suspects. At the same time it is well known
that online activity on sites such as Reddit led to mistaken
identification of some individuals and the spread of several
rumors.
We were motivated to apply the latest in epidemiological
modeling to understand information diffusion on Twitter, in
relation to the spread of both news and rumors. Epidemi-
ological models provide a classical approach to study how
information diffuses. These models typically divide the to-
tal population into several compartments which reflect the
status of an individual. For instance, common compart-
ments denote susceptible (S), exposed (E), infected (I), and
recovered (R) individuals. Individuals transit from one com-
partment to another, with certain probabilities that have to
be estimated from data. The simplest model, SI, has two
states; susceptible (S) individuals get infected (I) by one of
their neighbors and stay infected thereinafter. While con-
ceptually easy to understand, it is also unrealistic for practi-
cal situations. The SIS model is popular in infectious disease
modeling wherein individuals can transition back and forth
between susceptible (S) and infected (I) states (e.g., think
of allergies and the common cold); this model is often used
as the baseline model for more sophisticated approaches.
The SIR model enables individuals to recover (R) but is not
suited for modeling news cascades on Twitter since there is
no intuitive mapping to what ‘recovering’ means. The SEIZ
model (susceptible, exposed, infected, skeptic) proposed by
Bettencourt et al. [1] takes the interesting approach of intro-
ducing an exposed state (E). Individuals in such a state take
some time before they begin to believe (I) in a story (i.e.,
get infected). While the authors of [1] used this approach to
model the adoption of Feynman diagrams by communities
of physicists, our work explores their use in modeling news
and rumors on Twitter.
The key contributions of this paper are:
• Our work is the first to employ the SEIZ model to
model real Twitter datasets. We employ non-linear
least squares optimization of the underlying systems
of ODEs over tweet data, and demonstrate how this
2. model is better at modeling rumor and news diffusion
than the traditional SIS model.
• We analyze eight representative stories (four true events
and four rumors) across a range of topics (politics, ter-
rorism, entertainment, and crime) and over several ge-
ographic regions (USA, Mexico, Venezuela, Cuba, Vat-
ican). While not an exhaustive list, this demonstrates
the wide applicability of the proposed model.
• We demonstrate the capability of the SEIZ model to
quantify compartment transition dynamics. We show-
case how such information could facilitate the devel-
opment of screening criteria for distinguishing rumors
from real news happenings on Twitter.
2. RELATED WORK
Information Diffusion.
Significant work has gone into research on information
diffusion on social media, e.g., see [4, 9, 16, 21]. Recently,
Matsubara etc. [10] conducted research on the rise and fall
patterns of information diffusion, and managed to capture
the power-law fall pattern and periodicities inherent in such
data. Gomez-Rodriguez et al. [6] built a cascade transmis-
sion model to track cascading process taking place over a
network; they traced overall blogs and news for a one-year
period and found that the top 1000 media sites and blogs
tend to have a core-periphery structure.
Epidemiological models.
Mathematical modeling of disease spread not only pro-
vides vital information about the propagation of the dis-
ease in a human network, but also offers insight into the
strategies that can be used to control them. The classifi-
cation of the human population into different groups forms
the basic premise of using epidemiological models for model-
ing information diffusion. The two widely used such models
are SIR (Susceptible, Infected, Recovered) and SIS (Sus-
ceptible, Infected, Susceptible) models. Newman et al. [14]
showed that a large class of standard epidemiological mod-
els, viz. the SIR models, can be solved exactly on a wide
variety of networks, and confirmed the correctness of solu-
tions with numerical simulations of SIR epidemics on net-
works. Kimura et al. [8] proposed the application of the SIS
model to study information diffusion where the nodes can
be activated multiple times. Zhao et al. [23] proposed an
SIHR (Spreaders, Ignorants, Hibernators, Removed) rumor
spreading model, with forgetting and remembering mech-
anisms to simulate rumor spreading in inhomogeneous net-
works. Xiong et al. [20] proposed a diffusion model with four
different states: susceptible, contacted, infected, and refrac-
tory (SCIR) to identify the threshold value of the spreading
rate approaches almost zero. Bettencourt et al. [1] proposed
the SEIZ (susceptible, exposed, infected, skeptic) model to
capture the adoption of Feynman diagrams by using the pub-
lication counts after World War II. They extract the general
features for idea spreading and estimate the idea adoption
process. Their result showed that the SEIZ model can fit the
long term idea adoption process with reasonable error, but
does not demonstrate whether this model can be applied on
large scale datasets, or whether can be applied on Twitter,
where the story unfolds in real-time.
Rumor modeling.
As far as we know, Daley [5] first proposed the similarity
between epidemics and rumors using mathematical analysis.
Some researchers have studied rumor propagation modeling
in different network topologies [13, 22]; however, they do not
provide any discussion of propagation differences between
news and rumors. Shah et al. [17] detect rumor sources in
network using maximum likelihood modeling. In [2], Budak
et al. prove that minimizing the spread of the misinforma-
tion (i.e., rumors) in social networks is an NP-hard problem
and also provide a greedy approximate solution. Castillo et
al. [3] delve into twitter content modeling, such as sentiment
analysis and hashtags to identify rumors, while Qazvinian
et al. [15] try to address this issue using broader linguistic
methods, to learn possible features of rumor and determine
whether a twitter user believes a rumor or not. More re-
lated work appears in [7, 19]. Our goal is to develop an
understanding of these processes using diffusion models.
3. DATASETS
We focus on twitter datasets that have reliable coverage of
the events being studied; the volume of tweets ranges from
as low as 791 to nearly three orders of magnitude greater.
As described in Table 1, the news and rumors studied were
drawn from a variety of regions and across a diversity of
topics. Data collection was aimed at gathering tweets highly
related to the events under study. We employed customized
sets of keywords and hashtags pertaining to each incident.
Finally, date range restrictions were used to define relevant
tweets for each event. It is also pertinent to note that the
tweets analyzed spanned a variety of languages: English,
Spanish, Italian, and Portuguese.
3.1 News topics
Boston Marathon Bombings. Two pressure cooker bombs
exploded near the finish line of 2013 Boston Marathon on
April 15, 14:49:12 local time, killing three people and in-
juring more than 264 others. The FBI released photographs
and surveillance videos on online social networks which spread
like wildfire and provided crucial leads for identifying the
suspects1
.
Pope Resignation. Pope Benedict XVI announced his res-
ignation on the morning of February 11, 2013. In nearly 6
centuries, this was the first time a pope has stepped down
from his office. This news received reactions from all across
the world2
.
Amuay Refinery Explosion. Propane and butane gas
leakage caused an explosion at the Amuay refinery in Venezuela
on August 25, 2012 1:11 am local time. The blast killed 48
people, injured 151 others and damaged 1600 homes3
.
Michelle Obama at the 2013 Oscars. In the 2013 Oscar
awards ceremony, a big surprise was the appearance of US
first lady Michelle Obama for presenting the ‘Best Picture’
award4
.
1
http://www.cnn.com/2013/04/15/us/boston-marathon-
explosions
2
http://www.cnn.com/2013/02/11/world/europe/pope-
resignation-q-and-a
3
http://www.cnn.com/2012/08/25/world/americas/venezuela-
refinery-blast
4
http://www.mediaite.com/tv/michelle-obama-makes-
cameo-at-the-oscars-announces-be st-picture-winner/
3. Table 1: Twitter datasets studied in this paper.
No. Dataset Date Area Type Country #Tweets Response
ratio
Keywords & Hashtag
1 Boston 04-15-2013 terrorism news USA 501259 68.3% Marathon, (#)bostonmarathon
2 Pope 02-11-2013 religion news Vatican 31365 56.75% Pope, (#)Benedict
3 Amuay 08-25-2012 accident news Venezuela 49015 62.89% Amuay, refinery, explosion
4 Michelle 02-24-2013 entertainmentnews USA 3762 54.45% Michelle Obama, Oscars
5 Obama 04-23-2013 politics rumor USA 791 46.14% White House, explosions
6 Doomsday 12-21-2012 mythology rumor Global 11833 52.19% Doomsday, Mayan, doom
7 Castro 10-16-2012 politics rumor Cuba 3862 54.45% Fidel Castro, Dr. Marquina
8 Riot 09-05-2012 crime rumor Mexico 4838 47.17% Antorcha Campesina, Nezahualcoyotl
(a) Amuay explosion (b) Castro rumor
Figure 1: Tweet volume.
3.2 Rumors
Obama injured. A fake associated press (AP) tweet orig-
inated on April 23, 2013 that President Obama was hurt
in White House explosions which caused a brief period of
instability in financial markets. The information was false
and it was determined that the Twitter account was hacked.
Doomsday. December 21, 2012 was rumored to be the
Doomsday as it marked the end date of a 5126 year long
cycle in the Mesoamerican long count calendar. This rumor
spread like wildfire and social networks were flooded with
panic and anxiety posts. Considering that we are still alive,
Doomsday turned out to be nothing more than a rumor on
a massive scale5
.
Fidel Castro’s death. On October 16, 2012 a Naples doc-
tor claimed that former Cuban leader, Fidel Castro suffered
a cerebral hemorrhage and is near a neurovegetative state.
However, on October 21, 2012, these rumors were denied by
Elias Jauva, former Venezuelan vice president, who released
pictures of him meeting Castro a few days back6
.
Riots and shooting in Mexico. A very interesting exam-
ple that highlights the perils of rumor spreading on social
networks pertains to the false reports of violence and im-
pending attack in Nezahualcoyotl, Mexico. (False) rumors
spreading on Twitter and Facebook about shootouts caused
(real) panic and chaos in Mexico City on September 5, 2012.
Interestingly, authorities themselves turned to Twitter to
deny these rumors7
.
3.3 Preliminary Analysis
5
http://en.wikipedia.org/wiki/Doomsday
6
http://www.inquisitr.com/371007/fidel-castro-allegedly-
appears-in-public-after-stroke-rumors/
7
http://www.foxnews.com/world/2012/09/08/tweets-false-
shootouts-cause-panic-in-mexico-city/
(a) Amuay explosion (b) Castro rumor
Figure 2: Followers/followees distributions. Follow-
ers: people who follow the person; Followees: people
who are followed by the person.
We compare the basic properties of news and rumor prop-
agation, by characterizing tweet volume over time, follower/followee
distributions, the ‘response ratio’ of a story, and the retweet
cascades. In order to maintain brevity, we show results from
only two stories in this section: one from our news collection
(the Amuay explosion) and one from our rumor collection
(Fidel Castro’s purported death).
Tweet Volume. For both examples, we plot the tweet
volume over time from the beginning of the story. Fig-
ure 1(a) shows the activity for the 2012 Amuay refinery
explosion example. An activity burst was formed immedi-
ately after the news was made public. The number of tweets
dropped progressively as the days went by. This activity
trend displays attributes similar to breaking news propaga-
tion as described by Mendoza et al. [11]. In contrast, Fig-
ure 1(b) depicts the volume of tweets about a rumor regard-
ing the health of the former Cuban leader Fidel Castro. Here
we see occasional spikes of tweet volume; note the increase
in tweet volume around October 21st, when the rumors were
officially denied.
Followers and Followees Distributions. Figure 2(a)
is a log-log scatter plot of the followers/followees distribu-
tion about the Amuay explosion news, and Figure 2(b) is
the corresponding plot about Fidel Castro’s death rumor.
There is no significant qualitative or quantitative difference
in this case; in particular both plots show that the number
of followees is less than the number of followers.
Response Ratio. A tweet can either be a post made
by the user’s initiative, or a responsive post to some other
user’s post (e.g., retweets and replies). As Starbird et al. [18]
discuss, retweets reveal how information propagates through
a social network: the ‘deeper’ a retweet, the more relevant
4. (a) Amuay Cascade (b) Castro Cascade
Figure 3: Retweet cascade for the Amuay Explosion news and Castro rumor. Each node is a user id, and
each edge connects the retweet user to the original user.
the tweet is for the community. Based on this idea, we define
the response ratio of a story as the fraction of responsive
tweets to the total number of tweets in the story. Table 1
lists the response ratio for all the 8 stories. As we can see,
response ratios for news are higher than that for the rumors.
Retweet Cascades. A retweet cascade reflects how the
social media network propagates information. Figure 3 de-
picts the evolution of the retweet graphs for the Amuay news
and Castro rumor dataset. For Amuay news, we plot four
graphs with intervals of 6 hours, depicting that a burst has
been formed during 6am-12am, only 5 hours after the acci-
dent. Fig. 3(b) shows the retweet graphs of the rumor for
several days. We can see even after one day, there is no burst
of tweets related to this rumor. Compared with the net-
work between the news and rumors, we find several features
about the rumor. 1) The network for the news instance is
more complex and users can obtain news from many sources,
while users obtain the rumor information only from limited
information centers. 2) There is an immediate burst after a
news is made public while there is no obvious burst for the
rumors.
4. OUR APPROACH
As stated earlier, we used compartmental population mod-
els to quantify the propagation of news and rumors on Twit-
ter, focusing primarily on the SIS and SEIZ models.
4.1 SIS
As described earlier, this model divides the population
into two compartments, or classes: susceptible and infected.
Note that in this model, infected individuals return to the
susceptible class on recovery because the disease confers no
immunity against reinfection.
In order to adapt this model for Twitter, we have given
new meaning to these terms. An individual is identified as
infected (I) if he posts a tweet about the topic of interest,
and susceptible (S) if he has not. A consequence of this in-
terpretation is that an individual posting a tweet is retained
to the infected compartment indefinitely; hence, he can not
propagate back to the susceptible class as is possible in an
epidemiological application. At any given time period t, we
use N(t) to denote the total population size, S(t) the suscep-
tible population size, and I(t) the infected population size,
such that N(t) = I(t) + S(t). As shown in Figure 4, the SIS
spreading rule can be summarized as follows:
Figure 4: SIS model framework
Figure 5: SEIZ model framework
• An individual that tweets about a topic is regarded as
infected.
• A susceptible person has not tweeted about the topic.
• A susceptible person coming into contact with an in-
fected individual (via a tweet) becomes infected him-
self, thus immediately posting a tweet.
• Susceptible individuals remain so until coming into
contact with an infected person.
The SIS model is mathematically represented by the follow-
ing system of ordinary differential equations (ODEs) [12]:
d[S]
dt
= −βSI + αI (1a)
d[I]
dt
= βSI − αI (1b)
5. Figure 6: Numerical implementation work-flow.
4.2 SEIZ
One drawback of the SIS model is that once a suscepti-
ble individual gets exposed to disease, he can only directly
transition to infected status. In fact, especially on Twit-
ter, this assumption does not work well; people’s ideologies
are complex and when they are exposed to news or rumors,
they may hold different views, take time to adopt an idea, or
even be skeptical to some facts. In this situation, they might
be persuaded to propagate a story, or commence only after
careful consideration themselves. Additionally, it is quite
conceivable that an individual can be exposed to a story
(i.e. received a tweet), yet never post a tweet themselves.
Based on this reasoning, we considered a more applicable,
robust model, the SEIZ model which was first used to study
the adoption of Feynman diagrams [1]. In the context of
Twitter, the different compartments of the SEIZ model can
be viewed as follows: Susceptible (S) represents a user who
has not heard about the news yet; infected (I) denotes a user
who has tweeted about the news; skeptic (Z) is a user who
has heard about the news but chooses not to tweet about it;
and exposed (E) represents a user who has received the news
via a tweet but has taken some time, an exposure delay, prior
to posting. We note that referring to the Z compartment as
skeptics is in no way an implication of belief or skepticism
of a news story or rumor. We adopt this terminology as this
was the nomenclature used by the original authors of the
SEIZ model [1].
A major improvement of the SEIZ model over the SIS
model is the incorporation of exposure delay. That is, an in-
dividual may be exposed to a story, but not instantaneously
tweet about it. After a period of time, he may believe it and
then be promoted to the infected compartment. Further,
it is now possible for an individual in this model to receive
a tweet, and not tweet about it themselves. As shown in
Figure 5, SEIZ rules can be summarized as follows:
• Skeptics recruit from the susceptible compartment with
rate b, but these actions may result either in turning
the individual into another skeptic (with probability l),
or it may have the unintended consequence of sending
that person into the exposed (E) compartment with
probability (1 − l).
• A susceptible individual will immediately believe a news
story or rumor with probability p, or that person will
move to the exposed (E) compartment with probabil-
ity (1 − p).
• Transitioning of individuals from the exposed compart-
ment to the infected class can be caused by one of
two separate mechanisms: (i) an individual in the ex-
posed class has further contact with an infected indi-
vidual (with contact rate ρ), and this additional con-
Table 2: Parameter definitions in SEIZ model[1]
Parameter Definition
β S-I contact rate
b S-Z contact rate
ρ E-I contact rate
Incubation rate
1/ Average Incubation Time
bl Effective rate of S -> Z
βρ Effective rate of S -> I
b(1-l) Effective rate of S -> E via contact with Z
β(1 − p) Effective rate of S -> E via contact with I
l S->Z Probability given contact with skeptics
1-l S->E Probability given contact with skeptics
p S->I Probability given contact with adopters
1-p S->E Probability given contact with adopters
tact promotes him to infected; (ii) an individual in
the exposed class may become infected purely by self-
adoption (with rate ), and not from additional contact
with those already infected.
The SEIZ model is mathematically represented by the fol-
lowing system of ODEs. A slight difference of our imple-
mentation of this model is that we do not incorporate vital
dynamics, which includes the rate at which individuals en-
ter and leave the population N (represented by µ [1]). In
epidemiological disease applications, this encompasses the
rate at which people become susceptible (e.g. born) and
deceased. In our application, a Twitter topic has a net du-
ration not exceeding several days. Thus, the net entrance
and exodus of Twitter users over these relatively short time
periods is not expected to noticeably impact compartment
sizes and our ultimate findings8
.
d[S]
dt
= −βS
I
N
− bS
Z
N
(2a)
d[E]
dt
= (1 − p)βS
I
N
+ (1 − l)bS
Z
N
− ρE
I
N
− E (2b)
d[I]
dt
= pβS
I
N
+ ρE
I
N
+ E (2c)
d[Z]
dt
= lbS
Z
N
(2d)
4.3 Practical Issues
During our adoption of the SIS and SEIZ models to under-
stand Twitter datasets, we were constrained by several fac-
tors. The first constraint was the unknowns in the models.
For example, we do not know the transition rates between
the compartments nor the initial sizes of the compartments.
Another constraint is the inability to quantify the total
population size. This value appears to simply be the total
number of Twitter accounts; however the value that we truly
want is the number of individuals who could be exposed
to the news or rumor topic. This value shows to be very
different from the total number of Twitter accounts. Con-
sider the ∼175 million (M) registered Twitter accounts. Of
8
http://www.statisticbrain.com/twitter-statistics/
6. Table 3: Fitting error of SIS and SEIZ models
Boston Pope Amuay Michelle Obama Doomsday Castro Riot Average
SIS 0.058 0.041 0.058 0.088 0.102 0.028 0.082 0.096 0.069
SEIZ 0.010 0.004 0.027 0.061 0.101 0.029 0.073 0.053 0.045
(a) SIS (b) SEIZ
Figure 7: Best fit modeling for Boston news.
(a) SIS (b) SEIZ
Figure 8: Best fit modeling for Pope news.
these, (i) ∼90 million have no followers, and (ii) ∼56 mil-
lion follow no one9
. To further complicate the matter, there
exists an abundance of “fake” Twitter accounts, which are
never used by any real person. They are simply sold to users
wishing to enhance their perceived popularity. Coupling
these facts with sporadic Twitter usage due to night-time
inactivity and user “unplugging”, it is clear that establish-
ing a reliable estimate of users who could receive a tweet is
quite difficult.
Synthesizing all of these factors, we assume the following
in our SEIZ model implementation:
1. We do not have reliable population specifics.
(a) We do not know N, total population size.
(b) We do not know S(t0), E(t0), I(t0), or Z(t0), the
initial values of each population compartment.
2. Infected individuals (I) submit a tweet.
3. Skeptics (Z) have been exposed to story, but do not
tweet.
4. Vital dynamics do not contribute to the overall popu-
lation size. Thus, N is a constant.
The implication of these assumptions is that total popula-
tion size N and initial population sizes for each compartment
S(t0), E(t0), I(t0), and Z(t0) are viewed as unknowns. They
are therefore treated as parameters in the parameter fit rou-
tine, and fit along with the other model parameters [1].
9
http://www.businessinsider.com/chart-of-the-day-how-
many-users-does-twitter-really-have-2011-3
(a) SIS (b) SEIZ
Figure 9: Best fit modeling for Amuay news.
(a) SIS (b) SEIZ
Figure 10: Best fit modeling for Michelle news.
4.4 Parameter Identification
For each of the population models (SIS and SEIZ), repre-
sented by equation sets 1 and 2, we performed a nonlinear
least squares fit of the model to Twitter data. As shown in
Figure 6, each step of this fitting process involved selecting
a set of parameter values (rate constants and probabilities
in equations 1 and 2, and initial compartment sizes), and
numerically solving the system of ODEs with these param-
eter values. The set of parameter values that minimized
|I(t) − tweets(t)| was identified as the optimal parameter
set.
The experimental implementation was done in Matlab.
The lsqnonlin function performed the least squares fit. The
ODE systems were solved with a forward Euler function that
we developed. This algorithm was selected due to its com-
putational efficiency, and used a time-step of no more than
0.05. This threshold demonstrated to be numerically stable;
in several instances we compared the forward Euler solution
to those generated by Matlab’s ode45 (5th
-order Explicit
Runge-Kutta with embedded 4th
-order error control), and
observed nearly identical solutions.
5. EXPERIMENTAL RESULTS
5.1 Fitting Results
For each of the Twitter datasets, we were interested in
quantifying the transitions of users through the different
compartments of the SIS and SEIZ models. Figures 7 - 14
display the results for the best fit of SIS and SEIZ mod-
els (Equations 1 and 2) to the eight Twitter stories. Also
7. (a) SIS (b) SEIZ
Figure 11: Best fit modeling for Obama news.
(a) SIS (b) SEIZ
Figure 12: Best fit modeling for Doomsday rumor.
displayed for each figure are the relative error in 2-norm
||I(t) − tweets(t)||2
||tweets(t)||2
and the mean error deviation
Pn
i=1 |I(ti) − tweets(ti)|
n
,
where n is the number of data points.
The error metrics for these eight stories clearly indicate
that the SEIZ model fits the Twitter data much more accu-
rately than the SIS model. Furthermore, the low relative er-
ror of the SEIZ model fit suggests that this model accurately
represents the Twitter data for each of the eight stories; see
Table 3. A common observation about all the eight stories
is that the SEIZ model is far more accurate in modelling the
initial spread of the news on Twitter as compared with the
SIS model. This behaviour can be explained by the delay
caused by individuals in the“Exposed”category taking some
time before posting a story themselves [1].
Given that the SEIZ model is superior to the SIS model in
this application, and that the SEIZ model demonstrates an
accurate representation of information diffusion on Twitter,
a natural question arises “How can this model help us?” The
answer is really simple. Since we have a mathematical model
for the Twitter data, we can study solutions to some of the
constraints as mentioned in the “Practical Issues” section. A
well fitted SEIZ model provides values for all contact rates
and transition probabilities as defined by Equation 2. These
parameters empower us to investigate the dynamics of news
and rumor spread on Twitter in a fashion that is not possible
without a mathematical model. Table 2 specifies the SEIZ
model parameters that we can now examine to assess news
and rumor propagation on Twitter.
5.2 Boston Marathon Bombing Analysis
To demonstrate a line of analysis that is now possible with
the SEIZ mathematical model, we use quantities from the
(a) SIS (b) SEIZ
Figure 13: Best fit modeling for Castro rumor.
(a) SIS (b) SEIZ
Figure 14: Best fit modeling for Riot news.
SEIZ model fit of the Boston Marathon bombing Twitter
data (Table 2). Results are summarized in Table 4.
Here we discuss the dynamics of all 4 compartments, so we
specially show all 4 compartments in the SEIZ time-course
plot only for Boston Marathon bombing (Figure 15(a)). These
results suggest that the effective rate of susceptible indi-
viduals becoming skeptics is much greater than those that
becoming infected. The decrease in S(t) occurs directly
with an increase in Z(t), and S(t) becomes stable at the
same time that Z(t). I(t) does increase as S(t) decreases,
but its rate of change is much slower, and the majority of
I(t) increase occurs after S(t) has stabilized to a minimal
value, demonstrating that the continued change in the in-
fected compartment has no further influence on the change
in the susceptible compartment.
Table 4 also demonstrates that the skeptics compartment
is more influential on transitioning susceptible users to the
exposed class than does infected users. Figure 15(a) shows
this as the increase in E(t) is strongly correlated with the
increase in Z(t). E(t) also peaks as Z(t) peaks, and E(t) be-
gins to decrease at a rate negative to that of the I(t) increase.
In fact, the increase in I(t) directly coincides with a compa-
rable decrease in E(t). These data suggest that the increase
in infected users is not due in large part by recruitment of
susceptible users, but rather from the natural transition to
the infected compartment by exposed individuals.
Putting this all together, we can deduce that virtually
all individuals are initially in the susceptible compartment.
Most susceptible users become skeptics from interaction with
skeptics, and those susceptible users that do transition to the
exposed class do so by their interaction with skeptics. The
infected compartment increases predominately from the ex-
posed class, and not from direct recruitment of susceptible
individuals. Thus, these findings suggest that it was in-fact
non-Twitter mediums that most greatly aided in the gener-
ation of Twitter propagation! Further, the ρ
ratio indicates
that the exposed users became infected more so due to infor-
8. (a) Boston (b) Pope (c) Amuay (d) Michelle
(e) Obama (f) Doomsday (g) Castro (h) Riot
Figure 15: SEIZ compartment time-course results.
Table 4: Ratios of SEIZ model for Boston dataset.
bl
βp
3.1E5
b(1 − l)
β(1 − p)
1.0E4
ρ
7.8
mation incubation and self-adoption, and not so much from
direct contact with infected users.
The remaining instances of SEIZ time-course plots are
shown in Figure 15, we can see how S, E, and I dynamic
change over time. These analyses exemplify the types of
analyses that can be used to study Twitter dynamics via
the SEIZ population model.
5.3 Rumor Detection
We next examined if our implementation of the SEIZ
model, applied to our Twitter examples, could be utilized to
facilitate the discrimination of true news from rumors. We
began by assembling an equation to relate the key parame-
ters of the SEIZ model. In our first attempt at performing
this, we restricted our attention to the exposed compart-
ment; this class has direct or indirect interconnections be-
tween the other three compartments, and is a key path to
the infected compartment. To exemplify this, consider the
extreme case where susceptible individuals are attempted
to be recruited by skeptics, and ultimately end up in the
infected compartment (Figure 5). This can only be accom-
plished by passing through the exposed compartment.
We quantify a ratio through E as the ratio of the sum
of the effective transition rates entering this compartment
(from S) to the sum of the transition rates exiting this com-
partment (to I). We define this ratio as RSI , using the
subscripts to denote the contributions from the susceptible
and infected compartments in this quantity:
Figure 16: RSI values for eight Twitter datasets.
RSI =
(1 − p)β + (1 − l)b
ρ +
(3)
RSI possesses all rate constants and probability values
of the SEIZ model and relates them to the exposed com-
partment with a kind of flux ratio, viz. the ratio of effects
entering E to those leaving E. A RSI value greater than
1 implies that the influx into the exposed compartment is
greater than the efflux. Similarly, a value less than 1 indi-
cates that members are added to the exposed group more
slowly than they are removed. We hypothesized that this
measure could potentially aid in the distinction of rumor
topics from news topics; all parameters of the SEIZ fit are
utilized in this measure, and they are related via the RSI
value to a key compartment of this model. If a distinction
between rumors and true news stories is to be seen with the
SEIZ model, we identify the RSI measure to be a probable
candidate in aiding this process.
We then computed RSI using the specific parameter val-
ues attained from our model fits of the eight cases (Fig-
ure 16). Here we can see that the true news about the
Boston Marathon bombing, Pope resignation, and Amuay
refinery explosion do in fact have much higher RSI values
than the rumor topics: Doomsday, Fidel Castro death, Mex-
ico City riots, and Barrack Obama injury which each have
much lower RSI values. However, the Michelle presence at
9. the Oscars, which is classified as true news, has a very low
RSI value. This particular case is interesting since Michelle
did not really show up to the 2013 Oscar Awards Ceremony.
She simply participated remotely via video telecast. It is
thus arguable that this topic could have been discussed in
the media in terms similar to rumors.
These findings suggest, for these specific topics, that the
parameters in the SEIZ model can potentially aid in the
challenge of distinguishing rumor versus true news. We are
not claiming that the RSI value is the unique measure to
accomplish this, nor are we claiming that the SEIZ model
itself is the sole tool to do this. As is suggested by our find-
ings, we postulate that a fit of a compartmental model, in
the spirit of the SEIZ model, to Twitter data provides valu-
able propagation information that can be coupled with other
data analysis strategies (e.g., content modeling) to augment
the accuracy and reliability of true news story and rumor
topic discrimination.
6. CONCLUSION
In this paper, we have demonstrated how true news and
rumor stories being propagated over Twitter can be mod-
elled by epidemiologically-based population models. We have
shown that the SEIZ model, in particular, is accurate in cap-
turing the information spread of a variety of news and rumor
topics, thereby generating a wealth of valuable parameters
to facilitate the analysis of these events. We then demon-
strated how these parameters can also be incorporated into
a strategy for supporting the identification of Twitter topics
as rumor or news. As of now, we are modeling propagation
over static data. In future, we plan to adapt this model for
capturing news and rumors in real-time.
Acknowledgments
Supported by the Intelligence Advanced Research Projects
Activity (IARPA) via Department of Interior National Busi-
ness Center (DoI/NBC) contract number D12PC000337, the
US Government is authorized to reproduce and distribute
reprints for Governmental purposes notwithstanding any copy-
right annotation thereon. Disclaimer: The views and con-
clusions contained herein are those of the authors and should
not be interpreted as necessarily representing the official
policies or endorsements, either expressed or implied, of
IARPA, DoI/NBC, or the US Government.
7. REFERENCES
[1] L. Bettencourt, A. Cintr´on-Arias, D. I. Kaiser, and
C. Castillo-Ch´avez. The power of a good idea:
Quantitative modeling of the spread of ideas from
epidemiological models. PHYSICA A, 364:513–536,
2006.
[2] C. Budak, D. Agrawal, and A. El Abbadi. Limiting
the spread of misinformation in social networks. In
Proc. WWW’11, pages 665–674. ACM, 2011.
[3] C. Castillo, M. Mendoza, and B. Poblete. Information
credibility on twitter. In Proc. WWW’11, pages
675–684. ACM, 2011.
[4] M. Cha, H. Haddadi, F. Benevenuto, and K. P.
Gummadi. Measuring user influence in twitter: The
million follower fallacy. In ICWSM’10, volume 14,
page 8, 2010.
[5] D. J. Daley and D. G. Kendall. Epidemics and
rumours. Nature, 204(4963):1118–1118, 1964.
[6] M. Gomez-Rodriguez, J. Leskovec, and A. Krause.
Inferring networks of diffusion and influence. In Proc.
SIGKDD’10, pages 1019–1028. ACM, 2010.
[7] V. Isham, S. Harden, and M. Nekovee. Stochastic
epidemics and rumours on finite random networks.
PHYSICA A, 389(3):561–576, 2010.
[8] M. Kimura, K. Saito, and H. Motoda. Efficient
estimation of influence functions for sis model on
social networks. In Proc. IJCAI’09, pages 2046–2051.
Morgan Kaufmann Publishers Inc., 2009.
[9] H. Kwak, C. Lee, H. Park, and S. Moon. What is
twitter, a social network or a news media? In Proc.
WWW’10, pages 591–600. ACM, 2010.
[10] Y. Matsubara, Y. Sakurai, B. A. Prakash, L. Li, and
C. Faloutsos. Rise and fall patterns of information
diffusion: model and implications. In Proc.
SIGKDD’12, pages 6–14. ACM, 2012.
[11] M. Mendoza, B. Poblete, and C. Castillo. Twitter
under crisis: Can we trust what we rt? In Proc.
SOMA ’10, pages 71–79. ACM, 2010.
[12] J. D. Murray. Mathematical biology, volume 2.
springer, 2002.
[13] M. Nekovee, Y. Moreno, G. Bianconi, and M. Marsili.
Theory of rumour spreading in complex social
networks. PHYSICA A, 374(1):457–470, 2007.
[14] M. E. Newman. Spread of epidemic disease on
networks. Physical review E, 66(1):016128, 2002.
[15] V. Qazvinian, E. Rosengren, D. R. Radev, and
Q. Mei. Rumor has it: Identifying misinformation in
microblogs. In Proc. EMNLP’11, pages 1589–1599.
Association for Computational Linguistics, 2011.
[16] D. M. Romero, B. Meeder, and J. Kleinberg.
Differences in the mechanics of information diffusion
across topics: idioms, political hashtags, and complex
contagion on twitter. In Proc. WWW’11, pages
695–704. ACM, 2011.
[17] D. Shah and T. Zaman. Rumors in a network: Who’s
the culprit? Information Theory, IEEE Transactions
on, 57(8):5163–5181, 2011.
[18] K. Starbird and L. Palen. (how) will the revolution be
retweeted?: information diffusion and the 2011
egyptian uprising. In Proc. ACM’12, pages 7–16.
ACM, 2012.
[19] D. Trpevski, W. K. Tang, and L. Kocarev. Model for
rumor spreading over networks. Physical Review E,
81(5):056102, 2010.
[20] F. Xiong, Y. Liu, Z.-j. Zhang, J. Zhu, and Y. Zhang.
An information diffusion model based on retweeting
mechanism for online social media. Physics Letters A,
376(30-31):2103–2108, 2012.
[21] J. Yang and S. Counts. Predicting the speed, scale,
and range of information diffusion in twitter. Proc.
AAAI’10, 2010.
[22] D. H. Zanette. Dynamics of rumor propagation on
small-world networks. Physical Review E,
65(4):041908, 2002.
[23] L. Zhao, H. Cui, X. Qiu, X. Wang, and J. Wang. Sir
rumor spreading model in the new media age.
PHYSICA A, 392(4):995–1003, 2012.