This document presents the findings of a study that analyzed different groups of Twitter users and their ability to predict the results of the 2012 U.S. Republican presidential primaries. The study found that more engaged users and those with right-leaning political preferences were better able to predict election outcomes compared to less engaged or left-leaning users. Specifically, right-leaning users correctly predicted winners in 8 of 10 states when looking at tweets from 56 days before elections. However, the ability to identify voting intentions from social media is challenging and improving methods like sentiment analysis could enhance prediction accuracy.
The document analyzes Twitter data from 250,840 U.S. users who disclose their religious affiliations. It finds moderate correlations between the Twitter data and survey results on the distribution of religious groups across states and within states. Classifiers can accurately identify religious affiliations based on Twitter content and connections, with network features performing better. The analysis also shows strong assortativity, with users much more likely to connect to others of the same religion. However, the study only includes users who publicly declare their religion.
Lu Chen, Wenbo Wang, Amit Sheth. Are Twitter Users Equal in Predicting Elections? A Study of User Groups in Predicting 2012 U.S. Republican Presidential Primaries. The 4th International Conference on Social Informatics (SocInfo2012), December 5-8, 2012, Lausanne, Switzerland.
http://knoesis.org/library/resource.php?id=1787
Twitter Based Sentiment Analysis of Each Presidential Candidate Using Long Sh...CSCJournals
In the era of technology and internet, people use online social media services like Twitter, Instagram, Facebook, Reddit, etc. to express their emotions. The idea behind this paper is to understand people’s emotion on Twitter and their opinion towards Presidential Election 2020. We collected 1.2 million tweets in total with keyword like “RealDonaldTrump”, “JoeBiden”, “Election2020” and other election related keywords using Twitter API and then processed them with natural language processing toolkit. A Bidirectional Long Short-Term Memory (BiLSTM) model has been trained and we have achieved 93.45% accuracy on our test dataset. We then used our trained model to perform sentiment analysis on the rest of our dataset. With the sentiment analysis results and comparison with 2016 Presidential Election, we have made predictions on who could win the US Presidential Election in 2020 with pre-election twitter data. We have also analyzed the impact of COVID-19 on people’s sentiment about the election.
Robots in Jokowi and Prabowo Cyber TeamsIsmail Fahmi
1) The data found more bot activity for Jokowi's #01IndonesiaMaju hashtag compared to Prabowo's #2019GantiPresiden hashtag.
2) #2019GantiPresiden was consistently the top trending hashtag across all social media platforms.
3) Examples showed bot-like behavior for #01IndonesiaMaju including repetitive messages from accounts with few followers and high engagement from accounts with very low follower counts, while #2019GantiPresiden showed more natural patterns of engagement.
"Binders full of Tweets:" Twitter, gender and the 2012 electionsRachel Reis Mourao
In 2012, women's issues became central to the US elections. Political reporters focused on repeating candidates' assertions rather than providing context and analysis, influenced by Twitter which many used to cover campaigns mobilely. This may have impacted how journalists discussed gender during the elections.
The document summarizes the results of a survey of NPR's Facebook fans conducted in July 2010. It finds that almost all respondents access Facebook daily, most get news online or through NPR's radio broadcasts, and 3/4 agree Facebook is a major way to receive news from NPR. While comments on NPR's Facebook page are seen as generally polite and civil, only a minority of respondents often leave comments themselves. Respondents prefer NPR posts stories about offbeat news, hard news, and international events, rather than sports or celebrity stories.
The document summarizes a presentation by team CDTW on the topic of fatal incidents between African American males and police. It includes the team members, purpose of presenting their research findings, and a disclaimer. It then outlines the problem statement, hypothesis, and describes two paths of data collection - one using Twitter data and the other using structured datasets. It discusses challenges with data collection and presents some visualizations and recommendations to improve the analysis.
This document summarizes research on using machine learning to build an ideologically balanced news diet. The researchers trained classification models on debate transcripts to predict whether news articles came from left-leaning or right-leaning media sources. The models achieved 84% accuracy but predicted that 79% of articles were from right-leaning sources, which did not match other data. The researchers discuss potential reasons for this and ways to improve the models in future iterations, such as using more training data sources and articles to better represent the ideological spectrum.
The document analyzes Twitter data from 250,840 U.S. users who disclose their religious affiliations. It finds moderate correlations between the Twitter data and survey results on the distribution of religious groups across states and within states. Classifiers can accurately identify religious affiliations based on Twitter content and connections, with network features performing better. The analysis also shows strong assortativity, with users much more likely to connect to others of the same religion. However, the study only includes users who publicly declare their religion.
Lu Chen, Wenbo Wang, Amit Sheth. Are Twitter Users Equal in Predicting Elections? A Study of User Groups in Predicting 2012 U.S. Republican Presidential Primaries. The 4th International Conference on Social Informatics (SocInfo2012), December 5-8, 2012, Lausanne, Switzerland.
http://knoesis.org/library/resource.php?id=1787
Twitter Based Sentiment Analysis of Each Presidential Candidate Using Long Sh...CSCJournals
In the era of technology and internet, people use online social media services like Twitter, Instagram, Facebook, Reddit, etc. to express their emotions. The idea behind this paper is to understand people’s emotion on Twitter and their opinion towards Presidential Election 2020. We collected 1.2 million tweets in total with keyword like “RealDonaldTrump”, “JoeBiden”, “Election2020” and other election related keywords using Twitter API and then processed them with natural language processing toolkit. A Bidirectional Long Short-Term Memory (BiLSTM) model has been trained and we have achieved 93.45% accuracy on our test dataset. We then used our trained model to perform sentiment analysis on the rest of our dataset. With the sentiment analysis results and comparison with 2016 Presidential Election, we have made predictions on who could win the US Presidential Election in 2020 with pre-election twitter data. We have also analyzed the impact of COVID-19 on people’s sentiment about the election.
Robots in Jokowi and Prabowo Cyber TeamsIsmail Fahmi
1) The data found more bot activity for Jokowi's #01IndonesiaMaju hashtag compared to Prabowo's #2019GantiPresiden hashtag.
2) #2019GantiPresiden was consistently the top trending hashtag across all social media platforms.
3) Examples showed bot-like behavior for #01IndonesiaMaju including repetitive messages from accounts with few followers and high engagement from accounts with very low follower counts, while #2019GantiPresiden showed more natural patterns of engagement.
"Binders full of Tweets:" Twitter, gender and the 2012 electionsRachel Reis Mourao
In 2012, women's issues became central to the US elections. Political reporters focused on repeating candidates' assertions rather than providing context and analysis, influenced by Twitter which many used to cover campaigns mobilely. This may have impacted how journalists discussed gender during the elections.
The document summarizes the results of a survey of NPR's Facebook fans conducted in July 2010. It finds that almost all respondents access Facebook daily, most get news online or through NPR's radio broadcasts, and 3/4 agree Facebook is a major way to receive news from NPR. While comments on NPR's Facebook page are seen as generally polite and civil, only a minority of respondents often leave comments themselves. Respondents prefer NPR posts stories about offbeat news, hard news, and international events, rather than sports or celebrity stories.
The document summarizes a presentation by team CDTW on the topic of fatal incidents between African American males and police. It includes the team members, purpose of presenting their research findings, and a disclaimer. It then outlines the problem statement, hypothesis, and describes two paths of data collection - one using Twitter data and the other using structured datasets. It discusses challenges with data collection and presents some visualizations and recommendations to improve the analysis.
This document summarizes research on using machine learning to build an ideologically balanced news diet. The researchers trained classification models on debate transcripts to predict whether news articles came from left-leaning or right-leaning media sources. The models achieved 84% accuracy but predicted that 79% of articles were from right-leaning sources, which did not match other data. The researchers discuss potential reasons for this and ways to improve the models in future iterations, such as using more training data sources and articles to better represent the ideological spectrum.
This document provides an introduction and background for a research study analyzing how the NFL utilizes social media to engage with fans and provide updates. It outlines the study's goals of exploring the NFL's current social media effectiveness and how results could help the league improve engagement. The methodology section describes collecting surveys from 100 students ages 18-25 at Eastern Connecticut State University to measure fans' social media usage and satisfaction with NFL updates received. Key terms are defined and the survey questions are provided.
Using Tweets for Understanding Public Opinion During U.S. Primaries and Predi...Monica Powell
Abstract
Using social media for political analysis, especially during elections, has become popular in the past few years where many researchers and media now use social media to understand the public opinion and current trends. In this paper, we investigate methods for using Twitter to analyze public opinion and to predict U.S. Presidential Primary Election results. We analyzed over 13 million tweets from February 2016 to April 2016 during the primary elections, and we looked at tweets that mentioned either Hillary Clin- ton, Bernie Sanders, Donald Trump or Ted Cruz. First, we use the methods of sentiment analysis, geospatial analysis, net- work analysis, and visualizations tools to examine public opinion on twitter. We then use the twitter data and analysis results to propose a prediction model for predicting primary election results. Our results highlight the feasibility of using social media to look at public opinion and predict election results.
This document summarizes an analysis of ISIS Twitter activity. It identifies the problem of ISIS using Twitter to further its agenda and degrade the US, and the hypothesis that analyzing ISIS Twitter data can enhance understanding of their networks and tactics. It then describes collecting data from 10 self-identified jihadist Twitter accounts, performing network and text analytics on the data, and finding two distinct communities centered around British foreign fighters in Syria and information about Syrian Islamist groups. It concludes more data is needed but finds no correlation between US airstrikes and tweet volume except for one user, and that tweets generally contained religious statements or news about fighting in Syria.
Curating and Contextualizing Twitter Stories to Assist with Social Newsgatheringazubiaga
While journalism is evolving toward a rather open-minded participatory paradigm, social media presents overwhelming streams of data that make it difficult to identify the information of a journalist's interest. Given the increasing interest of journalists in broadening and democratizing news by incorporating social media sources, we have developed TweetGathering, a prototype tool that provides curated and contextualized access to news stories on Twitter. This tool was built with the aim of assisting journalists both with gathering and with researching news stories as users comment on them. Five journalism professionals who tested the tool found helpful characteristics that could assist them with gathering additional facts on breaking news, as well as facilitating discovery of potential information sources such as witnesses in the geographical locations of news.
A Dream of Predicting Elections and Trading Stocks using Twitter - Yelena Mej...Yandex
This document discusses several studies that aimed to analyze social media data like tweets to track and predict political sentiment and elections. Some key findings include:
- Classifying tweets by political leaning using text or network analysis achieved over 80% accuracy in some cases. Hashtag propagation was able to identify pro/anti protest sentiment with over 90% accuracy.
- Analyzing the overlap in audiences of news sources on Twitter correlated with known media bias scores, suggesting social media data could quantify bias.
- A study of the 2010 US Senate election found the "vocal minority" of highly active tweeters may have different agendas than the "silent majority" of less active users.
- Other research explored detecting politically motivated
How india muslims are being demonised through whats app groups (a critical studyZahidManiyar
This document summarizes a research paper that analyzes "fear speech" on WhatsApp groups in India. The researchers:
1) Define fear speech and distinguish it from hate speech, finding fear speech aims to instill fear of minority groups through misinformation rather than using toxic language.
2) Create a dataset of over 27,000 WhatsApp posts, manually labeling 8,000 as fear speech and 19,000 as non-fear speech, focusing on Islamophobic fear speech.
3) Develop models to identify fear speech automatically and conduct an online survey to understand the characteristics of users who share and consume fear speech.
This document discusses a topic modeling analysis of tweets from members of the 113th US Congress. The analysis aimed to identify patterns in party messaging and how individual members adopted or diverged from party messages. Topic modeling of over 180,000 tweets from 522 members identified 40 topics. Results showed that while members discussed a wide range of issues, both Democratic and Republican party accounts focused more on a few key messaging topics. However, some members diverged from party stances on certain issues like ISIL and Keystone pipeline.
This document discusses a partnership between Burger King and the NCAA for their March Madness basketball tournament. It highlights how the two brands are a good fit due to their shared large, engaged fan bases and social media reach. Burger King's marketing campaign for March Madness included custom TV spots, a bracket promotion, on-site fan activations at games, and jumbotron promotions. The goal was to integrate the Burger King brand into the tournament experience and conversations in an organic way.
This document summarizes a statistical analysis of shot quality in Major League Lacrosse from 2015-2016. Data was collected on shot location, shooter, assists, shot type, and result. Analysis found expected goals vary by field zone and identified tendencies of individual players and teams. Visualizations showed shot densities and comparisons to league averages. The analysis aims to provide new metrics to evaluate player and team performance beyond traditional stats. Future work could explore defensive contributions and market inefficiencies using salary data.
2nd Info Evaluation Source Group Scores Rust 10-3-14Buffy Hamilton
Students in a class evaluated 7 different information sources on a security breach involving the Secret Service using the CRAAP test to assess credibility. For Source 3, a CNN news video, most students gave it high marks for being recent, factual, and directly related to the topic. Source 5, an NPR podcast, was also viewed favorably by many students for directly addressing the issue in an unbiased way from a reputable source, though a few noted some opinion. Source 7, a Congressional Research Service report, provided many facts but was not always current and did not focus directly on the specific event in question.
This study finds support for agenda melding and further validates the Network Agenda Setting (NAS) model through a series of computer science methods with large datasets on Twitter. The results demonstrate that during the 2012 U.S. presidential election, distinctive audiences “melded” agendas of various media differently. “Vertical” media best predicted Obama supporters’ agendas on Twitter whereas Romney supporters were best explained by Republican “horizontal” media. Moreover, Obama and Romney supporters relied on their politically affiliated horizontal media more than their opposing party’s media. Evidence for findings are provided through the NAS model, which measures the agenda-setting effect not in terms of issue frequency alone, but also in terms of the interconnections and relationships issues inside of an agenda.
Social Media Research for Qualitative Dataksawatzk
This document discusses social media research (SMR) as a qualitative research method. SMR involves analyzing existing data from social media platforms rather than conducting surveys or focus groups. It has advantages like accessing large amounts of variable data over time, but disadvantages like lack of demographics. The document outlines ethical considerations for SMR and software that can help analyze social media content. Overall it presents SMR as a viable primary data source for qualitative research.
3rd period info evaluation source group scoresBuffy Hamilton
The document summarizes student scores and notes from an evaluation of 7 information sources using the CRAAP test. For each source, students assessed criteria like currency, relevance, authority, accuracy, and purpose. The Washington Post article scored highest, while the American Thinker blog post and the 2010 reference article scored lowest due to biases, lack of currency or relevance to the topic of the Secret Service. The Congressional Research Service report received a high score for its recency and the expert knowledge of the author on homeland security topics.
A slide deck discussing the results of my semester-long analysis on the hashtag "fake news". Within the deck is a compilation of statistical charts to offer ideas on the significance of this hashtag, as well as a deep dive into the social dynamics attached to this topic.
This document summarizes the process an initiative group took to create a data visualization about deaths from police shootings in the United States. They began by choosing this topic and collecting relevant data. They developed personas to represent who would use the tool. Through iterations of brainstorming, sketching, and wireframing, they designed a map-based visualization that allowed filtering and comparing state-level data. They created a working prototype and refined it based on feedback to focus on one key persona and ensure the tool met policymakers' needs.
This document analyzes journalists' use of humor on Twitter during the first 2012 US presidential debate:
- 17.9% of tweets from 430 political journalists analyzing the debate contained attempts at humor. Newspaper reporters had a slightly higher percentage.
- The use of humor was only positively correlated with retweets, not other Twitter activities like mentions or hyperlinks.
- While journalists traditionally restrain humor for objectivity, Twitter allows more transparency and engagement with audiences. Humor on the platform marks a break from strict reporting practices and could challenge elite news discourse.
FINAL PRINT -Engagement in the Details - AN ANALYSIS OF READER INTERACTION AC...Nathan J Stone
This document is an introduction to a final project submitted by Nathan J. Stone for a master's degree. It discusses reader engagement with news across the New York Times and Facebook. It reviews literature on how people consume vast amounts of information through various sources like social media. It also discusses an experiment by NPR where they posted a fake news story to see how many people would comment without reading it. The introduction argues that for a democratic society, readers need to critically analyze and understand what they are reading rather than just scanning headlines. It will analyze reader comments on top stories from the New York Times and Facebook to examine the depth of engagement.
Partisans remain sharply divided in their views of the news media according to a 2018 Pew Research Center survey. The survey found:
1) Democrats (82%) are much more likely than Republicans (38%) to think news media criticism keeps political leaders from doing things they shouldn't, continuing a large partisan divide from 2017. This gap is the largest in over 30 years of surveys.
2) Most Americans (71%) think news will be accurate, but many (68%) believe news organizations cover up mistakes. Most also feel the media doesn't understand them or that they are disconnected from their news sources.
3) While few have high trust in social media for news (4%), more have trust in national
This study examines the relationship between perceptions of credibility of partisan and balanced news sources and levels of political polarization. The researchers analyzed survey data from 16,305 Americans regarding their views of credibility of various media sources (Fox News, MSNBC, New York Times, etc.) and their political polarization levels. They found that perceiving partisan sources (MSNBC specifically) as credible was linked to higher polarization. However, perceiving balanced sources (New York Times, CNN, broadcast news) as credible was associated with lower polarization levels, even after controlling for political and demographic factors. The researchers discuss why perceptions of liberal versus conservative sources may differentially impact polarization. Limitations include not proving causation.
Copy of Danah Boyds Draft of ."Tweet Tweet Retweet: Conversational Aspects of Retweeting on Twitter." http://www.zephoria.org/thoughts/archives/2009/06/18/understanding_r.html (Embedded here to share easily. No Claim over content, at all. Just a fanboy.)
Social Media: the good, the bad and the uglyJosh Cowls
1. Social media can facilitate information sharing and communication, aiding disaster relief and public health efforts. However, when information is more mediated, people can be anti-social, offline power dynamics are replicated online, and behavior is difficult to measure accurately.
2. While social media aim to be horizontal, in reality prominent offline figures and media elites still hold sway. Measuring public opinion on social media also faces challenges regarding representativeness and reliability.
3. Those who have access to large social media datasets can use algorithms to potentially influence users or even predict criminal behavior, showing the power of "big data."
This document provides an introduction and background for a research study analyzing how the NFL utilizes social media to engage with fans and provide updates. It outlines the study's goals of exploring the NFL's current social media effectiveness and how results could help the league improve engagement. The methodology section describes collecting surveys from 100 students ages 18-25 at Eastern Connecticut State University to measure fans' social media usage and satisfaction with NFL updates received. Key terms are defined and the survey questions are provided.
Using Tweets for Understanding Public Opinion During U.S. Primaries and Predi...Monica Powell
Abstract
Using social media for political analysis, especially during elections, has become popular in the past few years where many researchers and media now use social media to understand the public opinion and current trends. In this paper, we investigate methods for using Twitter to analyze public opinion and to predict U.S. Presidential Primary Election results. We analyzed over 13 million tweets from February 2016 to April 2016 during the primary elections, and we looked at tweets that mentioned either Hillary Clin- ton, Bernie Sanders, Donald Trump or Ted Cruz. First, we use the methods of sentiment analysis, geospatial analysis, net- work analysis, and visualizations tools to examine public opinion on twitter. We then use the twitter data and analysis results to propose a prediction model for predicting primary election results. Our results highlight the feasibility of using social media to look at public opinion and predict election results.
This document summarizes an analysis of ISIS Twitter activity. It identifies the problem of ISIS using Twitter to further its agenda and degrade the US, and the hypothesis that analyzing ISIS Twitter data can enhance understanding of their networks and tactics. It then describes collecting data from 10 self-identified jihadist Twitter accounts, performing network and text analytics on the data, and finding two distinct communities centered around British foreign fighters in Syria and information about Syrian Islamist groups. It concludes more data is needed but finds no correlation between US airstrikes and tweet volume except for one user, and that tweets generally contained religious statements or news about fighting in Syria.
Curating and Contextualizing Twitter Stories to Assist with Social Newsgatheringazubiaga
While journalism is evolving toward a rather open-minded participatory paradigm, social media presents overwhelming streams of data that make it difficult to identify the information of a journalist's interest. Given the increasing interest of journalists in broadening and democratizing news by incorporating social media sources, we have developed TweetGathering, a prototype tool that provides curated and contextualized access to news stories on Twitter. This tool was built with the aim of assisting journalists both with gathering and with researching news stories as users comment on them. Five journalism professionals who tested the tool found helpful characteristics that could assist them with gathering additional facts on breaking news, as well as facilitating discovery of potential information sources such as witnesses in the geographical locations of news.
A Dream of Predicting Elections and Trading Stocks using Twitter - Yelena Mej...Yandex
This document discusses several studies that aimed to analyze social media data like tweets to track and predict political sentiment and elections. Some key findings include:
- Classifying tweets by political leaning using text or network analysis achieved over 80% accuracy in some cases. Hashtag propagation was able to identify pro/anti protest sentiment with over 90% accuracy.
- Analyzing the overlap in audiences of news sources on Twitter correlated with known media bias scores, suggesting social media data could quantify bias.
- A study of the 2010 US Senate election found the "vocal minority" of highly active tweeters may have different agendas than the "silent majority" of less active users.
- Other research explored detecting politically motivated
How india muslims are being demonised through whats app groups (a critical studyZahidManiyar
This document summarizes a research paper that analyzes "fear speech" on WhatsApp groups in India. The researchers:
1) Define fear speech and distinguish it from hate speech, finding fear speech aims to instill fear of minority groups through misinformation rather than using toxic language.
2) Create a dataset of over 27,000 WhatsApp posts, manually labeling 8,000 as fear speech and 19,000 as non-fear speech, focusing on Islamophobic fear speech.
3) Develop models to identify fear speech automatically and conduct an online survey to understand the characteristics of users who share and consume fear speech.
This document discusses a topic modeling analysis of tweets from members of the 113th US Congress. The analysis aimed to identify patterns in party messaging and how individual members adopted or diverged from party messages. Topic modeling of over 180,000 tweets from 522 members identified 40 topics. Results showed that while members discussed a wide range of issues, both Democratic and Republican party accounts focused more on a few key messaging topics. However, some members diverged from party stances on certain issues like ISIL and Keystone pipeline.
This document discusses a partnership between Burger King and the NCAA for their March Madness basketball tournament. It highlights how the two brands are a good fit due to their shared large, engaged fan bases and social media reach. Burger King's marketing campaign for March Madness included custom TV spots, a bracket promotion, on-site fan activations at games, and jumbotron promotions. The goal was to integrate the Burger King brand into the tournament experience and conversations in an organic way.
This document summarizes a statistical analysis of shot quality in Major League Lacrosse from 2015-2016. Data was collected on shot location, shooter, assists, shot type, and result. Analysis found expected goals vary by field zone and identified tendencies of individual players and teams. Visualizations showed shot densities and comparisons to league averages. The analysis aims to provide new metrics to evaluate player and team performance beyond traditional stats. Future work could explore defensive contributions and market inefficiencies using salary data.
2nd Info Evaluation Source Group Scores Rust 10-3-14Buffy Hamilton
Students in a class evaluated 7 different information sources on a security breach involving the Secret Service using the CRAAP test to assess credibility. For Source 3, a CNN news video, most students gave it high marks for being recent, factual, and directly related to the topic. Source 5, an NPR podcast, was also viewed favorably by many students for directly addressing the issue in an unbiased way from a reputable source, though a few noted some opinion. Source 7, a Congressional Research Service report, provided many facts but was not always current and did not focus directly on the specific event in question.
This study finds support for agenda melding and further validates the Network Agenda Setting (NAS) model through a series of computer science methods with large datasets on Twitter. The results demonstrate that during the 2012 U.S. presidential election, distinctive audiences “melded” agendas of various media differently. “Vertical” media best predicted Obama supporters’ agendas on Twitter whereas Romney supporters were best explained by Republican “horizontal” media. Moreover, Obama and Romney supporters relied on their politically affiliated horizontal media more than their opposing party’s media. Evidence for findings are provided through the NAS model, which measures the agenda-setting effect not in terms of issue frequency alone, but also in terms of the interconnections and relationships issues inside of an agenda.
Social Media Research for Qualitative Dataksawatzk
This document discusses social media research (SMR) as a qualitative research method. SMR involves analyzing existing data from social media platforms rather than conducting surveys or focus groups. It has advantages like accessing large amounts of variable data over time, but disadvantages like lack of demographics. The document outlines ethical considerations for SMR and software that can help analyze social media content. Overall it presents SMR as a viable primary data source for qualitative research.
3rd period info evaluation source group scoresBuffy Hamilton
The document summarizes student scores and notes from an evaluation of 7 information sources using the CRAAP test. For each source, students assessed criteria like currency, relevance, authority, accuracy, and purpose. The Washington Post article scored highest, while the American Thinker blog post and the 2010 reference article scored lowest due to biases, lack of currency or relevance to the topic of the Secret Service. The Congressional Research Service report received a high score for its recency and the expert knowledge of the author on homeland security topics.
A slide deck discussing the results of my semester-long analysis on the hashtag "fake news". Within the deck is a compilation of statistical charts to offer ideas on the significance of this hashtag, as well as a deep dive into the social dynamics attached to this topic.
This document summarizes the process an initiative group took to create a data visualization about deaths from police shootings in the United States. They began by choosing this topic and collecting relevant data. They developed personas to represent who would use the tool. Through iterations of brainstorming, sketching, and wireframing, they designed a map-based visualization that allowed filtering and comparing state-level data. They created a working prototype and refined it based on feedback to focus on one key persona and ensure the tool met policymakers' needs.
This document analyzes journalists' use of humor on Twitter during the first 2012 US presidential debate:
- 17.9% of tweets from 430 political journalists analyzing the debate contained attempts at humor. Newspaper reporters had a slightly higher percentage.
- The use of humor was only positively correlated with retweets, not other Twitter activities like mentions or hyperlinks.
- While journalists traditionally restrain humor for objectivity, Twitter allows more transparency and engagement with audiences. Humor on the platform marks a break from strict reporting practices and could challenge elite news discourse.
FINAL PRINT -Engagement in the Details - AN ANALYSIS OF READER INTERACTION AC...Nathan J Stone
This document is an introduction to a final project submitted by Nathan J. Stone for a master's degree. It discusses reader engagement with news across the New York Times and Facebook. It reviews literature on how people consume vast amounts of information through various sources like social media. It also discusses an experiment by NPR where they posted a fake news story to see how many people would comment without reading it. The introduction argues that for a democratic society, readers need to critically analyze and understand what they are reading rather than just scanning headlines. It will analyze reader comments on top stories from the New York Times and Facebook to examine the depth of engagement.
Partisans remain sharply divided in their views of the news media according to a 2018 Pew Research Center survey. The survey found:
1) Democrats (82%) are much more likely than Republicans (38%) to think news media criticism keeps political leaders from doing things they shouldn't, continuing a large partisan divide from 2017. This gap is the largest in over 30 years of surveys.
2) Most Americans (71%) think news will be accurate, but many (68%) believe news organizations cover up mistakes. Most also feel the media doesn't understand them or that they are disconnected from their news sources.
3) While few have high trust in social media for news (4%), more have trust in national
This study examines the relationship between perceptions of credibility of partisan and balanced news sources and levels of political polarization. The researchers analyzed survey data from 16,305 Americans regarding their views of credibility of various media sources (Fox News, MSNBC, New York Times, etc.) and their political polarization levels. They found that perceiving partisan sources (MSNBC specifically) as credible was linked to higher polarization. However, perceiving balanced sources (New York Times, CNN, broadcast news) as credible was associated with lower polarization levels, even after controlling for political and demographic factors. The researchers discuss why perceptions of liberal versus conservative sources may differentially impact polarization. Limitations include not proving causation.
Copy of Danah Boyds Draft of ."Tweet Tweet Retweet: Conversational Aspects of Retweeting on Twitter." http://www.zephoria.org/thoughts/archives/2009/06/18/understanding_r.html (Embedded here to share easily. No Claim over content, at all. Just a fanboy.)
Social Media: the good, the bad and the uglyJosh Cowls
1. Social media can facilitate information sharing and communication, aiding disaster relief and public health efforts. However, when information is more mediated, people can be anti-social, offline power dynamics are replicated online, and behavior is difficult to measure accurately.
2. While social media aim to be horizontal, in reality prominent offline figures and media elites still hold sway. Measuring public opinion on social media also faces challenges regarding representativeness and reliability.
3. Those who have access to large social media datasets can use algorithms to potentially influence users or even predict criminal behavior, showing the power of "big data."
Prof. Libby Hemphill
IIT Lewis College of Human Sciences
@libbyh
Prof. Edward Lee
IIT Chicago-Kent College of Law
@edleeprof
Using social media to mobilize people, whether for a product campaign or a political protest, is no easy task. This presentation will highlight some of the challenges organizers and entities face when trying to mobilize and sustain a campaign through social media. It is based on our empirical analysis of the ongoing efforts of political activists to engage the public about the NSA surveillance controversy by use of Twitter.
This document provides a ratio analysis of Kutwal Foods Pvt. Ltd., an Indian food manufacturing and trading company. It includes the company profile, objectives of the analysis, research methodology used, data interpretation and key findings. The analysis found that the company's gross and net profit ratios have been decreasing in recent years due to rising material costs and low sales margins. It recommends that the company improve its profitability by reducing expenses and utilizing resources more efficiently.
The document describes a final year project to develop a mobile and web application called SpringsVision Events for planning and managing social events. A team of 4 students - Syed Absar Karim, Umair Ahmed, Shafaq Yameen, and Zaid Hussain - presented their project to create an online platform for scheduling events, adding social networking features, and mobile support to the supervisor Mr. Nadeem Mahmood. The project aims to provide a useful tool for personal event management and sharing on social media.
This document proposes a method to quantify the political leaning of Twitter users based on their tweet and retweet activity. It formulates the inference of political leaning as a convex optimization problem that incorporates two ideas: (1) a user's tweets and retweets should be consistent in sentiment, and (2) similar users tend to be retweeted by similar audiences. The method is evaluated on 119 million election-related tweets from the 2012 US presidential election and achieves 94% accuracy in classifying frequently retweeted sources. A quantitative analysis of the tweets also finds that parody accounts and less vocal users are more likely to be liberal, while hashtags usage changes significantly with political events.
This document summarizes research on how emotions, demographics, and sociability affect social interactions on Twitter. The researchers analyzed over 6 million geo-tagged tweets from 340,000 users in Los Angeles County. They found that weaker social ties on Twitter and greater mobility between different places were associated with more positive emotions, higher income, and education levels. In contrast, stronger ties and lower mobility correlated with less positive emotions and more Hispanic residents. The study provides empirical evidence that cognitive factors and socio-economic characteristics influence the structure and quality of online social interactions on Twitter.
This document summarizes a student's Twitter sentiment analysis project on the 2016 US presidential election. The student scraped tweets with hashtags #Election, #Hillary, and #Trump before and after the election to create datasets. Natural language processing was used to analyze sentiment and tweets were organized into a network graph. Key findings include: 1) Graphs after the election were tighter as retweets/impacts narrowed, 2) Sentiment on #Election lowered after results, 3) #Hillary tweets were fairly neutral amid controversies before the election. The student analyzed differences in sentiment, connections, and influencers between pre- and post-election graphs for each topic.
The document summarizes a study examining different generations' views of candidates' use of Twitter during presidential campaigns. Focus groups separated by age range discussed their Twitter usage and opinions on candidates' tweets. Younger participants focused more on candidates' reputations, while older groups discussed policy issues. All agreed candidates need an active social media presence to win elections. Twitter was not a major source of political news for any group, but they saw it as important for reaching young voters.
1. The study analyzed the relationship between social media popularity (Facebook fans and Twitter followers) and polling data of 8 Republican presidential candidates from June to December 2011.
2. The results showed a significant correlation between total Facebook fans and polling numbers, but no correlation with Facebook growth rates. There was also a correlation between total Twitter followers and polling numbers after adjusting for outliers.
3. While social media data cannot replace polling, it provides additional insights and may help predict election trends in near real-time, complementing traditional polling methods. More research is still needed to better understand the dynamics.
Increasing Voter Knowledge with Pre-Election Interventions on FacebookMIT GOV/LAB
As part of our Data Science to Solve Social Problems series, Facebook Data Scientist Winter Mason presented on efforts to increase online civic engagement.
Group research project completed in the Spring Semester of 2016. Studied undergraduate students at Florida State University in order to gain knowledge on how they used social media platforms to gain information about the presidential election.
The document discusses a study on the role of Twitter in the 2010 Nevada Senate race between Harry Reid and Sharron Angle. It begins by providing background on Twitter and how it has been used in political campaigns. It then discusses different theories about how the internet and new technologies can impact political participation and engagement. Specifically, it examines the instrumental approach which posits that lower communication costs increase participation, and the psychological approach which argues individual motivations and attributes determine online political involvement. The document will analyze tweets from Reid and Angle's campaigns and compare them to mainstream media coverage, in order to understand how Twitter was utilized in this competitive Senate election.
The document proposes a mobile app called We The People (WTP) to address low voter turnout by modernizing democracy. WTP would be a database on politicians and candidates with bios and stances. It would allow users to take surveys for free and see results. Survey responses would integrate with social media to trend topics and gain audience. An MVP will test the market by asking local voters if they want to take surveys and have their voice heard. Politicians and groups would subscribe for survey data and ability to create their own surveys. The app aims to give users a voice and make money through subscriptions. It calls on users to join the beta and spread the word to take back democracy.
This document describes a system created by researchers to analyze sentiment in tweets about 2012 US presidential candidates in real-time. The system collects tweets through the Twitter API, preprocesses them by tokenizing text, matches tweets to candidates, analyzes sentiment using a model trained on Twitter language, aggregates sentiment results by candidate, and visualizes the analysis through interactive dashboards. It provides up-to-the-minute insight into how public opinion responds to political events as expressed on Twitter.
This document summarizes a project analyzing sentiment in tweets from the third 2016 US Presidential debate between Clinton and Trump. The team collected over 100,000 tweets using APIs and analyzed sentiment using the VADER analyzer, achieving 68% accuracy. Visualizations of keywords and sentiment were created and made available online. Sentiment analysis determined positive or negative emotions associated with text, and VADER was used as it is context-aware. Motivations included using tweets as a gauge of issue discussion and that social media is a popular place for political discussion.
This document discusses how sentiment analysis was used to analyze public opinion during the 2012 US Presidential debates between Barack Obama and Mitt Romney. Sentiment analysis involves evaluating subjective information in text data to determine sentiment. It can analyze large amounts of data from social media and news to classify sentiment as positive, negative, or neutral. The analysis of posts during and after the debates provided insights into how the public reacted to different moments and helped analysts understand who was perceived to have won each debate. Real-time analysis also allowed campaigns to track changes in sentiment over the course of live events.
This study examined the influence of social networking sites and interpersonal political discussion on civic and political participation and confidence in government. The study found that reliance on social networking sites was positively associated with civic participation but not political participation or confidence in government. Interpersonal political discussion was found to enhance political participation and help citizens develop higher quality opinions. The study suggests encouraging more interpersonal political discussion to stimulate civic and political participation.
This document provides guidelines for journalists on appropriately reporting opinion polls. It discusses [1] determining whether a poll meets professional standards, [2] deciding if a poll's findings have newsworthiness, and [3] the appropriate way to publish poll findings. Key points include checking a poll's methodology, sample size, and margin of error; using polls to enhance issues coverage rather than set the agenda; and providing full context and disclosure when publishing poll results. The guidelines aim to help journalists identify valid, reliable polls and determine the most meaningful way to communicate poll findings to their audience.
This document summarizes Facebook's research on civic engagement and political efficacy. It discusses the multi-disciplinary research team, the importance of civic engagement to Facebook's mission, and how the team conducts qualitative interviews, surveys, and analyzes interaction data to understand civic behaviors and gaps. The research is used to imagine new product ideas like helping people connect with local representatives. The process involves understanding needs, observing the current state, instrumenting the platform to measure priorities, imagining solutions, building products, and iterating based on results. While the impact on political efficacy is unclear, the research aims to engage users and ensure responsibility as the company explores this important area.
Twitter Based Election Prediction and AnalysisIRJET Journal
This document discusses using Twitter data to predict election outcomes through sentiment analysis. It begins with an introduction to election prediction methods and why social media data is being explored as an alternative. The paper then reviews related work on using features like user profiles, linguistic content, and sentiment analysis of tweets mentioning candidates. It describes the methodology used, including data collection from Twitter's API, preprocessing tweets, and performing sentiment analysis using both machine learning and lexicon-based approaches. The results section shows the sentiment analysis identified more positive tweets for Clinton and more negative tweets for Trump, suggesting Clinton would win. Emotion analysis found more tweets expressing sadness for Clinton and joy for Trump.
The document summarizes a social network analysis of the 2016 US presidential candidates Hillary Clinton and Donald Trump on Twitter. It introduces the purpose of analyzing their social media networks to understand their reach and how it impacts their campaigns. It then briefly reviews literature on previous research analyzing the role of social media in elections from 2004 to 2010. The research questions aim to study how information flows through each candidate's network and whether people with more connections act as influencers. Data was collected from Twitter using hashtags and analyzed using tools like NodeXL and Gephi.
The document analyzes the social networks of 2016 US presidential candidates Hillary Clinton and Donald Trump on Twitter. It finds that:
1. Donald Trump's network contained 3 communities - one for Trump supporters (green), one for Hillary Clinton supporters (blue), and one for Ted Cruz supporters (red).
2. Users with high betweenness centrality, like 'ibegoodnow' and 'thegreatfeather', may act as influential spreaders of information.
3. The clusters for Trump and Cruz were connected through multiple users, indicating they belong to the same party, whereas Clinton was only connected through one user.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceIndexBug
Imagine a world where machines not only perform tasks but also learn, adapt, and make decisions. This is the promise of Artificial Intelligence (AI), a technology that's not just enhancing our lives but revolutionizing entire industries.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Pushing the limits of ePRTC: 100ns holdover for 100 days
Are Twitter Users Equal in Predicting Elections
1. Are Twitter Users Equal in
Predicting Elections?
A Study of User Groups in Predicting 2012 U.S.
Republican Presidential Primaries
1
Lu Chen, Wenbo Wang, Amit Sheth. Are Twitter Users Equal in Predicting Elections? A Study of User Groups in
Predicting 2012 U.S. Republican Presidential Primaries. The 4th International Conference on Social Informatics
(SocInfo2012), 2012.
Lu Chen
chen@knoesis.org
Wenbo Wang
wenbo@knoesis.org
Amit Sheth
amit@knoesis.org
2. There is a surge of interest in building systems that harness the
power of social data to predict election results.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 2
# of Facebook users
talking about each
candidate; who is talking
about which candidate :
age, gender, state
Twitter users’
Positive/negative
opinions about
each candidate
Tweets from
@BarackObama and
@MittRomney organized
by engagement on Twitter
# of Facebook
“likes” & Twitter
“follower”
Real time semantic
analysis of topic,
opinion, emotion, and
popularity about each
candidate
3. 3
One problem seems to be ignored:
Are social media users equal
in predicting elections?
They may be from different countries and states.
They may be have different political beliefs.
They may be of different ages.
They may engage in the elections in different ways
and with different levels of involvement.
……
They may be … different in predicting elections…?
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
WHOSE opinion really matters?
4. 4
o We Study different groups of
social media users who engage in
the discussions of 2012 U.S.
Republican Presidential Primaries,
and compare the predictive power
among these user groups.
Data: Using Twitter Streaming API, we collected tweets that contain the words
“gingrich”, “romney”, “ron paul”, or “santorum” from 01/10/2012 to 03/05/2012 (Super
Tuesday was 03/06/2012). The dataset comprises 6,008,062 tweets from 933,343 users.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
6. 1
6
More than half of the users posted only one tweet. Only 8% of the
users posted more than 10 tweets.
A small group of users (0.23%) can produce a large amount of tweets
(23.73%) – Is tweet volume a reliable predictor?
The usage of hashtags and URLs reflects the users' intent to attract
people's attention on the topic they discuss. The more engaged users
show stronger such intent and are more involved in the election event.
2
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
7. 3
7
The original tweet-dominant group accounts for the biggest
proportion of users in every user engagement group.
A significant number of users (34.71% of all the users) belong to the
retweet -dominant group, whose voting intent might be more difficult
to detect.
Engagement
Degree
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
According to users' preference on generating their tweets, i.e., tweet mode, we
classified the users as original tweet-dominant, original tweet-prone, balanced,
retweet-prone and retweet-dominant.
8. 4
8
More engaged users tend to post a mixture of content, with similar
proportion of opinion and information, or larger proportion of
information.
Engagement
Degree
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
We use target-specific sentiment analysis techniques to classify each tweet as
positive or negative – whether the expressed opinion about a specific candidate is
positive or negative. The users are categorized based on whether they post more
information or more opinion.
9. 5
9
Right-leaning users were (as expected) more involved in republican
primaries in several ways: more users, more tweets, more original
tweets, higher usage of hashtags and URLs.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
We collected a set of Twitter users with known political preference from Twellow
(http://www.twellow.com/categories/politics). Based on the assumption that a user tends
to follow others who share the same political preference as his/hers, we identified the
left-leaning and right-leaning users utilizing their following/follower relations. We
tested this method using a datasets of 3341 users, and it showed an accuracy of 0.9243.
10. 6
10
The Pearson's r for the correlation between the number of users/tweets
and the population is 0.9459/0.9667 (p<.0001).
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
We utilized the background knowledge from LinkedGeoData to identify the
states from user location information.
If the user's state could not be inferred from his/her location in the profile, we
utilized the geographic locations of his/her tweets. A user was recognized as from
a state if his/her tweets were from that state.
11. Predicting a User's Vote
• Basic idea: for which candidate the user shows the most support
– Frequent mentions
– Positive sentiment
11
Nm(c): the number of tweets mentioning the candidate c
Npos(c): the number of positive tweets about candidate c
Nneg(c): the number of negative tweets about candidate c
(0 < < 1): smoothing parameter
(0 < < 1): discounting the score when the user does not
express any opinion towards c.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
The user
posted opinion
about c
The user
mentioned c but
did not post
opinion about c
More mentions,
higher score
More positive/less
negative opinions,
higher score
12. Prediction Results
12
We examine the predictive power of different user groups in predicting the
results of Super Tuesday races in 10 states.
To predict the election results in a state, we used only the collection of
users who are identified from that state.
The results were evaluated in two ways: (1) the accuracy of predicting
winners, and (2) the error rate between the predicted percentage of votes
and the actual percentage of votes for each candidate.
We examined four time windows -- 7 days, 14 days, 28 days and 56 days
prior to the election day. In a specific time window, a user's vote was
assessed using only the set of tweets he/she created during this time.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
13. 7
13
The prediction accuracy:
Engagement Degree: High > Low or Very Low
Tweet Mode: Original Tweet-Prone > Retweet-Prone
Content Type: In a draw
Political Preference: Right-Leaning >> Left Leaning
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
14. 14
Revealing the challenge of
identifying the vote intent of “silent
majority”
Retweets may not necessarily
reflect users' attitude.
Prediction of user’s vote based on
more opinion tweets is not
necessarily more accurate than the
prediction using more information
tweets
The right-leaning user group provides
the most accurate prediction result. In
the best case (56-day time window), it
correctly predict the winners in 8 out
of 10 states with an average
prediction error of 0.1.
To some extent, it demonstrates the
importance of identifying likely voters
in electoral prediction.
8
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
15. 15
Our findings
Twitter users are not “equal”
in predicting elections!
The likely voters’ opinions matter more.
Some users’ opinions are more difficult to identify because
of their lower levels of engagement
or the implicitly of their ways to express opinions.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
16. More Work need to be
done…
• Identifying likely/actual voters
• Improving sentiment analysis
techniques
• Investigating possible data biases
(e.g., spam tweets and political
campaign tweets) and how they
might affect the results
and more …
16Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
17. 17Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
It is actually about tracking public opinion.
PollingorSocial Media Analysis?
1. Sample size
2. Representative of the target population
3. Accurate measure of opinions
4. Timeliness
18. 18Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
1 Sample Size
Polling Social Media Analysis
Thousands of people Millions of people
19. 19
2 Representative of the Target Population
Polling Social Media Analysis
[1] Can Social Media Be Used for Political Polling? http://www.radian6.com/blog/2012/07/can-social-media-be-used-for-political-polling/
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
About 95% of US homes can be
reached by landline telephone and
cell phone.
Sampling the target population
randomly.
Weighting the sample to census
estimates for demographic
characteristics (gender, race, age,
educational attainment, and
region).
About 60% of American adults
use social networking sites.
Difficult to do random sampling.
Limited demographic data
(although with some work, can be
improved).
20. 20Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
3 Accurate measure of opinions
Polling Social Media Analysis
Ask people what they think
Look at what people talk about
and extract their opinions
Not as accurate as Polling
Who will
you vote
for?
……
21. 21Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
4 Timeliness
Polling Social Media Analysis
What is happening now
Not be able to track people’s
opinion in real time
22. Social Media Analysis – Promising but Very
Challenging
22
Increasing number of social
media users
Convenient and comfortable
way to express opinions
The analysis can be done in real
time
Lower cost
A great complement (if not
substitute) for polling
Extracting demographic
information
Identifying the target population
whose opinion matter, e.g. the
likely voters in electoral prediction
Discriminate personal opinion
from the voice of mainstream
media and political campaign
More accurate sentiment
analysis/opinion mining,
especially the identification of
opinions about a specific object
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
23. Subjective Information Extraction, Lu Chen 23
Our Twitris+ System kept tracking
people’s opinion on 2012 U.S.
Presidential Election in real time and this
is what we saw on the Election Day …
26. 26
Sentiment change about
Barack Obama
Sentiment change about
Mitt Romney
Positive/negative topics
that contribute to such
change
Analysis can be
performed at location or
issue based level
A key innovation in sentiment analysis, employed in Twitris+, is topic specific sentiment
analysis -- to associate sentiment with an entity. The same sentiment phrases may assigned
different polarities associated with different entities.
Twitris+ tracks sentiment trend about different entities, and identifies topics/events that
contribute to sentiment changes. The result is updated every hour.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
27. Twitris+ Insights in 2012 Presidential Debates
27
How was Obama doing in the first debate?
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
28. 28
How was Obama doing in the second debate?
Red Color: Negative Topics
Green Color: Positive Topics
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
29. 29
Obama VS Romney in the third debate
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
Obama
Romney
30. Thank you !
Subjective Information Extraction, Lu Chen 30
More about this study:
http://wiki.knoesis.org/index.php/ElectionPrediction
Kno.e.sis Center:
http://knoesis.wright.edu/
Twitris+:
http://twitris.knoesis.org/
Semantics driven Analysis of Social Media:
http://knoesis.org/research/semweb/projects/socialmedia
Editor's Notes
Tweet volume alone may not be a reliable predictor, since a small group of users can produce a large amount of tweets. E.g., political campaign, promotion tweets
Some of the Twellow preferences are self declared
There is very strong correlation between the number of Twitter users/tweets from each state and the population of each state. Usually the Pearson's correlation coefficient between 0.9 to 1.0 indicates Very strong correlation.
Categorized by engagement degree: the high engagement users achieved better prediction results. It may be due to two reasons. (1) high engagement users posted more tweets. It is more reliable to make the prediction using more tweets. (2) more engaged users were more involved in the election event, and were more likely to vote.Categorized by tweet mode: the original tweet prone users achieved better prediction results. It might suggest the difficulty of identifying users' voting intent from retweets.Categorized by content type: No significant difference is found between two groupsCategorized by political preference: the right-leaning user group achieved significantly better results than left-leaning group.