Lu Chen, Wenbo Wang, Amit Sheth. Are Twitter Users Equal in Predicting Elections? A Study of User Groups in Predicting 2012 U.S. Republican Presidential Primaries. The 4th International Conference on Social Informatics (SocInfo2012), December 5-8, 2012, Lausanne, Switzerland.
http://knoesis.org/library/resource.php?id=1787
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
Are Twitter Users Equal in Predicting Elections? Insights from Republican Primaries and 2012 General Election
1. Are Twitter Users Equal in
Predicting Elections?
A Study of User Groups in Predicting 2012 U.S. Republican Presidential Primaries
(with additional insights into the 2012 General Election)
Lu Chen Wenbo Wang Amit Sheth
chen@knoesis.org wenbo@knoesis.org amit@knoesis.org
Ohio Center of Excellent in Knowledge-enabled Computing (Kno.e.sis)
Wright State University, Dayton, OH, USA
Lu Chen, Wenbo Wang, Amit Sheth. Are Twitter Users Equal in Predicting Elections? A Study of User Groups in
Predicting 2012 U.S. Republican Presidential Primaries. The 4th International Conference on Social Informatics
1
(SocInfo2012), December 5-8, 2012, Lausanne, Switzerland.
2. There is a surge of interest in building systems that harness the
power of social data to predict election results.
# of Facebook users
Twitter users’ talking about each
# of Facebook Positive/negative candidate; who is talking
“likes” & Twitter opinions about about which candidate :
“follower” each candidate age, gender, state
Tweets from
@BarackObama and
Real time semantic
@MittRomney organized
analysis of topic,
by engagement on Twitter
opinion, emotion, and
popularity about each
candidate
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 2
3. One problem seems to be ignored:
Are social media users equal
in predicting elections?
They may be from different countries and states.
They may be have different political beliefs.
They may be of different ages.
They may engage in the elections in different ways
and with different levels of involvement.
……
They may be … different in predicting elections…?
WHOSE opinion really matters?
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 3
4. o We Studied different groups of
social media users who engage in
the discussions of 2012 U.S.
Republican Presidential Primaries,
and compare the predictive power
among these user groups.
Data: Using Twitter Streaming API, we collected tweets that contain the words
“gingrich”, “romney”, “ronpaul”, or “santorum” from 01/10/2012 to 03/05/2012 (Super
Tuesday was 03/06/2012). The dataset comprises 6,008,062tweets from 933,343users.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 4
5. User Categorization
2. Tweet Mode 3. Content Type
4. Political Preference
1. Engagement
Degree
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 5
6. 1
More than half of the users posted only one tweet. Only 8% of the
users posted more than 10 tweets.
A small group of users (0.23%) can produce a large amount of tweets
(23.73%) – Is tweet volume a reliable predictor?
2
The usage of hashtags and URLs reflects the users' intent to attract
people's attention on the topic they discuss. The more engaged users
show stronger such intent and are more involved in the election event.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 6
7. According to users' preference on generating their tweets, i.e., tweet mode, we
classified the users as original tweet-dominant, original tweet-prone, balanced,
retweet-prone and retweet-dominant.
3
Engagement
Degree
The original tweet-dominant group accounts for the biggest
proportion of users in every user engagement group.
A significant number of users (34.71% of all the users) belong to the
retweet -dominant group, whose voting intent might be more difficult
to detect.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 7
8. We use target-specific sentiment analysis techniques to classify each tweet as
positive or negative – whether the expressed opinion about a specific candidate is
positive or negative. The users are categorized based on whether they post more
information or more opinion.
4
Engagement
Degree
More engaged users tend to post a mixture of content, with similar
proportion of opinion and information, or larger proportion of
information.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 8
9. We collected a set of Twitter users with known political preference from Twellow
(http://www.twellow.com/categories/politics). Based on the assumption that a user tends
to follow others who share the same political preference as his/hers, we identified the
left-leaning and right-leaning users utilizing their following/follower relations. We
tested this method using a datasets of 3341 users, and it showed an accuracy of 0.9243.
5
Right-leaning users were (as expected) more involved in republican
primaries in several ways: more users, more tweets, more original
tweets, higher usage of hashtags and URLs.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 9
10. We utilized the background knowledge from LinkedGeoData to identify the
states from user location information.
If the user's state could not be inferred from his/her location in the profile, we
utilized the geographic locations of his/her tweets. A user was recognized as from
a state if his/her tweets were from that state.
6
The Pearson's r for the correlation between the number of users/tweets
and the population is 0.9459/0.9667 (p<.0001).
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 10
11. Predicting a User's Vote
• Basic idea: for which candidate the user shows the most support
– Frequent mentions The user
More mentions,
– Positive sentiment posted opinion
higher score
about c
More positive/less The user
negative opinions, mentioned c but
higher score did not post
Nm(c): the number of tweets mentioning the candidate c opinion about c
Npos(c): the number of positive tweets about candidate c
Nneg(c): the number of negative tweets about candidate c
(0 < < 1): smoothing parameter
(0 < < 1): discounting the score when the user does not
express any opinion towards c.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 11
12. Prediction Results
We examine the predictive power of different user groups in predicting the
results of Super Tuesday races in 10 states.
To predict the election results in a state, we used only the collection of
users who are identified from that state.
We examined four time windows -- 7 days, 14 days, 28 days and 56 days
prior to the primary day. In a specific time window, a user's vote was
assessed using only the set of tweets he/she created during this time.
The results were evaluated in two ways: (1) the accuracy of predicting
winners, and (2) the error rate between the predicted percentage of votes
and the actual percentage of votes for each candidate.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 12
13. 7
The prediction accuracy:
Engagement Degree: High > Low or Very Low
Tweet Mode: Original Tweet-Prone >Retweet-Prone
Content Type: In a draw
Political Preference: Right-Leaning >> Left Leaning
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 13
14. Revealing the challenge of
Retweets may not necessarily
8 identifying the vote intent of “silent
reflect users' attitude.
majority”
The right-leaning user group provides
the most accurate prediction result. In
the best case (56-day time window), it
correctly predict the winners in 8 out
of 10 states with an average
Prediction of user’s vote based on prediction error of 0.1.
more opinion tweets is not
necessarily more accurate than the To some extent, it demonstrates the
prediction using more information importance of identifying likely voters
tweets in electoral prediction.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 14
15. Our findings
Twitter users are not “equal”
in predicting elections!
The likely voters’ opinions matter more.
Some users’ opinions are more difficult to identify because
of their lower levels of engagement
or the implicit ways to express opinions.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 15
16. More Work need to be
done…
• Identifying likely/actual voters
• Improving sentiment analysis
techniques
• Investigating possible data biases
(e.g., spam tweets and political
campaign tweets) and how they
might affect the results
and more …
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 16
17. It is actually about tracking public opinion.
PollingorSocial Media Analysis?
1. Sample size
2. Representative of the target population
3. Accurate measure of opinions
4. Timeliness
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 17
18. 1 Sample Size
Polling Social Media Analysis
Thousands of people Millions of people
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 18
19. 2 Representative of the Target Population
Polling Social Media Analysis
About 95% of US homes can be
reached by landline telephone and
cell phone. About 60% of American adults
Sampling the target population use social networking sites.
randomly. Difficult to do random sampling.
Weighting the sample to census Limited demographic data
estimates for demographic (although with some work, can be
characteristics (gender, race, age, improved).
educational attainment, and
region).
[1] Can Social Media Be Used for Political Polling? http://www.radian6.com/blog/2012/07/can-social-media-be-used-for-political-polling/
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 19
20. 3 Accurate measure of opinions
Polling Social Media Analysis
Ask people what they think
Who will
you vote
for?
Look at what people talk about
and extract their opinions
……
Not as accurate as Polling
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 20
21. 4 Timeliness
Polling Social Media Analysis
Not be able to track people’s
opinion in real time What is happening now
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 21
22. Social Media Analysis – Promising but Very
Challenging
Extracting demographic
Increasing number of social information
media users
Identifying the target population
Convenient and comfortable whose opinion matter, e.g. the
way to express opinions likely voters in electoral prediction
The analysis can be done in real Discriminate personal opinion
time from the voice of mainstream
media and political campaign
Lower cost
More accurate sentiment
A great complement (if not analysis/opinion mining,
substitute) for polling especially the identification of
opinions about a specific object
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 22
23. Our Twitris+ System kept tracking
people’s opinion on 2012 U.S.
Presidential Election in real time and this
is what we saw on the Election Day …
Subjective Information Extraction, Lu Chen 23
24. /t
The screenshots of Twitris+ were taken on Nov. 6th 6 PM EST
Subjective Information Extraction, Lu Chen 24
25. Twitris+: http://twitris.knoesis.org/
Select event
Multi-faceted
Analysis
Select date
N-gram summaries
Related tweets Reference news Wikipedia articles
Subjective Information Extraction, Lu Chen 25
26. A key innovation in sentiment analysis, employed in Twitris+, is topic specific sentiment
analysis -- to associate sentiment with an entity. The same sentiment phrases may be
assigned different polarities associated with different entities.
Twitris+ tracks sentiment trend about different entities, and identifies topics/events that
contribute to sentiment changes. The result is updated every hour.
Sentiment change about
BarackObama
Analysis can be
performed at
location (eg, by
state) or issue Positive/negative topics
based level (eg, that contribute to such
economy, tax, Sentiment change about change
social issues – Mitt Romney
women, …)
Individual tweets related
to chosen topic
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 26
27. Twitris+ Insights in 2012 Presidential Debates
How was Obama doing in the first debate?
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 27
28. How was Obama doing in the second debate?
Red Color: Negative Topics
Green Color: Positive Topics
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 28
29. Obama vsRomney in the third debate
Obama
Romney
You can find a lot more –
Eg analysis from network,
demographic,
emotion, temporal, …
perspectives at
http://twitris.knoesis.org
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 29
30. Thank you !
More about this study:
http://wiki.knoesis.org/index.php/ElectionPrediction
Kno.e.sis Center:
http://knoesis.wright.edu/
Twitris+:
http://twitris.knoesis.org/
Semantics driven Analysis of Social Media:
http://knoesis.org/research/semweb/projects/socialmedia
Subjective Information Extraction, Lu Chen 30
Editor's Notes
Paper at: http://knoesis.org/library/resource.php?id=1787
Tweet volume alone may not be a reliable predictor, since a small group of users can produce a large amount of tweets. E.g., political campaign, promotion tweets
Some of the Twellow preferences are self declared
There is very strong correlation between the number of Twitter users/tweets from each state and the population of each state. Usually the Pearson's correlation coefficient between 0.9 to 1.0 indicates Very strong correlation.
Categorized by engagement degree: the high engagement users achieved better prediction results. It may be due to two reasons. (1) high engagement users posted more tweets. It is more reliable to make the prediction using more tweets. (2) more engaged users were more involved in the election event, and were more likely to vote.Categorized by tweet mode: the original tweet prone users achieved better prediction results. It might suggest the difficulty of identifying users' voting intent from retweets.Categorized by content type: No significant difference is found between two groupsCategorized by political preference: the right-leaning user group achieved significantly better results than left-leaning group.