Predicting Elections with Twitter

  • 2,964 views
Uploaded on

This is a presentation given at the ICWSM 2010 in Washington, DC (www.icwsm.org). You can watch a video of the presentation on videolectures.net …

This is a presentation given at the ICWSM 2010 in Washington, DC (www.icwsm.org). You can watch a video of the presentation on videolectures.net

Twitter is a microblogging website where users read and write millions of short messages on a variety of topics every day. This study uses the context of the German federal election to investigate whether Twitter is used as a forum for political deliberation and whether online messages on Twitter validly mirror offline political sentiment. Using LIWC text analysis software, we conducted a content-analysis of over 100,000 messages containing a reference to either a political party or a politician. Our results show that Twitter is indeed used extensively for political deliberation. We find that the mere number of messages mentioning a party reflects the election result. Moreover, joint mentions of two parties are in line with real world political ties and coalitions. An analysis of the tweets’ political sentiment demonstrates close correspondence to the parties' and politicians’ political positions indicating that the content of Twitter messages plausibly reflects the offline political landscape. We discuss the use of microblogging message content as a valid indicator of political sentiment and derive suggestions for further research.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,964
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
7

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. International Conference on Weblogs and Social Media Washington, May 25, 2010 Predicting Elections with Twitter – What 140 Characters Reveal about Political Sentiment Andranik Tumasjan, Timm O. Sprenger, Philipp G. Sandner, Isabell M. Welpe contact author: sprenger@tum.de © Timm O. Sprenger
  • 2. Agenda Introduction and related research Data set and methodology Results and implications 1
  • 3. The successful use of social media in the last presidential campaign has established Twitter as an integral part of the political campaign toolbox The increasing use of Twitter as mean of …has triggered attempts to better understand political communication… and aggregate this information The Library of Congress recently acquired the entire Twitter archive 2
  • 4. The goal of our study was to explore 3 research questions Research questions 1 Deliberation Does Twitter provide a platform for political deliberation online? 2 Sentiment How accurately can Twitter inform us about the electorate's political sentiment? 3 Prediction Can Twitter serve as a predictor of the election result? 3
  • 5. Existing research related to our research questions and SELECTED EXAMPLES resulting research gaps we try to address Research questions Related research Research gap 1 Deliberation Twitter is not only used for one-way Many contexts largely communication, but 31% of all tweets direct a unexplored, e.g. the Does Twitter provide specific addressee (Honeycut&Herring, 2009) political debate online a platform for Political internet discussion boards found to Unclear whether political deliberation be dominated by a small number of heavy findings apply to online? users (Koop&Jansen, 2009) microblogging forums 2 Sentiment 19% of all tweets contain mentions of a brand Limited application to or product and statistically significant political sentiment How accurately can differences of customer sentiment can be Twitter inform us extracted (Jansen et al., 2009) about the Pessimism towards the ability of blogs to Few empirical studies electorate's political aggregate dispersed bits of information to explore information sentiment? (Sunstein, 2008) aggregation in social media 3 Prediction Some studies explore the reflection of the Unclear whether political landscape in "traditional" weblogs findings apply to and social media (e.g., number of Facebook microblogging forums Can Twitter serve as users a valid indicator of electoral success, a predictor of the Williams&Gulati, 2008) election result? Count of candidate mentions in the press can be a better predictor of election results than official election polls (Véronis, 2007) Previous studies have shown that social media can reflect the political landscape offline, but focused on "traditional" weblogs/message boards 4
  • 6. Agenda Introduction and related research Data set and methodology Results and implications 5
  • 7. We examined more than 100,000 tweets and extracted their sentiment using LIWC Data set Methodology 104,003 political tweets Linguistic Inquiry and Word Count (by James Published between August 13th and September Pennebaker et al.) 19th, 2009 (one week prior to the election) Text analysis software developed to assess Collected all tweets containing the name of emotional, cognitive, and structural either components of text samples using a At least one of the 6 major parties psychometrically validated dictionary Selected prominent politicians Calculates the share of words in a text belonging to empirically defined psychological and structural dimensions LIWC has been used widely in psychology and linguistics including to Measure the sentiment levels in US Senatorial (Yu et al., 2008) Profile politicians Twitter messages 6
  • 8. Background on the situation prior to the German federal PRE-ELECTION election 2009 Governing grand coalition Parties in the German Bundestag 2005-09 Socialists Social Democrats Green Party Liberals Christian Democrats Grouping of former SPD candidate for Historical coalition Publicly committed Union*-Chancellor East German state Chancellor partner of SPD, to coalition with Merkel running for party and leftist Steinmeier but with limited CDU reelection SPD-defectors rejected Die Linke chances for Spent 8 years in Favoring a Recent rise frag- as possible government the opposition coalition with the mented political coalition partner participation FDP spectrum Potential coalition of CDU and FDP was leading by a slight majority in most polls * CDU and CSU, often referred to as the "Union", are sister parties which form one faction in the German parliament 7
  • 9. Agenda Introduction and related research Data set and methodology Results and implications 8
  • 10. While Twitter is used as a forum for political deliberation on 1 substantive issues, this forum is dominated by heavy users Two widely accepted indicators of blog-based deliberation… The exchange of substantive issues Equality of participation Party Sample tweet* Users Messages CDU CDU wants strict rules for internet User group Total Share Total Share CSU CSU continues attacks on partner of choice One-time users 7,064 50.3% 7,064 10.2% FDP Light (2-5) 4,625 32.9% 13,353 19.3% FDP Whoever wants civil rights must choose Medium (6-20) 1,820 12.9% 18,191 26.2% FDP! Heavy (21-79) 463 3.3% 15,990 23.1% Grüne After the crisis only Green can help GREEN+ Very heavy (80+) 84 0.6% 14,470 21.2% Total 14,056 100% 69,318 100% SPD Only a matter of time until the SPD dissolves While the distribution of users across user Die Linke Society for Human Rights recommends: No groups is almost identical with the one found on government partication for LINKE internet message boards, we find even less equality of participation for the political debate on 31% of all messages contain "@"-sign Twitter 19% of all messages are retweets Additional analyses have shown users to exhibit a party-bias in the volume and sentiment of their * Examples shortened for citation (e.g. omission of hyperlinks) Source: Jansen&Koop (2009) messages 9
  • 11. The online sentiment in tweets reflects nuanced offline 2 differences between the politicians in our sample LIWC profiles* Leading candidates Other politicians Very similar profile for all leading candidates Positive outweigh negative emotions, except in the Only polarizing political characters, such as liberal case of CSU leader Seehofer who in addition is leader Westerwelle and socialist Lafontaine, associated the most with anger (he irritated many deviate in line with their roles as opposition voters with his attacks on desired coalition partner leaders FDP) Messages mentioning Steinmeier, who was For Steinbrück and zu Guttenberg, the issues sending mixed signals regarding potential coalition money and work, reflect their roles as finance and partners, are the most tentative economics minister * We focused on the 12 dimensions which a priori seemed best suited to profile sentiment and political issues Source: Jansen&Koop (2009) 10
  • 12. The similarity of profiles is a plausible reflection of the political 2 proximity between the parties Similarity of LIWC profiles Group Distance* Key findings Politicians All politicians 0.21 High convergence of the leading candidates Governing coalition 0.23 More divergence among Right coalition 0.16 politicians of the governing grand Distance measure to quantify coalition than among those of a the similarity of sentiment Left coalition 0.10 potential right wing coalition profiles The similar profiles of Merkel and Candidates for chancellor 0.02 Steinmeier mirror the consensus- driven style of their grand coalition Leading candidates 0.10 Other candidates 0.24 Parties All parties 0.09 The fit of a potential right-wing coalition is almost as good as the Governing coalition 0.07 fit in the governing coalition Right coalition 0.08 Greatest divergence among parties on the left Left coalition 0.10 Tight fit between sister parties CDU and CSU Union 0.01 * Average distance from the mean profile per category across all 12 dimensions in percentage points 11
  • 13. The activity on Twitter prior to the election seems to validly 3 reflect the election outcome The share of tweets can be considered a plausible …and joint party mentions accurately reflect the political reflection of the election results… ties between parties All mentions Election results Relative frequency of joint mentions** Vote CDU CSU SPD FDP Linke Party Total Share share Error CDU 30,886 30.1% 29.0% 1.0% CSU 1.25* CSU 5,748 5.6% 6.9% 1.3% SPD 1.23* 0.71* SPD 27,356 26.6% 24.5% 2.2% FDP 1.04* 1.01 0.90* FDP 17,737 17.3% 15.5% 1.7% Die Linke 0.81* 0.79* 1.04* 0.97 Die Linke 12,689 12.4% 12.7% 0.3% Grüne 0.84* 0.79* 0.98 1.06* 1.18* Grüne 8,250 8.0% 11.4% 3.3% MAE = 1.65% Research institute MAE (last poll) Forsa 0.84% An analysis of messages surrounding the TV Forschungsgruppe Wahlen 1.04% debate between the main candidates has shown that tweets can also reflect the GMS 1.48% sentiment over time Infratest/dimap 1.40% * Significant at the .05-level ** Measures how often two parties are mentioned together relative to the random probability 12
  • 14. Our findings suggest the use of social media information content to complement insights regarding the public's political sentiment Research questions Conclusions 1 Deliberation While we find evidence of a lively political debate on Twitter, Does Twitter this discussion is dominated by a small number of users: provide a only 4% of all user account for more than 40% of the platform for messages political deliberation online? 2 Sentiment Sentiment profiles plausibly reflect many nuances of the How accurately election campaign can Twitter Politicians evoke a more diverse set of profiles than parties inform us about Similarity of profiles is indicative of the parties' proximity with the electorate's respect to political issues political sentiment? 3 Prediction In contrast with previous studies of political message boardes, we find that the mere number of messages reflects Can Twitter serve the election results and even comes close to traditional as a predictor of election polls the election Joint party mentions mirror closeness on political issues and result? likely coalitions 13
  • 15. Enter polls…. Exit polls…. 14
  • 16. hank you for your attention! Stay tuned @TUitter_2010 15
  • 17. References (1/2) • Adamic, L. A., and Glance, N. 2005. The political blogosphere and the 2004 US election: Divided they blog. In Proceedings of the 3rd International Workshop on Link Discovery, 36-43. Chicago, IL. • Albrecht, S.; Lübcke, M.; and Hartig-Perschke, R. 2007. Weblog Campaigning in the German Bundestag Election 2005. Social Science Computer Review, 25(4): 504-520. • Böhringer, M., and Richter, A. 2009. Adopting Social Software to the Intranet: A Case Study on Enterprise Microblogging. In Proceedings of the 9th Mensch & Computer Conference, 293-302. Berlin. • Drezner, D.W., and Henry F. 2007. Introduction: Blogs, politics and power: a special issue of Public Choice. Public Choice, 134(1-2): 1-13. • Farrell, H., and Drezner D.W. 2008. The power and politics of blogs. Public Choice. 134(1-2): 15-30. • Gueorguieva, V. 2007. Voters, MySpace, and YouTube: The Impact of Alternative Communication Channels on the 2006 Election Cycle and Beyond. Social Science Computer Review, 26(3): 288-300. • Honeycutt, C., and Herring, S. C. 2009. Beyond microblogging: Conversation and collaboration via Twitter. In 42nd Hawaii International Conference on System Sciences, 1-10, Hawaii. • Huber, J., and Hauser, F. 2005. Systematic mispricing in experimental markets–evidence from political stock markets. In 10th Annual Workshop on Economic Heterogeneous Interacting Agents. Essex, UK. • Huberman, B. A.; Romero, D. M.; and Wu, F. 2008. Social Networks that Matter: Twitter Under the Microscope. Retrieved on December 15, 2009 from http://ssrn.com/abstract=1313405 • Jansen, B. J.; Zhang, M.; Sobel, K.; and Chowdury, A. 2009. Twitter power: Tweets as electronic word of mouth. Journal of the American Society for Information Science and Technology, 60: 1–20. • Jansen, HJ, and Koop. R. 2005. Pundits, Ideologues, and Ranters: The British Columbia Election Online. Canadian Journal of Communication, 30(4): 613-632. • Java, A.; Song, X.; Finin, T.; and Tseng, B. 2007. Why we twitter: understanding microblogging usage and communities. In Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, 56– 65. San Jose, CA: ACM. • Koop, R., and Jansen, H. J. 2009. Political Blogs and Blogrolls in Canada: Forums for Democratic Deliberation? Social Science Computer Review, 27(2): 155-173. • McKenna, L., and Pole A. 2007. What do bloggers do: an average day on an average political blog. Public Choice, 134(1- 2): 97-108. • Meckel, M., and Stanoevska-Slabeva K. 2009. Auch Zwitschern muss man üben - Wie Politiker im deutschen Bundestagswahlkampf twitterten. Retrieved December 15 from: http://www.nzz.ch/nachrichten/kultur/medien/ auch_zwitschern_muss_man_ueben_1.3994226.html 16
  • 18. References (2/2) • Nielsen Media Research. 2009. Das Phänomen Twitter in Deutschland. Retrieved December 15, 2009 from http://de.nielsen.com/news/ NielsenPressemeldung04.08.2009-Twitter.shtml • Perlmutter, D. D. 2008. Political Blogging and Campaign 2008: A Roundtable. The International Journal of Press/Politics, 13(2): 160-170. • Pennebaker, J.; Chung, C.; and Ireland, M. 2007. The development and psychometric properties of LIWC2007. Austin, TX. • Schneider, S. M. 1996. Creating a Democratic Public Sphere Through Political Discussion. Social Science Computer Review, 14(3): 373-392. • Skemp, K. 2009. All A-Twitter about the Massachusetts Senate Primary. Retrieved December 15, 2009 from http://bostonist.com/2009/12/01/ massachusetts_senate_primary_debate_twitter.php • Sunstein, Cass. 2007. Neither Hayek nor Habermas. Public Choice, 134(1-2): 87-95. • Tausczik, Y. R., and Pennebaker, J. W. 2009. The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods. Journal of Language and Social Psychology. • Véronis, J. 2007. Citations dans la presse et résultats du premier tour de la présidentielle 2007. Retrieved December 15, 2009 from http://aixtal.blogspot.com/ 2007/04/2007-la-presse-fait-mieux-que-les.html • Williams, C., and Gulati, G. 2008. What is a Social Network Worth? Facebook and Vote Share in the 2008 Presidential Primaries. In Annual Meeting of the American Political Science Association, 1-17. Boston, MA. • Woodly, D. 2007. New competencies in democratic communication? Blogs, agenda setting and political participation. Public Choice, 134(1-2): 109-123. • Yu, B.; Kaufmann, S.; and Diermeier, D. 2008. Exploring the characteristics of opinion expressions for political opinion classification. In Proceedings of the 2008 international conference on Digital government research, 82-91. Montreal. • Zarrella, D. 2009a. State of the Twittersphere. Retrieved December 15, 2009 from: http://blog.hubspot.com/Portals/249/sotwitter09.pdf • Zarrella, D. 2009b. The Science of Retweets. Retrieved December 15, 2009 from: http://danzarrella.com/the-science-of- retweets-report.html • pearanalytics 2009. Twitter Study. Retrieved December 15, 2009 from http://www.pearanalytics.com/ wp- content/uploads/2009/08/Twitter-Study-August-2009.pdf 17
  • 19. EXAMPLES Limitations of our research Limitation Description Mitigating arguments/research gaps Research on political bloggers (McKenna and The fact that these well-educated users Pole 2008) and similar demographics of "influence important actors within Twitter users (pearanalytics 2009) suggest mainstream media who in turn frame issues Sample selection that our sample may not have been for a wider public" (Farrel and Drezner representative of the German electorate 2008, p. 29) warrants special attention to Twitter as a source of opinion leadership. Data were limited to the tweets containing the Twitter users include hashtags, in many names of parties and politicians that we messages (e.g., "#CDU"), so we believe the defined as search terms share of relevant replies to be small Search algorithm We may have missed some replies belonging Future research should try to capture the to a discussion thread because respondents context by following embedded links or by do not necessarily repeat these names searching for replies to an author. Our investigation was based on one particular We believe this effect to be negligible since text analysis software and used an existing LIWC is based on word count only and dictionary not specifically tailored to classify therefore should not be affected by Text analysis such short messages as tweets grammatical errors We treated all messages published in a given Further research should investigate time frame as one document sentiment one tweet at a time We have examined overall political sentiment, Future sentiment analysis could address voters' attitudes and opinions may vary this issue by conducting a more detailed Definition of depending on specific political issues classification of content. sentiment 18
  • 20. Users exhibit a party-bias in the volume and sentiment of BACKUP their messages Distribution of attention by user group Share of tweets and sentiment of party- affiliated accounts 19