The Party is Over Here: Structure and Content in the 2010 Election

8,900 views
8,825 views

Published on

In this work, we study the use of Twitter by House, Senate and gubernatorial candidates during the midterm (2010) elections in the U.S. Our data includes almost 700 candidates and over 690k documents that they produced and cited in the 3.5 years leading to the elections. We utilize graph and text mining techniques to analyze differences between Democrats, Republicans and Tea Party candidates, and suggest a novel use of language modeling for estimating content cohesiveness. Our findings show significant differences in the usage patterns of social media, and suggest conservative candidates used this medium more effectively, conveying a coherent message and maintaining a dense graph of connections. Despite the lack of party leadership, we find Tea Party members display both structural and language-based cohesiveness. Finally, we investigate the relation between network structure, content and election results by creating a proof-of-concept model that predicts candidate victory with an accuracy of 88.0%.

Published in: News & Politics, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
8,900
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
17
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

The Party is Over Here: Structure and Content in the 2010 Election

  1. 1. The Party is Over Here: Structure and Content in the 2010 Election<br />Avishay Livne, Matt Simmons, Eytan Adar, Lada Adamic<br />University of Michigan<br />ICWSM 2011<br />1<br />
  2. 2. Twitter and Politics<br />14% of Internet users engaged in political activity via social media in 2008 => 22% in 2010 [Smith 2008, 2010].<br />Political Polarization on Twitter – [Conover et al. 2011]<br />Characterizing Debate Performance via Aggregated Twitter Sentiment – [Diakopoulos& Shamma2010]<br />2<br />Seal of Approval<br />
  3. 3. Twitter in Politics<br />Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment – [Tumasjan et al. 2010]<br />Hypothesis: Portion of tweets mentioning a party => portion of votes this party will get.<br />Achieved MAE of 1.65%, traditional polls achieved 0.8%-1.48%.<br />German federal elections 2009.<br />In the same time a German octopus was used to predict the World Cup ‘10 results...<br />… correctly predicting 8/8 matches<br />Paul the Octopus<br />3<br />Our goal: Can we extract more information from Twitter?<br />
  4. 4. Agenda<br />Background<br />Data<br />Network analysis<br />Content analysis<br />Predicting election results<br />4<br />
  5. 5. Getting The Data<br />5<br />Unlike other studies we looked at candidates’ accounts.<br />Search: “Harry Reid site:twitter.com”<br />
  6. 6. 687 manually filtered users, 50% of the candidates.<br />Data<br />6<br />*Tea Party – political movement, supported Republicans.<br />Gained wide attention in 2010. <br />
  7. 7. Data<br />460K tweets over 2 years + 233K URLs.<br />4429 directed edges: AB == user A follows user B<br />7<br />Health Care Reform Bill Passed<br />
  8. 8. Usage Analysis<br />8<br />
  9. 9. Agenda<br />Background<br />Data<br />Network analysis<br />Content analysis<br />Predicting election results<br />9<br />
  10. 10. Graph Analysis<br />Edge(A,B) = A follows B<br />Democrats<br />Republicans<br />TeaParty<br />10<br />
  11. 11. Graph Analysis<br />Democratic<br />Republican<br />TeaParty<br />𝐷𝑒𝑛𝑠𝑖𝑡𝑦=𝐸𝑑𝑔𝑒𝑠𝑃𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝐸𝑑𝑔𝑒𝑠=𝐸𝑁𝑁−1<br /> <br />Supports previous studies [Adamic & Glance 2005]<br />11<br />
  12. 12. Agenda<br />Background & data<br />Network analysis<br />Content analysis<br />Predicting election results<br />12<br />
  13. 13. Language Modeling<br />How frequent each candidate/party used each term?<br />13<br />Extract distinguishing terms with KL-divergence<br />
  14. 14. Divergent terms<br />14<br />
  15. 15. Party Cohesiveness<br />How similar are pairs within each party<br />15<br />Similar <br />Not similar <br />
  16. 16. Language Similarity vs. Graph Distance<br />16<br />With retweets<br />Without retweets<br />Not similar <br />Similar <br />The closer two candidates are the more similar their content is.<br />This trend diminishes after 3 steps.<br />
  17. 17. Latent Topics<br />17<br />Distribution of documents over topics<br />Topics<br />1<br />LDA<br />Documents<br />Documents<br />2<br />Average over<br />party’s documents<br />Distribution of parties over topics<br />Topics<br />Which topics are more affiliated with each party<br />Compare distributions<br />Parties<br />
  18. 18. Topics<br />18<br />
  19. 19. Agenda<br />Background & data<br />Network analysis<br />Content analysis<br />Predicting election results<br />19<br />
  20. 20. Predicting Election Results - Individual Features<br />20<br />External<br />Network<br />Usage<br />Content<br />Logistic regression, 10-fold cross validation.<br />
  21. 21. Predicting Election Results<br />Models Comparison<br />21<br />Next, top model…<br />
  22. 22. 22<br />Predicting Election Results<br />Top model’s coefficients<br />
  23. 23. Summary<br />Twitter is prevalent in campaigns<br />Tea party more aggressive and Twitter-savvy<br />Republicans more aligned (network)<br />Democrats less cohesive (content)<br />Content similarity vs. network distance<br />Network and language improve prediction<br />23<br />
  24. 24. Future Work<br />More extensive topic & sentiment analysis<br />Temporal analysis<br />Integrating funding<br />Integrating constituents (the crowd)<br />24<br />
  25. 25. Thanks<br />Thank you!<br />Thanks to Abe Gong for helpful insights.<br />More information in the paper.<br />Contact info: avishay@umich.edu<br />25<br />

×