Large scale Twitter collection of 2012 US election

682 views

Published on

A short presentation on the data collection that the Social Media and Democracy Group at UW-Madison is doing around the 2012 US elections.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
682
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Large scale Twitter collection of 2012 US election

  1. 1. 2012 Twitter collection Alexander HannaResearchAgendaCurrent Large scale Twitter collection of 2012 USProjectCase Study electionFutureApproaches Alexander Hanna Department of Sociology University of Wisconsin-Madison ahanna@ssc.wisc.edu @alexhanna September 14, 2012
  2. 2. 2012 Twitter collection Alexander A Twitter-specific Research Hanna AgendaResearchAgendaCurrentProjectCase StudyFutureApproaches • How different is the political Twitterverse from the rest of the social graph? • What are the different modes of engagement between different types of elite users and their followers? • How does information flow from elite users to others?
  3. 3. 2012 Twitter collection Alexander A Twitter-specific Research Hanna AgendaResearchAgendaCurrentProjectCase StudyFutureApproaches • How different is the political Twitterverse from the rest of the social graph? • What are the different modes of engagement between different types of elite users and their followers? • How does information flow from elite users to others?
  4. 4. 2012 Twitter collection Alexander A Twitter-specific Research Hanna AgendaResearchAgendaCurrentProjectCase StudyFutureApproaches • How different is the political Twitterverse from the rest of the social graph? • What are the different modes of engagement between different types of elite users and their followers? • How does information flow from elite users to others?
  5. 5. 2012 Twitter collection Alexander Current Study HannaResearchAgendaCurrentProjectCase StudyFutureApproaches • Structured to consider direct follow relationships • Constructing the political Twitterverse
  6. 6. 2012 Twitter collection Alexander Political elites HannaResearchAgendaCurrentProjectCase Study • Candidates in national racesFutureApproaches • Party leadership • Media - Pundits, Reporters, Bloggers • Satirists • Celebrities • Advocacy Groups
  7. 7. 2012 Twitter collection Alexander Sampling Strategy HannaResearchAgenda Three levels - Elites and followersCurrentProjectCase StudyFutureApproaches
  8. 8. 2012 Twitter collection Alexander Waves of collection HannaResearchAgendaCurrentProjectCase StudyFutureApproaches Sampling at three different points • Pre-primary - Mid January • Post-primary - June 26 • Post-convention and pre-election - September 7
  9. 9. 2012 Twitter collection Alexander Data Collection and Processing HannaResearchAgendaCurrentProjectCase StudyFuture • Twitter RESTful API for collecting follower listsApproaches • Twitter Streaming API for collecting tweets • Two streams - targeted sample stream and “gardenhose” (10% sample of all of Twitter) • Hadoop/MapReduce for analysis
  10. 10. 2012 Twitter collection Alexander Data size and storage HannaResearchAgendaCurrentProjectCase Study • GardenhoseFuture • 2.7 TBApproaches • 20-40mil tweets/day • 15-16 GB/day • Targeted sample: • 77,054 unique users • 103 GB • 500k-1mil tweets/day • Currently around 1 GB/day
  11. 11. 2012 Twitter collection Alexander Case Study in Agenda Setting HannaResearchAgendaCurrentProjectCase StudyFutureApproaches Who establishes the media discourse? How do different elements of media try to set the discourse?
  12. 12. 2012 Twitter collection Alexander Trayvon Martin HannaResearchAgenda • February 26 - MartinCurrentProject killedCase Study • March 8 - CBS NewsFutureApproaches interview with Martin’s parents • Week of March 12 - Media catches on, case more covered than presidential race • April 11 - State Prosecuter files charges • April 19 - Zimmerman released on bond
  13. 13. 2012 Twitter collection Alexander Twitter mentions HannaResearchAgenda 1.0CurrentProjectCase Study 0.8FutureApproaches 0.6 factor(Keyword) Count trayvon zimmerman 0.4 0.2 03/01 03/05 03/09 03/13 03/17 03/21 03/25 03/29 04/02 04/06 04/10 04/14 04/18 04/22 04/26 04/30 05/04 Date
  14. 14. 2012 Twitter collection Alexander Twitter vs. Google HannaResearchAgenda 1.0CurrentProjectCase Study 0.8FutureApproaches 0.6 factor(Keyword) trayvon Count zimmerman Gzimmerman Gtrayvon 0.4 0.2 0.0 03/01 03/05 03/09 03/13 03/17 03/21 03/25 03/29 04/02 04/06 04/10 04/14 04/18 04/22 04/26 04/30 05/04 Date
  15. 15. 2012 Twitter collection Alexander Setting the agenda Hanna Mentions of TrayvonResearchAgendaCurrentProjectCase Study 0.05FutureApproaches 0.04 factor(Level) 0.03 1 Ratio 2 3 0.02 0.01 03/01 03/05 03/09 03/13 03/17 03/21 03/25 03/29 04/02 04/06 04/10 04/14 04/18 04/22 04/26 04/30 05/04 Date
  16. 16. 2012 Twitter collection Alexander Setting the agenda Hanna Mentions of ZimmermanResearchAgendaCurrentProject 0.030Case StudyFutureApproaches 0.025 0.020 factor(Level) 1 Ratio 2 3 0.015 0.010 0.005 03/01 03/05 03/09 03/13 03/17 03/21 03/25 03/29 04/02 04/06 04/10 04/14 04/18 04/22 04/26 04/30 05/04 Date
  17. 17. 2012 Twitter collection Alexander Setting the agenda HannaResearchAgendaCurrentProjectCase Study • No noticable differenceFutureApproaches between mentions of Trayvon in elites vs. followers • However, followers seem to catch on to Zimmerman quicker
  18. 18. 2012 Twitter collection Alexander Future Work HannaResearchAgendaCurrentProjectCase StudyFuture • Incorporating network structureApproaches • Follower/friend networks • User mention networks • Retweet patterns • Computer-aided content analysis • Machine learning (supervised and unsupervised)
  19. 19. 2012 Twitter collection Alexander Future Work HannaResearchAgendaCurrentProjectCase StudyFutureApproaches Thanks! ahanna@ssc.wisc.edu @alexhanna

×