Your SlideShare is downloading. ×
0
Election 2010:
The View from Twitter
Axel Bruns / Jean Burgess
ARC Centre of Excellence for Creative Industries and Innova...
Project: New Media and Public Communication
• ARC Discovery (2010-12) – A$410.000
• Axel Bruns (CI), Jean Burgess (SRF) – ...
Methodology – Twapperkeeper
Analysis
Capture
Identification #Hashtag
Archive
Tweet
Statistics and
@Replies
Patterns of
Act...
Data Processing – Twitter
• Typical data structure (#ausvotes):
Data Processing – Twitter
• Tools:
• Gawk – Scripting tool für CSV processing (open source)
• Excel – Data aggregation, pi...
0
1000
2000
3000
4000
5000
6000
7000
Prelude: Leadership #spill
23 June, 19:00-00:00:
Speculation
24 June, 08:00-15:00:
Pa...
#spill Discussion Network
(Node size: indegree [most @replies received]; node colour: outdegree [most @replies sent])
#ausvotes: Overall Activity (17 July – 24 Aug. 2010)
#ausvotes: Discussion Network
(17 July to 25 Aug. 2010 / All @replies / Node size: Indegree / Node colours: betweenness ce...
#ausvotes: Mentions of the Parties (normalised per day)
#ausvotes: Mentions of the Leaders (normalised per day)
#ausvotes: Mentions of the Leaders (cumulative)
#ausvotes: Key Themes
#ausvotes: Key Themes (normalised per day)
#ausvotes: Distractions (normalised per day)
Labor’s Twibbon
Campaign  RTs
#ausvotes: Distractions
Notes and Limitations
• Twapperkeeper relies on #hashtags
• Problem if #hashtags are inconsistent/unclear
• Follow-on @rep...
http://mappingonlinepublics.net/
@snurb_dot_info
@jeanburgess
Image by campoalto
Upcoming SlideShare
Loading in...5
×

Election 2010: The View from Twitter

2,232

Published on

Paper presented at the InASA 'Double Vision' conference, Sydney, 26 Nov. 2010.

Published in: Education, Technology
1 Comment
1 Like
Statistics
Notes
  • Fabulous! Verry impressive!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
2,232
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
35
Comments
1
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "Election 2010: The View from Twitter"

  1. 1. Election 2010: The View from Twitter Axel Bruns / Jean Burgess ARC Centre of Excellence for Creative Industries and Innovation, Brisbane a.bruns@qut.edu.au – @snurb_dot_info je.burgess@qut.edu.au – @jeanburgess http://mappingonlinepublics.net – http://cci.edu.au/ Image by campoalto
  2. 2. Project: New Media and Public Communication • ARC Discovery (2010-12) – A$410.000 • Axel Bruns (CI), Jean Burgess (SRF) – QUT, Brisbane • Lars Kirchhoff, Thomas Nicolai (PIs) – Sociomantic Labs, Berlin • Project blog: http://mappingonlinepublics.net/ Year 1 Year 2 Year 3 Social network sources:  YouTube  Flickr  Twitter  blogs Research tools:  network crawler  content scraper  content analysis  network analysis Research tool development and baseline data Baseline information:  data extraction  content creation statistics  patterns in terms and themes  baseline social networking map  interconnections between social network spaces Content creation patterns Changes over time:  short-term statistics  regular / seasonal patterns Cluster profiling:  common themes / patterns  lead users Focus on specific events Cultural dynamics:  rapid spread of new ideas  communication across clusters  thematic discourse analysis  relationship with main- stream media coverage
  3. 3. Methodology – Twapperkeeper Analysis Capture Identification #Hashtag Archive Tweet Statistics and @Replies Patterns of Activity over Time Networks of @Replies (short/long term) Tweet Texts Keyword / Key Phrase Mapping
  4. 4. Data Processing – Twitter • Typical data structure (#ausvotes):
  5. 5. Data Processing – Twitter • Tools: • Gawk – Scripting tool für CSV processing (open source) • Excel – Data aggregation, pivot tables and charts • Leximancer / WordStat – Keyword extraction, co-occurence matrices • Gephi – Network analysis and visualisation (open source) # Extract @replies for network visualisation # # this script takes a CSV archive of tweets, and reworks it into network data for visualisation # # expected data format: # text,to_user_id,from_user,id,from_user_id,iso_language_code,source,profile_image_url,geo_type, # geo_coordinates_0,geo_coordinates_1,created_at,time # # output format: # from,to,tweet,time,timestamp # # the script extracts @replies from tweets, and creates duplicates where multiple @replies are # present in the same tweet - e.g. the tweet "@one @two hello" from user @user results in # @user,@one,"@one @two hello" and @user,@two,"@one @two hello" # # Released under Creative Commons (BY, NC, SA) by Axel Bruns - a.bruns@qut.edu.au BEGIN { print "from,to,tweet,time,timestamp" } /@([A-Za-z0-9_]+)/ { a=0 do { match(substr($1, a),/@([A-Za-z0-9_]+)?/,atArray) a=a+atArray[1, "start"]+atArray[1, "length"] if (atArray[1] != 0) print $3 "," atArray[1] "," $1 "," $12 "," $13 } while(atArray[1, "start"] != 0) } # filter.awk - Filter list of tweets # # this script takes a CSV or other list of tweets, and removes any lines that don't include RT @username # the script preserves the first line, expecting that it contains header information # # script expects command-line argument search={searchcriteria} _before_ the input CSV filename # enclose the search term in quotation marks if it contains any special characters # # e.g.: gawk -F , -f filter.awk search="(julia|gillard)" tweets.csv >filteredtweets.csv # # expected data format: # CSV or simple list of tweets, line-by-line # # output format: # same as above, listing only retweets # # Released under Creative Commons (BY, NC, SA) by Axel Bruns - a.bruns@qut.edu.au BEGIN { getline print $0 } tolower($0) ~ search { print $0 }
  6. 6. 0 1000 2000 3000 4000 5000 6000 7000 Prelude: Leadership #spill 23 June, 19:00-00:00: Speculation 24 June, 08:00-15:00: Party Vote & Aftermath
  7. 7. #spill Discussion Network (Node size: indegree [most @replies received]; node colour: outdegree [most @replies sent])
  8. 8. #ausvotes: Overall Activity (17 July – 24 Aug. 2010)
  9. 9. #ausvotes: Discussion Network (17 July to 25 Aug. 2010 / All @replies / Node size: Indegree / Node colours: betweenness centrality)
  10. 10. #ausvotes: Mentions of the Parties (normalised per day)
  11. 11. #ausvotes: Mentions of the Leaders (normalised per day)
  12. 12. #ausvotes: Mentions of the Leaders (cumulative)
  13. 13. #ausvotes: Key Themes
  14. 14. #ausvotes: Key Themes (normalised per day)
  15. 15. #ausvotes: Distractions (normalised per day) Labor’s Twibbon Campaign  RTs
  16. 16. #ausvotes: Distractions
  17. 17. Notes and Limitations • Twapperkeeper relies on #hashtags • Problem if #hashtags are inconsistent/unclear • Follow-on @replies and retweets may not continue to use #hashtags • Casual commenters may not use #hashtags in the first place • May miss early developments – e.g. #hashtag standardisation • Twitter as a subset of society: • Broadband policy and Internet filter over-, asylum seekers underrepresented • #hashtag use is a further sign of self-selection • Need to look to Twitter firehose for more comprehensive picture • Need to track baseline activity to understand how exceptional #ausvotes was • See more at mappingonlinepublics.net – up next: time-based animations... • Or find us at @snurb_dot_info and @jeanburgess
  18. 18. http://mappingonlinepublics.net/ @snurb_dot_info @jeanburgess Image by campoalto
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×