Butterfly Hunt: On Collecting #mla14 Tweets (#mla15 #s398)
Academic excellence for business and the professions
On Collecting #mla14 Tweets
Dr Ernesto Priego
Centre for Information Science
• 1. Twitter data collection- “butterfly
hunt”, citizen archiving
• 2. How we collected and what we
• 3. #mla14 Twitter data summary
• 4. #mla15 so far
• 5. Why Twitter data collection matters
Conference Tweets Collection as Butterfly Hunt
Conference Tweet Collection as Citizen Archiving
• Citizen archivists are “the first responders
of history… arriving early on the scene to
gather, capture, describe and preserve
ephemeral artifacts of interest and helping
to ensure they survive over time to share
with the future.”
• “Almost any collection becomes interesting
once you get enough stuff in one place.”
-Butch Lazorchak, The Signal. Digital
Preservation. Library of Congress, May 8,
Data Collection; Data Sharing
• TAGS – Twitter Archiving Google Spreadsheet by Martin Hawksey
• Not as easy as just letting it run automatically– devil is in the details
• Issues: API Limits, Spreadsheet Settings and Limits, Different Time
Zones, Spam, Character Encoding, Languages, Duplication, Bots,
• Automated methods and manual methods are required
• Online collaboration – Chris Zarate and I, across time zones,
• Shared on open access repositories:
• http://bit.ly/MLA14TwitterArchive (see also
#MLA14 Tweet Activity on Conference Days (9-12
#MLA14 Conference Days Participation
#MLA14 Most Frequent Word*:
digital: 1,474 occurrences
Top Ten #MLA14 user_lang*
#MLA14 Tweets and Geolocation
Why Does It Matter?
• Historical record of scholarly social media participation and an
increasingly-important manifestation of the MLA Convention activity
• Twitter is experienced “live”, as it happens, but Web and Mobile
clients do not yet allow long-term archiving of hashtag acttivity or
• Tweets as ephemera; preservation for future analysis needed
• Twitter may or may not last; it may change significantly; Twitter API
rules and behaviour change very often and without warning
• Archiving and analyising Twitter evidence (data) may offer insights
into scholarly behaviour, disciplinary, thematic, linguistic and socio-
cultural trends over time, conference sentiment/feedback, for
scholarly social networks mapping, etc.
• Priego, Ernesto, and Zarate, Chris. “#MLA14 Twitter Archive, 9-12 January 2014”. Dataset. City
Research Online. 2014. Web.
http://openaccess.city.ac.uk/3083/ Accessed 7 January 2015.
• Priego, Ernesto. #MLA14: A First Look (I). Far Away, Yet Close. MLA Commons. 16 January 2014.
Web.http://remoteparticipation.commons.mla.org/2014/01/16/mla14-a-first-look/ Accessed 7
January 2015 (four parts).
• Priego, Ernesto. “Some Thoughts on Why You Would Like to Archive and Share [Small] Twitter
Data Sets”. 28 May 2014. Web.
share-twitter-small-data/ Accessed 7 January 2015.
• Download this presentation: