Timeline to date: April 2010: Twitter, Inc.’s ‘gift’ of a full continuous archive announced; substantial media coverage praising Twitter’s philanthropy January 2013: LoC blog update and whitepaper on the archive project 170b tweets ingested; 500m new tweets each day 400 inquiries from researchers; no access available as yet November 2015: Politico says LoC project is “in limbo”
Twitter as a First Draft of the Present – and the Challenges of Preserving It for the Future
TWITTER AS A FIRST DRAFT OF THE
PRESENT – AND THE CHALLENGES OF
PRESERVING IT FOR THE FUTURE
Prof. Axel Bruns Dr. Katrin Weller
Digital Media Research Centre Computational Social Science
Queensland University of Technology GESIS Leibniz Institute for the Social Sciences
Brisbane, Australia Köln, Germany
HISTORY IS WRITTEN…
• …by the winners
• …with the records that survive
• What will survive of our time?
– Print journalism?
– Audiovisual materials?
– The Web?
Various Web archiving projects
• But what about social media?
IS TWITTER HISTORY?
• What is Twitter?
– Journalism: “a first rough draft of history” (as journalists see it)
– Twitter: real-time, real-life observations by a diverse, global userbase
– Twitter: tweets, but also images, video, links to external content
Twitter is a first draft of the present
• But – significant fears for long-term preservation:
– Concerns over Twitter, Inc.’s commercial sustainability
– Data access commercialised and unaffordable at scale
– Account and content deletions threaten completeness
– Embedded third-party URLs, images, audio, video may disappear
We must get serious about preserving Twitter (and other social media)
WHAT IS LOST?
• Deleted tweets
• Audiovisual contents in
tweets: videos, images
• URLs and their contents
• Context information: user
names, meaning of hashtags
• Context: Interfaces, look & feel
WHERE TO FROM HERE?
• Archiving Twitter:
– Alternative options in addition to LoC are needed
– Archiving without Twitter, Inc.’s support feasible only for smaller subsets
• e.g. TrISMA project, Australian Research Council / National Library of Australia
– Full archive would require gaining/buying access to Twitter firehose: very costly
• Archiving Twitter, fully:
– Need to capture more than just tweets alone
– Content of shared URLs, embedded images, audio, video
– Background information on accounts, underlying structure of follower networks
– Twitter user experience: interface design, content presentation, etc.
All of this is increasingly urgent, as content is already disappearing…
Full paper: http://eprints.qut.edu.au/95296/
This research is funded by the Australian Research Council through Future Fellowship
and LIEF grants FT130100703 and LE140100148.
Part of this work was conducted as part of a Digital Studies Fellowship at the Library of
Congress’s John W. Kluge Center.