Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Collaboration and Cash: Web Archiving Incentive Awards


Published on

This presentation was delivered in session 306 at the annual meeting of the Society of American Archivists (#saa15). These slides provide information about and lessons learned from the web archiving incentive awards program. Links provided are to facilitate further learning about the tools mentioned but are not a definitive set of resources about these tools.

Published in: Education
  • Be the first to comment

  • Be the first to like this

Collaboration and Cash: Web Archiving Incentive Awards

  1. 1. Collaboration and Cash: Web Archiving Incentive Awards Anna Perricci Columbia University Libraries Society of American Archivists, Session 306 August 21, 2015
  2. 2. Good morning Source:
  3. 3. Today’s session Taking an expansive view of outreach for web archives, the speakers discuss methods for encouraging use and engagement with collections of web content via various approaches, including collaborative collecting, cooperative collection development, promoting new research uses, fostering research and tool development, advocacy, and working directly with content creators. Attendees have the opportunity to discuss novel approaches to promoting the utility and value of web archives.
  4. 4. For more on collaborative collection development • Overview of grant funded collaborations for RESAW: to-save-more-of-the-web & • Focus on Ivy Plus / Borrow Direct for #CUWARC: • Progress on the Contemporary Composers Web Archive for IAML: • Process for building CAUSEWAY for ARLIS/NA: multiinstitutional-web-archiving-collaboration-for-the-collaborative- architecture-urbanism-and-sustainability-web-archive-causeway
  5. 5. High level summary
  6. 6. • We’re a little over four months out from the finish line for this grant (ends December 31, 2015) • Work on the incentives award program is wrapping up and ready to be discussed • Distributed efforts do not result in a lower workload or cost savings so far but the outcomes from the collaborative projects are enriched by the shared expertise and insights
  7. 7. Incentive awards projects to advance web archiving tools
  8. 8. Source of funds A 2012 summit on web archiving held at Columbia “showed broad agreement on the need for action in several areas as web archiving continues to grow. We need to find ways to share expertise and infrastructure, to better understand how researchers will use web archives, and work with website owners to make their content easier to collect.” -Bob Wolven, Associate University Librarian for Bibliographic Services and Collection Development More overview on the collaborative web archiving grant : 5_CUL_Mellon_Web_Archiving_Grant.html
  9. 9. Visualizing Digital Collections of Web Archives Primary Investigators: Michele Weigle and Michael Nelson Institution: Old Dominion University Project purpose: Develop tool for showing how a single web page changes over time For more information see:,, sults.jsp & B1dpUnKLglmM2iScjl&index=6
  10. 10. Tools for Managing Seed URIs Primary Investigators: Michael Nelson and Michele Weigle Institution: Old Dominion University Project purpose: Develop tool to enable curators to evaluate and detect when their web archives are off topic or discover new seed sites to include in collections For more information see: Detection & FRB1dpUnKLglmM2iScjl&index=6
  11. 11. Archiving Transactions Towards Uninterruptible Web Service Primary Investigators: Zhiwu Xie and Ed Fox Institution: Virginia Tech University Project purpose: Create or leverage existing tools so when a web resource is unavailable due to some interruption of a key service an archived copy will be provided to an end user • Web archives can serve as a value-added collection to motivate web archiving as a tool for day-to-day IT operation For more information see: wQhBpFRB1dpUnKLglmM2iScjl &
  12. 12. Mitigating the Pervasive Problem of Link Rot in Scholarly Works and Preserving Online Content Primary Investigator: Jonathan Zittrain Institution: Harvard Library Innovation Lab Project Purpose: Create APIs to extend the use technology supporting, a tool for authors and editors to make a copy of a cited resource for preservation and future access (focus on legal resources) For more information see: & Dab4lwQhBpFRB1dpUnKLglmM2iScjl
  13. 13. Free Law Project Primary Investigators: Brian Carver and Michael Lissner Organization: Free Law Project Purpose: Expand capacity to obtain opinions on appellate court websites & involve wider community in scraping work (using Juriscraper); capture recordings of oral arguments For more information see:, & 4lwQhBpFRB1dpUnKLglmM2iScjl
  14. 14. Warcbase: A Web Archives Browser Built on Modern “Big Data" Infrastructure Primary Investigator: Jimmy Lin Institution: University of Maryland Project purpose: Warcbase is an open-source platform for storing, managing, and analyzing web archives using current “big data" infrastructure and tools (e.g. HBase for storage, Hadoop for data analytics); -further applications on ‘wimpy hardware’ (Raspberry Pi) also demonstrated for personal digital archiving For more information see: & 4lwQhBpFRB1dpUnKLglmM2iScjl
  15. 15. Bonus points for bringing a user of the tool!
  16. 16. Oversight panel • An oversight panel reviewed and chose proposals to fund; project outcomes will be evaluated by oversight panelists Oversight panel for project selection: • Kris Carpenter (while at the Internet Archive) • Mark Phillips (University of North Texas) • Rob Sanderson (while at Los Alamos National Laboratory) • Perry Willett (California Digital Library) Oversight panel for project evaluation: • Mark Phillips (University of North Texas) • Martin Klein (UCLA) • Jefferson Bailey (Internet Archive)
  17. 17. Bringing order to what could have been a terrible mess of emails and attachments
  18. 18. We organized a conference Web Archiving Collaboration: New Tools and Models Image source: 72157655295804376/
  19. 19. Information sharing • Slides and video links: /web_resources_collection/Conf erences/program.html • Video playlist: ist?list=PLf1Dab4lwQhBpFRB1dp UnKLglmM2iScjl • Pictures:
  20. 20. Fair warning • Can’t say there are plans to repeat this program • Many steps needed to work out requirements for the sponsored projects office, invoicing and intellectual property agreements, etc. – Many, many, many steps… Image source:
  21. 21. Hopes for the future • Ideal: seeing more development of generalizable and extensible tools • Interoperability with Archive-It is helpful whenever possible though other services / approaches are being explored as well
  22. 22. Parts of a whole / parting thoughts • Identifying a need and trying to meet it can lead to novel approaches and associated challenges • Hopefully what we learned can be used for future reference as development of digital tools to improve processes for preserving and making accessible digital archives of any kind are pursued
  23. 23. Thanks! Anna Perricci Columbia University Libraries @AnnaPerricci Unattributed images in this presentation are by Anna Perricci