Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Making Enterprise-Level Archive Tools                                                   Accessible for Personal Web Archiv...
Upcoming SlideShare
Loading in …5
×

Making Enterprise-Level Archive Tools Accessible for Personal Web Archiving

3,322 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Making Enterprise-Level Archive Tools Accessible for Personal Web Archiving

  1. 1. Making Enterprise-Level Archive Tools Accessible for Personal Web Archiving Mat Kelly, Michele C. Weigle, and Michael L. Nelson {mkelly,mweigle, mln}@cs.odu.edu Department of Computer Science, Old Dominion University, Norfolk, Virginia USAOne-Click, User Instigated Preservation • “Archive Now!” button sets up crawl, initiates crawl and puts archive file in correct location to be indexed. • Once indexed, “View Archive” button shows all archives • Selecting the date in local Wayback displays the pre- • Wayback consumption can be checked with “Check for URL in local Wayback. served webpage. Archive Status” button.Features Tools Installed Locally Interface for Tweaking Support • Collection of Archiving Tools • CREATE ARCHIVES: Heritrix (Crawler) • GENERATED ARCHIVES ARE SAFE Web ARChives (WARCs) reside on your hard • Drag & Drop Installation And Removal • REPLAY ARCHIVES: Wayback Machine drive, can be backed up for safe keeping like • All Tools Can Reside on a Single Machine • INSPECT ARCHIVES: WARC-Proxy any other file • Managed Through a Graphical User Interface • More to Come! • CROSS PLATFORM (GUI) Support for MacOS X, Windows and Linux • WORKS WITH EXISTING WARCSAdvanced Options/Features Just drop in and local Wayback will index for replay • Specify Multiple URLs to be Included in the Crawl • COMPATIBLE WITH OTHER • Setup Crawls and Allow for Customization Prior to Execution (e.g., crawl period) ARCHIVING TOOLS • Start or Stop Services Not Currently Needed (e.g., initialize a long crawl but delay replay until later) Use the WARC-generating preservation tool of your choice (e.g., WARCreate, Wget) in lieu of Heritrix PDA 2013; College Park, MD; February 21, 2013 http://matkelly.com/wail

×