• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Development of the CyberCemetery (2011)
 

Development of the CyberCemetery (2011)

on

  • 221 views

Latest presentation on the development of the CyberCemetery, an archive of "dead" websites for now-defunct government agencies and commissions. The CyberCemetery archive is maintained by the ...

Latest presentation on the development of the CyberCemetery, an archive of "dead" websites for now-defunct government agencies and commissions. The CyberCemetery archive is maintained by the University of North Texas (UNT) Libraries, an Affiliated Archive of the National Archives and Records Administration (NARA).

Statistics

Views

Total Views
221
Views on SlideShare
221
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  • 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  • 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  • 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  • 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  • 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  • 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  • 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!

Development of the CyberCemetery (2011) Development of the CyberCemetery (2011) Presentation Transcript

  • Development & Practice in the CyberCemetery Starr Hoffman Head, Government Documents Dept. University of North Texas Libraries 25 September 2011
  • • Intro Wha t is the Cy be rCe m e te ry ?• Purpose Why c re a te a Cy be rCe m e te ry ?• Development• Archiving Process• Technical Details• User Demographics Who us e s the Cy be rCe m e te ry ?• Conclusion
  • http:/digital.library.unt.edu/ / explore/ collections/ GDCC/
  • • online archive of websites from U.S. government agencies or commissions that are no longer operating http:/digital.library.unt.edu/ / explore/ collections/ GDCC/
  • • online archive of websites from U.S. government agencies or commissions that are no longer operating • “snapshot” of each website as it existed before “pulling the plug”• maintained by the University of North Texas Libraries• freely accessible world-wide• affiliated NAR archive (National Archives and Records A Administration) http:/digital.library.unt.edu/ / explore/ collections/ GDCC/
  • 1997 - present 2008 - present
  • • Protect At-Risk Information: • 1990’s: U.S. government information = online • born-digital • edited or removed without warning• Federal Depository Library Program (FDLP) • administered by U.S. Government Printing Office (GPO) • mission: to p ro v id e fre e , p e rm a ne nt p ublic a c c e s s to g o ve rnm e nt info rm a tio n • online information complicates this mission • University of North Texas is a federal depository library
  • 1995 e-docs at risk Government Printing Office(GPO) publishes report statingneed to preserve electronic government publications
  • 1997GPO + UNT University of North Texas(UNT) talks to GPO about forming a partnership
  • 1997 ACIRarchived UNT archives website of the Advisory Commission onIntergovernment al Relations (ACIR)
  • 1999GPO + UNT= expandedpermanent public access, expanded tomultiple websites,& any agency or commission nolonger operating
  • 1999 CyberCemeteryarchive is named “CyberCemetery”because websites are from “dead” agencies & commissions
  • 2006GPO + UNT + NARA partnership nowincludes the U.S. National Archives and Records Administration (NARA)
  • 2011 73+websitesarchived
  • 1. Identify at-risk government agencies and commissions • contacted directly by agency/commission • contacted by GPO • read/listen to news • read government-related websites & blogs • targeted search-engine queries • (“final report” + .gov) • referrals from other librarians, patrons
  • 2. Evaluate the website • must be an official government website • the agency or commission must: • be closing • issued a final report • other indication that the website is at-risk
  • 2. Evaluate the website (continued)  Questions for website administrator:  Wha t operating system wa s us e d to ho s t this we bs ite ?  Wha t webserver software wa s us e d fo r the ho s ting o f this we bs ite ?  A s e rve r s id e inc lud e s (s s i) us e d in this we bs ite ? re  Wa s this we bs ite static htm o r a dynam site? l ic  I d y na m ic , wha t scripting languages we re us e d fo r this we bs ite (p hp , p e rl, f p y tho n)?  Wa s a database us e d fo r this we bs ite ? 2. I s o , wha t d a ta ba s e wa s us e d fo r this we bs ite ? f 3. Wha t m e tho d s we re us e d to c o nne c t to the d a ta ba s e ?  I the re stream m s ing edia a s s o c ia te d with this we bs ite ?  A the re proprietary content types us e d in this we bs ite ? re  A the re a ny com ents y o u wo uld like to a d d ? re m
  • 3. Harvest the website • software: Heritrix (from Internet Archive) • http://crawler.archive.org/ • downloads content • bundles all content into WARC file • WARC = website in a single file • no manipulation of code or content3. Access archived website • software: Wayback (from Internet Archive) • http://archive-access.sourceforge.net/projects/wayback/ • retrieves content from WARC • add banner notifying archived status
  • 5. Harvesting alternative: Donated content • directly receive files from agency or commission • Why no t donated content? • Content could be altered • Harvesting = exact copy of online published content • Why donated content? • If content cannot be accessed by harvesting • flash video, large amounts of media • rarely necessary now
  • 6. Link Checking • Manual: • manually navigate original & archived sites • Automated: • Xenu Link Checker • http://home.snafu.de/tilman/xenulink.html • compare reports of original & archived sites6. Load to UNT Server • Upload archived website • Add navigation • Notify GPO (or agency/commission) that archived version is live
  • • Backup • full backups to magnetic tape • performed each weekend • shipped to offsite storage company • Iron Mountain • http://www.ironmountain.com
  • • web files (HTML, XML)• text documents (.txt, .pdf, .doc)• spreadsheets & statistics (.xls)• presentations (.ppt)• media files: • images & photographs (.jpg, .gif, .png, .tiff) • audio (.mp3) • video (.wm, .mov, .rp)
  • • researchers• historians• students• government employees• general public• avg. +1,000,000 hits per month• peak visits in one day: • 9,996 on 11.03.2011• most popular site: 9 /1 1 Co m m is s io n
  • • provides permanent public access• archive of “dead” government information• freely, globally available• 73 websites and growing• partnership between: • University of North Texas Libraries • U.S. Government Printing Office • National Archives and Records Administration
  • FOR FURTHER INFORMATION:http://www.library.unt.edu/govinfo/http://digital.library.unt.edu/explore/collections/GDCC/ Starr Hoffman Head, Government Documents Dept. University of North Texas Libraries govinfo@unt.edu starr.hoffman@gmail.com http:/geekyartistlibrarian.com /