Development of the CyberCemetery (2011)

290 views

Published on

Latest presentation on the development of the CyberCemetery, an archive of "dead" websites for now-defunct government agencies and commissions. The CyberCemetery archive is maintained by the University of North Texas (UNT) Libraries, an Affiliated Archive of the National Archives and Records Administration (NARA).

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
290
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  • 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  • 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  • 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  • 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  • 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  • 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  • 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  • Development of the CyberCemetery (2011)

    1. 1. Development & Practice in the CyberCemetery Starr Hoffman Head, Government Documents Dept. University of North Texas Libraries 25 September 2011
    2. 2. • Intro Wha t is the Cy be rCe m e te ry ?• Purpose Why c re a te a Cy be rCe m e te ry ?• Development• Archiving Process• Technical Details• User Demographics Who us e s the Cy be rCe m e te ry ?• Conclusion
    3. 3. http:/digital.library.unt.edu/ / explore/ collections/ GDCC/
    4. 4. • online archive of websites from U.S. government agencies or commissions that are no longer operating http:/digital.library.unt.edu/ / explore/ collections/ GDCC/
    5. 5. • online archive of websites from U.S. government agencies or commissions that are no longer operating • “snapshot” of each website as it existed before “pulling the plug”• maintained by the University of North Texas Libraries• freely accessible world-wide• affiliated NAR archive (National Archives and Records A Administration) http:/digital.library.unt.edu/ / explore/ collections/ GDCC/
    6. 6. 1997 - present 2008 - present
    7. 7. • Protect At-Risk Information: • 1990’s: U.S. government information = online • born-digital • edited or removed without warning• Federal Depository Library Program (FDLP) • administered by U.S. Government Printing Office (GPO) • mission: to p ro v id e fre e , p e rm a ne nt p ublic a c c e s s to g o ve rnm e nt info rm a tio n • online information complicates this mission • University of North Texas is a federal depository library
    8. 8. 1995 e-docs at risk Government Printing Office(GPO) publishes report statingneed to preserve electronic government publications
    9. 9. 1997GPO + UNT University of North Texas(UNT) talks to GPO about forming a partnership
    10. 10. 1997 ACIRarchived UNT archives website of the Advisory Commission onIntergovernment al Relations (ACIR)
    11. 11. 1999GPO + UNT= expandedpermanent public access, expanded tomultiple websites,& any agency or commission nolonger operating
    12. 12. 1999 CyberCemeteryarchive is named “CyberCemetery”because websites are from “dead” agencies & commissions
    13. 13. 2006GPO + UNT + NARA partnership nowincludes the U.S. National Archives and Records Administration (NARA)
    14. 14. 2011 73+websitesarchived
    15. 15. 1. Identify at-risk government agencies and commissions • contacted directly by agency/commission • contacted by GPO • read/listen to news • read government-related websites & blogs • targeted search-engine queries • (“final report” + .gov) • referrals from other librarians, patrons
    16. 16. 2. Evaluate the website • must be an official government website • the agency or commission must: • be closing • issued a final report • other indication that the website is at-risk
    17. 17. 2. Evaluate the website (continued)  Questions for website administrator:  Wha t operating system wa s us e d to ho s t this we bs ite ?  Wha t webserver software wa s us e d fo r the ho s ting o f this we bs ite ?  A s e rve r s id e inc lud e s (s s i) us e d in this we bs ite ? re  Wa s this we bs ite static htm o r a dynam site? l ic  I d y na m ic , wha t scripting languages we re us e d fo r this we bs ite (p hp , p e rl, f p y tho n)?  Wa s a database us e d fo r this we bs ite ? 2. I s o , wha t d a ta ba s e wa s us e d fo r this we bs ite ? f 3. Wha t m e tho d s we re us e d to c o nne c t to the d a ta ba s e ?  I the re stream m s ing edia a s s o c ia te d with this we bs ite ?  A the re proprietary content types us e d in this we bs ite ? re  A the re a ny com ents y o u wo uld like to a d d ? re m
    18. 18. 3. Harvest the website • software: Heritrix (from Internet Archive) • http://crawler.archive.org/ • downloads content • bundles all content into WARC file • WARC = website in a single file • no manipulation of code or content3. Access archived website • software: Wayback (from Internet Archive) • http://archive-access.sourceforge.net/projects/wayback/ • retrieves content from WARC • add banner notifying archived status
    19. 19. 5. Harvesting alternative: Donated content • directly receive files from agency or commission • Why no t donated content? • Content could be altered • Harvesting = exact copy of online published content • Why donated content? • If content cannot be accessed by harvesting • flash video, large amounts of media • rarely necessary now
    20. 20. 6. Link Checking • Manual: • manually navigate original & archived sites • Automated: • Xenu Link Checker • http://home.snafu.de/tilman/xenulink.html • compare reports of original & archived sites6. Load to UNT Server • Upload archived website • Add navigation • Notify GPO (or agency/commission) that archived version is live
    21. 21. • Backup • full backups to magnetic tape • performed each weekend • shipped to offsite storage company • Iron Mountain • http://www.ironmountain.com
    22. 22. • web files (HTML, XML)• text documents (.txt, .pdf, .doc)• spreadsheets & statistics (.xls)• presentations (.ppt)• media files: • images & photographs (.jpg, .gif, .png, .tiff) • audio (.mp3) • video (.wm, .mov, .rp)
    23. 23. • researchers• historians• students• government employees• general public• avg. +1,000,000 hits per month• peak visits in one day: • 9,996 on 11.03.2011• most popular site: 9 /1 1 Co m m is s io n
    24. 24. • provides permanent public access• archive of “dead” government information• freely, globally available• 73 websites and growing• partnership between: • University of North Texas Libraries • U.S. Government Printing Office • National Archives and Records Administration
    25. 25. FOR FURTHER INFORMATION:http://www.library.unt.edu/govinfo/http://digital.library.unt.edu/explore/collections/GDCC/ Starr Hoffman Head, Government Documents Dept. University of North Texas Libraries govinfo@unt.edu starr.hoffman@gmail.com http:/geekyartistlibrarian.com /

    ×