SlideShare a Scribd company logo
1 of 32
Development & Practice in the CyberCemetery


                                     Starr Hoffman
                 Head, Government Documents Dept.
                   University of North Texas Libraries
                                  25 September 2011
•   Intro               Wha t is the Cy be rCe m e te ry ?
•   Purpose             Why c re a te a Cy be rCe m e te ry ?
•   Development
•   Archiving Process
•   Technical Details
•   User Demographics   Who us e s the Cy be rCe m e te ry ?
•   Conclusion
http:/digital.library.unt.edu/
      /                      explore/
                                    collections/
                                               GDCC/
• online archive of websites from U.S. government agencies
  or commissions that are no longer operating




   http:/digital.library.unt.edu/
         /                      explore/
                                       collections/
                                                  GDCC/
• online archive of websites from U.S. government agencies
  or commissions that are no longer operating
  • “snapshot” of each website as it existed before “pulling the plug”
• maintained by the University of North Texas Libraries
• freely accessible world-wide
• affiliated NAR archive (National Archives and Records
                A
  Administration)




  http:/digital.library.unt.edu/
        /                      explore/
                                      collections/
                                                 GDCC/
1997 - present   2008 - present
• Protect At-Risk Information:
  • 1990’s: U.S. government information = online
  • born-digital
  • edited or removed without warning

• Federal Depository Library Program (FDLP)
    • administered by U.S. Government Printing Office (GPO)
    • mission: to p ro v id e fre e , p e rm a ne nt p ublic a c c e s s to
      g o ve rnm e nt info rm a tio n
    • online information complicates this mission
    • University of North Texas is a federal depository library
1995
 e-docs at
    risk

   Government
 Printing Office
(GPO) publishes
  report stating
need to preserve
    electronic
   government
   publications
1997
GPO + UNT

 University of
 North Texas
(UNT) talks to
 GPO about
  forming a
 partnership
1997
  ACIR
archived
  UNT archives
  website of the
     Advisory
 Commission on
Intergovernment
   al Relations
      (ACIR)
1999
GPO + UNT
= expanded
permanent public
     access,
  expanded to
multiple websites,
& any agency or
 commission no
longer operating
1999
 CyberCemetery



archive is named
 “CyberCemetery”
because websites
 are from “dead”
   agencies &
  commissions
2006
GPO + UNT
 + NARA

 partnership now
includes the U.S.
     National
  Archives and
     Records
  Administration
     (NARA)
2011

  73+
websites
archived
1. Identify at-risk government agencies and commissions
  •     contacted directly by agency/commission
  •     contacted by GPO
  •     read/listen to news
  •     read government-related websites & blogs
  •     targeted search-engine queries
      •    (“final report” + .gov)
  •     referrals from other librarians, patrons
2. Evaluate the website
   • must be an official government website
   • the agency or commission must:
     •   be closing
     •   issued a final report
     •   other indication that the website is at-risk
2.       Evaluate the website (continued)
              Questions for website administrator:

                   Wha t operating system wa s us e d to ho s t this we bs ite ?
                   Wha t webserver software wa s us e d fo r the ho s ting o f this we bs ite ?
                   A s e rve r s id e inc lud e s (s s i) us e d in this we bs ite ?
                      re
                   Wa s this we bs ite static htm o r a dynam site?
                                                        l               ic
                         I d y na m ic , wha t scripting languages we re us e d fo r this we bs ite (p hp , p e rl,
                            f
                          p y tho n)?
                         Wa s a database us e d fo r this we bs ite ?
                         2.      I s o , wha t d a ta ba s e wa s us e d fo r this we bs ite ?
                                  f
                         3.      Wha t m e tho d s we re us e d to c o nne c t to the d a ta ba s e ?
                   I the re stream m
                    s                   ing edia a s s o c ia te d with this we bs ite ?
                   A the re proprietary content types us e d in this we bs ite ?
                      re
                   A the re a ny com ents y o u wo uld like to a d d ?
                      re                   m
3.       Harvest the website
     •       software: Heritrix (from Internet Archive)
           •     http://crawler.archive.org/
           •     downloads content
           •     bundles all content into WARC file
           •     WARC = website in a single file
           •     no manipulation of code or content

3.       Access archived website
     •       software: Wayback (from Internet Archive)
           •     http://archive-access.sourceforge.net/projects/wayback/
           •     retrieves content from WARC
           •     add banner notifying archived status
5. Harvesting alternative: Donated content
  •       directly receive files from agency or commission

      •      Why no t donated content?
             •   Content could be altered
             •   Harvesting = exact copy of online published content


      •      Why donated content?
             •   If content cannot be accessed by harvesting
             •   flash video, large amounts of media
             •   rarely necessary now
6. Link Checking
  •     Manual:
      •    manually navigate original & archived sites
  •     Automated:
      •    Xenu Link Checker
      •    http://home.snafu.de/tilman/xenulink.html
      •    compare reports of original & archived sites
6. Load to UNT Server
  •    Upload archived website
  •    Add navigation
  •    Notify GPO (or agency/commission) that archived version is
       live
• Backup
  • full backups to magnetic tape
  • performed each weekend
  • shipped to offsite storage company
     • Iron Mountain
     • http://www.ironmountain.com
• web files (HTML, XML)
• text documents (.txt, .pdf,
  .doc)
• spreadsheets & statistics
  (.xls)
• presentations (.ppt)
• media files:
  • images & photographs (.jpg,
    .gif, .png, .tiff)
  • audio (.mp3)
  • video (.wm, .mov, .rp)
•   researchers
•   historians
•   students
•   government employees
•   general public




• avg. +1,000,000 hits per month
• peak visits in one day:
   • 9,996 on 11.03.2011
• most popular site: 9 /1 1 Co m m is s io n
•   provides permanent public access
•   archive of “dead” government information
•   freely, globally available
•   73 websites and growing

• partnership between:
    • University of North Texas Libraries
    • U.S. Government Printing Office
    • National Archives and Records Administration
FOR FURTHER
      INFORMATION:
http://www.library.unt.edu/govinfo/
http://digital.library.unt.edu/explore/collections/GDCC/


   Starr Hoffman
  Head, Government Documents Dept.
  University of North Texas Libraries
  govinfo@unt.edu


  starr.hoffman@gmail.com
  http:/geekyartistlibrarian.com
        /

More Related Content

What's hot

Gone today, here tomorrow: the future of government information and the digit...
Gone today, here tomorrow: the future of government information and the digit...Gone today, here tomorrow: the future of government information and the digit...
Gone today, here tomorrow: the future of government information and the digit...
James Jacobs
 
進行中
進行中進行中
進行中
maolins
 
進行中
進行中進行中
進行中
maolins
 
Intro open data hackday
Intro open data hackdayIntro open data hackday
Intro open data hackday
gueste2d87d8
 
Intro open data hackday
Intro open data hackdayIntro open data hackday
Intro open data hackday
gueste2d87d8
 

What's hot (19)

Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1 Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1
 
A Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly RecordA Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly Record
 
Blind Spots and Broken Links: Access to Government Information
Blind Spots and Broken Links: Access to Government InformationBlind Spots and Broken Links: Access to Government Information
Blind Spots and Broken Links: Access to Government Information
 
Gone today, here tomorrow: the future of government information and the digit...
Gone today, here tomorrow: the future of government information and the digit...Gone today, here tomorrow: the future of government information and the digit...
Gone today, here tomorrow: the future of government information and the digit...
 
Digital FDLP Louisiana GODORT 2012 slides+notes
Digital FDLP Louisiana GODORT 2012 slides+notesDigital FDLP Louisiana GODORT 2012 slides+notes
Digital FDLP Louisiana GODORT 2012 slides+notes
 
進行中
進行中進行中
進行中
 
進行中
進行中進行中
進行中
 
OAC Presentation at CNI 09 Fall Forum
OAC Presentation at CNI 09 Fall ForumOAC Presentation at CNI 09 Fall Forum
OAC Presentation at CNI 09 Fall Forum
 
The Memento Protocol and Research Issues With Web Archiving
The Memento Protocol and Research Issues With Web ArchivingThe Memento Protocol and Research Issues With Web Archiving
The Memento Protocol and Research Issues With Web Archiving
 
Intro open data hackday
Intro open data hackdayIntro open data hackday
Intro open data hackday
 
Intro open data hackday
Intro open data hackdayIntro open data hackday
Intro open data hackday
 
Linked Data + Drupal for Oceanographic data management
Linked Data + Drupal for Oceanographic data managementLinked Data + Drupal for Oceanographic data management
Linked Data + Drupal for Oceanographic data management
 
Towards a scientific data policy
Towards a scientific data policy Towards a scientific data policy
Towards a scientific data policy
 
Summarize Your Archival Holdings With MementoMap
Summarize Your Archival Holdings With MementoMapSummarize Your Archival Holdings With MementoMap
Summarize Your Archival Holdings With MementoMap
 
The Deep Web
The Deep WebThe Deep Web
The Deep Web
 
RDF and other linked data standards — how to make use of big localization data
RDF and other linked data standards — how to make use of big localization dataRDF and other linked data standards — how to make use of big localization data
RDF and other linked data standards — how to make use of big localization data
 
Linked Open Data for Archives
Linked Open Data for ArchivesLinked Open Data for Archives
Linked Open Data for Archives
 
Online Genealogy Intro for Mendon NY Public Library and Historical Society
Online Genealogy Intro for Mendon NY Public Library and Historical SocietyOnline Genealogy Intro for Mendon NY Public Library and Historical Society
Online Genealogy Intro for Mendon NY Public Library and Historical Society
 
Josh Moulin: What every prosecutor should know about peer to-peer investigations
Josh Moulin: What every prosecutor should know about peer to-peer investigationsJosh Moulin: What every prosecutor should know about peer to-peer investigations
Josh Moulin: What every prosecutor should know about peer to-peer investigations
 

Viewers also liked

Viewers also liked (9)

LIS 653, Session 5: Dublin Core & Metadata Basics
LIS 653, Session 5: Dublin Core & Metadata Basics LIS 653, Session 5: Dublin Core & Metadata Basics
LIS 653, Session 5: Dublin Core & Metadata Basics
 
LIS 653, Session 11: Data Management & Curation
LIS 653, Session 11: Data Management & CurationLIS 653, Session 11: Data Management & Curation
LIS 653, Session 11: Data Management & Curation
 
Using Secondary Data
Using Secondary DataUsing Secondary Data
Using Secondary Data
 
Activities and Experiences of Academic Librarians Embedded in Online Courses
Activities and Experiences of Academic Librarians Embedded in Online CoursesActivities and Experiences of Academic Librarians Embedded in Online Courses
Activities and Experiences of Academic Librarians Embedded in Online Courses
 
The Education of Academic Library Deans
The Education of Academic Library DeansThe Education of Academic Library Deans
The Education of Academic Library Deans
 
Intro to Government Information Sources
Intro to Government Information SourcesIntro to Government Information Sources
Intro to Government Information Sources
 
Encouraging an Informed Citizenry (Part 1)
Encouraging an Informed Citizenry (Part 1)Encouraging an Informed Citizenry (Part 1)
Encouraging an Informed Citizenry (Part 1)
 
Legal Resources for Academic Librarians
Legal Resources for Academic LibrariansLegal Resources for Academic Librarians
Legal Resources for Academic Librarians
 
LIS 653, Session 4-B: Introduction to Descriptive Metadata
LIS 653, Session 4-B: Introduction to Descriptive Metadata LIS 653, Session 4-B: Introduction to Descriptive Metadata
LIS 653, Session 4-B: Introduction to Descriptive Metadata
 

Similar to Development of the CyberCemetery (2011)

Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
lljohnston
 
Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011
lljohnston
 

Similar to Development of the CyberCemetery (2011) (20)

Deep Web and Digital Investigations
Deep Web and Digital Investigations Deep Web and Digital Investigations
Deep Web and Digital Investigations
 
Building Corpora from Social Media
Building Corpora from Social MediaBuilding Corpora from Social Media
Building Corpora from Social Media
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
 
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
 
Ir1
Ir1Ir1
Ir1
 
Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & Museums
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & Museums
 
Open Data and Web API
Open Data and Web APIOpen Data and Web API
Open Data and Web API
 
Big and Small Web Data
Big and Small Web DataBig and Small Web Data
Big and Small Web Data
 
Internet content as research data
Internet content as research dataInternet content as research data
Internet content as research data
 
Building Digital Collections: Managing and Sharing
Building Digital Collections: Managing and SharingBuilding Digital Collections: Managing and Sharing
Building Digital Collections: Managing and Sharing
 
Using Online Genealogy Programs
Using Online Genealogy ProgramsUsing Online Genealogy Programs
Using Online Genealogy Programs
 
Advanced Research Investigations for SIU Investigators
Advanced Research Investigations for SIU InvestigatorsAdvanced Research Investigations for SIU Investigators
Advanced Research Investigations for SIU Investigators
 
Presentation Deep Web Technology.pptx
Presentation Deep Web Technology.pptxPresentation Deep Web Technology.pptx
Presentation Deep Web Technology.pptx
 
Preventing data loss
Preventing data lossPreventing data loss
Preventing data loss
 
Planning a Successful Digital Project
Planning a Successful Digital ProjectPlanning a Successful Digital Project
Planning a Successful Digital Project
 
Measuring impact
Measuring impactMeasuring impact
Measuring impact
 
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
 
Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...
 

More from Dr. Starr Hoffman

LIS 653, Session 3: Principles and Standards
LIS 653, Session 3: Principles and Standards LIS 653, Session 3: Principles and Standards
LIS 653, Session 3: Principles and Standards
Dr. Starr Hoffman
 
Graphic Editing For the Non-Techie
Graphic Editing For the Non-Techie Graphic Editing For the Non-Techie
Graphic Editing For the Non-Techie
Dr. Starr Hoffman
 

More from Dr. Starr Hoffman (19)

LIS 653, Session 10: Controlled Vocabulary
LIS 653, Session 10: Controlled VocabularyLIS 653, Session 10: Controlled Vocabulary
LIS 653, Session 10: Controlled Vocabulary
 
LIS 653, Session 8: Radical Cataloging
LIS 653, Session 8: Radical Cataloging LIS 653, Session 8: Radical Cataloging
LIS 653, Session 8: Radical Cataloging
 
LIS 653, Session 7: Classification and Categorization
LIS 653, Session 7: Classification and CategorizationLIS 653, Session 7: Classification and Categorization
LIS 653, Session 7: Classification and Categorization
 
LIS 653, Session 6: FRBR & Relationships
LIS 653, Session 6: FRBR & Relationships LIS 653, Session 6: FRBR & Relationships
LIS 653, Session 6: FRBR & Relationships
 
LIS 653, Session 3: Principles and Standards
LIS 653, Session 3: Principles and Standards LIS 653, Session 3: Principles and Standards
LIS 653, Session 3: Principles and Standards
 
LIS 653, Session 9: Subject Analysis
LIS 653, Session 9: Subject Analysis LIS 653, Session 9: Subject Analysis
LIS 653, Session 9: Subject Analysis
 
LIS 653, Session 4-A: Bibliographic Formats and MARC
LIS 653, Session 4-A: Bibliographic Formats and MARC LIS 653, Session 4-A: Bibliographic Formats and MARC
LIS 653, Session 4-A: Bibliographic Formats and MARC
 
LIS 653, Session 2: Basics of Information Organization
LIS 653, Session 2: Basics of Information Organization LIS 653, Session 2: Basics of Information Organization
LIS 653, Session 2: Basics of Information Organization
 
The Relationship of Electronic Reference and the Development of Distance Educ...
The Relationship of Electronic Reference and the Development of Distance Educ...The Relationship of Electronic Reference and the Development of Distance Educ...
The Relationship of Electronic Reference and the Development of Distance Educ...
 
Strategies for Supporting Scholarly Communication
Strategies for Supporting Scholarly CommunicationStrategies for Supporting Scholarly Communication
Strategies for Supporting Scholarly Communication
 
The Preparation of Academic Library Administrators (Prezi import)
The Preparation of Academic Library Administrators (Prezi import)The Preparation of Academic Library Administrators (Prezi import)
The Preparation of Academic Library Administrators (Prezi import)
 
Networking and Getting Involved Professionally
Networking and Getting Involved Professionally Networking and Getting Involved Professionally
Networking and Getting Involved Professionally
 
Stop Using Cheesy Clip-Art!
Stop Using Cheesy Clip-Art!Stop Using Cheesy Clip-Art!
Stop Using Cheesy Clip-Art!
 
Graphic Editing For the Non-Techie
Graphic Editing For the Non-Techie Graphic Editing For the Non-Techie
Graphic Editing For the Non-Techie
 
Encouraging an Informed Citizenry (Part 2)
Encouraging an Informed Citizenry (Part 2)Encouraging an Informed Citizenry (Part 2)
Encouraging an Informed Citizenry (Part 2)
 
Beyond the Avatar: Best Practices as Librarians Embedded in Online Classes
Beyond the Avatar: Best Practices as Librarians Embedded in Online ClassesBeyond the Avatar: Best Practices as Librarians Embedded in Online Classes
Beyond the Avatar: Best Practices as Librarians Embedded in Online Classes
 
Dissertation Defense: The Preparation of Academic Library Administrators
Dissertation Defense: The Preparation of Academic Library AdministratorsDissertation Defense: The Preparation of Academic Library Administrators
Dissertation Defense: The Preparation of Academic Library Administrators
 
Value of the Doctorate for Academic Librarians
Value of the Doctorate for Academic LibrariansValue of the Doctorate for Academic Librarians
Value of the Doctorate for Academic Librarians
 
Gov Docs Overview
Gov Docs Overview Gov Docs Overview
Gov Docs Overview
 

Recently uploaded

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Recently uploaded (20)

COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 

Development of the CyberCemetery (2011)

  • 1. Development & Practice in the CyberCemetery Starr Hoffman Head, Government Documents Dept. University of North Texas Libraries 25 September 2011
  • 2. Intro Wha t is the Cy be rCe m e te ry ? • Purpose Why c re a te a Cy be rCe m e te ry ? • Development • Archiving Process • Technical Details • User Demographics Who us e s the Cy be rCe m e te ry ? • Conclusion
  • 3. http:/digital.library.unt.edu/ / explore/ collections/ GDCC/
  • 4. • online archive of websites from U.S. government agencies or commissions that are no longer operating http:/digital.library.unt.edu/ / explore/ collections/ GDCC/
  • 5. • online archive of websites from U.S. government agencies or commissions that are no longer operating • “snapshot” of each website as it existed before “pulling the plug” • maintained by the University of North Texas Libraries • freely accessible world-wide • affiliated NAR archive (National Archives and Records A Administration) http:/digital.library.unt.edu/ / explore/ collections/ GDCC/
  • 6.
  • 7.
  • 8.
  • 9.
  • 10. 1997 - present 2008 - present
  • 11.
  • 12. • Protect At-Risk Information: • 1990’s: U.S. government information = online • born-digital • edited or removed without warning • Federal Depository Library Program (FDLP) • administered by U.S. Government Printing Office (GPO) • mission: to p ro v id e fre e , p e rm a ne nt p ublic a c c e s s to g o ve rnm e nt info rm a tio n • online information complicates this mission • University of North Texas is a federal depository library
  • 13.
  • 14. 1995 e-docs at risk Government Printing Office (GPO) publishes report stating need to preserve electronic government publications
  • 15. 1997 GPO + UNT University of North Texas (UNT) talks to GPO about forming a partnership
  • 16. 1997 ACIR archived UNT archives website of the Advisory Commission on Intergovernment al Relations (ACIR)
  • 17. 1999 GPO + UNT = expanded permanent public access, expanded to multiple websites, & any agency or commission no longer operating
  • 18. 1999 CyberCemetery archive is named “CyberCemetery” because websites are from “dead” agencies & commissions
  • 19. 2006 GPO + UNT + NARA partnership now includes the U.S. National Archives and Records Administration (NARA)
  • 21. 1. Identify at-risk government agencies and commissions • contacted directly by agency/commission • contacted by GPO • read/listen to news • read government-related websites & blogs • targeted search-engine queries • (“final report” + .gov) • referrals from other librarians, patrons
  • 22. 2. Evaluate the website • must be an official government website • the agency or commission must: • be closing • issued a final report • other indication that the website is at-risk
  • 23. 2. Evaluate the website (continued)  Questions for website administrator:  Wha t operating system wa s us e d to ho s t this we bs ite ?  Wha t webserver software wa s us e d fo r the ho s ting o f this we bs ite ?  A s e rve r s id e inc lud e s (s s i) us e d in this we bs ite ? re  Wa s this we bs ite static htm o r a dynam site? l ic  I d y na m ic , wha t scripting languages we re us e d fo r this we bs ite (p hp , p e rl, f p y tho n)?  Wa s a database us e d fo r this we bs ite ? 2. I s o , wha t d a ta ba s e wa s us e d fo r this we bs ite ? f 3. Wha t m e tho d s we re us e d to c o nne c t to the d a ta ba s e ?  I the re stream m s ing edia a s s o c ia te d with this we bs ite ?  A the re proprietary content types us e d in this we bs ite ? re  A the re a ny com ents y o u wo uld like to a d d ? re m
  • 24. 3. Harvest the website • software: Heritrix (from Internet Archive) • http://crawler.archive.org/ • downloads content • bundles all content into WARC file • WARC = website in a single file • no manipulation of code or content 3. Access archived website • software: Wayback (from Internet Archive) • http://archive-access.sourceforge.net/projects/wayback/ • retrieves content from WARC • add banner notifying archived status
  • 25. 5. Harvesting alternative: Donated content • directly receive files from agency or commission • Why no t donated content? • Content could be altered • Harvesting = exact copy of online published content • Why donated content? • If content cannot be accessed by harvesting • flash video, large amounts of media • rarely necessary now
  • 26. 6. Link Checking • Manual: • manually navigate original & archived sites • Automated: • Xenu Link Checker • http://home.snafu.de/tilman/xenulink.html • compare reports of original & archived sites 6. Load to UNT Server • Upload archived website • Add navigation • Notify GPO (or agency/commission) that archived version is live
  • 27.
  • 28. • Backup • full backups to magnetic tape • performed each weekend • shipped to offsite storage company • Iron Mountain • http://www.ironmountain.com
  • 29. • web files (HTML, XML) • text documents (.txt, .pdf, .doc) • spreadsheets & statistics (.xls) • presentations (.ppt) • media files: • images & photographs (.jpg, .gif, .png, .tiff) • audio (.mp3) • video (.wm, .mov, .rp)
  • 30. researchers • historians • students • government employees • general public • avg. +1,000,000 hits per month • peak visits in one day: • 9,996 on 11.03.2011 • most popular site: 9 /1 1 Co m m is s io n
  • 31. provides permanent public access • archive of “dead” government information • freely, globally available • 73 websites and growing • partnership between: • University of North Texas Libraries • U.S. Government Printing Office • National Archives and Records Administration
  • 32. FOR FURTHER INFORMATION: http://www.library.unt.edu/govinfo/ http://digital.library.unt.edu/explore/collections/GDCC/ Starr Hoffman Head, Government Documents Dept. University of North Texas Libraries govinfo@unt.edu starr.hoffman@gmail.com http:/geekyartistlibrarian.com /

Editor's Notes

  1. 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  2. 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  3. 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  4. 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  5. 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  6. 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  7. 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!
  8. 1995 Government Printing Office (GPO) publishes report stating need to preserve electronic government publications 1997 University of North Texas (UNT) talks to GPO about forming a partnership UNT archives website of the Advisory Commission on Intergovernmental Relations 1999 UNT/GPO partnership is expanded permanent public access multiple government websites government agency or commission which is no longer operating (and/or has issued a final report) the collection is named “CyberCemetery” due to its collection of websites from “dead” government agencies and commissions 2006 UNT/GPO partnership is expanded Now includes the U.S. National Archives and Records Administration (NARA) 2011 73 websites archived, and more on the way!