148 john shaw2006fall


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

148 john shaw2006fall

  1. 1. ArchivingWhat is it and why should it be important to me? John Shaw Director, Publishing Technologies SAGE Publications, U.S.
  2. 2. I. Archiving OverviewII. Types of ArchivesII. A SAGE ExampleIV. Risks, Questions, and More Questions
  3. 3. Archiving Part I:Archiving Overview
  4. 4. What is an Archive? An authoritative collection Preserved and professionally managed in perpetuity  History, institutional commitment & policy, integrity re: preservation “…information needed for society’s memory.” "Schellenberg in Cyberspace," American Archivist 61:2 (Fall 1998), p. 309-327. Preservation first
  5. 5. What is a Repository? “A place where things can be stored and maintained; a storehouse.” [Society of American Archivists Glossary] “Depository” is same  also library that receives government documents to public access Not all repositories are archives
  6. 6. Why Care?“Preserving information for decades or even centuries has proved important. Shang dynasty (12th century BC) Chinese astronomers inscribed eclipse observations on “oracle bones" (animal bones and tortoise shells). About 3200 years later researchers used these records, together with one from 1302BC, to estimate that the accumulated clock error was just over 7 hours, and from this derived a value for the viscosity of the Earths mantle as it rebounds from the weight of the glaciers..”********
  7. 7. Why Care?“These timescales of many decades, even centuries, contrast with the typical 5-year lifetime for computing hardware and digital media” “A Fresh Look at the Reliability of Long­term Digital Storage.” Baker, Mary, et al.. EuroSys 06, April 18-21, 2006
  8. 8. Why Care?Preservation: Digital information is impermanent Publisher: Safety  to insure ongoing availability of your content Your library customers: Custodianship  to insure continuity of the record of scientific progress  Very long view: epistemology, history of science and culture
  9. 9. What Should be Preserved? Scholarly content Research materials Web-based, digitally born content
  10. 10. How e-Archives Differ Mission: collection v. preservation Access control, dark v. light Deposits  Why: voluntary v. mandated  Who: author v. publisher  What: manuscripts v. final work  When: backfile v. current content Future format migration Rights transfer Costs
  11. 11. Archiving Part II:Types of Archives
  12. 12. Types of Archives: National archives Institutional repositories Community-based archives Product solution archives
  13. 13. Types of Archives: National Dutch National library Koninklijke Bibliotheek (KB) British Library NIH – PubMedCentral?  “NIH’s digital repository for biomedical research” Library of Congress?
  14. 14. KB: Dutch National Library Mission: Legal deposit library  “…collect, catalogue and preserve all publications appearing in the Netherlands. ”  Capable of ingesting 60,000 articles/day Deposits: Source files from publishers  Automated, strict Costs? Access Control:  Local patron access  Publisher sets remote access rules
  15. 15. KB: Dutch National Library Migration: Preservation research leader  Committed to format migration Archiving agreements with:  OUP, Sage, Blackwell, Elsevier, Kluwer Academic, etc.
  16. 16. The British Library Legal Deposit Pilot Mission: Legal deposit library  UK-published (to start) Pilot: Legal deposit for e-journals  23 volunteer publishers  Secure infrastructure  Uses DigiTool by Ex-Libris  Shared with the other UK legal deposit libraries  To “scope and test” ingest, storage, retrieval Cost?
  17. 17. The British Library: Preservation and Migration BL’s future for managing digital assets  preserve any type of digital material in perpetuity Migration  ensure that users can view the material with contemporary applications  preserve the original look-and-feel where possible Access Control  “appropriate permissions”
  18. 18. PMC: US National Library of Medicine Journal Archive Mission: Make research more accessible Free full-text archive of 230 journals Deposit: publishers submit source files Migration Access Control Cost?
  19. 19. PMC: Depository forNIH-Funded Research Articles Authors of NIH-funded articles “encouraged” to deposit final manuscript  “After all modifications due to …peer review”  MS Word, PDF, etc.  With supplementary information  Publisher can replace with published version To be required soon?
  20. 20. Library of Congress National Digital Information Infrastructure and Preservation Program (NDIIPP) – formed in 2000  Members: National Library of Medicine, the National Agricultural Library, the National Institute of Standards and Technology, the Research Libraries Group, the OCLC Online Computer Library Center, and the Council on Library and Information Resources Preliminary investigation and software development phase Primarily e-journal deposit Future …???
  21. 21. Types of Archives: Institutional University with expansive focus  Stanford Digital Repository Automated  LOCKSS
  22. 22. Stanford Digital Repository Stanford Univ. Libraries initiative Digital preservation serving  Stanford University  Broader academic community  Publishers Principles: Trust, Security, Transparency Costs?
  23. 23. LOCKSS Technology to preserve local library collection Automated, self-correcting cache servers  Requires LOCKSS server at library Requires publisher participation Builds collection of all resources which the institution licenses Goes online to users if data source becomes unavailable  Provides access to static “HTML images” of source Costs
  24. 24. Types of Archives: Product Solution Non-profit organization  Portico
  25. 25. Portico Mission: scholarly preservation  Standalone archive  Initiated by JSTOR, with grant funding Deposits: source files from publisher Migration: planned Costs  Publishers annual fee $250 to $75,000  based on annual revenue  Libraries annual fee $1,500 to $24,000  based on Library Materials Expenditure
  26. 26. Portico: Access Control Member libraries get access:  “when specific trigger events occur, and when titles are no longer available from the publisher or other source.”  Trigger events include:  Publisher stops operations  Publisher ceases to publish a title  Publisher no longer offers back issues  Catastrophic and sustained failure of a publisher’s delivery platform Can also fulfill “perpetual access” subscription obligations
  27. 27. Types of Archives: Community Community based and openly run  CLOCKSS
  28. 28. CLOCKSS (Controlled LOCKSS) Long-term global archiving solution  Community-managed, failsafe repository for scholarly content  Serve libraries & publishers in the event of a long-term business interruption  Publishers participation is voluntary Small number library participants maintain the archive on behalf of larger community  libraries preserve member publisher content whether they subscribe or not Release only after a trigger event  Publisher, libraries, and society collaborative decision to release “cost sharing” for system, not access Costs?
  29. 29. Summary Table Agency Primary Data A/C Migration MissionKB Gov’t Preserv Pub Twilight YesBL Gov’t Preserv Pub ? YesPortico Ind. Failsafe Pub Dark YesPMC Gov’t Access Pub, Light Yes AuthorLoC Gov’t Preserv Pub ? ?SDR Inst. Preserv Pub Twilight YesLOCKSS Inst. Failsafe Pub Dark -CLOCKSS Comm. Failsafe Pub Dark -
  30. 30. Summary: How Repositories Differ Stated purpose Dark v. light Complete backfile v. current only Deposits  Who: author v. publisher  What: manuscripts v. final work  Why: voluntary v. mandated Rights transfer Access control Costs
  31. 31. Archiving Part III:A SAGE Example
  32. 32. Why Archive? SAGE’s commitment to customers and partners Critical to society arrangements Essential for new e-sales (consortia + single institutions) – Perpetual access Business continuity Long-term preservation We are not archiving experts!
  33. 33. Where to Archive? Dutch KB CLOCKSS LOCKSS Portico Library of Congress British Library
  34. 34. How to Archive? Provide details of digital availability Provide sample of content Provide details of content format (DTD) Send all backfile for loading Set up content flow for ongoing content
  35. 35. SAGE Experience with DutchKB Contract and negotiation Contact with technical team Delivery of samples and details of scope Follow-up questions Visit KB – Find out what’s happening Delivery of back content Delivery of ongoing issues Ongoing issue discrepancies
  36. 36. Archiving Part IV:Questions, Questions and More Questions
  37. 37. Measurements of Success Who is overseeing the archiving process and governance? Compliance? Accuracy and legitimacy? Financial stability?
  38. 38. Resources  Archiving should be done by librarians ad archivists, period. Gordon Tibbitts, Blackwell Publishing. April 4, 2006 UKSG  Portico - http://www.portico.org/  LOCKSS - http://lockss.stanford.edu  CLOCKSS - http://www.lockss.org/clockss/Home  KB E-Depot - http://www.kb.nl/index-en.html  DepotDigital Archiving at the national library of the Netherlands- http://www- 5.ibm.com/be/pdf/en/events/nextlevel/presentation_kb_den_haag_edepot_ibm_brussels_v03. pdf  “A Fresh Look at the Reliability of Long­term Digital Storage.” Baker, Mary, et al.. EuroSys 06, April 18-21, 2006  Digital Archives & Repositories: Why should I care? – Bernard Hecker, HighWire Press, Publishers Meeting, October 2004  Archive Overview, – Bernard Hecker, HighWire Press, Publishers Meeting, April 2006  Trusted Digital Repositories: Attributes and Responsibilities An RLG-OCLC Report. © 2002 Research Libraries Group  British Library: Project: JCLD Pilot Project in Anticipation of E-Journals, June 2005 Simon IngerNote: Presentation based on Digital Archives & Repositories: Why should I care? – Bernard Hecker, HighWire Press,Publishers Meeting, October 2004; Archive Overview. Bernard Hecker, HighWire Press, Publishers Meeting, April 2006;Archiving: A SAGE Example. John Shaw. Publishers Meeting, April 2006
  39. 39. Thank You! Contact info:John.Shaw@sagepub.com www.sagepub.com